Description
This specialized oncology pipeline can;
-
extract oncology biomarker type entities,
-
assign assertion status to the extracted entities,
-
establish relations between the extracted entities from the clinical documents.
In this pipeline, ner_oncology, ner_oncology_test, ner_oncology_biomarker, ner_biomarker and cancer_diagnosis_matcher NER models, assertion_oncology and assertion_oncology_test_binary assertion models and re_oncology_granular and re_oncology_biomarker_result relation extraction models were used to achieve those tasks.
-
Clinical Entity Labels:
Histological_Type
,Direction
,Staging
,Cancer_Score
,Imaging_Test
,Cycle_Number
,Tumor_Finding
,Site_Lymph_Node
,Invasion
,Response_To_Treatment
,Smoking_Status
,Tumor_Size
,Cycle_Count
,Adenopathy
,Age
,Biomarker_Result
,Unspecific_Therapy
,Site_Breast
,Chemotherapy
,Targeted_Therapy
,Radiotherapy
,Performance_Status
,Pathology_Test
,Site_Other_Body_Part
,Cancer_Surgery
,Line_Of_Therapy
,Pathology_Result
,Hormonal_Therapy
,Site_Bone
,Biomarker
,Immunotherapy
,Cycle_Day
,Frequency
,Route
,Duration
,Death_Entity
,Metastasis
,Site_Liver
,Cancer_Dx
,Grade
,Date
,Site_Lung
,Site_Brain
,Relative_Date
,Race_Ethnicity
,Gender
,Oncogene
,Dosage
,Radiation_Dose
,Drug
,CancerModifier
,Radiological_Test_Result
,Biomarker_Measurement
,Radiological_Test
,Test
,Test_Result
,Prognostic_Biomarkers
,Predictive_Biomarkers
-
Assertion Status Labels:
Past
,Family
,Absent
,Hypothetical
,Possible
,Present
,Hypothetical_Or_Absent
,Medical_History
-
Relation Extraction Labels:
is_related_to
,is_size_of
,is_date_of
,is_location_of
,is_finding_of
Predicted Entities
Histological_Type
, Direction
, Staging
, Cancer_Score
, Imaging_Test
, Cycle_Number
, Tumor_Finding
, Site_Lymph_Node
, Invasion
, Response_To_Treatment
, Smoking_Status
, Tumor_Size
, Cycle_Count
, Adenopathy
, Age
, Biomarker_Result
, Unspecific_Therapy
, Site_Breast
, Chemotherapy
, Targeted_Therapy
, Radiotherapy
, Performance_Status
, Pathology_Test
, Site_Other_Body_Part
, Cancer_Surgery
, Line_Of_Therapy
, Pathology_Result
, Hormonal_Therapy
, Site_Bone
, Biomarker
, Immunotherapy
, Cycle_Day
, Frequency
, Route
, Duration
, Death_Entity
, Metastasis
, Site_Liver
, Cancer_Dx
, Grade
, Date
, Site_Lung
, Site_Brain
, Relative_Date
, Race_Ethnicity
, Gender
, Oncogene
, Dosage
, Radiation_Dose
, Drug
, CancerModifier
, Radiological_Test_Result
, Biomarker_Measurement
, Radiological_Test
, Test
, Test_Result
, Prognostic_Biomarkers
, Predictive_Biomarkers
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("oncology_biomarker_pipeline", "en", "clinical/models")
result = pipeline.fullAnnotate("""Immunohistochemistry was negative for thyroid transcription factor-1 and napsin A. The test was positive for ER and PR,
and negative for HER2.""")
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = PretrainedPipeline("oncology_biomarker_pipeline", "en", "clinical/models")
val result = pipeline.fullAnnotate("""Immunohistochemistry was negative for thyroid transcription factor-1 and napsin A. The test was positive for ER and PR,
and negative for HER2.""")
Results
******************** ner_biomarker results ********************
| chunk | begin | end | ner_label | confidence |
|:-------------------------------|--------:|------:|:----------------------|-------------:|
| Immunohistochemistry | 0 | 19 | Test | 0.9561 |
| negative | 25 | 32 | Biomarker_Measurement | 0.968 |
| thyroid transcription factor-1 | 38 | 67 | Biomarker | 0.610925 |
| napsin A | 73 | 80 | Biomarker | 0.8696 |
| positive | 96 | 103 | Biomarker_Measurement | 0.9228 |
| ER | 109 | 110 | Biomarker | 0.9978 |
| PR | 116 | 117 | Biomarker | 0.9932 |
| negative | 124 | 131 | Biomarker_Measurement | 0.9781 |
| HER2 | 137 | 140 | Biomarker | 0.7243 |
******************** assertion results ********************
| chunk | ner_label | assertion | assertion_source |
|:-------------------------------|:-----------------|:------------|:-------------------|
| Immunohistochemistry | Pathology_Test | Past | assertion_oncology |
| negative | Biomarker_Result | Past | assertion_oncology |
| thyroid transcription factor-1 | Biomarker | Present | assertion_oncology |
| napsin A | Biomarker | Present | assertion_oncology |
| positive | Biomarker_Result | Present | assertion_oncology |
| ER | Biomarker | Present | assertion_oncology |
| PR | Biomarker | Present | assertion_oncology |
| negative | Biomarker_Result | Present | assertion_oncology |
| HER2 | Oncogene | Present | assertion_oncology |
******************** re results ********************
| chunk1 | entity1 | chunk2 | entity2 | relation |
|:---------------------|:-----------------|:-------------------------------|:-----------------|:--------------|
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_related_to |
| negative | Biomarker_Result | napsin A | Biomarker | is_related_to |
| positive | Biomarker_Result | ER | Biomarker | is_related_to |
| positive | Biomarker_Result | PR | Biomarker | is_related_to |
| negative | Biomarker_Result | HER2 | Oncogene | is_related_to |
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_finding_of |
| negative | Biomarker_Result | napsin A | Biomarker | is_finding_of |
| positive | Biomarker_Result | ER | Biomarker | is_finding_of |
| positive | Biomarker_Result | PR | Biomarker | is_finding_of |
| positive | Biomarker_Result | HER2 | Oncogene | is_finding_of |
| negative | Biomarker_Result | HER2 | Oncogene | is_finding_of |
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_finding_of |
| negative | Biomarker_Result | napsin A | Biomarker | is_finding_of |
| positive | Biomarker_Result | ER | Biomarker | is_finding_of |
| positive | Biomarker_Result | PR | Biomarker | is_finding_of |
| negative | Biomarker_Result | HER2 | Oncogene | is_finding_of |
Model Information
Model Name: | oncology_biomarker_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 5.4.1+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 1.8 GB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- WordEmbeddingsModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- TextMatcherInternalModel
- ChunkMergeModel
- ChunkMergeModel
- AssertionDLModel
- ChunkFilterer
- AssertionDLModel
- AssertionMerger
- PerceptronModel
- DependencyParserModel
- RelationExtractionModel
- RelationExtractionModel
- AnnotationMerger