Description
This specialized oncology pipeline can;
-
extract oncology biomarker type entities,
-
assign assertion status to the extracted entities,
-
establish relations between the extracted entities from the clinical documents.
In this pipeline, ner_oncology, ner_oncology_test, ner_oncology_biomarker, ner_biomarker and cancer_diagnosis_matcher NER models, assertion_oncology and assertion_oncology_test_binary assertion models and re_oncology_granular and re_oncology_biomarker_result relation extraction models were used to achieve those tasks.
-
Clinical Entity Labels:
Histological_Type,Direction,Staging,Cancer_Score,Imaging_Test,Cycle_Number,Tumor_Finding,Site_Lymph_Node,Invasion,Response_To_Treatment,Smoking_Status,Tumor_Size,Cycle_Count,Adenopathy,Age,Biomarker_Result,Unspecific_Therapy,Site_Breast,Chemotherapy,Targeted_Therapy,Radiotherapy,Performance_Status,Pathology_Test,Site_Other_Body_Part,Cancer_Surgery,Line_Of_Therapy,Pathology_Result,Hormonal_Therapy,Site_Bone,Biomarker,Immunotherapy,Cycle_Day,Frequency,Route,Duration,Death_Entity,Metastasis,Site_Liver,Cancer_Dx,Grade,Date,Site_Lung,Site_Brain,Relative_Date,Race_Ethnicity,Gender,Oncogene,Dosage,Radiation_Dose,Drug,CancerModifier,Radiological_Test_Result,Biomarker_Measurement,Radiological_Test,Test,Test_Result,Prognostic_Biomarkers,Predictive_Biomarkers -
Assertion Status Labels:
Past,Family,Absent,Hypothetical,Possible,Present,Hypothetical_Or_Absent,Medical_History -
Relation Extraction Labels:
is_related_to,is_size_of,is_date_of,is_location_of,is_finding_of
Predicted Entities
Histological_Type, Direction, Staging, Cancer_Score, Imaging_Test, Cycle_Number, Tumor_Finding, Site_Lymph_Node, Invasion, Response_To_Treatment, Smoking_Status, Tumor_Size, Cycle_Count, Adenopathy, Age, Biomarker_Result, Unspecific_Therapy, Site_Breast, Chemotherapy, Targeted_Therapy, Radiotherapy, Performance_Status, Pathology_Test, Site_Other_Body_Part, Cancer_Surgery, Line_Of_Therapy, Pathology_Result, Hormonal_Therapy, Site_Bone, Biomarker, Immunotherapy, Cycle_Day, Frequency, Route, Duration, Death_Entity, Metastasis, Site_Liver, Cancer_Dx, Grade, Date, Site_Lung, Site_Brain, Relative_Date, Race_Ethnicity, Gender, Oncogene, Dosage, Radiation_Dose, Drug, CancerModifier, Radiological_Test_Result, Biomarker_Measurement, Radiological_Test, Test, Test_Result, Prognostic_Biomarkers, Predictive_Biomarkers
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("oncology_biomarker_pipeline", "en", "clinical/models")
result = pipeline.fullAnnotate("""Immunohistochemistry was negative for thyroid transcription factor-1 and napsin A. The test was positive for ER and PR,
and negative for HER2.""")
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = PretrainedPipeline("oncology_biomarker_pipeline", "en", "clinical/models")
val result = pipeline.fullAnnotate("""Immunohistochemistry was negative for thyroid transcription factor-1 and napsin A. The test was positive for ER and PR,
and negative for HER2.""")
Results
******************** ner_biomarker results ********************
| chunk | begin | end | ner_label | confidence |
|:-------------------------------|--------:|------:|:----------------------|-------------:|
| Immunohistochemistry | 0 | 19 | Test | 0.9561 |
| negative | 25 | 32 | Biomarker_Measurement | 0.968 |
| thyroid transcription factor-1 | 38 | 67 | Biomarker | 0.610925 |
| napsin A | 73 | 80 | Biomarker | 0.8696 |
| positive | 96 | 103 | Biomarker_Measurement | 0.9228 |
| ER | 109 | 110 | Biomarker | 0.9978 |
| PR | 116 | 117 | Biomarker | 0.9932 |
| negative | 124 | 131 | Biomarker_Measurement | 0.9781 |
| HER2 | 137 | 140 | Biomarker | 0.7243 |
******************** assertion results ********************
| chunk | ner_label | assertion | assertion_source |
|:-------------------------------|:-----------------|:------------|:-------------------|
| Immunohistochemistry | Pathology_Test | Past | assertion_oncology |
| negative | Biomarker_Result | Past | assertion_oncology |
| thyroid transcription factor-1 | Biomarker | Present | assertion_oncology |
| napsin A | Biomarker | Present | assertion_oncology |
| positive | Biomarker_Result | Present | assertion_oncology |
| ER | Biomarker | Present | assertion_oncology |
| PR | Biomarker | Present | assertion_oncology |
| negative | Biomarker_Result | Present | assertion_oncology |
| HER2 | Oncogene | Present | assertion_oncology |
******************** re results ********************
| chunk1 | entity1 | chunk2 | entity2 | relation |
|:---------------------|:-----------------|:-------------------------------|:-----------------|:--------------|
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_related_to |
| negative | Biomarker_Result | napsin A | Biomarker | is_related_to |
| positive | Biomarker_Result | ER | Biomarker | is_related_to |
| positive | Biomarker_Result | PR | Biomarker | is_related_to |
| negative | Biomarker_Result | HER2 | Oncogene | is_related_to |
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_finding_of |
| negative | Biomarker_Result | napsin A | Biomarker | is_finding_of |
| positive | Biomarker_Result | ER | Biomarker | is_finding_of |
| positive | Biomarker_Result | PR | Biomarker | is_finding_of |
| positive | Biomarker_Result | HER2 | Oncogene | is_finding_of |
| negative | Biomarker_Result | HER2 | Oncogene | is_finding_of |
| negative | Biomarker_Result | thyroid transcription factor-1 | Biomarker | is_finding_of |
| negative | Biomarker_Result | napsin A | Biomarker | is_finding_of |
| positive | Biomarker_Result | ER | Biomarker | is_finding_of |
| positive | Biomarker_Result | PR | Biomarker | is_finding_of |
| negative | Biomarker_Result | HER2 | Oncogene | is_finding_of |
Model Information
| Model Name: | oncology_biomarker_pipeline |
| Type: | pipeline |
| Compatibility: | Healthcare NLP 5.4.1+ |
| License: | Licensed |
| Edition: | Official |
| Language: | en |
| Size: | 1.8 GB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- WordEmbeddingsModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- MedicalNerModel
- NerConverterInternalModel
- TextMatcherInternalModel
- ChunkMergeModel
- ChunkMergeModel
- AssertionDLModel
- ChunkFilterer
- AssertionDLModel
- AssertionMerger
- PerceptronModel
- DependencyParserModel
- RelationExtractionModel
- RelationExtractionModel
- AnnotationMerger