Pipeline for Extracting Clinical Entities Related to ICD-O Codes

Description

This pipeline is designed to extract all entities mappable to ICD-O codes.

2 NER models and a Text Matcher are used to achieve those tasks.

Copy S3 URI

How to use


from sparknlp.pretrained import PretrainedPipeline

ner_pipeline = PretrainedPipeline("ner_icdo_pipeline", "en", "clinical/models")

result = ner_pipeline.annotate("""
TRAF6 is a  putative oncogene in a variety of cancers  including
bladder cancer and skin cancer . WWP2 appears to regulate the expression of the well characterized
tumor  suppressor phosphatase and tensin homolog (PTEN) in endometrial cancer and squamous cell carcinoma.
""")


import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val ner_pipeline = PretrainedPipeline("ner_icdo_pipeline", "en", "clinical/models")

val result = ner_pipeline.annotate("""
TRAF6 is a  putative oncogene in a variety of cancers  including
bladder cancer and skin cancer . WWP2 appears to regulate the expression of the well characterized
tumor  suppressor phosphatase and tensin homolog (PTEN) in endometrial cancer and squamous cell carcinoma.
""")

Results

|    | chunks                  |   begin |   end | entities    |
|---:|:------------------------|--------:|------:|:------------|
|  0 | cancers                 |      47 |    53 | Cancer_dx   |
|  1 | bladder cancer          |      66 |    79 | Cancer_dx   |
|  2 | skin cancer             |      85 |    95 | Cancer_dx   |
|  3 | tumor                   |     165 |   169 | Oncological |
|  4 | endometrial cancer      |     224 |   241 | Cancer_dx   |
|  5 | squamous cell carcinoma |     247 |   269 | Cancer_dx   |

Model Information

Model Name: ner_icdo_pipeline
Type: pipeline
Compatibility: Healthcare NLP 6.0.2+
License: Licensed
Edition: Official
Language: en
Size: 1.7 GB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • WordEmbeddingsModel
  • MedicalNerModel
  • NerConverterInternalModel
  • MedicalNerModel
  • NerConverterInternalModel
  • TextMatcherInternalModel
  • ChunkMergeModel