Pipeline to Detect Anatomical Structures in Medical Text

Description

This pretrained pipeline is built on the top of bert_token_classifier_ner_anatem model.

Predicted Entities

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("bert_token_classifier_ner_anatem_pipeline", "en", "clinical/models")

text = '''Malignant cells often display defects in autophagy, an evolutionarily conserved pathway for degrading long-lived proteins and cytoplasmic organelles. However, as yet, there is no genetic evidence for a role of autophagy genes in tumor suppression. The beclin 1 autophagy gene is monoallelically deleted in 40 - 75 % of cases of human sporadic breast, ovarian, and prostate cancer.'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("bert_token_classifier_ner_anatem_pipeline", "en", "clinical/models")

val text = "Malignant cells often display defects in autophagy, an evolutionarily conserved pathway for degrading long-lived proteins and cytoplasmic organelles. However, as yet, there is no genetic evidence for a role of autophagy genes in tumor suppression. The beclin 1 autophagy gene is monoallelically deleted in 40 - 75 % of cases of human sporadic breast, ovarian, and prostate cancer."

val result = pipeline.fullAnnotate(text)

Results

|    | ner_chunk              |   begin |   end | ner_label   |   confidence |
|---:|:-----------------------|--------:|------:|:------------|-------------:|
|  0 | Malignant cells        |       0 |    14 | Anatomy     |     0.999951 |
|  1 | cytoplasmic organelles |     126 |   147 | Anatomy     |     0.999937 |
|  2 | tumor                  |     229 |   233 | Anatomy     |     0.999871 |
|  3 | breast                 |     343 |   348 | Anatomy     |     0.999842 |
|  4 | ovarian                |     351 |   357 | Anatomy     |     0.99998  |
|  5 | prostate cancer        |     364 |   378 | Anatomy     |     0.999968 |

Model Information

Model Name: bert_token_classifier_ner_anatem_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 404.8 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • MedicalBertForTokenClassifier
  • NerConverterInternalModel