Pipeline to Detect Genes/Proteins (BC2GM) in Medical Texts

Description

This pretrained pipeline is built on the top of ner_biomedical_bc2gm model.

Predicted Entities

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("ner_biomedical_bc2gm_pipeline", "en", "clinical/models")

text = '''Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections.'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("ner_biomedical_bc2gm_pipeline", "en", "clinical/models")

val text = "Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections."

val result = pipeline.fullAnnotate(text)
import nlu
nlu.load("en.med_ner.biomedical_bc2gm.pipeline").predict("""Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections.""")

Results

|    | ner_chunks   |   begin |   end | ner_label    |   confidence |
|---:|:-------------|--------:|------:|:-------------|-------------:|
|  0 | S-100        |      46 |    50 | GENE_PROTEIN |       0.9911 |
|  1 | HMB-45       |      89 |    94 | GENE_PROTEIN |       0.9944 |
|  2 | cytokeratin  |     131 |   141 | GENE_PROTEIN |       0.9951 |

Model Information

Model Name: ner_biomedical_bc2gm_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 1.7 GB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • WordEmbeddingsModel
  • MedicalNerModel
  • NerConverter