Detect Genes/Proteins (BC2GM) in Medical Texts

Description

This pretrained pipeline is built on the top of ner_biomedical_bc2gm model.

Open in Colab Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("ner_biomedical_bc2gm_pipeline", "en", "clinical/models")

result = pipeline.fullAnnotate("""Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections.""")
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("ner_biomedical_bc2gm_pipeline", "en", "clinical/models")

val result = pipeline.fullAnnotate("""Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections""")
import nlu
nlu.load("en.med_ner.biomedical_bc2gm.pipeline").predict("""Immunohistochemical staining was positive for S-100 in all 9 cases stained, positive for HMB-45 in 9 (90%) of 10, and negative for cytokeratin in all 9 cases in which myxoid melanoma remained in the block after previous sections.""")

Results

+-----------+------------+
|chunk      |ner_label   |
+-----------+------------+
|S-100      |GENE_PROTEIN|
|HMB-45     |GENE_PROTEIN|
|cytokeratin|GENE_PROTEIN|
+-----------+------------+

Model Information

Model Name: ner_biomedical_bc2gm_pipeline
Type: pipeline
Compatibility: Healthcare NLP 3.5.3+
License: Licensed
Edition: Official
Language: en
Size: 1.7 GB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • WordEmbeddingsModel
  • MedicalNerModel
  • NerConverter