Pipeline to Detect Chemicals and Proteins in text (biobert)

Description

This pretrained pipeline is built on the top of ner_chemprot_biobert model.

Predicted Entities

GENE-N, CHEMICAL, GENE-Y

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("ner_chemprot_biobert_pipeline", "en", "clinical/models")

text = '''Keratinocyte growth factor and acidic fibroblast growth factor are mitogens for primary cultures of mammary epithelium.'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("ner_chemprot_biobert_pipeline", "en", "clinical/models")

val text = "Keratinocyte growth factor and acidic fibroblast growth factor are mitogens for primary cultures of mammary epithelium."

val result = pipeline.fullAnnotate(text)
import nlu
nlu.load("en.med_ner.chemprot_biobert.pipeline").predict("""Keratinocyte growth factor and acidic fibroblast growth factor are mitogens for primary cultures of mammary epithelium.""")

Results

|    | ner_chunk    |   begin |   end | ner_label   |   confidence |
|---:|:-------------|--------:|------:|:------------|-------------:|
|  0 | Keratinocyte |       0 |    11 | GENE-Y      |       0.894  |
|  1 | growth       |      13 |    18 | GENE-Y      |       0.4833 |
|  2 | factor       |      20 |    25 | GENE-Y      |       0.7991 |
|  3 | acidic       |      31 |    36 | GENE-Y      |       0.9765 |
|  4 | fibroblast   |      38 |    47 | GENE-Y      |       0.3905 |
|  5 | growth       |      49 |    54 | GENE-Y      |       0.7109 |
|  6 | factor       |      56 |    61 | GENE-Y      |       0.8693 |

Model Information

Model Name: ner_chemprot_biobert_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 422.0 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • BertEmbeddings
  • MedicalNerModel
  • NerConverter