Description
This pretrained pipeline is built on the top of bert_token_classifier_ner_bc2gm_gene model.
Predicted Entities
GENE/PROTEIN
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
text = '''ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination.'''
result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = new PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
val text = "ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination."
val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
text = '''ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination.'''
result = pipeline.fullAnnotate(text)
Results
| | ner_chunk | begin | end | ner_label | confidence |
|---:|:------------|--------:|------:|:-------------|-------------:|
| 0 | ROCK-I | 0 | 5 | GENE/PROTEIN | 0.999978 |
| 1 | Kinectin | 8 | 15 | GENE/PROTEIN | 0.999973 |
| 2 | mDia2 | 22 | 26 | GENE/PROTEIN | 0.999974 |
| 3 | RhoA | 65 | 68 | GENE/PROTEIN | 0.999976 |
| 4 | Cdc42 | 74 | 78 | GENE/PROTEIN | 0.999979 |
| 5 | tnaC | 213 | 216 | GENE/PROTEIN | 0.999978 |
| 6 | Rho | 225 | 227 | GENE/PROTEIN | 0.999976 |
| 7 | boxA | 247 | 250 | GENE/PROTEIN | 0.999837 |
| 8 | rut sites | 256 | 264 | GENE/PROTEIN | 0.99115 |
Model Information
Model Name: | bert_token_classifier_ner_bc2gm_gene_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 4.4.4+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 404.8 MB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- MedicalBertForTokenClassifier
- NerConverterInternalModel