Description
This pretrained pipeline is built on the top of bert_token_classifier_ner_bc2gm_gene model.
Predicted Entities
GENE/PROTEIN
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
text = '''ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination.'''
result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = new PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
val text = "ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination."
val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc2gm_gene_pipeline", "en", "clinical/models")
text = '''ROCK-I, Kinectin, and mDia2 can bind the wild type forms of both RhoA and Cdc42 in a GTP-dependent manner in vitro. These results support the hypothesis that in the presence of tryptophan the ribosome translating tnaC blocks Rho ' s access to the boxA and rut sites, thereby preventing transcription termination.'''
result = pipeline.fullAnnotate(text)
Results
|    | ner_chunk   |   begin |   end | ner_label    |   confidence |
|---:|:------------|--------:|------:|:-------------|-------------:|
|  0 | ROCK-I      |       0 |     5 | GENE/PROTEIN |     0.999978 |
|  1 | Kinectin    |       8 |    15 | GENE/PROTEIN |     0.999973 |
|  2 | mDia2       |      22 |    26 | GENE/PROTEIN |     0.999974 |
|  3 | RhoA        |      65 |    68 | GENE/PROTEIN |     0.999976 |
|  4 | Cdc42       |      74 |    78 | GENE/PROTEIN |     0.999979 |
|  5 | tnaC        |     213 |   216 | GENE/PROTEIN |     0.999978 |
|  6 | Rho         |     225 |   227 | GENE/PROTEIN |     0.999976 |
|  7 | boxA        |     247 |   250 | GENE/PROTEIN |     0.999837 |
|  8 | rut sites   |     256 |   264 | GENE/PROTEIN |     0.99115  |
Model Information
| Model Name: | bert_token_classifier_ner_bc2gm_gene_pipeline | 
| Type: | pipeline | 
| Compatibility: | Healthcare NLP 4.4.4+ | 
| License: | Licensed | 
| Edition: | Official | 
| Language: | en | 
| Size: | 404.8 MB | 
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- MedicalBertForTokenClassifier
- NerConverterInternalModel