Description
This pretrained pipeline is built on the top of bert_token_classifier_ner_species model.
Predicted Entities
SPECIES
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
text = '''As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) .'''
result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = new PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
val text = "As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) ."
val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
text = '''As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) .'''
result = pipeline.fullAnnotate(text)
Results
|    | ner_chunk               |   begin |   end | ner_label   |   confidence |
|---:|:------------------------|--------:|------:|:------------|-------------:|
|  0 | 6C (T)                  |      57 |    62 | SPECIES     |     0.998955 |
|  1 | Betaproteobacteria      |     117 |   134 | SPECIES     |     0.99973  |
|  2 | Thiomonas intermedia    |     167 |   186 | SPECIES     |     0.999822 |
|  3 | DSM 18155 (T)           |     188 |   200 | SPECIES     |     0.997657 |
|  4 | Thiomonas perometabolis |     206 |   228 | SPECIES     |     0.999614 |
|  5 | DSM 18570 (T)           |     230 |   242 | SPECIES     |     0.997146 |
Model Information
| Model Name: | bert_token_classifier_ner_species_pipeline | 
| Type: | pipeline | 
| Compatibility: | Healthcare NLP 4.4.4+ | 
| License: | Licensed | 
| Edition: | Official | 
| Language: | en | 
| Size: | 404.8 MB | 
Included Models
- DocumentAssembler
 - SentenceDetectorDLModel
 - TokenizerModel
 - MedicalBertForTokenClassifier
 - NerConverterInternalModel