Description
This pretrained pipeline is built on the top of bert_token_classifier_ner_species model.
Predicted Entities
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
text = '''As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) .'''
result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = new PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
val text = "As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) ."
val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_species_pipeline", "en", "clinical/models")
text = '''As determined by 16S rRNA gene sequence analysis, strain 6C (T) represents a distinct species belonging to the class Betaproteobacteria and is most closely related to Thiomonas intermedia DSM 18155 (T) and Thiomonas perometabolis DSM 18570 (T) .'''
result = pipeline.fullAnnotate(text)
Results
| | ner_chunk | begin | end | ner_label | confidence |
|---:|:------------------------|--------:|------:|:------------|-------------:|
| 0 | 6C (T) | 57 | 62 | SPECIES | 0.998955 |
| 1 | Betaproteobacteria | 117 | 134 | SPECIES | 0.99973 |
| 2 | Thiomonas intermedia | 167 | 186 | SPECIES | 0.999822 |
| 3 | DSM 18155 (T) | 188 | 200 | SPECIES | 0.997657 |
| 4 | Thiomonas perometabolis | 206 | 228 | SPECIES | 0.999614 |
| 5 | DSM 18570 (T) | 230 | 242 | SPECIES | 0.997146 |
Model Information
Model Name: | bert_token_classifier_ner_species_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 4.4.4+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 404.8 MB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- MedicalBertForTokenClassifier
- NerConverterInternalModel