Description
This pretrained pipeline is built on the top of bert_token_classifier_ner_bc4chemd_chemicals model.
Predicted Entities
CHEM
How to use
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc4chemd_chemicals_pipeline", "en", "clinical/models")
text = '''The main isolated compounds were triterpenes (alpha - amyrin, beta - amyrin, lupeol, betulin, betulinic acid, uvaol, erythrodiol and oleanolic acid) and phenolic acid derivatives from 4 - hydroxybenzoic acid (gallic and protocatechuic acids and isocorilagin).'''
result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val pipeline = new PretrainedPipeline("bert_token_classifier_ner_bc4chemd_chemicals_pipeline", "en", "clinical/models")
val text = "The main isolated compounds were triterpenes (alpha - amyrin, beta - amyrin, lupeol, betulin, betulinic acid, uvaol, erythrodiol and oleanolic acid) and phenolic acid derivatives from 4 - hydroxybenzoic acid (gallic and protocatechuic acids and isocorilagin)."
val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
pipeline = PretrainedPipeline("bert_token_classifier_ner_bc4chemd_chemicals_pipeline", "en", "clinical/models")
text = '''The main isolated compounds were triterpenes (alpha - amyrin, beta - amyrin, lupeol, betulin, betulinic acid, uvaol, erythrodiol and oleanolic acid) and phenolic acid derivatives from 4 - hydroxybenzoic acid (gallic and protocatechuic acids and isocorilagin).'''
result = pipeline.fullAnnotate(text)
Results
| | ner_chunk | begin | end | ner_label | confidence |
|---:|:--------------------------------|--------:|------:|:------------|-------------:|
| 0 | triterpenes | 33 | 43 | CHEM | 0.99999 |
| 1 | alpha - amyrin | 46 | 59 | CHEM | 0.999939 |
| 2 | beta - amyrin | 62 | 74 | CHEM | 0.999679 |
| 3 | lupeol | 77 | 82 | CHEM | 0.999968 |
| 4 | betulin | 85 | 91 | CHEM | 0.999975 |
| 5 | betulinic acid | 94 | 107 | CHEM | 0.999984 |
| 6 | uvaol | 110 | 114 | CHEM | 0.99998 |
| 7 | erythrodiol | 117 | 127 | CHEM | 0.999987 |
| 8 | oleanolic acid | 133 | 146 | CHEM | 0.999984 |
| 9 | phenolic acid | 153 | 165 | CHEM | 0.999985 |
| 10 | 4 - hydroxybenzoic acid | 184 | 206 | CHEM | 0.999973 |
| 11 | gallic and protocatechuic acids | 209 | 239 | CHEM | 0.999984 |
| 12 | isocorilagin | 245 | 256 | CHEM | 0.999985 |
Model Information
Model Name: | bert_token_classifier_ner_bc4chemd_chemicals_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 4.4.4+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 404.7 MB |
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- MedicalBertForTokenClassifier
- NerConverterInternalModel