Pipeline to Detect Organism in Medical Texts

Description

This pretrained pipeline is built on the top of bert_token_classifier_ner_linnaeus_species model.

Predicted Entities

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("bert_token_classifier_ner_linnaeus_species_pipeline", "en", "clinical/models")

text = '''First identified in chicken, vigilin homologues have now been found in human (6), Xenopus laevis (7), Drosophila melanogaster (8) and Schizosaccharomyces pombe.'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("bert_token_classifier_ner_linnaeus_species_pipeline", "en", "clinical/models")

val text = "First identified in chicken, vigilin homologues have now been found in human (6), Xenopus laevis (7), Drosophila melanogaster (8) and Schizosaccharomyces pombe."

val result = pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("bert_token_classifier_ner_linnaeus_species_pipeline", "en", "clinical/models")

text = '''First identified in chicken, vigilin homologues have now been found in human (6), Xenopus laevis (7), Drosophila melanogaster (8) and Schizosaccharomyces pombe.'''

result = pipeline.fullAnnotate(text)

Results

|    | ner_chunk                 |   begin |   end | ner_label   |   confidence |
|---:|:--------------------------|--------:|------:|:------------|-------------:|
|  0 | chicken                   |      20 |    26 | SPECIES     |     0.998697 |
|  1 | human                     |      71 |    75 | SPECIES     |     0.999767 |
|  2 | Xenopus laevis            |      82 |    95 | SPECIES     |     0.999918 |
|  3 | Drosophila melanogaster   |     102 |   124 | SPECIES     |     0.999925 |
|  4 | Schizosaccharomyces pombe |     134 |   158 | SPECIES     |     0.999881 |

Model Information

Model Name: bert_token_classifier_ner_linnaeus_species_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 404.8 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • MedicalBertForTokenClassifier
  • NerConverterInternalModel