Pipeline to Detect Bacterial Species (BertForTokenClassification)

Description

This pretrained pipeline is built on the top of bert_token_classifier_ner_bacteria model.

Predicted Entities

SPECIES

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("bert_token_classifier_ner_bacteria_pipeline", "en", "clinical/models")

text = '''Based on these genetic and phenotypic properties, we propose that strain SMSP (T) represents a novel species of the genus Methanoregula, for which we propose the name Methanoregula formicica sp. nov., with the type strain SMSP (T) (= NBRC 105244 (T) = DSM 22288 (T)).'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("bert_token_classifier_ner_bacteria_pipeline", "en", "clinical/models")

val text = "Based on these genetic and phenotypic properties, we propose that strain SMSP (T) represents a novel species of the genus Methanoregula, for which we propose the name Methanoregula formicica sp. nov., with the type strain SMSP (T) (= NBRC 105244 (T) = DSM 22288 (T))."

val result = pipeline.fullAnnotate(text)
import nlu
nlu.load("en.classify.token_bert.bacteria_ner.pipeline").predict("""Based on these genetic and phenotypic properties, we propose that strain SMSP (T) represents a novel species of the genus Methanoregula, for which we propose the name Methanoregula formicica sp. nov., with the type strain SMSP (T) (= NBRC 105244 (T) = DSM 22288 (T)).""")

Results

|    | ner_chunk               |   begin |   end | ner_label   |   confidence |
|---:|:------------------------|--------:|------:|:------------|-------------:|
|  0 | SMSP (T)                |      73 |    80 | SPECIES     |     0.99985  |
|  1 | Methanoregula formicica |     167 |   189 | SPECIES     |     0.999787 |
|  2 | SMSP (T)                |     222 |   229 | SPECIES     |     0.999871 |

Model Information

Model Name: bert_token_classifier_ner_bacteria_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 404.9 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • MedicalBertForTokenClassifier
  • NerConverterInternalModel