Pipeline to Detect Cancer Genetics (BertForTokenClassification)

Description

This pretrained pipeline is built on the top of bert_token_classifier_ner_bionlp model.

Predicted Entities

Amino_acid, Anatomical_system, Cancer, Cell, Cellular_component, Developing_anatomical_structure, Gene_or_gene_product, Immaterial_anatomical_entity, Multi-tissue_structure, Organ, Organism, Organism_subdivision, Organism_substance, Pathological_formation, Simple_chemical, Tissue

Copy S3 URI

How to use

from sparknlp.pretrained import PretrainedPipeline

pipeline = PretrainedPipeline("bert_token_classifier_ner_bionlp_pipeline", "en", "clinical/models")

text = '''Both the erbA IRES and the erbA/myb virus constructs transformed erythroid cells after infection of bone marrow or blastoderm cultures. The erbA/myb IRES virus exhibited a 5-10-fold higher transformed colony forming efficiency than the erbA IRES virus in the blastoderm assay.'''

result = pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline

val pipeline = new PretrainedPipeline("bert_token_classifier_ner_bionlp_pipeline", "en", "clinical/models")

val text = "Both the erbA IRES and the erbA/myb virus constructs transformed erythroid cells after infection of bone marrow or blastoderm cultures. The erbA/myb IRES virus exhibited a 5-10-fold higher transformed colony forming efficiency than the erbA IRES virus in the blastoderm assay."

val result = pipeline.fullAnnotate(text)
import nlu
nlu.load("en.classify.token_bert.biolp.pipeline").predict("""Both the erbA IRES and the erbA/myb virus constructs transformed erythroid cells after infection of bone marrow or blastoderm cultures. The erbA/myb IRES virus exhibited a 5-10-fold higher transformed colony forming efficiency than the erbA IRES virus in the blastoderm assay.""")

Results

|    | ner_chunk           |   begin |   end | ner_label              |   confidence |
|---:|:--------------------|--------:|------:|:-----------------------|-------------:|
|  0 | erbA IRES           |       9 |    17 | Organism               |     0.999188 |
|  1 | erbA/myb virus      |      27 |    40 | Organism               |     0.999434 |
|  2 | erythroid cells     |      65 |    79 | Cell                   |     0.999837 |
|  3 | bone                |     100 |   103 | Multi-tissue_structure |     0.999846 |
|  4 | marrow              |     105 |   110 | Multi-tissue_structure |     0.999876 |
|  5 | blastoderm cultures |     115 |   133 | Cell                   |     0.999823 |
|  6 | erbA/myb IRES virus |     140 |   158 | Organism               |     0.999751 |
|  7 | erbA IRES virus     |     236 |   250 | Organism               |     0.999749 |
|  8 | blastoderm          |     259 |   268 | Cell                   |     0.999897 |

Model Information

Model Name: bert_token_classifier_ner_bionlp_pipeline
Type: pipeline
Compatibility: Healthcare NLP 4.4.4+
License: Licensed
Edition: Official
Language: en
Size: 404.8 MB

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • MedicalBertForTokenClassifier
  • NerConverter