Examples

 

Usage examples of nlp.load()

The following examples demonstrate how to use nlu’s load api accompanied by the outputs generated by it. It enables loading any model or pipeline in one line You need to pass one NLU reference to the load method.
You can also pass multiple whitespace separated references.
You can find all NLU references here

Medical Named Entity Recognition (NER)

Medical NER tutorial notebook

NLU provided a separate and highly tuned medical NER models for various Healthcare domains.
These medical NER models are trained to extract various medical named entities.

data ="""The patient is a 5-month-old infant who presented initially on Monday with a cold, cough, and runny nose for 2 days."""
df = nlp.load('med_ner.jsl.wip.clinical en.resolve_chunk.cpt_clinical').predict(data)
entities@clinical_results meta_entities@clinical_entity meta_entities@clinical_confidence chunk_resolution_results meta_chunk_resolution_all_k_aux_labels meta_chunk_resolution_target_text meta_chunk_resolution_distance meta_chunk_resolution_confidence meta_chunk_resolution_all_k_results meta_chunk_resolution_all_k_distances meta_chunk_resolution_all_k_cosine_distances
5-month-old Age 0.9982 49496   5-month-old 15.0536 1 49496 15.0536 0.5153
infant Age 0.9999 49492   infant 6.7093 1 49492 6.7093 0.3702
Monday RelativeDate 0.9983 59857   Monday 12.6501 1 59857 12.6501 0.5324
cold Symptom 0.7517 50547   cold 2.6313 1 50547 2.6313 0.4492
cough Symptom 0.9969 32215   cough 3.5559 1 32215 3.5559 0.4847
runny nose Symptom 0.7796 60281   runny nose 3.3286 1 60281 3.3286 0.3959
for 2 days Duration 0.5479 35390   for 2 days 2.3929 1 35390 2.3929 0.22

See the Models Hub for all avaiable Entity Resolution Models

Zero-Shot NER

Zero-Shot NER Tutorial Notebook
Based on John Snow Labs Enterprise-NLP ZeroShotNerModel
Zero shot models excel at generalization, meaning that the model can accurately predict entities in very different data sets without the need to fine tune the model or train from scratch for each different domain. Even though a model trained to solve a specific problem can achieve better accuracy than a zero-shot model in this specific task, it probably won’t be useful in a different task. That is where zero-shot models shows its usefulness by being able to achieve good results in various domains.

Usage:

We just need to load the zero-shot NER model and configure a set of entity definitions.

# load zero-shot ner model
enterprise_zero_shot_ner = nlp.load('en.zero_shot.ner_roberta')

# Configure entity definitions
enterprise_zero_shot_ner['zero_shot_ner'].setEntityDefinitions(
  {
    "PROBLEM": [
      "What is the disease?",
      "What is his symptom?",
      "What is her disease?",
      "What is his disease?",
      "What is the problem?",
      "What does a patient suffer",
      "What was the reason that the patient is admitted to the clinic?",
    ],
    "DRUG": [
      "Which drug?",
      "Which is the drug?",
      "What is the drug?",
      "Which drug does he use?",
      "Which drug does she use?",
      "Which drug do I use?",
      "Which drug is prescribed for a symptom?",
    ],
    "ADMISSION_DATE": ["When did patient admitted to a clinic?"],
    "PATIENT_AGE": [
      "How old is the patient?",
      "What is the gae of the patient?",
    ],
  }
)

Then we can already use this pipeline to predict labels

# Predict entities
df = enterprise_zero_shot_ner.predict(
  [
    "The doctor pescribed Majezik for my severe headache.",
    "The patient was admitted to the hospital for his colon cancer.",
    "27 years old patient was admitted to clinic on Sep 1st by Dr."+
    "X for a right-sided pleural effusion for thoracentesis.",
  ]
)
df
document entities_zero_shot entities_zero_shot_class entities_zero_shot_confidence entities_zero_shot_origin_chunk entities_zero_shot_origin_sentence
The doctor pescribed Majezik for my severe headache. Majezik DRUG 0.646716 0 0
The doctor pescribed Majezik for my severe headache. severe headache PROBLEM 0.552635 1 0
The patient was admitted to the hospital for his colon cancer. colon cancer PROBLEM 0.88985 0 0
27 years old patient was admitted to clinic on Sep 1st by Dr. X for a right-sided pleural effusion for thoracentesis. 27 years old PATIENT_AGE 0.694308 0 0
27 years old patient was admitted to clinic on Sep 1st by Dr. X for a right-sided pleural effusion for thoracentesis. Sep 1st ADMISSION_DATE 0.956461 1 0
27 years old patient was admitted to clinic on Sep 1st by Dr. X for a right-sided pleural effusion for thoracentesis. a right-sided pleural effusion for thoracentesis PROBLEM 0.500266 2 0

Entity Resolution (for sentences)

Entity Resolution tutorial notebook

Classify each sentence extracted by a sentence detector into one of C resolvable classes. These classes usually are international disease , medicine , or procedure codes based on ICD standards.

data = ["""He has a starvation ketosis but nothing found for significant for dry oral mucosa"""]
nlp.load('med_ner.jsl.wip.clinical resolve.icd10pcs').predict(data)
sentence_results sentence_resolution_results entities@clinical_results meta_entities@clinical_entity meta_entities@clinical_confidence
The patient is a 5-month-old infant who presented initially on Monday with a cold, cough, and runny nose for 2 days. DU12BBZ [‘5-month-old’, ‘infant’, ‘Monday’, ‘cold’, ‘cough’, ‘runny nose’, ‘for 2 days’, ‘Mom’, ‘she’, ‘fever’, ‘Her’, ‘she’, ‘spitting up a lot’] [‘Age’, ‘Age’, ‘RelativeDate’, ‘Symptom’, ‘Symptom’, ‘Symptom’, ‘Duration’, ‘Gender’, ‘Gender’, ‘VS_Finding’, ‘Gender’, ‘Gender’, ‘Symptom’] [‘0.9982’, ‘0.9999’, ‘0.9983’, ‘0.7517’, ‘0.9969’, ‘0.7796’, ‘0.5479’, ‘0.9427’, ‘0.9994’, ‘0.9975’, ‘0.9996’, ‘0.9985’, ‘0.30217502’]
Mom states she had no fever. F00ZNQZ [‘5-month-old’, ‘infant’, ‘Monday’, ‘cold’, ‘cough’, ‘runny nose’, ‘for 2 days’, ‘Mom’, ‘she’, ‘fever’, ‘Her’, ‘she’, ‘spitting up a lot’] [‘Age’, ‘Age’, ‘RelativeDate’, ‘Symptom’, ‘Symptom’, ‘Symptom’, ‘Duration’, ‘Gender’, ‘Gender’, ‘VS_Finding’, ‘Gender’, ‘Gender’, ‘Symptom’] [‘0.9982’, ‘0.9999’, ‘0.9983’, ‘0.7517’, ‘0.9969’, ‘0.7796’, ‘0.5479’, ‘0.9427’, ‘0.9994’, ‘0.9975’, ‘0.9996’, ‘0.9985’, ‘0.30217502’]
Her appetite was good but she was spitting up a lot. F08Z3YZ [‘5-month-old’, ‘infant’, ‘Monday’, ‘cold’, ‘cough’, ‘runny nose’, ‘for 2 days’, ‘Mom’, ‘she’, ‘fever’, ‘Her’, ‘she’, ‘spitting up a lot’] [‘Age’, ‘Age’, ‘RelativeDate’, ‘Symptom’, ‘Symptom’, ‘Symptom’, ‘Duration’, ‘Gender’, ‘Gender’, ‘VS_Finding’, ‘Gender’, ‘Gender’, ‘Symptom’] [‘0.9982’, ‘0.9999’, ‘0.9983’, ‘0.7517’, ‘0.9969’, ‘0.7796’, ‘0.5479’, ‘0.9427’, ‘0.9994’, ‘0.9975’, ‘0.9996’, ‘0.9985’, ‘0.30217502’]

See the Models Hub for all avaiable Entity Resolution Models

Relation Extraction

Relation Extraction tutorial notebook

Classify for pairs of entities what kind of relation exists between them.
It classifies for every named entity , which type of relationship exists to the other entities.
More precisely, internally the relation extractor classifies every pair of entities into one out of C potential relation classes.
There could be no relation between a pair of entities or there could a relation, which is specified by ` the predicted relation label` .

You can specify predict(data,output_level='relation to have one row per classified relation in your resulting dataframe.
Depending on what models are loaded in your pipe, NLU infers output_level=relation automatically and configures to that, unless specified otherwise.
See the Models Hub for all avaiable Relation Extractor Models

data = 'MRI demonstrated infarction in the upper brain stem , left cerebellum and  right basil ganglia'
df = nlp.load('en.med_ner.jsl.wip.clinical.greedy en.relation').predict(data)
document_results relation_results meta_relation_entity1 meta_relation_entity2 meta_relation_chunk1 meta_relation_chunk2 meta_relation_confidence entities@greedy_results meta_entities@greedy_entity meta_entities@greedy_confidence
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Disease_Syndrome_Disorder MRI infarction 0.900999 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Direction MRI upper 0.947945 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Internal_organ_or_component MRI brain stem 0.654686 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Direction MRI left 0.944728 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Internal_organ_or_component MRI cerebellum 0.683124 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Direction MRI right 0.96001 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Test Internal_organ_or_component MRI basil ganglia 0.958023 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Direction infarction upper 0.986427 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Internal_organ_or_component infarction brain stem 0.872217 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Direction infarction left 0.983788 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Internal_organ_or_component infarction cerebellum 0.974557 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Direction infarction right 0.981092 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Disease_Syndrome_Disorder Internal_organ_or_component infarction basil ganglia 0.968148 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 1 Direction Internal_organ_or_component upper brain stem 0.999582 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Direction upper left 0.98803 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Internal_organ_or_component upper cerebellum 0.990115 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Direction upper right 0.989708 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Internal_organ_or_component upper basil ganglia 0.971543 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Internal_organ_or_component Direction brain stem left 0.768312 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 1 Internal_organ_or_component Internal_organ_or_component brain stem cerebellum 0.504254 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Internal_organ_or_component Direction brain stem right 0.939806 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Internal_organ_or_component Internal_organ_or_component brain stem basil ganglia 0.944104 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 1 Direction Internal_organ_or_component left cerebellum 0.999842 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Direction left right 0.99164 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Direction Internal_organ_or_component left basil ganglia 0.985331 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Internal_organ_or_component Direction cerebellum right 0.986705 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 0 Internal_organ_or_component Internal_organ_or_component cerebellum basil ganglia 0.975779 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]
MRI demonstrated infarction in the upper brain stem , left cerebellum and right basil ganglia” 1 Direction Internal_organ_or_component right basil ganglia 0.999613 [‘MRI’, ‘infarction’, ‘upper’, ‘brain stem’, ‘left’, ‘cerebellum’, ‘right’, ‘basil ganglia’] [‘Test’, ‘Disease_Syndrome_Disorder’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’, ‘Direction’, ‘Internal_organ_or_component’] [‘0.9979’, ‘0.5062’, ‘0.2152’, ‘0.2636’, ‘0.4775’, ‘0.8135’, ‘0.5086’, ‘0.3236’]

Assertion

Assertion tutorial notebook

Assert for each entity the status into one out of C classes. These classes usually are : hypothetical, present, absent, possible, conditional, associated_with_someone_else.

data = "He has a starvation ketosis but nothing found for significant for dry oral mucosa"
assert_df = nlp.load('en.med_ner.clinical en.assert ').predict(data)
entities@clinical_results meta_entities@clinical_entity meta_entities@clinical_confidence assertion_results meta_assertion_confidence
a starvation ketosis PROBLEM 0.932233 present 0.9938
dry oral mucosa PROBLEM 0.797567 present 0.9997

See the Models Hub for all avaiable Assertion Models

De-Identification

De-Identification tutorial notebook

Detect sensitive information in a string and replace the sensitive data with anonymized labels

data= 'DR Johnson administerd to the patient Peter Parker last week 30 MG of penicilin on Friday 25. March 1999'
df = nlp.load('de_identify').predict(data)
deidentified_results entities@ner_results meta_entities@ner_entity
[‘DR administerd to the patient last week 30 MG of penicilin on Friday 25.', ' March '] Johnson PER
[‘DR administerd to the patient last week 30 MG of penicilin on Friday 25.', ' March '] Peter Parker PER

See the Models Hub for all avaiable De-Identification Models

Drug Normalizer

Drug Normalizer tutorial notebook

Normalize raw text from clinical documents, e.g. scraped web pages or xml document. Removes all dirty characters from text following one or more input regex patterns. Can apply non wanted character removal which a specific policy. Can apply lower case normalization.

Parameters are

  • lowercase: whether to convert strings to lowercase. Default is False.
  • policy: rule to remove patterns from text. Valid policy values are: all abbreviations, dosages Defaults is all. abbreviation policy used to expend common drugs abbreviations, dosages policy used to convert drugs dosages and values to the standard form (see examples bellow).
data = ["Agnogenic one half cup","adalimumab 54.5 + 43.2 gm","aspirin 10 meq/ 5 ml oral sol","interferon alfa-2b 10 million unit ( 1 ml ) injec","Sodium Chloride/Potassium Chloride 13bag"]
nlp.load('norm_drugs').predict(data)
drug_norm text
Agnogenic 0.5 oral solution Agnogenic one half cup
adalimumab 97700 mg adalimumab 54.5 + 43.2 gm
aspirin 2 meq/ml oral solution aspirin 10 meq/ 5 ml oral sol
interferon alfa - 2b 10000000 unt ( 1 ml ) injection interferon alfa-2b 10 million unit ( 1 ml ) injec
Sodium Chloride / Potassium Chloride 13 bag Sodium Chloride/Potassium Chloride 13bag

Text Generator

Text Generation tutorial notebook

Given a few tokens as an intro, it can generate human-like, conceptually meaningful texts up to 512 tokens given an input text (max 1024 tokens).

data - ['Covid 19 is']
df = nlu.load('en.text_generator.biomedical_biogpt_base').predict(data)
text generated
Covid 19 is Covid 19 is a pandemic that has affected the world economy and health. The World Health Organization ( WHO ) has declared the pandemic a global emergency.

See the Models Hub for all available Text Generation Models

Rule based NER with Context Matcher

Rule based NER with context matching tutorial notebook Define a rule based NER algorithm by providing Regex Patterns and resolution mappings. The confidence value is computed using a heuristic approach based on how many matches it has.
A dictionary can be provided with setDictionary to map extracted entities to a unified representation. The first column of the dictionary file should be the representation with following columns the possible matches.

import nlu
import json
# Define helper functions to write NER rules to file 
"""Generate json with dict contexts at target path"""
def dump_dict_to_json_file(dict, path):
  with open(path, 'w') as f: json.dump(dict, f)

"""Dump raw text file """
def dump_file_to_csv(data,path):
  with open(path, 'w') as f:f.write(data)
sample_text = """A 28-year-old female with a history of gestational diabetes mellitus diagnosed eight years prior to presentation and subsequent type two diabetes mellitus ( T2DM ), one prior episode of HTG-induced pancreatitis three years prior to presentation , associated with an acute hepatitis , and obesity with a body mass index ( BMI ) of 33.5 kg/m2 , presented with a one-week history of polyuria , polydipsia , poor appetite , and vomiting. Two weeks prior to presentation , she was treated with a five-day course of amoxicillin for a respiratory tract infection . She was on metformin , glipizide , and dapagliflozin for T2DM and atorvastatin and gemfibrozil for HTG . She had been on dapagliflozin for six months at the time of presentation . Physical examination on presentation was significant for dry oral mucosa ; significantly , her abdominal examination was benign with no tenderness , guarding , or rigidity . Pertinent laboratory findings on admission were : serum glucose 111 mg/dl , bicarbonate 18 mmol/l , anion gap 20 , creatinine 0.4 mg/dL , triglycerides 508 mg/dL , total cholesterol 122 mg/dL , glycated hemoglobin ( HbA1c ) 10% , and venous pH 7.27 . Serum lipase was normal at 43 U/L . Serum acetone levels could not be assessed as blood samples kept hemolyzing due to significant lipemia . The patient was initially admitted for starvation ketosis , as she reported poor oral intake for three days prior to admission . However , serum chemistry obtained six hours after presentation revealed her glucose was 186 mg/dL , the anion gap was still elevated at 21 , serum bicarbonate was 16 mmol/L , triglyceride level peaked at 2050 mg/dL , and lipase was 52 U/L . β-hydroxybutyrate level was obtained and found to be elevated at 5.29 mmol/L - the original sample was centrifuged and the chylomicron layer removed prior to analysis due to interference from turbidity caused by lipemia again . The patient was treated with an insulin drip for euDKA and HTG with a reduction in the anion gap to 13 and triglycerides to 1400 mg/dL , within 24 hours . Twenty days ago. Her euDKA was thought to be precipitated by her respiratory tract infection in the setting of SGLT2 inhibitor use . At birth the typical boy is growing slightly faster than the typical girl, but the velocities become equal at about seven months, and then the girl grows faster until four years. From then until adolescence no differences in velocity can be detected. 21-02-2020 21/04/2020 """

# Define Gender NER matching rules
gender_rules = {
  "entity": "Gender",
  "ruleScope": "sentence",
  "completeMatchRegex": "true"    }

# Define dict data in csv format
gender_data = '''male,man,male,boy,gentleman,he,him
female,woman,female,girl,lady,old-lady,she,her
neutral,neutral'''

# Dump configs to file 
dump_dict_to_json_file(gender_data, 'gender.csv')
dump_dict_to_json_file(gender_rules, 'gender.json')
gender_NER_pipe = nlp.load('match.context')
gender_NER_pipe.print_info()
gender_NER_pipe['context_matcher'].setJsonPath('gender.json')
gender_NER_pipe['context_matcher'].setDictionary('gender.csv', options={"delimiter":","})
gender_NER_pipe.predict(sample_text)
context_match context_match_confidence
female 0.13
she 0.13
she 0.13
she 0.13
she 0.13
boy 0.13
girl 0.13
girl 0.13

Context Matcher Parameters

You can define the following parameters in your rules.json file to define the entities to be matched

Parameter Type Description
entity str The name of this rule
regex Optional[str] Regex Pattern to extract candidates
contextLength Optional[int] defines the maximum distance a prefix and suffix words can be away from the word to match,whereas context are words that must be immediately after or before the word to match
prefix Optional[List[str]] Words preceding the regex match, that are at most contextLength characters aways
regexPrefix Optional[str] RegexPattern of words preceding the regex match, that are at most contextLength characters aways
suffix Optional[List[str]] Words following the regex match, that are at most contextLength characters aways
regexSuffix Optional[str] RegexPattern of words following the regex match, that are at most contextLength distance aways
context Optional[List[str]] list of words that must be immediatly before/after a match
contextException Optional[List[str]] ?? List of words that may not be immediatly before/after a match
exceptionDistance Optional[int] Distance exceptions must be away from a match
regexContextException Optional[str] Regex Pattern of exceptions that may not be within exceptionDistance range of the match
matchScope Optional[str] Either token or sub-token to match on character basis
completeMatchRegex Optional[str] Wether to use complete or partial matching, either "true" or "false"
ruleScope str currently only sentence supported

Authorize access to licensed features and install healthcare dependencies

You need a set of credentials to access the licensed healthcare features.
You can grab one here

Automatically Authorize Google Colab via JSON file

By default, nlu checks /content/spark_nlp_for_healthcare.json on google colabe enviroments for a spark_nlp_for_healthcare.json file that you recieve via E-mail from us. If you upload the spark_nlp_for_healthcare.json file to the standard colab directory, nlp.load() will automatically find it and authorize your enviroment.

Authorize anywhere via providing via JSON file

You can specify the location of your spark_nlp_for_healthcare.json like this :

path = '/path/to/spark_nlp_for_healthcare.json'
nlp.auth(path).load('licensed_model').predict(data)

Authorize via providing String parameters

import nlu
SPARK_NLP_LICENSE           = 'YOUR_SECRETS'
AWS_ACCESS_KEY_ID           = 'YOUR_SECRETS'
AWS_SECRET_ACCESS_KEY       = 'YOUR_SECRETS'
JSL_SECRET                  = 'YOUR_SECRETS'

nlp.auth(SPARK_NLP_LICENSE,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,JSL_SECRET)
Last updated