Detect clinical entities (ner_jsl_biobert)

Description

Detect symptoms, modifiers, age, drugs, treatments, tests and a lot more using a single pretrained NER model.

Predicted Entities

Symptom_Name, Negated, Pulse_Rate, Negation, Date_of_death, Age, Modifier, Substance_Name, Causative_Agents_(Virus_and_Bacteria), Drug_incident_description, Diagnosis, Weight, Drug_Name, Procedure_Name, Lab_Name, Blood_Pressure, Cause_of_death, Lab_Result, Gender, Name, Temperature, Procedure_Findings, Section_Name, Route, Maybe, O2_Saturation, Respiratory_Rate, Procedure, Procedure_incident_description, Frequency, Dosage, Allergenic_substance

Live Demo Open in Colab Download

How to use


...
embeddings_clinical = BertEmbeddings.pretrained("biobert_pubmed_base_cased")  .setInputCols(["sentence", "token"])  .setOutputCol("embeddings")
clinical_ner = MedicalNerModel.pretrained("ner_jsl_biobert", "en", "clinical/models")   .setInputCols(["sentence", "token", "embeddings"])   .setOutputCol("ner")
...
nlpPipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, embeddings_clinical, clinical_ner, ner_converter])
model = nlpPipeline.fit(spark.createDataFrame([[""]]).toDF("text"))
results = model.transform(spark.createDataFrame([["EXAMPLE_TEXT"]]).toDF("text"))

...
val embeddings_clinical = BertEmbeddings.pretrained("biobert_pubmed_base_cased")
  .setInputCols(Array("sentence", "token"))
  .setOutputCol("embeddings")
val ner = MedicalNerModel.pretrained("ner_jsl_biobert", "en", "clinical/models")
  .setInputCols(Array("sentence", "token", "embeddings"))
  .setOutputCol("ner")
...
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, embeddings_clinical, ner, ner_converter))
val result = pipeline.fit(Seq.empty[""].toDS.toDF("text")).transform(data)

Model Information

Model Name: ner_jsl_biobert
Compatibility: Spark NLP for Healthcare 3.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence, token, embeddings]
Output Labels: [ner]
Language: en