Adverse Drug Events Binary Classifier (BioBERT)

Description

This model is a [BioBERT-based] (https://github.com/dmis-lab/biobert) classifier that can classify tweets reporting ADEs (Adverse Drug Events).

Predicted Entities

ADE, noADE

Live Demo Open in Colab Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols("document") \
    .setOutputCol("token")

sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented", "en", "clinical/models")\
    .setInputCols(["document","token"])\
    .setOutputCol("class")

pipeline = Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequenceClassifier
])

data = spark.createDataFrame(["So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
                              "Religare Capital Ranbaxy has been accepting approval for Diovan since 2012"], StringType()).toDF("text")
              
result = pipeline.fit(data).transform(data)

result.select("text", "class.result").show(truncate=False)
val documenter = new DocumentAssembler() 
    .setInputCol("text") 
    .setOutputCol("document")

val tokenizer = new Tokenizer()
    .setInputCols("sentences")
    .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented", "en", "clinical/models")
    .setInputCols(Array("document","token"))
    .setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(documenter, tokenizer, sequenceClassifier))

val data = Seq(Array("So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
                     "Religare Capital Ranbaxy has been accepting approval for Diovan since 2012")).toDS.toDF("text")

val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.adverse_drug_events").predict("""So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st""")

Results

+-----------------------------------------------------------------------------------------------------------------------+-------+
|text                                                                                                                   |result |
+-----------------------------------------------------------------------------------------------------------------------+-------+
|So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st|[ADE]  |
|Religare Capital Ranbaxy has been accepting approval for Diovan since 2012                                             |[noADE]|
+-----------------------------------------------------------------------------------------------------------------------+-------+

Model Information

Model Name: bert_sequence_classifier_ade_augmented
Compatibility: Healthcare NLP 4.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.5 MB
Case sensitive: true
Max sentence length: 128

Benchmarking

       label  precision    recall  f1-score   support
         ADE     0.9696    0.9595    0.9645      2763
       noADE     0.9670    0.9753    0.9712      3366
    accuracy       -         -       0.9682      6129
   macro-avg     0.9683    0.9674    0.9678      6129
weighted-avg     0.9682    0.9682    0.9682      6129