Self Treatment Changes Classifier in Tweets (BioBERT)

Description

This model is a BioBERT based classifier that can classify patients non-adherent to their treatments and their reasons on Twitter.

Predicted Entities

negative, positive

Live Demo Open in Colab Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
    .setInputCol('text') \
    .setOutputCol('document')

tokenizer = Tokenizer() \
    .setInputCols(['document']) \
    .setOutputCol('token')

sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_treatment_changes_sentiment_tweet", "en", "clinical/models")\
    .setInputCols(["document",'token'])\
    .setOutputCol("class")

pipeline = Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequenceClassifier
])

data = spark.createDataFrame(["I love when they say things like this. I took that ambien instead of my thyroid pill.",
                              "I am a 30 year old man who is not overweight but is still on the verge of needing a Lipitor prescription."], StringType()).toDF("text")
                          
result = pipeline.fit(data).transform(data)

result.select("text", "class.result").show(truncate=False)
val document_assembler = new DocumentAssembler() 
    .setInputCol("text") 
    .setOutputCol("document")

val tokenizer = new Tokenizer()
    .setInputCols(Array("document"))
    .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_treatment_changes_sentiment_tweet", "en", "clinical/models")
    .setInputCols(Array("document","token"))
    .setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))

val data = Seq(Array("I love when they say things like this. I took that ambien instead of my thyroid pill.",
                      "I am a 30 year old man who is not overweight but is still on the verge of needing a Lipitor prescription.")).toDS.toDF("text")

val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.bert_sequence.treatment_sentiment_tweets").predict("""I am a 30 year old man who is not overweight but is still on the verge of needing a Lipitor prescription.""")

Results

+---------------------------------------------------------------------------------------------------------+----------+
|text                                                                                                     |result    |
+---------------------------------------------------------------------------------------------------------+----------+
|I love when they say things like this. I took that ambien instead of my thyroid pill.                    |[positive]|
|I am a 30 year old man who is not overweight but is still on the verge of needing a Lipitor prescription.|[negative]|
+---------------------------------------------------------------------------------------------------------+----------+

Model Information

Model Name: bert_sequence_classifier_treatment_changes_sentiment_tweet
Compatibility: Healthcare NLP 4.0.2+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.5 MB
Case sensitive: true
Max sentence length: 128

Benchmarking

       label  precision    recall  f1-score   support
    negative     0.9515    0.9751    0.9632      1368
    positive     0.6304    0.4603    0.5321       126
    accuracy     -         -         0.9317      1494
   macro-avg     0.7910    0.7177    0.7476      1494
weighted-avg     0.9244    0.9317    0.9268      1494