Description
This model is a [BioBERT-based] (https://github.com/dmis-lab/biobert) classifier that can classify tweets reporting ADEs (Adverse Drug Events).
Predicted Entities
ADE
, noADE
Live Demo Open in Colab Copy S3 URI
How to use
document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols("document") \
.setOutputCol("token")
sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented", "en", "clinical/models")\
.setInputCols(["document","token"])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
data = spark.createDataFrame(["So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
"Religare Capital Ranbaxy has been accepting approval for Diovan since 2012"], StringType()).toDF("text")
result = pipeline.fit(data).transform(data)
result.select("text", "class.result").show(truncate=False)
val documenter = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols("sentences")
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented", "en", "clinical/models")
.setInputCols(Array("document","token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(documenter, tokenizer, sequenceClassifier))
val data = Seq(Array("So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
"Religare Capital Ranbaxy has been accepting approval for Diovan since 2012")).toDS.toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.adverse_drug_events").predict("""So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st""")
Results
+-----------------------------------------------------------------------------------------------------------------------+-------+
|text |result |
+-----------------------------------------------------------------------------------------------------------------------+-------+
|So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st|[ADE] |
|Religare Capital Ranbaxy has been accepting approval for Diovan since 2012 |[noADE]|
+-----------------------------------------------------------------------------------------------------------------------+-------+
Model Information
Model Name: | bert_sequence_classifier_ade_augmented |
Compatibility: | Healthcare NLP 4.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [class] |
Language: | en |
Size: | 406.5 MB |
Case sensitive: | true |
Max sentence length: | 128 |
Benchmarking
label precision recall f1-score support
ADE 0.9696 0.9595 0.9645 2763
noADE 0.9670 0.9753 0.9712 3366
accuracy - - 0.9682 6129
macro-avg 0.9683 0.9674 0.9678 6129
weighted-avg 0.9682 0.9682 0.9682 6129