Edit on GitHub

Classifier for Adverse Drug Events in Small Conversations

Description

Classify sentence in two categories:

True : The sentence is talking about a possible ADE

False : The sentences doesn’t have any information about an ADE.

Predicted Entities

True, False

Live Demo Open in Colab Download Copy S3 URI

How to use

document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

tokenizer = Tokenizer().setInputCols(['document']).setOutputCol('token')

embeddings = BertEmbeddings.pretrained('biobert_pubmed_base_cased')\
.setInputCols(["document", 'token'])\
.setOutputCol("word_embeddings")

sentence_embeddings = SentenceEmbeddings() \
.setInputCols(["document", "word_embeddings"]) \
.setOutputCol("sentence_embeddings") \
.setPoolingStrategy("AVERAGE")

classifier = ClassifierDLModel.pretrained('classifierdl_ade_conversational_biobert', 'en', 'clinical/models')\
.setInputCols(['document', 'token', 'sentence_embeddings']).setOutputCol('class')

nlp_pipeline = Pipeline(stages=[document_assembler, tokenizer, embeddings, sentence_embeddings, classifier])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate(["I feel a bit drowsy & have a little blurred vision after taking an insulin", "I feel great after taking tylenol"])

import nlu
nlu.load("en.classify.ade.conversational").predict("""I feel a bit drowsy & have a little blurred vision after taking an insulin""")

Results

|   | text                                                                       | label |
|--:|:---------------------------------------------------------------------------|:------|
| 0 | I feel a bit drowsy & have a little blurred vision after taking an insulin | True  |
| 1 | I feel great after taking tylenol                                          | False |

Model Information

Model Name:	classifierdl_ade_conversational_biobert
Compatibility:	Spark NLP 2.7.1+
License:	Licensed
Edition:	Official
Input Labels:	[sentence_embeddings]
Output Labels:	[class]
Language:	en
Dependencies:	biobert_pubmed_base_cased

Data Source

Trained on a custom dataset comprising of CADEC, DRUG-AE and Twimed.

Benchmarking

precision    recall  f1-score   support

False       0.91      0.94      0.93      5706
True       0.80      0.70      0.74      1800

micro avg       0.89      0.89      0.89      7506
macro avg       0.85      0.82      0.84      7506
weighted avg       0.88      0.89      0.88      7506

PREVIOUSClassifier for Adverse Drug Events using Clinical Bert

NEXTClassifier for Genders - BIOBERT