Classifier for Adverse Drug Events

Description

This model classifies if a text is ADE-related (True) or not (False).

Classified Labels

True, False.

Live Demo Open in Colab Download

How to use

To classify your text if it is ADE-related, you can use this model as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, BertEmbeddings (biobert_pubmed_base_cased), SentenceEmbeddings, ClassifierDLModel.

...
embeddings = BertEmbeddings.pretrained('biobert_pubmed_base_cased')\
    .setInputCols(["document", 'token'])\
    .setOutputCol("word_embeddings")

sentence_embeddings = SentenceEmbeddings() \
      .setInputCols(["document", "word_embeddings"]) \
      .setOutputCol("sentence_embeddings") \
      .setPoolingStrategy("AVERAGE")

classifier = ClassifierDLModel.pretrained('classifierdl_biobert_ade', 'en', 'clinical/models')\
    .setInputCols(['document', 'token', 'sentence_embeddings']).setOutputCol('class')

nlp_pipeline = Pipeline(stages=[document_assembler, tokenizer, embeddings, sentence_embeddings, classifier])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate("I feel a bit drowsy & have a little blurred vision after taking an insulin")

...
val embeddings = BertEmbeddings.pretrained('biobert_pubmed_base_cased')
    .setInputCols(Array("document", 'token'))
    .setOutputCol("word_embeddings")
val sentence_embeddings = SentenceEmbeddings() 
      .setInputCols(Array("document", "word_embeddings")) 
      .setOutputCol("sentence_embeddings") 
      .setPoolingStrategy("AVERAGE")
val classsifierADE = ClassifierDLModel.pretrained("classifierdl_ade_biobert", "en", "clinical/models")
      .setInputCols(Array("sentence", "sentence_embeddings")) 
      .setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, embeddings, sentence_embeddings, classifierADE))
val result = pipeline.fit(Seq.empty["I feel a bit drowsy & have a little blurred vision after taking an insulin"].toDS.toDF("text")).transform(data)

Results

True : The sentence is talking about a possible ADE

False : The sentences doesn’t have any information about an ADE.

'True'

Model Information

Model Name: classifierdl_ade_biobert
Type: ClassifierDLModel
Compatibility: Spark NLP for Healthcare 2.6.2 +
Edition: Official
License: Licensed
Input Labels: [sentence_embeddings]
Output Labels: [class]
Language: [en]
Case sensitive: True

Data Source

Trained on a custom dataset comprising of CADEC, DRUG-AE, Twimed using biobert_pubmed_base_cased embeddings.

Benchmarking

|    | label            | prec   | rec    | f1     |
|---:|-----------------:|-------:|-------:|-------:|
|  0 | False            | 0.9469 | 0.9327 | 0.9398 | 
|  1 | True             | 0.7603 | 0.8030 | 0.7811 | 
|  2 | Macro-average    | 0.8536 | 0.8679 | 0.8604 |
|  3 | Weighted-average | 0.9077 | 0.9055 | 0.9065 |