Description
This model is a [BioBERT-based] (https://github.com/dmis-lab/biobert) classifier that can classify tweets reporting ADEs (Adverse Drug Events).
Predicted Entities
ADE
, noADE
How to use
document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequence_classifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented_onnx", "en", "clinical/models")\
.setInputCols(["document", "token"])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])
data = spark.createDataFrame(["So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
"Religare Capital Ranbaxy has been accepting approval for Diovan since 2012"], StringType()).toDF("text")
model = pipeline.fit(data)
result = model.transform(data)
document_assembler = nlp.DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = nlp.Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequenceClassifier = medical.BertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented_onnx", "en", "clinical/models")\
.setInputCols(["document","token"])\
.setOutputCol("classes")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
data = spark.createDataFrame(["So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
"Religare Capital Ranbaxy has been accepting approval for Diovan since 2012"], StringType()).toDF("text")
model = pipeline.fit(data)
result = model.transform(data)
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_ade_augmented_onnx", "en", "clinical/models")
.setInputCols(Array("document","token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))
val data = Seq(Array("So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st",
"Religare Capital Ranbaxy has been accepting approval for Diovan since 2012")).toDF("text")
val model = pipeline.fit(data)
val result = model.transform(data)
Results
+-----------------------------------------------------------------------------------------------------------------------+-------+
|text |result |
+-----------------------------------------------------------------------------------------------------------------------+-------+
|So glad I am off effexor, so sad it ruined my teeth. tip Please be carefull taking antideppresiva and read about it 1st|[ADE] |
|Religare Capital Ranbaxy has been accepting approval for Diovan since 2012 |[noADE]|
+-----------------------------------------------------------------------------------------------------------------------+-------+
Model Information
Model Name: | bert_sequence_classifier_ade_augmented_onnx |
Compatibility: | Healthcare NLP 6.1.1+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [label] |
Language: | en |
Size: | 437.7 MB |
Case sensitive: | true |