RCT Classifier (BioBERT)

Description

This model is a BioBERT-based classifier that can classify the sections within the abstracts of scientific articles regarding randomized clinical trials (RCT).

Predicted Entities

BACKGROUND, CONCLUSIONS, METHODS, OBJECTIVE, RESULTS

Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")


tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")


sequenceClassifier_loaded = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_rct_biobert", "en", "clinical/models")\
.setInputCols(["document",'token'])\
.setOutputCol("class")


pipeline = Pipeline(stages=[
document_assembler, 
tokenizer,
sequenceClassifier_loaded   
])


data = spark.createDataFrame([["""Previous attempts to prevent all the unwanted postoperative responses to major surgery with an epidural hydrophilic opioid , morphine , have not succeeded . The authors ' hypothesis was that the lipophilic opioid fentanyl , infused epidurally close to the spinal-cord opioid receptors corresponding to the dermatome of the surgical incision , gives equal pain relief but attenuates postoperative hormonal and metabolic responses more effectively than does systemic fentanyl ."""]]).toDF("text")


result = pipeline.fit(data).transform(data)
val documenter = new DocumentAssembler() 
.setInputCol("text") 
.setOutputCol("document")


val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")


val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_rct_biobert", "en", "clinical/models")
.setInputCols(Array("document","token"))
.setOutputCol("class")


val pipeline = new Pipeline().setStages(Array(documenter, tokenizer, sequenceClassifier))


val data = Seq("Previous attempts to prevent all the unwanted postoperative responses to major surgery with an epidural hydrophilic opioid , morphine , have not succeeded . The authors ' hypothesis was that the lipophilic opioid fentanyl , infused epidurally close to the spinal-cord opioid receptors corresponding to the dermatome of the surgical incision , gives equal pain relief but attenuates postoperative hormonal and metabolic responses more effectively than does systemic fentanyl .").toDF("text")


val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.med_ner.clinical_trials").predict("""Previous attempts to prevent all the unwanted postoperative responses to major surgery with an epidural hydrophilic opioid , morphine , have not succeeded . The authors ' hypothesis was that the lipophilic opioid fentanyl , infused epidurally close to the spinal-cord opioid receptors corresponding to the dermatome of the surgical incision , gives equal pain relief but attenuates postoperative hormonal and metabolic responses more effectively than does systemic fentanyl .""")

Results

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+
|text                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |class       |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+
|[Previous attempts to prevent all the unwanted postoperative responses to major surgery with an epidural hydrophilic opioid , morphine , have not succeeded . The authors ' hypothesis was that the lipophilic opioid fentanyl , infused epidurally close to the spinal-cord opioid receptors corresponding to the dermatome of the surgical incision , gives equal pain relief but attenuates postoperative hormonal and metabolic responses more effectively than does systemic fentanyl .]|[BACKGROUND]|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+




Model Information

Model Name: bert_sequence_classifier_rct_biobert
Compatibility: Healthcare NLP 3.4.1+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.0 MB
Case sensitive: true
Max sentence length: 128

References

https://arxiv.org/abs/1710.06071

Benchmarking

label         precision  recall  f1-score  support
BACKGROUND    0.77       0.86    0.81      2000   
CONCLUSIONS   0.96       0.95    0.95      2000   
METHODS       0.96       0.98    0.97      2000   
OBJECTIVE     0.85       0.77    0.81      2000   
RESULTS       0.98       0.95    0.96      2000   
accuracy      0.9        0.9     0.9       10000  
macro-avg     0.9        0.9     0.9       10000  
weighted-avg  0.9        0.9     0.9       10000