Detect concepts in drug development trials (BertForTokenClassification)

Description

It is a BertForTokenClassification NER model to identify concepts related to drug development including Trial Groups , End Points , Hazard Ratio and other entities in free text.

Predicted Entities

Patient_Count, Duration, End_Point, Value, Trial_Group, Hazard_Ratio, Total_Patients

Live Demo Open in Colab Copy S3 URI

How to use

documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")


sentenceDetector = SentenceDetectorDLModel.pretrained() \
.setInputCols(["document"]) \
.setOutputCol("sentence") 


tokenizer = Tokenizer()\
.setInputCols("sentence")\
.setOutputCol("token")


tokenClassifier = MedicalBertForTokenClassifier.pretrained("bert_token_classifier_drug_development_trials", "en", "clinical/models")\
.setInputCols("token", "sentence")\
.setOutputCol("ner")\


ner_converter = NerConverter()\
.setInputCols(["sentence","token","ner"])\
.setOutputCol("ner_chunk") 


pipeline =  Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier, ner_converter])     


test_sentence = """In June 2003, the median overall survival with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response ( CR ) , partial response ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with topotecan, and 11, 61, 66 and 32 patients, respectively, without topotecan."""


data = spark.createDataFrame([[test_sentence]]).toDF('text')


result = pipeline.fit(data).transform(data)
val documentAssembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")


val sentenceDetector = SentenceDetectorDLModel.pretrained()
.setInputCols("document") 
.setOutputCol("sentence") 


val tokenizer = new Tokenizer()
.setInputCols("sentence")
.setOutputCol("token")


val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_drug_development_trials", "en", "clinical/models")
.setInputCols(Array("token", "sentence"))
.setOutputCol("ner")


val ner_converter = NerConverter()
.setInputCols(Array("sentence","token","ner"))
.setOutputCol("ner_chunk")


val pipeline =  new Pipeline().setStages(Array(documentAssembler, sentenceDetector, tokenizer, tokenClassifier, ner_converter))


val data = Seq("In June 2003, the median overall survival  with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response  ( CR ) , partial response  ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with  topotecan,  and 11, 61, 66 and 32 patients, respectively, without topotecan.").toDF("text")


val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.ner.drug_development_trials").predict("""In June 2003, the median overall survival with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response ( CR ) , partial response ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with topotecan, and 11, 61, 66 and 32 patients, respectively, without topotecan.""")

Results

+-----------------+-------------+
|chunk            |ner_label    |
+-----------------+-------------+
|median           |Duration     |
|overall survival |End_Point    |
|with             |Trial_Group  |
|without topotecan|Trial_Group  |
|4.0              |Value        |
|3.6 months       |Value        |
|23               |Patient_Count|
|63               |Patient_Count|
|55               |Patient_Count|
|33 patients      |Patient_Count|
|topotecan        |Trial_Group  |
|11               |Patient_Count|
|61               |Patient_Count|
|66               |Patient_Count|
|32 patients      |Patient_Count|
|without topotecan|Trial_Group  |
+-----------------+-------------+

Model Information

Model Name: bert_token_classifier_drug_development_trials
Compatibility: Healthcare NLP 3.3.4+
License: Licensed
Edition: Official
Input Labels: [sentence, token]
Output Labels: [ner]
Language: en
Size: 404.4 MB
Case sensitive: true
Max sentence length: 256

References

Trained on data obtained from clinicaltrials.gov and annotated in-house.

Benchmarking

label  prec  rec   f1   support
B-Duration  0.93  0.94  0.93    1820
B-End_Point  0.99  0.98  0.98    5022
B-Hazard_Ratio  0.97  0.95  0.96     778
B-Patient_Count  0.81  0.88  0.85     300
B-Trial_Group  0.86  0.88  0.87    6751
B-Value  0.94  0.96  0.95    7675
I-Duration  0.71  0.82  0.76     185
I-End_Point  0.94  0.98  0.96    1491
I-Patient_Count  0.48  0.64  0.55      44
I-Trial_Group  0.78  0.75  0.77    4561
I-Value  0.93  0.95  0.94    1511
O  0.96  0.95  0.95   47423
accuracy  0.94  0.94  0.94   77608
macro-avg  0.79  0.82  0.80   77608
weighted-avg  0.94  0.94  0.94   77608