Detect concepts in drug development trials (BertForTokenClassification)


It is a BertForTokenClassification NER model to identify concepts related to drug development including Trial Groups , Efficacy and Safety End Points , Hazard Ratio, and others in free text.

Predicted Entities

Hazard_Ratio, Confidence_Interval, Patient_Count, Trial_Group, Patient_Group, Duration, Confidence_level, P_Value, Confidence_Range, End_Point, Follow_Up, ADE, Value, DATE

Live Demo Open in Colab Copy S3 URI

How to use

documentAssembler = DocumentAssembler()\

sentenceDetector = SentenceDetectorDLModel.pretrained() \
    .setInputCols(["document"]) \

tokenizer = Tokenizer()\

tokenClassifier = MedicalBertForTokenClassifier.pretrained("bert_token_classifier_drug_development_trials", "en", "clinical/models")\
    .setInputCols("token", "sentence")\

ner_converter = NerConverter()\

pipeline =  Pipeline(stages=[documentAssembler, sentenceDetector, tokenizer, tokenClassifier, ner_converter])     

test_sentence = """In June 2003, the median overall survival  with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response  ( CR ) , partial response  ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with  topotecan,  and 11, 61, 66 and 32 patients, respectively, without topotecan."""

data = spark.createDataFrame([[test_sentence]]).toDF('text')

result =
val documentAssembler = DocumentAssembler()

val sentenceDetector = SentenceDetectorDLModel.pretrained()

val tokenizer = new Tokenizer()

val tokenClassifier = BertForTokenClassification.pretrained("bert_token_classifier_drug_development_trials", "en", "clinical/models")
    .setInputCols(Array("token", "sentence"))

val ner_converter = NerConverter()

val pipeline =  new Pipeline().setStages(Array(documentAssembler, sentenceDetector, tokenizer, tokenClassifier, ner_converter))

val data = Seq("In June 2003, the median overall survival  with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response  ( CR ) , partial response  ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with  topotecan,  and 11, 61, 66 and 32 patients, respectively, without topotecan.").toDF("text")

val result =
import nlu
nlu.load("en.ner.drug_development_trials").predict("""In June 2003, the median overall survival  with and without topotecan were 4.0 and 3.6 months, respectively. The best complete response  ( CR ) , partial response  ( PR ) , stable disease and progressive disease were observed in 23, 63, 55 and 33 patients, respectively, with  topotecan,  and 11, 61, 66 and 32 patients, respectively, without topotecan.""")


|chunk            |ner_label    |
|median           |Duration     |
|overall survival |End_Point    |
|with             |Trial_Group  |
|without topotecan|Trial_Group  |
|4.0              |Value        |
|3.6 months       |Value        |
|23               |Patient_Count|
|63               |Patient_Count|
|55               |Patient_Count|
|33 patients      |Patient_Count|
|topotecan        |Trial_Group  |
|11               |Patient_Count|
|61               |Patient_Count|
|66               |Patient_Count|
|32 patients      |Patient_Count|
|without topotecan|Trial_Group  |

Model Information

Model Name: bert_token_classifier_drug_development_trials
Compatibility: Healthcare NLP 3.4.1+
License: Licensed
Edition: Official
Input Labels: [sentence, token]
Output Labels: [ner]
Language: en
Size: 400.7 MB
Case sensitive: true
Max sentence length: 256


Trained on data obtained from and annotated in-house.


label                   prec       rec        f1   support
B-ADE                   0.50      0.33      0.40         3
B-Confidence_Interval   0.46      1.00      0.63        12
B-Confidence_Range      1.00      0.98      0.99        42
B-Confidence_level      1.00      0.67      0.81        43
B-DATE                  0.95      0.93      0.94        40
B-Duration              1.00      0.82      0.90        11
B-End_Point             0.91      0.98      0.95        54
B-Follow_Up             1.00      1.00      1.00         2
B-Hazard_Ratio          0.77      1.00      0.87        24
B-P_Value               1.00      0.56      0.71         9
B-Patient_Count         1.00      0.95      0.97        19
B-Patient_Group         0.79      0.63      0.70        43
B-Trial_Group           0.96      0.94      0.95       274
B-Value                 0.98      0.83      0.90        77
I-ADE                   0.71      1.00      0.83        12
I-Confidence_Range      0.98      1.00      0.99        43
I-DATE                  0.95      1.00      0.98        60
I-Duration              1.00      1.00      1.00         1
I-End_Point             0.92      1.00      0.96        44
I-Follow_Up             1.00      1.00      1.00         2
I-P_Value               0.82      1.00      0.90        18
I-Patient_Count         0.00      0.00      0.00         0
I-Patient_Group         0.79      0.94      0.86       187
I-Trial_Group           0.92      0.90      0.91       156
I-Value                 1.00      1.00      1.00        10
O                       0.98      0.98      0.98      2622
accuracy                -         -         0.96      3808
macro-avg               0.86      0.86      0.85      3808
weighted-avg            0.96      0.96      0.96      3808