Detect Assertion Status (assertion_dl_healthcare)

Description

Assertion of Clinical Entities based on Deep Learning.

Assertion Status

hypothetical, present, absent, possible, conditional, associated_with_someone_else.

Open in ColabDownload

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, WordEmbeddingsModel, NerDLModel, AssertionDLModel.

...
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
  .setInputCols(["sentence", "token"])\
  .setOutputCol("embeddings")
clinical_ner = NerDLModel.pretrained("ner_clinical", "en", "clinical/models") \
  .setInputCols(["sentence", "token", "embeddings"]) \
  .setOutputCol("ner")
ner_converter = NerConverter() \
  .setInputCols(["sentence", "token", "ner"]) \
  .setOutputCol("ner_chunk")
clinical_assertion = AssertionDLModel.pretrained("assertion_dl_healthcare","en","clinical/models")\
    .setInputCols(["document","ner_chunk","embeddings"])\
    .setOutputCol("assertion")
    
nlpPipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, clinical_assertion])
 
model = nlpPipeline.fit(spark.createDataFrame([['Patient has a headache for the last 2 weeks and appears anxious when she walks fast. No alopecia noted. She denies pain']]).toDF("text"))
results = model.transform(data)
...
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
  .setInputCols(Array("sentence", "token"))
  .setOutputCol("embeddings")
val clinical_ner = NerDLModel.pretrained("ner_clinical", "en", "clinical/models")
  .setInputCols(Array("sentence", "token", "embeddings")) 
  .setOutputCol("ner")
val ner_converter = NerConverter()
  .setInputCols(Array("sentence", "token", "ner"))
  .setOutputCol("ner_chunk")
val clinical_assertion = AssertionDLModel.pretrained("assertion_dl_healthcare","en","clinical/models")
     .setInputCols("document","ner_chunk","embeddings")
     .setOutputCol("assertion")
    
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, clinical_assertion))

val result = pipeline.fit(Seq.empty["Patient has a headache for the last 2 weeks and appears anxious when she walks fast. No alopecia noted. She denies pain"].toDS.toDF("text")).transform(data)

Result


|   | chunks     | entities| assertion   |
|--:|-----------:|--------:|------------:|
| 0 | a headache | PROBLEM | present     |
| 1 | anxious    | PROBLEM | conditional |
| 2 | alopecia   | PROBLEM | absent      |
| 3 | pain       | PROBLEM | absent      |

Model Information

Name: assertion_dl_healthcare  
Type: AssertionDLModel  
Compatibility: 2.6.0  
License: Licensed  
Edition: Official  
Input labels: [document, chunk, word_embeddings]  
Output labels: [assertion]  
Language: en  
Case sensitive: False  
Dependencies: embeddings_healthcare_100d  

Data Source

Trained using embeddings_clinical on 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text from https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/

Benchmarking

                       label  prec    rec     f1

                      absent  0.9289  0.9466  0.9377
                     present  0.9433  0.9559  0.9496
                 conditional  0.6888  0.5     0.5794
associated_with_someone_else  0.9285  0.9122  0.9203
                hypothetical  0.9079  0.8654  0.8862
                    possible  0.7     0.6146  0.6545

                   macro-avg  0.8496  0.7991  0.8236
                   micro-avg  0.9245  0.9245  0.9245