Detect Assertion Status (assertion_bert_classifier_jsl_slim)

Description

Assign assertion status to clinical entities.

Predicted Entities

present, absent, possible

Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
     .setInputCol("text") \
     .setOutputCol("document")
sentence_detector = SentenceDetector() \
     .setInputCols(["document"]) \
     .setOutputCol("sentence")
tokenizer = Tokenizer() \
     .setInputCols(["sentence"]) \
     .setOutputCol("token")
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models") \
     .setInputCols(["sentence", "token"]) \
     .setOutputCol("embeddings")
clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models") \
     .setInputCols(["sentence", "token", "embeddings"]) \
     .setOutputCol("ner")
ner_converter = NerConverterInternal() \
     .setInputCols(["sentence", "token", "ner"]) \
     .setOutputCol("ner_chunk")
clinical_assertion = BertAssertionClassifier.pretrained("assertion_bert_classifier_jsl_slim", "en", "clinical/models") \
     .setInputCols(["sentence", "ner_chunk"]) \
     .setOutputCol("assertion")
pipeline = Pipeline().setStages([
     document_assembler,
     sentence_detector,
     tokenizer,
     word_embeddings,
     clinical_ner,
     ner_converter,
     clinical_assertion
 ])
text = """Patient with severe fever and sore throat. He shows no stomach pain and he maintained on an epidural.
and PCA for pain control. He also became short of breath with climbing a flight of stairs. After CT,
lung tumor located at the right lower lobe. Father with Alzheimer."""
data = spark.createDataFrame([[text]]).toDF("text")
result_df = pipeline.fit(data).transform(data)
result_df.selectExpr("explode(assertion) as result")\
    .select("result.metadata.ner_chunk", "result.begin", "result.end","result.metadata.ner_label", "result.result")\
    .show(100, False)
val document_assembler = new DocumentAssembler()
  .setInputCol("text")
  .setOutputCol("document")
val sentence_detector = new SentenceDetector()
  .setInputCols("document")
  .setOutputCol("sentence")
val tokenizer = new Tokenizer()
  .setInputCols("sentence")
  .setOutputCol("token")
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
  .setInputCols("sentence", "token")
  .setOutputCol("embeddings")
val clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models")
  .setInputCols("sentence", "token", "embeddings")
  .setOutputCol("ner")
val ner_converter = new NerConverterInternal()
  .setInputCols("sentence", "token", "ner")
  .setOutputCol("ner_chunk")
val clinical_assertion = BertAssertionClassifier.pretrained("assertion_bert_classifier_jsl_slim", "en", "clinical/models")
  .setInputCols("sentence", "ner_chunk")
  .setOutputCol("assertion")
val pipeline = new Pipeline().setStages(Array(
  document_assembler,
  sentence_detector,
  tokenizer,
  word_embeddings,
  clinical_ner,
  ner_converter,
  clinical_assertion
))
val text = """Patient with severe fever and sore throat. He shows no stomach pain and he maintained on an epidural.
           |and PCA for pain control. He also became short of breath with climbing a flight of stairs. After CT,
           |lung tumor located at the right lower lobe. Father with Alzheimer.""".stripMargin
val data = Seq(text).toDF("text")
val result_df = pipeline.fit(data).transform(data)
result_df.selectExpr("explode(assertion) as result")
  .select("result.metadata.ner_chunk", "result.begin", "result.end","result.metadata.ner_label", "result.result")
  .show(100, false)

Results

+---------------+-----+---+---------+-------+
|ner_chunk      |begin|end|ner_label|result |
+---------------+-----+---+---------+-------+
|severe fever   |13   |24 |PROBLEM  |present|
|sore throat    |30   |40 |PROBLEM  |present|
|stomach pain   |55   |66 |PROBLEM  |absent |
|an epidural    |89   |99 |TREATMENT|present|
|PCA            |106  |108|TREATMENT|present|
|pain control   |114  |125|TREATMENT|present|
|short of breath|143  |157|PROBLEM  |present|
|CT             |199  |200|TEST     |present|
|lung tumor     |203  |212|PROBLEM  |present|
|Alzheimer      |259  |267|PROBLEM  |present|
+---------------+-----+---+---------+-------+

Model Information

Model Name: assertion_bert_classifier_jsl_slim
Compatibility: Healthcare NLP 5.5.3+
License: Licensed
Edition: Official
Input Labels: [document, ner_chunk]
Output Labels: [assertion]
Language: en
Size: 406.3 MB
Case sensitive: false

Benchmarking

               label         precision     recall       f1-score         support
            absent              0.988      0.931           0.959             2594
         possible              0.730      0.755           0.742               652
          present               0.964      0.979           0.971            8622
       accuracy                    -              -                0.956          11868
     macro avg              0.894       0.888          0.891           11868
weighted avg              0.957       0.956          0.956           11868