Detect Clinical Events (Admissions)

Description

This model can be used to detect clinical events in medical text, with a focus on admission entities.

Predicted Entities

DATE, TIME, PROBLEM, TEST, TREATMENT, OCCURENCE, CLINICAL_DEPT, EVIDENTIAL, DURATION, FREQUENCY, ADMISSION, DISCHARGE.

Live Demo Open in Colab Download

How to use

...
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
  .setInputCols(["sentence", "token"])\
  .setOutputCol("embeddings")
clinical_ner = MedicalNerModel.pretrained("ner_events_admission_clinical", "en", "clinical/models") \
  .setInputCols(["sentence", "token", "embeddings"]) \
  .setOutputCol("ner")
...
nlp_pipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate("The patient presented to the emergency room last evening")
...
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
  .setInputCols(Array("sentence", "token"))
  .setOutputCol("embeddings")
val ner = MedicalNerModel.pretrained("ner_events_admission_clinical", "en", "clinical/models")
  .setInputCols("sentence", "token", "embeddings") 
  .setOutputCol("ner")
...
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, ner, ner_converter))
val data = Seq("The patient presented to the emergency room last evening").toDF("text")
val result = pipeline.fit(data).transform(data)

Results

+----+-----------------------------+---------+---------+-----------------+
|    | chunk                       |   begin |   end   |     entity      |
+====+=============================+=========+=========+=================+
|  0 | presented                   |    12   |    20   |   EVIDENTIAL    |
+----+-----------------------------+---------+---------+-----------------+
|  1 | the emergency room          |    25   |    42   |  CLINICAL_DEPT  |
+----+-----------------------------+---------+---------+-----------------+
|  2 | last evening                |    44   |    55   |     DATE        |
+----+-----------------------------+---------+---------+-----------------+

Model Information

Model Name: ner_events_admission_clinical
Compatibility: Spark NLP for Healthcare 3.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence, token, embeddings]
Output Labels: [ner]
Language: en

Data Source

Trained on augmented/enriched i2b2 events data with clinical_embeddings. The data for Admissions has been enriched specifically.

Benchmarking

label	           tp	   fp	  fn	 prec	       rec	       f1
I-TIME	         42	   6	  9	   0.875	     0.8235294	 0.8484849
I-TREATMENT	     1134	 111	312	 0.9108434	 0.7842324	 0.8428094
B-OCCURRENCE	   406	 344	382	 0.5413333	 0.51522845	 0.52795845
I-DURATION	     160	 42	  71	 0.7920792	 0.6926407	 0.73903
B-DATE	         500	 32	  49	 0.9398496	 0.9107468	 0.92506933
I-DATE	         309	 54	  49	 0.8512397	 0.8631285	 0.8571429
B-ADMISSION	     206	 1	  2	   0.9951691	 0.99038464	 0.9927711
I-PROBLEM	       2394	 390	412	 0.85991377	 0.85317177	 0.8565295
B-CLINICAL_DEPT	 327	 64	  77	 0.8363171	 0.8094059	 0.8226415
B-TIME	         44	   12	  15	 0.78571427	 0.7457627	 0.76521736
I-CLINICAL_DEPT	 597	 62	  78	 0.90591806	 0.8844444	 0.8950525
B-PROBLEM	       1643	 260	252	 0.86337364	 0.86701846	 0.86519223
I-FREQUENCY	     35	   21	  39	 0.625	     0.47297296	 0.5384615
I-TEST	         1082	 171	117	 0.86352754	 0.9024187	 0.8825449
B-TEST	         781	 125	127	 0.8620309	 0.86013216	 0.86108047
B-TREATMENT	     1283	 176	202	 0.87936944	 0.8639731	 0.87160325
B-DISCHARGE	     155	 0	  1	   1.0	       0.99358976	 0.99678457
B-EVIDENTIAL	   269	 25	  75	 0.914966	   0.78197676	 0.84326017
B-DURATION	     97	   43	  44	 0.69285715	 0.6879433	 0.6903914
B-FREQUENCY	     70	   16	  33	 0.81395346	 0.6796116	 0.7407407

tp: 11841 fp: 2366 fn: 2680 labels: 22
Macro-average	 prec: 0.8137135, rec: 0.7533389, f1: 0.7823631
Micro-average	 prec: 0.83346236, rec: 0.8154397, f1: 0.8243525