Description
Assertion of Clinical Entities based on Deep Learning.
Assertion Status
hypothetical
, present
, absent
, possible
, conditional
, associated_with_someone_else
.
How to use
Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, WordEmbeddingsModel, NerDLModel, AssertionDLModel.
...
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
.setInputCols(["sentence", "token"])\
.setOutputCol("embeddings")
clinical_ner = NerDLModel.pretrained("ner_clinical", "en", "clinical/models") \
.setInputCols(["sentence", "token", "embeddings"]) \
.setOutputCol("ner")
ner_converter = NerConverter() \
.setInputCols(["sentence", "token", "ner"]) \
.setOutputCol("ner_chunk")
clinical_assertion = AssertionDLModel.pretrained("assertion_dl_healthcare","en","clinical/models")\
.setInputCols(["document","ner_chunk","embeddings"])\
.setOutputCol("assertion")
nlpPipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, clinical_assertion])
model = nlpPipeline.fit(spark.createDataFrame([['Patient has a headache for the last 2 weeks and appears anxious when she walks fast. No alopecia noted. She denies pain']]).toDF("text"))
results = model.transform(data)
...
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
.setInputCols(Array("sentence", "token"))
.setOutputCol("embeddings")
val clinical_ner = NerDLModel.pretrained("ner_clinical", "en", "clinical/models")
.setInputCols(Array("sentence", "token", "embeddings"))
.setOutputCol("ner")
val ner_converter = NerConverter()
.setInputCols(Array("sentence", "token", "ner"))
.setOutputCol("ner_chunk")
val clinical_assertion = AssertionDLModel.pretrained("assertion_dl_healthcare","en","clinical/models")
.setInputCols("document","ner_chunk","embeddings")
.setOutputCol("assertion")
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, clinical_assertion))
val result = pipeline.fit(Seq.empty["Patient has a headache for the last 2 weeks and appears anxious when she walks fast. No alopecia noted. She denies pain"].toDS.toDF("text")).transform(data)
Result
| | chunks | entities| assertion |
|--:|-----------:|--------:|------------:|
| 0 | a headache | PROBLEM | present |
| 1 | anxious | PROBLEM | conditional |
| 2 | alopecia | PROBLEM | absent |
| 3 | pain | PROBLEM | absent |
Model Information
Name: | assertion_dl_healthcare | |
Type: | AssertionDLModel | |
Compatibility: | 2.6.0 | |
License: | Licensed | |
Edition: | Official | |
Input labels: | [document, chunk, word_embeddings] | |
Output labels: | [assertion] | |
Language: | en | |
Case sensitive: | False | |
Dependencies: | embeddings_healthcare_100d |
Data Source
Trained using embeddings_clinical
on 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text from https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/
Benchmarking
label prec rec f1
absent 0.9289 0.9466 0.9377
present 0.9433 0.9559 0.9496
conditional 0.6888 0.5 0.5794
associated_with_someone_else 0.9285 0.9122 0.9203
hypothetical 0.9079 0.8654 0.8862
possible 0.7 0.6146 0.6545
macro-avg 0.8496 0.7991 0.8236
micro-avg 0.9245 0.9245 0.9245