Description
This model can be used to identify temporal relationships among clinical events.
Predicted Entities
AFTER
, BEFORE
, OVERLAP
Live Demo Open in Colab Copy S3 URICopied!
How to use
Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, PerceptronModel, DependencyParserModel, WordEmbeddingsModel, NerDLModel, NerConverter, RelationExtractionModel.
In the table below, re_temporal_events_clinical
RE model, its labels, optimal NER model, and meaningful relation pairs are illustrated.
RE MODEL | RE MODEL LABES | NER MODEL | RE PAIRS |
---|---|---|---|
re_temporal_events_clinical | AFTER, BEFORE, OVERLAP | ner_events_clinical | [“No need to set pairs.”] |
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
sentence_detector = SentenceDetector()\
.setInputCols(["document"])\
.setOutputCol("sentences")
tokenizer = Tokenizer()\
.setInputCols(["sentences"])\
.setOutputCol("tokens")
pos_tagger = PerceptronModel().pretrained("pos_clinical", "en", "clinical/models") \
.setInputCols(["sentences", "tokens"])\
.setOutputCol("pos_tags")
word_embeddings = WordEmbeddingsModel().pretrained("embeddings_clinical", "en", "clinical/models") \
.setInputCols(["sentences", "tokens"]) \
.setOutputCol("embeddings")
clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models")\
.setInputCols("sentences", "tokens", "embeddings")\
.setOutputCol("ner_tags")
ner_converter = NerConverter() \
.setInputCols(["sentences", "tokens", "ner_tags"]) \
.setOutputCol("ner_chunks")
dependency_parser = DependencyParserModel().pretrained("dependency_conllu", "en") \
.setInputCols(["sentences", "pos_tags", "tokens"]) \
.setOutputCol("dependencies")
clinical_re_Model = RelationExtractionModel()\
.pretrained("re_temporal_events_clinical", "en", 'clinical/models')\
.setInputCols(["embeddings", "pos_tags", "ner_chunks", "dependencies"])\
.setOutputCol("relations")\
.setMaxSyntacticDistance(4)\
.setPredictionThreshold(0.9)\
.setRelationPairs(["date-problem", "occurrence-date"]) # Possible relation pairs. Default: All Relations.
nlp_pipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, pos_tagger, word_embeddings, clinical_ner, ner_converter, dependency_parser, clinical_re_Model])
light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))
annotations = light_pipeline.fullAnnotate("""The patient is a 56-year-old right-handed female with longstanding intermittent right low back pain, who was involved in a motor vehicle accident in September of 2005. At that time, she did not notice any specific injury, but five days later, she started getting abnormal right low back pain.""")
Results
+----+------------+------------+-----------------+---------------+--------------------------+-----------+-----------------+---------------+---------------------+--------------+
| | relation | entity1 | entity1_begin | entity1_end | chunk1 | entity2 | entity2_begin | entity2_end | chunk2 | confidence |
+====+============+============+=================+===============+==========================+===========+=================+===============+=====================+==============+
| 0 | OVERLAP | OCCURRENCE | 121 | 144 | a motor vehicle accident | DATE | 149 | 165 | September of 2005 | 0.999975 |
+----+------------+------------+-----------------+---------------+--------------------------+-----------+-----------------+---------------+---------------------+--------------+
| 1 | OVERLAP | DATE | 171 | 179 | that time | PROBLEM | 201 | 219 | any specific injury | 0.956654 |
+----+------------+------------+-----------------+---------------+--------------------------+-----------+-----------------+---------------+---------------------+--------------+
Model Information
Model Name: | re_temporal_events_clinical |
Type: | re |
Compatibility: | Healthcare NLP 2.6.0 + |
Edition: | Official |
License: | Licensed |
Input Labels: | [embeddings, pos_tags, ner_chunks, dependencies] |
Output Labels: | [relations] |
Language: | [en] |
Case sensitive: | false |
Dependencies: | embeddings_clinical |
Data Source
Trained on data gathered and manually annotated by John Snow Labs https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/
Benchmarking
|Relation | Recall | Precision | F1 |
|---------:|--------:|----------:|-----:|
| OVERLAP | 0.81 | 0.73 | 0.77 |
| BEFORE | 0.85 | 0.88 | 0.86 |
| AFTER | 0.38 | 0.46 | 0.43 |