Relation Extraction between Tests, Results, and Dates

Description

Relation extraction between lab test names, their findings, measurements, results, and date.

Predicted Entities

is_finding_of, is_result_of, is_date_of, O.

Live Demo Open in Colab Download

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, Tokenizer, PerceptronModel, DependencyParserModel, WordEmbeddingsModel, NerDLModel, NerConverter, RelationExtractionModel

In the table below, re_test_result_date RE model, its labels, optimal NER model, and meaningful relation pairs are illustrated.

RE MODEL RE MODEL LABES NER MODEL RE PAIRS
re_test_result_date is_finding_of,
is_result_of,
is_date_of,
O
ner_jsl [“test-test_result”,
“test_result-test”,
“test-date”, “date-test”,
“test-imagingfindings”,
“imagingfindings-test”,
“test-ekg_findings”,
“ekg_findings-test”,
“date-test_result”,
“test_result-date”,
“date-imagingfindings”,
“imagingfindings-date”,
“date-ekg_findings”,
“ekg_findings-date”]
ner_tagger = sparknlp.annotators.NerDLModel()\
    .pretrained('jsl_ner_wip_clinical',"en","clinical/models")\
    .setInputCols("sentences", "tokens", "embeddings")\
    .setOutputCol("ner_tags") 

re_model = RelationExtractionModel()\
    .pretrained("re_test_result_date", "en", 'clinical/models')\
    .setInputCols(["embeddings", "pos_tags", "ner_chunks", "dependencies"])\
    .setOutputCol("relations")\
    .setMaxSyntacticDistance(4)\ #default: 0
    .setPredictionThreshold(0.9)\ #default: 0.5
    .setRelationPairs(["external_body_part_or_region-test"]) # Possible relation pairs. Default: All Relations.

nlp_pipeline = Pipeline(stages=[ documenter, sentencer,tokenizer, words_embedder, pos_tagger,  clinical_ner_tagger,ner_chunker, dependency_parser,re_model])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate(''''He was advised chest X-ray or CT scan after checking his SpO2 which was <= 93%''')

Results

| index | relations    | entity1      | chunk1              | entity2      |  chunk2 |
|-------|--------------|--------------|---------------------|--------------|---------|
| 0     | O            | TEST         | chest X-ray         | MEASUREMENTS |  93%    | 
| 1     | O            | TEST         | CT scan             | MEASUREMENTS |  93%    |
| 2     | is_result_of | TEST         | SpO2                | MEASUREMENTS |  93%    |

Model Information

Model Name: re_test_result_date
Type: re
Compatibility: Spark NLP for Healthcare 2.7.4+
License: Licensed
Edition: Official
Input Labels: [embeddings, pos_tags, train_ner_chunks, dependencies]
Output Labels: [relations]
Language: en

Data Source

Trained on internal data.

Benchmarking

| relation        | prec |
|-----------------|------|
| O               | 0.77 |
| is_finding_of   | 0.80 |
| is_result_of    | 0.96 |
| is_date_of      | 0.94 |