Extract relations between problem, test, and findings in reports

Description

Find relations between diagnosis, tests and imaging findings in radiology reports. 1 : The two entities are related. 0 : The two entities are not related

Predicted Entities

0, 1

Live Demo Open in Colab Copy S3 URI

How to use

In the table below, re_test_problem_finding RE model, its labels, optimal NER model, and meaningful relation pairs are illustrated.

RE MODEL RE MODEL LABES NER MODEL RE PAIRS
re_test_problem_finding 0,1 ner_jsl [“test-cerebrovascular_disease”,
“cerebrovascular_disease-test”,
“test-communicable_disease”,
“communicable_disease-test”,
“test-diabetes”, “diabetes-test”,
“test-disease_syndrome_disorder”,
“disease_syndrome_disorder-test”,
“test-heart_disease”,
“heart_disease-test”,
“test-hyperlipidemia”,
“hyperlipidemia-test”,
“test-hypertension”,
“hypertension-test”,
“test-injury_or_poisoning”,
“injury_or_poisoning-test”,
“test-kidney_disease”,
“kidney_disease-test”,
“test-obesity”,
“obesity-test”,
“test-oncological”,
“oncological-test”,
“test-psychological_condition”,
“psychological_condition-test”,
“test-symptom”, “symptom-test”,
“ekg_findings-disease_syndrome_disorder”,
“disease_syndrome_disorder-ekg_findings”,
“ekg_findings-heart_disease”,
“heart_disease-ekg_findings”,
“ekg_findings-symptom”,
“symptom-ekg_findings”,
“imagingfindings-cerebrovascular_disease”,
“cerebrovascular_disease-imagingfindings”,
“imagingfindings-communicable_disease”,
“communicable_disease-imagingfindings”,
“imagingfindings-disease_syndrome_disorder”,
“disease_syndrome_disorder-imagingfindings”,
“imagingfindings-heart_disease”,
“heart_disease-imagingfindings”,
“imagingfindings-hyperlipidemia”,
“hyperlipidemia-imagingfindings”,
“imagingfindings-hypertension”,
“hypertension-imagingfindings”,
“imagingfindings-injury_or_poisoning”,
“injury_or_poisoning-imagingfindings”,
“imagingfindings-kidney_disease”,
“kidney_disease-imagingfindings”,
“imagingfindings-oncological”,
“oncological-imagingfindings”,
“imagingfindings-psychological_condition”,
“psychological_condition-imagingfindings”,
“imagingfindings-symptom”,
“symptom-imagingfindings”,
“vs_finding-cerebrovascular_disease”,
“cerebrovascular_disease-vs_finding”,
“vs_finding-communicable_disease”,
“communicable_disease-vs_finding”,
“vs_finding-diabetes”,
“diabetes-vs_finding”,
“vs_finding-disease_syndrome_disorder”,
“disease_syndrome_disorder-vs_finding”,
“vs_finding-heart_disease”,
“heart_disease-vs_finding”,
“vs_finding-hyperlipidemia”,
“hyperlipidemia-vs_finding”,
“vs_finding-hypertension”,
“hypertension-vs_finding”,
“vs_finding-injury_or_poisoning”,
“injury_or_poisoning-vs_finding”,
“vs_finding-kidney_disease”,
“kidney_disease-vs_finding”,
“vs_finding-obesity”,
“obesity-vs_finding”,
“vs_finding-oncological”,
“oncological-vs_finding”,
“vs_finding-overweight”,
“overweight-vs_finding”,
“vs_finding-psychological_condition”,
“psychological_condition-vs_finding”,
“vs_finding-symptom”, “symptom-vs_finding”]
documenter = DocumentAssembler()\
		.setInputCol("text")\
		.setOutputCol("document")

sentencer = SentenceDetector()\
    .setInputCols(["document"])\
    .setOutputCol("sentences")

tokenizer = Tokenizer()\
    .setInputCols(["sentences"])\
    .setOutputCol("tokens")
  
words_embedder = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols(["sentences", "tokens"])\
    .setOutputCol("embeddings")

pos_tagger = PerceptronModel()\
    .pretrained("pos_clinical", "en", "clinical/models") \
    .setInputCols(["sentences", "tokens"])\
    .setOutputCol("pos_tags")

ner_tagger = MedicalNerModel()\
    .pretrained('jsl_ner_wip_clinical',"en","clinical/models")\
    .setInputCols("sentences", "tokens", "embeddings")\
    .setOutputCol("ner_tags") 

ner_chunker = NerConverterInternal()\
    .setInputCols(["sentences", "tokens", "ner_tags"])\
    .setOutputCol("ner_chunks")

dependency_parser = DependencyParserModel()\
    .pretrained("dependency_conllu", "en")\
    .setInputCols(["sentences", "pos_tags", "tokens"])\
    .setOutputCol("dependencies")

re_model = RelationExtractionModel()\
    .pretrained("re_test_problem_finding", "en", 'clinical/models')\
    .setInputCols(["embeddings", "pos_tags", "ner_chunks", "dependencies"])\
    .setOutputCol("relations")\
    .setMaxSyntacticDistance(4)\
    .setPredictionThreshold(0.9)\
    .setRelationPairs(["procedure-symptom"])

nlp_pipeline = Pipeline(stages=[documenter, sentencer, tokenizer, words_embedder, pos_tagger, ner_tagger, ner_chunker, dependency_parser, re_model])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate("""Targeted biopsy of this lesion for histological correlation should be considered.""")
val documenter = new DocumentAssembler()
		.setInputCol("text")
		.setOutputCol("document")

val sentencer = new SentenceDetector()
    .setInputCols("document")
    .setOutputCol("sentences")

val tokenizer = new Tokenizer()
    .setInputCols("sentences")
    .setOutputCol("tokens")
  
val words_embedder = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
    .setInputCols(Array("sentences", "tokens"))
    .setOutputCol("embeddings")

val pos_tagger = PerceptronModel()
    .pretrained("pos_clinical", "en", "clinical/models")
    .setInputCols(Array("sentences", "tokens"))
    .setOutputCol("pos_tags")

val ner_tagger = MedicalNerModel()
    .pretrained("jsl_ner_wip_clinical","en","clinical/models")
    .setInputCols(Array("sentences", "tokens", "embeddings"))
    .setOutputCol("ner_tags") 

val ner_chunker = new NerConverterInternal()
    .setInputCols(Array("sentences", "tokens", "ner_tags"))
    .setOutputCol("ner_chunks")

val dependency_parser = DependencyParserModel()
    .pretrained("dependency_conllu", "en")
    .setInputCols(("sentences", "pos_tags", "tokens"))
    .setOutputCol("dependencies")

val re_model = RelationExtractionModel()
    .pretrained("re_test_problem_finding", "en", "clinical/models")
    .setInputCols(Array("embeddings", "pos_tags", "ner_chunks", "dependencies"))
    .setOutputCol("relations")
    .setMaxSyntacticDistance(4)
    .setPredictionThreshold(0.9)
    .setRelationPairs("procedure-symptom")

val nlp_pipeline = new Pipeline().setStagesArray(documenter, sentencer, tokenizer, words_embedder, pos_tagger, ner_tagger, ner_chunker, dependency_parser, re_model))

val data = Seq("""Targeted biopsy of this lesion for histological correlation should be considered.""").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)

Results

| index | relations    | entity1      | chunk1              | entity2      |  chunk2 |
|-------|--------------|--------------|---------------------|--------------|---------|
| 0     | 1            | PROCEDURE    | biopsy              | SYMPTOM      |  lesion | 

Model Information

Model Name: re_test_problem_finding
Type: re
Compatibility: Healthcare NLP 2.7.1+
License: Licensed
Edition: Official
Input Labels: [embeddings, pos_tags, train_ner_chunks, dependencies]
Output Labels: [relations]
Language: en

Data Source

Trained on internal datasets.