Extract relations between problem, test, and findings in reports


Find relations between diagnosis, tests and imaging findings in radiology reports. 1 : The two entities are related. 0 : The two entities are not related

Predicted Entities

0, 1

Live Demo Open in Colab Copy S3 URI

How to use

In the table below, re_test_problem_finding RE model, its labels, optimal NER model, and meaningful relation pairs are illustrated.

re_test_problem_finding 0,1 ner_jsl [“test-cerebrovascular_disease”,
“test-diabetes”, “diabetes-test”,
“test-symptom”, “symptom-test”,
“vs_finding-symptom”, “symptom-vs_finding”]
documenter = DocumentAssembler()\

sentencer = SentenceDetector()\

tokenizer = Tokenizer()\
words_embedder = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols(["sentences", "tokens"])\

pos_tagger = PerceptronModel()\
    .pretrained("pos_clinical", "en", "clinical/models") \
    .setInputCols(["sentences", "tokens"])\

ner_tagger = MedicalNerModel()\
    .setInputCols("sentences", "tokens", "embeddings")\

ner_chunker = NerConverterInternal()\
    .setInputCols(["sentences", "tokens", "ner_tags"])\

dependency_parser = DependencyParserModel()\
    .pretrained("dependency_conllu", "en")\
    .setInputCols(["sentences", "pos_tags", "tokens"])\

re_model = RelationExtractionModel()\
    .pretrained("re_test_problem_finding", "en", 'clinical/models')\
    .setInputCols(["embeddings", "pos_tags", "ner_chunks", "dependencies"])\

nlp_pipeline = Pipeline(stages=[documenter, sentencer, tokenizer, words_embedder, pos_tagger, ner_tagger, ner_chunker, dependency_parser, re_model])

light_pipeline = LightPipeline(nlp_pipeline.fit(spark.createDataFrame([['']]).toDF("text")))

annotations = light_pipeline.fullAnnotate("""Targeted biopsy of this lesion for histological correlation should be considered.""")
val documenter = new DocumentAssembler()

val sentencer = new SentenceDetector()

val tokenizer = new Tokenizer()
val words_embedder = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
    .setInputCols(Array("sentences", "tokens"))

val pos_tagger = PerceptronModel()
    .pretrained("pos_clinical", "en", "clinical/models")
    .setInputCols(Array("sentences", "tokens"))

val ner_tagger = MedicalNerModel()
    .setInputCols(Array("sentences", "tokens", "embeddings"))

val ner_chunker = new NerConverterInternal()
    .setInputCols(Array("sentences", "tokens", "ner_tags"))

val dependency_parser = DependencyParserModel()
    .pretrained("dependency_conllu", "en")
    .setInputCols(("sentences", "pos_tags", "tokens"))

val re_model = RelationExtractionModel()
    .pretrained("re_test_problem_finding", "en", "clinical/models")
    .setInputCols(Array("embeddings", "pos_tags", "ner_chunks", "dependencies"))

val nlp_pipeline = new Pipeline().setStagesArray(documenter, sentencer, tokenizer, words_embedder, pos_tagger, ner_tagger, ner_chunker, dependency_parser, re_model))

val data = Seq("""Targeted biopsy of this lesion for histological correlation should be considered.""").toDS.toDF("text")

val result = pipeline.fit(data).transform(data)


| index | relations    | entity1      | chunk1              | entity2      |  chunk2 |
| 0     | 1            | PROCEDURE    | biopsy              | SYMPTOM      |  lesion | 

Model Information

Model Name: re_test_problem_finding
Type: re
Compatibility: Healthcare NLP 2.7.1+
License: Licensed
Edition: Official
Input Labels: [embeddings, pos_tags, train_ner_chunks, dependencies]
Output Labels: [relations]
Language: en

Data Source

Trained on internal datasets.