Relation extraction between body parts and problem entities

Description

Relation extraction between body parts and problem entities in clinical texts

Predicted Entities

1 : Shows that there is a relation between the body part entity and the entities labeled as problem ( diognosis, symptom etc.) 0 : Shows that there no relation between the body part entity and the entities labeled as problem ( diognosis, symptom etc.)

Open in Colab Download

How to use


ner_tagger = sparknlp.annotators.NerDLModel()\
    .pretrained('jsl_ner_wip_greedy_clinical','en','clinical/models')\
    .setInputCols("sentences", "tokens", "embeddings")\
    .setOutputCol("ner_tags") 

reModel = RelationExtractionModel.pretrained("re_bodypart_problem","en","clinical/models")\
    .setInputCols(["word_embeddings","chunk","pos","dependency"])\
    .setOutput("relations")
    .setRelationPairs(['symptom-external_body_part_or_region'])

pipeline = Pipeline(stages=[documenter, sentencer, tokenizer, words_embedder, pos_tagger, ner_tagger, ner_chunker, dependency_parser, reModel)

model = pipeline.fit(spark.createDataFrame([[""]]).toDF("text"))

results = LightPipeline(model).fullAnnotate('''No neurologic deficits other than some numbness in his left hand.''')

Results

| index | relations | entity1 | entity1_begin | entity1_end | chunk1              | entity2                      | entity2_end | entity2_end | chunk2 | confidence |
|-------|-----------|---------|---------------|-------------|---------------------|------------------------------|-------------|-------------|--------|------------|
| 0     | 0         | Symptom | 3             | 21          | neurologic deficits | external_body_part_or_region | 60          | 63          | hand   | 0.999998   |
| 1     | 1         | Symptom | 39            | 46          | numbness            | external_body_part_or_region | 60          | 63          | hand   | 1          |

Model Information

Model Name: re_bodypart_problem
Type: re
Compatibility: Spark NLP 2.7.1+
License: Licensed
Edition: Official
Input Labels: [embeddings, pos_tags, train_ner_chunks, dependencies]
Output Labels: [relations]
Language: en
Dependencies: embeddings_clinical

Data Source

Trained on custom datasets annotated internally

Benchmarking

| relation | recall | precision |
|----------|--------|-----------|
| 0        | 0.72   | 0.82      |
| 1        | 0.94   | 0.91      |