Chunk Entity Resolver for ICD10 codes

Description

This model maps extracted medical entities to ICD10-GM codes for German language using chunk embeddings (augmented with synonyms, four times richer than previous resolver).

Predicted Entities

ICD10 codes

Live Demo Open in Colab Copy S3 URI

How to use


...
resolver = ChunkEntityResolverModel.pretrained("chunkresolve_ICD10GM_2021","de","clinical/models")    .setInputCols("token","chunk_embeddings")    .setOutputCol("entity")

pipeline = Pipeline(stages = [documentAssembler, sentenceDetector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk_embeddings, resolver])

data = spark.createDataFrame([["metastatic lung cancer"]]).toDF("text")
model = pipeline.fit(data)
results = model.transform(data)
...


...
val resolver = ChunkEntityResolverModel.pretrained("chunkresolve_ICD10GM_2021","de","clinical/models")    .setInputCols("token","chunk_embeddings")    .setOutputCol("entity")

val pipeline = new Pipeline().setStages(Array(document_assembler, sbert_embedder, resolver))

val data = Seq("metastatic lung cancer").toDF("text")

val result = pipeline.fit(data).transform(data)

Results


|    | chunks                 | code   | resolutions                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | all_codes                                                                                              | billable_hcc_status_score   | all_distances                                                                                                            |
|---:|:-----------------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|:----------------------------|:-------------------------------------------------------------------------------------------------------------------------|
|  0 | metastatic lung cancer | C7800  | ['cancer metastatic to lung', 'metastasis from malignant tumor of lung', 'cancer metastatic to left lung', 'history of cancer metastatic to lung', 'metastatic cancer', 'history of cancer metastatic to lung (situation)', 'metastatic adenocarcinoma to bilateral lungs', 'cancer metastatic to chest wall', 'metastatic malignant neoplasm to left lower lobe of lung', 'metastatic carcinoid tumour', 'cancer metastatic to respiratory tract', 'metastatic carcinoid tumor'] | ['C7800', 'C349', 'C7801', 'Z858', 'C800', 'Z8511', 'C780', 'C798', 'C7802', 'C799', 'C7830', 'C7B00'] | ['1', '1', '8']             | ['0.0464', '0.0829', '0.0852', '0.0860', '0.0914', '0.0989', '0.1133', '0.1220', '0.1220', '0.1253', '0.1249', '0.1260'] |

Model Information

Model Name: chunkresolve_ICD10GM_2021
Compatibility: Healthcare NLP 3.0.0+
License: Licensed
Edition: Official
Input Labels: [token, chunk_embeddings]
Output Labels: [recognized]
Language: de