Sentence Entity Resolver for billable ICD10-CM HCC codes (Slim, JSL Medium Bert)

Description

This model maps extracted medical entities to ICD10-CM codes using sentence embeddings. This model has been augmented with synonyms, and synonyms having low cosine similarity are dropped, making the model slim. It utilises fine-tuned sbert_jsl_medium_uncased Sentence Bert Model.

Predicted Entities

Outputs 7-digit billable ICD codes. In the result, look for aux_label parameter in the metadata to get HCC status. The HCC status can be divided to get further information: billable status, hcc status, and hcc score.

Live Demo Open in Colab Download

How to use

document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

sbert_embedder = BertSentenceEmbeddings\
    .pretrained("sbert_jsl_medium_uncased","en","clinical/models")\
    .setInputCols(["document"])\
    .setOutputCol("sbert_embeddings")

icd10_resolver = SentenceEntityResolverModel\
    .pretrained("sbertresolve_icd10cm_slim_billable_hcc_med","en", "clinical/models")\
    .setInputCols(["document", "sbert_embeddings"])\
    .setOutputCol("icd10cm_code")\
    .setDistanceFunction("EUCLIDEAN")\
    .setReturnCosineDistances(True)

bert_pipeline_icd = Pipeline(stages = [document_assembler, sbert_embedder, icd10_resolver]) 

data = spark.createDataFrame([["metastatic lung cancer"]]).toDF("text") 

results = bert_pipeline_icd.fit(data).transform(data)

val document_assembler = DocumentAssembler()
    .setInputCol("text")
    .setOutputCol("document")

val sbert_embedder = BertSentenceEmbeddings
    .pretrained("sbert_jsl_medium_uncased","en","clinical/models")
    .setInputCols(Array("document"))
    .setOutputCol("sbert_embeddings")

val icd10_resolver = SentenceEntityResolverModel
    .pretrained("sbertresolve_icd10cm_slim_billable_hcc_med","en", "clinical/models") 
    .setInputCols(Array("document", "sbert_embeddings")) 
    .setOutputCol("icd10cm_code")
    .setDistanceFunction("EUCLIDEAN")
    .setReturnCosineDistances(True)

val bert_pipeline_icd = new Pipeline().setStages(Array(document_assembler, sbert_embedder, icd10_resolver))

val data = Seq("metastatic lung cancer").toDF("text")

val result = bert_pipeline_icd.fit(data).transform(data)

Results

|    | chunks                 | code   | resolutions                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | all_codes                                                                                              | billable_hcc_status_score   | all_distances                                                                                                            |
|---:|:-----------------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|:----------------------------|:-------------------------------------------------------------------------------------------------------------------------|
|  0 | metastatic lung cancer | C7800  | ['cancer metastatic to lung', 'metastasis from malignant tumor of lung', 'cancer metastatic to left lung', 'history of cancer metastatic to lung', 'metastatic cancer', 'history of cancer metastatic to lung (situation)', 'metastatic adenocarcinoma to bilateral lungs', 'cancer metastatic to chest wall', 'metastatic malignant neoplasm to left lower lobe of lung', 'metastatic carcinoid tumour', 'cancer metastatic to respiratory tract', 'metastatic carcinoid tumor'] | ['C7800', 'C349', 'C7801', 'Z858', 'C800', 'Z8511', 'C780', 'C798', 'C7802', 'C799', 'C7830', 'C7B00'] | ['1', '1', '8']             | ['0.0464', '0.0829', '0.0852', '0.0860', '0.0914', '0.0989', '0.1133', '0.1220', '0.1220', '0.1253', '0.1249', '0.1260'] |

Model Information

Model Name: sbertresolve_icd10cm_slim_billable_hcc_med
Compatibility: Spark NLP for Healthcare 3.0.4+
License: Licensed
Edition: Official
Input Labels: [ner_chunk, sbert_embeddings]
Output Labels: [icd10_code]
Language: en
Case sensitive: false