Description
This model maps extracted medical entities to ICD10-CM codes using sbiobert_base_cased_mli
Sentence Bert Embeddings, and has faster load time, with a speedup of about 6X when compared to previous versions. The load process now is more memory friendly meaning that the maximum memory required during load time is smaller, reducing the chances of OOM exceptions, and thus relaxing hardware requirements. It has been augmented with synonyms, four times richer than previous resolver. It also adds support of 7-digit codes with HCC status.
Predicted Entities
Outputs 7-digit billable ICD codes. In the result, look for aux_label
parameter in the metadata to get HCC status. The HCC status can be divided to get further information: billable status
, hcc status
, and hcc score
.For example, in the example shared below the billable status is 1
, hcc status is 1
, and hcc score is 8
.
Live Demo Open in Colab Copy S3 URI
How to use
sbiobertresolve_icd10cm_augmented_billable_hcc
resolver model must be used with sbiobert_base_cased_mli
as embeddings ner_clinical
as NER model. PROBLEM
set in .setWhiteList()
.
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli","en","clinical/models")\
.setInputCols(["document"])\
.setOutputCol("sbert_embeddings")
icd10_resolver = SentenceEntityResolverModel\
.pretrained("sbiobertresolve_icd10cm_augmented_billable_hcc","en", "clinical/models") \
.setInputCols(["document", "sbert_embeddings"]) \
.setOutputCol("icd10cm_code")\
.setDistanceFunction("EUCLIDEAN").setReturnCosineDistances(True)
bert_pipeline_icd = Pipeline(stages = [document_assembler, sbert_embedder, icd10_resolver])
data = spark.createDataFrame([["metastatic lung cancer"]]).toDF("text")
results = bert_pipeline_icd.fit(data).transform(data)
val document_assembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val sbert_embedder = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols(Array("document"))
.setOutputCol("sbert_embeddings")
val icd10_resolver = SentenceEntityResolverModel
.pretrained("sbiobertresolve_icd10cm_augmented_billable_hcc","en", "clinical/models")
.setInputCols(Array("document", "sbert_embeddings"))
.setOutputCol("icd10cm_code")
.setDistanceFunction("EUCLIDEAN")
.setReturnCosineDistances(True)
val bert_pipeline_icd = new Pipeline().setStages(Array(document_assembler, sbert_embedder, icd10_resolver))
val data = Seq("metastatic lung cancer").toDF("text")
val result = bert_pipeline_icd.fit(data).transform(data)
import nlu
nlu.load("en.resolve.icd10cm.augmented_billable").predict("""metastatic lung cancer""")
Results
| | chunks | code | resolutions | all_codes | billable_hcc_status_score | all_distances |
|---:|:-----------------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|:----------------------------|:-------------------------------------------------------------------------------------------------------------------------|
| 0 | metastatic lung cancer | C7800 | ['cancer metastatic to lung', 'metastasis from malignant tumor of lung', 'cancer metastatic to left lung', 'history of cancer metastatic to lung', 'metastatic cancer', 'history of cancer metastatic to lung (situation)', 'metastatic adenocarcinoma to bilateral lungs', 'cancer metastatic to chest wall', 'metastatic malignant neoplasm to left lower lobe of lung', 'metastatic carcinoid tumour', 'cancer metastatic to respiratory tract', 'metastatic carcinoid tumor'] | ['C7800', 'C349', 'C7801', 'Z858', 'C800', 'Z8511', 'C780', 'C798', 'C7802', 'C799', 'C7830', 'C7B00'] | ['1', '1', '8'] | ['0.0464', '0.0829', '0.0852', '0.0860', '0.0914', '0.0989', '0.1133', '0.1220', '0.1220', '0.1253', '0.1249', '0.1260'] |
Model Information
Model Name: | sbiobertresolve_icd10cm_augmented_billable_hcc |
Compatibility: | Healthcare NLP 3.0.4+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [icd10cm_code] |
Language: | en |
Case sensitive: | false |