Mapping ICD-10-CM codes with Their Corresponding general codes

Description

This pretrained model maps ICD-10-CM codes to their generalised 3-digit ICD-10-CM codes and the main concepts.

Predicted Entities

Copy S3 URI

How to use

documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("ner_chunk")

sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")\
    .setInputCols(["ner_chunk"])\
    .setOutputCol("sbert_embeddings")\
    .setCaseSensitive(False)

icd10_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_icd10cm_augmented", "en", "clinical/models")\
    .setInputCols(["sbert_embeddings"])\
    .setOutputCol("icd10_code")\
    .setDistanceFunction("EUCLIDEAN")

resolver2chunk = Resolution2Chunk()\
    .setInputCols(["icd10_code"])\
    .setOutputCol("icd102chunk")

chunkMapper = ChunkMapperModel.pretrained("icd10cm_generalised_mapper", "en", "clinical/models")\
    .setInputCols(["icd102chunk"])\
    .setOutputCol("mappings")\

pipeline = Pipeline(stages = [
    documentAssembler,
    sbert_embedder,
    icd10_resolver,
    resolver2chunk,
    chunkMapper])

data = spark.createDataFrame([["gestational diabetes mellitus"],["Chronic obstructive pulmonary disease"]]).toDF("text")

mapper_model = pipeline.fit(data)
result= mapper_model.transform(data)     
val documentAssembler = new DocumentAssembler()
	.setInputCol("text")
	.setOutputCol("ner_chunk")
	
val sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli","en","clinical/models")
	.setInputCols(Array("ner_chunk"))
	.setOutputCol("sbert_embeddings")
	.setCaseSensitive(false)
	
val icd10_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_icd10cm_augmented","en","clinical/models")
	.setInputCols(Array("sbert_embeddings"))
	.setOutputCol("icd10_code")
	.setDistanceFunction("EUCLIDEAN")
	
val resolver2chunk = new Resolution2Chunk()
	.setInputCols(Array("icd10_code"))
	.setOutputCol("icd102chunk")
	
val chunkMapper = ChunkMapperModel.pretrained("icd10cm_generalised_mapper","en","clinical/models")
	.setInputCols(Array("icd102chunk"))
	.setOutputCol("mappings")
	
val Pipeline(stages = Array(
    documentAssembler,
    sbert_embedder,
    icd10_resolver,
    resolver2chunk,
    chunkMapper))
	
val data = Seq("gestational diabetes mellitus"),Array("Chronic obstructive pulmonary disease") .toDF("text")
	
val mapper_model = pipeline.fit(data)
result= mapper_model.transform(data)

Results

+-------------------------------------+------------+--------------------------------------------+
|chunk                                |icd10cm_code|generalised_code                            |
+-------------------------------------+------------+--------------------------------------------+
|gestational diabetes mellitus        |O24.4       |O24:Pregnancy, childbirth and the puerperium|
|Chronic obstructive pulmonary disease|J44.9       |J44:Diseases of the respiratory system      |
+-------------------------------------+------------+--------------------------------------------+

Model Information

Model Name: icd10cm_generalised_mapper
Compatibility: Healthcare NLP 5.2.1+
License: Licensed
Edition: Official
Input Labels: [ner_chunk]
Output Labels: [mappings]
Language: en
Size: 1.3 MB