Mapping ICD10CM Codes with Their Corresponding UMLS Codes

Description

This pretrained model maps ICD10CM codes to corresponding UMLS codes under the Unified Medical Language System (UMLS).

Predicted Entities

umls_code

Copy S3 URI

How to use

document_assembler = DocumentAssembler()\
      .setInputCol('text')\
      .setOutputCol('document')

chunk_assembler = Doc2Chunk()\
      .setInputCols(['document'])\
      .setOutputCol('ner_chunk')

mapperModel = ChunkMapperModel.pretrained("icd10cm_umls_mapper", "en", "clinical/models")\
    .setInputCols(["ner_chunk"])\
    .setOutputCol("mappings")\
    .setRels(["umls_code"])

mapper_pipeline = Pipeline(stages=[
    document_assembler,
    chunk_assembler,
    mapperModel
])

data = spark.createDataFrame([["A01.2"], ["F10.220"]]).toDF("text")

result = mapper_pipeline.fit(data).transform(data)
document_assembler = nlp.DocumentAssembler()\
      .setInputCol('text')\
      .setOutputCol('document')

chunk_assembler = medical.Doc2Chunk()\
      .setInputCols(['document'])\
      .setOutputCol('ner_chunk')

mapperModel = medical.ChunkMapperModel.pretrained("icd10cm_umls_mapper", "en", "clinical/models")\
    .setInputCols(["ner_chunk"])\
    .setOutputCol("mappings")\
    .setRels(["umls_code"])

mapper_pipeline = Pipeline(stages=[
    document_assembler,
    chunk_assembler,
    mapperModel
])

data = spark.createDataFrame([["A01.2"], ["F10.220"]]).toDF("text")

result = mapper_pipeline.fit(data).transform(data)
val document_assembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("document")

val chunk_assembler = new Doc2Chunk()
      .setInputCols("document")
      .setOutputCol("ner_chunk")

val chunkerMapper = ChunkMapperModel
      .pretrained("icd10cm_umls_mapper", "en", "clinical/models")
      .setInputCols(Array("ner_chunk"))
      .setOutputCol("mappings")
      .setRels(Array("umls_code"))

val mapper_pipeline = new Pipeline().setStages(Array(
                                                  document_assembler,
                                                  chunk_assembler,
                                                  chunkerMapper))

val data = Seq("A01.2", "F10.220").toDF("text")

val result = mapper_pipeline.fit(data).transform(data)

Results

+------------+---------+---------+
|icd10cm_code|umls_code| relation|
+------------+---------+---------+
|       A01.2| C0343376|umls_code|
|     F10.220| C2874385|umls_code|
+------------+---------+---------+

Model Information

Model Name: icd10cm_umls_mapper
Compatibility: Healthcare NLP 6.0.2+
License: Licensed
Edition: Official
Input Labels: [ner_chunk]
Output Labels: [mappings]
Language: en
Size: 1.4 MB

References

Trained on concepts from ICD10CM for the 2025AA release of the Unified Medical Language System® (UMLS) Knowledge Sources: https://www.nlm.nih.gov/research/umls/index.html