Mapping UMLS Codes with Their Corresponding SNOMED Codes

Description

This pretrained model maps UMLS codes to corresponding SNOMED codes.

Predicted Entities

snomed

Copy S3 URI

How to use

documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("ner_chunk")

sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")\
    .setInputCols(["ner_chunk"])\
    .setOutputCol("sbert_embeddings")\
    .setCaseSensitive(False)

umls_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_umls_clinical_drugs", "en", "clinical/models")\
    .setInputCols(["sbert_embeddings"]) \
    .setOutputCol("umls_code")\
    .setDistanceFunction("EUCLIDEAN")

resolver2chunk = Resolution2Chunk()\
    .setInputCols(["umls_code"])\
    .setOutputCol("umls2chunk")

chunkerMapper = ChunkMapperModel.pretrained("umls_snomed_mapper", "en", "clinical/models")\
    .setInputCols(["umls2chunk"])\
    .setOutputCol("mappings")\
    .setRels(["snomed_code"])

pipeline = Pipeline(stages = [
    documentAssembler,
    sbert_embedder,
    umls_resolver,
    resolver2chunk,
    chunkerMapper])

data = spark.createDataFrame([["acebutolol"],["aspirin"]]).toDF("text")

mapper_model = pipeline.fit(data)
result= mapper_model.transform(data)  
val documentAssembler = new DocumentAssembler()
    .setInputCol("text")
    .setOutputCol("ner_chunk")
	
val sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli","en","clinical/models")
    .setInputCols(Array("ner_chunk"))
    .setOutputCol("sbert_embeddings")
    .setCaseSensitive(false)
	
val umls_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_umls_clinical_drugs","en","clinical/models")
    .setInputCols(Array("sbert_embeddings"))
    .setOutputCol("umls_code")
    .setDistanceFunction("EUCLIDEAN")
	
val resolver2chunk = new Resolution2Chunk()
    .setInputCols(Array("umls_code"))
    .setOutputCol("umls2chunk")
	
val chunkerMapper = ChunkMapperModel.pretrained("umls_snomed_mapper","en","clinical/models")
    .setInputCols(Array("umls2chunk"))
    .setOutputCol("mappings")
    .setRels(["snomed_code"])
	
val Pipeline(stages = Array(
  documentAssembler,
  sbert_embedder, 
  umls_resolver,
  resolver2chunk,
  chunkerMapper))

val data = Seq("acebutolol", "aspirin").toDF("text")
	
val mapper_model = pipeline.fit(data)

result= mapper_model.transform(data)

Results

+----------+---------+-----------+
|chunk     |umls_code|snomed_code|
+----------+---------+-----------+
|acebutolol|C0000946 |68088000   |
|aspirin   |C0004057 |319770009  |
+----------+---------+-----------+

Model Information

Model Name: umls_snomed_mapper
Compatibility: Healthcare NLP 5.2.1+
License: Licensed
Edition: Official
Input Labels: [ner_chunk]
Output Labels: [mappings]
Language: en
Size: 6.9 MB