Description
This pretrained model maps ICD-10 codes to corresponding MedDRA PT (Preferred Term) codes.
Predicted Entities
icd10 code
How to use
documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("ner_chunk")
sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")\
.setInputCols(["ner_chunk"])\
.setOutputCol("sbert_embeddings")\
.setCaseSensitive(False)
icd10_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_icd10cm_augmented", "en", "clinical/models")\
.setInputCols(["sbert_embeddings"]) \
.setOutputCol("icd10_code")\
.setDistanceFunction("EUCLIDEAN")
resolver2chunk = Resolution2Chunk()\
.setInputCols(["icd10_code"])\
.setOutputCol("icd102chunk")
chunkerMapper = ChunkMapperModel.load("icd10_meddra_pt_mapper")\
.setInputCols(["icd102chunk"])\
.setOutputCol("mappings")
pipeline = Pipeline(stages = [
documentAssembler,
sbert_embedder,
icd10_resolver,
resolver2chunk,
chunkerMapper])
data = spark.createDataFrame([["Type 2 diabetes mellitus"], ["Typhoid fever"], ["Malignant neoplasm of oesophagus"]]).toDF("text")
mapper_model = pipeline.fit(data)
result = mapper_model.transform(data)
val documentAssembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("ner_chunk")
val sbert_embedder = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")
.setInputCols(Array("ner_chunk"))
.setOutputCol("sbert_embeddings")
.setCaseSensitive(False)
val icd10_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_icd10cm_augmented", "en", "clinical/models")
.setInputCols(Array("sbert_embeddings"))
.setOutputCol("icd10_code")
.setDistanceFunction("EUCLIDEAN")
val resolver2chunk = Resolution2Chunk()
.setInputCols(Array("icd10_code"))
.setOutputCol("icd102chunk")
val chunkerMapper = ChunkMapperModel.load("icd10_meddra_pt_mapper")
.setInputCols(Array("icd102chunk"))
.setOutputCol("mappings")
val pipeline = new Pipeline().setStages(Array(
documentAssembler,
sbert_embedder,
icd10_resolver,
resolver2chunk,
chunkerMapper))
val data = Seq(Array("Type 2 diabetes mellitus"), Array("Typhoid fever"), Array("Malignant neoplasm of oesophagus")).toDF("text")
val mapper_model = pipeline.fit(data)
val result = mapper_model.transform(data)
Results
+--------------------------------+----------+-----------------------------------+
|chunk |icd10_code|meddra_code |
+--------------------------------+----------+-----------------------------------+
|Type 2 diabetes mellitus |E11 |10067585.0:Type 2 diabetes mellitus|
|Typhoid fever |A01.0 |10045275.0:Typhoid fever |
|Malignant neoplasm of oesophagus|C15.9 |10030155.0:Oesophageal carcinoma |
+--------------------------------+----------+-----------------------------------+
Model Information
Model Name: | icd10_meddra_pt_mapper |
Compatibility: | Healthcare NLP 5.3.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [ner_chunk] |
Output Labels: | [mappings] |
Language: | en |
Size: | 210.0 KB |
References
This model is trained with the January 2024 release of ICD-10 to MedDRA Map dataset.
To utilize this model, possession of a valid MedDRA license is requisite. If you possess one and wish to use this model, kindly contact us at support@johnsnowlabs.com.