ICD10CM Neoplasms Entity Resolver

Description

Entity Resolution model Based on KNN using Word Embeddings + Word Movers Distance.

Predicted Entities

ICD10-CM Codes and their normalized definition with clinical_embeddings.

Live Demo Open in Colab Download

How to use

...
neoplasm_resolver = ChunkEntityResolverModel.pretrained("chunkresolve_icd10cm_neoplasms_clinical","en","clinical/models")\
	.setInputCols("token","chunk_embeddings")\
	.setOutputCol("entity")
pipeline_puerile = Pipeline(stages = [documentAssembler, sentenceDetector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk_embeddings, neoplasm_resolver])

model = pipeline_puerile.fit(spark.createDataFrame([["""The patient is a 5-month-old infant who presented initially on Monday with a cold, cough, and runny nose for 2 days. Mom states she had no fever. Her appetite was good but she was spitting up a lot. She had no difficulty breathing and her cough was described as dry and hacky. At that time, physical exam showed a right TM, which was red. Left TM was okay. She was fairly congested but looked happy and playful. She was started on Amoxil and Aldex and we told to recheck in 2 weeks to recheck her ear. Mom returned to clinic again today because she got much worse overnight. She was having difficulty breathing. She was much more congested and her appetite had decreased significantly today. She also spiked a temperature yesterday of 102.6 and always having trouble sleeping secondary to congestion."""]]).toDF("text"))

results = model.transform(data)
...
val neoplasm_resolver = ChunkEntityResolverModel.pretrained("chunkresolve_icd10cm_neoplasms_clinical","en","clinical/models")
	.setInputCols(Array("token","chunk_embeddings"))
	.setOutputCol("resolution")
val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDetector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk_embeddings, neoplasm_resolver))

val result = pipeline.fit(Seq.empty["The patient is a 5-month-old infant who presented initially on Monday with a cold, cough, and runny nose for 2 days. Mom states she had no fever. Her appetite was good but she was spitting up a lot. She had no difficulty breathing and her cough was described as dry and hacky. At that time, physical exam showed a right TM, which was red. Left TM was okay. She was fairly congested but looked happy and playful. She was started on Amoxil and Aldex and we told to recheck in 2 weeks to recheck her ear. Mom returned to clinic again today because she got much worse overnight. She was having difficulty breathing. She was much more congested and her appetite had decreased significantly today. She also spiked a temperature yesterday of 102.6 and always having trouble sleeping secondary to congestion."].toDS.toDF("text")).transform(data)

Results

chunk                entity                         icd10_neoplasm_description  icd10_neoplasm_code

0 patient              Organism        Acute myelomonocytic leukemia, in remission  C9251
1  infant              Organism          Malignant (primary) neoplasm, unspecified  C801
2    nose                 Organ                 Malignant neoplasm of nasal cavity  C300
3     She              Organism                Malignant neoplasm of thyroid gland  C73
4     She              Organism                Malignant neoplasm of thyroid gland  C73
5     She              Organism                Malignant neoplasm of thyroid gland  C73
6   Aldex  Gene_or_gene_product  Acute megakaryoblastic leukemia not having ach...  C9420
7     ear                 Organ  Other benign neoplasm of skin of right ear and...  D2321
8     She              Organism                Malignant neoplasm of thyroid gland  C73
9     She              Organism                Malignant neoplasm of thyroid gland  C73
10    She              Organism                Malignant neoplasm of thyroid gland  C73

Model Information

Model Name: chunkresolve_icd10cm_neoplasms_clinical
Compatibility: Spark NLP for Healthcare 3.0.0+
License: Licensed
Edition: Official
Input Labels: [token, chunk_embeddings]
Output Labels: [icd10cm]
Language: en