Sentence Entity Resolver for ICD10-CM (general 3 character codes)

Description

This model maps extracted medical entities to ICD10-CM codes using sbiobert_base_cased_mli Sentence Bert Embeddings. It predicts ICD codes up to 3 characters (according to ICD10 code structure the first three characters represent general type of the injury or disease).

Predicted Entities

Live Demo Open in Colab Copy S3 URI

How to use

sbiobertresolve_icd10cm_generalised resolver model must be used with sbiobert_base_cased_mli as embeddings ner_clinical as NER model. PROBLEM set in .setWhiteList().

...
chunk2doc = Chunk2Doc().setInputCols("ner_chunk").setOutputCol("ner_chunk_doc")

sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")

icd10_resolver = SentenceEntityResolverModel\
.pretrained("sbiobertresolve_icd10cm_generalised","en", "clinical/models") \
.setInputCols(["ner_chunk", "sbert_embeddings"]) \
.setOutputCol("resolution")\
.setDistanceFunction("EUCLIDEAN")

nlpPipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk2doc, sbert_embedder, icd10_resolver])

data = spark.createDataFrame([["This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU ."]]).toDF("text")

results = nlpPipeline.fit(data).transform(data)
chunk2doc = Chunk2Doc().setInputCols("ner_chunk").setOutputCol("ner_chunk_doc")

val sbert_embedder = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols(Array("ner_chunk_doc"))
.setOutputCol("sbert_embeddings")

val icd10_resolver = SentenceEntityResolverModel
.pretrained("sbiobertresolve_icd10cm_generalised","en", "clinical/models")
.setInputCols(Array("ner_chunk", "sbert_embeddings"))
.setOutputCol("resolution")
.setDistanceFunction("EUCLIDEAN")

val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk2doc, sbert_embedder, icd10_resolver))

val data = Seq("This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU .").toDF("text")

val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.resolve.icd10cm_generalised").predict("""This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU .""")

Results

|    | chunk                       |   begin |   end | entity   | code   | code_desc                                                |   distance | all_k_resolutions                                                                                                                                                                                                                                                                                                                                       | all_k_codes                                                                 |
|---:|:----------------------------|--------:|------:|:---------|:-------|:---------------------------------------------------------|-----------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------|
|  0 | hypertension                |      68 |    79 | PROBLEM  | I10    | hypertension                                             |     0      | hypertension:::hypertension (high blood pressure):::h/o: hypertension:::fh: hypertension:::hypertensive heart disease:::labile hypertension:::history of hypertension (situation):::endocrine hypertension                                                                                                                                              | I10:::I15:::Z86:::Z82:::I11:::R03:::Z87:::E27                               |
|  1 | chronic renal insufficiency |      83 |   109 | PROBLEM  | N18    | chronic renal impairment                                 |     0.014  | chronic renal impairment:::renal insufficiency:::renal failure:::anaemia of chronic renal insufficiency:::impaired renal function disorder:::history of renal insufficiency:::prerenal renal failure:::abnormal renal function:::abnormal renal function                                                                                                | N18:::P96:::N19:::D63:::N28:::Z87:::N17:::N25:::R94                         |
|  2 | COPD                        |     113 |   116 | PROBLEM  | J44    | chronic obstructive lung disease (disorder)              |     0.1197 | chronic obstructive lung disease (disorder):::chronic obstructive pulmonary disease leaflet given:::chronic pulmonary congestion (disorder):::chronic respiratory failure (disorder):::chronic respiratory insufficiency:::cor pulmonale (chronic):::history of - chronic lung disease (situation)                                                      | J44:::Z76:::J81:::J96:::R06:::I27:::Z87                                     |
|  3 | gastritis                   |     120 |   128 | PROBLEM  | K29    | gastritis                                                |     0      | gastritis:::bacterial gastritis:::parasitic gastritis                                                                                                                                                                                                                                                                                                   | K29:::B96:::K93                                                             |
|  4 | TIA                         |     136 |   138 | PROBLEM  | S06    | cerebral concussion                                      |     0.1662 | cerebral concussion:::transient ischemic attack (disorder):::thalamic stroke:::cerebral trauma:::stroke:::traumatic amputation:::spinal cord stroke                                                                                                                                                                                                     | S06:::G45:::I63:::S09:::I64:::T14:::G95                                     |
|  5 | a non-ST elevation MI       |     182 |   202 | PROBLEM  | I21    | non-st elevation (nstemi) myocardial infarction          |     0.1615 | non-st elevation (nstemi) myocardial infarction:::nonruptured cerebral artery dissection:::acute stroke, nonatherosclerotic:::nontraumatic ischemic infarction of muscle, unsp shoulder:::history of nonatherosclerotic stroke without residual deficits:::non-traumatic cerebral hemorrhage                                                            | I21:::I67:::I63:::M62:::Z86:::I61                                           |
|  6 | Guaiac positive stools      |     208 |   229 | PROBLEM  | R85    | abnormal anal pap                                        |     0.1807 | abnormal anal pap:::straining at stool (finding):::amine test positive:::appendiceal colic:::fecal smearing:::epiploic appendagitis:::diverticulosis of intestine (finding):::appendicitis (disorder):::colostomy present (finding):::thickened anal verge (finding):::anal fissure:::amoebic enteritis:::zenkers diverticulum                          | R85:::R19:::Z78:::K38:::R15:::K65:::K57:::K37:::Z93:::K62:::K60:::A06:::K22 |
|  7 | mid LAD lesion              |     332 |   345 | PROBLEM  | I21    | stemi involving left anterior descending coronary artery |     0.1595 | stemi involving left anterior descending coronary artery:::divided left atrium:::disorder of left atrium:::double inlet left ventricle:::left os acromiale:::furuncle of left upper limb:::left anterior fascicular hemiblock (heart rhythm):::aberrant origin of left subclavian artery:::stent in circumflex branch of left coronary artery (finding) | I21:::Q24:::I51:::Q20:::M89:::L02:::I44:::Q27:::Z95                         |
|  8 | hypotension                 |     362 |   372 | PROBLEM  | I95    | hypotension                                              |     0      | hypotension:::supine hypotensive syndrome                                                                                                                                                                                                                                                                                                               | I95:::O26                                                                   |
|  9 | bradycardia                 |     378 |   388 | PROBLEM  | R00    | bradycardia                                              |     0      | bradycardia:::bradycardia (finding):::drug-induced bradycardia:::bradycardia (disorder)                                                                                                                                                                                                                                                                 | R00:::P29:::T50:::P20                                                       |
| 10 | vagal reaction              |     466 |   479 | PROBLEM  | G52    | vagus nerve finding                                      |     0.0926 | vagus nerve finding:::vasomotor reaction:::vesicular breathing (finding):::abdominal muscle tone - finding:::agonizing state:::paresthesia (finding):::glossolalia (finding):::tactile alteration (finding)                                                                                                                                             | G52:::I73:::R09:::R19:::R45:::R20:::R41:::R44                               |

Model Information

Model Name: sbiobertresolve_icd10cm_generalised
Compatibility: Healthcare NLP 3.2.1+
License: Licensed
Edition: Official
Input Labels: [sentence_chunk_embeddings]
Output Labels: [icd10cm_code]
Language: en
Case sensitive: false

Data Source

Trained on ICD10 Clinical Modification dataset with sbiobert_base_cased_mli sentence embeddings. https://www.icd10data.com/ICD10CM/Codes/