Sentence Entity Resolver for CPT (sbiobert_base_cased_mli embeddings)

Description

This model maps extracted medical entities to CPT codes using chunk embeddings.

Predicted Entities

CPT Codes and their normalized definition with sbiobert_base_cased_mli sentence embeddings.

Open in Colab

How to use

...
chunk2doc = Chunk2Doc()\
    .setInputCols("ner_chunk")
    .setOutputCol("ner_chunk_doc")

sbert_embedder = BertSentenceEmbeddings\
    .pretrained("sbiobert_base_cased_mli","en","clinical/models")\
    .setInputCols(["ner_chunk_doc"])\
    .setOutputCol("sbert_embeddings")  

cpt_resolver = SentenceEntityResolverModel.load("sbiobertresolve_cpt") \
    .setInputCols(["sbert_embeddings"]) \
    .setOutputCol("resolution")\
    .setDistanceFunction("EUCLIDEAN")

nlpPipeline = Pipeline(stages=[
    document_assembler, 
    sentence_detector, 
    tokenizer, 
    word_embeddings, 
    clinical_ner, 
    ner_converter, 
    chunk2doc, 
    sbert_embedder, 
    cpt_resolver])

data = spark.createDataFrame([["This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU ."]]).toDF("text")

results = nlpPipeline.fit(data).transform(data)

...
chunk2doc = Chunk2Doc()
    .setInputCols("ner_chunk")
    .setOutputCol("ner_chunk_doc")

val sbert_embedder = BertSentenceEmbeddings
    .pretrained("sbiobert_base_cased_mli","en","clinical/models")
    .setInputCols(Array("ner_chunk_doc"))
    .setOutputCol("sbert_embeddings")

val cpt_resolver = SentenceEntityResolverModel
    .load("sbiobertresolve_cpt")
    .setInputCols(Array("ner_chunk", "sbert_embeddings"))
    .setOutputCol("resolution")
    .setDistanceFunction("EUCLIDEAN")

val pipeline = new Pipeline().setStages(
    Array(
        document_assembler, 
        sentence_detector, 
        tokenizer, 
        word_embeddings, 
        clinical_ner, 
        ner_converter, 
        chunk2doc, 
        sbert_embedder, 
        cpt_resolver))

val data = Seq("This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU .").toDF("text")
val result = pipeline.fit(data).transform(data)

Results

+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+
|               chunk|begin|end|   entity| code|confidence|   all_k_resolutions|         all_k_codes|
+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+
|        hypertension|   68| 79|  PROBLEM|49425|    0.0967|Insertion of peri...|49425:::36818:::3...|
|chronic renal ins...|   83|109|  PROBLEM|50070|    0.2569|Nephrolithotomy; ...|50070:::49425:::5...|
|                COPD|  113|116|  PROBLEM|49425|    0.0779|Insertion of peri...|49425:::31592:::4...|
|           gastritis|  120|128|  PROBLEM|43810|    0.5289|Gastroduodenostom...|43810:::43880:::4...|
|                 TIA|  136|138|  PROBLEM|25927|    0.2060|Transmetacarpal a...|25927:::25931:::6...|
|a non-ST elevatio...|  182|202|  PROBLEM|33300|    0.3046|Repair of cardiac...|33300:::33813:::3...|
|Guaiac positive s...|  208|229|  PROBLEM|47765|    0.0974|Anastomosis, of i...|47765:::49425:::1...|
|cardiac catheteri...|  295|317|     TEST|62225|    0.1996|Replacement or ir...|62225:::33722:::4...|
|                PTCA|  324|327|TREATMENT|60500|    0.1481|Parathyroidectomy...|60500:::43800:::2...|
|      mid LAD lesion|  332|345|  PROBLEM|33722|    0.3097|Closure of aortic...|33722:::33732:::3...|
+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+

Model Information

Name: sbiobertresolve_cpt
Type: SentenceEntityResolverModel
Compatibility: Spark NLP 2.6.4 +
License: Licensed
Edition: Official
Input labels: [ner_chunk, chunk_embeddings]
Output labels: [resolution]
Language: en
Dependencies: sbiobert_base_cased_mli

Data Source

Trained on Current Procedural Terminology dataset with sbiobert_base_cased_mli sentence embeddings.

References

CPT resolver models are removed from the Models Hub due to license restrictions and can only be shared with the users who already have a valid CPT license. If you possess one and wish to use this model, kindly contact us at support@johnsnowlabs.com.