Description
This model maps extracted medical entities to CPT codes using sbiobert_base_cased_mli
Sentence Bert Embeddings, and has faster load time, with a speedup of about 6X when compared to previous versions. Also the load process now is more memory friendly meaning that the maximum memory required during load time is smaller, reducing the chances of OOM exceptions, and thus relaxing hardware requirements.
Predicted Entities
Predicts CPT codes and their descriptions.
How to use
document_assembler = DocumentAssembler()\
.setInputCol('text')\
.setOutputCol('document')
sentence_detector = SentenceDetector() \
.setInputCols(["document"]) \
.setOutputCol("sentence")
tokenizer = Tokenizer() \
.setInputCols(["sentence"]) \
.setOutputCol("token")
word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
.setInputCols(["sentence", "token"])\
.setOutputCol("embeddings")
clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models") \
.setInputCols(["sentence", "token", "embeddings"]) \
.setOutputCol("ner")
ner_converter = NerConverter() \
.setInputCols(["sentence", "token", "ner"]) \
.setOutputCol("ner_chunk")
chunk2doc = Chunk2Doc()\
.setInputCols("ner_chunk")\
.setOutputCol("ner_chunk_doc")
sbert_embedder = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")
cpt_resolver = SentenceEntityResolverModel\
.load("sbiobertresolve_cpt") \
.setInputCols(["ner_chunk", "sbert_embeddings"]) \
.setOutputCol("resolution")\
.setDistanceFunction("EUCLIDEAN")
nlpPipeline = Pipeline(stages=[
document_assembler,
sentence_detector,
tokenizer,
word_embeddings,
clinical_ner,
ner_converter,
chunk2doc,
sbert_embedder,
cpt_resolver])
data = spark.createDataFrame([["This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU ."]]).toDF("text")
results = nlpPipeline.fit(data).transform(data)
val documentAssembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val sentenceDetector = new SentenceDetector()
.setInputCols("document")
.setOutputCol("sentence")
val tokenizer = new Tokenizer()
.setInputCols("sentence")
.setOutputCol("token")
val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")
.setInputCols(Array("sentence", "token"))
.setOutputCol("embeddings")
val clinical_ner = MedicalNerModel.pretrained("ner_clinical", "en", "clinical/models")
.setInputCols(Array("sentence", "token", "embeddings"))
.setOutputCol("ner")
val ner_converter = new NerConverter()
.setInputCols(Array("sentence", "token", "ner"))
.setOutputCol("ner_chunk")
val chunk2doc = new Chunk2Doc()
.setInputCols("ner_chunk")
.setOutputCol("ner_chunk_doc")
val sbert_embedder = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols("ner_chunk_doc")
.setOutputCol("sbert_embeddings")
val cpt_resolver = SentenceEntityResolverModel
.load("sbiobertresolve_cpt")
.setInputCols(Array("ner_chunk", "sbert_embeddings"))
.setOutputCol("resolution")
.setDistanceFunction("EUCLIDEAN")
val pipeline = new Pipeline().setStages(
Array(
document_assembler,
sentence_detector,
tokenizer, word_embeddings,
clinical_ner,
ner_converter,
chunk2doc,
sbert_embedder,
cpt_resolver))
val data = Seq("""This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU .""").toDS().toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.resolve").predict("""This is an 82 - year-old male with a history of prior tobacco use , hypertension , chronic renal insufficiency , COPD , gastritis , and TIA who initially presented to Braintree with a non-ST elevation MI and Guaiac positive stools , transferred to St . Margaret\'s Center for Women & Infants for cardiac catheterization with PTCA to mid LAD lesion complicated by hypotension and bradycardia requiring Atropine , IV fluids and transient dopamine possibly secondary to vagal reaction , subsequently transferred to CCU for close monitoring , hemodynamically stable at the time of admission to the CCU .""")
Results
+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+
| chunk|begin|end| entity| code|confidence| all_k_resolutions| all_k_codes|
+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+
| hypertension| 68| 79| PROBLEM|49425| 0.0967|Insertion of peri...|49425:::36818:::3...|
|chronic renal ins...| 83|109| PROBLEM|50070| 0.2569|Nephrolithotomy; ...|50070:::49425:::5...|
| COPD| 113|116| PROBLEM|49425| 0.0779|Insertion of peri...|49425:::31592:::4...|
| gastritis| 120|128| PROBLEM|43810| 0.5289|Gastroduodenostom...|43810:::43880:::4...|
| TIA| 136|138| PROBLEM|25927| 0.2060|Transmetacarpal a...|25927:::25931:::6...|
|a non-ST elevatio...| 182|202| PROBLEM|33300| 0.3046|Repair of cardiac...|33300:::33813:::3...|
|Guaiac positive s...| 208|229| PROBLEM|47765| 0.0974|Anastomosis, of i...|47765:::49425:::1...|
|cardiac catheteri...| 295|317| TEST|62225| 0.1996|Replacement or ir...|62225:::33722:::4...|
| PTCA| 324|327|TREATMENT|60500| 0.1481|Parathyroidectomy...|60500:::43800:::2...|
| mid LAD lesion| 332|345| PROBLEM|33722| 0.3097|Closure of aortic...|33722:::33732:::3...|
+--------------------+-----+---+---------+-----+----------+--------------------+--------------------+
Model Information
Model Name: | sbiobertresolve_cpt |
Compatibility: | Healthcare NLP 3.0.4+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [ner_chunk, sbert_embeddings] |
Output Labels: | [cpt_code] |
Language: | en |
Case sensitive: | false |
Data Source
Trained on Current Procedural Terminology dataset with sbiobert_base_cased_mli
sentence embeddings.
References
CPT resolver models are removed from the Models Hub due to license restrictions and can only be shared with the users who already have a valid CPT license. If you possess one and wish to use this model, kindly contact us at support@johnsnowlabs.com.