Sentence Entity Resolver for RxCUI (``sbiobert_base_cased_mli`` embeddings)

Description

This model maps extracted medical entities to RxCUI codes using chunk embeddings.

Predicted Entities

RxCUI Codes and their normalized definition with sbiobert_base_cased_mli embeddings.

Open in Colab Download

How to use

...
chunk2doc = Chunk2Doc().setInputCols("ner_chunk").setOutputCol("ner_chunk_doc")
 
sbert_embedder = BertSentenceEmbeddings\
     .pretrained("sbiobert_base_cased_mli","en","clinical/models")\
     .setInputCols(["ner_chunk_doc"])\
     .setOutputCol("sbert_embeddings")
rxcui_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_rxcui","en", "clinical/models") \
     .setInputCols(["ner_chunk", "sbert_embeddings"]) \
     .setOutputCol("resolution")\
     .setDistanceFunction("EUCLIDEAN")
nlpPipeline = Pipeline(stages=[document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk2doc, sbert_embedder, rxcui_resolver])

model = nlpPipeline.fit(spark.createDataFrame([["He was seen by the endocrinology service and she was discharged on 50 mg of eltrombopag oral at night, 5 mg amlodipine with meals, and metformin 1000 mg two times a day"]]).toDF("text"))

results = model.transform(data)

...
val chunk2doc = Chunk2Doc().setInputCols("ner_chunk").setOutputCol("ner_chunk_doc")
 
val sbert_embedder = BertSentenceEmbeddings
     .pretrained("sbiobert_base_cased_mli","en","clinical/models")
     .setInputCols(Array("ner_chunk_doc"))
     .setOutputCol("sbert_embeddings")
val rxcui_resolver = SentenceEntityResolverModel.pretrained("sbiobertresolve_rxcui","en", "clinical/models")
     .setInputCols(Array("ner_chunk", "sbert_embeddings"))
     .setOutputCol("resolution")
     .setDistanceFunction("EUCLIDEAN")
val pipeline = new Pipeline().setStages(Array(document_assembler, sentence_detector, tokenizer, word_embeddings, clinical_ner, ner_converter, chunk2doc, sbert_embedder, rxcui_resolver))

val data = Seq("He was seen by the endocrinology service and she was discharged on 50 mg of eltrombopag oral at night, 5 mg amlodipine with meals, and metformin 1000 mg two times a day").toDF("text")
val result = pipeline.fit(data).transform(data)

Results

+---------------------------+--------+-----------------------------------------------------+
| chunk                     | code   | term                                                |               
+---------------------------+--------+-----------------------------------------------------+
| 50 mg of eltrombopag oral | 825427 | eltrombopag 50 MG Oral Tablet                       |
| 5 mg amlodipine           | 197361 | amlodipine 5 MG Oral Tablet                         |
| metformin 1000 mg         | 861004 | metformin hydrochloride 2000 MG Oral Tablet         |
+---------------------------+--------+-----------------------------------------------------+

Model Information

Name: sbiobertresolve_rxcui
Type: SentenceEntityResolverModel
Compatibility: Spark NLP 2.6.5 +
License: Licensed
Edition: Official
Input labels: [ner_chunk, chunk_embeddings]
Output labels: [resolution]
Language: en
Dependencies: sbiobert_base_cased_mli

Data Source

Trained on November 2020 RxNorm Clinical Drugs ontology graph with sbiobert_base_cased_mli embeddings. https://www.nlm.nih.gov/pubs/techbull/nd20/brief/nd20_rx_norm_november_release.html. Sample Content.