Sentence Entity Resolver for RxNorm (NDC)

Description

This model maps DRUG entities to RxNorm codes and their National Drug Codes (NDC) using sbiobert_base_cased_mli sentence embeddings. You can find all NDC codes of drugs seperated by | symbol in the all_k_aux_labels parameter of the metadata.

Predicted Entities

RxNorm Codes, NDC Codes

Live Demo Open in Colab Copy S3 URI

How to use

documentAssembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("ner_chunk")

sbert_embedder = BertSentenceEmbeddings\
    .pretrained('sbiobert_base_cased_mli', 'en','clinical/models')\
    .setInputCols(["ner_chunk"])\
    .setOutputCol("sentence_embeddings")

rxnorm_ndc_resolver = SentenceEntityResolverModel\
    .pretrained("sbiobertresolve_rxnorm_ndc", "en", "clinical/models") \
    .setInputCols(["sentence_embeddings"]) \
    .setOutputCol("rxnorm_code")\
    .setDistanceFunction("EUCLIDEAN")

rxnorm_ndc_pipeline = Pipeline(stages = [
    documentAssembler,
    sbert_embedder,
    rxnorm_ndc_resolver])

data = spark.createDataFrame([["activated charcoal 30000 mg powder for oral suspension"]]).toDF("text")

res = rxnorm_ndc_pipeline.fit(data).transform(data)
val documentAssembler = DocumentAssembler()
    .setInputCol("text")
    .setOutputCol("ner_chunk")

val sbert_embedder = BertSentenceEmbeddings
    .pretrained("sbiobert_base_cased_mli", "en","clinical/models")
    .setInputCols(Array("ner_chunk")
    .setOutputCol("sentence_embeddings")

val rxnorm_ndc_resolver = SentenceEntityResolverModel
    .pretrained("sbiobertresolve_rxnorm_ndc", "en", "clinical/models") 
    .setInputCols(Array("sentence_embeddings")) 
    .setOutputCol("rxnorm_code")
    .setDistanceFunction("EUCLIDEAN")

val rxnorm_ndc_pipeline = new Pipeline().setStages(Array(
    documentAssembler, 
    sbert_embedder, 
    rxnorm_ndc_resolver))

val data = Seq("activated charcoal 30000 mg powder for oral suspension").toDF("text")

val res = rxnorm_ndc_pipeline.fit(data).transform(data)
import nlu
nlu.load("en.resolve.rxnorm_ndc").predict("""activated charcoal 30000 mg powder for oral suspension""")

Results

+--+------------------------------------------------------+-----------+-----------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------+
|  |ner_chunk                                             |rxnorm_code|all_codes                                      |resolutions                                                                                                                                                                                                                                                                               |all_k_aux_labels (ndc_codes)                                                                           |
+--+------------------------------------------------------+-----------+-----------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------+
|0 |activated charcoal 30000 mg powder for oral suspension|1440919    |[1440919, 808917, 1088194, 1191772, 808921,...]|'activated charcoal 30000 MG Powder for Oral Suspension', 'Activated Charcoal 30000 MG Powder for Oral Suspension', 'wheat dextrin 3000 MG Powder for Oral Solution [Benefiber]', 'cellulose 3000 MG Oral Powder [Unifiber]', 'fosfomycin 3000 MG Powder for Oral Solution [Monurol]', ...|69784030828, 00395052791, 08679001362|86790016280|00067004490, 46017004408|68220004416, 00456430001,...|

Model Information

Model Name: sbiobertresolve_rxnorm_ndc
Compatibility: Healthcare NLP 3.2.3+
License: Licensed
Edition: Official
Input Labels: [sentence_embeddings]
Output Labels: [rxnorm_code]
Language: en
Case sensitive: false