Description
This model maps DRUG
entities to RxNorm codes and their National Drug Codes (NDC) using sbiobert_base_cased_mli
sentence embeddings. You can find all NDC codes of drugs seperated by |
symbol in the all_k_aux_labels parameter of the metadata.
Predicted Entities
RxNorm Codes
, NDC Codes
Live Demo Open in Colab Copy S3 URI
How to use
documentAssembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("ner_chunk")
sbert_embedder = BertSentenceEmbeddings\
.pretrained('sbiobert_base_cased_mli', 'en','clinical/models')\
.setInputCols(["ner_chunk"])\
.setOutputCol("sentence_embeddings")
rxnorm_ndc_resolver = SentenceEntityResolverModel\
.pretrained("sbiobertresolve_rxnorm_ndc", "en", "clinical/models") \
.setInputCols(["sentence_embeddings"]) \
.setOutputCol("rxnorm_code")\
.setDistanceFunction("EUCLIDEAN")
rxnorm_ndc_pipeline = Pipeline(stages = [
documentAssembler,
sbert_embedder,
rxnorm_ndc_resolver])
data = spark.createDataFrame([["activated charcoal 30000 mg powder for oral suspension"]]).toDF("text")
res = rxnorm_ndc_pipeline.fit(data).transform(data)
val documentAssembler = DocumentAssembler()
.setInputCol("text")
.setOutputCol("ner_chunk")
val sbert_embedder = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli", "en","clinical/models")
.setInputCols(Array("ner_chunk")
.setOutputCol("sentence_embeddings")
val rxnorm_ndc_resolver = SentenceEntityResolverModel
.pretrained("sbiobertresolve_rxnorm_ndc", "en", "clinical/models")
.setInputCols(Array("sentence_embeddings"))
.setOutputCol("rxnorm_code")
.setDistanceFunction("EUCLIDEAN")
val rxnorm_ndc_pipeline = new Pipeline().setStages(Array(
documentAssembler,
sbert_embedder,
rxnorm_ndc_resolver))
val data = Seq("activated charcoal 30000 mg powder for oral suspension").toDF("text")
val res = rxnorm_ndc_pipeline.fit(data).transform(data)
import nlu
nlu.load("en.resolve.rxnorm_ndc").predict("""activated charcoal 30000 mg powder for oral suspension""")
Results
+--+------------------------------------------------------+-----------+-----------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------+
| |ner_chunk |rxnorm_code|all_codes |resolutions |all_k_aux_labels (ndc_codes) |
+--+------------------------------------------------------+-----------+-----------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------+
|0 |activated charcoal 30000 mg powder for oral suspension|1440919 |[1440919, 808917, 1088194, 1191772, 808921,...]|'activated charcoal 30000 MG Powder for Oral Suspension', 'Activated Charcoal 30000 MG Powder for Oral Suspension', 'wheat dextrin 3000 MG Powder for Oral Solution [Benefiber]', 'cellulose 3000 MG Oral Powder [Unifiber]', 'fosfomycin 3000 MG Powder for Oral Solution [Monurol]', ...|69784030828, 00395052791, 08679001362|86790016280|00067004490, 46017004408|68220004416, 00456430001,...|
Model Information
Model Name: | sbiobertresolve_rxnorm_ndc |
Compatibility: | Healthcare NLP 3.2.3+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [rxnorm_code] |
Language: | en |
Case sensitive: | false |