RxNorm Scd ChunkResolver

Description

Entity Resolution model Based on KNN using Word Embeddings + Word Movers Distance.

Predicted Entities

RxNorm Codes and their normalized definition with clinical_embeddings.

Open in Colab Copy S3 URI

How to use

...

rxnormResolver = ChunkEntityResolverModel()\
    .pretrained('chunkresolve_rxnorm_scd_clinical', 'en', "clinical/models")\
    .setEnableLevenshtein(True)\
    .setNeighbours(200).setAlternatives(5).setDistanceWeights([3,3,2,0,0,7])\
    .setInputCols(['token', 'chunk_embs_drug'])\
    .setOutputCol('rxnorm_resolution')\

pipeline_rxnorm = Pipeline(stages = [documentAssembler, sentenceDetector, tokenizer, stopwords, word_embeddings, jslNer, drugNer, jslConverter, drugConverter, jslChunkEmbeddings, drugChunkEmbeddings, rxnormResolver])

model = pipeline_rxnorm.fit(spark.createDataFrame([['']]).toDF("text"))

results = model.transform(data)

...

val rxnormResolver = ChunkEntityResolverModel()
    .pretrained('chunkresolve_rxnorm_scd_clinical', 'en', "clinical/models")
    .setEnableLevenshtein(True)
    .setNeighbours(200).setAlternatives(5).setDistanceWeights(Array(3,3,2,0,0,7))
    .setInputCols('token', 'chunk_embs_drug')
    .setOutputCol('rxnorm_resolution')

val pipeline = new Pipeline().setStages(Array(documentAssembler, sentenceDetector, tokenizer, stopwords, word_embeddings, jslNer, drugNer, jslConverter, drugConverter, jslChunkEmbeddings, drugChunkEmbeddings, rxnormResolver))

val result = pipeline.fit(Seq.empty[String]).transform(data)

Results

| coords       | chunk       | entity    | rxnorm_opts                                                                             |
|--------------|-------------|-----------|-----------------------------------------------------------------------------------------|
| 3::278::287  | creatinine  | DrugChem  | [(849628, Creatinine 800 MG Oral Capsule), (252180, Urea 10 MG/ML Topical Lotion), ...] |
| 7::83::93    | cholesterol | DrugChem  | [(2104173, beta Sitosterol 35 MG Oral Tablet), (832876, phytosterol esters 500 MG O...] |
| 10::397::406 | creatinine  | DrugChem  | [(849628, Creatinine 800 MG Oral Capsule), (252180, Urea 10 MG/ML Topical Lotion), ...] |

Model Information

Name: chunkresolve_rxnorm_scd_clinical  
Type: ChunkEntityResolverModel  
Compatibility: Spark NLP 2.5.1+  
License: Licensed  
Edition: Official  
Input labels: [token, chunk_embeddings]  
Output labels: [entity]  
Language: en  
Case sensitive: True  
Dependencies: embeddings_clinical  

Data Source

Trained on December 2019 RxNorm Clinical Drugs (TTY=SCD) ontology graph with embeddings_clinical https://www.nlm.nih.gov/pubs/techbull/nd19/brief/nd19_rxnorm_december_2019_release.html