SDOH Alcohol Usage For Binary Classification


This Generic Classifier model is intended for detecting alcohol use in clinical notes and trained by using GenericClassifierApproach annotator. Present: if the patient was a current consumer of alcohol or the patient was a consumer in the past and had quit. Never: if the patient had never consumed alcohol. None: if there was no related text.

Predicted Entities

Present, Never, None

Live Demo Open in Colab Copy S3 URI

How to use

document_assembler = DocumentAssembler()\
sentence_embeddings = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", 'en','clinical/models')\

features_asm = FeaturesAssembler()\

generic_classifier = GenericClassifierModel.pretrained("genericclassifier_sdoh_alcohol_usage_binary_sbiobert_cased_mli", 'en', 'clinical/models')\

pipeline = Pipeline(stages=[

text_list = ["Retired schoolteacher, now substitutes. Lives with wife in location 1439. Has a 27 yo son and a 25 yo daughter. He uses alcohol and cigarettes",
             "Employee in neuro departmentin at the Center Hospital 18. Widower since 2001. Current smoker since 20 years. No EtOH or illicits.",
             "Patient smoked 4 ppd x 37 years, quitting 22 years ago. He is widowed, lives alone, has three children."]
df = spark.createDataFrame(text_list, StringType()).toDF("text")

result ="text", "class.result").show(truncate=100)
val document_assembler = new DocumentAssembler()
val sentence_embeddings = BertSentenceEmbeddings.pretrained("sbiobert_base_cased_mli", "en", "clinical/models")

val features_asm = new FeaturesAssembler()

val generic_classifier = GenericClassifierModel.pretrained("genericclassifier_sdoh_alcohol_usage_binary_sbiobert_cased_mli", "en", "clinical/models")

val pipeline = new PipelineModel().setStages(Array(

val data = Seq("Retired schoolteacher, now substitutes. Lives with wife in location 1439. Has a 27 yo son and a 25 yo daughter. He uses alcohol and cigarettes.").toDS.toDF("text")

val result =
import nlu
nlu.load("en.classify.generic.sdoh_alchol_binary_sbiobert_cased").predict("""Retired schoolteacher, now substitutes. Lives with wife in location 1439. Has a 27 yo son and a 25 yo daughter. He uses alcohol and cigarettes""")


|                                                                                                text|   result|
|Retired schoolteacher, now substitutes. Lives with wife in location 1439. Has a 27 yo son and a 2...|[Present]|
|Employee in neuro departmentin at the Center Hospital 18. Widower since 2001. Current smoker sinc...|  [Never]|
|Patient smoked 4 ppd x 37 years, quitting 22 years ago. He is widowed, lives alone, has three chi...|   [None]|

Model Information

Model Name: genericclassifier_sdoh_alcohol_usage_binary_sbiobert_cased_mli
Compatibility: Healthcare NLP 4.2.4+
License: Licensed
Edition: Official
Input Labels: [features]
Output Labels: [prediction]
Language: en
Size: 3.4 MB


       label  precision    recall  f1-score   support
       Never       0.85      0.86      0.85       523
        None       0.81      0.82      0.81       341
     Present       0.88      0.86      0.87       516
    accuracy        -         -        0.85      1380
   macro-avg       0.85      0.85      0.85      1380
weighted-avg       0.85      0.85      0.85      1380