Description
This pipeline can be used to extracts substance usage
information in medical text.
SUBSTANCE_USE
: Mentions of illegal recreational drugs use. Include also substances that can create dependency including here caffeine and tea. “overdose, cocaine, illicit substance intoxication, coffee, etc.”.
Predicted Entities
SUBSTANCE_USE
How to use
from sparknlp.pretrained import PretrainedPipeline
ner_pipeline = PretrainedPipeline("ner_substance_use_benchmark_pipeline", "en", "clinical/models")
text = """SOCIAL HISTORY : The patient is a nonsmoker . Denies any alcohol or illicit drug use . The patient does live with his family .
SOCIAL HISTORY : The patient smokes approximately 2 packs per day times greater than 40 years . He does drink occasional alcohol approximately 5 to 6 alcoholic drinks per month . He denies any drug use . He is a retired liquor store owner .
SOCIAL HISTORY : Patient admits alcohol use , Drinking is described as heavy , Patient denies illegal drug use , Patient denies STD history , Patient denies tobacco use .
SOCIAL HISTORY : The patient is employed in the finance department . He is a nonsmoker . He does consume alcohol on the weekend as much as 3 to 4 alcoholic beverages per day on the weekends . He denies any IV drug use or abuse .
SOCIAL HISTORY : The patient is a smoker . Admits to heroin use , alcohol abuse as well . Also admits today using cocaine .
"""
result = ner_pipeline.fullAnnotate(text)
from sparknlp.pretrained import PretrainedPipeline
ner_pipeline = nlp.PretrainedPipeline("ner_substance_use_benchmark_pipeline", "en", "clinical/models")
text = """SOCIAL HISTORY : The patient is a nonsmoker . Denies any alcohol or illicit drug use . The patient does live with his family .
SOCIAL HISTORY : The patient smokes approximately 2 packs per day times greater than 40 years . He does drink occasional alcohol approximately 5 to 6 alcoholic drinks per month . He denies any drug use . He is a retired liquor store owner .
SOCIAL HISTORY : Patient admits alcohol use , Drinking is described as heavy , Patient denies illegal drug use , Patient denies STD history , Patient denies tobacco use .
SOCIAL HISTORY : The patient is employed in the finance department . He is a nonsmoker . He does consume alcohol on the weekend as much as 3 to 4 alcoholic beverages per day on the weekends . He denies any IV drug use or abuse .
SOCIAL HISTORY : The patient is a smoker . Admits to heroin use , alcohol abuse as well . Also admits today using cocaine .
"""
result = ner_pipeline.fullAnnotate(text)
import com.johnsnowlabs.nlp.pretrained.PretrainedPipeline
val ner_pipeline = PretrainedPipeline("ner_substance_use_benchmark_pipeline", "en", "clinical/models")
val text = """SOCIAL HISTORY : The patient is a nonsmoker . Denies any alcohol or illicit drug use . The patient does live with his family .
SOCIAL HISTORY : The patient smokes approximately 2 packs per day times greater than 40 years . He does drink occasional alcohol approximately 5 to 6 alcoholic drinks per month . He denies any drug use . He is a retired liquor store owner .
SOCIAL HISTORY : Patient admits alcohol use , Drinking is described as heavy , Patient denies illegal drug use , Patient denies STD history , Patient denies tobacco use .
SOCIAL HISTORY : The patient is employed in the finance department . He is a nonsmoker . He does consume alcohol on the weekend as much as 3 to 4 alcoholic beverages per day on the weekends . He denies any IV drug use or abuse .
SOCIAL HISTORY : The patient is a smoker . Admits to heroin use , alcohol abuse as well . Also admits today using cocaine .
"""
val result = ner_pipeline.fullAnnotate(text)
Results
| | chunk | begin | end | ner_label |
|---:|:-----------------|--------:|------:|:--------------|
| 0 | illicit drug use | 68 | 83 | SUBSTANCE_USE |
| 1 | drug use | 320 | 327 | SUBSTANCE_USE |
| 2 | illegal drug use | 462 | 477 | SUBSTANCE_USE |
| 3 | IV drug use | 745 | 755 | SUBSTANCE_USE |
| 4 | heroin use | 821 | 830 | SUBSTANCE_USE |
| 5 | using cocaine | 876 | 888 | SUBSTANCE_USE |
Model Information
Model Name: | ner_substance_use_benchmark_pipeline |
Type: | pipeline |
Compatibility: | Healthcare NLP 5.5.3+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 1.7 GB |
Included Models
- DocumentAssembler
- SentenceDetector
- TokenizerModel
- WordEmbeddingsModel
- TextMatcherInternalModel
- MedicalNerModel
- NerConverterInternalModel
- ChunkMergeModel
- ChunkMergeModel
Benchmarking
label precision recall f1-score support
O 1.000 1.000 1.000 82313
SUBSTANCE_USE 1.000 0.981 0.990 258
accuracy - - 1.000 82571
macro-avg 1.000 0.990 0.995 82571
weighted-avg 1.000 1.000 1.000 82571