Description
This model is a bioBERT based classifier that can classify source of emotional stress in text.
Predicted Entities
Family_Issues
, Financial_Problem
, Health_Fatigue_or_Physical Pain
, Other
, School
, Work
, Social_Relationships
Live Demo Open in Colab Copy S3 URI
How to use
document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_stressor", "en", "clinical/models")\
.setInputCols(["document","token"])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
data = spark.createDataFrame([["All the panic about the global pandemic has been stressing me out!"]]).toDF("text")
result = pipeline.fit(data).transform(data)
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols("document")
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_stressor", "en", "clinical/models")
.setInputCols(Array("document", "token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))
val data = Seq("All the panic about the global pandemic has been stressing me out!").toDF("text")
val result = pipeline.fit(data).transform(data)
import nlu
nlu.load("en.classify.stressor").predict("""All the panic about the global pandemic has been stressing me out!""")
Results
+------------------------------------------------------------------+-----------------------------------+
|text |class |
+------------------------------------------------------------------+-----------------------------------+
|All the panic about the global pandemic has been stressing me out!|[Health, Fatigue, or Physical Pain]|
+------------------------------------------------------------------+-----------------------------------+
Model Information
Model Name: | bert_sequence_classifier_stressor |
Compatibility: | Healthcare NLP 4.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [class] |
Language: | en |
Size: | 406.5 MB |
Case sensitive: | true |
Max sentence length: | 128 |
Benchmarking
label precision recall f1-score support
Family Issues 0.80 0.87 0.84 161
Financial Problem 0.87 0.83 0.85 126
Health, Fatigue, or Physical Pain 0.75 0.81 0.78 168
Other 0.82 0.80 0.81 384
School 0.89 0.91 0.90 127
Social Relationships 0.83 0.71 0.76 133
Work 0.87 0.89 0.88 271
accuracy - - 0.83 1370
macro-avg 0.83 0.83 0.83 1370
weighted-avg 0.83 0.83 0.83 1370