Description
This model is a BioBERT based sentiment analysis model that can extract information from COVID-19 pandemic-related tweets. The model predicts whether a tweet contains positive, negative, or neutral sentiments about COVID-19 pandemic.
Predicted Entities
neutral
, positive
, negative
How to use
document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequence_classifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")\
.setInputCols(["document", "token"])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])
data = spark.createDataFrame([
["British Department of Health confirms first two cases of in UK"],
["so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :("],
[ "I wish everyone to be safe at home and stop pandemic"]]
).toDF("text")
model = pipeline.fit(data)
result = model.transform(data)
document_assembler = nlp.DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = nlp.Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequenceClassifier = medical.BertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")\
.setInputCols(["document","token"])\
.setOutputCol("classes")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
data = spark.createDataFrame([
["British Department of Health confirms first two cases of in UK"],
["so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :("],
[ "I wish everyone to be safe at home and stop pandemic"]]
).toDF("text")
model = pipeline.fit(data)
result = model.transform(data)
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")
.setInputCols(Array("document","token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))
val data = Seq( "British Department of Health confirms first two cases of in UK",
"so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken ",
"I wish everyone to be safe at home and stop pandemic"
).toDF("text")
val model = pipeline.fit(data)
val result = model.transform(data)
Results
+---------------------------------------------------------------------------------------------------------+----------+
|text |result |
+---------------------------------------------------------------------------------------------------------+----------+
|British Department of Health confirms first two cases of in UK |[neutral] |
|so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :(|[negative]|
|I wish everyone to be safe at home and stop pandemic |[positive]|
+---------------------------------------------------------------------------------------------------------+----------+
Model Information
Model Name: | bert_sequence_classifier_covid_sentiment_onnx |
Compatibility: | Healthcare NLP 6.1.1+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [label] |
Language: | en |
Size: | 437.7 MB |
Case sensitive: | true |