COVID-19 Sentiment Classifier (BioBERT) ONNX

Description

This model is a BioBERT based sentiment analysis model that can extract information from COVID-19 pandemic-related tweets. The model predicts whether a tweet contains positive, negative, or neutral sentiments about COVID-19 pandemic.

Predicted Entities

neutral, positive, negative

Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

tokenizer = Tokenizer() \
    .setInputCols(["document"]) \
    .setOutputCol("token")

sequence_classifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")\
  .setInputCols(["document", "token"])\
  .setOutputCol("class")

pipeline = Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequence_classifier    
])

data = spark.createDataFrame([
    ["British Department of Health confirms first two cases of in UK"],
    ["so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :("], 
    [ "I wish everyone to be safe at home and stop pandemic"]]
).toDF("text")

model = pipeline.fit(data)
result = model.transform(data)
document_assembler = nlp.DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

tokenizer = nlp.Tokenizer() \
    .setInputCols(["document"]) \
    .setOutputCol("token")

sequenceClassifier = medical.BertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")\
    .setInputCols(["document","token"])\
    .setOutputCol("classes")

pipeline = nlp.Pipeline(stages=[
    document_assembler,
    tokenizer,
    sequenceClassifier
])

data = spark.createDataFrame([
    ["British Department of Health confirms first two cases of in UK"],
    ["so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :("], 
    [ "I wish everyone to be safe at home and stop pandemic"]]
).toDF("text")

model = pipeline.fit(data)
result = model.transform(data)

val document_assembler = new DocumentAssembler() 
    .setInputCol("text") 
    .setOutputCol("document")

val tokenizer = new Tokenizer() 
    .setInputCols(Array("document")) 
    .setOutputCol("token")

val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_covid_sentiment_onnx", "en", "clinical/models")
  .setInputCols(Array("document","token"))
  .setOutputCol("class")

val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))

val data = Seq( "British Department of Health confirms first two cases of in UK",
  "so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken ",
  "I wish everyone to be safe at home and stop pandemic"
).toDF("text")

val model = pipeline.fit(data)
val result = model.transform(data)

Results


+---------------------------------------------------------------------------------------------------------+----------+
|text                                                                                                     |result    |
+---------------------------------------------------------------------------------------------------------+----------+
|British Department of Health confirms first two cases of in UK                                           |[neutral] |
|so my trip to visit my australian exchange student just got canceled bc of coronavirus. im heartbroken :(|[negative]|
|I wish everyone to be safe at home and stop pandemic                                                     |[positive]|
+---------------------------------------------------------------------------------------------------------+----------+

Model Information

Model Name: bert_sequence_classifier_covid_sentiment_onnx
Compatibility: Healthcare NLP 6.1.1+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [label]
Language: en
Size: 437.7 MB
Case sensitive: true