Financial Twitter Texts Sentiment Analysis (Large)

Description

This model is designed to perform sentiment analysis on Twitter data, extracting three primary sentiments: Bullish, Bearish, and Neutral. This model is the large version of finclf_bert_twitter_financial_news_sentiment as it is trained on a much larger dataset.

Predicted Entities

Bullish, Bearish, Neutral

Copy S3 URICopied!

How to use

document_assembler = nlp.DocumentAssembler() \
    .setInputCol('text') \
    .setOutputCol('document')

tokenizer = nlp.Tokenizer() \
    .setInputCols(['document']) \
    .setOutputCol('token')

sequenceClassifier = finance.BertForSequenceClassification.pretrained("finclf_bert_twitter_financial_text_sentiment_lg", "en", "finance/models")\
  .setInputCols(["document",'token'])\
  .setOutputCol("class")
  
pipeline = nlp.Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequenceClassifier  
])

empty_data = spark.createDataFrame([[""]]).toDF("text")
model = pipeline.fit(empty_data)

data = [["""$GM: Deutsche Bank cuts to Hold """],["""HELSINKI (Thomson Financial)- Kemira GrowHow swung into profit in its first quarter earnings on improved sales , especially in its fertilizer business in Europe , which is normally stronger during the first quarter ."""],["""Vianor sells tires for cars and trucks as well as a range of other car parts and provides maintenance services ."""],["""Pharmaceuticals group Orion Corp reported a fall in its third-quarter earnings that were hit by larger expenditures on R&D and marketing ."""]]

# couple of simple examples
example = model.transform(spark.createDataFrame(data).toDF("text"))

example.select("text", "class.result").show(truncate=False)

Results

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
|text                                                                                                                                                                                                                    |result   |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
|$GM: Deutsche Bank cuts to Hold                                                                                                                                                                                         |[Bearish]|
|HELSINKI (Thomson Financial)- Kemira GrowHow swung into profit in its first quarter earnings on improved sales , especially in its fertilizer business in Europe , which is normally stronger during the first quarter .|[Bullish]|
|Vianor sells tires for cars and trucks as well as a range of other car parts and provides maintenance services .                                                                                                        |[Neutral]|
|Pharmaceuticals group Orion Corp reported a fall in its third-quarter earnings that were hit by larger expenditures on R&D and marketing .                                                                              |[Bearish]|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+

Model Information

Model Name: finclf_bert_twitter_financial_text_sentiment_lg
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.4 MB
Case sensitive: true
Max sentence length: 512

References

In-house annotations on financial reports

Benchmarking

label             precision    recall  f1-score   support
      Bearish       0.84      0.85      0.84       624
      Bullish       0.90      0.88      0.89      1064
      Neutral       0.94      0.94      0.94      2679
    accuracy          -         -       0.91      4367
    macro-avg       0.89      0.89      0.89      4367
weighted-avg        0.91      0.91      0.91      4367