Description
This model is designed to perform sentiment analysis on Twitter data, extracting three primary sentiments: Bullish
, Bearish
, and Neutral
. This model is the large version of finclf_bert_twitter_financial_news_sentiment
as it is trained on a much larger dataset.
Predicted Entities
Bullish
, Bearish
, Neutral
How to use
document_assembler = nlp.DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')
tokenizer = nlp.Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')
sequenceClassifier = finance.BertForSequenceClassification.pretrained("finclf_bert_twitter_financial_text_sentiment_lg", "en", "finance/models")\
.setInputCols(["document",'token'])\
.setOutputCol("class")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
empty_data = spark.createDataFrame([[""]]).toDF("text")
model = pipeline.fit(empty_data)
data = [["""$GM: Deutsche Bank cuts to Hold """],["""HELSINKI (Thomson Financial)- Kemira GrowHow swung into profit in its first quarter earnings on improved sales , especially in its fertilizer business in Europe , which is normally stronger during the first quarter ."""],["""Vianor sells tires for cars and trucks as well as a range of other car parts and provides maintenance services ."""],["""Pharmaceuticals group Orion Corp reported a fall in its third-quarter earnings that were hit by larger expenditures on R&D and marketing ."""]]
# couple of simple examples
example = model.transform(spark.createDataFrame(data).toDF("text"))
example.select("text", "class.result").show(truncate=False)
Results
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
|text |result |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
|$GM: Deutsche Bank cuts to Hold |[Bearish]|
|HELSINKI (Thomson Financial)- Kemira GrowHow swung into profit in its first quarter earnings on improved sales , especially in its fertilizer business in Europe , which is normally stronger during the first quarter .|[Bullish]|
|Vianor sells tires for cars and trucks as well as a range of other car parts and provides maintenance services . |[Neutral]|
|Pharmaceuticals group Orion Corp reported a fall in its third-quarter earnings that were hit by larger expenditures on R&D and marketing . |[Bearish]|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
Model Information
Model Name: | finclf_bert_twitter_financial_text_sentiment_lg |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [class] |
Language: | en |
Size: | 406.4 MB |
Case sensitive: | true |
Max sentence length: | 512 |
References
In-house annotations on financial reports
Benchmarking
label precision recall f1-score support
Bearish 0.84 0.85 0.84 624
Bullish 0.90 0.88 0.89 1064
Neutral 0.94 0.94 0.94 2679
accuracy - - 0.91 4367
macro-avg 0.89 0.89 0.89 4367
weighted-avg 0.91 0.91 0.91 4367