Description
This model is designed to perform sentiment analysis on Twitter data, extracting three primary sentiments: Positive
, Negative
, and Neutral
.
Predicted Entities
Positive
, Negative
, Neutral
How to use
document_assembler = nlp.DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')
tokenizer = nlp.Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')
sequenceClassifier = finance.BertForSequenceClassification.pretrained("finclf_bert_twitter_financial_text_sentiment", "en", "finance/models")\
.setInputCols(["document",'token'])\
.setOutputCol("class")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
empty_data = spark.createDataFrame([[""]]).toDF("text")
model = pipeline.fit(empty_data)
data = [["""Early Crater Lake Drill Results Return Better Than Expected Grades and Intersection Lengths – 79.7 meters at 311 g/t Scandium Oxide, 0.326% Rare Earths Oxides and Yttrium -- Imperial Mining Group Ltd. ("Imperial") (TSX VENTURE: IPG; OTCQB: IMPNF) is pleased to announce that it has completed its Summer 2022 exploration and definition diamond drill program on the Ta-Nb Target and the TG Zone. Early results are encouraging and give inference to grade and tonnage increases to the TG North Lobe Deposit resource (see Imperial Press release - SEP 23, 2021)."""],["""Noranda Income Fund Provides an Update on Operational and Production Challenges and Announces a Cellhouse Maintenance Shutdown -- Noranda Income Fund (TSX:NIF.UN) (the “Fundâ€) today provided an update regarding its previously disclosed challenges with cellhouse operating conditions and equipment fragility, which have been adversely affecting zinc production volumes and output quality."""]]
# couple of simple examples
example = model.transform(spark.createDataFrame(data).toDF("text"))
example.select("text", "class.result").show(truncate=False)
Results
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|text |result |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|Early Crater Lake Drill Results Return Better Than Expected Grades and Intersection Lengths – 79.7 meters at 311 g/t Scandium Oxide, 0.326% Rare Earths Oxides and Yttrium -- Imperial Mining Group Ltd. ("Imperial") (TSX VENTURE: IPG; OTCQB: IMPNF) is pleased to announce that it has completed its Summer 2022 exploration and definition diamond drill program on the Ta-Nb Target and the TG Zone. Early results are encouraging and give inference to grade and tonnage increases to the TG North Lobe Deposit resource (see Imperial Press release - SEP 23, 2021).|[Positive]|
|Noranda Income Fund Provides an Update on Operational and Production Challenges and Announces a Cellhouse Maintenance Shutdown -- Noranda Income Fund (TSX:NIF.UN) (the “Fundâ€) today provided an update regarding its previously disclosed challenges with cellhouse operating conditions and equipment fragility, which have been adversely affecting zinc production volumes and output quality. |[Negative]|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
Model Information
Model Name: | finclf_bert_twitter_financial_text_sentiment |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [class] |
Language: | en |
Size: | 406.4 MB |
Case sensitive: | true |
Max sentence length: | 512 |
References
In-house annotations on financial reports
Benchmarking
label precision recall f1-score support
Negative 0.75 0.60 0.67 15
Neutral 0.89 0.87 0.88 207
Positive 0.83 0.87 0.85 134
accuracy - - 0.86 356
macro-avg 0.82 0.78 0.80 356
weighted-avg 0.86 0.86 0.86 356