Financial Assertion of Aspect-Based Sentiment (md, Medium)

Description

This assertion model classifies financial entities into an aspect-based sentiment. It is designed to be used together with the associated NER model.

Predicted Entities

POSITIVE, NEGATIVE, NEUTRAL

Copy S3 URI

How to use

documentAssembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

# Sentence Detector annotator, processes various sentences per line
sentenceDetector = nlp.SentenceDetector()\
    .setInputCols(["document"])\
    .setOutputCol("sentence")

# Tokenizer splits words in a relevant format for NLP
tokenizer = nlp.Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")

bert_embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base", "en")\
    .setInputCols("sentence", "token")\
    .setOutputCol("embeddings")\
    .setMaxSentenceLength(512)

finance_ner = finance.NerModel.pretrained("finner_aspect_based_sentiment_md", "en", "finance/models")\
    .setInputCols(["sentence", "token", "embeddings"])\
    .setOutputCol("ner")

ner_converter = finance.NerConverterInternal()\
    .setInputCols(["sentence", "token", "ner"])\
    .setOutputCol("ner_chunk")

assertion_model = finance.AssertionDLModel.pretrained("finassertion_aspect_based_sentiment_md", "en", "finance/models")\
    .setInputCols(["sentence", "ner_chunk", "embeddings"])\
    .setOutputCol("assertion")


nlpPipeline = nlp.Pipeline(
    stages=[documentAssembler,
            sentenceDetector,
            tokenizer,
            bert_embeddings,
            finance_ner,
            ner_converter,
            assertion_model])

text = "Equity and earnings of affiliates in Latin America increased to $4.8 million in the quarter from $2.2 million in the prior year as the commodity markets in Latin America remain strong through the end of the quarter."

spark_df = spark.createDataFrame([[text]]).toDF("text")

result = nlpPipeline.fit(spark_df ).transform(spark_df)

result.select(F.explode(F.arrays_zip("ner_chunk.result", "ner_chunk.metadata", "assertion.result", "assertion.metadata")).alias("cols"))\
      .select(F.expr("cols['0']").alias("entity"),
              F.expr("cols['1']['entity']").alias("label"),
              F.expr("cols['2']").alias("assertion"),
              F.expr("cols['3']['confidence']").alias("confidence")).show(50, truncate=False)

Results

+--------+---------+---------+----------+
|entity  |label    |assertion|confidence|
+--------+---------+---------+----------+
|Equity  |LIABILITY|POSITIVE |0.9895    |
|earnings|PROFIT   |POSITIVE |0.995     |
+--------+---------+---------+----------+

Model Information

Model Name: finassertion_aspect_based_sentiment_md
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, chunk, embeddings]
Output Labels: [assertion]
Language: en
Size: 2.7 MB

Benchmarking

 label         precision  recall  f1-score  support 
 NEGATIVE      0.68       0.43    0.53      232     
 NEUTRAL       0.44       0.65    0.53      441     
 POSITIVE      0.79       0.69    0.74      947     
 accuracy      -          -       0.64      1620    
 macro-avg     0.64       0.59    0.60      1620    
 weighted-avg  0.68       0.64    0.65      1620