Sentiment Analysis on Auditors' Reports

Description

This is a Sentiment Analysis model which retrieves 3 sentiments (positive, negative or neutral) from Auditors’ comments.

Predicted Entities

positive, negative, neutral

Download Copy S3 URI

How to use

documentAssembler = nlp.DocumentAssembler() \
    .setInputCol("sentence") \
    .setOutputCol("document")

embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en") \
    .setInputCols("document") \
    .setOutputCol("sentence_embeddings")

sentiment =  nlp.ClassifierDLModel.pretrained("finclf_auditor_sentiment_analysis", "en", "finance/models") \
    .setInputCols("sentence_embeddings") \
    .setOutputCol("category")

pipeline = nlp.Pipeline() \
    .setStages(
      [
        documentAssembler,
        embeddings,
        sentiment 
      ]
    )

pipelineModel = pipeline.fit(sdf_test)
res = pipelineModel.transform(sdf_test)
res.select('sentence', 'category.result').show(truncate=100)

Results

+----------------------------------------------------------------------------------------------------+----------+
|                                                                                            sentence|    result|
+----------------------------------------------------------------------------------------------------+----------+
|In our opinion, the consolidated financial statements referred to above present fairly..............|[positive]|
+----------------------------------------------------------------------------------------------------+----------+

Model Information

Model Name:	finclf_auditor_sentiment_analysis
Compatibility:	Finance NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[sentence_embeddings]
Output Labels:	[category]
Language:	en
Size:	23.1 MB

References

Propietary auditors’ reports

Benchmarking

       label  precision    recall  f1-score   support
    negative       0.66      0.78      0.72       124
     neutral       0.88      0.77      0.82       559
    positive       0.65      0.76      0.70       286
    accuracy        -          -       0.77       969
   macro-avg       0.73      0.77      0.74       969
weighted-avg       0.78      0.77      0.77       969

PREVIOUSGeneral Oncology Pipeline

NEXTOncology Pipeline for Biomarkers