Financial Indian News Sentiment Analysis (Medium)

Description

This is a md version of Indian News Sentiment Analysis Text Classifier, which will retrieve if a text is either expression a Positive Emotion or a Negative one.

Predicted Entities

POSITIVE, NEGATIVE

Copy S3 URI

How to use


document_assembler = nlp.DocumentAssembler() \
                .setInputCol("text") \
                .setOutputCol("document")

tokenizer = nlp.Tokenizer() \
                .setInputCols(["document"]) \
                .setOutputCol("token")
      
classifierdl = finance.BertForSequenceClassification.pretrained("finclf_indian_news_sentiment_medium","en", "finance/models")\
    .setInputCols(["document", "token"])\
    .setOutputCol("label")

bert_clf_pipeline = nlp.Pipeline(stages=[document_assembler,
                                     tokenizer,
                                     classifierdl])

text = ["Eliminating shadow economy to have positive impact on GDP : Arun Jaitley"]
empty_df = spark.createDataFrame([[""]]).toDF("text")
model = bert_clf_pipeline.fit(empty_df)
res = model.transform(spark.createDataFrame([text]).toDF("text"))


Results

+------------------------------------------------------------------------+----------+
|text                                                                    |result    |
+------------------------------------------------------------------------+----------+
|Eliminating shadow economy to have positive impact on GDP : Arun Jaitley|[POSITIVE]|
+------------------------------------------------------------------------+----------+

Model Information

Model Name: finclf_indian_news_sentiment_medium
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 412.3 MB
Case sensitive: true
Max sentence length: 128

References

An in-house augmented version of this dataset

Benchmarking

       label  precision    recall  f1-score   support
    NEGATIVE       0.85      0.86      0.86     10848
    POSITIVE       0.83      0.83      0.83      9202
    accuracy        -         -        0.84     20050
   macro-avg       0.84      0.84      0.84     20050
weighted-avg       0.84      0.84      0.84     20050