Financial NER (xlg, XLarge)

Description

This financial model is an xlg (Xlarge) version, which has been trained with more general labels than other versions such (md, lg, …) that are available in the Models Hub. The training corpus used for this model is a combination of Broker Reports, Earning Calls, and 10K filings.

Predicted Entities

CF, INCOME, KPI_INCREASE, CFO, PROFIT, PROFIT_INCREASE, AMOUNT, REVENUE, CFI, EXPENSE, FISCAL_YEAR, Expense, KPI, LIABILITY, TARGET_PRICE, CFO_decrease, STOCKHOLDERS_EQUITY, PROFIT_DECLINE, CMP, CFF, Expense_decrease, Revenue_decline, COUNT, Contra_LIABILITY, Expense_Increase, STOCK_EXCHANGE, LOSS, FCF, Revenue_increase, CFN, CFO_Increase, Income, PERCENTAGE, CURRENCY, ASSET, STOCKHOLDERS_DEFICIT, DATE, RATING

Live Demo Copy S3 URI

How to use

 
# Test classifier in Spark NLP pipeline
document_assembler = nlp.DocumentAssembler() \
    .setInputCol('text') \
    .setOutputCol('document')

tokenizer = nlp.Tokenizer() \
    .setInputCols(['document']) \
    .setOutputCol('token')

# Load newly trained classifier
token_classifier = finance.BertForTokenClassification.pretrained("finner_financial_xlarge", "en", "finance/models")\
  .setInputCols(["document",'token'])\
  .setOutputCol("label")\
  .setCaseSensitive(True)

converter = finance.NerConverterInternal()\
    .setInputCols(["document", "token", "label"])\
    .setOutputCol("ner_span")

pipeline = nlp.Pipeline(stages=[
    document_assembler, 
    tokenizer,
    token_classifier,
    converter    
])

# Generating example
example = spark.createDataFrame([['''We expect Revenue / PAT CAGR of ~ 19 %/~ 22 % over FY2022-FY2024E EPS . Hence , we retain our Buy recommendation on VGIL with an unchanged price target ( PT ) of . This includes $ 1 billion in cash and cash equivalents , $ 2 billion in property and equipment , and $ 2 billion in intangible assets .''']]).toDF("text")

result = pipeline.fit(example).transform(example)

Results


+-------------------------+----------------+
|chunk                    |entity          |
+-------------------------+----------------+
|Revenue                  |Revenue_increase|
|PAT                      |PROFIT_INCREASE |
|19                       |PERCENTAGE      |
|22                       |PERCENTAGE      |
|Buy                      |RATING          |
|price target             |TARGET_PRICE    |
|$                        |CURRENCY        |
|1 billion                |AMOUNT          |
|cash and cash equivalents|CF              |
|$                        |CURRENCY        |
|2 billion                |AMOUNT          |
|property and equipment   |ASSET           |
|$                        |CURRENCY        |
|2 billion                |AMOUNT          |
|intangible assets        |ASSET           |
+-------------------------+----------------+

Model Information

Model Name: finner_financial_xlarge
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence, token]
Output Labels: [ner]
Language: en
Size: 401.0 MB
Case sensitive: true
Max sentence length: 128

References

In-house dataset

Benchmarking

label                      precision    recall  f1-score   support
        I-CFO_Increase       0.87      0.94      0.90       349
                 I-CFF       0.75      0.85      0.80       486
                B-LOSS       0.84      0.89      0.87       122
        B-CFO_Increase       0.92      0.95      0.93       233
     B-Revenue_decline       0.61      0.77      0.68        93
                 B-CFO       0.81      0.89      0.85       298
        B-KPI_INCREASE       0.71      0.29      0.41        42
            I-CURRENCY       1.00      0.96      0.98        70
                 I-CFI       0.88      0.85      0.87       489
      I-PROFIT_DECLINE       0.92      0.78      0.84        45
               I-COUNT       0.80      0.90      0.85        31
                 B-CFN       0.99      1.00      1.00       327
        I-KPI_INCREASE       0.55      0.40      0.46        30
     I-Revenue_decline       0.66      0.88      0.75        94
               B-ASSET       0.62      0.57      0.60       282
    I-Contra_LIABILITY       0.84      0.84      0.84        92
                 B-KPI       0.48      0.36      0.41        58
 I-STOCKHOLDERS_EQUITY       0.90      0.67      0.77       164
      B-STOCK_EXCHANGE       1.00      0.94      0.97        52
         I-FISCAL_YEAR       0.94      0.97      0.96      1999
              I-Income       0.77      0.76      0.76       168
      B-PROFIT_DECLINE       0.72      0.76      0.74        50
             I-Expense       0.69      0.67      0.68       450
         B-FISCAL_YEAR       0.93      0.96      0.94       621
                 I-CFO       0.81      0.83      0.82       581
           B-LIABILITY       0.67      0.83      0.74       305
             B-Expense       0.71      0.64      0.67       318
              B-INCOME       0.62      0.33      0.43        39
 B-STOCKHOLDERS_EQUITY       0.77      0.71      0.74        83
               I-ASSET       0.61      0.68      0.65       377
                I-DATE       0.90      0.93      0.91      1146
                  B-CF       0.83      0.81      0.82       135
    I-Expense_Increase       0.90      0.89      0.90       353
              B-PROFIT       0.86      0.87      0.86       970
I-STOCKHOLDERS_DEFICIT       0.96      0.82      0.88        28
B-STOCKHOLDERS_DEFICIT       0.89      0.75      0.81        32
    B-Expense_Increase       0.85      0.87      0.86       267
              I-AMOUNT       0.96      0.97      0.97      3009
                  I-CF       0.85      0.94      0.89       291
      I-STOCK_EXCHANGE       0.98      1.00      0.99       170
              I-PROFIT       0.87      0.92      0.90       794
    I-Expense_decrease       0.76      0.84      0.80       198
             B-REVENUE       0.75      0.75      0.75       410
    B-Revenue_increase       0.72      0.82      0.76       498
             I-EXPENSE       0.53      0.66      0.59       169
     I-PROFIT_INCREASE       0.83      0.92      0.87       160
    B-Contra_LIABILITY       0.64      0.66      0.65        95
              I-INCOME       0.75      0.43      0.55        35
             I-REVENUE       0.73      0.67      0.70       226
              B-AMOUNT       0.96      0.97      0.97     10381
                 I-CMP       1.00      1.00      1.00        33
               B-COUNT       0.85      0.99      0.91       294
        I-TARGET_PRICE       0.97      0.99      0.98       113
                 B-CFI       0.75      0.79      0.77       154
    I-Revenue_increase       0.64      0.78      0.71       329
                B-DATE       0.91      0.97      0.94      2112
        I-CFO_decrease       0.78      0.63      0.70       101
                 I-FCF       0.74      0.95      0.83        66
                 I-KPI       0.46      0.46      0.46        50
    B-Expense_decrease       0.80      0.86      0.83       140
     B-PROFIT_INCREASE       0.77      0.85      0.81       239
              B-Income       0.77      0.75      0.76        79
          B-PERCENTAGE       0.96      0.98      0.97      2885
            B-CURRENCY       0.98      0.99      0.99      3631
        B-CFO_decrease       0.89      0.70      0.78        57
                I-LOSS       0.87      0.80      0.84       155
              B-RATING       0.94      0.98      0.96       536
          I-PERCENTAGE       0.83      0.50      0.62        30
                 B-CMP       0.98      1.00      0.99        41
                 B-CFF       0.69      0.63      0.66       184
        B-TARGET_PRICE       0.98      0.97      0.98       153
             B-EXPENSE       0.68      0.74      0.71       230
           I-LIABILITY       0.79      0.84      0.82       371
                 B-FCF       0.77      0.86      0.81        35
             micro-avg       0.90      0.92      0.91     39733
             macro-avg       0.81      0.80      0.80     39733
          weighted-avg       0.90      0.92      0.91     39733