Description
This financial model is an xlg (Xlarge) version, which has been trained with more general labels than other versions such (md
, lg
, …) that are available in the Models Hub. The training corpus used for this model is a combination of Broker Reports, Earning Calls, and 10K filings.
Predicted Entities
CF
, INCOME
, KPI_INCREASE
, CFO
, PROFIT
, PROFIT_INCREASE
, AMOUNT
, REVENUE
, CFI
, EXPENSE
, FISCAL_YEAR
, Expense
, KPI
, LIABILITY
, TARGET_PRICE
, CFO_decrease
, STOCKHOLDERS_EQUITY
, PROFIT_DECLINE
, CMP
, CFF
, Expense_decrease
, Revenue_decline
, COUNT
, Contra_LIABILITY
, Expense_Increase
, STOCK_EXCHANGE
, LOSS
, FCF
, Revenue_increase
, CFN
, CFO_Increase
, Income
, PERCENTAGE
, CURRENCY
, ASSET
, STOCKHOLDERS_DEFICIT
, DATE
, RATING
How to use
# Test classifier in Spark NLP pipeline
document_assembler = nlp.DocumentAssembler() \
.setInputCol('text') \
.setOutputCol('document')
tokenizer = nlp.Tokenizer() \
.setInputCols(['document']) \
.setOutputCol('token')
# Load newly trained classifier
token_classifier = finance.BertForTokenClassification.pretrained("finner_financial_xlarge", "en", "finance/models")\
.setInputCols(["document",'token'])\
.setOutputCol("label")\
.setCaseSensitive(True)
converter = finance.NerConverterInternal()\
.setInputCols(["document", "token", "label"])\
.setOutputCol("ner_span")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
token_classifier,
converter
])
# Generating example
example = spark.createDataFrame([['''We expect Revenue / PAT CAGR of ~ 19 %/~ 22 % over FY2022-FY2024E EPS . Hence , we retain our Buy recommendation on VGIL with an unchanged price target ( PT ) of . This includes $ 1 billion in cash and cash equivalents , $ 2 billion in property and equipment , and $ 2 billion in intangible assets .''']]).toDF("text")
result = pipeline.fit(example).transform(example)
Results
+-------------------------+----------------+
|chunk |entity |
+-------------------------+----------------+
|Revenue |Revenue_increase|
|PAT |PROFIT_INCREASE |
|19 |PERCENTAGE |
|22 |PERCENTAGE |
|Buy |RATING |
|price target |TARGET_PRICE |
|$ |CURRENCY |
|1 billion |AMOUNT |
|cash and cash equivalents|CF |
|$ |CURRENCY |
|2 billion |AMOUNT |
|property and equipment |ASSET |
|$ |CURRENCY |
|2 billion |AMOUNT |
|intangible assets |ASSET |
+-------------------------+----------------+
Model Information
Model Name: | finner_financial_xlarge |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence, token] |
Output Labels: | [ner] |
Language: | en |
Size: | 401.0 MB |
Case sensitive: | true |
Max sentence length: | 128 |
References
In-house dataset
Benchmarking
label precision recall f1-score support
I-CFO_Increase 0.87 0.94 0.90 349
I-CFF 0.75 0.85 0.80 486
B-LOSS 0.84 0.89 0.87 122
B-CFO_Increase 0.92 0.95 0.93 233
B-Revenue_decline 0.61 0.77 0.68 93
B-CFO 0.81 0.89 0.85 298
B-KPI_INCREASE 0.71 0.29 0.41 42
I-CURRENCY 1.00 0.96 0.98 70
I-CFI 0.88 0.85 0.87 489
I-PROFIT_DECLINE 0.92 0.78 0.84 45
I-COUNT 0.80 0.90 0.85 31
B-CFN 0.99 1.00 1.00 327
I-KPI_INCREASE 0.55 0.40 0.46 30
I-Revenue_decline 0.66 0.88 0.75 94
B-ASSET 0.62 0.57 0.60 282
I-Contra_LIABILITY 0.84 0.84 0.84 92
B-KPI 0.48 0.36 0.41 58
I-STOCKHOLDERS_EQUITY 0.90 0.67 0.77 164
B-STOCK_EXCHANGE 1.00 0.94 0.97 52
I-FISCAL_YEAR 0.94 0.97 0.96 1999
I-Income 0.77 0.76 0.76 168
B-PROFIT_DECLINE 0.72 0.76 0.74 50
I-Expense 0.69 0.67 0.68 450
B-FISCAL_YEAR 0.93 0.96 0.94 621
I-CFO 0.81 0.83 0.82 581
B-LIABILITY 0.67 0.83 0.74 305
B-Expense 0.71 0.64 0.67 318
B-INCOME 0.62 0.33 0.43 39
B-STOCKHOLDERS_EQUITY 0.77 0.71 0.74 83
I-ASSET 0.61 0.68 0.65 377
I-DATE 0.90 0.93 0.91 1146
B-CF 0.83 0.81 0.82 135
I-Expense_Increase 0.90 0.89 0.90 353
B-PROFIT 0.86 0.87 0.86 970
I-STOCKHOLDERS_DEFICIT 0.96 0.82 0.88 28
B-STOCKHOLDERS_DEFICIT 0.89 0.75 0.81 32
B-Expense_Increase 0.85 0.87 0.86 267
I-AMOUNT 0.96 0.97 0.97 3009
I-CF 0.85 0.94 0.89 291
I-STOCK_EXCHANGE 0.98 1.00 0.99 170
I-PROFIT 0.87 0.92 0.90 794
I-Expense_decrease 0.76 0.84 0.80 198
B-REVENUE 0.75 0.75 0.75 410
B-Revenue_increase 0.72 0.82 0.76 498
I-EXPENSE 0.53 0.66 0.59 169
I-PROFIT_INCREASE 0.83 0.92 0.87 160
B-Contra_LIABILITY 0.64 0.66 0.65 95
I-INCOME 0.75 0.43 0.55 35
I-REVENUE 0.73 0.67 0.70 226
B-AMOUNT 0.96 0.97 0.97 10381
I-CMP 1.00 1.00 1.00 33
B-COUNT 0.85 0.99 0.91 294
I-TARGET_PRICE 0.97 0.99 0.98 113
B-CFI 0.75 0.79 0.77 154
I-Revenue_increase 0.64 0.78 0.71 329
B-DATE 0.91 0.97 0.94 2112
I-CFO_decrease 0.78 0.63 0.70 101
I-FCF 0.74 0.95 0.83 66
I-KPI 0.46 0.46 0.46 50
B-Expense_decrease 0.80 0.86 0.83 140
B-PROFIT_INCREASE 0.77 0.85 0.81 239
B-Income 0.77 0.75 0.76 79
B-PERCENTAGE 0.96 0.98 0.97 2885
B-CURRENCY 0.98 0.99 0.99 3631
B-CFO_decrease 0.89 0.70 0.78 57
I-LOSS 0.87 0.80 0.84 155
B-RATING 0.94 0.98 0.96 536
I-PERCENTAGE 0.83 0.50 0.62 30
B-CMP 0.98 1.00 0.99 41
B-CFF 0.69 0.63 0.66 184
B-TARGET_PRICE 0.98 0.97 0.98 153
B-EXPENSE 0.68 0.74 0.71 230
I-LIABILITY 0.79 0.84 0.82 371
B-FCF 0.77 0.86 0.81 35
micro-avg 0.90 0.92 0.91 39733
macro-avg 0.81 0.80 0.80 39733
weighted-avg 0.90 0.92 0.91 39733