Classify Earning Calls, Broker Reports and 10K

Description

This is a Text Cassification model, which can help you identify if a model is an Earning Call, a Broker Report, a 10K filing or something else.

Predicted Entities

earning_call, broker_report, 10k, other

Live Demo Copy S3 URI

How to use

documentAssembler = DocumentAssembler() \
  .setInputCol("text") \
  .setOutputCol("document")

embeddings = BertSentenceEmbeddings.pretrained("sent_bert_base_cased", "en") \
  .setInputCols("document") \
  .setOutputCol("sentence_embeddings")

docClassifier = finance.ClassifierDLModel.pretrained("finclf_earning_broker_10k", "en", "finance/models")\
    .setInputCols(["sentence_embeddings"])\
    .setOutputCol("label") \

nlpPipeline = nlp.Pipeline(stages=[
    documentAssembler, 
    embeddings,
    docClassifier])

text = """Varun Beverages  
 
 
Investors are advised to refer through important disclosures made at the last page of the Research Report.  
Motilal Oswal research is available on www.motilaloswal.com/Institutional -Equities, Bloomberg, Thomson Reuters, Factset and S&P Capital.  Research Analyst: Sumant Kumar (Sumant.Kumar@MotilalOswal.com)         
Research Analyst: Meet  Jain (Meet.Jain@ Motilal Oswal.com)  / Omkar Shintre  (Omkar.Shintre @Motilal Oswal.com)"""

sdf = spark.createDataFrame([[text]]).toDF("text")
fit = nlpPipeline.fit(sdf)
res = fit.transform(sdf)
res = res.select('label.result')

Results

[broker_report]

Model Information

Model Name: finclf_earning_broker_10k
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence_embeddings]
Output Labels: [label]
Language: en
Size: 22.8 MB

References

  • Scrapped broker reports, earning calls, and 10K filings from the internet
  • Other financial documents

Benchmarking

        label  precision    recall  f1-score   support
          10k       1.00      1.00      1.00        17
broker_report       1.00      1.00      1.00        18
 earning_call       1.00      1.00      1.00        19
        other       1.00      1.00      1.00        98
     accuracy          -         -      1.00       152
    macro-avg       1.00      1.00      1.00       152
 weighted-avg       1.00      1.00      1.00       152