Company Name Normalization using Nasdaq

Description

This is a Financial Entity Resolver model, trained to obtain normalized versions of Company Names, registered in NASDAQ. You can use this model after extracting a company name using any NER, and you will obtain the official name of the company as per NASDAQ database.

After this, you can use finmapper_nasdaq_data_company_name to augment and obtain more information about a company using NASDAQ datasource, including Ticker, Sector, Location, Currency, etc.

Predicted Entities

Live Demo Copy S3 URI

How to use

test = ["FIDUS INVESTMENT corp","ASPECT DEVELOPMENT Inc","CFSB BANCORP","DALEEN TECHNOLOGIES","GLEASON Corporation"]
testdf = pandas.DataFrame(test, columns=['text'])
testsdf = spark.createDataFrame(testdf).toDF('text')

documentAssembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("sentence")

use = nlp.UniversalSentenceEncoder.pretrained("tfhub_use_lg", "en")\
    .setInputCols(["sentence"])\
    .setOutputCol("embeddings")

use_er_model = finance.SentenceEntityResolverModel.pretrained('finel_nasdaq_data_company_name', 'en', 'finance/models')\
  .setInputCols("embeddings")\
  .setOutputCol('normalized')\

prediction_Model = nlp.Pipeline(stages=[documentAssembler, use, use_er_model])

test_pred = prediction_Model.transform(testsdf)

Results

+----------------------+-------------------------+
|text                  |result                   |
+----------------------+-------------------------+
|FIDUS INVESTMENT corp |[FIDUS INVESTMENT CORP]  |
|ASPECT DEVELOPMENT Inc|[ASPECT DEVELOPMENT INC] |
|CFSB BANCORP          |[CFSB BANCORP INC]       |
|DALEEN TECHNOLOGIES   |[DALEEN TECHNOLOGIES INC]|
|GLEASON Corporation   |[GLEASON CORP]           |
+----------------------+-------------------------+

Model Information

Model Name: finel_nasdaq_data_company_name
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [embeddings]
Output Labels: [normalized]
Language: en
Size: 69.7 MB
Case sensitive: false

References

NASDAQ Database