Description
This is a Financial Entity Resolver model, trained to obtain normalized versions of Company Names, registered in NASDAQ. You can use this model after extracting a company name using any NER, and you will obtain the official name of the company as per NASDAQ database.
After this, you can use finmapper_nasdaq_data_company_name
to augment and obtain more information about a company using NASDAQ datasource, including Ticker, Sector, Location, Currency, etc.
Predicted Entities
How to use
test = ["FIDUS INVESTMENT corp","ASPECT DEVELOPMENT Inc","CFSB BANCORP","DALEEN TECHNOLOGIES","GLEASON Corporation"]
testdf = pandas.DataFrame(test, columns=['text'])
testsdf = spark.createDataFrame(testdf).toDF('text')
documentAssembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("sentence")
use = nlp.UniversalSentenceEncoder.pretrained("tfhub_use_lg", "en")\
.setInputCols(["sentence"])\
.setOutputCol("embeddings")
use_er_model = finance.SentenceEntityResolverModel.pretrained('finel_nasdaq_data_company_name', 'en', 'finance/models')\
.setInputCols("embeddings")\
.setOutputCol('normalized')\
prediction_Model = nlp.Pipeline(stages=[documentAssembler, use, use_er_model])
test_pred = prediction_Model.transform(testsdf)
Results
+----------------------+-------------------------+
|text |result |
+----------------------+-------------------------+
|FIDUS INVESTMENT corp |[FIDUS INVESTMENT CORP] |
|ASPECT DEVELOPMENT Inc|[ASPECT DEVELOPMENT INC] |
|CFSB BANCORP |[CFSB BANCORP INC] |
|DALEEN TECHNOLOGIES |[DALEEN TECHNOLOGIES INC]|
|GLEASON Corporation |[GLEASON CORP] |
+----------------------+-------------------------+
Model Information
Model Name: | finel_nasdaq_data_company_name |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [embeddings] |
Output Labels: | [normalized] |
Language: | en |
Size: | 69.7 MB |
Case sensitive: | false |
References
NASDAQ Database