Description
This is a Financial Entity Resolver model, trained to obtain normalized versions of Company Names, registered in NASDAQ Stock Screener. You can use this model after extracting a company name using any NER, and you will obtain the official name of the company as per NASDAQ Stock Screener.
After this, you can use finmapper_nasdaq_company_name_stock_screener
to augment and obtain more information about a company using NASDAQ Stock Screener, including Ticker, Sector, Country, etc.
Predicted Entities
How to use
documentAssembler = nlp.DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
tokenizer = nlp.Tokenizer()\
.setInputCols(["document"])\
.setOutputCol("token")
embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_model = finance.NerModel.pretrained("finner_orgs_prods_alias", "en", "finance/models")\
.setInputCols(["document", "token", "embeddings"])\
.setOutputCol("ner")
ner_converter = nlp.NerConverter()\
.setInputCols(["document","token","ner"])\
.setOutputCol("ner_chunk")
chunkToDoc = nlp.Chunk2Doc()\
.setInputCols("ner_chunk")\
.setOutputCol("ner_chunk_doc")
bge_embeddings = nlp.BGEEmbeddings.pretrained("finance_bge_base_embeddings", "en", "finance/models")\
.setInputCols("ner_chunk_doc") \
.setOutputCol("sentence_embeddings")
fe_er_model = finance.SentenceEntityResolverModel.pretrained("finel_nasdaq_company_name_stock_screener_fe", "en", "finance/models") \
.setInputCols(["sentence_embeddings"]) \
.setOutputCol("normalized")\
.setDistanceFunction("EUCLIDEAN")
nlpPipeline = nlp.Pipeline(stages=[
documentAssembler,
tokenizer,
embeddings,
ner_model,
ner_converter,
chunkToDoc,
bge_embeddings,
fe_er_model
])
text = """NIKE is an American multinational corporation that is engaged in the design, development, manufacturing, and worldwide marketing and sales of footwear, apparel, equipment, accessories, and services."""
test_data = spark.createDataFrame([[text]]).toDF("text")
model = nlpPipeline.fit(test_data)
lp = nlp.LightPipeline(model)
result = lp.annotate(text)
result["normalized"]
Results
['Nike Inc. Common Stock']
Model Information
Model Name: | finel_nasdaq_company_name_stock_screener_fe |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [normalized] |
Language: | en |
Size: | 115.7 MB |
Case sensitive: | false |
References
https://www.nasdaq.com/market-activity/stocks/screener