Description
This model allows you to, given an extracted name of a company, get following information about that company from Nasdaq Stock Screener:
- Country
- IPO_Year
- Industry
- Last_Sale
- Market_Cap
- Name
- Net_Change
- Percent_Change
- Sector
- Ticker
- Volume
It can be optionally combined with Entity Resolution to normalize first the name of the company.
Predicted Entities
How to use
document_assembler = nlp.DocumentAssembler()\
.setInputCol('text')\
.setOutputCol('document')
tokenizer = nlp.Tokenizer()\
.setInputCols("document")\
.setOutputCol("token")
embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base","en") \
.setInputCols(["document", "token"]) \
.setOutputCol("embeddings")
ner_model = finance.NerModel.pretrained('finner_orgs_prods_alias', 'en', 'finance/models')\
.setInputCols(["document", "token", "embeddings"])\
.setOutputCol("ner")
ner_converter = nlp.NerConverter()\
.setInputCols(["document", "token", "ner"])\
.setOutputCol("ner_chunk")
# Optional: To normalize the ORG name using NASDAQ data before the mapping
##########################################################################
chunkToDoc = nlp.Chunk2Doc()\
.setInputCols("ner_chunk")\
.setOutputCol("ner_chunk_doc")
chunk_embeddings = nlp.UniversalSentenceEncoder.pretrained("tfhub_use", "en")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("chunk_embeddings")
use_er_model = finance.SentenceEntityResolverModel.pretrained('finel_nasdaq_company_name_stock_screener', 'en', 'finance/models')\
.setInputCols("chunk_embeddings")\
.setOutputCol('normalized')\
.setDistanceFunction("EUCLIDEAN")
##########################################################################
CM = finance.ChunkMapperModel.pretrained('finmapper_nasdaq_company_name_stock_screener', 'en', 'finance/models')\
.setInputCols(["normalized"])\
.setOutputCol("mappings")
pipeline = nlp.Pipeline().setStages([document_assembler,
tokenizer,
embeddings,
ner_model,
ner_converter,
chunkToDoc, # Optional for normalization
chunk_embeddings, # Optional for normalization
use_er_model, # Optional for normalization
CM])
empty_data = spark.createDataFrame([[""]]).toDF("text")
model = pipeline.fit(empty_data)
lp = nlp.LightPipeline(model)
text = """Nike is an American multinational association that is involved in the design, development, manufacturing and worldwide marketing and sales of apparel, footwear, accessories, equipment and services."""
result = lp.fullAnnotate(text)
Results
"Country": "United States",
"IPO_Year": "0",
"Industry": "Shoe Manufacturing",
"Last_Sale": "$128.85",
"Market_Cap": "1.9979004036E11",
"Name": "Nike Inc. Common Stock",
"Net_Change": "0.96",
"Percent_Change": "0.751%",
"Sector": "Consumer Discretionary",
"Symbol": "NKE",
"Volume": "4854668"
Model Information
Model Name: | finmapper_nasdaq_company_name_stock_screener |
Compatibility: | Finance NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [ner_chunk] |
Output Labels: | [mappings] |
Language: | en |
Size: | 599.1 KB |
References
https://www.nasdaq.com/market-activity/stocks/screener