Description
This is a pretrained pipeline to extract Companies (ORG), People (PERSON), Job titles (ROLE) and Dates combining different pretrained NER models to improve coverage.
Predicted Entities
ORG, PERSON, ROLE, DATE
How to use
from johnsnowlabs import *
deid_pipeline = PretrainedPipeline("finpipe_org_per_role_date", "en", "finance/models")
deid_pipeline.annotate("John Smith works as Computer Engineer at Amazon since 2020")
res = deid_pipeline.annotate("John Smith works as Computer Engineer at Amazon since 2020")
for token, ner in zip(res['token'], res['ner']):
print(f"{token} ({ner})")
Results
John (B-PERSON)
Smith (I-PERSON)
works (O)
as (O)
Computer (B-ROLE)
Engineer (I-ROLE)
at (O)
Amazon (B-ORG)
since (O)
2020 (B-DATE)
Model Information
| Model Name: | finpipe_org_per_role_date |
| Type: | pipeline |
| Compatibility: | Finance NLP 1.0.0+ |
| License: | Licensed |
| Edition: | Official |
| Language: | en |
| Size: | 828.4 MB |
References
In-house annotations on legal and financial documents, Ontonotes, Conll 2003, Finsec conll, Cuad dataset, 10k filings
Included Models
- DocumentAssembler
- SentenceDetectorDLModel
- TokenizerModel
- BertEmbeddings
- FinanceNerModel
- FinanceBertForTokenClassification
- NerConverter
- NerConverter
- ChunkMergeModel