Financial Pipeline (ORG-PER-ROLE-DATE)

Description

This is a pretrained pipeline to extract Companies (ORG), People (PERSON), Job titles (ROLE) and Dates combining different pretrained NER models to improve coverage.

Predicted Entities

ORG, PERSON, ROLE, DATE

Live Demo Copy S3 URI

How to use

from johnsnowlabs import *

deid_pipeline = PretrainedPipeline("finpipe_org_per_role_date", "en", "finance/models")

deid_pipeline.annotate("John Smith works as Computer Engineer at Amazon since 2020")

res = deid_pipeline.annotate("John Smith works as Computer Engineer at Amazon since 2020")
for token, ner in zip(res['token'], res['ner']):
    print(f"{token} ({ner})")

Results

John (B-PERSON)
Smith (I-PERSON)
works (O)
as (O)
Computer (B-ROLE)
Engineer (I-ROLE)
at (O)
Amazon (B-ORG)
since (O)
2020 (B-DATE)

Model Information

Model Name: finpipe_org_per_role_date
Type: pipeline
Compatibility: Finance NLP 1.0.0+
License: Licensed
Edition: Official
Language: en
Size: 828.4 MB

References

In-house annotations on legal and financial documents, Ontonotes, Conll 2003, Finsec conll, Cuad dataset, 10k filings

Included Models

  • DocumentAssembler
  • SentenceDetectorDLModel
  • TokenizerModel
  • BertEmbeddings
  • FinanceNerModel
  • FinanceBertForTokenClassification
  • NerConverter
  • NerConverter
  • ChunkMergeModel