Temporality / Certainty Assertion Status (sm)

Description

This is a small Assertion Status Model aimed to detect temporality (PRESENT, PAST, FUTURE) or Certainty (POSSIBLE) in your legal documents

Predicted Entities

PRESENT, PAST, FUTURE, POSSIBLE

Live Demo Download Copy S3 URI

How to use

# YOUR NER HERE
# ...
embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base","en") \
    .setInputCols(["sentence", "token"]) \
    .setOutputCol("embeddings")

chunk_converter = nlp.ChunkConverter() \
    .setInputCols(["entity"]) \
    .setOutputCol("ner_chunk")

assertion = legal.AssertionDLModel.pretrained("legassertion_time", "en", "legal/models")\
    .setInputCols(["sentence", "ner_chunk", "embeddings"]) \
    .setOutputCol("assertion")
    
nlpPipeline = nlp.Pipeline(stages=[
    documentAssembler, 
    tokenizer,
    embeddings,
    ner,
    chunk_converter,
    assertion
    ])

empty_data = spark.createDataFrame([[""]]).toDF("text")

model = nlpPipeline.fit(empty_data)

lp = LightPipeline(model)

texts = ["The subsidiaries of Atlantic Inc will participate in a merging operation",
    "The Conditions and Warranties of this agreement might be modified"]

lp.annotate(texts)

Results

chunk,begin,end,entity_type,assertion
Atlantic Inc,20,31,ORG,FUTURE

chunk,begin,end,entity_type,assertion
Conditions and Warranties,4,28,DOC,POSSIBLE

Model Information

Model Name:	legassertion_time
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[document, doc_chunk, embeddings]
Output Labels:	[assertion]
Language:	en
Size:	2.2 MB

References

In-house annotations on financial and legal corpora

Benchmarking

label            tp      fp    fn    prec         rec         f1
PRESENT          201     11    16    0.9481132    0.9262672   0.937063
POSSIBLE         171     3     6     0.9827586    0.9661017   0.974359
FUTURE           119     6     4     0.952        0.9674796   0.959677
PAST             270     16    10    0.9440559    0.9642857   0.954063
Macro-average    761     36    36    0.9567319    0.9560336   0.9563826
Micro-average    761     36    36    0.9548306    0.9548306   0.9548306

PREVIOUSRussian Fact Extraction NER

NEXTLegal Indemnifications Clause Binary Classifier