Legal Assertion Status (Negation)

Description

This is a Legal Negation model, aimed to identify if an NER entity is mentioned in the context to be negated or not.

Predicted Entities

positive, negative

How to use

import pyspark.sql.functions as F

document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentence_detector = nlp.SentenceDetector() \
    .setInputCols(["document"]) \
    .setOutputCol("sentence")

tokenizer = nlp.Tokenizer()\
    .setInputCols(["sentence"])\
    .setOutputCol("token")

embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_sec_bert_base","en") \
    .setInputCols(["sentence", "token"]) \
    .setOutputCol("embeddings")

ner = legal.NerModel.pretrained("legner_orgs_prods_alias","en","legal/models")\
    .setInputCols(["sentence", "token", "embeddings"])\
    .setOutputCol("ner")

ner_converter = nlp.NerConverter() \
    .setInputCols(["sentence", "token", "ner"]) \
    .setOutputCol("ner_chunk")

legassertion = legal.AssertionDLModel.pretrained("legassertion_negation", "en", "legal/models")\
    .setInputCols(["sentence", "ner_chunk", "embeddings"])\
    .setOutputCol("leglabel")

pipe = nlp.Pipeline(stages = [ document_assembler, sentence_detector, tokenizer, embeddings, ner, ner_converter, legassertion])

text = "Gradio INC will not be entering into a joint agreement with Hugging Face, Inc."

sdf = spark.createDataFrame([[text]]).toDF("text")
res = pipe.fit(sdf).transform(sdf)

res.select(F.explode(F.arrays_zip(res.ner_chunk.result, 
                                  res.leglabel.result)).alias("cols"))\
                  .select(F.expr("cols['0']").alias("ner_chunk"),
                          F.expr("cols['1']").alias("assertion")).show(200, truncate=100)

Results

+-----------------+---------+
|        ner_chunk|assertion|
+-----------------+---------+
|       Gradio INC| negative|
|Hugging Face, Inc| positive|
+-----------------+---------+

Model Information

Model Name:	legassertion_negation
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[document, chunk, embeddings]
Output Labels:	[assertion]
Language:	en
Size:	2.2 MB

References

In-house annotated legal sentences

Benchmarking

label	 tp	 fp	 fn	 prec	 rec	 f1
negative	 26	 0	 1	 1.0	 0.962963	 0.9811321
positive	 38	 1	 0	 0.974359	 1.0	 0.987013
Macro-average 641 1 1 0.9871795 0.9814815 0.9843222
Micro-average 0.9846154 0.9846154 0.9846154

PREVIOUSFinancial Assertion Status (Negation)

NEXTLegal ORG, PER, ROLE, DATE NER