Legal NER for NDA (Assigment Clause)

Description

This is a NER model, aimed to be run only after detecting the ASSIGNMENT clause with a proper classifier (use legmulticlf_mnda_sections_paragraph_other for that purpose). It will extract the following entities: ASSIGN_EXCEPTION

Predicted Entities

ASSIGN_EXCEPTION

Download Copy S3 URI

How to use

document_assembler = nlp.DocumentAssembler()\
        .setInputCol("text")\
        .setOutputCol("document")
        
sentence_detector = nlp.SentenceDetector()\
        .setInputCols(["document"])\
        .setOutputCol("sentence")

tokenizer = nlp.Tokenizer()\
        .setInputCols(["sentence"])\
        .setOutputCol("token")

embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base","en") \
        .setInputCols(["sentence", "token"]) \
        .setOutputCol("embeddings")\
        .setMaxSentenceLength(512)\
        .setCaseSensitive(True)

ner_model = legal.NerModel.pretrained("legner_nda_assigment", "en", "legal/models")\
        .setInputCols(["sentence", "token", "embeddings"])\
        .setOutputCol("ner")

ner_converter = nlp.NerConverter()\
        .setInputCols(["sentence", "token", "ner"])\
        .setOutputCol("ner_chunk")

nlpPipeline = nlp.Pipeline(stages=[
        document_assembler,
        sentence_detector,
        tokenizer,
        embeddings,
        ner_model,
        ner_converter])

empty_data = spark.createDataFrame([[""]]).toDF("text")

model = nlpPipeline.fit(empty_data)

text = ["""Any attempted or purported assignment of this Agreement by either party without the prior written consent of the other party shall be null and void."""]

result = model.transform(spark.createDataFrame([text]).toDF("text"))

Results

+---------------+----------------+
|chunk          |ner_label       |
+---------------+----------------+
|written consent|ASSIGN_EXCEPTION|
+---------------+----------------+

Model Information

Model Name:	legner_nda_assigment
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[sentence, token, embeddings]
Output Labels:	[ner]
Language:	en
Size:	16.3 MB

References

In-house annotations on the Non-disclosure Agreements

Benchmarking

label               precision  recall  f1-score  support 
B-ASSIGN_EXCEPTION  0.96       0.96    0.96      24      
I-ASSIGN_EXCEPTION  0.94       0.94    0.94      17      
micro-avg           0.95       0.95    0.95      41      
macro-avg           0.95       0.95    0.95      41      
weighted-avg        0.95       0.95    0.95      41 

PREVIOUSUnderstanding Perpetuity in "Return of Confidential Information" Clauses

NEXTPipeline to Resolve Medication Codes