Legal NER - Warranties (sm)

Description

IMPORTANT: Don’t run this model on the whole legal agreement. Instead:

Split by paragraphs. You can use notebook 1 in Finance or Legal as inspiration;
Use the legclf_warranty_clause Text Classifier to select only these paragraphs;

This is a Legal Named Entity Recognition Model to identify the Subject (who), Action (what), Object(the indemnification) and Indirect Object (to whom) from Warranty clauses.

Predicted Entities

WARRANTY, WARRANTY_ACTION, WARRANTY_SUBJECT, WARRANTY_INDIRECT_OBJECT

Download Copy S3 URI

How to use

documentAssembler = nlp.DocumentAssembler()\
        .setInputCol("text")\
        .setOutputCol("document")
        
sentenceDetector = nlp.SentenceDetectorDLModel.pretrained("sentence_detector_dl","xx")\
        .setInputCols(["document"])\
        .setOutputCol("sentence")

tokenizer = nlp.Tokenizer()\
        .setInputCols(["sentence"])\
        .setOutputCol("token")

embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base","en") \
    .setInputCols(["sentence", "token"]) \
    .setOutputCol("embeddings")

ner_model = legal.NerModel.pretrained('legner_warranty', 'en', 'legal/models')\
        .setInputCols(["sentence", "token", "embeddings"])\
        .setOutputCol("ner")

ner_converter = nlp.NerConverter()\
        .setInputCols(["sentence","token","ner"])\
        .setOutputCol("ner_chunk")

nlpPipeline = nlp.Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,embeddings,ner_model,ner_converter])

data = spark.createDataFrame([["""8 . Representations and Warranties SONY hereby makes the following representations and warranties to PURCHASER , each of which shall be true and correct as of the date hereof and as of the Closing Date , and shall be unaffected by any investigation heretofore or hereafter made : 8.1 Power and Authority SONY has the right and power to enter into this IP Agreement and to transfer the Transferred Patents and to grant the license set forth in Section 3.1 ."""]]).toDF("text")

result = nlpPipeline.fit(data).transform(data)

Results

+--------------------------------------------------------------------------+------------------------+
|chunk                                                                     |entity                  |
+--------------------------------------------------------------------------+------------------------+
|SONY                                                                      |WARRANTY_SUBJECT        |
|makes the following representations and warranties                        |WARRANTY_ACTION         |
|PURCHASER                                                                 |WARRANTY_INDIRECT_OBJECT|
|shall be true and correct as of the date hereof and as of the Closing Date|WARRANTY                |
|shall be unaffected by any investigation                                  |WARRANTY                |
|SONY                                                                      |WARRANTY_SUBJECT        |
|has the right and power to enter into this IP Agreement                   |WARRANTY                |
+--------------------------------------------------------------------------+------------------------+

Model Information

Model Name:	legner_warranty
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[sentence, token, embeddings]
Output Labels:	[ner]
Language:	en
Size:	16.3 MB

References

In-house annotated examples from CUAD legal dataset

Benchmarking

                     label  precision    recall  f1-score   support
                B-WARRANTY     0.8993    0.9178    0.9085       146
         B-WARRANTY_ACTION     1.0000    0.9318    0.9647        44
B-WARRANTY_INDIRECT_OBJECT     1.0000    0.9474    0.9730        19
        B-WARRANTY_SUBJECT     0.8554    0.9726    0.9103        73
                I-WARRANTY     0.9695    0.9618    0.9656      1885
         I-WARRANTY_ACTION     0.9515    0.9800    0.9655       100
I-WARRANTY_INDIRECT_OBJECT     0.8333    0.8333    0.8333         6
        I-WARRANTY_SUBJECT     1.0000    0.9444    0.9714        36
                         O     0.9758    0.9772    0.9765      3381
                  accuracy        -        -       0.9698      5690
                 macro-avg     0.9428    0.9407    0.9410      5690
              weighted-avg     0.9700    0.9698    0.9698      5690

PREVIOUSLegal Confidentiality NER

NEXTLegal Relation Extraction (Confidentiality, sm, Bidirectional)