Description
IMPORTANT: Don’t run this model on the whole legal agreement. Instead:
- Split by paragraphs. You can use notebook 1 in Finance or Legal as inspiration;
 - Use the 
legclf_warranty_clauseText Classifier to select only these paragraphs; 
This is a Legal Named Entity Recognition Model to identify the Subject (who), Action (what), Object(the indemnification) and Indirect Object (to whom) from Warranty clauses.
Predicted Entities
WARRANTY, WARRANTY_ACTION, WARRANTY_SUBJECT, WARRANTY_INDIRECT_OBJECT
How to use
documentAssembler = nlp.DocumentAssembler()\
        .setInputCol("text")\
        .setOutputCol("document")
        
sentenceDetector = nlp.SentenceDetectorDLModel.pretrained("sentence_detector_dl","xx")\
        .setInputCols(["document"])\
        .setOutputCol("sentence")
tokenizer = nlp.Tokenizer()\
        .setInputCols(["sentence"])\
        .setOutputCol("token")
embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base","en") \
    .setInputCols(["sentence", "token"]) \
    .setOutputCol("embeddings")
ner_model = legal.NerModel.pretrained('legner_warranty', 'en', 'legal/models')\
        .setInputCols(["sentence", "token", "embeddings"])\
        .setOutputCol("ner")
ner_converter = nlp.NerConverter()\
        .setInputCols(["sentence","token","ner"])\
        .setOutputCol("ner_chunk")
nlpPipeline = nlp.Pipeline(stages=[documentAssembler,sentenceDetector,tokenizer,embeddings,ner_model,ner_converter])
data = spark.createDataFrame([["""8 . Representations and Warranties SONY hereby makes the following representations and warranties to PURCHASER , each of which shall be true and correct as of the date hereof and as of the Closing Date , and shall be unaffected by any investigation heretofore or hereafter made : 8.1 Power and Authority SONY has the right and power to enter into this IP Agreement and to transfer the Transferred Patents and to grant the license set forth in Section 3.1 ."""]]).toDF("text")
result = nlpPipeline.fit(data).transform(data)
Results
+--------------------------------------------------------------------------+------------------------+
|chunk                                                                     |entity                  |
+--------------------------------------------------------------------------+------------------------+
|SONY                                                                      |WARRANTY_SUBJECT        |
|makes the following representations and warranties                        |WARRANTY_ACTION         |
|PURCHASER                                                                 |WARRANTY_INDIRECT_OBJECT|
|shall be true and correct as of the date hereof and as of the Closing Date|WARRANTY                |
|shall be unaffected by any investigation                                  |WARRANTY                |
|SONY                                                                      |WARRANTY_SUBJECT        |
|has the right and power to enter into this IP Agreement                   |WARRANTY                |
+--------------------------------------------------------------------------+------------------------+
Model Information
| Model Name: | legner_warranty | 
| Compatibility: | Legal NLP 1.0.0+ | 
| License: | Licensed | 
| Edition: | Official | 
| Input Labels: | [sentence, token, embeddings] | 
| Output Labels: | [ner] | 
| Language: | en | 
| Size: | 16.3 MB | 
References
In-house annotated examples from CUAD legal dataset
Benchmarking
                     label  precision    recall  f1-score   support
                B-WARRANTY     0.8993    0.9178    0.9085       146
         B-WARRANTY_ACTION     1.0000    0.9318    0.9647        44
B-WARRANTY_INDIRECT_OBJECT     1.0000    0.9474    0.9730        19
        B-WARRANTY_SUBJECT     0.8554    0.9726    0.9103        73
                I-WARRANTY     0.9695    0.9618    0.9656      1885
         I-WARRANTY_ACTION     0.9515    0.9800    0.9655       100
I-WARRANTY_INDIRECT_OBJECT     0.8333    0.8333    0.8333         6
        I-WARRANTY_SUBJECT     1.0000    0.9444    0.9714        36
                         O     0.9758    0.9772    0.9765      3381
                  accuracy        -        -       0.9698      5690
                 macro-avg     0.9428    0.9407    0.9410      5690
              weighted-avg     0.9700    0.9698    0.9698      5690