Legal Relation Extraction (Obligations, md, Unidirectional)

Description

IMPORTANT: Don’t run this model on the whole legal agreement. Instead:

  • Split by paragraphs. You can use notebook 1 in Finance or Legal as inspiration;
  • Use the legclf_cuad_obligations_clause Text Classifier to select only these paragraphs;

We call “obligation” to any sentence in the text stating that a Party (OBLIGATION_SUBJECT) must do (OBLIGATION_ACITON) something (OBLIGATION_OBJECT) to other Party (OBLIGATION_INDIRECT_OBJECT). This model extracts relationships, connecting all of those parts of the sentence (subject with action, action with object, etc).

This model requires legner_obligations as an NER in the pipeline.It’s a md model with Unidirectional Relations, meaning that the model retrieves in chunk1 the left side of the relation (source), and in chunk2 the right side (target).

This is a Deep Learning model, meaning only semantics are taking into account, not grammatical structures. If you want to parse the relations using a grammatical dependency tree, please feel free to use this other model

Predicted Entities

is_obliged_to, is_obliged_with, is_obliged_object

Copy S3 URI

How to use

document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

tokenizer = nlp.Tokenizer()\
    .setInputCols("document")\
    .setOutputCol("token")

ner_model = legal.BertForTokenClassification.pretrained("legner_obligations", "en", "legal/models")\
    .setInputCols("token", "document")\
    .setOutputCol("ner")\
    .setCaseSensitive(True)

ner_converter = nlp.NerConverter()\
    .setInputCols(["document","token","ner"])\
    .setOutputCol("ner_chunk")

re_model = legal.RelationExtractionDLModel()\
    .pretrained("legre_obligations_md", "en", "legal/models")\
    .setPredictionThreshold(0.5)\
    .setInputCols(["ner_chunk", "document"])\
    .setOutputCol("relations")

pipeline = nlp.Pipeline(stages=[
        document_assembler, 
        tokenizer,
        ner_model,
        ner_converter,
        re_model
])

empty_df = spark.createDataFrame([[""]]).toDF("text")

model = pipeline.fit(empty_df)

text="""Licensee agrees to reasonably cooperate with Licensor in achieving registration of the Licensed Mark."""

data = spark.createDataFrame([[text]]).toDF("text")

result = model.transform(data)


Results

| relation          | entity1                    | entity1_begin | entity1_end | chunk1                         | entity2                    | entity2_begin | entity2_end | chunk2                                         | confidence |
|-------------------|----------------------------|---------------|-------------|--------------------------------|----------------------------|---------------|-------------|------------------------------------------------|------------|
| is_obliged_to     | OBLIGATION_ACTION          | 9             | 38          | agrees to reasonably cooperate | OBLIGATION_SUBJECT         | 0             | 7           | Licensee                                       | 0.91654503 |
| is_obliged_with   | OBLIGATION_SUBJECT         | 0             | 7           | Licensee                       | OBLIGATION_INDIRECT_OBJECT | 45            | 52          | Licensor                                       | 0.803172   |
| is_obliged_to     | OBLIGATION_SUBJECT         | 0             | 7           | Licensee                       | OBLIGATION                 | 54            | 99          | in achieving registration of the Licensed Mark | 0.7439706  |
| is_obliged_object | OBLIGATION_ACTION          | 9             | 38          | agrees to reasonably cooperate | OBLIGATION_INDIRECT_OBJECT | 45            | 52          | Licensor                                       | 0.96132916 |
| is_obliged_object | OBLIGATION_ACTION          | 9             | 38          | agrees to reasonably cooperate | OBLIGATION                 | 54            | 99          | in achieving registration of the Licensed Mark | 0.9174475  |
| is_obliged_to     | OBLIGATION_INDIRECT_OBJECT | 45            | 52          | Licensor                       | OBLIGATION                 | 54            | 99          | in achieving registration of the Licensed Mark | 0.9091029  |

Model Information

Model Name: legre_obligations_md
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Language: en
Size: 402.3 MB

References

Manual annotations on CUAD dataset

Benchmarking

label              Recall  Precision  F1     Support 
is_obliged_object  0.989   0.994      0.992  177     
is_obliged_to      0.995   1.000      0.998  202     
is_obliged_with    1.000   0.961      0.980  49      
Avg.               0.996   0.989      0.992  -       
Weighted-Avg.      0.996   0.996      0.996  -