Understanding Perpetuity in "Return of Confidential Information" Clauses (Bert)

Description

Given a clause classified as RETURN_OF_CONF_INFO using the legmulticlf_mnda_sections_paragraph_other classifier, you can subclassify the sentences as PERPETUITY or OTHER from it using the legclf_nda_perpetuity_bert model. It has been trained with the SOTA approach

Predicted Entities

PERPETUITY, OTHER

Download Copy S3 URI

How to use

document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

tokenizer = nlp.Tokenizer()\
    .setInputCols(["document"])\
    .setOutputCol("token")

sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_perpetuity_bert", "en", "legal/models")\
    .setInputCols(["document", "token"])\
    .setOutputCol("class")\
    .setCaseSensitive(True)\
    .setMaxSentenceLength(512)

clf_pipeline = nlp.Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequence_classifier    
])

empty_df = spark.createDataFrame([['']]).toDF("text")

model = clf_pipeline.fit(empty_df)

text_list = [
"""Notwithstanding the return or destruction of all Evaluation Material, you or your Representatives shall continue to be bound by your obligations of confidentiality and other obligations hereunder.""",
"""There are no intended third party beneficiaries to this Agreement."""
]

df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))

result = model.transform(df)

Results

+--------------------------------------------------------------------------------+----------+
|                                                                            text|     class|
+--------------------------------------------------------------------------------+----------+
|Notwithstanding the return or destruction of all Evaluation Material, you or ...|PERPETUITY|
|              There are no intended third-party beneficiaries to this Agreement.|     OTHER|
+--------------------------------------------------------------------------------+----------+

Model Information

Model Name: legclf_nda_perpetuity_bert
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.4 MB
Case sensitive: true
Max sentence length: 512

References

In-house annotations on the Non-disclosure Agreements

Benchmarking

label         precision  recall  f1-score  support 
OTHER         0.98       1.00    0.99      60      
PERPETUITY    1.00       0.89    0.94      9       
accuracy      -          -       0.99      69      
macro avg     0.99       0.94    0.97      69      
weighted avg  0.99       0.99    0.99      69