Understanding Non-compete Items in Non-Compete Clauses (Bert)

Description

Given a clause classified as NON_COMP using the legmulticlf_mnda_sections_paragraph_other classifier, you can subclassify the sentences as NON_COMPETE_ITEMS, or OTHER from it using the legclf_nda_non_compete_items_bert model. It has been trained with the SOTA approach.

Predicted Entities

NON_COMPETE_ITEMS, OTHER

Download Copy S3 URI

How to use

document_assembler = nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

tokenizer = nlp.Tokenizer()\
    .setInputCols(["document"])\
    .setOutputCol("token")

sequence_classifier = legal.BertForSequenceClassification.pretrained("legclf_nda_non_compete_items_bert", "en", "legal/models")\
    .setInputCols(["document", "token"])\
    .setOutputCol("class")\
    .setCaseSensitive(True)\
    .setMaxSentenceLength(512)

clf_pipeline = nlp.Pipeline(stages=[
    document_assembler, 
    tokenizer,
    sequence_classifier    
])

empty_df = spark.createDataFrame([['']]).toDF("text")

model = clf_pipeline.fit(empty_df)

text_list = [
"""This Agreement will be binding upon and inure to the benefit of each Party and its respective heirs, successors and assigns""",
"""Activity that is in direct competition with the Company's business, including but not limited to developing, marketing, or selling products or services that are similar to those of the Company."""
]

df = spark.createDataFrame(pd.DataFrame({"text" : text_list}))

result = model.transform(df)

Results

+--------------------------------------------------------------------------------+-----------------+
|                                                                            text|            class|
+--------------------------------------------------------------------------------+-----------------+
|This Agreement will be binding upon and inure to the benefit of each Party an...|            OTHER|
|Activity that is in direct competition with the Company's business, including...|NON_COMPETE_ITEMS|
+--------------------------------------------------------------------------------+-----------------+

Model Information

Model Name: legclf_nda_non_compete_items_bert
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [class]
Language: en
Size: 406.4 MB
Case sensitive: true
Max sentence length: 512

References

In-house annotations on the Non-disclosure Agreements

Benchmarking

label              precision  recall  f1-score  support 
NON_COMPETE_ITEMS  1.00       1.00    1.00      10      
OTHER              1.00       1.00    1.00      64      
accuracy           -          -       1.00      74      
macro avg          1.00       1.00    1.00      74      
weighted avg       1.00       1.00    1.00      74