IMPORTANT: Don’t run this model on the whole legal agreement. Instead:
- Split by paragraphs. You can use notebook 1 in Finance or Legal as inspiration;
- Use the
legclf_cuad_obligations_clauseText Classifier to select only these paragraphs;
This is a Pretrained Pipeline to process agreements, more specifically the sentences where all the obligations of the parties are expressed (what they agreed upon in the contract).
This pipeline returns:
- NER entities for the subject, the action/verb, the object and the indirect object of the clause;
- Syntactic dependencies of the chunks, so that you can disambiguate in case different clauses/agreements are present in the same sentence.
This model does not include a Sentence Detector, it executes everything at document-level. If you want to split by sentences, do it before and call this pipeline with the text of the sentences.
How to use
from johnsnowlabs import * deid_pipeline = PretrainedPipeline("legpipe_obligations", "en", "legal/models") deid_pipeline.annotate('The Supplier agrees to provide the Buyer with all the necessary documents to fulfill the agreement') # Return NER chunkcs pipeline_result['ner_chunk'] # Visualize the Dependencies dependency_vis = viz.DependencyParserVisualizer() dependency_vis.display(pipeline_result, #should be the results of a single example, not the complete dataframe. pos_col = 'pos', #specify the pos column dependency_col = 'dependencies', #specify the dependency column dependency_type_col = 'dependency_type' #specify the dependency type column )
# NER ['Supplier', 'agrees to provide', 'Buyer', 'with all the necessary documents to fulfill the agreement'] # DEP # Use Spark NLP Display to see the dependency tree
|Compatibility:||Legal NLP 1.0.0+|
In-house annotations on CUAD dataset