Whereas Pipeline

Description

IMPORTANT: Don’t run this model on the whole legal agreement. Instead:

  • Split by paragraphs. You can use notebook 1 in Finance or Legal as inspiration;
  • Use the legclf_cuad_whereas_clause Text Classifier to select only these paragraphs;

This is a Pretrained Pipeline to show extraction of whereas clauses (Subject, Action and Object), and also the relationships between them, using two approaches:

  • A Semantic Relation Extraction Model;
  • A Dependency Parser Tree;

The difficulty of these entities is that they are totally free-text, with OBJECT being sometimes very long with very diverse vocabulary. Although the NER and the REDL can help you identify them, the Dependency Parser has been added so you can navigate the tree looking for specific direct objects or other phrases.

Predicted Entities

WHEREAS_SUBJECT, WHEREAS_OBJECT, WHEREAS_ACTION

Live Demo Copy S3 URI

How to use

from johnsnowlabs import *

deid_pipeline = PretrainedPipeline("legpipe_whereas", "en", "legal/models")

deid_pipeline.annotate('WHEREAS VerticalNet owns and operates a series of online communities.')

# Return NER chunks
pipeline_result['ner_chunk']

# Return RE
pipeline_result['relations']

# Visualize the Dependencies

dependency_vis = viz.DependencyParserVisualizer()

dependency_vis.display(pipeline_result[0], #should be the results of a single example, not the complete dataframe.
                       pos_col = 'pos', #specify the pos column
                       dependency_col = 'dependencies', #specify the dependency column
                       dependency_type_col = 'dependency_type' #specify the dependency type column
                       )

Results

# NER
['VerticalNet', 'operates', 'a series of online communities']

# Relations
['has_subject', 'has_subject', 'has_object']

# DEP
# Use Spark NLP Display to see the dependency tree

Model Information

Model Name: legpipe_whereas
Type: pipeline
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Language: en
Size: 918.6 MB

References

In-house annotations on CUAD dataset

Included Models

  • nlp.DocumentAssembler
  • nlp.Tokenizer
  • nlp.PerceptronModel
  • nlp.DependencyParserModel
  • nlp.TypedDependencyParserModel
  • nlp.RoBertaEmbeddings
  • legal.NerModel
  • nlp.NerConverter
  • legal.RelationExtractionDLModel