Description
This model aims to detect License grants / permissions in agreements, provided by a Subject (PERMISSION_SUBJECT) to a Recipient (PERMISSION_INDIRECT_OBJECT). THe permission itself is in PERMISSION tag.
There is a lighter (non-transformer based) version of this model available as legner_grants_md.
Predicted Entities
PERMISSION, PERMISSION_SUBJECT, PERMISSION_OBJECT, PERMISSION_INDIRECT_OBJECT
How to use
documentAssembler = nlp.DocumentAssembler()\
  .setInputCol("text")\
  .setOutputCol("document")
tokenizer = nlp.Tokenizer()\
  .setInputCols("document")\
  .setOutputCol("token")
tokenClassifier = legal.BertForTokenClassification.pretrained("legner_bert_grants", "en", "legal/models")\
  .setInputCols("token", "document")\
  .setOutputCol("label")\
  .setCaseSensitive(True)
pipeline =  nlp.Pipeline(stages=[
  documentAssembler,
  tokenizer,
  tokenClassifier
    ]
)
import pandas as pd
p_model = pipeline.fit(spark.createDataFrame(pd.DataFrame({'text': ['']})))
text = """Fox grants to Licensee a limited, exclusive (except as otherwise may be provided in this Agreement), 
non-transferable (except as permitted in Paragraph 17(d)) right and license"""
res = p_model.transform(spark.createDataFrame([[text]]).toDF("text"))
from pyspark.sql import functions as F
res.select(F.explode(F.arrays_zip('token.result', 'label.result')).alias("cols")) \
               .select(F.expr("cols['0']").alias("token"),
                       F.expr("cols['1']").alias("ner_label"))\
               .show(20, truncate=100)
Results
+----------------+----------------------------+
|           token|                   ner_label|
+----------------+----------------------------+
|             Fox|        B-PERMISSION_SUBJECT|
|          grants|                           O|
|              to|                           O|
|        Licensee|B-PERMISSION_INDIRECT_OBJECT|
|               a|                           O|
|         limited|                B-PERMISSION|
|               ,|                I-PERMISSION|
|       exclusive|                I-PERMISSION|
|               (|                I-PERMISSION|
|          except|                I-PERMISSION|
|              as|                I-PERMISSION|
|       otherwise|                I-PERMISSION|
|             may|                I-PERMISSION|
|              be|                I-PERMISSION|
|        provided|                I-PERMISSION|
|              in|                I-PERMISSION|
|            this|                I-PERMISSION|
|       Agreement|                I-PERMISSION|
|              ),|                I-PERMISSION|
|non-transferable|                I-PERMISSION|
+----------------+----------------------------+
Model Information
| Model Name: | legner_bert_grants | 
| Type: | legal | 
| Compatibility: | Legal NLP 1.0.0+ | 
| License: | Licensed | 
| Edition: | Official | 
| Input Labels: | [sentence, token] | 
| Output Labels: | [ner] | 
| Language: | en | 
| Size: | 412.2 MB | 
| Case sensitive: | true | 
| Max sentence length: | 128 | 
References
Manual annotations on CUAD dataset
Benchmarking
                       label  precision    recall  f1-score   support
                B-PERMISSION       0.88      0.79      0.83        38
B-PERMISSION_INDIRECT_OBJECT       0.85      0.94      0.89        36
        B-PERMISSION_SUBJECT       0.89      0.85      0.87        40
                I-PERMISSION       0.80      0.69      0.74       342
                           O       0.94      0.97      0.95      1827
                    accuracy         -         -       0.92      2292
                   macro-avg       0.85      0.81      0.86      2292
                weighted-avg       0.91      0.92      0.91      2292