Applicable Law Clause NER Model

Description

This is a NER model aimed to be used in applicable_law clauses to retrieve entities as APPLIC_LAW. Make sure you run this model only on applicable_law clauses after you filter them using legclf_applicable_law_cuad model.

Predicted Entities

APPLIC_LAW

Copy S3 URI

How to use

documentAssembler = nlp.DocumentAssembler()\
        .setInputCol("text")\
        .setOutputCol("document")
        
sentenceDetector = nlp.SentenceDetectorDLModel.pretrained("sentence_detector_dl","xx")\
        .setInputCols(["document"])\
        .setOutputCol("sentence")

tokenizer = nlp.Tokenizer()\
        .setInputCols(["sentence"])\
        .setOutputCol("token")

embeddings = nlp.RoBertaEmbeddings.pretrained("roberta_embeddings_legal_roberta_base", "en") \
        .setInputCols("sentence", "token") \
        .setOutputCol("embeddings")

ner_model = legal.NerModel.pretrained("legner_applicable_law_clause", "en", "legal/models")\
        .setInputCols(["sentence", "token", "embeddings"])\
        .setOutputCol("ner")

ner_converter = nlp.NerConverter()\
        .setInputCols(["sentence","token","ner"])\
        .setOutputCol("ner_chunk")

nlpPipeline = nlp.Pipeline(stages=[
        documentAssembler,
        sentenceDetector,
        tokenizer,
        embeddings,
        ner_model,
        ner_converter])

empty_data = spark.createDataFrame([[""]]).toDF("text")

model = nlpPipeline.fit(empty_data)

text = ["""ELECTRAMECCANICA VEHICLES CORP., an entity incorporated under the laws of the Province of British Columbia, Canada, with an address of Suite 102 East 1st Avenue, Vancouver, British Columbia, Canada, V5T 1A4 ("EMV")""" ]

result = model.transform(spark.createDataFrame([text]).toDF("text"))

Results

+----------------------------------------+----------+----------+
|chunk                                   |ner_label |confidence|
+----------------------------------------+----------+----------+
|laws of the Province of British Columbia|APPLIC_LAW|0.95625716|
+----------------------------------------+----------+----------+

Model Information

Model Name: legner_applicable_law_clause
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence, token, embeddings]
Output Labels: [ner]
Language: en
Size: 1.1 MB

References

In-house dataset

Benchmarking

       label  precision    recall  f1-score   support
B-APPLIC_LAW       0.90      0.89      0.90        84
I-APPLIC_LAW       0.98      0.93      0.96       425
   micro-avg       0.97      0.93      0.95       509
   macro-avg       0.94      0.91      0.93       509
weighted-avg       0.97      0.93      0.95       509