Legal Law Area Prediction Classifier (French)

Description

This is a Multiclass classification model which identifies law area labels(civil_law, penal_law, public_law, social_law) in French-based Court Cases.

Predicted Entities

civil_law, penal_law, public_law, social_law

Download Copy S3 URI

How to use

document_assembler = nlp.DocumentAssembler() \
     .setInputCol("text") \
     .setOutputCol("document")

embeddings = nlp.BertSentenceEmbeddings.pretrained("sent_bert_multi_cased", "xx")\
    .setInputCols(["document"]) \
    .setOutputCol("sentence_embeddings")

docClassifier = legal.ClassifierDLModel.pretrained("legclf_law_area_prediction_french", "fr", "legal/models")\
    .setInputCols(["sentence_embeddings"])\
    .setOutputCol("category")

nlpPipeline = nlp.Pipeline(stages=[
      document_assembler, 
      embeddings,
      docClassifier
])

df = spark.createDataFrame([["par ces motifs, le Juge unique prononce : 1. Le recours est irrecevable. 2. Il n'est pas perçu de frais judiciaires. 3. Le présent arrêt est communiqué aux parties, au Tribunal administratif fédéral et à l'Office fédéral des assurances sociales. Lucerne, le 2 juin 2016 Au nom de la IIe Cour de droit social du Tribunal fédéral suisse Le Juge unique : Meyer Le Greffier : Cretton"]]).toDF("text")

model = nlpPipeline.fit(df)
result = model.transform(df)

result.select("text", "category.result").show(truncate=100)

Results

+----------------------------------------------------------------------------------------------------+------------+
|                                                                                                text|      result|
+----------------------------------------------------------------------------------------------------+------------+
|par ces motifs, le Juge unique prononce : 1. Le recours est irrecevable. 2. Il n'est pas perçu de...|[social_law]|
+----------------------------------------------------------------------------------------------------+------------+

Model Information

Model Name: legclf_law_area_prediction_french
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [sentence_embeddings]
Output Labels: [class]
Language: fr
Size: 22.3 MB

References

Train dataset available here

Benchmarking

label         precision  recall  f1-score  support 
civil_law     0.93       0.91    0.92      613     
penal_law     0.94       0.96    0.95      579     
public_law    0.92       0.91    0.92      605     
social_law    0.97       0.98    0.97      478     
accuracy      -          -       0.94      2275    
macro-avg     0.94       0.94    0.94      2275    
weighted-avg  0.94       0.94    0.94      2275