Description
This is the Multi-Label Text Classification model that can be used to identify up to 5 classes to facilitate analysis, discovery, and comparison of legal texts in Italian related to COVID-19 exception measures. The classes are as follows:
- Closures/lockdown
- Government_oversight
- Restrictions_of_daily_liberties
- Restrictions_of_fundamental_rights_and_civil_liberties
- State_of_Emergency
Predicted Entities
Closures/lockdown
, Government_oversight
, Restrictions_of_daily_liberties
, Restrictions_of_fundamental_rights_and_civil_liberties
, State_of_Emergency
How to use
document_assembler = nlp.DocumentAssembler() \
.setInputCol("text")\
.setOutputCol("document")
tokenizer = nlp.Tokenizer()\
.setInputCols(["document"]) \
.setOutputCol("token")
embeddings = nlp.BertEmbeddings.pretrained("bert_embeddings_bert_base_italian_xxl_cased", "it") \
.setInputCols(["document", "token"])\
.setOutputCol("embeddings")
embeddingsSentence = nlp.SentenceEmbeddings() \
.setInputCols(["document", "embeddings"])\
.setOutputCol("sentence_embeddings")\
.setPoolingStrategy("AVERAGE")
multilabelClfModel = nlp.MultiClassifierDLModel.pretrained('legmulticlf_covid19_exceptions_italian', 'it', "legal/models") \
.setInputCols(["sentence_embeddings"])\
.setOutputCol("class")
clf_pipeline = nlp.Pipeline(
stages=[document_assembler,
tokenizer,
embeddings,
embeddingsSentence,
multilabelClfModel])
df = spark.createDataFrame([["Al di fuori di tale ultima ipotesi, secondo le raccomandazioni impartite dal Ministero della salute, occorre provvedere ad assicurare la corretta applicazione di misure preventive quali lavare frequentemente le mani con acqua e detergenti comuni."]]).toDF("text")
model = clf_pipeline.fit(df)
result = model.transform(df)
result.select("text", "class.result").show(truncate=False)
Results
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
|text |result |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
|Al di fuori di tale ultima ipotesi, secondo le raccomandazioni impartite dal Ministero della salute, occorre provvedere ad assicurare la corretta applicazione di misure preventive quali lavare frequentemente le mani con acqua e detergenti comuni.|[Restrictions_of_daily_liberties]|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+
Model Information
Model Name: | legmulticlf_covid19_exceptions_italian |
Compatibility: | Legal NLP 1.0.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [sentence_embeddings] |
Output Labels: | [class] |
Language: | it |
Size: | 13.9 MB |
References
Train dataset available here
Benchmarking
label precision recall f1-score support
Closures/lockdown 0.88 0.94 0.91 47
Government_oversight 1.00 0.50 0.67 4
Restrictions_of_daily_liberties 0.88 0.79 0.83 28
Restrictions_of_fundamental_rights_and_civil_liberties 0.62 0.62 0.62 16
State_of_Emergency 0.67 1.00 0.80 6
micro-avg 0.82 0.83 0.83 101
macro-avg 0.81 0.77 0.77 101
weighted-avg 0.83 0.83 0.83 101
samples-avg 0.81 0.84 0.81 101