Legal Word Embeddings

Description

The word embedding models were based on Word2Vec, trained on a mix of different datasets. We used public data and in-house annotated documents.

Predicted Entities

Download Copy S3 URI

How to use

model =  nlp.WordEmbeddingsModel.pretrained("legal_word_embeddings","en","legal/models")\
	.setInputCols(["sentence","token"])\
	.setOutputCol("embeddings")

Model Information

Model Name:	legal_word_embeddings
Type:	embeddings
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[document, token]
Output Labels:	[word_embeddings]
Language:	en
Size:	84.9 MB
Case sensitive:	false
Dimension:	200

References

Public data and in-house annotated documents

PREVIOUSFinancial NER on Aspect-Based Sentiment Analysis

NEXTGeneric Deidentification NER (Legal)