Legal Word Embeddings

Description

The word embedding models were based on Word2Vec, trained on a mix of different datasets. We used public data and in-house annotated documents.

Predicted Entities

Copy S3 URI

How to use

model =  nlp.WordEmbeddingsModel.pretrained("legal_word_embeddings","en","legal/models")\
	.setInputCols(["sentence","token"])\
	.setOutputCol("embeddings")

Model Information

Model Name: legal_word_embeddings
Type: embeddings
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document, token]
Output Labels: [word_embeddings]
Language: en
Size: 84.9 MB
Case sensitive: false
Dimension: 200

References

Public data and in-house annotated documents