Legal BGE Embeddings

Description

The BGE embedding model was trained on a mix of different datasets. We used public data and in-house annotated documents.

Predicted Entities

Download Copy S3 URI

How to use

embeddings =  nlp.BGEEmbeddings.pretrained("legal_bge_base_embeddings","en","legal/models")\
    .setInputCols("document")\ 
    .setOutputCol("embeddings")

Model Information

Model Name:	legal_bge_base_embeddings
Compatibility:	Legal NLP 1.0.0+
License:	Licensed
Edition:	Official
Input Labels:	[document]
Output Labels:	[sentence_embeddings]
Language:	en
Size:	394.4 MB

References

Public data and in-house annotated documents

PREVIOUSCompany Name Normalization using Nasdaq Stock Screener

NEXTSentence Entity Resolver for SNOMED Veterinary