Legal BGE Embeddings

Description

The BGE embedding model was trained on a mix of different datasets. We used public data and in-house annotated documents.

Predicted Entities

Copy S3 URI

How to use

embeddings =  nlp.BGEEmbeddings.pretrained("legal_bge_base_embeddings","en","legal/models")\
    .setInputCols("document")\ 
    .setOutputCol("embeddings")

Model Information

Model Name: legal_bge_base_embeddings
Compatibility: Legal NLP 1.0.0+
License: Licensed
Edition: Official
Input Labels: [document]
Output Labels: [sentence_embeddings]
Language: en
Size: 394.4 MB

References

Public data and in-house annotated documents