Sentence Embeddings - Bluebert uncased (MedNLI)

Description

This model is trained to generate contextual sentence embeddings of input sentences. It has been fine-tuned on MedNLI dataset to provide sota performance on STS and SentEval Benchmarks.

Copy S3 URI

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, BertSentenceEmbeddings. The output of this model can be used in tasks like NER, Classification, Entity Resolution etc.

sbiobert_embeddings = BertSentenceEmbeddings\
.pretrained("sbluebert_base_uncased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")


val sbiobert_embeddings = BertSentenceEmbeddings.pretrained("sbluebert_base_uncased_mli","en","clinical/models")
.setInputCols(Array("ner_chunk_doc"))
.setOutputCol("sbert_embeddings")

import nlu
nlu.load("en.embed_sentence.bluebert.mli").predict("""Put your text here.""")

Results

Gives a 768 dimensional vector representation of the sentence.

Model Information

Model Name: sbluebert_base_uncased_mli
Type: BertSentenceEmbeddings
Compatibility: Spark NLP 2.6.4 +
Edition: Official
License: Licensed
Input Labels: [ner_chunk]
Output Labels: [sentence_embeddings]
Language: [en]
Case sensitive: false

Data Source

Tuned on MedNLI dataset using Bluebert weights.