Sentence Embeddings - Bluebert uncased (MedNLI)

Description

This model is trained to generate contextual sentence embeddings of input sentences. It has been fine-tuned on MedNLI dataset to provide sota performance on STS and SentEval Benchmarks.

Download Copy S3 URI

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, BertSentenceEmbeddings. The output of this model can be used in tasks like NER, Classification, Entity Resolution etc.

sbiobert_embeddings = BertSentenceEmbeddings\
.pretrained("sbluebert_base_uncased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")

val sbiobert_embeddings = BertSentenceEmbeddings.pretrained("sbluebert_base_uncased_mli","en","clinical/models")
.setInputCols(Array("ner_chunk_doc"))
.setOutputCol("sbert_embeddings")

import nlu
nlu.load("en.embed_sentence.bluebert.mli").predict("""Put your text here.""")

Results

Gives a 768 dimensional vector representation of the sentence.

Model Information

Model Name:	sbluebert_base_uncased_mli
Type:	BertSentenceEmbeddings
Compatibility:	Spark NLP 2.6.4 +
Edition:	Official
License:	Licensed
Input Labels:	[ner_chunk]
Output Labels:	[sentence_embeddings]
Language:	[en]
Case sensitive:	false

Data Source

Tuned on MedNLI dataset using Bluebert weights.

PREVIOUSSentence Entity Resolver for Snomed Concepts, INT version (``sbiobert_base_cased_mli`` embeddings)

NEXTDetect Drugs and Posology Entities (ner_posology_greedy)