Sentence Embeddings - Biobert cased (MedNLI)

Description

This model is trained to generate contextual sentence embeddings of input sentences. It has been fine-tuned on MedNLI dataset to provide sota performance on STS and SentEval Benchmarks.

Download Copy S3 URI

How to use

Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, BertSentenceEmbeddings. The output of this model can be used in tasks like NER, Classification, Entity Resolution etc.

sbiobert_embeddings = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")

val sbiobert_embeddings = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols(Array("ner_chunk_doc"))
.setOutputCol("sbert_embeddings")

import nlu
nlu.load("en.embed_sentence.biobert.mli").predict("""Put your text here.""")

Results

Gives a 768 dimensional vector representation of the sentence.

Model Information

Model Name:	sbiobert_base_cased_mli
Type:	BertSentenceEmbeddings
Compatibility:	Spark NLP 2.6.4 +
Edition:	Official
License:	Licensed
Input Labels:	[ner_chunk]
Output Labels:	[sentence_embeddings]
Language:	[en]
Case sensitive:	false

Data Source

Tuned on MedNLI dataset using Biobert weights.

PREVIOUSPICO Classifier

NEXTSentence Entity Resolver for CPT (sbiobert_base_cased_mli embeddings)