Description
This model is trained to generate contextual sentence embeddings of input sentences. It has been fine-tuned on MedNLI dataset to provide sota performance on STS and SentEval Benchmarks.
How to use
Use as part of an nlp pipeline with the following stages: DocumentAssembler, SentenceDetector, BertSentenceEmbeddings. The output of this model can be used in tasks like NER, Classification, Entity Resolution etc.
sbiobert_embeddings = BertSentenceEmbeddings\
.pretrained("sbiobert_base_cased_mli","en","clinical/models")\
.setInputCols(["ner_chunk_doc"])\
.setOutputCol("sbert_embeddings")
val sbiobert_embeddings = BertSentenceEmbeddings
.pretrained("sbiobert_base_cased_mli","en","clinical/models")
.setInputCols(Array("ner_chunk_doc"))
.setOutputCol("sbert_embeddings")
import nlu
nlu.load("en.embed_sentence.biobert.mli").predict("""Put your text here.""")
Results
Gives a 768 dimensional vector representation of the sentence.
Model Information
Model Name: | sbiobert_base_cased_mli |
Type: | BertSentenceEmbeddings |
Compatibility: | Spark NLP 2.6.4 + |
Edition: | Official |
License: | Licensed |
Input Labels: | [ner_chunk] |
Output Labels: | [sentence_embeddings] |
Language: | [en] |
Case sensitive: | false |
Data Source
Tuned on MedNLI dataset using Biobert weights.