Biobert Pubmed Pmc Base Cased

Description

BERT (Bidirectional Encoder Representations from Transformers) provides dense vector representations for natural language by using a deep, pre-trained neural network with the Transformer architecture Contextual embeddings representation using biobert_pubmed_pmc_base_cased

Predicted Entities

Contextual feature vectors based on biobert_pubmed_pmc_base_cased

Download

How to use

model = BertEmbeddings.pretrained("biobert_pubmed_pmc_base_cased","en","clinical/models")\
	.setInputCols("document","sentence","token")\
	.setOutputCol("word_embeddings")
val model = BertEmbeddings.pretrained("biobert_pubmed_pmc_base_cased","en","clinical/models")
	.setInputCols("document","sentence","token")
	.setOutputCol("word_embeddings")

Model Information

Name: biobert_pubmed_pmc_base_cased  
Type: BertEmbeddings  
Compatibility: Spark NLP 2.5.0+  
License: Licensed  
Edition: Official  
Input labels: [document, sentence, token]  
Output labels: [word_embeddings]  
Language: en  
Dimension: 768.0  
Case sensitive: True  

Data Source

Trained on PubMed + MIMIC III corpora https://github.com/naver/biobert-pretrained