Fastext Word Embeddings in German

Description

Word Embeddings lookup annotator that maps tokens to vectors.

How to use

model = WordEmbeddingsModel.pretrained("w2v_cc_300d","de","clinical/models")\
	.setInputCols(["document","token"])\
	.setOutputCol("word_embeddings")

val model = WordEmbeddingsModel.pretrained("w2v_cc_300d","de","clinical/models")
	.setInputCols(Array("document","token"))
	.setOutputCol("word_embeddings")

import nlu
nlu.load("de.embed.w2v").predict("""Put your text here.""")

Results

Word2Vec feature vectors based on w2v_cc_300d.

Model Information

Name:	w2v_cc_300d
Type:	WordEmbeddingsModel
Compatibility:	Healthcare NLP 2.5.5+
License:	Licensed
Edition:	Official
Input labels:	[document, token]
Output labels:	[word_embeddings]
Language:	de
Dimension:	300.0

Data Source

FastText common crawl word embeddings for Germany https://fasttext.cc/docs/en/crawl-vectors.html

PREVIOUSICD10GM ChunkResolver

NEXTSplit Sentences in Healthcare Texts