Fastext Word Embeddings in German

Description

Word Embeddings lookup annotator that maps tokens to vectors.

Predicted Entities

Word2Vec feature vectors based on w2v_cc_300d.

Open in ColabDownload

How to use

model = WordEmbeddingsModel.pretrained("w2v_cc_300d","de","clinical/models")
	.setInputCols("document","token")
	.setOutputCol("word_embeddings")
val model = WordEmbeddingsModel.pretrained("w2v_cc_300d","de","clinical/models")
	.setInputCols("document","token")
	.setOutputCol("word_embeddings")

Model Information

Name: w2v_cc_300d
Type: WordEmbeddingsModel
Compatibility: Spark NLP for Healthcare 2.5.5+
License: Licensed
Edition: Official
Input labels: [document, token]
Output labels: [word_embeddings]
Language: de
Dimension: 300.0

Data Source

FastText common crawl word embeddings for Germany https://fasttext.cc/docs/en/crawl-vectors.html