Computes contextualized word representations using character-based word representations and bidirectional LSTMs.
This model outputs fixed embeddings at each LSTM layer and a learnable aggregation of the 3 layers.
word_emb: the character-based word representations with shape [batch_size, max_length, 512]. == word_emb
lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024]. === lstm_outputs1
lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024]. === lstm_outputs2
elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024] == elmo
The complex architecture achieves state of the art results on several benchmarks. Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.
The details are described in the paper “Deep contextualized word representations”.
How to use
embeddings = ElmoEmbeddings.pretrained("elmo", "en") \ .setInputCols("sentence", "token") \ .setOutputCol("embeddings") \ .setPoolingLayer("elmo")
val embeddings = ElmoEmbeddings.pretrained("elmo", "en") .setInputCols("sentence", "token") .setOutputCol("embeddings") .setPoolingLayer("elmo")
|Compatibility:||Spark NLP 2.4.0+|
|Input Labels:||[sentence, token]|
The model is imported from https://tfhub.dev/google/elmo/3