Computes contextualized word representations using character-based word representations and bidirectional LSTMs.
This model outputs fixed embeddings at each LSTM layer and a learnable aggregation of the 3 layers.
word_emb: the character-based word representations with shape [batch_size, max_length, 512].
lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024].
lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024].
elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024].
The complex architecture achieves state of the art results on several benchmarks. Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.
The details are described in the paper “Deep contextualized word representations”.
How to use
embeddings = ElmoEmbeddings.pretrained("elmo", "en") \ .setInputCols(["sentence", "token"]) \ .setOutputCol("embeddings") \ .setPoolingLayer("elmo")
val embeddings = ElmoEmbeddings.pretrained("elmo", "en") .setInputCols("sentence", "token") .setOutputCol("embeddings") .setPoolingLayer("elmo")
|Compatibility:||Spark NLP 2.4.0+|
|Input Labels:||[sentence, token]|
The model is imported from https://tfhub.dev/google/elmo/3