Elmo

Description

Computes contextualized word representations using character-based word representations and bidirectional LSTMs.

This model outputs fixed embeddings at each LSTM layer and a learnable aggregation of the 3 layers.

  • word_emb: the character-based word representations with shape [batch_size, max_length, 512]. == word_emb
  • lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024]. === lstm_outputs1
  • lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024]. === lstm_outputs2
  • elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024] == elmo

The complex architecture achieves state of the art results on several benchmarks. Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.

The details are described in the paper “Deep contextualized word representations”.

Download

How to use


embeddings = ElmoEmbeddings.pretrained("elmo", "en") \
      .setInputCols("sentence", "token") \
      .setOutputCol("embeddings") \
      .setPoolingLayer("elmo")

val embeddings = ElmoEmbeddings.pretrained("elmo", "en")
      .setInputCols("sentence", "token")
      .setOutputCol("embeddings")
      .setPoolingLayer("elmo")

Model Information

Model Name: elmo  
Type: embeddings  
Compatibility: Spark NLP 2.4.0+  
License: Open Source  
Edition: Official  
Input Labels: [sentence, token]  
Output Labels: [word_embeddings]  
Language: [en]  
Dimension: 512 1024
Case sensitive: true  

Data Source

The model is imported from https://tfhub.dev/google/elmo/3