Longformer Base (longformer_base_4096)


Longformer is a transformer model for long documents.

longformer_base_4096 is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.

Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations.

If you use Longformer in your research, please cite Longformer: The Long-Document Transformer.

  title={Longformer: The Long-Document Transformer},
  author={Iz Beltagy and Matthew E. Peters and Arman Cohan},

Longformer is an open-source project developed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.


How to use

embeddings = LongformerEmbeddings\
      .setInputCols(["document", "token"])\
val embeddings = LongformerEmbeddings.pretrained("longformer_base_4096", "en")
    .setInputCols("document", "token") 

Model Information

Model Name: longformer_base_4096
Compatibility: Spark NLP 3.2.0+
License: Open Source
Edition: Official
Input Labels: [token, sentence]
Output Labels: [embeddings]
Language: en
Case sensitive: true
Max sentense length: 4096

Data Source