T5 text-to-text model

Description

The T5 transformer model described in the seminal paper “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”. This model can perform a variety of tasks, such as text summarization, question answering and translation. More details about using the model can be found in the paper (https://arxiv.org/pdf/1910.10683.pdf).

Live Demo Open in Colab Download Copy S3 URI

How to use

document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("documents")

t5 = T5Transformer() \
    .pretrained("t5_small") \
    .setTask("summarize:")\
    .setMaxOutputLength(200)\
    .setInputCols(["documents"]) \
    .setOutputCol("summaries")

pipeline = Pipeline().setStages([document_assembler, t5])

results = pipeline.fit(data_df).transform(data_df)
val documentAssembler = new DocumentAssembler()
    .setInputCol("text")
    .setOutputCol("documents")

val t5 = T5Transformer
    .pretrained("t5_small")
    .setTask("summarize:")
    .setInputCols(Array("documents"))
    .setOutputCol("summaries")

val pipeline = new Pipeline().setStages(Array(documentAssembler, t5))

val result = pipeline.fit(dataDf).transform(dataDf)
import nlu

nlu.load("en.t5.small").predict("""Put your text here.""")

Model Information

Model Name: t5_small
Compatibility: Spark NLP 2.7.0+
Edition: Official
Language: en

Data Source

C4