T5 text-to-text model

Description

The T5 transformer model described in the seminal paper “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”. This model can perform a variety of tasks, such as text summarization, question answering and translation. More details about using the model can be found in the paper (https://arxiv.org/pdf/1910.10683.pdf).

Download

How to use

        document_assembler = DocumentAssembler() \
            .setInputCol("text") \
            .setOutputCol("documents")

        t5 = T5Transformer() \
            .pretrained("t5_small") \
            .setTask("summarize:")\
            .setMaxOutputLength(200)\
            .setInputCols(["documents"]) \
            .setOutputCol("summaries")

        pipeline = Pipeline().setStages([document_assembler, t5])
        results = pipeline.fit(data_df).transform(data_df)

        results.select("summaries.result").show(truncate=False)
    val documentAssembler = new DocumentAssembler()
      .setInputCol("text")
      .setOutputCol("documents")

    val t5 = T5Transformer
      .pretrained("t5_small")
      .setTask("summarize:")
      .setInputCols(Array("documents"))
      .setOutputCol("summaries")

    val pipeline = new Pipeline().setStages(Array(documentAssembler, t5))

    val model = pipeline.fit(dataDf)
    val results = model.transform(dataDf)

    results.select("summaries.result").show(truncate = false)

Model Information

Model Name: t5_small
Compatibility: Spark NLP 2.7.0+
Edition: Official
Language: en

Data Source

C4