Generic Text Generation - Large

Description

This model is based on google’s Flan-T5 Large, and can generate conditional text. Sequence length is 512 tokens.

Predicted Entities

Live Demo Open in Colab Download Copy S3 URI

How to use

document_assembler = DocumentAssembler()\
    .setInputCol("prompt")\
    .setOutputCol("document_prompt")

med_text_generator  = MedicalTextGenerator.pretrained("text_generator_generic_flan_t5_large", "en", "clinical/models")\
    .setInputCols("document_prompt")\
    .setOutputCol("answer")\
    .setMaxNewTokens(256)\
    .setDoSample(True)\
    .setTopK(3)\
    .setRandomSeed(42)

pipeline = Pipeline(stages=[document_assembler, med_text_generator])
data = spark.createDataFrame([["""Classify the following review as negative or positive:

Not a huge fan of her acting, but the movie was actually quite good!"""]]).toDF("prompt")
pipeline.fit(data).transform(data)
val document_assembler = new DocumentAssembler()
    .setInputCol("prompt")
    .setOutputCol("document_prompt")

val med_text_generator  = MedicalTextGenerator.pretrained("text_generator_generic_flan_t5_large", "en", "clinical/models")
    .setInputCols("document_prompt")
    .setOutputCol("answer")
    .setMaxNewTokens(256)
    .setDoSample(true)
    .setTopK(3)
    .setRandomSeed(42)

val pipeline = new Pipeline().setStages(Array(document_assembler, med_text_generator))
val data = Seq(Array("""Classify the following review as negative or positive:

Not a huge fan of her acting, but the movie was actually quite good!""")).toDS.toDF("prompt")
val result = pipeline.fit(data).transform(data)

Results

positive

Model Information

Model Name: text_generator_generic_flan_t5_large
Compatibility: Healthcare NLP 4.3.2+
License: Licensed
Edition: Official
Language: en
Size: 2.9 GB