Medical Question Answering (biogpt_pubmed_qa)

Description

This model has been trained with medical documents and can generate two types of answers, short and long. Types of questions are supported: "short" (producing yes/no/maybe) answers and "full" (long answers).

Predicted Entities

Download Copy S3 URI

How to use

document_assembler = MultiDocumentAssembler()\
    .setInputCols("question", "context")\
    .setOutputCols("document_question", "document_context")

med_qa = MedicalQuestionAnswering\
    .pretrained("biogpt_pubmed_qa", "en", "clinical/models")\
    .setInputCols(["document_question", "document_context"])\
    .setOutputCol("answer")\
    .setMaxNewTokens(30)\
    .setTopK(1)\
    .setQuestionType("long") # "short"

pipeline = Pipeline(stages=[document_assembler, med_qa])

paper_abstract = "The visual indexing theory proposed by Zenon Pylyshyn (Cognition, 32, 65-97, 1989) predicts that visual attention mechanisms are employed when mental images are projected onto a visual scene."
long_question = "What is the effect of directing attention on memory?"
yes_no_question = "Does directing attention improve memory for items?"

data = spark.createDataFrame(
    [
        [long_question, paper_abstract, "long"],
        [yes_no_question, paper_abstract, "short"],
    ]
).toDF("question", "context", "question_type")

pipeline.fit(data).transform(data.where("question_type == 'long'"))\
    .select("answer.result")\
    .show(truncate=False)

pipeline.fit(data).transform(data.where("question_type == 'short'"))\
    .select("answer.result")\
    .show(truncate=False)
val document_assembler = new MultiDocumentAssembler()
    .setInputCols("question", "context")
    .setOutputCols("document_question", "document_context")

val med_qa = MedicalQuestionAnswering
    .pretrained("biogpt_pubmed_qa","en","clinical/models")
    .setInputCols(("document_question", "document_context"))
    .setOutputCol("answer")
    .setMaxNewTokens(30)
    .setTopK(1)
    .setQuestionType("long") # "short"

val pipeline = new Pipeline().setStages(Array(document_assembler, med_qa))

paper_abstract = "The visual indexing theory proposed by Zenon Pylyshyn (Cognition, 32, 65-97, 1989) predicts that visual attention mechanisms are employed when mental images are projected onto a visual scene."
long_question = "What is the effect of directing attention on memory?"
yes_no_question = "Does directing attention improve memory for items?"

val data = Seq( 
    (long_question, paper_abstract,"long" ),
    (yes_no_question, paper_abstract, "short"))
    .toDS.toDF("question", "context", "question_type")

val result = pipeline.fit(data).transform(data)

Results

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[the present study investigated whether directing spatial attention to one location in a visual array would enhance memory for the array features. participants memorized two]|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Model Information

Model Name: biogpt_pubmed_qa
Compatibility: Healthcare NLP 4.4.2+
License: Licensed
Edition: Official
Language: en
Size: 1.1 GB
Case sensitive: true