Description
This LLM model is trained to perform Summarization and Q&A based on a given context.
How to use
document_assembler = DocumentAssembler()\
.setInputCol("text")\
.setOutputCol("document")
medical_llm = MedicalLLM.pretrained("jsl_meds_q16_v1", "en", "clinical/models")\
.setInputCols("document")\
.setOutputCol("completions")\
.setBatchSize(1)\
.setNPredict(100)\
.setUseChatTemplate(True)\
.setTemperature(0)
pipeline = Pipeline(
stages = [
document_assembler,
medical_llm
])
prompt = """
Based on the following text, what age group is most susceptible to breast cancer?
## Text:
The exact cause of breast cancer is unknown. However, several risk factors can increase your likelihood of developing breast cancer, such as:
- A personal or family history of breast cancer
- A genetic mutation, such as BRCA1 or BRCA2
- Exposure to radiation
- Age (most commonly occurring in women over 50)
- Early onset of menstruation or late menopause
- Obesity
- Hormonal factors, such as taking hormone replacement therapy
"""
data = spark.createDataFrame([[prompt]]).toDF("text")
results = pipeline.fit(data).transform(data)
results.select("completions").show(truncate=False)
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val medical_llm = MedicalLLM.pretrained("jsl_meds_q16_v1", "en", "clinical/models")
.setInputCols("document")
.setOutputCol("completions")
.setBatchSize(1)
.setNPredict(100)
.setUseChatTemplate(True)
.setTemperature(0)
val pipeline = new Pipeline().setStages(Array(
document_assembler,
medical_llm
))
val prompt = """
Based on the following text, what age group is most susceptible to breast cancer?
## Text:
The exact cause of breast cancer is unknown. However, several risk factors can increase your likelihood of developing breast cancer, such as:
- A personal or family history of breast cancer
- A genetic mutation, such as BRCA1 or BRCA2
- Exposure to radiation
- Age (most commonly occurring in women over 50)
- Early onset of menstruation or late menopause
- Obesity
- Hormonal factors, such as taking hormone replacement therapy
"""
val data = Seq(prompt).toDF("text")
val results = pipeline.fit(data).transform(data)
results.select("completions").show(truncate=False)
Results
The age group most susceptible to breast cancer, as mentioned in the text, is women over the age of 50.
Model Information
Model Name: | jsl_meds_q16_v1 |
Compatibility: | Healthcare NLP 5.5.0+ |
License: | Licensed |
Edition: | Official |
Language: | en |
Size: | 6.1 GB |
Benchmarking
We have generated a total of 400 questions, 100 from each category. These questions were labeled and reviewed by 3 physician annotators. %
indicates the preference rate.
Please see the more benchmark information here.
## Overall
| Model | Factuality % | Clinical Relevancy % | Conciseness % |
|------------|--------------|----------------------|---------------|
| JSL-MedS | 0.24 | 0.25 | 0.38 |
| GPT4o | 0.19 | 0.26 | 0.27 |
| Neutral | 0.43 | 0.36 | 0.18 |
| None | 0.14 | 0.13 | 0.17 |
| Total | 1.00 | 1.00 | 1.00 |
## Summary
| Model | Factuality % | Clinical Relevancy % | Conciseness % |
|------------|--------------|----------------------|---------------|
| JSL-MedS | 0.47 | 0.48 | 0.42 |
| GPT4o | 0.25 | 0.25 | 0.25 |
| Neutral | 0.22 | 0.22 | 0.25 |
| None | 0.07 | 0.05 | 0.08 |
| Total | 1.00 | 1.00 | 1.00 |
## QA
| Model | Factuality % | Clinical Relevancy % | Conciseness % |
|------------|--------------|----------------------|---------------|
| JSL-MedS | 0.35 | 0.36 | 0.42 |
| GPT4o | 0.24 | 0.24 | 0.29 |
| Neutral | 0.33 | 0.33 | 0.18 |
| None | 0.09 | 0.07 | 0.11 |
| Total | 1.00 | 1.00 | 1.00 |
## BioMedical
| Model | Factuality % | Clinical Relevancy % | Conciseness % |
|------------|--------------|----------------------|---------------|
| JSL-MedS | 0.33 | 0.24 | 0.57 |
| GPT4o | 0.12 | 0.08 | 0.16 |
| Neutral | 0.45 | 0.57 | 0.16 |
| None | 0.10 | 0.10 | 0.10 |
| Total | 1.00 | 1.00 | 1.00 |
## OpenEnded
| Model | Factuality % | Clinical Relevancy % | Conciseness % |
|------------|--------------|----------------------|---------------|
| JSL-MedS | 0.35 | 0.30 | 0.39 |
| GPT4o | 0.30 | 0.33 | 0.41 |
| Neutral | 0.19 | 0.20 | 0.02 |
| None | 0.17 | 0.17 | 0.19 |
| Total | 1.00 | 1.00 | 1.00 |
PREVIOUSJSL_MedS_NER (LLM - q8)