Generic Classifier for Oncology


This model is a oncology classification model that determines whether clinical sentences include terms related to oncology.

  • True: Contains oncology related terms.
  • False: Doesn’t contain oncology related terms.

Predicted Entities

True, False

How to use

document_assembler = DocumentAssembler()\

tokenizer = Tokenizer()\

word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical","en","clinical/models")\

sentence_embeddings = SentenceEmbeddings()\
    .setInputCols(["document", "word_embeddings"])\

features_asm = FeaturesAssembler()\

generic_classifier = GenericClassifierModel.pretrained("generic_classifier_oncology","en","clinical/models")\

clf_Pipeline = Pipeline(

data = spark.createDataFrame([
["The patient was diagnosed with a malignant tumor, and surgery was promptly scheduled to remove the mass."],
["Following this adjustment, the patient's ECG remained in sinus rhythm, with heart rates varying between 45 and 70 bpm and no significant QTc prolongation."],
["During the treatment review, the oncologist discussed the progression of metastases from the primary lesion to nearby lymph nodes."],
["Functional MRI (fMRI) showed increased activation in the motor cortex during the finger-tapping task."]

result =

val documentAssembler = new DocumentAssembler()

val tokenizer = new Tokenizer()

val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical","en","clinical/models")

val sentence_embeddings = new SentenceEmbeddings()
  .setInputCols(Array("document", "word_embeddings"))

val features_asm = new FeaturesAssembler()

val generic_classifier = GenericClassifierModel.pretrained("generic_classifier_oncology","en","clinical/models")

val clf_Pipeline = new Pipeline().setStages(Array(

val data = Seq([
["The patient was diagnosed with a malignant tumor, and surgery was promptly scheduled to remove the mass."],
["Following this adjustment, the patient's ECG remained in sinus rhythm, with heart rates varying between 45 and 70 bpm and no significant QTc prolongation."],
["During the treatment review, the oncologist discussed the progression of metastases from the primary lesion to nearby lymph nodes."],
["Functional MRI (fMRI) showed increased activation in the motor cortex during the finger-tapping task."]

val result =


|text                                                                                                                                                      |result |
|The patient was diagnosed with a malignant tumor, and surgery was promptly scheduled to remove the mass.                                                  | True  |
|Following this adjustment, the patient's ECG remained in sinus rhythm, with heart rates varying between 45 and 70 bpm and no significant QTc prolongation.| False |
|During the treatment review, the oncologist discussed the progression of metastases from the primary lesion to nearby lymph nodes.                        | True  |
|Functional MRI (fMRI) showed increased activation in the motor cortex during the finger-tapping task.                                                     | False |

Model Information

Model Name: generic_classifier_oncology
Compatibility: Healthcare NLP 5.4.0+
License: Licensed
Edition: Official
Input Labels: [features]
Output Labels: [prediction]
Language: en
Size: 1.5 MB


       label  precision    recall  f1-score   support
       False       0.90      0.86      0.88      2093
        True       0.89      0.93      0.91      2714
    accuracy          -         -      0.90      4807
   macro-avg       0.90      0.89      0.89      4807
weighted-avg       0.90      0.90      0.90      4807