Generic Classifier for Therapy

Description

This model is a therapy classification model that determines whether clinical sentences include terms related to therapy.

  • True: Contains therapy related terms.
  • False: Doesn’t contain therapy related terms.

Predicted Entities

True, False

Copy S3 URI

How to use


document_assembler = DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

tokenizer = Tokenizer()\
    .setInputCols("document")\
    .setOutputCol("token")

word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical","en","clinical/models")\
    .setInputCols(["document","token"])\
    .setOutputCol("word_embeddings")

sentence_embeddings = SentenceEmbeddings()\
    .setInputCols(["document", "word_embeddings"])\
    .setOutputCol("sentence_embeddings")\
    .setPoolingStrategy("AVERAGE")

features_asm = FeaturesAssembler()\
    .setInputCols(["sentence_embeddings"])\
    .setOutputCol("features")

generic_classifier = GenericClassifierModel.pretrained("generic_classifier_therapy","en","clinical/models")\
    .setInputCols(["features"])\
    .setOutputCol("prediction")

clf_Pipeline = Pipeline(
  stages=[
    document_assembler,
    tokenizer,
    word_embeddings,
    sentence_embeddings,
    features_asm,
    generic_classifier])

data = spark.createDataFrame([
['Patient was briefly started on Arimidex due to significant complaints of hot flashes and joint pain, for which these symptoms increased while on Arimidex.'],
['The patient continues to report pain more at the base of her skull radiating to the top of the head and into the jaws.'],
['Post-Surgical Finding: There are post-surgical findings from a previous lumpectomy with radiation seen in the left breast.'],
['Significant narrowing of the disc with tiny osteophyte and bulging disc producing mild ventral impress and minimal facet degenerative change.']]).toDF("text")

result = clf_Pipeline.fit(data).transform(data)


val documentAssembler = new DocumentAssembler()
  .setInputCol(Array("text"))
  .setOutputCol("document")

val tokenizer = new Tokenizer()
  .setInputCols(Array("document"))
  .setOutputCol("token")

val word_embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical","en","clinical/models")
  .setInputCols(Array("document","token"))
  .setOutputCol("word_embeddings")

val sentence_embeddings = new SentenceEmbeddings()
  .setInputCols(Array("document", "word_embeddings"))
  .setOutputCol("sentence_embeddings")
  .setPoolingStrategy("AVERAGE")

val features_asm = new FeaturesAssembler()
  .setInputCols(Array("sentence_embeddings"))
  .setOutputCol("features")

val generic_classifier = GenericClassifierModel.pretrained("generic_classifier_therapy","en","clinical/models")
  .setInputCols(Array("features"))
  .setOutputCol("prediction")

val clf_Pipeline = new Pipeline().setStages(Array(
  documentAssembler,
  tokenizer,
  word_embeddings,
  sentence_embeddings,
  features_asm,
  generic_classifier
))


val data = Seq([
["Patient was briefly started on Arimidex due to significant complaints of hot flashes and joint pain, for which these symptoms increased while on Arimidex."],
["The patient continues to report pain more at the base of her skull radiating to the top of the head and into the jaws."],
["Post-Surgical Finding: There are post-surgical findings from a previous lumpectomy with radiation seen in the left breast."],
["Significant narrowing of the disc with tiny osteophyte and bulging disc producing mild ventral impress and minimal facet degenerative change."]]).toDF("text")

val result = clf_Pipeline.fit(data).transform(data)

Results


+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
|text                                                                                                                                                      |result |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------+
|Patient was briefly started on Arimidex due to significant complaints of hot flashes and joint pain, for which these symptoms increased while on Arimidex.| True  |
|The patient continues to report pain more at the base of her skull radiating to the top of the head and into the jaws.                                    | False |
|Post-Surgical Finding: There are post-surgical findings from a previous lumpectomy with radiation seen in the left breast.                                | True  |
|Significant narrowing of the disc with tiny osteophyte and bulging disc producing mild ventral impress and minimal facet degenerative change.             | False |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------+

Model Information

Model Name: generic_classifier_therapy
Compatibility: Healthcare NLP 5.4.0+
License: Licensed
Edition: Official
Input Labels: [features]
Output Labels: [prediction]
Language: en
Size: 1.5 MB

Benchmarking

       label  precision    recall  f1-score   support
       False       0.94      0.92      0.93      3044
        True       0.86      0.90      0.88      1607
    accuracy          -         -      0.92      4651
   macro-avg       0.90      0.91      0.91      4651
weighted-avg       0.92      0.92      0.92      4651