Description
This model is a BioBERT based metastasis classification model that can determine whether the clinical sentences include terms related to metastasis or not.
1
: Contains metastasis related terms.0
: Doesn’t contain metastasis related terms.
Predicted Entities
True
, False
How to use
document_assembler = DocumentAssembler()\
.setInputCol('text')\
.setOutputCol('document')
sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare","en","clinical/models")\
.setInputCols(["document"])\
.setOutputCol("sentence")
tokenizer = Tokenizer()\
.setInputCols(['sentence'])\
.setOutputCol('token')
sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_metastasis","en","clinical/models")\
.setInputCols(["sentence",'token'])\
.setOutputCol("prediction")
pipeline = Pipeline(stages=[
document_assembler,
sentence_detector,
tokenizer,
sequenceClassifier
])
sample_texts = [
["Contrast MRI confirmed the findings of meningeal carcinomatosis."],
["A 62-year-old male presents with weight loss, persistent cough, and episodes of hemoptysis."],
["The primary tumor (T) is staged as T3 due to its size and local invasion, there is no nodal involvement (N0), and due to multiple bone and liver lesions, it is classified as M1, reflecting distant metastatic foci."] ,
["After all procedures done and reviewing the findings, biochemical results and screening, the TNM classification is determined."],
["The oncologist noted that the tumor had spread to the liver, indicating advanced stage cancer."],
["The patient's care plan is adjusted to focus on symptom management and slowing the progression of the disease."],
]
sample_data = spark.createDataFrame(sample_texts).toDF("text")
result = pipeline.fit(sample_data).transform(sample_data)
result.select("text", "prediction.result").show(truncate=False)
val documentAssembler = new DocumentAssembler()
.setInputCol(Array("text"))
.setOutputCol("document")
val sentenceDetector = SentenceDetectorDLModel.pretrained("sentence_detector_dl_healthcare","en","clinical/models")
.setInputCols(Array("document"))
.setOutputCol("sentence")
val tokenizer = new Tokenizer()
.setInputCols(Array("sentence"))
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_metastasis","en","clinical/models")
.setInputCols(Array("sentence", "token"))
.setOutputCol("prediction")
val pipeline = new Pipeline().setStages(Array(
documentAssembler,
sentenceDetector,
tokenizer,
sequenceClassifier
))
val data = Seq(Array("Contrast MRI confirmed the findings of meningeal carcinomatosis.",
"A 62-year-old male presents with weight loss, persistent cough, and episodes of hemoptysis.",
"The primary tumor (T) is staged as T3 due to its size and local invasion, there is no nodal involvement (N0), and due to multiple bone and liver lesions, it is classified as M1, reflecting distant metastatic foci." ,
"After all procedures done and reviewing the findings, biochemical results and screening, the TNM classification is determined.",
"The oncologist noted that the tumor had spread to the liver, indicating advanced stage cancer.",
"The patient's care plan is adjusted to focus on symptom management and slowing the progression of the disease."
)).toDF("text")
val result = pipeline.fit(data).transform(data)
Results
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
|text |result|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
|Contrast MRI confirmed the findings of meningeal carcinomatosis. |[1] |
|A 62-year-old male presents with weight loss, persistent cough, and episodes of hemoptysis. |[0] |
|The primary tumor (T) is staged as T3 due to its size and local invasion, there is no nodal involvement (N0), and due to multiple bone and liver lesions, it is classified as M1, reflecting distant metastatic foci.|[1] |
|After all procedures done and reviewing the findings, biochemical results and screening, the TNM classification is determined. |[0] |
|The oncologist noted that the tumor had spread to the liver, indicating advanced stage cancer. |[1] |
|The patient's care plan is adjusted to focus on symptom management and slowing the progression of the disease. |[0] |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+
Model Information
Model Name: | bert_sequence_classifier_metastasis |
Compatibility: | Healthcare NLP 5.4.0+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [prediction] |
Language: | en |
Size: | 406.4 MB |
Case sensitive: | false |
Max sentence length: | 512 |
Benchmarking
label precision recall f1-score support
0 0.9979 0.9986 0.9983 4357
1 0.9944 0.9916 0.9930 1072
accuracy - - 0.9972 5429
macro-avg 0.9962 0.9951 0.9956 5429
weighted-avg 0.9972 0.9972 0.9972 5429