Description
This is a BERT-based model for classification of clinical documents sections. This model is trained on clinical document sections without the section header in the text, e.g., when splitting the document with ChunkSentenceSplitter
annotator with parameter setInsertChunk=False
.
Predicted Entities
Consultation and Referral
, Habits
, Complications and Risk Factors
, Diagnostic and Laboratory Data
, Discharge Information
, History
, Impression
, Patient Information
, Procedures
, Other
How to use
document_assembler = DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequence_classifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_clinical_sections_headless_onnx", "en", "clinical/models")\
.setInputCols(["document", "token"])\
.setOutputCol("class")
pipeline = Pipeline(stages=[
document_assembler,
tokenizer,
sequence_classifier
])
example_df = spark.createDataFrame(
[["""It was a pleasure taking care of you! You came to us with
stomach pain and worsening distension. While you were here we
did a paracentesis to remove 1.5L of fluid from your belly. We
also placed you on you 40 mg of Lasix and 50 mg of Aldactone to
help you urinate the excess fluid still in your belly. As we
discussed, everyone has a different dose of lasix required to
make them urinate and it's likely that you weren't taking a high
enough dose. Please take these medications daily to keep excess
fluid off and eat a low salt diet. You will follow up with Dr.
___ in liver clinic and from there have your colonoscopy
and EGD scheduled. """]]).toDF("text")
model = pipeline.fit(example_df)
result = model.transform(example_df)
document_assembler = nlp.DocumentAssembler() \
.setInputCol("text") \
.setOutputCol("document")
tokenizer = nlp.Tokenizer() \
.setInputCols(["document"]) \
.setOutputCol("token")
sequenceClassifier = medical.BertForSequenceClassification.pretrained("bert_sequence_classifier_clinical_sections_headless_onnx", "en", "clinical/models")\
.setInputCols(["document","token"])\
.setOutputCol("classes")
pipeline = nlp.Pipeline(stages=[
document_assembler,
tokenizer,
sequenceClassifier
])
example_df = spark.createDataFrame(
[["""It was a pleasure taking care of you! You came to us with
stomach pain and worsening distension. While you were here we
did a paracentesis to remove 1.5L of fluid from your belly. We
also placed you on you 40 mg of Lasix and 50 mg of Aldactone to
help you urinate the excess fluid still in your belly. As we
discussed, everyone has a different dose of lasix required to
make them urinate and it's likely that you weren't taking a high
enough dose. Please take these medications daily to keep excess
fluid off and eat a low salt diet. You will follow up with Dr.
___ in liver clinic and from there have your colonoscopy
and EGD scheduled. """]]).toDF("text")
model = pipeline.fit(example_df)
result = model.transform(example_df)
val document_assembler = new DocumentAssembler()
.setInputCol("text")
.setOutputCol("document")
val tokenizer = new Tokenizer()
.setInputCols(Array("document"))
.setOutputCol("token")
val sequenceClassifier = MedicalBertForSequenceClassification.pretrained("bert_sequence_classifier_clinical_sections_headless_onnx", "en", "clinical/models")
.setInputCols(Array("document","token"))
.setOutputCol("class")
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, sequenceClassifier))
val data = Seq("""It was a pleasure taking care of you! You came to us with
stomach pain and worsening distension. While you were here we
did a paracentesis to remove 1.5L of fluid from your belly. We
also placed you on you 40 mg of Lasix and 50 mg of Aldactone to
help you urinate the excess fluid still in your belly. As we
discussed, everyone has a different dose of lasix required to
make them urinate and it's likely that you weren't taking a high
enough dose. Please take these medications daily to keep excess
fluid off and eat a low salt diet. You will follow up with Dr.
___ in liver clinic and from there have your colonoscopy
and EGD scheduled. """).toDF("text")
val model = pipeline.fit(data)
val result = model.transform(data)
Results
+-----------------------+
|result |
+-----------------------+
|[Discharge Information]|
+-----------------------+
Model Information
Model Name: | bert_sequence_classifier_clinical_sections_headless_onnx |
Compatibility: | Healthcare NLP 6.1.1+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [document, token] |
Output Labels: | [label] |
Language: | en |
Size: | 437.7 MB |
Case sensitive: | true |