Description
This is a ViT (Visual Transformer) model, which can be used to carry out Classification on Dicom images. This model has been trained in-house with different corpora, including:
DICOM COCO In-house annotated documents You can use this model to classify images and document notes , and then use Visual NLP to extract information using the layout and the text features.
Predicted Entities
'image'
, 'document_notes'
, 'others'
How to use
document_assembler = nlp.ImageAssembler() \
.setInputCol("image") \
.setOutputCol("image_assembler")
imageClassifier_loaded = nlp.ViTForImageClassification.pretrained("visualclf_vit_dicom", "en", "clinical/ocr")\
.setInputCols(["image_assembler"])\
.setOutputCol("class")
pipeline = nlp.Pipeline().setStages([
document_assembler,
imageClassifier_loaded
])
test_image = spark.read\
.format("image")\
.option("dropInvalid", value = True)\
.load("./dicom.JPEG")
result = pipeline.fit(test_image).transform(test_image)
result.select("class.result").show(1, False)
Model Information
Model Name: | visualclf_vit_dicom |
Compatibility: | Healthcare NLP 5.0.1+ |
License: | Licensed |
Edition: | Official |
Input Labels: | [image_assembler] |
Output Labels: | [class] |
Language: | en |
Size: | 321.6 MB |