3.14.0
Release date: 13-06-2022
Overview
We are glad to announce that Spark OCR 3.14.0 has been released!. This release focuses around Visual Document Classification models, native Image Preprocessing on the JVM, and bug fixes.
New Features
- VisualDocumentClassifierv2:
- New annotator for classifying documents based on multimodal(text + images) features.
- VisualDocumentClassifierv3:
- New annotator for classifying documents based on image features.
- ImageTransformer:
- New transformer that provides different image transformations on the JVM. Supported transforms are Scaling, Adaptive Thresholding, Median Blur, Dilation, Erosion, and Object Removal.
New notebooks
- SparkOCRVisualDocumentClassifierv2.ipynb, example of Visual Document Classification using multimodal (text + visual) features.
- SparkOCRVisualDocumentClassifierv3.ipynb, example of Visual Document Classification using only visual features.
- SparkOCRCPUImageOperations.ipynb, example of ImageTransformer.
Versions
- 5.4.2
- 5.4.1
- 5.4.0
- 5.3.2
- 5.3.1
- 5.3.0
- 5.2.0
- 5.1.2
- 5.1.0
- 5.0.2
- 5.0.1
- 5.0.0
- 4.4.4
- 4.4.3
- 4.4.2
- 4.4.1
- 4.4.0
- 4.3.3
- 4.3.0
- 4.2.4
- 4.2.1
- 4.2.0
- 4.1.0
- 4.0.2
- 4.0.0
- 3.14.0
- 3.13.0
- 3.12.0
- 3.11.0
- 3.10.0
- 3.9.1
- 3.9.0
- 3.8.0
- 3.7.0
- 3.6.0
- 3.5.0
- 3.4.0
- 3.3.0
- 3.2.0
- 3.1.0
- 3.0.0
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.2
- 1.1.1
- 1.1.0
- 1.0.0
PREVIOUSRelease Notes