Spark NLP release notes 3.14.0

 

3.14.0

Release date: 13-06-2022

Overview

We are glad to announce that Spark OCR 3.14.0 has been released!. This release focuses around Visual Document Classification models, native Image Preprocessing on the JVM, and bug fixes.

New Features

  • VisualDocumentClassifierv2:
    • New annotator for classifying documents based on multimodal(text + images) features.
  • VisualDocumentClassifierv3:
    • New annotator for classifying documents based on image features.
  • ImageTransformer:
    • New transformer that provides different image transformations on the JVM. Supported transforms are Scaling, Adaptive Thresholding, Median Blur, Dilation, Erosion, and Object Removal.

New notebooks

Versions

Last updated