Spark NLP release notes 4.0.2

 

4.0.2

Release date: 12-09-2022

Overview

We are glad to announce that Spark OCR 4.0.2 has been released! This release comes with new features, fixes and more!

New Features

  • VisualDocumentClassifierV2 is now trainable! Continuing with the effort to make all the most useful models easily trainable, we added training capabilities to this annotator.
  • Added support for Simplified Chinese.
  • Added new ‘PdfToForm’ annotator, capable of extracting forms from digital PDFs. This is different from previously introduced VisualDocumentNER annotator in that this new annotator works only on digital documents, as opposite to the scanned forms handled by VisualDocumentNER. PdfToForm is complementary to VisualDocumentNER.

Improvements

  • Support for multi-frame dicom has been added.
  • Added the missing load()​ method in ImageToTextV2.

New Notebooks

Versions

Last updated