Annotating text included in image documents (e.g. scanned documents) is a common use case in many verticals but comes with several challenges. With the new Visual NER Labeling config, we aim to ease the work of annotators by allowing them to simply select text from an image and assign the corresponding label to it. This feature is powered by Spark OCR 3.5.0; thus a valid Spark OCR license is required to get access to it.
Here is how this can be used:
- Upload a valid Spark OCR license. See how to do this here.
- Create a new project, specify a name for your project, add team members if necessary, and from the list of predefined templates (Default Project Configs) choose “Visual NER Labeling”.
- Update the configuration if necessary. This might be useful if you want to use other labels than the currently defined ones. Click the save button. While saving the project, a confirmation dialog is displayed to let you know that the Spark OCR pipeline for Visual NER is being deployed.
- Import the tasks you want to annotate (images).
- Start annotating text on top of the image by clicking on the text tokens or by drawing bounding boxes on top of chunks or image areas.
- Export annotations in your preferred format.
The entire process is illustrated below:
Support for multi-page PDF documents
When a valid Saprk OCR license is available, Annotation Lab offers support for multi-page PDF annotation. The complete flow of import, annotation, and export for multi-page PDF files is currently supported.
Users have two options for importing a new PDF file into the Visual NER project
- Import PDF file from local storage;
- Add a link to the PDF file in the file attribute.
After import, the task becomes available on the
Tasks Page. The title of the new task is the name of the imported file.
On the labeling page, the PDF file is displayed with pagination so that annotators can annotate on the PDF document one page at a time.
OCR and Visual NER servers
Just like (preannotation servers), Annotation Lab 3.0.0 also supports the deployment of multiple OCR servers. If a user has uploaded a Spark OCR license, be it airgap or floating, OCR inference is enabled.
To create a Visual NER project, users have to deploy at least one OCR server. Any OCR server can perform preannotation. To select the OCR server, users have to go to the Import page, toggle the OCR option and from the popup, choose one of the available OCR servers. In no suitable OCR server is available, one can be created by choosing the “Create Server” option.