5.3.2
Release date: 15-05-2024
Visual NLP 5.3.2 Release Notes 🕶️
We are glad to announce that Visual NLP 5.3.2 has been released.!!! 📢📢📢
Highlights 🔴
- Ocr Metrics against Cloud Providers: Textract, and CGP.
- LightPipeline support for Table Recognition and Clustering.
- PositionFinder supports entities spawning multiple lines.
- Other Changes.
Ocr Metrics against Cloud Providers: Textract, and CGP.
Next are metrics for Text Detection and Recognition tasks collected on the FUNSD dataset, the final metric is the average F score across Text Detection and Recognition tasks.
Detection | Recognition | Detection Metrics | Recognition Metrics | |||
---|---|---|---|---|---|---|
Precision | Recall | Precision | Recall | Avg. F Score | ||
Google OCR | Google OCR | 0.3528 | 0.7776 | 0.8889 | 0.8823 | 0.6854 |
Amazon Textract | Amazon Textract | 0.5284 | 0.8534 | 0.8236 | 0.8539 | 0.7455 |
ImageTextDetector (memOpt) | ImageToTextV2 (base checkpoint) | 0.6199 | 0.9044 | 0.9354 | 0.9331 | 0.8349 |
ImageTextDetector (memOpt) | ImageToTextV2 (large checkpoint) | 0.6199 | 0.9044 | 0.9457 | 0.9426 | 0.8398 |
ImageTextDetectorV2 | ImageToTextV2 (base checkpoint) | 0.598 | 0.9046 | 0.9354 | 0.9331 | 0.8271 |
ImageTextDetectorV2 | ImageToTextV2 (large checkpoint) | 0.598 | 0.9046 | 0.9457 | 0.9426 | 0.8320 |
ImageTextDetector (memOpt) | ImageToText | 0.6199 | 0.9044 | 0.464 | 0.4654 | 0.6001 |
Not only the scores are slightly better than those of cloud providers, but also the cost is lower(*),
Service | Cost(USD) |
---|---|
Amazon | 120 |
Azure | 30 |
43.5 | |
JSL | 17.6 |
(*) JSL costs were estimated assuming a Databricks setup.
LightPipeline support for Table Recognition and Clustering
Now you can use Table Extraction and Clustering pipelines as LightPipelines. To do so you just need to create the LightPipeline as usual, check this example using PretrainedPipeline,
- new
LightPipeline.fromBinary()
method that allows the usage of in-memory binary buffers as inputs to Visual NLP pipelines.
lp = PretrainedPipeline("digital_pdf_table_extractor")
lp.fromLocalPath("page_with_tables.png")
For other examples please check this notebook.
PositionFinder
For cases in which entities spawn multiple lines, PositionFinder was not working properly.
Now, the expected bounding boxes for the entity are returned. Keep in mind that as before more than one bounding box will be returned, and all will share the same chunk_id
.
Other Changes
- CVE related to commons-compress was removed.
- Bug Fixes: ImageToPdf rendering images outside page boundaries.
Previous versions
- 5.4.1
- 5.4.0
- 5.3.2
- 5.3.1
- 5.3.0
- 5.2.0
- 5.1.2
- 5.1.0
- 5.0.2
- 5.0.1
- 5.0.0
- 4.4.4
- 4.4.3
- 4.4.2
- 4.4.1
- 4.4.0
- 4.3.3
- 4.3.0
- 4.2.4
- 4.2.1
- 4.2.0
- 4.1.0
- 4.0.2
- 4.0.0
- 3.14.0
- 3.13.0
- 3.12.0
- 3.11.0
- 3.10.0
- 3.9.1
- 3.9.0
- 3.8.0
- 3.7.0
- 3.6.0
- 3.5.0
- 3.4.0
- 3.3.0
- 3.2.0
- 3.1.0
- 3.0.0
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.2
- 1.1.1
- 1.1.0
- 1.0.0