Spark OCR Visual Document Understanding
Visual Document Classification
Classify documents using text and layout data with the new features offered by Spark OCR.
Extract Data from Scanned Invoices
Detect companies, total amounts and dates in scanned invoices using out of the box Spark OCR models.
Extract Data from FoundationOne Sequencing Reports
Extract patient, genomic and biomarker information from FoundationOne Sequencing Reports.
Recognize entities in scanned PDFs
End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.