Spark OCR Extract Tables & Structured Data
Recognize entities in scanned PDFs
End-to-end example of regular NER pipeline: import scanned images from cloud storage, preprocess them for improving their quality, recognize text using Spark OCR, correct the spelling mistakes for improving OCR results and finally run NER for extracting entities.
Extract tables from selectable PDF documents with the new features offered by Spark OCR.
Extract Data from FoundationOne Sequencing Reports
Use our transformer to parse patient info, genomic and biomarker findings, and gene lists.
Classify visual documents
Classify documents using text and layout data
Detect tables in documents
Detect tables on the image by a pretrained model based on CascadeTabNet.