Spark OCR Extract Text from Documents
PDF to Text
Extract text from generated/selectable PDF documents and keep the original structure of the document by using our out-of-the-box Spark OCR library.
DICOM to Text
Recognize text from DICOM format documents. This feature explores both to the text on the image and to the text from the metadata file.
Image to Text
Recognize text in images and scanned PDF documents by using our out-of-the-box Spark OCR library.
DOCX to Text
Extract text from Word documents with Spark OCR