Spark NLP in Action
Run 300+ live demos and notebooks

Extract Text from Documents - Live Demos & Notebooks

PDF to Text
Extract text from generated/selectable PDF documents and keep the original structure of the document by using our out-of-the-box Spark OCR library. (...)
DICOM to Text
Recognize text from DICOM format documents. This feature explores both to the text on the image and to the text from the metadata file. (...)
Image to Text
Recognize text in images and scanned PDF documents by using our out-of-the-box Spark OCR library. (...)
DOCX to Text
Extract text from Word documents with Spark OCR (...)
Extract text from Powerpoint slides
This demo shows how PPTX texts can be extracted using Spark OCR. (...)