Pretrained Pipeline for Reading Handwritten Text with Image Documents

Description

This is a pretrained pipeline designed to extract handwritten text from document images. It utilizes advanced machine learning techniques to accurately recognize and convert handwritten content into digital text. By processing images of handwritten documents, the model ensures efficient and precise transcription of the text, enabling seamless integration into various workflows.

Predicted Entities

Live Demo Open in Colab Download

How to use

img_pipeline = PretrainedPipeline('image_handwritten_transformer_extraction', 'en', 'clinical/ocr')

img_path = '/content/images/'
img_example_df = spark.read.format("binaryFile").load(img_path).cache()

result = img_pipeline.transform(img_example_df)

val img_pipeline = new PretrainedPipeline("image_handwritten_transformer_extraction", "en", "clinical/ocr")

val img_path = "/content/images/"
val img_example_df = spark.read.format("binaryFile").load(img_path).cache()

val result = img_pipeline.transform(img_example_df)

Example

Input

Screenshot

Output

"This is an example of handwritten
text .
Let's # check the performance !
I hope it will be awesome ."

Model Information

Model Name:	image_handwritten_transformer_extraction
Type:	pipeline
Compatibility:	Visual NLP 5.0.2+
License:	Licensed
Edition:	Official
Language:	en

PREVIOUSLegal E5 Embedding Base

NEXTPretrained Pipeline for Reading in Mixed Scanned and Digital PDF Documents