Description
Pretrained pipeline designed to extract printed text from document images. It empowers accurate and efficient conversion of printed content into digital text, making it an invaluable tool for text recognition tasks.
Predicted Entities
Live Demo Open in Colab Download
How to use
img_pipeline = PretrainedPipeline('image_printed_transformer_extraction', 'en', 'clinical/ocr')
img_path = '/content/images/'
img_example_df = spark.read.format("binaryFile").load(img_path).cache()
result = img_pipeline.transform(img_example_df)
val img_pipeline = new PretrainedPipeline("image_printed_transformer_extraction", "en", "clinical/ocr")
val img_path = "/content/images/"
val img_example_df = spark.read.format("binaryFile").load(img_path).cache()
val result = img_pipeline.transform(img_example_df)
Example
Input
Output
STARBUCKS Store #19208
11902 Euclid Avenue
Cleveland, OH (216) 229-U749
CHK 664250
12/07/2014 06:43 PM
112003. Drawers 2. Reg: 2
¥t Pep Mocha 4.5
Sbux Card 495
AMXARKERARANG 228
Subtotal $4.95
Total $4.95
Change Cue BO LOO
- Check Closed ~
"49/07/2014 06:43 py
oBUX Card «3228 New Balance: 37.45
Card is registertd
Model Information
Model Name: | image_printed_transformer_extraction |
Type: | pipeline |
Compatibility: | Visual NLP 5.0.2+ |
License: | Licensed |
Edition: | Official |
Language: | en |