Scanning Text - OCR

Its quite difficult to find an open source OCR (Optical Character Recognition) program. I tried Kooka (Ocrad) and also gOCR and in both cases I did not get very good results, I may not have been using the programs properly, but I found hardly any words were recognised correctly and it would be easier to type in from scratch.

The only free OCR program that I have found, which works well, is Tesseract.

OCR Using Tesseract

(installing Tesseract in SUSE)

gimp

aquire->XSane:Epkowa
grey
dpi=400
gamma=0.67
brightness= -72 (-30 removes reverse side)
contast=100

window:preview
set border

scan

image->mode->indexed
dithering: off
use black&white (1 bit)

save image tiff
compression 1

tesseract in1.tiff out1


metadata block
see also:

 

Correspondence about this page

Book Shop - Further reading.

Where I can, I have put links to Amazon for books that are relevant to the subject, click on the appropriate country flag to get more details of the book or to buy it from them.

 

Commercial Software Shop

Where I can, I have put links to Amazon for commercial software, not directly related to this site, but related to the subject being discussed, click on the appropriate country flag to get more details of the software or to buy it from them.

 

This site may have errors. Don't use for critical systems.

Copyright (c) 1998-2015 Martin John Baker - All rights reserved - privacy policy.