Dear bugtrackers and developers,
I am trying to use the gnu ocr OCRAD to extract text from scanned
documents. Reviews of the software deem it to be "reasonably good" and
to "produce fairly accurate results". Unfortunately, when I use OCRAD
to parse images, I do not even get any barely usable results. The
output of OCRAD looks more like a dumped gpg encrypted file then a
document - I'm serious, not even remotely readable.
I have tried everything I could think of. I printed the "quick brown
fox jumps over the lazy dog" in Arial and New Times Roman, size ranging
from 9 to 16 on an A4 paper and scanned it in color, grayscale and
black-and-white, with 72, 300, 750 and 1200 dpi. The 12 scanned images
each got saved as pbm and ppm. that makes 24 files and not even one was
processed by ocrad to produce remotely readable results. The best
approximation was "qa\;c_br0mfox ipmpsO wer the |psYdOq", by processing
the 750 dpi grayscale pbm...
Obviously, I'm doing something wrong here, but I don't know what. I am
using kooka to scan the images from a HP Deskjet 4620F. Ocrad is
version 0.17, running on SUSE 11.1 .
If you could hint me to what I am doing wrong here, please do...
thanks for your help in advance
_______________________________________________
Bug-ocrad mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-ocrad