bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-ocrad] Re: Request about adding more characters


From: Antonio Diaz Diaz
Subject: [Bug-ocrad] Re: Request about adding more characters
Date: Mon, 10 Jan 2005 12:40:57 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040913

Hello Donald. Thanks for your interest in Ocrad.

Donald Rogers wrote:
I have recently started using ocrad for OCR of English texts. I am impressed with it - partly because it handles UTF-8 text. IMHO any
OCR program that does not handle Unicode characters is useless.

Well, my life would be a lot easier with an 8-bit charset, but people can't stop inventing letters. ;-)


I would like to use ocrad for OCR of Esperanto texts. What is
involved with adding the recognition of extra characters to ocrad?

A lot of work. I have hacked ocrad too much adding new characters. I have to rewrite some things to add support for a new charset (iso-8859-3 in this case). (Offtopic note: Given that Zamenhof invented Esperanto, why did he choose accented letters instead of, say, the Latin alphabet?).


I noticed in the ocrad source code that there are already some characters with breves and some with circumflexes.

Yes, some from iso-8859-15 and some from iso-8859-9.


I could also send a file or two of scanned Esperanto text in say PBM
format, with the 12 letters:

Please, send them. I will try to add Esperanto support in version 0.12 of ocrad.

Regards,
Antonio Diaz.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]