[Bug-ocrad] Request about adding more characters

bug-ocrad

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-ocrad] Request about adding more characters

From:	Donald Rogers
Subject:	[Bug-ocrad] Request about adding more characters
Date:	Mon, 10 Jan 2005 08:32:37 +1300
User-agent:	Mozilla Thunderbird 0.8 (X11/20041020)

I have recently started using ocrad for OCR of English texts. I amimpressed with it - partly because it handles UTF-8 text. IMHO any OCRprogram that does not handle Unicode characters is useless.I would like to use ocrad for OCR of Esperanto texts. What is involvedwith adding the recognition of extra characters to ocrad? I have lookedup the Unicode values of all the accented Esperanto letters and herethey are in the format used in file ucs.h:


Unicode characters for Esperanto:
CCCIRCU = 0x010C, // latin capital letter c with circumflex
SCCIRCU = 0x010D, // latin small letter c with circumflex
CGCIRCU = 0x011C, // latin capital letter g with circumflex
SGCIRCU = 0x011D, // latin small letter g with circumflex
CHCIRCU = 0x0124, // latin capital letter h with circumflex
SHCIRCU = 0x0125, // latin small letter h with circumflex
CJCIRCU = 0x0134, // latin capital letter j with circumflex
SJCIRCU = 0x0135, // latin small letter j with circumflex
CSCIRCU = 0x015C, // latin capital letter s with circumflex
SSCIRCU = 0x015D, // latin small letter s with circumflex
CUBREVE = 0x016C, // latin capital letter u with breve
SUBREVE = 0x016D, // latin small letter u with breve

I noticed in the ocrad source code that there are already somecharacters with breves and some with circumflexes. Would it be a big jobfor you to add the extra 12 characters?The Esperanto letters are also in ISO-8859-3. I can send you a list oftheir codes in this set too if you wish. I could also send a file or twoof scanned Esperanto text in say PBM format, with the 12 letters: ĈĜĤĴŜŬĉĝĥĵŝŭ.


Donald Rogers
New Zealand

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-ocrad] Request about adding more characters, Donald Rogers <=
- [Bug-ocrad] Re: Request about adding more characters, Antonio Diaz Diaz, 2005/01/10

Next by Date: [Bug-ocrad] Re: Request about adding more characters
Next by thread: [Bug-ocrad] Re: Request about adding more characters
Index(es):
- Date
- Thread