aspell-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [aspell-devel] Thoughts on using aspell for Indian language ing


From: Kevin Atkinson
Subject: Re: [aspell-devel] Thoughts on using aspell for Indian language ing
Date: Mon, 13 Nov 2006 01:50:00 -0700 (MST)

On Mon, 13 Nov 2006, address@hidden wrote:

[Sorry if this messes up the Unicode characters, pine is lame and doesn't support utf-8]

The base characters themselves certainly fit. However, if one wishes to
operate on syllables (made by combining consonants in the base
character set), the number of these syllables can exceed 256.
 Here is a short example of just one of the issues that come up when
treating characters, rather than syllables as the base unit in Hindi.
Take, for example, the conjunct, "kra", क्र. This is represented
linguistically, and in UTF-8, as क + ् + र (U0915 + U094D + U0930).
It makes no sense to swap the "halant" (U094D) with the "ka" or the
"ra", as that creates a completely different conjunct, and is not a
mistake that would typically be made. As you suggest, I could just
include "kra" in the encoding, but, in many Indian languages, the
256 available slots are not sufficient for all such conjuncts.

I am going to need a better explanation.

So "kra" is stored in Unicode using three "characters"? But you want to store it using the "kra" conjunct? Which is not the way it is normally stored. What is the Unicode character for "kra"?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]