[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[aspell-devel] UTF-8 in phonetic code table
From: |
gora |
Subject: |
[aspell-devel] UTF-8 in phonetic code table |
Date: |
Thu, 23 Nov 2006 22:25:15 +0100 |
Hi,
I have been trying out rules for Hindi in the phonetic code table,
by adding hi_phonet.dat, appropriately modifying the hi.dat file,
and remaking the dictionary. Using UTF-8 in this file is OK, is it
not? Simple rules seem to work, like Devanagari vowel sign i being
equivalent to Devanagari vowel sign ii. However, I am getting mixed
results with another simple example, a rule that a consonant sounds
similar to the same consonant, plus vowel sign a. Here is just one
example that I have added to the file
ह हा
By adding this rule, I would have expected that any word would be
zero edit distance away from another, if they differed only in that
one used ह, and the other हा However, I am not seeing that. The way
I am testing is by mispelling a word such that it is an edit distance
of two away from a known word in the dictionary, assuming that the
rule above makes the edit distance of ह and हा zero. I would then
expect
the correct word to show up in the list of suggestions, and indeed
close to the top. However, I am not seeing that. Am I missing something?
Regards,
Gora
- [aspell-devel] UTF-8 in phonetic code table,
gora <=