m17n-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: kn-itrans.mim: About the updates in https://github.com/indic-transli


From: Mike FABIAN
Subject: Re: kn-itrans.mim: About the updates in https://github.com/indic-transliteration/m17n-db-indic
Date: Mon, 21 Aug 2023 18:38:27 +0200
User-agent: Gnus/5.13 (Gnus v5.13)

"विश्वासो वासुकिजः (Vishvas Vasuki)" <vishvas.vasuki@gmail.com> さんはかきました:

> On Fri, 11 Aug 2023 at 23:29, Mike FABIAN <mfabian@redhat.com> wrote:
>
>>
>> Compared to what is in m17n-db/MIM/kn-itrans.mim at the moment,
>>
>> https://github.com/indic-transliteration/m17n-db-indic/blob/master/kn-itrans.mim
>>
>> has the additions/changes as in the diff below.
>>
>> Can I just include these changes or does anybody disagree with that?:
>>
>> --- kn-itrans.mim       2023-08-11 19:39:40.994532711 +0200
>> +++ kn-itrans.mim      2023-08-11 19:32:04.781108733 +0200
>> @@ -52,7 +52,7 @@
>>
>>  (map
>>   (starter
>> -  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
>> +  (".") ("~") ("#") ("$") ("^") ("*") ((C-#)) ((C-@))
>>    ("0") ("1") ("2") ("3") ("4")
>>    ("5") ("6") ("7") ("8") ("9")
>>    ("A") ("C") ("D") ("E") ("G") ("H") ("I") ("J") ("K")
>> @@ -137,12 +137,13 @@
>>    ("shh" "ಷ್")
>>    ("s" "ಸ್")
>>    ("h" "ಹ್")
>> -  ("f" "ೞ್")                            ; not in ITRANS Kannada table
>> +  ("LH" "ೞ್")                           ; not in ITRANS Kannada table
>>
>
> Indeed - typing fakIra should yield ಫಕೀರ , however it seems that I get
> ಫ಼್ಕೀರ - could you check and fix that?

Are you sure that fakIra should yield ಫಕೀರ ?
Because http://aksharamukha.appspot.com/converter/ seems to disagree:

Wben converting to Kannada:

phakIra  ಫಕೀರ
fakIra   ಫ಼ಕೀರ

When converting to Devanagari:


phakIra  फकीर
fakIra   फ़कीर

Your hi-itrans.mim agrees with
http://aksharamukha.appspot.com/converter/ when converting to Devanagari
but it does not agree with your statement: "fakIra should yield ಫಕೀರ".

When I compare the entries for ph and f in hi-itrans.mim and
kn-itrans.mim I find:

("ph" "ಫ್") U+0CAB KANNADA LETTER PHA U+0CCD KANNADA SIGN VIRAMA
("f" "ಫ಼್‌") U+0CAB KANNADA LETTER PHA U+0CCD KANNADA SIGN VIRAMA U+0CBC 
KANNADA SIGN NUKTA U+200C ZERO WIDTH NON-JOINER

(and (".ph" "ಫ಼್‌") is the same as ("f" "ಫ಼್‌"))

and for hi-itrans.mim:

("ph" "फ्") U+092B DEVANAGARI LETTER PHA U+094D DEVANAGARI SIGN VIRAMA
("f" "फ़्")  U+092B DEVANAGARI LETTER PHA U+093C DEVANAGARI SIGN NUKTA U+094D 
DEVANAGARI SIGN VIRAMA

(and (".ph" "फ़्") is the same as ("f" "फ़्"))

Comparing that I notice that NUKTA and VIRAMA are in different order in
hi-itrans.mim and kn-itrans.mim **and** in kn-itrans.mim there is a
final ZERO WIDTH non-JOINER.

So when one types "fa" with kn-itrans.mim, the "a" does:

("a" (delete @-) "")

i.e. it deletes the last code point which removes the ZERO WIDTH NON-JOINER in 
("f" "ಫ಼್‌").

But when one types "fa" with hi-itrans.mim, the "a" removes the U+094D
DEVANAGARI SIGN VIRAMA and leaves U+092B DEVANAGARI LETTER PHA U+093C
DEVANAGARI SIGN NUKTA (फ़).

So **if** hi-itrans.mim and http://aksharamukha.appspot.com/converter/
do it correctly, I could fix kn-itrans.mim by removing the ZERO WIDTH
NON-JOINER in

("f" "ಫ಼್‌")

and reversing the order of VIRMANA and NUKTA.

But of course I am not sure what is correct.

-- 
Mike FABIAN <mfabian@redhat.com>
睡眠不足はいい仕事の敵だ。




reply via email to

[Prev in Thread] Current Thread [Next in Thread]