bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input metho


From: Visuwesh
Subject: bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
Date: Fri, 01 Jul 2022 22:07:38 +0530
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

[வெள்ளி ஜூலை 01, 2022] Eli Zaretskii wrote:

>> I mostly meant to ask if the weighted approach was good but I wasn't
>> clear enough, sorry.  Let me try to explain it better:
>> 
>> Let's suppose that string-lessp does not work for English for the
>> discussion here.  The task is to sort a list of jumbled English
>> alphabets in alphabetical order.  What I'm currently doing is creating
>> an alist where the key is the alphabet and the value is the alphabet's
>> order (so a will be 1, b will be 2, etc.).  Then in the sort function, I
>> look for this order.  If the alphabet is not in this list, then I fall
>> back to a large number.
>> 
>> So the code above would look like this if it were in English,
>> 
>>     (sort '("b" "z" "c" "n" "a" "aa" "p")
>>           (lambda (x y)
>>             (let ((cp '(("a" . 0) ("b" . 1) ("c" . 2) ("d" . 3) ("e" . 4)
>>                         ("f" . 5) ("g" . 6) ("h" . 7) ("i" . 8) ("j" . 9)
>>                         ("k" . 10) ("l" . 11) ("m" . 12) ("n" . 13) ("o" . 
>> 14)
>>                         ("p" . 15) ("q" . 16) ("r" . 17) ("s" . 18) ("t" . 
>> 19)
>>                         ("u" . 20) ("v" . 21) ("w" . 22) ("x" . 23) ("y" . 
>> 24)
>>                         ("z" . 25))))
>>               (< (or (assoc-default x cp) 10000)
>>                  (or (assoc-default y cp) 10000)))))
>> 
>> and the sorted list comes out as ("a" "b" "c" "n" "p" "z" "aa")
>> which is exactly what I desire.  I hope this is clear enough.
>
> The above just gives each letter its order in the alphabet.  But if
> that is what you wanted, string-lessp (or even just direct comparison
> of characters) would have worked for you.  So there's still something
> important missing from your description, I think.
>

Unfortunately, string-lessp does not do the job.  (string-lessp "ஞ" "ஜ")
should return t but it returns nil probably because ஞ's codepoint is
2974 and ஜ's codepoint is 2972.  But ஜ is not even part of the "core"
Tamil characters and hence should come at last.  This is why I went with
defining an alist with the _actual_ order of the characters.  I hope
this is clear: to demonstrate this using English, it would be something
like...

    c's codepoint is 29 and d's codepoint is 27.  Clearly, c comes
    before d but since string-lessp seems to rely on the Unicode
    codepoint, when we do the sorting with string-lessp, we get 
    "... d c ..." in the list instead of the desired "... c d ...".

I hope this is clear.

>> Yep, it is misalignment.  I could try to use those pixel-resolution
>> alignment features but I really don't think I can do a good enough job.
>> It is something I tried in the past but gave up since it was too complex
>> for me.  The current code produces a Good Enough™ table and I think I
>> will just leave it unless Someone™ complains since after all, the
>> current situation is much better than what we have in Emacs 28 (the
>> docfix that happened as part of bug#50143 isn't in Emacs 28).
>
> I thought vtable.el was about solving such problems?

Okay then, I will use that.  I was mostly unsure if using vtable would
be alright especially since it puts keymap properties and the entire
vtable object as a text property -- it seemed too excessive for a
docstring.  Maybe some of this can be addressed?

>> BTW, do you have any other code/documentation review?  And what about
>> the patch I posted in 
>> https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html?
>> No rush but I would like to know if it can go in since it only addresses
>> fallouts from the previous bug in this area.  Thanks.
>
> It sounded to me like you are still working on the code, so I didn't
> see a need to review it.  If you have specific parts that you'd like
> me to review nonetheless, please tell which parts are those.

Thanks.  The patch I posted in
https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-06/msg02256.html
is done, and can be pushed to master if you see no problems.  All it
does is address a few fallouts that were accidentally left out when
fixing bug#50143.  Specifically, it adds an entry for the TAMIL OM
character, and adds two more Sanskrit consonants to the Tamil itrans
table.

Also, I would like to know if there's a better to write the :set
function for the defcustoms tamil-vowel-translation,
tamil-consonant-translation, tamil-misc-translation, tamil-native-digits
without the boundp check chain below,

    (defun tamil--set-variable (sym val)
      (set-default sym val)
      (when (and (boundp 'tamil-vowel-translation)
                 (boundp 'tamil-consonant-translation)
                 (boundp 'tamil-misc-translation)
                 (boundp 'tamil-native-digits))
        (tamil--update-quail-rules)))

I'm also doubtful about the current group being used for these
defcustoms.  Should I go ahead and make a new 'tamil' group and make it
a subgroup of leim or i18n?  And is the prefix tamil- okay or should I
change it to something else?

Finally, I'm unsure if "List of input sequences to translate to ..." is
clear.  I think it sounds a mouthful and there should be a better way to
put it.  I think "translation rules" is quite nice but I'm afraid that
it is too Quail specific and might not be well understood.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]