bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input metho


From: Visuwesh
Subject: bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method
Date: Sat, 02 Jul 2022 13:41:17 +0530
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:

>> From: Visuwesh <visuweshm@gmail.com>
>> Cc: 56323@debbugs.gnu.org
>> Date: Sat, 02 Jul 2022 12:24:39 +0530
>> 
>> [சனி ஜூலை 02, 2022] Eli Zaretskii wrote:
>> 
>> >> There are two "classes" of consonants: those that are part of Tamil
>> >> (let's call them "core") and those borrowed from Sanskrit.  When one
>> >> writes the consonants in order, the core consonants come first then the
>> >> Sanskrit ones.  You can find the order of the core consonants in
>> >> wikipedia here in the table titled "Tamil consonants":
>> >> https://en.wikipedia.org/wiki/Tamil_script#Letters
>> >> 
>> >> We need not worry too much about the order of Sanskrit consonants, we
>> >> just need to ensure that they come after the core consonants.  You can
>> >> find these Sanskrit consonants in the table titled "Grantha consonants
>> >> in Tamil" in the same link.
>> >> 
>> >> I hope this is clear.
>> >> 
>> >> As for the criteria, it is simply "Tamil consonants then the Sanskrit
>> >> consonants."
>> >
>> > Then your comparison function should first see whether a character is
>> > in the former or the latter group, and use string-lessp or character
>> > codepoint comparison with each group, right?  But that's not what you
>> > did, so I wonder whether my understanding is correct.
>> 
>> It didn't occur to me to do it this way so I tried it out but then I
>> noticed, string-lessp even within a group won't work.  When you evaluate
>> the following sexp, you don't get a list of increasing numbers...
>> 
>>     (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
>>                              "ந" "ப" "ம" "ய" "ர" "ல"
>>                              "வ" "ழ" "ள" "ற" "ன")))
>>       (mapcar (lambda (c) (string-to-char c)) core-consonants))
>> 
>>       ;; => (2965 2969 2970 2974 2975 2979 2980 2984 2986 2990 2991 2992
>>              2994 2997 2996 2995 2993 2985)
>> 
>> and sure enough when you do (sort core-consonants #'string-lessp) the
>> list is jumbled up instead of retaining the order.
>> [ core-consonants, as declared, is in the right order but sort jumbles
>>   it up.  ]
>> 
>> But string-lessp works for vowels.  It is the consonants that is the
>> problem.
>
> Sorry, I don't understand what you are saying here.  How is the above
> code related to the issue at hand, which is how to sort characters in
> the order you want them to be sorted?  (And please keep in mind that I
> don't even know which of those characters are consonants and which are
> vowels -- if you want me to say something intelligent about that.)

I'm trying to explain the behaviour of string-lessp which seems to sort
the characters by their Unicode codepoints.  But the order these
characters appear in Unicode and their actual order is not the same so
string-lessp does not do the job we want it to.

[சனி ஜூலை 02, 2022] Eli Zaretskii wrote:

>
> Or maybe my guess below will be lucky.  You probably want this:
>
>   (defun sort-by-codepoint (c1 c2)
>     (< (string-to-char c1) (string-to-char c2)))
>
>   (let ((core-consonants '("க" "ங" "ச" "ஞ" "ட" "ண" "த"
>                          "ந" "ப" "ம" "ய" "ர" "ல"
>                          "வ" "ழ" "ள" "ற" "ன")))
>
>  (sort core-consonants 'sort-by-codepoint))
>   => ("க" "ங" "ச" "ஞ" "ட" "ண" "த" "ந" "ன" "ப" "ம" "ய" "ர" "ற" "ல" "ள" "ழ" "வ")
>
> (To understand why, read the doc string of 'sort' carefully, where it
> explains what is expected from PREDICATE.)

Unfortunately not, since it jumbles up the list.  The desired outcome is
the same list.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]