bug#56323: closed (29.0.50; Add new customisable phonetic Tamil input me

emacs-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#56323: closed (29.0.50; Add new customisable phonetic Tamil input me

From:	GNU bug Tracking System
Subject:	bug#56323: closed (29.0.50; Add new customisable phonetic Tamil input method)
Date:	Thu, 14 Jul 2022 06:35:01 +0000

Your message dated Thu, 14 Jul 2022 09:34:14 +0300
with message-id <83y1wwtd7d.fsf@gnu.org>
and subject line Re: bug#56323: 29.0.50; [v2] Add new customisable phonetic 
Tamil input method
has caused the debbugs.gnu.org bug report #56323,
regarding 29.0.50; Add new customisable phonetic Tamil input method
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
56323: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56323
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems

--- Begin Message --- Subject: 29.0.50; Add new customisable phonetic Tamil input method Date: Thu, 30 Jun 2022 17:43:21 +0530 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Tags: patch

The attached patchset adds a new customisable phonetic Tamil input
method. I tried to reuse as much of the existing itrans input method
code since it greatly simplifies the creation of an Indic input method
(see `indian-make-hash').

The first patch fixes a fallout from bug#50143 asking to add TAMIL OM ௐ
to the itrans table, and this means that one can insert the TAMIL OM
character using the tamil-itrans input methods as well. I'd prefer it
if this patch can be pushed quickly.

The second patch actually adds the new phonetic input method. I will
leave the rationale for making it a _customisable_ input method in
footnote [1]. To reuse the existing code that calculates the various
tables for the tamil-itrans IM, I turned the code in defvars to defuns.
However, the definition of the almighty
quail-tamil-itrans-syllable-table is still huge since I needed to do a
whole lot to convert the indian-tml-base-table to a format that will
accepted by the new defun `quail-tamil-itrans-compute-syllable-table'.

The current quail rules is inspired by the one in
https://github.com/rnchzn/tamil-phonetic/raw/main/tamil-phonetic.el and
the comments in
https://emacsnotes.wordpress.com/2022/03/07/tamil-phonetic-input-method-in-emacs-emacs-%E0%AE%87%E0%AE%B2%E0%AF%8D-%E0%AE%A4%E0%AE%AE%E0%AE%BF%E0%AE%B4%E0%AF%8D-%E0%AE%83%E0%AE%AA%E0%AF%8A%E0%AE%A9%E0%AF%86%E0%AE%9F%E0%AE%BF%E0%AE%95%E0%AF%8D/.

Avid readers might notice that I went for a nil SIMPLE argument despite
my recent complaint in emacs-devel. The reason for that is because we
need a way to end the ongoing translation (C-SPC). E.g., if one decides
to transliterate ல் as "l" and ள் as "ll", then to type ல்ல the key
sequence will be

l C-SPC la

without the C-SPC, "lla" would be translated to ள. The better way
forward would be to present _both_ ல்ல and ள் for the sequence "lla" but I
have no idea how to do it. Any pointers would be _highly_ appreciated.

I plan to modify indian--puthash-char to have one to many translations
i.e., "l" would translate to both ல் and ள் and then the user could decide
which one to insert. This combined with the DETERMINISTIC argument to
quail-define-package would make it an attractive option, I think. But
I'm leaving it out right now since I want the current patch to be
reviewed first.

I think adding an optional NAME argument to tamil--update-quail-rules
might be more flexible since then a user could let bind the relevant
defcustoms to define other Tamil input methods without hassle (like the
tamil99 layout, which I plan to get to at Some Point™). WDYT?

The code for tamil--update-quail-rules is sort of convoluted because of
the conversion mentioned above. tamil--make-trans-table is also kind of
complicated because,

1. I couldn't make the tamil-vowel-translation (and consonant, and
misc) alist have a character key since the Customize interface
shows those characters as numbers!! I really do not want to dig
into the Customize UI code, sorry. :(

2. indian-tml-base-table has the character க in it but the defcustom
tamil-consonant-translation has the character க் in it because the
latter makes more sense to a native speaker and also because of
(1) above. More explanation as to why in footnote [2].

There are some FIXMEs spattered in the code but I will get to it in a
later revision. I also don't have a :set function for the defcustoms
since I'm not sure if something along the following is the only way to
automagically recalculate the quail rules:

(defun tamil--set-variable (sym val)
(set-default sym val)
(when (and (boundp 'tamil-vowel-translation)
(boundp 'tamil-consonant-translation)
(boundp 'tamil-misc-translation)
(boundp 'tamil-native-digits))
(tamil--update-quail-rules)))

Comments on this, and general code review would be much appreciated.
I don't think I have missed anything and if you want me to add more
comments on some of the stuff, please do tell. Thanks.

If Tamil speakers are reading this bug report, shout at me if you want
something else and if you have other general comments. Or if I made an
embarrassing typo somewhere. Thanks!

0001-Fix-fallout-from-bug-50143.patch
Description: Text Data

0002-Add-new-customizable-phonetic-Tamil-input-method.patch
Description: Text Data


---

Footnotes:

1. The itrans input method is absolutely horrible for Tamil since unlike
   the other Indic languages, it doesn't have a lot of consonants
   HOWEVER, the consonant sound _changes_ depending on where it ends up.
   So ideally, the Tamil input method show allow multiple _ways_ to
   insert a single character.  As an example, consider the following
   words

        தும்பிக்கை - thumbikai            (tusk)
        படம் - padam                      (photograph/image)

    The consonant of interest is "ப".  The letter "பி" is pronounced in
    the first word as "bi" as in "bicycle" however, the letter "ப" is
    pronounced as "pa" as in "party".  This is just one of many
    examples.

    There are also pairs of very similar sounding consonants and when
    transliterated (when you type in "Tanglish" for example), all the
    characters in the pair use the same letter.  E.g., such a pair is
    the ல/ள family; when one causally chats in "Tanglish", we just type
    "lXX" as the transliteration for that family.  Obviously, when one
    is typing in _Tamil_, he/she needs to distinguish between these two
    characters.  Leaving the choice of input sequence to transliterate
    these characters to the writer is much better.  For more, please
    read the wordpress article I linked, thanks.

2. Opting to not go for character key in tamil-consonant-translation
   because of the Customize interface is only part of the reason.

   Having the key be TAMIL LETTER XXX + TAMIL SIGN VIRAMA is much more
   intuitive for the native speaker.  Take பு for example, the way you
   break it down into consonant and vowel is

        ப் + உ = பு
        (ippu + u = pu)

   and NOT

        ப + உ = பு
        (pa + u = pu)

--- End Message ---

--- Begin Message --- Subject: Re: bug#56323: 29.0.50; [v2] Add new customisable phonetic Tamil input method Date: Thu, 14 Jul 2022 09:34:14 +0300
> From: Visuwesh <visuweshm@gmail.com>
> Cc: 56323@debbugs.gnu.org
> Date: Sun, 10 Jul 2022 13:02:11 +0530
> 
> > Updated patch attached.
> >
> 
> I managed to miss a comment, sorry about that.  Now fixed in attached
> patch.

Thanks, installed.
--- End Message ---

[Prev in Thread]

Current Thread

[Next in Thread]

bug#56323: closed (29.0.50; Add new customisable phonetic Tamil input method), GNU bug Tracking System <=

Prev by Date: bug#56528: closed (29.0.50; Emacs lucid segfaults when X dies)
Next by Date: bug#56499: closed (28.1; Unable to open large file)
Previous by thread: bug#56528: closed (29.0.50; Emacs lucid segfaults when X dies)
Next by thread: bug#56499: closed (28.1; Unable to open large file)
Index(es):
- Date
- Thread