[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Request to add a new layout for Malayalam

From: Ajith R
Subject: Re: Request to add a new layout for Malayalam
Date: Tue, 7 Nov 2023 11:44:11 +0000 (UTC)

Dear Mike,

Thanks a lot for the detailed explanation. Much appreciated.
I will try ibus typing booster as you have suggested and try to understand the 
layouts that implement surrounding text.


On Tuesday, 7 November, 2023 at 05:04:14 pm IST, Mike FABIAN 
<mfabian@redhat.com> wrote: 

Ajith R <ajithramayyan@yahoo.co.in> さんはかきました:

> Thanks for all your help. I have incorporated almost all features that
> I wanted in the layouts. The feature that is not implemented is
> automatic vowel changes after using delete or after moving cursor.
> Even in bn-national-jatiya.mim, this is not working as I would want. For 
> example,
> if I press    h    I get আ
> if I press    l    I get দ
> if I press    lh    I get দা
> Till now everything is as I want. Suppose now,
> if I press    lh, then the backspace and then    h, I get    দআ and not দা 

To understand why this happens, it is helpful if you make the preedit
visible in the setup of “bn-national-jatiya (m17n)”. By default,
ibus-m17n does not mark the preedit in any way so it is hard to know
what is still in preedit and what is already committed (Unless you use
something like Gnome Wayland where the preedit is unfortunately
**always** underlined at the moment.

Let’s assume you have made the preedit visible by underlining and use
ibus-m17n with “bn-national-jatiya (m17n)”.

Now if you type `l`, you see দ and it is underlined. That means it is
still in "preedit", that means the input method is still sort of holding
this text and waiting for what comes next. If you now type a space, the
underline vanishes, now the দ is "committed" waiting for further input
is not necessary anymore, this দ text can be considered "finished".

If you don’t type a space after the `l` but a `h`, you get দা because
the input method still had the দ in preedit when the `h` was typed so it
"knew" that the `h` was typed after a consonant and could produce দা.
After the `lh` the underline disappears when using ibus-m17n because now
typing more letters could not change the result দা anymore. ibus-m17n
commits as early as possible when typing more letters cannot change the
result anymore, that result is committed.

But when something is committed, it is not controlled by the input
method anymore but by the application you are typing into.

If have typed `lh` and দা is commited and you type backspace, you remove
the "া" U+09BE BENGALI VOWEL SIGN AA and only দ is left in the
application. But ibus-m17n does not know that the দ is there to the left
of the cursor. If you type a `h` now you get আ because as far as
ibus-m17n knows, this is the first letter typed.

Now if you do the same using ibus-typing-booster, the behaviour is
slightly different: Contrary to ibus-m17n, ibus-typing-booster keeps
everything in preedit until a whole word is finished (Until you type
space, Return, ... or something like that). That means if you use
bn-national-jatiya in ibus-typing-booster and type `lh`, you get দা but
it is still underlined because the word is not finished and
ibus-typing-booster still keeps this in preedit and waits for further
input. If you type Backspace now, again "া" U+09BE BENGALI VOWEL SIGN AA
is removed and only দ is left **but** it is in preedit, you see the
underline, ibus-typing-booster still knows that the দ is there. So if
you now type the `h` again, you get দা again. If you type a space now,
the word is finished, the দা is committed and the underline disappears.
And if you type two times Backspace now, you remove the a space and the
"া" U+09BE BENGALI VOWEL SIGN AA leaving দ but **not** in preedit (there
is no underline), so ibus-typing-booster does not know it is there. And
therefore, if you type `h` now you get দআ because ibus-typing-booster
did not know that the দ was there.

Here is a video where I type

`l` `space` `h` `space` `lh` `Backspace` `h` `space` `Backspace` `Backspace` `h`

first with bn-national-jatiya with ibus-m17n and then with
bn-national-jatiya with ibus-typing-booster. I hope that makes the
difference clear:

Both strategies (committing as early as possible as in ibus-m17n and
committing only after a full word is completed as in
ibus-typing-booster) have advantages and disadvantages.

1) committing early:
  + the committed part can already be used by firefox for example when
    doing a google search, firefox already "owns" that committed text.
  + when going with arrow-left over a word just typed, the cursor moves
    over each character in one step, even if several keys were typed to
    create that character.
  - What is already committed is lost and now unknown to the input
    method and cannot be used for further editing

2) committing late only after a word is finished:
  + The input method has more information left, editing the word with
    backspace or even going left with the arrow-left key and inserting
    something inside the word is still possible.
  + predictions on how to complete the word can be made
    (ibus-typing-booster needs this because it shows suggestions how
    to complete words)
  - The editing of characters which have been composed by typing
    several keys inside of the preedit can sometimes be not so easy
    because the cursor can only be shown to the left or the right
    of a glyph, if that glyph has been composed by typing several
    characters, you have to count how often you press arrow-left,
    the visible cursor bar does not show you where "inside" that glyph
    you are.
  - if you don’t commit the word by typing space and just step back
    continuing to type arrow-left until you reach the left side
    of the word where another arrow-left causes a commit and leaves
    the word on the left side, you sometimes need surprisingly many
    arrow-left keys, you need one arrow-left key for each key you typed
    to compose that preedit which can be much more than the number of
    glyphs shown in the preedit. Personally I think that is OK, but
    some people don’t like that.
  - firefox does not yet see what is in preedit, so a Google search
    in firefox cannot yet start searching for stuff which is still in
    preedit (That could be considered a bug in firefox though, as
    google-chrome can do just that, google-chrome also uses the
    contents of the preedit to do google searches. So apparently it
    is possible for an application to know what is in preedit and
    do something with that but firefox just does not do it).

So it is not clear whether early or late committing is better, it
depends and what one wants to do. Try to use your input method both with
ibus-m17n and ibus-typing-booster and see what you like
better. ibus-typing-booster has many extra features like word
completions from remembering what you typed before or what is in a
dictionary like /usr/share/hunspell/ml_IN.dic. But if you don’t like
these extra features and just want to do the same ibus-m17n does only
with committing the preedit late, then you can switch almost all of
these extra features off, see this chapter in the ibus-typing-booster

“Simulate the behaviour of ibus-m17n”

There is another way for an input method to know what is to the left of
the cursor even if it is not in preedit anymore. That feature is called
"surrounding text". An input method can ask the application whether if
supports surrounding text and if yes get a chunk of text near the cursor
position. Then it can react differently on what it finds there, it is
for example delete some of this surrounding text from the application
and put it into preedit again, one could take দ (`l`) out of the
application and put it into preedit so that typing a following `h` gives
দা. Or, one could leave the দ as it is in the application and just make
a `h` typed in that situation produce a "া" U+09BE BENGALI VOWEL SIGN AA.

Unfortunately there are still many problems with surrounding text:

- some applications do not support it at all, it does not work
  at all in gnome-terminal, xfce4-terminal, xterm, ...
- in many applications surrounding text sort of works to some extent
  but not all surrounding text functions are implemented or very buggy.
  Then trying to use surrounding text can cause a lot of confusion
  and make everthing much worse instead of being helpful.

You **can** try to use surrounding text with ibus-m17n.
Some input methods in m17n-db try to do that. For example
si-wijesekera.mim. si-wijesekera.mim contains:

(use-surrounding-text (_"Surrounding text vs. preedit.
If 1, try to use surrounding text.  Otherwise, use preedit.")
                      0 1 0))

You can read si-wijesekera.mim to understand how this works.  Apart from
si-wijesekera.mim, there are a few other input methods in m17n-db which
try to do something with surrounding text:

mfabian@hathi:/local/mfabian/src/m17n/m17n-db/MIM (release-candidate-1-8-5 *)
$ git grep surrounding
si-wijesekera.mim:Although this code supports both surrounding text and preedit,
si-wijesekera.mim: (use-surrounding-text (_"Surrounding text vs. preedit.
si-wijesekera.mim:If 1, try to use surrounding text.  Otherwise, use preedit.")
si-wijesekera.mim:    ((& (= use-surrounding-text 1) (= @-0 -1))
si-wijesekera.mim:    (shift surrounding-text))
si-wijesekera.mim: (surrounding-text
si-wijesekera.mim:  (shift surrounding-text))
si-wijesekera.mim:  (shift surrounding-text)))
ta-lk-renganathan.mim: (use-surrounding-text (_"Surrounding text vs. preedit
ta-lk-renganathan.mim:If 1, try to use surrounding text.  Otherwise, use 
ta-lk-renganathan.mim: (check-surrounding-text
ta-lk-renganathan.mim:    ((& (= use-surrounding-text 1) (= @-0 -1))
ta-lk-renganathan.mim:    (shift surrounding-text))
ta-lk-renganathan.mim: (surrounding-text
ta-lk-renganathan.mim:    ;; additional check-surrounding-text for this vowel 
ta-lk-renganathan.mim:    ;; ordinary check-surrounding-text
ta-lk-renganathan.mim:    (check-surrounding-text)
ta-lk-renganathan.mim:  (check-surrounding-text)
ta-lk-renganathan.mim:  (check-surrounding-text)
th-kesmanee.mim:    ;; If surrounding text is supported, commit the only char 
in preedit.
th-pattachote.mim:    ;; If surrounding text is supported, commit the only char 
in preedit.
th-tis820.mim:    ;; If surrounding text is supported, commit the only char in 
vi-tcvn.mim:  ;; typed after vowel.  NST is 1 iff surrounding text is not 
vi-tcvn.mim: ;; surrounding text is not supported.
vi-telex.mim:  ;; typed after vowel.  NST is 1 iff surrounding text is not 
vi-telex.mim: ;; surrounding text is not supported.
vi-viqr.mim:  ;; typed after vowel.  NST is 1 iff surrounding text is not 
vi-viqr.mim: ;; surrounding text is not supported.
vi-vni.mim:  ;; typed after vowel.  NST is 1 iff surrounding text is not 
vi-vni.mim: ;; surrounding text is not supported.
lines 1-30/30 (END)

You can read these for reference.

But unfortunately as surrounding text still has many bugs in many
applications, it may often make things worse instead of better. I hope
that can be improved in future. If surrounding text works well, it is an
increadibly useful thing for input methods.

Mike FABIAN <mfabian@redhat.com>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]