[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fw: Outputting NFC with ibus-m17n
From: |
Richard Wordingham |
Subject: |
Fw: Outputting NFC with ibus-m17n |
Date: |
Sat, 22 Jun 2024 13:13:50 +0100 |
Copying to list.
Begin forwarded message:
Date: Sat, 22 Jun 2024 13:11:31 +0100
From: Richard Wordingham <richard.wordingham@ntlworld.com>
To: Mike FABIAN <mfabian@redhat.com>
Subject: Re: Outputting NFC with ibus-m17n
On Thu, 20 Jun 2024 10:54:57 +0200
Mike FABIAN <mfabian@redhat.com> wrote:
> Richard Wordingham <richard.wordingham@ntlworld.com> さんはかきました:
>
> > I have a couple of problems using m17n with ibus on Ubuntu.
> >
> > ibus 1.5.26-4
> > ibus-m17n 1.4.9-1
> > m17n-db 1.8.0-3
> >
> > The core of the keyboard definition is a set of keystroke
> > definitions such as
> >
> > (map
> > (simple
> > ;; cy3sn-p1:
> > ("B" ?β) ; U+03b2 GREEK SMALL LETTER BETA
> > ...
> > ("e_H" ?é) ; U+00E9 LATIN SMALL LETTER E WITH ACUTE
> > ))
> > (state (init (simple)))
> >
> > These problems didn't occur when I was using fcitx as the input
> > method.
> >
> > The first problem is that when the sequence of keystrokes is an
> > initial portion of another sequence of keystrokes, the character I
> > thought I had entered simply disappears after a while if I don't
> > firm it up by moving to another character, e.g. by entering more
> > strokes or moving the cursor.
>
> Can you ɡive an example?
For example, if I hit shift-B three times into a Gnome-terminal and
then go away for a while, there are three betas when I get up but
only two when I come back. The wait varies a lot. It can be as little
as 20s, but can be several minutesǃ
Part of the cause of the problem is that I also have:
("B\\" ?ʙ) ; U+0299 LATIN LETTER SMALL CAPITAL B
("B\\\\" ?B)
Therefore the entry has not been finalised because I have not entered
a backslash, and it seems that unfinalised entries get deleted after a
timeout. This is extremely annoying when copy-editing diacritic-rich
text. At least one can often notice base characters disappearing
before one's eyes. Can a user or keyboard definition set the timeout to
something more reasonable, like a week?
I have put my keyboard definition on the internet as
https://wrdingham.co.uk/fonts/xsampa.mim.zip. (Zipped because it's
UTF-8, not Latin-1.) Its comments haven't been fully converted from
Quail to m17n - I want the _source_ code to be the same for the Quail
and m17n versions.
> > The second problem is that the character I entered gets converted to
> > NFD. Thus I can't enter e acute to grep for the line defining its
> > keystrokes - I have to create my own regular expression engine to do
> > searches respecting Unicode canonical equivalence.
** PROBLEM 2 APPARENTLY SOLVED - SEE BELOW **
> > Am I missing some tricks to avoid these problems? The idea of the
> > keyboard is that I can just type IPA by typing XSAMPA.
> I am not sure whether I understand exactly what problems you are
> facing.
> Long ago I wrote my own ipa-x-sampa.mim which I have in my home
> directory:
> $ ls ~/.m17n.d/ipa-x-sampa.mim ~/.m17n.d/icons/ipa-x-sampa.png
> /home/mfabian/.m17n.d/icons/ipa-x-sampa.png
> /home/mfabian/.m17n.d/ipa-x-sampa.mim
> It seems to work for me.
> It did not contain a line like
> ("e_H" ?é) ; U+00E9 LATIN SMALL LETTER E WITH ACUTE
> it contained only:
> ("_H" ?́) ;; U+0301 COMBINING ACUTE ACCENT
> Because of that, when I typed “e_H” I got é (U+0065 LATIN SMALL
> LETTER E followed by U+0301 COMBINING ACUTE ACCENT)
> But when I add this line:
> ("e_H" ?é) ;; LATIN SMALL LETTER E WITH ACUTE <- this line added
> today!
>
>
> then typing “e_H” gives me é U+00E9 LATIN SMALL LETTER E WITH ACUTE.
When I enter the command line "echo e_He_He_H |wc" on the Gnome-terminal
running bash, I get a line length of 10, being the three characters in
NFD and the line-terminator. Additionally, the e-acutes are displayed
differently to the precomposed character.
Possibly relevant definitions are:
("e_" ?e (pushback 1))
("e_H" ?é) ; U+00E9 LATIN SMALL LETTER E WITH ACUTE
("e_H\\" ["e˦"]) ; e, U+02E6
There was a time, back in October 2020, that a sequence would only
work if all its prefixes were single characters or defined,which is
when and why I started using 'pushback'. That side issue is currently
not present with ibus-m17n. The dates are consistent with moving from
deceased 32-bit Ubuntu to 64-bit Ubuntu, at which point (definitely
Focal) I gave ibus another try, having previously found it too
ill-supported and had moved to fcitx.
Indeed, that seems to explain why I'm getting NFD. If I comment out
the definition for input sequence "e_", I now get NFC for e_H.
> But typing “a_H” still gives á (U+0061 LATIN SMALL LETTER A followed
> by U+0301 COMBINING ACUTE ACCENT), because I did not add a ("a_H" ?á)
> line.
> So by default, my ipa-x-sampa.mim produces NFD for é, that’s the way
> I wrote it. But it could be changed to behave differently.
>
> My ipa-x-sampa.mim is here:
>
> https://github.com/mike-fabian/m17n-db-ipa-x-sampa
>
> If you show me yours I can test it and see what goes wrong.
Thanks. As said above, now available at
https://wrdingham.co.uk/fonts/xsampa.mim.zip.
Richard.
- Fw: Outputting NFC with ibus-m17n,
Richard Wordingham <=