groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] Accented Cyrillic characters


From: Robin Haberkorn
Subject: [groff] Accented Cyrillic characters
Date: Thu, 2 Aug 2018 01:15:16 +0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Hello!

I'm working on a small Russian offline dictionary that formats the entries of
words into Troff/Man pages, so you can view them in the terminal.

There is a small problem when trying to format accented Cyrillic characters.
Accents are commonly used in Russian to highlight word stress by placing them on
the stressed syllable's first vowel.
Currently, I'm just adding a standalone UTF composite accent character (U+0301)
after every vowel I want to show stress on since Unicode does not seem to define
separate codepoints for all of the Cyrillic accented vowels.
AFAIK, the accent is not really interpreted by Groff - to it, it will seem like
a standalone glyph. But the terminal emulator (at least URXVT) will combine the
accent and the vowel into a single glyph.
For instance саморазруше\[u0301]ние will effectively render as саморазруше́ние.

This approach of adding accents causes problems with tbl, though. The
combination of the two characters into a single glyph screws up tbl's (and/or
Groff's) assumptions. For instance, in a table like:
| саморазруше́ние |
| foo bar         |
the bars won't properly line up.
It will probably cause other more subtle formatting issues as well, but that's
where I personally caught it.

I tried to use the Groff Unicode composite syntax, so it becomes clear to Groff
that the accented character is a single glyph. For instance,
\[u0435_0301] should theoretically also format as an accented Cyrillic e.
But what happens instead is that the accent is dropped during formatting.
Curiously, this works when using latin characters. For instance, \[e u0301],
\[e aa], \[e '] will result in a properly accented latin e.

Why is that so? Did I catch a grotty bug here?
Do you know any workaround I could employ?

Best regards,
Robin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]