groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Latin-2 woes...


From: Ted Harding
Subject: Re: [Groff] Latin-2 woes...
Date: Thu, 12 Oct 2000 10:10:00 +0100 (BST)

On 11-Oct-00 Lukasz Wiechec wrote:
>> Nevertheless, the general solution is to write a proper input encoding
>> file to map Latin-2 to glyph names; something like
>> 
>>   .char £ \[/L]
>>   .char ³ \[/l]
>>   ...
> 
> I don't quite follow. '.char' isn't a groff macro, is it ? (4 letters
> ?)

1. '.char' is indeed a groff macro (groff names can be any length)
   but it will not work in "compatibility" mode since traditional
   troff names can be at most two letters long.

   .char \[name] string

   defines a character whose name is "name" and which is constructed
   by formatting "string". You then use it by putting '\[name]' where
   you want it in the input text.

   If a character with name "name" already exists, then it is replaced
   by the definition. Any character can be treated in this way
   (for instance, you can redefine the ordinary English character "a",
   as I do for Cyrillic, for instance:

   .de Cyrillic
   .ft AntCy
   .ftr Cy AntCy
   .char \(yu \N'192'
   .char a \N'193'
   .............


2. The groff command '.rchar \[name]' removes the definition. This
   gives rise to one of two situations.

   a) The character-name "name" can be found in the font file for
      the current font or a currently-searchable Special font. In
      this case the character with that name will be used as though
      the ".char" definition had never been given.

   b) The character-name "name" will disappear for ever. In particular,
      if you give one definition for '.char \[name] ... ' and then
      another definition '.char \[name] ... ' followed later by
      '.rchar \[name]' then the first definition is gone too.
      Character definitions do not "stack", and if you want the first
      one again then you must redefine it.

Therefore, for instance, if (using my .Cyrillic which defines Cyrillic
characters as above, and my ./Cyrillic which undoes all the definitions)
I write

   .Cyrillic
   Vladimir Putin
   ./Cyrillic
   Vladimir Putin

the first "Vladimir Putin" will print in Cyrillic, and the second will
print in English, since all these characters can still be found in the
standard fonts. On the other hand,

   .Cyrillic
   \[Ch]e\[ch]ni\[ya]
   ./Cyrillic
   \[Ch]e\[ch]ni\[ya]

will first print the Cyrillic version of the name which is written
"Chechniya" in English, and then print "eni" in English, since \[Ch],
\[ch] and \[ya] have been removed and are not in the standard font files,
while e, n and i can still be found.

Werner's suggestion anounts to the following.

In the standard PostScript fonts, there are characters with PostScript
names "Lslash" and "lslash": the glyphs for these form part of standard
PostScript and so do not need separate definition.

If you look in one of the devps font files (say .../groff/font/devps/TR)
you will find

   ....
   /l      278,683 2       0234    lslash
   ....
   /L      611,662 2       0237    Lslash
   ....

so these characters have groff names \[/l] and \[/L] already defined
in the font files: therefore ".char anything \[/L]" means that whenever
a character named "anything" occurs in the input stream, the string
"\[/L]" is used instead. Therefore this will print as intended.

In the Latin-1 encoding (which is what groff recognises), the "£" sign
occurs in the same position as "Lslash" in Latin-2. Therefore, when
groff sees the input-byte corresponding to "£" in Latin-1, it consults
the ".char" definition and replaces it with "\[/L]. This does not involve
any builtin "knowledge" of Latin-2 encoding: it is the user who has
supplied this in the ".char" definition:

  .char £ \[/L]

Unfortunately for Poles and others, the ready-made glyphs in the
standard PostScript fonts do not cover all the needed possibilities,
and you will then have to define your characters with strings which
cause them to be directly constructed. For this, fortunately the
standard PostScript fonts contain glyphs for a variety of accents.


For instance, the ms macros allow a character to be used as an
"accent-under" for the preceding character, in particular "ogonek"
(PostScript name) for which the groff name is "\[ho]" ("hook"):

   ho    333,0,165       0       0230    ogonek

So you can define the string \*[ogonek] which performs the
correct placement of this accent under the preceding character by

  .acc*under-def ogonek \[ho]

  .char \[o,] o\*[ogonek]

and then use \[o,] whenever you need "o-ogonek".

I had been meaning to pick up on this topic earlier, giving a
complete repertoire of Latin-2 translations, but I have been too
taken up with other things recently to finish it. However, I will
do it later.

Meanwhile, I hope the above at least points the way.

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <address@hidden>
Fax-to-email: +44 (0)870 284 7749
Date: 12-Oct-00                                       Time: 10:10:00
------------------------------ XFMail ------------------------------

reply via email to

[Prev in Thread] Current Thread [Next in Thread]