groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

aliasing fonts (was: Computer Modern Font)


From: G. Branden Robinson
Subject: aliasing fonts (was: Computer Modern Font)
Date: Thu, 8 Jun 2023 11:47:01 -0500

Hi Alexis,

Thanks for working on this use case!

At 2023-06-08T13:40:36+0200, Alexis wrote:
> When using the generated Computer Modern Unicode fonts with pdfroff,
> grops complains about invalid input characters in the pfb files, e.g.:
> 
>   % pdfroff -t -ms -mcmu doc/ms.ms > doc/ms.pdf
>   grops:$GROFF_FONT_PATH/devps/../cm-unicode-0.7.0/cmunbi.pfb 
> (doc/ms.ms):3047: invalid input character code 3
> 
> What could be the cause for that?

To all appearances, it is a diagnostic from one of the following two
places in the source.

src/devices/grops/psrm.cpp:429:      error("invalid input character code %1", 
int(c));
src/libs/libgroff/font.cpp:116: error("invalid input character code %1", 
int(c));

The second diagnostic tests for valid input characters to the formatter.
These are documented (at least as of groff 1.23.0 RC yadda) in our
Texinfo manual and in the groff(7) page.

    Invalid input characters are subset of control characters (from the
    sets "C0 Controls" and "C1 Controls" as Unicode describes them).
    When troff encounters one in an identifier, it produces a warning in
    category "input" (see section "Warnings" in troff(1)).  They are
    removed during interpretation: an identifier "foo", followed by an
    invalid character and then "bar", is processed as "foobar".

    On a machine using the ISO 646, 8859, or 10646 character encodings,
    invalid input characters are 0x00, 0x08, 0x0B, 0x0D-0x1F, and
    0x80-0x9F.  On an EBCDIC host, they are 0x00-0x01, 0x08, 0x09, 0x0B,
    0x0D-0x14, 0x17-0x1F, and 0x30-0x3F.  Some of these code points are
    used by troff internally, making it non-trivial to extend the
    program to accept UTF-8 or other encodings that use characters from
    these ranges.

That the diagnostic above is prefixed neither with "error" nor "warning"
suggests to me that it is groff 1.22.4 output.  It also looks like I
might want to recast some of these diagnostics slightly to distinguish
them.  Character code 3 _is_ valid GNU troff input, so by elimination I
begin to suspect the source of the diagnostic is the first file,
psrm.cpp.

"psrm" is short for PostScript Resource Manager.  It's responsible for
loading fonts and whatever else can be embedded in PostScript.  It has
its own table of valid input characters, and code 3 is _not_ valid
there.

https://git.savannah.gnu.org/cgit/groff.git/tree/src/devices/grops/psrm.cpp?h=1.23.0.rc4#n38

> Does someone know of afix or workaround?

It sounds like we need a PostScript expert to tell us if grops's table
is accurate.  If it is, then it sounds like something is producing a
corrupt PFB file.

A workaround might be to convert it to PFA instead.  groff's own
pfbtops(1) command can do this.  Ghostscript provides pfbtopfa(1) too.

> What follows are some thoughts on adding support for font aliases to
> groff.
> 
> With groff 1.22.4 doing less validation of font files it was possible
> to simply create symlink to a font file and that symlink would serve
> as a font alias.
>
> Since groff 1.23.0 does more validation of the font files than groff
> 1.22.4 each font alias needs to be a separate file on disk, although
> only the filename and value for the name directive differ.
> 
> Is it possible and feasible to change the format of the name directive
> so that it allows for several names/aliases?
>
> This would allow a single font file to be used for several font
> aliases.
> 
> Imagine a font file cmunrm containing the following name directive:
> 
>   name cmunrm CMUSerifR CMUSerifRoman
>   
> and CMUSerifR and CMUSerifRoman being symlinks to cmunrm.
> 
> This would make it easy to add a new font alias to a font and show
> the relation of fonts and font aliases on the file system too.

It is probably feasible, but...

> So far I've only looked into supporting multiple names and aliases
> during font file parsing in function font::load from
> src/libs/libgroff/font.cpp:779 but know too little about groff's font
> loading mechanism in general. Any pointers are greatly appreciated.
> 
> If folks think that this might be a useful change I'd be happy to
> learn what other code parts might need changing and possible have
> a first go at it.
>
> What are your thoughts?

I think there is already a mechanism for this.  When I learned of it, I
found it a surprisingly old one, dating back to what we might call "late
Kernighan troff".

Also, there is already some precedent for shipping relatively small
supporting macro files as companions to font descriptions.  This
precedent is "ec.tmac", which has been around for many years as support
for the EC fonts for TeX and our grodvi(1) output driver.

So if a font packager for groff were willing to maintain and supply a
macro file as well, they could alias the font when it is mounted.

Quoting our Texinfo manual from the groff Git master branch...

 -- Request: .fp pos id [font-description-file-name]
 -- Register: \n[.f]
 -- Register: \n[.fp]
     Mount a font under the name ID at mounting position POS, a
     non-negative integer.  When the formatter starts up, it reads the
     output device's description to mount an initial set of faces, and
     selects font position 1.  Position 0 is unused by default.  Unless
     the FONT-DESCRIPTION-FILE-NAME argument is given, ID should be the
     name of a font description file stored in a directory corresponding
     to the selected output device.  GNU 'troff' does not traverse
     directories to locate the font description file.

     The optional third argument enables font names to be aliased, which
     can be necessary in compatibility mode since AT&T 'troff' syntax
     affords no means of identifying fonts with names longer than two
     characters, like 'TBI' or 'ZCMI', in a font selection escape
     sequence.  *Note Compatibility Mode::.  You can also alias fonts on
     mounting for convenience or abstraction.  (See below regarding the
     '.fp' register.)

          .fp \n[.fp] SC ZCMI
          Send a \f(SChand-written\fP thank-you note.
          .fp \n[.fp] Emph TI
          .fp \n[.fp] Strong TB
          Are \f[Emph]these names\f[] \f[Strong]comfortable\f[]?

     'DESC', 'P', and non-negative integers are not usable as font
     identifiers.

     The position of the currently selected font (or abstract style) is
     available in the read-only register '.f'.  It is associated with
     the environment (*note Environments::).

     You can copy the value of '.f' to another register to save it for
     later use.

          .nr saved-font \n[.f]
          ... text involving many font changes ...
          .ft \n[saved-font]

     The index of the next (non-zero) free font position is available in
     the read-only register '.fp'.  Fonts not listed in the 'DESC' file
     are automatically mounted at position '\n[.fp]' when selected with
     the 'ft' request or '\f' escape sequence.  When mounting a font at
     a position explicitly with the 'fp' request, this same practice
     should be followed, although GNU 'troff' does not enforce this
     strictly.

Dave Kemper and I mused about these issues fairly extensively in
<https://savannah.gnu.org/bugs/?61423>.

Does this show the way to a better solution?  Is there anything unclear
above that I might make more lucid?

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]