bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PDFPIC misbehaves if called two times in a row


From: G. Branden Robinson
Subject: Re: PDFPIC misbehaves if called two times in a row
Date: Sun, 31 Jul 2022 16:42:20 -0500

At 2022-07-31T20:59:44+0200, joerg van den hoff wrote:
> On 31.07.22 18:57, Deri wrote:
> > On Sunday, 31 July 2022 17:34:19 BST joerg van den hoff wrote:
> > > hi deri,
> > > 
> > > thank you for looking into this!
> > > 
> > > question: the pdfpic.tmac you've send is supposed to work if I
> > > include it at top of my document (e.g. the fragment tt.trf) so
> > > that it overrules the pdfpic that is probably loaded previously?
> > > 
> > > well, I have done that (include at top), but then it does only
> > > generate empty output files (both, with grops as well as gropdf)
> > > 
> > > am I missing something?
[...]
> ok, understood (question: why does explicit inclusion in file not
> work??).

A lot of the files in groff's tmac directory have "include guards", like
C header files.  pdfpic.tmac's looks like this.

.do if d PDFPIC .nx

This means to skip to the next input file (`nx`) if a macro[1] named
`PDFPIC` is already defined.  Since there is no next input file, troff
exits.

And `PDFPIC` _will_ be defined, if you're using the 'pdf' output device,
because the default troffrc file unconditionally includes it.

I would take that line out of your troffrc, or the `do` line above out
of your inlined "pdfpic" macro file--but I didn't test the latter.

> > I notice that Branden has just checked in the two line changes in
> > the pdfpic.tmac file I sent to you, so I think they were Ok, saves
> > me some typing too. :-)
> 
> I've know replaced it in .../groff/1.22.4/tmac and noted that the old
> and new version differ rather massively: is 1.22.4 really that
> obsolete already?

It's not obsolete; I refactored it for clarity and robustness.

Here are all the commits to the file since groff 1.22.4 was released
(skip if bored).

$ git log --reverse 1.22.4..HEAD tmac/pdfpic.tmac
commit bdf82767ba8a22502f1bf245665c6f710c215ad0
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Apr 17 20:47:20 2020 +1000

    tmac/*.tmac: Add editor assist comments.
    
    Most *.tmac files already had some.

commit 1474eebe1f83c439dd6d422a311c1c39417fac40
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Sun Apr 19 15:24:24 2020 +1000

    **/*.{man,tmac}: Save compatibility mode robustly.
    
    Use new \n[.cp] register to save compatibilty mode.
    
    Use register names based on the filename (at the source maintenance
    level) to avoid clobbering other files' saved compatibility modes.
    
    tmac/html.tmac: Eliminate reference to saved-compatibility register by
    moving its test inside the block where compatibility mode is off.  This
    is the only part of this changeset that was not automated.

commit f3240eda596796b19a4cd4cc7cf7e665d410df91
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Sat Jul 10 16:06:47 2021 +1000

    [tmac]: Remove compatibility mode save registers.
    
    ...after using them.
    
    * tmac/an.tmac:
    * tmac/andoc.tmac:
    * tmac/cp1047.tmac:
    * tmac/devtag.tmac:
    * tmac/ec.tmac:
    * tmac/fallbacks.tmac:
    * tmac/latin1.tmac:
    * tmac/latin2.tmac:
    * tmac/latin5.tmac:
    * tmac/latin9.tmac:
    * tmac/papersize.tmac:
    * tmac/pdfpic.tmac:
    * tmac/psold.tmac:
    * tmac/pspic.tmac:
    * tmac/trace.tmac:
    * tmac/unicode.tmac: Do it.
    
    * doc/groff.texi (Implementation Differences):
    * man/groff_diff.7.man (Implementation Differences): Illustrate doing so
      in relevant examples.

commit 8bff992449d6728a50e2905023bdec0a2d8c5c4e
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Jan 21 16:51:34 2022 +1100

    [pdfpic]: Improve robustness.
    
    * tmac/pdfpic.tmac (@abort): Rename to `pdfpic@error`.  This is an
      auxiliary package, and something else could very well step on the
      former name (or worse, we break it if we're loaded later).
    
      (pdfpic@error): Stop aborting; simply report an error.  It's up to the
      user how serious `PDFPIC` macro problems are.  As noted in a comment,
      the user can easile `am PDFPIC` to tack an `ab` request onto the end
      of its definition.  Always report input file name and line number.
      Replace "[PDFPIC]" prefix with the name of the macro file complaining,
      to make it easier for groff non-experts to find.
    
      (PDFPIC): Return upon errors.  Recast diagnostic messages.  Stop
      implying that we perform any sort of probing test of file type
      (there's no telling what pdfinfo(1) will say).  Apply new 'stringdown'
      request so that we accept '.pdf' file name extension in any
      lettercase.  Test file argument for existence before proceeding
      (acknowledge TOCTTOU exposure); see Savannah #61892.  Skip file if
      pipeline returned a non-zero exit status or the registers into which
      we extract the height and width are undefined (indicating failure of a
      temporary file to be created or read).  Reject files with non-positive
      image width or height reported by `pdfinfo`.  Validate `width` and
      `height` arguments, if given, rejecting non-positive values.
    
    Fixes <https://savannah.gnu.org/bugs/?61892>.

commit 0fd6ab6b4ca1e4bbb5740925d38fba0a35a2541f
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Jan 21 19:31:08 2022 +1100

    [pdfpic]: Refactor.
    
    Now that the package does not abort upon the first whiff of any trouble,
    avoid littering groff's name spaces.  Take this opportunity to rename
    registers and strings to have obvious meaning to even the casual reader.
    
    * tmac/pdfpic.tmac: Do it.
    
      (pdfpic@cleanup): New macro removes temporary strings and registers.
    
      (PDFPIC): Call the cleanup macro upon entry; this way, if we errored
      out from a previous call, we avoid confusion.  (We don't clean up upon
      an error return because the leftover objects might be useful for
      troubleshooting.)  Rename registers and strings, to get them under
      name space discipline and also to better suggest their purpose.
    
      - convert-pdf     -> pdfpic*do-conversion
      - pdf-offset-mode -> pdfpic*offset-mode
      - pspic-args      -> pdfpic*pspic-args
      - pdf-offset      -> pdfpic*indentation
      - is-pdf          -> pdfpic*file-extension
      - img-file        -> pdfpic*file-name
      - pdf-wid         -> pdfpic*width
      - pdf-ht          -> pdfpic*height
      - pdf-deswid      -> pdfpic*desired-width
      - pdf-desht       -> pdfpic*desired-height
    
      Call cleanup macro before returning upon successful operation.

commit 1483e1168224e9d37ff0986ccf5e3346af7e9281
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Jan 21 20:38:19 2022 +1100

    [pdfpic]: Fix Savannah #58206.
    
    * tmac/pdfpic.tmac (PDFPIC): Scrub null bytes out of pdfinfo(1) output.
      Thanks to an anonymous contributor for the patch (the commentary about
      it is mine, if someone wants an argument).
    
    Fixes <https://savannah.gnu.org/bugs/?58206>.

commit cad31e26c118decd8c19be7ef7a53665c22a0117
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Sat Jan 22 04:03:49 2022 +1100

    tmac/pdfpic.tmac: Fix comment nits.

commit d026a4c4949156fc1a0bb0e61987c19ee3b2cac4
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Tue Feb 15 02:18:06 2022 +1100

    tmac/pdfpic.tmac: Refactor PDFPIC_NOSPACE handling.
    
    * tmac/pdfpic.tmac (PDFPIC): Refactor PDFPIC_NOSPACE handling.  Stop
      shelling out to create, and then sourcing, a temporary file just to
      obtain the value of an environment variable.  groff already has the
      `\V` escape sequence for that purpose (it's even safe!).  Check that
      the value of $GROFF_PDFPIC_NOSPACE is a valid numeric expression
      before assigning it to a register.  Trivially, use '=' instead of '=='
      as an equality operator.  *roff languages do not use '=' as an
      assignment operator, and I believe the '==' synonym to be a sop to C
      programmers.

commit 38da9fee2c00a60abd7282b29db757cd30594099
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Tue Feb 15 17:22:30 2022 +1100

    tmac/pdfpic.tmac: Slightly refactor.
    
    * tmac/pdfpic.tmac: Slightly refactor.
    
      (pdfpic@cleanup, PDFPIC): Rename `pdfpic*file-name` to
      `pdfpic*file-name-base` since it is not used as a complete file name
      (but rather a basis for an ".eps" extension).
    
      (PDFPIC): When testing PDF file extension, include the "."; a file
      name like "mxyzptlkpdf" is too dubious to accept.  Also explicitly
      compare `systat` register as equal or not equal to zero, since its
      truth value is inverted from the expectations of *roff expressions.

commit afc1999af86d95d0dfce22235c8bf4c02154c917
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Tue Feb 15 19:30:50 2022 +1100

    [pdfpic]: Fix Savannah #62055 for Unix systems.
    
    * tmac/pdfpic.tmac: Search for temporary directories as groff(1) does,
      instead of going straight to /tmp.
    
      (pdfpic@get-temporary-directory): New function checks each of its
      arguments for validity as a temporary directory; if one is found, its
      name is left in `pdfpic*temporary-directory`.
    
      (pdfpic@cleanup): Remove strings `pdfpic*temporary-directory` and
      `pdfpic*temporary-file`.
    
      (PDFPIC): Call `pdfpic@get-temporary-directory`, using the contents of
      the environment variables $GROFF_TMPDIR and $TMPDIR, and then /tmp, in
      that order.  Store the temporary file name in string
      `pdfpic*temporary-file`.  Use this string in subsequent `sy` and `so`
      requests.
    
    Fixes <https://savannah.gnu.org/bugs/?62055> for Unix systems.

commit 24900cf6d73088e4b00c797221ede1b5ac36d863
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Tue Feb 15 19:47:25 2022 +1100

    [pdfpic]: Fix Savannah #62055 for Cygwin/MinGW.
    
    * tmac/pdfpic.tmac: Add support for Cygwin/MinGW temporary directory
      conventions.

commit 5cf4d08deba2a46018da840c581dd5efa3fc4e50
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Fri Mar 25 18:16:43 2022 +1100

    [pdfpic]: Fix breakage if no temp dir defined.
    
    * tmac/pdfpic.tmac (PDFPIC): Fix breakage when no temporary directory
      environment variables are defined: actually use string interpolation
      syntax in comparand to output comparison operator.  Problem introduced
      by me in commits adc1999af and 24900cf6d, 15 February.

commit 339cb9ff4aef5e50dbaa1cea95e95087bae44942
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Tue Apr 12 11:16:51 2022 +1000

    tmac/pdfpic.tmac: Fix style nit (wrap long line).

commit 343d2567a1fb2c3eb7248d44280e0a7df7c4c24a
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date:   Sun Jul 31 11:06:13 2022 -0500

    [pdfpic]: Finish incomplete string renames.
    
    * tmac/pdfpic.tmac: Finish incomplete string renames.  Continues
      0fd6ab6b4c, 21 January.
            pdfpic*file-name -> pdfpic*file-name-base
            pspic-args -> pdfpic*pspic-args

> grops now works (thank you!) but gropdf know throws error (NB: moved the
> picture file to subdir `figs' of the dir where the trf-file resides in the
> meantime -- therefore different path to file):
> 
>  error: retrieval of 'figs/aitail.pdf' image dimensions failed; skipping

Here's the logic.  I regret that the syntax is pretty thick.

.  \" Get image dimensions.  The `tr` command to strip null bytes is
.  \" distasteful, but its necessity is imposed on us.  See
.  \" <https://gitlab.freedesktop.org/poppler/poppler/-/issues/776>.
.  ec @
.  sy pdfinfo @$1 | \
tr -d '\000' | \
grep "Page *size" | \
sed -e 's/Page *size: *\\([[:digit:].]*\\) *x *\\([[:digit:].]*\\).*$/\
.nr pdfpic*width (p;\\1)\\n\
.nr pdfpic*height  (p;\\2)/' \
> @*[pdfpic*temporary-file]
.  ec
.  if \\n[systat] \{\
.    pdfpic@error retrieval of '\\$1' image dimensions failed with \
exit status \\n[systat]; skipping
.    return
.  \}

> (and yes, I do have poppler's pdfinfo and pdftops installed :))

I would try running the command that the `PDFPIC` macro is running.
Build up the pipeline step by step and check the exit status at every
point.  (I have the exit status of the previous command in my shell
prompt for this reason.)

pdfinfo yourfile.pdf || echo failed, status $?
pdfinfo yourfile.pdf | tr -d '\000' || echo failed, status $?
pdfinfo yourfile.pdf | tr -d '\000' | grep "Page *size" \
  || echo failed, status $?
pdfinfo yourfile.pdf | tr -d '\000' | grep "Page *size" \
  | sed -e 's/Page *size: *\\([[:digit:].]*\\) *x *\\([[:digit:].]*\\).*$/\
.nr pdfpic*width (p;\\1)\\n\
.nr pdfpic*height  (p;\\2)/' || echo failed, status $?

> so no inclusion of pictures at all...?

Since the `PDFPIC` macro is failing to perform an essential
operation--it has to know how big the image is--that's right.

However it should be possible to sort this out with the procedure above.
It is a combination of Bernd Warken's code and mine; a true
Frankenstein's monster, in other words.  But without a PDF interpreter
built in to the GNU troff formatter or requiring the user to do
preparatory work to create PDF or image file descriptions out-of-band, I
can't imagine any other way this work could get done.

Some details may need to be fiddled, of course.

Regards,
Branden

[1] ...or string, or diversion...

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]