[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] PDFPIC macro
From: |
Keith Marshall |
Subject: |
Re: [Groff] PDFPIC macro |
Date: |
Mon, 9 Oct 2017 09:10:18 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
Hi Deri,
Thanks for trying it out.
On 09/10/17 01:21, Deri James wrote:
> Some pdfs I have tried fail with "syntax error".
That's yacc's default behaviour, when the sequence of tokens returned
by the lexer doesn't conform to its notion of a valid grammar -- either
the order isn't as expected, or the sequence is incomplete.
> It seems to occur if MediaBox is defined in an ancestor object rather
> than in a "/Page object. There are a number of page attributes which
> are inheritable in this way, MediaBox is one of them.
I do know that, thanks; it is a configuration which I did test, (albeit
with contrived, hand crafted test files):
$ ./psbb *.pdf
inherited.pdf: bounding box = (0,0)..(612,792)
minimal.pdf: bounding box = (0,0)..(612,792)
override.pdf: bounding box = (0,0)..(606,809)
> So in case a MediaBox is superseded by an entry further down the tree
> you still have to continue looking till you get to the object for
> page 1, to make sure.
And this is exactly what my code does! (To be precise, it parses the
trailer dictionary, to locate the /Catalog object, whence it follows the
indirect object reference to the top level /Pages object, and thence, it
follows the chain of the first /Kids references, through as many /Pages
objects as it may find, until it finds the first /Page object. In each
/Pages object it traverses, it evaluates any /MediaBox specifications
it may find; at each lower level, any such specification overrides any
which was evaluated at a higher level. Thus, when the /Page object is
parsed, the last /MediaBox encountered -- which may be within the /Page
object itself, or in its nearest /Pages ancestor which specified one --
will prevail).
Perhaps, you could:
$ make clean
$ make CFLAGS=-DDEBUGGING
and check your failing PDFs again, so we can see whatever unexpected
token sequence is leading to the "syntax error"; only when we know that,
will we have any chance of handling it, before the parser simply gives
up on the offending PDF.
--
Regards,
Keith.
samples.tar.xz
Description: application/xz