groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems with .PDFPIC caused by pdfinfo


From: Keith Marshall
Subject: Re: Problems with .PDFPIC caused by pdfinfo
Date: Tue, 21 Sep 2021 22:24:20 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.1

On 21/09/2021 13:34, Heinz-Jürgen Oertel wrote:
I did some more research. The result, it's not "pdfinfo" it is
Imagemagick "convert". I mostly use jpg file converted to pdf by
"convert".

Since your graphic originates as JPG, is there any particular reason why
you cannot convert to EPS, and use .PSPIC to import it into groff?  That
way you would be using groff's built-in .psbb request, so no potentially
unsafe call-out to pdfinfo is required, to get the bounding box.

The example file "Selz.pdf"

% pdfinfo Selz.pdf | hexdump -xc
0000000    6954    6c74    3a65    2020    2020    2020    2020    2020
0000000   T   i   t   l   e   :
0000010    5300    6500    6c00    7a00    0000    410a    7475    6f68

Looks like UTF-16 creeping into what is otherwise a UTF-8 (or ASCII)
data stream.

0000010  \0   S  \0   e  \0   l  \0   z  \0  \0  \n   A   u   t   h   o
0000020    3a72    2020    2020    2020    2020    6820    7474    7370
0000020   r   :                                      h   t   t   p   s
    ...

as one can see, there are \0 chars already in the title.
Looking at the PDF:

/Title <00530065006C007A0000>

So, here the title is encoded as an ASCII hex-digit representation of
UTF-16LE text.  IIRC, that's a valid PDF encoding, but why is pdfinfo
not decoding it in a format which is consistent with the rest of its
output?  Looks like a pdfinfo bug, to me.

--
Cheers,
Keith



reply via email to

[Prev in Thread] Current Thread [Next in Thread]