[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: PDF outline not capturing Cyrillic text
From: |
Robin Haberkorn |
Subject: |
Re: Re: PDF outline not capturing Cyrillic text |
Date: |
Wed, 7 Feb 2024 04:07:37 +0300 |
On Tue, Feb 06, 2024 at 01:39:51PM +0000, Deri wrote:
> Hi Robin,
>
> The current gropdf (in the master branch) does support UTF-16BE for pdf
> outlines (see attached pdf), but Branden has not released the other parts to
> make it work! If you can compile and install the current git the applying the
> attached patch should give you what you want.
>
> To apply the patch, cd into the git groff directory and "patch -p1 < path-to-
> patch-file", and then run make and install as usual.
>
> I would be very interested in how you get on, and whether it gives you what
> you need. Note that I am assuming you are feeding groff a file in UTF-8 and
> the -k flag. I can see some hyphenation happening, but I don't know if it is
> correct.
>
> Cheers
>
> Deri
Hello Deri!
This patch works. All the outline titles are correct and .pdfinfo /Title,
/Author etc. also work with Cyrillic.
That's very cool.
But it only works when using UTF-8 as the input encoding (-Kutf-8).
As reported earlier in the correponding Savannah ticket, even hyphenation
works with UTF-8 input and I see no difference to the hyphenation result
compared to KOI-8 input. I have no idea how you did this.
Still, when using UTF-8 input, there are problems (missing letters) with
link texts autogenerated by .pdfhref L.
With KOI-8 input, all the outlines are incomprehensible, ie. they consist of
крокозябры as it would be called in Russian. ;-)
Apparently gropdf does not know, it has to convert from KOI-8 instead of UTF-8.
So I am still going to disable the outlines for the time being and go with
KOI-8.
It's anyway more of a nice to have thing, rather than a necessity.
I need Russian support as I am writing my master's thesis in Russian.
At the end of the day, this will be printed, so I can live without
PDF outlines.
Best regards,
Robin
PS: And to comment on some of the heated discussions on this list:
It's great that you and Branden spend so much time on improving Groff.
I think, you do a great job. Regressions are sometimes unavoidable,
especially when taking over a large code base from somebody else.
- Re: PDF outline not capturing Cyrillic text, Robin Haberkorn, 2024/02/03
- Re: PDF outline not capturing Cyrillic text, Deri, 2024/02/06
- gropdf-ng merge status (was: PDF outline not capturing Cyrillic text), G. Branden Robinson, 2024/02/06
- Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text), Deri, 2024/02/06
- Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text), G. Branden Robinson, 2024/02/06
- Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text), G. Branden Robinson, 2024/02/07
- Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)), Deri, 2024/02/07
- Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)), Dave Kemper, 2024/02/07
- Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)), Peter Schaffter, 2024/02/07
- Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status, Oliver Corff, 2024/02/07
Re: Re: PDF outline not capturing Cyrillic text,
Robin Haberkorn <=