groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PDF outline not capturing Cyrillic text


From: Deri
Subject: Re: PDF outline not capturing Cyrillic text
Date: Fri, 23 Jun 2023 22:40:42 +0100

On Friday, 23 June 2023 19:17:58 BST Robin Haberkorn wrote:
> Hello Peter,
> 
> I am also now stumbling across Cyrillc-related issues with pdfmark. I am
> using ms for the time being. The bug also affects autogenerating link texts
> given via `.pdfhref L`.
> In the most simple case, preconv will turn your Cyrillic characters into
> escapes which are apparently not further interpreted by pdfmark (or
> anything that follows). I see text like "[u0421][u043F]..." in my outline.
> 
> I believe that this is why you have .pdfmomclean in MOM. Do I understand
> correctly that this is supposed to turn the escapes back into Latin-1?
> This is presumably mainly the work of .asciify, which would be misnamed
> anyway. It does not work with Cyrillic at all, which doesn't surprise.
> That's also why you don't get "mojibake garbage" in the outline. None of the
> Cyrillic characters end up in intermediate output.
> 
> It also explains why I previously had no problems with German Unicode
> characters (that was using MOM) - they can be converted back into Latin-1.
> 
> Manually editing the ps:exec lines in the intermediate output and inserting
> Unicode characters there, does not produce the desired results, which is
> also not surprising.
> 
> So it seems that the main problem really lies in grops and/or gropdf which
> should ideally work with the Unicode escapes produced by preconv.
> I am not sure if we would still need .pdfmomclean. But whatever useful stuff
> it currently does, it should probably be in pdfmark.tmac (and/or pdf.tmac?)
> instead.
> 
> Best regards,
> Robin

Hi Robin,

The features you require are coming. This is an example of Russian with 
bookmarks in cyrillic. I'm afraid I don't know what it means and I have 
forgotten where I got the text.

Cheers 

Deri

Attachment: Rus2.pdf
Description: Adobe PDF document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]