[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] pdfmom grep (was parallel text processing)
From: |
Deri James |
Subject: |
Re: [Groff] pdfmom grep (was parallel text processing) |
Date: |
Sat, 09 Sep 2017 19:56:14 +0100 |
User-agent: |
KMail/4.14.10 (Linux/4.4.82-desktop-1.mga5; KDE/4.14.35; x86_64; ; ) |
On Sat 09 Sep 2017 09:51:27 Peter Schaffter wrote:
> On Sat, Sep 09, 2017, Ralph Corderoy wrote:
> > Hi Peter,
> >
> >
> >
> > > The grep in pdfmom is returning a binary file hit when it encounters
> > > the diacritic in
> > >
> > > .ds pdf:look(pdf:bm1) L'étranger
> >
> >
> >
> > What does locale(1) output for you where you run this pdfmom command?
>
> LANG=en_CA.UTF-8
> LANGUAGE=en_CA:en
> LC_CTYPE="en_CA.UTF-8"
> LC_NUMERIC="en_CA.UTF-8"
> LC_TIME="en_CA.UTF-8"
> LC_COLLATE="en_CA.UTF-8"
> LC_MONETARY="en_CA.UTF-8"
> LC_MESSAGES="en_CA.UTF-8"
> LC_PAPER="en_CA.UTF-8"
> LC_NAME="en_CA.UTF-8"
> LC_ADDRESS="en_CA.UTF-8"
> LC_TELEPHONE="en_CA.UTF-8"
> LC_MEASUREMENT="en_CA.UTF-8"
> LC_IDENTIFICATION="en_CA.UTF-8"
> LC_ALL=en_CA.UTF-8
>
>
> > > The solution is to pass the -a flag to grep.
> >
> >
> >
> > How about
> >
> >
> > groff ... 2>&1 | LC_ALL=C grep '^\.ds' | groff ...
>
> Yes, that's the solution I thought of before suggesting the tidier
> but, as Steffen pointed out, not universal -a flag.
>
>
> > BTW, pdfmom has a bug shown by that strace command I suggested.
> >
> >
> > system("groff ... 2>&1 | grep '^\.ds' | groff ...");
> >
> >
> > That's a double-quoted Perl string so `\.' is escaping the dot and grep
> > sees a plain dot for `any character'. The backslash needs doubling.
>
> Missed that. Argh. Why don't they make special glasses that let
> you see code as if for the first time whenever you put them on?
>
> --
> Peter Schaffter
I can't actually recreate the problem, i.e. grep does not spit out the
"binary" error. I've tried with a en_GB.UTF-8 and a en_GB environment, neither
show the message. The version of grep I'm using is:-
grep (GNU grep) 2.20
The double escaping of the "." in the grep pattern used to be there:-
grep \"^\\.ds\"
but got changed.
Cheers
Deri
- Re: [Groff] pdfmom grep (was parallel text processing), (continued)
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/10
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/10
- Re: [Groff] pdfmom grep (was parallel text processing),
Deri James <=
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Ralph Corderoy, 2017/09/07
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Mike Bianchi, 2017/09/07
Re: [Groff] parallel text processing ; vertical and horizontal mode, Ted Harding, 2017/09/06
Re: [Groff] parallel text processing ; vertical and horizontal mode, Larry Kollar, 2017/09/13