help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: doc-view.el now allows searching


From: Tassilo Horn
Subject: Re: doc-view.el now allows searching
Date: Fri, 31 Aug 2007 22:16:40 +0200
User-agent: Gnus/5.110007 (No Gnus v0.7) Emacs/23.0.50 (gnu/linux)

Peter Dyballa <Peter_Dyballa@Web.DE> writes:

Hi Pete,

>> First, I thought it might have something to do with the size of the
>> pdf file or blank pages not being counted; however, searches of the
>> sicp.pdf book all seemed to produce correct results, so I'm not sure
>> why the search functionality doesn't work properly in some instances.
>
> The reason is probably that pdftotext is used with the -raw option,
> which deletes all empty lines leading white space. There is another
> option: -layout.

I use -raw here and don't have any problems.  Both versions have 528
occurences of ^J and that's what doc-view uses for counting pages.

> I think doc-view should check the number of pages first (pdfinfo
> <file> | grep -i pages | awk '{print $NF}'). If the number is greater
> 1 than -layout should be more appropriate ...

I don't like -layout, because then the context of the search matches
contain those whitespaces, too.

Maybe you use an older (buggy) pdftotext version?

> BTW, -layout is bit faster!

Not here:

heimdall@baldur ~/t/test> time pdftotext -raw practicalcommonlisp.pdf pcl.txtraw
21.11user 0.21system 0:21.60elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+1135minor)pagefaults 0swaps
heimdall@baldur ~/t/test> time pdftotext -layout practicalcommonlisp.pdf 
pcl.txtlayout
21.84user 0.22system 0:22.34elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+1400minor)pagefaults 0swaps

Bye,
Tassilo
-- 
Chuck Norris is the only man who has, literally, beaten the odds. With his 
fists. 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]