help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Efficiently checking the initial contents of a file


From: Nordlöw
Subject: Re: Efficiently checking the initial contents of a file
Date: Mon, 19 May 2008 00:20:26 -0700 (PDT)
User-agent: G2/1.0

On 16 Maj, 16:03, "Juanma Barranquero" <lek...@gmail.com> wrote:
> On Fri, May 16, 2008 at 2:52 PM, Nordlöw <per.nord...@gmail.com> wrote:
> > (defun file-begin-p (filename beg)
> >  "Determine if FILENAME begins with BEG."
> >  (interactive "fFile to investigate: ")
> >  (if (and (file-exists-p filename)
> >           (file-readable-p filename))
> >      (with-temp-buffer
> >        (let ((width (string-width beg)))
> >          (insert-file-contents-literally filename nil 0 width)
> >          (looking-at beg)
> >          ))))
>
> A few additional comments:
>
>  - BEG can be a regular expression, in which case the length of it can
> be a red herring; for example (file-begin-p "[ABC]\\{20\\}") will
> always return nil. Perhaps you could do
>
>    (defun file-begin-p (filename beg &optional len)
>       ...
>      (let ((width (or len (string-width beg))))
>        ...
>
> so you can pass a length if needed.
>
>  - If you don't want to pass a regexp, it is advisable to remember
> using regexp-quote, otherwise (file-begin-p "A*") is always going to
> return t.
>
>  - Mixing `insert-file-contents-literally' and `string-width' does not
> seem like a good idea. Better use `string-bytes', or, if BEG can
> contain non-ASCII chars, use `insert-file-contents' and `length'. I'd
> recommend that second route.
>
> Hope this helps,
>
>    Juanma

Hey again!

Is I see it the most general and efficient solution to this problem
would be to

Make the looking-at() logic stream based as we want to prevent the
logic from requiring the whole buffer to be read from file into memory
regardless of the length of BEG. Is there some way of opening a file
into a buffer without actually reading the whole contents of the file
into memory before it is actually used by, in our case, looking-at() ?

A less optimal solution could make use of a function say regexp-max-
match-length(REGEXP) the determines the longest possible pattern a
regexp can match, possibly infinity. The return value from this
function could then be used as length-argument to insert-file-contents-
literally().

By the way I am surprised that my sought-of-function does not already
exist in GNU Emacs. Can it be because it is difficult to design a
solution that satisfies *all* of the needs given above.

/Nordlöw


reply via email to

[Prev in Thread] Current Thread [Next in Thread]