bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: diff's type used for holding line numbers


From: Andreas Gruenbacher
Subject: Re: diff's type used for holding line numbers
Date: Sun, 5 Apr 2009 18:40:20 +0200
User-agent: KMail/1.9.9

On Sunday, 5 April 2009 18:00:21 Bruno Haible wrote:
> Andreas Gruenbacher wrote about the type used to hold line numbers
> in 'diff': 
> > > %td is for ptrdiff_t, not off_t.
> >
> > Exactly, and ptrdiff_t should be machine word size, which is what we want
> > here, right?
>
> No. ptrdiff_t may be too small.
>
> ISO C 99 section 6.5.6.(9) says that ptrdiff_t is the type for the
> difference of two pointers into the *same array*. There is no requirement
> that ptrdiff_t is near the size of available RAM. For example, you can have
> platforms where arrays are limited to 4 GB (or 2 GB) in size, but there is
> 64 GB available RAM, and the user wants to diff two files of size 5 GB,
> each of them consisting mostly of newlines. The line numbers will be >
> 2^32.

Alright.  Diff uses arrays for storing the files it compares and for computing 
the diff so it can only compare files whose line numbers fit into ptrdiff_t. 
Still safe.

Patch is the weird beast here though: its Plan B creates an index of line 
offsets into the input file and doesn't keep the entire file in memory.  
That's crazy, but it seems to work (when trying on small files with
--debug=16 at least).

> off_t is guaranteed to be sufficiently large, because a file cannot
> contain more lines than it contains bytes on disk.

Okay, thanks for explaining!

Andreas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]