bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dd skip bug?


From: Pádraig Brady
Subject: Re: dd skip bug?
Date: Fri, 18 Apr 2008 17:24:35 +0100
User-agent: Thunderbird 2.0.0.6 (X11/20071008)

Jim Meyering wrote:
> Pádraig Brady <address@hidden> wrote:
> 
>> Jim Meyering wrote:
>>> Pádraig Brady <address@hidden> wrote:
>>>> dd handles skip weirdly
>>>>
>>>> disk=/dev/sda8
>>>> dd if=$disk bs=8M count=1 skip=1000 of=/dev/null  #ok
>>>> dd if=$disk bs=8M count=1 skip=1000K of=/dev/null #reads whole disk! as 
>>>> seek fails
>>>>
>>>> I had a 10s look at the source and noticed a comment
>>>> saying POSIX doesn't specify what we should do when
>>>> skipping past the end of input. For seekable files though,
>>>> reading the whole thing is unexpected to me at least.
>>>> I would expect it to do:
>>>>
>>>> if (seekable && !seek(skip_len))
>>>>     exit(EXIT_FAILURE);
>>> Thanks, but the existing behavior is deliberate, and IMHO, necessary.
>>>
>>> skip=N is required to try to seek, and failing that, position
>>> the read pointer by calling read.  That is so it works on
>>> e.g., redirected stdin as well as on regular files.
>> redirected stdin is seekable.
>> Note the logic I presented above.
> 
> Hi Pádraig,
> 
> Redirected stdin is seekable, as long as it's from a seekable file.
> I meant "piped stdin".  Same holds for any other non-seekable input source.
> I think if you try the code above, it will cause test failures.
> 

There are actually 3 cases to consider with large skip values.
Currently dd does the following for various values of skip:

    if skip > file_size && skip < max_file_size
      lseek returns new offset, and read() returns 0

    if skip > file_size && skip > max file size
      lseek returns error and read() used to advance input

    if skip would overflow off_t
      read() used to advance input

It should at least be consistent for all cases I think.
I.E. for cases 2 & 3, we should seek to the end of the file
so that read() will just return 0 as in case 1.

A patch to do the above is attached.
Note I haven't tested this fully, it's just for illustration.

The more contentious thing to do in all 3 cases is to terminate with an error,
or just print a warning would be a good first step I suppose.
Exiting though would allow one for example to terminate a script that reads a 
file
in chunks with dd. The only way I can see to do this at present, is to use the
file size and chunk number ourselves to terminate the loop in the script.

thanks,
Pádraig
diff --git a/src/dd.c b/src/dd.c
index 0a7b154..76739f9 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -1178,6 +1178,7 @@ skip (int fdesc, char const *file, uintmax_t records, 
size_t blocksize,
       char *buf)
 {
   uintmax_t offset = records * blocksize;
+  off_t soffset;
 
   /* Try lseek and if an error indicates it was an inappropriate operation --
      or if the file offset is not representable as an off_t --
@@ -1194,6 +1195,11 @@ skip (int fdesc, char const *file, uintmax_t records, 
size_t blocksize,
   else
     {
       int lseek_errno = errno;
+      if (fdesc == STDIN_FILENO && input_seekable)
+       {
+         soffset = lseek(fdesc, 0, SEEK_END);
+         advance_input_offset (soffset);
+       }
 
       do
        {

reply via email to

[Prev in Thread] Current Thread [Next in Thread]