[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: dd skip bug?
From: |
Pádraig Brady |
Subject: |
Re: dd skip bug? |
Date: |
Fri, 18 Apr 2008 17:24:35 +0100 |
User-agent: |
Thunderbird 2.0.0.6 (X11/20071008) |
Jim Meyering wrote:
> Pádraig Brady <address@hidden> wrote:
>
>> Jim Meyering wrote:
>>> Pádraig Brady <address@hidden> wrote:
>>>> dd handles skip weirdly
>>>>
>>>> disk=/dev/sda8
>>>> dd if=$disk bs=8M count=1 skip=1000 of=/dev/null #ok
>>>> dd if=$disk bs=8M count=1 skip=1000K of=/dev/null #reads whole disk! as
>>>> seek fails
>>>>
>>>> I had a 10s look at the source and noticed a comment
>>>> saying POSIX doesn't specify what we should do when
>>>> skipping past the end of input. For seekable files though,
>>>> reading the whole thing is unexpected to me at least.
>>>> I would expect it to do:
>>>>
>>>> if (seekable && !seek(skip_len))
>>>> exit(EXIT_FAILURE);
>>> Thanks, but the existing behavior is deliberate, and IMHO, necessary.
>>>
>>> skip=N is required to try to seek, and failing that, position
>>> the read pointer by calling read. That is so it works on
>>> e.g., redirected stdin as well as on regular files.
>> redirected stdin is seekable.
>> Note the logic I presented above.
>
> Hi Pádraig,
>
> Redirected stdin is seekable, as long as it's from a seekable file.
> I meant "piped stdin". Same holds for any other non-seekable input source.
> I think if you try the code above, it will cause test failures.
>
There are actually 3 cases to consider with large skip values.
Currently dd does the following for various values of skip:
if skip > file_size && skip < max_file_size
lseek returns new offset, and read() returns 0
if skip > file_size && skip > max file size
lseek returns error and read() used to advance input
if skip would overflow off_t
read() used to advance input
It should at least be consistent for all cases I think.
I.E. for cases 2 & 3, we should seek to the end of the file
so that read() will just return 0 as in case 1.
A patch to do the above is attached.
Note I haven't tested this fully, it's just for illustration.
The more contentious thing to do in all 3 cases is to terminate with an error,
or just print a warning would be a good first step I suppose.
Exiting though would allow one for example to terminate a script that reads a
file
in chunks with dd. The only way I can see to do this at present, is to use the
file size and chunk number ourselves to terminate the loop in the script.
thanks,
Pádraig
diff --git a/src/dd.c b/src/dd.c
index 0a7b154..76739f9 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -1178,6 +1178,7 @@ skip (int fdesc, char const *file, uintmax_t records,
size_t blocksize,
char *buf)
{
uintmax_t offset = records * blocksize;
+ off_t soffset;
/* Try lseek and if an error indicates it was an inappropriate operation --
or if the file offset is not representable as an off_t --
@@ -1194,6 +1195,11 @@ skip (int fdesc, char const *file, uintmax_t records,
size_t blocksize,
else
{
int lseek_errno = errno;
+ if (fdesc == STDIN_FILENO && input_seekable)
+ {
+ soffset = lseek(fdesc, 0, SEEK_END);
+ advance_input_offset (soffset);
+ }
do
{