help-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dd strangeness


From: abc
Subject: Re: dd strangeness
Date: Wed, 08 Dec 2010 15:04:17 -0000
User-agent: G2/1.0

On Nov 14, 11:55 pm, "Colin S. Miller" <no-spam-
thank-...@csmiller.demon.co.uk> wrote:
> abc wrote:
> > I need to split a large stream of approx 600 GB that gets generated by
> > Solaris' zfs send command. I could try the whole chain with a big
> > enough file but I don't see how the zfs command could be the problem
> > as it only generates a stream.
> <snip!>
> > Thanks,
> > Lucia
>
> Lucia,
> Did you try the 'split' command, which is designed to split an
> infinitely large input into chunks whose size you specify?
>
> HTH,
> Colin S. Miller
>
> --
> Replace the obvious in my email address with the first three letters of the 
> hostname to reply.

Colin,

split will not do as it wants to write the whole stream to file(s) and
I don't have the space for that. Besides, in split I see no way to
select one chunk, or skip ahead, which would solve the problem.

I talked to one of the dd maintainers  - dd has design flaws
(choices?) that make it break the principle of least surprise, and
they are not going away. In short, it's not the perfect tool for the
job. But who cares :)

I came up with this, which works:

# zfs send <backup> | dd obs=512 | dd bs=512 count=n

and in succesive iterations specify skip=n, 2*n, 3*n, ... to get at
every chunk.

The first dd coalesces mutiple short reads into a 512 byte block and
the second one just picks them up. This doesn't work as the size is
increased but it's no problem. You mileage may vary here, with Linux
slowing down as read()s increase, but I don't see any slowdown in
Solaris.

It'd also work with only one dd specifying ibs=1, obs=whatever you
want, but I suspect it would take ages and I havent tried it. I may be
wrong. The manual is particularly misleading about the distinction
between bs=n and ibs=n/obs=n. It literally says "bs=n forces ibs=n and
obs=n". That's only half true. If you specify ibs/obs, dd uses two
buffers, if you specify only bs, dd uses only one buffer, leading to
fuck-ups like this one, and hours lost. The manual doesn't explain
this very fundamental distinction.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]