bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dd PATCH: add conv=direct


From: Bryce
Subject: Re: dd PATCH: add conv=direct
Date: Wed, 14 Apr 2004 12:46:08 +0100

On Tue, 2004-04-13 at 22:51, Paul Eggert wrote:
> Jim Meyering <address@hidden> writes:
> 
> >>> > http://oss.oracle.com/projects/ocfs/dist/files/source/RHAT/RHAS3/coreutils-4.5.3-33.src.rpm
> 
> I briefly looked at the following patches in that RPM:
> 
> coreutils-4.5.3-O_DIRECT-NFS.patch
> coreutils-4.5.3-O_DIRECT-dd.patch
> coreutils-4.5.3-O_DIRECT-valloc.patch
> coreutils-4.5.3-o_direct-copy-valloc.patch
> coreutils-4.5.3-o_direct.patch
> 
> and I found the following differences between those ideas and what's
> in coreutils CVS right now:
> 
> * Coreutils dd simply aligns the I/O buffers to getpagesize()
>   boundaries, 4.5.3-33 has a complicated alignment strategy that I
>   don't fully follow, but which seems to do the same thing.
>   (There may be some differences if I/O errors occur; is that
>   the point?)

Partly,
there's a sub annoyance that I was trying to handle regarding RHAT's
AS2.1 kernels which is a kernel limitation regarding reads and FS types
that did or did not support O_DIRECT. (this is all from the time of when
O_DIRECT in the kernel was quite new). I was trying to make this a non
issue (it's fixed in RHAT's AS3, however we're tied to supporting AS2.1
as long as RHAT does, joy...) I'm not sure when RHAT's AS2.1 support
dies but I'm hoping soon then all that cruft that I put in can be
stripped out.

> * 4.5.3-33 aligns buffers to page size boundaries in copy.c.
>   This looks to me like it's worth doing (independently of O_DIRECT),
>   so I'll propose a patch along those lines via separate email
>   to bug-coreutils.
> 
> * cp, mv, and md5sum have --o_direct options.  I'm not convinced that
>   md5sum needs this (why not all the other commands that read files,
>   too, while you're at it? cat, say?) but perhaps cp and mv should
>   have it (what are the application areas here?).  Also, option names
>   should not have underscores, so I'd suggest --direct (or perhaps
>   --direct-io) as a better name for this sort of option.

The reasoning behind md5sum was,.. the boss wanted it.
Actually it's because in automated scripting if you write a large file
using O_DIRECT, it may not be fully committed to disk after the write
process has finished. The upshot of this is that you will not be able to
access the file until it's fully committed and md5sum could suddenly
encounter a 'permission denied' condition. by using O_DIRECT, the access
is channelled by the O_DIRECT mechanisms in the kernel and can read the
data even though it has not hit the disk, avoiding this 'permission
denied' issue. I didn't convert more than tar/mv/cp/dd/md5sum because
I'm pressed for time on other projects.

As for why should cp/mv have these options? I'll direct your attention
to the following chart
http://oss.oracle.com/~bryce/cp.gif
when you're talking about 4Tb databases the savings in time and coffee
are substantial,.. (8 hours vs 2.5 hours for a backup copy)

The --o_direct name was, well, I was stuck for another name to call it
at the time, it kinda stuck. It's no biggie for it to be tossed on the
fire.

> * The dd options are spelled differently, e.g.:
> 
>   dd ibs=512 obs=1024 iflags=direct oflags=direct (coreutils CVS)
>   dd --o_direct=512,1024   (4.5.3-33)
> 
>   Here I prefer the coreutils CVS version as it's a bit more orthogonal.


Aye, again there is method in my madness (maybe)

there are situations where you can be asked to read from a non O_DIRECT
capable FS/stream to an O_DIRECT FS or vice verse
eg

dd --o_direct=8192,0 if=some_O_DIRECT_file | gzip -f > backup_file.gz


I was trying to sort that out by limiting the o_direct method to a long
option which modified the behavior

ie --o_direct by itself would assume both sides were O_DIRECT capable
and attempt to automatically determine the correct block size

it could also be passed two arguments for read and write block sizes
where a 0 would denote that the read or the write was to be done via non
o_direct methods. A -1 would try and determine what the O_DIRECT block
size was. The reasoning behind this was for systems where two O_DIRECT
FS's were in use that had DIFFERING block sizes. This facilitated the
need to be able to say copy a 128K chunk over to a an FS in say 32K
chunks, hence all that fun code in full_write.c/safe_read.c I wrote.

eg cp --o_direct=0,8192 /normal_fs/file /o_direct_fs/file

by comparison, my convolution of embedding the status in the arguments
is expressed by your iflags/oflags options which is better to be honest
I was simply trying too hard to have the functionality embedded in the
one option.


OK, next step, what would you like from me?

Phil
=--=





reply via email to

[Prev in Thread] Current Thread [Next in Thread]