bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15578: Parameter -d or --direct to open files with flag O_DIRECT?


From: Pádraig Brady
Subject: bug#15578: Parameter -d or --direct to open files with flag O_DIRECT?
Date: Fri, 11 Oct 2013 00:21:46 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 10/10/2013 09:09 AM, Pádraig Brady wrote:
> On 10/10/2013 05:49 AM, Kyle Sallee wrote:
>> Please forgive the inconvenience.
>> Attaching a gzipped patch file
>> ensures the integrity of patch file content.
>> I should have presented only the idea
>> rather than a patch for the implementation.
>>
>> I concur dd is already suited to the task.
>>
>> Please consider
>>
>> | sed 's/^/if=/' | xargs -r --max-lines=1 dd iflag=direct  # in contrast
>> with
>>                    | xargs -r --max-lines=4096 cat -d --
>>
>> Invoking dd 466059 times costs only a slight performance decrease
>> as compared with invoking cat 114 times.
>> However, this example probably represents rare usage for cat.
>>
>> Thanks for granting the time and consideration.
> 
> Fair point, but still not worth adding to cat(1)
> since it's not special in this regard.
> 
> Something like this might be more appropriate:
> https://github.com/Feh/nocache
> 
> Note that doesn't avoid the page cache completely,
> and so may be more performant/portable than O_DIRECT.
> (dd has this functionality too as described at 'nocache' at:
>  http://www.gnu.org/software/coreutils/manual/html_node/dd-invocation.html)

One possibility worth mentioning, would be to add a files0-from=F option to dd,
like du,sort,wc already have.

Now those have it because they need to operate on the complete input set,
for accumulation or sorting, and thus can't resort to separated runs
with xargs or whatever. dd might use it as it has a very different command
syntax to the standard tools. So that would allow a general method
to efficiently read many files.

Another related thing to consider is the above would allow a single
process to handle everything, but it might be better to split the
load into a process per CPU.

thanks,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]