bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] md5: accepts a new --threads option


From: Jim Meyering
Subject: Re: [PATCH] md5: accepts a new --threads option
Date: Wed, 21 Oct 2009 10:52:56 +0200

Pádraig Brady wrote:
> Pádraig Brady wrote:
>>
>> You wouldn't want multiple threads/processes fighting over
>> the disk head so you would do something like:
>>
>>   find /disk1 | xargs md5sum & find /disk2 | xargs md5sum
>>
>> Note if we're piping/redirecting the output of the above
>> then we must be careful to line buffer the output from md5sum
>> so that it's not interspersed. Hmm I wonder should
>> we linebuffer the output from *sum by default.
>
> In the attached patch, I've changed the default buffering
> to line buffered to address the above issue. For standard
> size files there is a 2% performance drop.

Good catch.
It sounds like this fixes a real (albeit obscure) bug, so this
might deserve a NEWS item, though I admit it is borderline.

Thanks!

> p.s. I'll look at bypassing stdio on input to see
> if I can get at least the 2% back

IMHO, even if it did, it would not be worth it.

>>From 0db7057c6256d9cd25e988b3fe23e97a0e30f717 Mon Sep 17 00:00:00 2001
> From: =?utf-8?q?P=C3=A1draig=20Brady?= <address@hidden>
> Date: Tue, 20 Oct 2009 19:19:58 +0100
> Subject: [PATCH] md5sum, sha*sum, sum: line buffer the outputted checksums

s/outputted/printed/
s/line buffer/line-buffer/

> * src/md5sum.c (main): Set stdout to line buffered mode
> to ensure parallel running instances don't intersperse
> their output.  This adds 5% to the run time in the worst case
> of many zero length files, or 2% with standard file sizes.
> * src/sum.c (main): Likewise.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]