bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new coreutil? shuffle - randomize file contents


From: Philip Rowlands
Subject: Re: new coreutil? shuffle - randomize file contents
Date: Sat, 4 Jun 2005 02:11:30 +0100 (BST)

On Fri, 3 Jun 2005, Davis Houlton wrote:

>Now that I've heard of some samples, I think a sort --random is an excellent
>idea, and I hope its inclusion to coreutils occurs at some point.  There is
>scope in sort that is far beyond shuffle.  Assuming sort --random's eventual
>entry to coreutils is a given, I think it sounds like we have two questions
>to decide on--
> a) Implementation details aside, does a shuffle command merit entry?

No. (IMHO - at least, not as a standalone coreutil.)

> b) Examining implementation details, what is the best way to go?

Extend sort.

My rationale for the above answers is to consider the issues of
randomization and partial ("top 10") sort separately. They are
orthogonal challenges, and while extending sort with a "head" operator
would be contrary to the unix-y way of small, separate do-one-thing-well
filters, the efficiency gains are an overwhelming plus (and not only in
the context of shuffling).

Paul Eggert wrote:

> Suppose you're randomizing an input file of 10 million lines.  And
> suppose you want to approximate a "truly random" key by using a
> 128-bit random key for each input line. Then you'll need about 1.3
> billion random bits.

I don't think is necessarily the case, even if "secure" shuffling is
required. Common SSL and VPN crypto seems to manage without horrendous
amounts of random data - several cryptographic PRNG algorithms exist.

David Feuer wrote:
> There seems to be some sloppy thinking regarding efficiency and
> uniform randomness.

I'm feeling lucky - how does a Knuth shuffle fed by a Mersenne twister
(fast) or something Yarrow-based (secure) seeded by 256 bits of
/dev/{,u}random sound? Corrections of sloppy thinking are welcome :)


Cheers,
Phil




reply via email to

[Prev in Thread] Current Thread [Next in Thread]