[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: new coreutil? shuffle - randomize file contents
From: |
Davis Houlton |
Subject: |
Re: new coreutil? shuffle - randomize file contents |
Date: |
Fri, 3 Jun 2005 07:16:33 +0000 |
User-agent: |
KMail/1.7.2 |
On Thursday 02 June 2005 10:31, Jim Meyering wrote:
> It sure sounds like shuffle and sort should share a lot of code,
> one way or another, so why not have them share the line- and key-
> handling code, too? I won't rule out adding a new program, like
> shuffle, but I confess I'm less inclined now than when I started
> typing this message.
Now that I've heard of some samples, I think a sort --random is an excellent
idea, and I hope its inclusion to coreutils occurs at some point. There is
scope in sort that is far beyond shuffle. Assuming sort --random's eventual
entry to coreutils is a given, I think it sounds like we have two questions
to decide on--
a) Implementation details aside, does a shuffle command merit entry?
b) Examining implementation details, what is the best way to go?
My take is that shuffle is good for the lazy man--shuffle as is currently
written, is destructive, replacing a file A with random(A). I intend (after I
add -z and -o, of course :] ) to add a --head (-h) option. so that if all we
want is the first line, that's all we have to process. My thought is that
these properties make shuffle ideal for simple, quick hitters via "system"
type calls in the various scripting languages that are commonly used.
To help illustrate, here are the common use cases I envsion:
USE CASE 1: Randomizing file contents
shuffle *
ls * | xargs -l -ii sort --random "i" -o "i"
USE CASE 2: Grabbing a file at random
ls * | shuffle -h 1
ls * | sort --random | head -l 1
USE CASE 3: Generating a list of random files
find . -name \*.mp3 | shuffle -o "playlist.m3u"
find . -name \*.mp3 | sort --random -o "playlist.m3u"
Effiency wise, I think shuffle will run quicker, but that may not be an issue
given the size of average cases (small). For question a) above, I'm thinking
there is room. In the same way we have grep -r and find . | xargs grep, I'm
assuming we can have both shuffle and sort (from a users perspective).
If we assume that the potential exists for both sort and shuffle, the devil
then becomes the details. How much of sort would exist in shuffle--or vice
versa? Should there be a gnu coreutils include that deals specifically with
temp files, for use by any utility?
Ahhh, questions questions...I'm not sure how we should approach it. Is the
answer to b) unknown at present? Trying to get our arms around the issue
could lead to a great deal of analysis paralysis, though I'm always willing
to try. If we agree that a) is a given, maybe we should just try and add the
N-scale code to shuffle, with a parallel --random effort in sort? Then we
can operate in hindsight, refactoring and adjusting as neccessary. One
potential plan at any rate. Of course, if we agree that shuffle should not be
included, then no harm no foul either. But, as it sounds like sort --random
is far from trivial, sometimes a bird in the hand??? Thoughts?
Thanks,
Davis
- Re: new coreutil? shuffle - randomize file contents, (continued)
- Re: new coreutil? shuffle - randomize file contents, David Feuer, 2005/06/04
- Re: new coreutil? shuffle - randomize file contents, Frederik Eaton, 2005/06/04
- Re: new coreutil? shuffle - randomize file contents, James Youngman, 2005/06/03
- Re: new coreutil? shuffle - randomize file contents, Davis Houlton, 2005/06/03
- Re: new coreutil? shuffle - randomize file contents, Frederik Eaton, 2005/06/04
- Re: new coreutil? shuffle - randomize file contents, Frederik Eaton, 2005/06/05
- Re: new coreutil? shuffle - randomize file contents, Frederik Eaton, 2005/06/05
- Re: new coreutil? shuffle - randomize file contents, Frederik Eaton, 2005/06/06
- Re: new coreutil? shuffle - randomize file contents, Jim Meyering, 2005/06/07
Re: new coreutil? shuffle - randomize file contents, Jim Meyering, 2005/06/02
- Re: new coreutil? shuffle - randomize file contents,
Davis Houlton <=
Re: new coreutil? shuffle - randomize file contents, David Feuer, 2005/06/02
Re: new coreutil? shuffle - randomize file contents, David Feuer, 2005/06/02
Re: new coreutil? shuffle - randomize file contents, David Feuer, 2005/06/02