bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] split: --chunks option


From: Chen Guo
Subject: Re: [PATCH] split: --chunks option
Date: Tue, 15 Dec 2009 14:46:50 -0800 (PST)

Hey guys,
    Alright as of now everything that we originally talked about has been
implemented; tests were done on an 980K ASCII file while limiting buffer
size to 2 bytes (will test on binary later). Everything works great.


As of now, I've got the following syntax:
    -bN or --bytes=N is original usage
    -b/N or --bytes=/N is split into N equal sized files
    -bK/N or --bytes=K/N is extract Kth of N equal chunks to stdout
-nN is equivalent to -b/N, and -nK/N is equivalent to -bK/N 

> Right. Also doing -n lines:4 would allow one to specify
> a distribution method which may be required. I.E. this
> could be used to specify round robin distribution of lines
> which might be required.
> 
> -n lines-rr:4
I haven't handled the non-seekable file case yet, but yeah this works.
As for extracting byte-chunks to stdout, I see no other way than to
read from the file's start and start outputting when the desired chunk
is read.

> Also specifying other delimiters might be useful like:
> 
> -n nul:4

Actually at the top of split.c I see a TODO that talks about a -t option
which specifies a CHAR or REGEX deliminator. REGEX might be
kind of complicated, but a delim char as a global char eol should
be trivial to implement. We can leave eol = '\n' by default, and the -t
option can override it.

But then this begs the question... How would you enter say, '\0' into
the terminal? And the way I know of entering newline is rather awkward:
-t '
'

And last thing, would I be wrong to say we can't support splitting by
chunks with stdin? Barring of course, the round robin line splitting.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]