Re: [coreutils] added ability in sort to skip n number of lines for each

coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] added ability in sort to skip n number of lines for each

From:	Pádraig Brady
Subject:	Re: [coreutils] added ability in sort to skip n number of lines for each file
Date:	Mon, 22 Nov 2010 17:28:45 +0000
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3

On 18/11/10 16:36, Jim Hester wrote:
> A common problem when sorting files stems from the file containing 1
> or more header lines, which should not be sorted.  As of now, the
> common solution to this problem is to remove the header lines with
> manually, or to output only the non header lines with tail, awk, or
> some other program and pipe the results to sort.

Thanks for the patch!

> This was likely not
> deemed a problem when sort was only single threaded, as the printing
> and pipe was likely still faster than the sort itself.  However with
> multi-threaded sort this results in the operation bottle necking
> waiting for more information from the pipe.

I'm not following the argument above.
One can always print the header synchronously?
I.E. the `head` below is guaranteed to run before the `sort`

printf "z_header\nb\na\n" > file
(head -n1 file; sort <(tail -n+2 file) <(tail -n+2 file))

Now the above is awkward and dependent on bash
(constructs per file), so your idea has some merit I think.

> This common operation
> would be greatly improved if sort could simply print a user defined
> number of lines for each file.  I have made a simple patch to
> implement this feature, which I have attached to this email.

Note `join` recently got the --header option
http://lists.gnu.org/archive/html/bug-coreutils/2010-01/msg00284.html
also essentially to exclude starting lines from order comparisons.

cheers,
Pádraig.

[Prev in Thread]

Current Thread

[Next in Thread]

[coreutils] added ability in sort to skip n number of lines for each file, Jim Hester, 2010/11/18
- Re: [coreutils] added ability in sort to skip n number of lines for each file, Pádraig Brady <=
  - Re: [coreutils] added ability in sort to skip n number of lines for each file, Pádraig Brady, 2010/11/22
    - Re: [coreutils] added ability in sort to skip n number of lines for each file, Pádraig Brady, 2010/11/22
    - Re: [coreutils] added ability in sort to skip n number of lines for each file, Jim Hester, 2010/11/23
    - Re: [coreutils] added ability in sort to skip n number of lines for each file, Pádraig Brady, 2010/11/23
- Re: [coreutils] added ability in sort to skip n number of lines for each file, Assaf Gordon, 2010/11/22
  - Re: [coreutils] added ability in sort to skip n number of lines for each file, Assaf Gordon, 2010/11/22

Prev by Date: Re: [coreutils] over aggressive threads in sort
Next by Date: Re: [coreutils] added ability in sort to skip n number of lines for each file
Previous by thread: [coreutils] added ability in sort to skip n number of lines for each file
Next by thread: Re: [coreutils] added ability in sort to skip n number of lines for each file
Index(es):
- Date
- Thread