bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9995: problem about sort -u -k


From: 夏凯
Subject: bug#9995: problem about sort -u -k
Date: Wed, 9 Nov 2011 22:02:26 +0800

thanks for you reply.
if i want to use the entire line as a key, and sort by the third
field, whether should i use sort -u -k3 -k1 -k2 a to do that?

On Wed, Nov 9, 2011 at 03:45, Eric Blake <address@hidden> wrote:
> On 11/08/2011 11:54 AM, Eric Blake wrote:
>>>
>>> 22:41:39#tp#~> /usr/local/bin/sort -u -k1,3 a
>>> 1 a q
>>> 1 a w
>>> 3 a w
>>> 22:41:48#tp#~> /usr/local/bin/sort -u -k3 a
>>> 1 a q
>>> 1 a w
>
>> Since you didn't tell us what output you were hoping to get, I can't
>> tell you the proper command line that would match your expected output.
>> Feel free to reply, even while this bug is closed, if you need more help
>> in getting the output you want.
>
> I'll give a preemptive attempt at guessing what you meant, as well:
>
> If you wanted to sort on just the third and subsequent fields, but then
> strip duplicate lines only if the entire line is duplicate, then you have to
> use two processes:
>
> sort [-s] -k3 a | uniq
>
> If you don't mind a two-key sort, where the primary key is the third and
> subsequent fields, but where the secondary key is the entire line so as to
> force sort -u to consider the entire line when determining uniqueness, then
> one process will do:
>
> sort -u -k3 -k1 a
>
> To see the difference, and remembering that sort -u implies sort -s,
> consider these contents for a:
>
> $ cat a
> 1 a q
> 2 a q
> 1 a q
> 1 a w
> 3 a w
> $ sort -u -k3 -k1 a
> 1 a q
> 2 a q
> 1 a w
> 3 a w
> $ sort -s -k3 a | uniq
> 1 a q
> 2 a q
> 1 a q
> 1 a w
> 3 a w
> $ sort -k3 a | uniq
> 1 a q
> 2 a q
> 1 a w
> 3 a w
>
> That is, if the stable sort of just -k3 leaves identical lines that are not
> adjacent ("1 a q" in my example), then the separate uniq process won't
> filter them; while using sort -u with -k1 as the means to force the entire
> line as a secondary sort key loses the ability to leave identical lines
> separated by a distinct line.  Likewise, omitting both -s and -u lets sort
> imply a last-resort -k1, at which point uniq sees the same line order as
> sort -u sees.
>
>>> i read
>>> http://www.gnu.org/s/coreutils/manual/html_node/sort-invocation.html,
>>> but got nothing about this.
>
> Actually, it does - under the option -u, I see:
>
> The commands sort -u and sort | uniq are equivalent, but this equivalence
> does not extend to arbitrary sort options. For example, sort -n -u inspects
> only the value of the initial numeric string when checking for uniqueness,
> whereas sort -n | uniq inspects the entire line. See uniq invocation.
>
> --
> Eric Blake   address@hidden    +1-801-349-2682
> Libvirt virtualization library http://libvirt.org
>



-- 
contact me:
MSN: address@hidden
GTALK: address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]