bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18273: closed (Re: bug#18273: sort seems to misbehave if both -u and


From: Eric Blake
Subject: bug#18273: closed (Re: bug#18273: sort seems to misbehave if both -u and -n or -k are used)
Date: Fri, 15 Aug 2014 14:32:14 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0

On 08/15/2014 02:22 PM, Lennart Sorensen wrote:

> OK I accept that it is correct behaviour.
> 
> The documentation on the other hand is awful in that case.  I went and
> checked the documentation to try and make sense of what it was doing
> before sending the report, and there was nothing there that gave any
> hint that this was expected behaviour.

'info sort' says:

  The '--stable' ('-s') option
disables this "last-resort comparison" so that lines in which all fields
compare equal are left in their original relative order.  The '--unique'
('-u') option also disables the last-resort comparison.

and later on:

'-u'
'--unique'

     Normally, output only the first of a sequence of lines that compare
     equal.  For the '--check' ('-c' or '-C') option, check that no pair
     of consecutive lines compares equal.

     This option also disables the default last-resort comparison.

     The commands 'sort -u' and 'sort | uniq' are equivalent, but this
     equivalence does not extend to arbitrary 'sort' options.  For
     example, 'sort -n -u' inspects only the value of the initial
     numeric string when checking for uniqueness, whereas 'sort -n |
     uniq' inspects the entire line.  *Note uniq invocation::.


> 
> Why does it have a blob talking about which options implicitly enable -s,
> rather than mention that in the documentation for the options that do it.

-u is the only option that implicitly enables -s.

You are welcome to propose a patch to the documentation that would
clarify the situation; we can reopen this bug if a patch materializes.
Maybe even a change to 'sort --help' output to mention that -u implies
-s (which would also feed the 'man sort' page).

> 
> Why does it not mention for -n that anything that isn't a number is
> ignored and treated as if it didn't exist when it comes to deciding
> things like uniqueness?  Are people expected to go read the posix
> standard instead?

The info page DOES mention this:

'-n'
'--numeric-sort'
'--sort=numeric'
     Sort numerically.  The number begins each line and consists of
     optional blanks, an optional '-' sign, and zero or more digits
     possibly separated by thousands separators, optionally followed by
     a decimal-point character and zero or more digits.  An empty number
     is treated as '0'.  The 'LC_NUMERIC' locale specifies the
     decimal-point character and thousands separator.  By default a
     blank is a space or a tab, but the 'LC_CTYPE' locale can change
     this.

The --help output is intentionally terse, so I don't know what we could
do there to make it more obvious without exploding the size of what is
supposed to be brief.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]