bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18273: closed (Re: bug#18273: sort seems to misbehave if both -u and


From: Lennart Sorensen
Subject: bug#18273: closed (Re: bug#18273: sort seems to misbehave if both -u and -n or -k are used)
Date: Fri, 15 Aug 2014 17:05:23 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Aug 15, 2014 at 02:32:14PM -0600, Eric Blake wrote:
> 'info sort' says:
> 
>   The '--stable' ('-s') option
> disables this "last-resort comparison" so that lines in which all fields
> compare equal are left in their original relative order.  The '--unique'
> ('-u') option also disables the last-resort comparison.
> 
> and later on:
> 
> '-u'
> '--unique'
> 
>      Normally, output only the first of a sequence of lines that compare
>      equal.  For the '--check' ('-c' or '-C') option, check that no pair
>      of consecutive lines compares equal.
> 
>      This option also disables the default last-resort comparison.
> 
>      The commands 'sort -u' and 'sort | uniq' are equivalent, but this
>      equivalence does not extend to arbitrary 'sort' options.  For
>      example, 'sort -n -u' inspects only the value of the initial
>      numeric string when checking for uniqueness, whereas 'sort -n |
>      uniq' inspects the entire line.  *Note uniq invocation::.

OK I guess that does somewhat point out the behaviour.

> -u is the only option that implicitly enables -s.
> 
> You are welcome to propose a patch to the documentation that would
> clarify the situation; we can reopen this bug if a patch materializes.
> Maybe even a change to 'sort --help' output to mention that -u implies
> -s (which would also feed the 'man sort' page).

I do wonder why there isn't an option to undo that implicit option,
but perhaps it would not actually make sense.

> The info page DOES mention this:
> 
> '-n'
> '--numeric-sort'
> '--sort=numeric'
>      Sort numerically.  The number begins each line and consists of
>      optional blanks, an optional '-' sign, and zero or more digits
>      possibly separated by thousands separators, optionally followed by
>      a decimal-point character and zero or more digits.  An empty number
>      is treated as '0'.  The 'LC_NUMERIC' locale specifies the
>      decimal-point character and thousands separator.  By default a
>      blank is a space or a tab, but the 'LC_CTYPE' locale can change
>      this.
> 
> The --help output is intentionally terse, so I don't know what we could
> do there to make it more obvious without exploding the size of what is
> supposed to be brief.

Well I always thought info was meant to be complete documentation.

I see nothing in the above that makes me think it would ignore the part
of the line that isn't a number.  The part in -u does seem to point out
that this is the behaviour.

I think this might be the first time I ever used -n when the input was
not pure numbers, so I never hit this before.

-- 
Len Sorensen





reply via email to

[Prev in Thread] Current Thread [Next in Thread]