[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sort -u -n error : sort v 4.5.3
From: |
Eric Blake |
Subject: |
Re: sort -u -n error : sort v 4.5.3 |
Date: |
Sun, 05 Mar 2006 03:12:32 +0000 |
> > address@hidden data]# sort -n < /tmp/new_spammers
> > 64.202.165.131
> > 64.202.165.132
> > 64.202.165.133
> >
> >
> > address@hidden data]# sort -u -n < /tmp/new_spammers
> > 64.202.165.132
>
> Not a bug. This behavior is required by POSIX,
> http://www.opengroup.org/onlinepubs/009695399/utilities/sort.html,
> which states that -n restricts the key to the first numeric
> field, and that -u uniquifies based on the sort key.
>
> In your sample file, the unique key is 64, and all
> three lines have the same key. When -n or -u is
> used alone, a tiebreaker is used that sorts based
> on the rest of the line. But when used together,
> the behavior correctly stops after one line.
However, I think this does point out some bugs in POSIX
compliance - GNU sort currently parses the decimal separator
in a numeric sort key, whereas POSIX does not permit that; and
treats signed zeros as significant, where POSIX does not
permit that:
sort 5.94:
$ export LANG=C LC_ALL=C POSIXLY_CORRECT=1
$ cat blah
64.202
64.201
64.203
0
-0
0
$ sort -n blah
-0
0
0
64.201
64.202
64.203
$ sort -nu blah
0
64.201
64.202
64.203
POSIX:
$ sort -n blah # lines with equal numeric keys should stay stable during sort
0
-0
0
64.202
64.201
64.203
$ sort -nu blah # numeric keys do not include decimal points in LANG=C
0
64.202
--
Eric Blake