bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#25550: Apparent unicode bug in uniq 8.26


From: David Loyall
Subject: bug#25550: Apparent unicode bug in uniq 8.26
Date: Thu, 26 Jan 2017 16:45:37 -0600

Hello.  I think I found a bug in uniq 8.26.

Here's a demo:

address@hidden:~/e2-scratch$ cat faces_mre.txt
(◕‿◕)
(︺︹︺)

address@hidden:~/e2-scratch$ uniq -c faces_mre.txt
2 (◕‿◕)

Here's some background info:

address@hidden:~/e2-scratch$ od -x faces_mre.txt
0000000 e228 9597 80e2 e2bf 9597 0a29 ef28 bab8
0000020 b8ef efb9 bab8 0a29
0000030

address@hidden:~/e2-scratch$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

address@hidden:~/e2-scratch$ uniq --version
uniq (GNU coreutils) 8.26
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Richard M. Stallman and David MacKenzie.

The bug disappears in the C locale.

address@hidden:~/e2-scratch$ LC_COLLATE=c uniq -c faces_mre.txt
1 (◕‿◕)
1 (︺︹︺)

I hope this helps.

Cheers,

--Dave Loyall
Omaha, Nebraska, USA





reply via email to

[Prev in Thread] Current Thread [Next in Thread]