bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16168: uniq mis-handles UTF8 (8bit) characters


From: Pádraig Brady
Subject: bug#16168: uniq mis-handles UTF8 (8bit) characters
Date: Mon, 16 Dec 2013 17:33:23 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

tag 16168 notabug
close 16168
stop

On 12/16/2013 01:50 PM, Shlomo Urbach wrote:
> Lines with CJK letters are deemed equal by length only, since the
> characters seem to be ignored.
> I understand this is due to locale.
> But, it would be nice if a simple flag would do a locale-free comparison
> (i.e. equal = all bytes are equal).

If you want to compare byte by byte:

LC_ALL=C uniq ....

thanks,
Pǽdraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]