[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#36718: uniq treats distinct Korean characters equal
From: |
Felix Hamme |
Subject: |
bug#36718: uniq treats distinct Korean characters equal |
Date: |
Fri, 19 Jul 2019 12:18:32 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
Thanks @Paul Eggert, it seems like this isn't a bug at all.
My locale (de_DE.utf8) appears to lack definitions for the mentioned
Korean characters. After setting my system language to Korean
(ko_KR.utf8) uniq produces the expected output.
For my purpose, I'll set my environment to LC_COLLATE=C, which forces
byte-wise comparison and should work for all languages.
Admittedly, I could've searched it:
https://unix.stackexchange.com/questions/373848/why-does-uniq-think-%E3%81%82%E3%81%84-and-%E3%81%84%E3%81%82-are-the-same