[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#38627: uniq -c gets wrong count with non-ascii strings
From: |
Andreas Schwab |
Subject: |
bug#38627: uniq -c gets wrong count with non-ascii strings |
Date: |
Sun, 23 Feb 2020 21:02:30 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
On Feb 23 2020, Pádraig Brady wrote:
> On 17/12/2019 17:25, Roy Smith wrote:
>> I stopped short of actually building uniq.c from source (bootstrap,
>> prerequisites, ...), but looking at the code, it looks like the call chain
>> is:
>>
>> different()
>> xmemcoll()
>> memcoll()
>> strcoll()
>>
>> so I tried a little test at the strcoll() level:
>>
>> #include <stdio.h>
>> #include <unistd.h>
>> #include <string.h>
>>
>> int
>> main (int argc, char **argv)
>> {
>> unsigned char null[] = {
>>
>> 0342, 0201, 0277, 0341, 0265, 0230, 0313, 0241, 0313, 0241, 0
>> };
>> unsigned char iraq[] = {
>> 0334, 0245, 0334, 0235, 0334, 0252, 0334, 0220, 0334, 0251, 0};
>>
>> printf("%s\n", null);
>> printf("%s\n", iraq);
>>
>> int m = strcoll(null, iraq);
>> printf("m = %d\n", m);
>> }
This lacks setlocale.
Andreas.
--
Andreas Schwab, address@hidden
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."