[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gawk: length return incorrect value when MB_CUR_MAX > 1
From: |
Hirofumi Saito |
Subject: |
Re: gawk: length return incorrect value when MB_CUR_MAX > 1 |
Date: |
Wed, 30 Nov 2005 23:33:42 +0900 |
User-agent: |
Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.6 (Maruoka) FLIM/1.14.6 (Marutamachi) APEL/10.6 Emacs/21.4 (i386-pc-linux-gnu) MULE/5.0 (SAKAKI) |
KIMURA Koichi wrote.
> Hi,
> A certain user found the bug of gawk 3.1.5's length function.
> $LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'
> This script prints '3', not '7'. I have tested Windows and GNU/Linux
> (Fedora Core3).
I tried this script with Fedora Core 5 test 1, then I got the same
results.
$ LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'
3
$ LANG=ja_JP.UTF-8 gawk 'BEGIN {print length("abc\0def")}'
3
$ LANG=ja_JP.eucJP gawk 'BEGIN {print length("abc\0def")}'
3
Next, I tried this script with Debian GNU/Linux (sarge).
The version of gawk is 3.1.4.
$ LANG=ja_JP.utf8 /usr/bin/gawk 'BEGIN{print length("abc\0def")}'
7
And then, I tried to use gawk 3.1.5 which I build with sarge.
$ LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'
7
$ LANG=ja_JP.eucJP gawk 'BEGIN {print length("abc\0def")}'
3
Does it means this result depends on the LANG?
By the way, I patched Kimura's patch, then:
$ LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'
7
$ LANG=ja_JP.eucJP gawk 'BEGIN {print length("abc\0def")}'
7
It seems good.
Regards,
--
----+----1----+----2----+----3----+----4----+----5----+----6----+----7
Hirofumi Saito
Mail: address@hidden