bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gawk 3.1.4 and multibyte


From: Ricardo Erbano
Subject: gawk 3.1.4 and multibyte
Date: Wed, 6 Oct 2004 15:32:35 -0300 (BRT)


Hello,

I found one bug in gawk 3.1.4, it doesn't work fine when I using locale set to UTF8. Below how I reproduce this problem

$ export LANG=pt_BR.utf8

$ gawk --version
GNU Awk 3.1.4

$ cat gawk.txt
5       GPG_ERR_SOURCE_PINENTRY         Pinentry
6       GPG_ERR_SOURCE_SCD              SCD

$ od -tx1c -Ax gawk.txt
000000 35 09 47 50 47 5f 45 52 52 5f 53 4f 55 52 43 45
         5  \t   G   P   G   _   E   R   R   _   S   O   U   R   C   E
000010 5f 50 49 4e 45 4e 54 52 59 09 09 50 69 6e 65 6e
         _   P   I   N   E   N   T   R   Y  \t  \t   P   i   n   e   n
000020 74 72 79 0a 36 09 47 50 47 5f 45 52 52 5f 53 4f
         t   r   y  \n   6  \t   G   P   G   _   E   R   R   _   S   O
000030 55 52 43 45 5f 53 43 44 09 09 53 43 44 0a
         U   R   C   E   _   S   C   D  \t  \t   S   C   D  \n
00003e

$ gawk  'BEGIN { FS="[\t]+" } { print $1 }' gawk.txt
5
6

$ gawk  'BEGIN { FS="[\t]+" } { print $2 }' gawk.txt
GPG_ERR_SOURCE_PINENTRY
GPG_ERR_SOURCE_SCD              SCD

$ gawk  'BEGIN { FS="[\t]+" } { print $3 }' gawk.txt
Pinentry
[ blank line ]

but if I export locale without utf8 with the same file:

export LANG=pt_BR

$ gawk  'BEGIN { FS="[\t]+" } { print $1 }' gawk.txt
5
6

$ gawk  'BEGIN { FS="[\t]+" } { print $2 }' gawk.txt
GPG_ERR_SOURCE_PINENTRY
GPG_ERR_SOURCE_SCD

$ gawk  'BEGIN { FS="[\t]+" } { print $3 }' gawk.txt
Pinentry
SCD

or if I use utf8 or not with old gawk version:

$ gawk --version
GNU Awk 3.1.3

$ gawk  'BEGIN { FS="[\t]+" } { print $1 }' gawk.txt
5
6

$ gawk  'BEGIN { FS="[\t]+" } { print $2 }' gawk.txt
GPG_ERR_SOURCE_PINENTRY
GPG_ERR_SOURCE_SCD

$ gawk  'BEGIN { FS="[\t]+" } { print $3 }' gawk.txt
Pinentry
SCD

the output is correct.

Best Regards,
Ricardo Erbano
Conectiva Linux
Development and Research








reply via email to

[Prev in Thread] Current Thread [Next in Thread]