[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gawk] printf does not recognize .PREC if locale is en_US.UTF-8
From: |
Aharon Robbins |
Subject: |
Re: [gawk] printf does not recognize .PREC if locale is en_US.UTF-8 |
Date: |
Fri, 01 Jan 2010 11:46:32 +0200 |
Greetings. Re: this:
> Date: Thu, 31 Dec 2009 13:32:39 +0200
> From: tczy <address@hidden>
> To: address@hidden
> Subject: [gawk] printf does not recognize .PREC if locale is en_US.UTF-8
>
> *** ISSUE AND HOW TO REPRODUCE (IT): ***
>
> echo nothing | awk '{printf "%.3s", "foobar"}'
>
> produces 'foobar' if LC_ALL is en_US.UTF-8. Other variations of the same
> program (with awk 'BEGIN{printf ...', etc.) produce the same. If LC_ALL
> is set to C, everything is fine.
>
> awk '{a=3Dsprintf("%.3s", "foobar"); print a}'
>
> also has this issue.
>
> IRC reports 3.1.5 working well with UTF locale.
>
> *** SYSTEM INFO ***
>
> % gawk --version
> GNU Awk 3.1.7
>
> % uname -a
> Linux sidep.ath.cx 2.6.31-ARCH #1 SMP PREEMPT Tue Nov 10 19:01:40 CET 2009 =
> x86_64 Intel(R) Core(TM)2 Duo CPU T5670 @ 1.80GHz GenuineIntel GNU/Linux
>
> Also GLibC 2.11.1.
It is indeed a bug. Dealing with multibyte characters in general has been
a continuing source of pain. Attached is a patch. It will wend its way
into the Savannah CVS shortly.
Happy New Year!
Arnold
---------------------------------------------------------------------------------
Fri Jan 1 11:41:50 2010 Arnold D. Robbins <address@hidden>
* builtin.c (format_tree): At pr_tail, remember to take the precision
into account when determining how many characters to copy out.
Thanks to tczy <address@hidden> for the bug report.
Index: builtin.c
===================================================================
RCS file: /d/mongo/cvsrep/gawk-stable/builtin.c,v
retrieving revision 1.38
diff -u -r1.38 builtin.c
--- builtin.c 21 Nov 2009 21:16:50 -0000 1.38
+++ builtin.c 1 Jan 2010 09:40:49 -0000
@@ -1223,9 +1223,18 @@
if (fw == 0 && ! have_prec)
;
else if (gawk_mb_cur_max > 1 && (cs1 == 's' || cs1 ==
'c')) {
+ int nchars_needed = 0;
+
assert(cp == arg->stptr || cp == cpbuf);
- copy_count = mbc_byte_count(arg->stptr,
- cs1 == 's' ? arg->stlen : 1);
+
+ if (cs1 == 'c')
+ nchars_needed = 1;
+ else if (have_prec)
+ nchars_needed = prec;
+ else
+ nchars_needed = arg->stlen;
+
+ copy_count = mbc_byte_count(arg->stptr,
nchars_needed);
}
bchunk(cp, copy_count);
while (fw > prec) {
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [gawk] printf does not recognize .PREC if locale is en_US.UTF-8,
Aharon Robbins <=