[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Gawk match() strange behaviour
From: |
Alain Ketterlin |
Subject: |
Re: Gawk match() strange behaviour |
Date: |
Thu, 06 Sep 2007 23:07:01 +0200 |
User-agent: |
Internet Messaging Program (IMP) H3 (4.1.4) / FreeBSD-6.2 |
Hi, thanks for your help.
The following program:
{
r = match($0,/^ */,t);
print "R=" r " S=" RSTART " L=" RLENGTH;
}
produces this (< signals input, > signals output)
<
> R=-1208966831 S=-1208966831 L=1208966850
< random
> R=1 S=1 L=34
< random
> R=1 S=1 L=2
I could not reproduce this using either stock gawk 3.1.5 or the current CVS
sources. I suggest that you try building from scratch from the CVS archive
on savannah.gnu.org.
For the empty line I get
R=1 S=1 L=0
Things are getting strange (for me, I mean :). I just noticed that
the locale has an impact.
With gawk-3.1.5 (compiled from the tarball), under en_US.utf-8 I get:
-from an empty line: R=1 S=1 L=18
-from a line containing "random" (no space at beginning): R=1 S=1 L=34
-from " random" (two spaces at beginning): R=1 S=1 l=2 (correct)
Under en_US.iso-8859-1, everything is ok. So it seems that utf-8
input is the problem.
With gawk-stable checked out from savannah, everything is correct,
under both locales.
Thanks for your help.
-- Alain.