bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fwd: gawk bug]


From: Aharon Robbins
Subject: Re: [Fwd: gawk bug]
Date: Tue, 23 Nov 2004 14:16:57 +0200

Greetings. Re this:

> Date: Tue, 23 Nov 2004 10:13:07 +0100
> From: Eiso AB <address@hidden>
> To: address@hidden
> Subject: [Fwd: gawk bug]
>
> hi again(?),
>
> not sure if this got through yesterday, and I didn't include output.
>
> the script produces
> H99 N|
> instead of the expected
> H99 N
>
> so in the second gsub not all |'s
> are removed.
>
> goodluck, Eiso
>
>
> -------- Original Message --------
> Subject: gawk bug
> Date: Mon, 22 Nov 2004 18:20:42 +0100
> From: Eiso AB <address@hidden>
> To: address@hidden
> CC: address@hidden
> References: <address@hidden> <address@hidden> 
> <address@hidden>
>
>   hi Arnold,
>
> this produces a bug in gawk 3.1.4
>
>
> put this in an awk script (bug.awk)
> {
>            ( FILENAME ~ /[.]save$/ )
>            h=$2
>            gsub("[|]"," ",h)
>            x=$1
>            gsub("[|]"," ",x)
>            print x
> }
>
> then run as
>
> echo "|H99|N| |H99|HN|" > t.save ; gawk -f bug.awk t.save
>
>
>
> removal of:
> * ( FILENAME ~ /[.]save$/ )
> * gsub("[|]"," ",h)
> and change of the filename(!)
> will make the error disappear.
>
>
> very strange!
> goodluck,

It turns out that my current development code doesn't show the
problem.  Here is the patch.

I verified that dropping this into 3.1.4 fixes the problem.

Whew, I'm glad it was this easy!

Arnold
-----------------------
--- ../gawk-3.1.4/dfa.c 2004-07-26 17:11:41.000000000 +0300
+++ dfa.c       2004-10-21 17:12:19.000000000 +0200
@@ -2871,6 +2871,14 @@
   if (MB_CUR_MAX > 1)
     {
       int remain_bytes, i;
+#if 0
+      /*
+       * This caching can get things wrong:
+
+      printf "ab\n\tb\n" | LC_ALL=de_DE.UTF-8 ./gawk '/^[ \t]/ { print }'
+
+       * should print \tb but doesn't
+       */
       buf_begin -= buf_offset;
       if (buf_begin <= (unsigned char const *)begin && (unsigned char const *) 
end <= buf_end) {
        buf_offset = (unsigned char const *)begin - buf_begin;
@@ -2878,6 +2886,7 @@
        buf_end = end;
        goto go_fast;
       }
+#endif
 
       buf_offset = 0;
       buf_begin = begin;
@@ -2916,7 +2925,9 @@
       mblen_buf[i] = 0;
       inputwcs[i] = 0; /* sentinel */
     }
+#if 0
 go_fast:
+#endif
 #endif /* MBS_SUPPORT */
 
   for (;;)
@@ -2930,7 +2941,7 @@
             s1 = s;
            if (d->states[s].mbps.nelem != 0)
              {
-               /* Can match with a multibyte character( and multi character
+               /* Can match with a multibyte character (and multi character
                   collating element).  */
                unsigned char const *nextp;
 
@@ -3668,9 +3679,9 @@
  done:
   if (strlen(result))
     {
-      dm = (struct dfamust *) malloc(sizeof (struct dfamust));
+      MALLOC(dm, struct dfamust, 1);
       dm->exact = exact;
-      dm->must = malloc(strlen(result) + 1);
+      MALLOC(dm->must, char, strlen(result) + 1);
       strcpy(dm->must, result);
       dm->next = dfa->musts;
       dfa->musts = dm;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]