bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Memory exhausted when doing a case-insensitive match to an empty reg


From: Aharon Robbins
Subject: Re: Memory exhausted when doing a case-insensitive match to an empty regexp (gawk)
Date: Fri, 09 Mar 2007 12:59:14 +0200

Greetings.  Sorry for the long delay in replying. I am finally working my way
through old emails.

This is now fixed in the CVS archive available on savannah.gnu.org.  Thanks
for the patch!

Arnold

> Date: Wed, 26 Oct 2005 15:42:07 -0700
> From: Tony Leneis <address@hidden>
> Subject: Re: Memory exhausted when doing a case-insensitive match to an empty
>  regexp (gawk)
> To: Karel Zak <address@hidden>
> Cc: address@hidden
>
> On Wed, Oct 26, 2005 at 04:31:44PM +0200, Karel Zak wrote:
> > On Tue, 2005-10-25 at 23:25 -0700, Tony Leneis wrote:
> > >   Gawk has started having problems with case-insensitive empty
> > > regexp matches sometime between version 3.1.1 and 3.1.4.  Here's what I
> > > see with gawk 3.1.4 and 3.1.5:
> > > 
> > > # gawk 'BEGIN { IGNORECASE=0; print "test" ~ "" }'
> > > 1
> > > # gawk 'BEGIN { IGNORECASE=1; print "test" ~ "" }'
> > > gawk: fatal: memory exhausted
> > > 
> > > When I try the same program under gawk 3.0.3, 3.1.0, and 3.1.1 I get a
> > > response of 1 regardless of how IGNORECASE is set.
> > 
> > It works for me:
> > 
> >     $ ./gawk 'BEGIN { IGNORECASE=1; print "test" ~ "" }';
> >     1
> >     $ ./gawk --version | head -1
> >     GNU Awk 3.1.5
> > 
> > Note that it's raw upstream version without any patch. You should try it
> > with gdb.
>
> The error is coming from dfacomp() in dfa.c (gawk itself isn't crashing.)
> This works:
>
> # GAWK_NO_DFA=1 gawk 'BEGIN { IGNORECASE=1; print "test" ~ "" }'
> 1
>
> The following code is near the top of dfacomp() and is run if case_fold
> is set:
>
>       lcopy = malloc(len);
>       if (!lcopy)
>         dfaerror(_("memory exhausted"));
>
> My guess is len == 0, which means malloc() is being asked to allocate a
> block of 0 bytes, which according to my copy of the standard C library
> means the behavior of malloc() is implementation dependent.  The
> implementation on my system happens to return a NULL pointer, which then
> triggers dfaerror().  Your implementation probably returned a unique but
> indeterminate pointer to 0 bytes of free memory...
>
> Here is an extremely naive and barely tested patch that seems to solve
> the problem for me (I just treat an empty case insensitive regexp the
> same as an empty case sensitive regexp since both should be handled the
> same way.)  Note that this is just the quick hack I did to make gawk
> 3.1.5 work on my system, and is not necessarily the best way to solve
> the problem:
>
> --- dfa.c.orig  2005-10-26 22:20:10.000000000 +0000
> +++ dfa.c       2005-10-26 22:20:26.000000000 +0000
> @@ -3060,7 +3060,7 @@
>  void
>  dfacomp (char const *s, size_t len, struct dfa *d, int searchflag)
>  {
> -  if (case_fold)       /* dummy folding in service of dfamust() */
> +  if (case_fold && len)        /* dummy folding in service of dfamust() */
>      {
>        char *lcopy;
>        int i;
>
> For example, it might be better to do something in re.c to not use the
> dfa code when the regexp length is 0.
>
> -Tony




reply via email to

[Prev in Thread] Current Thread [Next in Thread]