bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34525: replace-regexp missing some matches


From: Alan Mackenzie
Subject: bug#34525: replace-regexp missing some matches
Date: Sun, 24 Feb 2019 21:00:58 +0000
User-agent: Mutt/1.10.1 (2018-07-13)

Hello, Eli.

On Sun, Feb 24, 2019 at 19:56:13 +0200, Eli Zaretskii wrote:
> > Date: Sun, 24 Feb 2019 17:37:46 +0000
> > Cc: daniel.lopez999@gmail.com, 34525@debbugs.gnu.org,
> >   Stefan Monnier <monnier@iro.umontreal.ca>
> > From: Alan Mackenzie <acm@muc.de>

> > The query-replace word ends up calling re-search-forward.
> > Fre_search_forward ends up calling re_search_2 (which is called
> > rpl_re_search_2 in gdb.  :-( ).

> > This calls re_match_2_internal, which scans through the compiled regexp,
> > "\<Bitmap\>".

> > Up till now, we have said yes to replace the first Bitmap with
> > SharedBitmap in query-replace.  Emacs is now seeking out the second
> > occurrence of Bitmap, which is on L69 of the OP's test file, and looks
> > like "Bitmap<", where the < has a syntax-table text property of (4 . 62),
> > an opening paren which matches ">".

> > re_natch_2_internal finds its way to case wordbeg: to handle the "\<" of
> > the regexp.  It invokes UPDATE_SYNTAX_TABLE (charpos) to get the syntax
> > for the "B" it has already found.

> > Sadly, UPDATE_SYNTAX_TABLE sets its internal structure gl_state not for
> > the current contents of position 1948, but the contents of 1948 before
> > the change at the top of the buffer (Bitmap -> SharedBitmap) was made.
> > So it picks up the syntax for the "<" rather than the "B".

> Are you saying that we've modified buffer text, but
> re_match_2_internal still holds to a C pointer to buffer text before
> the change?

I don't think that's the case.  The relevant buffer pointers/sizes are
calculated (in search_buffer_re) as

    p1 = BEGV_ADDR;
    s1 = GPT_BYTE - BEGV_BYTE;
    p2 = GAP_END_ADDR;
    s2 = ZV_BYTE - GPT_BYTE;

each time before a search.

> If so, it's a simple manner of recomputing the C pointer using the
> buffer position after the change, right?  We do such things in a few
> places, like coding.c, by recording the offset of the text before the
> change and reapplying it after the change.

> > I think the glitch is in the text property interval handling code.
> > It is as though after the replacement of Bitmap by SharedBitmap, the
> > interval starting positions have not been adjusted for the extra six
> > characters.

> If the code has variables that record C pointers to buffer text, those
> need to be updated after every change, of else they will become
> invalid.

> But I'm surprised we have such blatant bugs in such veteran code, ....

The bug was introduced sometime between 25.3 and 26.1.  I tried to bisect
the commits between 25.2 and 26.1, but couldn't, because autogen.sh was
broken in lots of the pertinent commits, so I couldn't build these Emacs
versions.

> .... so I'm probably missing something.  Can you describe the above
> again, this time showing the relevant code fragments and variables
> involved in this?

I'm afraid my gdb session is too long and chaotic to extract anything
meaningful out of.  I'll have to recreate it more purposefully, to get
these results.  Not tonight!

We'll get this sorted out.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]