bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#70077: An easier way to track buffer changes


From: Stefan Monnier
Subject: bug#70077: An easier way to track buffer changes
Date: Tue, 02 Apr 2024 11:17:41 -0400
User-agent: Gnus/5.13 (Gnus v5.13)

>> I'm not sure how to combine the benefits of combining small changes into
>> larger ones with the benefits of keeping distant changes separate.
> I am not sure if combining small changes into larger ones is at all a
> good idea.

In my experience, if the amount of work to be done "per change" is not
completely trivial, combining small changes is indispensable (the worst
part being that the cases where it's needed are sufficiently infrequent
that naive users of `*-change-functions` won't know about that, so we
end up with unacceptable slowdowns in corner cases).

It also keeps the API simpler since the clients only need to care about
at most one (BEG END BEFORE) item at any given time.

> The changes are stored in strings, which get allocated and
> re-allocated repeatedly.

Indeed.  Tho for N small changes, my code should perform only O(log N)
re-allocations.

> Repeated string allocations, especially when strings keep growing
> towards the buffer size, is likely going to increase consing and make
> GCs more frequent.

Similar allocations presumably take place anyway while running the code
(e.g. for the `buffer-undo-list`), so I'm hoping the effect will be
"lost in the noise".

But admittedly for code which does not need the `before` string at all
(such as `diff-mode.el`), it's indeed a waste of effort.  I haven't been
able to come up with an API which is still simple but without such costs.

> Aside...
> How nice would it be if buffer state and buffer text were persistent.
> Like in 
> https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation

🙂

>> We could expose a list of simultaneous (and thus disjoint) changes,
>> which avoids the last problem.  But it's a fair bit more work for us, it
>> makes the API more complex for the clients, and it's rarely what the
>> clients really want anyway.
> FYI, Org parser maintains such a list.

Could you point me to the relevant code (I failed to find it)?

> We previously discussed a similar API in
> https://yhetil.org/emacs-bugs/87o7iq1emo.fsf@localhost/

IIUC this discusses a *sequence* of edits.  In the point to which you
replied I was discussing keeping a list of *simultaneous* edits.

This said, the downside in both cases is that the specific data that we
need from such a list tends to depend on the user.  E.g. you suggest

    (BEG END_BEFORE END_AFTER COUNTER)

but that is not sufficient reconstruct the corresponding buffer state,
so things like Eglot/CRDT can't use it.  Ideally for CRDT I think you'd
want a sequence of

    (BEG END-BEFORE STRING-AFTER)

but for Eglot this is not sufficient because Eglot needs to convert BEG
and END_BEFORE into LSP positions (i.e. "line+col") and for that it
needs to reproduce the past buffer state.  So instead, what Eglot needs
(and does indeed build using `*-change-functions`) is a sequence of

    (LSP-BEG LSP-END-BEFORE STRING-AFTER)

[ Tho it seems it also needs a "LENGTH" of the deleted chunk, not sure
  exactly why, but I guess it's a piece of redundant info the servers
  can use to sanity-check the data?  ]

>> But it did occur to me that we could solve the "disjoint changes"
>> problem in the following way: signal the client (from
>> `before-change-functions`) when a change is about to be made "far" from
>> the currently pending changes.
> This makes sense.

The code changes were actually quite simple, so I have now included it.
See my current PoC code below.


        Stefan

Attachment: track-changes.el
Description: application/emacs-lisp


reply via email to

[Prev in Thread] Current Thread [Next in Thread]