emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question collaborative editing.


From: Ergus
Subject: Re: Question collaborative editing.
Date: Thu, 1 Oct 2020 01:11:59 +0200

On Wed, Sep 30, 2020 at 09:08:27PM +0300, Eli Zaretskii wrote:
From: Qiantan Hong <qhong@mit.edu>
CC: Ergus <spacibba@aol.com>, Fermin <fmfs@posteo.net>,
        Jean Louis
        <bugs@gnu.support>, Noam Postavsky <npostavs@gmail.com>,
        Emacs developers
        <emacs-devel@gnu.org>,
        Karl Fogel <kfogel@red-bean.com>,
        Stefan Monnier
        <monnier@iro.umontreal.ca>
Date: Wed, 30 Sep 2020 17:48:13 +0000

> I'm sorry, I probably miss too much background information here,
> because I don't really understand what is the issue you are describing
> here.  E.g., why is user intent important, when all you need is to
> keep buffer text synchronized between several sessions?

That’s the baseline for collaborative editing, there’s apparently
solution that satisfies this single property (convergence property)
but are extremely bad — e.g. the effect of some user input takes
no effect, or a trivial program displaying an empty buffer for
every user. Actually if we only want this property there’s much
simpler way than CRDT/OT, e.g. let the server (or one client)
send the whole buffer periodically.

One example of diff not preserving user intent is:
suppose in the buffer there is
“one two one”
And users agree that this sentence should always end in “one”.
Now user A deleted the first two words “one two “
user B inserted “three” between the first “one” and “two”
However, the diff have no idea about users’ agreement
and what exactly user does (unless we get it from modification
hooks). For user A, the diff just see “one” after and assume
that A delete the suffix “two one”. The CRDT then does it job
correctly and gives final synchronization result “one three”.
Clearly, user intent is violated.

I understand all that, but (a) you can overcome that by techniques
different from CRDT, and (b) unless we mean very different things by
"user intent", the intent is not important here.  What is important,
AFAIU, is to specify the changes in a way that will produce the
desired result even if the buffer in the other Emacs was meanwhile
modified.

Hi Eli:

I will probably describe the obvious, but it seems that there is a miss
communication here.

The CRDT algorithm works in a way that every single character in the
buffer has a unique ID if you have "ab" then a could be zero and b could
be 1... Inserting a char between them (acb) will make c be 0.5; adding
another (acdb) will make d=0.75 (this is just a simplified example)

So adding a letter only needs to contain the new number and the char to
insert {0.5; c} {0.75; d}. And to insert "c" we only need to find which
char in the buffer are x < 0.5 and x+1 > 0.5.

If I understood the Qiantan idea; his approach is to add such ID as a
text property within the emacs buffer. To modify the buffer it is only
needed to search (go to) the property ID of text chars/words whatever in
the emacs buffer and perform the action. So the error probability is
smaller because the IDs ARE in the text itself.

This will reduce undetected errors inserting in the wrong positions when
translating form the CRDT ID to the real "global" buffer position to do
an "insert". Which could happen in some corner cases... (lets say
basically synchronization errors between local modifications (which
modify local absolute positions) and processing remote ones using
outdated information (translating to global indices)...

Said that; the external library approach seemed to work with tandem
https://github.com/jscheid/tandem-emacs

I am not sire how well it did.

My previous concerns about the internal storage and performance was
because the linear search of text properties in a long linear buffer may
be very expensive as described in the refereed paper; so using a two
steps search improved performance as described there... There are also
other optimizations like grouping operations or remove redundant ones
that with the C approach could reduce the operation that arrive to
emacs.  Probably in our case the improvement may be to use a bisection
search and position hints.

Any way. I am pretty sure that the Qiantan lisp only implementation
could work; but performance is something that needs to be tested
(hopefully performance issues will be negligible compared to network
latency).

The C library implementation was my utopic ideal implementation in order
to make it efficient, portable and reusable with other editors... But it
will somehow require to start a completely independent project from
scratch and would require much more time to implement (algorithm,
network handling, json, synchronization between clients and with
emacs)... And basically emacs provides all that. Maybe this time simple
is not better, but more realistic.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]