Brief Introduction to Event Graph Walker (Eg-Walker)
Eg-walker is a novel CRDT algorithm introduced in:
Collaborative Text Editing with Eg-walker: Better, Faster, Smaller (opens in a new tab)
By: Joseph Gentle, Martin Kleppmann
Whether dealing with real-time collaboration or multi-end synchronization, a directed acyclic graph (DAG) forms over the history of these parallel edits, similar to Git's history. The Eg-walker algorithm records the history of user edits on the DAG. Unlike conventional CRDTs, Eg-walker can record just the original description of operations, not the metadata of CRDTs.
For instance, in text editing scenarios, the RGA algorithm (opens in a new tab) needs the op ID and Lamport timestamp (opens in a new tab) of the character to the left to determine the insertion point. Yjs (opens in a new tab)/Fugue, however, requires the op ID of both the left and right characters at insertion. In contrast, Eg-walker simplifies this by only recording the index at the time of insertion. Loro, which uses Fugue (opens in a new tab) upon Eg-walker, inherits these advantages.
An index is not a stable position descriptor, as the index of an operation can
be affected by other operations. For example, if you highlight content from
index=x
to index=y
, and concurrently someone inserts n characters at
index=n
where n<x
, then your highlighted range should shift to cover from
x+n
to y+n
. However, Eg-walker can determine the exact position of this index and
reconstruct the corresponding CRDT structure by replaying history.
Reconstructing history might seem time-consuming, but Eg-walker can backtrack only some. When merging updates from remote, it only needs to replay the operations between the current version and the remote version up to their lowest common ancestor, constructing a temporary CRDTs to calculate the effect of the remote update.
The Eg-walker algorithm excels with its fast local update speeds and eliminates concerns about tombstone collection in CRDTs. For instance, if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.