# FILE: pages/changelog/v1.0.0-beta.mdx --- version: "v1.0.0" title: "Release Loro v1.0.0" date: 2024/10/21 # breakingChange: false # category: ["Encoding", "Tree"] --- We are very excited to announce the release of Loro v1.0, a major milestone. It has a stable encoding format, faster document import and export speed, better version control capabilities, and a shallow snapshot. For more information, please check [the blog](https://loro.dev/blog/v1.0). The following are the specific API changes: ## New ### LoroDoc - `getChange(id: ID)`: get `ChangeMeta` by `ID`. - `setDetachedEditing(flag: boolean)`: Enables editing in detached mode, which is disabled by default. - `isDetachedEditingEnabled()`: Whether the editing is enabled in detached mode. - `setNextCommitMessage(msg: string)`: Set the commit message of the next commit. - `shallowSinceVV()`: The doc only contains the history since this version. - `shallowSinceFrontiers()`: The doc only contains the history since this version. - `export(mode: ExportMode)`: Export the document based on the specified ExportMode. see more details [here](/docs/tutorial/encoding). - `getDeepValueWithID()`: Get deep value of the document with container id. - `subscribeLocalUpdates(callback:(bytes: Uint8Array) => void)`: Subscribe to updates from local edits. - `getPathToContainer(id: ContainerID)`: Get the path from the root to the container. - `JSONPath(jsonPath: string)`: Evaluate JSONPath against a LoroDoc. - `forkAt(frontiers: Frontiers): LoroDoc`: Creates a new LoroDoc at a specified version (Frontiers) - `getPendingTxnLength():number`: Get the number of operations in the pending transaction. - `travelChangeAncestors(ids: ID[], callback: (meta: ChangeMeta)->bool)`: Iterate over all changes including the input id in order, and stop iterating if the callback returns false. ### LoroText - `updateByLine(text: string)`: Update the current text based on the provided text line by line like git. ### LoroList - `toArray(): ValueOrContainer[]`: Get elements of the list. If the value is a child container, the corresponding `Container` will be returned. - `clear()`: Delete all elements in the list. ### LoroMovableList - `toArray(): ValueOrContainer[]`: Get elements of the list. If the value is a child container, the corresponding `Container` will be returned. - `clear()`: Delete all elements in the list. ### LoroMap - `clear()`: Delete all key-value pairs in the map. ### LoroTree - `enableFractionalIndex(jitter: number)`: Set whether to generate fractional index for Tree Position. - `disableFractionalIndex()`: Disable the fractional index generation for Tree Position when you don't need the Tree's siblings to be sorted. The fractional index will be always default. - `isFractionalIndexEnabled()`: Whether the tree enables the fractional index generation. - `isNodeDeleted(id: TreeID)`: Return `undefined` if the node is not exist, otherwise return `true` if the node is deleted. - `getNodes(prop: getNodesProp): LoroTreeNode[]`: Get the flat array of the forest. If `with_deleted` is true, the deleted nodes will be included. ### UndoManager - `clear()`: Clear the Undo and Redo stack of `UndoManager` ## Changes ### LoroDoc - Move `setFractionalIndexJitter()` to `LoroTree`, you can set whether to enable or disable it for each `Tree Container`. - `import()`, `importWith()` and `importJsonUpdates` will return `ImportStatus` for indicating which ops have been successfully applied and which ops are pending. - New Subscription for event. - In Loro 1.0, `doc.version()` `doc.frontiers()` `doc.oplogVersion()` and `doc.oplogFrontiers()` even if ops has not been committed, it indicates the latest version of all operations. - rename `Loro` to `LoroDoc`. ### LoroTree - `contains(id: TreeID)`: Return true even if the node exists in the internal state and has been deleted. - `nodes()`: deleted nodes will be included now, you can use `isDeleted()` to filter. - `toJSON()`: Now use the hierarchical approach to express the tree structure. ## Deprecation ### LoroDoc - `exportFrom(version)` and `exportSnapshot()` are deprecated, use `export(mode: ExportMode)` instead. # FILE: pages/changelog/v1.3.0.mdx --- version: "v1.3.0" title: "Release Loro v1.3.0" date: 2025/01/09 --- ## New - UndoManager's `onPush` now can access the change event. - add getShallowValue for each container. ### LoroDoc - `toJsonWithReplacer(replacer: (k, v)=>Value)`: Convert the document to a JSON value with a custom replacer function. - `revertTo(frontiers: Frontiers)`: Revert the document to the given frontiers. - `findIdSpansBetween(from: Frontiers, to: Frontiers)`: Find the op id spans that between the `from` version and the `to` version. - `exportJsonInIdSpan(idSpan: IdSpan)`: Export the readable [`Change`]s in the given [`IdSpan`]. ## Fix - fix: prevent merging remote changes based on local `changeMergeInterval` config [#643](https://github.com/loro-dev/loro/pull/643) - fix: should commit before travel_change_ancestors [#599](https://github.com/loro-dev/loro/pull/599) - fix: panic when detach then attach [#592](https://github.com/loro-dev/loro/pull/592) - fix: move child in current parent [#589](https://github.com/loro-dev/loro/pull/589) - fix: panic when returned non-boolean value from text.iter(f) [#578](https://github.com/loro-dev/loro/pull/578) # FILE: pages/changelog/v1.1.0.mdx --- version: "v1.1.0" title: "Release Loro v1.1.0" date: 2024/11/09 --- ## New ### LoroDoc - `forkAt(frontiers: Frontiers)`: Fork the document at the given frontiers. - `getChangedContainersIn(id: ID, len: number)`: Gets container IDs modified in the given ID range. - `` ### LoroText - `getEditorOf(pos: number)`: Get the editor of the text at the given position. - `push(s: string)`: Push a string to the end of the text. ### LoroMap - `getLastEditor(key: string)`: Get the peer id of the last editor on the given entry ### LoroList - `getIdAt(pos: number)`: Get the ID of the list item at the given position. - `pushContainer(child: Container)`: Push a container to the end of the list. ### LoroMovableList - `getCreatorAt(pos: number)`: Get the creator of the list item at the given position. - `getLastMoverAt(pos: number)`: Get the last mover of the list item at the given position. - `getLastEditorAt(pos: number)`: Get the last editor of the list item at the given position. - `pushContainer(child: Container)`: Push a container to the end of the list. ### LoroTree - `toJSON()`: Get JSON format of the LoroTreeNode. ## Fix - fix get correct encode blob info [#545](https://github.com/loro-dev/loro/pull/545) - fix: avoid creating non-root containers that doesn't exist by get_container api [#541](https://github.com/loro-dev/loro/pull/541) - fix: define the fork behavior when the doc is detached [#537](https://github.com/loro-dev/loro/pull/537) # FILE: pages/changelog/inspector-v0.1.0.mdx --- version: "v0.1.0" title: "Release Loro Inspector v0.1.0" date: 2025/04/30 --- Try it here: [Loro Inspector](https://inspector.loro.dev/) Now you can directly browse the current state and complete edit history of your Loro documents in the browser. You can also use this tool to time travel to any version in the history of your Loro document. import { ReactPlayer } from "components/video"; # FILE: pages/changelog/v1.5.0.mdx --- version: "v1.5.0" title: "Release Loro v1.5.0" date: 2025/04/04 --- ## New ### 1. New Hooks `doc.subscribePreCommit(listener)` - Modify commit options before processing: This hook is particularly useful because doc.commit() is often invoked implicitly in various methods such as doc.import, doc.export, doc.checkout, and doc.exportJsonUpdates. Without this hook, users attempting to add custom messages to each commit might miss these implicit commit triggers. ```ts const doc = new LoroDoc(); doc.setPeerId(0); doc.subscribePreCommit((e) => { e.modifier.setMessage("test").setTimestamp(Date.now()); }); doc.getList("list").insert(0, 100); doc.commit(); expect(doc.getChangeAt({ peer: "0", counter: 0 }).message).toBe("test"); ``` Advanced Example - Creating a Merkle DAG: ```ts no_run const doc = new LoroDoc(); doc.setPeerId(0); doc.subscribePreCommit((e) => { const changes = doc.exportJsonInIdSpan(e.changeMeta); expect(changes).toHaveLength(1); const hash = crypto.createHash("sha256"); const change = { ...changes[0], deps: changes[0].deps.map((d) => { const depChange = doc.getChangeAt(idStrToId(d)); return depChange.message; }), }; hash.update(JSON.stringify(change)); const sha256Hash = hash.digest("hex"); e.modifier.setMessage(sha256Hash); }); doc.getList("list").insert(0, 100); doc.commit(); expect(doc.getChangeAt({ peer: "0", counter: 0 }).message).toBe( "2af99cf93869173984bcf6b1ce5412610b0413d027a5511a8f720a02a4432853", ); ``` `doc.subscribeFirstCommitFromPeer(listener)` - Triggers on first peer interaction: This hook provides an ideal point to associate peer information (such as author identity) with the document. ```ts const doc = new LoroDoc(); doc.setPeerId(0); doc.subscribeFirstCommitFromPeer((e) => { doc.getMap("users").set(e.peer, "user-" + e.peer); }); doc.getList("list").insert(0, 100); doc.commit(); expect(doc.getMap("users").get("0")).toBe("user-0"); ``` ### 2. EphemeralStore EphemeralStore is a better alternative to Awareness for ephemeral states: Awareness is commonly used as a state-based CRDT for handling ephemeral states in real-time collaboration scenarios, such as cursor positions and application component highlights. As application complexity grows, Awareness may be set in multiple places, from cursor positions to user presence. However, the current version of Awareness doesn't support partial state updates, which means even minor mouse movements require synchronizing the entire Awareness state. ```ts no_run awareness.setLocalState({ ...awareness.getLocalState(), x: 167, }); ``` Since Awareness is primarily used in real-time collaboration scenarios where consistency requirements are relatively low, we can make it more flexible. We've introduced EphemeralStore as an alternative to Awareness. Think of it as a simple key-value store that uses timestamp-based last-write-wins for conflict resolution. You can choose the appropriate granularity for your key-value pairs based on your application's needs, and only modified key-value pairs are synchronized. ```ts import { EphemeralStore, EphemeralListener, EphemeralStoreEvent, } from "loro-crdt"; const store = new EphemeralStore(); // Set ephemeral data store.set("user-alice", { anchor: 10, focus: 20, user: "Alice" }); // Encode only the data for `loro-prosemirror` const encoded = store.encode("user-alice") const newStore = new EphemeralStore(); newStore.subscribe((e: EphemeralStoreEvent) => { // Listen to changes from `local`, `remote`, or `timeout` events }); newStore.apply(encoded); console.log(newStore.get("user-alice")) // { // anchor: 10, // focus: 20, // user: "Alice" // } ``` ## Fix - Fixed text styling at end "\n" character - Added JSON support for current transaction operations - For environments that support multi-threading such as Rust and Swift, LoroDoc can now be directly and safely shared and accessed in parallel across multiple threads without triggering the previous WouldBlock panic. # FILE: pages/changelog/v1.4.7.mdx --- version: "v1.4.7" title: "Release Loro v1.4.7" date: 2025/04/01 --- ## New - You can get the version of Loro by `LORO_VERSION` - `setNextCommitOrigin(origin: string)`: Set the origin of the next commit. - `setNextCommitTimestamp(timestamp: number)`: Set the timestamp of the next commit. - `setNextCommitOptions(options: CommitOption)`: Set the options of the next commit. - `clearNextCommitOptions()`: Clear the options of the next commit. - `configDefaultTextStyle(style: TextStyle)`: Configures the default text style for the document. - `getUncommittedOpsAsJson()`: Get the pending operations from the current transaction in JSON format ## Fix - fix: memory leak issue [#647](https://github.com/loro-dev/loro/pull/647) - fix: mark err on detached LoroText [#659](https://github.com/loro-dev/loro/pull/659) - fix: detached loro text issues [#665](https://github.com/loro-dev/loro/pull/665) - fix: entity index when the tree is empty [#670](https://github.com/loro-dev/loro/pull/670) # FILE: pages/changelog/v1.2.0.mdx --- version: "v1.2.0" title: "Release Loro v1.2.0" date: 2024/12/10 --- ## New - Add `isDeleted()` method to all container types (Text, Map, List, Tree, etc.) ### LoroDoc - `changeCount()`: Get the number of changes in the oplog. - `opCount()`: Get the number of ops in the oplog. ### VersionVector - `setEnd(id: ID)`: Set the exclusive ending point. target id will NOT be included by self. - `setLast(id: ID)`: Set the inclusive ending point. target id will be included. - `remove(peer: PeerID)`: Remove the specific peer id. - `length()`: Get the number of peers in the VersionVector. ## Change - Return `ImportStatus` in the `importUpdateBatch` method. - Fractional index is enabled by default now. ## Fix - fix: getOrCreateContainer should not throw if value is null [#576](https://github.com/loro-dev/loro/pull/576) - fix: dead loop when importing updates [#570](https://github.com/loro-dev/loro/pull/570) # FILE: pages/changelog/v1.4.0.mdx --- version: "v1.4.0" title: "Release Loro v1.4.0" date: 2025/02/13 --- ## New - add `unsubscribe()` for Subscription. ## Fix - fix: getting values by path in LoroTree [#643](https://github.com/loro-dev/loro/pull/643) - fix: should be able to call subscription after diffing [#637] - fix: update long text may fail [#633](https://github.com/loro-dev/loro/pull/633) - fix: map.keys() may return keys from deleted entries [#618](https://github.com/loro-dev/loro/pull/618) # FILE: pages/docs/examples.mdx --- keywords: "crdt, example, get started, excalidraw" description: "The examples of basic usage in Loro" --- # Examples You can find the examples of basic usage in [Loro examples in Deno](https://github.com/loro-dev/loro-examples-deno); ### loro-prosemirror GitHub: [loro-dev/loro-prosemirror](https://github.com/loro-dev/loro-prosemirror) It provides seamless integration between Loro and ProseMirror, a powerful rich text editor framework. It includes: - Document state synchronization with rich text support - Cursor awareness and synchronization - Undo/Redo support in collaborative editing It can also be used with [Tiptap](https://tiptap.dev/), a popular rich text editor built on top of ProseMirror. This means you can easily add collaborative editing capabilities to your Tiptap-based applications. ### loro-codemirror GitHub: [loro-dev/loro-codemirror](https://github.com/loro-dev/loro-codemirror) This package provides integration between Loro and CodeMirror 6, a versatile code editor. It supports: - Document state synchronization - Cursor awareness synchronization - Undo/Redo functionality ### loro-inspector GitHub: [loro-dev/loro-inspector](https://github.com/loro-dev/loro-inspector) import { ReactPlayer } from "components/video"; # FILE: pages/docs/advanced/inspector.mdx # Loro Inspector [Loro Inspector](https://inspector.loro.dev/) is an open-source web tool that helps developers debug and visualize Loro documents. It provides a user-friendly interface to inspect the state and history of Loro documents. import { ReactPlayer } from "components/video"; # FILE: pages/docs/advanced/doc_state_and_oplog.md --- keywords: "crdt, oplog, snapshot, doc state, checkout, version" description: "Introducing Loro's DocState and OpLog concepts" --- # DocState and OpLog Although not explicitly exposed in the WASM interface, internally in Loro, we distinctly differentiate between: - The current state of the document: DocState - The edit history of the document: OpLog During local operations, we update the DocState and record the operations in OpLog. When merging remote updates, we add the new Ops to OpLog and compute a Delta. This Delta is applied to DocState and also emitted as an event. DocState can switch between different versions, similar to Git's checkout. In this case, we calculate the Delta based on the edit history. The same mechanism applies: the Delta is emitted as an event and applied to DocState. Impact on the encoding schema: - When calling `doc.export({ mode: "update" })` or `doc.export({ mode: "update-in-range" })`, we only encode the operations that occurred after the specified version. - When calling `doc.export({ mode: "snapshot" })` or `doc.export({ mode: "shallow-snapshot" })`, we encode both OpLog and DocState, providing rapid loading speed (as it doesn't require recalculating the state of DocState). ## Attached/Detached LoroDoc Status As we aim to support version control and the ability to load OpLog without state, the version of DocState and the latest version recorded in OpLog may not always match. When they align, it is in an _attached_ state; otherwise, it's in a _detached_ state. ```ts const doc = new LoroDoc(); doc.setPeerId(1); doc.getText("text").insert(0, "Hello"); const doc2 = doc.fork(); // create a fork of the doc console.log(doc.version().toJSON()); // Map(1) { "1" => 5 } console.log(doc.oplogVersion().toJSON()); // Map(1) { "1" => 5 } doc.checkout([{ peer: "1", counter: 1 }]); console.log(doc.version().toJSON()); // Map(1) { "1" => 2 } console.log(doc.oplogVersion().toJSON()); // Map(1) { "1" => 5 } doc2.setPeerId(2); doc2.getText("text").insert(5, "!"); doc.import(doc2.export({ mode: "update" })); console.log(doc.version().toJSON()); // Map(1) { "1" => 2 } console.log(doc.oplogVersion().toJSON()); // Map(2) { "1" => 5, "2" => 1 } console.log(doc.isDetached()); // true doc.attach(); console.log(doc.version().toJSON()); // Map(2) { "1" => 5, "2" => 1 } console.log(doc.oplogVersion().toJSON()); // Map(2) { "1" => 5, "2" => 1 } ``` ![DocState and OpLog Detached Example](./images/version-4.png) The doc cannot be edited in the detached mode. Users must use `attach()` to return to the latest version to continue editing. ### Attached/Detached Container Status This refers to whether a container is associated with a document. If not, it's a detached container created by methods like `new LoroText()`. The `.isAttached()` state never changes for an instance of a container. When you insert a detached container into an attached one, you get a new attached container that has the same content as the detached one. > This is different from the LoroDoc's attached/detached status # FILE: pages/docs/advanced/event_graph_walker.mdx --- keywords: "crdt, event graph walker, eg-walker, synchronization, collaboration" description: "introduction to Event Graph Walker, a crdt algorithm for real-time collaboration and synchronization." --- # Brief Introduction to Event Graph Walker (Eg-Walker) Eg-walker is a novel CRDT algorithm introduced in: > [Collaborative Text Editing with Eg-walker: Better, Faster, Smaller](https://arxiv.org/abs/2409.14252) > By: Joseph Gentle, Martin Kleppmann import { ReactPlayer } from "components/video"; Whether dealing with real-time collaboration or multi-end synchronization, a directed acyclic graph (DAG) forms over the history of these parallel edits, similar to Git's history. The Eg-walker algorithm records the history of user edits on the DAG. Unlike conventional CRDTs, Eg-walker can record just the original description of operations, not the metadata of CRDTs. For instance, in text editing scenarios, the [RGA algorithm] needs the op ID and [Lamport timestamp][Lamport] of the character to the left to determine the insertion point. [Yjs]/Fugue, however, requires the op ID of both the left and right characters at insertion. In contrast, Eg-walker simplifies this by only recording the index at the time of insertion. Loro, which uses [Fugue] upon Eg-walker, inherits these advantages. An index is not a stable position descriptor, as the index of an operation can be affected by other operations. For example, if you highlight content from `index=x` to `index=y`, and concurrently someone inserts n characters at `index=n` where `n Each insertion or deletion by the user generates an `op`, and multiple consecutive `op`s can merge into a larger `Change`. `Change`s log a `Timestamp`, but each `Change` can only associate with one `Timestamp`. Hence, if the time gap is too wide, merging `Changes` becomes impractical. However, treating each Change as new based on slight timestamp differences (e.g., key presses milliseconds apart) introduces significant additional overhead for each Change. Therefore, users can customize the `change merge interval` according to their needs. Note that these settings do not persist in exported `Updates` or `Snapshots`. Thus, if custom configurations are required, they must be reapplied upon each initialization of `LoroDoc`. Without timestamp recording enabled, the `Timestamp` defaults to the current maximum known `Timestamp`. # FILE: pages/docs/advanced/op_and_change.md # Operations and Change In Loro, every basic operation such as setting a key-value pair on a Map, adding a list item, or inserting/deleting a character in text is considered an individual op. (Don't worry about the cost, in Loro's internal memory representation and export format, consecutive ops are merged into a larger op, such as consecutive text insertions and deletions.) One or more local consecutive `Op`s constitute a `Change`, which includes the following information: - ID: ID of the Change is essentially the first op's ID - Timestamp: An optional timestamp, which can be enabled with `setRecordTimestamp(true)`. If not enabled, there is no extra storage overhead. - Dependency IDs: Used to represent the causal order, the Op IDs that the current Change directly depends on. - Commit Message: An optional commit message (WIP not yet released); when not enabled, there is no extra storage overhead. Each time `doc.commit()` is called, a new `Change` is generated, which will be merged with the previous local `Change` as much as possible to reduce the amount of metadata that needs to be stored. > Note: Each time you export, a `doc.commit()` is implicitly performed by the > Loro Doc. Unlike a Git commit, Loro's Change can be merged; it is neither atomic nor indivisible. This design allows Loro to better accommodate real-time collaboration scenarios (where each keystroke would have its own `doc.commit()`, which would be hugely costly if not merged) and asynchronous collaboration scenarios (like Git, which combines many modifications to form one). ## When a New Change is Formed > Note: You may not need to understand the content of this section, and the > content may change in future versions. Unless you want to understand Loro's > internal implementation or want to achieve more extreme performance > optimization. By default, each commit-generated `Change` will merge with the previous local `Change`. However, there are exceptions in several cases: - The current Change depends on a Change from a different peer. This occurs when local operations build upon recently applied remote operations. For example, deleting a character sequence that was just inserted by a remote peer. These causal relationships form a DAG (Directed Acyclic Graph). After importing remote updates, the next local Change will have new dependency IDs, necessitating a separate Change. - When `setRecordTimestamp(true)` is set, if the time interval between successive Changes exceeds the "change merge interval" (default duration 1000s). - When the current Change has a different commit message from the previous Change by the same peer. ## Example ```ts import { Change, LoroDoc } from "npm:loro-crdt@1.0.0-beta.5"; const docA = new LoroDoc(); docA.setPeerId("0"); const textA = docA.getText("text"); // This create 3 operations textA.insert(0, "123"); // This create a new Change docA.commit(); // This create 2 operations textA.insert(0, "ab"); // This will NOT create a new Change docA.commit(); { const changeMap: Map<`${number}`, Change[]> = docA.getAllChanges(); console.log(changeMap); // Output: // // Map(1) { // "0" => [ // { // lamport: 0, // length: 5, // peer: "0", // counter: 0, // deps: [], // timestamp: 0 // } // ] // } } // Create docB from doc const docB = LoroDoc.fromSnapshot(docA.export({ mode: "snapshot" })); docB.setPeerId("1"); const textB = docB.getText("text"); // This create 2 operations textB.insert(0, "cd"); // Import the Change from docB to doc const bytes = docB.export({ mode: "update" }); // Exporting has implicit commit docA.import(bytes); // This create 1 operations textA.insert(0, "1"); // Because doc import a Change from docB, it will create a new Change for // new commit to record this causal order docA.commit(); { const changeMap: Map<`${number}`, Change[]> = docA.getAllChanges(); console.log(changeMap); // Output: // // Map(2) { // "0" => [ // { // lamport: 0, // length: 5, // peer: "0", // counter: 0, // deps: [], // timestamp: 0 // }, // { // lamport: 7, // length: 1, // peer: "0", // counter: 5, // deps: [ { peer: "1", counter: 1 } ], // timestamp: 0 // } // ], // "1" => [ // { // lamport: 5, // length: 2, // peer: "1", // counter: 0, // deps: [ { peer: "0", counter: 4 } ], // timestamp: 0 // } // ] // } } ``` # FILE: pages/docs/advanced/cid.md # Container ID A Container ID is a unique identifier that comes in two forms: - Root Container: Composed of a type and root name - Normal Container: Created through user operations, composed of an ID and type **Rust `ContainerID`** ```rust pub enum ContainerID { Root { name: InternalString, container_type: ContainerType, }, Normal { peer: PeerID, counter: Counter, container_type: ContainerType, }, } ``` **TypeScript `ContainerID`** ```typescript export type ContainerID = | `cid:root-${string}:${ContainerType}` | `cid:${number}@${PeerID}:${ContainerType}`; ``` 1. **Root Containers** - Created implicitly when accessing a root container for the first time (e.g., calling `doc.getText("text")`). No operation is generated in the history. - Uniquely identified by a string name and container type 2. **Normal Containers** - Created explicitly through operations like `insertContainer` or `createNode` - Generated automatically when applying operations that create child containers - Contains the Operation ID of its creation within its Container ID ## Container Overwrites When initializing child containers in parallel, overwrites can occur instead of automatic merging. For example: ```typescript const doc = new LoroDoc(); const map = doc.getMap("map"); // Parallel initialization of child containers const docB = doc.fork(); const textA = doc.getMap("map").setContainer("text", new LoroText()); textA.insert(0, "A"); const textB = docB.getMap("map").setContainer("text", new LoroText()); textB.insert(0, "B"); doc.import(docB.export({ mode: "update" })); // Result: Either { "meta": { "text": "A" } } or { "meta": { "text": "B" } } ``` This behavior poses a significant risk of data loss if the editing history is not preserved. Even when the complete history is available and allows for data recovery, the recovery process can be complex. When a container holds substantial data or serves as the primary storage for document content, overwriting it can lead to the unintended hiding/loss of critical information. For this reason, it is essential to implement careful and systematic container initialization practices to prevent such issues. ### Best Practices 1. When containers might be initialized concurrently, prefer initializing them at the root level rather than as nested containers 2. When using map containers: - If possible, initialize all child containers during the map container's initialization - Avoid concurrent creation of child containers with the same key in the map container to prevent overwrites The overwrite behavior occurs because parallel creation of child containers results in different container IDs, preventing automatic merging of their contents. # FILE: pages/docs/advanced/import_batch.md # Batch Import ## Performance Differences and Their Causes When importing multiple updates into a document, using `doc.importBatch(updates)` is significantly faster than importing updates individually. This performance difference stems from how data merging is handled in each approach. ```ts import { LoroDoc } from "loro-crdt"; const doc = new LoroDoc(); doc.getText("text").update("Hello"); const update1 = doc.export({ mode: "update" }); const version = doc.version(); doc.getText("text").update("Hello World"); const update2 = doc.export({ mode: "update", from: version }); const newDoc1 = new LoroDoc(); newDoc1.importBatch([update1, update2]); // faster const newDoc2 = new LoroDoc(); for (const update of [update1, update2]) { // slower newDoc2.import(update); } ``` ### Key Advantages of Import Batch #### 1. Single Diff Calculation The most significant advantage is that import batch performs only one diff calculation. In contrast, each individual import follows these steps: - Merge remote updates into local history - Calculate document state changes from the current version to the merged version - Apply the diff to the current document state This diff calculation has fixed overhead costs that accumulate with each import. But `doc.importBatch(...)` only performs one diff calculation, which is faster than multiple individual diff calculations. #### 2. Reduced Communication Overhead Import batch also results in more concise events. Each individual import generates a new event, but `doc.importBatch(...)` generates only a single event that contains all the changes. # FILE: pages/docs/advanced/version_deep_dive.md --- keywords: "crdt, version, frontier, DAG, git, history, time travel" description: "The concept of several versions in Loro, DAG, Frontiers, and Version Vectors" --- # Loro's Versioning Deep Dive: DAG, Frontiers, and Version Vectors In centralized environments, we can use linear version numbers to represent a version, such as incrementing a number each time or using timestamps. However, CRDTs can be used in decentralized environments, and their version representation is different. In Loro, you can express a document's version through a [Version Vector](https://en.wikipedia.org/wiki/Version_vector) or Frontiers. ```rust doc.version(); // State Version vector doc.oplogVersion(); // OpLog Version vector doc.frontiers(); // State Frontiers doc.oplogFrontiers(); // OpLog Frontiers ``` In most cases, you might only need the Version Vector, which can be used for data synchronization and version comparison. Frontiers are more space-efficient than Version Vectors but come with more limitations. The following sections detail these two different ways of version description. ## Background Knowledge Loro belongs to Operation-Based CRDTs. Two Loro documents are consistent if they have the same set of activated Operations. Version is a compact way to describe which ops belong to this set. Each operation has a unique ID, known as OpId. For example, each user's deletion or insertion of characters will have a corresponding unique OpId (don't worry about the overhead here; we've compressed them in memory and Encoding). OpId consists of PeerId and Counter. > 💡 Note: To simplify, the specific type descriptions below may not be accurate > (e.g., the type of peer), but this does not affect the understanding of the > content. ```tsx interface OpId { // Simplified here // PeerId is actually close to uuid type, Loro uses u64 peerId: number; counter: number; } ``` PeerId represents the unique identifier of the device operating the document, and Counter is a number that starts from 0 and increments on that device. ## Version Vector A Version Vector is a vector that describes how many Operations starting from 0 need to be included for each Peer. ```tsx // In TypeScript type VersionVector = { [peerId in number]: number }; ``` For example, for the following version vector A, ```tsx const A = { 0: 2, 1: 3, }; ``` It includes: - All operations where PeerId == 0 && Counter < 2 - Or PeerId == 1 && Counter < 3 This expression allows us to easily compare two versions and get the operations they are missing from each other. For example, in the following case, suppose we know that the document on the server has reached version B, and the device has version A. To let the user's device get the Operations it is missing, ```tsx const A = { 0: 2, 1: 3, }; const B = { 0: 5, 1: 3, 2: 9, }; ``` We can easily calculate the Operations the device is missing: - All Ops where PeerId == 0 && 2 ≤ Counter < 5 - Or PeerId == 2 && Counter < 9 Version Vectors help us identify a version and allow us to quickly calculate the differences between two versions (quickly calculating the set of Ops missing in two documents to be synchronized). For example, currently in Loro, you can use the following method to synchronize two documents, where they only import the Ops missing from each other: ```tsx // doc.version() returns the version vector of the doc a.import(b.export({ mode: "update", from: a.version() })); b.import(a.export({ mode: "update", from: b.version() })); ``` ## Directed Acyclic Graph (DAG) History In Loro, we record an operation history similar to Git's Directed Acyclic Graph (DAG). We represent DAG by recording Dependencies links on each Op. Like document states, when the Op set is the same, the DAG is consistent across all endpoints. ![DAG Illustration](./images/version-3.png) When a new Op is added, its Dependencies link points to the current Operations that are not being pointed to. > 💡 Note: The order of Ops is a partial order relationship. An Op may be > before, after, or parallel to another Op. The structure of the DAG allows us > to compactly express this partial order. DAG history information helps us design some special CRDT algorithms and facilitates time-travel implementation. Moreover, when automatic merges don't meet expectations, DAG allows users a better chance to manually resolve potential conflicts. > 💡 We use Counter@PeerId for a more compact description of OpId. Below is an example of how a DAG is formed: ![DAG Formation Example](./images/version-0.png) ## Frontiers Version Vectors increase in size with the number of Peers. Most CRDTs Lib, to prevent PeerId conflicts, use a random new PeerId each time a document is opened, which increases the overhead of Version Vectors. > 📌 If PeerId repeats, there's a high probability of duplicate OpId, > potentially leading to inconsistent CRDT documents. Automerge has solved this > problem, ensuring document consistency even with duplicate OpId, but at the > cost of higher sync complexity. Based on DAG information, we have a more compact way to describe versions: given one or multiple OpIds, include all with a Causal Order less than or equal to them. We call this method Frontiers. ```tsx type Frontiers = OpId[]; ``` > 💡 We use Counter@PeerId for a more compact description of OpId. ![Frontiers Illustration](./images/version-1.png) `Frontiers = [3@0]` in the above DAG includes 6 operations: - PeerId == 0 && Counter < 4 - PeerId == 1 && Counter < 2 It is equivalent to the Version Vector: ```tsx { 0: 4, 1: 2, } ``` Frontiers can also specify multiple parallel operations: ![Parallel Operations Example](./images/version-2.png) In this example, `Frontiers = [1@1, 1@2]`, corresponding Version Vector is: ```tsx { 0: 1, 1: 2, 2: 2, } ``` One great property of Frontiers as a version is that `Frontiers = [A]` actually represents the document version right after operation A was executed. Thus, combined with Loro's time machine capability, you can easily return to any document version corresponding to a specific operation. Here is a complete example: ```tsx const doc = new LoroDoc(); doc.setPeerId(0n); const text = doc.getText("text"); text.insert(0, "H"); doc.commit(); const v = doc.frontiers(); text.insert(1, "i"); expect(doc.toJSON()).toStrictEqual({ text: "Hi", }); doc.checkout([{ peer: 0n, counter: 0 }]); expect(doc.toJSON()).toStrictEqual({ text: "H", }); ``` You can convert between Version Vectors and Frontiers in Loro if the specified version is included in the document's history: ```tsx const vv = doc.vvToFrontiers(f); const frontiers = doc.frontiersToVV(vv); ``` # FILE: pages/docs/advanced/undo.mdx --- keywords: "crdt, undo, redo, undo manager, cursor, selection, collaborate" description: "how to use loro undo manager and show all APIs of loro undo manager." --- # Undo We provide an `UndoManager` that helps you manage the undo/redo stack and can be used during concurrent editing. It also supports cursor position transformation. In the case of concurrent editing, you will want the undo/redo to be **local**, meaning it should only undo/redo local operations without affecting other collaborators' edits. ### Why Local Undo/Redo? If undo/redo is global, it often does not meet expectations. Consider the following scenario: - **User A and User B are collaborating**: They are likely editing different parts of the document. - **User A performs an undo**: If this undoes User B's operations, User A might see no changes and think the undo failed, as User B might be editing content outside of User A's view. - **User B's perspective**: User B would find their recent edits deleted without knowing how it happened. ## Usage To create an `UndoManager`, you can specify: - Which local operations are not recorded. - The merge range for undo operations. - The stack depth. ```ts no_run const undoManager = new UndoManager(doc, { maxUndoSteps: 100, // default 100 mergeInterval: 1000, // default 1000ms excludeOriginPrefixes: ["sys:"], // default [] onPush: ..., onPop: ... })); ``` ## Limitations It can only track a single peer. When the peer ID of the document changes, it will clear the undo stack and the redo stack and track the new peer ID. ## Restoring Selections When utilizing undo/redo functionalities, it is important for our application to restore the selection to its position prior to the operation that is being undone or redone. This task is particularly challenging in a collaborative setting where remote operations can alter the cursor's position (for instance, if the cursor needs to revert to the 5th position, but remote operations have added new characters before this position).
Challenges - **Local undo/redo**: Local undo and redo delete and recreate elements. If we use CRDT-based stable positions, they might lock on to the deleted elements, while the user's expectation is for the cursor to return to the newly created elements after redo. - **Example**: ``` A fox jumped ``` - Select "fox" and delete it. - Undo. - After undo, the three characters "fox" should be recreated, and the cursor should select these three characters. However, if we use stable positions, it would lock onto the initially deleted characters, and after undo, the absolute positions obtained would be `start = 2` and `end = 2`.
### Solution We support storing [`cursors`](/docs/tutorial/cursor) for each undo/redo action within the `UndoManager`. The `UndoManager` will adjust the stored cursors to reflect changes from remote edits or other undo/redo operations, ensuring they match the current state of the document. They need be handled by the `onPush` and `onPop` callbacks. On `onPush`, you can return a list of `Cursor`s that you want to store. On `onPop`, you can retrieve the stored cursors and use them to restore the selection. Typically, you may need to store selections in an undo/redo item in the following cases: - When a new local change is applied, we need to record a new undo item. Therefore, we must store the selection **before** the operation to be undone. - **Purpose**: Storing the selection **before** is crucial because we may lose the selection after applying the operation. If the user selects text and deletes it, after undo, the `onPop` can retrieve the state of the selected deleted content. - **First undo operation**: Store the current document's selection for the corresponding redo item. - **Purpose**: After redo, it can return to the initial selection state. Internally, we also automatically handle the storage and reset of cursors in the undo/redo loop state. ```ts const doc = new LoroDoc(); let cursors: Cursor[] = []; let poppedCursors: Cursor[] = []; const undo = new UndoManager(doc, { mergeInterval: 0, onPop: (isUndo, value, counterRange) => { poppedCursors = value.cursors; }, onPush: () => { return { value: null, cursors: cursors }; }, }); doc.getText("text").insert(0, "hello world"); doc.commit(); cursors = [ doc.getText("text").getCursor(0)!, doc.getText("text").getCursor(5)!, ]; // delete "hello ", the cursors should be transformed doc.getText("text").delete(0, 6); doc.commit(); expect(poppedCursors.length).toBe(0); undo.undo(); expect(poppedCursors.length).toBe(2); expect(doc.toJSON()).toStrictEqual({ text: "hello world", }); // expect the cursors to be transformed back expect(doc.getCursorPos(poppedCursors[0]).offset).toBe(0); expect(doc.getCursorPos(poppedCursors[1]).offset).toBe(5); ```
Implementation [The implementation of undo/redo](https://github.com/loro-dev/loro/pull/361) follows a model similar to ProseMirror/Quill, which are based on OT (Operational Transformation) algorithms (so we also implement basic OT primitives internally). When implementing undo/redo operations, we need to ensure the following properties: - Do not undo remote inserts. - Redo after undo should return to the original state. - If there is no concurrent editing, undo should return to the previous version's state. Therefore, we have also added some relevant checks in our internal fuzzing tests to ensure correctness.
## Demonstration import { ReactPlayer } from "components/video"; import Caption from "components/caption"; {" "} ProseMirror with Loro binding ## Understanding the Undo/Redo Stack The UndoManager maintains two stacks: 1. **Undo Stack**: Contains operations that can be undone 2. **Redo Stack**: Contains operations that were undone and can be redone ### How the Callbacks Work The `onPush` and `onPop` callbacks are triggered when these stacks change: - **onPush(isUndo, range, event)**: Called when a new item is pushed to either stack - `isUndo: boolean`: `true` for the undo stack, `false` for the redo stack - `range: (number, number)`: The operations' counter range that associated with the undo/redo action - Returns: An object that can include `value` (any data you want to store) and `cursors` (cursor positions) - **onPop(isUndo, value)**: Called when an item is popped from either stack - `isUndo`: `true` for the undo stack, `false` for the redo stack - `value`: The value you returned from `onPush` when this item was created ### Understanding Action Merging The `mergeInterval` option in the UndoManager controls how closely spaced operations are grouped: ```ts no_run const undoManager = new UndoManager(doc, { mergeInterval: 1000, // 1000ms = 1 second (default) }); ``` **How mergeInterval works:** - Operations occurring within the specified time interval (in milliseconds) will be merged into a single undo action - Even though these operations are merged, `onPush` events will still be triggered for each individual operation - When undoing, all merged operations will be undone as a single unit - A lower value results in more granular undo steps; a higher value creates fewer, more comprehensive undo steps - Set to `0` to disable merging entirely (every operation becomes a separate undo step) ### Stack Operations Flow 1. When a local transaction is committed, a new undo item is pushed to the undo stack (triggers `onPush` with `isUndo=true`) 2. When `.undo()` is called: - An item is popped from the undo stack (triggers `onPop` with `isUndo=true`) - A corresponding item is pushed to the redo stack (triggers `onPush` with `isUndo=false`) 3. When `.redo()` is called: - An item is popped from the redo stack (triggers `onPop` with `isUndo=false`) - A corresponding item is pushed to the undo stack (triggers `onPush` with `isUndo=true`) ### Example: Text Editing with Undo/Redo Consider a simple text editor that uses Loro for collaboration. Let's walk through what happens during typical editing operations: ```ts // Create document and undo manager const doc = new LoroDoc(); const textField = doc.getText("textField"); // Set up undo manager with callbacks to track changes const undoManager = new UndoManager(doc, { // Store cursor positions and any other state you need onPush: (isUndo, range) => { if (isUndo) { console.log("Action recorded for undo"); } else { console.log("Action recorded for redo"); } // Return whatever data you want associated with this action return { value: { affectedRange: range }, cursors: [/* your cursor positions */] }; }, onPop: (isUndo, storedData) => { // Access the data you stored during onPush const { value, cursors } = storedData; if (isUndo) { console.log("Retrieving data for undo"); } else { console.log("Retrieving data for redo"); } // Use the stored cursors to restore selection // applyStoredCursors(cursors); }, mergeInterval: 0, }); // User types "Hello" textField.insert(0, "Hello"); doc.commit(); // → onPush triggered (isUndo=true) - Adds to undo stack // User types " World" textField.insert(5, " World"); doc.commit(); // → onPush triggered (isUndo=true) - Adds to undo stack // User clicks Undo button undoManager.undo(); // → onPop triggered (isUndo=true) - Removes " World" from document // → onPush triggered (isUndo=false) - Adds to redo stack // Document now contains only "Hello" // User clicks Redo button undoManager.redo(); // → onPop triggered (isUndo=false) - Retrieves " World" operation // → onPush triggered (isUndo=true) - Adds back to undo stack // Document now contains "Hello World" again ``` When the user clicks Undo, two things happen: 1. The last action is popped from the undo stack (removing " World") 2. That action is pushed to the redo stack so it can be redone later This approach ensures that local changes can be undone without affecting other users' edits, making it ideal for collaborative editing. ### Cursor Efficiency The built-in cursor solution is optimized for performance and handles collaborative scenarios efficiently, including situations where peers may change the document concurrently during undo/redo operations. For complex editors like rich text editors, the cursor implementation provides the best balance of performance and correctness. # FILE: pages/docs/tutorial/text.md --- keywords: "text crdt, richtext, richtext editor" description: "how to use loro richtext crdt and show all APIs of loro text crdt." --- # Text Loro supports both plain text and rich text. When rich text features (like mark and unmark) are not used, the text container operates as plain text without any rich text overhead, making it efficient for simple text operations. LoroText offers excellent performance, particularly when handling large strings. It significantly outperforms native JavaScript string operations due to its internal B-tree structure. All basic operations like insert and delete have O(log N) time complexity, making it highly efficient even when working with documents containing several million characters. > To learn how rich text CRDT in Loro works under the hood, please refer to our > blog: [Introduction to Loro's Rich Text CRDT](/blog/loro-richtext). ## Editor Bindings Loro provides official bindings for popular editors to make it easier to integrate Loro's CRDT capabilities: ### ProseMirror Binding The [loro-prosemirror](https://github.com/loro-dev/loro-prosemirror) package provides seamless integration between Loro and ProseMirror, a powerful rich text editor framework. It includes: - Document state synchronization with rich text support - Cursor awareness and synchronization - Undo/Redo support in collaborative editing The ProseMirror binding can also be used with [Tiptap](https://tiptap.dev/), a popular rich text editor built on top of ProseMirror. This means you can easily add collaborative editing capabilities to your Tiptap-based applications. ```ts import { CursorAwareness, LoroCursorPlugin, LoroSyncPlugin, LoroUndoPlugin, undo, redo, } from "loro-prosemirror"; import { LoroDoc } from "loro-crdt"; import { EditorView } from "prosemirror-view"; import { EditorState } from "prosemirror-state"; const doc = new LoroDoc(); const awareness = new CursorAwareness(doc.peerIdStr); const plugins = [ ...pmPlugins, LoroSyncPlugin({ doc }), LoroUndoPlugin({ doc }), keymap({ "Mod-z": undo, "Mod-y": redo, "Mod-Shift-z": redo, }), LoroCursorPlugin(awareness, {}), ]; const editor = new EditorView(editorDom, { state: EditorState.create({ doc, plugins }), }); ``` ### CodeMirror Binding The [loro-codemirror](https://github.com/loro-dev/loro-codemirror) package provides integration between Loro and CodeMirror 6, a versatile code editor. It supports: - Document state synchronization - Cursor awareness - Undo/Redo functionality ```ts import { EditorState } from "@codemirror/state"; import { EditorView } from "@codemirror/view"; import { LoroExtensions } from "loro-codemirror"; import { Awareness, LoroDoc, UndoManager } from "loro-crdt"; const doc = new LoroDoc(); const awareness = new Awareness(doc.peerIdStr); const undoManager = new UndoManager(doc, {}); new EditorView({ state: EditorState.create({ extensions: [ // ... other extensions LoroExtensions( doc, { awareness: awareness, user: { name: "Bob", colorClassName: "user1" }, }, undoManager, ), ], }), parent: document.querySelector("#editor")!, }); ``` ## LoroText vs String It's important to understand that LoroText is very different from using a regular string type. So the following code has different merge results: Using `LoroText`: ```ts const doc = new LoroDoc(); doc.setPeerId("0"); doc.getText("text").insert(0, "Hello"); const doc2 = new LoroDoc(); doc2.setPeerId("1"); doc2.getText("text").insert(0, "World"); doc.import(doc2.export({ mode: "update" })); console.log(doc.getText("text").toString()); // "HelloWorld" ``` Using `String`: ```ts const doc = new LoroDoc(); doc.setPeerId("0"); doc.getMap("map").set("text", "Hello"); const doc2 = new LoroDoc(); doc2.setPeerId("1"); doc2.getMap("map").set("text", "World"); doc.import(doc2.export({ mode: "update" })); console.log(doc.getMap("map").get("text")); // "World" ``` ### Merge Semantics Unlike LoroMap which uses Last-Write-Wins (LWW) semantics, LoroText is designed to preserve edits. Here's how they differ: When user A and user B make concurrent edits to the same text: - LoroText will merge both users' edits in sequence, preserving both changes - LoroMap will use LWW semantics, keeping only one user's changes ### When to Use String in LoroMap There are specific scenarios where using a string in LoroMap (with LWW semantics) might be more appropriate than using LoroText: - **URLs**: When dealing with hyperlinks, automatic merging could result in invalid URLs. In this case, it's better to use LoroMap's LWW semantics to ensure the URL remains valid. - **Hash String**: When handling hash string, LWW semantics are more appropriate to maintain data accuracy and consistency. ## Rich Text Config To use rich text in Loro, you need to specify the expanding behaviors for each format first. When we insert new text at the format boundaries, they define whether the inserted text should inherit the format. There are four kinds of expansion behaviors: - `after`(default): when inserting text right after the given range, the mark will be expanded to include the inserted text - `before`: when inserting text right before the given range, the mark will be expanded to include the inserted text - `none`: the mark will not be expanded to include the inserted text at the boundaries - `both`: when inserting text either right before or right after the given range, the mark will be expanded to include the inserted text ### Example ```ts const doc = new LoroDoc(); doc.configTextStyle({ bold: { expand: "after" }, link: { expand: "before" }, }); const text = doc.getText("text"); text.insert(0, "Hello World!"); text.mark({ start: 0, end: 5 }, "bold", true); expect(text.toDelta()).toStrictEqual([ { insert: "Hello", attributes: { bold: true, }, }, { insert: " World!", }, ] as Delta[]); // " Test" will inherit the bold style because `bold` is configured to expand forward text.insert(5, " Test"); expect(text.toDelta()).toStrictEqual([ { insert: "Hello Test", attributes: { bold: true, }, }, { insert: " World!", }, ] as Delta[]); ``` ## Methods ### `insert(pos: number, s: string)` Insert text at the given pos. ### `delete(pos: number, len: number)` Delete text at the given range. ### `slice(start: number, end: number): string` Get a string slice. ### `toString(): string` Get the plain text value. ### `charAt(pos: number): char` Get the character at the given position. ### `splice(pos: number, len: number, text: string): string` Delete and return the string at the given range and insert a string at the same position. ### `length(): number` Get the length of text ### `getCursor(pos: number, side: Side): Cursor | undefined` Get the cursor at the given position. ### `toDelta(): Delta[]` Get the rich text value. It's in [Quill's Delta format](https://quilljs.com/docs/delta/). ### `mark(range: {start: number, end: number}, key: string, value: any): void` Mark the given range with a key-value pair. ### `unmark(range: {start: number, end: number}, key: string): void` Remove key-value pairs in the given range with the given key. ### `update(text: string)` Update the current text based on the provided text. ### `applyDelta(delta: Delta[]): void` Change the state of this text by delta. If a delta item is `insert`, it should include all the attributes of the inserted text. Loro's rich text CRDT may make the inserted text inherit some styles when you use the `insert` method directly. However, when you use `applyDelta` if some attributes are inherited from CRDT but not included in the delta, they will be removed. Another special property of `applyDelta` is if you format an attribute for ranges out of the text length, Loro will insert new lines to fill the gap first. It's useful when you build the binding between Loro and rich text editors like Quill, which might assume there is always a newline at the end of the text implicitly. ```ts const doc = new LoroDoc(); const text = doc.getText("text"); doc.configTextStyle({ bold: { expand: "after" } }); text.insert(0, "Hello World!"); text.mark({ start: 0, end: 5 }, "bold", true); const delta = text.toDelta(); const text2 = doc.getText("text2"); text2.applyDelta(delta); expect(text2.toDelta()).toStrictEqual(delta); ``` ### `subscribe(f: (event: Listener)): number` This method returns a number that can be used to remove the subscription. The text event is in `Delta[]` format. It can be used to bind the rich text editor. It has the same type as the arg of `applyDelta`, so the following example works: ```ts (async () => { const doc1 = new LoroDoc(); doc1.configTextStyle({ link: { expand: "none" }, bold: { expand: "after" }, }); const text1 = doc1.getText("text"); const doc2 = new LoroDoc(); doc2.configTextStyle({ link: { expand: "none" }, bold: { expand: "after" }, }); const text2 = doc2.getText("text"); text1.subscribe((e) => { for (const event of e.events) { const d = event.diff as TextDiff; text2.applyDelta(d.diff); } }); text1.insert(0, "foo"); text1.mark({ start: 0, end: 3 }, "link", true); doc1.commit(); await new Promise((r) => setTimeout(r, 1)); expect(text2.toDelta()).toStrictEqual(text1.toDelta()); text1.insert(3, "baz"); doc1.commit(); await new Promise((r) => setTimeout(r, 1)); expect(text2.toDelta()).toStrictEqual([ { insert: "foo", attributes: { link: true } }, { insert: "baz" }, ]); expect(text2.toDelta()).toStrictEqual(text1.toDelta()); text1.mark({ start: 2, end: 5 }, "bold", true); doc1.commit(); await new Promise((r) => setTimeout(r, 1)); expect(text2.toDelta()).toStrictEqual(text1.toDelta()); })(); ``` # FILE: pages/docs/tutorial/tips.md # Tips and Tricks ##### `LoroDoc` will be initialized with a new random PeerID each time
What if I need to set the initial state? If your document requires an initial state, you should not edit the document to achieve this state right after creating it with new LoroDoc(). This approach can cause problems - each time someone opens the document, new operations with different PeerIDs would be added just to set up the initial state. The better approach is to initialize your document by loading the same Snapshot. This ensures all users start from an identical baseline without generating unnecessary operations.
--- ##### Be careful when using `doc.setPeerId(newId)` When using `setPeerId`, you must avoid having two parallel peers use the same PeerId. This can cause serious consistency problems in your application.
Why It's because Loro determines whether an operation has already been included by checking its operation ID. Since operation IDs are composed of `PeerId + Counter`, duplicate PeerIds can easily lead to duplicate operation IDs. During synchronization, Loro might incorrectly assume certain operations have already been processed, resulting in document inconsistency across peers.
How to reuse PeerIds safely Be careful when reusing PeerIds (this optimization is often unnecessary). You should not assign a fixed PeerId to a user, as one user might use multiple devices. Similarly, you shouldn't assign a fixed PeerId to a device, because even on a single browser, multiple tabs might open the same document simultaneously. If you must reuse PeerIds, you need to carefully manage your local PeerId cache with proper locking mechanisms. This would allow only one tab to "take" a specific PeerId, while other tabs use random IDs. The PeerId should be returned to the cache when no longer in use.
--- ##### Root containers don't need operations to be initialized Root Containers are created implicitly in Loro. This means that when you call `doc.getText("text")`, no new operations appear in the LoroDoc history, and there are no operations that need to be synchronized with other peers. This behavior contrasts with non-root containers. For example, when you execute `doc.getMap("meta").setContainer("text", new LoroText())`, it generates an operation to insert the LoroText container into the map. --- ##### When initializing child containers of LoroMap in parallel, overwrites can occur instead of automatic merging.
Why this happens This happens because parallel creation of child containers results in different container IDs, preventing automatic merging of their contents. When a container holds substantial data or serves as the primary storage for document content, overwriting it can lead to unintended hiding or loss of critical information. ```typescript const doc = new LoroDoc(); const map = doc.getMap("map"); // Parallel initialization of child containers const docB = doc.fork(); const textA = doc.getMap("map").setContainer("text", new LoroText()); textA.insert(0, "A"); const textB = docB.getMap("map").setContainer("text", new LoroText()); textB.insert(0, "B"); doc.import(docB.export({ mode: "update" })); // Result: Either { "meta": { "text": "A" } } or { "meta": { "text": "B" } } ```
Best practices for container initialization 1. When using map containers: - If possible, initialize all child containers during the map container's initialization - Avoid concurrent creation of child containers with the same key in the map container to prevent overwrites 2. If it's impossible to initialize all child containers when the map container is initialized, prefer initializing them at the root level rather than as nested containers. - You can use `doc.getMap("user." + userId)` instead of `doc.getMap("user").getOrCreateContainer(userId, new LoroMap())` to avoid this problem.
--- ##### Use redaction to safely share document history There are times when users might accidentally paste sensitive information (like API keys, passwords, or personal data) into a collaborative document. When this happens, you need a way to remove just that sensitive content from the document history without compromising the rest of the document's integrity.
How to safely redact sensitive content Loro provides a `redactJsonUpdates` function that allows you to selectively redact operations within specific version ranges. For example, if a user accidentally pastes a password or API key into a document: ```typescript const doc = new LoroDoc(); doc.setPeerId("1"); // Create some content to be redacted const text = doc.getText("text"); text.insert(0, "Sensitive information"); doc.commit(); const map = doc.getMap("map"); map.set("password", "secret123"); map.set("public", "public information"); doc.commit(); // Export JSON updates const jsonUpdates = doc.exportJsonUpdates(); // Define version range to redact (redact the text content) const versionRange = { "1": [0, 21] // Redact the "Sensitive information" }; // Apply redaction const redactedJson = redactJsonUpdates(jsonUpdates, versionRange); // Create a new document with redacted content const redactedDoc = new LoroDoc(); redactedDoc.importJsonUpdates(redactedJson); // The text content is now redacted with replacement characters console.log(redactedDoc.getText("text").toString()); // Outputs: "���������������������" // You can also redact specific map entries const versionRange2 = { "1": [21, 22] // Redact the "secret123" password }; const redactedJson2 = redactJsonUpdates(jsonUpdates, versionRange2); const redactedDoc2 = new LoroDoc(); redactedDoc2.importJsonUpdates(redactedJson2); console.log(redactedDoc2.getMap("map").get("password")); // null console.log(redactedDoc2.getMap("map").get("public")); // "public information" ``` This approach is safer than manually editing document content because: 1. It maintains document structure and CRDT consistency 2. It keeps key metadata like operation IDs and dependencies intact 3. It allows concurrent editing to continue working after redaction 4. It selectively redacts only specific operations, not the entire document The redaction process follows these rules: - Preserves delete, tree move, and list move operations - Replaces text insertion content with Unicode replacement characters '�' - Substitutes list and map insert values with null - Maintains structure of child containers - Replaces text mark values with null - Preserves map keys and text annotation keys **Important**: Your application needs to ensure that all peers receive the redacted version, otherwise the original document with sensitive information will still exist on other peers.
--- ##### Use shallow snapshots to completely remove old history When you need to completely remove ALL history older than a certain version point, shallow snapshots provide the solution.
How to remove old history with shallow snapshots Shallow snapshots create a new document that preserves the current state but completely eliminates all history before a specified point, similar to Git's shallow clone functionality. ```typescript const doc = new LoroDoc(); doc.setPeerId("1"); // Old history - will be completely removed const text = doc.getText("text"); text.insert(0, "This document has a long history with many edits"); doc.commit(); text.insert(0, "Including some potentially sensitive information. "); doc.commit(); // More recent history - will be preserved text.delete(11, 55); // Remove the middle part text.insert(11, "with sanitized history"); doc.commit(); // Create a sanitized version that removes ALL history before current point const sanitizedSnapshot = doc.export({ mode: "shallow-snapshot", frontiers: doc.oplogFrontiers() }); // Create a new document from the sanitized snapshot const sanitizedDoc = new LoroDoc(); sanitizedDoc.import(sanitizedSnapshot); // The document has the final state console.log(sanitizedDoc.getText("text").toString()); // Outputs: "Including with sanitized history" // But ALL history before the snapshot point is completely removed console.log(sanitizedDoc.isShallow()); // true console.log(sanitizedDoc.shallowSinceFrontiers()); // Shows the starting point ``` This approach is useful for: 1. Completely removing all old history that might contain various sensitive information 2. Significantly reducing document size by eliminating unnecessary history 3. Creating clean document instances after certain milestones 4. Ensuring old operations cannot be recovered or examined Compared to redaction: - Shallow snapshots completely remove all operations before a version point - Redaction selectively replaces just specific content with placeholders **Important**: While both methods maintain future synchronization consistency, your application must distribute the sanitized document to all peers. Otherwise, the original document with sensitive information will still exist on other clients. **When to use each approach**: - Use **redaction** when you need to sanitize specific operations (like an accidental password paste) while preserving older history - Use **shallow snapshots** when you want to completely eliminate all history before a certain point
--- ##### You can store mappings between LoroDoc's peerIds and user IDs in the document itself Use `doc.subscribeFirstCommitFromPeer(listener)` to associate peer information with user identities when a peer first interacts with the document.
How to track peer-to-user mappings This functionality is essential for building user-centric features in collaborative applications. You often need bidirectional mapping between user IDs and peer IDs: - **Finding all edits by a user**: When you need to retrieve all document edits made by a specific user ID, you must first find all peer IDs associated with that user - **Showing edit attribution**: When displaying which user edited a piece of text, you need to map from the peer ID (stored in the operation) back to the user ID for display This hook provides an ideal point to associate peer information (such as author identity) with the document. The listener is triggered on the first commit from each peer, allowing you to store user metadata within the document itself. ```typescript const doc = new LoroDoc(); doc.setPeerId(0); doc.subscribeFirstCommitFromPeer((e) => { doc.getMap("users").set(e.peer, "user-" + e.peer); }); doc.getList("list").insert(0, 100); doc.commit(); expect(doc.getMap("users").get("0")).toBe("user-0"); ``` This approach allows you to: 1. Automatically track which peers have contributed to the document 2. Store user metadata (names, emails, etc.) alongside the document 3. Build features like author attribution, presence indicators, or edit history The mapping is stored within the document, so it automatically synchronizes across all peers and persists with the document's state.
--- ##### You can use https://loro.dev/llms-full.txt to prompt your AI When working with AI assistants or language models on Loro-related tasks, you can use these URLs to provide comprehensive context about Loro's capabilities and API: - `https://loro.dev/llms-full.txt` - All the documentation in one file - `https://loro.dev/llms.txt` - An overview of Loro website # FILE: pages/docs/tutorial/map.md --- keywords: "map crdt, last write win, key value, conflict" description: "how to use loro map crdt and show all APIs of loro map crdt." --- # Map Loro's Map uses LWW (Last-Write-Wins) semantics. When concurrent edits conflict, it compares Lamport logic timestamps to determine the winner. Here is how to use it: ```ts const docA = new LoroDoc(); docA.setPeerId("0"); const docB = new LoroDoc(); docB.setPeerId("1"); const mapA = docA.getMap("map"); const mapB = docB.getMap("map"); mapA.set("a", 1); const textB = mapB.setContainer("a", new LoroText()); textB.insert(0, "Hi"); console.log(docA.toJSON()); // OUTPUT: { map: { a: 1 } } console.log(docB.toJSON()); // OUTPUT: { map: { a: "Hi" } } docA.import(docB.export({ mode: "snapshot" })); docB.import(docA.export({ mode: "snapshot" })); // docB wins because it has the larger peerId, thus the larger logical timestamp console.log(docA.toJSON()); // OUTPUT: { map: { a: "Hi" } } console.log(docB.toJSON()); // OUTPUT: { map: { a: "Hi" } } ``` > **Note**: When calling `map.set(key, value)` on a LoroMap, if `map.get(key)` already returns `value`, the operation will be a no-op (no operation recorded). # FILE: pages/docs/tutorial/counter.md --- keywords: "counter crdt" description: "how to use loro counter crdt and show all APIs of loro counter crdt." --- # Counter Loro's Counter will add up all the applied values, and supports integers and floating point numbers. Here is how to use it: ```ts const doc = new LoroDoc(); const counter = doc.getCounter("counter"); counter.increment(1); counter.increment(2); counter.decrement(1); expect(counter.value).toBe(2); ``` # FILE: pages/docs/tutorial/cursor.md --- keywords: "cursor, crdts, multi-player, awareness, position, loro" description: "How to use Loro to implement cursor position in real-time collaboration" --- # Cursor Cursor is an independently storable entity, meaning it can store content separately from the Loro document. It is used to stably represent positions within structures such as Text, List, or MovableList. Cursors can be utilized to represent collaborative cursor positions, highlight ranges, or comment ranges. ## Motivation Using "index" to denote cursor positions can be unstable, as positions may shift with document edits. To reliably represent a position or range within a document, it is more effective to leverage the unique ID of each item/character in a List CRDT or Text CRDT. ## Updating Cursors Loro optimizes State metadata by not storing the IDs of deleted elements. This approach, while efficient, complicates tracking cursor positions since they rely on these IDs for precise locations within the document. The solution recalculates position by replaying relevant history to update stable positions accurately. To minimize the performance impact of history replay, the system updates cursor info to reference only the IDs of currently present elements, thereby reducing the need for replay. Each position has a "Side" information, indicating the actual cursor position is on the left, right, or directly in the center of the target ID. Note: In JavaScript, the offset returned when querying a Stable Position is based on the UTF-16 index. ## Example ```ts const loro = new LoroDoc(); const list = loro.getList("list"); list.insert(0, "a"); const pos0 = list.getCursor(0); list.insert(1, "b"); { const ans = loro.getCursorPos(pos0!); expect(ans.offset).toEqual(0); expect(ans.side).toEqual(0); expect(ans.update).toBeUndefined(); } list.insert(0, "c"); { const ans = loro.getCursorPos(pos0!); expect(ans.offset).toEqual(1); expect(ans.side).toEqual(0); expect(ans.update).toBeUndefined(); } list.delete(1, 1); { const ans = loro.getCursorPos(pos0!); expect(ans.offset).toEqual(1); expect(ans.side).toEqual(-1); expect(ans.update).toBeDefined(); } ``` # FILE: pages/docs/tutorial/composition.md --- keywords: "crdts, json, data model, document state, semantics" description: "Everyone can effectively model the states and the updates of documents that conform to the JSON schema." --- # Composing CRDTs In Loro, you can build complex data structures using basic CRDTs such as List, MovableList, Map and Tree. These containers can include sub-containers, which in turn can contain more sub-containers, allowing for the composition of intricate data structures. It's important to note that documents in Loro must adhere to a tree structure. This means that while a parent can have multiple children, each child is restricted to only one parent. Therefore, the document forms a tree rather than a graph (like a DAG). By leveraging these fundamental CRDTs, you can effectively model the states and the updates of documents that conform to the JSON schema. ```ts const doc = new LoroDoc(); const map = doc.getMap("map"); let callTimes = 0; // Events from a child are propagated to all ancestor nodes. map.subscribe((event) => { console.log(event); callTimes++; }); // Create a sub container for map // { map: { list: [] } } const list = map.setContainer("list", new LoroList()); list.push(0); list.push(1); // Create a sub container for list // { map: { list: [0, 1, LoroText] } } const text = list.insertContainer(2, new LoroText()); expect(doc.toJSON()).toStrictEqual({ map: { list: [0, 1, ""] } }); { // Commit will trigger the event, because list is a sub container of map doc.commit(); await new Promise((resolve) => setTimeout(resolve, 1)); expect(callTimes).toBe(1); } text.insert(0, "Hello, "); text.insert(7, "World!"); expect(doc.toJSON()).toStrictEqual({ map: { list: [0, 1, "Hello, World!"] } }); { // Commit will trigger the event, because text is a descendant of map doc.commit(); await new Promise((resolve) => setTimeout(resolve, 1)); expect(callTimes).toBe(2); } ``` # FILE: pages/docs/tutorial/list.md --- keywords: "list crdt, movable list, move element, cursor, awareness" description: "how to use loro list and movable list crdt and show all APIs of loro list and movable crdt." --- # List and Movable List Loro supports two types of lists: `List` and `MovableList`. The `List` is a standard list that supports Insert and Delete operations. In contrast, the `MovableList` supports additional Set and Move operations. Using a combination of insert and delete operations, one can simulate set and move operations on a `List`. However, this approach fails in concurrent editing scenarios. For example, if the same element is set or moved concurrently, the simulation would result in the deletion of the original element and the insertion of two new elements, which does not meet expectations. The `MovableList` addresses these issues by ensuring that set and move operations do not lead to such problems, though it incurs additional overheads. Specifically, when performing only insertions/deletions, the `MovableList` is approximately 80% slower in encode/decode and consumes about 50% more memory compared to the `List`. Both `List` and `MovableList` utilize the [_Fugue_](https://arxiv.org/abs/2305.00583) to achieve _maximal non-interleaving_. Additionally, `MovableList` uses the algorithm from [_Moving Elements in List CRDTs_](https://martin.kleppmann.com/2020/04/27/papoc-list-move.html) to implement the move operation. ## Basic Usage ### List ```ts const docA = new LoroDoc(); docA.setPeerId("1"); const listA = docA.getList("list"); listA.push(0); listA.push(1); listA.push(2); const bytes: Uint8Array = docA.export({ mode: "snapshot" }); const docB = Loro.fromSnapshot(bytes); docB.setPeerId("2"); const listB = docB.getList("list"); { // Concurrently docA and docB update element at index 2 // docA updates it to 8 // docB updates it to 9 // docA.toJSON() should return { list: [0, 1, 8] } // docB.toJSON() should return { list: [0, 1, 9] } listB.delete(2, 1); listB.insert(2, 9); expect(docB.toJSON()).toStrictEqual({ list: [0, 1, 9] }); listA.delete(2, 1); listA.insert(2, 8); expect(docA.toJSON()).toStrictEqual({ list: [0, 1, 8] }); } { // Merge docA and docB docA.import(docB.export({ mode: "update", from: docA.version() })); docB.import(docA.export({ mode: "update", from: docB.version() })); } expect(docA.toJSON()).toStrictEqual({ list: [0, 1, 8, 9] }); expect(docB.toJSON()).toStrictEqual({ list: [0, 1, 8, 9] }); ``` ### MovableList ```ts const docA = new LoroDoc(); docA.setPeerId("1"); const listA = docA.getMovableList("list"); listA.push(0); listA.push(1); listA.push(2); const bytes: Uint8Array = docA.export({ mode: "snapshot" }); const docB = Loro.fromSnapshot(bytes); docB.setPeerId("2"); const listB = docB.getMovableList("list"); { // Concurrently docA and docB update element at index 2 // docA updates it to 8 // docB updates it to 9 // docA.toJSON() should return { list: [0, 1, 8] } // docB.toJSON() should return { list: [0, 1, 9] } listA.set(2, 8); expect(docA.toJSON()).toStrictEqual({ list: [0, 1, 8] }); listB.set(2, 9); expect(docB.toJSON()).toStrictEqual({ list: [0, 1, 9] }); } { // Merge docA and docB docA.import(docB.export({ mode: "update", from: docA.version() })); docB.import(docA.export({ mode: "update", from: docB.version() })); } // Converge to [0, 1, 9] because docB has larger peerId thus larger logical time expect(docA.toJSON()).toStrictEqual({ list: [0, 1, 9] }); expect(docB.toJSON()).toStrictEqual({ list: [0, 1, 9] }); { // Concurrently docA and docB move element at index 0 // docA moves it to 2 // docB moves it to 1 // docA.toJSON() should return { list: [1, 9, 0] } // docB.toJSON() should return { list: [1, 0, 9] } listA.move(0, 2); listB.move(0, 1); expect(docA.toJSON()).toStrictEqual({ list: [1, 9, 0] }); expect(docB.toJSON()).toStrictEqual({ list: [1, 0, 9] }); } { // Merge docA and docB docA.import(docB.export({ mode: "update", from: docA.version() })); docB.import(docA.export({ mode: "update", from: docB.version() })); } // Converge to [1, 0, 9] because docB has larger peerId thus larger logical time expect(docA.toJSON()).toStrictEqual({ list: [1, 0, 9] }); expect(docB.toJSON()).toStrictEqual({ list: [1, 0, 9] }); ``` ## Using Cursor on List The Cursor on a List changes with the list's modifications. If new elements are inserted in front of it, it moves to the right. If elements in front are deleted, it moves to the left. If elements are inserted or deleted behind it, it remains stationary. If you use a List to represent paragraphs in an article, you can use a Cursor to stably represent the selection range on the paragraph. ```ts import { Cursor, LoroDoc, LoroList, LoroMovableList, LoroText, } from "loro-crdt"; const doc = new LoroDoc(); doc.setPeerId("1"); const list = doc.getList("list"); list.push("Hello"); list.push("World"); const cursor = list.getCursor(1)!; console.log(cursor.pos()); // OUTPUT: { peer: "1", counter: 1 } const encodedCursor: Uint8Array = cursor.encode(); const exported: Uint8Array = doc.export({ mode: "snapshot" }); // Sending the exported snapshot and the encoded cursor to peer 2 // Peer 2 will decode the cursor and get the position of the cursor in the list // Peer 2 will then insert "Hello" at the beginning of the list const docB = new LoroDoc(); docB.setPeerId("2"); const listB = docB.getList("list"); docB.import(exported); listB.insert(0, "Foo"); console.log(docB.toJSON()); // OUTPUT: { list: ["Foo", "Hello", "World"] } const cursorB = Cursor.decode(encodedCursor); { // The cursor position is shifted to the right by 1 const pos = docB.getCursorPos(cursorB); console.log(pos.offset); // OUTPUT: 2 } listB.insert(1, "Bar"); console.log(docB.toJSON()); // OUTPUT: { list: ["Foo", "Bar", "Hello", "World"] } { // The cursor position is shifted to the right by 1 const pos = docB.getCursorPos(cursorB); console.log(pos.offset); // OUTPUT: 3 } listB.delete(3, 1); console.log(docB.toJSON()); // OUTPUT: { list: ["Foo", "Bar", "Hello"] } { // The position the cursor points to is now deleted, // but it should still get the position const pos = docB.getCursorPos(cursorB); console.log(pos.offset); // OUTPUT: 3 // It will also offer an update on the cursor position. // // Because the old cursor position is deleted, `doc.getCursorPos()` will slow down over time. // Internally, it needs to traverse the related history to find the correct position for a deleted // cursor position. // // After refreshing the cursor, the performance of `doc.getCursorPos()` will improve. console.log(pos.update); // OUTPUT: { peer: "2", counter: 1 } const newCursor: Cursor = pos.update!; // The new cursor position is undefined because the cursor is at the end of the list console.log(newCursor.pos()); // OUTPUT: undefined // The side is 1 because the cursor is at the right end of the list console.log(newCursor.side()); // OUTPUT: 1 const newPos = docB.getCursorPos(newCursor); // The offset doesn't change console.log(newPos.offset); // OUTPUT: 3 // The update is undefined because the cursor no longer needs to be updated console.log(newPos.update); // OUTPUT: undefined } ``` # FILE: pages/docs/tutorial/encoding.md --- keywords: "crdts, schema, encoding, persist, snapshot, gc" description: "Introduce how to encode and decode Loro documents, and how to persist data" --- # Export Mode > Loro 1.0 has stabilized the data format and will not have any breaking > changes. Loro introduces three encoding modes to meet the needs for different use cases: - **Updates Encoding**: Encodes all or a specified range of operations, mainly used for delta updates of documents. - **Snapshot Encoding**: Encodes the entire document, including both all the operations and the current document state. - **Shallow Snapshot Encoding**: Loro 1.0 provides a new encoding format for discarding useless historical operations. It is a special snapshot that encodes the most recent historical operations, the starting state of the remaining history, and the latest state of the document. Different encoding formats are provided through the unified `doc.export(mode)`, and all binary encoding formats can be imported via `doc.import(bytes)`. ```ts export type ExportMode = | { mode: "update"; from?: VersionVector; } | { mode: "updates-in-range"; spans: { id: ID; len: number; }[]; } | { mode: "snapshot"; } | { mode: "shallow-snapshot"; frontiers: Frontiers; }; ``` ## Updates Encoding There are two modes for updates encoding. `update` 和 `updates-in-range`. - `update` mode will encode all ops from the from version to the latest version of the document. - `updates-in-range` mode will encode all ops within the specified version range. ```ts const doc1 = new LoroDoc(); const doc2 = new LoroDoc(); doc2.setPeerId(2); doc2.getText("text").insert(0, "hello"); const bytes = doc2.export({ mode: "update", from: doc1.version() }); // Uint8Array // const bytes = doc2.export({ mode: "updates-in-range", spans: [{id: { peer: 2, counter: 0 }, len: 1}] }); // These bytes can be stored in a database or transmitted over a network. doc1.import(bytes); console.log(doc1.toJSON()); // { text: "hello" } ``` Updates Encoding selectively encodes operations from a chosen initial version to the most recent, enhancing support for real-time collaboration by focusing on incremental changes rather than the entire document state. ## Snapshot Encoding ```ts const doc1 = new LoroDoc(); const doc2 = new LoroDoc(); doc2.getText("text").insert(0, "hello"); const bytes = doc2.export({ mode: "snapshot" }); // Uint8Array // These bytes can be stored in a database or transmitted over a network. doc1.import(bytes); console.log(doc1.toJSON()); // { text: "hello" } ``` Snapshot Encoding comprehensively captures a document's current state and its historical operations. This mode can quickly obtain the latest document state. ## Shallow Snapshot Encoding Loro will save all editing history to resolve conflicts and history backtracking. However, for most scenes, most of the older history is useless, taking up too much extra memory and requires more storage space for saving. Loro 1.0 provides a shallow snapshot encoding mode. You can specify the starting historical version to be retained, and Loro will truncate all the history before this version. ```ts const doc1 = new LoroDoc(); const doc2 = new LoroDoc(); doc2.setPeerId(2); const text = doc2.getText("text"); text.insert(0, "hello"); const frontiers = doc2.frontiers(); text.insert(0, "w"); text.insert(0, "o"); text.insert(0, "r"); text.insert(0, "l"); text.insert(0, "d"); const bytes = doc2.export({ mode: "shallow-snapshot", frontiers }); // Uint8Array // These bytes can be stored in a database or transmitted over a network. doc1.import(bytes); console.log(doc1.toJSON()); // { text: "hello" } ``` Note: When using shallow snapshots, you cannot import updates that are concurrent to the snapshot's start version. For details, see [shallow snapshot](/docs/advanced/shallow_snapshot). ## Loro's Snapshot File Format To support lazy loading capabilities, we referenced common storage formats in databases, introduced a simple LSM engine, and abstracted the encoding results as simple key bytes and value bytes representations, preparing for future integration with common key-value databases. ### Encoding Details In Loro, multiple consecutive Ops of the same Container are merged into one Change, i.e., one commit record. Each Change has additional metadata, such as ID, Lamport, Timestamp, etc. In most scenarios, Changes are also consecutive, and related metadata can be compressed to some extent. Therefore, we combine multiple consecutive Changes into a ChangeBlock, which is the minimum unit for encoding Op history. Detailed encoding content can be found in the [documentation](https://github.com/loro-dev/loro/blob/main/crates/loro-internal/src/oplog/change_store/block_encode.rs). We use the first Change ID of the ChangeBlock as the query key, and the encoded ChangeBlock as the value. This allows for quick querying of specified ID Changes and their Ops without decoding all Changes. The complete State is composed of the states of all Containers starting from the Root Container. The key is the bytes representation of the ContainerID, and the value is composed of each Container's metadata and the bytes encoded according to its own semantic state expression. These key-value pairs will be stored in our implemented simple LSM structure. The final exported encoding will be divided into four parts: encoding header, oplog bytes, latest state bytes, and shallow state bytes. The Header consists of a 4-byte magic number, 16-byte checksum, and 2-byte encode mode. The oplog bytes, latest state bytes, and shallow state bytes are all exported using the [LSM encoding structure](https://github.com/loro-dev/loro/blob/main/crates/kv-store/src/lib.rs). ## Snapshot Encoding Layout ![](../advanced/shallow-imgs/image-4.png) # FILE: pages/docs/tutorial/version.md # Version In centralized environments, we can use linear version numbers to represent a version, such as incrementing a number each time or using timestamps. However, CRDTs can be used in decentralized environments, and their version representation is different. In Loro, you can express a document's version through a [Version Vector](https://en.wikipedia.org/wiki/Version_vector) or Frontiers. ```rust doc.version(); // State Version vector doc.oplogVersion(); // OpLog Version vector doc.frontiers(); // State Frontiers doc.oplogFrontiers(); // OpLog Frontiers ``` In most cases, you might only need the Version Vector, which can be used for data synchronization and version comparison. To understand the difference and the definition of `Frontiers`, see [Loro's Versioning Deep Dive: DAG, Frontiers, and Version Vectors](/docs/advanced/version_deep_dive) # FILE: pages/docs/tutorial/persistence.md ## Best Practices for Persisting Loro Documents The simplest approach is to use full snapshots for both import and export operations. Here's how it works: 1. Import the entire snapshot when loading the document. 2. Export a complete snapshot when saving changes. 3. Implement a debounce or throttle mechanism to trigger snapshot saves after a certain number of edits. This method simplifies initial application development but has a drawback: user edits are not immediately saved. Let's explore how to quickly save each user edit while minimizing resource consumption. ### Balancing Quick Saves and Resource Efficiency To achieve both quick saves and resource efficiency: - Use [`Snapshot Encoding`](./encoding#snapshot-encoding) to periodically store the entire document. - Use [`Updates Encoding`](./encoding#updates-encoding) to export delta updates frequently (e.g., after each keystroke or with debounce/throttle). Store these binary data in fast-write storage like user disks or a key-value database. This ensures quick saves with low resource cost. - When loading a document, import the snapshot and all related updates to get the latest version. - After importing, export a new snapshot to replace the old one and remove imported updates for faster future loading. - If your `LoroDoc` has grown large and older history can be safely recycled, use `Shallow Snapshot Encoding` to reduce snapshot size. You can archive the history before the shallow snapshot's start version in cold storage. For collaboration, the binary data generated by snapshot/updates can be transmitted through any medium, such as WebRTC, WebSocket, or HTTP. The strong eventual consistency in CRDTs ensures that peers with identical sets of operations will converge to the same document state, obviating concerns about the order, duplication, or timing of operation delivery. # FILE: pages/docs/tutorial/ephemeral.md --- keywords: "ephemeral, awareness, presence, collaborative" description: "How to use Loro's ephemeral store feature to implement user awareness and online status management in real-time collaboration." --- # Ephemeral Store In real-time collaborative scenarios, Presence information is just as important as maintaining document consistency across peers through CRDTs. This includes information such as the current collaborator's username, mouse pointer position, or selected objects. We need a mechanism that doesn't persist in the CRDT Document but remains ephemeral, allowing collaborators to perceive each other's presence for better coordination and to avoid conflicts when multiple users edit the same object. This is why we've introduced the Ephemeral Store. ![](/images/ephemeral.png) Since Ephemeral information is primarily used for real-time collaboration, we've chosen a simple yet effective approach. The Ephemeral Store is a timestamp-based, last-write-wins key-value store. Each entry maintains its own timestamp of the last update, enabling the system to send only the updated entry content rather than the complete current state. ## Example ```ts import { EphemeralStore, EphemeralListener, EphemeralStoreEvent, } from "loro-crdt"; const store = new EphemeralStore(); // Set ephemeral data store.set("loro-prosemirror", { anchor: ..., focus: ..., user: "Alice" }); store.set("online-users", ["Alice", "Bob"]); expect(storeB.get("online-users")).toEqual(["Alice", "Bob"]); // Encode only the data for `loro-prosemirror` const encoded = store.encode("loro-prosemirror") store.subscribe((e: EphemeralStoreEvent) => { // Listen to changes from `local`, `remote`, or `timeout` events }); ``` ## API - `constructor(timeout)`: Creates a new EphemeralStore instance with an optional timeout parameter (default: 30000ms). The timeout determines how long ephemeral data remains valid before being automatically removed. - `set(key, value)`: Sets a value for the specified key in the ephemeral store. If the key already exists, its value will be updated. - `get(key)`: Retrieves the current value for the specified key, or returns `undefined` if the key doesn't exist. - `delete(key)`: Removes the specified key and its associated value from the ephemeral store. - `getAllStates()`: Returns all current key-value pairs in the ephemeral store. - `keys()`: Returns an array of all keys currently in the ephemeral store. - `encode(key)`: Encodes the value associated with the specified key into a binary format that can be transmitted to other peers. - `encodeAll()`: Encodes all key-value pairs in the ephemeral store into a binary format. - `apply(bytes)`: Applies encoded ephemeral data received from other peers to the local ephemeral store. - `subscribe((event: EphemeralStoreEvent)=>void)`: Registers a listener function that will be called whenever the ephemeral store is updated, either from local changes, remote changes, or timeout events. ```ts interface EphemeralStoreEvent { // The source of the event: local changes, imported from remote, or timeout expiration by: "local" | "import" | "timeout"; // Array of keys that were newly added added: string[]; // Array of keys that had their values updated updated: string[]; // Array of keys that were removed removed: string[]; } ``` - `subscribeLocalUpdates((bytes: Uint8Array) => void)`: Registers a listener that will be called only for local updates to the ephemeral store. ```ts // you need maintain the Subscription to avoid gc const _sub1 = ephemeral1.subscribeLocalUpdates((update) => { ephemeral2.apply(update); }); const _sub2 = ephemeral2.subscribeLocalUpdates((update) => { ephemeral1.apply(update); }); ``` # FILE: pages/docs/tutorial/sync.md ## Sync Two documents with concurrent edits can be synchronized by just two message exchanges. Below is an example of synchronization between two documents: ```ts const docA = new LoroDoc(); const docB = new LoroDoc(); const listA: LoroList = docA.getList("list"); listA.insert(0, "A"); listA.insert(1, "B"); listA.insert(2, "C"); // B import the ops from A const data: Uint8Array = docA.export({ mode: "update" }); // The data can be sent to B through the network docB.import(data); expect(docB.toJSON()).toStrictEqual({ list: ["A", "B", "C"], }); const listB: LoroList = docB.getList("list"); listB.delete(1, 1); // `doc.export({mode: "update", from: version})` can encode all the ops from the version to the latest version // `version` is the version vector of another document const missingOps = docB.export({ mode: "update", from: docA.oplogVersion(), }); docA.import(missingOps); expect(docA.toJSON()).toStrictEqual({ list: ["A", "C"], }); expect(docA.toJSON()).toStrictEqual(docB.toJSON()); ``` ## Real-time Collaboration Due to CRDT properties, document consistency is guaranteed when peers receive the same updates, regardless of order or duplicates. ### Sync Strategies 1. **First Sync** (Initial synchronization between peers): - New peers can exchange their [Version Vectors](/docs/tutorial/version) to determine missing updates - Use `doc.export({ mode: "update", from: versionVector })` to get updates since the peer's last known state. You may as well send the whole history by `doc.export({ mode: "update" })` as shown in the example above. - Example shows basic first sync scenario 2. **Realtime Sync** (Continuous updates): - Subscribe to local updates - Broadcast updates directly to all other peers - No need for version comparison after initial sync - As long as updates reach all peers, consistency is maintained ### Example Here's how two peers can establish realtime sync when one comes online with offline changes: 1. Both peers exchange their version information 2. Each peer shares their missing updates: - `doc2` gets updates it's missing from `doc1` - `doc1` gets updates it's missing from `doc2` 3. Both peers establish realtime sync to stay connected ```ts const doc1 = new LoroDoc(); doc1.getText("text").insert(0, "Hello"); // Peer2 joins the network const doc2 = new LoroDoc(); // ... doc2 may import its local snapshot // 1. Exchange version information const peer2Version = doc2.oplogVersion(); const peer1Version = doc1.oplogVersion(); // 2. Request missing updates from existing peers const missingOps = doc1.export({ mode: "update", from: peer2Version }); doc2.import(missingOps); const missingOps2 = doc2.export({ mode: "update", from: peer1Version, }); doc1.import(missingOps2); // 3. Establish realtime sync doc2.subscribeLocalUpdates((update) => { // websocket.send(update); }); doc1.subscribeLocalUpdates((update) => { // websocket.send(update); }); // Now both peers are in sync and can collaborate ``` ## Understanding the `import()` Return Value The `import()` method in Loro's JavaScript/WASM binding returns an object that provides feedback on the import operation. This object, let's call it `ImportStatusJS`, has the following structure: ```typescript interface ImportStatusJS { success: PeerVersionRange; pending?: PeerVersionRange; // Optional: only present if there are pending operations } interface PeerVersionRange { [peerId: string]: { start: number; // Start counter (inclusive) end: number; // End counter (exclusive) }; } ``` ### Fields Explained: 1. **`success`** (Object, `PeerVersionRange`) * **Description**: This field is always present and details the ranges of operations (changes) that were successfully imported and applied to the Loro document. * **Structure**: It's an object where: * Each **key** is a `string` representing a `PeerID` (the unique identifier of a collaborator or a source of changes). * Each **value** is an object `{ start: number, end: number }` defining a continuous range of operation counters for that specific peer. * `start`: The starting counter of the successfully imported range (inclusive). * `end`: The ending counter of the successfully imported range (exclusive). This means operations from `start` up to, but not including, `end` were processed. * **Purpose**: Helps understand which parts of the provided update data have been integrated into the local document's state. * **Example**: ```javascript // Assuming importResult is the return value of doc.import(bytes) console.log(importResult.success); // Example output: // { // "clientA_peerId": { "start": 0, "end": 50 }, // "server_peerId": { "start": 120, "end": 150 } // } // This means operations from clientA (counters 0-49) and // operations from server (counters 120-149) were successfully imported. ``` 2. **`pending`** (Object, `PeerVersionRange`, optional) * **Description**: This field is only present if some operations from the imported data could not be applied because they depend on other operations that Loro has not seen yet (i.e., their causal dependencies are missing). It details these "pending" operation ranges. * **Structure**: Identical to the `success` field. An object mapping `PeerID` strings to `{ start: number, end: number }` counter ranges. * **Purpose**: Informs the application that certain changes are known but are "on hold" awaiting their prerequisites. To apply these pending changes, the missing prerequisite operations must be imported first. This is crucial for maintaining data consistency in collaborative scenarios. * **Example**: ```javascript // Assuming importResult is the return value of doc.import(bytes) if (importResult.pending) { console.log(importResult.pending); // Example output: // { // "clientA_peerId": { "start": 50, "end": 60 }, // "clientB_peerId": { "start": 10, "end": 25 } // } // This means operations from clientA (counters 50-59) and // operations from clientB (counters 10-24) are pending due to missing dependencies. } ``` ### How to Use This Information: * Check the `success` field to confirm which updates were applied. * If the `pending` field exists and is not empty, it signals that further updates (dependencies) are required to fully integrate all known changes. Your application might need to fetch or request these missing updates from other peers or a central server. # FILE: pages/docs/tutorial/event.md # Event Handling in Loro Loro implements an event system to track changes in the document. This section explains when events are emitted and how transactions work in Loro. ## Event Emission Points Events in Loro are emitted whenever the internal document state changes. This mechanism allows application-level derived states to automatically synchronize with changes in the document state. 1. **Local Operations**: For local operations (like insertions or deletions on text), the operations are first placed in a pending state within an internal transaction. 2. **Transaction Commit**: When a transaction is committed, all pending operations collectively emit their corresponding events. This transaction commit occurs in two scenarios: - When `LoroDoc.commit()` is explicitly called - Automatically before an import or export operation Note that events are emitted asynchronously after a microtask. If you need to handle events immediately after a commit, you should await a microtask: ```javascript const doc = new LoroDoc(); doc.subscribe((event) => { console.log("Event:", event); }); const text = doc.getText("text"); text.insert(0, "Hello"); doc.commit(); // Event hasn't been emitted yet await Promise.resolve(); // Now the event has been emitted ``` 3. **Import**: When importing changes from a remote source using the `import()` method, respective events are emitted. This allows the local document to react to changes made by other peers. ```javascript no_run const doc = new LoroDoc(); doc.subscribe((event) => { console.log("Event:", event); }); doc.import(remoteChanges); // This will trigger events after a microtask await Promise.resolve(); // Wait for events to be emitted ``` 4. **Version Checkout**: When you switch document state to a different version using `doc.checkout(frontiers)`, Loro emits an event to reflect this change. Like other events, these are also emitted after a microtask. ```javascript const doc = new LoroDoc(); const frontiers = doc.frontiers(); doc.checkout(frontiers); // This will trigger events after a microtask await Promise.resolve(); // Wait for events to be emitted ``` ## Transaction Behavior Transactions in Loro primarily serve to bundle related operations and emit their events together as a cohesive unit. This is useful in several scenarios: 1. **Related Local Operations**: When performing multiple local operations that are logically connected, you may want them to: - Share the same commit message - Have the same timestamp - Move together during undo/redo operations 2. **Event Handling**: Applications often benefit from receiving related changes as a single batch rather than individual updates. Transactions facilitate this by: - Allowing you to set an origin identifier during commit - Including this origin value in the emitted events - Enabling better event filtering and processing based on the origin ## Triggering a Commit There are several ways to trigger a commit: 1. **Explicit Commit**: Directly calling the `commit()` method on the Loro document. ```javascript const doc = new LoroDoc(); const text = doc.getText("myText"); text.insert(0, "Hello, Loro!"); doc.commit(); // This commits pending operations and emits events ``` 2. **Before Import/Export**: A commit is automatically triggered before executing an import operation. ```javascript const doc1 = new LoroDoc(); doc1.setPeerId(1); const doc2 = new LoroDoc(); doc2.setPeerId(2); // Some ops on doc1 and doc2 doc1.getText("text").insert(0, "Alice"); doc2.getText("text").insert(0, "Hello, Loro!"); console.log(doc1.version().toJSON()); // Map(0) {} console.log(doc2.version().toJSON()); // Map(0) {} const updates = doc1.export({ mode: "snapshot" }); doc2.import(updates); // This first commits any pending operations in doc2 console.log(doc2.version().toJSON()); // Map(2) { "1" => 5, "2" => 12 } console.log(doc1.version().toJSON()); // Map(2) { "1" => 5 } ``` ## Transactions in Loro It's important to note that Loro's concept of a transaction differs from traditional database transactions: - Loro transactions do not have ACID properties. - They primarily serve as event wrappers. - There is no rollback mechanism if an operation fails. # FILE: pages/docs/tutorial/get_started.md --- keywords: "loro-crdt, build collaboration software, local-first, operation transform, crdts, ot" description: "How to use Loro to build real-time or asynchronous collaboration software." --- # Getting Started You can use Loro in your application by using: - [`loro-crdt`](https://www.npmjs.com/package/loro-crdt) NPM package - [`loro`](https://crates.io/crates/loro) Rust crate - [`loro-swift`](https://github.com/loro-dev/loro-swift) Swift package - [`loro-py`](https://github.com/loro-dev/loro-py) Python package - You can also find a list of examples in [Loro examples in Deno](https://github.com/loro-dev/loro-examples-deno). You can use [Loro Inspector](/docs/advanced/inspector) to debug and visualize the state and history of Loro documents. The following guide will use `loro-crdt` js package as the example. [![Open in StackBlitz](https://developer.stackblitz.com/img/open_in_stackblitz.svg)](https://stackblitz.com/edit/loro-basic-test?file=test%2Floro-sync.test.ts) ## Install ```bash npm install loro-crdt # Or pnpm install loro-crdt # Or yarn add loro-crdt ``` If you're using `Vite`, you should add the following to your vite.config.ts: ```ts no-run import wasm from "vite-plugin-wasm"; import topLevelAwait from "vite-plugin-top-level-await"; export default defineConfig({ plugins: [...otherConfigures, wasm(), topLevelAwait()], }); ``` If you're using `Next.js`, you should add the following to your next.config.js: ```js no-run module.exports = { webpack: function (config) { config.experiments = { layers: true, asyncWebAssembly: true, }; return config; }, }; ``` You can also use Loro directly in the browser via ESM imports. Here's a minimal example: ```html ESM Module Example
``` ## Introduction It is well-known that syncing data/building realtime collaborative apps is challenging, especially when devices can be offline or part of a peer-to-peer network. Loro simplifies this process for you. After you model your app state by Loro, syncing is simple: ```ts const docA = new LoroDoc(); const docB = new LoroDoc(); //...operations on docA and docB // Assume docA and docB are two Loro documents in two different devices const bytesA = docA.export({ mode: "update" }); // send bytes to docB by any method docB.import(bytesA); // docB is now updated with all the changes from docA const bytesB = docB.export({ mode: "update" }); // send bytes to docA by any method docA.import(bytesB); // docA and docB are now in sync, they have the same state ``` Saving your app state is also straightforward: ```ts const doc = new LoroDoc(); doc.getText("text").insert(0, "Hello world!"); const bytes = doc.export({ mode: "snapshot" }); // Bytes can be saved to local storage, database, or sent over the network ``` Loading your app state: ```ts no_run const newDoc = new LoroDoc(); newDoc.import(bytes); ``` Loro also makes it easy for you to time travel the history and add version control to your app. [Learn more about time travel](/docs/tutorial/time_travel). ```ts no_run doc.checkout(version); // Checkout the doc to the given version ``` Loro is compatible with the JSON schema. If you can model your app state with JSON, you probably can sync your app with Loro. Because we need to adhere to the JSON schema, using a number as a key in a Map is not permitted, and cyclic links should be avoided. ```ts no_run doc.toJSON(); // Get the JSON representation of the doc ``` ## Entry Point: LoroDoc LoroDoc is the entry point for using Loro. You must create a Doc to use Map, List, Text, and other types and to complete data synchronization. ```ts const doc = new LoroDoc(); const text: LoroText = doc.getText("text"); text.insert(0, "Hello world!"); console.log(doc.toJSON()); // { "text": "Hello world!" } ``` ## Container We refer to CRDT types such as `List`, `Map`, `Tree`, `MovableList`, and `Text` as `Container`s. Here are their basic operations: ```ts const doc = new LoroDoc(); const list: LoroList = doc.getList("list"); list.insert(0, "A"); list.insert(1, "B"); list.insert(2, "C"); const map: LoroMap = doc.getMap("map"); // map can only has string key map.set("key", "value"); expect(doc.toJSON()).toStrictEqual({ list: ["A", "B", "C"], map: { key: "value" }, }); // delete 2 element at index 0 list.delete(0, 2); expect(doc.toJSON()).toStrictEqual({ list: ["C"], map: { key: "value" }, }); // Insert a text container to the list const text = list.insertContainer(0, new LoroText()); text.insert(0, "Hello"); text.insert(0, "Hi! "); expect(doc.toJSON()).toStrictEqual({ list: ["Hi! Hello", "C"], map: { key: "value" }, }); // Insert a list container to the map const list2 = map.setContainer("test", new LoroList()); list2.insert(0, 1); expect(doc.toJSON()).toStrictEqual({ list: ["Hi! Hello", "C"], map: { key: "value", test: [1] }, }); ``` ## Save and Load Loro is a pure library and does not handle network protocols or storage mechanisms. It is your responsibility to manage the storage and transmission of the binary data exported by Loro. To save the document, use `doc.export({mode: "snapshot"})` to get its binary form. To open it again, use `doc.import(data)` to load this binary data. ```ts const doc = new LoroDoc(); doc.getText("text").insert(0, "Hello world!"); const data = doc.export({ mode: "snapshot" }); const newDoc = new Loro(); newDoc.import(data); expect(newDoc.toJSON()).toStrictEqual({ text: "Hello world!", }); ``` Exporting the entire document on each keypress is inefficient. Instead, use `doc.export({mode: "update", from: VersionVector})` to obtain binary data for operations since the last export. ```ts const doc = new LoroDoc(); doc.getText("text").insert(0, "Hello world!"); const data = doc.export({ mode: "snapshot" }); let lastSavedVersion = doc.version(); doc.getText("text").insert(0, "✨"); const update0 = doc.export({ mode: "update", from: lastSavedVersion }); lastSavedVersion = doc.version(); doc.getText("text").insert(0, "😶‍🌫️"); const update1 = doc.export({ mode: "update", from: lastSavedVersion }); { /** * You can import the snapshot and the updates to get the latest version of the document. */ // import the snapshot const newDoc = new LoroDoc(); newDoc.import(data); expect(newDoc.toJSON()).toStrictEqual({ text: "Hello world!", }); // import update0 newDoc.import(update0); expect(newDoc.toJSON()).toStrictEqual({ text: "✨Hello world!", }); // import update1 newDoc.import(update1); expect(newDoc.toJSON()).toStrictEqual({ text: "😶‍🌫️✨Hello world!", }); } { /** * You may also import them in a batch */ const newDoc = new LoroDoc(); newDoc.importUpdateBatch([update1, update0, data]); expect(newDoc.toJSON()).toStrictEqual({ text: "😶‍🌫️✨Hello world!", }); } ``` If updates accumulate, exporting a new snapshot can quicken import times and decrease the overall size of the exported data. You can store the binary data exported from Loro wherever you prefer. ## Sync Two documents with concurrent edits can be synchronized by just two message exchanges. Below is an example of synchronization between two documents: ```ts const docA = new LoroDoc(); const docB = new LoroDoc(); const listA: LoroList = docA.getList("list"); listA.insert(0, "A"); listA.insert(1, "B"); listA.insert(2, "C"); // B import the ops from A const data: Uint8Array = docA.export({ mode: "update" }); // The data can be sent to B through the network docB.import(data); expect(docB.toJSON()).toStrictEqual({ list: ["A", "B", "C"], }); const listB: LoroList = docB.getList("list"); listB.delete(1, 1); // `doc.export({mode: "update", from: version})` can encode all the ops from the version to the latest version // `version` is the version vector of another document const missingOps = docB.export({ mode: "update", from: docA.oplogVersion(), }); docA.import(missingOps); expect(docA.toJSON()).toStrictEqual({ list: ["A", "C"], }); expect(docA.toJSON()).toStrictEqual(docB.toJSON()); ``` ## Event You can subscribe to the event from `Container`s. `LoroText` and `LoroList` can receive updates in [Quill Delta](https://quilljs.com/docs/delta/) format. The events will be emitted after a transaction is committed. A transaction is committed when: - `doc.commit()` is called. - `doc.export(mode)` is called. - `doc.import(data)` is called. - `doc.checkout(version)` is called. Below is an example of rich text event: ```ts // The code is from https://github.com/loro-dev/loro-examples-deno const doc = new LoroDoc(); const text = doc.getText("text"); text.insert(0, "Hello world!"); doc.commit(); let ran = false; text.subscribe((e) => { for (const event of e.events) { if (event.diff.type === "text") { expect(event.diff.diff).toStrictEqual([ { retain: 5, attributes: { bold: true }, }, ]); ran = true; } } }); text.mark({ start: 0, end: 5 }, "bold", true); doc.commit(); await new Promise((r) => setTimeout(r, 1)); expect(ran).toBeTruthy(); ``` The types of events are defined as follows: ```ts export interface LoroEvent { /** * The container ID of the event's target. */ target: ContainerID; diff: Diff; /** * The absolute path of the event's emitter, which can be an index of a list container or a key of a map container. */ path: Path; } export type Path = (number | string | TreeID)[]; export type Diff = ListDiff | TextDiff | MapDiff | TreeDiff | CounterDiff; export type ListDiff = { type: "list"; diff: Delta[]; }; export type TextDiff = { type: "text"; diff: Delta[]; }; export type MapDiff = { type: "map"; updated: Record; }; export type TreeDiffItem = | { target: TreeID; action: "create"; parent: TreeID | undefined; index: number; fractionalIndex: string; } | { target: TreeID; action: "delete"; oldParent: TreeID | undefined; oldIndex: number; } | { target: TreeID; action: "move"; parent: TreeID | undefined; index: number; fractionalIndex: string; oldParent: TreeID | undefined; oldIndex: number; }; export type TreeDiff = { type: "tree"; diff: TreeDiffItem[]; }; export type CounterDiff = { type: "counter"; increment: number; }; export type Delta = | { insert: T; attributes?: { [key in string]: {} }; retain?: undefined; delete?: undefined; } | { delete: number; attributes?: undefined; retain?: undefined; insert?: undefined; } | { retain: number; attributes?: { [key in string]: {} }; delete?: undefined; insert?: undefined; }; ``` # FILE: pages/docs/tutorial/tree.md --- keywords: "tree crdt, move operation, fractional index, hierarchical, tree" description: "how to use loro tree crdt and show all APIs of loro tree crdt." --- # Tree When it comes to utilizing hierarchical data structures and even performing operations across different levels of data, tree structures are of paramount importance. Loro implements the concept of a movable tree CRDT, based on the algorithm created by Kleppmann et al. in **_[A highly-available move operation for replicated trees](https://martin.kleppmann.com/2021/10/07/crdt-tree-move-operation.html)_**. In addition to this, Loro further employs **Fractional Index** (an algorithm used in distributed systems to maintain the order of sequences) to sort each child node, ensuring that sibling nodes maintain an orderly sequence. As a result, Loro can provide APIs such as `moveAfter()` and `moveBefore()`. This is particularly useful when a custom-ordered hierarchical view is required. ## Node Data Loro associates a [Map](https://www.loro.dev/docs/tutorial/map) with each tree node, serving as a data container for the node, allowing you to nest any data structure supported by Loro. > Note: The associated Map Container is considered a child of the Tree Container within the Loro documentation hierarchy. You can get the associated map by `.data` of the `LoroTreeNode` so you can invoke `set` or `setContainer` on the map to add some data or sub-containers to the tree node. For example: ```ts import { Loro, LoroTree, LoroTreeNode, LoroMap, LoroList } from "loro-crdt"; let doc = new Loro(); let tree: LoroTree = doc.getTree("tree"); let node: LoroTreeNode = tree.createNode(); node.data.set("key", "value"); node.data.setContainer("list", new LoroList()); ``` ## Ordered Tree Nodes In certain scenarios such as graphic design or file systems, where sibling nodes may also require a sequential relationship, we have introduced `Fractional Index` in Loro to support this capability. You can read more about `Fractional Index` in the [Figma blog](https://www.figma.com/blog/realtime-editing-of-ordered-sequences). In simple terms, `Fractional Index` assigns a sortable value to each object, and if a new insertion occurs between two objects, the `Fractional Index` of the new object will be between the left and right values. The rust-based `Fractional Index` [implementation of Drifting-in-Space](https://github.com/drifting-in-space/fractional_index) has good design and minimal document size growth in most cases. We forked it and made a simple extension for use in Loro. Whenever a new tree node is created or a node is moved to a new position, Loro will generate a `Fractional Index` for the node based on its position to sort among sibling nodes. In collaborative environments, conflicts with `Fractional Indexes` can arise, such as different nodes having the same `Fractional Index` or the situation where the calculation of a new node's position results in the same `Fractional Index` on both sides. We will discuss the corresponding conflict resolution methods in detail in [the blog post](https://www.loro.dev/blog/movable-tree). You should call `enable_fractional_index(jitter: number)` to enable `Fractional Index`. > Note: Fractional Index has an interleaving issue, but we believe this is acceptable for tree structures. ## Events There are three types of events in the tree structure: `Create`, `Move`, and `Delete`, as follows: ```typescript export type TreeDiffItem = | { target: TreeID; action: "create"; parent: TreeID | undefined; index: number; fractionalIndex: string; } | { target: TreeID; action: "delete"; oldParent: TreeID | undefined; oldIndex: number; } | { target: TreeID; action: "move"; parent: TreeID | undefined; index: number; fractionalIndex: string; oldParent: TreeID | undefined; oldIndex: number; }; ``` Here, `index` and `fractionalIndex` represent the node's index and the hex string representation of the `Fractional Index`, respectively. Loro will emit events in the order of causality. The deletion event signifies that the target node, along with the entire subtree rooted at the target node, has been deleted. Deletion events for the child nodes are not emitted. ### Events of node's data Since the data of the tree nodes is represented using `MapContainer`, each `MapContainer` associated with a tree node is a child of the `TreeContainer` in the document. If you modify the data of a tree node, you will receive an event from the `MapContainer`. However, the event path contains a element of `TreeID` to indicate which node's data has been altered. ## Retrieving All Nodes There are multiple ways to retrieve all nodes from a `LoroTree`: 1. `nodes()`: Retrieves all `LoroTreeNode` instances from the current `LoroTree`, but this function does not guarantee the order of the nodes. 2. `roots()`: Retrieves all root nodes from the current `LoroTree`, with the root nodes being ordered. Subsequently, the `children()` method of `LoroTreeNode` can be used to perform a hierarchical traversal. 3. `toArray()`: Retrieves all node information in a hierarchical traversal order, where each node's data structure is as follows: ```ts no_run { id: TreeID; parent: TreeID | undefined; index: number; fractionalIndex: string; meta: LoroMap; } ``` 4. `toJSON()`: The JSON representation of `toArray()`, where each node's `meta` is also recursively parsed into JSON format. ## Basic Usage ### Example ```ts import { Loro, LoroTree, LoroTreeNode, LoroMap } from "loro-crdt"; let doc = new Loro(); let tree: LoroTree = doc.getTree("tree"); let root: LoroTreeNode = tree.createNode(); // By default, append to the end of the parent node's children list let node = root.createNode(); // Specify the child's position let node2 = root.createNode(0); // Move the node to become a child of another node node.move(node2); // Specify the child's position within the new parent node.move(node2, 0); // Move the node to become the root node node.move(); // Move the node to be positioned after another node node.moveAfter(node2); // Move the node to be positioned before another node node.moveBefore(node2); // Retrieve the index of the node within its parent's children let index = node.index(); // Get the fractional index of the node let fractionalIndex = node.fractionalIndex(); // Access the associated data map container let nodeData: LoroMap = node.data; ``` # FILE: pages/docs/tutorial/time_travel.mdx --- keywords: "crdt, time travel, history, checkout, version control" description: "time travel in Loro" --- # How to Use Time Travel in Loro In Loro, you can call `doc.checkout(frontiers)` to jump to the version specified by the frontiers([Learn more about frontiers](/docs/advanced/version_deep_dive#frontiers)). Note that using `doc.checkout(frontiers)` to jump to a specific version places the document in a detached state, preventing further edits. To learn more, see [_Attached/Detached Status_](/docs/advanced/doc_state_and_oplog#attacheddetached-status). To continue editing, reattach the document to the latest version using `doc.attach()`. This design is temporary and will be phased out once we have a more refined version control API in place. ## Read-only Time Travel Below we demonstrate how to implement simple, read-only time-travel. You could, for example, combine this with a slider in a UI to allow users to view the document over time. ### Enable Timestamps Before this example will work, it is important that the edits made to the document have had [timestamp storage](/docs/advanced/timestamp) enabled: ```ts no_run doc.setRecordTimestamp(true); ``` This makes sure that all changes to the document will have a timestamp added to it. We will use this timestamp to sort changes so that the ordering will match user intuition. ### Implementing Time Travel The first step is to load our document. Here we assume that you have a snapshot from your database or API. ```ts no_run // Get the snapshot for your doc from your database / API let snapshot = fetchSnapshot(); // Import into a new document const doc = new LoroDoc(); doc.import(snapshot); ``` Next we must collect and sort the timestamps for every change in the document. We want uesrs to be able to drag a slider to select a timestamp out of this list. ```ts no_run // Collect all changes from the document const changes = doc.getAllChanges(); // Get the timestamps for all changes const timestamps = Array.from( new Set( [...changes.values()] .flat() // Flatten changes from all peers into one list .map((x) => x.timestamp) // Get the timestamp from each peer .filter((x) => !!x) ) ); // Sort the timestamps timestamps.sort((a, b) => a - b); ``` Next we need to make a helper function that will return a list of [Frontiers](/docs/advanced/version_deep_dive#frontiers) for any timestamp. For each peer that has edited a document, there is a list of changes by that peer. Each change has a `counter`, and a `length`. That `counter` is like an always incrementing version number for the changes made by that peer. A change's `counter` is the starting point of the change, and the `length` indicates how much the change incremented the counter before the end of the change. The frontiers are the list of counters that we want to checkout from each peer. Since we are going for a timeline view, we want to get the highest counter that we know happned before our timestamp for each peer. Here we make a helper function to do that. ```ts const getFrontiersForTimestamp = ( changes: Map, ts: number ): { peer: string; counter: number }[] => { const frontiers = [] as { peer: string; counter: number }[]; // Record the highest counter for each peer where it's change is not later than // our target timestamp. changes.forEach((changes, peer) => { let counter = -1; for (const change of changes) { if (change.timestamp <= ts) { counter = Math.max(counter, change.counter + change.length - 1); } } if (counter > -1) { frontiers.push({ counter, peer }); } }); return frontiers; }; ``` Finally, all we can get the index from our slider, get the timestamp from our list, and then checkout the calculated frontiers. ```ts no_run let sliderIdx = 3; const timestamp = timestamps[sliderIdx - 1]; const frontiers = getFrontiersForTimestamp(changes, timestamp); doc.checkout(frontiers); ``` ## Time Travel With Editing Below is a more complete example demonstrating Time Travel functionality with a node editor. # FILE: pages/docs/tutorial/loro_doc.md # Getting Started with LoroDoc LoroDoc is the main entry point for almost all Loro functionality. It serves as a container manager and coordinator that provides: 1. **Container Management**: Create and manage different types of CRDT containers (Text, List, Map, Tree, MovableList) 2. **Version Control**: Track document history, checkout versions, and manage branches 3. **Event System**: Subscribe to changes at both document and container levels 4. **Import/Export**: Save and load documents/updates in various formats ## Basic Usage First, let's create a new LoroDoc instance: ```typescript import { LoroDoc } from "loro-crdt"; // Create a new document with a random peer ID const doc = new LoroDoc(); // Or set a specific peer ID doc.setPeerId("1"); // Create containers const text = doc.getText("text"); const list = doc.getList("list"); const map = doc.getMap("map"); const tree = doc.getTree("tree"); const movableList = doc.getMovableList("tasks"); ``` To model a document with the following format: ```json { "meta": { "title": "Document Title", "createdBy": "Author" }, "content": "Article", "comments": [ { "user": "userId", "comment": "comment" } ] } ``` ```ts const doc = new LoroDoc(); const meta = doc.getMap("meta"); meta.set("title", "Document Title"); meta.set("createdBy", "Author"); doc.getText("content").insert(0, "Article"); const comments = doc.getList("comments"); const comment1 = comments.insertContainer(0, new LoroMap()); comment1.set("user", "userId"); comment1.set("comment", "comment"); ``` ## Container Types LoroDoc supports several container types: 1. **Text** - For rich text editing 2. **List** - For ordered collections 3. **Map** - For key-value pairs 4. **Tree** - For hierarchical data structures 5. **MovableList** - For lists with movable items Let's look at how to use each type: ### Text Container ```typescript const doc = new LoroDoc(); const text = doc.getText("text"); text.insert(0, "Hello"); text.insert(5, " World!"); console.log(text.toString()); // "Hello World!" // Rich text support doc.configTextStyle({ bold: { expand: "after" }, link: { expand: "none" } }); text.mark({ start: 0, end: 5 }, "bold", true); ``` ### List Container ```typescript const doc = new LoroDoc(); const list = doc.getList("list"); list.insert(0, "first"); list.insert(1, "second"); console.log(list.toArray()); // ["first", "second"] // Nested containers const nestedText = list.insertContainer(2, new LoroText()); nestedText.insert(0, "nested text"); ``` ### Map Container ```typescript const doc = new LoroDoc(); const map = doc.getMap("map"); map.set("name", "John"); map.set("age", 30); console.log(map.get("name")); // "John" // Nested containers const userText = map.setContainer("bio", new LoroText()); userText.insert(0, "Software Engineer"); ``` ### Tree Container ```typescript const doc = new LoroDoc(); const tree = doc.getTree("tree"); const root = tree.createNode(); root.data.set("name", "Root"); const child1 = root.createNode(); child1.data.set("name", "Child 1"); const child2 = root.createNode(); child2.data.set("name", "Child 2"); ``` ### MovableList Container ```typescript const doc = new LoroDoc(); const movableList = doc.getMovableList("tasks"); movableList.insert(0, "Task 1"); movableList.insert(1, "Task 2"); movableList.move(0, 1); // Move Task 1 after Task 2 ``` ## Collaboration Features LoroDoc can be used for real-time collaboration. Here's how to sync changes between peers: ```typescript // First peer const doc1 = new LoroDoc(); doc1.setPeerId("1"); const text1 = doc1.getText("text"); // Second peer const doc2 = new LoroDoc(); doc2.setPeerId("2"); const text2 = doc2.getText("text"); // Set up two-way sync doc1.subscribeLocalUpdates((updates) => { doc2.import(updates); }); doc2.subscribeLocalUpdates((updates) => { doc1.import(updates); }); // Now changes in doc1 will be reflected in doc2 and vice versa text1.insert(0, "Hello"); doc1.commit(); await Promise.resolve(); // await for the event to be emitted text2.insert(5, " World!"); doc2.commit(); ``` ## Undo/Redo Support Loro provides built-in undo/redo functionality: ```typescript import { UndoManager, LoroDoc } from "loro-crdt"; const doc = new LoroDoc(); const undoManager = new UndoManager(doc, { maxUndoSteps: 100, mergeInterval: 1000 }); const text = doc.getText("text"); // Make some changes text.insert(0, "Hello"); doc.commit(); // Undo the changes if (undoManager.canUndo()) { undoManager.undo(); } // Redo the changes if (undoManager.canRedo()) { undoManager.redo(); } ``` ## Exporting and Importing You can save and load the document state: ```typescript const doc = new LoroDoc(); // Export the document const snapshot = doc.export({ mode: "snapshot" }); // Create a new document from the snapshot const newDoc = LoroDoc.fromSnapshot(snapshot); const doc2 = new LoroDoc(); // Or import into an existing document doc2.import(snapshot); ``` ### Shallow Import/Export Shallow import/export is a feature that allows you to create and share document snapshots without including the complete history. This is particularly useful for: 1. Reducing the size of exported data 2. Sharing the document with others without revealing the complete history 3. Speedup the import/export process Here's how to use shallow export: ```typescript const doc = new LoroDoc(); // Export a shallow snapshot that only include the history since `doc.oplogFrontiers()` // It works like `git clone --depth=1`, where the exported data only contain the most recent ops. const shallowSnapshot = doc.export({ mode: "shallow-snapshot", frontiers: doc.oplogFrontiers() }); // Check if a document is shallow const isShallow = doc.isShallow(); // Get the version since which the history is available const sinceVersion = doc.shallowSinceVV(); // Or get it in frontiers format const sinceFrontiers = doc.shallowSinceFrontiers(); ``` Note: A shallow document only contains history after a certain version point. Operations before the shallow start point are not included, but the document remains fully functional for collaboration. ### Redacting Sensitive Content Loro allows you to redact specific segments of document history while preserving the rest. This is particularly useful when: 1. A user accidentally pastes sensitive information (like passwords or API keys) into the document 2. You need to remove just the sensitive part of the history while keeping older and newer edits intact 3. You want to share document history with sensitive segments sanitized Here's how to use the redaction functionality: ```typescript const doc = new LoroDoc(); doc.setPeerId("1"); // Create some content to be redacted const text = doc.getText("text"); text.insert(0, "Sensitive information"); doc.commit(); const map = doc.getMap("map"); map.set("password", "secret123"); map.set("public", "public information"); doc.commit(); // Export JSON updates const jsonUpdates = doc.exportJsonUpdates(); // Define version range to redact (redact the text content) const versionRange = { "1": [0, 21] // Redact the "Sensitive information" }; // Apply redaction const redactedJson = redactJsonUpdates(jsonUpdates, versionRange); // Create a new document with redacted content const redactedDoc = new LoroDoc(); redactedDoc.importJsonUpdates(redactedJson); // The text content is now redacted with replacement characters console.log(redactedDoc.getText("text").toString()); // Outputs: "���������������������" // Map operations after the redacted range remain intact console.log(redactedDoc.getMap("map").get("password")); // "secret123" console.log(redactedDoc.getMap("map").get("public")); // "public information" ``` Redaction applies these rules to preserve document structure while removing sensitive content: - Preserves delete and move operations - Replaces text insertion content with Unicode replacement characters '�' - Substitutes list and map insert values with null - Maintains structure of nested containers - Replaces text mark values with null - Preserves map keys and text annotation keys Note that redaction doesn't remove the operations completely - it just replaces the sensitive content with placeholders. If you need to completely remove portions of history, see the section on shallow snapshots in the [Tips](./tips.md) section. #### Important: Synchronization Considerations Both redaction and shallow snapshots maintain future synchronization consistency, but your application is responsible for ensuring all peers get the sanitized version. Otherwise, old instances of the document with sensitive information will still exist on other peers. ## Event Subscription Subscribe to changes in the document: ```typescript const doc = new LoroDoc(); doc.subscribe((event) => { console.log("Document changed:", event); }); const text = doc.getText("text"); // Container-specific subscription text.subscribe((event) => { console.log("Text changed:", event); }); ``` ### Event Emission Events in LoroDoc are emitted only after a transaction is committed, and importantly, the events are emitted after a microtask. This means you need to await a microtask if you want to handle the events immediately after a commit. 1. Explicitly calling `doc.commit()`: ```typescript const doc = new LoroDoc(); const text = doc.getText("text"); // Subscribe to changes doc.subscribe((event) => { console.log("Change event:", event); }); text.insert(0, "Hello"); // No event emitted yet doc.commit(); // Event will be emitted after a microtask // If you need to wait for the event: await Promise.resolve(); // Now the event has been emitted ``` 2. Implicitly through certain operations: ```typescript no_run const doc = new LoroDoc(); const text = doc.getText("text"); // These operations trigger implicit commits: doc.export({ mode: "snapshot" }); // Implicit commit doc.import(someData); // Implicit commit doc.checkout(someVersion); // Implicit commit ``` You can also specify additional information when committing: ```typescript no_run doc.commit({ origin: "user-edit", // Mark the event source message: "Add greeting", // Like a git commit message timestamp: Date.now() // Custom timestamp }); await Promise.resolve(); // Wait for event if needed ``` Note: Multiple operations before a `commit` are batched into a single event. This helps reduce event overhead and provides atomic changes. The event will still be emitted after a microtask, regardless of whether the commit was explicit or implicit. ## Version Control and History LoroDoc provides powerful version control features that allow you to track and manage document history: ### Version Representation Loro uses two ways to represent versions: 1. **Version Vector**: A map from peer ID to counter ```typescript const doc = new LoroDoc(); // Get current version vector const vv = doc.version(); // Get oplog version vector (latest known version) const oplogVv = doc.oplogVersion(); ``` 2. **Frontiers**: A list of operation IDs that represent the latest operations from each peer. This is compacter than version vector. In most of the cases, it only has 1 element. ```typescript const doc = new LoroDoc(); doc.setPeerId("0"); doc.getMap("map").set("text", "Hello"); // Get current frontiers const frontiers = doc.frontiers(); // Get oplog frontiers (latest known version) const oplogFrontiers = doc.oplogFrontiers(); // { "0": 0 } ``` ### Checkout and Time Travel You can navigate through document history using checkout: ```typescript const doc = new LoroDoc(); // Save current version const frontiers = doc.frontiers(); const text = doc.getText("text"); // Make some changes text.insert(0, "Hello World!"); // Checkout to previous version doc.checkout(frontiers); // Return to latest version doc.checkoutToLatest(); // or doc.attach(); ``` Note: After checkout, the document enters "detached" mode. In this mode: - The document is not editable by default - Import operations are recorded but not applied to the document state - You need to call `attach()` or `checkoutToLatest()` to go back to the latest version and make it editable again ### Detached Mode The document enters "detached" mode after a `checkout` operation or when explicitly calling `doc.detach()`. In detached mode, the document state is not synchronized with the latest version in the OpLog. ```typescript const doc = new LoroDoc(); // Check if document is in detached mode console.log(doc.isDetached()); // false // Explicitly detach the document doc.detach(); console.log(doc.isDetached()); // true // Return to attached mode doc.attach(); console.log(doc.isDetached()); // false ``` By default, editing is disabled in detached mode. However, you can enable it: ```typescript const doc = new LoroDoc(); // Enable editing in detached mode doc.setDetachedEditing(true); console.log(doc.isDetachedEditingEnabled()); // true ``` #### Key Behaviors in Detached Mode 1. **Import Operations** - Operations imported via `doc.import()` are recorded in the OpLog - These operations are not applied to the document state until checkout ```typescript const oldDoc = new LoroDoc(); oldDoc.getMap("map").set("name", "John"); const updates = oldDoc.export({ mode: "update" }); const doc = new LoroDoc(); // In detached mode doc.import(updates); // Updates are stored but not applied doc.checkoutToLatest(); // Now updates are applied ``` 2. **Version Management** - Each checkout uses a different PeerID to prevent conflicts - The document maintains two version states: ```typescript const doc = new LoroDoc(); // Current state version const stateVersion = doc.version(); // Latest known version in OpLog const latestVersion = doc.oplogVersion(); ``` 3. **Forking** - You can create a new document at a specific version: ```typescript const doc = new LoroDoc(); doc.setPeerId("0"); doc.getText("text").insert(0, "Hello"); // Fork at current frontiers const forkedDoc = doc.fork(); // Or fork at specific frontiers const forkedAtVersion = doc.forkAt([{ peer: "0", counter: 1 }]); console.log(forkedAtVersion.getText("text").toString()); // "He" ``` #### Common Use Cases 1. **Time Travel and History Review** ```typescript no_run const doc = new LoroDoc(); // Save current version const frontiers = doc.frontiers(); // Make changes text.insert(0, "New content"); // Review previous version doc.checkout(frontiers); // Return to latest version doc.checkoutToLatest(); ``` 2. **Branching** ```typescript no_run const doc = new LoroDoc(); // Enable detached editing doc.setDetachedEditing(true); // Create a branch const branch = doc.fork(); // Make changes in branch const branchText = branch.getText("text"); branchText.insert(0, "Branch changes"); ``` ## Subscription and Sync ### Local Updates Subscription Subscribe to local changes for syncing between peers: ```typescript const doc = new LoroDoc(); // Subscribe to local updates const unsubscribe = doc.subscribeLocalUpdates((updates) => { // Send updates to other peers otherDoc.import(updates); }); // Later, unsubscribe when needed unsubscribe(); ``` ### Document Events Subscribe to all document changes. The event may be triggered by local operations, importing updates, or switching to another version. ```typescript const doc = new LoroDoc(); doc.subscribe((event: LoroEventBatch) => { console.log("Event triggered by:", event.by); // "local" | "import" | "checkout" console.log("Event origin:", event.origin); for (const e of event.events) { console.log("Target container:", e.target); console.log("Path:", e.path); console.log("Changes:", e.diff); } }); ``` ### Container-specific Events Subscribe to changes in specific containers: ```typescript const doc = new LoroDoc(); const text = doc.getText("text"); text.subscribe((event: LoroEventBatch) => { // Handle text-specific changes console.log("Text changed:", event); }); const list = doc.getList("list"); list.subscribe((event: LoroEventBatch) => { // Handle list-specific changes console.log("List changed:", event); }); ``` ## Advanced Features ### Cursor Support Loro provides stable cursor position tracking that remains valid across concurrent edits: ```typescript const doc = new LoroDoc(); const text = doc.getText("text"); text.insert(0, "123"); // Get cursor at position with side (-1, 0, or 1) const cursor = text.getCursor(0, 0); if (cursor) { // Get current cursor position const pos = doc.getCursorPos(cursor); console.log(pos.offset); // Current position console.log(pos.side); // Cursor side // Cursor position updates automatically with concurrent edits text.insert(0, "abc"); const newPos = doc.getCursorPos(cursor); console.log(newPos.offset); // Position updated } ``` ### Change Tracking Track and analyze document changes: ```typescript const doc = new LoroDoc(); doc.setPeerId("1"); doc.getText("text").insert(0, "Hello"); doc.commit(); // Get number of changes and operations console.log(doc.changeCount()); // Number of changes console.log(doc.opCount()); // Number of operations // Get all changes const changes = doc.getAllChanges(); for (const [peer, peerChanges] of changes.entries()) { for (const change of peerChanges) { console.log("Change:", { peer: change.peer, counter: change.counter, lamport: change.lamport, timestamp: change.timestamp, message: change.message }); } } // Get specific change const changeId = { peer: "1", counter: 0 }; const change = doc.getChangeAt(changeId); // Get operations in a change const ops = doc.getOpsInChange(changeId); // Track change ancestors doc.travelChangeAncestors([changeId], (change) => { console.log("Ancestor change:", change); return true; // continue traversal }); // Get modified containers in a change const modifiedContainers = doc.getChangedContainersIn(changeId, 1); ``` ### Advanced Import/Export Loro supports various import and export modes: ```typescript // Export modes const doc = new LoroDoc(); const previousVersion = doc.version(); doc.getText("text").insert(0, "Hello"); const snapshot = doc.export({ mode: "snapshot" }); const updates = doc.export({ mode: "update", from: previousVersion }); const shallowSnapshot = doc.export({ mode: "shallow-snapshot", frontiers: doc.oplogFrontiers() }); const rangeUpdates = doc.export({ mode: "updates-in-range", spans: [{ id: { peer: "1", counter: 0 }, len: 10 }] }); // Import with status tracking const status = doc.import(updates); console.log("Successfully imported:", status.success); console.log("Pending imports:", status.pending); // Batch import const status2 = doc.importBatch([snapshot, updates]); // Import JSON updates const jsonStatus = doc.importJsonUpdates({ schema_version: 1, start_version: new Map([["1", 0]]), peers: ["1"], changes: [] }); ``` ### Path and Value Access Access document content through paths: ```typescript const doc = new LoroDoc(); // Get value or container by path const value = doc.getByPath("map/key"); const container = doc.getByPath("list"); // Get path to a container const path = doc.getPathToContainer("cid:root-list:List"); // JSONPath support const results = doc.JSONPath("$.list[*]"); // Get shallow values (container IDs instead of resolved values) const shallowDoc = doc.getShallowValue(); console.log(shallowDoc); // { list: 'cid:root-list:List', ... } // Custom JSON serialization const json = doc.toJsonWithReplacer((key, value) => { if (value instanceof LoroText) { return value.toDelta(); } return value; }); ``` ### Debug and Metadata Access debug information and metadata: ```typescript import { setDebug, LoroDoc, decodeImportBlobMeta } from "loro-crdt"; const doc = new LoroDoc(); // Enable debug info setDebug(); const blob = doc.export({ mode: "update" }); // Get import blob metadata const metadata = decodeImportBlobMeta(blob, true); console.log({ startTimestamp: metadata.startTimestamp, endTimestamp: metadata.endTimestamp, mode: metadata.mode, changeNum: metadata.changeNum }); ``` # FILE: pages/docs/index.mdx # Introduction to Loro It is well-known that syncing data/building realtime collaborative apps is challenging, especially when devices can be offline or part of a peer-to-peer network. Loro simplifies this process for you. We want to provide better DevTools to make building [local-first apps](https://www.inkandswitch.com/local-first/) easy and enjoyable. Loro uses [Conflict-free Replicated Data Types (CRDTs)](/docs/concepts/crdt) to resolve parallel edits. By utilizing Loro's data types, your applications can be made collaborative and keep the editing history with low overhead. After you model your app state by Loro, syncing is simple: ```ts const docA = new LoroDoc(); const docB = new LoroDoc(); docA.getText("text").insert(0, "Hello world!"); docB.getText("text").insert(0, "Hi!"); // Assume docA and docB are two Loro documents in two different devices const bytesA = docA.export({ mode: "update" }); // send bytes to docB by any method docB.import(bytesA); // docB is now updated with all the changes from docA const bytesB = docB.export({ mode: "update" }); // send bytes to docA by any method docA.import(bytesB); // docA and docB are now in sync, they have the same state ``` Saving your app state is also straightforward: ```ts const doc = new LoroDoc(); doc.getText("text").insert(0, "Hello world!"); const bytes = doc.export({ mode: "snapshot" }); // Bytes can be saved to local storage, database, or sent over the network ``` Loading your app state: ```ts no_run const newDoc = new LoroDoc(); newDoc.import(bytes); ``` Loro also makes it easy for you to time travel the history and add version control to your app. [Learn more about time travel](/docs/tutorial/time_travel). ```ts no_run doc.checkout(version); // Checkout the doc to the given version ``` Loro is compatible with the JSON schema. If you can model your app state with JSON, you probably can sync your app with Loro. Because we need to adhere to the JSON schema, using a number as a key in a Map is not permitted, and cyclic links should be avoided. ```ts no_run doc.toJSON(); // Get the JSON representation of the doc ``` import { Cards } from "nextra/components"; <>![Getting started](/images/GettingStarted.png) # Differences from other CRDT libraries The table below summarizes Loro's features, which may not be present in other CRDT libraries. | Features/Important design decisions | Loro | Diamond-types | Yjs | Automerge | | :-------------------------------------------------------------------------- | :---- | :------------ | :--------- | :---------- | | [Event Graph Walker](https://loro.dev/docs/advanced/replayable_event_graph) | ✅ | ✅ Inventor | ❌ | ❌ | | Rich Text CRDT | ✅ | ❌ | ❌ | ✅ | | [Movable Tree](https://ieeexplore.ieee.org/document/9563274) | ✅ | ❌ | ❌ | ❌ Inventor | | [Movable List](https://loro.dev/docs/tutorial/list) | ✅ | ❌ | ❌ | ❌ Inventor | | Time Travel | ✅ | ✅ | ✅[1] | ✅ | | [Fugue](https://arxiv.org/abs/2305.00583) / Maximal non-interleaving | ✅ | ✅ | ❌ | ❌ | | JSON Types | ✅ | ❓ | ✅ | ✅ | | Merging Elements in Memory by Run Length Encoding | ✅ | ✅ | ✅ Inventor | ❌ | | Byzantine-fault-tolerance | ❌ | ❌ | ❌ | ✅ | | Version Control | ✅ | ❌ | ❌ | ✅ | - [1] Unlike others, Yjs requires users to store a version vector and a delete set, enabling time travel back to a specific point. - [Fugue](https://arxiv.org/abs/2305.00583) is a text/list CRDTs that can minimize the chance of the interleaving anomalies. # FILE: pages/docs/concepts/when_not_crdt.md --- keywords: "crdt, crdts, difficulty, synchronization" description: "When Not to Rely on CRDTs" --- # When Not to Rely on CRDTs If there are no concurrent operations, CRDTs and other solutions don't differ much. In handling concurrent editing, CRDTs use predefined algorithms to ensure strong eventual consistency (i.e., if multiple people receive the same set of Ops, they will see the same content). Furthermore, some CRDT algorithms make the merge results as close to user expectations as possible. However, its problems include: - Difficulty in maintaining user-defined invariants. For instance, in a meeting room reservation scenario, a room can only be booked once. But if CRDTs represent the reservation status, multiple people booking the same room simultaneously may all see a successful operation, but in reality, only one person will have the reservation after synchronization. Thus, other systems are needed to maintain these invariants. - Automatic merged results of concurrent edits may violate the schema. For example, automatically merged code edits may not comply with the code's syntax structure, requiring human intervention to adjust. The requirement that a `
` can only contain one `` might also be violated in concurrent editing (two people inserting a new `` concurrently). For such scenarios, this issue can be ignored, or `` can be represented differently (changing `` from a List element to a Register on a separate Map can prevent this problem). CRDTs alone may not solve all problems in these scenarios. If maintaining these invariants is crucial for your application, you might need to combine CRDTs with centralized solutions for synchronization. Alternatively, you can implement an additional layer on top of the CRDT that derives a view conforming to the correct schema or maintaining the required invariants, using the CRDT state as the source of truth. This approach can ensure consistency across clients, but its suitability depends on your specific use case and user expectations. When merging edits from a peer that has been offline for a long time, extensive changes may result in an automatically merged state that doesn't meet expectations. If your application cannot tolerate such merge results, you shouldn't rely solely on CRDT's automatic merging. Instead, detect these situations of long-term concurrent edits and provide a diff view to users, allowing them to resolve conflicts manually. Fortunately, Loro's internal architecture supports implementing such manual conflict resolution logic. # FILE: pages/docs/concepts/crdt.mdx import Image from "next/image"; # What are CRDTs CRDT (conflict-free replicated data type) is a data structure that can be replicated across multiple computers in a network, where replicas can be updated independently and in parallel, without the need for coordination between replicas, and with a guarantee that no conflicts will occur. CRDT is often used in collaborative software, such as scenarios where multiple users need to work together to edit/read a shared document, database, or state. It can be used in database software, text editing software, chat software, etc. # What problems does CRDT solve? For example, a scenario where multiple users edit the same document online at the same time. This scenario requires that each user sees the same content, even after concurrent edits by different users (e.g. two users changing the title at the same time), which is known as **consistency**. (To be precise, CRDT satisfies the eventual consistency, see below for more details) Users can use CRDT even when they are offline. They can be back on sync with others the network is restored. It also supports collaboratively editing with other users via P2P. It is known as **partitioning fault tolerance**. This allows CRDT to support **decentralized** applications very well: synchronization can be done even without a centralized server. # The Emergence of CRDTs The formal concept of Conflict-free Replicated Data Types (CRDTs) was first introduced in Marc Shapiro's 2011 paper, [Conflict-Free Replicated Data Types](https://inria.hal.science/hal-00932836/file/CRDTs_SSS-2011.pdf). However, it can be argued that the groundwork for CRDTs was laid earlier, in the 2006 study [Woot](https://doi.org/10.1145%2F1180875.1180916): An Algorithm for Collaborative Real-time Editing. The primary motivation behind developing CRDTs was to address the challenges associated with designing conflict resolution mechanisms for eventual consistency. Prior to the introduction of CRDTs, literature on the subject offered limited guidance, and ad hoc solutions were often prone to errors. Consequently, Shapiro's paper presented a simple, theoretically sound approach to achieving eventual consistency through the use of CRDTs. (PS: Marc Shapiro actually wrote a paper [Designing a commutative replicated data type](https://hal.inria.fr/inria-00177693v2/document) in 2007. In 2011, he reworded commutative into conflict-free in 2011, expanding the definition of commutative to include state-based CRDT) According to [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem), it is impossible for a distributed computing system to perfectly satisfy the following three points at the same time. - _Consistency_: each read receives the result of the most recent write or reports an error; it behaves as if it is accessing the same piece of data - _Availability_: every request gets a non-error response - but there is no guarantee that the data fetched is up-to-date - _Partition tolerance_: the ability of a distributed system to continue functioning properly even when communication between its different components is lost or delayed, resulting in a partition or network failure. If the system cannot achieve data consistency within the time limit, it means that partitioning has occurred and a choice must be made between C and A for the current operation, so "perfect consistency" is in conflict with "perfect availability". CRDTs do not provide "perfect consistency", but Strong Eventual Consistency (SEC). This means that site A may not immediately reflect the state changes from site B, but when A and B synchronize their messages they both regain consistency and do not need to resolve potential conflicts (CRDT mathematically prevents conflicts from occurring). _Strong Eventual Consistency_ does not conflict with _Availability_ and _Partition Tolerance_. CRDTs provide a good CAP tradeoff. ![CPA](./crdt-images/a4858e2a50bc1a2d79722060156e89b0cac5815cf25e8c67e409aa0926280cef.png) _CRDT satisfies A + P + Eventual Consistency; a good tradeoff under CAP_ (PS: In 2012, Eric Brewer, author of the CAP theorem, wrote an article [CAP Twelve Years Later: How the "Rules" Have Changed](https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed/), explaining that the description of the "two out of three CAP features" is actually misleading, and that the CAP actually prohibits perfect availability and consistency in a very small part of the design space, i.e., in the presence of partitions; in fact, the design of the tradeoff between C and A is very flexible. A good example is CRDT.) # A simple CRDT case We can use a few simple examples to get a general idea of how CRDTs achieve **Strong Eventual Consistency**. > **Grow-only Counter** How can we count the number of times something happens in a distributed system without locking? G-Counter - Let each copy increments only its own counter => no locking synchronization & no conflicts - Each copy keeps the count values of all other copies at the same time - Number of occurrences = sum of count values of all copies - Since each copy only updates its own count and does not conflict with other counters, this type satisfies consistency after message synchronization > **Grow-only Set** G-Set - The elements in a Grow-only Set can only be increased and not decreased - To merge two such states, you only need to do a merge set - This type satisfies consistency after message synchronization because there are no conflicting operations since the elements only grow and do not decrease. Both of these methods are CRDTs, and they both satisfy the following properties - They can both be updated independently and concurrently, without coordination (locking) between replicas - There is no possibility of conflict between multiple updates - Final consistency can always be guaranteed # Introduction to the Principle There are two types of CRDTs: Op-based CRDTs and State-based CRDTs. This article focuses on the concept of Op-based CRDTs. Op-based CRDTs operate on the principle that if two users perform identical sequences of operations, the final state of the document should also be identical. To achieve this, each user saves all the operations performed on the data (Operations) and synchronizes these Operations with other users to ensure a consistent final state. A critical challenge in this approach is ensuring the order of Operations remains consistent, especially when parallel modification operations occur. To address this, Op-based CRDTs require that all possible parallel Operations be commutative, satisfying the final consistency requirement. # Comparison of CRDT and OT Both CRDT and [Operation Transformation(OT)][ot] can be used in online collaborative applications, with the following differences | OT | CRDT | | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------ | | OT relies on a centralized server for collaboration; [it is extremely difficult to make it work in a distributed environment](https://digitalfreepen.com/2018/01/04/operational-transform-hard.html) | CRDT algorithm can be used to synchronize data through a P2P approach synchronization | | The earliest paper on OT was presented in 1989 | The earliest paper on CRDT appeared in 2006 | | The OT algorithm is designed with higher complexity to ensure consistency | The CRDT algorithm is designed to be simpler to ensure consistency | | It is easier to design OT to preserve user intent | It is more difficult to design a CRDT algorithm that preserves user intent | | OT does not affect document size | CRDT documents are larger than the original document data | [a highly-available move operation for replicated trees]: https://martin.kleppmann.com/papers/move-op.pdf [moving elements in list crdts]: https://martin.kleppmann.com/papers/list-move-papoc20.pdf [interleaving anomalies in collaborative text editors]: https://martin.kleppmann.com/papers/interleaving-papoc19.pdf [conflict-free replicated data types]: https://readpaper.com/paper/1516319412 [5000x faster crdts: an adventure in optimization]: https://josephg.com/blog/crdts-go-brrr/ [json crdt]: https://arxiv.org/abs/1608.03960 [yata]: https://www.researchgate.net/publication/310212186_Near_Real-Time_Peer-to-Peer_Shared_Editing_on_Extensible_Data_Types [yjs]: https://github.com/yjs/yjs [loro]: https://loro.dev [automerge]: https://github.com/automerge/automerge [half grid]: https://zh.wikipedia.org/wiki/%E5%8D%8A%E6%A0%BC [ot]: https://en.wikipedia.org/wiki/Operational_transformation [crdts and the quest for distributed consistency]: https://www.infoq.com/presentations/crdt-distributed-consistency/ # FILE: pages/docs/concepts/choose_crdt_type.md --- keywords: "crdt, crdts, application, data model, concurrent, conflict" description: "Loro supports many CRDT types. You need to choose the correct type to model the data based on the algorithm semantics." --- # How to Choose the Right CRDT Types Choosing the right CRDT type means understanding their potential behavior in concurrent editing situations and judging whether such behavior is acceptable for your application. For text, you can choose to represent it directly as a Value on a Map (where the Value can be a string type), or you can choose to use a Text CRDT. For the former, each operation completely overwrites the previous one, so if A and B make concurrent modifications, only one of their edits will remain in the end. For the latter, the CRDT will retain all concurrent insertions by both people, and concurrent deletions are combined to complete the deletion. For most text box edits, you might prefer the latter. But for something like editing a link, you might want to use the former. For Lists, concurrently removing the same element and inserting a single element creates a new element, differentiating from the semantics of Set on a Map (we may consider providing a list set method in the future). For representing coordinates, it's better to use a Map rather than a List. If you represent coordinates as [x, y], and the A client updates the y coordinate by deleting the y element and reinserting a new y_a, and the B client also deletes y and inserts y_b, then after merging, the array will become [x, y_a, y_b], which does not conform to the user's schema. Using a Map can prevent this problem. # FILE: pages/docs/performance/docsize.md --- keywords: "loro, yjs, automerge, diamond-type, benchmark, document size, crdt" description: "Comparing the document size of Loro and popular CRDTs" --- # Document Size In this benchmark, we use the Automerge paper dataset. Source: https://github.com/automerge/automerge-perf/tree/master/edit-by-index The dataset consists of: - 182,315 single-character insertion operations - 77,463 single-character deletion operations - A total of 259,778 operations - 104,852 characters in the final document The first line of settings in the table below indicates configurations without `gc` and `compress`. | Settings | loro-snapshot | loro-update | diamond-type | yrs | automerge | | -------------------- | ------------- | ----------- | ------------ | ------ | --------- | | Default (no options) | 273561 | 251352 | 281042 | 226973 | 292742 | | gc | x | x | 203564 | 159921 | x | | compress | 132459 | 105724 | 150723 | 91777 | 129062 | | gc & compress | x | x | 106242 | 71033 | x | > The `x` in the table above signifies that the corresponding setting is not supported. Loro also supports a shallow snapshot encoding format with gc capabilities by truncating the history. For details, see [the doc](/docs/tutorial/encoding). If truncated from the latest version, the result will be: | Settings | loro-shallow-snapshot | | -------- | --------------------- | | Default | 63352 | | compress | 54517 | # FILE: pages/docs/performance/native.mdx # Native Benchmarks [This native benchmark](https://github.com/zxch3n/crdt-bench-native) is based on the Rust implementation of each crate. - Conducted on a M2 Max CPU, dated 2024-10-18. - The tasks with names starting with `automerge` use the automerge paper dataset. - In this benchmark, compression is disabled for both automerge and loro. - Diamond-type doesn't support the list type yet. | Tasks | automerge | loro | diamond-type | yrs | | :---------------------- | :-------- | :-------- | :----------- | :-------- | | automerge - apply | 450.91 ms | 88.19 ms | 15.63 ms | 4238.8 ms | | automerge - decode time | 506.30 ms | 0.189 ms | 2.19 ms | 3.82 ms | | automerge - encode time | 17.65 ms | 0.416 ms | 1.15 ms | 0.759 ms | | concurrent list inserts | 81.07 ms | 130.63 ms | 57.08 ms | 13.95 ms | | list_random_insert_1k | 296.64 ms | 12.15 ms | 4.32 ms | 5.83 ms | # FILE: pages/docs/performance/index.md --- keywords: "loro, yjs, automerge, benchmark, memory, crdt" description: "CRDT benchmarks, comparing the performance of Loro and popular CRDTs" --- # JS/WASM Benchmarks > The primary role of these benchmarks should be to serve as indicators of the absence of performance pitfalls rather than as measures of which project is superior. This is because different projects consistently make different trade-offs. It is inaccurate to claim that Project A is superior to Project B simply because A performs better in certain benchmarks, while Project B may excel in other areas by a significant margin. The benchmark can be reproduced using the [crdt-benchmarks](https://github.com/zxch3n/crdt-benchmarks) repo. - The benchmarks were performed on MacBook Pro M1 2020 with 16GB RAM - loro-old is the version of loro on 2023-11-10, it's compiled from [this commit](https://github.com/loro-dev/loro/tree/c1613ee680c6a4757e55fcda76e4f5f627daeb56). Loro has undergone numerous changes since then, particularly in terms of [encoding schema](https://github.com/loro-dev/loro/pull/219), shifting from a performance-focused version to one that prioritizes compatibility. Because we want Loro to have good backward and forward compatibility after reaching version 1.0, we have adopted a more easily extensible encoding method. It is slower than the encoding method used in the `loro-old` version, but it better ensures our ability to iterate quickly after reaching version 1.0 without introducing breaking changes. - There is a more exchaustive benchmark at the bottom that only runs benchmarks on Yjs. - Automerge can perform the `B4` benchmark in about 1 second (see `time`) if all changes are applied within a single `change` transaction. However, our benchmarks test individual edits that generate individual update events as this more closely simulates actual user behavior. See #21 - Note that `parseTime` is significantly higher with `automerge` and `loro` when the initial document is not empty (e.g. when syncing content from a remote server). - Loro and Automerge can store a complete DAG of editing history for each keystroke, but Yjs requires additional storage for a Version Vector + Delete Set for each version saved, which incurs significant extra overhead beyond the document size reported.
Benchmark setup #### B1: No conflicts Simulate two clients. One client modifies a text object and sends update messages to the other client. We measure the time to perform the task (`time`), the amount of data exchanged (`avgUpdateSize`), the size of the encoded document after the task is performed (`docSize`), the time to parse the encoded document (`parseTime`), and the memory used to hold the decoded document (`memUsed`). #### B2: Two users producing conflicts Simulate two clients. Both start with a synced text object containing 100 characters. Both clients modify the text object in a single transaction and then send their changes to the other client. We measure the time to sync concurrent changes into a single client (`time`), the size of the update messages (`updateSize`), the size of the encoded document after the task is performed (`docSize`), the time to parse the encoded document (`parseTime`), and the memory used to hold the decoded document (`memUsed`). #### B3: Many conflicts Simulate `√N` concurrent actions. We measure the time to perform the task and sync all clients (`time`), the size of the update messages (`updateSize`), the size of the encoded document after the task is performed (`docSize`), the time to parse the encoded document (`parseTime`), and the memory used to hold the decoded document (`memUsed`). The logarithm of `N` was chosen because `√N` concurrent actions may result in up to `√N^2 - 1` conflicts (apply action 1: 0 conflict; apply action2: 1 conflict, apply action 2: 2 conflicts, ..). #### B4: Real-world editing dataset Replay a real-world editing dataset. This dataset contains the character-by-character editing trace of a large-ish text document, the LaTeX source of this paper: https://arxiv.org/abs/1608.03960 Source: https://github.com/automerge/automerge-perf/tree/master/edit-by-index - 182,315 single-character insertion operations - 77,463 single-character deletion operations - 259,778 operations totally - 104,852 characters in the final document We simulate one client replaying all changes and storing each update. We measure the time to replay the changes and the size of all update messages (`updateSize`), the size of the encoded document after the task is performed (`docSize`), the time to encode the document (`encodeTime`), the time to parse the encoded document (`parseTime`), and the memory used to hold the decoded document in memory (`memUsed`). ##### [B4 x 100] Real-world editing dataset 100 times Replay the [B4] dataset one hundred times. The final document has a size of over 10 million characters. As comparison, the book "Game of Thrones: A Song of Ice and Fire" is only 1.6 million characters long (including whitespace). - 18,231,500 single-character insertion operations - 7,746,300 single-character deletion operations - 25,977,800 operations totally - 10,485,200 characters in the final document
| N = 6000 | yjs | ywasm | loro | loro-old | automerge | automerge-wasm | | :----------------------------------------------------------------------- | ---------------: | --------------: | ---------------: | ---------------: | ---------------: | --------------: | | Version | 13.6.15 | 0.17.4 | 1.0.0-beta.2 | 0.15.2 | 2.1.10 | 0.9.0 | | Bundle size | 84,017 bytes | 938,991 bytes | 2,919,363 bytes | 1,583,094 bytes | 1,696,176 bytes | 1,701,136 bytes | | Bundle size (gzipped) | 25,105 bytes | 284,616 bytes | 894,460 bytes | 592,039 bytes | 591,049 bytes | 594,071 bytes | | [B1.1] Append N characters (time) | 141 ms | 171 ms | 164 ms | 115 ms | 279 ms | 110 ms | | [B1.1] Append N characters (avgUpdateSize) | 27 bytes | 27 bytes | 88 bytes | 58 bytes | 121 bytes | 121 bytes | | [B1.1] Append N characters (encodeTime) | 1 ms | 0 ms | 3 ms | 0 ms | 5 ms | 6 ms | | [B1.1] Append N characters (docSize) | 6,031 bytes | 6,031 bytes | 12,382 bytes | 6,219 bytes | 3,992 bytes | 3,992 bytes | | [B1.1] Append N characters (memUsed) | 0 B | 0 B | 0 B | 0 B | 0 B | 0 B | | [B1.1] Append N characters (parseTime) | 22 ms | 65 ms | 28 ms | 28 ms | 59 ms | 61 ms | | [B1.2] Insert string of length N (time) | 1 ms | 1 ms | 0 ms | 0 ms | 18 ms | 14 ms | | [B1.2] Insert string of length N (avgUpdateSize) | 6,031 bytes | 6,031 bytes | 6,089 bytes | 6,088 bytes | 6,201 bytes | 6,201 bytes | | [B1.2] Insert string of length N (encodeTime) | 0 ms | 0 ms | 0 ms | 0 ms | 2 ms | 2 ms | | [B1.2] Insert string of length N (docSize) | 6,031 bytes | 6,031 bytes | 12,313 bytes | 6,146 bytes | 3,974 bytes | 3,974 bytes | | [B1.2] Insert string of length N (parseTime) | 27 ms | 47 ms | 27 ms | 26 ms | 29 ms | 30 ms | | [B1.3] Prepend N characters (time) | 118 ms | 30 ms | 111 ms | 47 ms | 272 ms | 73 ms | | [B1.3] Prepend N characters (avgUpdateSize) | 27 bytes | 26 bytes | 87 bytes | 57 bytes | 116 bytes | 116 bytes | | [B1.3] Prepend N characters (encodeTime) | 2 ms | 0 ms | 2 ms | 1 ms | 4 ms | 4 ms | | [B1.3] Prepend N characters (docSize) | 6,041 bytes | 6,040 bytes | 15,414 bytes | 6,165 bytes | 3,988 bytes | 3,988 bytes | | [B1.3] Prepend N characters (parseTime) | 45 ms | 38 ms | 27 ms | 37 ms | 73 ms | 47 ms | | [B1.4] Insert N characters at random positions (time) | 128 ms | 101 ms | 113 ms | 51 ms | 268 ms | 89 ms | | [B1.4] Insert N characters at random positions (avgUpdateSize) | 29 bytes | 29 bytes | 88 bytes | 58 bytes | 121 bytes | 121 bytes | | [B1.4] Insert N characters at random positions (encodeTime) | 3 ms | 0 ms | 2 ms | 1 ms | 6 ms | 5 ms | | [B1.4] Insert N characters at random positions (docSize) | 29,571 bytes | 29,554 bytes | 39,040 bytes | 29,503 bytes | 24,743 bytes | 24,743 bytes | | [B1.4] Insert N characters at random positions (parseTime) | 46 ms | 30 ms | 27 ms | 26 ms | 73 ms | 64 ms | | [B1.5] Insert N words at random positions (time) | 149 ms | 264 ms | 112 ms | 54 ms | 539 ms | 291 ms | | [B1.5] Insert N words at random positions (avgUpdateSize) | 36 bytes | 36 bytes | 95 bytes | 65 bytes | 131 bytes | 131 bytes | | [B1.5] Insert N words at random positions (encodeTime) | 6 ms | 1 ms | 3 ms | 1 ms | 14 ms | 15 ms | | [B1.5] Insert N words at random positions (docSize) | 87,868 bytes | 87,924 bytes | 135,713 bytes | 98,901 bytes | 96,203 bytes | 96,203 bytes | | [B1.5] Insert N words at random positions (parseTime) | 50 ms | 33 ms | 27 ms | 33 ms | 101 ms | 111 ms | | [B1.6] Insert string, then delete it (time) | 1 ms | 0 ms | 2 ms | 1 ms | 39 ms | 31 ms | | [B1.6] Insert string, then delete it (avgUpdateSize) | 6,053 bytes | 6,052 bytes | 6,189 bytes | 6,179 bytes | 6,338 bytes | 6,338 bytes | | [B1.6] Insert string, then delete it (encodeTime) | 0 ms | 0 ms | 0 ms | 0 ms | 3 ms | 2 ms | | [B1.6] Insert string, then delete it (docSize) | 6,040 bytes | 6,039 bytes | 6,409 bytes | 6,145 bytes | 3,993 bytes | 3,993 bytes | | [B1.6] Insert string, then delete it (parseTime) | 29 ms | 29 ms | 26 ms | 28 ms | 52 ms | 44 ms | | [B1.7] Insert/Delete strings at random positions (time) | 153 ms | 112 ms | 136 ms | 60 ms | 423 ms | 212 ms | | [B1.7] Insert/Delete strings at random positions (avgUpdateSize) | 31 bytes | 31 bytes | 100 bytes | 61 bytes | 135 bytes | 135 bytes | | [B1.7] Insert/Delete strings at random positions (encodeTime) | 3 ms | 1 ms | 3 ms | 1 ms | 13 ms | 11 ms | | [B1.7] Insert/Delete strings at random positions (docSize) | 41,917 bytes | 41,592 bytes | 81,700 bytes | 51,470 bytes | 59,281 bytes | 59,281 bytes | | [B1.7] Insert/Delete strings at random positions (parseTime) | 53 ms | 41 ms | 25 ms | 27 ms | 80 ms | 78 ms | | [B1.8] Append N numbers (time) | 140 ms | 59 ms | 155 ms | 73 ms | 448 ms | 113 ms | | [B1.8] Append N numbers (avgUpdateSize) | 32 bytes | 32 bytes | 94 bytes | 62 bytes | 125 bytes | 125 bytes | | [B1.8] Append N numbers (encodeTime) | 2 ms | 0 ms | 2 ms | 2 ms | 6 ms | 6 ms | | [B1.8] Append N numbers (docSize) | 35,641 bytes | 35,634 bytes | 71,568 bytes | 47,625 bytes | 26,985 bytes | 26,985 bytes | | [B1.8] Append N numbers (parseTime) | 32 ms | 43 ms | 31 ms | 29 ms | 76 ms | 67 ms | | [B1.9] Insert Array of N numbers (time) | 3 ms | 3 ms | 13 ms | 9 ms | 47 ms | 22 ms | | [B1.9] Insert Array of N numbers (avgUpdateSize) | 35,653 bytes | 35,657 bytes | 35,715 bytes | 35,717 bytes | 31,199 bytes | 31,199 bytes | | [B1.9] Insert Array of N numbers (encodeTime) | 1 ms | 0 ms | 2 ms | 1 ms | 4 ms | 2 ms | | [B1.9] Insert Array of N numbers (docSize) | 35,653 bytes | 35,657 bytes | 71,578 bytes | 47,646 bytes | 26,953 bytes | 26,953 bytes | | [B1.9] Insert Array of N numbers (parseTime) | 43 ms | 29 ms | 31 ms | 29 ms | 56 ms | 44 ms | | [B1.10] Prepend N numbers (time) | 157 ms | 31 ms | 125 ms | 53 ms | 484 ms | 159 ms | | [B1.10] Prepend N numbers (avgUpdateSize) | 32 bytes | 32 bytes | 93 bytes | 61 bytes | 120 bytes | 120 bytes | | [B1.10] Prepend N numbers (encodeTime) | 3 ms | 1 ms | 3 ms | 1 ms | 6 ms | 6 ms | | [B1.10] Prepend N numbers (docSize) | 35,669 bytes | 35,665 bytes | 76,773 bytes | 47,645 bytes | 26,987 bytes | 26,987 bytes | | [B1.10] Prepend N numbers (parseTime) | 53 ms | 33 ms | 48 ms | 39 ms | 65 ms | 67 ms | | [B1.11] Insert N numbers at random positions (time) | 160 ms | 121 ms | 125 ms | 56 ms | 517 ms | 121 ms | | [B1.11] Insert N numbers at random positions (avgUpdateSize) | 34 bytes | 34 bytes | 94 bytes | 60 bytes | 125 bytes | 125 bytes | | [B1.11] Insert N numbers at random positions (encodeTime) | 2 ms | 1 ms | 4 ms | 1 ms | 8 ms | 6 ms | | [B1.11] Insert N numbers at random positions (docSize) | 59,132 bytes | 59,137 bytes | 100,632 bytes | 70,901 bytes | 47,746 bytes | 47,746 bytes | | [B1.11] Insert N numbers at random positions (parseTime) | 52 ms | 50 ms | 58 ms | 41 ms | 428 ms | 80 ms | | [B2.1] Concurrently insert string of length N at index 0 (time) | 1 ms | 0 ms | 1 ms | 0 ms | 75 ms | 32 ms | | [B2.1] Concurrently insert string of length N at index 0 (updateSize) | 6,094 bytes | 6,094 bytes | 6,188 bytes | 9,244 bytes | 9,499 bytes | 9,499 bytes | | [B2.1] Concurrently insert string of length N at index 0 (encodeTime) | 0 ms | 0 ms | 0 ms | 0 ms | 7 ms | 5 ms | | [B2.1] Concurrently insert string of length N at index 0 (docSize) | 12,151 bytes | 12,151 bytes | 24,735 bytes | 12,281 bytes | 8,011 bytes | 8,011 bytes | | [B2.1] Concurrently insert string of length N at index 0 (parseTime) | 28 ms | 34 ms | 30 ms | 27 ms | 60 ms | 48 ms | | [B2.2] Concurrently insert N characters at random positions (time) | 46 ms | 166 ms | 55 ms | 125 ms | 270 ms | 399 ms | | [B2.2] Concurrently insert N characters at random positions (updateSize) | 33,420 bytes | 33,444 bytes | 23,779 bytes | 350,337 bytes | 27,476 bytes | 1,093,293 bytes | | [B2.2] Concurrently insert N characters at random positions (encodeTime) | 2 ms | 1 ms | 4 ms | 1 ms | 6 ms | 11 ms | | [B2.2] Concurrently insert N characters at random positions (docSize) | 66,808 bytes | 66,852 bytes | 79,937 bytes | 59,358 bytes | 50,686 bytes | 50,704 bytes | | [B2.2] Concurrently insert N characters at random positions (parseTime) | 67 ms | 69 ms | 27 ms | 27 ms | 51 ms | 95 ms | | [B2.3] Concurrently insert N words at random positions (time) | 105 ms | 459 ms | 68 ms | 120 ms | 435 ms | 630 ms | | [B2.3] Concurrently insert N words at random positions (updateSize) | 89,143 bytes | 88,994 bytes | 62,640 bytes | 408,723 bytes | 122,485 bytes | 1,185,202 bytes | | [B2.3] Concurrently insert N words at random positions (encodeTime) | 7 ms | 2 ms | 7 ms | 3 ms | 28 ms | 35 ms | | [B2.3] Concurrently insert N words at random positions (docSize) | 178,428 bytes | 178,130 bytes | 268,884 bytes | 197,284 bytes | 185,019 bytes | 191,498 bytes | | [B2.3] Concurrently insert N words at random positions (parseTime) | 82 ms | 67 ms | 31 ms | 30 ms | 134 ms | 184 ms | | [B2.4] Concurrently insert & delete (time) | 145 ms | 1,243 ms | 118 ms | 282 ms | 653 ms | 1,311 ms | | [B2.4] Concurrently insert & delete (updateSize) | 140,984 bytes | 141,122 bytes | 123,725 bytes | 798,123 bytes | 298,810 bytes | 2,395,876 bytes | | [B2.4] Concurrently insert & delete (encodeTime) | 6 ms | 3 ms | 14 ms | 4 ms | 43 ms | 56 ms | | [B2.4] Concurrently insert & delete (docSize) | 282,112 bytes | 282,358 bytes | 392,151 bytes | 304,592 bytes | 293,831 bytes | 307,291 bytes | | [B2.4] Concurrently insert & delete (parseTime) | 105 ms | 73 ms | 34 ms | 31 ms | 185 ms | 269 ms | | [B3.1] 20√N clients concurrently set number in Map (time) | 54 ms | 60 ms | 27 ms | 32 ms | 1,058 ms | 21 ms | | [B3.1] 20√N clients concurrently set number in Map (updateSize) | 49,167 bytes | 49,162 bytes | 132,376 bytes | 63,832 bytes | 283,296 bytes | 283,296 bytes | | [B3.1] 20√N clients concurrently set number in Map (encodeTime) | 3 ms | 1 ms | 16 ms | 1 ms | 9 ms | 12 ms | | [B3.1] 20√N clients concurrently set number in Map (docSize) | 36,763 bytes | 36,751 bytes | 78,764 bytes | 38,428 bytes | 86,165 bytes | 86,164 bytes | | [B3.1] 20√N clients concurrently set number in Map (parseTime) | 57 ms | 67 ms | 83 ms | 49 ms | 59 ms | 54 ms | | [B3.2] 20√N clients concurrently set Object in Map (time) | 55 ms | 62 ms | 27 ms | 40 ms | 1,126 ms | 28 ms | | [B3.2] 20√N clients concurrently set Object in Map (updateSize) | 85,084 bytes | 85,073 bytes | 171,370 bytes | 99,753 bytes | 398,090 bytes | 325,370 bytes | | [B3.2] 20√N clients concurrently set Object in Map (encodeTime) | 2 ms | 1 ms | 18 ms | 2 ms | 21 ms | 17 ms | | [B3.2] 20√N clients concurrently set Object in Map (docSize) | 72,682 bytes | 72,659 bytes | 84,255 bytes | 75,227 bytes | 112,588 bytes | 93,401 bytes | | [B3.2] 20√N clients concurrently set Object in Map (parseTime) | 73 ms | 68 ms | 83 ms | 48 ms | 68 ms | 59 ms | | [B3.3] 20√N clients concurrently set String in Map (time) | 124 ms | 61 ms | 63 ms | 73 ms | 1,869 ms | 166 ms | | [B3.3] 20√N clients concurrently set String in Map (updateSize) | 7,826,229 bytes | 7,826,225 bytes | 7,912,520 bytes | 7,840,917 bytes | 8,063,440 bytes | 8,063,440 bytes | | [B3.3] 20√N clients concurrently set String in Map (encodeTime) | 56 ms | 5 ms | 70 ms | 23 ms | 59 ms | 61 ms | | [B3.3] 20√N clients concurrently set String in Map (docSize) | 7,813,826 bytes | 7,813,814 bytes | 241,646 bytes | 7,815,537 bytes | 97,997 bytes | 98,008 bytes | | [B3.3] 20√N clients concurrently set String in Map (parseTime) | 141 ms | 75 ms | 161 ms | 44 ms | 80 ms | 76 ms | | [B3.4] 20√N clients concurrently insert text in Array (time) | 45 ms | 49 ms | 195 ms | 28 ms | 1,768 ms | 19 ms | | [B3.4] 20√N clients concurrently insert text in Array (updateSize) | 52,751 bytes | 52,735 bytes | 137,490 bytes | 70,514 bytes | 311,830 bytes | 285,330 bytes | | [B3.4] 20√N clients concurrently insert text in Array (encodeTime) | 1 ms | 0 ms | 13 ms | 1 ms | 13 ms | 7 ms | | [B3.4] 20√N clients concurrently insert text in Array (docSize) | 26,596 bytes | 26,580 bytes | 100,791 bytes | 47,943 bytes | 96,423 bytes | 86,519 bytes | | [B3.4] 20√N clients concurrently insert text in Array (parseTime) | 55 ms | 37 ms | 94 ms | 30 ms | 43 ms | 37 ms | | [B4] Apply real-world editing dataset (time) | 2,616 ms | 17,556 ms | 2,271 ms | 768 ms | 7,109 ms | 2,775 ms | | [B4] Apply real-world editing dataset (encodeTime) | 4 ms | 8 ms | 11 ms | 3 ms | 165 ms | 161 ms | | [B4] Apply real-world editing dataset (docSize) | 226,981 bytes | 226,981 bytes | 230,556 bytes | 260,813 bytes | 129,116 bytes | 129,116 bytes | | [B4] Apply real-world editing dataset (parseTime) | 27 ms | 22 ms | 6 ms | 4 ms | 1,185 ms | 1,123 ms | | [B4x100] Apply real-world editing dataset 100 times (time) | 279,705 ms | skipped | 233,739 ms | 75,122 ms | skipped | skipped | | [B4x100] Apply real-world editing dataset 100 times (encodeTime) | 417 ms | skipped | 743 ms | 205 ms | skipped | skipped | | [B4x100] Apply real-world editing dataset 100 times (docSize) | 22,694,543 bytes | skipped | 21,016,454 bytes | 26,826,427 bytes | skipped | skipped | | [B4x100] Apply real-world editing dataset 100 times (parseTime) | 1,270 ms | skipped | 66 ms | 200 ms | skipped | skipped | | [B3.5] 20√N clients concurrently insert text (time) | 42 ms | 49 ms | 212 ms | 115 ms | 1,869 ms | 36 ms | | [B3.5] 20√N clients concurrently insert text (updateSize) | 48,129 bytes | 48,135 bytes | 132,870 bytes | 68,978 bytes | 298,020 bytes | 298,020 bytes | | [B3.5] 20√N clients concurrently insert text (encodeTime) | 1 ms | 0 ms | 13 ms | 1 ms | 12 ms | 16 ms | | [B3.5] 20√N clients concurrently insert text (docSize) | 24,313 bytes | 24,325 bytes | 102,818 bytes | 48,053 bytes | 90,768 bytes | 90,781 bytes | | [B3.5] 20√N clients concurrently insert text (parseTime) | 38 ms | 38 ms | 64 ms | 63 ms | 67 ms | 55 ms | | [C1.1] Concurrently insert & delete 100K (time) | 27,138 ms | skipped | 2,335 ms | skipped | 50,692 ms | skipped | | [C1.1] Concurrently insert & delete 100K (updateSize) | 4,908,122 bytes | skipped | 4,269,521 bytes | skipped | 10,326,487 bytes | skipped | | [C1.1] Concurrently insert & delete 100K (encodeTime) | 129 ms | skipped | 210 ms | skipped | 837 ms | skipped | | [C1.1] Concurrently insert & delete 100K (docSize) | 4,908,186 bytes | skipped | 6,748,052 bytes | skipped | 5,404,289 bytes | skipped | | [C1.1] Concurrently insert & delete 100K (parseTime) | 1,653 ms | skipped | 78 ms | skipped | 7,308 ms | skipped | | [C1.1] Concurrently insert & delete 100K (versionSize) | 222,681 bytes | skipped | 28 bytes | skipped | 64 bytes | skipped | | [C1.2] Concurrently set Map 100K (time) | 31,598 ms | skipped | 488 ms | skipped | 472,547 ms | skipped | | [C1.2] Concurrently set Map 100K (updateSize) | 980,804 bytes | skipped | 2,601,744 bytes | skipped | 5,541,744 bytes | skipped | | [C1.2] Concurrently set Map 100K (encodeTime) | 60 ms | skipped | 203 ms | skipped | 2,652 ms | skipped | | [C1.2] Concurrently set Map 100K (docSize) | 738,949 bytes | skipped | 1,539,209 bytes | skipped | 1,743,081 bytes | skipped | | [C1.2] Concurrently set Map 100K (parseTime) | 4,069 ms | skipped | 631 ms | skipped | 338 ms | skipped | | [C1.2] Concurrently set Map 100K (versionSize) | 416,246 bytes | skipped | 314,810 bytes | skipped | 1,949,999 bytes | skipped | # FILE: pages/blog/movable-tree.mdx --- title: "Movable tree CRDTs and Loro's implementation" date: 2024/07/18 description: This article introduces the implementation difficulties and challenges of Movable Tree CRDTs when collaboration, and how Loro implements it and sorts child nodes. The algorithm has high performance and can be used in production. image: https://i.ibb.co/nMrgzZJ/DALL-E-2024-01-31-21-29-16-Create-a-black-and-white-illustration-with-a-black-background-that-matche.png --- # Movable tree CRDTs and Loro's implementation import Caption from "components/caption"; import Authors, { Author } from "components/authors"; ![](./movable-tree/movable-tree-cover.png) This article introduces the implementation difficulties and challenges of Movable Tree CRDTs when collaboration, and how Loro implements it and sorts child nodes. The algorithm has high performance and can be used in production. ## Background In distributed systems and collaborative software, managing hierarchical relationships is difficult and complex. Challenges arise in resolving conflicts and meeting user expectations when working with the data structure that models movement by combining deletion and insertion. For instance, if a node is concurrently moved to different parents in replicas, it may lead to the unintended creation of duplicate nodes with the same content. Because the node is deleted twice and created under two parents. Currently, many software solutions offer different levels of support and functionality for managing hierarchical data structures in distributed environments. The key variation among these solutions lies in their approaches to handling potential conflicts. ### Conflicts in Movable Trees A movable tree has 3 primary operations: creation, deletion, and movement. Consider a scenario where two peers independently execute various operations on their respective replicas of the same movable tree. Synchronizing these operations can lead to potential conflicts, such as: - The same node was deleted and moved - The same node was moved under different nodes - Different nodes were moved, resulting in a cycle - The ancestor node is deleted while the descendant node is moved #### Deletion and Movement of the Same Node ![Deletion and Movement of the Same Node](./movable-tree/move-delete-dark.png) This situation is relatively easy to resolve. It can be addressed by applying one of the operations while ignoring the other based on the timestamp in the distributed system or the application's specific requirements. Either approach yields an acceptable outcome. #### Moving the Same Node Under Different Parents ![Moving the Same Node Under Different Parents](./movable-tree/move-same-node-dark.png) Merging concurrent movement operations of the same node is slightly more complex. Different approaches can be adopted depending on the application: - Delete the node and create copies of nodes under different parent nodes. Subsequent operations then treat these nodes independently. This approach is acceptable when node uniqueness is not critical. - Allow the node have two edges pointing to different parents. However, this approach breaks the fundamental tree structure and is generally not considered acceptable. - Sort all operations, then apply them one by one. The order can be determined by timestamps in a distributed system. Providing the system maintains a consistent operation sequence, it ensures uniform results across all peers. #### Movement of Different Nodes Resulting in a Cycle ![cycle](./movable-tree/cycle-dark.png) Concurrent movement operations that cause cycles make the conflict resolution of movable trees complex. Matthew Weidner listed several solutions to resolve cycles in his [blog](https://mattweidner.com/2023/09/26/crdt-survey-2.html#forests-and-trees). > 1. Error. Some desktop file sync apps do this in practice ([Martin Kleppmann et al. (2022)](https://doi.org/10.1109/TPDS.2021.3118603) give an example). > 2. Render the cycle nodes (and their descendants) in a special “time-out” zone. They will stay there until some user manually fixes the cycle. > 3. Use a server to process move ops. When the server receives an op, if it would create a cycle in the server’s own state, the server rejects it and tells users to do likewise. This is [what Figma does](https://www.figma.com/blog/how-figmas-multiplayer-technology-works/#syncing-trees-of-objects). Users can still process move ops optimistically, but they are tentative until confirmed by the server. (Optimistic updates can cause temporary cycles for users; in that case, Figma uses strategy (2): it hides the cycle nodes.) > 4. Similar, but use a [topological sort](https://mattweidner.com/2023/09/26/crdt-survey-2.html#topological-sort) (below) instead of a server’s receipt order. When processing ops in the sort order, if an op would create a cycle, skip it [(Martin Kleppmann et al. 2022)](https://doi.org/10.1109/TPDS.2021.3118603). > 5. For forests: Within each cycle, let `B.parent = A` be the edge whose `set` operation has the largest LWW timestamp. At render time, “hide” that edge, instead rendering `B.parent = "none"`, but don’t change the actual CRDT state. This hides one of the concurrent edges that created the cycle. > • To prevent future surprises, users’ apps should follow the rule: before performing any operation that would create or destroy a cycle involving a hidden edge, first “affirm” that hidden edge, by performing an op that sets `B.parent = "none"`. > 6. For trees: Similar, except instead of rendering `B.parent = "none"`, render the previous parent for `B` - as if the bad operation never happened. More generally, you might have to backtrack several operations. Both [Hall et al. (2018)](http://dx.doi.org/10.1145/3209280.3229110) and [Nair et al. (2022)](https://arxiv.org/abs/2103.04828) describe strategies along these lines. #### Ancestor Node Deletion and Descendant Node Movement ![Ancestor Node Deletion and Descendant Node Movement](./movable-tree/move_chlid_delete_parent_dark.png) The most easily overlooked scenario is moving descendant nodes when deleting an ancestor node. If all descendant nodes of the ancestor are deleted directly, users may easily misunderstand that their data has been lost. ### How Popular Applications Handle Conflicts Dropbox is a file data synchronization software. Initially, Dropbox treated file movement as a two-step process: deletion from the original location followed by creation at a new location. However, this method risked data loss, especially if a power outage or system crash occurred between the delete and create operations. Today, when multiple people move the same file concurrently and attempt to save their changes, Dropbox detects a conflict. In this scenario, it typically saves one version of the original file and creates a new ["conflicted copy"](https://help.dropbox.com/organize/conflicted-copy) for the changes made by one of the users. ![Solution for conflicts when moving files with Dropbox](./movable-tree/dropbox_move.gif) The image shows the conflict that occurs when A is moved to the B folder and B is moved to the A folder concurrently. Figma is a real-time collaborative prototyping tool. They consider tree structures as the most complex part of the collaborative system, as detailed in their [blog post about multiplayer technology](https://www.figma.com/blog/how-figmas-multiplayer-technology-works/#syncing-trees-of-objects). To maintain consistency, each element in Figma has a "parent" attribute. The centralized server plays a crucial role in ensuring the integrity of these structures. It monitors updates from various users and checks if any operation would result in a cycle. If a potential cycle is detected, the server rejects the operation. However, due to network delays and similar issues, there can be instances where updates from users temporarily create a cycle before the server has the chance to reject them. Figma acknowledges that this situation is uncommon. Their [solution](https://www.figma.com/blog/how-figmas-multiplayer-technology-works/#syncing-trees-of-objects) is straightforward yet effective: they temporarily preserve this state and hide the elements involved in the cycle. This approach lasts until the server formally rejects the operation, ensuring both the stability of the system and a seamless user experience.
![An animation that demonstrates how Figma resolves conflicts.](./movable-tree/figma-tree.gif)
An animation that demonstrates how [Figma](https://www.figma.com/blog/how-figmas-multiplayer-technology-works/#syncing-trees-of-objects) resolves conflicts. ## Movable Tree CRDTs The applications mentioned above use movable trees and resolve conflicts based on centralized solutions. Another alternative approach to collaborative tree structures is using Conflict-free Replicated Data Types (CRDTs). While initial CRDT-based algorithms were challenging to implement and incurred significant storage overhead as noted in prior research, such as [Abstract unordered and ordered trees CRDT](https://arxiv.org/pdf/1201.1784.pdf) or [File system on CRDT](https://arxiv.org/pdf/1207.5990.pdf), but continual optimization and improvement have made several CRDT-based tree synchronization algorithms suitable for certain production environments. This article highlights two innovative CRDT-based approaches for movable trees. The first is presented by Martin Kleppmann et al. in their work **_[A highly-available move operation for replicated trees](https://martin.kleppmann.com/2021/10/07/crdt-tree-move-operation.html)_** and the second by Evan Wallace in his **_[CRDT: Mutable Tree Hierarchy](https://madebyevan.com/algos/crdt-mutable-tree-hierarchy/)_**. ### A highly-available move operation for replicated trees This paper unifies the three operations used in trees (creating, deleting, and moving nodes) into a move operation. The move operation is defined as a four-tuple `Move t p m c`, where `t` is the operation's unique and ordered timestamp such as [`Lamport timestamp`](https://en.wikipedia.org/wiki/Lamport_timestamp), `p` is the parent node ID, `m` is the metadata associated with the node, and `c` is the child node ID. If all nodes of the tree do not contain `c`, this is a **creation** operation that creates a child node `c` under parent node `p`. Otherwise, it is a **move** operation that moves `c` from its original parent to the new parent `p`. Additionally, node deletion is elegantly handled by introducing a designated `TRASH` node; moving a node to `TRASH` implies its deletion, with all descendants of `TRASH` considered deleted. But they remain in memory to prevent concurrent editing from moving them to other nodes. In order to handle the previously mentioned situation of deleting ancestor nodes and moving descendant nodes concurrently. In the three potential conflicts mentioned earlier, since deletion is also defined as a move operation, **deleting and moving the same node** is transformed into two move operations, leaving only two remaining problems: - **Moving the same node under different parents** - **Moving different nodes, creating a cycle** Logical timestamps are added so that all operations can be linearly ordered, thus the first conflict can be avoided as they can be expressed as two operations in sequence rather than concurrently for the same node. Therefore, in modeling a Tree using only move operations, the only exceptional case in concurrent editing would be creating a cycle, and operations causing a cycle are termed **unsafe operations**. This algorithm sorts all move operations according to their timestamps. It can then sequentially apply each operation. Before applying, the algorithm detects cycles to determine whether an operation is safe. If the operation creates a cycle, we ignore the unsafe operation to ensure the correct structure of the tree. Based on the above approach, the consistency problem of movable trees becomes the following two questions: 1. How to introduce global order to operations 2. How to apply a remote operation that should be inserted in the middle of an existing sorted sequence of operations #### Globally Ordered Logical Timestamps [Lamport Timestamp](https://en.wikipedia.org/wiki/Lamport_timestamp) can determine the causal order of events in a distributed system. Here's how they work: each peer starts with a counter initialized to `0`. When a local event occurs, the counter is increased by `1`, and this value becomes the event's Lamport Timestamp. When peer `A` sends a message to peer `B`, `A` attaches its Lamport Timestamp to the message. Upon receiving the message, peer `B` compares its current logical clock value with the timestamp in the message and updates its logical clock to the larger value. To globally sort events, we first look at the Lamport Timestamps: smaller numbers mean earlier events. If two events have the same timestamp, we use the unique ID of the peer serves as a tiebreaker. #### Apply a Remote Operation An op's safety depends on the tree's state when applied, avoiding cycles. Insertion requires evaluating the state formed by all preceding ops. For remote updates, we may need to: 1. Undo recent ops 2. Insert the new op 3. Reapply undone ops This ensures proper integration of new ops into the existing sequence. ##### Undo Recent Ops Since we've modeled all operations on the tree as move operations, undoing a move operation involves either moving the node back to its old parent or undoing the operation that created this node. To enable quick undoing, we cache and record the **old parent** of the node before applying each move operation. ##### Apply the Remote Op Upon encountering an unsafe operation, disregarding its effects prevents the creation of a cycle. Nevertheless, it's essential to record the operation, as the safety of an operation is determined **dynamically**. For instance, if we receive and sort an update that deletes another node causing the cycle prior to this operation, the operation that was initially unsafe becomes safe. Additionally, we need to mark this unsafe operation as ineffective, since during undo operations, it's necessary to query the **old parent** node, which is the target parent of the last effective operation in the sequence targeting this node. ##### Reapply Undone Ops Cycles only occur when receiving updates from other peers, so the undo-do-redo process is also needed at this time. When receiving a new op: ```jsx function apply(newOp) // Compare the ID of the new operation with existing operations if largerThanExistingOpId(newOp.id, oplog) // If the new operation's ID is greater, apply it directly oplog.applyOp(newOp) else // If the new operation's ID is not the greatest, undo operations until it can be applied undoneOps = oplog.undoUtilCanBeApplied(newOp) oplog.applyOp(newOp) // After applying the new operation, redo the undone operations to maintain sequence order oplog.redoOps(undoneOps) ``` - If the new operation depends on an op that has not been encountered locally, indicating that some inter-version updates are still missing, it is necessary to temporarily cache the new op and wait to apply it until the missing updates are received. - Compare the new operation with all existing operations. If the `opId` of the new operation is greater than that of all existing operations, it can be directly applied. If the new operation is safe, record the parent node of the target node as the old parent node, then apply the move operation to change the current state. If it is not safe, mark this operation as ineffective and ignore the operation's impact. - If the new opId is sorted in the middle of the existing sequence, it is necessary to pop the operations that are sorted later from the sequence one by one, and undo the impact of this operation, which means moving back to the child of the old parent node, until the new operation can be applied. After applying the new operation, reapply the undone nodes in sequence order, ensuring that all operations are applied in order. The following animated GIF demonstrates the process executed by `Peer1`: 1. Received `Peer0` creating node `A` with the `root` node as its parent. 2. Received `Peer0` creating node `B` with `A` as its parent. 3. Created node `C` with `A` as its parent and synchronized it with `Peer0`. 4. Moved `C` to have `B` as its parent. 5. Received `Peer0`'s moving `B` to have `C` as its parent.
![](./movable-tree/undo-do-redo.gif)
The queue at the top right of the animation represents the order of local operations and newly received updates. The interpretation of each element in each `Block` is as follows:
![](./movable-tree/explain.png)
A particular part of this process to note is the two operations with `lamport timestamps` of `0:3` and `1:3`. Initially, the `1:3` operation moving `C` to `B` was created and applied locally, followed by receiving `Peer0`'s `0:3` operation moving `B` to `C`. In `lamport timestamp` order, `0:3` is less than `1:3` but greater than `1:2` (with peer as the tiebreaker when counters are equal). To apply the new op, the `1:3` operation is undone first, moving `C` back to its old parent `A`, then `0:3` moving `B` to `C` is applied. After that, `1:3` is redone, attempting to move `C` to `B` again (the old parent remains `A`, omitted in the animation). However, a cycle is detected during this attempt, preventing the operation from taking effect, and the state of the tree remains unchanged. This completes an `undo-do-redo` process. ### CRDT: Mutable Tree Hierarchy Evan Wallace has developed an innovative algorithm that enables each node to track all its historical parent nodes, attaching a counter to each recorded parent. The count value of a new parent node is 1 higher than that of all the node's historical parents, indicating the update sequence of the node's parents. The parent with the highest count is considered the current parent node. During synchronization, this parent node information is also synced. If a cycle occurs, a heuristic algorithm reattaches the nodes causing the cycle back to the nearest historical parent node that won't cause a cycle and is connected to the root node, thus updating the parent node record. This process is repeated until all nodes causing cycles are reattached to the tree, achieving all replica synchronization of the tree structure. The demo in [Evan's blog](https://madebyevan.com/algos/crdt-mutable-tree-hierarchy/) clearly illustrates this process. As Evan summarized at the end of the article, this algorithm does not require the expensive `undo-do-redo` process. However, each time a remote move is received, the algorithm needs to determine if all nodes are connected to the root node and reattach the nodes causing cycles back to the tree, which can perform poorly when there are too many nodes. I established a [benchmark](https://github.com/Leeeon233/movable-tree-crdt) to compare the performance of the movable tree algorithms. ## Movable Tree CRDTs implementation in Loro Loro implements the algorithm proposed by Martin Kleppmann et al., **_[A highly-available move operation for replicated trees](https://martin.kleppmann.com/2021/10/07/crdt-tree-move-operation.html)_**. On one hand, this algorithm has high performance in most real world scenarios. On the other hand, the core `undo-do-redo` process of the algorithm is highly similar to how Eg-walker (Event Graph Walker) applies remote updates in Loro. Introduction about **Eg-walker** can be found in our previous [blog](https://www.loro.dev/blog/loro-richtext#brief-introduction-to-replayable-event-graph). Movable tree has been introduced in detail, but there is still another problem of tree structure that has not been solved. For movable tree, in some real use cases, we still need the capability to sort child nodes. This is necessary for outline notes or layer management in graphic design softwares. Users need to adjust node order and sync it to other collaborators or devices. We integrated the `Fractional Index` algorithm into Loro and combined it with the movable tree, making the child nodes of the movable tree sortable. There are many introductions to `Fractional Index` on the web, You can read more about `Fractional Index` in the [Figma blog](https://www.figma.com/blog/realtime-editing-of-ordered-sequences) or [Evan blog](https://madebyevan.com/algos/crdt-fractional-indexing/). In simple terms, `Fractional Index` assigns a sortable value to each object, and if a new insertion occurs between two objects, the `Fractional Index` of the new object will be between the left and right values. What we want to speak about more here is how to deal with potential conflicts brought by `Fractional Index` in CRDTs systems. ### Potential Conflicts in Child Node Sorting As our applications are in a distributive condition, when multiple peers insert new nodes in the same position, the same `Fractional Index` would be assigned to these differing content but same position nodes. When updates from the remote are applied to local, conflicts arise as the same `Fractional Index` is encountered. In Loro, we retain these identical `Fractional Index` and use `PeerID` (unique ID of every Peer) as the tie-breaker for the relative order judgment of the same `Fractional Index`. ![](./movable-tree/FI-and-PeerID-dark.png) Although this solved the sorting problem among the same `Fractional Index` nodes from different peers, it impacted the generation of new `Fractional Index` as we cannot generate a new `Fractional Index` between two same ones. We use two methods to solve this problem: 1. The first method, as stated in Evan's blog, we could add a certain amount of jitter to each generated `Fractional Index`, (for the ease of explanation, all examples below take decimal fraction as the `Fractional Index`) for example, when generating a new `Fractional Index` between 0 and 1, it should have been 0.5, but through random jitters, it could be `0.52712`, `0.58312`, `0.52834`, etc., thus significantly reducing the chance of same `Fractional Index` appearing. 2. If the situation arises where the same `Fractional Index` is present on both sides, we can handle this problem by resetting these `Fractional Index`. For example, if we need to insert a new node between `0.7@A` and `0.7@B` (which indicates `Fractional Index` @ `PeerID`), instead of generating a new `Fractional Index` between 0.7 and 0.7, we could assign two new `Fractional Index` respectively for the new node and the `0.7@B` node between 0.7 and 1, which could be understood as an extra move operations. ![](./movable-tree/same-FI-dark.png) ### Implementation and Encoding Size Introducing `Fractional Index` brings the advantage of node sequence. What about encoding size? Loro uses [drifting-in-space](https://github.com/drifting-in-space/fractional_index) `Fractional Index` implementation based on `Vec`, which is base 256. In other words, you need to continuously insert 128 values forward or backward from the default value to increase the byte size of the `Fractional Index` by 1. The worst storage overhead case, such as inserting new values alternately each time. For example, the initial sequence is `ab`, insert `c` between `a` and `b`, then insert `d` between `c` and `b`, then `e` between `c` and `d`, like: ```js no_run ab // [128] [129, 128] acb // [128] [129, 127, 128] [129, 128] acdb // [128] [129, 127, 128] [129, 127, 129, 128] [129, 128] acedb // [128] [129, 127, 128] [129, 127, 129, 127, 128] [129, 127, 129, 128] [129, 128] ``` a new operation would cause an additional byte to be needed. But such a situation is very rare. Considering that potential conflicts wouldn't appear frequently in most applications, Loro simply extended the implementation, the original implementation produced new `Fractional Index` in `Vec` by only increasing or decreasing 1 in certain index to achieve relative sorting. The simple jitter solution was added, by appending random bytes in length of jitter value to `Fractional Index`. To enable jitter in js, you can use `doc.setFractionalIndexJitter(number)` with a positive value. But this will increase the encoding size slightly, but each `Fractional Index` only adds `jitter` bytes. If you want to generate `Fractional Index` at the same position with 99% probability without conflict, the relationship between `jitter` settings and the maximum number of concurrent edits `n` will be:
jitter max num of concurrent edits
1 3
2 37
3 582
When there are numerous `Fractional Indexes`, there will be many common prefixes after being sorted, when Loro encodes these `Fractional Indexes`, prefix optimization would be implemented. Each `Fractional Index` only saves the amount of same prefix bits and remaining bytes with the previous one, which further downsizes the overall encoding size. ### Related work Other than using Fractional Index, there are other movable list CRDT that can make sibling nodes of the tree in order. One of these algorithms is Martin Kleppmann's [Moving Elements in List CRDTs](https://martin.kleppmann.com/2020/04/27/papoc-list-move.html), which has been used in Loro's [Movable List](https://www.loro.dev/docs/tutorial/list). In comparison, the implementation of `Fractional Index` solution is simpler, and no stable position representation is provided for child nodes when modeling nodes in a tree, otherwise, the overall tree structure would be too complex. However, the `Fractional Index` has the problem of [interleaving](https://vlcn.io/blog/fractional-indexing#interleaving), but this is acceptable when some only need relative order and do not require strict sequential semantics, such as figma layer items, multi-level bookmarks, etc. ## Benchmark We conducted performance benchmarks on the Movable Tree implementation by Loro, including scenarios of random node movement, switching to historical versions, and performance under extreme conditions with significantly deep tree structures. The results indicate that it is capable of supporting real-time collaboration and enabling seamless historical version checkouts. | Task | Time | Setup | | :---------------------------------------- | :----- | :------------------------------------------------ | | Move 10000 times randomly | 28 ms | Create 1000 nodes first | | Switch to different versions 1000 times | 153 ms | Create 1000 nodes and move 1000 times first | | Switch to different versions 1000 times in a tree with depth of 300 | 701 ms | The new node is a child node of the previous node | Test environment: M2 Max CPU, you can find the bench code [here](https://github.com/loro-dev/loro/blob/main/crates/loro-internal/benches/tree.rs). ## Usage ```tsx import { Loro, LoroTree, LoroTreeNode, LoroMap } from "loro-crdt"; let doc = new Loro(); let tree: LoroTree = doc.getTree("tree"); let root: LoroTreeNode = tree.createNode(); // By default, append to the end of the parent node's children list let node = root.createNode(); // Specify the child's position let node2 = root.createNode(0); // Move `node2` to be the last child of `node` node2.move(node); // Move `node` to be the first child of `node2` node.move(node2, 0); // Move the node to become the root node node.move(); // Move the node to be positioned after another node node.moveAfter(node2); // Move the node to be positioned before another node node.moveBefore(node2); // Retrieve the index of the node within its parent's children let index = node.index(); // Get the `Fractional Index` of the node let fractionalIndex = node.fractionalIndex(); // Access the associated data map container let nodeData: LoroMap = node.data; ``` ### Demo We developed a simulated Todo app with data synchronization among multiple peers using Loro, including the use of `Movable Tree` to represent subtask relationships, `Map` to represent various attributes of tasks, and `Text` to represent task titles, etc. In addition to basic creation, moving, modification, and deletion, we also implemented version switching based on Loro. You can drag the scrollbar to switch between all the historical versions that have been operated on.