Knowledge

CRDTs in Production: Building Conflict-Free Real-Time Collaboration Systems

Why Google Docs, Figma, and Notion switched from Operational Transformation to CRDTs for real-time collaboration.

3/9/20267 min readKnowledge
CRDTs in Production: Building Conflict-Free Real-Time Collaboration Systems

Executive summary

Why Google Docs, Figma, and Notion switched from Operational Transformation to CRDTs for real-time collaboration.

Last updated: 3/9/2026

The collaboration problem that broke Operational Transformation

For years, Operational Transformation (OT) was the standard algorithm powering real-time collaborative editing in applications like Google Docs and Microsoft Office 365. The core idea was elegant: when two users edit the same text simultaneously, transform their operations to achieve convergence.

In theory, OT works perfectly. In production, it collapses under edge cases that become inevitable at scale.

The fundamental flaw is that OT requires centralized coordination. Every operation must pass through a central server that transforms it against all concurrent operations. This creates a single point of failure, introduces latency proportional to server distance, and makes offline operation nearly impossible. When network partitions occur, OT systems diverge—requiring complex and error-prone merge conflict resolution.

The industry's solution is Conflict-Free Replicated Data Types (CRDTs).

What makes CRDTs different

CRDTs are mathematical data structures designed to guarantee eventual consistency without requiring coordination. When two replicas of a CRDT diverge and later reconnect, they will automatically converge to the same state—no conflict resolution needed.

This is achieved through two key properties:

  1. Commutativity: The order of operations doesn't matter to the final state.
  2. Idempotency: Applying the same operation twice has no additional effect.

If Alice inserts "Hello" and Bob inserts "World" concurrently, a CRDT ensures both end up with "Hello World" regardless of which message arrived first or whether they communicated directly.

Types of CRDTs: Operation-based vs State-based

State-based CRDTs (CvRDTs)

State-based CRDTs (also called CvRDTs, or Convergent Replicated Data Types) work by exchanging entire state snapshots. Each replica compares its state with incoming state and merges using a deterministic merge function:

typescriptinterface CvRDT<T> {
  state: T;
  merge(otherState: T): void;  // merge must be commutative, associative, idempotent
}

function merge(replicaA: CvRDT, replicaB: CvRDT) {
  const mergedState = mergeFunction(replicaA.state, replicaB.state);
  replicaA.state = mergedState;
  replicaB.state = mergedState;
}

The merge function is the heart of the CvRDT. It must be:

  • Commutative: merge(A, B) === merge(B, A)
  • Associative: merge(merge(A, B), C) === merge(A, merge(B, C))
  • Idempotent: merge(A, A) === A

Common CvRDT implementations include G-Counters (grow-only counters) and G-Sets (grow-only sets).

Operation-based CRDTs (CmRDTs)

Operation-based CRDTs (CmRDTs, or Convergent Replicated Data Types) exchange operations rather than full state. Each operation is broadcast to all replicas and applied exactly once:

typescriptinterface Op {
  id: string;        // unique operation identifier
  timestamp: number; // Lamport clock or hybrid logical clock
  payload: unknown;  // operation-specific data
}

interface CmRDT {
  apply(op: Op): void;
}

function broadcast(replica: CmRDT, op: Op) {
  // Send operation to all replicas
  allReplicas.forEach(r => r.apply(op));
}

CmRDTs are more bandwidth-efficient but require exactly-once delivery semantics. Duplicate operations must be detected and ignored using operation IDs.

Practical CRDTs for collaborative editing

RGA: Replicated Growable Array

The Replicated Growable Array (RGA) is the workhorse CRDT for collaborative text editing. It represents text as a linked list where each character node contains:

typescriptinterface RGANode {
  id: { replicaId: string; counter: number };  // unique across replicas
  value: string;                                // the character
  leftId: string | null;                         // pointer to previous node
  deleted: boolean;                             // tombstone flag
}

When Alice inserts "A" at position 5, her node is linked after the node currently at position 5. If Bob simultaneously inserts "B" at position 5, both nodes link after position 5. The final order is determined by a deterministic tiebreaker—typically the replica ID and counter.

This guarantees that both replicas converge to the same sequence without requiring coordination.

LWW-Element-Set: Last-Writer-Wins Element Set

For applications where exact ordering is less critical than convergence, LWW-Element-Set provides a simpler alternative. Each element is paired with a timestamp, and merge semantics resolve conflicts by keeping the element with the higher timestamp:

typescriptinterface LWWElement<T> {
  value: T;
  addedAt: number;
  removedAt: number;
}

function merge(a: LWWElement[], b: LWWElement[]): LWWElement[] {
  // Element is present if addedAt > removedAt
  // Conflicts resolved by max(addedAt)
  // Duplicate elements resolved by max(timestamp)
}

LWW-Element-Set is ideal for collaborative presence indicators, cursors, and other metadata where occasional "last-writer-wins" behavior is acceptable.

Production CRDT frameworks

Yjs: The de facto standard for web-based collaboration

Yjs has emerged as the dominant CRDT framework for web applications due to its:

  • TypeScript-first design with strong typing support
  • Extensive binding ecosystem (React, Vue, Svelte, Solid)
  • Binary encoding for efficient network transmission
  • Undo/redo support built into the data model
typescriptimport * as Y from 'yjs';
import { WebsocketProvider } from 'y-websocket';

const doc = new Y.Doc();
const ytext = doc.getText('codemirror');

// Connect to collaboration server
const wsProvider = new WebsocketProvider(
  'wss://your-collab-server.com',
  'document-room-123',
  doc
);

// Local changes automatically sync
ytext.insert(0, 'Hello World!');

// Observe remote changes
ytext.observe((event, transaction) => {
  event.changes.delta.forEach(op => {
    if (op.insert) console.log('Inserted:', op.insert);
    if (op.delete) console.log('Deleted', op.delete, 'chars');
  });
});

Automerge: JSON-compatible CRDTs

Automerge focuses on making CRDTs feel like working with plain JavaScript objects:

typescriptimport { Automerge } from '@automerge/automerge';

let doc = Automerge.init();
doc = Automerge.change(doc, doc => {
  doc.text = new Automerge.Text();
  doc.text.insertAt(0, ...'Hello');
});

// Create a fork
let doc2 = Automerge.clone(doc);
doc2 = Automerge.change(doc2, doc => {
  doc.text.insertAt(5, ...' World');
});

// Merge forks without conflicts
let merged = Automerge.merge(doc, doc2);
// merged.text === "Hello World"

Automerge is particularly suited for applications that need to persist CRDT state to traditional databases, as it provides compact binary serialization.

Infrastructure considerations for CRDT deployments

WebSocket vs WebRTC transport

CRDT systems require continuous bidirectional communication. The choice between WebSocket and WebRTC depends on your deployment architecture:

WebSocket (via Y-websocket or similar):

  • Requires a centralized signaling server
  • Better for web applications with browser clients
  • Easier to implement authentication and authorization
  • Server can mediate and record all changes

WebRTC (via Y-webrtc or similar):

  • Peer-to-peer mesh topology
  • Better for desktop/mobile applications with native clients
  • Lower latency as data flows directly between peers
  • Serverless deployment possible

Persistence strategies

CRDT state must be persisted for:

  1. Reconnection: When clients reconnect, they need to receive missing operations
  2. New participants: Late joiners need the complete document state
  3. Backup and recovery: Document restoration after data loss

Recommended persistence patterns:

typescript// Snapshotting strategy for long-running documents
async function saveSnapshot(doc: Y.Doc, docId: string) {
  const state = Y.encodeStateAsUpdateV2(doc);
  const compressed = await compress(state);
  await db.save(`snapshot:${docId}`, compressed);
}

// Operation log for point-in-time recovery
async function appendToLog(op: Uint8Array, docId: string) {
  await db.lpush(`ops:${docId}`, op);
  await db.expire(`ops:${docId}`, 60 * 60 * 24 * 7); // 7 days
}

Hybrid approaches work well: keep frequent operation logs for recent changes and periodic full snapshots for long-term storage.

When CRDTs are overkill

CRDTs are powerful but come with trade-offs:

  1. Memory overhead: Maintaining full edit history with tombstones consumes significant memory. A 100KB document might require 500KB+ of CRDT state.
  1. Complexity: Understanding and debugging CRDT behavior requires specialized knowledge. Production issues are harder to diagnose.
  1. Bandwidth: Initial synchronization requires sending the full CRDT state, which can be large for long-lived documents.
  1. Overhead for simple cases: For applications with a single active editor or read-heavy workloads, a simpler optimistic locking strategy may suffice.

CRDTs are justified when:

  • You need multi-user real-time collaboration with 3+ concurrent editors
  • Offline-first behavior is a requirement
  • Conflict-free convergence is more important than exact edit order
  • Latency is critical and server round-trips are unacceptable

Conclusion

CRDTs have transformed real-time collaboration from a fragile, coordination-heavy problem into a mathematically sound, fault-tolerant architecture. By leveraging frameworks like Yjs and Automerge, engineering teams can build collaborative applications that scale without the operational complexity that plagued Operational Transformation systems.

The investment in CRDT architecture pays dividends in user experience: your users get Google Docs-style collaboration without the "conflicting changes" dialogs, offline support that actually works, and instant responsiveness regardless of server location.


Building a real-time collaborative product? Talk to Imperialis engineering specialists to design and implement a CRDT-based architecture that scales with your users.

Sources

Related reading