As discussed in a previous article, I’m now able to send messages when markers are added, or properties are updated on the map.
So far, the way I’ve added collaboration features on uMap is by a) catching when changes are done on the interface, b) sending messages to the other party and c) applying the changes on the receiving client.
This works well in general, but it doesn’t take care of conflicts handling, especially when disconnection can happen.
One way to do this is to use CRDTs (Conflict-free Resolution Data Types). You can see CRDTs as a specific type of data that’s able to merge its state with other states without generating conflicts. Append-only sets are probably the most common type of CRDT: if multiple parties add the same element, it will be present only once, because it’s how sets work.
Requirements
I’m looking for something that:
- Stores key/value pairs, for most of the case, a Last Writer Wins (LWW) register might be enough
- Propagates the changes to another party
- Handles disconnections, so that it’s possible to reconcialiate local changes with remote ones when getting back online
The API could be as simple as this:
// A callback is called when new values are received
// We would obviously need a way to distinguish between local and remote changes
let store = new Store(onUpdate=callback)
store.set('key', 'value')
One thing that I would like to clarify is how does these lib work when peers get offline, and back online. I suppose I will want something like:
- When you loose the connection, you continue to apply the changes locally, but messages are piling up
- When you’re getting back online, you need a way to sync with other clients. One way to handle this is to ask the other peers for changes since the last known update, and then reapply your changes, which should sync.
What’s the complexity about?
CRDTs are intimidating. When trying to understand what’s going on, I felt I was missing some context. A lot of terms aren’t familiar to me, and as such, it’s easy to feel a bit lost.
It turns out that what I’m trying to do is rather simple. Don’t get me wrong, CRDTs are solving a hard problem, but mainly they’re solving a problem we don’t have: lists. We’re mainly interested in maps and registers.
YATA and RGA
The two popular CRDTs implementation out there use different approaches for the virtual counter:
- RGA maintains a single globally incremented counter (which can be ordinary integer value), that’s updated anytime we detect that remote insert has an id with sequence number higher that local counter. Therefore every time, we produce a new insert operation, we give it a highest counter value known at the time.
- YATA also uses a single integer value, however unlike in case of RGA we don’t use a single counter shared with other replicas, but rather let each peer keep its own, which is incremented monotonically only by that peer. Since increments are monotonic, we can also use them to detect missing operations eg. updates marked as A:1 and A:3 imply, that there must be another (potentially missing) update A:2.Y.js and Automerge.
Y.js
YJS uses YATA (Yet Another Transformation Approach), which is a delta-state based variant.
The API seem to offer what we look for, and provides a way to observe changes
const store = new Y.Doc()
const map = ydoc.getMap()
map.set('key', 'value')
map.observe((event) => {
// read the keys that changed
event.keysChanged
// If I need to iterate on the keys, or get the old values, it's possible.
event.changes.keys.forEach((change, key) {
map.get(key)
})
})
Pros:
- Awareness support
Cons:
Automerge
The API looks like this:
let store = repo.create()
store.change(d => d.key = "value")
store.on("change", ({ doc }) => {
})
Pros
Cons
- Documentation hard to understand. I didn’t see what’s getting passed to the callback for observers, for instance.
JSON Joy
import {Model} from 'json-joy/es2020/json-crdt';
// Create a new JSON CRDT document.
const model = Model.withLogicalClock();
// Find "obj" object node at path [].
const obj = model.api.obj([]);
// Overwrite the "counter" last-write-wins register to 25.
obj.set({ counter: 25 });
Pros:
- Low level
- Atomic libraries
Cons:
- Doesn’t provide high level interface for sync
Comparison
YATA / RGA are two different types of CRDTs,
There are two types of CRDTs: state-based (convergent) and operation-based (commutative).
Name | Type | Size | Bundler | Conflicts |
---|---|---|---|---|
Y.js | YATA | Not sure | required | |
Automerge | RGA | |||
Json Joy | ||||
RXDB |
Resources
- Bartosz Sypytkowski introduction on CRDTs, with practical exemples is very intuitive.