mirror of
https://github.com/almet/notmyidea.git
synced 2025-04-29 20:12:38 +02:00
240 lines
No EOL
14 KiB
HTML
240 lines
No EOL
14 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="fr">
|
|
<head>
|
|
<title>
|
|
CRDTs - Alexis Métaireau </title>
|
|
<meta charset="utf-8" />
|
|
<meta name="viewport" content="width=device-width, initial-scale=1">
|
|
<link rel="stylesheet"
|
|
href="https://blog.notmyidea.org/theme/css/main.css?v2"
|
|
type="text/css" />
|
|
<link href="https://blog.notmyidea.org/feeds/all.atom.xml"
|
|
type="application/atom+xml"
|
|
rel="alternate"
|
|
title="Alexis Métaireau ATOM Feed" />
|
|
</head>
|
|
<body>
|
|
<div id="content">
|
|
<section id="links">
|
|
<ul>
|
|
<li>
|
|
<a class="main" href="/">Alexis Métaireau</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/journal/index.html">Journal</a>
|
|
</li>
|
|
<li>
|
|
<a class="selected"
|
|
href="https://blog.notmyidea.org/code/">Code, etc.</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/weeknotes/">Notes hebdo</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/lectures/">Lectures</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/projets.html">Projets</a>
|
|
</li>
|
|
</ul>
|
|
</section>
|
|
<header>
|
|
<h1 class="post-title">CRDTs</h1>
|
|
<time datetime="2024-02-24T00:00:00+01:00">24 février 2024</time>
|
|
</header>
|
|
<article>
|
|
<p>As discussed in a previous article, I’m now able to send messages
|
|
when markers are added, or properties are updated on the map.</p>
|
|
<p>So far, the way I’ve added collaboration features on uMap is by a) catching when
|
|
changes are done on the interface, b) sending messages to the other party and c)
|
|
applying the changes on the receiving client.</p>
|
|
<p>This works well in general, but it doesn’t take care of conflicts handling,
|
|
especially when disconnection can happen.</p>
|
|
<p>One way to do this is to use CRDTs (Conflict-free Resolution Data Types).
|
|
You can see CRDTs as a specific type of data that’s able to merge its state with
|
|
other states without generating conflicts. Append-only sets are probably the
|
|
most common type of <span class="caps">CRDT</span>: if multiple parties add the same element, it will be
|
|
present only once, because it’s how sets work.</p>
|
|
<h2 id="requirements">Requirements</h2>
|
|
<p>I’m looking for something that:</p>
|
|
<ul>
|
|
<li><strong>Stores key/value pairs</strong>, for most of the case, a Last Writer Wins (<span class="caps">LWW</span>)
|
|
register might be enough</li>
|
|
<li><strong>Propagates the changes</strong> to another party</li>
|
|
<li><strong>Handles disconnections</strong>, so that it’s possible to reconcialiate local
|
|
changes with remote ones when getting back online</li>
|
|
</ul>
|
|
<p>The <span class="caps">API</span> could be as simple as this:</p>
|
|
<div class="highlight"><pre><span></span><code><span class="c1">// A callback is called when new values are received</span>
|
|
<span class="c1">// We would obviously need a way to distinguish between local and remote changes</span>
|
|
<span class="kd">let</span><span class="w"> </span><span class="nx">store</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ow">new</span><span class="w"> </span><span class="nx">Store</span><span class="p">(</span><span class="nx">onUpdate</span><span class="o">=</span><span class="nx">callback</span><span class="p">)</span>
|
|
<span class="nx">store</span><span class="p">.</span><span class="nx">set</span><span class="p">(</span><span class="s1">'key'</span><span class="p">,</span><span class="w"> </span><span class="s1">'value'</span><span class="p">)</span>
|
|
</code></pre></div>
|
|
|
|
<p>One thing that I would like to clarify is how does these lib work when peers get offline, and back online. I suppose I will want something like:</p>
|
|
<ol>
|
|
<li>When you loose the connection, you continue to apply the changes locally, but
|
|
messages are piling up</li>
|
|
<li>When you’re getting back online, you need a way to sync with other clients.
|
|
One way to handle this is to ask the other peers for changes since the last
|
|
known update, and then reapply your changes, which should sync.</li>
|
|
</ol>
|
|
<h2 id="whats-the-complexity-about">What’s the complexity about?</h2>
|
|
<p>CRDTs are intimidating. When trying to understand what’s going on, I felt I was
|
|
missing some context. A lot of terms aren’t familiar to me, and as such, it’s easy
|
|
to feel a bit lost.</p>
|
|
<p>It turns out that what I’m trying to do is rather simple. Don’t get me wrong,
|
|
CRDTs are solving a hard problem, but mainly they’re solving a problem we don’t
|
|
have: lists. We’re mainly interested in maps and registers.</p>
|
|
<h2 id="yata-and-rga"><span class="caps">YATA</span> and <span class="caps">RGA</span></h2>
|
|
<p>The two popular CRDTs implementation out there use different approaches for the
|
|
virtual counter:</p>
|
|
<blockquote>
|
|
<ul>
|
|
<li><span class="caps">RGA</span> maintains a single globally incremented counter (which can be ordinary
|
|
integer value), that’s updated anytime we detect that remote insert has an id
|
|
with sequence number higher that local counter. Therefore every time, we produce
|
|
a new insert operation, we give it a highest counter value known at the time.</li>
|
|
<li><span class="caps">YATA</span> also uses a single integer value, however unlike in case of <span class="caps">RGA</span> we
|
|
don’t use a single counter shared with other replicas, but rather let each
|
|
peer keep its own, which is incremented monotonically only by that peer. Since
|
|
increments are monotonic, we can also use them to detect missing operations eg.
|
|
updates marked as A:1 and A:3 imply, that there must be another (potentially
|
|
missing) update A:2.Y.js and Automerge.</li>
|
|
</ul>
|
|
</blockquote>
|
|
<h2 id="yjs">Y.js</h2>
|
|
<p><span class="caps">YJS</span> uses <span class="caps">YATA</span> (Yet Another Transformation Approach), which is a delta-state based variant.</p>
|
|
<p>The <span class="caps">API</span> seem to offer what we look for, and provides a way to <a href="https://docs.yjs.dev/api/shared-types/y.map#observing-changes-y.mapevent">observe changes</a></p>
|
|
<div class="highlight"><pre><span></span><code><span class="kd">const</span><span class="w"> </span><span class="nx">store</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ow">new</span><span class="w"> </span><span class="nx">Y</span><span class="p">.</span><span class="nx">Doc</span><span class="p">()</span>
|
|
<span class="kd">const</span><span class="w"> </span><span class="nx">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nx">ydoc</span><span class="p">.</span><span class="nx">getMap</span><span class="p">()</span>
|
|
<span class="nx">map</span><span class="p">.</span><span class="nx">set</span><span class="p">(</span><span class="s1">'key'</span><span class="p">,</span><span class="w"> </span><span class="s1">'value'</span><span class="p">)</span>
|
|
<span class="nx">map</span><span class="p">.</span><span class="nx">observe</span><span class="p">((</span><span class="nx">event</span><span class="p">)</span><span class="w"> </span><span class="p">=></span><span class="w"> </span><span class="p">{</span>
|
|
<span class="w"> </span><span class="c1">// read the keys that changed</span>
|
|
<span class="w"> </span><span class="nx">event</span><span class="p">.</span><span class="nx">keysChanged</span>
|
|
|
|
<span class="w"> </span><span class="c1">// If I need to iterate on the keys, or get the old values, it's possible.</span>
|
|
<span class="w"> </span><span class="nx">event</span><span class="p">.</span><span class="nx">changes</span><span class="p">.</span><span class="nx">keys</span><span class="p">.</span><span class="nx">forEach</span><span class="p">((</span><span class="nx">change</span><span class="p">,</span><span class="w"> </span><span class="nx">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span>
|
|
<span class="w"> </span><span class="nx">map</span><span class="p">.</span><span class="nx">get</span><span class="p">(</span><span class="nx">key</span><span class="p">)</span>
|
|
<span class="w"> </span><span class="p">})</span>
|
|
<span class="p">})</span>
|
|
</code></pre></div>
|
|
|
|
<p>Pros:</p>
|
|
<ul>
|
|
<li>Awareness support</li>
|
|
</ul>
|
|
<p>Cons:</p>
|
|
<ul>
|
|
<li></li>
|
|
</ul>
|
|
<h2 id="automerge">Automerge</h2>
|
|
<p>The <span class="caps">API</span> looks like this:</p>
|
|
<div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="nx">store</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nx">repo</span><span class="p">.</span><span class="nx">create</span><span class="p">()</span>
|
|
<span class="nx">store</span><span class="p">.</span><span class="nx">change</span><span class="p">(</span><span class="nx">d</span><span class="w"> </span><span class="p">=></span><span class="w"> </span><span class="nx">d</span><span class="p">.</span><span class="nx">key</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span>
|
|
<span class="nx">store</span><span class="p">.</span><span class="nx">on</span><span class="p">(</span><span class="s2">"change"</span><span class="p">,</span><span class="w"> </span><span class="p">({</span><span class="w"> </span><span class="nx">doc</span><span class="w"> </span><span class="p">})</span><span class="w"> </span><span class="p">=></span><span class="w"> </span><span class="p">{</span>
|
|
|
|
<span class="p">})</span>
|
|
</code></pre></div>
|
|
|
|
<h3 id="pros">Pros</h3>
|
|
<ul>
|
|
<li><a href="https://automerge.org/docs/documents/conflicts/">get informed when a conflict occured</a></li>
|
|
<li><a href="https://automerge.org/docs/repositories/ephemeral/">an <span class="caps">API</span> to send ephemeral messages</a></li>
|
|
</ul>
|
|
<h3 id="cons">Cons</h3>
|
|
<ul>
|
|
<li>Documentation hard to understand. I didn’t see what’s getting passed to the
|
|
callback for observers, for instance.</li>
|
|
</ul>
|
|
<h2 id="json-joy"><span class="caps">JSON</span> Joy</h2>
|
|
<div class="highlight"><pre><span></span><code><span class="k">import</span><span class="w"> </span><span class="p">{</span><span class="nx">Model</span><span class="p">}</span><span class="w"> </span><span class="kr">from</span><span class="w"> </span><span class="s1">'json-joy/es2020/json-crdt'</span><span class="p">;</span>
|
|
|
|
<span class="c1">// Create a new JSON CRDT document.</span>
|
|
<span class="kd">const</span><span class="w"> </span><span class="nx">model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nx">Model</span><span class="p">.</span><span class="nx">withLogicalClock</span><span class="p">();</span>
|
|
|
|
<span class="c1">// Find "obj" object node at path [].</span>
|
|
<span class="kd">const</span><span class="w"> </span><span class="nx">obj</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nx">model</span><span class="p">.</span><span class="nx">api</span><span class="p">.</span><span class="nx">obj</span><span class="p">([]);</span>
|
|
|
|
<span class="c1">// Overwrite the "counter" last-write-wins register to 25.</span>
|
|
<span class="nx">obj</span><span class="p">.</span><span class="nx">set</span><span class="p">({</span><span class="w"> </span><span class="nx">counter</span><span class="o">:</span><span class="w"> </span><span class="mf">25</span><span class="w"> </span><span class="p">});</span>
|
|
</code></pre></div>
|
|
|
|
<p>Pros:</p>
|
|
<ul>
|
|
<li>Low level</li>
|
|
<li>Atomic libraries</li>
|
|
</ul>
|
|
<p>Cons:</p>
|
|
<ul>
|
|
<li>Doesn’t provide high level interface for sync</li>
|
|
</ul>
|
|
<h2 id="comparison">Comparison</h2>
|
|
<p><span class="caps">YATA</span> / <span class="caps">RGA</span> are two different types of CRDTs, </p>
|
|
<p>There are two types of CRDTs: state-based (convergent) and operation-based (commutative).</p>
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Name</th>
|
|
<th>Type</th>
|
|
<th>Size</th>
|
|
<th>Bundler</th>
|
|
<th>Conflicts</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td><a href="https://github.com/yjs/yjs">Y.js</a></td>
|
|
<td><span class="caps">YATA</span></td>
|
|
<td>Not sure</td>
|
|
<td><a href="https://github.com/yjs/yjs/issues/282">required</a></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td><a href="https://automerge.org">Automerge</a></td>
|
|
<td><span class="caps">RGA</span></td>
|
|
<td></td>
|
|
<td></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td><a href="https://jsonjoy.com">Json Joy</a></td>
|
|
<td></td>
|
|
<td></td>
|
|
<td></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td><a href="https://rxdb.info/crdt.html"><span class="caps">RXDB</span></a></td>
|
|
<td></td>
|
|
<td></td>
|
|
<td></td>
|
|
<td></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<h3 id="resources">Resources</h3>
|
|
<ul>
|
|
<li><a href="https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/">Bartosz Sypytkowski</a> introduction on CRDTs, with practical
|
|
exemples is very intuitive.</li>
|
|
<li><a href="https://jzhao.xyz/thoughts/CRDT-Implementations#replicated-growable-array-rga"></a></li>
|
|
</ul>
|
|
<p>
|
|
<a href="https://blog.notmyidea.org/tag/crdts.html">#crdts</a>
|
|
, <a href="https://blog.notmyidea.org/tag/umap.html">#umap</a>
|
|
, <a href="https://blog.notmyidea.org/tag/sync.html">#sync</a>
|
|
- Posté dans la catégorie <a href="https://blog.notmyidea.org/code/">code</a>
|
|
</p>
|
|
</article>
|
|
<footer>
|
|
<a id="feed" href="/feeds/all.atom.xml">
|
|
<img alt="RSS Logo" src="/theme/rss.svg" />
|
|
</a>
|
|
</footer>
|
|
</div>
|
|
</body>
|
|
</html> |