blog.notmyidea.org/content/code/2024-06-20-umap-4.md
2024-07-12 02:18:10 +02:00

84 lines
4 KiB
Markdown

---
title: uMap realtime sync #4
tags: umap, geojson, websockets
slug: adding-collaboration-on-umap-fourth-update
---
The main branch of uMap now ships a web socket server, enabling
local changes to be replicated to other peers.
Here is short video capturing how the import of some data can be synced between
two browsers.
<video width="100%" controls src="https://files.notmyidea.org/choropleth-sync.webm">
</video>
It's pretty exciting, but the feature is not complete yet, and it's still barely usable.
Over the past few months, we made the following changes to the code:
- We [replaced the datalayers ids with uuids](https://github.com/umap-project/umap/pull/1630), making it possible to have them generated by the clients in the long term.
- We [assigned semi-unique IDs to each map "feature"](https://github.com/umap-project/umap/pull/1649), to be able to refer to them consistently (without it, it's not possible for peers to know they're talking about the same object).
- We changed how the data flows, by [separating data updates from UI-rendering](https://github.com/umap-project/umap/pull/1692). With this change, we introduced a way to regenerate only the required parts when changing a value.
- Lastly, we added [a websocket server, making it possible to apply remote data updates locally](https://github.com/umap-project/umap/pull/1754).
## The roadmap
Our roadmap for having "real-time" collaboration currently looks like this:
1. ☑︎ Learn about CRDTs, what they are and find how they can be useful;
2. ☑︎ Make structural changes to the codebase;
3. ☑︎ Allow a replication of a peer state to another connected peer;
4. **Bringing peers on the same page**: That's where we are.
5. **Interface changes**: we have some mockups to integrate, making it obvious
other peers are connected and interacting on the map, and handle web socket
disconnects.
6. **Security**: making peers communicate with each other enables new data
flows, and with them new "attack vectors". We want to take the time to think
this trough, and cover some of the problems we are already envisioning about
handling permissions, and escalation trough a peer with greater privileges.
7. **Scaling things up**: uMap is used to serve a quite large number of maps,
and probably will need to make sure our changes are able to scale. This will
potentially require some structural changes on the web socket server, because
not everything might fit in memory anymore.
## Bringing peers to the same page
The current code of uMap allows for syncing **already connected** clients.
You only get operations happening *while you're connected*, meaning
it's lacking the changes that happened before you joined, or if you get
disconnected (think flacky connections).
Basically, we'll need to:
- Assign each operation an `id` and store them locally ;
- Find a way for the peers to ask another peer about the missing changes since a specific date ;
- Get these changes and reapply them locally.
### Storing operations locally, using HLCs
We already have `operation` messages sent over our transport protocol. We will need to
store them locally. Each operation will be tied to a particular "time".
We will be using Hybrid Logical Clock for this, ad they offer us some nice
properties on distributed systems. I won't go in to much details about them, but
you can just consider that the time is unique and it's possible for all peers to
agree on which operation came before another one.
The following operations will be required:
- Receive remote operations and decide if we should apply the updates.
- Compact the data: sometimes, the same value gets updated multiple times, and
we might not want to send useless info.
- Keep old values, it might prove useful to implement an "undo" feature.
### Communicating with other peers
At the moment, the server broadcasts all messages (to all the connected peers).
We will need a way to send messages only to a specific peer.
This will be useful to retrieve information known only by the peers, such as…
the list of operations :-)
Let's get back to it, we're almost there!