blog.notmyidea.org/content/code/2024-06-20-umap-4.md
2024-07-12 02:18:10 +02:00

4 KiB

title tags slug
uMap realtime sync umap, geojson, websockets adding-collaboration-on-umap-fourth-update

The main branch of uMap now ships a web socket server, enabling local changes to be replicated to other peers.

Here is short video capturing how the import of some data can be synced between two browsers.

It's pretty exciting, but the feature is not complete yet, and it's still barely usable.

Over the past few months, we made the following changes to the code:

The roadmap

Our roadmap for having "real-time" collaboration currently looks like this:

  1. ☑︎ Learn about CRDTs, what they are and find how they can be useful;
  2. ☑︎ Make structural changes to the codebase;
  3. ☑︎ Allow a replication of a peer state to another connected peer;
  4. Bringing peers on the same page: That's where we are.
  5. Interface changes: we have some mockups to integrate, making it obvious other peers are connected and interacting on the map, and handle web socket disconnects.
  6. Security: making peers communicate with each other enables new data flows, and with them new "attack vectors". We want to take the time to think this trough, and cover some of the problems we are already envisioning about handling permissions, and escalation trough a peer with greater privileges.
  7. Scaling things up: uMap is used to serve a quite large number of maps, and probably will need to make sure our changes are able to scale. This will potentially require some structural changes on the web socket server, because not everything might fit in memory anymore.

Bringing peers to the same page

The current code of uMap allows for syncing already connected clients.

You only get operations happening while you're connected, meaning it's lacking the changes that happened before you joined, or if you get disconnected (think flacky connections).

Basically, we'll need to:

  • Assign each operation an id and store them locally ;
  • Find a way for the peers to ask another peer about the missing changes since a specific date ;
  • Get these changes and reapply them locally.

Storing operations locally, using HLCs

We already have operation messages sent over our transport protocol. We will need to store them locally. Each operation will be tied to a particular "time".

We will be using Hybrid Logical Clock for this, ad they offer us some nice properties on distributed systems. I won't go in to much details about them, but you can just consider that the time is unique and it's possible for all peers to agree on which operation came before another one.

The following operations will be required:

  • Receive remote operations and decide if we should apply the updates.
  • Compact the data: sometimes, the same value gets updated multiple times, and we might not want to send useless info.
  • Keep old values, it might prove useful to implement an "undo" feature.

Communicating with other peers

At the moment, the server broadcasts all messages (to all the connected peers). We will need a way to send messages only to a specific peer.

This will be useful to retrieve information known only by the peers, such as… the list of operations :-)

Let's get back to it, we're almost there!