blog.notmyidea.org/content/code/2024-02-12-umap3.md
2024-07-12 02:18:10 +02:00

306 lines
12 KiB
Markdown

---
title: uMap realtime sync #3
slug: adding-collaboration-on-umap-third-update
tags: umap, geojson, websockets
---
I've spent the last few weeks working on [uMap](https://umap-project.org), still
with the goal of bringing real-time collaboration to the maps. I'm not there
yet, but I've made some progress that I will relate here.
## JavaScript modules
uMap has been there [since 2012](https://github.com/
umap-project/umap/commit/0cce7f9e2a19c83fa76645d7773d39d54f357c43), at a time
when ES6 [wasn't out there yet](https://fr.wikipedia.org/wiki/ECMAScript).
At that time, it wasn't possible to use JavaScript modules, nor modern JavaScript
syntax. The project stayed with these requirements for a long time, in order to support
people with old browsers. But as time goes on, we now have access to more browser features,
and it's now possible to use modules!
The team has been working hard on bringing modules to the mix. It
wasn't a piece of cake, but the result is here: we're [now able to use modern
JavaSript modules](https://github.com/umap-project/umap/pull/1463/files) and we
are now more confident [about which features of the browser we can use or
not](https://github.com/umap-project/umap/commit/65f1cdd6b4569657ef5e219d9b377fec85c41958).
---
I then spent some time trying to integrate existing CRDTs like
Automerge and YJS in our project. These two libs are unfortunately expecting us to
use a bundler, which we aren't currently.
uMap is plain old JavaScript, and as such is not using react or any other framework. The way
I see this is that it makes it possible to have something "close to the
metal" (if that means anything when it comes to web development).
As a result, we're not tied to the development pace of these frameworks, and have more
control on what we do (read "it's easier to debug").
So, after making tweaks and learning how "modules", "requires" and "bundling"
are working, I ultimately decided to take a break from this path, to work on the
wiring with uMap. After all, CRDTs might not even be the way forward for us.
## Internals
After some time with the head under the water, I'm now able to better
understand the big picture, and I'm not getting lost in the details like I was at first.
Let me try to summarize what I've learned.
uMap appears to be doing a lot of different things, but in the end it's:
- Using [Leaflet.js](https://leafletjs.com/) to render *features* on the map ;
- Using [Leaflet Editable](https://github.com/Leaflet/Leaflet.Editable) to edit
complex shapes, like polylines, polygons, and to draw markers ;
- Using the [Formbuilder](https://github.com/yohanboniface/Leaflet.FormBuilder)
to expose a way for the users to edit the features, and the data of the map
- Serializing the layers to and from [GeoJSON](https://geojson.org/). That's
what's being sent to and received from the server.
- Providing different layer types (marker cluster, chloropleth, etc) to display
the data in different ways.
### Naming matters
There is some naming overlap between the different projects we're using, and
it's important to have these small clarifications in mind:
#### Leaflet layers and uMap features
**In Leaflet, everything is a layer**. What we call *features* in geoJSON are
leaflet layers, and even a (uMap) layer is a layer. We need to be extra careful
what are our inputs and outputs in this context.
We actually have different layers concepts: the *datalayer* and the different
kind of layers (chloropleth, marker cluster, etc). A datalayer, is (as you can
guess) where the data is stored. It's what uMap serializes. It contains the
features (with their properties). But that's the trick: these features are named
*layers* by Leaflet.
#### GeoJSON and Leaflet
We're using GeoJSON to share data with the server, but we're using Leaflet
internally. And these two have different way of naming things.
The different geometries are named differently (a leaflet `Marker` is a GeoJSON
`Point`), and their coordinates are stored differently: Leaflet stores `lat,
long` where GeoJSON stores `long, lat`. Not a big deal, but it's a good thing
to know.
Leaflet stores data in `options`, where GeoJSON stores it in `properties`.
### This is not reactive programming
I was expecting the frontend to be organised similarly to Elm apps (or React
apps): a global state and a data flow ([a la redux](https:// react-redux.js.org/
introduction/getting-started)), with events changing the data that will trigger
a rerendering of the interface.
Things work differently for us: different components can write to the map, and
get updated without being centralized. It's just a different paradigm.
## A syncing proof of concept
With that in mind, I started thinking about a simple way to implement syncing.
I left aside all the thinking about how this would relate with CRDTs. It can
be useful, but later. For now, I "just" want to synchronize two maps. I want a
proof of concept to do informed decisions.
### Syncing map properties
I started syncing map properties. Things like the name of the map, the default
color and type of the marker, the description, the default zoom level, etc.
All of these are handled by "the formbuilder". You pass it an object, a list of
properties and a callback to call when an update happens, and it will build for
you form inputs.
Taken from the documentation (and simplified):
```js
var tilelayerFields = [
['name', {handler: 'BlurInput', placeholder: 'display name'}],
['maxZoom', {handler: 'BlurIntInput', placeholder: 'max zoom'}],
['minZoom', {handler: 'BlurIntInput', placeholder: 'min zoom'}],
['attribution', {handler: 'BlurInput', placeholder: 'attribution'}],
['tms', {handler: 'CheckBox', helpText: 'TMS format'}]
];
var builder = new L.FormBuilder(myObject, tilelayerFields, {
callback: myCallback,
callbackContext: this
});
```
In uMap, the formbuilder is used for every form you see on the right panel. Map
properties are stored in the `map` object.
We want two different clients work together. When one changes the value of a
property, the other client needs to be updated, and update its interface.
I've started by creating a mapping of property names to rerender-methods, and
added a method `renderProperties(properties)` which updates the interface,
depending on the properties passed to it.
We now have two important things:
1. Some code getting called each time a property is changed ;
2. A way to refresh the right interface when a property is changed.
In other words, from one client we can send the message to the other client,
which will be able to rerender itself.
Looks like a plan.
## Websockets
We need a way for the data to go from one side to the other. The easiest
way is probably websockets.
Here is a simple code which will relay messages from one websocket to the other
connected clients. It's not the final code, it's just for demo puposes.
A basic way to do this on the server side is to use python's
[websockets](https://websockets.readthedocs.io/) library.
```python
import asyncio
import websockets
from websockets.server import serve
import json
# Just relay all messages to other connected peers for now
CONNECTIONS = set()
async def join_and_listen(websocket):
CONNECTIONS.add(websocket)
try:
async for message in websocket:
# recompute the peers-list at the time of message-sending.
# doing so beforehand would miss new connections
peers = CONNECTIONS - {websocket}
websockets.broadcast(peers, message)
finally:
CONNECTIONS.remove(websocket)
async def handler(websocket):
message = await websocket.recv()
event = json.loads(message)
# The first event should always be 'join'
assert event["kind"] == "join"
await join_and_listen(websocket)
async def main():
async with serve(handler, "localhost", 8001):
await asyncio.Future() # run forever
asyncio.run(main())
```
On the client side, it's fairly easy as well. I won't even cover it here.
We now have a way to send data from one client to the other.
Let's consider the actions we do as "verbs". For now, we're just updating
properties values, so we just need the `update` verb.
## Code architecture
We need different parts:
- the **transport**, which connects to the websockets, sends and receives messages.
- the **message sender** to relat local messages to the other party.
- the **message receiver** that's being called each time we receive a message.
- the **sync engine** which glues everything together
- Different **updaters**, which knows how to apply received messages, the goal being
to update the interface in the end.
When receiving a message it will be routed to the correct updater, which will
know what to do with it.
In our case, its fairly simple: when updating the `name` property, we send a
message with `name` and `value`. We also need to send along some additional
info: the `subject`.
In our case, it's `map` because we're updating map properties.
When initializing the `map`, we're initializing the `SyncEngine`, like this:
```js
// inside the map
let syncEngine = new umap.SyncEngine(this)
// Then, when we need to send data to the other party
let syncEngine = this.obj.getSyncEngine()
let subject = this.obj.getSyncSubject()
syncEngine.update(subject, field, value)
```
The code on the other side of the wire is simple enough: when you receive the
message, change the data and rerender the properties:
```js
this.updateObjectValue(this.map, key, value)
this.map.renderProperties(key)
```
## Syncing features
At this stage I was able to sync the properties of the map. A
small victory, but not the end of the trip.
The next step was to add syncing for features: markers, polygon and polylines,
alongside their properties.
All of these features have a uMap class representation (which extends Leaflets
ones). All of them share some code in the `FeatureMixin` class.
That seems a good place to do the changes.
I did a few changes:
- Each feature now has an identifier, so clients know they're talking about the
same thing. This identifier is also stored in the database when saved.
- I've added an `upsert` verb, because we don't have any way (from the
interface) to make a distinction between the creation of a new feature and
its modification. The way we intercept the creation of a feature (or its
update) is to use Leaflet Editable's `editable:drawing:commit` event. We just
have to listen to it and then send the appropriate messages !
After some giggling around (ah, everybody wants to create a new protocol !) I
went with reusing GeoJSON. It allowed me to have a better understanding of how
Leaflet is using latlongs. That's a multi-dimensional array, with variable
width, depending on the type of geometry and the number of shapes in each of
these.
Clearly not something I want to redo, so I'm now reusing some Leaflet code, which handles this serialization for me.
I'm now able to sync different types of features with their properties.
<video controls width="80%">
<source src="/images/umap/sync-features.webm" type="video/webm">
</video>
Point properties are also editable, using the already-existing table editor. I
was expecting this to require some work but it's just working without more
changes.
## What's next ?
I'm able to sync map properties, features and their properties, but I'm not
yet syncing layers. That's the next step! I also plan to make some pull
requests with the interesting bits I'm sure will go in the final
implementation:
- Adding ids to features, so we have a way to refer to them.
- Having a way to map properties with how they render the interface, the `renderProperties` bits.
When this demo will be working, I'll probably spend some time updating it with the latest changes (umap is moving a lot these weeks).
I will probably focus on how to integrate websockets in the server side, and then will see how to leverage (maybe) some magic from CRDTs, if we need it.
See you for the next update!