Adding Real-Time Collaboration to uMap, second week

A heads-up on what I've been doing this week on uMap

I continued working on uMap, an open-source map-making tool to create and share customizable maps, based on Open Street Map data.

Here is a summary of what I did:

The optimistic-merge approach

There were an open pull request implementing an “optimistic merge”. We spent some time together with Yohan to understand what the pull request is doing, discuss it and made a few changes.

Here’s the logic of the changes:

  1. On the server-side, we detect if we have a conflict between the incoming changes and what’s stored on the server (is the last document save fresher than the IF-UNMODIFIED-SINCE header we get ?) ;
  2. In case of conflict, find back the reference document in the history (let’s name this the “local reference”) ;
  3. Merge the 3 documents together, that is :
  4. Find what the the incoming changes are, by comparing the incoming doc to the local reference.
  5. Re-apply the changes on top of the latest doc.

One could compare this logic to what happens when you do a git rebase. Here is some pseudo-code:

def merge_features(reference: list, latest: list, incoming: list):
    """Finds the changes between reference and incoming, and reapplies them on top of latest."""
    if latest == incoming:
        return latest

    reference_removed, incoming_added = get_difference(reference, incoming)

    # Ensure that items changed in the reference weren't also changed in the latest.
    for removed in reference_removed:
        if removed not in latest:
            raise ConflictError

    merged = copy(latest)
    # Reapply the changes on top of the latest.
    for removed in reference_removed:
        merged.delete(removed)

    for added in incoming_added:
        merged.append(added)

    return merged

The pull request is not ready yet, as I still want to add tests with real data, and enhance the naming, but that’s a step in the right direction :-)

Using SQLite in the browser

At the moment, (almost) everything is stored on the server side as GeoJSON files. They are simple to use, to read and to write, and having them on the storage makes it easy to handle multiple revisions.

I’ve been asked to challenge this idea for a moment. What if we were using some other technology to store the data? What would that give us? What would be the challenges?

I went with SQLite, just to see what this would mean.

I wanted to see how it would work, and what would be the challenges around this technology. I wrote a small application with it. Turns out writing to a local in-browser SQLite works.

Here is what it would look like:

I’m not sure SQLite by itself is useful here. It sure is fun, but I don’t see what we get in comparison with a more classical CRDT approach, except complexity. The technology is still quite young and rough to the edges, and uses Rust and WebASM, which are still strange beasts to me.

Here are some interesting projects I’ve found this week :

Two libraries seems useful for us:

I’m noting that:

How to transport the data?

One of the related subjects is transportation of the data between the client and the server. When we’ll get the local changes, we’ll need to find a way to send this data to the other clients, and ultimately to the server.

There are multiple ways to do this, and I spent some time trying to figure out the pros and cons of each approach. Here is a list:

All of these scenarii are possible, and each of them has pros and cons. I’ll be working on a document this week to better understand what’s hidden behind each of these, so we can ultimately make a choice.

Server-Sent Events (SSE)

Here are some notes about SSE. I’ve learned that:

It’s questioning me in terms of infrastructure changes.

#Python, #CRDT, #Sync, #uMap - Posté dans la catégorie code