mirror of
https://github.com/freedomofpress/dangerzone.git
synced 2025-05-17 18:51:50 +02:00
FIXUP: Keep only the necessary instructions for checking reproducibility
This commit is contained in:
parent
685cf431a3
commit
acbc433717
1 changed files with 13 additions and 80 deletions
|
@ -11,105 +11,38 @@ Our build artifacts consist of:
|
|||
* Fedora packages (for regular Fedora distros and Qubes)
|
||||
* Debian packages (for Debian and Ubuntu)
|
||||
|
||||
As of writing this, none of the above artifacts are reproducible. For this
|
||||
reason, we purposefully build them in machines owned by FPF, since we can't
|
||||
trust third-party servers. A security hole in GitHub, or
|
||||
in our CI pipeline (check out the
|
||||
[Ultralytics cryptominer saga](https://github.com/ultralytics/ultralytics/issues/18027)),
|
||||
may allow attackers to plant a malicious artifact with no detection.
|
||||
As of writing this, only the following artifacts are reproducible:
|
||||
* Container images (see [#1047](https://github.com/freedomofpress/dangerzone/issues/1047))
|
||||
|
||||
Still, building our artifacts in private is not ideal. Third parties cannot
|
||||
easily audit if our artifacts have been built correctly or if they have been
|
||||
tampered with. For instance, our Apple Silicon container image builds PyMuPDF
|
||||
from source, and while the PyPI source package is hashed, the produced output
|
||||
does not have a known hash. So, it's not easy to verify it's been built
|
||||
correctly (read also the seminal
|
||||
["Reflections on Trusting Trust"](https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf)
|
||||
lecture by Ken Thompson on that subject).
|
||||
|
||||
In order to make our builds auditable and allow building artifacts in
|
||||
third-party servers safely, we want to make each artifact build reproducible. In
|
||||
the following sections, we'll lay down the plan to do so for each artifact type.
|
||||
In the following sections, we'll mention some specifics about enforcing
|
||||
reproducibility for each artifact type.
|
||||
|
||||
## Container image
|
||||
|
||||
### Current limitations
|
||||
|
||||
Our container image is currently not reproducible for the following main
|
||||
reasons:
|
||||
|
||||
* We build PyMuPDF from source, since it's not available in Alpine Linux. The
|
||||
result of this build is not reproducible. Note that PyMuPDF wheels are
|
||||
available from PyPI, but there are no ARM wheels for the musl libc platforms.
|
||||
* Alpine Linux does not have a way to pin packages and their dependencies, and
|
||||
does not retain old packages. There's a
|
||||
[workaround](https://github.com/reproducible-containers/repro-pkg-cache)
|
||||
to download the required packages and store them elsewhere, but then the
|
||||
cached package downloads cannot be easily audited.
|
||||
|
||||
## Proposed implementation
|
||||
|
||||
We can take advantage of the
|
||||
[Debian snapshot archives](https://snapshot.debian.org/)
|
||||
and pin our packages by specifying a date. There's already
|
||||
[prior art](https://github.com/reproducible-containers/repro-sources-list.sh/)
|
||||
for that, thanks to the incredible work of @AkihiroSuda on
|
||||
[reproducible containers](https://github.com/reproducible-containers).
|
||||
As for PyMuPDF, it is available from the Debian repos, so we won't have to build
|
||||
it from source.
|
||||
|
||||
Here are a few other obstacles that we need to overcome:
|
||||
* We currently download the
|
||||
[latest gVisor version](https://gvisor.dev/docs/user_guide/install/#latest-release)
|
||||
from a GCS bucket. Now that we have switched to Debian, we can take advantage
|
||||
of their
|
||||
[timestamped APT repos](https://gvisor.dev/docs/user_guide/install/#specific-release)
|
||||
and download specific releases from those. An extra benefit is that such
|
||||
releases are signed with their APT key.
|
||||
* We can no longer update the packages in the container image by rebuilding it.
|
||||
We have to bump the dates in the Dockerfile first, which is a minor hassle,
|
||||
but much more declarative.
|
||||
* The `repro-source-list-.sh` script uses the release date of the container
|
||||
image. However, the Debian image is not updated daily (see
|
||||
[newest tags](https://hub.docker.com/_/debian/tags)
|
||||
in DockerHub). So, if we want to ship an emergency release, we have to
|
||||
circumvent this limitation. A simple way is to trick the script by bumping the
|
||||
date of the `/etc/apt/sources.list.d/debian.sources` and
|
||||
`/etc/apt/sources.list` files.
|
||||
* While we talk about image reproducibility, we can't actually achieve the exact
|
||||
same SHA-256 hash for two different image builds. That's because the file
|
||||
timestamps in the image layers will differ, depending on when the build took
|
||||
place. The rest of the image though (file contents, permissions, manifest)
|
||||
should be byte-for-byte the same. A simple way to check this is with the
|
||||
[`diffoci`](https://github.com/reproducible-containers/diffoci) tool, and
|
||||
specifically this invocation:
|
||||
|
||||
```
|
||||
./diffoci diff podman://<new_image_tag> podman://<old_image_tag> \
|
||||
--ignore-timestamps --ignore-image-name --verbose
|
||||
```
|
||||
|
||||
### Updating the image
|
||||
|
||||
The fact that our image is reproducible also means that it's frozen in time.
|
||||
This means that rebuilding the image without updating our Dockerfile will **not**
|
||||
receive security updates.
|
||||
This means that rebuilding the image without updating our Dockerfile will
|
||||
**not** receive security updates.
|
||||
|
||||
We list the necessary variables that make up our image in the `Dockerfile.env`
|
||||
file. These are:
|
||||
Here are the necessary variables that make up our image in the `Dockerfile.env`
|
||||
file:
|
||||
* `DEBIAN_IMAGE_DATE`: The date that the Debian container image was released
|
||||
* `DEBIAN_ARCHIVE_DATE`: The Debian snapshot repo that we want to use
|
||||
* `GVISOR_ARCHIVE_DATE`: The gVisor APT repo that we want to use
|
||||
* `H2ORESTART_CHECKSUM`: The SHA-256 checksum of the H2ORestart plugin
|
||||
* `H2ORESTART_VERSION`: The version of the H2ORestart plugin
|
||||
|
||||
If you update these values in `Dockerfile.env`, you can create a new Dockerfile
|
||||
with:
|
||||
If you update these values in `Dockerfile.env`, you must also create a new
|
||||
Dockerfile with:
|
||||
|
||||
```
|
||||
poetry run jinja2 Dockerfile.in Dockerfile.env > Dockerfile
|
||||
```
|
||||
|
||||
Updating `Dockerfile` without bumping `Dockerfile.in` is detected and should
|
||||
trigger a CI error.
|
||||
|
||||
### Reproducing the image
|
||||
|
||||
For a simple way to reproduce a Dangerzone container image, either local or
|
||||
|
|
Loading…
Reference in a new issue