mirror of
https://github.com/freedomofpress/dangerzone.git
synced 2025-05-18 03:01:50 +02:00
FIXUP: Keep only the necessary instructions for checking reproducibility
This commit is contained in:
parent
685cf431a3
commit
acbc433717
1 changed files with 13 additions and 80 deletions
|
@ -11,105 +11,38 @@ Our build artifacts consist of:
|
||||||
* Fedora packages (for regular Fedora distros and Qubes)
|
* Fedora packages (for regular Fedora distros and Qubes)
|
||||||
* Debian packages (for Debian and Ubuntu)
|
* Debian packages (for Debian and Ubuntu)
|
||||||
|
|
||||||
As of writing this, none of the above artifacts are reproducible. For this
|
As of writing this, only the following artifacts are reproducible:
|
||||||
reason, we purposefully build them in machines owned by FPF, since we can't
|
* Container images (see [#1047](https://github.com/freedomofpress/dangerzone/issues/1047))
|
||||||
trust third-party servers. A security hole in GitHub, or
|
|
||||||
in our CI pipeline (check out the
|
|
||||||
[Ultralytics cryptominer saga](https://github.com/ultralytics/ultralytics/issues/18027)),
|
|
||||||
may allow attackers to plant a malicious artifact with no detection.
|
|
||||||
|
|
||||||
Still, building our artifacts in private is not ideal. Third parties cannot
|
In the following sections, we'll mention some specifics about enforcing
|
||||||
easily audit if our artifacts have been built correctly or if they have been
|
reproducibility for each artifact type.
|
||||||
tampered with. For instance, our Apple Silicon container image builds PyMuPDF
|
|
||||||
from source, and while the PyPI source package is hashed, the produced output
|
|
||||||
does not have a known hash. So, it's not easy to verify it's been built
|
|
||||||
correctly (read also the seminal
|
|
||||||
["Reflections on Trusting Trust"](https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf)
|
|
||||||
lecture by Ken Thompson on that subject).
|
|
||||||
|
|
||||||
In order to make our builds auditable and allow building artifacts in
|
|
||||||
third-party servers safely, we want to make each artifact build reproducible. In
|
|
||||||
the following sections, we'll lay down the plan to do so for each artifact type.
|
|
||||||
|
|
||||||
## Container image
|
## Container image
|
||||||
|
|
||||||
### Current limitations
|
|
||||||
|
|
||||||
Our container image is currently not reproducible for the following main
|
|
||||||
reasons:
|
|
||||||
|
|
||||||
* We build PyMuPDF from source, since it's not available in Alpine Linux. The
|
|
||||||
result of this build is not reproducible. Note that PyMuPDF wheels are
|
|
||||||
available from PyPI, but there are no ARM wheels for the musl libc platforms.
|
|
||||||
* Alpine Linux does not have a way to pin packages and their dependencies, and
|
|
||||||
does not retain old packages. There's a
|
|
||||||
[workaround](https://github.com/reproducible-containers/repro-pkg-cache)
|
|
||||||
to download the required packages and store them elsewhere, but then the
|
|
||||||
cached package downloads cannot be easily audited.
|
|
||||||
|
|
||||||
## Proposed implementation
|
|
||||||
|
|
||||||
We can take advantage of the
|
|
||||||
[Debian snapshot archives](https://snapshot.debian.org/)
|
|
||||||
and pin our packages by specifying a date. There's already
|
|
||||||
[prior art](https://github.com/reproducible-containers/repro-sources-list.sh/)
|
|
||||||
for that, thanks to the incredible work of @AkihiroSuda on
|
|
||||||
[reproducible containers](https://github.com/reproducible-containers).
|
|
||||||
As for PyMuPDF, it is available from the Debian repos, so we won't have to build
|
|
||||||
it from source.
|
|
||||||
|
|
||||||
Here are a few other obstacles that we need to overcome:
|
|
||||||
* We currently download the
|
|
||||||
[latest gVisor version](https://gvisor.dev/docs/user_guide/install/#latest-release)
|
|
||||||
from a GCS bucket. Now that we have switched to Debian, we can take advantage
|
|
||||||
of their
|
|
||||||
[timestamped APT repos](https://gvisor.dev/docs/user_guide/install/#specific-release)
|
|
||||||
and download specific releases from those. An extra benefit is that such
|
|
||||||
releases are signed with their APT key.
|
|
||||||
* We can no longer update the packages in the container image by rebuilding it.
|
|
||||||
We have to bump the dates in the Dockerfile first, which is a minor hassle,
|
|
||||||
but much more declarative.
|
|
||||||
* The `repro-source-list-.sh` script uses the release date of the container
|
|
||||||
image. However, the Debian image is not updated daily (see
|
|
||||||
[newest tags](https://hub.docker.com/_/debian/tags)
|
|
||||||
in DockerHub). So, if we want to ship an emergency release, we have to
|
|
||||||
circumvent this limitation. A simple way is to trick the script by bumping the
|
|
||||||
date of the `/etc/apt/sources.list.d/debian.sources` and
|
|
||||||
`/etc/apt/sources.list` files.
|
|
||||||
* While we talk about image reproducibility, we can't actually achieve the exact
|
|
||||||
same SHA-256 hash for two different image builds. That's because the file
|
|
||||||
timestamps in the image layers will differ, depending on when the build took
|
|
||||||
place. The rest of the image though (file contents, permissions, manifest)
|
|
||||||
should be byte-for-byte the same. A simple way to check this is with the
|
|
||||||
[`diffoci`](https://github.com/reproducible-containers/diffoci) tool, and
|
|
||||||
specifically this invocation:
|
|
||||||
|
|
||||||
```
|
|
||||||
./diffoci diff podman://<new_image_tag> podman://<old_image_tag> \
|
|
||||||
--ignore-timestamps --ignore-image-name --verbose
|
|
||||||
```
|
|
||||||
|
|
||||||
### Updating the image
|
### Updating the image
|
||||||
|
|
||||||
The fact that our image is reproducible also means that it's frozen in time.
|
The fact that our image is reproducible also means that it's frozen in time.
|
||||||
This means that rebuilding the image without updating our Dockerfile will **not**
|
This means that rebuilding the image without updating our Dockerfile will
|
||||||
receive security updates.
|
**not** receive security updates.
|
||||||
|
|
||||||
We list the necessary variables that make up our image in the `Dockerfile.env`
|
Here are the necessary variables that make up our image in the `Dockerfile.env`
|
||||||
file. These are:
|
file:
|
||||||
* `DEBIAN_IMAGE_DATE`: The date that the Debian container image was released
|
* `DEBIAN_IMAGE_DATE`: The date that the Debian container image was released
|
||||||
* `DEBIAN_ARCHIVE_DATE`: The Debian snapshot repo that we want to use
|
* `DEBIAN_ARCHIVE_DATE`: The Debian snapshot repo that we want to use
|
||||||
* `GVISOR_ARCHIVE_DATE`: The gVisor APT repo that we want to use
|
* `GVISOR_ARCHIVE_DATE`: The gVisor APT repo that we want to use
|
||||||
* `H2ORESTART_CHECKSUM`: The SHA-256 checksum of the H2ORestart plugin
|
* `H2ORESTART_CHECKSUM`: The SHA-256 checksum of the H2ORestart plugin
|
||||||
* `H2ORESTART_VERSION`: The version of the H2ORestart plugin
|
* `H2ORESTART_VERSION`: The version of the H2ORestart plugin
|
||||||
|
|
||||||
If you update these values in `Dockerfile.env`, you can create a new Dockerfile
|
If you update these values in `Dockerfile.env`, you must also create a new
|
||||||
with:
|
Dockerfile with:
|
||||||
|
|
||||||
```
|
```
|
||||||
poetry run jinja2 Dockerfile.in Dockerfile.env > Dockerfile
|
poetry run jinja2 Dockerfile.in Dockerfile.env > Dockerfile
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Updating `Dockerfile` without bumping `Dockerfile.in` is detected and should
|
||||||
|
trigger a CI error.
|
||||||
|
|
||||||
### Reproducing the image
|
### Reproducing the image
|
||||||
|
|
||||||
For a simple way to reproduce a Dangerzone container image, either local or
|
For a simple way to reproduce a Dangerzone container image, either local or
|
||||||
|
|
Loading…
Reference in a new issue