Commit graph

37 commits

Author SHA1 Message Date
Alexis Métaireau
4c9139201f
Remove duplicated python3 dependency from Dockerfile 2025-04-22 12:55:48 +02:00
Alex Pyrgiotis
972b264236
Update the Dangerzone image and its dependencies
Bump the Debian container image, gVisor version, and H2Orestart plugin.
2025-04-01 10:33:55 +03:00
sudoforge
e8ca12eb11
Use a fully qualified URI for the debian image
Some checks are pending
Tests / build-deb (debian trixie) (push) Blocked by required conditions
Tests / build-deb (ubuntu 22.04) (push) Blocked by required conditions
Tests / build-deb (ubuntu 24.04) (push) Blocked by required conditions
Tests / build-deb (ubuntu 24.10) (push) Blocked by required conditions
Tests / build-deb (ubuntu 25.04) (push) Blocked by required conditions
Tests / install-deb (debian bookworm) (push) Blocked by required conditions
Tests / install-deb (debian bullseye) (push) Blocked by required conditions
Tests / install-deb (debian trixie) (push) Blocked by required conditions
Tests / install-deb (ubuntu 22.04) (push) Blocked by required conditions
Tests / install-deb (ubuntu 24.04) (push) Blocked by required conditions
Tests / install-deb (ubuntu 24.10) (push) Blocked by required conditions
Tests / install-deb (ubuntu 25.04) (push) Blocked by required conditions
Tests / build-install-rpm (fedora 40) (push) Blocked by required conditions
Tests / build-install-rpm (fedora 41) (push) Blocked by required conditions
Tests / build-install-rpm (fedora 42) (push) Blocked by required conditions
Tests / run tests (debian bookworm) (push) Blocked by required conditions
Tests / run tests (debian bullseye) (push) Blocked by required conditions
Tests / run tests (debian trixie) (push) Blocked by required conditions
Tests / run tests (fedora 40) (push) Blocked by required conditions
Tests / run tests (fedora 41) (push) Blocked by required conditions
Tests / run tests (fedora 42) (push) Blocked by required conditions
Tests / run tests (ubuntu 22.04) (push) Blocked by required conditions
Tests / run tests (ubuntu 24.04) (push) Blocked by required conditions
Tests / run tests (ubuntu 24.10) (push) Blocked by required conditions
Tests / run tests (ubuntu 25.04) (push) Blocked by required conditions
Release multi-arch container image / build-push-image (push) Waiting to run
Scan latest app and container / security-scan-container (ubuntu-24.04) (push) Waiting to run
Scan latest app and container / security-scan-container (ubuntu-24.04-arm) (push) Waiting to run
Scan latest app and container / security-scan-app (ubuntu-24.04) (push) Waiting to run
Scan latest app and container / security-scan-app (ubuntu-24.04-arm) (push) Waiting to run
This change adds the registry prefix to the `debian` image we pull
from `docker.io/library`. By adding this, we improve support for
non-interactive builds, as users who do not have a preferred default
registry defined in their local configuration will no longer be prompted
to select which registry to pull this from.
2025-03-31 09:26:25 -07:00
sudoforge
491cca6341
Use a digest for the debian base image
66600f32dc introduced various improvements
to the determinism of the container image in this repository. This
change builds on this effort by ensuring that the base image is pulled
by digest. Image digests are immutable references, unlike tags, which
are mutable (except when optionally configured as immutable in certain
container registries, but not `docker.io`).
2025-03-31 08:04:05 -07:00
Alex Pyrgiotis
66600f32dc
Remove sources of non-determinism from our image
Make our container image more reproducible, by changing the following in
our Dockerfile:
1. Touch `/etc/apt/sources.list` with a UTC timestamp. Else, builds on
   different countries (!?) may result to different Unix epochs for the
   same date, and therefore different modification time for the
   file.
2. Turn the third column of `/etc/shadow` (date of last password change)
   for the `dangerzone` user into a constant number.
3. Fix r-s file permissions in some copied files, due to inconsistent
   COPY behavior in containerized vs non-containerized Buildkit. This
   requires creating a full file hierarchy in a separate directory (see
   new_root/).
4. Set a specific modification time for the entrypoint script, because
   rewrite-timestamp=true does not overwrite it.
2025-03-20 17:15:15 +02:00
Alex Pyrgiotis
d41f604969
Bump container image parameters
Bump all the values in Dockerfile.env, since there are new releases out
for all of them.
2025-03-20 17:15:15 +02:00
Alex Pyrgiotis
fb90243668
Symlink /usr in Debian container image
Update our Dockerfile and entrypoint script in order to reuse the /usr
dir in the inner and outer container image.

Refs #1048
2025-01-27 21:40:27 +02:00
Alex Pyrgiotis
0ce7773ca1
Render the Dockerfile from a template and some params
Allow updating the Dockerfile from a template and some envs, so that
it's easier to bump the dates in it.
2025-01-27 21:40:27 +02:00
Alex Pyrgiotis
033ce0986d
Switch base image to Debian Stable
Switch base image from Alpine Linux to Debian Stable, in order to reduce
our image footprint, improve our security posture, and build our
container image reproducibly.

Fixes #1046
Refs #1047
2025-01-23 23:24:48 +02:00
Alex Pyrgiotis
935396565c
Reuse the same rootfs for the inner and outer container
Remove the need to copy the Dangerzone container image (used by the
inner container) within a wrapper gVisor image (used by the outer
container). Instead, use the root of the container filesystem for both
containers. We can do this safely because we don't mount any secrets to
the container, and because gVisor offers a read-only view of the
underlying filesystem

Fixes #1048
2025-01-23 23:24:48 +02:00
Alex Pyrgiotis
8568b4bb9d
Move container-only build context to dangerzone/container
Move container-only build context (currently just the entrypoint script)
from `dangerzone/gvisor_wrapper` to `dangerzone/container_helpers`.
Update the rest of the scripts to use this location as well.
2025-01-23 23:24:48 +02:00
Alexis Métaireau
c4bb7c28c8
Unpin gVisor, now that upstream is able to support Linux Yama Mode 2
Fixes #298
2024-11-20 16:41:55 +01:00
Alex Pyrgiotis
7ea7c8a0cc
Remove dead code 2024-10-17 15:50:12 +03:00
Alex Pyrgiotis
8f71df56d9
Handle PyMuPDF 1.24.11 wheels in our Dockerfile
The PyMuPDF wheels for version 1.24.11 have changed the way they are
being built, which means we have to adapt our Dockerfile in order to
install them properly.
2024-10-15 13:08:33 +03:00
Alex Pyrgiotis
93b960cd23
Bump H2ORestart to version 0.6.6
Follow Debian's lead [1] and bump this version to 0.6.6. This change
should bring some stability improvements to our CI tests as well.

[1]: https://packages.debian.org/unstable/text/libreoffice-h2orestart
2024-10-07 18:36:06 +03:00
Alexis Métaireau
eb10082a62
Merge branch 'hotfix-0.7.1' into main 2024-10-01 15:16:25 +02:00
Alex Pyrgiotis
bd2dc0ea3c
Pin gVisor to the last working release
Temporarily pin gVisor to the latest working version
(`release-20240826.0`), since the latest one breaks our container image.

Refs #928
2024-09-27 12:55:59 +03:00
Alexis Métaireau
e4af44c220
Use PyMuPDF wheels for non-ARM architectures.
This removes the need to build the PyMuPDF project by ourselves, but
only when on non-ARM architectures since the wheels for these are not
provided yet.

Changes the `Dockerfile` and `build-image.py` script, introducing a new
`ARCH` flag to conditionally build the wheels.
2024-09-10 14:47:57 +02:00
Etienne Perot
f03bc71855
Sandbox all Dangerzone document processing within gVisor.
This wraps the existing container image inside a gVisor-based sandbox.

gVisor is an open-source OCI-compliant container runtime.
It is a userspace reimplementation of the Linux kernel in a
memory-safe language.

It works by creating a sandboxed environment in which regular Linux
applications run, but their system calls are intercepted by gVisor.
gVisor then redirects these system calls and reinterprets them in
its own kernel. This means the host Linux kernel is isolated
from the sandboxed application, thereby providing protection against
Linux container escape attacks.

It also uses `seccomp-bpf` to provide a secondary layer of defense
against container escapes. Even if its userspace kernel gets
compromised, attackers would have to additionally have a Linux
container escape vector, and that exploit would have to fit within
the restricted `seccomp-bpf` rules that gVisor adds on itself.

Fixes #126
Fixes #224
Fixes #225
Fixes #228
2024-06-12 13:40:04 +03:00
Alexis Métaireau
c01515b775
Bump the minimum python version to 3.9
The minimum python version when installing from source is now python
3.9, as Pyside6 6.7.1 dropped support for python 3.8 (see #780 for more
information).

On Debian-derivatives distributions, the minimum Python version is now
set to 3.8. In practice, because Pyside6 is not packaged for Debian, we
use Pyside2 [0], which is why we can relax the python version requirement.

In practice, when installing from source on an environment where
python3.9 is not the default python, poetry will look for it and use it
if available

> For various reasons, this Python version might not be compatible with
> the python range supported by the project. In this case, Poetry will
> try to find one that is and use it.
>
> [Poetry docs](https://python-poetry.org/docs/managing-environments/)

On Ubuntu Focal (20.04) where Python 3.9 is not installed by default,
it is possible to install it using the `python3.9` package.

Additionally, In version 1.24.3, PyMuPDF changed its package name from `fitz`
to `pymupdf` [2], resulting in a breakage on how it is installed in our
container. This is now fixed.

[0] More information on how Pyside6 packaging affects dangerzone on #221
[1] See [the current status of Pyside6 packaging](https://repology.org/
project/python:pyside6/packages)
[2] PyMuPDF changelog: https://pymupdf.readthedocs.io/en/latest/changes.html#change-log
2024-06-04 19:57:40 +02:00
Alex Pyrgiotis
76898471e7
Bump Python system path to 3.12 in Dockerfile
Alpine Linux 3.20 was released recently [1]. As a result, the
`alpine:latest` image ref, that our Dockerfile uses, switched from the
3.19 to the 3.20 Alpine Linux release. This release has Python 3.12,
meaning that the following line in our Dockerfile now fails:

    COPY --from=pymupdf-build /usr/lib/python3.11/site-packages/fitz/ /usr/lib/python3.11/site-packages/fitz

Bump the Python version in the Python system path to 3.12, so that we
can successfully build the container image.

[1]: https://alpinelinux.org/posts/Alpine-3.20.0-released.html
2024-05-23 12:14:00 +03:00
deeplow
0a099540c8
Stream pages in containers: merge isolation providers
Merge Qubes and Containers isolation providers core code into the class
parent IsolationProviders abstract class.

This is done by streaming pages in containers for exclusively in first
conversion process. The commit is rather large due to the multiple
interdependencies of the code, making it difficult to split into various
commits.

The main conversion method (_convert) now in the superclass simply calls
two methods:
  - doc_to_pixels()
  - pixels_to_pdf()

Critically, doc_to_pixels is implemented in the superclass, diverging
only in a specialized method called "start_doc_to_pixels_proc()". This
method obtains the process responsible that communicates with the
isolation provider (container / disp VM) via `podman/docker` and qrexec
on Containers and Qubes respectively.

Known regressions:
  - progress reports stopped working on containers

Fixes #443
2024-02-06 19:42:33 +00:00
Prateek Jain
699b116d4d
Add clang-dev to Dockerfile 2024-01-15 16:54:00 +00:00
deeplow
f676891482
Remove Dockerfile dependencies replaced by PyMuPDF
PyMuPDF replaced the need for almost all dependencies, which this commit
now removes.

We are also removing tesseract-ocr as a dependency since
(to our surprise) PyMuPDF ships directly with tesseract binaries [1].
However, now that tesseract-ocr is not available directly as a binary
tool, the `test_ocr.py` needed to be changed.

Fixes #658

[1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149
2024-01-03 12:58:36 +00:00
deeplow
e0b092692d
Multi-stage Dockerfile build
Breaks down the container build into multiple stages in order to speed
up build times. Building PyMuPDF was taking too long and this way it can
be cached.

The original version was made by @apyrgio
2024-01-03 12:58:35 +00:00
deeplow
2b082913a0
Bump pymupdf version 1.23.7
The build was failing due to a missing kernel libraries. Adding the
linux-headers dependency solves the issue.
2024-01-03 12:58:35 +00:00
deeplow
250d8356cd
Hash-verify container pip install & merge build-image
Ensure that when the container image is installing pymupdf (unavailable
in the repos) with verified hashes. To do so, it has the pymupdf
dependency declared in a "container" group in `pyproject.toml`, which
then gets exported into a requirements.txt, which is then used for
hash-verification when building the container.

Because this required modifying the container image build scripts, they
were all merged to avoid duplicate code. This was an overdue change
anyways.
2024-01-03 12:58:35 +00:00
deeplow
7b57cb209e
PIP force --break-system-packages
We're intentionally bypassing PEP 668 [1], which prevents the
installation of non-distro python wheels alongside system packages to
avoid incompatibilities at distro-level.

We are intentionally bypassing this since our container image is a
controlled environment (we only ship a version after rigorous testing).

[1]: https://peps.python.org/pep-0668/
2024-01-03 12:58:35 +00:00
deeplow
327ab8791f
Replace pdftoppm logic with PyMuPDF (native python)
Use PyMuPDF (AGPL-licensed) within the container conversion to replace
the pdf conversion to RGB. This massively simplifies the code since
PyMuPDF is a native python library.
2024-01-03 12:40:45 +00:00
Alex Pyrgiotis
6973845ec9
Revert "Switch to the edge repo of Alpine Linux"
This reverts commit acd615e0e1. The
rationale is that we want to wait until the LibreOffice package that
allows HWP conversion in Alpine Linux lands in `alpine:latest`.

For more info, read
https://github.com/freedomofpress/dangerzone/issues/498#issuecomment-1739894100
2023-10-02 14:22:46 +03:00
Alex Pyrgiotis
cbca9110ca
Switch to tessdata-fast Tesseract model
Switch to the tessdata-fast Tesseract model, instead of the tessdata
one. The tessdata-fast Tesseract model is much smaller, and a bit faster
than the other one. Also, it's the model that Debian/Fedora ship by
default.

Closes #545
2023-09-25 12:48:05 +03:00
Moon Sungjoon
acd615e0e1
Switch to the edge repo of Alpine Linux
The Alpine Linux team has enabled Java support for LibreOffice on ARM
architecture:

    74d443f479

This commit is included in 7.5.5.2-r2, so the installed LibreOffice
package should be 7.5.5.2-r2 or higher to fix this issue.

However 3.18 doesn't have the 7.5.5.2-r2 package:

    https://pkgs.alpinelinux.org/package/v3.18/community/aarch64/libreoffice

The Dangerzone image uses the alpine:latest image which is 3.18 as of
writing this.

For this reason, we switch to the edge repo of Alpine Linux, which
includes this fix.

Refs #498
Refs #540
Refs #542
2023-09-06 13:09:34 +03:00
deeplow
d16961bed6
Security: Dynamically load libreoffice extension (PoC)
Only load the LibreOffice extension for opening hwp/hwpx when it is
actually needed. Adding an extension to libreoffice may allow for it to
run arbitrary code. This makes it trust more scalable by trusting
LibreOffice extensions only for the filetypes which they target.

Reasoning
---------

Assuming a malicious `.oxt` extension this means that the extension has
arbitrary code execution in the container. While this is not an
existential threat in itself, we should not expose every Dangerzone user
to it. This is achieved by dynamically loading the extension at runtime
only when needed.

This ensures that a compromised extension will in its least malicious
form be able to modify the visual content of any hancom office files but
not *every file*. In the more malicious version, if the code execution
manages to do a container escape, this will only affect users that have
converted a Hancom office file.
2023-08-01 14:28:34 +01:00
Moon Sungjoon
3e895adbab
Add hwp hwpx support
hwp/hwpx has several custom MIME types

.hwp:
 - application/x-hwp
 - application/haansofthwp
 - application/vnd.hancom.hwp

.hwpx:
 - application/haansofthwpx
 - application/vnd.hancom.hwpx,
 - application/hwp+zip

Fixes #243
2023-08-01 14:27:18 +01:00
Moon Sungjoon
d8cc24cebe
container: Add H2ORestart for HWP/HWPX support
H2ORestart is a LibreOffice extension which adds Hancom HWP/HWPX (Hangul Word Processor)
supports for LibreOffice. This format is widely used in South Korea.

Version: v0.5.7
Extension Repository: https://github.com/ebandal/H2Orestart/releases
2023-08-01 14:07:51 +01:00
Moon Sungjoon
5b58576854
Add --no-cache on the apk install option 2023-06-26 20:02:58 +03:00
deeplow
814d533c3b
Restructure container code
The files in `container/` no longer make sense to have that name since
the "document to pixels" part will run in Qubes OS in its own virtual
machine.

To adapt to this, this PR does the following:
- Moves all the files in `container` to `dangerzone/conversion`
- Splits the old `container/dangerzone.py` into its two components
  `dangerzone/conversion/{doc_to_pixels,pixels_to_pdf}.py` with a
  `common.py` file for shared functions
- Moves the Dockerfile to the project root and adapts it to the new
  container code location
- Updates the CircleCI config to properly cache Docker images.
- Updates our install scripts to properly build Docker images.
- Adds the new conversion module to the container image, so that it can
  be imported as a package.
- Adapts the container isolation provider to use the new way of calling
  the code.

NOTE: We have made zero changes to the conversion code in this commit,
except for necessary imports in order to factor out some common parts.
Any changes necessary for Qubes integration follow in the subsequent
commits.
2023-06-21 11:44:47 +03:00
Renamed from container/Dockerfile (Browse further)