When the flag is set, the `RUNSC_DEBUG=1` environment variable is added
to the outer container, and stderr is captured in a separate thread, before printing its output.
Add the following two methods in the isolation provider:
1. `.is_available()`: Mainly used for the Container isolation provider,
it specifies whether the container runtime is up and running. May be
used in the future by other similar providers.
2. `.should_wait_install()`: Whether the isolation provider takes a
while to be installed. Should be `True` only for the Container
isolation provider, for the time being.
Revamp the container image installation process in a way that does not
involve using image IDs. We don't want to rely on image IDs anymore,
since they are brittle (see
https://github.com/freedomofpress/dangerzone/issues/933). Instead, we
use image tags, as provided in the `image-id.txt` file. This allows us
to check fast if an image is up to date, and we no longer need to
maintain multiple image IDs from various container runtimes.
Refs #933
Refs #988Fixes#1020
Add the following methods that allow the `Container` isolation provider
to work with tags for the Dangerzone image:
* `list_image_tag()`
* `delete_image_tag()`
* `add_image_tag()`
Move the `is_runtime_available()` method from the base
`IsolationProvider` class, and into the `Dummy` provider class. This
method was originally defined in the base class, in order to be mocked
in our tests for the `Dummy` provider. There's no reason for the `Qubes`
class to have it though, so we can just move it to the `Dummy` provider.
This reverts commit 73b0f8b7d4.
Unfortunately, disabling DirectFS causes a problem in Linux systems that
enable Yama mode 2. Turns out that Tails is such a system, so we have to
revert this change, if we want to support it.
Refs #982
Do not close stderr as part of the Qubes termination logic, since we
need to read the debug logs. This shouldn't affect typical termination
scenarios, since we expect our disposable qube to be either busy reading
from stdin, or writing to stdout. If this is not the case, then
forcefully killing the `qrexec-client-vm` process should unblock the
qube.
Make the Dummy isolation provider follow the rest of the isolation
providers and perform the second part of the conversion on the host. The
first part of the conversion is just a dummy script that reads a file
from stdin and prints pixels to stdout.
Extend the base isolation provider to immediately convert each page to
a PDF, and optionally use OCR. In contract with the way we did things
previously, there are no more two separate stages (document to pixels,
pixels to PDF). We now handle each page individually, for two main
reasons:
1. We don't want to buffer pixel data, either on disk or in memory,
since they take a lot of space, and can potentially leave traces.
2. We can perform these operations in parallel, saving time. This is
more evident when OCR is not used, where the time to convert a page
to pixels, and then back to a PDF are comparable.
Add a new way to detect where the Tesseract data are stored in a user's
system. On Linux, the Tesseract data should be installed via the package
manager. On macOS and Windows, they should be bundled with the
Dangerzone application.
There is also the exception of running Dangerzone locally, where even
on Linux, we should get the Tesseract data from the Dangerzone share/
folder.
Make the dummy provider behave a bit more like the other providers, with
a proper function and termination logic. This will be helpful soon in
the tests.
Instead of killing just the invoked Podman/Docker/qrexec process, kill
the whole process group, to make sure that other components that have
been spawned die as well. In the case of Podman, conmon is one of the
processes that lingers, so that's one way to kill it.
Start the conversion process in a new session, so that we can later on
kill the process group, without killing the controlling script (i.e.,
the Dangezone UI). This should not affect the conversion process in any
other way.
As per Etienne Perot's comment on #908:
> Then it seems to me like it would be easy to simply apply this seccomp
profile under all container runtimes (since there's no reason why the
same image and the same command-line would call different syscalls under
different container runtimes).
Docker Desktop 4.30.0 uses the containerd image store by default, which
generates different IDs for the images, and as a result breaks the logic
we are using when verifying the images IDs are present.
Now, multiple IDs can be stored in the `image-id.txt` file.
Fixes#933
Use "podman" when on Linux, and "docker" otherwise.
This commit also adds a text widget to the interface, showing the actual
content fo the error that happened, to help debug further if needed.
Fixes#212
DirectFS is enabled by default in gVisor to improve I/O performance,
but comes at the cost of enabling the `openat(2)` syscall (with severe
restrictions, but still). As Dangerzone is not performance-sensitive,
and that it is desirable to guarantee for the document conversion
process to not open any files (to mimic some of what SELinux provides),
might as well disable it by default.
See #226.
PyMUPDF logs to stdout by default, which is problematic because we use
the stdout of the conversion process to read the pixel stream of a
document.
Make PyMuPDF always log to stderr, by setting the following environment
variables: PYMUPDF_MESSAGE and PYMUPDF_LOG.
Fixes#877
Set the `container_engine_t` SELinux on the **outer** Podman container,
so that gVisor does not break on systems where SELinux is enforcing.
This label is provided for container engines running within a container,
which fits our `runsc` within `crun` situation.
We have considered using the more permissive `label=disable` option, to
disable SELinux labels altogether, but we want to take advantage of as
many SELinux protections as we can, even for the **outer** container.
Cherry-picked from e1e63d14f8Fixes#880
Set the `container_engine_t` SELinux on the **outer** Podman container,
so that gVisor does not break on systems where SELinux is enforcing.
This label is provided for container engines running within a container,
which fits our `runsc` within `crun` situation.
We have considered using the more permissive `label=disable` option, to
disable SELinux labels altogether, but we want to take advantage of as
many SELinux protections as we can, even for the **outer** container.
Fixes#880
We have encountered several conversions where the `docker kill` command
hangs. Handle this case by specifying a timeout to this command. If the
timeout expires, log a warning and proceed with the rest of the
termination logic (i.e., kill the conversion process).
Fixes#854
We are aware that some Docker Desktop releases before 25.0.0 ship with a
seccomp policy which disables the `ptrace(2)` system call. In such
cases, we opt to use our own seccomp policy which allows this system
call. This seccomp policy is the default one in the latest releases of
Podman, and we use it in Linux distributions where Podman version is <
4.0.
Fixes#846
This wraps the existing container image inside a gVisor-based sandbox.
gVisor is an open-source OCI-compliant container runtime.
It is a userspace reimplementation of the Linux kernel in a
memory-safe language.
It works by creating a sandboxed environment in which regular Linux
applications run, but their system calls are intercepted by gVisor.
gVisor then redirects these system calls and reinterprets them in
its own kernel. This means the host Linux kernel is isolated
from the sandboxed application, thereby providing protection against
Linux container escape attacks.
It also uses `seccomp-bpf` to provide a secondary layer of defense
against container escapes. Even if its userspace kernel gets
compromised, attackers would have to additionally have a Linux
container escape vector, and that exploit would have to fit within
the restricted `seccomp-bpf` rules that gVisor adds on itself.
Fixes#126Fixes#224Fixes#225Fixes#228
Get the (major, minor) parts of the Docker/Podman version, to check if
some specific features can be used, or if we need a fallback. These
features are related with the upcoming gVisor integration, and will be
added in subsequent commits.