When security scanning our poetry.lock file for the **released**
Dangerzone version, retain the Grype ignore
list (.grype.yaml) of the current branch, which would be otherwise
overwritten by a git checkout to the latest released tag (v0.9.0 as of
writing this). This way, we can instruct Grype to ignore vulnerabilities
in the latest Dangerzone release.
Ignore an h11 vulnerability that is present in the Dangerzone
application released from the `v0.9.0` tag. This vulnerability
reportedly affects web servers behind reverse proxies, which is not
Dangerzone's case.
The way to handle the trust for a PGP key has changed in recent versions
of `apt-secure` and now requires the use of PGP keys in something
different than the internal GPG keybox database.
When updating the CI checks, I found that there were a difference between
them and the instructions that were provided in the INSTALL.md file, which
was using the armored version.
The instructions now require the unarmored keys, stored in a `.gpg`
file, and installation of these keys differ depending on the system,
using `sq` on newer distributions.
In case we run on Windows and use Podman Desktop (for which we currently
offer experimental support), we must not pass some Podman flags in order
to avoid conversion errors.
Refs #1127
It now outputs the following:
```
build-linux Build linux packages (.rpm and .deb)
build-macos-arm Build macOS Apple Silicon package (.dmg)
build-macos-intel Build macOS intel package (.dmg)
Dockerfile Regenerate the Dockerfile from its template
fix apply all the suggestions from ruff
help Print this message and exit.
lint Check the code for linting, formatting, and typing issues with ruff and mypy
regenerate-reference-pdfs Regenerate the reference PDFs
test Run the tests
test-large Run large test set
```
Do not unconditionally install the Poetry plugin for exporting
dependencies as a requirements.txt file, since it's used only when
building a Debian package. Keep it instead in the Linux instructions and
when building a Dangerzone environment.
Use `poetry sync` instead of `poetry install --sync`, since the latter
is deprecated and will be removed after June 2025, as seen in the
following warning message:
The `--sync` option is deprecated and slated for removal in the next
minor release after June 2025, use the `poetry sync` command instead.
Remove the tesseract data dir from the doit targets, else we encounter
the following error:
Traceback (most recent call last):
[...]
File "[...]/Library/Caches/pypoetry/virtualenvs/dangerzone-52Yr5wv_-py3.11/lib/python3.11/site-packages/doit/dependency.py", line 39, in get_file_md5
with open(path, 'rb') as file_data:
^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: 'share/tessdata'
Use abbreviated months in the Debian changelog, else we'll have warnings
like the following:
LINE: -- Freedom of the Press Foundation <info@freedom.press> Mon, 31 March 2025 15:57:18 +0300
dpkg-source: warning: dangerzone/debian/changelog(l5): cannot parse non-conformant date '31 March 20
Update the minimum Docker Desktop version prior to the 0.9.0 release.
The new version should also fix a recent Docker bug, whereby the container
stdout was truncated, and caused our conversions to fail.
Fixes#1101
This change adds the registry prefix to the `debian` image we pull
from `docker.io/library`. By adding this, we improve support for
non-interactive builds, as users who do not have a preferred default
registry defined in their local configuration will no longer be prompted
to select which registry to pull this from.
66600f32dc introduced various improvements
to the determinism of the container image in this repository. This
change builds on this effort by ensuring that the base image is pulled
by digest. Image digests are immutable references, unlike tags, which
are mutable (except when optionally configured as immutable in certain
container registries, but not `docker.io`).
This sets the container runtime in the settings, and provides an easy
way to do so for users, without having to mess with the json settings.
When setting the container runtime, one can just pass "podman" and the
path to the executable will be stored in the settings.
Update our `RELEASE.md` so that we don't forget to bump the download
links in `INSTALL.md` prior to tagging a release. This way, we won't
have a versioned `INSTALL.md` page pointing to an older download link.
Note that this means that the latest version of the `INSTALL.md` page
will point to a broken link, in the short period of time between the
pre-release and the actual release. That's not an issue in our case,
because we don't point to the latest version of our `INSTALL.md` from
our `README.md`. We use versioned links instead, and thus we minimize
the chance that a user may encounter a broken link.
Fixes#1100
This is useful to reduce the computation time when creating PDF visual
diffs. Here is a comparison of the same operation using python arrays
and numpy arrays + lookups:
Python arrays:
```
diff took 5.094218431997433 seconds
diff took 3.1553626069980965 seconds
diff took 3.3721952960004273 seconds
diff took 3.2134646750018874 seconds
diff took 3.3410625500000606 seconds
diff took 3.2893160990024626 seconds
```
Numpy:
```
diff took 0.13705662599750212 seconds
diff took 0.05698924000171246 seconds
diff took 0.15319590600120137 seconds
diff took 0.06126453700198908 seconds
diff took 0.12916332699751365 seconds
diff took 0.05839455900058965 seconds
Currently, this is only returning warnings, but we seem to just skip
them. As it's possible to merge PRs when the CI is red, issuing an error
would help us to think about populating this file.
Create a reusable GitHub Actions workflow that does the following:
1. Create a multi-architecture container image for Dangerzone, instead
of having two different tarballs (or no option at all)
2. Build the Dangerzone container image on our supported architectures
(linux/amd64 and linux/arm64). It so happens that GitHub also offers
ARM machine runners, which speeds up the build.
3. Combine the images from these two architectures into one, multi-arch
image.
4. Generate provenance info for each manifest, and the root manifest
list.
5. Check the image's reproduciblity.
Also, remove an older CI job for checking the reproducibility of the
image, which is now obsolete.
Fixes#1035
Make a major change to the `reproduce-image.py` script: drop `diffoci`,
build the container image, and ensure it has the exact same hash as the
source image.
We can drop the `diffoci` script when comparing the two images, because
we are now able build bit-for-bit reproducible images.
Loading an image built with Buildkit in Podman 3.4 messes up its name.
The tag somehow becomes the name of the loaded image.
We know that older Podman versions are not generally affected, since
Podman v3.0.1 on Debian Bullseye works properly. Also, Podman v4.0 is
not affected, so it makes sense to target only Podman v3.4 for a fix.
The fix is simple, tag the image properly based on the expected tag from
`share/image-id.txt` and delete the incorrect tag.
Refs containers/podman#16490
Find all references to the `container.tar.gz` file, and replace them
with references to `container.tar`. Moreover, remove the `--no-save`
argument of `build-image.py` since we now always save the image.
Finally, fix some stale references to Poetry, which are not necessary
anymore.
Invoke the `repro-build` script when building a container image, instead
of the underlying Docker/Podman commands. The `repro-build` script
handles the underlying complexity to call Docker/Podman in a manner that
makes the image reproducible.
Moreover, mirror some arguments from the `repro-build` script, so that
consumers of `build-image.py` can pass them to it.
Important: the resulting image will be in .tar format, not .tar.gz,
starting from this commit. This means that our tests will be broken for
the next few commits.
Fixes#1074
Vendor the `repro-build` script in our codebase, which will be used to
build our container image in a reproducible manner. We prefer to copy it
verbatim for the time-being, since its interface is not stable enough,
and the repro-build repo is not reviewed after all.
In the future, we want to store this script in a separate place, and
pull it when necessary.
Refs #1085
Make our container image more reproducible, by changing the following in
our Dockerfile:
1. Touch `/etc/apt/sources.list` with a UTC timestamp. Else, builds on
different countries (!?) may result to different Unix epochs for the
same date, and therefore different modification time for the
file.
2. Turn the third column of `/etc/shadow` (date of last password change)
for the `dangerzone` user into a constant number.
3. Fix r-s file permissions in some copied files, due to inconsistent
COPY behavior in containerized vs non-containerized Buildkit. This
requires creating a full file hierarchy in a separate directory (see
new_root/).
4. Set a specific modification time for the entrypoint script, because
rewrite-timestamp=true does not overwrite it.
This is a common pitfall of pyinstaller, when using multiprocessing.
In our case, the spawned processes is passed the -B option, thinking
it's python (but it's dangerzone).
> -B Don't write .pyc files on import. See also PYTHONDONTWRITEBYTECODE.
As a result, dangerzone is spawned with the -B option, which doesn't
mean anything for it.
> In the frozen application, sys.executable points to your application
> executable. So when the multiprocessing module in your main process
> attempts to spawn a subprocess (a worker or the resource tracker), it
> runs another instance of your program, with the following arguments for
> resource tracker:
>
> my_program -B -S -I -c "from multiprocessing.resource_tracker import main;main(5)"
https://pyinstaller.org/en/stable/common-issues-and-pitfalls.html#multi-processing
Unpin the PyMuPDF package that we vendor in our Debian packages. We
originally pinned it to version 1.24.11, because it was the last version
that supported Ubuntu Focal, but we can now unpin it, since we have
dropped Ubuntu Focal support.
Fixes#1018
Drop Ubuntu 20.04 (Focal) support, because it's nearing its end-of-life
date. By doing so, we can remove several workarounds and notices we had
in place for this version, and most importantly, remove the pin to our
vendored PyMuPDF package.
Refs #1018
Refs #965
Scan ARM images using Anchore's scan action, by utilizing the Ubuntu ARM
runners provided by GitHub. While our ARM images are used only in macOS
silicon platforms, we can use the Ubuntu ARM runners just for scanning.
Closes#1008
Update our CI job and build instructions with the latest WiX version, so
that we don't encounter any installation issues when new WiX versions
are released.
Also, add a reminder in our release instruction to bump the WiX version
before we start a new release.
Fixes#1087
Remove the Shiboken6 pin for our Linux and macOS platforms, since a new
upstream package has been released, that has wheels for every platform.
Also, remove the `sed` command from our dangerzone.spec, whose purpose
was to nullify this pin for our Fedora packages.
Fixes#1061
Ignore the CVE-2025-0665 vulnerability, since it's a libcurl one, and
the Dangerzone container does not make network calls. Also, it seems
that Debian Bookworm is not affected.
Mask some paths of the outer container in the OCI config of the inner
container. This is done to avoid leaking any sensitive information from
Podman / Docker / gVisor, since we reuse the same rootfs
Refs #1048
Add a dev script for Linux platforms that verifies that a source image
can be reproducibly built from the current Git commit. The
reproducibility check is enforced by the `diffoci` tool, which is
downloaded as part of running the script.
Allow setting a tag for the container image, when building it with the
`build-image.py` script. This should be used for development purposes
only, since the proper image name should be dictated by the script.
Remove our suggestions for not using the container cache, which stemmed
from the fact that our Dangerzone image was not reproducible. Now that
we have switched to Debian Stable and the Dockerfile is all we need to
reproducibly build the exact same container image, we can just use the
cache to speed up builds.
Switch base image from Alpine Linux to Debian Stable, in order to reduce
our image footprint, improve our security posture, and build our
container image reproducibly.
Fixes#1046
Refs #1047
Remove the need to copy the Dangerzone container image (used by the
inner container) within a wrapper gVisor image (used by the outer
container). Instead, use the root of the container filesystem for both
containers. We can do this safely because we don't mount any secrets to
the container, and because gVisor offers a read-only view of the
underlying filesystem
Fixes#1048
Download and copy the following artifacts that will be used for building
a Debian-based Dangerzone container image in the subsequent commits:
* The APT key for the gVisor repo [1]
* A helper script for building reproducible Debian images [2]
[1] https://gvisor.dev/archive.key
[2] d15cf12b26/repro-sources-list.sh
Move container-only build context (currently just the entrypoint script)
from `dangerzone/gvisor_wrapper` to `dangerzone/container_helpers`.
Update the rest of the scripts to use this location as well.
Starting with Debian Trixie, `apt secure` relies on `sqv` to do its verification, which doesn't support the GPG keybox database format.
At the same time, using the standard PGP base64 format makes the verification fail for versions of `apt secure` which relies on `gpg`, as the subkey isn't detected there.
Fixes#1055
When the flag is set, the `RUNSC_DEBUG=1` environment variable is added
to the outer container, and stderr is captured in a separate thread, before printing its output.
Since Poetry 2.0.0, the `export` command has been removed and it's
advised to use the "poetry-plugin-export" package instead.
This commit adds this dependency to the different places it's needed
(debian environments, CI, build instructions, etc).
- Use `ruff` instead of `black` and `isort` in the `lint` target for linting and code formatting.
- Add a new target `fix` which applies all suggestions from `ruff check` and `ruff format`.
Ignore the CVE-2024-11053 vulnerability, since it's a libcurl one, and
the Dangerzone container does not make network calls.
Also, clear the previous vulnerabilities, now that we have a new image
out.
Update our release instructions with a way to run manual tasks via
`doit`. Also, add developer documentation on how to use `doit`, and some
tips and tricks.
Create a `dodo.py` file where we define the dependencies and targets of
each release task, as well as how to run it. Currently, we have
automated all of our Linux and macOS tasks, except for adding Linux
packages to the respective APT/YUM repos.
The tasks we have automated follow below:
build_image Build the container image using ./install/common/build-image.py
check_container_runtime Test that the container runtime is ready.
clean_container_runtime Clean the storage space of the container runtime.
clean_prompt Make sure that the user really wants to run the clean tasks.
debian_deb Build a Debian package for Debian Bookworm.
debian_env Build a Debian Bookworm dev environment.
download_tessdata Download the Tesseract data using ./install/common/download-tessdata.py
fedora_env Build Fedora dev environments.
fedora_env:40 Build Fedora 40 dev environments
fedora_env:41 Build Fedora 41 dev environments
fedora_rpm Build Fedora packages for every supported version.
fedora_rpm:40 Build a Fedora 40 package
fedora_rpm:40-qubes Build a Fedora 40 package for Qubes
fedora_rpm:41 Build a Fedora 41 package
fedora_rpm:41-qubes Build a Fedora 41 package for Qubes
git_archive Build a Git archive of the repo.
init_release_dir Create a directory for release artifacts.
macos_build_dmg Build the macOS .dmg file for Dangerzone.
macos_check_cert Test that the Apple developer certificate can be used.
macos_check_system Run macOS specific system checks, as well as the generic ones.
poetry_install Setup the Poetry environment
Closes#1016
Add the doit automation tool in our `pyproject.toml` and `poetry.lock`
file as a package-related dependency, since we don't want to ship it to
our end users.
Now that our image tarball is not tagged as 'latest', we must first grab
the image tag first, and then refer to it. We can grab the tag either
from `share/image-id.txt` (if available) or with:
docker load dangerzone.rocks/dangerzone --format {{ .Tag }}
Add the following two methods in the isolation provider:
1. `.is_available()`: Mainly used for the Container isolation provider,
it specifies whether the container runtime is up and running. May be
used in the future by other similar providers.
2. `.should_wait_install()`: Whether the isolation provider takes a
while to be installed. Should be `True` only for the Container
isolation provider, for the time being.
Revamp the container image installation process in a way that does not
involve using image IDs. We don't want to rely on image IDs anymore,
since they are brittle (see
https://github.com/freedomofpress/dangerzone/issues/933). Instead, we
use image tags, as provided in the `image-id.txt` file. This allows us
to check fast if an image is up to date, and we no longer need to
maintain multiple image IDs from various container runtimes.
Refs #933
Refs #988Fixes#1020
Build Dangerzone images and tag them with a unique ID that stems from
the Git reop. Note that using tags as image IDs instead of regular image
IDs breaks the current Dangerzone expectations, but this will be
addressed in subsequent commits.
Build Dangerzone images and tag them with a unique ID that stems from
the Git reop. Note that using tags as image IDs instead of regular image
IDs breaks the current Dangerzone expectations, but this will be
addressed in subsequent commits.
Add the following methods that allow the `Container` isolation provider
to work with tags for the Dangerzone image:
* `list_image_tag()`
* `delete_image_tag()`
* `add_image_tag()`
Move the `is_runtime_available()` method from the base
`IsolationProvider` class, and into the `Dummy` provider class. This
method was originally defined in the base class, in order to be mocked
in our tests for the `Dummy` provider. There's no reason for the `Qubes`
class to have it though, so we can just move it to the `Dummy` provider.
Workaround for an issue after upgrading from WiX Toolset v3 to v5 where the previous
version of Dangerzone is not uninstalled during the upgrade by checking if the older installation
exists in "C:\Program Files (x86)\Dangerzone".
Also handle a special case for Dangerzone 0.8.0 which allows choosing the install location
during install by checking if the registry key for it exists.
Note that this seems to allow installing Dangerzone 0.8.0 after installing Dangerzone from this branch.
In this case the installer errors until Dangerzone 0.8.0 is uninstalled again
- WiX Toolset v3 used to validate the msi package by default. In v5 that has moved to a new command, so add a new validation step to the script.
- Also emove the step that uses `insignia.exe` to sign the Dangerzone.msi with the digital signatures from its external cab archives.
In WiX Toolset v4 and newer, insignia is replaced with a new command `wix msi inscribe`, but we tell wix to embed the cabinets into the .msi
(That's what`EmbedCab="yes"` in the Media / MediaTemplate element does) so singning them separately is not necessary. [0]
[0] https://wixtoolset.org/docs/tools/signing/
With this, all the files are organised into Components,
each of which points to a Directory defined in the StandardDirectory element.
This simplifies the Feature element considerable as only thing it needs to
include everything in the built msi is a reference to `ApplicationComponents`
- The Keywords and Description items move under a new SummaryInformation element.
- Shuffle things around so that elements previously under the product element are now under the Package element.
- Rename SummaryCodepage in SummaryInformation to Codepage and remove a duplicate Manufacturer item.
- Remove InstallerVersion and let WiX set it to default value. (500 a.k.a Windows 7)
This commit makes changes to the release instructions, prefering bash
exemples when that's possible. As a result, the QA.md and RELEASE.md
files have been separated and a new `generate-release-tasks.py` script
is making its apparition.
People might not now if the issue is related to docker or not, and we've
had to ask them for additional information after they opened the issue.
This makes it clearer that this information might be useful.
There are various place in our release process
(build/installation/release instructions and CI checks) where we make
sure that the FPF-maintained PySide6 package works in Fedora 39. Now
that Fedora 39 is nearing its EOL date, we can remove those.
Our repo's README.md should point to our INSTALL.md for installation
instructions, and not the other way around. This fixes an issue with
INSTALL.md pointing to a stale README.md version. Updating our README
before tagging is not possible, since the latest version is the one that
our users visit, and it can't point to download links that do not exist.
Fixes#1003
Implement the following steps from the QA docs:
1. Check if the latest Python version that we support is installed. For
example, we currently support Python 3.12, so we add code to check
that the latest Python 3.12.x version is installed.
2. Download the Tesseract data using our script, both on Windows and
Linux.
Increase the size of the `dz` qube in our build instructions. We
increase it from 2GiB (default), to 5GiB (suggested), in order to cater
for some extra space that our build instructions need (e.g., the
download of the Tesseract data).
Previously, the actions were duplicated, due to the fact when developing
we often create feature branches and open pull requests.
This new setup requires us to open pull requests to trigger the CI.
It seem that these tests are flaky, and as a result our CI pipeline is
failing from time to time. This will rerun it automatically when there
is an error.
See https://github.com/freedomofpress/dangerzone/issues/968 for more
information
The filename of the container image tarball that we published in our
release assets has changed from `container.tar.gz` to
`container-0.8.0-i686.tar.gz`. Change the scan action accordingly.
This reverts commit 73b0f8b7d4.
Unfortunately, disabling DirectFS causes a problem in Linux systems that
enable Yama mode 2. Turns out that Tails is such a system, so we have to
revert this change, if we want to support it.
Refs #982
Restore the `isolation_provider.base.kill_process_group()` function,
which was previously mocked, at the end of the
`test_linger_unkillable()` test. This function is initially mocked, in
order to simulate a hang process. After the mocking completes, the test
needs the original function once more, in order to actually kill the
spawned process.
Do not close stderr as part of the Qubes termination logic, since we
need to read the debug logs. This shouldn't affect typical termination
scenarios, since we expect our disposable qube to be either busy reading
from stdin, or writing to stdout. If this is not the case, then
forcefully killing the `qrexec-client-vm` process should unblock the
qube.
Add a doc that contains an MP4 video in it, which has an audio and video
stream. This type of document could not be converted with the latest
Dangerzone releases, because PyMuPDF threw this error in the container's
stdout:
MuPDF error: unsupported error: cannot create appearance stream for
Screen annotations
This error message was treated literally by our client code, which
parsed the first few bytes in order to find out the page height/width.
This resulted to a misleading Dangerzone error, e.g.:
A page exceeded the maximum height
This issue started occurring since 0.6.0, which added streaming support,
and was fixed by commit 3f86e7b465. That
fix was not accompanied by a test document that would ensure we would
not have this regression from now on, so we add it in this
commit.
Refs #877Closes#917
Remove some macOS entitlements that are not necessary for the current
iteration of Dangerzone. Those are the ability to run as a hypervisor,
and the ability to accept network connections. They are a relic from
when we were experimenting with VMs, instead of relying on Docker
Desktop.
Make the Dummy isolation provider follow the rest of the isolation
providers and perform the second part of the conversion on the host. The
first part of the conversion is just a dummy script that reads a file
from stdin and prints pixels to stdout.
Extend the base isolation provider to immediately convert each page to
a PDF, and optionally use OCR. In contract with the way we did things
previously, there are no more two separate stages (document to pixels,
pixels to PDF). We now handle each page individually, for two main
reasons:
1. We don't want to buffer pixel data, either on disk or in memory,
since they take a lot of space, and can potentially leave traces.
2. We can perform these operations in parallel, saving time. This is
more evident when OCR is not used, where the time to convert a page
to pixels, and then back to a PDF are comparable.
The PyMuPDF package was previously mainly used within the Dangerzone
container, as well as on Qubes. With on-host conversion, PyMuPDF will be
used in all supported platforms by default. For this reason, we can
promote it to a main dependency.
Add a new way to detect where the Tesseract data are stored in a user's
system. On Linux, the Tesseract data should be installed via the package
manager. On macOS and Windows, they should be bundled with the
Dangerzone application.
There is also the exception of running Dangerzone locally, where even
on Linux, we should get the Tesseract data from the Dangerzone share/
folder.
Add a Python script that can run in all supported platforms, and can
download and extract the Tesseract language data from GitHub, while
also:
1. Checking that the expected hash matches.
2. Informing the user if the language data have already been downloaded.
3. Extracting only the subset of language data that Dangerzone needs
GitHub actions somehow managed to downgrade our runners from Ubuntu
24.04 to Ubuntu 22.04, even though we use `ubuntu-latest`. Make the
Ubuntu 24.04 requirement more explicit, until GitHub migrates fully to
this version for the `ubuntu-latest` tag.
Fixes#957
Unreleased Fedora versions may refer to themselves as "rawhide", instead
of their version (e.g., "41"). For this reason, we should try and
replace the "rawhide" string with the proper Fedora version.
Fedora 41 has a newer dnf interface (dnf v5), and the config-manager
plugin that we use is not compatible with it. Suggest running it with
`dnf-3` instead, which is present in all Fedora versions.
It seems that the container image for Ubuntu 24.10 also ships with a
default Ubuntu user with UID 1000, so we need to remove it when creating
our dev environment.
Try installing `passt`, which is responsible for user networking in
later Podman releases. If not installed, building the container image
within an Ubuntu 24.10 environment fails with:
setup network: could not find pasta, the network namespace can't be
configured: exec: "pasta": executable file not found in $PATH
Note that this package is not available in older Ubuntu versions. In
these cases, we should swallow installation failures and continue.
Install PyMuPDF under ./dangerzone/vendor, right before we build the
.deb package. We vendor PyMuPDF just for Debian, since the provided
versions don't have OCR support enabled.
Currently, we don't use PyMuPDf on the host, but this will change once
we fully implement the on-host conversion feature.
Refs #625
Add a script that installs PyMuPDF under ./dangerzone/vendor. This will
be useful in subsequent commits, for vendoring PyMuPDF when building
Debian packages.
The PyMuPDF wheels for version 1.24.11 have changed the way they are
being built, which means we have to adapt our Dockerfile in order to
install them properly.
Remove the installation steps for Xvfb, since it's already included in
GitHub actions, and fire up an Xvfb server with disabled host-based
access control.
Initially, we tried to wrap our CI tests with `xvfb-run`, but any
X11 client within our Podman container failed with the following error
message:
Authorization required, but no authorization protocol specified.
This error message is usually thrown when the X11 client does not
provide the magic cookie in the Xauthority file back to the X11 server.
In our case though, we can verify that commands in our Podman container
read the Xauthority file successfully:
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X99"}, 21) = -1 ECONNREFUSED (Connection refused)
close(3) = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
getsockopt(3, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/tmp/.X11-unix/X99"}, 110) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path="/tmp/.X11-unix/X99"}, [124->21]) = 0
uname({sysname="Linux", nodename="dangerzone-dev", ...}) = 0
access("/home/runner/work/dangerzone/dangerzone/cookie", R_OK) = 0
openat(AT_FDCWD, "/home/runner/work/dangerzone/dangerzone/cookie", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0600, st_size=59, ...}) = 0
read(4, "\1\0\0\rfv-az1915-957\0\299\0\22MIT-MAGIC"..., 4096) = 59
read(4, "", 4096) = 0
close(4) = 0
fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
poll([{fd=3, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{iov_base="l\0\v\0\0\0\0\0\0\0\0\0", iov_len=12}, {iov_base="", iov_len=0}], 2) = 12
recvfrom(3, 0x55a5635c0050, 8, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN}], 1, -1) = 1 ([{fd=3, revents=POLLIN}])
recvfrom(3, "\0@\v\0\0\0\20\0", 8, 0, NULL, NULL) = 8
recvfrom(3, "Authorization required, but no a"..., 64, 0, NULL, NULL) = 64
write(2, "Authorization required, but no a"..., 64Authorization required, but no authorization protocol specified
) = 64
The line with the magic cookie is:
read(4, "\1\0\0\rfv-az1915-957\0\299\0\22MIT-MAGIC"..., 4096) = 59
Since we are not sure why we are not allowed access to the X11 server
from the Podman container, we decided to disable host-based access
controls altogether. This is not a security concern, since this X11
session is a remote one. However, we shouldn't run tests this way in dev
machines.
Fixes#949
Do not use the `provider_wait` fixture in our termination logic tests,
and switch instead to the `provider` fixture, which instantiates a
typical isolation provider.
The `provider_wait` fixture's goal was to emulate how would the process
behave if it had fully spawned. In practice, this masked some
termination logic issues that became apparent in the WIP on-host
conversion PR. Now that we kill the spawned process via its process
group, we can just use the default isolation provider in our tests.
In practice, in this PR we just do `s/provider_wait/provider`, and
remove some stale code.
Add a fixture that returns our stock Dummy provider. Also, explicitly
use a blocking Dummy provider (`DummyWait`) for a specific test case.
This will prove useful when we stop using the `provider_wait` variant of
our isolation providers in the next commits.
Make the dummy provider behave a bit more like the other providers, with
a proper function and termination logic. This will be helpful soon in
the tests.
Instead of killing just the invoked Podman/Docker/qrexec process, kill
the whole process group, to make sure that other components that have
been spawned die as well. In the case of Podman, conmon is one of the
processes that lingers, so that's one way to kill it.
Start the conversion process in a new session, so that we can later on
kill the process group, without killing the controlling script (i.e.,
the Dangezone UI). This should not affect the conversion process in any
other way.
GitHub provides an Intel macOS runner as `macos-13`. Add it alongside
our M1 macOS runner (`macos-latest`), in order to cover all of our
target environments.
It seems that we need to specify that Python files have LF line endings
on Windows environments, else they will get converted to CRLF. If this
happens, then the container image we build in this environment will have
Python files with wrong endings, and tests will break.
Refs #838 for previous attempt.
Make sure that the Debian package we build conforms to the expected
naming scheme else, it's possible that something is off. A scenario
we've encountered is bumping `share/version.txt`, but not
`debian/changelog`, which would create a Debian package with an older
version.
As part of this change, the dev (build) and end-user test images names
changed from `dangerzone.rocks/*` to `ghcr.io`.
A new `--sync` option is provided in the `env.py` command, in order to
retrieve the images from the registry, or build and upload otherwise.
As per Etienne Perot's comment on #908:
> Then it seems to me like it would be easy to simply apply this seccomp
profile under all container runtimes (since there's no reason why the
same image and the same command-line would call different syscalls under
different container runtimes).
Docker Desktop 4.30.0 uses the containerd image store by default, which
generates different IDs for the images, and as a result breaks the logic
we are using when verifying the images IDs are present.
Now, multiple IDs can be stored in the `image-id.txt` file.
Fixes#933
Switch to using `WixUI_InstallDir` dialog set in the windows installer and add the `WIXUI_INSTALLDIR` property it needs to let user choose where Dangerzone is installed.
resolves#148
Update the gVisor design doc, to better reflect the current state of the
gVisor integration. More specifically, the following have changed since
this design doc was merged:
* We have dropped the need for the `SETFCAP` capability.
* We have added the SELinux label `container_engine_t` to the outer
container.
Use "podman" when on Linux, and "docker" otherwise.
This commit also adds a text widget to the interface, showing the actual
content fo the error that happened, to help debug further if needed.
Fixes#212
Previously, these files where stored inside the repository (under
`dev_scripts/env/`), which could lead to conflicts with some tooling
(black, debian-helper).
(Linux only): as a convenience, here is how to move data to the new
location:
```bash
mkdir -p ~/.local/share/dangerzone-dev
mv dev_scripts/envs/ ~/.local/share/dangerzone-dev/.
```
As a result, a new `debian` folder is now living in the repository.
Debian packaging is now done manually rather than using tools that do
the heavy-lifting for us.
The `build-deb.py` script has also been updated to use `dpkg-buildpackage`
Do not use by default the builder cache, when we build the Dangerzone
container image. This way, we can always have the most fresh result when
we run the `./install/common/build-image.py` command.
If a dev wants to speed up non-release builds, we add the `--use-cache`
flag to use the builder cache.
DirectFS is enabled by default in gVisor to improve I/O performance,
but comes at the cost of enabling the `openat(2)` syscall (with severe
restrictions, but still). As Dangerzone is not performance-sensitive,
and that it is desirable to guarantee for the document conversion
process to not open any files (to mimic some of what SELinux provides),
might as well disable it by default.
See #226.
This removes the need to build the PyMuPDF project by ourselves, but
only when on non-ARM architectures since the wheels for these are not
provided yet.
Changes the `Dockerfile` and `build-image.py` script, introducing a new
`ARCH` flag to conditionally build the wheels.
Update our CircleCI machines for specific tests (Debian Bookworm and
Fedora 40). It seems that the newest Podman version (5.2.1+), when
creating a container using the `--userns nomap` triggers a permission
denied error in older kernels. E.g.:
Error: crun: cannot stat `/tmp/storage-run-1000/containers/overlay-containers/d00932f2600df7b0d8f4cc78e2346487ec92bfd17307127f3ae8d4e5bbc7887b/userdata/hosts`: Permission denied: OCI permission denied
The solution that works for us is to update the machine image from
Ubuntu 20.04 to Ubuntu 22.04.
Some of the files in our large test set can make LibreOffice hang. We
do not have a proper solution for this yet, but we can at least make
the tests timeout quickly, so that they can finish at some point.
Refs #878
PyMUPDF logs to stdout by default, which is problematic because we use
the stdout of the conversion process to read the pixel stream of a
document.
Make PyMuPDF always log to stderr, by setting the following environment
variables: PYMUPDF_MESSAGE and PYMUPDF_LOG.
Fixes#877
Ignore CVE-2024-5175 from our security scans, because Dangerzone is not
affected by it. Our assessment follows:
The affected library, `libaom.so`, is linked by GStreamer's
`libgstaom.so` library. The vulnerable `aom_img_alloc` function is only
used when **encoding** a video to AV1. LibreOffce uses the **decode**
path instead, when generating thumbnails.
Closes#895
Update our `env.py` script to auto-detect the correct Dangerzone package
name. This is useful when building an end-user environment, i.e., a
container image where we copy the respective Dangerzone .deb/.rpm
package and install it via a package manager.
To achieve this, we replace the hardcoded patch level (`-1`) in the
package name with a glob character (`*`). Then, we check in the
respective build directory if there's exactly one match for this
pattern. If yes, we return the full path. If not, we raise an exception.
Note that this limitation was triggered when we were building RPM
packages for the 0.7.0 hotfix release.
Refs #880
Set the `container_engine_t` SELinux on the **outer** Podman container,
so that gVisor does not break on systems where SELinux is enforcing.
This label is provided for container engines running within a container,
which fits our `runsc` within `crun` situation.
We have considered using the more permissive `label=disable` option, to
disable SELinux labels altogether, but we want to take advantage of as
many SELinux protections as we can, even for the **outer** container.
Cherry-picked from e1e63d14f8Fixes#880
Set the `container_engine_t` SELinux on the **outer** Podman container,
so that gVisor does not break on systems where SELinux is enforcing.
This label is provided for container engines running within a container,
which fits our `runsc` within `crun` situation.
We have considered using the more permissive `label=disable` option, to
disable SELinux labels altogether, but we want to take advantage of as
many SELinux protections as we can, even for the **outer** container.
Fixes#880
We believe that Dangerzone is not affected by CVE-2024-5535 for the
following reasons:
1. This CVE affects applications that make network calls. The Dangerzone
container does not perform any such calls, and has no access to the
internet.
2. The OpenSSL devs have marked this issue as low severity.
We have encountered several conversions where the `docker kill` command
hangs. Handle this case by specifying a timeout to this command. If the
timeout expires, log a warning and proceed with the rest of the
termination logic (i.e., kill the conversion process).
Fixes#854
We have a container-specific test that deals with missing OCR files in
the container image. This test _can_ be run under Qubes, and it may
fail since it requires Podman.
Make the pytest guard more strict and don't allow running this test on
Qubes.
Also, fix a typo in the word "omission".
The platform where we run our tests directly affects the isolation
providers we can choose. For instance, we cannot run Qubes tests on a
Windows/macOS platform, nor can we spawn containers in a Qubes platform,
if the `QUBES_CONVERSION` envvar has been specified.
This platform incompatibility was never an issue before, because
Dangerzone is capable of selecting the proper isolation provider under
the hood. However, with the addition of tests that target specific
isolation providers, it's possible that we may run by mistake a test
that does not apply to our platform.
To counter this, we employed `pytest.skipif()` guards around classes,
but we may omit those by mistake. Case in point, the `TestContainer`
class does not have such a guard, which means that we attempt to run
this test case on Qubes and it fails.
Add module-level guards in our isolation provider tests using pytest's
`pytest.skip("...", allow_module_level=True)` function, so that we make
such restrictions more explicit, and less easy to forget when we add a
new class.
Ask the user to add the Dangerzone repo, when following the build
instructions for Qubes. The reason is that on Fedora 39 and 40, there's
no other way to install PySide6 than use our repo.
With the addition of the drag-and-drop QA scenario, the numbering of the
QA steps has changed. Mirror this numbering change in the qa.py script
as well, which tracks which QA scenarios do not apply to Linux
platforms.
We are aware that some Docker Desktop releases before 25.0.0 ship with a
seccomp policy which disables the `ptrace(2)` system call. In such
cases, we opt to use our own seccomp policy which allows this system
call. This seccomp policy is the default one in the latest releases of
Podman, and we use it in Linux distributions where Podman version is <
4.0.
Fixes#846
This was mostly done to fix an issue where `gvisor_wrapper/
entrypoint.py` didn't have the correct line-ending on Windows, leading
to a situation where the containers couldn't start.
This wraps the existing container image inside a gVisor-based sandbox.
gVisor is an open-source OCI-compliant container runtime.
It is a userspace reimplementation of the Linux kernel in a
memory-safe language.
It works by creating a sandboxed environment in which regular Linux
applications run, but their system calls are intercepted by gVisor.
gVisor then redirects these system calls and reinterprets them in
its own kernel. This means the host Linux kernel is isolated
from the sandboxed application, thereby providing protection against
Linux container escape attacks.
It also uses `seccomp-bpf` to provide a secondary layer of defense
against container escapes. Even if its userspace kernel gets
compromised, attackers would have to additionally have a Linux
container escape vector, and that exploit would have to fit within
the restricted `seccomp-bpf` rules that gVisor adds on itself.
Fixes#126Fixes#224Fixes#225Fixes#228
Add Podman's default seccomp policy as of 2024-06-10 [1]. This policy
will be used in subsequent commits in platforms with Podman version 3,
whose seccomp policy does not allow the `ptrace()` syscall.
[1] d3283f8401/pkg/seccomp/seccomp.json
Get the (major, minor) parts of the Docker/Podman version, to check if
some specific features can be used, or if we need a fallback. These
features are related with the upcoming gVisor integration, and will be
added in subsequent commits.
Add a design document for the gVisor integration, which is currently
under review. The associated pull request has lots of architectural
discussions about integrating gVisor, so in this document we collect
them all in one place.
Refs #590
Move the documentation on how to create and use containerized Dangerzone
environments under `docs/developer`, which seems like a more natural
place than a README under `dev_scripts/`.
I'm actually ensure how the previous version was working, but since we
are now loading the pytest fixtures automatically, it uncovered a misuse
in the tests.
The `updater` fixture sets `updater.dangerzone.app` to a magic mock
instance, whereas `qt_updater` returns the real QT app, which is what we
want in our tests.
They ideally should find their way by themselves.
> You don’t need to import the fixture you want to use in a test,
> it automatically gets discovered by pytest. The discovery of fixture
> functions starts at test classes, then test modules, then conftest.py
> files and finally builtin and third party plugins.>
>
> — [pytest docs](https://docs.pytest.org/en/4.6.x/fixture.html#conftest-py-sharing-fixture-functions)
A few minor changes about when to use `==` and when to use `is`.
Basically, this uses `is` for booleans, and `==` for other values.
With a few other changes about coding style which was enforced by
`ruff`.
This commit removes code that's not being used, it can be exceptions
with the `as e` where the exception itself is not used, the same with
`with` statements, and some other parts where there were duplicated
code.
> f-strings are a convenient way to format strings, but they are not
> necessary if there are no placeholder expressions to format. In this
> case, a regular string should be used instead, as an f-string without
> placeholders can be confusing for readers, who may expect such a
> placeholder to be present.
>
> — [ruff docs](https://docs.astral.sh/ruff/rules/f-string-missing-placeholders/)
The minimum python version when installing from source is now python
3.9, as Pyside6 6.7.1 dropped support for python 3.8 (see #780 for more
information).
On Debian-derivatives distributions, the minimum Python version is now
set to 3.8. In practice, because Pyside6 is not packaged for Debian, we
use Pyside2 [0], which is why we can relax the python version requirement.
In practice, when installing from source on an environment where
python3.9 is not the default python, poetry will look for it and use it
if available
> For various reasons, this Python version might not be compatible with
> the python range supported by the project. In this case, Poetry will
> try to find one that is and use it.
>
> [Poetry docs](https://python-poetry.org/docs/managing-environments/)
On Ubuntu Focal (20.04) where Python 3.9 is not installed by default,
it is possible to install it using the `python3.9` package.
Additionally, In version 1.24.3, PyMuPDF changed its package name from `fitz`
to `pymupdf` [2], resulting in a breakage on how it is installed in our
container. This is now fixed.
[0] More information on how Pyside6 packaging affects dangerzone on #221
[1] See [the current status of Pyside6 packaging](https://repology.org/
project/python:pyside6/packages)
[2] PyMuPDF changelog: https://pymupdf.readthedocs.io/en/latest/changes.html#change-log
Bump the PySide6 version used in our user environments to 6.7.1, to
mirror the one we ship to our users, and also fix a segfault issue in
our CI tests.
Refs #801
When building the Dangerzone RPMs, we were seeing the following shebang
warnings:
+ /usr/lib/rpm/redhat/brp-mangle-shebangs
mangling shebang in /usr/lib/python3.12/site-packages/dangerzone/conversion/doc_to_pixels.py from /usr/bin/env python3 to #!/usr/bin/python3
mangling shebang in /usr/lib/python3.12/site-packages/dangerzone/conversion/common.py from /usr/bin/env python3 to #!/usr/bin/python3
mangling shebang in /usr/lib/python3.12/site-packages/dangerzone/conversion/pixels_to_pdf.py from /usr/bin/env python3 to #!/usr/bin/python3
mangling shebang in /etc/qubes-rpc/dz.ConvertDev from /usr/bin/env python3 to #!/usr/bin/python3
mangling shebang in /etc/qubes-rpc/dz.Convert from /bin/sh to #!/usr/bin/sh
These warnings are benign in nature, but coupled with #727, they could
lead to incorrect file permissions.
Remove shebangs from the following files, since they are not executed
directly, but are imported instead:
dangerzone/conversion/common.py
dangerzone/conversion/doc_to_pixels.py
dangerzone/conversion/pixels_to_pdf.py
Also, accept the suggestions by Fedora (/bin/sh -> /usr/bin/sh,
/usr/bin/env python3 -> /usr/bin/python3) for the following files:
qubes/dz.Convert
qubes/dz.ConvertDev
Refs #727
Switch build directory for the `rpmbuild` command from
`./install/linux/rpm-build` to `~/rpmbuild`. The main reason for this is
that we want a build directory that will not be mounted in the
container, since we've experienced issues with file permissions.
Regarding the choice of directories, we went with `~/rpmbuild` because
it's outside the Dangerzone source, and also because it's the default
choice in Fedora [1].
[1]: 3ae1eeafee/rpmdev-setuptree (L60)Closes#727
When building the Dangerzone RPM package, detect if the files bundled in
it have any incorrect permissions. We have seen in the past that
building RPMs from the Dangerzone source, mounted to a macOS Docker
container, can lead to files readable only by the root user (600 /
rw-------).
Refs #727
Dangezone has received a security audit in December 2023, and published
on February 2024. It would be nice for people seeing this project to
learn about this audit.
Add some articles about the Dangerzone project that may be useful for
those evaluating this tool. This article list is not complete, and has
been sampled from various links we have encountered in the past.
Alpine Linux 3.20 was released recently [1]. As a result, the
`alpine:latest` image ref, that our Dockerfile uses, switched from the
3.19 to the 3.20 Alpine Linux release. This release has Python 3.12,
meaning that the following line in our Dockerfile now fails:
COPY --from=pymupdf-build /usr/lib/python3.11/site-packages/fitz/ /usr/lib/python3.11/site-packages/fitz
Bump the Python version in the Python system path to 3.12, so that we
can successfully build the container image.
[1]: https://alpinelinux.org/posts/Alpine-3.20.0-released.html
Zsh users that attempt to run the following command in our Ubuntu/Debian
installation instructions:
echo deb [signed-by=/etc/apt/keyrings/fpf-apt-tools-archive-keyring.gpg] \
https://packages.freedom.press/apt-tools-prod ${VERSION_CODENAME?} main \
| sudo tee /etc/apt/sources.list.d/fpf-apt-tools.list
encounter the following error:
zsh: no matches found:
[signed-by=/etc/apt/keyrings/fpf-apt-tools-archive-keyring.gpg]
Quote this command to ensure compatibility with other shells, and update
our CI checks.
Fixes#805
Inform users that for specific distros and versions, we install some
extra packages (PySide6, conmon), in order to fix some incompatibilities
between Dangerzone and the base system. Provide also a link to the
source / build instructions for the package, as well as any relevant
issues.
Fixes#767
Add a section for our end-users in INSTALL.md, that explains how to
verify that our Dangerzone assets have been signed by our advertised
signing key.
This section explains what are the .asc files that users see next to our
release assets, and how they can verify each asset individually using
GPG. It is heavily inspired by a similar section for OnionShare.
Closes#761
Add a new script called `sign-assets.py`, which produces the hash of all
the Dangerzone assets for a release (Windows/macOS installers, container
image), and signs them individually.
Also update our RELEASE.md document, to incorporate this script into our
release workflow.
Gracefully terminate certain conversion processes that may get stuck
when writing lots of data to stdout. Also, handle a race condition when
a conversion process terminates slightly after the associated container.
Fixes#791
We have recently [1] changed the name of the Dangerzone application to
capital-case "Dangerzone", but this breaks our PDF viewer detection
logic. Adjust our check to exclude Dangerzone from the list.
Fixes#790
[1]: See commit 3d426ed36b
In d632908a44 we improved our
`replace_control_chars()` function, by replacing every control or
invalid Unicode character with a placeholder one. This change, however,
made our debug logs harder to read, since newlines were not preserved.
There are indeed various cases in which replacing newlines is wise
(e.g., in filenames), so we should keep this behavior by default.
However, specifically for reading debug logs, we add an option to keep
newlines to improve readability, at no expense to security.
The `exit()` [1] function is not necessarily present in every Python
environment, as it's added by the `site` module. Also, this function is
"[...] useful for the interactive interpreter shell and should not be
used in programs"
For this reason, we replace all such occurrences with `sys.exit()` [2],
which is the canonical function to exit Python programs.
[1]: https://docs.python.org/3/library/constants.html#exit
[2]: https://docs.python.org/3/library/sys.html#sys.exit
On Windows, if we don't use the `startupinfo=` argument of
subprocess.Popen, then a terminal window will flash while running the
command.
Use `startupinfo=` when killing a container, as we do for every other
command.
On Windows, if we somehow attempt to archive the same document twice
(e.g, because it got archived once, and then we copy it back), we will
get an error, because Windows does not overwrite the target path, if it
already exists.
Fix this issue by always removing the previously archived version, when
performing the next archival action, and update our tests.
Quote from `pytest-mock` docs [1]:
> The purpose of this plugin is to make the use of context managers and
> function decorators for mocking unnecessary, so it will emit a warning
> when used as such.
Thus using it as a context manager currently produces a warning during
test runs in CI which is extra noise that could make new (possibly more
important) warnings harder to spot.
[1]: https://pytest-mock.readthedocs.io/en/latest/usage.html#usage-as-context-manager
On Linux, the `%u` field code results in multiple Dangerzone instances
being launched when opening multiple documents with Dangerzone from
e.g. Nautilus, as `%u` signifies that the application (in this case -
Dangerzone) can only open a single file/URL at once.
This changes the field code to `%F` as Dangerzone (now) supports
converting multiple files at once. We use `%F` (multiple local files)
instead of `%U` (multiple files and/or URLs) since Dangerzone does not
support opening URLs.
See also the Desktop Entry Specification [1] for more information on the
field codes.
Fixes#797
[1]: https://specifications.freedesktop.org/desktop-entry-spec/latest/ar01s07.html
Currently, the app ID of the Dangerzone GUI application when running
under Wayland is `python3`, which is not very useful if one wants to
automate some action related to the Dangerzone application window (e.g.
to always start Dangerzone window in floating mode under Sway WM).
Setting the desktop filename property also sets the app ID of the
application under Wayland. According to Qt documentation[1], the
property value should be the name of the application's .desktop file but
without the extension.
Qt documentation also states:
> This property gives a precise indication of what desktop entry
> represents the application and it is needed by the windowing system to
> retrieve such information without resorting to imprecise heuristics.
Therefore I also think that setting this property is needed to display
the correct application name and icon (taken from the .desktop entry)
when running under certain windowing systems (like Wayland)
(see also #402).
Note that this property is not enough, as we've encountered systems
where setting just the desktop file name does not alter the detected
application name by the window manager. For this reason, we also use
set the application name [2] to `dangerzone`, to remove any ambiguity.
[1]: https://doc.qt.io/qt-6/qguiapplication.html#desktopFileName-prop
[2]: https://doc.qt.io/qt-6/qcoreapplication.html#applicationName-propFixes#402
Bump our poetry.lock file, to get the latest versions of our
dependencies. Note that we are aware that this bump does not bring in
the latest PySide6 version.
Refs #773
Update the description in the pre-release task for PySide6, since a lot
has changed after writing this section. Now that `python3-pyside6` is in
the Fedora Rawhide repo, and will soon get backported in the stable
repos, we no longer check for newer upstream versions, but if Fedora
finally did the backport.
Refs freedomofpress/maint-dangerzone-pyside6#5
On Unix systems a filename can be a sequence of bytes that is not valid
UTF-8. Python uses[1] surrogate escapes to allow to decode such
filenames to Unicode (bytes that cannot be decoded are replaced by a
surrogate; upon encoding the surrogate is converted to the original
byte).
From `click` docs[2]:
> Invalid bytes or surrogate escapes will raise an error when written
> to a stream with `errors="strict"`. This will typically happen with
> `stdout` when the locale is something like `en_GB.UTF-8`.
To fix that, we use `utils.replace_control_chars()` before printing the
filenames to `stdout` so that surrogate escapes are replaced by �.
Fixes#768
The `util.replace_control_chars()` function was overly strict, and
would replace every non-ASCII character with "_". This included both
control characters, as well as normal characters in a non-English
alphabet.
Relax these restrictions by checking each character and deciding if it's
a Unicode control character, using the `unicodedata` Python package.
With this change, emojis and non-English letters are now allowed.
Add a pytest fixture that crafts a filename with Unicode characters that
are not considered common for this use. By default, this fixture uses
an invalid Unicode character as well, but we strip it in case of macOS
(APFS) since filenames must be UTF-8 encoded.
[1]: https://en.wikipedia.org/wiki/Filename#Comparison_of_filename_limitations
Include the test dependencies when linting, especially `pytest`. We need
this because `mypy` cannot understand the `pytest.raises` exception, and
specifically the fact that it catches exceptions. It assumes that the
next code block is unreachable, since it doesn't see any try ... except.
Add termination tests for the Dummy provider, so that we can have
cross-platform coverage in our Windows/macOS CI runners, which can't use
the Container isolation providers.
Add termination-related tests for Qubes. To achieve this, we need
to make a change to the Qubes isolation provider. More specifically,
we need to make the isolation provider yield control to the caller only
when the disposable qube is up and running.
Qubes does not provide us a solid guarantee to do so, but we've found a
hacky workaround, whereby we wait until the `qrexec-client-vm` process
opens a `/dev/xen` character device. This should happen, in theory, once
the disposable qube is ready, and has sent a `MSG_SERVICE_CONNECT` RPC
message to the caller.
Add termination-related tests for containers. To achieve this, we need
to make a change to the container isolation provider. More specifically,
we need to make the isolation provider yield control to the caller only
when the container is up and running. Failure to do so may lead to
lingering processes.
Add some test cases in the isolation provider tests, that check how it
behaves when a process completes successfully, lingers, or cannot
terminate.
These tests cannot run yet, since they must be imported by a concrete
isolation provider test class. In subsequent commits, we will start
enabling them.
Previously, we always assumed that the spawned process would quit
within 3 seconds. This was an arbitrary call, and did not work in
practice.
We can improve our standing here by doing the following:
1. Make `Popen.wait()` calls take a generous amount of time (since they
are usually on the sad path), and handle any timeout errors that they
throw. This way, a slow conversion process cleanup does not take too
much of our users time, nor is it reported as an error.
2. Always make sure that once the conversion of doc to pixels is over,
the corresponding process will finish within a reasonable amount of
time as well.
Fixes#749
Get the exit code of the spawned process for the doc-to-pixels phase,
without timing out. More specifically, if the spawned process has not
finished within a generous amount of time (hardcode to 15 seconds),
return UnexpectedConversionError, with a custom message.
This way, the happy path is not affected, and we still make our best to
learn the underlying cause of the I/O error.
Extend the IsolationProvider class with a
`terminate_doc_to_pixels_proc()` method, which must be implemented by
the Qubes/Container providers and gracefully terminate a process started
for the doc to pixels phase.
Refs #563
Set a unique name for spawned containers, based on the ID of the
provided document. This ID is not globally unique, as it has few bits of
entropy. However, since we only want to avoid collisions within a
single Dangerzone invocation, and since we can't support multiple
containers running in parallel, this ID will suffice.
Pass the Document instance that will be converted to the
`IsolationProvider.start_doc_to_pixels_proc()` method. Concrete classes
can then associate this name with the started process, so that they can
later on kill it.
Extend our CircleCI jobs to run CI tests in Ubuntu Noble.
This commit also adds support for building the Dangerzone .deb package
in Ubuntu Noble, but does not actually enable it. The reason is that
stdeb, which produces our Debian packages, does not work with Python
3.12, which ships with Ubuntu Noble
Refs #773
Our end-user Fedora environments, that we create for testing how
Dangerzone would operate on a clean Fedora system, require PySide6 to be
installed. This package is not available from the official Fedora repos
yet.
We have a way instead to check the poetry.lock file, grab the latest
PySide6 version from there, and install it from a URL. This is no longer
necessary, now that PySide6 6.7.0 will soon be available in all stable
Fedora releases. Since the last release maintained by FPF will be
6.6.3.1, we should pin this version in our env.py script. This way, we
can bump poetry.lock independently, and let Windows/macOS users get
different versions.
Refs freedomofpress/maint-dangerzone-pyside6#5
The `checkout`, `setup-python`, `upload-artifact` and `download-artifact`
actions produce warnings about deprecated Node.js 16.
Update the actions to use Node.js 20.
Previously settings was implicitly tested on tests/gui/test_updater.py.
However this was concerned with updater-related tests only, which
incidentally covered almost all of settings.py. However, a few tests
were missing. This commit increases the test coverage, but also tests
additional test conditions.
The goal is to help us increase the test coverage of the previous
scenario, which tested for the persistence of user data (settings). This
way we can drop the requirement to test this on linux hosts, which is
slightly harder (more cumbersome) to do.
Settings().set() would fail if we were trying to set a setting that did
not exist before. The reason is because before setting it would try to
get the previous value, but though direct key access, which would lead
to an exception.
The previous scenario 10 tested the handling of state upon Dangerzone
updates. This, however was particularly difficult to do on Linux due to
the need to add a repository and install, especially in our
semi-automated QA environment.
For this reason this commits removes Linux from this scenario and moves
it closer to the top of the scenarios list to reduce the change of
state "contamination". In other words, before testing the new version,
the tester now installs a previous version and then the new one, thus
guaranteeing that there is no inconsistent state due to installing an
earlier version later in QA.
Fixes#719
The problem (MuPDF C++ bindings generation breakage) was
apparently caused by a recent libclang update on pypi, and
fixed in the 1.24.0 release[1].
Fixes#750
[1]: https://github.com/pymupdf/PyMuPDF/issues/3279
PyMuPDF has some hardcoded log messages that print to stdout [1]. We don't
have a way to silence them, because they don't use the Python logging
infrastructure.
What we can do here is silence a particular call that's been creating
debug messages. For a long term solution, we have sent a PR to the
PyMuPDF team, and we will follow up there [2].
Fixes#700
[1]: https://github.com/freedomofpress/dangerzone/issues/700
[2]: https://github.com/pymupdf/PyMuPDF/pull/3137
There are times where we may want to build the container image for
testing, but compression takes too much time. If we don't plan to use
this image for production builds, we can specify instead a compression
level that is so low, that the image will be compressed instantly.
In this commit, we allow the user to specify the Gzip compression level,
and even set it to 0. The default will always be 9, so that we don't
make a mistake during release.
For a while now, we didn't get logs for the second-stage conversion
when using containers. Extend the code to log any captured output from
the second stage conversion, only if we run Dangerzone via our dev
entrypoint.
Note that the Qubes isolation provider was always logging output from
the second stage of the conversion.
On Qubes the conversion in dev mode would fail when converting from a
Fedora 38 development qube via a Fedora 39 disposable qube. The reason
was that dz.ConvertDev was receiving `.pyc` files, which were compiled
for python 3.11 but running on python 3.12.
Unfortunately PyZipFile objects cannot send source python files, even
though the documentation is a little bit unclear on this [1].
Fixes#723
[1]: https://docs.python.org/3/library/zipfile.html#pyzipfile-objects
The file container-pip-dependencies.txt was being left a directory when
building the docker image. This meant that it was being packaged when it
wasn't supposed to.
To avoid this, we remove file with the help from a context manager.
The change is minimal and the biggest part of the diff are indentation
changes.
Fixes#739
Simplifies the release announcement drafting by providing some
templates. It would have been preferable to be a .github config file,
but GitHub does not yet support content templates for release notes.
Provide a fix for an OCR bug that affected Fedora 38 templates of Qubes
OS. In that specific configuration, the PyMuPDF version accepts the
Tesseract data directory only from the `TESSDATA_PREFIX` environment
variable. Our mistake was that we were setting this environment variable
in a dev script, instead of setting it for all configurations.
In this commit, we set an attribute in the fitz.fitz module, so that
both dev scripts and end-user installations can work. This is hacky, but
it targets an old PyMuPDF release after all, so we don't expect things
to break in the long run.
Fixes#737
Accept `.svg` and `.bmp` files when browsing via the Dangerzone GUI.
Support for these extensions has already been added in the converter
code that runs in the sandbox (cd99122385)
but they were erroneously left out from the filter in the Dangerzone
main window.
Do not throw exceptions for unknown error codes. If
`get_proc_exception()` gets called from within an exception context and
raises an exception itself, then this exception will not get caught, and
it will get lost.
Prefer instead to return an exception class that we have for this
purpose, and show to the user the unknown error code of the converesion
process.
Inform testers that the container code no longer returns "UNTRUSTED >"
strings in its output. Every string is trusted now, and the output will
be similar for container and Qubes isolation providers alike.
When we get an early EOF from the converter process, we should
immediately get the exit code of that process, to find out the actual
underlying error. Currently, the exception we raise masks the underlying
error.
Raise a ConverterProcException, that in turns makes our error handling
code read the exit code of the spawned process, and converts it to a
helpful error message.
Fixes#714
50% would show twice in the conversion progress due to an overlap in
conversion progress values. The doc_to_pixels would be from 0-50% and
the pixels_to_pdf from 50%-100%.
This commit makes the first part go from 0 to 49% instead.
Fixes#715
On Windows platforms, we can't consume the stdin using select(), because
it's not available for pipes [1]. We can instead consume it using some
native Windows calls.
[1]: From https://docs.python.org/3/library/select.html#select.select:
"File objects on Windows are not acceptable, but sockets are. On
Windows, the underlying select() function is provided by the
WinSock library, and does not handle file descriptors that don’t
originate from WinSock."
Update the build instructions for Ubuntu Jammy regarding conmon, now
that oldstable-proposed-updates no longer offers a patched conmon
package. Propose instead to install conmon from our apt-tools-prod repo.
Instead of installing a patched conmon version from the
oldstable-proposed-updates repo, install it from our apt-tools-prod
repo. This applies to just Ubuntu Jammy, since the rest of the platforms
don't have this problem.
Now that the conmon package with version 2.0.25+ds1-1.1+deb11u1 has been
released [1] for Debian Bullseye, there is no need to install it from
the oldstable-proposed-updates repo any more.
[1]: https://tracker.debian.org/pkg/conmon
PyMuPDF 1.23.9 swapped the new fitz implementation (fitz_new)
with the fitz module. In the new module there are prints in the code
that interfere with our stdout for sending JSON from the container.
Pinning the version seems to have no adverse consequences [1], since
fitz_old hasn't had significant changes and it gives breathing room for
the print-related issue to be tackled in PR [2].
Fixes temporarily #700
[1]: https://github.com/freedomofpress/dangerzone/issues/700#issuecomment-1938357651
[2]: https://github.com/pymupdf/PyMuPDF/pull/3137
The container image does not need the TESSDATA_PREFIX env variable since
its PyMuPDF version is new enough to support `tessdata` as an argument
when calling the PyMuPDF tesseract method.
Switching from mounting files to writing to stdout has introduced some
Podman crashes in specific environments (Ubuntu Jammy / Debian Bullseye)
due to a conmon bug that affects version 2.0.25.
Fixing it for various permutations of the environments we support
requires the following:
1. CI tests: Install conmon from the oldstable-proposed-updates in
our Debian Bullseye / Ubuntu Jammy dev/end-user environments.
2. Developers: Add a line in BUILD.md that suggests users to install
conmon from the oldstable-proposed-updates repo, or some other repo
they prefer.
3. End-user installations: We will build conmon for Ubuntu Jammy, and
wait until the proposed updates repo gets merged in Debian Bullseye.
Fixes#685
Since the progress information is now inferred on host based on the
number of pages obtained, progress-tracking variables should be removed
from the server.
Remove timeouts due to several reasons:
1. Lost purpose: after implementing the containers page streaming the
only subprocess we have left is LibreOffice. So don't have such a
big risk of commands hanging (the original reason for timeouts).
2. Little benefit: predicting execution time is generically unsolvable
computer science problem. Ultimately we were guessing an arbitrary
time based on the number of pages and the document size. As a guess
we made it pretty lax (30s per page or MB). A document hanging for
this long will probably lead to user frustration in any case and the
user may be compelled to abort the conversion.
3. Technical Challenges with non-blocking timeout: there have been
several technical challenges in keeping timeouts that we've made effort
to accommodate. A significant one was having to do non-blocking read to
ensure we could timeout when reading conversion stream (and then used
here)
Fixes#687
This reverts commit fea193e935.
This is part of the purge of timeout-related code since we no longer
need it [1]. Non-blocking reads were introduced in the reverted commit
in order to be able to cut a stream mid-way due to a timeout. This is
no longer needed now that we're getting rid of timeouts.
[1]: https://github.com/freedomofpress/dangerzone/issues/687
If we increased the number of parallel conversions, we'd run into an
issue where the streams were getting mixed together. This was because
the Converter.proc was a single attribute. This breaks it down into a
local variable such that this mixup doesn't happen.
Conversions methods had changed and that was part of the reason why
the tests were failing. Furthermore, due to the `provider.proc`, which
stores the associated qrexec / container process, "server" exceptions
raise a IterruptedConversion error (now ConverterProcException), which
then requires interpretation of the process exit code to obtain the
"real" exception.
Avoids downloading the container image 4 times in the multi-stage build
by first pulling the alpine image once and then building without any
pulls.
Implemented following a suggestion of @apyrgio.
Now that only the second container can send JSON-encoded progress
information, we can the untrusted JSON parsing. The parse_progress was
also renamed to `parse_progress_trusted` to ensure future developers
don't mistake this as a safe method.
The old methods for sending untrusted JSON were repurposed to send the
progress instead to stderr for troubleshooting in development mode.
Fixes#456
If one converted more than one document, since the state of
IsolationProvider.percentage would be stored in the IsolationProvider
instance, it would get reused for the second document. The fix is to
keep it as a local variable, but we can explore having progress stored
on the document itself, for example. Or having one IsolationProvider per
conversion.
Merge Qubes and Containers isolation providers core code into the class
parent IsolationProviders abstract class.
This is done by streaming pages in containers for exclusively in first
conversion process. The commit is rather large due to the multiple
interdependencies of the code, making it difficult to split into various
commits.
The main conversion method (_convert) now in the superclass simply calls
two methods:
- doc_to_pixels()
- pixels_to_pdf()
Critically, doc_to_pixels is implemented in the superclass, diverging
only in a specialized method called "start_doc_to_pixels_proc()". This
method obtains the process responsible that communicates with the
isolation provider (container / disp VM) via `podman/docker` and qrexec
on Containers and Qubes respectively.
Known regressions:
- progress reports stopped working on containers
Fixes#443
Reclaim some storage space in the middle of the CI job that builds and
installs Dangerzone in Fedora. The reason is that previously, we
encountered an issues with CI runners running out of space.
Explain what happens when we bump our `poetry.lock`, and a new
Pyside6 version. Also, have a step-by-step guide on how the maintainer
should create a new PySide6 RPM and update FPF's repo, so that
Dangerzone can be released.
Add some Fedora CI jobs that build RPMs, install them in an end-user
environment, and make a simple conversion and GUI import check. These
are basically smoke tests for Fedora, similar to the ones we have for
Debian.
Now that we can create a Dangerzone RPM that depends on PySide6, we can
officially support Fedora 39 as a platform. Add this platform in our CI
tests, as well as our install/release notes.
Fixes#606
Extend the env.py script to build an end-user, Fedora 39+ environment
with PySide6 installed, as a regular RPM package. Previously, this was
only possible for development environments with PySide6 downloaded from
PyPI.
As a way to simplify builds, the env.py script offers the option to
download the RPM package itself from FPF's RPM repo [1], if the package
has been uploaded.
[1]: https://packages.freedom.press/yum-tools-prod
Fedora 39 ships with Python 3.12 by default, which Dangerzone previously
did not support due to limitations from the PySide6 package. Now that
the PySide6 package has been updated to 6.6.1, and the limitation has
lifted, we should to reflect this in pyproject.toml.
Install a subset of Podman dependencies, so that we don't also install
Systemd. Doing so can introduce some subtle issues of its own, which is
why we prefer cherry-picking the Podman packages we really need.
Fixes#689
Make Poetry include data files only in the source distribution, and not
on our wheels. This mainly makes RPM packaging a bit easier, but does
not solve the problem of how to install files to
`/usr/share/dangerzone`.
Also, include files using globs, which is the way Poetry prefers.
Fixes#678
Refs #677
Our security scans for the released container image have flagged
CVE-2023-7104. Our assessment is that this CVE doesn't affect
Dangerzone, mainly because our understanding is that attackers cannot
embed SQLite dbs within LibreOffice spreadsheets.
Add the following functionality to the build image script:
1. Let the user choose the container runtime of their choice. In some
systems, both Docker and Podman may be available, so we need to let
the user choose which runtime they want.
2. Let users choose if they want to save the image. For non-production
builds, we may want to simply build the container image, without
the time penalty of compression.
PyMuPDF replaced the need for almost all dependencies, which this commit
now removes.
We are also removing tesseract-ocr as a dependency since
(to our surprise) PyMuPDF ships directly with tesseract binaries [1].
However, now that tesseract-ocr is not available directly as a binary
tool, the `test_ocr.py` needed to be changed.
Fixes#658
[1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149
Some tests [1] lead to the conclusion that ocr_compression does the same
to the file (performance and size-wise) to the file as deflating images
when saving the file. However, both methods active do add a bit of extra
time. For this reason we're disabling the image deflation (default
option).
[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1434042296
Qubes does on-host pixels-to-pdf whereas the containers version doesn't.
This leads to an issue where on the containers version it tries to load
fitz, which isn't installed there, just because it's trying to check if
it should run the Qubes version.
The error it was showing was something like this:
ImportError while loading conftest '/home/user/dangerzone/tests/conftest.py'.
tests/__init__.py:8: in <module>
from dangerzone.document import SAFE_EXTENSION
dangerzone/__init__.py:16: in <module>
from .gui import gui_main as main
dangerzone/gui/__init__.py:28: in <module>
from ..isolation_provider.qubes import Qubes, is_qubes_native_conversion
dangerzone/isolation_provider/qubes.py:15: in <module>
from ..conversion.pixels_to_pdf import PixelsToPDF
dangerzone/conversion/pixels_to_pdf.py:16: in <module>
import fitz
E ModuleNotFoundError: No module named 'fitz'
For context see discussion in [1].
[1]: https://github.com/freedomofpress/dangerzone/pull/622#issuecomment-1839164885
Breaks down the container build into multiple stages in order to speed
up build times. Building PyMuPDF was taking too long and this way it can
be cached.
The original version was made by @apyrgio
Ensure that when the container image is installing pymupdf (unavailable
in the repos) with verified hashes. To do so, it has the pymupdf
dependency declared in a "container" group in `pyproject.toml`, which
then gets exported into a requirements.txt, which is then used for
hash-verification when building the container.
Because this required modifying the container image build scripts, they
were all merged to avoid duplicate code. This was an overdue change
anyways.
We're intentionally bypassing PEP 668 [1], which prevents the
installation of non-distro python wheels alongside system packages to
avoid incompatibilities at distro-level.
We are intentionally bypassing this since our container image is a
controlled environment (we only ship a version after rigorous testing).
[1]: https://peps.python.org/pep-0668/
The original document was larger in dimensions than the original one due
to a mismatch in DPI settings. When converting documents to pixels we
were setting the DPI to 150 pixels per inch. Then when converting back
into a PDF we were using 70 DPI. This difference would result in an
overall larger document in dimensions (though not necessarily in file
size).
Fixes#626
Adding PyMuPDF essentially make the code much simpler since it can do
everything that we'd need multiple programs for. It also includes
tesseract-OCR integration, which this commit makes use of.
Timeout can no longer be used since we're not calling a subprocess. We
could still implement it, but it's more worthy to reply in
yet-to-implement client-side timeouts (in containers).
Use PyMuPDF (AGPL-licensed) within the container conversion to replace
the pdf conversion to RGB. This massively simplifies the code since
PyMuPDF is a native python library.
Many instructions relied on the fact that the developer would have to
copy over the RPC policies and install the dependencies manually on the
template. This is no longer needed since a Qubes-built package ships
the necessary RPC policies and dependencies.
Removing the dependencies installation also helps with documentation
maintenance since it would be yet another place where we would need to
keep the dependency list up to date.
Make the first part of the Dangerzone development just to install the
Qubes RPC policies. Poetry install and other development related tasks
should be pointed to in the Fedora part of the instructions to avoid
duplication.
Create a new GitHub Actions workflow which aims to continuously test our
official installation instructions. The way we do it is the following:
1. Create two jobs, one for the Debian-based distros, and one for Fedora
ones.
2. Copy the instructions from INSTALL.md into each job.
3. Create a matrix that runs the installation jobs in parallel, for each
supported distro and version.
The jobs will run only on 00:00 UTC, and not on every PR, since it
wouldn't make sense otherwise.
Fix#653
Add a script to upload release assets to GitHub. This script can take
either a release ID, a Git tag, or the latest draft release.
Note that while GitHub's official client can upload assets to releases,
it cannot upload them to draft releases [1], hence why we created this
script.
[1]: https://cli.github.com/manual/gh_release_upload
This PR reverts the patch that disables HWP / HWPX conversion on MacOS M1.
It does not fix conversion on Qubes OS (#494).
Previously, HWP / HWPX conversion didn't work on MacOS (Apple silicon CPU) (#498)
because libreoffice wasn't built with Java support on Alpine Linux for ARM (aarch64).
Gratefully, the Alpine team has enabled Java support on the aarch64
system [1], so we can enable it again for ARM architectures.
And this patch is included in Alpine 3.19
This commit was included in #541 and reverted on #562 due to a stability issue.
Fixes#498
[1]: 74d443f479
Dangerzone was failing to convert documents in Qubes due to missing
client-side dependencies. In particular poppler-utils, ghostscript and
graphicsmagick.
Fixes#647
Our security scans previously alerted us on critical CVEs that have a
fix. In this commit, we ask to be alerted on CVEs that don't have a fix
yet, so that we can have them in our radar.
Since the introduction of these security checks, we have only once
encountered a case where our container was vulnerable to a CVE that
Alpine Linux had not fixed yet. This means that the maintenance burden
of this change will probably be minimal.
In Qubes the disposable netVM is internet connected. For this reason,
on Qubes we chose create our own disposable VM (dz-dvm). However, in
reality this could still be bypassed since dz-dvm had the default
disposable dispvm.
By setting the default_dispvm to '' we prevent this bypass. For VMs
users who have already followed the setup instructions, the following
command should (to be ran in dom0) will fix this issue:
qvm-prefs dz-dvm default_dispvm ''
By using `--skip / --extend-skip .gitignore`, we actually never read the
.gitignore file. We have to use `--skip-gitignore` instead.
This requires Git in the development environment, so we need to install
Git in our CI runners as well.
Fix a bug in the "Change Selection" action, whereby changing your
selection and picking files from another directory results in:
"Dangerzone does not support adding documents from multiple
locations. The newly added documents were ignored."
To fix this, change the output directory when we change selection as
well.
The original intention of leaving the update checkbox in the hamburger
menu was to let non-supported Linux distros (e.g. compiled from source)
to check for updates. However, on Linux it ended up being disabled
forcefully by default on startup.
This takes into account an overriden update checkbox.
Fixes#596
This commit fixes 3 small issues with the way we produce our Qubes RPM:
1. The `.exists()` method follows symlinks by default, whereas we want
to check if a symlink exists. This functionality has been added in
Python
3.12.
Instead of checking if a symlink exists and then removing it, simply
remove it and don't throw an error if it doesn't exist in the first
place.
2. The `dz.Convert*` policies were not installed with the executable bit
set, therefore the qube could not start.
3. The `dz.ConvertDev` policy in particular had an ambiguous shebang,
thus we change it to explicitly call Python3
If a Qubes conversion encounters an exception that is not a subclass of
ConversionException, it will still show a preview of a file that does
not exist.
Send an error progress report in that case, so that the GUI code can
detect that an error occurred and not open a file preview
Fixes#581
Create a temporary dir before the conversion begins, and store every
file necessary for the conversion there. We are mostly concerned about
the second stage of the conversion, which runs in the host. The first
stage runs in a disposable qube and cleanup is implicit.
Fixes#575Fixes#436
Extend the PixelsToPDF converter by adding an additional `tempdir`
argument. This argument can be used to make the conversion use a
different temporary directory other than `/tmp`.
For containers, this extra arguments makes no difference, as it won't be
used. For Qubes, this argument will allow storing files in a temporary
dir that will be cleaned up once the conversion completes. Previously,
these files would linger in the user's `/tmp`.
Refs #575
If a command encounters an error or times out during the second stage of
the conversion in Qubes, handle it the same way as we would have handled
it in the first stage:
1. Get its error message.
2. Throw an UnexpectedConversionError exception, with the original
message.
Note that, because the second stage takes place locally, users will see
the original content of the error.
Refs #567Closes#430
This should only affect the alpha version of Qubes OS (in containers
it only allows the attacker to control the timeout). In short, an
attacker could have PDF metadata that would show before "Pages:" in
the `pdfinfo` command output and this would essentially override the
number of pages measured in the server. This could enable the attacker
to shorten the number of pages of a document for example.
Fixes#565
When qrexec-client-vm fails, it could be a symptom of various issues:
- the system being out of RAM
- dz-dvm not existing
The exit code is the same in all cases (126), which makes it
particularly tricky to solve in the client application. For this reason
the approach is now to tell the user to see the qubes error notification
on the top right of their screen.
Add a sanity check at the end of the conversion from doc to pixels, to
ensure that the resulting document will have the same number of pages as
the original one.
Refs #560
Stream page data back to the caller, immediately after we read them from
pdftoppm. This way, we have more accurate progress reports and timeouts.
Fixes#557
Introduce 4 new methods that can be overloaded by the Qubes isolation
provider to stream page data/metadata back to the caller. For the time
being, these methods do what they did before, i.e., write this info in
files within the pixels directory.
Do not read a line from the command output and then check if
we are at EOF, because it's possible that the writer immediately exited
after writing the last line of output. Instead, switch the order of
actions.
This is a very serious bug that can lead to Dangerzone excluding the
last page of the document. It should have bit us right from the start
(see aeeed411a0), but it seems that the
small period of time it takes the kernel to close the file descriptors
was hiding this bug.
Fixes#560
This preserves isort's default behavior of ignoring virtualenvs with
common names like `venv` or `.venv`, which is helpful when running
`isort` in a local development environment that uses such a
virtualenv.
To be consistent with Light Mode, the background of the
safe_extension_filename QLabel should match the adjacent QTextField,
but the text should be "grayed out"/disabled to indicate that it's not
supposed to be editable.
Sets the detected OS color mode (dark/light) as a property on the
QApplication so it can be referenced in stylesheets to select style
rules suited to the OS color mode.
In Qubes OS it's often the case that the user doesn't have enough
RAM to start the conversion. In this case it raises BrokenPipeException
and exits with code 126.
It didn't seem possible to distinguish this kind of failure to one
where the user has misconfigured qrexec policies.
NOTE: this approach is not ideal UX-wise. After the first doc failing
the next one will also try and fail. Upon first failure we should
inform the user that they need to close some programs or qubes.
Because the server also checks the MAX_PAGES limit, the test in base
would hide the fact that the client is not enforcing the limit. This
ensures that's not the case.
When the pages in containers are streamed (#443), then this test should
be in base.py.
Theoretically the max pages would be 65536 (2byte unsigned int.
However this limit is much higher than practical documents have
and larger ones can lead to unforseen problems, for example RAM
limitations.
We thus opted to use a lower limit of 10K. The limit must be
detected client-side, given that the server is distrusted. However
we also check it in the server, just as a fail-early mechanism.
Isolation provider tests done in tests/test_base.py and had
pytest.mark.parameterize() for each isolation provider. This logic
would not work well when we had test that diverge. We could have marked
each one as compatible with one provider or another, but in the end it
turned out to be better to have the common ones in a base class and
the divergent ones in each.
NOTE: this has a strange side-effect: inherited test classes need to
have imports for all of the fixtures even if they are not explictly used
Add an error for interrupted conversions, in order to better
differentiate this scenario from other ValueErrors that may be raised
throughout the code's lifetime.
Store, in an instance attribute, the process that we have started for
the spawned disposable qube. In subsequent commits, we will use it from
other places as well, aside from the `_convert` method.
Note that this commit does not alter the conversion logic, and only does
the following:
1. Renames `p.` to `self.proc.`
2. Adds an `__init__` method to the Qubes isolation provider, and
initializes the `self.proc` attribute to `None`.
3. Adds an assert that `self.proc` is not `None` after it's spawned, to
placate Mypy.
Add instructions for installing Dangerzone on Qubes from our official
repos. These instructions are adapted from the build instructions, but
have been greatly simplified because we don't need some of the qubes
that the development environment needs.
Closes#431
Add Tesseract models for the 10 most spoken languages as package
requirements for Qubes. For containers, this problem is already solved
since we install all Tesseract models.
If a user is not covered by the installed models, they can install
extras on their own. We will add a note for this in subsequent commits.
Refs #431
Switch to the tessdata-fast Tesseract model, instead of the tessdata
one. The tessdata-fast Tesseract model is much smaller, and a bit faster
than the other one. Also, it's the model that Debian/Fedora ship by
default.
Closes#545
Extend the client-side capabilities of the Qubes isolation provider, by
adding client-side timeout logic.
This implementation brings the same logic that we used server-side to
the client, by taking into account the original file size and the number
of pages that the server returns.
Since the code does not have the exact same insight as the server has,
the calculated timeouts are in two places:
1. The timeout for getting the number of pages. This timeout takes into
account:
* the disposable qube startup time, and
* the time it takes to convert a file type to PDF
2. The total timeout for converting the PDF into pixels, in the same way
that we do it on the server-side.
Besides these changes, we also ensure that partial reads (e.g., due to
EOF) are detected (see exact=... argument)
Some things that are not resolved in this commit are:
* We have both client-side and server-side timeouts for the first phase
of the conversion. Once containers can stream data back to the
application (see #443), these server-side timeouts can be removed.
* We do not show a proper error message when a timeout occurs. This will
be part of the error handling PR (see #430)
Fixes#446
Refs #443
Refs #430
Replace the deprecated `bdist_rpm` method of creating RPMs for
Dangerzone. Instead, update our `install/linux/build-rpm.py` script, to
build Dangerzone RPMs using our SPEC file under
`install/linux/dangerzone.spec`. The script now essentially creates a
source distribution (sdist) using `poetry build`, and then uses
`rpmbuild` to create binary and source RPMs.
Fixes#298
Add an `rpm-build` directory under `install/linux`, which will be used
for building Dangerzone RPMs. For the time being, it only has a
.gitignore file there, but in the future, invoking
`install/linux/build-rpm.py` will populate it.
Update the dependencies required to build RPM packages. More
specifically, remove the older python3-setuptools dependency, and depend
instead on python3-devel and python3-poetry-core.
Note that this commit may break our CI, but it will be resolved in
subsequent commits.
Introduce a SPEC file that can be used to create an RPM from a Python
source distribution. Some notable features of this SPEC file follow:
1. We can use this SPEC file to create both regular RPM packages and
ones targeted for Qubes.
2. It has a post installation script that removes stale .egg-info
directories, which previously caused issues to our users.
3. It automatically creates a changelog from our Git logs, which differs
from the actual CHANGELOG.md.
4. It folloes the latest Fedora guidelines (as of writing this) for
packaging Python projects.
Fixes#514
Update our pyproject.toml file to include some non-Python data files,
e.g., our container image and assets. This way, we can use `poetry
build` to create a source distribution / Python wheel from our source
repository.
Note that this list of data files is already defined in our `setup.py`
script. In that script, one can find some extra goodies:
1. We can conditionally include data files in our Python package. We use
this to include Qubes data only in our Qubes packages.
2. We can specify where will the data files be installed in the end-user
system.
The above are non-goals for Poetry [1], especially (2), because modern
Python wheels are not supposed to install files in arbitrary places
within the user's host, nor should the install invocation use sudo.
Instead, this is a task that's better suited for the .deb / .rpm
packages.
So, why do we bother updating our `pyproject.toml` and not use
`setup.py` instead? Because `setup.py` is deprecated [2,3], and the
latest Python packaging RFCs [4], as well as most recent Fedora
guidelines [5] use `pyproject.toml` as the source of truth, instead of
`setup.py`.
In subsequent commits, we will also use just `pyproject.toml` for RPM
packaging.
[1]: https://github.com/python-poetry/poetry/issues/890
[2]: https://peps.python.org/pep-0517/#source-trees
[3]: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html
[4]: https://peps.python.org/pep-0517/
[5]: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/
The dangerzone-container entrypoint, as specified in pyproject.toml, is
stale, for the following reasons:
1. It's not mentioned in the setup.py script, so it was never included
in our Linux distributions.
2. The code in `dangerzone.__init__.py` that decides if it will invoke
the GUI or CLI backend, just takes `dangerzone-cli` into account for
this decision, and does not mention dangerzone-container anywhere.
In order to let isort respect .gitignore, we need to specify this in the
tool.isort entry, in pyproject.toml.
For black, we don't need any extra tweaks. This is weird, since until a
few months ago black did not respect .gitignore. Maybe something has
changed in the meantime but if not, we should revert this change.
Creates exceptions in the server code to be shared with the client via an
identifying exit code. These exceptions are then reconstructed in the
client.
Refs #456 but does not completely fix it. Unexpected exceptions and
progress descriptions are still passed in Containers.
This PR reverts the patch that disables HWP / HWPX conversion on MacOS
M1. It does not fix conversion on Qubes OS (#494)
Previously, HWP / HWPX conversion didn't work on MacOS M1 systems (#498)
because libreoffice wasn't built with Java support on Alpine Linux for
ARM (aarch64).
Gratefully, the Alpine team has enabled Java support on the aarch64
system [1], so we can enable it again for ARM architectures.
Fixes#498
[1]: 74d443f479
The Alpine Linux team has enabled Java support for LibreOffice on ARM
architecture:
74d443f479
This commit is included in 7.5.5.2-r2, so the installed LibreOffice
package should be 7.5.5.2-r2 or higher to fix this issue.
However 3.18 doesn't have the 7.5.5.2-r2 package:
https://pkgs.alpinelinux.org/package/v3.18/community/aarch64/libreoffice
The Dangerzone image uses the alpine:latest image which is 3.18 as of
writing this.
For this reason, we switch to the edge repo of Alpine Linux, which
includes this fix.
Refs #498
Refs #540
Refs #542
qvm-copy-to-vm since a long time doesn't respect the qube name
provided. Instead it is enforced by the dom0 policy prompt. This is
probably a leftover from a command ran in dom0, where this command
actually works.
The "check for updates" button wasn't showing up immediately as checked
as soon as the user is prompted for checking updates. This fixes that.
Fixes#513
Reporting script now parses JunitXML instead of a series of
".container_log" files. The script in in changed submodule.
Additionally it makes failed tests actually fail so that this is
recorded in the JunitXML report.
Adds a large pool of document that can and should be used prior to a
release to understand effects of the new release over a real-world
scenario.
Documents are stored in an external git LFS repo under
`tests/test_docs_large` and currently it's about 11K documents gathered
from multiple PDF readers and office suite's test sets.
Documentation on how to run the tests is under
`docs/developer/TESTING.md`
Certain characters may be abused. Particularly ANSI escape codes.
Solution inspired by Qubes OS's hardening of ther RPC mechanism [1]:
> Terminal control characters are a security issue, which in worst case
> amount to arbitrary command execution. In the simplest case this
> requires two often found codes: terminal title setting (which puts
> arbitrary string in the window title) and title repo reporting (which
> puts that string on the shell's standard input. [sic]
>
> -- qvm-run.rst [2]
[1]: e005836286
[2]: c70da44702/doc/manpages/qvm-run.rst (L126)
Store the conversion log to a file (captured-output.txt) in the
container and when in development mode, have its output displayed on the
terminal output.
Use qrexec stdout to send conversion data (pixels) and stderr to send
conversion progress at the end of the conversion. This happens
regardless of whether or not the conversion is in developer mode or not.
It's the client that decides if it reads the debug data from stderr or
not. In this case, it only reads it if developer mode is enabled.
We don't tend to use Docker for development tasks in Linux, since we
have Podman for that. In MacOS and Windows, we do use Docker, but
typically without sudo.
Make our MacOS / Windows dev tasks non-interactive, by ditching the
`sudo` invocation.
Closes#519
Makes it clear that one needs to install Docker for Desktop to use Dangerzone
on Mac or Windows and Podman on linux. The app itself will warn the user about
this, but we should state the prerequisites more clearly upfront.
Mentions mac and windows in INSTALL.md so that anyone reading this page does
not wrongly assume that Dangerzone is a Linux-only app.
Fixes#475
The markdown dependency uses importlib to monkeypatch 'html.parser'
[1]. Due to this approach 'html.parser' is never explicitly stated
as a dependency. This works fine in most cases, since it's part of
the python standard lib. But on Windows the build tool (CxFreeze)
ships in the .exe only the modules needed. And because html.parser
is never mentioned, it fails with an error (see issue #501).
Fixes#501
[1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/htmlparser.py#L29
The HWP / HWPX conversion feature does not work on the following
platforms:
* MacOS with Apple Silicon CPU
* Native Qubes OS
For this reason, we need to:
1. Disable it on the GUI side, by not allowing the user to select these
files.
2. Throw an error on the isolation provider side, in case the user
directly attempts to convert the file (either through CLI or via
"Open With").
Refs #494
Refs #498
Sometimes, LibreOffice returns with status code 0, but in reality, it
fails. It doesn't create a file, and Dangerzone does not detect this.
What happens next is that it fails in the next command, and throws an
unrelated error.
Detect that LibreOffice fails, by checking if the output file exists,
after the PDF conversion.
Always pull the base container image (alpine:latest) before building our
own container image. Else, in an environments that we haven't touched
for a while, an older image may be used.
Update our release instructions in the following ways:
1. Make sure to check the Python dependencies / version before the
release.
2. Make sure to upload the final container.tar.gz image as a release
artifact.
Add extra files and base64 encode externally contributed docs. This
prevents the accidental opening of such documents, since they couldn't
be rebuit by the Dangerzone developers to ensure their safety.
Use the MIME types actually used by the `file` command, which was
recently changed for the detection of the HWPX format [1].
application/hwp+zip -> application/x-hwp+zip
But the HWPX format includes a 'mimetype' file, which contains the
MIME type string "application/hwp+zip", so that was left so because
it may be possible to detect it as "application/hwp+zip".
[1]: ceef7ead3a
HWPX MIME type is recognized as 'application/zip' with current version of file command (file-5.44).
It will be recognized as 'application/hwp+zip' when new version of file is released.
For a temporary fix, when MIME type of file is 'application/zip',
check the file type again (without the MIME option).
And then check if it's 'Zip data (MIME type "application/hwp+zip"?)' or not.
Only load the LibreOffice extension for opening hwp/hwpx when it is
actually needed. Adding an extension to libreoffice may allow for it to
run arbitrary code. This makes it trust more scalable by trusting
LibreOffice extensions only for the filetypes which they target.
Reasoning
---------
Assuming a malicious `.oxt` extension this means that the extension has
arbitrary code execution in the container. While this is not an
existential threat in itself, we should not expose every Dangerzone user
to it. This is achieved by dynamically loading the extension at runtime
only when needed.
This ensures that a compromised extension will in its least malicious
form be able to modify the visual content of any hancom office files but
not *every file*. In the more malicious version, if the code execution
manages to do a container escape, this will only affect users that have
converted a Hancom office file.
H2ORestart is a LibreOffice extension which adds Hancom HWP/HWPX (Hangul Word Processor)
supports for LibreOffice. This format is widely used in South Korea.
Version: v0.5.7
Extension Repository: https://github.com/ebandal/H2Orestart/releases
Force Podman to use the overlay storage driver in our Dangerzone
environments. We have seen that in certain cases, Podman may opt to use
the vfs storage driver instead, which is more space-intensive.
Closes#489
Improve the `parse_progress()` method of the container isolation
provider in the following ways:
1. Make sure that the fields of the progress report have the expected
type.
2. In case of a JSON parsing error, sanitize the invalid string so that
it doesn't contain escape sequences, or the user considers it as
trusted.
Update the common `print_progress()` method in the base
`IsolationProvider` class, with two extra features:
1. Always sanitize the provided text argument.
2. Mark the sanitized text argument as untrusted.
This is default behavior from now on, since this function is commonly
used to parse progress reports from the conversion sandbox.
Sanitize filenames in various places in the code, before we write them
to the user's terminal. Filenames, especially in Linux, can contain
virtually any character except for '\0' and '/', so it's important to
sanitize them.
Make the `error_label` widget always render messages as plain text,
instead of auto discovering if the text is rich. We need this because
the error message may contain input from the sandbox, which we consider
untrusted.
Run our GUI tests on separate processes, because the combination of
Ubuntu Focal, Qt5, PySide6, and pytest-qt somehow leads to segfaults,
probably due to stale global state.
Closes#493
The isort tool is not compatible with Black by default. This leads to a
tug of war between these tools, when we run `make lint-apply` -> `make
lint`. Fix this by forcing isort to be compatible with Black.
Move the "Ok" button in the prompt that asks users if they want to
enable update checks to the right, to further reinforce that this is
the default action.
Fully test the update check logic, by introducing several Qt tests.
Also, improve the `UpdaterThread.get_letest_info()` method, that gets
the latest version and changelog from GitHub, with several checks.
These checks are also tested in our newly added tests.
We want to differentiate between the user clicking on "Cancel" and
clicking on "X", since in the second case, we want to remind them again
on the next run.
Reverse the logic in Qubes to run in containers by default and only
perform the conversion with VMs when explicitly set by the env var
QUBES_CONVERSION=1. This will avoid surprises when someone installs
Dangerzone on Qubes expecting it to work out of the box just like any
other Linux.
Fixes#451
Parallel tests had given us issues in the part [1]. This time, they
weren't playing well with pytest-qt. One hypothesis is that Qt
application components run as singletons and don't play well when there
are two instances.
The symptom we were experiencing was infinite recursion and removing
pytest-xdist solved the issue.
[1]: https://github.com/freedomofpress/dangerzone/issues/217
Now that sample_doc was renamed to sample_pdf it could cause some
confusion the fact that that the TestBase class had an attribute called
sample_doc which referenced the sample PDF.
By removing this attribute and passing the fixture instead we are
following a more pytest-native approach of passing arguments explicitly.
Implements the GUI logic necessary to change the selected document. When
"Change Selection" is clicked, it opens a File Dialog on the directory
of the previously selected files (if any)
Fixes#428
Upgrade from Qt5 to Qt6 in our CI runners and dev environments, since
the latest PySide6 versions do not support Qt5. This leaves only our
Debian / Fedora packages relying on Qt5, since there's no PySide6
package for them yet.
There are some caveats to the Qt6 upgrade:
1. Debian Bullseye has a missing dependency to `libgl1`, so we need to
install it separately.
2. Ubuntu Jammy has a missing dependency to `libxkbcommon-x11-0`, which
we have to install separately.
3. Ubuntu Focal does not have Qt6, but surprisingly PySide6 works with
Qt5.
4. All Debian-based distros require `libxcb-cursor0`.
As a side effect, we have to make our `env.py` a bit more complicated,
to cater to these exceptions.
Refs #482
Change the signal type in `UpdaterThread.check_for_updates()` from
`dict` to `UpdateReport`. The `dict` parameter is stale and should have
never been used.
When building the *end-user* environment for Ubuntu Lunar using
`./dev_scripts/env.py ... build`, we erroneously used a Dockerfile
snippet that is actually reserved for the *development* environment.
This pairing worked by chance, but we should use the proper Dockerfile
snippet, so that we don't mix these two environments.
Add a hamburger button in the main window of Dangerzone, that will be
the entry point for update information. Whenever a new update is
released, users will see a green notification bubble. If an update error
happens, they will see a red notification bubble.
In the hamburger menu, users have the option to enable or disable update
checks. Depending on the update check status, users will see in a pop-up
dialog more info about the new update or the error.
Closes#189
Add a dialog that we will show for update-related tasks. This dialog has
a different layout than the Alert class: it has a message, followed by
a widget that the user chooses (can be a text box or collapsible
element), and then one last message.
Add a Qt widget called "CollapsibleBox", in order to build sections that
you can hide/show with a single click. There is no native widget for
this functionality, so we borrow some code from a StackOverflow user:
https://stackoverflow.com/a/52617714
Factor out some parts of the Alert class into a more generic dialog
class. This class will be used for a new type of dialog that we will
introduce in a subsequent commit.
Note that this commit does not alter the functionality of the Alert
class.
Add three status icon for the hamburger menu:
* hamburger_menu.svg: The typical hamburger menu. Taken from
https://commons.wikimedia.org/wiki/File:Hamburger_icon.svg, which is
in the public domain.
* hamburger_menu_update_error.svg: A hamburger menu with a red
notification bubble on the top right corner.
* hamburger_menu_update_success.svg: A hamburger menu with a green
notification bubble on the top right corner.
Add a Pytest fixture that returns an UpdaterThread instance which has
its own unique settings directory. Note that the UpdaterThread instance
needs to be slightly nerfed, so that it doesn't rely on Qt functionality
or any isolation providers.
Add a new Python module called "updater", which contains the logic for
prompting the user to enable updates, and checking our GitHub releases
for new updates.
This class has some light dependency to Qt functionality, since it needs
to:
* Show a prompt to the user,
* Run update checks asynchronously in a Qt thread,
* Provide the main window with the result of the update check
Refs #189
Get the default settings of Dangezone for the current version, without
having to instantiate the Settings class. Note that instantiating the
Settings class also writes the settings to the underlying
`settings.json` file, and there are cases where we don't want this
behavior.
Add the following two features in the Settings class:
1. Add a way to save the settings, if the contents of a key have
changed.
2. Add a way to get all the updater settings, by getting fetching the
keys that start with `"updater_"`.
Pass the X11 socket of the Linux CI runners to the container where our
CI tests run, with the `-g` flag of `dev_scripts/env.py`. By having a
working X11 socket, we can run GUI tests. Prior to this fix, we would
encounter this error:
tests/gui/test_main_window.py::test_change_document_button qt.qpa.xcb: could not connect to display
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb, offscreen, wayland-egl, wayland, eglfs, vnc, minimalegl, vkkhrdisplay, linuxfb, minimal.
Another alternative we considered was to use the
`QT_QPA_PLATFORM=offscreen` environment variable. This alternative
works, but it's less close to the end-user's environment, so we decided
in favor of the approach above.
A workaround to an issue related to SIP imposed on Docker has been identified in #371. Update README.md to include friendly instructions for MacOS 11+ users blocked by this issue.
stdout_callback is used to flow progress information from the conversion
to some front-end. It was always used in tandem with printing to the
terminal (which is kind of a front-end). So it made sense to put them
always together.
Python 3.10.12 fixes some CVEs for which Dangerzone does not appear to be
affected, however its binaries are not made available by the python
foundation. Moving to 3.11 should be trivial since this was already
deployed in Fedora 37+.
The Ubuntu 23.04 docker image includes a user by default (ubuntu) which
overtakes the 1000 uid and so our user becomes 1001 which makes the user
directory unwritable. The solution as suggested in [1] was to remove
that user.
[1]: https://bugs.launchpad.net/cloud-images/+bug/2005129Fixes#452
Allow creating an RPM package that is to be installed specifically on
Qubes. This package has the following extra properties from our regular
RPM packages:
1. Make `python3-magic`, `libreoffice` and `tesseract` requirements
for installing Dangerzone, since the conversion takes place in a
disposable qube that needs these packages.
2. Ignore the container.tar.gz file, if it exists.
3. Add our RPC calls under `/etc/qubes-rpc`
Add an isolation provider for Qubes, that performs the document
conversion as follows:
Document to pixels phase
------------------------
1. Starts a disposable qube by calling either the dz.Convert or the
dz.ConvertDev RPC call, depending on the execution context.
2. Sends the file to disposable qube through its stdin.
* If we call the conversion from the development environment, also
pass the conversion module as a Python zipfile, before the
suspicious document.
3. Reads the number of pages, their dimensions, and the page data.
Pixels to PDF phase
-------------------
1. Writes the page data under /tmp/dangerzone, so that the
`pixels_to_pdf` module can read them.
2. Pass OCR parameters as envvars.
3. Call the `pixels_to_pdf` main function, as if it was running within a
container. Wait until the PDF gets created.
4. Move the resulting PDF to the proper directory.
Fixes#414
Add two RPC calls that can run on disposable VMs:
* dz.Convert: This call simply imports the dangerzone package and runs
the Qubes wrapper for the "document to pixels" code. This call is
similar to the way we run the conversion part in a container.
* dz.ConvertDev: This call is for development purposes, and does the
following:
- First it receives the `dangerzone.conversion` module as Python
zipfile. This way, we can quickly iterate on changes on the
server-side part of Qubes, without altering the templates.
- Second, it calls the Qubes wrapper for the "document to pixels"
code, as dz.Convert does.
The "document to pixels" code assumes that the client has called it with
some mount points in which it can write files. This is true for the
container isolation provider, but not for Qubes, who can communicate
with the client only via stdin/stdout.
Add a Qubes wrapper for this code that reads the suspicious document
from stdin and writes the pages to stdout. The on-wire format is the
same as the one that TrustedPDF uses.
It seems that there are at least two Python libraries with libmagic
support:
* PyPI: python-magic (https://pypi.org/project/python-magic/)
On Fedora it's `python3-magic`
* PyPI: filemagic (https://pypi.org/project/filemagic/)
On Fedora it's `python3-file-magic`
The first package corresponds to the `py3-magic` package on Alpine
Linux, and it's the one we install in the container. The second package
uses a different API, and it's the only one we can use on Qubes.
To make matters worse, we:
* Can't install the first package on Fedora, because it installs the
second under the hood:
https://bugzilla.redhat.com/show_bug.cgi?id=1899279
* Can't install the second package on Alpine Linux (untested), due to
Musl being used instead of libC:
https://stackoverflow.com/a/53936722
Ultimately, we need to support both, by trying the first API, and on
failure using the other API.
The files in `container/` no longer make sense to have that name since
the "document to pixels" part will run in Qubes OS in its own virtual
machine.
To adapt to this, this PR does the following:
- Moves all the files in `container` to `dangerzone/conversion`
- Splits the old `container/dangerzone.py` into its two components
`dangerzone/conversion/{doc_to_pixels,pixels_to_pdf}.py` with a
`common.py` file for shared functions
- Moves the Dockerfile to the project root and adapts it to the new
container code location
- Updates the CircleCI config to properly cache Docker images.
- Updates our install scripts to properly build Docker images.
- Adds the new conversion module to the container image, so that it can
be imported as a package.
- Adapts the container isolation provider to use the new way of calling
the code.
NOTE: We have made zero changes to the conversion code in this commit,
except for necessary imports in order to factor out some common parts.
Any changes necessary for Qubes integration follow in the subsequent
commits.
Update our GitHub Actions workflow with the following tests:
1. Build a .deb for Dangerzone on Debian Bookworm.
2. Install this .deb on every Debian-based platform that we support.
3. Test that the installed version runs successfully.
This way, we can be sure that .deb that we create on a single Debian
version (here we choose Debian Bookworm) works on all platforms.
Refs #358
When we run our Dangerzone environments through dev_scripts/env.py, we
use the Podman flag `--userns keep-id`. This option maps the UID in the
host to the *same* UID in the container. This way, the container can
access mounted files from the host.
The reason this works is because the user within the container has UID
1000, and the user in the host *typically* has UID 1000 as well. This
setup can break though if the user outside the host has a different UID.
For instance, the UID of the GitHub actions user that runs our CI
command is 1001.
To fix this, we need to always map the host user UID (whatever that is)
to container UID 1000. We can achieve this with the following mapping:
1000:0:1 # Map container UID 1000 to subordinate UID 0
# (sub UID 0 = owner of the user ns = host user UID)
0:1:1000 # Map container UIDs 0-999 to subordinate UIDs 1-1000
1001:1001:64536 # Map container UIDs 1001-65535 to subordinate UIDs 1001-65535
Refs #228
In Debian-based images, there are some Podman dependencies that are
marked as recommended, but are essential for rootless containers. These
dependencies will not be installed in our Dangerzone environments, due
to the `--no-install-recommends` flag.
Our approach was to find these dependencies through trial and error,
and hardcode them in our image. Turns out though that there are some
dependencies (e.g., `netavark`) that may be necessary in some Debian
flavors, and not others.
In order to not impact the readability of the env.py file, we prefer
installing Podman with all of its recommended packages. On one hand,
this will make the image size of our Debian-based Dangerzone
environments slightly larger, but on the other hand, it will make CI
tests less flaky.
Fix transient errors in Debian Bullseye CI tests by using a different
machine image (Ubuntu 22.04 vs Ubuntu 20.04), and solving some Podman
config issues along the way.
Fixes#388
Remove the Kurdish (Arabic) language ("kur_ara") from the list of
languages that we offer for OCR, since it's not included in the
installed languages.
Interestingly, it is not present in the Apline Linux repos as well, so
this was probably an omission in the first place.
Restore the OCR languages to the state they were in
66d3c40163, with some minor changes. We
can now do so because we download all the trained models, not just the
ones that Alpine Linux offers.
Grab Tesseract's trained models from GitHub, instead of from the Alpine
Linux repos. Over the past few months, the models in the Alpine Linux
repos did not remain stable, leading to CI issues.
Since the models are already pre-trained and available through
Tesseract's repo on GitHub, we can use the release tarball that they
offer to install them in the container image, which is basically what
the upstream packages are doing as well.
In order to make sure that we have no regressions, at the time of this
commit we ensured that the hashes of the models offered through the
Alpine Linux repos and the models offered from the GitHub release are
the same. Also, in order to detect future regressions or foul play, we
check the downloaded models against a known checksum. Given that these
models change every few years, updating the checksum should not be an
issue.
Fix#357
Ignore two CVEs from our security scans, which were triggered when
scanning the Dangerzone container image for v0.4.1. These CVEs do not
affect out users, and we offer an explanation why.
Add two GitHub Actions workflows, that perform the following checks:
* Security scan the Python dependencies of the Dangerzone application
(`poetry.lock`), for the current/main branch.
* Build and security scan the Dangerzone container image for the
current/main branch.
* Security scan the Python dependencies of the Dangerzone application
(`poetry.lock`), for the latest release of Dangerzone (currently
v0.4.1).
* Download and security scan the Dangerzone container image for the
latest release of Dangerzone (currently v0.4.1).
The first two checks will run on branch pushes, PRs, and nightly. The
last two checks will run only nightly, since the code in the current
branch cannot affect already released artifacts.
Also, besides the security scans, these workflows will also update the
Security alerts in the GitHub page for the Dangerzone project, and print
the SARIF report to the stdout, for debugging purposes.
Closes#222
Replace our reference to an Apple development certificate with a
Developer ID Application certificate. The former is not accepted during
the code notarization phase, whereas the latter is.
This release brings a split in the MacOS binaries, since we now have
separate ones for Intel and Apple Silicon architectures, so we must
reflect this in the README as well.
Remove any -rc identifiers (e.g., 0.4.1-rc3) from the Dangerzone
version, if it includes them. If we don't remove them, then building
the MSI for Windows will fail as follows:
error CNDL0108: The Product/@Version attribute's value, '0.4.1-rc3',
is not a valid version. Legal version values should look like
'x.x.x.x' where x is an integer from 0 to 65534.
Install the following packages in Dangerzone envs:
* python3-setuptools: We've seen that this package is necessary to build
the RPM package for Dangerzone. The error that we encountered was the
following:
* Deleting old build and dist
* Building RPM package
Traceback (most recent call last):
File "/home/user/dangerzone/setup.py", line 5, in <module>
import setuptools
ModuleNotFoundError: No module named 'setuptools'
Traceback (most recent call last):
File "/home/user/./dangerzone/install/linux/build-rpm.py", line 43, in <module>
main()
File "/home/user/./dangerzone/install/linux/build-rpm.py", line 30, in main
subprocess.run(
File "/usr/lib64/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 setup.py bdist_rpm --requires='podman,python3-pyside2,python3-appdirs,python3-click,python3-pyxdg,python3-colorama'' returned non-zero exit status 1.
* fuse-overlayfs: In Ubuntu 22.10 (at least), we encountered the
following error when running Podman:
ERRO[0000] User-selected graph driver "overlay" overwritten by
graph driver "vfs" from database - delete libpod local files to
resolve
The `vfs` driver is much slower than the `overlayfs` storage driver,
so we need to fix this. The reason why we encounter this error is
explained in the Podman docs [1]:
[...] and is vfs for non-root users when fuse-overlayfs is not
available.
Normally, the `fuse-overlayfs` package would have been installed, but
we don't install it due to the `--no-install-recommends` flag, so we
install it manually.
[1]: https://docs.podman.io/en/latest/markdown/podman.1.html#storage-driver-value
In PR #378 ("container: Allow converting more document formats"), we
added support for the following MIME types:
* application/zip
* application/octet-stream
* application/x-ole-storage
* application/vnd.oasis.opendocument.spreadsheet-template
* application/vnd.oasis.opendocument.text-template
However, we forgot to add some tests for these MIME types in the repo.
In this commit, we add a file for each of these MIME types, to make sure
we have no regressions in the future.
The main use of safe mode [1] in LibreOffice is to run with a fresh user
profile, in case the default one got borked somehow. This is actually
not a concern of ours, since the user's profile is in the container and
is not persistent.
The main reason we want to preemptively run LibreOffice in safe mode is
to remove hardware acceleration capabilities. Whether hardware
acceleration actually works in a container is another question, but we
want to be extra sure.
[1]: https://help.libreoffice.org/latest/en-US/text/shared/01/profile_safe_mode.html
Remove the association between MIME types and export filters, because
LibreOffice is able to auto-detect them on its own. Instead, ask
LibreOffice to simply convert the document to a .pdf.
This association was cumbersome for yet another reason; there are MIME
types that may be associated with more than one file type. That's why
it's better to let LibreOffice decide the proper filter for the
conversion.
Our current understanding is that this change won't widen our attack
surface for the following reasons:
* The output filters for PDF documents are pretty specific, and we don't
affect the input filters somehow.
* The default behavior of LibreOffice on Alpine Linux is to disable
macros.
Closes#369
Due to a bump in our Python dependencies, we now install Mypy 1.1.1
instead of 0.982. This change triggered the following errors:
* Incompatible default for argument <a> (default has type
None, argument has type <t>):
Mypy further explains here that PEP 484 prohibits implicit Optional,
so we need to make these types explicit Optional.
* Unused "type: ignore" comment, use narrower [method-assign] instead of
[assignment]:
Mypy has specialized some of its lints, meaning that we should switch
to the newer variants.
Also, it detected several other small inconsistencies. We fix all of
these errors in this commit.
Run `poetry lock` and allow updating the existing dependencies. This
fixes a CI regression that was introduced by Poetry 1.4.1, which added
stricter Python wheels validation
Fixes#376
Pave the way for deploying .deb and .rpm packages to
packages.freedom.press. Remove the code that deploys to PackageCloud
once we tag a commit with `v<semver>`.
Refs #291
Update several references to First Look Media in the code, to better
reflect the current status, where Freedom of the Press Foundation has
taken over the stewardship of the project.
Fixes#343
Remove a stale QA requirement for running the tests manually in the rest
of our Linux distros. Our CI jobs take care of that, so we don't need to
do it.
Use the full image tag (dangerzone.rocks/dangerzone:latest) when
building the image. Else, we risk creating a `share/image-id.txt` file
with multiple IDs in it, if we have another
`dangerzone.rocks/dangerzone` image (with a different tag) in our dev
environment.
Update our QA instructions for ARM-based MacOS systems. The main change
in 0.4.1 is that we can build an ARM container image for Dangerzone,
which is different from Intel Macs. So, we need to build and test it
during release.
Perform the following timeout bumps:
1. Increase the minimum timeout per page/MiB by x3. The rationale is that
10 seconds is a reasonable timeout, but to be on the safe side, it's
best if we multiply it by a safety factor.
2. Increase the minimum timeout from 10 seconds to 60 seconds. 10
seconds may be too little if the application runtime (e.g.,
LibreOffice) is slow to start due to background CPU thrashing.
Replace the command to install Poetry globally via `pip` in our build
instructions, with a command that installs Poetry under ~/.local/bin
via `pipx`. The rationale is the same as in the previous commit, i.e.,
PEP 668 does not allow it.
Note that in this case, we don't have any CI restrictions, so we could
use the official installer instead. However, for security reasons, we
prefer suggesting `pipx` to the users, and of course give them a list of
alternatives.
Note that for Windows and MacOS we leave the command as is, until we
figure out how PEP 668 applies in there.
We can no longer install Poetry via `pip`, since Debian Bookworm now
enforces PEP 668, meaning that both `pip install poetry` and `pip
install --user poetry` cannot work [1]. Since we use the same
installation steps for all of our dev environments, we need to find a
common way to install Poetry.
Poetry's website provides several ways to install Poetry [2]. Moreover,
it also has a special section with CI recommendations [3]. In this
section, it strongly suggests to install Poetry via `pipx`, instead of
the installer script that you download from the Internet.
Follow Poetry's suggestion to install it via `pipx` in CI environments,
with one minor change. Do not use `pipx ensurepath`, as that will
affect the `.bashrc` of the dev environment, which at some point in the
future may be mounted by the dev. Instead, set a PATH environment
variable that includes `~/.local/bin`.
[1]: https://github.com/freedomofpress/dangerzone/issues/351
[2]: https://python-poetry.org/docs/#installation
[3]: https://python-poetry.org/docs/#ci-recommendationsFixes#351
We no longer need to install Poetry via PyPI, since the upstream Debian
issues have been fixed. Moreover, PEP 668 [1] is now enforced in Debian
Bookworm, so we can't install Poetry globally via `pip` in any case.
For these reasons, prefer installing Poetry via APT.
[1]: https://peps.python.org/pep-0668/
Refs #351
When clicking on the "Choose..." button nothing would happen visually
and it would show the error:
Traceback (most recent call last):
File "/home/user/dangerzone/dangerzone/gui/main_window.py", line 614, in select_output_directory
dialog.setFileMode(QtWidgets.QFileDialog.DirectoryOnly)
According to the PySide docs, QFileDialog.DirectoryOnly has been
deprecated in Qt4.6 [1]. This was not an issue probably on PySide2
because it must have used an earlier Qt version.
Fixes#360
[1]: https://doc.qt.io/qtforpython-5/PySide2/QtWidgets/QFileDialog.html#PySide2.QtWidgets.PySide2.QtWidgets.QFileDialog.FileMode
Building the `.msi` on Windows was failing in the `candle.exe` step due
to some files in the PySide6 library being too long (PySide6/examples)
or having illegal character (`+`) in their file names
(PySide6/qml/QtQuick).
Skipping copying these files to the `.msi` fixes the issue. Skipping
`examples/` should be of no impact since they're just examples and
skipping `qml/QtQuick` shouldn't cause issues because we don't use QML.
Reverts commit `bbbf822` and adapts it from PySide2 to PySide6.
pdftoppm raises Syntax issues and Errors on a variety of documents.
But it still produces usable results despite the failures. From the
user's perspective it's best to have a document even if imperfect than
having none at all. For this reason, we ignore non-relevant output.
Some documents were reporting the following error when running them
over pdftoppm:
Syntax Error: Missing language pack for 'Adobe-Japan1' mapping
This did not necessarily make the document fail but it could be
that some fonts were not properly rendered due to the missing package.
Enable installing Podman in Ubuntu Focal, by re-using the instructions
we have in our installation section. This enables us building a dev
environment for Ubuntu Focal, which we couldn't previously.
Provide a fallback for QRegularExpressionValidator specifically for
Ubuntu Focal, because it's not present in PySide2 5.14. Instead,
fallback to QRegExpValidator if it doesn't exist.
Fixes#339
Copy input files in a temporary dir before mounting them, thereby
changing their permissions, without affecting the original files. This
way, we can avoid cases where a file is accessible to the user only due
to a supplemental user group, which does not work for containers.
Fixes#157Fixes#260Fixes#335
Take SELinux labels into account when mounting a file to the Dangerzone
container. Use the `:Z` flag (which is a no-op in non-SELinux systems)
to clear the existing SELinux label for a file, and apply one that
matches the container's.
Refs #335
Do not leave stale temporary directories when conversion fails
unexpectedly. Instead, wrap the conversion operation in a context
manager that wipes the temporary dir afterwards.
Fixes#317
Run each CLI command in a separate config/cache dir, to avoid leaks
between tests. Moreover, this way we are able to check the contents of
the config/cache dirs for a single CLI run.
Do not store temporary directories in the Dangerzone's config directory.
There are two reasons for that:
1. They are ephemeral, and they need a temporary place to be stored,
preferably RAM-backed.
2. We need to set them while running our CI tests.
Allow users to disable timeouts via the CLI, with the
`--disable-timeouts` argument. By default, the timeouts are always
enabled.
This option applies both to the CLI version of Dangerzone, and the GUI
one. For the latter, the user must start the GUI from their CLI (i.e.,
`dangerzone --disable-timeouts ...`)
Introduce proportional timeouts in the container code, where the
conversion logic runs.
Previously, we had a single timeout for each command (120 seconds),
which didn't scale well either with the number of pages in a document,
or with the size of the document.
In this commit, we look into each operation, and we're trying to figure
out the following:
1. What's the number of pages we will operate on?
2. How large is the document?
Knowing the above, we can break down a command into multiple operations,
at least conceptually. Having a number of operations and a sane timeout
value per operation (10 seconds), we can multiply those and reach to a
timeout that fits the command better.
Fixes#306Fixes#314
Refs #327
Add an optional --distro argument to build-deb.py, to specify the Debian
version in the package name, which currently is "1". This option may
prove useful when publishing packages to freedomofpress/apt-tools-prod,
where packages from different distros with the same names but different
contents are not accepted.
While creating a Debian package for Dangerzone, we found out that the
`dangerzone.isolation_provider` submodule was not copied to the final
package. Turns out that it was missing from the packages list that we
define in `setup.py`.
Include this package in the proper section in `setup.py`.
Convert the Dangerzone script that in the container to run commands
asynchronously, via the asyncio module.
The main advantage of this approach is that it's fast, easy, and safe to
consume the command's streams, while the command is running in the
background.
Previously, we had implemented an approach that used non-blocking
sockets, but those are easy to get wrong. For instance, timeouts were
not exact, capturing output was brittle.
Fixes#325
Commit d7be28ec2a assumed that OpenJDK was
required for the PDFtk package, which is no longer installed in the
Dangerzone image, and thus was removed.
Turns out that while LibreOffice does not depend on OpenJDK, it may
produce corrupted PDFs if installed without it, and will not abort the
operation.
Reinstate OpenJDK to fix the issue of corrupted PDFs.
Fixes#315
Drop PySide2 from our dependencies (previously used only on Linux
environments) and use PySide6 in all dev environments. The reason is
that PySide2 (from PyPI) does not support Python 3.11, and the variants
that do (Fedora/Debian packages) need to backport fixes from PySide6.
Our original attempt was to build PySide2 wheels for Python 3.11 but
it was not simple, nor maintainable. So, we were left with two options:
1. Install Python 3.10 in dev environments that have Python 3.11 by
default.
2. Use PySide6 in all of our environments.
In both cases, we break package parity with the user's system, since we
are not testing Dangerzone under the same conditions. However, since
option (2) is forwards-compatible with where we want to move the
project (use Qt6 and PySide6), we chose that one.
Fixes#330
Instead of reinstalling shadow-utils, use the actual fix that the Fedora
devs have suggested (rpm --restore shadow-utils). The previous method
does not seem to work on Fedora 37, and it threw the following error
when building the development environment:
Installed package shadow-utils-2:4.12.3-3.fc37.x86_64 (from koji-override-0) not available.
Error: No packages marked for reinstall.
Error: building at STEP "RUN dnf reinstall -y shadow-utils && dnf clean all": while running runtime: exit status 1
Fix an issue in our PyTest wrapper, that caused this recursion error:
```
File "shibokensupport/signature/loader.py", line 61, in feature_importedgc
File "shibokensupport/feature.py", line 137, in feature_importedgc
File "shibokensupport/feature.py", line 148, in _mod_uses_pysidegc
File "/usr/lib/python3.10/inspect.py", line 1147, in getsourcegc
lines, lnum = getsourcelines(object)gc
File "/usr/lib/python3.10/inspect.py", line 1129, in getsourcelinesgc
lines, lnum = findsource(object)gc
File "/usr/lib/python3.10/inspect.py", line 954, in findsourcegc
lines = linecache.getlines(file, module.__dict__)gc
File "/home/user/.cache/pypoetry/virtualenvs/dangerzone-hQU0mwlP-py3.10/lib/python3.10/site-packages/py/_vendored_packages/apipkg/__init__.py", line 177, in __dict__gc
self.__makeattr(name)gc
File "/home/user/.cache/pypoetry/virtualenvs/dangerzone-hQU0mwlP-py3.10/lib/python3.10/site-packages/py/_vendored_packages/apipkg/__init__.py", line 157, in __makeattrgc
result = importobj(modpath, attrname)gc
File "/home/user/.cache/pypoetry/virtualenvs/dangerzone-hQU0mwlP-py3.10/lib/python3.10/site-packages/py/_vendored_packages/apipkg/__init__.py", line 75, in importobjgc
module = __import__(modpath, None, None, ["__doc__"])gc
File "shibokensupport/signature/loader.py", line 54, in feature_importgc
RecursionError: maximum recursion depth exceededgc
```
This error seems to be related to
https://github.com/pytest-dev/pytest/issues/1794. By not importing
`pytest` in our test wrapper, and instead executing directly, we can
avoid it.
Note that this seems to be triggered only by Shiboken6, which is why we
hadn't previously encountered it.
Add libqt5gui5 as a test dependency in the 'convert-test-docs' step.
This package brings several other Qt and graphics libraries, which are
the ones that we actually require to run the tests *with PySide6*. Else,
we encounter this error:
```
Traceback (most recent call last):
File "/home/circleci/project/dangerzone/gui/__init__.py", line 19, in <module>
from PySide6 import QtCore, QtGui, QtWidgets
ImportError: libEGL.so.1: cannot open shared object file: No such file or directory
```
Note that the same package is not required when importing PySide2.QtGui,
which is why we hadn't encountered this issue before. Also, in the rest
of our environments, we explicitly install libqt5gui5, in order to run
the Dangerzone GUI.
We're not yet adding them to Linux, since PySide6 is not yet available
in Linux distros' packages, whereas with Linux and macOS our packaging
process includes the shipped binaries.
Fixes#211
Replace PySide2-stubs with types-PySide2, both of which are projects
that provide PySide2 typing hints, for the following reasons:
1. types-PySide2 is more complete and allows us to ditch some 'type:
ignore' comments for Mypy.
2. PySide2-stubs also brings PySide2 as a dependency, which cannot be
installed in MacOS M1 machines.
Refs #177
Exceptions raised during the document conversion process would be
silently hidden. This was because ThreadPoolExecuter in logic.py created
various contexts and hid any exceptions raised.
Fixes#309
Adds tests for macOS and Windows with the dummy converter. Tests won't
actually perform the conversion. But it should be enough for us to test
the remainder of the codebase.
Fixes#229
When enabled, the conversion part does nothing but print some simulated
output. This can be useful for testing non-conversion code (e.g. GUI).
Activated with the hidden flag --unsafe-dummy-conversion.
All isolation providers will some similar steps when convert() is
called. For this reason, all the common parts are captured in convert()
and then each isolation provider implements its own specific conversion
process in _convert() (which is called from the convert() method).
Encapsulate container logic into an implementation of
AbstractIsolationProvider. This flexibility will allow for other types
of isolation managers, such as a Dummy one.
default-jre and java dependencies dependencies had been added initially
[1] because of libreoffice-java-common, which is no longer present.
Then, when the image was changed from ubuntu to alpine [2], default-jre
was replaced with openjdk-8.
If java is still a dependency for libreoffice, then it should be pulled
automatically.
[1] 9ecdb9e995
[2] 650ae6eee1
PDFtk actually isn't needed. It was being used for breaking a PDF
into pages but this is something that be replaced by the already present
'pdftoppm'. Furthermore, by removing this dependency we contribute to
reproducible builds and overall supply chain security because it was
obtained from gitlab with no signature verification or version pinning.
The replacement 'pdftoppm' enabled us to do a shortcut:
- before: PDF -> PDF pages -> PNG images -> RGB images
- after: PDF -> PPM images -> RGB images
And this last conversion step is trivial since the RGB format we were
using is just a PPM file without the metadata in its header.
This BUILD.md was merged into main without updating qa.py to reflect it
because our linters were down due to the now-fixed poetry bug (see prev
commit).
The previous way of excluding files under `dev_scripts/envs` does not
seem to work. Ditching the glob and excluding the whole path works, so
we can go with that.
Narrow down the system packages that we install in dev environments. The
rationale is that we get most of the Python dependencies from Poetry, so
we don't need to install them from the system as well.
The packages that we do need to install are non-Python ones, and this
commit adds some that were missing: make, python3-stdeb. Also, we
explicitly install the base Qt5 libraries, in order to get the graphics
and C++ libraries that we can't get from PyPI.
Ignore the `dev_scripts/envs` folder when running tests or linting code,
as it may contain files that are not owned by the current user. In this
case, we've seen that pytest/black etc. fail.
This typically happens when the user has run Dangerzone in a
containerized environment (see #286), and Podman created a directory
with files owned by the user in the nested container.
Add a script that makes the user go through the QA steps for a supported
Dangerzone platform, and may optionally run them automatically, if the
user agrees.
Closes#287
Add a design document for `dev_scripts/env.py`, which is a script that
creates Dangerzone environments for various Linux distros. In this
design document, we explain various architectural decisions that we have
taken for this script, as well as how it works under the hood, what are
its shortcomings, etc.
Refs #286
Introduce `dev_scripts/env.py`, which is a script for building
Dangerzone environments for various Linux distros, and running commands
in them.
Closes#286
Skip the creation of the `share/container.tar` file, since it's not used
anywhere. Instead, pipe our `docker/podman save` invocations to `gzip`
directly, which will compress the tarfile on the fly. This saves both
time and disk space.
In non-development mode, the CLI shows the user information via the INFO
log level. The message is shown directly without [INFO] as a prefix.
Otherwise it would quickly get annoying to the user seeing [INFO] on
every line of a CLI application.
However, if an error happens it's important for the user to recognize
it's an error or a warning. This commit prints the log level in these
cases.
The "open with" dialog on windows was showing the description of
Dangerzone instead of its app name. The issue was that on windows it
shows the description there.
Fixes#283
Instability in the automated tests sometimes would sometimes fail when
running "podman images --format {{.ID}}". It turns out that in versions
prior to podman 4.3.0, podman volumes (stored in
~/.local/share/contaiers) would get corrupted when multiple tests were
run in parallel.
The current solution is to wrap the test command to run sequentially in
versions prior to the fix and in parallel for versions after that.
Fixes#217
Fix the failing convert-test-docs step, by pinning Poetry to version
1.2.2. This way, we avoid a bug in Poetry 1.3 [1], which was recently
released on PyPI.
[1]: https://github.com/python-poetry/poetry/issues/7184Closes#292
Debian has removed the python-all package from its Bookworm repos, which
breaks our CI tests. Looking into why python-all is required in the
first place, we found that it's an artificial stdeb requirement [1],
prior to 0.9.1 versions
The only platform affected by this issue is Ubuntu Focal, so our
solution is to install python-all specifically for that platform.
Finally, we further simplify our build tasks [2] (on Debian-like
distros) by not letting dh-python run tests when building the packages.
Running the tests has some issues after all:
1. It requires installing all the runtime dependencies of Dangerzone,
since it uses `python -m unittest discover` underneath.
2. It doesn't aid in the stability of the package, since unittest cannot
run test cases for PyTest.
[1]: https://github.com/astraw/stdeb/issues/153
[2]: https://github.com/freedomofpress/dangerzone/issues/292#issuecomment-1349967888
Create two separate groups for Poetry dependencies:
1. test: Dependencies required for testing Dangerzone.
2. lint: Dependencies required for linting the code with `make lint`.
Replace references to github.com/firstlookmedia with
github.com/freedomofpress, since the ownership of these repos has been
transferred to the Freedom of the Press Foundation.
Now that safe PDFs can open on Windows right after conversion
(implemented in commit 5b2fefd), we need to save/load the "Open safe
documents after converting" setting.
Cx-Freeze 6.13.0 limited the PATH of the build executables, making it so
Dangerzone couldn't find Docker through shutil.which().
More information on the issue is available at:
https://github.com/marcelotduarte/cx_Freeze/issues/1674
Allowing this would lead to several UI edge-cases related to where the
files would be saved. Avoiding this is the easiest solution at the
moment.
In the future we should consider other options.
It was possible that users would add duplicate documents via 'open with
Dangerzone'. This would lead to unexpected situations and preventing it
both in the CLI and the GUI solves those issues.
Handle the case where a user has already added some documents (either
through 'open with' or via Dangerzone 'select documents' button) and
then they want to add some more via the 'open_with' dialog.
It updates the settings to reflect the newly added documents and blocks
the user from adding them if a conversion is already in progress.
Makes the ContentWidget a choke-point, where we can allow or prevent
adding more documents and where we can ensure that newly selected
documents are added immediately to the DangerzoneGui class.
Logically, the application flow should not change in any way.
Closing windows on macOS would not actually close Dangerzone. Now that
it is a single-window program, it makes sense for it to close
immediately.
Fixes#271
The save group box would get partially trimmed when running in macOS
this appears to be due to differences in rendering fonts and widget
sizes.
Refs #270
Add a QA section in our release notes, which describes the list of
manual checks a developer needs to make before a release, to ensure that
we have no regressions.
Closes#246
Bump the global timeout used for various steps from 1 minute to 2
minutes. The reason is that we've seen several reports of operations
failing due to timeout reasons, that were otherwise legitimately
running.
Also, bump the timeout used for compression, which has been reported as
problematic as well.
Refs #146
Refs #149
If a user has provided an output filename for a document, then we should
no longer accept suffixes. The reason is that we can't do something
meaningful with it, as we can't alter the provided output filename.
The proper behavior is to reject this action with an exception. Note
that this acts more of a safeguard, since (currently) there is no path
where a user may add a suffix to a document that already has an output
filename.
In preparation for adding a limit on how many convert threads exist, we
are simplifying its logic. Getting ocr_lang doesn't seem to belong to
the thread.
Aligns document labels following the design specified in issue #117.
It did not specify how it would change with window resize, so it
currently expands the progress bar / error message width and keeps the
document name fixed in size.
Mypy complains about a line being unreachable. This is probably a false
positive. It must assume the code is not using a framework and thus it
can't when a PySide 'connect()' is being called.
Since everything now happens in a single window, there is no need
to have a way to keep track other windows. They simply won't exist.
But on windows and Linux it will still be possible to start
multiple windows by starting various Dangerzone processes. On MacOS
this doesn't seem to be as easy from the launcher, but it should
not be critical as multiple documents can be converted at the same
time in the one window.
To help debugging and visualizing what was happening, we set all
widgets to be visible at the same time. Now that is no longer needed,
we can hide them.
This keeps the original program flow:
1. select the documents
2. set the settings
3. see the conversion progress
This diverges from the proposed design in issue #117 for simplification
and consistency (with past program flow) purposes.
The user is supposed to only be able to select the safe PDF extension.
In a multi-file scenario, the extension will be the same for all files.
We follow here the design document [1]. To achieve this, we needed a
QLabel right next to a QLineEdit, to give the user the illusion that
it is the same graphical object.
[1]: https://github.com/firstlookmedia/dangerzone/files/6657536/DangerZone_NA02a.pdf
Avoid conversion issues when saving the output file when it is set
wrongly. Inform the user with a red box saying "must end in .pdf"
and prevent the user from clicking "convert" before that is fixed.
Combines the validation logic with the already-existing 'update_ui()'
Set the default directory for saving the file as the one from
the first document. This one will show just the directory name.
If the user changes it by choosing another directory, then show the new
directory name and its full path.
- shows settings again
- removes documents arg from settings widget - this is now stored
under DangerzoneGui instance.
- removes widget 'dangerous_doc_label' - the doc label is already
shown next to each document.
- 'Save as' button now serves the purpose of selecting where all
output files should be saved. Before, it was for selecting where
the file would be saved.
- 'save_lineedit' widget which was read-only and showed the path
where the file would be saved, it now called 'safe_suffix' and is
writable. It is where the user can type the safe file extension
(e.g. '-safe.pdf'). Validation is not yet implemented.
- when 'start_button' is clicked it now changes the output_filename
of all the documents to set their output directory to the one the
user has selected (if 'save_checkbox' enabled) and to set their
new 'safe_suffix'
- change to plural text for selection of multiple documents
These will be needed in for the GUI's settings. This also adds test
cases for these documents. The methods are the following:
- set_output_dir()
For changing the output directory of the safe file
- suffix setter and getter - for changing the suffix of the file
Allows the user to:
a) specify filenames via the terminal (for the GUI)
b) select multiple documents via the GUI
The conversion process can't yet be started since the settings are
broken and disabled (expect mypy complaints).
Allow opening multiple documents at the same time from the terminal
by calling
$ dangerzone document1.pdf document2.pdf
It will open each document in its own window, making use of the
already existing 'multi-document multi-window' parallel conversion
implementation.
plistlib:
- originaly added in commit 3be1d63330
- no longer needed
grp, getpass:
- originally added in commit ae7c919d8e
- used for finding the 'docker' executable. No longer needed.
Checking if files were writeable created files in the process. In the
case where someone adds a list of N files to dangerzone but exits before
converting, they would be left with N 0-byte files for the -safe
version. Now they don't.
Fixes#214
All filename-related exceptions were of class DocumentFilenameException.
This made it difficult to disambiguate them. Specializing them makes it
it easier for tests to detect which exception in particular we want to
verify.
The document's state update is better update in the convert() function.
This is because this function is always called for the conversion
progress regardless of the frontend.
The container output logging logic was in both the CLI and the GUI.
This change moves the core parsing logic to container.py.
Since the code was largely the same, now cli does need to specify
a stdout_callback since all the necessary logging already happens.
The GUI now only adds an stdout_callback to detect if there was an
error during the conversion process.
Initial parallel document conversion: creates a pool of N threads
defined by the setting 'parallel_conversions'. Each thread calls
convert() on a document.
Wildcard arguments like `*` can lead to security vulnerabilities
if files are maliciously named as would-be parameters. In the following
scenario if a file in the current directory was named '--help', running
the following command would show the help.
$ dangerzone-cli *
By checking if parameters also happen to be files, we mitigate this
risk and have a chance to warn the user.
Add extra installations steps for installing Podman in Ubuntu Focal,
since it's not present in the official Ubuntu repos. This is the final
requirement to reinstate Ubuntu Focal support.
Closes#206
Introduce a script for installing Podman in Ubuntu Focal, in
environments that may, or may not, have sudo installed.
Also, update our CircleCI configuration to use this script when
installing Podman.
Report some Linux versions that were recently supported (Debian 12 /
Fedora 37) in the installation instructions. These instructions where
copied from the Dangerzone wiki, which is why the recently supported
versions were missing.
Copy installation instructions from the Dangerzone wiki [1] into the
Dangerzone source. This has several benefits:
1. Devs can update installation instructions as part of a PR.
2. Users can see installation instructions for previous releases.
The last point is important, because we can update our instructions in
the main branch, without affecting the instructions a user follows from
the website (currently pointing to the Dangerzone Wiki).
Refs #240
[1]: https://github.com/freedomofpress/dangerzone/wiki/Installing-Dangerzone
Align build instructions about Python Poetry, which where previously
present only on MacOS and Windows. With this commit we:
1. Add Poetry instructions on Linux.
2. Add missing Poetry instructions on Windows, when running Dangerzone
from source.
Fedora 37 had been removed (commit d7cbe41) due to lack of support by
packagecloud (our package hosting solution at the time). This will no
longer be true and thus we can add this distro to the list of supported.
Rename the `gui.common` module and `gui.common.GuiCommon` class
to `gui.logic` and `gui.logic.DangerzoneGui` respectively. We keep as is
the original names of the variables that hold instances of this class,
since they will change in subsequent commits.
This change is part of the initial refactor to make the DangerzoneGui
class handle the GUI logic of the Dangerzone project.
Rename the `global_common` module and `global_common.GlobalCommon` class
to `logic` and `logic.Dangerzone` respectively. Also rename variables
that hold instances of this class.
This change is part of the initial refactor to make the Dangerzone class
handle the core logic of the Dangerzone project.
Let the Document class suggest the default filename for the safe PDF,
based on the provided input filename, appended with the extension
`-safe.pdf`.
Previously, this logic was copy-pasted throughout the code, which made
it difficult to maintain.
Make the Document class always resolve relative input/output file paths,
which are usually passed as arguments by users.
Previously, resolving relative filepaths was a job left to the
instantiators of the Document class. This was error-prone since this
conversion must happen in all the places where we instantiated the
Document class.
Implement Click's callback interface and create validators for the
input/output filenames, using the logic from the Document class. This
way, we can catch user errors as early as possible.
Factor out the filename validation logic and move it into the Document
class. Previously, the filename validation logic was scattered across
the CLI and GUI code.
Also, introduce a new errors.py module whose purpose is to handle
document-related errors, by providing:
* A special exception for them (DocumentFilenameExcpetion)
* A decorator that handles DocumentFilenameException, logs it and the
underlying cause, and exits the program gracefully.
Avoid setting document's filename via document.filename and instead
do it via object instantiation where possible.
Incidentally this has to change some window logic. When
select_document() is called it no longer checks if there is already an
open window with no document selected yet. The user can open as many
windows with unselected documents as they want.
Rename the `common` module and `common.Common` class to `document` and
`document.Document` respectively. Also, rename the variables that hold
instances of this class.
This change reflects the fact that the class is responsible for tracking
the state of the document. When we add bulk document conversion,
allowing us to keep track of a document's state will be key. This name
change is a step towards that.
Run Mypy static checks against our tests. This brings them inline with
the rest of the codebase, and we have an extra level of certainty that
the tests (and unit tests in particular) will not significantly diverge
from the code they are testing.
Concatenate directories and filenames in a platform-independent way, by
using pathlib.Path. This fixes issues in the tests where the "/" path
separator made the tests fail on Windows.
Add two tests that check if Dangerzone properly handles input and output
filenames with spaces in them. Previously this was not straight-forward
because we didn't tokenize arguments, which lead to Click splitting
filenames with spaces in two.
Pass tokenized arguments (i.e., arguments as lists of strings) to CLI
invocations, else Click will attempt to tokenize them internally. The
problem with leaving tokenization to Click is that it uses
`shlex.split()`, which is Unix-oriented, and may miss some cases in
Windows.
Wrap Click results (`Result`) with a new class (`CLIResult`), which
includes:
1. Assertion statements.
2. Logic for formatting and printing a Click result.
3. Invocation arguments, which are missing from the original `Result`
class.
On a windows system when running `pip install` it fails to install
`cx_Logging-3.0` with the error:
error: Microsoft Visual C++ 14.0 or greater is required. Get it
with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
Installing this dependency solves the issue.
Reduce "global_common" coupling by moving methods that could be
static onto "semantically-closer" py files.
Based on work initially made by @gmarmstrong on PR #166:
- moves container-specific code out of global_common.py and into
container.py
- creates a util.py for static methods used through the whole app
- move banner code from global_common onto cli.py given that it's
only displayed there
- updates tests to reflect these changes
- move ocr_languages from global_common onto its own json file in
share/ocr-languages.json to simplify global_common logic
Container-related methods recently moved to container.py no longer
need to have 'container' in their name as they are within the
container scope already.
Additonally it made it awkward to call from another module:
from .. import container
container.get_container_runtime()
The logic for detecting if we were are running on docker or podman
and identifying its respective binary were scattered across the
codebase. This centralizes it all in container.py
- display_banner() was only displayed in CLI mode so it makes sense
for it to be in the CLI.
- get_version(), was mvoed to util since it is a static function
that is needed in multiple parts of the application.
static methods that are used application-wide should belong to
the utilities python file.
inspired by @gmarmstrong's PR #166 on refactoring global_common
methods to be static and have a dzutil.py
originally PDF files were included for these edge-cases but in
reality all we want to test is the filename itself. So it reduces
repo size if we have them generated dynamically.
The parameterizatin features of pytest over the default unittest
will be useful to reduce test code. Furthermore, pytest is already
used by folks at FPF so there won't be any learning curve if folks
want to work on it.
Updates to the macOS and Windows build scripts and documentation:
- Switched from hardcoding the exact minor release of Python 3.9
to just using Python 3.9
- Switches from 32-bit Windows Python binaries to 64-bit
- Install poetry in Windows using pip, which is much simpler and
less error-prone than the PowerShell way
- Includes instructions for making the Windows release in a
Windows 11 VM, and building the container image on the host
- Updates the fingerprint of the Windows signing key
- Fixes a small bug with the .wxs file used to build the MSI
package
Mypy was returning many errors relating to PySide2, which didn't
make much sense. This is apparently because there are missing type
hinting stubs for PySide2.
The temporary solution is to add this devel dependency.
Upstream issue: (remove dep. when solved)
- https://bugreports.qt.io/browse/PYSIDE-1675
Input_filename and output_filename could be None or Str. This lead
to typing issues where the static analysis type hint tool could not
check that the type colisions would not happen in runtime.
So the logic was replaced by throwing a runtime exception if either
of these valiables is ever used without first having been set.
Originally tied to implment following PEP 589 [1] – TypedDict: Type
Hints for Dictionaries with a Fixed Set of Keys for the Settings
dict.
But this quickly turned out to very challenging without redoing the
code. So we opted instead for using the Any keyword.
[1]: https://peps.python.org/pep-0589/
The following logic was leading to type hint issues:
> inspect.getfile(inspect.currentframe())
But this code is overly complex for what it does is the same as
simply __file__. So we kill two birds with one stone, so to speak.
We added the following check as well:
+ if stdout_callback and p.stdout is not None:
Because, according to the subprocess docs[1]:
> If the stdout argument was not PIPE, this attribute is None.
In this case, it should not need to confirm that p.stdout is not
None in the mypy static analysis. However it still complained. So
we made mypy the favor and confirmed this was the case.
[1]: https://docs.python.org/3/library/subprocess.html#subprocess.Popen.stdout
`subprocess.STARTUPINFO()` only exists in windows systems. Because
of this, in linux-based systems it was raising type hint issues
as it didn't recognize the return function.
There was no code to handle if at this stage the runtime existed.
This caused issues with type hints since `shutil.which()` can
return None, which had not previously been accounted for.
We did not use the opportunity to consolidate all the code for
detecting the runtime, to make this review easier.
The container arguments was duplicated. This could potentially lead
to refactor errors. For example security arg could be added in one
container call but forgotten to be added in a second one.
Moving to /dangerzone was failing with insuficient permissions:
Invalid JSON returned from container: PermissionError: [Errno
13] Permission denied: '/dangerzone/page-3.rgb'
A previous approach was removed in commit 805222. It started with
root at first in a wrapper script and then dropped these
priviledges which running the script.
`--userns=keep-id` solves the mountpoint issues as it maps the user
starting the container is mapped in the container [1].
[1]: https://www.redhat.com/sysadmin/user-flag-rootless-containers
Was calling color spillover to the adjacent text if the banner was
logged instead of printed. Since this is the CLI version, it could
make sense to have this printed.
fixes#144: printing non-ascii characters in a macOS application
opened directly from finder would sometimes lead to an error
message in /var/log/system.log similar to this:
Failed to execute script 'dangerzone' due to unhandled exception:
'ascii' codec can't encode character '\u201c' in position 1:
ordinal not in range(128)
Simplifying the logic for obtaining resource paths by using pathlib
instead inspect.
Co-authored-by: Guthrie McAfee Armstrong <git@gmarmstrong.dev>
Based on commit bbce13d
Hi, and thanks for taking the time to open this bug report.
- type:textarea
id:what-happened
attributes:
label:What happened?
description:What was the expected behaviour, and what was the actual behaviour? Can you specify the steps you followed, so that we can reproduce?
placeholder:"A bug happened!"
validations:
required:true
- type:textarea
id:os-version
attributes:
label:Linux distribution
description:|
What is the name and version of your Linux distribution? You can find it out with `cat /etc/os-release`
placeholder:Ubuntu 22.04.5 LTS
validations:
required:true
- type:textarea
id:dangerzone-version
attributes:
label:Dangerzone version
description:Which version of Dangerzone are you using?
validations:
required:true
- type:textarea
id:podman-info
attributes:
label:Podman info
description:|
Please copy and paste the following commands in your terminal, and provide us with the output:
```shell
podman version
podman info -f 'json'
podman images
podman run hello-world
```
This will be automatically formatted into code, so no need for backticks.
render:shell
- type:textarea
id:logs
attributes:
label:Document conversion logs
description:|
If the bug occurs during document conversion, we'd like some logs from this process. Please copy and paste the following commands in your terminal, and provide us with the output (replace `/path/to/file` with the path to your document):
```bash
dangerzone-cli /path/to/file
```
render:shell
- type:textarea
id:additional-info
attributes:
label:Additional info
description:|
Please provide us with any additional info, such as logs, extra content, that may help us debug this issue.
Hi, and thanks for taking the time to open this bug report.
- type:textarea
id:what-happened
attributes:
label:What happened?
description:What was the expected behaviour, and what was the actual behaviour? Can you specify the steps you followed, so that we can reproduce?
placeholder:"A bug happened!"
validations:
required:true
- type:textarea
id:os-version
attributes:
label:operating system version
description:Which version of MacOS do you use? You can follow [this link](https://support.apple.com/en-us/109033) to find out more.
placeholder:macOS Sequoia 15
validations:
required:true
- type:dropdown
id:proc-architecture
attributes:
label:Processor type
description:|
Which kind of processor do you use?
You can follow [this link](https://support.apple.com/en-us/109033) to find out more.
options:
- Intel
- Apple Silicon
validations:
required:true
- type:textarea
id:dangerzone-version
attributes:
label:Dangerzone version
description:Which version of Dangerzone are you using?
validations:
required:true
- type:textarea
id:docker-info
attributes:
label:Docker info
description:|
Please copy and paste the following commands in your
terminal, and provide us with the output:
```shell
docker version
docker info -f 'json'
docker images
docker run hello-world
```
This will be automatically formatted into code, so no need for backticks.
render:shell
- type:textarea
id:logs
attributes:
label:Document conversion logs
description:|
If the bug occurs during document conversion, we'd like some logs from this process. Please copy and paste the following commands in your terminal, and provide us with the output (replace `/path/to/file` with the path to your document):
Hi, and thanks for taking the time to open this bug report.
- type:textarea
id:what-happened
attributes:
label:What happened?
description:What was the expected behaviour, and what was the actual behaviour? Can you specify the steps you followed, so that we can reproduce?
placeholder:"A bug happened!"
validations:
required:true
- type:textarea
id:os-version
attributes:
label:operating system version
description:|
Which version of Windows do you use? Follow [this link](https://learn.microsoft.com/en-us/windows/client-management/client-tools/windows-version-search) to find out.
validations:
required:true
- type:textarea
id:dangerzone-version
attributes:
label:Dangerzone version
description:Which version of Dangerzone are you using?
validations:
required:true
- type:textarea
id:docker-info
attributes:
label:Docker info
description:|
Please copy and paste the following commands in your
terminal, and provide us with the output:
```shell
docker version
docker info -f 'json'
docker images
docker run hello-world
```
This will be automatically formatted into code, so no need for backticks.
render:shell
- type:textarea
id:logs
attributes:
label:Document conversion logs
description:|
If the bug occurs during document conversion, we'd like some logs from this process. Please copy and paste the following commands in your terminal, and provide us with the output (replace `\path\to\file` with the path to your document):
stale-issue-message:"Marking this issue as stale because it has been open for 30 days with no activity. It will be closed in 14 days if there's no activity, or if the `stale` label is not removed. Does anyone want to add something?"
close-issue-message:"Closing this issue now. Don't hesitate to reopen if you have anything to add :-)"
sudo apt install -y podman dh-python build-essential make libqt6gui6 \
pipx python3 python3-dev
```
You also need docker, either by installing the [Docker snap package](https://snapcraft.io/docker), installing the `docker.io` package, or by installing `docker-ce` by following [these instructions for Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/) or [for Debian](https://docs.docker.com/install/linux/docker-ce/debian/).
Install Poetry using `pipx` (recommended) and add it to your `$PATH`:
Change to the `dangerzone` folder, and install the poetry dependencies:
> **Note**: due to an issue with [poetry](https://github.com/python-poetry/poetry/issues/1917), if it prompts for your keyring, disable the keyring with `keyring --disable` and run the command again.
You also need docker, either by installing the `docker` package, or by installing `docker-ce` by following [these instructions](https://docs.docker.com/install/linux/docker-ce/fedora/).
Change to the `dangerzone` folder, and install the poetry dependencies:
> **Note**: due to an issue with [poetry](https://github.com/python-poetry/poetry/issues/1917), if it prompts for your keyring, disable the keyring with `keyring --disable` and run the command again.
```
cd dangerzone
poetry install
```
Build the latest container:
```sh
python3 ./install/common/build-image.py
```
Download the OCR language data:
```sh
python3 ./install/common/download-tessdata.py
```
Run from source tree:
```sh
# start a shell in the virtual environment
poetry shell
# run the CLI
./dev_scripts/dangerzone-cli --help
# run the GUI
./dev_scripts/dangerzone
```
> [!NOTE]
> Prefer running the following command in a Fedora development environment,
> created by `./dev_script/env.py`.
Create a .rpm:
```sh
./install/linux/build_rpm.py
./install/linux/build-rpm.py
```
## Qubes OS
> :warning: Native Qubes support is in beta stage, so the instructions below
> require switching between qubes, and are subject to change.
>
> If you want to build Dangerzone on Qubes and use containers instead of disposable
> qubes, please follow the instructions of Fedora / Debian instead.
### Initial Setup
The following steps must be completed once. Make sure you run them in the
2. Follow the Fedora instructions for setting up the development environment.
3. Build a dangerzone `.rpm` for qubes with the command
```sh
./install/linux/build-rpm.py --qubes
```
4. Copy the produced `.rpm` file into `fedora-41-dz`
```sh
qvm-copy dist/*.x86_64.rpm
```
#### In the `fedora-41-dz` template
1. Install the `.rpm` package you just copied
```sh
sudo dnf install ~/QubesIncoming/dz/*.rpm
```
2. Shutdown the `fedora-41-dz` template
### Developing Dangerzone
From here on, developing Dangerzone is similar to Fedora. The only differences
are that you need to set the environment variable `QUBES_CONVERSION=1` when
you wish to test the Qubes conversion, run the following commands on the `dz` development qube:
```sh
# run the CLI
QUBES_CONVERSION=1 poetry run ./dev_scripts/dangerzone-cli --help
# run the GUI
QUBES_CONVERSION=1 poetry run ./dev_scripts/dangerzone
```
And when creating a `.rpm` you'll need to enable the `--qubes` flag.
> [!NOTE]
> Prefer running the following command in a Fedora development environment,
> created by `./dev_script/env.py`.
```sh
./install/linux/build-rpm.py --qubes
```
For changes in the server side components, you can simply edit them locally,
and they will be mirrored to the disposable qube through the `dz.ConvertDev`
RPC call.
The only reason to build a new Qubes RPM and install it in the `fedora-41-dz`
template for development is if:
1. The project requires new server-side components.
2. The code for `qubes/dz.ConvertDev` needs to be updated.
## macOS
Install python@3.9 from Homebrew:
Install [Docker Desktop](https://www.docker.com/products/docker-desktop). Make sure to choose your correct CPU, either Intel Chip or Apple Chip.
Install the latest version of Python 3.12 [from python.org](https://www.python.org/downloads/macos/), and make sure `/Library/Frameworks/Python.framework/Versions/3.12/bin` is in your `PATH`.
To create an app bundle, use the `build_app.py` script:
```sh
poetry run ./install/macos/build_app.py
poetry run ./install/macos/build-app.py
```
If you want to build for distribution, you'll need a codesigning certificate, and then run:
```sh
poetry run ./install/macos/build_app.py --with-codesign
poetry run ./install/macos/build-app.py --with-codesign
```
The output is in the `dist` folder.
## Windows
These instructions include adding folders to the path in Windows. To do this, go to Start and type "advanced system settings", and open "View advanced system settings" in the Control Panel. Click Environment Variables. Under "System variables" double-click on Path. From there you can add and remove folders that are available in the PATH.
Download Python 3.9.0, 32-bit (x86) from https://www.python.org/downloads/release/python-390/. I downloaded python-3.9.0.exe. When installing it, make sure to check the "Add Python 3.9 to PATH" checkbox on the first page of the installer.
Install the latest version of Python 3.12 (64-bit) [from python.org](https://www.python.org/downloads/windows/). Make sure to check the "Add Python 3.12 to PATH" checkbox on the first page of the installer.
Install Microsoft Visual C++ 14.0 or greater. Get it with ["Microsoft C++ Build Tools"](https://visualstudio.microsoft.com/visual-cpp-build-tools/) and make sure to select "Desktop development with C++" when installing.
Install [poetry](https://python-poetry.org/). Open PowerShell, and run:
### If you want the .exe to not get falsely flagged as malicious by anti-virus software
Dangerzone uses PyInstaller to turn the python source code into Windows executable `.exe` file. Apparently, malware developers also use PyInstaller, and some anti-virus vendors have included snippets of PyInstaller code in their virus definitions. To avoid this, you have to compile the Windows PyInstaller bootloader yourself instead of using the pre-compiled one that comes with PyInstaller.
Here's how to compile the PyInstaller bootloader:
Download and install [Microsoft Build Tools for Visual Studio 2019](https://www.visualstudio.com/downloads/#build-tools-for-visual-studio-2019). I downloaded `vs_buildtools__719988613.1603831511.exe`. In the installer, check the box next to "C++ build tools". Click "Individual components", and under "Compilers, build tools and runtimes", check "Windows Universal CRT SDK". Then click install. When installation is done, you may have to reboot your computer.
Then, enable the 32-bit Visual C++ Toolset on the Command Line like this:
```
cd "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build"
vcvars32.bat
```
Change to a folder where you keep source code, and clone the PyInstaller git repo and checkout the `v4.3` tag:
The next step is to compile the bootloader. We should do this all in dangerzone's poetry shell:
```
cd dangerzone
# start a shell in the virtual environment
poetry shell
cd ..\pyinstaller
# run the CLI
.\dev_scripts\dangerzone-cli.bat --help
# run the GUI
.\dev_scripts\dangerzone.bat
```
Then, compile the bootloader:
### If you want to build the Windows installer
```
cd bootloader
python waf distclean all --target-arch=32bit --msvc_targets=x86
cd ..
Install [.NET SDK](https://dotnet.microsoft.com/en-us/download) version 6 or later. Then, open a terminal and install the latest version of [WiX Toolset .NET tool](https://wixtoolset.org/) **v5** with:
```sh
dotnet tool install --global wix --version 5.0.2
```
Finally, install the PyInstaller module into your poetry environment:
Install the WiX UI extension. You may need to open a new terminal in order to use the newly installed `wix` .NET tool:
Now the next time you use PyInstaller to build dangerzone, the `.exe` file should not be flagged as malicious by anti-virus.
### If you want to build the installer
* Go to https://dotnet.microsoft.com/download/dotnet-framework and download and install .NET Framework 3.5 SP1 Runtime. I downloaded `dotnetfx35.exe`.
* Go to https://wixtoolset.org/releases/ and download and install WiX toolset. I downloaded `wix311.exe`.
* Add `C:\Program Files (x86)\WiX Toolset v3.11\bin` to the path.
> [!IMPORTANT]
> To avoid compatibility issues, ensure the WiX UI extension version matches the version of the WiX Toolset.
>
> Run `wix --version` to check the version of WiX Toolset you have installed and replace `5.x.y` with the full version number without the Git revision.
### If you want to sign binaries with Authenticode
* You'll need a code signing certificate. I got an open source code signing certificate from [Certum](https://www.certum.eu/certum/cert,offer_en_open_source_cs.xml).
* Once you get a code signing key and certificate and covert it to a pfx file, import it into your certificate store.
You'll need a code signing certificate.
## To make a .exe
Open a command prompt, cd into the dangerzone directory, and run:
```
poetry run pyinstaller install\pyinstaller\pyinstaller.spec
poetry run python .\setup-windows.py build
```
`dangerzone.exe` and all of their supporting files will get created inside the `dist` folder.
You then must create a symbolic link for dangerzone to run. To do this, open a command prompt _as an administrator_, cd to the dangerzone folder, and run:
```
cd dist\dangerzone
mklink dangerzone-container.exe dangerzone.exe
```
In `build\exe.win32-3.12\` you will find `dangerzone.exe`, `dangerzone-cli.exe`, and all supporting files.
### To build the installer
Note that you must have a codesigning certificate installed in order to use the `install\windows\build.bat` script, because it codesigns `dangerzone.exe` and `Dangerzone.msi`.
Open a command prompt, cd to the dangerzone directory, and run:
Note that you must have a codesigning certificate installed in order to use the `install\windows\build-app.bat` script, because it codesigns `dangerzone.exe`, `dangerzone-cli.exe` and `Dangerzone.msi`.
```
poetry run install\windows\step1-build-exe.bat
```
Open a second command prompt _as an administratror_, cd to the dangerzone directory, and run:
```
install\windows\step2-make-symlink.bat
```
Then back in the first command prompt, run:
```
poetry run install\windows\step3-build-installer.bat
poetry run .\install\windows\build-app.bat
```
When you're done you will have `dist\Dangerzone.msi`.
## Updating the container image
The Dangezone container image is reproducible. This means that every time we
build it, the result will be bit-for-bit the same, with some minor exceptions.
Read more on how you can update it in `docs/developer/reproducibility.md`.
- Document Operating System support [#986](https://github.com/freedomofpress/dangerzone/issues/986)
- Tests: Look for regressions when converting PDFs [#321](https://github.com/freedomofpress/dangerzone/issues/321)
- Ensure container image reproducibilty across different container runtimes and versions ([#1074](https://github.com/freedomofpress/dangerzone/issues/1074))
- Point to the installation instructions that the Tails team maintains for Dangerzone ([announcement](https://tails.net/news/dangerzone/index.en.html))
- Installation and execution errors are now caught and displayed in the interface ([#193](https://github.com/freedomofpress/dangerzone/issues/193))
- Prevent users from using illegal characters in output filename ([#362](https://github.com/freedomofpress/dangerzone/issues/362)). Thanks [@bnewc](https://github.com/bnewc) for the contribution!
- Add support for Fedora 41 ([#947](https://github.com/freedomofpress/dangerzone/issues/947))
- Add support for Ubuntu Oracular (24.10) ([#954](https://github.com/freedomofpress/dangerzone/pull/954))
### Fixed
- Update our macOS entitlements, removing now unneeded privileges ([#638](https://github.com/freedomofpress/dangerzone/issues/638))
- Make Dangerzone work on Linux systems with SELinux in enforcing mode ([#880](https://github.com/freedomofpress/dangerzone/issues/880))
- Process documents with embedded multimedia files without crashing ([#877](https://github.com/freedomofpress/dangerzone/issues/877))
- Search for applications that can read PDF files in a more reliable way on Linux ([#899](https://github.com/freedomofpress/dangerzone/issues/899))
- Handle and report some stray conversion errors ([#776](https://github.com/freedomofpress/dangerzone/issues/776)). Thanks [@amnak613](https://github.com/amnak613) for the contribution!
- Replace occurrences of the word "Docker" in Podman-related error messages in Linux ([#212](https://github.com/freedomofpress/dangerzone/issues/212))
### Changed
- The second phase of the conversion (pixels to PDF) now happens on the host. Instead of first grabbing all of the pixel data from the first container, storing them on disk, and then reconstructing the PDF on a second container, Dangerzone now immediately reconstructs the PDF **on the host**, while the doc to pixels conversion is still running on the first container. The sanitation is no less safe, since the boundaries between the sandbox and the host are still respected ([#625](https://github.com/freedomofpress/dangerzone/issues/625))
- PyMuPDF is now vendorized for Debian packages. This is done because the PyMuPDF package from the Debian repos lacks OCR support ([#940](https://github.com/freedomofpress/dangerzone/pull/940))
- Always use our own seccomp policy as a default ([#908](https://github.com/freedomofpress/dangerzone/issues/908))
- Debian packages are now amd64 only, which removes some warnings in Linux distros with 32-bit repos enabled ([#394](https://github.com/freedomofpress/dangerzone/issues/394))
- Allow choosing installation directory on Windows platforms ([#148](https://github.com/freedomofpress/dangerzone/issues/148)). Thanks [@jkarasti](https://github.com/jkarasti) for the contribution!
- Bumped H2ORestart LibreOffice extension to version 0.6.6 ([#943](https://github.com/freedomofpress/dangerzone/issues/943))
- Platform support: Ubuntu Focal (20.04) is now deprecated, and support will be dropped with the next release ([#965](https://github.com/freedomofpress/dangerzone/issues/965))
### Removed
- Platform support: Drop Ubuntu Mantic (23.10), since it's end-of-life ([#977](https://github.com/freedomofpress/dangerzone/pull/977))
### Development changes
- Build Debian packages with pybuild ([#773](https://github.com/freedomofpress/dangerzone/issues/773))
- Test Dangerzone on Intel macOS machines as well ([#932](https://github.com/freedomofpress/dangerzone/issues/932))
- Switch from CircleCI runners to Github actions ([#674](https://github.com/freedomofpress/dangerzone/issues/674))
- Sign Windows executables and installer with SHA256 rather than SHA1 ([#931](https://github.com/freedomofpress/dangerzone/pull/931)). Thanks [@jkarasti](https://github.com/jkarasti) for the contribution!
- Integrate Dangerzone with gVisor, a memory-safe application kernel, thanks to [@EtiennePerot](https://github.com/EtiennePerot) ([#126](https://github.com/freedomofpress/dangerzone/issues/126)).
As a result of this integration, we have also improved Dangerzone's security in the following ways:
* Prevent attacker from becoming root within the container ([#224](https://github.com/freedomofpress/dangerzone/issues/224))
* Use a restricted seccomp profile ([#225](https://github.com/freedomofpress/dangerzone/issues/225))
* Make use of user namespaces ([#228](https://github.com/freedomofpress/dangerzone/issues/228))
- Files can now be drag-n-dropped to Dangerzone ([issue #409](https://github.com/freedomofpress/dangerzone/issues/409))
### Fixed
- Fix a deprecation warning in PySide6, thanks to [@naglis](https://github.com/naglis) ([issue #595](https://github.com/freedomofpress/dangerzone/issues/595))
- Make update notifications work in systems with PySide2, thanks to [@naglis](https://github.com/naglis) ([issue #788](https://github.com/freedomofpress/dangerzone/issues/788))
- Updated the Dangerzone container image to use Alpine Linux 3.20 ([#812](https://github.com/freedomofpress/dangerzone/pull/812))
- Fix wrong file permissions in Fedora packages ([issue #727](https://github.com/freedomofpress/dangerzone/pull/727))
- Quote commands in installation instructions, making it compatible with `zsh` based shells. (issue [#805](https://github.com/freedomofpress/dangerzone/issues/805))
- Order the list of PDF viewers and return the default application first on Linux, thanks to [@rocodes](https://github.com/rocodes) (issue [#814](https://github.com/freedomofpress/dangerzone/pull/814))
### Removed
- Platform support: Drop Fedora 38, since it's end-of-life ([issue #840](https://github.com/freedomofpress/dangerzone/pull/840))
### Development changes
- Bumped the minimum python version to 3.9, due to Pyside6 dropping support for python 3.8 ([#780](https://github.com/freedomofpress/dangerzone/pull/780))
- Minor amendments to the codebase (in [#811](https://github.com/freedomofpress/dangerzone/pull/811))
- Use the original line ending (usually `LF`) for all content except images ([#838](https://github.com/freedomofpress/dangerzone/pull/838))
- Explained how to create, sign, and verify source tarballs ([#823](https://github.com/freedomofpress/dangerzone/pull/823))
- Added a design doc for the update notifications
- Added a design doc for the gVisor integration ([#815](https://github.com/freedomofpress/dangerzone/pull/815))
- Removed the python shebang from some files
## Dangerzone 0.6.1
### Added
- Platform support: Ubuntu 24.04 and Fedora 40 ([issue #762](https://github.com/freedomofpress/dangerzone/issues/762))
### Fixed
- Handle timeout errors (`"Timeout after 3 seconds"`) more gracefully ([issue #749](https://github.com/freedomofpress/dangerzone/issues/749))
- Make Dangerzone work in macOS versions prior to Ventura (13), thanks to [@maltfield](https://github.com/maltfield) ([issue #471](https://github.com/freedomofpress/dangerzone/issues/471))
- Make OCR work again in Qubes Fedora 38 templates ([issue #737](https://github.com/freedomofpress/dangerzone/issues/737))
- Make .svg / .bmp files selectable when browsing files via the Dangerzone GUI ([#722](https://github.com/freedomofpress/dangerzone/pull/722))
- Linux: Show the proper application name and icon for Dangerzone, in the user's window manager, thanks to [@naglis](https://github.com/naglis) ([issue #402](https://github.com/freedomofpress/dangerzone/issues/402))
- Linux: Allow opening multiple files at once, when selecting them from the user's file manager, thanks to [@naglis](https://github.com/naglis) ([issue #797](https://github.com/freedomofpress/dangerzone/issues/797))
- Linux: Do not include Dangerzone in the list of available PDF viewers, thanks to [@naglis](https://github.com/naglis) ([issue #790](https://github.com/freedomofpress/dangerzone/issues/790))
- Linux: Handle filenames with invalid Unicode characters in the Dangerzone CLI, thanks to [@naglis](https://github.com/naglis) ([issue #768](https://github.com/freedomofpress/dangerzone/issues/768))
### Changed
- Sign our release assets with the Dangerzone signing key, and provide
instructions to end-users ([issue #761](https://github.com/freedomofpress/dangerzone/issues/761)
- Use the newest reimplementation of the PyMuPDF rendering engine (`fitz`) ([issue #700](https://github.com/freedomofpress/dangerzone/issues/700))
- Development: Build Dangerzone using the latest Wix 3.14 release ([#746](https://github.com/freedomofpress/dangerzone/pull/746)
- Add new file formats: epub svg and several image formats (BMP, PNM, BPM, PPM) ([issue #697](https://github.com/freedomofpress/dangerzone/issues/697))
## Fixed
- Fix mismatched between between original document and converted one ([issue #626](https://github.com/freedomofpress/dangerzone/issues/)). This does not affect the quality of the final document.
- Capitalize "dangerzone" on the application as well as on the Linux desktop shortcut, thanks to [@sudwhiwdh](https://github.com/sudwhiwdh) [#676](https://github.com/freedomofpress/dangerzone/pull/676)
- Fedora (Linux): Add missing Dangerzone logo on application launcher ([issue #645](https://github.com/freedomofpress/dangerzone/issues/645))
- Prevent document conversion from failing due to lack of space in the converter. This affected mainly systems with low computing resources such as Qubes OS ([issue #574](https://github.com/freedomofpress/dangerzone/issues/574))
- Add a missing dependency to our Apple Silicon container image, which affected dev environments only, thanks to [@prateekj117](https://github.com/prateekj117) ([#671](https://github.com/freedomofpress/dangerzone/pull/671))
- Development: Add missing check when building container image, thanks to [@EtiennePerot](https://github.com/EtiennePerot) ([#721](https://github.com/freedomofpress/dangerzone/pull/721))
### Changed
- Feature: Add support for HWP/HWPX files (Hancom Office) for macOS Apple Silicon devices ([issue #498](https://github.com/freedomofpress/dangerzone/issues/498), thanks to [@OctopusET](https://github.com/OctopusET))
- Replace Dangerzone document rendering engine from pdftoppm PyMuPDF, essentially replacing a variety of tools (gm / tesseract / pdfunite / ps2pdf) ([issue #658](https://github.com/freedomofpress/dangerzone/issues/658))
- Changed project license from MIT to AGPLv3 (related to [issue #658](https://github.com/freedomofpress/dangerzone/issues/658))
- Containers: stream pages instead of mounting directories. For users in practice this doesn't change much, but it opens up technical possibilities that go from security to usability. ([issue #443](https://github.com/freedomofpress/dangerzone/issues/443))
- Ubuntu Jammy (Linux): add external depedency (provided by the Dangerzone repository) which fixes podman crashing during standar stream I/O ([issue #685](https://github.com/freedomofpress/dangerzone/issues/685))
- Platform support: Drop Ubuntu 23.04 (Lunar Lobster), since it's end-of-life ([issue #705](https://github.com/freedomofpress/dangerzone/issues/705))
## Dangerzone 0.5.1
### Fixed
- Our Qubes RPM package was missing critical dependencies for the conversion of a document from pixels to PDF ([issue #647](https://github.com/freedomofpress/dangerzone/issues/647))
### Changed
- Use more descriptive button labels in update check prompt ([issue #527](https://github.com/freedomofpress/dangerzone/issues/527), thanks to [@garrettr](https://github.com/garrettr))
### Removed
- Platform support: Drop Fedora 37, since it reached end-of-life ([issue #637](https://github.com/freedomofpress/dangerzone/issues/637))
### Security
- [Security advisory 2023-12-07](https://github.com/freedomofpress/dangerzone/blob/main/docs/advisories/2023-12-07.md): Protect our container image against
CVE-2023-43115, by updating GhostScript to version 10.02.0.
- [Security advisory 2023-10-25](https://github.com/freedomofpress/dangerzone/blob/main/docs/advisories/2023-10-25.md): prevent dz-dvm network via dispVMs. This was
officially communicated on the advisory date and is only included here since
this is the first release since it was announced.
## Dangerzone 0.5.0
### Added
- Platform support: Beta integration with Qubes OS ([issue #412](https://github.com/freedomofpress/dangerzone/issues/412))
- Add client-side timeouts in Qubes ([issue #446](https://github.com/freedomofpress/dangerzone/issues/446))
- Add installation instructions for Qubes ([issue #431](https://github.com/freedomofpress/dangerzone/issues/431))
- Development: Add tests that run Dangerzone against a pool of roughly 11K documents ([PR #386](https://github.com/freedomofpress/dangerzone/pull/386))
- Development: Grab the output of commands when in development mode ([issue #319](https://github.com/freedomofpress/dangerzone/issues/319))
### Fixed
- Fix a bug that was introduced in version 0.4.1 and could potentially lead to
excluding the last page of the sanitized document ([issue #560](https://github.com/freedomofpress/dangerzone/issues/560))
- Fix the parsing of a document's page count ([issue #565](https://github.com/freedomofpress/dangerzone/issues/565))
- Make progress reports in Qubes real-time ([issue #557](https://github.com/freedomofpress/dangerzone/issues/557))
- Improve the handling of various runtime errors in Qubes ([issue #430](https://github.com/freedomofpress/dangerzone/issues/430))
- Pass OCR parameters properly in Qubes ([issue #455](https://github.com/freedomofpress/dangerzone/issues/455))
- Fix dark mode support ([issue #550](https://github.com/freedomofpress/dangerzone/issues/550),
thanks to [@garrettr](https://github.com/garrettr))
- MacOS/Windows: Sync "Check for updates" checkbox with the user's choice ([issue #513](https://github.com/freedomofpress/dangerzone/issues/513))
- Fix issue where changing document selection to a file from a different directory would lead to an error ([issue #581](https://github.com/freedomofpress/dangerzone/issues/581))
- Qubes: in the cli version "Safe PDF created" would be shown twice ([issue #555](https://github.com/freedomofpress/dangerzone/issues/555))
- Qubes: in the cli version the percentage is now rounded to the unit ([issue #553](https://github.com/freedomofpress/dangerzone/issues/553))
- Qubes: clean up temporary files ([issue #575](https://github.com/freedomofpress/dangerzone/issues/575))
- Qubes: do not open document if the conversion failed ([issue #581](https://github.com/freedomofpress/dangerzone/issues/581))
- Development: Switch from the deprecated `bdist_rpm` toolchain to the more
modern RPM SPEC files, when building Fedora packages ([issue #298](https://github.com/freedomofpress/dangerzone/issues/298))
- Development: Make our dev scripts properly invoke Docker in MacOS / windows
- Shave off ~300MiB from our container image, using the fast variant of the
Tesseract OCR language models ([issue #545](https://github.com/freedomofpress/dangerzone/issues/545))
- When a user is asked to enable updates, make "Yes" the default option ([issue #507](https://github.com/freedomofpress/dangerzone/issues/507))
- Use Fedora 38 as a template in our Qubes build instructions ([PR #533](https://github.com/freedomofpress/dangerzone/issues/533))
- Improve the installation docs for newcomers ([issue #475](https://github.com/freedomofpress/dangerzone/issues/475))
- Development: Explain how to get the application password from the MacOS keychain ([issue #522](https://github.com/freedomofpress/dangerzone/issues/522))
### Removed
- Remove the `dangerzone-container` executable, since it was not used in practice by any user ([PR #538](https://github.com/freedomofpress/dangerzone/issues/538))
### Security
- Do not allow attackers to show error or log messages to Qubes users ([issue #456](https://github.com/freedomofpress/dangerzone/issues/456))
## Dangerzone 0.4.2
### Added
- Inform about new updates on MacOS/Windows platforms, by periodically checking
- Feature: Add support for HWP/HWPX files (Hancom Office) ([issue #243](https://github.com/freedomofpress/dangerzone/issues/243), thanks to [@OctopusET](https://github.com/OctopusET))
* **NOTE:** This feature is not yet supported on MacOS with Apple Silicon CPU
or Qubes OS ([issue #494](https://github.com/freedomofpress/dangerzone/issues/494),
- Allow users to change their document selection from the UI ([issue #428](https://github.com/freedomofpress/dangerzone/issues/428))
- Add a note in our README for MacOS 11+ users blocked by SIP ([PR #401](https://github.com/freedomofpress/dangerzone/pull/401), thanks to [@keywordnew](https://github.com/keywordnew))
- Platform support: Alpha integration with Qubes OS ([issue #411](https://github.com/freedomofpress/dangerzone/issues/411))
- Development: Use Qt6 in our CI runners and dev environments ([issue #482](https://github.com/freedomofpress/dangerzone/issues/482))
### Removed
- Platform support: Drop Fedora 36, since it's end-of-life ([issue #420](https://github.com/freedomofpress/dangerzone/issues/420))
- Platform support: Drop Ubuntu 22.10 (Kinetic Kudu), since it's end-of-life ([issue #485](https://github.com/freedomofpress/dangerzone/issues/485))
### Fixed
- Add missing language detection (OCR) models ([issue #357](https://github.com/freedomofpress/dangerzone/issues/357))
- Replace deprecated `pipes` module with `shlex` ([issue #373](https://github.com/freedomofpress/dangerzone/issues/373), thanks to [@OctopusET](https://github.com/OctopusET))
- Shrink container image with `--no-cache` option on `apk` ([issue #459](https://github.com/freedomofpress/dangerzone/issues/459), thanks to [@OctopusET](https://github.com/OctopusET))
### Security
- Continuously scan our Python dependencies and container image for
- Development: Add dummy isolation provider for testing non-conversion-related
issues in virtualized Windows and MacOS, where Docker can't run, due to the
lack of nested virtualization ([issue #229](https://github.com/freedomofpress/dangerzone/issues/229))
- Add support for more MIME types that were previously disregarded ([issue #369](https://github.com/freedomofpress/dangerzone/issues/369))
- Platform support: Add support for Fedora 38
### Changed
- Full release under Freedom of the Press Foundation: signing keys have changed from the original developer Micah Lee / First Look Media to FPF's signing keys. Linux packages moved from Packagecloud to FPF's servers
- [Installation instructions updated](https://github.com/freedomofpress/dangerzone/blob/v0.4.1/INSTALL.md) to reflect change in key owership to FPF
- Platform support: MacOS (Apple Silicon) native application with significant
- Feature: Introduce PySide6 / Qt6 support on Windows, MacOS, and Linux (dev-only) ([issue #219](https://github.com/freedomofpress/dangerzone/issues/219))
- Feature: Adjust conversion timeouts based on the document's pages/size, and
allow users to disable them with `--disable-timeouts` (available when you run
the Dangerzone from the terminal) ([issue #327](https://github.com/freedomofpress/dangerzone/issues/327))
- Development: Update Linux instructions for development on Qubes
### Removed
- Platform support: Drop Fedora 35, since it's end-of-life ([issue #308](https://github.com/freedomofpress/dangerzone/issues/308))
- Bug fix: Remove unused PDFtk and sudo libraries from the container image, to
lower its attack surface and reduce its size ([issue #232](https://github.com/freedomofpress/dangerzone/issues/232))
### Fixed
- Feature: Convert documents with non-standard permissions or SELinux labels ([issue #335](https://github.com/freedomofpress/dangerzone/issues/335))
- Bug fix: Report exceptions during conversions ([issue #309](https://github.com/freedomofpress/dangerzone/issues/309))
- Feature: Support bulk conversion to safe PDFs ([issue #77](https://github.com/freedomofpress/dangerzone/issues/77))
- Feature: Option to archive unsafe directories ([issue #255](https://github.com/freedomofpress/dangerzone/pull/255))
- Feature: Support python 3.10
- Feature: When quitting while still converting, confirm if user is sure
- Bug fix: Fix unit tests on Windows
- Bug fix: Do not hardcode "docker" in help messages, now that Podman is also used ([issue #122](https://github.com/freedomofpress/dangerzone/issues/122))
- Bug fix: Failed execution no longer produces an empty "safe" documents ([issue #214](https://github.com/freedomofpress/dangerzone/issues/214))
- Bug fix: Malfunctioning "New window" logic was replaced with multi-doc support ([issue #204](https://github.com/freedomofpress/dangerzone/issues/204))
- Bug fix: re-adds support for 'open with Dangerzone' from finder on macOS ([issue #268](https://github.com/freedomofpress/dangerzone/issues/268))
- Bug fix: (macOS) quit Dangerzone when main window is closed ([issue #271](https://github.com/freedomofpress/dangerzone/issues/271))
## Dangerzone 0.3.2
- Bug fix: some non-ascii characters like “ would prevent Dangerzone from working ([issue #144](https://github.com/freedomofpress/dangerzone/issues/144))
- Bug fix: error where Dangerzone would show "permission denied: '/tmp/input_file'" ([issue #157](https://github.com/freedomofpress/dangerzone/issues/157))
- Bug fix: remove containers after use, enabling Dangerzone to run after 1000+ converted docs ([issue #197](https://github.com/freedomofpress/dangerzone/pull/197))
- Security: limit container capabilities, run in container as non-root and limit privilege escalation ([issue #169](https://github.com/freedomofpress/dangerzone/issues/169))
## Dangerzone 0.3.1
- Bug fix: Allow converting documents on different mounted filesystems than the container volume
- Bug fix: In GUI mode, don't always OCR document
- Bug fix: In macOS, fix "open with" Dangerzone so documents are automatically selected
- Windows: Change packaging to avoid anti-virus false positives
## Dangerzone 0.3
- Removes the need for internet access by shipping the Dangerzone container image directly with the software
| Tails | Only the last release | 🗷 | Last release only |
Notes:
✰ Support for Ubuntu Focal [was dropped](https://github.com/freedomofpress/dangerzone/issues/1018)
✢ Qubes OS support assumes the use of a Fedora template. The supported releases follow our general support for Fedora.
◎ More information about where that points [in the runner-images repository](https://github.com/actions/runner-images/tree/main)
## MacOS
- Download [Dangerzone 0.9.0 for Mac (Apple Silicon CPU)](https://github.com/freedomofpress/dangerzone/releases/download/v0.9.0/Dangerzone-0.9.0-arm64.dmg)
- Download [Dangerzone 0.9.0 for Mac (Intel CPU)](https://github.com/freedomofpress/dangerzone/releases/download/v0.9.0/Dangerzone-0.9.0-i686.dmg)
> [!TIP]
> We support the releases of macOS that are still within Apple's servicing timeline. Apple usually provides security updates for the latest 3 releases, but this isn’t consistently applied and security fixes aren’t guaranteed for the non-latest releases. We are also dependent on [Docker Desktop windows support](https://docs.docker.com/desktop/setup/install/mac-install/)
You can also install Dangerzone for Mac using [Homebrew](https://brew.sh/): `brew install --cask dangerzone`
> **Note**: you will also need to install [Docker Desktop](https://www.docker.com/products/docker-desktop/).
> This program needs to run alongside Dangerzone at all times, since it is what allows Dangerzone to
> create the secure environment.
## Windows
- Download [Dangerzone 0.9.0 for Windows](https://github.com/freedomofpress/dangerzone/releases/download/v0.9.0/Dangerzone-0.9.0.msi)
> **Note**: you will also need to install [Docker Desktop](https://www.docker.com/products/docker-desktop/).
> This program needs to run alongside Dangerzone at all times, since it is what allows Dangerzone to
> create the secure environment.
> [!TIP]
> We generally support Windows releases that are still within [Microsoft’s servicing timeline](https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet).
>
> Docker sets the bottom line:
>
> > Docker only supports Docker Desktop on Windows for those versions of Windows that are still within [Microsoft’s servicing timeline](https://support.microsoft.com/en-us/help/13853/windows-lifecycle-fact-sheet). Docker Desktop is not supported on server versions of Windows, such as Windows Server 2019 or Windows Server 2022.
## Linux
On Linux, Dangerzone uses [Podman](https://podman.io/) instead of Docker Desktop for creating
an isolated environment. It will be installed automatically when installing Dangerzone.
> [!TIP]
> We support Ubuntu, Debian, and Fedora releases that are still within
> their respective servicing timelines, with a few twists:
>
> - Ubuntu: We follow upstream support with an extra cutoff date. No support for
> versions prior to the second oldest LTS release.
> - Fedora: We follow upstream support
> - Debian: current stable, oldstable and LTS releases.
Dangerzone is available for:
- Ubuntu 25.04 (plucky)
- Ubuntu 24.10 (oracular)
- Ubuntu 24.04 (noble)
- Ubuntu 22.04 (jammy)
- Debian 13 (trixie)
- Debian 12 (bookworm)
- Debian 11 (bullseye)
- Fedora 42
- Fedora 41
- Fedora 40
- Tails
- Qubes OS (beta support)
### Ubuntu, Debian
<table>
<tr>
<td>
<details>
<summary><i>:information_source: Backport notice for Ubuntu 22.04 (Jammy) users regarding the <code>conmon</code> package</i></summary>
</br>
The `conmon` version that Podman uses and Ubuntu Jammy ships, has a bug
that gets triggered by Dangerzone
(more details in https://github.com/freedomofpress/dangerzone/issues/685).
To fix this, we provide our own `conmon` package through our APT repo, which
was built with the following [instructions](https://github.com/freedomofpress/maint-dangerzone-conmon/tree/ubuntu/jammy/fpf).
This package is essentially a backport of the `conmon` package
[provided](https://packages.debian.org/source/oldstable/conmon) by Debian
Bullseye.
</details>
</td>
</tr>
</table>
First, retrieve the PGP keys. The instructions differ depending on the specific
distribution you are using:
For Debian Trixie and Ubuntu Plucky (25.04), follow these instructions to
<summary>Importing GPG key 0x22604281: ... Is this ok [y/N]:</summary>
</br>
After some minutes of running the above command (depending on your internet speed) you'll be asked to confirm the fingerprint of our signing key. This is to make sure that in the case our servers are compromised your computer stays safe. It should look like this:
> **Note**: If it does not show this fingerprint confirmation or the fingerprint does not match, it is possible that our servers were compromised. Be distrustful and reach out to us.
The `Fingerprint` should be `DE28 AB24 1FA4 8260 FAC9 B8BA A7C9 B385 2260 4281`. For extra security, you should confirm it matches the one at the bottom of our website ([dangerzone.rocks](https://dangerzone.rocks)) and our [Mastodon account](https://fosstodon.org/@dangerzone) bio.
After confirming that it matches, type `y` (for yes) and the installation should proceed.
</details>
</td>
</tr>
</table>
### Qubes OS
> [!WARNING]
> This section is for the beta version of native Qubes support. If you
> want to try out the stable Dangerzone version (which uses containers instead
> of virtual machines for isolation), please follow the Fedora or Debian
> instructions and adapt them as needed.
>
> **If you followed these instructions before October 25, 2023, please read [this security advisory](docs/advisories/2023-10-25.md).**
> This notice will be removed with the 1.0.0 release of Dangerzone.
> [!IMPORTANT]
> This section will install Dangerzone in your **default template**
> (`fedora-41` as of writing this). If you want to install it in a different
> one, make sure to replace `fedora-41` with the template of your choice.
The following steps must be completed once. Make sure you run them in the
- You can download this key [from the keys.openpgp.org keyserver](https://keys.openpgp.org/vks/v1/by-fingerprint/DE28AB241FA48260FAC9B8BAA7C9B38522604281).
_(You can also cross-check this fingerprint with the fingerprint in our
[Mastodon page](https://fosstodon.org/@dangerzone) and the fingerprint in the
footer of our [official site](https://dangerzone.rocks))_
You must have GnuPG installed to verify signatures. For macOS you probably want
[GPGTools](https://gpgtools.org/), and for Windows you probably want
Take potentially dangerous PDFs, office documents, or images and convert them to a safe PDF.

Dangerzone works like this: You give it a document that you don't know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn't already one), and then converts the PDF into raw pixel data: a huge list of of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
_Read more about Dangerzone in the blog post [Dangerzone: Working With Suspicious Documents Without Getting Hacked](https://tech.firstlook.media/dangerzone-working-with-suspicious-documents-without-getting-hacked)._
Dangerzone works like this: You give it a document that you don't know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn't already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, outside of the sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
_Read more about Dangerzone in the [official site](https://dangerzone.rocks/about/)._
## Getting started
- Download [Dangerzone 0.2 for Mac](https://github.com/firstlookmedia/dangerzone/releases/download/v0.2/Dangerzone-0.2.dmg)
- Download [Dangerzone 0.2 for Windows](https://github.com/firstlookmedia/dangerzone/releases/download/v0.2/Dangerzone-0.2.msi)
- See [installing Dangerzone](https://github.com/firstlookmedia/dangerzone/wiki/Installing-Dangerzone) on the wiki for Linux repositories
Follow the instructions for each platform:
You can also install Dangerzone for Mac using [Homebrew](https://brew.sh/): `brew install --cask dangerzone`
You can read more about our operating system support [here](https://github.com/freedomofpress/dangerzone/blob/v0.9.0/INSTALL.md#operating-system-support).
## Some features
- Sandboxes don't have network access, so if a malicious document can compromise one, it can't phone home
- Sandboxes use [gVisor](https://gvisor.dev/), an application kernel written in Go, that implements a substantial portion of the Linux system call interface.
- Dangerzone can optionally OCR the safe PDFs it creates, so it will have a text layer again
- Dangerzone compresses the safe PDF to reduce file size
- After converting, Dangerzone lets you open the safe PDF in the PDF viewer of your choice, which allows you to open PDFs and office docs in Dangerzone by default so you never accidentally open a dangerous document
@ -33,13 +42,47 @@ Dangerzone can convert these types of document into safe PDFs:
- ODF Spreadsheet (`.ods`)
- ODF Presentation (`.odp`)
- ODF Graphics (`.odg`)
- Hancom HWP (Hangul Word Processor) (`.hwp`, `.hwpx`)
- other image formats (`.bmp`, `.pnm`, `.pbm`, `.ppm`)
Dangerzone was inspired by [Qubes trusted PDF](https://blog.invisiblethings.org/2013/02/21/converting-untrusted-pdfs-into-trusted.html), but it works in non-Qubes operating systems. It uses containers as sandboxes instead of virtual machines (using Docker for macOS, Windows, and Debian/Ubuntu, and [podman](https://podman.io/) for Fedora).
Dangerzone was inspired by [Qubes trusted PDF](https://blog.invisiblethings.org/2013/02/21/converting-untrusted-pdfs-into-trusted.html), but it works in non-Qubes operating systems. It uses containers as sandboxes instead of virtual machines (using Docker for macOS and Windows, and [podman](https://podman.io/) on Linux).
Set up a development environment by following [these instructions](/BUILD.md).
The git repository for the container is called [dangerzone-converter](https://github.com/firstlookmedia/dangerzone-converter).
# License and Copyright
Licensed under the AGPLv3: [https://opensource.org/licenses/agpl-3.0](https://opensource.org/licenses/agpl-3.0)
Copyright (c) 2022-2024 Freedom of the Press Foundation and Dangerzone contributors
Copyright (c) 2020-2021 First Look Media
## See also
* [GIJN Toolbox: Cutting-Edge — and Free — Online Investigative Tools You Can Try Right Now](https://gijn.org/stories/cutting-edge-free-online-investigative-tools/)
* [When security matters: working with Qubes OS at the Guardian](https://www.theguardian.com/info/2024/apr/04/when-security-matters-working-with-qubes-os-at-the-guardian)
## FAQ
### Has Dangerzone received a security audit?
Yes, Dangerzone received its [first security audit](https://freedom.press/news/dangerzone-receives-favorable-audit/) by [Include Security](https://includesecurity.com/) in December 2023. The audit was generally favorable, as it didn't identify any high-risk findings, except for 3 low-risk and 7 informational findings.
### "I'm experiencing an issue while using Dangerzone."
Dangerzone gets updates to improve its features _and_ to fix problems. So, updating may be the simplest path to resolving the issue which brought you here. Here is how to update:
1. Check which version of Dangerzone you are currently using: run Dangerzone, then look for a series of numbers to the right of the logo within the app. The format of the numbers will look similar to `0.4.1`
2. Now find the latest available version of Dangerzone: go to the [download page](https://dangerzone.rocks/#downloads). Look for the version number displayed. The number will be using the same format as in Step 1.
3. Is the version on the Dangerzone download page higher than the version of your installed app? Go ahead and update.
### Can I use Podman Desktop?
Yes! We've introduced [experimental support for Podman Desktop](https://github.com/freedomofpress/dangerzone/blob/main/docs/podman-desktop.md) on Windows and macOS.
This section documents the release process. Unless you're a dangerzone developer making a release, you'll probably never need to follow it.
This section documents how we currently release Dangerzone for the different distributions we support.
## Changelog, version, and signed git tag
## Pre-release
Before making a release, all of these should be complete:
Here is a list of tasks that should be done before issuing the release:
- [ ] Create a new issue named **QA and Release for version \<VERSION\>**, to track the general progress.
You can generate its content with the the `poetry run ./dev_scripts/generate-release-tasks.py` command.
- [ ] [Add new Linux platforms and remove obsolete ones](https://github.com/freedomofpress/dangerzone/blob/main/RELEASE.md#add-new-linux-platforms-and-remove-obsolete-ones)
- [ ] Bump the Python dependencies using `poetry lock`
- [ ] Check for new [WiX releases](https://github.com/wixtoolset/wix/releases) and update it if needed
- [ ] Update `version` in `pyproject.toml`
- [ ] Update `share/version.txt`
- [ ] Update version and download links in `README.md`
- [ ] Update the "Version" field in `install/linux/dangerzone.spec`
- [ ] Bump the Debian version by adding a new changelog entry in `debian/changelog`
- [ ] [Bump the minimum Docker Desktop versions](https://github.com/freedomofpress/dangerzone/blob/main/RELEASE.md#bump-the-minimum-docker-desktop-version) in `isolation_provider/container.py`
- [ ] Bump the dates and versions in the `Dockerfile`
- [ ] Update the download links in our `INSTALL.md` page to point to the new version (the download links will be populated after the release)
- [ ] Update screenshot in `README.md`, if necessary
- [ ] CHANGELOG.md should be updated to include a list of all major changes since the last release
- [ ] Test CircleCI Linux builds: Look in `.circleci/config.yml`, manually try each build in docker, and add new platforms and remove obsolete platforms
- [ ] Create a test build in Windows and make sure it works
- [ ] Create a test build in mcaOS and make sure it works
- [ ] There must be a PGP-signed git tag for the version, e.g. for dangerzone 0.1.0, the tag must be `v0.1.0`
- [ ] A draft release should be created. Copy the release notes text from the template at [`docs/templates/release-notes`](https://github.com/freedomofpress/dangerzone/tree/main/docs/templates/)
- [ ] Send the release notes to editorial for review
- [ ] Do the QA tasks
Before making a release, verify the release git tag:
## Add new Linux platforms and remove obsolete ones
```
git fetch
git tag -v v$VERSION
```
Our currently supported Linux OSes are Debian, Ubuntu, Fedora (we treat Qubes OS
as a special case of Fedora, release-wise). For each of these platforms, we need
to check if a new version has been added, or if an existing one is now EOL
(https://endoflife.date/ is handy for this purpose).
If the tag verifies successfully and check it out:
In case of a new version (beta, RC, or official release):
```
git checkout v$VERSION
```
1. Add it in our CI workflows, to test if that version works.
* See `.circleci/config.yml` and `.github/workflows/ci.yml`, as well as
`dev_scripts/env.py` and `dev_scripts/qa.py`.
2. Do a test of this version locally with `dev_scripts/qa.py`. Focus on the
GUI part, since the basic functionality is already tested by our CI
workflows.
3. Add the new version in our `INSTALL.md` document, and drop a line in our
`CHANGELOG.md`.
4. If that version is a new stable release, update the `RELEASE.md` and
`BUILD.md` files where necessary.
4. Send a PR with the above changes.
## macOS release
In case of the removal of a version:
To make a macOS release, go to macOS build machine:
1. Remove any mention to this version from our repo.
* Consult the previous paragraph, but also `grep` your way around.
2. Add a notice in our `CHANGELOG.md` about the version removal.
## Bump the minimum Docker Desktop version
We embed the minimum docker desktop versions inside Dangerzone, as an incentive for our macOS and Windows users to upgrade to the latests version.
You can find the latest version at the time of the release by looking at [their release notes](https://docs.docker.com/desktop/release-notes/)
## Large Document Testing
Parallel to the QA process, the release candidate should be put through the large document tests in a dedicated machine to run overnight.
Follow the instructions in `docs/developer/TESTING.md` to run the tests.
These tests will identify any regressions or progression in terms of document coverage.
## Release
Once we are confident that the release will be out shortly, and doesn't need any more changes:
- [ ] Create a PGP-signed git tag for the version, e.g., for dangerzone `v0.1.0`:
```bash
git tag -s v0.1.0
git push origin v0.1.0
```
**Note**: release candidates are suffixed by `-rcX`.
> [!IMPORTANT]
> Because we don't have [reproducible builds](https://github.com/freedomofpress/dangerzone/issues/188)
> yet, building the Dangerzone container image in various platforms would lead
> to different container image IDs / hashes, due to different timestamps. To
> avoid this issue, we should build the final container image for x86_64
> architectures on **one** platform, and then copy it to the rest of the
> You can automate these steps from your macOS terminal app with:
>
> ```
> export APPLE_ID=<email>
> make build-macos-intel # for Intel macOS
> make build-macos-arm # for Apple Silicon macOS
> ```
The following needs to happen for both Silicon and Intel chipsets.
#### Initial Setup
- Build machine must have:
- macOS 10.14
- Apple-trusted `Developer ID Application: FIRST LOOK PRODUCTIONS, INC.` and `Developer ID Installer: FIRST LOOK PRODUCTIONS, INC.` code-signing certificates installed
- An app-specific Apple ID password saved in the login keychain called `flockagent-notarize`
- Verify and checkout the git tag for this release
- Run `poetry install`
- Run `poetry run ./install/macos/build_app.py --with-codesign`; this will make `dist/Dangerzone.dmg`
- Wait for it to get approved, check status with: `xcrun altool --notarization-history 0 -u "micah@firstlook.org" -p "@keychain:dangerzone-notarize"`
- (If it gets rejected, you can see why with: `xcrun altool --notarization-info [RequestUUID] -u "micah@firstlook.org" -p "@keychain:dangerzone-notarize"`)
- After it's approved, staple the ticket: `xcrun stapler staple dist/Dangerzone.dmg`
- Apple-trusted `Developer ID Application: Freedom of the Press Foundation (94ZZGGGJ3W)` code-signing certificates installed
- Apple account must have:
- A valid application password for `notarytool` in the Keychain. You can verify
this by running: `xcrun notarytool history --apple-id "<email>" --keychain-profile "dz-notarytool-release-key"`. If you don't find it, you can add it to the Keychain by running
with the respective `email` and `team ID` (the latter can be obtained [here](https://developer.apple.com/help/account/manage-your-team/locate-your-team-id))
- Agreed to any new terms and conditions. You can find those if you visit
https://developer.apple.com and login with the proper Apple ID.
This process ends up with the final file:
#### Releasing and Signing
```
dist/Dangerzone.dmg
```
Here is what you need to do:
Rename `Dangerzone.dmg` to `Dangerzone-$VERSION.dmg`.
- [ ] Verify and install the latest supported Python version from
[python.org](https://www.python.org/downloads/macos/) (do not use the one from
brew as it is known to [cause issues](https://github.com/freedomofpress/dangerzone/issues/471))
## Windows release
- [ ] Checkout the dependencies, and clean your local copy:
To make a Windows release, go to the Windows build machine:
```bash
- Build machine should be running Windows 10, and have the Windows codesigning certificate installed
- Verify and checkout the git tag for this release
- Run `poetry install`
- Run `poetry shell`, then `cd ..\pyinstaller`, `python setup.py install`, `exit`
- Run `poetry run install\windows\step1-build-exe.bat`
- Open a second command prompt _as an administratror_, cd to the dangerzone directory, and run: `install\windows\step2-make-symlink.bat`
- Back in the first command prompt, run: `poetry run install\windows\step3-build-installer.bat`
- When you're done you will have `dist\Dangerzone.msi`
# In case of a new Python installation or minor version upgrade, e.g., from
# 3.11 to 3.12, reinstall Poetry
python3 -m pip install poetry
# You can verify the correct Python version is used
poetry debug info
# Replace with the actual version
export DZ_VERSION=$(cat share/version.txt)
# Verify and checkout the git tag for this release:
git checkout -f v$VERSION
# Clean the git repository
git clean -df
# Clean up the environment
poetry env remove --all
# Install the dependencies
poetry sync
```
- [ ] Build the container image and the OCR language data
The Windows release is performed in a Windows 11 virtual machine (as opposed to a physical one).
#### Initial Setup
- Download a VirtualBox VM image for Windows from here: https://developer.microsoft.com/en-us/windows/downloads/virtual-machines/ and import it into VirtualBox. Also install the Oracle VM VirtualBox Extension Pack.
- Install updates
- Install git for Windows from https://git-scm.com/download/win, and clone the dangerzone repo
- Follow the Windows build instructions in `BUILD.md`, except:
- Don't install Docker Desktop (it won't work without nested virtualization)
- Install the Windows SDK from here: https://developer.microsoft.com/en-us/windows/downloads/windows-sdk/ and add `C:\Program Files (x86)\Microsoft SDKs\ClickOnce\SignTool` to the path (you'll need it for `signtool.exe`)
- You'll also need the Windows codesigning certificate installed on the VM
#### Releasing and Signing
- [ ]Checkout the dependencies, and clean your local copy:
```bash
# In case of a new Python installation or minor version upgrade, e.g., from
# 3.11 to 3.12, reinstall Poetry
python3 -m pip install poetry
# You can verify the correct Python version is used
poetry debug info
# Replace with the actual version
export DZ_VERSION=$(cat share/version.txt)
# Verify and checkout the git tag for this release:
git checkout -f v$VERSION
# Clean the git repository
git clean -df
# Clean up the environment
poetry env remove --all
# Install the dependencies
poetry sync
```
- [ ] Copy the container image into the VM
> [!IMPORTANT]
> Instead of running `python .\install\windows\build-image.py` in the VM, run the build image script on the host (making sure to build for `linux/amd64`). Copy `share/container.tar` and `share/image-id.txt` from the host into the `share` folder in the VM.
- [ ] Run `poetry run .\install\windows\build-app.bat`
- [ ] When you're done you will have `dist\Dangerzone.msi`
Rename `Dangerzone.msi` to `Dangerzone-$VERSION.msi`.
## Linux release
### Linux release
Linux binaries are automatically built and deployed to repositories when a new tag is pushed.
> [!TIP]
> You can automate these steps from any Linux distribution with:
>
> ```
> make build-linux
> ```
>
> You can then add the created artifacts to the appropriate APT/YUM repo.
## Publishing the release
Below we explain how we build packages for each Linux distribution we support.
To publish the release:
#### Debian/Ubuntu
- Create a new release on GitHub, put the changelog in the description of the release, and upload the macOS and Windows installers
- Update the [Installing Dangerzone](https://github.com/firstlookmedia/dangerzone/wiki/Installing-Dangerzone) wiki page
- Update the [Dangerzone website](https://github.com/firstlookmedia/dangerzone.rocks) to link to the new installers
Because the Debian packages do not contain compiled Python code for a specific
Python version, we can create a single Debian package and use it for all of our
Debian-based distros.
Create a Debian Bookworm development environment. You can [follow the
instructions in our build section](https://github.com/freedomofpress/dangerzone/blob/main/BUILD.md#debianubuntu),
or create your own locally with:
```sh
# Create and run debian bookworm development environment
[`freedomofpress/yum-tools-prod`](https://github.com/freedomofpress/yum-tools-prod) repo, by sending a PR. Follow the instructions in that repo on how to do so.
# User is not in the docker group, ask if they prefer typing their password
message="<b>Dangerzone requires Docker</b><br><br>In order to use Docker, your user must be in the 'docker' group or you'll need to type your password each time you run dangerzone.<br><br><b>Adding your user to the 'docker' group is more convenient but less secure</b>, and will require just typing your password once. Which do you prefer?"
return_code=Alert(
self,
self.global_common,
message,
ok_text="I'll type my password each time",
extra_button_text="Add my user to the 'docker' group",
message="Great! Now you must log out of your computer and log back in, and then you can use Dangerzone."
Alert(self,self.global_common,message).launch()
else:
message="Failed to add your user to the 'docker' group, quitting."
Alert(self,self.global_common,message).launch()
returnFalse
else:
# Cancel
returnFalse
returnTrue
defensure_docker_service_is_started(self):
ifnotis_docker_ready(self.global_common):
message="<b>Dangerzone requires Docker</b><br><br>Docker should be installed, but it looks like it's not running in the background.<br><br>Click Ok to try starting the docker service. You will have to type your login password."
if(
Alert(self,self.global_common,message).launch()
==QtWidgets.QDialog.Accepted
):
p=subprocess.run(
[
"/usr/bin/pkexec",
self.global_common.get_resource_path(
"enable_docker_service.sh"
),
]
)
ifp.returncode==0:
# Make sure docker is now ready
ifis_docker_ready(self.global_common):
returnTrue
else:
message="Restarting docker appeared to work, but the service still isn't responding, quitting."
Alert(self,self.global_common,message).launch()
else:
message="Failed to start the docker service, quitting."
Description: Take potentially dangerous PDFs, office documents, or images
Dangerzone is an open source desktop application that takes potentially dangerous PDFs, office documents, or images and converts them to safe PDFs. It uses disposable VMs on Qubes OS, or container technology in other OSes, to convert the documents within a secure sandbox.
For users testing our [new Qubes integration (beta)](https://github.com/freedomofpress/dangerzone/blob/main/INSTALL.md#qubes-os), please note that our instructions were missing a configuration detail for disposable VMs which is necessary to fully harden the configuration.
These instructions apply to users who followed the setup instructions **before October 25, 2023**.
**What you need to do:** run the following command in dom0:
```bash
qvm-prefs dz-dvm default_dispvm ''
```
**Explanation**: In Qubes OS, the default template for disposable VMs is network-connected. For this reason, we instruct users to create their own disposable VM (`dz-dvm`). However, adversaries with the ability to execute commands on `dz-dvm` would also be able open new disposable VMs with the default settings. By setting the default_dispvm to "none" we prevent this bypass.
Dangerzone has some automated testing under `tests/`.
The following assumes that you have already setup the development environment.
## Run tests
Unit / integration tests are run with:
```bash
poetry run make test
```
## Run large tests
We also have a larger set of tests that can take a day or more to run, where we evaluate the completeness of Dangerzone conversions.
```bash
poetry run make test-large
```
### Test report generation
After running the large tests, a report is stored under `tests/test_docs_large/results/junit/` and it is composed of the JUnit XML file describing the pytest run.
This report can be analysed for errors. It is obtained by running:
```bash
cd tests/docs_test_large
make report
```
If you want to run the report on some historical test result, you can call:
The `dev_scripts/env.py` script creates environments where a user can run
Dangerzone, allows the user to run arbitrary commands in these environments, as
well as run Dangerzone (nested containerization).
It supports two types of environments:
1. Dev environment. This environment has developer tools, necessary for
Dangerzone, baked in. Also, it mounts the Dangerzone source under
`/home/user/dangerzone` in the container. The developer can then run
Dangerzone from source, with `poetry run ./dev_scripts/dangerzone`.
2. End-user environment. This environment has only Dangerzone installed in it,
from the .deb/.rpm package that we have created. For convenience, it also has
the Dangerzone source mounted under `/home/user/dangerzone`, but it lacks
Poetry and other build tools. The developer can run Dangerzone there with
`dangerzone`. This environment is the most vanilla Dangerzone environment,
and should be closer to the end user's environment, than the development
environment.
Each environment corresponds to a Dockerfile, which is generated on the fly. The
developer can see this Dockerfile by passing `--show-dockerfile`.
For usage information, run `./dev_scripts/env.py --help`.
## Nested containerization
Since the Dangerzone environments are containers, this means that the Podman
containers that Dangerzone creates have to be nested containers. This has some
challenges that we will highlight below:
1. Containers typically only have a subset of syscalls allowed, and sometimes
only for specific arguments. This happens with the use of
[seccomp filters](https://docs.docker.com/engine/security/seccomp/). For
instance, in Docker, the `clone` syscall is limited in containers and cannot
create new namespaces
(https://docs.docker.com/engine/security/seccomp/#significant-syscalls-blocked-by-the-default-profile). For testing/development purposes, we can get around this limitation
by disabling the seccomp filters for the external container with
`--security-opt seccomp=unconfined`. This has the same effect as developing
Dangerzone locally, so it should probably be sufficient for now.
2. While Linux supports nested namespaces, we need extra handling for nested
user namespaces. By default, the configuration for each user namespace (see
of the processes that run in them. By default, containers do not have mount
capabilities, since it requires `CAP_SYS_ADMIN`, which effectively
[makes the process root](https://lwn.net/Articles/486306/) in the specific
user namespace. In our case, we have to give the Dangerzone environment this
capability, since it will have to mount directories in Podman containers. For
this reason, as well as some extra things we bumped into during development,
we pass `--privileged` when creating the Dangerzone environment, which
includes the `CAP_SYS_ADMIN` capability.
## GUI containerization
Running a GUI app in a container is a tricky subject for multi-platform apps. In
our case, we deal specifically with Linux environments, so we can target just
this platform.
To understand how a GUI app can draw in the user's screen from within a
container, we must first understand how it does so outside the container. In
Unix-like systems, GUI apps act like
[clients to a display server](https://wayland.freedesktop.org/architecture.html).
The most common display server implementation is X11, and the runner-up is
Wayland. Both of these display servers share some common traits, mainly that
they use Unix domain sockets as a way of letting clients communicate with them.
So, this gives us the answer on how one can run a containerized GUI app; they
can simply mount the Unix Domain Socket in the container. In practice this is
more nuanced, for two reasons:
1. Wayland support is not that mature on Linux, so we need to
[set some extra environment variables](https://github.com/mviereck/x11docker/wiki/How-to-provide-Wayland-socket-to-docker-container). To simplify things, we will target
X11 / XWayland hosts, which are the majority of the Linux OSes out there.
2. Sharing the Unix Domain socket does not allow the client to talk to the
display server, for security reasons. In order to allow the client, we need
to mount a magic cookie stored in a file pointed at by the `$XAUTHORITY`
envvar. Else, we can use `xhost`, which is considered slightly more dangerous
for multi-user environments.
## Caching and Reproducibility
In order to build Dangerzone environments, the script uses the following inputs:
* Dev environment:
- Distro name and version. Together, these comprise the base container image.
- `poetry.lock` and `pyproject.toml`. Together, these comprise the build
context.
* End-user environment:
- Distro name and version. Together, these comprise the base container image.
- `.deb` / `.rpm` Dangerzone package, as found under `deb_dist/` or `dist/`
respectively.
Any change in these inputs busts the cache for the corresponding image. In
theory, this means that the Dangerzone environment for each commit can be built
reproducibly. In practice, there are some issues that we haven't covered yet: