Update the build instructions for Ubuntu Jammy regarding conmon, now
that oldstable-proposed-updates no longer offers a patched conmon
package. Propose instead to install conmon from our apt-tools-prod repo.
Instead of installing a patched conmon version from the
oldstable-proposed-updates repo, install it from our apt-tools-prod
repo. This applies to just Ubuntu Jammy, since the rest of the platforms
don't have this problem.
Now that the conmon package with version 2.0.25+ds1-1.1+deb11u1 has been
released [1] for Debian Bullseye, there is no need to install it from
the oldstable-proposed-updates repo any more.
[1]: https://tracker.debian.org/pkg/conmon
PyMuPDF 1.23.9 swapped the new fitz implementation (fitz_new)
with the fitz module. In the new module there are prints in the code
that interfere with our stdout for sending JSON from the container.
Pinning the version seems to have no adverse consequences [1], since
fitz_old hasn't had significant changes and it gives breathing room for
the print-related issue to be tackled in PR [2].
Fixes temporarily #700
[1]: https://github.com/freedomofpress/dangerzone/issues/700#issuecomment-1938357651
[2]: https://github.com/pymupdf/PyMuPDF/pull/3137
The container image does not need the TESSDATA_PREFIX env variable since
its PyMuPDF version is new enough to support `tessdata` as an argument
when calling the PyMuPDF tesseract method.
Switching from mounting files to writing to stdout has introduced some
Podman crashes in specific environments (Ubuntu Jammy / Debian Bullseye)
due to a conmon bug that affects version 2.0.25.
Fixing it for various permutations of the environments we support
requires the following:
1. CI tests: Install conmon from the oldstable-proposed-updates in
our Debian Bullseye / Ubuntu Jammy dev/end-user environments.
2. Developers: Add a line in BUILD.md that suggests users to install
conmon from the oldstable-proposed-updates repo, or some other repo
they prefer.
3. End-user installations: We will build conmon for Ubuntu Jammy, and
wait until the proposed updates repo gets merged in Debian Bullseye.
Fixes#685
Since the progress information is now inferred on host based on the
number of pages obtained, progress-tracking variables should be removed
from the server.
Remove timeouts due to several reasons:
1. Lost purpose: after implementing the containers page streaming the
only subprocess we have left is LibreOffice. So don't have such a
big risk of commands hanging (the original reason for timeouts).
2. Little benefit: predicting execution time is generically unsolvable
computer science problem. Ultimately we were guessing an arbitrary
time based on the number of pages and the document size. As a guess
we made it pretty lax (30s per page or MB). A document hanging for
this long will probably lead to user frustration in any case and the
user may be compelled to abort the conversion.
3. Technical Challenges with non-blocking timeout: there have been
several technical challenges in keeping timeouts that we've made effort
to accommodate. A significant one was having to do non-blocking read to
ensure we could timeout when reading conversion stream (and then used
here)
Fixes#687
This reverts commit fea193e935.
This is part of the purge of timeout-related code since we no longer
need it [1]. Non-blocking reads were introduced in the reverted commit
in order to be able to cut a stream mid-way due to a timeout. This is
no longer needed now that we're getting rid of timeouts.
[1]: https://github.com/freedomofpress/dangerzone/issues/687
If we increased the number of parallel conversions, we'd run into an
issue where the streams were getting mixed together. This was because
the Converter.proc was a single attribute. This breaks it down into a
local variable such that this mixup doesn't happen.
Conversions methods had changed and that was part of the reason why
the tests were failing. Furthermore, due to the `provider.proc`, which
stores the associated qrexec / container process, "server" exceptions
raise a IterruptedConversion error (now ConverterProcException), which
then requires interpretation of the process exit code to obtain the
"real" exception.
Avoids downloading the container image 4 times in the multi-stage build
by first pulling the alpine image once and then building without any
pulls.
Implemented following a suggestion of @apyrgio.
Now that only the second container can send JSON-encoded progress
information, we can the untrusted JSON parsing. The parse_progress was
also renamed to `parse_progress_trusted` to ensure future developers
don't mistake this as a safe method.
The old methods for sending untrusted JSON were repurposed to send the
progress instead to stderr for troubleshooting in development mode.
Fixes#456
If one converted more than one document, since the state of
IsolationProvider.percentage would be stored in the IsolationProvider
instance, it would get reused for the second document. The fix is to
keep it as a local variable, but we can explore having progress stored
on the document itself, for example. Or having one IsolationProvider per
conversion.
Merge Qubes and Containers isolation providers core code into the class
parent IsolationProviders abstract class.
This is done by streaming pages in containers for exclusively in first
conversion process. The commit is rather large due to the multiple
interdependencies of the code, making it difficult to split into various
commits.
The main conversion method (_convert) now in the superclass simply calls
two methods:
- doc_to_pixels()
- pixels_to_pdf()
Critically, doc_to_pixels is implemented in the superclass, diverging
only in a specialized method called "start_doc_to_pixels_proc()". This
method obtains the process responsible that communicates with the
isolation provider (container / disp VM) via `podman/docker` and qrexec
on Containers and Qubes respectively.
Known regressions:
- progress reports stopped working on containers
Fixes#443
Reclaim some storage space in the middle of the CI job that builds and
installs Dangerzone in Fedora. The reason is that previously, we
encountered an issues with CI runners running out of space.
Explain what happens when we bump our `poetry.lock`, and a new
Pyside6 version. Also, have a step-by-step guide on how the maintainer
should create a new PySide6 RPM and update FPF's repo, so that
Dangerzone can be released.
Add some Fedora CI jobs that build RPMs, install them in an end-user
environment, and make a simple conversion and GUI import check. These
are basically smoke tests for Fedora, similar to the ones we have for
Debian.
Now that we can create a Dangerzone RPM that depends on PySide6, we can
officially support Fedora 39 as a platform. Add this platform in our CI
tests, as well as our install/release notes.
Fixes#606
Extend the env.py script to build an end-user, Fedora 39+ environment
with PySide6 installed, as a regular RPM package. Previously, this was
only possible for development environments with PySide6 downloaded from
PyPI.
As a way to simplify builds, the env.py script offers the option to
download the RPM package itself from FPF's RPM repo [1], if the package
has been uploaded.
[1]: https://packages.freedom.press/yum-tools-prod
Fedora 39 ships with Python 3.12 by default, which Dangerzone previously
did not support due to limitations from the PySide6 package. Now that
the PySide6 package has been updated to 6.6.1, and the limitation has
lifted, we should to reflect this in pyproject.toml.
Install a subset of Podman dependencies, so that we don't also install
Systemd. Doing so can introduce some subtle issues of its own, which is
why we prefer cherry-picking the Podman packages we really need.
Fixes#689
Make Poetry include data files only in the source distribution, and not
on our wheels. This mainly makes RPM packaging a bit easier, but does
not solve the problem of how to install files to
`/usr/share/dangerzone`.
Also, include files using globs, which is the way Poetry prefers.
Fixes#678
Refs #677
Our security scans for the released container image have flagged
CVE-2023-7104. Our assessment is that this CVE doesn't affect
Dangerzone, mainly because our understanding is that attackers cannot
embed SQLite dbs within LibreOffice spreadsheets.
Add the following functionality to the build image script:
1. Let the user choose the container runtime of their choice. In some
systems, both Docker and Podman may be available, so we need to let
the user choose which runtime they want.
2. Let users choose if they want to save the image. For non-production
builds, we may want to simply build the container image, without
the time penalty of compression.