If we increased the number of parallel conversions, we'd run into an
issue where the streams were getting mixed together. This was because
the Converter.proc was a single attribute. This breaks it down into a
local variable such that this mixup doesn't happen.
Conversions methods had changed and that was part of the reason why
the tests were failing. Furthermore, due to the `provider.proc`, which
stores the associated qrexec / container process, "server" exceptions
raise a IterruptedConversion error (now ConverterProcException), which
then requires interpretation of the process exit code to obtain the
"real" exception.
Avoids downloading the container image 4 times in the multi-stage build
by first pulling the alpine image once and then building without any
pulls.
Implemented following a suggestion of @apyrgio.
Now that only the second container can send JSON-encoded progress
information, we can the untrusted JSON parsing. The parse_progress was
also renamed to `parse_progress_trusted` to ensure future developers
don't mistake this as a safe method.
The old methods for sending untrusted JSON were repurposed to send the
progress instead to stderr for troubleshooting in development mode.
Fixes#456
If one converted more than one document, since the state of
IsolationProvider.percentage would be stored in the IsolationProvider
instance, it would get reused for the second document. The fix is to
keep it as a local variable, but we can explore having progress stored
on the document itself, for example. Or having one IsolationProvider per
conversion.
Merge Qubes and Containers isolation providers core code into the class
parent IsolationProviders abstract class.
This is done by streaming pages in containers for exclusively in first
conversion process. The commit is rather large due to the multiple
interdependencies of the code, making it difficult to split into various
commits.
The main conversion method (_convert) now in the superclass simply calls
two methods:
- doc_to_pixels()
- pixels_to_pdf()
Critically, doc_to_pixels is implemented in the superclass, diverging
only in a specialized method called "start_doc_to_pixels_proc()". This
method obtains the process responsible that communicates with the
isolation provider (container / disp VM) via `podman/docker` and qrexec
on Containers and Qubes respectively.
Known regressions:
- progress reports stopped working on containers
Fixes#443
Reclaim some storage space in the middle of the CI job that builds and
installs Dangerzone in Fedora. The reason is that previously, we
encountered an issues with CI runners running out of space.
Explain what happens when we bump our `poetry.lock`, and a new
Pyside6 version. Also, have a step-by-step guide on how the maintainer
should create a new PySide6 RPM and update FPF's repo, so that
Dangerzone can be released.
Add some Fedora CI jobs that build RPMs, install them in an end-user
environment, and make a simple conversion and GUI import check. These
are basically smoke tests for Fedora, similar to the ones we have for
Debian.
Now that we can create a Dangerzone RPM that depends on PySide6, we can
officially support Fedora 39 as a platform. Add this platform in our CI
tests, as well as our install/release notes.
Fixes#606
Extend the env.py script to build an end-user, Fedora 39+ environment
with PySide6 installed, as a regular RPM package. Previously, this was
only possible for development environments with PySide6 downloaded from
PyPI.
As a way to simplify builds, the env.py script offers the option to
download the RPM package itself from FPF's RPM repo [1], if the package
has been uploaded.
[1]: https://packages.freedom.press/yum-tools-prod
Fedora 39 ships with Python 3.12 by default, which Dangerzone previously
did not support due to limitations from the PySide6 package. Now that
the PySide6 package has been updated to 6.6.1, and the limitation has
lifted, we should to reflect this in pyproject.toml.
Install a subset of Podman dependencies, so that we don't also install
Systemd. Doing so can introduce some subtle issues of its own, which is
why we prefer cherry-picking the Podman packages we really need.
Fixes#689
Make Poetry include data files only in the source distribution, and not
on our wheels. This mainly makes RPM packaging a bit easier, but does
not solve the problem of how to install files to
`/usr/share/dangerzone`.
Also, include files using globs, which is the way Poetry prefers.
Fixes#678
Refs #677
Our security scans for the released container image have flagged
CVE-2023-7104. Our assessment is that this CVE doesn't affect
Dangerzone, mainly because our understanding is that attackers cannot
embed SQLite dbs within LibreOffice spreadsheets.
Add the following functionality to the build image script:
1. Let the user choose the container runtime of their choice. In some
systems, both Docker and Podman may be available, so we need to let
the user choose which runtime they want.
2. Let users choose if they want to save the image. For non-production
builds, we may want to simply build the container image, without
the time penalty of compression.
PyMuPDF replaced the need for almost all dependencies, which this commit
now removes.
We are also removing tesseract-ocr as a dependency since
(to our surprise) PyMuPDF ships directly with tesseract binaries [1].
However, now that tesseract-ocr is not available directly as a binary
tool, the `test_ocr.py` needed to be changed.
Fixes#658
[1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149
Some tests [1] lead to the conclusion that ocr_compression does the same
to the file (performance and size-wise) to the file as deflating images
when saving the file. However, both methods active do add a bit of extra
time. For this reason we're disabling the image deflation (default
option).
[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1434042296
Qubes does on-host pixels-to-pdf whereas the containers version doesn't.
This leads to an issue where on the containers version it tries to load
fitz, which isn't installed there, just because it's trying to check if
it should run the Qubes version.
The error it was showing was something like this:
ImportError while loading conftest '/home/user/dangerzone/tests/conftest.py'.
tests/__init__.py:8: in <module>
from dangerzone.document import SAFE_EXTENSION
dangerzone/__init__.py:16: in <module>
from .gui import gui_main as main
dangerzone/gui/__init__.py:28: in <module>
from ..isolation_provider.qubes import Qubes, is_qubes_native_conversion
dangerzone/isolation_provider/qubes.py:15: in <module>
from ..conversion.pixels_to_pdf import PixelsToPDF
dangerzone/conversion/pixels_to_pdf.py:16: in <module>
import fitz
E ModuleNotFoundError: No module named 'fitz'
For context see discussion in [1].
[1]: https://github.com/freedomofpress/dangerzone/pull/622#issuecomment-1839164885
Breaks down the container build into multiple stages in order to speed
up build times. Building PyMuPDF was taking too long and this way it can
be cached.
The original version was made by @apyrgio
Ensure that when the container image is installing pymupdf (unavailable
in the repos) with verified hashes. To do so, it has the pymupdf
dependency declared in a "container" group in `pyproject.toml`, which
then gets exported into a requirements.txt, which is then used for
hash-verification when building the container.
Because this required modifying the container image build scripts, they
were all merged to avoid duplicate code. This was an overdue change
anyways.
We're intentionally bypassing PEP 668 [1], which prevents the
installation of non-distro python wheels alongside system packages to
avoid incompatibilities at distro-level.
We are intentionally bypassing this since our container image is a
controlled environment (we only ship a version after rigorous testing).
[1]: https://peps.python.org/pep-0668/
The original document was larger in dimensions than the original one due
to a mismatch in DPI settings. When converting documents to pixels we
were setting the DPI to 150 pixels per inch. Then when converting back
into a PDF we were using 70 DPI. This difference would result in an
overall larger document in dimensions (though not necessarily in file
size).
Fixes#626