Make Poetry include data files only in the source distribution, and not
on our wheels. This mainly makes RPM packaging a bit easier, but does
not solve the problem of how to install files to
`/usr/share/dangerzone`.
Also, include files using globs, which is the way Poetry prefers.
Fixes#678
Refs #677
Our security scans for the released container image have flagged
CVE-2023-7104. Our assessment is that this CVE doesn't affect
Dangerzone, mainly because our understanding is that attackers cannot
embed SQLite dbs within LibreOffice spreadsheets.
Add the following functionality to the build image script:
1. Let the user choose the container runtime of their choice. In some
systems, both Docker and Podman may be available, so we need to let
the user choose which runtime they want.
2. Let users choose if they want to save the image. For non-production
builds, we may want to simply build the container image, without
the time penalty of compression.
PyMuPDF replaced the need for almost all dependencies, which this commit
now removes.
We are also removing tesseract-ocr as a dependency since
(to our surprise) PyMuPDF ships directly with tesseract binaries [1].
However, now that tesseract-ocr is not available directly as a binary
tool, the `test_ocr.py` needed to be changed.
Fixes#658
[1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149
Some tests [1] lead to the conclusion that ocr_compression does the same
to the file (performance and size-wise) to the file as deflating images
when saving the file. However, both methods active do add a bit of extra
time. For this reason we're disabling the image deflation (default
option).
[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1434042296
Qubes does on-host pixels-to-pdf whereas the containers version doesn't.
This leads to an issue where on the containers version it tries to load
fitz, which isn't installed there, just because it's trying to check if
it should run the Qubes version.
The error it was showing was something like this:
ImportError while loading conftest '/home/user/dangerzone/tests/conftest.py'.
tests/__init__.py:8: in <module>
from dangerzone.document import SAFE_EXTENSION
dangerzone/__init__.py:16: in <module>
from .gui import gui_main as main
dangerzone/gui/__init__.py:28: in <module>
from ..isolation_provider.qubes import Qubes, is_qubes_native_conversion
dangerzone/isolation_provider/qubes.py:15: in <module>
from ..conversion.pixels_to_pdf import PixelsToPDF
dangerzone/conversion/pixels_to_pdf.py:16: in <module>
import fitz
E ModuleNotFoundError: No module named 'fitz'
For context see discussion in [1].
[1]: https://github.com/freedomofpress/dangerzone/pull/622#issuecomment-1839164885
Breaks down the container build into multiple stages in order to speed
up build times. Building PyMuPDF was taking too long and this way it can
be cached.
The original version was made by @apyrgio
Ensure that when the container image is installing pymupdf (unavailable
in the repos) with verified hashes. To do so, it has the pymupdf
dependency declared in a "container" group in `pyproject.toml`, which
then gets exported into a requirements.txt, which is then used for
hash-verification when building the container.
Because this required modifying the container image build scripts, they
were all merged to avoid duplicate code. This was an overdue change
anyways.
We're intentionally bypassing PEP 668 [1], which prevents the
installation of non-distro python wheels alongside system packages to
avoid incompatibilities at distro-level.
We are intentionally bypassing this since our container image is a
controlled environment (we only ship a version after rigorous testing).
[1]: https://peps.python.org/pep-0668/
The original document was larger in dimensions than the original one due
to a mismatch in DPI settings. When converting documents to pixels we
were setting the DPI to 150 pixels per inch. Then when converting back
into a PDF we were using 70 DPI. This difference would result in an
overall larger document in dimensions (though not necessarily in file
size).
Fixes#626
Adding PyMuPDF essentially make the code much simpler since it can do
everything that we'd need multiple programs for. It also includes
tesseract-OCR integration, which this commit makes use of.
Timeout can no longer be used since we're not calling a subprocess. We
could still implement it, but it's more worthy to reply in
yet-to-implement client-side timeouts (in containers).
Use PyMuPDF (AGPL-licensed) within the container conversion to replace
the pdf conversion to RGB. This massively simplifies the code since
PyMuPDF is a native python library.
Many instructions relied on the fact that the developer would have to
copy over the RPC policies and install the dependencies manually on the
template. This is no longer needed since a Qubes-built package ships
the necessary RPC policies and dependencies.
Removing the dependencies installation also helps with documentation
maintenance since it would be yet another place where we would need to
keep the dependency list up to date.
Make the first part of the Dangerzone development just to install the
Qubes RPC policies. Poetry install and other development related tasks
should be pointed to in the Fedora part of the instructions to avoid
duplication.
Create a new GitHub Actions workflow which aims to continuously test our
official installation instructions. The way we do it is the following:
1. Create two jobs, one for the Debian-based distros, and one for Fedora
ones.
2. Copy the instructions from INSTALL.md into each job.
3. Create a matrix that runs the installation jobs in parallel, for each
supported distro and version.
The jobs will run only on 00:00 UTC, and not on every PR, since it
wouldn't make sense otherwise.
Fix#653
Add a script to upload release assets to GitHub. This script can take
either a release ID, a Git tag, or the latest draft release.
Note that while GitHub's official client can upload assets to releases,
it cannot upload them to draft releases [1], hence why we created this
script.
[1]: https://cli.github.com/manual/gh_release_upload