Commit graph

1077 commits

Author SHA1 Message Date
Alex Pyrgiotis
7e21d5e8c4
ci: Use Docker for building images, instead of Podman 2024-01-03 15:57:49 +00:00
Alex Pyrgiotis
f254575cb4
install: Make build image script more flexible
Add the following functionality to the build image script:

1. Let the user choose the container runtime of their choice. In some
   systems, both Docker and Podman may be available, so we need to let
   the user choose which runtime they want.
2. Let users choose if they want to save the image. For non-production
   builds, we may want to simply build the container image, without
   the time penalty of compression.
2024-01-03 15:57:41 +00:00
deeplow
f1d90c6fa9
Compress per page when not using OCR
Make the compression happen per page when OCR is not enabled [1].

[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1410986342
2024-01-03 12:58:36 +00:00
deeplow
e2531279c0
FIXUP Revert "Disable image compression when saving PDF"
This reverts commit f074db0beaa50389634203657f9b46307164a353.
2024-01-03 12:58:36 +00:00
deeplow
f676891482
Remove Dockerfile dependencies replaced by PyMuPDF
PyMuPDF replaced the need for almost all dependencies, which this commit
now removes.

We are also removing tesseract-ocr as a dependency since
(to our surprise) PyMuPDF ships directly with tesseract binaries [1].
However, now that tesseract-ocr is not available directly as a binary
tool, the `test_ocr.py` needed to be changed.

Fixes #658

[1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149
2024-01-03 12:58:36 +00:00
deeplow
ee35e28aa6
Disable image compression when saving PDF
Some tests [1] lead to the conclusion that ocr_compression does the same
to the file (performance and size-wise) to the file as deflating images
when saving the file. However, both methods active do add a bit of extra
time. For this reason we're disabling the image deflation (default
option).

[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1434042296
2024-01-03 12:58:36 +00:00
deeplow
6f61e44502
Solve import errors by lazy-loading fitz module
Qubes does on-host pixels-to-pdf whereas the containers version doesn't.
This leads to an issue where on the containers version it tries to load
fitz, which isn't installed there, just because it's trying to check if
it should run the Qubes version.

The error it was showing was something like this:

    ImportError while loading conftest '/home/user/dangerzone/tests/conftest.py'.
        tests/__init__.py:8: in <module>
            from dangerzone.document import SAFE_EXTENSION
        dangerzone/__init__.py:16: in <module>
            from .gui import gui_main as main
        dangerzone/gui/__init__.py:28: in <module>
            from ..isolation_provider.qubes import Qubes, is_qubes_native_conversion
        dangerzone/isolation_provider/qubes.py:15: in <module>
            from ..conversion.pixels_to_pdf import PixelsToPDF
        dangerzone/conversion/pixels_to_pdf.py:16: in <module>
            import fitz
        E   ModuleNotFoundError: No module named 'fitz'

For context see discussion in [1].

[1]: https://github.com/freedomofpress/dangerzone/pull/622#issuecomment-1839164885
2024-01-03 12:58:36 +00:00
deeplow
773fcfa75b
Add poetry as CI container build dependency
Due to the new build-image.py, which now uses `poetry export` we need to
explicitly install poetry in the CI before building the container image.
2024-01-03 12:58:36 +00:00
deeplow
80db7bb02e
Remove pre-pymupdf exceptions and detect pymupdf ones 2024-01-03 12:58:35 +00:00
deeplow
e0b092692d
Multi-stage Dockerfile build
Breaks down the container build into multiple stages in order to speed
up build times. Building PyMuPDF was taking too long and this way it can
be cached.

The original version was made by @apyrgio
2024-01-03 12:58:35 +00:00
deeplow
1cd87f73a8
Bump pymupdf to 1.23.8 2024-01-03 12:58:35 +00:00
deeplow
2b082913a0
Bump pymupdf version 1.23.7
The build was failing due to a missing kernel libraries. Adding the
linux-headers dependency solves the issue.
2024-01-03 12:58:35 +00:00
deeplow
250d8356cd
Hash-verify container pip install & merge build-image
Ensure that when the container image is installing pymupdf (unavailable
in the repos) with verified hashes. To do so, it has the pymupdf
dependency declared in a "container" group in `pyproject.toml`, which
then gets exported into a requirements.txt, which is then used for
hash-verification when building the container.

Because this required modifying the container image build scripts, they
were all merged to avoid duplicate code. This was an overdue change
anyways.
2024-01-03 12:58:35 +00:00
deeplow
7b57cb209e
PIP force --break-system-packages
We're intentionally bypassing PEP 668 [1], which prevents the
installation of non-distro python wheels alongside system packages to
avoid incompatibilities at distro-level.

We are intentionally bypassing this since our container image is a
controlled environment (we only ship a version after rigorous testing).

[1]: https://peps.python.org/pep-0668/
2024-01-03 12:58:35 +00:00
deeplow
b75417bbec
Remove all server-side timeouts from doc to pixels
Now we're using client-side timeouts so the server side-ones are not
needed. Implemented following the suggestion from @apyrgio [1].

[1]: https://github.com/freedomofpress/dangerzone/pull/622#discussion_r1413906514
2024-01-03 12:58:35 +00:00
deeplow
576cbd3382
Fix DPI mismatch between doc2pixels and pixels2pdf
The original document was larger in dimensions than the original one due
to a mismatch in DPI settings. When converting documents to pixels we
were setting the DPI to 150 pixels per inch. Then when converting back
into a PDF we were using 70 DPI. This difference would result in an
overall larger document in dimensions (though not necessarily in file
size).

Fixes #626
2024-01-03 12:58:34 +00:00
deeplow
e5dbe25abb
Replace 'convert' with PyMuPDF for images
PyMuPDF can also convert images of the types we already support so we
don't need ImageMagick's 'convert'.
2024-01-03 12:58:34 +00:00
deeplow
a3a64882a3
Add PyMuPDF to dev env in Qubes
Since PyMuPDF is now used in Pixels to PDF we needed to add it to the
qubes development environment.
2024-01-03 12:58:32 +00:00
deeplow
77d5ea5940
Add PyMuPDF in pixels_to_pdf replacing old logic
Adding PyMuPDF essentially make the code much simpler since it can do
everything that we'd need multiple programs for. It also includes
tesseract-OCR integration, which this commit makes use of.
2024-01-03 12:56:33 +00:00
deeplow
ba17016643
Doc_to_pixels: remove unneeded timeout
Timeout can no longer be used since we're not calling a subprocess. We
could still implement it, but it's more worthy to reply in
yet-to-implement client-side timeouts (in containers).
2024-01-03 12:40:45 +00:00
deeplow
317deadbe4
Replace pdfinfo logic (get # pages) with PyMuPDF 2024-01-03 12:40:45 +00:00
deeplow
327ab8791f
Replace pdftoppm logic with PyMuPDF (native python)
Use PyMuPDF (AGPL-licensed) within the container conversion to replace
the pdf conversion to RGB. This massively simplifies the code since
PyMuPDF is a native python library.
2024-01-03 12:40:45 +00:00
deeplow
e923ac0788
Remove whitespace
Remove whitespace accidentally added in [1].

[1]: commit d6c162ea080f0df27f3109bf4aab84788704272c
2024-01-03 10:52:47 +00:00
deeplow
555cd33eb6
Simplify Qubes install instructions
Many instructions relied on the fact that the developer would have to
copy over the RPC policies and install the dependencies manually on the
template. This is no longer needed since a Qubes-built package ships
the necessary RPC policies and dependencies.

Removing the dependencies installation also helps with documentation
maintenance since it would be yet another place where we would need to
keep the dependency list up to date.
2024-01-03 10:52:47 +00:00
deeplow
5849800606
Improve "Developing Dangerzone" docs section
Make it clearer that we are talking about the two main
development-workflow differences when developing on Qubes.
2024-01-03 10:52:46 +00:00
deeplow
d1eb4ec76c
Remove duplicate "cd dangerzone" instruction 2024-01-03 10:52:46 +00:00
deeplow
3f6437cf66
Remove poetry install part from Qubes instructions
Make the first part of the Dangerzone development just to install the
Qubes RPC policies. Poetry install and other development related tasks
should be pointed to in the Fedora part of the instructions to avoid
duplication.
2024-01-03 10:52:46 +00:00
deeplow
6597b57452
Apply 2023-10-25 advisory in BUILD instructions
On the security advisory done in 2023-10-25 we updated the instructions
in INSTALL.md, but missed the ones in BUILD.md, leaving developers with
a network path. This is not too critical since it's development but it
should be fixed in any case.

[1]: https://github.com/freedomofpress/dangerzone/blob/5acb968/docs/advisories/2023-10-25.md
2024-01-03 10:52:46 +00:00
deeplow
0ae7f89dea
Add note that Qubes instr. are on dom0 terminal
It was not entirely clear that what we showed should be run in a
terminal.
2024-01-03 10:52:46 +00:00
deeplow
5121b4f702
Qubes: clarify instructions for skipping step 1
Make it clearer that step 1 should be skipped entirely when the user
wants to install it on their default template.
2024-01-03 10:52:46 +00:00
deeplow
cac06caf82
Correct Qubes Instructions: dz-dvm is not disposable
The qube dz-dvm is not a disposable qube but rather a disposable
template qube (aka. app qube).
2024-01-03 10:52:46 +00:00
Alex Pyrgiotis
5bf7549b55
Fix typo 2023-12-29 18:30:48 +02:00
Alex Pyrgiotis
9f713ebb8b
ci: Test official installation instructions
Create a new GitHub Actions workflow which aims to continuously test our
official installation instructions. The way we do it is the following:

1. Create two jobs, one for the Debian-based distros, and one for Fedora
   ones.
2. Copy the instructions from INSTALL.md into each job.
3. Create a matrix that runs the installation jobs in parallel, for each
   supported distro and version.

The jobs will run only on 00:00 UTC, and not on every PR, since it
wouldn't make sense otherwise.

Fix #653
2023-12-21 21:51:07 +02:00
Alex Pyrgiotis
12eda5d73c
dev_scripts: Add missing git dependency
Add missing git dependency, which is required to run the `isort` command
on the development environment.
2023-12-21 21:38:39 +02:00
Alex Pyrgiotis
e137976581
dev_scripts: Upload release assets to GitHub
Add a script to upload release assets to GitHub. This script can take
either a release ID, a Git tag, or the latest draft release.

Note that while GitHub's official client can upload assets to releases,
it cannot upload them to draft releases [1], hence why we created this
script.

[1]: https://cli.github.com/manual/gh_release_upload
2023-12-21 21:38:39 +02:00
deeplow
42228647e0
Fix lint due to inconsistent qa.py and RELEASE.md
Missed during the merge of PR #654 [1].

[1]: https://github.com/freedomofpress/dangerzone/pull/654
2023-12-19 08:10:18 +00:00
deeplow
2c5f04c2c3
Add instructions for adding release tag
Instructions only stated how to verify the release tag bug not how
to make it.
2023-12-19 08:06:14 +00:00
deeplow
184abfd5fc
Fix Qubes indentation 2023-12-18 08:19:26 +00:00
deeplow
418e388535
Add note that Windows 11 is in a VM 2023-12-18 08:18:27 +00:00
deeplow
2594dab31d
Simplify initial setup section titles 2023-12-18 08:18:27 +00:00
deeplow
bb653b3425
Right-click (scenario 8) can be tested under Qubes
Fixes #641
2023-12-18 08:18:27 +00:00
deeplow
d0e9eea55c
"Checklist-ize" RELEASE.md 2023-12-18 08:18:27 +00:00
deeplow
24ddda4070
Add point about creating an issue for QA & Release 2023-12-18 08:18:27 +00:00
deeplow
b3fed27178
Move container building notice to release instructions 2023-12-18 08:18:27 +00:00
deeplow
65afdc68cd
Add 'Release' section and indent subsections 2023-12-18 08:18:27 +00:00
deeplow
01b107ced9
Title-case various sections for consistency 2023-12-18 08:18:26 +00:00
deeplow
05b8e59d67
Make RELEASE Windows structure similar to macOS 2023-12-18 08:18:26 +00:00
deeplow
3d21e17e3b
Reorganize macOS release into setup and building 2023-12-18 08:18:26 +00:00
deeplow
a936780266
Move pre-release instructions to top of RELEASE
The instructions to cut a release were after all the scenarios which
made them easy to miss.
2023-12-18 08:18:26 +00:00
Moon Sungjoon
63aea4cb45
Enable HWP conversion on MacOS (Apple silicon CPU)
This PR reverts the patch that disables HWP / HWPX conversion on MacOS M1.
It does not fix conversion on Qubes OS (#494).

Previously, HWP / HWPX conversion didn't work on MacOS (Apple silicon CPU) (#498)
because libreoffice wasn't built with Java support on Alpine Linux for ARM (aarch64).

Gratefully, the Alpine team has enabled Java support on the aarch64
system [1], so we can enable it again for ARM architectures.
And this patch is included in Alpine 3.19

This commit was included in #541 and reverted on #562 due to a stability issue.

Fixes #498

[1]: 74d443f479
2023-12-13 12:57:22 +02:00