Commit graph

1356 commits

Author SHA1 Message Date
Alex Pyrgiotis
eddc06b436
Make Dummy isolation provider more realistic
Make the Dummy isolation provider follow the rest of the isolation
providers and perform the second part of the conversion on the host. The
first part of the conversion is just a dummy script that reads a file
from stdin and prints pixels to stdout.
2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
2c08b5f9c3
Remove dead docs 2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
1ab3aab08e
Remove dead code 2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
62c32673f1
Update the way we get debug logs
Move the logic for grabbing debug logs to a new place, now that we have
merged the two conversion stages (doc to pixels, pixels to PDF).
2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
bae637c974
Perform on-host pixels to PDF conversion
Extend the base isolation provider to immediately convert each page to
a PDF, and optionally use OCR. In contract with the way we did things
previously, there are no more two separate stages (document to pixels,
pixels to PDF). We now handle each page individually, for two main
reasons:

1. We don't want to buffer pixel data, either on disk or in memory,
   since they take a lot of space, and can potentially leave traces.
2. We can perform these operations in parallel, saving time. This is
   more evident when OCR is not used, where the time to convert a page
   to pixels, and then back to a PDF are comparable.
2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
afe8179a51
Update .deb/.rpm dependencies
Update .deb/.rpm specs to include PyMuPDF as a required package.
2024-10-08 19:15:00 +03:00
Alex Pyrgiotis
0f2be58167
Make PyMuPDF a main Dangerzone dependency
The PyMuPDF package was previously mainly used within the Dangerzone
container, as well as on Qubes. With on-host conversion, PyMuPDF will be
used in all supported platforms by default. For this reason, we can
promote it to a main dependency.
2024-10-08 19:14:59 +03:00
Alex Pyrgiotis
b9e5c59520
Add new way to detect tessdata dir
Add a new way to detect where the Tesseract data are stored in a user's
system. On Linux, the Tesseract data should be installed via the package
manager. On macOS and Windows, they should be bundled with the
Dangerzone application.

There is also the exception of running Dangerzone locally, where even
on Linux, we should get the Tesseract data from the Dangerzone share/
folder.
2024-10-08 19:14:59 +03:00
Alex Pyrgiotis
c6475ed526
Ignore tesseract data when building DEB/RPM packages 2024-10-08 19:14:59 +03:00
Alex Pyrgiotis
84a4ae7fdd
ci: Add GitHub action for tessdata 2024-10-08 19:14:59 +03:00
Alex Pyrgiotis
23caf9faf7
Update build instructions 2024-10-08 19:14:59 +03:00
Alex Pyrgiotis
0921cc23e7
Add script for downloading Tesseract data
Add a Python script that can run in all supported platforms, and can
download and extract the Tesseract language data from GitHub, while
also:

1. Checking that the expected hash matches.
2. Informing the user if the language data have already been downloaded.
3. Extracting only the subset of language data that Dangerzone needs
2024-10-08 19:10:02 +03:00
Alex Pyrgiotis
6547998633
Provide sanitized version of output filename 2024-10-08 19:10:02 +03:00
Alex Pyrgiotis
17fa82297e
Better way to collect tests 2024-10-08 19:10:02 +03:00
Alex Pyrgiotis
54dc22c410
ci: Be explicit about the Debian package we install in end-user envs 2024-10-08 19:10:02 +03:00
Alex Pyrgiotis
25ac980b0b
FIXUP: Fix for vendoring PyMuPDF 2024-10-08 19:10:02 +03:00
Alex Pyrgiotis
6fd0f925a8
FIXUP: Fix a lint 2024-10-08 13:34:33 +03:00
Alex Pyrgiotis
30b4f24d77
FIXUP: Use the proper pip argument 2024-10-08 13:34:33 +03:00
Alex Pyrgiotis
e027d853c2
FIXUP: Implement review comments 2024-10-08 13:34:33 +03:00
Alex Pyrgiotis
07921566ba
FIXUP: Make Dockerfile work with latest wheels 2024-10-08 13:34:33 +03:00
Alex Pyrgiotis
eef4e8b548
debian: Vendor PyMuPDf when building Debian package
Install PyMuPDF under ./dangerzone/vendor, right before we build the
.deb package. We vendor PyMuPDF just for Debian, since the provided
versions don't have OCR support enabled.

Currently, we don't use PyMuPDf on the host, but this will change once
we fully implement the on-host conversion feature.

Refs #625
2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
ed55124a8b
Add an import preference for vendored packages
Prefer importing packages from ./dangerzone/vendor, if there is one
there, instead of using the system ones.
2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
f61097e9b3
install: Add script for vendoring PyMuPDF
Add a script that installs PyMuPDF under ./dangerzone/vendor. This will
be useful in subsequent commits, for vendoring PyMuPDF when building
Debian packages.
2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
c22f945614
dev_scripts: Install pip in dev environments
Install pip in dev environments, so that we can use it to vendor
PyMuPDf in subsequent commits.
2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
892dfaf1bc
Bump our Poetry dependencies 2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
00711fa9e2
Add missing .pybuild dir in .gitignore 2024-10-08 13:34:32 +03:00
Alex Pyrgiotis
93b960cd23
Bump H2ORestart to version 0.6.6
Follow Debian's lead [1] and bump this version to 0.6.6. This change
should bring some stability improvements to our CI tests as well.

[1]: https://packages.debian.org/unstable/text/libreoffice-h2orestart
2024-10-07 18:36:06 +03:00
bnewc
752eff02d8
Prevent user from using illegal characters in output filename
Add some checks in the Dangerzone GUI and CLI that will prevent a user
from mistakenly adding illegal characters in the output filename.
2024-10-07 18:04:47 +03:00
Alex Pyrgiotis
275189587e
tests: Test termination logic under default conditions
Do not use the `provider_wait` fixture in our termination logic tests,
and switch instead to the `provider` fixture, which instantiates a
typical isolation provider.

The `provider_wait` fixture's goal was to emulate how would the process
behave if it had fully spawned. In practice, this masked some
termination logic issues that became apparent in the WIP on-host
conversion PR. Now that we kill the spawned process via its process
group, we can just use the default isolation provider in our tests.

In practice, in this PR we just do `s/provider_wait/provider`, and
remove some stale code.
2024-10-07 17:37:57 +03:00
Alex Pyrgiotis
b5130b08b6
tests: Improve Dummy provider tests
Add a fixture that returns our stock Dummy provider. Also, explicitly
use a blocking Dummy provider (`DummyWait`) for a specific test case.
This will prove useful when we stop using the `provider_wait` variant of
our isolation providers in the next commits.
2024-10-07 17:37:42 +03:00
Alex Pyrgiotis
dc8a22c8e7
Fix the dummy provider
Make the dummy provider behave a bit more like the other providers, with
a proper function and termination logic. This will be helpful soon in
the tests.
2024-10-07 17:37:42 +03:00
Alex Pyrgiotis
d6410652cb
Kill the process group when conversion terminates
Instead of killing just the invoked Podman/Docker/qrexec process, kill
the whole process group, to make sure that other components that have
been spawned die as well. In the case of Podman, conmon is one of the
processes that lingers, so that's one way to kill it.
2024-10-07 17:37:39 +03:00
Alex Pyrgiotis
b9a3dd63ad
Always start conversion process in new session
Start the conversion process in a new session, so that we can later on
kill the process group, without killing the controlling script (i.e.,
the Dangezone UI). This should not affect the conversion process in any
other way.
2024-10-07 17:27:38 +03:00
Alex Pyrgiotis
8d856ff4c3
ci: Add Intel macOS runner
GitHub provides an Intel macOS runner as `macos-13`. Add it alongside
our M1 macOS runner (`macos-latest`), in order to cover all of our
target environments.
2024-10-07 12:48:03 +03:00
Alex Pyrgiotis
95660c3ec7
Make dummy tests faster
Remove the unnecessary sleep command in our dummy tests, which made them
run much slower.
2024-10-07 12:48:03 +03:00
Alex Pyrgiotis
58b4659ffd
Improve .gitattributes
It seems that we need to specify that Python files have LF line endings
on Windows environments, else they will get converted to CRLF. If this
happens, then the container image we build in this environment will have
Python files with wrong endings, and tests will break.

Refs #838 for previous attempt.
2024-10-07 12:48:02 +03:00
Alex Pyrgiotis
a001b5497c
Add release note for Debian packages 2024-10-02 16:49:46 +02:00
Alex Pyrgiotis
eb2d114ea7
install: Catch version errors when building DEBs
Make sure that the Debian package we build conforms to the expected
naming scheme else, it's possible that something is off. A scenario
we've encountered is bumping `share/version.txt`, but not
`debian/changelog`, which would create a Debian package with an older
version.
2024-10-02 16:49:46 +02:00
Alex Pyrgiotis
a32522f6c8
debian: Bump version to 0.7.1
Add a dummy entry in debian/changelog, to signal that the latest
Dangerzone version is 0.7.1.
2024-10-02 16:49:46 +02:00
Alexis Métaireau
025e5dda51
Switch from CircleCI runners to Github actions.
As part of this change, the dev (build) and end-user test images names
changed from `dangerzone.rocks/*` to `ghcr.io`.

A new `--sync` option is provided in the `env.py` command, in order to
retrieve the images from the registry, or build and upload otherwise.
2024-10-02 16:47:58 +02:00
Alexis Métaireau
3e434d08d1
Always use our own seccomp policy as a default.
As per Etienne Perot's comment on #908:

> Then it seems to me like it would be easy to simply apply this seccomp
profile under all container runtimes (since there's no reason why the
same image and the same command-line would call different syscalls under
different container runtimes).
2024-10-02 14:12:48 +02:00
Alexis Métaireau
eb10082a62
Merge branch 'hotfix-0.7.1' into main 2024-10-01 15:16:25 +02:00
Alexis Métaireau
eee405e29e
Update download links to use 0.7.1 2024-10-01 12:58:11 +02:00
Alex Pyrgiotis
2371d1c23c
Add release note for containerd graph driver
Fixes #933
2024-09-30 15:45:15 +03:00
Alexis Métaireau
9117ba5d6c
Bump version to 0.7.1 2024-09-30 12:40:06 +02:00
Alexis Métaireau
fb2f4ce695
Add 0.7.1 to the CHANGELOG 2024-09-30 12:38:41 +02:00
Alex Pyrgiotis
4423fc6232
Handle multiple image IDs in the image-ids.txt file.
Docker Desktop 4.30.0 uses the containerd image store by default, which
generates different IDs for the images, and as a result breaks the logic
we are using when verifying the images IDs are present.

Now, multiple IDs can be stored in the `image-id.txt` file.

Fixes #933
2024-09-30 12:34:34 +02:00
Alex Pyrgiotis
bd2dc0ea3c
Pin gVisor to the last working release
Temporarily pin gVisor to the latest working version
(`release-20240826.0`), since the latest one breaks our container image.

Refs #928
2024-09-27 12:55:59 +03:00
Alex Pyrgiotis
27d201a95b
container: Avoid pop-ups on Windows
Avoid window pop-ups on Windows systems, by using the `startupinfo`
argument of `subprocess.run`.
2024-09-27 12:55:46 +03:00
JKarasti
791444cd5d
Windows installer: Allow choosing installation directory during install
Switch to using `WixUI_InstallDir` dialog set in the windows installer and add the `WIXUI_INSTALLDIR` property it needs to let user choose where Dangerzone is installed.

resolves #148
2024-09-24 15:04:43 +03:00