Add Tesseract models for the 10 most spoken languages as package
requirements for Qubes. For containers, this problem is already solved
since we install all Tesseract models.
If a user is not covered by the installed models, they can install
extras on their own. We will add a note for this in subsequent commits.
Refs #431
Switch to the tessdata-fast Tesseract model, instead of the tessdata
one. The tessdata-fast Tesseract model is much smaller, and a bit faster
than the other one. Also, it's the model that Debian/Fedora ship by
default.
Closes#545
Extend the client-side capabilities of the Qubes isolation provider, by
adding client-side timeout logic.
This implementation brings the same logic that we used server-side to
the client, by taking into account the original file size and the number
of pages that the server returns.
Since the code does not have the exact same insight as the server has,
the calculated timeouts are in two places:
1. The timeout for getting the number of pages. This timeout takes into
account:
* the disposable qube startup time, and
* the time it takes to convert a file type to PDF
2. The total timeout for converting the PDF into pixels, in the same way
that we do it on the server-side.
Besides these changes, we also ensure that partial reads (e.g., due to
EOF) are detected (see exact=... argument)
Some things that are not resolved in this commit are:
* We have both client-side and server-side timeouts for the first phase
of the conversion. Once containers can stream data back to the
application (see #443), these server-side timeouts can be removed.
* We do not show a proper error message when a timeout occurs. This will
be part of the error handling PR (see #430)
Fixes#446
Refs #443
Refs #430
Replace the deprecated `bdist_rpm` method of creating RPMs for
Dangerzone. Instead, update our `install/linux/build-rpm.py` script, to
build Dangerzone RPMs using our SPEC file under
`install/linux/dangerzone.spec`. The script now essentially creates a
source distribution (sdist) using `poetry build`, and then uses
`rpmbuild` to create binary and source RPMs.
Fixes#298
Add an `rpm-build` directory under `install/linux`, which will be used
for building Dangerzone RPMs. For the time being, it only has a
.gitignore file there, but in the future, invoking
`install/linux/build-rpm.py` will populate it.
Update the dependencies required to build RPM packages. More
specifically, remove the older python3-setuptools dependency, and depend
instead on python3-devel and python3-poetry-core.
Note that this commit may break our CI, but it will be resolved in
subsequent commits.
Introduce a SPEC file that can be used to create an RPM from a Python
source distribution. Some notable features of this SPEC file follow:
1. We can use this SPEC file to create both regular RPM packages and
ones targeted for Qubes.
2. It has a post installation script that removes stale .egg-info
directories, which previously caused issues to our users.
3. It automatically creates a changelog from our Git logs, which differs
from the actual CHANGELOG.md.
4. It folloes the latest Fedora guidelines (as of writing this) for
packaging Python projects.
Fixes#514
Update our pyproject.toml file to include some non-Python data files,
e.g., our container image and assets. This way, we can use `poetry
build` to create a source distribution / Python wheel from our source
repository.
Note that this list of data files is already defined in our `setup.py`
script. In that script, one can find some extra goodies:
1. We can conditionally include data files in our Python package. We use
this to include Qubes data only in our Qubes packages.
2. We can specify where will the data files be installed in the end-user
system.
The above are non-goals for Poetry [1], especially (2), because modern
Python wheels are not supposed to install files in arbitrary places
within the user's host, nor should the install invocation use sudo.
Instead, this is a task that's better suited for the .deb / .rpm
packages.
So, why do we bother updating our `pyproject.toml` and not use
`setup.py` instead? Because `setup.py` is deprecated [2,3], and the
latest Python packaging RFCs [4], as well as most recent Fedora
guidelines [5] use `pyproject.toml` as the source of truth, instead of
`setup.py`.
In subsequent commits, we will also use just `pyproject.toml` for RPM
packaging.
[1]: https://github.com/python-poetry/poetry/issues/890
[2]: https://peps.python.org/pep-0517/#source-trees
[3]: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html
[4]: https://peps.python.org/pep-0517/
[5]: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/
The dangerzone-container entrypoint, as specified in pyproject.toml, is
stale, for the following reasons:
1. It's not mentioned in the setup.py script, so it was never included
in our Linux distributions.
2. The code in `dangerzone.__init__.py` that decides if it will invoke
the GUI or CLI backend, just takes `dangerzone-cli` into account for
this decision, and does not mention dangerzone-container anywhere.
In order to let isort respect .gitignore, we need to specify this in the
tool.isort entry, in pyproject.toml.
For black, we don't need any extra tweaks. This is weird, since until a
few months ago black did not respect .gitignore. Maybe something has
changed in the meantime but if not, we should revert this change.
Creates exceptions in the server code to be shared with the client via an
identifying exit code. These exceptions are then reconstructed in the
client.
Refs #456 but does not completely fix it. Unexpected exceptions and
progress descriptions are still passed in Containers.
This PR reverts the patch that disables HWP / HWPX conversion on MacOS
M1. It does not fix conversion on Qubes OS (#494)
Previously, HWP / HWPX conversion didn't work on MacOS M1 systems (#498)
because libreoffice wasn't built with Java support on Alpine Linux for
ARM (aarch64).
Gratefully, the Alpine team has enabled Java support on the aarch64
system [1], so we can enable it again for ARM architectures.
Fixes#498
[1]: 74d443f479
The Alpine Linux team has enabled Java support for LibreOffice on ARM
architecture:
74d443f479
This commit is included in 7.5.5.2-r2, so the installed LibreOffice
package should be 7.5.5.2-r2 or higher to fix this issue.
However 3.18 doesn't have the 7.5.5.2-r2 package:
https://pkgs.alpinelinux.org/package/v3.18/community/aarch64/libreoffice
The Dangerzone image uses the alpine:latest image which is 3.18 as of
writing this.
For this reason, we switch to the edge repo of Alpine Linux, which
includes this fix.
Refs #498
Refs #540
Refs #542
qvm-copy-to-vm since a long time doesn't respect the qube name
provided. Instead it is enforced by the dom0 policy prompt. This is
probably a leftover from a command ran in dom0, where this command
actually works.
The "check for updates" button wasn't showing up immediately as checked
as soon as the user is prompted for checking updates. This fixes that.
Fixes#513
Reporting script now parses JunitXML instead of a series of
".container_log" files. The script in in changed submodule.
Additionally it makes failed tests actually fail so that this is
recorded in the JunitXML report.
Adds a large pool of document that can and should be used prior to a
release to understand effects of the new release over a real-world
scenario.
Documents are stored in an external git LFS repo under
`tests/test_docs_large` and currently it's about 11K documents gathered
from multiple PDF readers and office suite's test sets.
Documentation on how to run the tests is under
`docs/developer/TESTING.md`
Certain characters may be abused. Particularly ANSI escape codes.
Solution inspired by Qubes OS's hardening of ther RPC mechanism [1]:
> Terminal control characters are a security issue, which in worst case
> amount to arbitrary command execution. In the simplest case this
> requires two often found codes: terminal title setting (which puts
> arbitrary string in the window title) and title repo reporting (which
> puts that string on the shell's standard input. [sic]
>
> -- qvm-run.rst [2]
[1]: e005836286
[2]: c70da44702/doc/manpages/qvm-run.rst (L126)
Store the conversion log to a file (captured-output.txt) in the
container and when in development mode, have its output displayed on the
terminal output.