Add a sanity check at the end of the conversion from doc to pixels, to
ensure that the resulting document will have the same number of pages as
the original one.
Refs #560
Stream page data back to the caller, immediately after we read them from
pdftoppm. This way, we have more accurate progress reports and timeouts.
Fixes#557
Introduce 4 new methods that can be overloaded by the Qubes isolation
provider to stream page data/metadata back to the caller. For the time
being, these methods do what they did before, i.e., write this info in
files within the pixels directory.
Do not read a line from the command output and then check if
we are at EOF, because it's possible that the writer immediately exited
after writing the last line of output. Instead, switch the order of
actions.
This is a very serious bug that can lead to Dangerzone excluding the
last page of the document. It should have bit us right from the start
(see aeeed411a0), but it seems that the
small period of time it takes the kernel to close the file descriptors
was hiding this bug.
Fixes#560
This preserves isort's default behavior of ignoring virtualenvs with
common names like `venv` or `.venv`, which is helpful when running
`isort` in a local development environment that uses such a
virtualenv.
To be consistent with Light Mode, the background of the
safe_extension_filename QLabel should match the adjacent QTextField,
but the text should be "grayed out"/disabled to indicate that it's not
supposed to be editable.
Sets the detected OS color mode (dark/light) as a property on the
QApplication so it can be referenced in stylesheets to select style
rules suited to the OS color mode.
In Qubes OS it's often the case that the user doesn't have enough
RAM to start the conversion. In this case it raises BrokenPipeException
and exits with code 126.
It didn't seem possible to distinguish this kind of failure to one
where the user has misconfigured qrexec policies.
NOTE: this approach is not ideal UX-wise. After the first doc failing
the next one will also try and fail. Upon first failure we should
inform the user that they need to close some programs or qubes.
Because the server also checks the MAX_PAGES limit, the test in base
would hide the fact that the client is not enforcing the limit. This
ensures that's not the case.
When the pages in containers are streamed (#443), then this test should
be in base.py.
Theoretically the max pages would be 65536 (2byte unsigned int.
However this limit is much higher than practical documents have
and larger ones can lead to unforseen problems, for example RAM
limitations.
We thus opted to use a lower limit of 10K. The limit must be
detected client-side, given that the server is distrusted. However
we also check it in the server, just as a fail-early mechanism.
Isolation provider tests done in tests/test_base.py and had
pytest.mark.parameterize() for each isolation provider. This logic
would not work well when we had test that diverge. We could have marked
each one as compatible with one provider or another, but in the end it
turned out to be better to have the common ones in a base class and
the divergent ones in each.
NOTE: this has a strange side-effect: inherited test classes need to
have imports for all of the fixtures even if they are not explictly used
Add an error for interrupted conversions, in order to better
differentiate this scenario from other ValueErrors that may be raised
throughout the code's lifetime.
Store, in an instance attribute, the process that we have started for
the spawned disposable qube. In subsequent commits, we will use it from
other places as well, aside from the `_convert` method.
Note that this commit does not alter the conversion logic, and only does
the following:
1. Renames `p.` to `self.proc.`
2. Adds an `__init__` method to the Qubes isolation provider, and
initializes the `self.proc` attribute to `None`.
3. Adds an assert that `self.proc` is not `None` after it's spawned, to
placate Mypy.
Add instructions for installing Dangerzone on Qubes from our official
repos. These instructions are adapted from the build instructions, but
have been greatly simplified because we don't need some of the qubes
that the development environment needs.
Closes#431
Add Tesseract models for the 10 most spoken languages as package
requirements for Qubes. For containers, this problem is already solved
since we install all Tesseract models.
If a user is not covered by the installed models, they can install
extras on their own. We will add a note for this in subsequent commits.
Refs #431
Switch to the tessdata-fast Tesseract model, instead of the tessdata
one. The tessdata-fast Tesseract model is much smaller, and a bit faster
than the other one. Also, it's the model that Debian/Fedora ship by
default.
Closes#545
Extend the client-side capabilities of the Qubes isolation provider, by
adding client-side timeout logic.
This implementation brings the same logic that we used server-side to
the client, by taking into account the original file size and the number
of pages that the server returns.
Since the code does not have the exact same insight as the server has,
the calculated timeouts are in two places:
1. The timeout for getting the number of pages. This timeout takes into
account:
* the disposable qube startup time, and
* the time it takes to convert a file type to PDF
2. The total timeout for converting the PDF into pixels, in the same way
that we do it on the server-side.
Besides these changes, we also ensure that partial reads (e.g., due to
EOF) are detected (see exact=... argument)
Some things that are not resolved in this commit are:
* We have both client-side and server-side timeouts for the first phase
of the conversion. Once containers can stream data back to the
application (see #443), these server-side timeouts can be removed.
* We do not show a proper error message when a timeout occurs. This will
be part of the error handling PR (see #430)
Fixes#446
Refs #443
Refs #430
Replace the deprecated `bdist_rpm` method of creating RPMs for
Dangerzone. Instead, update our `install/linux/build-rpm.py` script, to
build Dangerzone RPMs using our SPEC file under
`install/linux/dangerzone.spec`. The script now essentially creates a
source distribution (sdist) using `poetry build`, and then uses
`rpmbuild` to create binary and source RPMs.
Fixes#298
Add an `rpm-build` directory under `install/linux`, which will be used
for building Dangerzone RPMs. For the time being, it only has a
.gitignore file there, but in the future, invoking
`install/linux/build-rpm.py` will populate it.
Update the dependencies required to build RPM packages. More
specifically, remove the older python3-setuptools dependency, and depend
instead on python3-devel and python3-poetry-core.
Note that this commit may break our CI, but it will be resolved in
subsequent commits.
Introduce a SPEC file that can be used to create an RPM from a Python
source distribution. Some notable features of this SPEC file follow:
1. We can use this SPEC file to create both regular RPM packages and
ones targeted for Qubes.
2. It has a post installation script that removes stale .egg-info
directories, which previously caused issues to our users.
3. It automatically creates a changelog from our Git logs, which differs
from the actual CHANGELOG.md.
4. It folloes the latest Fedora guidelines (as of writing this) for
packaging Python projects.
Fixes#514
Update our pyproject.toml file to include some non-Python data files,
e.g., our container image and assets. This way, we can use `poetry
build` to create a source distribution / Python wheel from our source
repository.
Note that this list of data files is already defined in our `setup.py`
script. In that script, one can find some extra goodies:
1. We can conditionally include data files in our Python package. We use
this to include Qubes data only in our Qubes packages.
2. We can specify where will the data files be installed in the end-user
system.
The above are non-goals for Poetry [1], especially (2), because modern
Python wheels are not supposed to install files in arbitrary places
within the user's host, nor should the install invocation use sudo.
Instead, this is a task that's better suited for the .deb / .rpm
packages.
So, why do we bother updating our `pyproject.toml` and not use
`setup.py` instead? Because `setup.py` is deprecated [2,3], and the
latest Python packaging RFCs [4], as well as most recent Fedora
guidelines [5] use `pyproject.toml` as the source of truth, instead of
`setup.py`.
In subsequent commits, we will also use just `pyproject.toml` for RPM
packaging.
[1]: https://github.com/python-poetry/poetry/issues/890
[2]: https://peps.python.org/pep-0517/#source-trees
[3]: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html
[4]: https://peps.python.org/pep-0517/
[5]: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/
The dangerzone-container entrypoint, as specified in pyproject.toml, is
stale, for the following reasons:
1. It's not mentioned in the setup.py script, so it was never included
in our Linux distributions.
2. The code in `dangerzone.__init__.py` that decides if it will invoke
the GUI or CLI backend, just takes `dangerzone-cli` into account for
this decision, and does not mention dangerzone-container anywhere.
In order to let isort respect .gitignore, we need to specify this in the
tool.isort entry, in pyproject.toml.
For black, we don't need any extra tweaks. This is weird, since until a
few months ago black did not respect .gitignore. Maybe something has
changed in the meantime but if not, we should revert this change.
Creates exceptions in the server code to be shared with the client via an
identifying exit code. These exceptions are then reconstructed in the
client.
Refs #456 but does not completely fix it. Unexpected exceptions and
progress descriptions are still passed in Containers.
This PR reverts the patch that disables HWP / HWPX conversion on MacOS
M1. It does not fix conversion on Qubes OS (#494)
Previously, HWP / HWPX conversion didn't work on MacOS M1 systems (#498)
because libreoffice wasn't built with Java support on Alpine Linux for
ARM (aarch64).
Gratefully, the Alpine team has enabled Java support on the aarch64
system [1], so we can enable it again for ARM architectures.
Fixes#498
[1]: 74d443f479
The Alpine Linux team has enabled Java support for LibreOffice on ARM
architecture:
74d443f479
This commit is included in 7.5.5.2-r2, so the installed LibreOffice
package should be 7.5.5.2-r2 or higher to fix this issue.
However 3.18 doesn't have the 7.5.5.2-r2 package:
https://pkgs.alpinelinux.org/package/v3.18/community/aarch64/libreoffice
The Dangerzone image uses the alpine:latest image which is 3.18 as of
writing this.
For this reason, we switch to the edge repo of Alpine Linux, which
includes this fix.
Refs #498
Refs #540
Refs #542