The markdown dependency uses importlib to monkeypatch 'html.parser'
[1]. Due to this approach 'html.parser' is never explicitly stated
as a dependency. This works fine in most cases, since it's part of
the python standard lib. But on Windows the build tool (CxFreeze)
ships in the .exe only the modules needed. And because html.parser
is never mentioned, it fails with an error (see issue #501).
Fixes#501
[1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/htmlparser.py#L29
The HWP / HWPX conversion feature does not work on the following
platforms:
* MacOS with Apple Silicon CPU
* Native Qubes OS
For this reason, we need to:
1. Disable it on the GUI side, by not allowing the user to select these
files.
2. Throw an error on the isolation provider side, in case the user
directly attempts to convert the file (either through CLI or via
"Open With").
Refs #494
Refs #498
Sometimes, LibreOffice returns with status code 0, but in reality, it
fails. It doesn't create a file, and Dangerzone does not detect this.
What happens next is that it fails in the next command, and throws an
unrelated error.
Detect that LibreOffice fails, by checking if the output file exists,
after the PDF conversion.
Always pull the base container image (alpine:latest) before building our
own container image. Else, in an environments that we haven't touched
for a while, an older image may be used.
Update our release instructions in the following ways:
1. Make sure to check the Python dependencies / version before the
release.
2. Make sure to upload the final container.tar.gz image as a release
artifact.
Add extra files and base64 encode externally contributed docs. This
prevents the accidental opening of such documents, since they couldn't
be rebuit by the Dangerzone developers to ensure their safety.
Use the MIME types actually used by the `file` command, which was
recently changed for the detection of the HWPX format [1].
application/hwp+zip -> application/x-hwp+zip
But the HWPX format includes a 'mimetype' file, which contains the
MIME type string "application/hwp+zip", so that was left so because
it may be possible to detect it as "application/hwp+zip".
[1]: ceef7ead3a
HWPX MIME type is recognized as 'application/zip' with current version of file command (file-5.44).
It will be recognized as 'application/hwp+zip' when new version of file is released.
For a temporary fix, when MIME type of file is 'application/zip',
check the file type again (without the MIME option).
And then check if it's 'Zip data (MIME type "application/hwp+zip"?)' or not.
Only load the LibreOffice extension for opening hwp/hwpx when it is
actually needed. Adding an extension to libreoffice may allow for it to
run arbitrary code. This makes it trust more scalable by trusting
LibreOffice extensions only for the filetypes which they target.
Reasoning
---------
Assuming a malicious `.oxt` extension this means that the extension has
arbitrary code execution in the container. While this is not an
existential threat in itself, we should not expose every Dangerzone user
to it. This is achieved by dynamically loading the extension at runtime
only when needed.
This ensures that a compromised extension will in its least malicious
form be able to modify the visual content of any hancom office files but
not *every file*. In the more malicious version, if the code execution
manages to do a container escape, this will only affect users that have
converted a Hancom office file.
H2ORestart is a LibreOffice extension which adds Hancom HWP/HWPX (Hangul Word Processor)
supports for LibreOffice. This format is widely used in South Korea.
Version: v0.5.7
Extension Repository: https://github.com/ebandal/H2Orestart/releases
Force Podman to use the overlay storage driver in our Dangerzone
environments. We have seen that in certain cases, Podman may opt to use
the vfs storage driver instead, which is more space-intensive.
Closes#489
Improve the `parse_progress()` method of the container isolation
provider in the following ways:
1. Make sure that the fields of the progress report have the expected
type.
2. In case of a JSON parsing error, sanitize the invalid string so that
it doesn't contain escape sequences, or the user considers it as
trusted.
Update the common `print_progress()` method in the base
`IsolationProvider` class, with two extra features:
1. Always sanitize the provided text argument.
2. Mark the sanitized text argument as untrusted.
This is default behavior from now on, since this function is commonly
used to parse progress reports from the conversion sandbox.
Sanitize filenames in various places in the code, before we write them
to the user's terminal. Filenames, especially in Linux, can contain
virtually any character except for '\0' and '/', so it's important to
sanitize them.
Make the `error_label` widget always render messages as plain text,
instead of auto discovering if the text is rich. We need this because
the error message may contain input from the sandbox, which we consider
untrusted.
Run our GUI tests on separate processes, because the combination of
Ubuntu Focal, Qt5, PySide6, and pytest-qt somehow leads to segfaults,
probably due to stale global state.
Closes#493
The isort tool is not compatible with Black by default. This leads to a
tug of war between these tools, when we run `make lint-apply` -> `make
lint`. Fix this by forcing isort to be compatible with Black.
Move the "Ok" button in the prompt that asks users if they want to
enable update checks to the right, to further reinforce that this is
the default action.
Fully test the update check logic, by introducing several Qt tests.
Also, improve the `UpdaterThread.get_letest_info()` method, that gets
the latest version and changelog from GitHub, with several checks.
These checks are also tested in our newly added tests.
We want to differentiate between the user clicking on "Cancel" and
clicking on "X", since in the second case, we want to remind them again
on the next run.
Reverse the logic in Qubes to run in containers by default and only
perform the conversion with VMs when explicitly set by the env var
QUBES_CONVERSION=1. This will avoid surprises when someone installs
Dangerzone on Qubes expecting it to work out of the box just like any
other Linux.
Fixes#451
Parallel tests had given us issues in the part [1]. This time, they
weren't playing well with pytest-qt. One hypothesis is that Qt
application components run as singletons and don't play well when there
are two instances.
The symptom we were experiencing was infinite recursion and removing
pytest-xdist solved the issue.
[1]: https://github.com/freedomofpress/dangerzone/issues/217
Now that sample_doc was renamed to sample_pdf it could cause some
confusion the fact that that the TestBase class had an attribute called
sample_doc which referenced the sample PDF.
By removing this attribute and passing the fixture instead we are
following a more pytest-native approach of passing arguments explicitly.
Implements the GUI logic necessary to change the selected document. When
"Change Selection" is clicked, it opens a File Dialog on the directory
of the previously selected files (if any)
Fixes#428