dangerzone

mirror of https://github.com/freedomofpress/dangerzone.git synced 2025-04-28 18:02:38 +02:00

Author	SHA1	Message	Date
Alex Pyrgiotis	ccf4132ea0	conversion: Add sanity check for page count Add a sanity check at the end of the conversion from doc to pixels, to ensure that the resulting document will have the same number of pages as the original one. Refs #560	2023-09-28 22:50:54 +03:00
Alex Pyrgiotis	b4e5cf5be7	qubes: Stream page data in real time Stream page data back to the caller, immediately after we read them from pdftoppm. This way, we have more accurate progress reports and timeouts. Fixes #557	2023-09-28 22:50:54 +03:00
Alex Pyrgiotis	4bb959f220	conversion: Add anchor points for streaming page data/metadata Introduce 4 new methods that can be overloaded by the Qubes isolation provider to stream page data/metadata back to the caller. For the time being, these methods do what they did before, i.e., write this info in files within the pixels directory.	2023-09-28 22:50:53 +03:00
Alex Pyrgiotis	6012cd1491	Improve EOF detection when reading command output Do not read a line from the command output and then check if we are at EOF, because it's possible that the writer immediately exited after writing the last line of output. Instead, switch the order of actions. This is a very serious bug that can lead to Dangerzone excluding the last page of the document. It should have bit us right from the start (see `aeeed411a0`), but it seems that the small period of time it takes the kernel to close the file descriptors was hiding this bug. Fixes #560	2023-09-28 22:50:53 +03:00
Garrett Robinson	79c1d6db0f	Use extend_skip to avoid overriding isort's skip default This preserves isort's default behavior of ignoring virtualenvs with common names like `venv` or `.venv`, which is helpful when running `isort` in a local development environment that uses such a virtualenv.	2023-09-28 17:21:00 +03:00
Garrett Robinson	eab768f950	Style safe_extension_filename consistently in Dark Mode To be consistent with Light Mode, the background of the safe_extension_filename QLabel should match the adjacent QTextField, but the text should be "grayed out"/disabled to indicate that it's not supposed to be editable.	2023-09-28 17:20:54 +03:00
Garrett Robinson	40b6240097	Only set certain colors in light mode	2023-09-28 17:20:50 +03:00
Garrett Robinson	46f978e6f0	Detect OS color mode and set as property for stylesheets Sets the detected OS color mode (dark/light) as a property on the QApplication so it can be referenced in stylesheets to select style rules suited to the OS color mode.	2023-09-28 17:20:34 +03:00
deeplow	23bee23d81	Disable isolation_provider tests on dummy conversion Windows and macOS in CI (which don't support nested virtualization) and thus Docker aren't really candidates for isolation_provider tests.	2023-09-28 11:08:53 +01:00
deeplow	0a6b33ebed	Qubes: detect qube failing to start (missing RAM) In Qubes OS it's often the case that the user doesn't have enough RAM to start the conversion. In this case it raises BrokenPipeException and exits with code 126. It didn't seem possible to distinguish this kind of failure to one where the user has misconfigured qrexec policies. NOTE: this approach is not ideal UX-wise. After the first doc failing the next one will also try and fail. Upon first failure we should inform the user that they need to close some programs or qubes.	2023-09-28 11:08:50 +01:00
deeplow	63f03d5bcd	Add limit and test to max width and height of docs	2023-09-28 11:08:47 +01:00
deeplow	6f26fc6303	Qubes: add test if MAX_PAGES is enforced in client Because the server also checks the MAX_PAGES limit, the test in base would hide the fact that the client is not enforcing the limit. This ensures that's not the case. When the pages in containers are streamed (#443), then this test should be in base.py.	2023-09-28 11:06:36 +01:00
deeplow	54b8ffbf96	Add page limit of 10000 Theoretically the max pages would be 65536 (2byte unsigned int. However this limit is much higher than practical documents have and larger ones can lead to unforseen problems, for example RAM limitations. We thus opted to use a lower limit of 10K. The limit must be detected client-side, given that the server is distrusted. However we also check it in the server, just as a fail-early mechanism.	2023-09-28 11:01:14 +01:00
deeplow	afba362d22	Tests: split isolation provider tests per provider Isolation provider tests done in tests/test_base.py and had pytest.mark.parameterize() for each isolation provider. This logic would not work well when we had test that diverge. We could have marked each one as compatible with one provider or another, but in the end it turned out to be better to have the common ones in a base class and the divergent ones in each. NOTE: this has a strange side-effect: inherited test classes need to have imports for all of the fixtures even if they are not explictly used	2023-09-28 09:53:29 +01:00
Alex Pyrgiotis	18b73d94b0	qubes: Find out reason of interrupted conversions If a conversion has been interrupted (usually due to an EOF), figure out why this happened by checking the exit code of the spawned process.	2023-09-26 17:35:26 +03:00
Alex Pyrgiotis	30196ff35b	errors: Add error for interrupted conversions Add an error for interrupted conversions, in order to better differentiate this scenario from other ValueErrors that may be raised throughout the code's lifetime.	2023-09-26 17:35:26 +03:00
Alex Pyrgiotis	0273522fb1	qubes: Store the process for the spawned qube Store, in an instance attribute, the process that we have started for the spawned disposable qube. In subsequent commits, we will use it from other places as well, aside from the `_convert` method. Note that this commit does not alter the conversion logic, and only does the following: 1. Renames `p.` to `self.proc.` 2. Adds an `__init__` method to the Qubes isolation provider, and initializes the `self.proc` attribute to `None`. 3. Adds an assert that `self.proc` is not `None` after it's spawned, to placate Mypy.	2023-09-26 17:35:25 +03:00
deeplow	e08b6defc3	Round conversion progress from float to int Fixes #553	2023-09-26 15:20:41 +01:00
deeplow	8d37ff15e0	Remove duplicated Qubes message: "Safe PDF Created" Fixes #555. This is a leftover from when we didn't have progress reports from the second stage conversion (AKA. pixels to PDF) in #429.	2023-09-26 12:16:48 +01:00
Alex Pyrgiotis	a67c080898	Add changelog entry for Qubes beta integration	2023-09-25 12:51:41 +03:00
Alex Pyrgiotis	af7087af65	Update our release/QA instructions for Qubes Update the release/QA instructions for Qubes, so that they take into account the fact that we can now publish a Qubes RPM through our official repos.	2023-09-25 12:51:41 +03:00
Alex Pyrgiotis	c94c8c8ba5	Add installation instructions for Qubes Add instructions for installing Dangerzone on Qubes from our official repos. These instructions are adapted from the build instructions, but have been greatly simplified because we don't need some of the qubes that the development environment needs. Closes #431	2023-09-25 12:51:40 +03:00
Alex Pyrgiotis	22a58d83df	install: Add Tesseract models as package reqs Add Tesseract models for the 10 most spoken languages as package requirements for Qubes. For containers, this problem is already solved since we install all Tesseract models. If a user is not covered by the installed models, they can install extras on their own. We will add a note for this in subsequent commits. Refs #431	2023-09-25 12:51:40 +03:00
Alex Pyrgiotis	215fa8b558	install: Add conflict if Dangerzone is installed Add a "Conflicts:" entry in the RPM spec, in case another version of Dangerzone is already installed.	2023-09-25 12:49:58 +03:00
Alex Pyrgiotis	81b4a8deb5	Minor fixes in Fedora installation section	2023-09-25 12:49:58 +03:00
Alex Pyrgiotis	cbca9110ca	Switch to tessdata-fast Tesseract model Switch to the tessdata-fast Tesseract model, instead of the tessdata one. The tessdata-fast Tesseract model is much smaller, and a bit faster than the other one. Also, it's the model that Debian/Fedora ship by default. Closes #545	2023-09-25 12:48:05 +03:00
Alex Pyrgiotis	e64d1da61f	qubes: Pass OCR parameters properly Pass OCR parameters to conversion functions as arguments, instead of setting environment variables. Fixes #455	2023-09-20 18:04:40 +03:00
Alex Pyrgiotis	8a0c0a4673	Make parameter actually optional	2023-09-20 17:58:39 +03:00
Alex Pyrgiotis	20157bef58	Fix typo	2023-09-20 17:45:44 +03:00
Alex Pyrgiotis	99dd5f5139	qubes: Add client-side timeouts Extend the client-side capabilities of the Qubes isolation provider, by adding client-side timeout logic. This implementation brings the same logic that we used server-side to the client, by taking into account the original file size and the number of pages that the server returns. Since the code does not have the exact same insight as the server has, the calculated timeouts are in two places: 1. The timeout for getting the number of pages. This timeout takes into account: * the disposable qube startup time, and * the time it takes to convert a file type to PDF 2. The total timeout for converting the PDF into pixels, in the same way that we do it on the server-side. Besides these changes, we also ensure that partial reads (e.g., due to EOF) are detected (see exact=... argument) Some things that are not resolved in this commit are: * We have both client-side and server-side timeouts for the first phase of the conversion. Once containers can stream data back to the application (see #443), these server-side timeouts can be removed. * We do not show a proper error message when a timeout occurs. This will be part of the error handling PR (see #430) Fixes #446 Refs #443 Refs #430	2023-09-20 17:32:42 +03:00
Alex Pyrgiotis	55a4491ced	Consolidate import statements	2023-09-20 17:14:24 +03:00
Alex Pyrgiotis	c547ffc3b4	conversion: Factor out calculate_timeout Factor out the logic behind the calculate_timeout() method, used in Dangerzone conversions, so that isolation providers can call it directly.	2023-09-20 17:14:24 +03:00
Alex Pyrgiotis	fea193e935	Add non-blocking read utility Add a function that can read data from non-blocking fds, which we will used later on to read from standard streams with a timeout.	2023-09-20 17:14:24 +03:00
Alex Pyrgiotis	344d6f7bfa	Add Stopwatch implementation Add a simple stopwatch implementation to track the elapsed time since an event, or the remaining time until a timeout.	2023-09-20 17:14:23 +03:00
Alex Pyrgiotis	fbe13bb114	Refer to Qubes in the project's description	2023-09-20 16:48:53 +03:00
Alex Pyrgiotis	a3bb740b19	Remove some stale Qubes refs in setup.py	2023-09-20 16:48:53 +03:00
Alex Pyrgiotis	01d63e4eda	install: Build Dangerzone RPMs using our SPEC file Replace the deprecated `bdist_rpm` method of creating RPMs for Dangerzone. Instead, update our `install/linux/build-rpm.py` script, to build Dangerzone RPMs using our SPEC file under `install/linux/dangerzone.spec`. The script now essentially creates a source distribution (sdist) using `poetry build`, and then uses `rpmbuild` to create binary and source RPMs. Fixes #298	2023-09-20 16:48:53 +03:00
Alex Pyrgiotis	6cc2a953ff	install: Add directory for building Dangerzone RPMs Add an `rpm-build` directory under `install/linux`, which will be used for building Dangerzone RPMs. For the time being, it only has a .gitignore file there, but in the future, invoking `install/linux/build-rpm.py` will populate it.	2023-09-20 16:48:53 +03:00
Alex Pyrgiotis	f5abe0abd0	Update RPM dependencies Update the dependencies required to build RPM packages. More specifically, remove the older python3-setuptools dependency, and depend instead on python3-devel and python3-poetry-core. Note that this commit may break our CI, but it will be resolved in subsequent commits.	2023-09-20 16:48:53 +03:00
Alex Pyrgiotis	33197f26b7	install: Introduce a SPEC file for creating RPMs Introduce a SPEC file that can be used to create an RPM from a Python source distribution. Some notable features of this SPEC file follow: 1. We can use this SPEC file to create both regular RPM packages and ones targeted for Qubes. 2. It has a post installation script that removes stale .egg-info directories, which previously caused issues to our users. 3. It automatically creates a changelog from our Git logs, which differs from the actual CHANGELOG.md. 4. It folloes the latest Fedora guidelines (as of writing this) for packaging Python projects. Fixes #514	2023-09-20 16:48:52 +03:00
Alex Pyrgiotis	3dea16bcd2	Include non-Python data files into Python package Update our pyproject.toml file to include some non-Python data files, e.g., our container image and assets. This way, we can use `poetry build` to create a source distribution / Python wheel from our source repository. Note that this list of data files is already defined in our `setup.py` script. In that script, one can find some extra goodies: 1. We can conditionally include data files in our Python package. We use this to include Qubes data only in our Qubes packages. 2. We can specify where will the data files be installed in the end-user system. The above are non-goals for Poetry [1], especially (2), because modern Python wheels are not supposed to install files in arbitrary places within the user's host, nor should the install invocation use sudo. Instead, this is a task that's better suited for the .deb / .rpm packages. So, why do we bother updating our `pyproject.toml` and not use `setup.py` instead? Because `setup.py` is deprecated [2,3], and the latest Python packaging RFCs [4], as well as most recent Fedora guidelines [5] use `pyproject.toml` as the source of truth, instead of `setup.py`. In subsequent commits, we will also use just `pyproject.toml` for RPM packaging. [1]: https://github.com/python-poetry/poetry/issues/890 [2]: https://peps.python.org/pep-0517/#source-trees [3]: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html [4]: https://peps.python.org/pep-0517/ [5]: https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/	2023-09-20 16:38:55 +03:00
Alex Pyrgiotis	5431e059bf	Update build-system entry in pyproject.toml Update the `build-backend` attribute, in accordance with the Python Poetry docs [1]. Also, bump the minimum required poetry-core version to 1.2.0, since this is the version that introduced the Poetry dependency groups [2], i.e., the [tool.poetry.group] sections in pyproject.toml. [1]: https://python-poetry.org/docs/pyproject/#poetry-and-pep-517 [2]: https://python-poetry.org/docs/managing-dependencies/#dependency-groups	2023-09-20 16:38:55 +03:00
Alex Pyrgiotis	b83d2495eb	Remove stale dangerzone-container entrypoint The dangerzone-container entrypoint, as specified in pyproject.toml, is stale, for the following reasons: 1. It's not mentioned in the setup.py script, so it was never included in our Linux distributions. 2. The code in `dangerzone.__init__.py` that decides if it will invoke the GUI or CLI backend, just takes `dangerzone-cli` into account for this decision, and does not mention dangerzone-container anywhere.	2023-09-20 16:38:55 +03:00
Alex Pyrgiotis	7bc0129f94	Let black and isort respect .gitignore In order to let isort respect .gitignore, we need to specify this in the tool.isort entry, in pyproject.toml. For black, we don't need any extra tweaks. This is weird, since until a few months ago black did not respect .gitignore. Maybe something has changed in the meantime but if not, we should revert this change.	2023-09-20 16:38:55 +03:00
Alex Pyrgiotis	29c0181b4d	Add test_docs_large in our .gitignore	2023-09-20 16:38:54 +03:00
deeplow	94f569cdf5	Add error code for unexpected errors in conversion	2023-09-19 15:52:47 +01:00
deeplow	8e4f04a52e	Shift to conversion exit codes by 128 Distinguish from podman or other errors in called binaries by shifting the error codes by 128.	2023-09-19 15:34:00 +01:00
deeplow	b4c3e07d36	Remove attacker-controlled error messages Creates exceptions in the server code to be shared with the client via an identifying exit code. These exceptions are then reconstructed in the client. Refs #456 but does not completely fix it. Unexpected exceptions and progress descriptions are still passed in Containers.	2023-09-19 15:33:20 +01:00
Moon Sungjoon	214ce9720d	Enable HWP conversion on MacOS M1 This PR reverts the patch that disables HWP / HWPX conversion on MacOS M1. It does not fix conversion on Qubes OS (#494) Previously, HWP / HWPX conversion didn't work on MacOS M1 systems (#498) because libreoffice wasn't built with Java support on Alpine Linux for ARM (aarch64). Gratefully, the Alpine team has enabled Java support on the aarch64 system [1], so we can enable it again for ARM architectures. Fixes #498 [1]: `74d443f479`	2023-09-06 13:10:18 +03:00
Moon Sungjoon	acd615e0e1	Switch to the edge repo of Alpine Linux The Alpine Linux team has enabled Java support for LibreOffice on ARM architecture: `74d443f479` This commit is included in 7.5.5.2-r2, so the installed LibreOffice package should be 7.5.5.2-r2 or higher to fix this issue. However 3.18 doesn't have the 7.5.5.2-r2 package: https://pkgs.alpinelinux.org/package/v3.18/community/aarch64/libreoffice The Dangerzone image uses the alpine:latest image which is 3.18 as of writing this. For this reason, we switch to the edge repo of Alpine Linux, which includes this fix. Refs #498 Refs #540 Refs #542	2023-09-06 13:09:34 +03:00

1 2 3 4 5 ...

1078 commits