dangerzone

mirror of https://github.com/freedomofpress/dangerzone.git synced 2025-05-04 04:31:49 +02:00

Author	SHA1	Message	Date
Alexis Métaireau	ab6dd9c01d	Use `pathlib.Path` to return path locations	2025-03-31 16:20:28 +02:00
Alex Pyrgiotis	51f432be6b	Fix references to container.tar.gz Find all references to the `container.tar.gz` file, and replace them with references to `container.tar`. Moreover, remove the `--no-save` argument of `build-image.py` since we now always save the image. Finally, fix some stale references to Poetry, which are not necessary anymore.	2025-03-20 17:15:15 +02:00
Alex Pyrgiotis	25fba42022	Extend the interface of the isolation provider Add the following two methods in the isolation provider: 1. `.is_available()`: Mainly used for the Container isolation provider, it specifies whether the container runtime is up and running. May be used in the future by other similar providers. 2. `.should_wait_install()`: Whether the isolation provider takes a while to be installed. Should be `True` only for the Container isolation provider, for the time being.	2024-12-10 11:29:00 +02:00
Alex Pyrgiotis	5ed4a048a0	qubes: Do not close stderr Some checks are pending Tests / build-deb (debian trixie) (push) Blocked by required conditions Details Tests / build-deb (ubuntu 20.04) (push) Blocked by required conditions Details Tests / build-deb (ubuntu 22.04) (push) Blocked by required conditions Details Tests / build-deb (ubuntu 23.10) (push) Blocked by required conditions Details Tests / build-deb (ubuntu 24.04) (push) Blocked by required conditions Details Tests / build-deb (ubuntu 24.10) (push) Blocked by required conditions Details Tests / install-deb (debian bookworm) (push) Blocked by required conditions Details Tests / install-deb (debian bullseye) (push) Blocked by required conditions Details Tests / install-deb (debian trixie) (push) Blocked by required conditions Details Tests / install-deb (ubuntu 20.04) (push) Blocked by required conditions Details Tests / install-deb (ubuntu 22.04) (push) Blocked by required conditions Details Tests / install-deb (ubuntu 23.10) (push) Blocked by required conditions Details Tests / install-deb (ubuntu 24.04) (push) Blocked by required conditions Details Tests / install-deb (ubuntu 24.10) (push) Blocked by required conditions Details Tests / build-install-rpm (fedora 39) (push) Blocked by required conditions Details Tests / build-install-rpm (fedora 40) (push) Blocked by required conditions Details Tests / build-install-rpm (fedora 41) (push) Blocked by required conditions Details Tests / run tests (debian bookworm) (push) Blocked by required conditions Details Tests / run tests (debian bullseye) (push) Blocked by required conditions Details Tests / run tests (debian trixie) (push) Blocked by required conditions Details Tests / run tests (fedora 39) (push) Blocked by required conditions Details Tests / run tests (fedora 40) (push) Blocked by required conditions Details Tests / run tests (fedora 41) (push) Blocked by required conditions Details Tests / run tests (ubuntu 20.04) (push) Blocked by required conditions Details Tests / run tests (ubuntu 22.04) (push) Blocked by required conditions Details Tests / run tests (ubuntu 23.10) (push) Blocked by required conditions Details Tests / run tests (ubuntu 24.04) (push) Blocked by required conditions Details Tests / run tests (ubuntu 24.10) (push) Blocked by required conditions Details Scan latest app and container / security-scan-container (push) Waiting to run Details Scan latest app and container / security-scan-app (push) Waiting to run Details Do not close stderr as part of the Qubes termination logic, since we need to read the debug logs. This shouldn't affect typical termination scenarios, since we expect our disposable qube to be either busy reading from stdin, or writing to stdout. If this is not the case, then forcefully killing the `qrexec-client-vm` process should unblock the qube.	2024-10-22 20:33:29 +03:00
Alex Pyrgiotis	7ea7c8a0cc	Remove dead code	2024-10-17 15:50:12 +03:00
Alex Pyrgiotis	b9a3dd63ad	Always start conversion process in new session Start the conversion process in a new session, so that we can later on kill the process group, without killing the controlling script (i.e., the Dangezone UI). This should not affect the conversion process in any other way.	2024-10-07 17:27:38 +03:00
Alexis Métaireau	65a8827daa	chore: minor linting A few minor changes about when to use `==` and when to use `is`. Basically, this uses `is` for booleans, and `==` for other values. With a few other changes about coding style which was enforced by `ruff`.	2024-06-05 14:19:31 +02:00
Alexis Métaireau	cbbd6afcc1	chore: remove unused code This commit removes code that's not being used, it can be exceptions with the `as e` where the exception itself is not used, the same with `with` statements, and some other parts where there were duplicated code.	2024-06-05 14:19:31 +02:00
Alexis Métaireau	5aa4863b52	chore(imports): remove useless imports As detected by [ruff](https://github.com/astral-sh/ruff) Related to #254, although it doesn't provide the command to lint the codebase itself.	2024-06-05 14:19:30 +02:00
Alex Pyrgiotis	171a7eca52	isolation_provider: Terminate doc-to-pixels proc Extend the IsolationProvider class with a `terminate_doc_to_pixels_proc()` method, which must be implemented by the Qubes/Container providers and gracefully terminate a process started for the doc to pixels phase. Refs #563	2024-04-24 14:36:14 +03:00
Alex Pyrgiotis	6850d31edc	isolation_provider: Pass doc when creating doc-to-pixels proc Pass the Document instance that will be converted to the `IsolationProvider.start_doc_to_pixels_proc()` method. Concrete classes can then associate this name with the started process, so that they can later on kill it.	2024-04-24 14:33:33 +03:00
deeplow	0449840ec3	dz.ConvertDev: do not teleport .pyc files On Qubes the conversion in dev mode would fail when converting from a Fedora 38 development qube via a Fedora 39 disposable qube. The reason was that dz.ConvertDev was receiving `.pyc` files, which were compiled for python 3.11 but running on python 3.12. Unfortunately PyZipFile objects cannot send source python files, even though the documentation is a little bit unclear on this [1]. Fixes #723 [1]: https://docs.python.org/3/library/zipfile.html#pyzipfile-objects	2024-03-13 07:13:39 +00:00
deeplow	69c2a02d81	Remove timeouts Remove timeouts due to several reasons: 1. Lost purpose: after implementing the containers page streaming the only subprocess we have left is LibreOffice. So don't have such a big risk of commands hanging (the original reason for timeouts). 2. Little benefit: predicting execution time is generically unsolvable computer science problem. Ultimately we were guessing an arbitrary time based on the number of pages and the document size. As a guess we made it pretty lax (30s per page or MB). A document hanging for this long will probably lead to user frustration in any case and the user may be compelled to abort the conversion. 3. Technical Challenges with non-blocking timeout: there have been several technical challenges in keeping timeouts that we've made effort to accommodate. A significant one was having to do non-blocking read to ensure we could timeout when reading conversion stream (and then used here) Fixes #687	2024-02-06 20:11:43 +00:00
deeplow	f3032a7142	Make big endian explicit in int to bytes Fix issues in older distros that don't yet support python 3.11 where endianness was not a default argument [1]. This is in response to CI failures [2]. [1]: https://docs.python.org/3/library/stdtypes.html#int.to_bytes [2]: https://app.circleci.com/pipelines/github/freedomofpress/dangerzone/2186/workflows/e340ca21-85ce-42b6-9bc3-09e66f96684a/jobs/27380y	2024-02-06 19:42:41 +00:00
deeplow	1835756b45	Allow each conversion to have its own proc If we increased the number of parallel conversions, we'd run into an issue where the streams were getting mixed together. This was because the Converter.proc was a single attribute. This breaks it down into a local variable such that this mixup doesn't happen.	2024-02-06 19:42:41 +00:00
deeplow	550786adfe	Remove untrusted progress parsing (stderr instead) Now that only the second container can send JSON-encoded progress information, we can the untrusted JSON parsing. The parse_progress was also renamed to `parse_progress_trusted` to ensure future developers don't mistake this as a safe method. The old methods for sending untrusted JSON were repurposed to send the progress instead to stderr for troubleshooting in development mode. Fixes #456	2024-02-06 19:42:40 +00:00
deeplow	0a099540c8	Stream pages in containers: merge isolation providers Merge Qubes and Containers isolation providers core code into the class parent IsolationProviders abstract class. This is done by streaming pages in containers for exclusively in first conversion process. The commit is rather large due to the multiple interdependencies of the code, making it difficult to split into various commits. The main conversion method (_convert) now in the superclass simply calls two methods: - doc_to_pixels() - pixels_to_pdf() Critically, doc_to_pixels is implemented in the superclass, diverging only in a specialized method called "start_doc_to_pixels_proc()". This method obtains the process responsible that communicates with the isolation provider (container / disp VM) via `podman/docker` and qrexec on Containers and Qubes respectively. Known regressions: - progress reports stopped working on containers Fixes #443	2024-02-06 19:42:33 +00:00
deeplow	dca46d0a6b	Homogenize qubes and containers inner convert method Simple rename of the __convert() method in the Qubes conversion to make the code structurally similar.	2024-02-06 18:54:31 +00:00
Alex Pyrgiotis	edfba0c783	Qubes: Fix progress in first stage of Qubes conversion	2023-10-13 22:44:37 +03:00
Alex Pyrgiotis	bdf3f8babc	qubes: Clean up temporary files Create a temporary dir before the conversion begins, and store every file necessary for the conversion there. We are mostly concerned about the second stage of the conversion, which runs in the host. The first stage runs in a disposable qube and cleanup is implicit. Fixes #575 Fixes #436	2023-10-04 14:05:23 +03:00
Alex Pyrgiotis	b7b76174ab	qubes: Log captured output for the second stage Log the captured command output during the second stage, only in dev environments. This follows what we have already done for the first stage.	2023-10-02 15:41:29 +03:00
Alex Pyrgiotis	16603875d6	qubes: Display all errors in second stage If a command encounters an error or times out during the second stage of the conversion in Qubes, handle it the same way as we would have handled it in the first stage: 1. Get its error message. 2. Throw an UnexpectedConversionError exception, with the original message. Note that, because the second stage takes place locally, users will see the original content of the error. Refs #567 Closes #430	2023-10-02 15:41:17 +03:00
deeplow	0a6b33ebed	Qubes: detect qube failing to start (missing RAM) In Qubes OS it's often the case that the user doesn't have enough RAM to start the conversion. In this case it raises BrokenPipeException and exits with code 126. It didn't seem possible to distinguish this kind of failure to one where the user has misconfigured qrexec policies. NOTE: this approach is not ideal UX-wise. After the first doc failing the next one will also try and fail. Upon first failure we should inform the user that they need to close some programs or qubes.	2023-09-28 11:08:50 +01:00
deeplow	63f03d5bcd	Add limit and test to max width and height of docs	2023-09-28 11:08:47 +01:00
deeplow	54b8ffbf96	Add page limit of 10000 Theoretically the max pages would be 65536 (2byte unsigned int. However this limit is much higher than practical documents have and larger ones can lead to unforseen problems, for example RAM limitations. We thus opted to use a lower limit of 10K. The limit must be detected client-side, given that the server is distrusted. However we also check it in the server, just as a fail-early mechanism.	2023-09-28 11:01:14 +01:00
Alex Pyrgiotis	18b73d94b0	qubes: Find out reason of interrupted conversions If a conversion has been interrupted (usually due to an EOF), figure out why this happened by checking the exit code of the spawned process.	2023-09-26 17:35:26 +03:00
Alex Pyrgiotis	30196ff35b	errors: Add error for interrupted conversions Add an error for interrupted conversions, in order to better differentiate this scenario from other ValueErrors that may be raised throughout the code's lifetime.	2023-09-26 17:35:26 +03:00
Alex Pyrgiotis	0273522fb1	qubes: Store the process for the spawned qube Store, in an instance attribute, the process that we have started for the spawned disposable qube. In subsequent commits, we will use it from other places as well, aside from the `_convert` method. Note that this commit does not alter the conversion logic, and only does the following: 1. Renames `p.` to `self.proc.` 2. Adds an `__init__` method to the Qubes isolation provider, and initializes the `self.proc` attribute to `None`. 3. Adds an assert that `self.proc` is not `None` after it's spawned, to placate Mypy.	2023-09-26 17:35:25 +03:00
deeplow	8d37ff15e0	Remove duplicated Qubes message: "Safe PDF Created" Fixes #555. This is a leftover from when we didn't have progress reports from the second stage conversion (AKA. pixels to PDF) in #429.	2023-09-26 12:16:48 +01:00
Alex Pyrgiotis	e64d1da61f	qubes: Pass OCR parameters properly Pass OCR parameters to conversion functions as arguments, instead of setting environment variables. Fixes #455	2023-09-20 18:04:40 +03:00
Alex Pyrgiotis	8a0c0a4673	Make parameter actually optional	2023-09-20 17:58:39 +03:00
Alex Pyrgiotis	99dd5f5139	qubes: Add client-side timeouts Extend the client-side capabilities of the Qubes isolation provider, by adding client-side timeout logic. This implementation brings the same logic that we used server-side to the client, by taking into account the original file size and the number of pages that the server returns. Since the code does not have the exact same insight as the server has, the calculated timeouts are in two places: 1. The timeout for getting the number of pages. This timeout takes into account: * the disposable qube startup time, and * the time it takes to convert a file type to PDF 2. The total timeout for converting the PDF into pixels, in the same way that we do it on the server-side. Besides these changes, we also ensure that partial reads (e.g., due to EOF) are detected (see exact=... argument) Some things that are not resolved in this commit are: * We have both client-side and server-side timeouts for the first phase of the conversion. Once containers can stream data back to the application (see #443), these server-side timeouts can be removed. * We do not show a proper error message when a timeout occurs. This will be part of the error handling PR (see #430) Fixes #446 Refs #443 Refs #430	2023-09-20 17:32:42 +03:00
Alex Pyrgiotis	55a4491ced	Consolidate import statements	2023-09-20 17:14:24 +03:00
deeplow	b4c3e07d36	Remove attacker-controlled error messages Creates exceptions in the server code to be shared with the client via an identifying exit code. These exceptions are then reconstructed in the client. Refs #456 but does not completely fix it. Unexpected exceptions and progress descriptions are still passed in Containers.	2023-09-19 15:33:20 +01:00
deeplow	f41cefde1d	Add "armor" around conversion log Add GPG-styled "armor" around conversion logs -----CONVERSION LOG START----- Creator: Writer Producer: LibreOffice 6.4 [...] -----CONVERSION LOG END-----	2023-08-22 16:11:28 +01:00
deeplow	9f1abe2836	Replace non-printable ascii in conversion log Certain characters may be abused. Particularly ANSI escape codes. Solution inspired by Qubes OS's hardening of ther RPC mechanism [1]: > Terminal control characters are a security issue, which in worst case > amount to arbitrary command execution. In the simplest case this > requires two often found codes: terminal title setting (which puts > arbitrary string in the window title) and title repo reporting (which > puts that string on the shell's standard input. [sic] > > -- qvm-run.rst [2] [1]: `e005836286` [2]: `c70da44702/doc/manpages/qvm-run.rst (L126)`	2023-08-22 16:11:27 +01:00
deeplow	95cef8cf0a	Containers: capture conversion logs Store the conversion log to a file (captured-output.txt) in the container and when in development mode, have its output displayed on the terminal output.	2023-08-22 16:11:26 +01:00
deeplow	d6bce4dec5	Qubes: close qrexec stdin and stout Ensure a server cannon keep the client hannging if more data than necessary is sent. This applies to container and the Qubes implmentation.	2023-08-22 16:11:23 +01:00
deeplow	874b8865e2	Qubes: strategy for capturing conversion logs Use qrexec stdout to send conversion data (pixels) and stderr to send conversion progress at the end of the conversion. This happens regardless of whether or not the conversion is in developer mode or not. It's the client that decides if it reads the debug data from stderr or not. In this case, it only reads it if developer mode is enabled.	2023-08-22 16:11:20 +01:00
Alex Pyrgiotis	6c374d8a7e	qubes: Mark Dangerzone messages as trusted Mark the messages that Dangerzone creates once a conversion step finishes as trusted, since they do not contain any string not controlled by us.	2023-08-01 14:43:49 +03:00
deeplow	1ab14dbd86	Use containers in Qubes until Beta Reverse the logic in Qubes to run in containers by default and only perform the conversion with VMs when explicitly set by the env var QUBES_CONVERSION=1. This will avoid surprises when someone installs Dangerzone on Qubes expecting it to work out of the box just like any other Linux. Fixes #451	2023-07-26 14:02:06 +01:00
deeplow	ef41cab76e	Add progress reports on Qubes (GUI) Fixes #429	2023-07-13 12:57:23 +01:00
deeplow	bf38c24d99	Merge stdout_callback with print_progress stdout_callback is used to flow progress information from the conversion to some front-end. It was always used in tandem with printing to the terminal (which is kind of a front-end). So it made sense to put them always together.	2023-07-13 12:57:04 +01:00
deeplow	baeab9d7eb	Add Qubes isolation provider Add an isolation provider for Qubes, that performs the document conversion as follows: Document to pixels phase ------------------------ 1. Starts a disposable qube by calling either the dz.Convert or the dz.ConvertDev RPC call, depending on the execution context. 2. Sends the file to disposable qube through its stdin. * If we call the conversion from the development environment, also pass the conversion module as a Python zipfile, before the suspicious document. 3. Reads the number of pages, their dimensions, and the page data. Pixels to PDF phase ------------------- 1. Writes the page data under /tmp/dangerzone, so that the `pixels_to_pdf` module can read them. 2. Pass OCR parameters as envvars. 3. Call the `pixels_to_pdf` main function, as if it was running within a container. Wait until the PDF gets created. 4. Move the resulting PDF to the proper directory. Fixes #414	2023-06-21 11:46:34 +03:00

44 commits