dangerzone

mirror of https://github.com/freedomofpress/dangerzone.git synced 2025-05-18 03:01:50 +02:00

Author	SHA1	Message	Date
jkarasti	cecfe63338	Lint: Fix unused-import (F401)	2024-12-17 17:44:32 +01:00
Alex Pyrgiotis	25fba42022	Extend the interface of the isolation provider Add the following two methods in the isolation provider: 1. `.is_available()`: Mainly used for the Container isolation provider, it specifies whether the container runtime is up and running. May be used in the future by other similar providers. 2. `.should_wait_install()`: Whether the isolation provider takes a while to be installed. Should be `True` only for the Container isolation provider, for the time being.	2024-12-10 11:29:00 +02:00
Alex Pyrgiotis	309bd12423	Move container-specific method from base class Move the `is_runtime_available()` method from the base `IsolationProvider` class, and into the `Dummy` provider class. This method was originally defined in the base class, in order to be mocked in our tests for the `Dummy` provider. There's no reason for the `Qubes` class to have it though, so we can just move it to the `Dummy` provider.	2024-12-09 19:19:21 +02:00
Alexis Métaireau	a95b612e78	Catch installation errors and display them. Fixes #193	2024-10-17 16:20:56 +02:00
Alex Pyrgiotis	7ea7c8a0cc	Remove dead code	2024-10-17 15:50:12 +03:00
Alex Pyrgiotis	f42bb23229	Update the way we get debug logs Move the logic for grabbing debug logs to a new place, now that we have merged the two conversion stages (doc to pixels, pixels to PDF).	2024-10-17 15:50:12 +03:00
Alex Pyrgiotis	e34c36f7bc	Perform on-host pixels to PDF conversion Extend the base isolation provider to immediately convert each page to a PDF, and optionally use OCR. In contract with the way we did things previously, there are no more two separate stages (document to pixels, pixels to PDF). We now handle each page individually, for two main reasons: 1. We don't want to buffer pixel data, either on disk or in memory, since they take a lot of space, and can potentially leave traces. 2. We can perform these operations in parallel, saving time. This is more evident when OCR is not used, where the time to convert a page to pixels, and then back to a PDF are comparable.	2024-10-17 15:50:12 +03:00
Alex Pyrgiotis	d6410652cb	Kill the process group when conversion terminates Instead of killing just the invoked Podman/Docker/qrexec process, kill the whole process group, to make sure that other components that have been spawned die as well. In the case of Podman, conmon is one of the processes that lingers, so that's one way to kill it.	2024-10-07 17:37:39 +03:00
Alex Pyrgiotis	b9a3dd63ad	Always start conversion process in new session Start the conversion process in a new session, so that we can later on kill the process group, without killing the controlling script (i.e., the Dangezone UI). This should not affect the conversion process in any other way.	2024-10-07 17:27:38 +03:00
Alexis Métaireau	65a8827daa	chore: minor linting A few minor changes about when to use `==` and when to use `is`. Basically, this uses `is` for booleans, and `==` for other values. With a few other changes about coding style which was enforced by `ruff`.	2024-06-05 14:19:31 +02:00
Alexis Métaireau	cbbd6afcc1	chore: remove unused code This commit removes code that's not being used, it can be exceptions with the `as e` where the exception itself is not used, the same with `with` statements, and some other parts where there were duplicated code.	2024-06-05 14:19:31 +02:00
Alexis Métaireau	5aa4863b52	chore(imports): remove useless imports As detected by [ruff](https://github.com/astral-sh/ruff) Related to #254, although it doesn't provide the command to lint the codebase itself.	2024-06-05 14:19:30 +02:00
Alex Pyrgiotis	37bf9badf4	Remove extraneous log sanitization Remove an extra call to `replace_control_chars()`, as well as an unnecessary method.	2024-05-09 15:57:42 +03:00
Alex Pyrgiotis	0b45360384	Keep newlines when reading debug logs In `d632908a44` we improved our `replace_control_chars()` function, by replacing every control or invalid Unicode character with a placeholder one. This change, however, made our debug logs harder to read, since newlines were not preserved. There are indeed various cases in which replacing newlines is wise (e.g., in filenames), so we should keep this behavior by default. However, specifically for reading debug logs, we add an option to keep newlines to improve readability, at no expense to security.	2024-05-09 15:57:42 +03:00
Alex Pyrgiotis	f57d2f7191	isolation_provider: Always terminate spawned process Previously, we always assumed that the spawned process would quit within 3 seconds. This was an arbitrary call, and did not work in practice. We can improve our standing here by doing the following: 1. Make `Popen.wait()` calls take a generous amount of time (since they are usually on the sad path), and handle any timeout errors that they throw. This way, a slow conversion process cleanup does not take too much of our users time, nor is it reported as an error. 2. Always make sure that once the conversion of doc to pixels is over, the corresponding process will finish within a reasonable amount of time as well. Fixes #749	2024-04-24 14:39:15 +03:00
Alex Pyrgiotis	cd4cbdb00a	isolation_provider: Get exit code without timing out Get the exit code of the spawned process for the doc-to-pixels phase, without timing out. More specifically, if the spawned process has not finished within a generous amount of time (hardcode to 15 seconds), return UnexpectedConversionError, with a custom message. This way, the happy path is not affected, and we still make our best to learn the underlying cause of the I/O error.	2024-04-24 14:36:14 +03:00
Alex Pyrgiotis	171a7eca52	isolation_provider: Terminate doc-to-pixels proc Extend the IsolationProvider class with a `terminate_doc_to_pixels_proc()` method, which must be implemented by the Qubes/Container providers and gracefully terminate a process started for the doc to pixels phase. Refs #563	2024-04-24 14:36:14 +03:00
Alex Pyrgiotis	6850d31edc	isolation_provider: Pass doc when creating doc-to-pixels proc Pass the Document instance that will be converted to the `IsolationProvider.start_doc_to_pixels_proc()` method. Concrete classes can then associate this name with the started process, so that they can later on kill it.	2024-04-24 14:33:33 +03:00
Alex Pyrgiotis	634523dac9	Get underlying error when conversion fails When we get an early EOF from the converter process, we should immediately get the exit code of that process, to find out the actual underlying error. Currently, the exception we raise masks the underlying error. Raise a ConverterProcException, that in turns makes our error handling code read the exit code of the spawned process, and converts it to a helpful error message. Fixes #714	2024-02-20 15:55:45 +02:00
Alex Pyrgiotis	6ee1d14c9a	Start conversion process earlier Start the conversion process earlier, so that we have a reference to the Popen object in case of an exception.	2024-02-20 15:55:45 +02:00
deeplow	e4a5dbce46	Don't show 50% duplicated progress info 50% would show twice in the conversion progress due to an overlap in conversion progress values. The doc_to_pixels would be from 0-50% and the pixels_to_pdf from 50%-100%. This commit makes the first part go from 0 to 49% instead. Fixes #715	2024-02-20 13:47:15 +00:00
deeplow	69c2a02d81	Remove timeouts Remove timeouts due to several reasons: 1. Lost purpose: after implementing the containers page streaming the only subprocess we have left is LibreOffice. So don't have such a big risk of commands hanging (the original reason for timeouts). 2. Little benefit: predicting execution time is generically unsolvable computer science problem. Ultimately we were guessing an arbitrary time based on the number of pages and the document size. As a guess we made it pretty lax (30s per page or MB). A document hanging for this long will probably lead to user frustration in any case and the user may be compelled to abort the conversion. 3. Technical Challenges with non-blocking timeout: there have been several technical challenges in keeping timeouts that we've made effort to accommodate. A significant one was having to do non-blocking read to ensure we could timeout when reading conversion stream (and then used here) Fixes #687	2024-02-06 20:11:43 +00:00
deeplow	f3032a7142	Make big endian explicit in int to bytes Fix issues in older distros that don't yet support python 3.11 where endianness was not a default argument [1]. This is in response to CI failures [2]. [1]: https://docs.python.org/3/library/stdtypes.html#int.to_bytes [2]: https://app.circleci.com/pipelines/github/freedomofpress/dangerzone/2186/workflows/e340ca21-85ce-42b6-9bc3-09e66f96684a/jobs/27380y	2024-02-06 19:42:41 +00:00
deeplow	1835756b45	Allow each conversion to have its own proc If we increased the number of parallel conversions, we'd run into an issue where the streams were getting mixed together. This was because the Converter.proc was a single attribute. This breaks it down into a local variable such that this mixup doesn't happen.	2024-02-06 19:42:41 +00:00
deeplow	61e7a3c107	Fix isolation provider tests Conversions methods had changed and that was part of the reason why the tests were failing. Furthermore, due to the `provider.proc`, which stores the associated qrexec / container process, "server" exceptions raise a IterruptedConversion error (now ConverterProcException), which then requires interpretation of the process exit code to obtain the "real" exception.	2024-02-06 19:42:41 +00:00
deeplow	550786adfe	Remove untrusted progress parsing (stderr instead) Now that only the second container can send JSON-encoded progress information, we can the untrusted JSON parsing. The parse_progress was also renamed to `parse_progress_trusted` to ensure future developers don't mistake this as a safe method. The old methods for sending untrusted JSON were repurposed to send the progress instead to stderr for troubleshooting in development mode. Fixes #456	2024-02-06 19:42:40 +00:00
deeplow	c991e530d0	Fix IsolationProvider.percentage variable reuse If one converted more than one document, since the state of IsolationProvider.percentage would be stored in the IsolationProvider instance, it would get reused for the second document. The fix is to keep it as a local variable, but we can explore having progress stored on the document itself, for example. Or having one IsolationProvider per conversion.	2024-02-06 19:42:40 +00:00
deeplow	0a099540c8	Stream pages in containers: merge isolation providers Merge Qubes and Containers isolation providers core code into the class parent IsolationProviders abstract class. This is done by streaming pages in containers for exclusively in first conversion process. The commit is rather large due to the multiple interdependencies of the code, making it difficult to split into various commits. The main conversion method (_convert) now in the superclass simply calls two methods: - doc_to_pixels() - pixels_to_pdf() Critically, doc_to_pixels is implemented in the superclass, diverging only in a specialized method called "start_doc_to_pixels_proc()". This method obtains the process responsible that communicates with the isolation provider (container / disp VM) via `podman/docker` and qrexec on Containers and Qubes respectively. Known regressions: - progress reports stopped working on containers Fixes #443	2024-02-06 19:42:33 +00:00
Alex Pyrgiotis	3daf0e2cb7	Do not show file previews in case of exceptions If a Qubes conversion encounters an exception that is not a subclass of ConversionException, it will still show a preview of a file that does not exist. Send an error progress report in that case, so that the GUI code can detect that an error occurred and not open a file preview Fixes #581	2023-10-05 11:11:42 +03:00
deeplow	e08b6defc3	Round conversion progress from float to int Fixes #553	2023-09-26 15:20:41 +01:00
deeplow	b4c3e07d36	Remove attacker-controlled error messages Creates exceptions in the server code to be shared with the client via an identifying exit code. These exceptions are then reconstructed in the client. Refs #456 but does not completely fix it. Unexpected exceptions and progress descriptions are still passed in Containers.	2023-09-19 15:33:20 +01:00
deeplow	9ec9cc5f87	Replace armor guards that indicate isolated output	2023-08-22 16:11:41 +01:00
deeplow	75369cf621	Adapt code so it works for reporting script Reporting script now parses JunitXML instead of a series of ".container_log" files. The script in in changed submodule. Additionally it makes failed tests actually fail so that this is recorded in the JunitXML report.	2023-08-22 16:11:36 +01:00
deeplow	f41cefde1d	Add "armor" around conversion log Add GPG-styled "armor" around conversion logs -----CONVERSION LOG START----- Creator: Writer Producer: LibreOffice 6.4 [...] -----CONVERSION LOG END-----	2023-08-22 16:11:28 +01:00
deeplow	95cef8cf0a	Containers: capture conversion logs Store the conversion log to a file (captured-output.txt) in the container and when in development mode, have its output displayed on the terminal output.	2023-08-22 16:11:26 +01:00
Alex Pyrgiotis	9410b68c1d	Sanitize progress reports in a provider-agnostic way Update the common `print_progress()` method in the base `IsolationProvider` class, with two extra features: 1. Always sanitize the provided text argument. 2. Mark the sanitized text argument as untrusted. This is default behavior from now on, since this function is commonly used to parse progress reports from the conversion sandbox.	2023-08-01 14:43:48 +03:00
Alex Pyrgiotis	77f4b8115c	Add missing reset ANSI sequence Do not forget to reset the red text once we print an error string to the terminal	2023-08-01 14:38:32 +03:00
deeplow	bf38c24d99	Merge stdout_callback with print_progress stdout_callback is used to flow progress information from the conversion to some front-end. It was always used in tandem with printing to the terminal (which is kind of a front-end). So it made sense to put them always together.	2023-07-13 12:57:04 +01:00
deeplow	56b5b98f1e	Report exceptions raised in document conversion Exceptions raised during the document conversion process would be silently hidden. This was because ThreadPoolExecuter in logic.py created various contexts and hid any exceptions raised. Fixes #309	2023-01-26 18:53:20 +00:00
deeplow	f5c4847af2	De-duplicate print_progress() logic	2023-01-25 14:53:28 +00:00
deeplow	538df18709	Split isolation providers into their own .py files Provides more clear code organization having each provider in their own python file rather than a single one.	2023-01-25 14:19:05 +00:00

41 commits