Commit graph

100 commits

Author SHA1 Message Date
Alex Pyrgiotis
edfba0c783
Qubes: Fix progress in first stage of Qubes conversion 2023-10-13 22:44:37 +03:00
Alex Pyrgiotis
3daf0e2cb7
Do not show file previews in case of exceptions
If a Qubes conversion encounters an exception that is not a subclass of
ConversionException, it will still show a preview of a file that does
not exist.

Send an error progress report in that case, so that the GUI code can
detect that an error occurred and not open a file preview

Fixes #581
2023-10-05 11:11:42 +03:00
Alex Pyrgiotis
bdf3f8babc
qubes: Clean up temporary files
Create a temporary dir before the conversion begins, and store every
file necessary for the conversion there. We are mostly concerned about
the second stage of the conversion, which runs in the host. The first
stage runs in a disposable qube and cleanup is implicit.

Fixes #575
Fixes #436
2023-10-04 14:05:23 +03:00
Alex Pyrgiotis
6232062146
Add missing newline char 2023-10-02 15:41:29 +03:00
Alex Pyrgiotis
b7b76174ab
qubes: Log captured output for the second stage
Log the captured command output during the second stage, only in dev
environments. This follows what we have already done for the first
stage.
2023-10-02 15:41:29 +03:00
Alex Pyrgiotis
16603875d6
qubes: Display all errors in second stage
If a command encounters an error or times out during the second stage of
the conversion in Qubes, handle it the same way as we would have handled
it in the first stage:

1. Get its error message.
2. Throw an UnexpectedConversionError exception, with the original
   message.

Note that, because the second stage takes place locally, users will see
the original content of the error.

Refs #567
Closes #430
2023-10-02 15:41:17 +03:00
deeplow
0a6b33ebed
Qubes: detect qube failing to start (missing RAM)
In Qubes OS it's often the case that the user doesn't have enough
RAM to start the conversion. In this case it raises BrokenPipeException
and exits with code 126.

It didn't seem possible to distinguish this kind of failure to one
where the user has misconfigured qrexec policies.

NOTE: this approach is not ideal UX-wise. After the first doc failing
the next one will also try and fail. Upon first failure we should
inform the user that they need to close some programs or qubes.
2023-09-28 11:08:50 +01:00
deeplow
63f03d5bcd
Add limit and test to max width and height of docs 2023-09-28 11:08:47 +01:00
deeplow
54b8ffbf96
Add page limit of 10000
Theoretically the max pages would be 65536 (2byte unsigned int.
However this limit is much higher than practical documents have
and larger ones can lead to unforseen problems, for example RAM
limitations.

We thus opted to use a lower limit of 10K. The limit must be
detected client-side, given that the server is distrusted. However
we also check it in the server, just as a fail-early mechanism.
2023-09-28 11:01:14 +01:00
Alex Pyrgiotis
18b73d94b0
qubes: Find out reason of interrupted conversions
If a conversion has been interrupted (usually due to an EOF), figure out
why this happened by checking the exit code of the spawned process.
2023-09-26 17:35:26 +03:00
Alex Pyrgiotis
30196ff35b
errors: Add error for interrupted conversions
Add an error for interrupted conversions, in order to better
differentiate this scenario from other ValueErrors that may be raised
throughout the code's lifetime.
2023-09-26 17:35:26 +03:00
Alex Pyrgiotis
0273522fb1
qubes: Store the process for the spawned qube
Store, in an instance attribute, the process that we have started for
the spawned disposable qube. In subsequent commits, we will use it from
other places as well, aside from the `_convert` method.

Note that this commit does not alter the conversion logic, and only does
the following:
1. Renames `p.` to `self.proc.`
2. Adds an `__init__` method to the Qubes isolation provider, and
   initializes the `self.proc` attribute to `None`.
3. Adds an assert that `self.proc` is not `None` after it's spawned, to
   placate Mypy.
2023-09-26 17:35:25 +03:00
deeplow
e08b6defc3
Round conversion progress from float to int
Fixes #553
2023-09-26 15:20:41 +01:00
deeplow
8d37ff15e0
Remove duplicated Qubes message: "Safe PDF Created"
Fixes #555.  This is a leftover from when we didn't have progress
reports from the second stage conversion (AKA. pixels to PDF) in #429.
2023-09-26 12:16:48 +01:00
Alex Pyrgiotis
e64d1da61f
qubes: Pass OCR parameters properly
Pass OCR parameters to conversion functions as arguments, instead of
setting environment variables.

Fixes #455
2023-09-20 18:04:40 +03:00
Alex Pyrgiotis
8a0c0a4673
Make parameter actually optional 2023-09-20 17:58:39 +03:00
Alex Pyrgiotis
99dd5f5139
qubes: Add client-side timeouts
Extend the client-side capabilities of the Qubes isolation provider, by
adding client-side timeout logic.

This implementation brings the same logic that we used server-side to
the client, by taking into account the original file size and the number
of pages that the server returns.

Since the code does not have the exact same insight as the server has,
the calculated timeouts are in two places:

1. The timeout for getting the number of pages. This timeout takes into
   account:
   * the disposable qube startup time, and
   * the time it takes to convert a file type to PDF
2. The total timeout for converting the PDF into pixels, in the same way
   that we do it on the server-side.

Besides these changes, we also ensure that partial reads (e.g., due to
EOF) are detected (see exact=... argument)

Some things that are not resolved in this commit are:
* We have both client-side and server-side timeouts for the first phase
  of the conversion. Once containers can stream data back to the
  application (see #443), these server-side timeouts can be removed.
* We do not show a proper error message when a timeout occurs. This will
  be part of the error handling PR (see #430)

Fixes #446
Refs #443
Refs #430
2023-09-20 17:32:42 +03:00
Alex Pyrgiotis
55a4491ced
Consolidate import statements 2023-09-20 17:14:24 +03:00
deeplow
b4c3e07d36
Remove attacker-controlled error messages
Creates exceptions in the server code to be shared with the client via an
identifying exit code. These exceptions are then reconstructed in the
client.

Refs #456 but does not completely fix it. Unexpected exceptions and
progress descriptions are still passed in Containers.
2023-09-19 15:33:20 +01:00
deeplow
9ec9cc5f87
Replace armor guards that indicate isolated output 2023-08-22 16:11:41 +01:00
deeplow
fa215063ee
Add logging for second container 2023-08-22 16:11:38 +01:00
deeplow
75369cf621
Adapt code so it works for reporting script
Reporting script now parses JunitXML instead of a series of
".container_log" files. The script in in changed submodule.

Additionally it makes failed tests actually fail so that this is
recorded in the JunitXML report.
2023-08-22 16:11:36 +01:00
deeplow
f41cefde1d
Add "armor" around conversion log
Add GPG-styled "armor" around conversion logs

    -----CONVERSION LOG START-----
    Creator:         Writer
    Producer:        LibreOffice 6.4
    [...]
    -----CONVERSION LOG END-----
2023-08-22 16:11:28 +01:00
deeplow
9f1abe2836
Replace non-printable ascii in conversion log
Certain characters may be abused. Particularly ANSI escape codes.
Solution inspired by Qubes OS's hardening of ther RPC mechanism [1]:

> Terminal control characters are a security issue, which in worst case
> amount to arbitrary command execution. In the simplest case this
> requires two often found codes: terminal title setting (which puts
> arbitrary string in the window title) and title repo reporting (which
> puts that string on the shell's standard input. [sic]
>
>  -- qvm-run.rst [2]

[1]: e005836286
[2]: c70da44702/doc/manpages/qvm-run.rst (L126)
2023-08-22 16:11:27 +01:00
deeplow
95cef8cf0a
Containers: capture conversion logs
Store the conversion log to a file (captured-output.txt) in the
container and when in development mode, have its output displayed on the
terminal output.
2023-08-22 16:11:26 +01:00
deeplow
d6bce4dec5
Qubes: close qrexec stdin and stout
Ensure a server cannon keep the client hannging if more data than
necessary is sent. This applies to container and the Qubes
implmentation.
2023-08-22 16:11:23 +01:00
deeplow
874b8865e2
Qubes: strategy for capturing conversion logs
Use qrexec stdout to send conversion data (pixels) and stderr to send
conversion progress at the end of the conversion. This happens
regardless of whether or not the conversion is in developer mode or not.

It's the client that decides if it reads the debug data from stderr or
not. In this case, it only reads it if developer mode is enabled.
2023-08-22 16:11:20 +01:00
Alex Pyrgiotis
6c374d8a7e
qubes: Mark Dangerzone messages as trusted
Mark the messages that Dangerzone creates once a conversion step
finishes as trusted, since they do not contain any string not controlled
by us.
2023-08-01 14:43:49 +03:00
deeplow
72536a05ac
container: Improve parsing of progress reports
Improve the `parse_progress()` method of the container isolation
provider in the following ways:

1. Make sure that the fields of the progress report have the expected
   type.
2. In case of a JSON parsing error, sanitize the invalid string so that
   it doesn't contain escape sequences, or the user considers it as
   trusted.
2023-08-01 14:43:49 +03:00
Alex Pyrgiotis
9410b68c1d
Sanitize progress reports in a provider-agnostic way
Update the common `print_progress()` method in the base
`IsolationProvider` class, with two extra features:

1. Always sanitize the provided text argument.
2. Mark the sanitized text argument as untrusted.

This is default behavior from now on, since this function is commonly
used to parse progress reports from the conversion sandbox.
2023-08-01 14:43:48 +03:00
Alex Pyrgiotis
77f4b8115c
Add missing reset ANSI sequence
Do not forget to reset the red text once we print an error string to the
terminal
2023-08-01 14:38:32 +03:00
deeplow
1ab14dbd86
Use containers in Qubes until Beta
Reverse the logic in Qubes to run in containers by default and only
perform the conversion with VMs when explicitly set by the env var
QUBES_CONVERSION=1. This will avoid surprises when someone installs
Dangerzone on Qubes expecting it to work out of the box just like any
other Linux.

Fixes #451
2023-07-26 14:02:06 +01:00
Moon Sungjoon
494f498d17
Remove pipes module and use shlex instead
Thanks: https://github.com/tox-dev/tox/pull/2418/files

Closes #373
2023-07-24 18:13:00 +03:00
deeplow
ef41cab76e
Add progress reports on Qubes (GUI)
Fixes #429
2023-07-13 12:57:23 +01:00
deeplow
bf38c24d99
Merge stdout_callback with print_progress
stdout_callback is used to flow progress information from the conversion
to some front-end. It was always used in tandem with printing to the
terminal (which is kind of a front-end). So it made sense to put them
always together.
2023-07-13 12:57:04 +01:00
deeplow
baeab9d7eb
Add Qubes isolation provider
Add an isolation provider for Qubes, that performs the document
conversion as follows:

Document to pixels phase
------------------------

1. Starts a disposable qube by calling either the dz.Convert or the
   dz.ConvertDev RPC call, depending on the execution context.
2. Sends the file to disposable qube through its stdin.
   * If we call the conversion from the development environment, also
     pass the conversion module as a Python zipfile, before the
     suspicious document.
3. Reads the number of pages, their dimensions, and the page data.

Pixels to PDF phase
-------------------

1. Writes the page data under /tmp/dangerzone, so that the
   `pixels_to_pdf` module can read them.
2. Pass OCR parameters as envvars.
3. Call the `pixels_to_pdf` main function, as if it was running within a
   container. Wait until the PDF gets created.
4. Move the resulting PDF to the proper directory.

Fixes #414
2023-06-21 11:46:34 +03:00
deeplow
a0d1a68302
Use /tmp/dangerzone for Qubes compatibility
For using in containers, creating a /dangerzone directory is fine but it
is more standard to do this in /tmp.
2023-06-21 11:44:53 +03:00
deeplow
814d533c3b
Restructure container code
The files in `container/` no longer make sense to have that name since
the "document to pixels" part will run in Qubes OS in its own virtual
machine.

To adapt to this, this PR does the following:
- Moves all the files in `container` to `dangerzone/conversion`
- Splits the old `container/dangerzone.py` into its two components
  `dangerzone/conversion/{doc_to_pixels,pixels_to_pdf}.py` with a
  `common.py` file for shared functions
- Moves the Dockerfile to the project root and adapts it to the new
  container code location
- Updates the CircleCI config to properly cache Docker images.
- Updates our install scripts to properly build Docker images.
- Adds the new conversion module to the container image, so that it can
  be imported as a package.
- Adapts the container isolation provider to use the new way of calling
  the code.

NOTE: We have made zero changes to the conversion code in this commit,
except for necessary imports in order to factor out some common parts.
Any changes necessary for Qubes integration follow in the subsequent
commits.
2023-06-21 11:44:47 +03:00
Alex Pyrgiotis
8b846820d2
Update typing hints for Mypy 1.1.1
Due to a bump in our Python dependencies, we now install Mypy 1.1.1
instead of 0.982. This change triggered the following errors:

* Incompatible default for argument <a> (default has type
  None, argument has type <t>):

  Mypy further explains here that PEP 484 prohibits implicit Optional,
  so we need to make these types explicit Optional.

* Unused "type: ignore" comment, use narrower [method-assign] instead of
  [assignment]:

  Mypy has specialized some of its lints, meaning that we should switch
  to the newer variants.

Also, it detected several other small inconsistencies. We fix all of
these errors in this commit.
2023-03-27 15:19:43 +03:00
Alex Pyrgiotis
1f308e9cc5
Reformat code with Black 23
Due to a bump in our Python dependencies, we now install Black 23
instead of 22, which detects some of our files as badly formatted.
2023-03-27 15:17:23 +03:00
Alex Pyrgiotis
2042591964
container: Copy files before mounting them
Copy input files in a temporary dir before mounting them, thereby
changing their permissions, without affecting the original files. This
way, we can avoid cases where a file is accessible to the user only due
to a supplemental user group, which does not work for containers.

Fixes #157
Fixes #260
Fixes #335
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
ea73f5d820
container: Take SELinux labels into account
Take SELinux labels into account when mounting a file to the Dangerzone
container. Use the `:Z` flag (which is a no-op in non-SELinux systems)
to clear the existing SELinux label for a file, and apply one that
matches the container's.

Refs #335
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
d733890ca0
container: Do not leave stale temporary dirs
Do not leave stale temporary directories when conversion fails
unexpectedly. Instead, wrap the conversion operation in a context
manager that wipes the temporary dir afterwards.

Fixes #317
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
44c324f9ac
Separate config dirs from temp dirs
Do not store temporary directories in the Dangerzone's config directory.
There are two reasons for that:

1. They are ephemeral, and they need a temporary place to be stored,
   preferably RAM-backed.
2. We need to set them while running our CI tests.
2023-02-17 01:06:44 +02:00
deeplow
9b3d98b20b
Build arm64 docker image for arm-based Macs
Remove --patform args completely so that by default we build natively
on each platform.

Partial fix for #50
2023-02-16 10:59:00 +00:00
Alex Pyrgiotis
93a06d72f0
Allow users to disable timeouts
Allow users to disable timeouts via the CLI, with the
`--disable-timeouts` argument. By default, the timeouts are always
enabled.

This option applies both to the CLI version of Dangerzone, and the GUI
one. For the latter, the user must start the GUI from their CLI (i.e.,
`dangerzone --disable-timeouts ...`)
2023-02-15 23:48:36 +02:00
deeplow
56b5b98f1e
Report exceptions raised in document conversion
Exceptions raised during the document conversion process would be
silently hidden. This was because ThreadPoolExecuter in logic.py created
various contexts and hid any exceptions raised.

Fixes #309
2023-01-26 18:53:20 +00:00
deeplow
724dd2a71f
Make container-specific methods static
Make these methods callable without having to create an instance of the
Container class. This was needed to make pytest-wrapper.py cleaner.
2023-01-25 14:55:43 +00:00
deeplow
f5c4847af2
De-duplicate print_progress() logic 2023-01-25 14:53:28 +00:00
deeplow
538df18709
Split isolation providers into their own .py files
Provides more clear code organization having each provider in their own
python file rather than a single one.
2023-01-25 14:19:05 +00:00