Commit graph

12 commits

Author SHA1 Message Date
deeplow
f7190e3876
PDFunite: fix too many open files
In large (1200+) PDFs the PDFunite command would fail on some systems
(e.g. Qubes), because it would be called with 1024+ files, leading up
to too many files open (`ulimit -n`).

This solution splits the merging into batches, accumulating the results
in a single PDF and then merging it with the next batch.
2023-11-02 15:11:50 +00:00
deeplow
3046cb7b8b
Process PDF->RGB in groups of 50 pages
PDFtoPPM was producing RGB files faster than they were getting consumed.
Since the RGB files were only getting removed after they were sent, this
was leading to /tmp in the server getting clogged.

This solution consists in processing and sending images in chunks of 50
pages. This solution is slightly inefficient since it can't process and
send data simultaneously. That will be solved in a future commit.

Fixes #574
2023-11-02 15:11:39 +00:00
Alex Pyrgiotis
4bb959f220
conversion: Add anchor points for streaming page data/metadata
Introduce 4 new methods that can be overloaded by the Qubes isolation
provider to stream page data/metadata back to the caller. For the time
being, these methods do what they did before, i.e., write this info in
files within the pixels directory.
2023-09-28 22:50:53 +03:00
Alex Pyrgiotis
6012cd1491
Improve EOF detection when reading command output
Do not read a line from the command output and then check if
we are at EOF, because it's possible that the writer immediately exited
after writing the last line of output. Instead, switch the order of
actions.

This is a very serious bug that can lead to Dangerzone excluding the
last page of the document. It should have bit us right from the start
(see aeeed411a0), but it seems that the
small period of time it takes the kernel to close the file descriptors
was hiding this bug.

Fixes #560
2023-09-28 22:50:53 +03:00
Alex Pyrgiotis
c547ffc3b4
conversion: Factor out calculate_timeout
Factor out the logic behind the calculate_timeout() method, used in
Dangerzone conversions, so that isolation providers can call it
directly.
2023-09-20 17:14:24 +03:00
deeplow
eb16285790
Replace container output command prefix ">>>"
In the junitxml this prefix would look ugly ("&gt&gt&gt") because it has
to escape any non-xml tags.
2023-08-22 16:11:35 +01:00
deeplow
48b2e7bc3c
Log command to debug log for traceback purposes
Log commands so we can trace back which errors / outputs are from each
command.
2023-08-22 16:11:34 +01:00
deeplow
874b8865e2
Qubes: strategy for capturing conversion logs
Use qrexec stdout to send conversion data (pixels) and stderr to send
conversion progress at the end of the conversion. This happens
regardless of whether or not the conversion is in developer mode or not.

It's the client that decides if it reads the debug data from stderr or
not. In this case, it only reads it if developer mode is enabled.
2023-08-22 16:11:20 +01:00
deeplow
1ab14dbd86
Use containers in Qubes until Beta
Reverse the logic in Qubes to run in containers by default and only
perform the conversion with VMs when explicitly set by the env var
QUBES_CONVERSION=1. This will avoid surprises when someone installs
Dangerzone on Qubes expecting it to work out of the box just like any
other Linux.

Fixes #451
2023-07-26 14:02:06 +01:00
deeplow
ef41cab76e
Add progress reports on Qubes (GUI)
Fixes #429
2023-07-13 12:57:23 +01:00
deeplow
9410da762c
Check if conversion code runs on Qubes
Add a way to check if the code runs (or should run) on Qubes.

Refs #451
2023-06-21 11:44:58 +03:00
deeplow
814d533c3b
Restructure container code
The files in `container/` no longer make sense to have that name since
the "document to pixels" part will run in Qubes OS in its own virtual
machine.

To adapt to this, this PR does the following:
- Moves all the files in `container` to `dangerzone/conversion`
- Splits the old `container/dangerzone.py` into its two components
  `dangerzone/conversion/{doc_to_pixels,pixels_to_pdf}.py` with a
  `common.py` file for shared functions
- Moves the Dockerfile to the project root and adapts it to the new
  container code location
- Updates the CircleCI config to properly cache Docker images.
- Updates our install scripts to properly build Docker images.
- Adds the new conversion module to the container image, so that it can
  be imported as a package.
- Adapts the container isolation provider to use the new way of calling
  the code.

NOTE: We have made zero changes to the conversion code in this commit,
except for necessary imports in order to factor out some common parts.
Any changes necessary for Qubes integration follow in the subsequent
commits.
2023-06-21 11:44:47 +03:00