Commit graph

769 commits

Author SHA1 Message Date
Alex Pyrgiotis
bb0de52b01
Bump version to 0.4.1-rc3 2023-04-03 19:36:48 +03:00
Alex Pyrgiotis
7fe01d6470
install/windows: Remove -rc identifiers from version
Remove any -rc identifiers (e.g., 0.4.1-rc3) from the Dangerzone
version, if it includes them. If we don't remove them, then building
the MSI for Windows will fail as follows:

    error CNDL0108: The Product/@Version attribute's value, '0.4.1-rc3',
    is not a valid version. Legal version values should look like
    'x.x.x.x' where x is an integer from 0 to 65534.
2023-04-03 19:35:19 +03:00
Alex Pyrgiotis
6c7c0b615f
dev_scripts: Add missing packages in Dangerzone envs
Install the following packages in Dangerzone envs:

* python3-setuptools: We've seen that this package is necessary to build
  the RPM package for Dangerzone. The error that we encountered was the
  following:

      * Deleting old build and dist
      * Building RPM package
      Traceback (most recent call last):
        File "/home/user/dangerzone/setup.py", line 5, in <module>
          import setuptools
      ModuleNotFoundError: No module named 'setuptools'
      Traceback (most recent call last):
        File "/home/user/./dangerzone/install/linux/build-rpm.py", line 43, in <module>
          main()
        File "/home/user/./dangerzone/install/linux/build-rpm.py", line 30, in main
          subprocess.run(
        File "/usr/lib64/python3.11/subprocess.py", line 571, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command 'python3 setup.py bdist_rpm --requires='podman,python3-pyside2,python3-appdirs,python3-click,python3-pyxdg,python3-colorama'' returned non-zero exit status 1.

* fuse-overlayfs: In Ubuntu 22.10 (at least), we encountered the
  following error when running Podman:

      ERRO[0000] User-selected graph driver "overlay" overwritten by
      graph driver "vfs" from database - delete libpod local files to
      resolve

  The `vfs` driver is much slower than the `overlayfs` storage driver,
  so we need to fix this. The reason why we encounter this error is
  explained in the Podman docs [1]:

      [...] and is vfs for non-root users when fuse-overlayfs is not
      available.

  Normally, the `fuse-overlayfs` package would have been installed, but
  we don't install it due to the `--no-install-recommends` flag, so we
  install it manually.

[1]: https://docs.podman.io/en/latest/markdown/podman.1.html#storage-driver-value
2023-04-03 18:58:56 +03:00
Alex Pyrgiotis
33f81f5064
tests: Add sample files for extra MIME types
In PR #378 ("container: Allow converting more document formats"), we
added support for the following MIME types:

* application/zip
* application/octet-stream
* application/x-ole-storage
* application/vnd.oasis.opendocument.spreadsheet-template
* application/vnd.oasis.opendocument.text-template

However, we forgot to add some tests for these MIME types in the repo.
In this commit, we add a file for each of these MIME types, to make sure
we have no regressions in the future.
2023-04-03 18:58:56 +03:00
Alex Pyrgiotis
58a8241844
container: Run LibreOffice in safe mode
The main use of safe mode [1] in LibreOffice is to run with a fresh user
profile, in case the default one got borked somehow. This is actually
not a concern of ours, since the user's profile is in the container and
is not persistent.

The main reason we want to preemptively run LibreOffice in safe mode is
to remove hardware acceleration capabilities. Whether hardware
acceleration actually works in a container is another question, but we
want to be extra sure.

[1]: https://help.libreoffice.org/latest/en-US/text/shared/01/profile_safe_mode.html
2023-03-28 14:47:07 +03:00
Alex Pyrgiotis
a1c87a207a
container: Allow converting more document formats
Remove the association between MIME types and export filters, because
LibreOffice is able to auto-detect them on its own. Instead, ask
LibreOffice to simply convert the document to a .pdf.

This association was cumbersome for yet another reason; there are MIME
types that may be associated with more than one file type. That's why
it's better to let LibreOffice decide the proper filter for the
conversion.

Our current understanding is that this change won't widen our attack
surface for the following reasons:

* The output filters for PDF documents are pretty specific, and we don't
  affect the input filters somehow.
* The default behavior of LibreOffice on Alpine Linux is to disable
  macros.

Closes #369
2023-03-28 14:46:47 +03:00
Alex Pyrgiotis
8b846820d2
Update typing hints for Mypy 1.1.1
Due to a bump in our Python dependencies, we now install Mypy 1.1.1
instead of 0.982. This change triggered the following errors:

* Incompatible default for argument <a> (default has type
  None, argument has type <t>):

  Mypy further explains here that PEP 484 prohibits implicit Optional,
  so we need to make these types explicit Optional.

* Unused "type: ignore" comment, use narrower [method-assign] instead of
  [assignment]:

  Mypy has specialized some of its lints, meaning that we should switch
  to the newer variants.

Also, it detected several other small inconsistencies. We fix all of
these errors in this commit.
2023-03-27 15:19:43 +03:00
Alex Pyrgiotis
1f308e9cc5
Reformat code with Black 23
Due to a bump in our Python dependencies, we now install Black 23
instead of 22, which detects some of our files as badly formatted.
2023-03-27 15:17:23 +03:00
Alex Pyrgiotis
b102b2bd49
Update Poetry lock file
Run `poetry lock` and allow updating the existing dependencies. This
fixes a CI regression that was introduced by Poetry 1.4.1, which added
stricter Python wheels validation

Fixes #376
2023-03-27 15:15:26 +03:00
Alex Pyrgiotis
7613941e1f
ci: Do not deploy to PackageCloud
Pave the way for deploying .deb and .rpm packages to
packages.freedom.press. Remove the code that deploys to PackageCloud
once we tag a commit with `v<semver>`.

Refs #291
2023-03-27 13:41:08 +03:00
Alex Pyrgiotis
8a7d52b471
Update Changelog for 0.4.1 2023-03-27 12:32:36 +03:00
deeplow
bc50917362
Sort OCR languages when loading them from json
Because now the ocr-languages.json is sorted by tesseract language arg
name, we'll want to sort the languages the user sees alphabetically.
2023-03-16 14:23:31 +00:00
deeplow
58332fdd6e
tesseract: add new lanaguages and others
Tagalo was replaced with filipino [1] in newer tesseract versions, so it
doesn't make sense for us to use the new name and map it to the old
"tgl" name (Tagalo) under the hood.

Language names obtained from tesseract's man page [2].

[1]: 58f7a72f00
[2]: https://github.com/tesseract-ocr/tesseract/blob/main/doc/tesseract.1.asc
2023-03-16 14:23:30 +00:00
deeplow
d8d83ff036
Remove languages not supported
When the ocr languages list was originally introduced (commit b527776),
the container was running in a ubuntu 18.04 [1]. Later it changed to
alpine linux. Unfortunately it has less languages than in ubuntu.

This commit removes those languages. Fixes #355

[1]: b527776e28 (diff-ec032b25a6c2af24eaf4128c85090c5ce0dcbab64e64eace10be9f4e4683a71bR1)
2023-03-16 14:23:28 +00:00
deeplow
66d3c40163
Sort OCR languages by tesseract arg name
Make it easier to compare the list of languages with the output of
`tesseract --list-langs`.
2023-03-16 14:23:25 +00:00
Alex Pyrgiotis
d768099912
Grab just the image ID
When building the image, grab the image id using `-q`, which removes all
the decorations in the output and just keeps the image ID.
2023-03-09 19:04:59 +02:00
Alex Pyrgiotis
a33dcfbb51
Replace First Look Media references
Update several references to First Look Media in the code, to better
reflect the current status, where Freedom of the Press Foundation has
taken over the stewardship of the project.

Fixes #343
2023-03-08 18:40:55 +02:00
Alex Pyrgiotis
330766665d
Update instructions in qa.py 2023-03-08 17:56:25 +02:00
Alex Pyrgiotis
b74258b6d2
Remove stale QA requirement
Remove a stale QA requirement for running the tests manually in the rest
of our Linux distros. Our CI jobs take care of that, so we don't need to
do it.
2023-03-08 17:40:26 +02:00
Alex Pyrgiotis
4668443be6
install: Use the full image tag
Use the full image tag (dangerzone.rocks/dangerzone:latest) when
building the image. Else, we risk creating a `share/image-id.txt` file
with multiple IDs in it, if we have another
`dangerzone.rocks/dangerzone` image (with a different tag) in our dev
environment.
2023-03-08 17:40:26 +02:00
Alex Pyrgiotis
c719fc4f54
Update our MacOS QA instructions
Update our QA instructions for ARM-based MacOS systems. The main change
in 0.4.1 is that we can build an ARM container image for Dangerzone,
which is different from Intel Macs. So, we need to build and test it
during release.
2023-03-08 17:40:26 +02:00
Alex Pyrgiotis
5a0c4d0a03
Bump timeouts
Perform the following timeout bumps:

1. Increase the minimum timeout per page/MiB by x3. The rationale is that
   10 seconds is a reasonable timeout, but to be on the safe side, it's
   best if we multiply it by a safety factor.
2. Increase the minimum timeout from 10 seconds to 60 seconds. 10
   seconds may be too little if the application runtime (e.g.,
   LibreOffice) is slow to start due to background CPU thrashing.
2023-03-08 17:38:59 +02:00
Alex Pyrgiotis
a2049349b1
ci: Add missing CI tests for Ubuntu Focal / Debian Bullseye 2023-03-08 17:36:42 +02:00
Alex Pyrgiotis
b32f215c7c
dev_scripts: Handle alt name for Ubuntu Focal 2023-03-08 17:36:42 +02:00
Alex Pyrgiotis
aaecfdb63e
dev_scripts: Immitate mkdir -p when creating state dirs
The first time we run the env.py script, we may not have the necessary
dirs under envs. It's best to create them with `parents=True`.
2023-03-08 17:36:42 +02:00
Alex Pyrgiotis
96d8cdef94
Suggest users to install Poetry via pipx
Replace the command to install Poetry globally via `pip` in our build
instructions, with a command that installs Poetry under ~/.local/bin
via `pipx`. The rationale is the same as in the previous commit, i.e.,
PEP 668 does not allow it.

Note that in this case, we don't have any CI restrictions, so we could
use the official installer instead. However, for security reasons, we
prefer suggesting `pipx` to the users, and of course give them a list of
alternatives.

Note that for Windows and MacOS we leave the command as is, until we
figure out how PEP 668 applies in there.
2023-03-08 17:36:42 +02:00
Alex Pyrgiotis
7310977343
dev_scripts: Install Poetry via pipx
We can no longer install Poetry via `pip`, since Debian Bookworm now
enforces PEP 668, meaning that both `pip install poetry` and `pip
install --user poetry` cannot work [1]. Since we use the same
installation steps for all of our dev environments, we need to find a
common way to install Poetry.

Poetry's website provides several ways to install Poetry [2]. Moreover,
it also has a special section with CI recommendations [3]. In this
section, it strongly suggests to install Poetry via `pipx`, instead of
the installer script that you download from the Internet.

Follow Poetry's suggestion to install it via `pipx` in CI environments,
with one minor change. Do not use `pipx ensurepath`, as that will
affect the `.bashrc` of the dev environment, which at some point in the
future may be mounted by the dev. Instead, set a PATH environment
variable that includes `~/.local/bin`.

[1]: https://github.com/freedomofpress/dangerzone/issues/351
[2]: https://python-poetry.org/docs/#installation
[3]: https://python-poetry.org/docs/#ci-recommendations

Fixes #351
2023-03-08 17:36:42 +02:00
Alex Pyrgiotis
7979dbd653
ci: Install Poetry via APT on Debian Bookworm
We no longer need to install Poetry via PyPI, since the upstream Debian
issues have been fixed. Moreover, PEP 668 [1] is now enforced in Debian
Bookworm, so we can't install Poetry globally via `pip` in any case.

For these reasons, prefer installing Poetry via APT.

[1]: https://peps.python.org/pep-0668/

Refs #351
2023-03-08 17:23:06 +02:00
deeplow
e840c7a18c
Fix "Choose..." dialog not opening on Qt6
When clicking on the "Choose..." button nothing would happen visually
and it would show the error:

  Traceback (most recent call last):
    File "/home/user/dangerzone/dangerzone/gui/main_window.py", line 614, in select_output_directory
      dialog.setFileMode(QtWidgets.QFileDialog.DirectoryOnly)

According to the PySide docs, QFileDialog.DirectoryOnly has been
deprecated in Qt4.6 [1]. This was not an issue probably on PySide2
because it must have used an earlier Qt version.

Fixes #360

[1]: https://doc.qt.io/qtforpython-5/PySide2/QtWidgets/QFileDialog.html#PySide2.QtWidgets.PySide2.QtWidgets.QFileDialog.FileMode
2023-03-01 12:49:46 +00:00
Alex Pyrgiotis
56c5d77afd
Build Windows MSI/.exe in GitHub actions
Update our GitHub actions manifest to also build a dummy Windows MSI
installer for Dangerzone, so that we don't find out issues during
release.
2023-02-23 09:12:06 +00:00
deeplow
f307e03215
Windows build: link to adding Wix to PATH 2023-02-23 09:12:04 +00:00
deeplow
fb85421db8
Fix Windows build for PySide6 (illegal file names)
Building the `.msi` on Windows was failing in the `candle.exe` step due
to some files in the PySide6 library being too long (PySide6/examples)
or having illegal character (`+`) in their file names
(PySide6/qml/QtQuick).

Skipping copying these files to the `.msi` fixes the issue. Skipping
`examples/` should be of no impact since they're just examples and
skipping `qml/QtQuick` shouldn't cause issues because we don't use QML.

Reverts commit `bbbf822` and adapts it from PySide2 to PySide6.
2023-02-23 09:12:02 +00:00
deeplow
541fe7f382
Container: ignore non-progress pdftoppm output
pdftoppm raises Syntax issues and Errors on a variety of documents.
But it still produces usable results despite the failures. From the
user's perspective it's best to have a document even if imperfect than
having none at all. For this reason, we ignore non-relevant output.
2023-02-21 19:05:21 +00:00
deeplow
dbd0450542
Add poppler-data package due to missing fonts
Some documents were reporting the following error when running them
over pdftoppm:

    Syntax Error: Missing language pack for 'Adobe-Japan1' mapping

This did not necessarily make the document fail but it could be
that some fonts were not properly rendered due to the missing package.
2023-02-21 18:39:14 +00:00
Alex Pyrgiotis
9bf65bc829
dev_scripts: Add extra distros in QA script
Add some distros in the QA script that were missing from the list of our
supported ones.
2023-02-21 20:20:04 +02:00
Alex Pyrgiotis
ce86c1b126
dev_scripts: Enable building envs on Ubuntu Focal
Enable installing Podman in Ubuntu Focal, by re-using the instructions
we have in our installation section. This enables us building a dev
environment for Ubuntu Focal, which we couldn't previously.
2023-02-21 20:20:04 +02:00
Alex Pyrgiotis
5100e15213
Add missing build dependencies for Ubuntu Focal
Add some missing build dependencies that we encountered for Ubuntu
Focal, but they apply to the rest of the Debian-based distros as well.
2023-02-21 20:20:03 +02:00
Alex Pyrgiotis
79ccd14d5d
Fix PySide2 issue for Ubuntu Focal
Provide a fallback for QRegularExpressionValidator specifically for
Ubuntu Focal, because it's not present in PySide2 5.14. Instead,
fallback to QRegExpValidator if it doesn't exist.

Fixes #339
2023-02-21 20:17:05 +02:00
Alex Pyrgiotis
b94d0712c8
Minor corrections in test code 2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
2042591964
container: Copy files before mounting them
Copy input files in a temporary dir before mounting them, thereby
changing their permissions, without affecting the original files. This
way, we can avoid cases where a file is accessible to the user only due
to a supplemental user group, which does not work for containers.

Fixes #157
Fixes #260
Fixes #335
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
ea73f5d820
container: Take SELinux labels into account
Take SELinux labels into account when mounting a file to the Dangerzone
container. Use the `:Z` flag (which is a no-op in non-SELinux systems)
to clear the existing SELinux label for a file, and apply one that
matches the container's.

Refs #335
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
d733890ca0
container: Do not leave stale temporary dirs
Do not leave stale temporary directories when conversion fails
unexpectedly. Instead, wrap the conversion operation in a context
manager that wipes the temporary dir afterwards.

Fixes #317
2023-02-17 01:15:08 +02:00
Alex Pyrgiotis
18bc77332d
tests: Run each test in separate config/cache dirs
Run each CLI command in a separate config/cache dir, to avoid leaks
between tests. Moreover, this way we are able to check the contents of
the config/cache dirs for a single CLI run.
2023-02-17 01:15:07 +02:00
Alex Pyrgiotis
44c324f9ac
Separate config dirs from temp dirs
Do not store temporary directories in the Dangerzone's config directory.
There are two reasons for that:

1. They are ephemeral, and they need a temporary place to be stored,
   preferably RAM-backed.
2. We need to set them while running our CI tests.
2023-02-17 01:06:44 +02:00
deeplow
9b3d98b20b
Build arm64 docker image for arm-based Macs
Remove --patform args completely so that by default we build natively
on each platform.

Partial fix for #50
2023-02-16 10:59:00 +00:00
Alex Pyrgiotis
93a06d72f0
Allow users to disable timeouts
Allow users to disable timeouts via the CLI, with the
`--disable-timeouts` argument. By default, the timeouts are always
enabled.

This option applies both to the CLI version of Dangerzone, and the GUI
one. For the latter, the user must start the GUI from their CLI (i.e.,
`dangerzone --disable-timeouts ...`)
2023-02-15 23:48:36 +02:00
Alex Pyrgiotis
f2a4f29cff
container: Introduce proportional timeouts
Introduce proportional timeouts in the container code, where the
conversion logic runs.

Previously, we had a single timeout for each command (120 seconds),
which didn't scale well either with the number of pages in a document,
or with the size of the document.

In this commit, we look into each operation, and we're trying to figure
out the following:

1. What's the number of pages we will operate on?
2. How large is the document?

Knowing the above, we can break down a command into multiple operations,
at least conceptually. Having a number of operations and a sane timeout
value per operation (10 seconds), we can multiply those and reach to a
timeout that fits the command better.

Fixes #306
Fixes #314
Refs #327
2023-02-15 23:46:53 +02:00
Maeve Andrews
c26326450b
Add a --distro option to build-deb.py
Add an optional --distro argument to build-deb.py, to specify the Debian
version in the package name, which currently is "1". This option may
prove useful when publishing packages to freedomofpress/apt-tools-prod,
where packages from different distros with the same names but different
contents are not accepted.
2023-02-14 15:49:51 +02:00
deeplow
b49d6de6bd
Sample PDFs: rename to include file format in name
Make it so all samples when converted don't map to the same file. This
makes it easier to manually inspect files.
2023-02-09 09:02:33 +00:00
deeplow
275df80484
GUI: exit with 1 when some conversion failed
Fixes: #318
2023-02-08 17:24:55 +00:00