dangerzone

mirror of https://github.com/freedomofpress/dangerzone.git synced 2025-04-28 18:02:38 +02:00

Author	SHA1	Message	Date
Alex Pyrgiotis	4ea0650f42	tests: Skip a test for missing OCR files on Qubes We have a container-specific test that deals with missing OCR files in the container image. This test _can_ be run under Qubes, and it may fail since it requires Podman. Make the pytest guard more strict and don't allow running this test on Qubes. Also, fix a typo in the word "omission".	2024-06-27 22:11:50 +03:00
Etienne Perot	f03bc71855	Sandbox all Dangerzone document processing within gVisor. This wraps the existing container image inside a gVisor-based sandbox. gVisor is an open-source OCI-compliant container runtime. It is a userspace reimplementation of the Linux kernel in a memory-safe language. It works by creating a sandboxed environment in which regular Linux applications run, but their system calls are intercepted by gVisor. gVisor then redirects these system calls and reinterprets them in its own kernel. This means the host Linux kernel is isolated from the sandboxed application, thereby providing protection against Linux container escape attacks. It also uses `seccomp-bpf` to provide a secondary layer of defense against container escapes. Even if its userspace kernel gets compromised, attackers would have to additionally have a Linux container escape vector, and that exploit would have to fit within the restricted `seccomp-bpf` rules that gVisor adds on itself. Fixes #126 Fixes #224 Fixes #225 Fixes #228	2024-06-12 13:40:04 +03:00
deeplow	69c2a02d81	Remove timeouts Remove timeouts due to several reasons: 1. Lost purpose: after implementing the containers page streaming the only subprocess we have left is LibreOffice. So don't have such a big risk of commands hanging (the original reason for timeouts). 2. Little benefit: predicting execution time is generically unsolvable computer science problem. Ultimately we were guessing an arbitrary time based on the number of pages and the document size. As a guess we made it pretty lax (30s per page or MB). A document hanging for this long will probably lead to user frustration in any case and the user may be compelled to abort the conversion. 3. Technical Challenges with non-blocking timeout: there have been several technical challenges in keeping timeouts that we've made effort to accommodate. A significant one was having to do non-blocking read to ensure we could timeout when reading conversion stream (and then used here) Fixes #687	2024-02-06 20:11:43 +00:00
deeplow	f676891482	Remove Dockerfile dependencies replaced by PyMuPDF PyMuPDF replaced the need for almost all dependencies, which this commit now removes. We are also removing tesseract-ocr as a dependency since (to our surprise) PyMuPDF ships directly with tesseract binaries [1]. However, now that tesseract-ocr is not available directly as a binary tool, the `test_ocr.py` needed to be changed. Fixes #658 [1]: https://github.com/freedomofpress/dangerzone/issues/658#issuecomment-1861033149	2024-01-03 12:58:36 +00:00
Alex Pyrgiotis	641aa131c9	ci: Add test for OCR languages Test that the languages that we provide to users for OCR match the languages that are installed in the container image Fixes #417	2023-05-24 13:43:29 +03:00

5 commits