mirror of https://github.com/freedomofpress/dangerzone.git synced 2025-04-29 02:12:36 +02:00

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs

Find a file

Alex Pyrgiotis a0d6f0d719 container: Grab trained OCR models from GitHub Grab Tesseract's trained models from GitHub, instead of from the Alpine Linux repos. Over the past few months, the models in the Alpine Linux repos did not remain stable, leading to CI issues. Since the models are already pre-trained and available through Tesseract's repo on GitHub, we can use the release tarball that they offer to install them in the container image, which is basically what the upstream packages are doing as well. In order to make sure that we have no regressions, at the time of this commit we ensured that the hashes of the models offered through the Alpine Linux repos and the models offered from the GitHub release are the same. Also, in order to detect future regressions or foul play, we check the downloaded models against a known checksum. Given that these models change every few years, updating the checksum should not be an issue. Fix #357		2023-05-23 16:27:40 +03:00
.circleci	Deprecate Fedora 36 support	2023-05-23 09:22:59 +01:00
.github/workflows	ci: Add security scanning	2023-05-17 20:29:13 +03:00
assets	Update README screenshots for 0.4.0 release	2022-12-02 11:26:21 +00:00
container	container: Grab trained OCR models from GitHub	2023-05-23 16:27:40 +03:00
dangerzone	fix gui typo	2023-05-08 12:53:09 +01:00
dev_scripts	Deprecate Fedora 36 support	2023-05-23 09:22:59 +01:00
install	Appease linter	2023-04-24 11:50:58 +03:00
share	Bump version to 0.4.1	2023-04-18 23:01:00 +03:00
tests	tests: Add sample files for extra MIME types	2023-04-03 18:58:56 +03:00
.gitignore	migrate to pytest & test_docs -> tests/test_docs	2022-09-13 13:07:58 +01:00
.grype.yaml	ci: Ignore two CVEs from our security scans	2023-05-17 20:29:13 +03:00
BUILD.md	Fix typo	2023-05-17 08:52:34 +01:00
CHANGELOG.md	Deprecate Fedora 36 support	2023-05-23 09:22:59 +01:00
INSTALL.md	Update changelog for Fedora 38	2023-05-16 16:20:32 +03:00
LICENSE	Replace First Look Media references	2023-03-08 18:40:55 +02:00
Makefile	Make 'make test' use the Python interpreter	2023-01-25 16:36:31 +00:00
poetry.lock	Update Poetry lock file	2023-03-27 15:15:26 +03:00
pyproject.toml	Bump version to 0.4.1	2023-04-18 23:01:00 +03:00
README.md	Bump version to 0.4.1	2023-04-18 23:01:00 +03:00
RELEASE.md	Add support for Fedora 38 in the QA script	2023-05-16 16:20:32 +03:00
setup-windows.py	Windows: fix "Open with" dialog showing dz description	2023-01-16 11:38:08 +00:00
setup.py	Replace First Look Media references	2023-03-08 18:40:55 +02:00
stdeb.cfg	Replace First Look Media references	2023-03-08 18:40:55 +02:00

README.md

Dangerzone

Take potentially dangerous PDFs, office documents, or images and convert them to a safe PDF.

Dangerzone works like this: You give it a document that you don't know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn't already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.

Read more about Dangerzone in the official site.

Getting started

Download Dangerzone 0.4.1 for Mac (Apple Silicon CPU)
Download Dangerzone 0.4.1 for Mac (Intel CPU)
Download Dangerzone 0.4.1 for Windows
See installing Dangerzone for Linux repositories

You can also install Dangerzone for Mac using Homebrew: brew install --cask dangerzone

Some features

Sandboxes don't have network access, so if a malicious document can compromise one, it can't phone home
Dangerzone can optionally OCR the safe PDFs it creates, so it will have a text layer again
Dangerzone compresses the safe PDF to reduce file size
After converting, Dangerzone lets you open the safe PDF in the PDF viewer of your choice, which allows you to open PDFs and office docs in Dangerzone by default so you never accidentally open a dangerous document

Dangerzone can convert these types of document into safe PDFs:

PDF (.pdf)
Microsoft Word (.docx, .doc)
Microsoft Excel (.xlsx, .xls)
Microsoft PowerPoint (.pptx, .ppt)
ODF Text (.odt)
ODF Spreadsheet (.ods)
ODF Presentation (.odp)
ODF Graphics (.odg)
Jpeg (.jpg, .jpeg)
GIF (.gif)
PNG (.png)

Dangerzone was inspired by Qubes trusted PDF, but it works in non-Qubes operating systems. It uses containers as sandboxes instead of virtual machines (using Docker for macOS, Windows, and Debian/Ubuntu, and podman for Fedora).

Set up a development environment by following these instructions.