![]() Grab Tesseract's trained models from GitHub, instead of from the Alpine Linux repos. Over the past few months, the models in the Alpine Linux repos did not remain stable, leading to CI issues. Since the models are already pre-trained and available through Tesseract's repo on GitHub, we can use the release tarball that they offer to install them in the container image, which is basically what the upstream packages are doing as well. In order to make sure that we have no regressions, at the time of this commit we ensured that the hashes of the models offered through the Alpine Linux repos and the models offered from the GitHub release are the same. Also, in order to detect future regressions or foul play, we check the downloaded models against a known checksum. Given that these models change every few years, updating the checksum should not be an issue. Fix #357 |
||
---|---|---|
.circleci | ||
.github/workflows | ||
assets | ||
container | ||
dangerzone | ||
dev_scripts | ||
install | ||
share | ||
tests | ||
.gitignore | ||
.grype.yaml | ||
BUILD.md | ||
CHANGELOG.md | ||
INSTALL.md | ||
LICENSE | ||
Makefile | ||
poetry.lock | ||
pyproject.toml | ||
README.md | ||
RELEASE.md | ||
setup-windows.py | ||
setup.py | ||
stdeb.cfg |
Dangerzone
Take potentially dangerous PDFs, office documents, or images and convert them to a safe PDF.
![]() |
![]() |
---|
Dangerzone works like this: You give it a document that you don't know if you can trust (for example, an email attachment). Inside of a sandbox, Dangerzone converts the document to a PDF (if it isn't already one), and then converts the PDF into raw pixel data: a huge list of RGB color values for each page. Then, in a separate sandbox, Dangerzone takes this pixel data and converts it back into a PDF.
Read more about Dangerzone in the official site.
Getting started
- Download Dangerzone 0.4.1 for Mac (Apple Silicon CPU)
- Download Dangerzone 0.4.1 for Mac (Intel CPU)
- Download Dangerzone 0.4.1 for Windows
- See installing Dangerzone for Linux repositories
You can also install Dangerzone for Mac using Homebrew: brew install --cask dangerzone
Some features
- Sandboxes don't have network access, so if a malicious document can compromise one, it can't phone home
- Dangerzone can optionally OCR the safe PDFs it creates, so it will have a text layer again
- Dangerzone compresses the safe PDF to reduce file size
- After converting, Dangerzone lets you open the safe PDF in the PDF viewer of your choice, which allows you to open PDFs and office docs in Dangerzone by default so you never accidentally open a dangerous document
Dangerzone can convert these types of document into safe PDFs:
- PDF (
.pdf
) - Microsoft Word (
.docx
,.doc
) - Microsoft Excel (
.xlsx
,.xls
) - Microsoft PowerPoint (
.pptx
,.ppt
) - ODF Text (
.odt
) - ODF Spreadsheet (
.ods
) - ODF Presentation (
.odp
) - ODF Graphics (
.odg
) - Jpeg (
.jpg
,.jpeg
) - GIF (
.gif
) - PNG (
.png
)
Dangerzone was inspired by Qubes trusted PDF, but it works in non-Qubes operating systems. It uses containers as sandboxes instead of virtual machines (using Docker for macOS, Windows, and Debian/Ubuntu, and podman for Fedora).
Set up a development environment by following these instructions.