Commit graph

5 commits

Author SHA1 Message Date
Alex Pyrgiotis
f75d471ec8
Fix OCR bug in Qubes Fedora 38 templates
Provide a fix for an OCR bug that affected Fedora 38 templates of Qubes
OS. In that specific configuration, the PyMuPDF version accepts the
Tesseract data directory only from the `TESSDATA_PREFIX` environment
variable. Our mistake was that we were setting this environment variable
in a dev script, instead of setting it for all configurations.

In this commit, we set an attribute in the fitz.fitz module, so that
both dev scripts and end-user installations can work. This is hacky, but
it targets an old PyMuPDF release after all, so we don't expect things
to break in the long run.

Fixes #737
2024-03-04 16:53:04 +02:00
deeplow
6006beeb03
Fix OCR on Qubes: PyMuPDF required TESSDATA_PREFIX
PyMuPDF versions lower than 1.22.5 pass the tesseract data path as
an argument to `pixmap.pdfocr_tobytes()` [1], but lower versions require
setting instead the TESSDATA_PREFIX environment variable [2].

Because on Qubes the pixels to pdf conversion happens on the host and
Qubes has a lower PyMuPDF package version, we need to pass instead via
environment variable.

NOTE: the TESSDATA_PREFIX env. variable was set in dangerzone-cli
instead of closer to the calling method in `doc_to_pixels.py` since
PyMuPDF reads this variable as soon as the fitz module is imported
[3][4].

[1]: https://pymupdf.readthedocs.io/en/latest/pixmap.html#Pixmap.pdfocr_tobytes
[2]: https://pymupdf.readthedocs.io/en/latest/installation.html#enabling-integrated-ocr-support
[3]: https://github.com/pymupdf/PyMuPDF/discussions/2439
[4]: https://github.com/pymupdf/PyMuPDF/blob/5d6a7db/src/__init__.py#L159

Fixes #682
2024-02-07 13:13:10 +00:00
deeplow
4d8e4c53e3
sort imports with isort linter 2022-08-22 10:15:26 +01:00
Micah Lee
cf367adcfa
This creates a separate script dangerzone-container which is a wrapper for running the container. This lets us run dangerzone as unprivileged, but dangerzone-container as privileged, to avoid adding the user to the dangerzone group. 2020-03-13 16:49:53 -07:00
Micah Lee
0b9823a34e
Initial commit 2020-01-06 14:40:09 -08:00