Implement Click's callback interface and create validators for the
input/output filenames, using the logic from the Document class. This
way, we can catch user errors as early as possible.
Factor out the filename validation logic and move it into the Document
class. Previously, the filename validation logic was scattered across
the CLI and GUI code.
Also, introduce a new errors.py module whose purpose is to handle
document-related errors, by providing:
* A special exception for them (DocumentFilenameExcpetion)
* A decorator that handles DocumentFilenameException, logs it and the
underlying cause, and exits the program gracefully.
Avoid setting document's filename via document.filename and instead
do it via object instantiation where possible.
Incidentally this has to change some window logic. When
select_document() is called it no longer checks if there is already an
open window with no document selected yet. The user can open as many
windows with unselected documents as they want.
Rename the `common` module and `common.Common` class to `document` and
`document.Document` respectively. Also, rename the variables that hold
instances of this class.
This change reflects the fact that the class is responsible for tracking
the state of the document. When we add bulk document conversion,
allowing us to keep track of a document's state will be key. This name
change is a step towards that.
Run Mypy static checks against our tests. This brings them inline with
the rest of the codebase, and we have an extra level of certainty that
the tests (and unit tests in particular) will not significantly diverge
from the code they are testing.
Concatenate directories and filenames in a platform-independent way, by
using pathlib.Path. This fixes issues in the tests where the "/" path
separator made the tests fail on Windows.
Add two tests that check if Dangerzone properly handles input and output
filenames with spaces in them. Previously this was not straight-forward
because we didn't tokenize arguments, which lead to Click splitting
filenames with spaces in two.
Pass tokenized arguments (i.e., arguments as lists of strings) to CLI
invocations, else Click will attempt to tokenize them internally. The
problem with leaving tokenization to Click is that it uses
`shlex.split()`, which is Unix-oriented, and may miss some cases in
Windows.
Wrap Click results (`Result`) with a new class (`CLIResult`), which
includes:
1. Assertion statements.
2. Logic for formatting and printing a Click result.
3. Invocation arguments, which are missing from the original `Result`
class.
On a windows system when running `pip install` it fails to install
`cx_Logging-3.0` with the error:
error: Microsoft Visual C++ 14.0 or greater is required. Get it
with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
Installing this dependency solves the issue.
Reduce "global_common" coupling by moving methods that could be
static onto "semantically-closer" py files.
Based on work initially made by @gmarmstrong on PR #166:
- moves container-specific code out of global_common.py and into
container.py
- creates a util.py for static methods used through the whole app
- move banner code from global_common onto cli.py given that it's
only displayed there
- updates tests to reflect these changes
- move ocr_languages from global_common onto its own json file in
share/ocr-languages.json to simplify global_common logic
Container-related methods recently moved to container.py no longer
need to have 'container' in their name as they are within the
container scope already.
Additonally it made it awkward to call from another module:
from .. import container
container.get_container_runtime()
The logic for detecting if we were are running on docker or podman
and identifying its respective binary were scattered across the
codebase. This centralizes it all in container.py
- display_banner() was only displayed in CLI mode so it makes sense
for it to be in the CLI.
- get_version(), was mvoed to util since it is a static function
that is needed in multiple parts of the application.
static methods that are used application-wide should belong to
the utilities python file.
inspired by @gmarmstrong's PR #166 on refactoring global_common
methods to be static and have a dzutil.py
originally PDF files were included for these edge-cases but in
reality all we want to test is the filename itself. So it reduces
repo size if we have them generated dynamically.
The parameterizatin features of pytest over the default unittest
will be useful to reduce test code. Furthermore, pytest is already
used by folks at FPF so there won't be any learning curve if folks
want to work on it.
Updates to the macOS and Windows build scripts and documentation:
- Switched from hardcoding the exact minor release of Python 3.9
to just using Python 3.9
- Switches from 32-bit Windows Python binaries to 64-bit
- Install poetry in Windows using pip, which is much simpler and
less error-prone than the PowerShell way
- Includes instructions for making the Windows release in a
Windows 11 VM, and building the container image on the host
- Updates the fingerprint of the Windows signing key
- Fixes a small bug with the .wxs file used to build the MSI
package