From d91d4ab913b695ea799d2abe182529f141b0b471 Mon Sep 17 00:00:00 2001 From: Moon Sungjoon Date: Mon, 26 Jun 2023 18:22:25 +0900 Subject: [PATCH] Add CJK fonts Add noto cjk font for the better CJK support Without this fonts, some characters in documnets are rendered as "tofu" --- Dockerfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Dockerfile b/Dockerfile index 787f35a..97155b2 100644 --- a/Dockerfile +++ b/Dockerfile @@ -15,7 +15,7 @@ RUN apk --no-cache -U upgrade && \ python3 \ py3-magic \ tesseract-ocr \ - openjdk17-jre-headless + font-noto-cjk # Download the trained models from the latest GitHub release of Tesseract, and # store them under /usr/share/tessdata. This is basically what distro packages