
Research Article
High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8
@INPROCEEDINGS{10.4108/eai.18-12-2025.2365292, author={Yuya YOSHIZU and Lin MENG}, title={High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8}, proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China}, publisher={EAI}, proceedings_a={IIKI}, year={2026}, month={6}, keywords={Computer Vision Deep Learning Object Detection YOLOX Historical Documents}, doi={10.4108/eai.18-12-2025.2365292} }- Yuya YOSHIZU
Lin MENG
Year: 2026
High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8
IIKI
EAI
DOI: 10.4108/eai.18-12-2025.2365292
Abstract
Historical Japanese manuscripts are invaluable cultural assets, yet their characters are often obscured due to degradation such as stains, fading, and insect damage. To ensure reliable digital preservation and enable downstream restoration, highly accurate extraction of text regions is essential. This paper proposes a high-precision character detection framework based on YOLOX. Each manuscript page is padded and divided into overlapping 640 × 640 tiles, and detection is performed independently on each tile. The results are then merged using page-level non-maximum suppression (NMS). To further mitigate duplicate detections and boundary errors inherent to tiled inference, a central-region selection strategy is employed. Experiments on 11 manuscripts demonstrate that the conventional page-level YOLOX approach, which processes entire pages resized to a fixed resolution, suffers from degraded detection performance and achieves only 79.8% recall due to loss of detail. In contrast, the proposed method combining tiled inference with central-region filtering achieves 0.976 precision, 0.989 recall, and 0.982 average precision (AP). It successfully separates main body characters from annotation characters and degradation-induced artifacts across RGB, grayscale, and monochrome images.


