High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8

Yuya YOSHIZU; Lin MENG

Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China

Research Article

High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8

Download18 downloads

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/eai.18-12-2025.2365292,
    author={Yuya  YOSHIZU and Lin  MENG},
    title={High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8},
    proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China},
    publisher={EAI},
    proceedings_a={IIKI},
    year={2026},
    month={6},
    keywords={Computer Vision Deep Learning Object Detection YOLOX Historical Documents},
    doi={10.4108/eai.18-12-2025.2365292}
}

Yuya YOSHIZU
Lin MENG
Year: 2026
High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8
IIKI
EAI
DOI: 10.4108/eai.18-12-2025.2365292

Yuya YOSHIZU¹, Lin MENG²^,*

1: Graduate School of Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan
2: College of Science and Engineering, Ritsumeikan University, 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan

*Contact email: menglin@fc.ritsumei.ac.jp

Abstract

Historical Japanese manuscripts are invaluable cultural assets, yet their characters are often obscured due to degradation such as stains, fading, and insect damage. To ensure reliable digital preservation and enable downstream restoration, highly accurate extraction of text regions is essential. This paper proposes a high-precision character detection framework based on YOLOX. Each manuscript page is padded and divided into overlapping 640 × 640 tiles, and detection is performed independently on each tile. The results are then merged using page-level non-maximum suppression (NMS). To further mitigate duplicate detections and boundary errors inherent to tiled inference, a central-region selection strategy is employed. Experiments on 11 manuscripts demonstrate that the conventional page-level YOLOX approach, which processes entire pages resized to a fixed resolution, suffers from degraded detection performance and achieves only 79.8% recall due to loss of detail. In contrast, the proposed method combining tiled inference with central-region filtering achieves 0.976 precision, 0.989 recall, and 0.982 average precision (AP). It successfully separates main body characters from annotation characters and degradation-induced artifacts across RGB, grayscale, and monochrome images.

Keywords: Computer Vision, Deep Learning, Object Detection, YOLOX, Historical Documents

Published: 2026-06-17
Publisher: EAI

: http://dx.doi.org/10.4108/eai.18-12-2025.2365292

High-Precision Character Extraction from Historical Japanese Manuscripts Based on Improved YOLOv8

Abstract

About EAI

Community

Publish with EAI