About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part I

Research Article

AI-Enhanced Multi-OCR Framework with NLP Post-processing for Improved Handwritten Text Recognition and Analysis

Download9 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.28-4-2025.2357771,
        author={Venkatasivaprasad  Ravinuthala and Ranjana  P},
        title={AI-Enhanced Multi-OCR Framework with NLP Post-processing for Improved Handwritten Text Recognition and Analysis},
        proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part I},
        publisher={EAI},
        proceedings_a={ICITSM PART I},
        year={2025},
        month={10},
        keywords={optical character recognition (ocr) handwritten text recognition natural language processing (nlp) paraphrasing summarization sentiment analysis multi-ocr integration word error rate (wer) gradio interface},
        doi={10.4108/eai.28-4-2025.2357771}
    }
    
  • Venkatasivaprasad Ravinuthala
    Ranjana P
    Year: 2025
    AI-Enhanced Multi-OCR Framework with NLP Post-processing for Improved Handwritten Text Recognition and Analysis
    ICITSM PART I
    EAI
    DOI: 10.4108/eai.28-4-2025.2357771
Venkatasivaprasad Ravinuthala1,*, Ranjana P1
  • 1: Hindustan Institute of Technology & Science
*Contact email: 23cp0320007@student.hindustanuniv.ac.in

Abstract

Handwritten text recognition remains challenging due to diverse handwriting styles, image quality variations, and limitations inherent in single Optical Character Recognition (OCR) tools. This study introduces a novel AI-enhanced OCR framework that combines multiple OCR engines with advanced Natural Language Processing (NLP) post-processing techniques, including paraphrasing, summarization and sentiment analysis. The multi-OCR approach strategically leverages the strengths of each OCR engine to optimize initial recognition accuracy. Subsequent NLP refinement significantly reduces OCR-induced errors, enhances readability, and provides contextual clarity. Comprehensive evaluations on synthetic and real-world handwritten datasets demonstrate marked improvements, evidenced by reductions in Word Error Rate (WER) and enhancements in precision, recall, and F1-score. Furthermore, an interactive interface developed using Gradio facilitates real-time processing and intuitive visualization of OCR and NLP outcomes, underscoring the practical applicability of the proposed system. This research provides a robust, integrative solution for handwritten text digitization and analysis, addressing critical gaps in existing OCR technologies.

Keywords
optical character recognition (ocr), handwritten text recognition, natural language processing (nlp), paraphrasing, summarization, sentiment analysis, multi-ocr integration, word error rate (wer), gradio interface
Published
2025-10-13
Publisher
EAI
http://dx.doi.org/10.4108/eai.28-4-2025.2357771
Copyright © 2025–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL