About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part I

Research Article

A Multi-Model Video Summarization Framework Integrating Feature Extraction, Embedding and Transformer-Based Learning

Download13 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.28-4-2025.2357767,
        author={Sahaya Sakila  V and Chitransh  Nishad and Muthangi  Shashank and Tarun Prithi  Gopinath},
        title={A Multi-Model Video Summarization Framework Integrating Feature Extraction, Embedding and Transformer-Based Learning},
        proceedings={Proceedings of the 4th International Conference on Information Technology, Civil Innovation, Science, and Management, ICITSM 2025, 28-29 April 2025, Tiruchengode, Tamil Nadu, India, Part I},
        publisher={EAI},
        proceedings_a={ICITSM PART I},
        year={2025},
        month={10},
        keywords={video summarization deep learning openai whisper faiss pytorch resnet50 semantic embedding benchmarking},
        doi={10.4108/eai.28-4-2025.2357767}
    }
    
  • Sahaya Sakila V
    Chitransh Nishad
    Muthangi Shashank
    Tarun Prithi Gopinath
    Year: 2025
    A Multi-Model Video Summarization Framework Integrating Feature Extraction, Embedding and Transformer-Based Learning
    ICITSM PART I
    EAI
    DOI: 10.4108/eai.28-4-2025.2357767
Sahaya Sakila V1,*, Chitransh Nishad1, Muthangi Shashank1, Tarun Prithi Gopinath1
  • 1: SRM Institute of Science and Technology
*Contact email: sahayasv2@srmist.edu.in

Abstract

Video summarization is important for managing large volumes of videos across various domains such as media, education, surveillance and so on. Traditional approach for summarization includes keyframe selection and clustering which fails to capture the temporal dependencies and semantic context that leads to incomplete or redundant summaries. To address these limitations faced by existing systems, the proposed method: VidSynape introduces a multi model video summarization that combines frame level analysis with insights from transcript. It uses deep feature embeddings for visual content representation and efficient similarity-based indexing to enhance scalability and speed. Using multi model techniques this approach improves summary, contextual coherence and computation efficiency. The system is tested on datasets such as Sum Me and TVSum with results showing better performance over other methods. It effectively generates quality summaries while reducing processing time making it a solution for real world video analysis application.

Keywords
video summarization, deep learning, openai whisper, faiss, pytorch, resnet50, semantic embedding, benchmarking
Published
2025-10-13
Publisher
EAI
http://dx.doi.org/10.4108/eai.28-4-2025.2357767
Copyright © 2025–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL