About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China

Research Article

MSDA-Text: Template-Guided Long-Form Text Generation with Multi-Source Data Augmentation

Download17 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.18-12-2025.2365293,
        author={Zheng  Dai and Yilun  Zhang and Pengjia  Wang and Qianpu  Jiang and Fuguo  Liu and Yufeng  Shi},
        title={MSDA-Text: Template-Guided Long-Form Text Generation with Multi-Source Data Augmentation},
        proceedings={Proceedings of the 13th International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2025, 18-21 December 2025, Chengdu, China},
        publisher={EAI},
        proceedings_a={IIKI},
        year={2026},
        month={6},
        keywords={Large Language Models Long-Form Text Generation Multi-Source Data Augmentation Template-Guided Generation RAG Text-to-SQL},
        doi={10.4108/eai.18-12-2025.2365293}
    }
    
  • Zheng Dai
    Yilun Zhang
    Pengjia Wang
    Qianpu Jiang
    Fuguo Liu
    Yufeng Shi
    Year: 2026
    MSDA-Text: Template-Guided Long-Form Text Generation with Multi-Source Data Augmentation
    IIKI
    EAI
    DOI: 10.4108/eai.18-12-2025.2365293
Zheng Dai1, Yilun Zhang2, Pengjia Wang1, Qianpu Jiang1, Fuguo Liu3, Yufeng Shi1,*
  • 1: Institute for Financial Studies, Shandong University
  • 2: Research Center for Mathematics and Interdisciplinary Sciences, Shandong University
  • 3: School of Mathematics and Data Sciences, Changji University
*Contact email: yfshi@sdu.edu.cn

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in text generation, yet their outputs often depend heavily on pre-training data and lack the factual depth required for domain-specific long-form writing, such as industrial reports or biographical summaries. To address this limitation, we propose MSDA-Text (Template-Guided Long-Form Text Generation with Multi-Source Data Augmentation), a framework designed to produce accurate and comprehensive long-form texts aligned with user intent. Building upon existing long-text generation architectures such as Storm, MSDA-Text introduces two key enhancements: (1) a template-guided outline generation process that incorporates user-provided reference materials into multi-perspective LLM discussions, and (2) multi-source data augmentation that integrates both Internet-based and local real-time data through Retrieval-Augmented Generation (RAG) and Text-to-SQL techniques. The framework employs the Model Context Protocol (MCP) to unify template parsing across heterogeneous file types and features a long-text writing agent that autonomously retrieves and synthesizes content for each outline section. Experimental results demonstrate that MSDA-Text generates long-form documents that are more structured, user-aligned, and factually grounded than existing LLM-based methods.

Keywords
Large Language Models, Long-Form Text Generation, Multi-Source Data Augmentation, Template-Guided Generation, RAG, Text-to-SQL
Published
2026-06-17
Publisher
EAI
http://dx.doi.org/10.4108/eai.18-12-2025.2365293
Copyright © 2025–2026 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center
  • Cookie Preferences

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL