About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
airo 24(1):

Research Article

A Survey of Data-Driven 2D Diffusion Models for Generating Images from Text

Download177 downloads
Cite
BibTeX Plain Text
  • @ARTICLE{10.4108/airo.5453,
        author={Shun Fang},
        title={A Survey of Data-Driven 2D Diffusion Models for Generating Images from Text},
        journal={EAI Endorsed Transactions on AI and Robotics},
        volume={3},
        number={1},
        publisher={EAI},
        journal_a={AIRO},
        year={2024},
        month={4},
        keywords={2D Diffusion Model, DDPM, HighLDM, Imagen},
        doi={10.4108/airo.5453}
    }
    
  • Shun Fang
    Year: 2024
    A Survey of Data-Driven 2D Diffusion Models for Generating Images from Text
    AIRO
    EAI
    DOI: 10.4108/airo.5453
Shun Fang1,*
  • 1: Peking University
*Contact email: fangshun@pku.org.cn

Abstract

This paper explores recent advances in generative modeling, focusing on DDPMs, HighLDM, and Imagen. DDPMs utilize denoising score matching and iterative refinement to reverse diffusion processes, enhancing likelihood estimation and lossless compression capabilities. HighLDM breaks new ground with high-res image synthesis by conditioning latent diffusion on efficient autoencoders, excelling in tasks through latent space denoising with cross-attention for adaptability to diverse conditions. Imagen combines transformer-based language models with HD diffusion for cutting-edge text-to-image generation. It uses pre-trained language encoders to generate highly realistic and semantically coherent images, surpassing competitors based on FID scores and human evaluations in DrawBench and similar benchmarks. The review critically examines each model's methods, contributions, performance, and limitations, providing a comprehensive comparison of their theoretical underpinnings and practical implications. The aim is to inform future generative modeling research across various applications.

Keywords
2D Diffusion Model, DDPM, HighLDM, Imagen
Received
2024-03-18
Accepted
2024-04-21
Published
2024-04-22
Publisher
EAI
http://dx.doi.org/10.4108/airo.5453

Copyright © 2024 S. Fang et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.

EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL