
Research Article
Leveraging Synthetic Mammograms to Enhance Deep-Learning Performance for Breast Cancer Classification Using EfficientNetV2L Architecture
@ARTICLE{10.4108/airo.9749, author={Raymond Sutjiadi and Siti Sendari and Heru Wahyu Herwanto and Yosi Kristian}, title={Leveraging Synthetic Mammograms to Enhance Deep-Learning Performance for Breast Cancer Classification Using EfficientNetV2L Architecture}, journal={EAI Endorsed Transactions on AI and Robotics}, volume={4}, number={1}, publisher={EAI}, journal_a={AIRO}, year={2025}, month={9}, keywords={Breast Cancer Detection, Synthetic Mammograms, EfficientNetV2L, Denoising Diffusion Probabilistic Models (DDPM), Deep Learning in Medical Imaging}, doi={10.4108/airo.9749} }
- Raymond Sutjiadi
Siti Sendari
Heru Wahyu Herwanto
Yosi Kristian
Year: 2025
Leveraging Synthetic Mammograms to Enhance Deep-Learning Performance for Breast Cancer Classification Using EfficientNetV2L Architecture
AIRO
EAI
DOI: 10.4108/airo.9749
Abstract
INTRODUCTION: To improve survival rates for breast cancer, a leading cause of female mortality globally, early detection is essential. This study presents a deep learning framework for classifying mammogram images as normal or abnormal. OBJECTIVES: This research aims to enhance the performance of a deep learning model for breast cancer classification by augmenting a real mammogram dataset with synthetic images. The study evaluates the impact of progressively increasing the number of synthetic mammograms on the model's accuracy, precision, recall, and F1-score. METHODS: The approach utilizes the EfficientNetV2L model for classification. Data augmentation was performed by generating synthetic mammograms using Denoising Diffusion Probabilistic Models (DDPM). A baseline dataset of 410 real mammograms from the INbreast public dataset was augmented with an increasing number of synthetic images across four experimental scenarios. RESULTS: The model demonstrated substantial performance gains directly linked to the use of synthetic data. The best performance was achieved when 500 synthetic images were used, resulting in all evaluation metrics exceeding a score of 0.90. The results confirm that incorporating more synthetic images is a key factor in achieving both higher classification accuracy and more stable training convergence. CONCLUSION: These findings highlight the significant potential of synthetic image augmentation to address data scarcity, class imbalance, and model generalisation in medical image analysis. This method provides a scalable and privacy-preserving solution for breast cancer screening systems.
Copyright © 2025 R. Sutjiadi et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.