An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images

Prof. Dr. Abdul kadar Muhammad Masum; Abu Kowshir Bitto; Shafiqul Islam Talukder; Md Fokrul Islam Khan; Mohammed Shamsul Alam; Khandaker Mohammad Mohi Uddin

airo 25(1):

Research Article

An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images

Download42 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/airo.9795,
    author={Prof. Dr. Abdul kadar Muhammad Masum  and Abu Kowshir Bitto and Shafiqul Islam Talukder and Md Fokrul Islam Khan and Mohammed Shamsul Alam and Khandaker Mohammad Mohi Uddin},
    title={An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images},
    journal={EAI Endorsed Transactions on AI and Robotics},
    volume={4},
    number={1},
    publisher={EAI},
    journal_a={AIRO},
    year={2025},
    month={8},
    keywords={Gastrointestinal Disease, Medical Image Processing, Transformer Models, Ensemble Model, Explainable AI},
    doi={10.4108/airo.9795}
}

Prof. Dr. Abdul kadar Muhammad Masum
Abu Kowshir Bitto
Shafiqul Islam Talukder
Md Fokrul Islam Khan
Mohammed Shamsul Alam
Khandaker Mohammad Mohi Uddin
Year: 2025
An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images
AIRO
EAI
DOI: 10.4108/airo.9795

Prof. Dr. Abdul kadar Muhammad Masum ¹^,*, Abu Kowshir Bitto², Shafiqul Islam Talukder³, Md Fokrul Islam Khan⁴, Mohammed Shamsul Alam⁵, Khandaker Mohammad Mohi Uddin¹

1: Southeast University
2: Daffodil International University
3: Westcliff University
4: International American University
5: International Islamic University Chittagong

*Contact email: akmmasum@yahoo.com

Abstract

Gastrointestinal diseases such as gastroesophageal reflux disease (GERD) and polyps remain prevalent and challenging to diagnose accurately due to overlapping visual features and inconsistent endoscopic image quality. In this study, we investigate the application of transformer-based deep learning models—Vision Transformer (ViT), Swin Transformer, and a novel Ensemble Transformer model—for classifying four categories: GERD, GERD Normal, Polyp, and Polyp Normal from endoscopic images. The dataset was curated and collected in collaboration with Zainul Haque Sikder Women's Medical College & Hospital, ensuring high-quality clinical annotations. All models were evaluated using precision, recall, F1 score, and overall classification accuracy. Our proposed Ensemble Transformer model, which fuses the outputs of ViT and Swin Transformer, achieved superior performance by delivering well-balanced F1 scores across all classes, reducing misclassification, and improving robustness with an overall accuracy of 87%. Furthermore, we incorporated explainable AI (XAI) techniques such as Grad-CAM and Grad-CAM++ to generate visual explanations of the model’s predictions, enhancing interpretability for clinical validation. This work demonstrates the potential of integrating global and local attention mechanisms along with XAI in building reliable, real-time, AI-assisted diagnostic support systems for gastrointestinal disorders, particularly in resource-limited healthcare settings.

Keywords: Gastrointestinal Disease, Medical Image Processing, Transformer Models, Ensemble Model, Explainable AI

Received: 2025-07-25
Accepted: 2025-08-17
Published: 2025-08-25
Publisher: EAI

: http://dx.doi.org/10.4108/airo.9795

Copyright © 2025 Abdul Kadar Muhammad Masum et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.

An Explainable AI Based Deep Ensemble Transformer Framework for Gastrointestinal Disease Prediction from Endoscopic Images

Abstract

About EAI

Community

Publish with EAI