About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Artificial Intelligence for Communications and Networks. 4th EAI International Conference, AICON 2022, Hiroshima, Japan, November 30 - December 1, 2022, Proceedings

Research Article

A Study on Effectiveness of BERT Models and Task-Conditioned Reasoning Strategy for Medical Visual Question Answering

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-29126-5_5,
        author={Chau Nguyen and Tung Le and Nguyen-Khang Le and Trung-Tin Pham and Le-Minh Nguyen},
        title={A Study on Effectiveness of BERT Models and Task-Conditioned Reasoning Strategy for Medical Visual Question Answering},
        proceedings={Artificial Intelligence for Communications and Networks. 4th EAI International Conference, AICON 2022, Hiroshima, Japan, November 30 - December 1, 2022, Proceedings},
        proceedings_a={AICON},
        year={2023},
        month={3},
        keywords={Medical visual question answering Visual question answering Task-conditioned reasoning Conditional reasoning},
        doi={10.1007/978-3-031-29126-5_5}
    }
    
  • Chau Nguyen
    Tung Le
    Nguyen-Khang Le
    Trung-Tin Pham
    Le-Minh Nguyen
    Year: 2023
    A Study on Effectiveness of BERT Models and Task-Conditioned Reasoning Strategy for Medical Visual Question Answering
    AICON
    Springer
    DOI: 10.1007/978-3-031-29126-5_5
Chau Nguyen,*, Tung Le, Nguyen-Khang Le, Trung-Tin Pham, Le-Minh Nguyen
    *Contact email: chau.nguyen@jaist.ac.jp

    Abstract

    Medical visual question answering task requires a framework to understand a medical question in natural language and examine the corresponding image to produce the answer to the question. The common framework consists of a language understanding module, a visual understanding module, a signal fusion module, and an answer prediction module. Most existing works employed recurrent neural network-based models for the language understanding module. However, these approaches may not produce robust text presentations and are hard to interpret. On the other hand, BERT models are more robust for text representation and can provide a clue for interpretability via the attention weights between the words. Besides, as the questions consist of closed-answer questions and open-answer questions, the task-conditioned reasoning strategy was proposed to handle each type of question separately while maintaining several modules in the framework to be shared. In this paper, we investigate the effectiveness of pre-trained BERT models and the task-conditioned reasoning strategy for the task of medical visual question answering on the VQA-RAD dataset. Experimental results demonstrate improvements when pre-trained BERT models are combined with the task-conditioned reasoning strategy.

    Keywords
    Medical visual question answering Visual question answering Task-conditioned reasoning Conditional reasoning
    Published
    2023-03-26
    Appears in
    SpringerLink
    http://dx.doi.org/10.1007/978-3-031-29126-5_5
    Copyright © 2022–2025 ICST
    EBSCOProQuestDBLPDOAJPortico
    EAI Logo

    About EAI

    • Who We Are
    • Leadership
    • Research Areas
    • Partners
    • Media Center

    Community

    • Membership
    • Conference
    • Recognition
    • Sponsor Us

    Publish with EAI

    • Publishing
    • Journals
    • Proceedings
    • Books
    • EUDL