
Research Article
An Overview of Multimodal Fusion Learning
@INPROCEEDINGS{10.1007/978-3-031-23902-1_20, author={Fan Yang and Bo Ning and Huaiqing Li}, title={An Overview of Multimodal Fusion Learning}, proceedings={Mobile Multimedia Communications. 15th EAI International Conference, MobiMedia 2022, Virtual Event, July 22-24, 2022, Proceedings}, proceedings_a={MOBIMEDIA}, year={2023}, month={2}, keywords={Multimodal learning Multimodal fusion Deep learning}, doi={10.1007/978-3-031-23902-1_20} }
- Fan Yang
Bo Ning
Huaiqing Li
Year: 2023
An Overview of Multimodal Fusion Learning
MOBIMEDIA
Springer
DOI: 10.1007/978-3-031-23902-1_20
Abstract
With the rapid development of modern science and technology, information sources have become more widely available and in more diverse forms, resulting in widespread interest in multimodal learning. With the various types of information captured by humans in understanding the world and perceiving objects, a single modality cannot provide all of the information about a specific object or phenomenon. Multimodal fusion learning opens up new avenues for tasks in deep learning, making them more scientific and human in their approach to solving many real-world problems. An important challenge confronting multimodal learning today is how to efficiently facilitate the fusion of multimodal features while retaining the integrity of modal information to reduce information loss. This paper summarizes the definition and development process of multimodality, analyzes and discusses briefly the main approaches to multimodal fusion, common models, and current specific applications, and finally discusses future development trends and research directions in the context of existing technologies.