cs 15(2): e5

Research Article

Large Scale Cross-media Data Retrieval based on Hadoop

Download988 downloads
  • @ARTICLE{10.4108/eai.19-8-2015.2260108,
        author={Wenchen Cheng and Jiang Qian and Zhicheng Zhao and Fei Su},
        title={Large Scale Cross-media Data Retrieval based on Hadoop},
        journal={EAI Endorsed Transactions on Cloud Systems},
        volume={1},
        number={2},
        publisher={EAI},
        journal_a={CS},
        year={2015},
        month={9},
        keywords={cross-media; image retrieval; hadoop; mapreudce},
        doi={10.4108/eai.19-8-2015.2260108}
    }
    
  • Wenchen Cheng
    Jiang Qian
    Zhicheng Zhao
    Fei Su
    Year: 2015
    Large Scale Cross-media Data Retrieval based on Hadoop
    CS
    EAI
    DOI: 10.4108/eai.19-8-2015.2260108
Wenchen Cheng,*, Jiang Qian1, Zhicheng Zhao1, Fei Su1
  • 1: Beijing University of Posts and Telecommunications
*Contact email: vCheng_dmt@163.com

Abstract

With the rapid development of the Internet and speedy increase of the data size, there are more and more data intensive applications which often involve hundreds of megabytes of data. It is important and necessary to obtain the retrieval results from cross-media data quickly and accurately. Large scale cross-media data retrieval based on Hadoop is proposed to speed up the retrieval in this paper. We divide cross-media feature extraction and cross-media retrieval into paralleled pipeline, and implement with the combination of the HDFS, HBase and MapReduce framework. To verify the performance of the proposed method, comparisons with stand-alone mode on different sizes of the image dataset are conducted, and the experimental results demonstrate the good performances of proposed method, which sharply decreases time-consuming, and meanwhile keeps the same query precision.