Research Article
Large Scale Cross-media Data Retrieval based on Hadoop
@ARTICLE{10.4108/eai.19-8-2015.2260108, author={Wenchen Cheng and Jiang Qian and Zhicheng Zhao and Fei Su}, title={Large Scale Cross-media Data Retrieval based on Hadoop}, journal={EAI Endorsed Transactions on Cloud Systems}, volume={1}, number={2}, publisher={EAI}, journal_a={CS}, year={2015}, month={9}, keywords={cross-media; image retrieval; hadoop; mapreudce}, doi={10.4108/eai.19-8-2015.2260108} }
- Wenchen Cheng
Jiang Qian
Zhicheng Zhao
Fei Su
Year: 2015
Large Scale Cross-media Data Retrieval based on Hadoop
CS
EAI
DOI: 10.4108/eai.19-8-2015.2260108
Abstract
With the rapid development of the Internet and speedy increase of the data size, there are more and more data intensive applications which often involve hundreds of megabytes of data. It is important and necessary to obtain the retrieval results from cross-media data quickly and accurately. Large scale cross-media data retrieval based on Hadoop is proposed to speed up the retrieval in this paper. We divide cross-media feature extraction and cross-media retrieval into paralleled pipeline, and implement with the combination of the HDFS, HBase and MapReduce framework. To verify the performance of the proposed method, comparisons with stand-alone mode on different sizes of the image dataset are conducted, and the experimental results demonstrate the good performances of proposed method, which sharply decreases time-consuming, and meanwhile keeps the same query precision.
Copyright © 2015 W. Cheng al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.