11th EAI International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness

Research Article

A Cloud-based Efficient On-line Analytical Processing System with Inverted Data Model

Download561 downloads
  • @INPROCEEDINGS{10.4108/eai.19-8-2015.2261409,
        author={Sheng-Wei Huang and Ce-Kuen Shieh and Che-Ching Liao and Chui-Ming Chiu and Ming-Fong Tsai and Lien-Wu Chen},
        title={A Cloud-based Efficient On-line Analytical Processing System with Inverted Data Model},
        proceedings={11th EAI International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness},
        publisher={IEEE},
        proceedings_a={QSHINE},
        year={2015},
        month={9},
        keywords={cloud-based inverted data model},
        doi={10.4108/eai.19-8-2015.2261409}
    }
    
  • Sheng-Wei Huang
    Ce-Kuen Shieh
    Che-Ching Liao
    Chui-Ming Chiu
    Ming-Fong Tsai
    Lien-Wu Chen
    Year: 2015
    A Cloud-based Efficient On-line Analytical Processing System with Inverted Data Model
    QSHINE
    IEEE
    DOI: 10.4108/eai.19-8-2015.2261409
Sheng-Wei Huang1, Ce-Kuen Shieh1, Che-Ching Liao1, Chui-Ming Chiu1, Ming-Fong Tsai2,*, Lien-Wu Chen2
  • 1: Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University
  • 2: Department of Information Engineering and Computer Science, Feng Chia University
*Contact email: mingfongtsai@gmail.com

Abstract

On-line analytical processing (OLAP) provides analysis of multi-dimensional data stored in a database and achieves great success in many applications such as sales, marketing, financial data analysis. OLAP operation is a dominant part of data analysis especially when addressing a large amount of data. With the emergence of the MapReduce paradigm and cloud technology, OLAP operation can be processed on big data that resides in scalable, distributed storage. However, current MapReduce implementations of OLAP operation processing have a major performance drawback caused by improper processing procedure. This is crucial when dimension or dependent attributes are large, which is a common case for most data warehouses hold nowadays. To solve this issue, this paper proposes a methodology to accelerate the performance of OLAP operation processing on big data. We have conducted the experiments on the basic algebra of OLAP operation with different data sizes to demonstrate the effectiveness of our system.