3rd International ICST Conference on Scalable Information Systems

Research Article

Dynamic Data Organization for Bitmap Indices

Download327 downloads
  • @INPROCEEDINGS{10.4108/ICST.INFOSCALE2008.3554,
        author={Tan Apaydin and Guadalupe Canahuate and Hakan Ferhatosmanoglu and Ali Saman Tosun},
        title={Dynamic Data Organization for Bitmap Indices},
        proceedings={3rd International ICST Conference on Scalable Information Systems},
        publisher={ICST},
        proceedings_a={INFOSCALE},
        year={2010},
        month={5},
        keywords={},
        doi={10.4108/ICST.INFOSCALE2008.3554}
    }
    
  • Tan Apaydin
    Guadalupe Canahuate
    Hakan Ferhatosmanoglu
    Ali Saman Tosun
    Year: 2010
    Dynamic Data Organization for Bitmap Indices
    INFOSCALE
    ICST
    DOI: 10.4108/ICST.INFOSCALE2008.3554
Tan Apaydin1,*, Guadalupe Canahuate1,*, Hakan Ferhatosmanoglu1,*, Ali Saman Tosun2,*
  • 1: Department of CSE, The Ohio State University
  • 2: Department of CS, University of Texas at San Antonio
*Contact email: apaydin@cse.ohiostate.edu, canahuat@cse.ohiostate.edu, hakan@cse.ohiostate.edu, tosun@cs.utsa.edu

Abstract

Bitmap indices have been successfully used in scientific databases and data warehouses. Run-length encoding is commonly used to generate smaller size bitmaps that do not require explicit decompression for query processing. For static data sets, compression is shown to be greatly improved by data reordering techniques that generate longer and fewer runs. However, these data reorganization methods are not applicable to dynamic and very large data sets because of their significant overhead. In this paper, we present a dynamic data structure and algorithm for organizing bitmap indices for better compression and query processing performance. Our scheme enforces a compression rate close to the optimum for a target ordering of the data which results in fast query response time. For our experiments, we use Gray code ordering as the tuple ordering strategy. However, the proposed scheme efficiently works for any desired ordering strategy. Experimental results show that the proposed framework provides better compression and query execution time than the traditional approaches.