1st International ICST Conference on Scalable Information Systems

Research Article

Performance evaluation of linux file systems for data warehousing workloads

  • @INPROCEEDINGS{10.1145/1146847.1146890,
        author={Peter Wai  Yee Wong and Ric  Hendrickson and Haider  Rizvi and Steve  Pratt},
        title={Performance evaluation of linux file systems for data warehousing workloads},
        proceedings={1st International ICST Conference on Scalable Information Systems},
        publisher={ACM},
        proceedings_a={INFOSCALE},
        year={2006},
        month={6},
        keywords={},
        doi={10.1145/1146847.1146890}
    }
    
  • Peter Wai Yee Wong
    Ric Hendrickson
    Haider Rizvi
    Steve Pratt
    Year: 2006
    Performance evaluation of linux file systems for data warehousing workloads
    INFOSCALE
    ACM
    DOI: 10.1145/1146847.1146890
Peter Wai Yee Wong1,*, Ric Hendrickson1,*, Haider Rizvi2,*, Steve Pratt1,*
  • 1: IBM, 11501 Burnet Road, Austin, TX 78758, USA
  • 2: IBM, 8200 Warden Ave, Markham, ON L6G 1C7, Canada
*Contact email: wpeter@us.ibm.com, richhend@us.ibm.com, haider@ca.ibm.com, slpratt@us.ibm.com

Abstract

Many database users store data on raw or block devices for performance reasons, since file caching and file locking by the file system can be bypassed. However, many database users would prefer to use file systems for the ease of long-term maintenance. To our knowledge, there have not been any major efforts to systematically assess the performance of Linux file systems for database workloads. In this paper, we present our initial performance study on data warehousing systems. We first provide a brief introduction to various Linux file systems, namely Ext2, Ext3, ReiserFS, XFS and JFS. We examine the performance impact of asynchronous I/O, direct I/O, file caching, I/O schedulers, file fragmentation, and database storage methods. We then quantify the performance of these Linux file systems utilizing a well-known data warehousing workload. Finally, system configurations are recommended and future work is suggested.