Research Article
Performance evaluation of linux file systems for data warehousing workloads
@INPROCEEDINGS{10.1145/1146847.1146890, author={Peter Wai Yee Wong and Ric Hendrickson and Haider Rizvi and Steve Pratt}, title={Performance evaluation of linux file systems for data warehousing workloads}, proceedings={1st International ICST Conference on Scalable Information Systems}, publisher={ACM}, proceedings_a={INFOSCALE}, year={2006}, month={6}, keywords={}, doi={10.1145/1146847.1146890} }
- Peter Wai Yee Wong
Ric Hendrickson
Haider Rizvi
Steve Pratt
Year: 2006
Performance evaluation of linux file systems for data warehousing workloads
INFOSCALE
ACM
DOI: 10.1145/1146847.1146890
Abstract
Many database users store data on raw or block devices for performance reasons, since file caching and file locking by the file system can be bypassed. However, many database users would prefer to use file systems for the ease of long-term maintenance. To our knowledge, there have not been any major efforts to systematically assess the performance of Linux file systems for database workloads. In this paper, we present our initial performance study on data warehousing systems. We first provide a brief introduction to various Linux file systems, namely Ext2, Ext3, ReiserFS, XFS and JFS. We examine the performance impact of asynchronous I/O, direct I/O, file caching, I/O schedulers, file fragmentation, and database storage methods. We then quantify the performance of these Linux file systems utilizing a well-known data warehousing workload. Finally, system configurations are recommended and future work is suggested.