- Who else wants national broadband?
- A new take on cloud security ... from Hitler
- Apple sees over 120,000 iPad pre-orders on first day
- IE9 proves Microsoft is back in the browser battle
- 60% of virtual servers less secure than physical machines
Most people's concept of a filesystem is “that thing that keeps my files organized in directories, and lets me set permissions on them”. In other words, it's just a tool. But there is also an implicit bargain in this. In exchange for trusting the software with our files, we expect a filesystem to be quick to deliver and store the data we need, and also not corrupt any of it in any strange or unusual ways.
The problems that filesystem developers face are, to be frank, growing a lot harder. They are facing ever-increasing disk sizes, along with seek times that are not keeping pace with the size of the disk they need to seek over. Some filesystem developers even wish to be able to spot and recover from hardware errors that corrupt your data, the kind that can happen when your hard disk starts to fail, for instance.
See a slideshow of all the graphics for this article.
The fsck problem
At the 2007 Linux Storage and File Systems Workshop, Valerie Henson reported on the “fsck problem” caused by seek times not keeping pace with disk sizes, leading to inordinate caffeine consumption while waiting for the “fsck” filesystem check utility to run after a crash or power failure. Previously, in a paper for a Usenix conference in 2006, she wrote, “Recently the main server for kernel.org, which hosts several years' worth of Linux kernel archives, suffered file system corruption at the RAID level; running fsck on the (journaling) ext3 file system took over a week, more than the time required to restore the entire file system from backup.”
Brandon Phillips reports at LWN that Val illustrated this with her barely 60% full home partition, a very modest 37GB. Using Seagate's estimates for advances in disk technology, she predicts the current 8-minute fsck time will expand out to 80 minutes for a partition 16 times as large, simply because seek times do not keep pace with bandwidth to and from disk.
Because of the fsck problem and other issues, new Linux filesystems are emerging. But no single next-generation contender has emerged. The purpose of this article is to examine the performance of emerging filesystems in Linux, so I decided to choose a mix of both real work tasks and synthetic benchmarks to test and stress the various filesystems I chose to work with.
If your favorite experimental filesystem isn't tested here, please accept my apologies. The ones that were chosen were those that I had already experimented with (ZFS/FUSE, NILFS, btrfs) or those that were already getting a fair amount of attention (reiser4, ext4, ChunkFS). I also chose to throw into this mix a test of ZFS under OpenSolaris, and would have included ZFS under FreeBSD 7 if the installer had been able to recognize the PCI ID of the Adaptec RAID card.
All the filesystems examined here use one (or more) allocation strategies when writing and arranging data on disk.
Files and directories in block-based filesystems are constructed from one or more fixed size chunks of disk (“blocks”). This can mean that if an existing file is extended after another file has been written (or after the filesystem has been in use for some time) its blocks can be scattered across the platter (“fragmentation”), resulting in a performance penalty when reading or writing.
Comment