Network World
Tuesday, November 10, 2009
DNSstuff.com
Get information about your IP
IP Information
50+ On-demand DNS and network tools

Emerging Linux filesystems

Today's abundance of hard drive space looks like a blessing, but it's a looming crisis in disguise. Disks are growing faster than the ability of existing filesystems and tools to manage, check, and repair them. We run a battery of benchmarks on the latest new inventions from filesystem-land.

Introduction

Most people's concept of a filesystem is “that thing that keeps my files organized in directories, and lets me set permissions on them”. In other words, it's just a tool. But there is also an implicit bargain in this. In exchange for trusting the software with our files, we expect a filesystem to be quick to deliver and store the data we need, and also not corrupt any of it in any strange or unusual ways.

The problems that filesystem developers face are, to be frank, growing a lot harder. They are facing ever-increasing disk sizes, along with seek times that are not keeping pace with the size of the disk they need to seek over. Some filesystem developers even wish to be able to spot and recover from hardware errors that corrupt your data, the kind that can happen when your hard disk starts to fail, for instance.

The fsck problem

At the 2007 Linux Storage and File Systems Workshop, Valerie Henson reported on the “fsck problem” caused by seek times not keeping pace with disk sizes, leading to inordinate caffeine consumption while waiting for the “fsck” filesystem check utility to run after a crash or power failure.  Previously, in a paper for a Usenix conference in 2006, she wrote, “Recently the main server for kernel.org, which hosts several years' worth of Linux kernel archives, suffered file system corruption at the RAID level; running fsck on the (journaling) ext3 file system took over a week, more than the time required to restore the entire file system from backup.”

Brandon Phillips reports at LWN that Val illustrated this with her barely 60% full home partition, a very modest 37GB. Using Seagate's estimates for advances in disk technology, she predicts the current 8-minute fsck time will expand out to 80 minutes for a partition 16 times as large, simply because seek times do not keep pace with bandwidth to and from disk.

Because of the fsck problem and other issues, new Linux filesystems are emerging.  But no single next-generation contender has emerged. The purpose of this article is to examine the performance of emerging filesystems in Linux, so I decided to choose a mix of both real work tasks and synthetic benchmarks to test and stress the various filesystems I chose to work with.

React: Give us your thoughts on the issues here.
Start a public discussion with other Network World users on this article (scroll up to send this article to a colleague).
Log In | Register for an account (Why you should)

Note: Register to have your user name appear; otherwise your comment will show up as "Anonymous."

*Anonymous comments will only appear once they are approved by the moderator.

Copyright 2008 Network World Inc.