Kernel space: the vmsplice() exploit
A recent Linux security hole allows local users to
seize the power of root. Here's how three separate
bugs came together to create one big vulnerability.
By Jonathan Corbet, LinuxWorld.com
February 19, 2008 07:47 PM ET
- Share/Email
- Tweet This
- Print
Jonathan Corbet As this is being written, distributors are working quickly to ship kernel updates fixing the local root vulnerabilities
in the vmsplice() system call. Unlike a number of other recent vulnerabilities which have required special situations (such as the presence
of specific hardware) to exploit, these vulnerabilities are trivially exploited and the code to do so is circulating on the
net. The author found himself wondering how such a wide hole could find its way into the core kernel code, so he set himself
the task of figuring out just what was going on - a task which took rather longer than he had expected.
The splice() system call, remember, is a mechanism for creating data flow plumbing within the kernel. It can be used to join two file
descriptors; the kernel will then read data from one of those descriptors and write it to the other in the most efficient
way possible. So one can write a trivial file copy program which opens the source and destination files, then splices the
two together. The vmsplice() variant connects a file descriptor (which must be a pipe) to a region of user memory; it is in this system call that the
problems came to be.
The first step in understanding this vulnerability is that, in fact, it is three separate bugs. When the word of this problem
first came out, it was thought to only affect 2.6.23 and 2.6.24 kernels. Changes to the vmsplice() code had caused the omission of a couple of important permissions checks. In particular, if the application had requested
that vmsplice() move the contents of a pipe into a range of memory, the kernel didn't check whether that application had the right to write
to that memory. So the exploit could simply write a code snippet of its choice into a pipe, then ask the kernel to copy it
into a piece of kernel memory. Think of it as a quick-and-easy rootkit installation mechanism.
If the application is, instead, splicing a memory range into a pipe, the kernel must, first, read in one or more iovec structures describing that memory range. The 2.6.23 vmsplice() changes omitted a check on whether the purported iovec structures were in readable memory. This looks more like an information disclosure vulnerability than anything else - though,
as we will see, it can be hard to tell sometimes.
These two vulnerabilities (CVE-2008-0009 and CVE-2008-0010) were patched in the 2.6.23.15 and 2.6.24.1 kernel updates, released on February 8.
On February 10, Niki Denev pointed out that the kernel appeared to be still vulnerable after the fix. In fact, the vulnerability was the result of a different problem
- and it is a much worse one, in that kernels all the way back to 2.6.17 are affected. At this point, a large proportion of
running Linux systems are vulnerable. This one has been fixed in the 2.6.22.18, 2.6.23.16, and 2.6.24.2 kernels, also released on the 10th. At this point, with luck, all of these bugs have been firmly stomped - though, now, we
need to see a lot of distributor updates.
The problem, once again, is in the memory-to-pipe implementation. The function get_iovec_page_array() is charged with finding a set of struct page pointers corresponding to the array of iovec structures passed in by the calling application. Those pointers are stored in this array:
Comment