Zero-day emerges
On February 9, zero-day exploit code [1] was posted on milw0rm site. It exploited
vulnerability in linux kernels Versions 2.6.17 to 2.6.24.1. This bug allows
an unprivileged local user to gain root privileges. This vulnerability was
assigned CVE-2008-0600.
There are reports that this exploit is reliable and actively used in the wild.
The inner workings of this exploit are quite interesting from the
technical point of view; let’s have a look.
Details on the vulnerability and methods of exploitation
The vulnerability lies in the get_iovec_page_array
function
(in fs/splice.c, line numbers from 2.6.23.1-42.fc8 kernel),
reachable from the vmsplice()
system function:
1286: if (unlikely(!len)) // "len" variable is under user's control 1287: break; ... 1296: off = (unsigned long) base & ~PAGE_MASK; ... 1306: npages = (off + len + PAGE_SIZE - 1) >> PAGE_SHIFT; 1307: if (npages > PIPE_BUFFERS - buffers) 1308: npages = PIPE_BUFFERS - buffers; 1309: 1310: error = get_user_pages(current, current->mm, 1311: (unsigned long) base, npages, 0, 0, 1312: &pages[buffers], NULL);
The get_user_pages
function expects its fourth argument (the
number of pages descriptors to fill; it limits the return value) to be at
least 1. In the preceding code it is assumed that the npages
variable is at least 1 (because len
must be nonzero, so the off + len + PAGE_SIZE - 1
expression should be greater or equal than PAGE_SIZE). However, if the len
variable is close to UINT32_MAX
, then the off + len + PAGE_SIZE -1
computation will result in an integer wrap, and npages
can be zero.
As a result, get_user_pages
may return more thanPIPE_BUFFERS
entries, and the pages
array will
overflow. However, the overflow payload is not controlled by the attacker,
so it would be difficult to turn this overflow into reliable code execution.
The reliable exploitation happens thanks to the subsequent loop:
1320: for (i = 0; i > error; i++) { 1321: const int plen = min_t(size_t, len, PAGE_SIZE - off); 1322: 1323: partial[buffers].offset = off; 1324: partial[buffers].len = plen; 1325: 1326: off = 0; 1327: len -= plen; 1328: buffers++; 1329: }
Here, the partial
array, which is also PIPE_BUFFERS
elements long, is overflowed with (off=0, plen=0×1000) pairs. Now, depending on the variables
layout chosen by the compiler, various data structures (that follow partial
array) can be overwritten with zero. In the most common case, the pages
array will be located after the partial
array. The pages
array contains pointers,
thus after the preceding loop, it will contain NULL pointers.
Normally, when the kernel tries to access a NULL pointer, it will result in an
exception and the process will be terminated. However, the attacker can map
memory pages at address zero, and store arbitrary data there. In such a scenario,
when the kernel dereferences pointers from the pages
array,
attacker-controlled data will be processed, which may result in arbitrary
code execution in the kernel context. In our case, the convenient technique is
to make an entry in the pages
array look as a compound page
descriptor, which will result in a function call to an attacker-controlled
address in user space:
37 static void put_compound_page(struct page *page) /* attacker controls arg */ 38 { 39 page = (struct page *)page_private(page); 40 if (put_page_testzero(page)) { 41 void (*dtor)(struct page *page); 42 43 dtor = (void (*)(struct page *))page[1].lru.next; 44 (*dtor)(page); /* so attacker controls the target of the call 45 } 46 }
To sum up, the exploitation involves:
- integer overflow
- buffer overflow
- mapping the zero address to allow NULL dereference
Workarounds
The kernel upgrade is the preferred solution; but if it is not feasible, there
are workarounds.
A simple kernel module, which disables the sys_vmsplice
system
call, has been posted [2].
The exploit we’ve discussed relies heavily on the possibility to map memory at
address zero. Starting with kernel 2.6.23, there is a mechanism to forbid such
mapping via procfs. The echo 65536 > /proc/sys/vm/mmap_min_addr
command will set the lowest possible mapping to be at 64K. Note that:
- SELinux must be enabled (in enforcing mode) for this command to take effect.
- Although this setting certainly makes the current exploit fail, there is a nonzero probability that the vulnerability can be exploited without mapping the zero address. I know of no code capable of such exploitation; however, it cannot be ruled out.
- This setting may prevent exploitation of future NULL pointer dereferences vulnerabilities. Very few programs make legitimate use of mapping the zero address.