Back

Domen Puncer Kugler

Emerging Technologies

Hardware & Embedded Systems

Reducing Vulnerabilities at Scale

Research

Tutorial/Study Guide

February 16, 2023

6 mins read

Rustproofing Linux (Part 4/4 Shared Memory)

This is a four part blog post series that starts with Rustproofing Linux (Part 1/4 Leaking Addresses).

Shared memory is often used to share data without the performance hit of copying. Whenever a shared resource is consumed by one component while being modified by another component, there is potential for Time-Of-Check-Time-Of-Use (TOCTOU) or Double Fetch vulnerabilities. In these examples we focus on the case where double fetching occurs in the kernel and the software changing that data is in userspace, making this an avenue for user-to-kernel privilege escalation. However, note that this same type of vulnerability could exist when accessing memory that is shared between a device driver and a peripheral, two userspace processes, hypervisor and kernel, etc.

As a side note, we would like to mention that double fetch vulnerabilities can also arise due to compiler introduced problems.

Our vulnerable example is a bit contrived for the sake of brevity, but it should illustrate a common buggy pattern of shared memory usage:

static int vuln_open(struct inode *ino, struct file *filp)
{
    struct file_state *state;

    state = kzalloc(sizeof(*state), GFP_KERNEL);
    if (!state)
        return -ENOMEM;

    state->page = alloc_pages(GFP_KERNEL | __GFP_ZERO, 0);

A memory page is allocated

static int vuln_mmap(struct file *filp, struct vm_area_struct *vma)
{
    struct file_state *state = filp->private_data;
    int ret = 0;

    ret = vm_map_pages_zero(vma,  state->page, 1);
    return ret;
}

The page is mapped into userspace

static long vuln_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
    struct file_state *state = filp->private_data;
    volatile u32 *sh_buf = page_to_virt(state->page);
    u8 tmp_buf[32];

    switch (cmd) {
    case VULN_PROCESS_BUF:
        if (sh_buf[0] <= sizeof(tmp_buf)) {
            memcpy(tmp_buf, (void *) sh_buf[1], sh_buf[0] );

Data is read from shared memory

The vulnerability is in reading sh_buf[0] twice. If memory contents change between the reads, this could lead to a buffer overflow of tmp_buf.

A PoC was created to change sh_buf[0] value between the two fetches by repeatedly changing the memory contents in one process while calling vuln_ioctl in the other:

    volatile u32 *buf = mmap(NULL, LEN, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    if (buf == MAP_FAILED) {
        perror("mmap");
        return -1;
    }

    int child = fork() == 0;

    cpu_set_t set;
    CPU_ZERO( set);
    CPU_SET(child,  set);
    if (sched_setaffinity(getpid(), sizeof(set),  set) < 0) {
        perror("sched_setaffinity error");
        return -1;
    }

    if (child) {
        while (1) {
            buf[0] = 32;
            buf[0] = 128;
        }
    } else {
        while (1) {
            ioctl(fd, VULN_PROCESS_BUF, 0);
        }
    }

One process changing memory contents, the other calling VULN_PROCESS_BUF

When this PoC is executed, KASAN reports the vulnerability as a 128 byte out of bounds write.

Porting to Rust

The code we ported to Rust looks similar, but is guided by mm::virt::Area and pages::Pages abstractions. This starts with the mmap implementation:

fn mmap(state:  Self, _file:  File, vma:  mut mm::virt::Area) -> Result {
    vma.insert_page(vma.start(),  state.mutable.lock().page)?;

    Ok(())
}

mmap() callback implementation in Rust

The mmap method we implement for file::Operations has an vma: mm::virt::Area argument. While this struct only has one member, a pointer to C’s struct vm_area_struct, it is private, so we need to use the only available method to create a mapping, insert_page().

insert_page() requires a pages::Pages<0> argument, and similarly we don’t get access to the underlying struct page and are limited to provided methods to access the memory contents:

fn ioctl(state:  Self, _file:  File, cmd:  mut IoctlCommand) -> Result<i32> {
    let (cmd, _arg) = cmd.raw();
    match cmd {
        VULN_PROCESS_BUF => {
            let mut tmp_buf = Box::try_new([0u8; 32])?; // on heap

            let page =  state.mutable.lock().page;

            let mut size = 0u32;
            unsafe { page.read( mut size as *mut u32 as _, 0, 4)? };
            if size as usize <= core::mem::size_of_val( tmp_buf) {
                unsafe { page.read( mut size as *mut u32 as _, 0, 4)? };
                unsafe { page.read(tmp_buf.as_mut_ptr(), 4, size as usize)? };

                if tmp_buf[0] == 'A' as u8 {
                    return Ok(0);
                }
            }

ioctl() callback using Pages<0>::read() to read memory

Let’s compare the above marked lines to the same C-based PoC, where the first word of the shared buffer is accessed simply as sh_buf[0]. Since these two highlighted lines are identical, and don’t really have a purpose except to intentionally introduce a TOCTOU vulnerability, we believe it would be very unusual for a developer to do this. Thus, it seems unlikely for such TOCTOU vulnerabilities to be naively ported from C to Rust.

Variant Using Raw Pointers

In the above port, the abstractions were preventing us from dereferencing a memory pointer like we did in C. Since Rust is a low-level language we should be able to bypass the Pages struct abstraction and directly use C’s struct page it contains. In our experiment we created our own copy of Pages, ExposedPages, and we used core::mem::transmute to basically cast Pages into our new type.

VULN_PROCESS_BUF => {
    let mut tmp_buf = Box::try_new([0u8; 32])?; // on heap

    let page =  state.mutable.lock().page;
    // page.pages is private, page.kmap() is private, tricks required
    let page:  ExposedPages = unsafe { core::mem::transmute(page) };
    let sh_buf: *mut u32 = unsafe { bindings::kmap(page.pages) } as _;

    // XXX assembly shows this will be only one access to *sh_buf
    if unsafe { *sh_buf } as usize <= tmp_buf.len() {
        unsafe { core::ptr::copy(sh_buf.offset(1) as *mut u8, tmp_buf.as_mut_ptr(), *sh_buf as _) };

        if tmp_buf[0] == 'A' as u8 {
            return Ok(0);
        }
    }

Dereferencing a raw pointer to access shared memory

This PoC is closer to the C-language version (sh_buf[0] in C code could also be written as *sh_buf, so that part could be identical), but since we can’t just mark the pointer as volatile, the compiler optimises out the second *sh_buf. For those interested, a full example is provided.

Variant With Volatile Pointer Access

While Rust has no volatile keyword, it does offer a way to dereference pointers the same way with core::ptr::read_volatile() and core::ptr::write_volatile().

Our next variation uses read_volatile instead of pointer dereference:

if unsafe { read_volatile(sh_buf) } as usize <= tmp_buf.len() {
    unsafe { copy(sh_buf.offset(1) as *mut u8, tmp_buf.as_mut_ptr(), read_volatile(sh_buf) as _) };

Using core::ptr::read_volatile

This does trigger the TOCTOU vulnerability, and one could find it plausible for a developer to use read_volatile(sh_buf) twice instead of declaring a temporary variable.

We have also explored accessing raw contents of mm::virt::Area instead of pages::Pages, but the source code then becomes even more like C, and uses more C bindings.

Takeaways

The ways we have tried to access shared memory in a vulnerable way all felt a bit forced or contrived, and did not feel like idiomatic Rust. Rust abstractions require us to read memory in a way that makes a double fetch more obvious. While the abstractions can be bypassed, even a cursory code inspection should pick up the unsafe block with transmute and later also a read_volatile, making sure that the code would be harshly reviewed, and maybe even removed.

Overall Conclusions

To conclude this four part blog series (one, two, three, four) we note that Rust brings some very nice features to the table. Writing Linux device drivers in Rust will almost certainly improve the kernel’s overall security posture.

However, the security improvements in the Rust language are not free or completely automatic. Porting C code to Rust is a non-trivial matter that has its own set of unique pitfalls. We believe that Rust is a tool which still requires considerable expertise of its master to avoid shooting themself in the foot. As we’ve shown, naïve ports from C to Rust may still exhibit vulnerabilities.

While it is easy to spot the unsafe keyword when auditing Rust code, thoroughly inspecting and documenting it requires a deeper understanding of Rust and the driver code. Even with all unsafe blocks removed (or proven to be memory safe) there’s still potential for other vulnerabilities, although those will probably be less severe, since by design they should not be related to memory safety.

In particular, we wish to highlight the MutexGuard usage caveat that we discussed in post #2 – while the automatic unlock at the guard variable’s end of life is very nice, one should be aware of patterns like the demonstrated .lock() method chaining, where we produced a race condition because a mutex was unlocked between two guarded variable accesses.

From our experimentation, integer overflows as well as shared memory accesses seem to be less likely causes of vulnerabilities, since the programmer needs to go out of their way to introduce a bug.

Finally, leaking kernel addresses seems to be as easy as always. While the benefits of KASLR are questioned by some already, the bypasses probably won’t go away either.

We hope the future is less buggy and software more secure. As Rust gets used more in the Linux kernel, we predict that the security research community will start to discover new manifestations of traditional driver vulnerabilities. Collectively, we probably need more time to discover these new vulnerability patterns, and better tools are likely needed to automatically detect and eliminate them.

Acknowledgements

Thanks to Miguel Ojeda, Alex Gaynor, Gary Guo and other Rust for Linux maintainers for valuable insights.

Special thanks to Jeremy Boone for all his help and suggestions.

Published by Domen Puncer Kugler

View all posts by Domen Puncer Kugler ->

Here are some related articles you may find interesting

Ghidra nanoMIPS ISA module

Introduction In late 2023 and early 2024, the NCC Group Hardware and Embedded Systems practice undertook an engagement to reverse engineer baseband firmware on several smartphones. This included MediaTek 5G baseband firmware based on the nanoMIPS architecture. While we were aware of some nanoMIPS modules for Ghidra having been developed…

Hardware & Embedded Systems

Reverse Engineering

Tool Release

May 7, 2024

6 mins read

Sifting through the spines: identifying (potential) Cactus ransomware victims

Authored by Willem Zeeman and Yun Zheng Hu This blog is part of a series written by various Dutch cyber security firms that have collaborated on the Cactus ransomware group, which exploits Qlik Sense servers for initial access. To view all of them please check the central blog by Dutch…

Digital Forensics and Incident Response (DFIR)

Fox-IT and European Research

Vulnerability Research

April 25, 2024

7 mins read

Public Report – Confidential Mode for Hyperdisk – DEK Protection Analysis

During the spring of 2024, Google engaged NCC Group to conduct a design review of Confidential Mode for Hyperdisk (CHD) architecture in order to analyze how the Data Encryption Key (DEK) that encrypts data-at-rest is protected. The project was 10 person days and the goal is to validate that the…

Public Reports

April 12, 2024

1 min read

View articles by category

Most recent posts

Call us before you need us.

Our experts will help you.

Get in touch

Rustproofing Linux (Part 4/4 Shared Memory)

Porting to Rust

Variant Using Raw Pointers

Variant With Volatile Pointer Access

Takeaways

Overall Conclusions

Acknowledgements

Like this:

View articles by category

Most popular posts

Most recent posts

Call us before you need us.

Rustproofing Linux (Part 4/4 Shared Memory)

Porting to Rust

Variant Using Raw Pointers

Variant With Volatile Pointer Access

Takeaways

Overall Conclusions

Acknowledgements

Share this:

Like this:

Here are some related articles you may find interesting

Ghidra nanoMIPS ISA module

Sifting through the spines: identifying (potential) Cactus ransomware victims

Public Report – Confidential Mode for Hyperdisk – DEK Protection Analysis

View articles by category

Most popular posts

Most recent posts

Call us before you need us.