eBPF Adventures: Fiddling with the Linux Kernel and Unix Domain Sockets


tl;dr

eBPF (extended Berkeley Packet Filter) is slowly taking over as a programmatic way for (generally privileged) users to invoke Linux kernel APIs and performantly execute semi-arbitrary code without having to load it from a custom kernel module. eBPF is a general means to load memory safe restricted code that reduces the risk of crashes, deadlocks, and infinite loops of inherent to the kernel module alternative.

In this post, we describe how to effectively use eBPF to trace Linux kernel functions. We also discuss how we implemented our eBPF-based tracing tool that can sniff Unix domain sockets across an entire Linux host (this “impossible” task was what got us started with eBPF). You can install it with sudo -H pip install unixdump (or from our repo), but it requires BCC to be installed separately. But, if you just want to see some of what it can do, click here.

If you’re looking to get into building custom eBPF-based Linux kernel tracing tools, we recommend starting with BCC, busting out its reference guide, and pinning a tab to the Linux kernel codebase.

Background

Unix domain sockets1 are a core OS-provided IPC (inter-process communication) mechanism that enable processes on the same host to communicate through send(2)/sendto(2)– and recv(2)/recvfrom(2)-able file descriptors similarly to network sockets. Unix domain sockets use a file path (or in the case of Linux’s abstract namespace, a key string) as the bind(2)/connect(2) “address;” additionally, “unnamed” Unix domain sockets may be created in connected pairs through the socketpair(2) syscall. Traditionally, Unix domain sockets are often used when applications or services require bidirectional communication between related processes or communication between unrelated ones that may enforce OS-backed permission checks (though the checks differ between Unixes2). Depending on the internal implementation, Unix domain sockets are often significantly more performant than loopback-networking as they do not go through the networking stack. One downside of this is that Unix domain sockets are extremely difficult to inspect as there is no simple interface or API for intercepting Unix domain socket traffic, such as the pcap(3) APIs for network traffic. Instead, when needing to observe Unix domain socket traffic, one often resorts to interposing a forwarding daemon with a middle Unix domain socket that the connect(2)/sendto(2)-ing peer will actually communicate with. In the case where file descriptors are being passed or peer credentials are being validated, and it may be overly complicated to run such a daemon with the right process information, function hooking (typically LD_PRELOAD-alikes) may be used to interdict libc stubs used to invoke the communication syscalls. Both methods have drawbacks and additionally require more than a cursory understanding of how an application is already using Unix domain sockets in order to intercept them. Additionally, neither method scales across an entire host and may only be used to individually intercept Unix domain socket traffic for single applications and services. Intercepting all Unix domain socket traffic across a host will require dumping the data directly from the kernel; we can do this by tracing the kernel.

Kernel Tracing and Instrumentation

Generally, kernel tracing utilities enable one to obtain very basic information about the execution within the OS kernel. Most implementations provide the ability to dump metadata about executing functions, including their arguments and return values. Often, this data is very limiting as the relevant information is embedded within structs and arrays for which only the pointer address is returned. Fewer OS kernels directly provide instrumentation APIs that enable deeper information to be gleaned from executing functions and kernel memory. The gold standard for such functionality is DTrace. DTrace was originally developed for Solaris, and has since been ported to FreeBSD and macosx (where, unfortunately, it has been purposefully weakened to align with other DRM-related changes and will deny attempts to trace binaries from core system directories). DTrace has also been ported to Linux, twice; once as a /proc/kallsyms-based kernel module that implements its own function hooking, and, more recently, by Oracle, who relicensed it under the GPL (for kernel code). Unfortunately, the former is essentially unmaintained and fails on current Linux distributions, and the latter is effectively locked to Oracle Linux at the moment and it is unclear if it will be accepted into upstream.

Due to the long period without DTrace on Linux and the eventual realization of the need to have useful debugging features within the kernel, Linux has played fast and loose with a number of tracing frameworks that mostly constitute dead ends. Chief among the failures is SystemTap, which has a painful installation process requiring installation of kernel debug symbols and uses a frustrating NIH clone of the DTrace scripting syntax to dynamically generate kernel modules. Needless to say, one is probably better off writing their own function hooking kernel modules. Sysdig, on the other hand, is a useful tool (but doing anything fancy might involve Lua scripting…). Unfortunately, is not capable of drilling down into arbitrary kernel structures in memory as it is based entirely on Linux’s built-in tracepoint definitions (primarily those for syscalls and process scheduling events), which provide only specific pre-defined values that the kernel’s developers felt would be useful for debugging and tracing kernel functionality.

Linux’s Lego Problem

Linux’s in-kernel tracing features are very similar to other facets of modern Linux, specifically containers. Linux slowly gained a number of “namespacing” features that were eventually composed to form the “concept” of “containers,” an ape of purpose-built sandboxing technology such as FreeBSD’s jails and Solaris’ zones. Containers have only recently started to see legitimate security hardening materialize for normal users within the past few years, most notably with the introduction of user namespaces, a feature feared by Linux distros, but which wipes out a number of container escapes by isolating privileges to the container’s namespaces. Jessie Frazelle sums up the “concept” distinction nicely in her infamous blog post comparing them. We will only add that if you decide to make an ocean out of the Death Star set, it will cost $500 and your ocean will be a murky gray.

The Linux kernel has several different inter-related features that support dynamic instrumentation and tracing (for a more thorough introduction, see Julia Evans’ overview on Linux’s tracing systems):

  • kprobes: Kernel API via register_kprobe(struct kprobe*) that can register callback functions to handle a breakpoint injected at an arbitrary memory address (typically the start of a function). The handlers are provided a struct pt_regs* containing the architecture-specific register values.

  • ftrace: A function tracing API provided by the Linux kernel built on lower-level APIs such as kprobes and tracepoints, that provides a filesystem-based userland API (debugfs) to configure and enable various tracing and profiling operations.

  • perf (aka perf_events): Kernel API for hardware performance monitoring counters (e.g. number of instructions executed), timed sampling (e.g. find where in the callstack time is spent), and userspace mapped ring buffers via perf_event_open(2)/mmap(2) syscalls.

  • tracepoints: Kernel API with tracepoints declared through the TRACE_EVENT macro, inserted via calls to the tracepoint “function,” and callbacks registered through tracepoint_probe_register(struct tracepoint*,void*,void*). The TRACE_EVENT macro creates metadata useful for perf and ftrace to instrument by tracepoint name.

Some of these are implemented in terms of each other, and several of their subsystems interact with or support each other. But, up until recently, if you wanted to directly interact with any of these things in a meaningful way, you needed to write a bunch of GNU-flavored C in a kernel module and implement your own scheme for communicating data back to userspace (assuming you want something more performant than printk). This is essentially how every single tracing “framework” is implemented, and one of the common things they seem to share is that they all tend to set up their own ring buffer from scratch that is mmap(2)’d using a file descriptor obtained by open(2)-ing a special path registered by the module (implementations vary heavily, from use of cdev_add to mounted in-memory filesystems).

eBPF: Lego Grout

eBPF (extended Berkeley Packet Filter) is an in-kernel JIT-ing virtual machine that adds extra computational resources (more registers, direct ISA mapping to major CPU architectures, and fast C interop with internal Linux kernel functionality)3 to the classic BPF virtual machine and bytecode instruction set; it is referred to as bpf in the Linux source, APIs, syscalls, etc. eBPF is slowly taking over as a “programmatic” way for users, often privileged ones, to invoke Linux kernel APIs and execute “performant” code in kernel space (which limits context switches) without necessitating the development and loading of a custom kernel module in an unsafe or dangerous language. eBPF is not specifically a tracing or instrumentation feature, but a general means to load memory safe restricted code that reduces the risk of crashes, deadlocks, and infinite loops of inherent to the kernel module alternative. Given that eBPF itself has already introduced vulnerabilities (Though, being honest, what revolutionary new feature hasn’t had bugs?), it is exacerbating the failings of the Linux capability “model” and raising concerns about the balance between functionality and security. But if the intention behind eBPF is to keep mortals from writing buggy kernel modules, it may well succeed and improve the security status quo in doing so.

eBPF is slowly but surely becoming a framework within the kernel with an ever-increasing menagerie of features, programming capabilities, kernel-backed (and userland-mapped) data structures, helper functions (for which an in-eBPF implementation would otherwise violate the safety restrictions), and pluggable APIs. These include multiple forms of packet and socket filtering and processing, an interesting API for adding encapsulation to packets based on route tables, shenanigans to hook bind(2) to “fix” broken apps that run in containers, and multiple APIs to attach them to all manner of kernel-based tracing and instrumentation mechanisms. The last of these is most relevant to our purposes, but we will nonetheless remind folks to stay safe and make sure to properly account for variable-length headers when processing packets with eBPF.

BCC: C to eBPF By Way of the Long Way Round

BCC (BPF Compiler Collection) is a toolkit for having userland code (generally Python) interact with kernel space eBPF code, and includes an LLVM-based cross-compilation toolchain that compiles C code into eBPF bytecode. At a very high level, this toolchain is based on using Clang’s RecursiveASTVisitor AST traversal library to modify the C code into a “suitable” format and then use LLVM’s eBPF backend to emit the bytecode. These modifications exist primarily to replace external memory accesses with equivalent eBPF memory accessing helper functions and expand other simplified C coding constructs such as BCC library functions and “magic” semantics introduced by BCC that are used to denote the eBPF attach target (e.g. kprobe__funcname to attach the eBPF-compiled function as a kprobe hook on funcname). For a slightly deeper dive into how BCC C code works and how eBPF kprobes are registered, see our talk given recently at the 35C3 conference.

BCC and eBPF Bytecode Validator Hell

To make eBPF “safe,” the Linux kernel validates all eBPF code before loading it. For example, eBPF code is not allowed to “loop” (to prevent infinite loops), so any attempt to run code at an address/offset before the currently running eBPF instruction will be deemed illegal (This restriction has been weakened slightly in newer versions of Linux to reduce code bloat, but is still enforced by default.). Additionally, the eBPF validator may detect loops when there are none in the source code; this can happen due to compiler optimizations or because of faulty identification of normal function calling operations as being loop-like. As such, BCC C often uses unrolled loops and inlined functions. The eBPF validator additionally has a number of call-site validations and register taint tracking logic that attempt to ensure that helper functions, such as those used to manipulate memory-mapped tables and access kernel memory, are only passed “safe” argument values. This “logic” is problematic as it is often not thorough enough to properly determine value bounds. This problem is further complicated by the fact that BCC compiles code with -O2; most naïve attempts to make such bounds “more obvious” are likely to be optimized out by Clang. Additionally, updating BCC (and possibly the Linux kernel) may potentially result in a slightly different bytecode output that trip the validator. However, this is generally not the case for very simple code, such as that of the tools and examples code within the BCC repo itself. We have also observed errors when using certain Linux/BCC versions where the use of a bool function parameter was not tolerated in certain variants of our code (e.g. different filtering comparisons being applied) and integer types were not tolerated in the others; we originally had to solve this by using #ifdef magic to control the parameter’s type depending on the variant of the code until a BCC update unbroke it. These issues are so pervasive that the BCC developers themselves appear to believe that certain tolerable code constructs, such as variable-length byte copies, are not possible in eBPF because the idiomatic code C is not accepted by the eBPF validator.

While these issues present challenges when attempting to develop portable BCC/eBPF-based tooling, it is useful to be able to disable these inane validations when simply attempting to quickly trace a kernel function and extract some interesting data. Unfortunately, the current validator implementation suffers from high coupling-low cohesion as the validation routine itself pre-processes the bytecode and configures the internal kernel data structures responsible for running it. As a result, the validation routine cannot be bypassed directly with a NOP or by stomping over its implementation with a return 0. Instead, one has to individually clip the strings of the eBPF validator’s golden fiddle by performing a number of nigh-surgical function hooks that will both bypass lower-level state validation checks and override registers with safe bound values. We have implemented a set of such hooks that have bypassed the more pernicious and maddening errors that we have experienced while writing our Unix domain socket sniffer tool. While we do not recommend using it in production as it can definitely lead to unstable and, more importantly, unsafe eBPF code, our yolo-ebpf kernel module can help in a pinch when attempting to reverse engineer applications on the fly. And if you still happen to hit an incorrect eBPF validator error while using it, please send us an issue. It is disappointing that such tomfoolery is needed in the first place, but eBPF and BCC are both relatively new and these things take time.

unixdump: An eBPF-based Unix Domain Socket Sniffer

unixdump is a full-featured utility for passively capturing Unix domain socket traffic from Linux hosts built on top of eBPF and BCC. It can capture all traffic across a host, including file descriptor transfers and Unix credential passes. unixdump supports fine-grained filtering based on Unix domain socket paths (including abstract namespace keys) and PIDs, and can perform both inclusive and exclusive filtering of PIDs. unixdump additionally supports outputting to readable log files amenable to extracting binary content (We are currently looking into outputting to the pcapng format, which can support ancillary data, but performantly and accurately timestamping events may pose a challenge).

Design and Implementation

As with other BCC-based tools, our userland event handling code is written in Python and our kernel space kprobe hook that generates events is written in C. Essentially, the C code is what performs the important operations; in our case, this is the retrieval and filtering of metadata and content from sockets and other kernel structures. This code then marshals the event data into a struct that is unpacked on the Python side. The Python code then processes the event stream into a more user-friendly data output.

This flow is implemented through the use of two ring buffers, one the perf_event ring buffer, and the other a custom ring buffer built on top of an eBPF map. Events are pushed to userspace through the perf_event ring buffer using perf_submit calls in the C code. The Python userspace code constantly poll(2)s file descriptors associated with these ring buffers to detect event submissions. The Python code then attempts to read the rest of the data from the asynchronously updated custom ring buffer mapped into userspace. Following this, the Python code process the data and clears the custom ring buffer entry.

In unixdump, we are extracting the data sent over Unix domain sockets at one of the lowest possible levels, from the internal msghdr structs holding them. When the send syscall is invoked on a Unix domain socket, a msghdr parameter, msg, gets passed along. The data in the msghdr struct is contained within another structure, iov_iter, that is embedded into the msghdr as its msg_iter field. iov_iters can wrap several kernel buffer structures, but in our case, it uses the const struct iovec* iov union variant, which is a simple structure that contains a buffer base pointer, iov_base, and a buffer length, iov_len, that together refer to our Unix domain socket message content.

We extract this data using the bpf_probe_read() helper function, which acts as a “safe” memcpy enabling eBPF programs to read arbitrary kernel memory into their own memory space. An interesting quirk of how BCC works is that function calls to bpf_* functions, which are part of the kernel’s eBPF API, and other BCC-specific helper functions/methods (yes, “methods”) are rewritten using an LLVM-based code generation pass. This enables the helper functions to be translated into the appropriate bpf_call instructions and is additionally used to translate all kernel memory dereferences into bpf_probe_read() calls.

Unix domain sockets also allow processes to pass file descriptors to one another (SCM_RIGHTS), and authenticate their identity (or act on behalf of another) by passing user credentials (SCM_CREDENTIALS) through the kernel. This “ancillary data” takes the form of several cmsghdr structures and CMSG_DATA payloads lined up within a single byte buffer blob. This blob is pointed to by the void* msg_control field of the msghdr struct and the size_t msg_controllen field specifies the total size. To differentiate and identify the raw contents of the CMSG_DATA payloads, the cmsghdr struct stores metadata about the type and size of the data. For example, if the int cmsg_type field is SCM_RIGHTS, the particular CMSG is being used to pass file descriptors. An interesting quirk of the CSMG system in the kernel is that separate CMSG objects of the same type will be combined into one CMSG observed by the receiver. Like most ad-hoc data structures in the Linux kernel codebase, CMSG blobs are not simple to parse given eBPF’s constraints. In particular, these blobs are typically iterated through by using multiple layers of pointer shifting macros that embed a for-loop construct to iterate the initially unknown number of CMSG objects; it is worth keeping in mind that the msghdr.msg_controllen field refers to the byte length of the whole CMSG blob, and is used to ensure that CMSG objects are not iterated or processed past the end of the buffer allocated to them. To get around the eBPF limitations, we use CLI flag-based tunables to guide (hacky string concat) code generation of C code that statically iterates these blobs, if present, and copies the metadata and typing information into our ring buffer; we feel this was still less painful to implement than it would have been to copy the entire blob to userspace and process it in Python.

BCC, providing a userland interface on top of a kernel-only one, enables BCC C code to specify I/O data structures that map to the ones in <linux/bpf.h> through the use of BCC-provided macros. The primary benefit of these structures is that they may allocate much more storage space than is otherwise provided on the eBPF stack and are they are considered a valid copy target for reading arbitrary kernel memory (there are some inconsistent “protections” around copying pointer addresses directly onto the eBPF stack). unixdump uses a BPF_PERCPU_ARRAY() to store (potentially large) message content as it enables easier iterating of ring buffer slots. For simpler one-off event notifications, BCC C supports setting and registering a perf_event output ring buffer-based output struct through BPF_PERF_OUTPUT(); the perf_submit() helper function may then be called on the output object declared by the macro. This function call is actually translated into a bpf_perf_event_output() helper function call through BCC’s code generation.

Dividing our message content and event metadata across these two storage mechanisms enables us to better tune memory usage; and detect, mitigate, and report when unixdump is bottlenecking against the system. One major difference between these two I/O mechanisms is that BPF_PERF_OUTPUT()-registered data structures will be automatically parsed/deserialized by BCC, whereas BCC-registered tables/arrays/maps will be provided to registered event handlers as raw byte buffers, necessitating the use of custom Python ctypes parsing logic. However, One major pain point to be aware of with this behavior of BCC is that in the former case, char[] fields will be parsed as NUL-terminated C strings, and all data after a NULL byte will be lost; it may not be recovered by using ctypes.string_at. The solution is to use uint8_t[] for non-C string data, as it will result in BCC reading all of the data.

When writing unixdump, we quickly learned that display server traffic (e.g. X11) for terminal applications goes over Unix domain sockets. A naive implementation would quickly result in a feedback loop that would suck up memory and CPU resources. Since we wanted to avoid locking up the system, and since we also wanted to capture tunable amounts of data larger than the eBPF stack size (since eBPF cannot perform dynamic allocations), we went with a CLI-configurable ring buffer for content storage. The current implementation will simply drop events (but notify userland of the drop with additional metadata) if the ring buffer slot is still in use by the time it is needed again. We also do not directly perf_submit the large slots of the ring buffer as this resulted in a large number of kernel-dropped perf events. Instead we perf_submit smaller event metadata, which includes information like PIDs, socket paths, and the index of the ring buffer slot synced to userspace using the bpf_map_* APIs. This results a slight “race condition” in that the ring buffer slot may not yet be accessible to userspace by the time we attempt to access it. However, this delay is not subject to the ABA problem or any similar use-after-free-like issues as the userland pages will have always been cleared by the userspace code prior to being updated by the kernel. We use a few fallback mechanisms to poll at it depending on whether or not event order preservation is necessary, but will give up after a few tries as we have observed complete losses of the data in some circumstances where the slot never updates. It is currently unclear if this is due to a flaw in BCC or the Linux kernel itself.

For better throughput, we perform various checks to determine whether it is worth it to continue processing. For example, we will bail out early if various validation checks do not pass (e.g. if certain metadata is missing or unexpected). On top of this, we provide a number of in-kernel filters to reduce and refine the amount of processing done in the kernel. Users can filter on specific Unix domain socket paths (or match path prefixes) and PIDs. It is also possible to exclude certain noisy PIDs altogether (e.g. the GUI terminal process rendering the output of its own Unix socket communication with the display server, or the display server itself). Using the filters will reduce the amount of data copied to the fixed-size perf_events ring buffer and therefore help to prevent it from overfilling and dropping (“missing” in the Linux parlance) events that cannot fit. We also support configuring the size of this buffer should stable throughput still be too much for the default size.

Case Study: Sniffing Frida C2 Traffic

Frida is a popular “cross-platform dynamic instrumentation toolkit” that injects a JavaScript interpreter into a target process and uses it to run a semi-DSL of JavaScript function hooking code. The impetus for unixdump was part of a greater desire to answer a seemingly simple question: “How does Frida work?” More specifically, we were looking to find out how Frida’s agent communication protocol works. At a high-level, Frida works by attaching to a target process using platform-specific debugging APIs (i.e. ptrace(2) on Linux, task_for_pid()/mach_vm_*() on macosx, and OpenProcess()/VirtualAllocEx()/WriteProcessMemory()/CreateRemoteThread() on Windows), and then uses them to inject an “agent” that runs within the target. This agent is what runs the instrumenting JavaScript code and performs the lower-level operations invoked by it (e.g. function hooking, memory reads/writes, etc.). While Frida does support non-interactively loading a single JavaScript file, its primary mode of operation involves the use of a “client” process that interacts with the agent running within the target. In addition to the protocol used for direct attachment, Frida also supports having the client connect to a frida-server instance that makes direct attachments to targets. While we are interested in the goings-on of the direct attachment/connection case, it is worth noting that a frida-server can be loaded into a given process through frida-gadget, and that the frida-server and client libraries support several protocols to enable remote attachment to hosts over TCP and mobile devices using a TCP-forwarding mechanism to connect to a a frida-server on a USB-attached device (e.g. ADB’s TCP forwarding for Android, and usbmuxd TCP forwarding) for iOS).

Through strace(1)-ing the client, it became clear early on that the direct attachment communication protocol had to be transported over Unix domain sockets with dynamically-generated paths. The problem was that multiple such Unix domain sockets are created, and it wasn’t clear which ones were being used. Additionally, because the Frida client is still ptrace(2)-ing the target, we cannot simply strace(1) it, as strace(1) uses ptrace(2) and a process can only be ptrace(2)-d by one tracer at a time. While we could have tried to hack up a sniffer by instrumenting the Frida client itself to hook its Unix domain socket I/O, this was not an ideal solution for a number of reasons, and we instead tried to solve the Unix domain socket traffic sniffing problem once and for all (on Linux at least). After we got the MVP version of the eBPF hooks running, it quickly became obvious that Frida uses DBus to serialize custom API calls over Unix domain sockets. In fact, outside of specific protocols to initialize direct connections between Frida clients, servers, and targets, pretty much all of Frida’s communications use the DBus protocol.

Frida Agent Script Communication Protocol

Using the code in the first example defined in the Frida documentation, we demonstrate unixdump’s ability to intercept Unix domain socket traffic. Knowing (from strace(1)) that Frida’s Unix domain socket paths begin with /tmp/frida, we can instruct unixdump to filter for messages starting with that path name:

sudo unixdump -s '/tmp/frida' -b

We then proceed to start our target binary, hello, and our Frida hook script, hook.py, passing the latter the PID from of the hello process:

./hello &
./hook.py $FUNCTION_VALUE
  1. When Frida starts, it begins authenticating via DBus and the hooked process begins to send Unix credentials to identifying itself (in this case, hello was run as root):
Output
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 1
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
ancillary data sent (attempted): 1 CMSG observed
  SCM_CREDENTIALS: pid=26525 uid=0(root) gid=0(root)

----
00000000: 00                                                .
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 6
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 41 55 54 48 0D 0A                                 AUTH..
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 46
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 52 45 4A 45 43 54 45 44  20 45 58 54 45 52 4E 41  REJECTED EXTERNA
00000010: 4C 20 41 4E 4F 4E 59 4D  4F 55 53 20 44 42 55 53  L ANONYMOUS DBUS
00000020: 5F 43 4F 4F 4B 49 45 5F  53 48 41 31 0D 0A        _COOKIE_SHA1..
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 18
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 41 55 54 48 20 45 58 54  45 52 4E 41 4C 20 33 30  AUTH EXTERNAL 30
00000010: 0D 0A                                             ..
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 37
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 4F 4B 20 36 37 36 39 37  34 36 38 37 35 36 32 32  OK 6769746875622
00000010: 65 36 33 36 66 36 64 32  66 36 36 37 32 36 39 36  e636f6d2f6672696
00000020: 34 36 31 0D 0A                                    461..
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 19
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 4E 45 47 4F 54 49 41 54  45 5F 55 4E 49 58 5F 46  NEGOTIATE_UNIX_F
00000010: 44 0D 0A                                          D..
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 15
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 41 47 52 45 45 5F 55 4E  49 58 5F 46 44 0D 0A     AGREE_UNIX_FD..
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 7
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 42 45 47 49 4E 0D 0A                              BEGIN..
  1. Afterwords, Frida performs a GetAll request of the DBus properties:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 156
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 24 00 00 00  01 00 00 00 68 00 00 00  l...$.......h...
00000010: 08 01 67 00 01 73 00 00  01 01 6F 00 1E 00 00 00  ..g..s....o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 53  /re/frida/AgentS
00000030: 65 73 73 69 6F 6E 50 72  6F 76 69 64 65 72 00 00  essionProvider..
00000040: 03 01 73 00 06 00 00 00  47 65 74 41 6C 6C 00 00  ..s.....GetAll..
00000050: 02 01 73 00 1F 00 00 00  6F 72 67 2E 66 72 65 65  ..s.....org.free
00000060: 64 65 73 6B 74 6F 70 2E  44 42 75 73 2E 50 72 6F  desktop.DBus.Pro
00000070: 70 65 72 74 69 65 73 00  1F 00 00 00 72 65 2E 66  perties.....re.f
00000080: 72 69 64 61 2E 41 67 65  6E 74 53 65 73 73 69 6F  rida.AgentSessio
00000090: 6E 50 72 6F 76 69 64 65  72 31 32 00              nProvider12.
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 151
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 01 00 01 1F 00 00 00  01 00 00 00 68 00 00 00  l...........h...
00000010: 08 01 67 00 01 73 00 00  01 01 6F 00 19 00 00 00  ..g..s....o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 43  /re/frida/AgentC
00000030: 6F 6E 74 72 6F 6C 6C 65  72 00 00 00 00 00 00 00  ontroller.......
00000040: 03 01 73 00 06 00 00 00  47 65 74 41 6C 6C 00 00  ..s.....GetAll..
00000050: 02 01 73 00 1F 00 00 00  6F 72 67 2E 66 72 65 65  ..s.....org.free
00000060: 64 65 73 6B 74 6F 70 2E  44 42 75 73 2E 50 72 6F  desktop.DBus.Pro
00000070: 70 65 72 74 69 65 73 00  1A 00 00 00 72 65 2E 66  perties.....re.f
00000080: 72 69 64 61 2E 41 67 65  6E 74 43 6F 6E 74 72 6F  rida.AgentContro
00000090: 6C 6C 65 72 31 32 00                              ller12.
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 08 00 00 00  02 00 00 00 18 00 00 00  l...............
00000010: 08 01 67 00 05 61 7B 73  76 7D 00 00 00 00 00 00  ..g..a{sv}......
00000020: 05 01 75 00 01 00 00 00  00 00 00 00 00 00 00 00  ..u.............
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 48
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 02 01 01 08 00 00 00  02 00 00 00 18 00 00 00  l...............
00000010: 08 01 67 00 05 61 7B 73  76 7D 00 00 00 00 00 00  ..g..a{sv}......
00000020: 05 01 75 00 01 00 00 00  00 00 00 00 00 00 00 00  ..u.............
  1. Frida then instructs the script to open and waits for a confirmation that the open succeeded:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 04 00 00 00  03 00 00 00 70 00 00 00  l...........p...
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 01 01 6F 00 1E 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 50 72  a/AgentSessionPr
00000040: 6F 76 69 64 65 72 00 00  03 01 73 00 04 00 00 00  ovider....s.....
00000050: 4F 70 65 6E 00 00 00 00  02 01 73 00 1F 00 00 00  Open......s.....
00000060: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000070: 73 73 69 6F 6E 50 72 6F  76 69 64 65 72 31 32 00  ssionProvider12.
00000080: 01 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 04 01 01 04 00 00 00  03 00 00 00 70 00 00 00  l...........p...
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 01 01 6F 00 1E 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 50 72  a/AgentSessionPr
00000040: 6F 76 69 64 65 72 00 00  03 01 73 00 06 00 00 00  ovider....s.....
00000050: 4F 70 65 6E 65 64 00 00  02 01 73 00 1F 00 00 00  Opened....s.....
00000060: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000070: 73 73 69 6F 6E 50 72 6F  76 69 64 65 72 31 32 00  ssionProvider12.
00000080: 01 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 00 00 00 00  04 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 03 00 00 00  ..g.......u.....
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 148
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 1C 00 00 00  04 00 00 00 68 00 00 00  l...........h...
00000010: 08 01 67 00 01 73 00 00  01 01 6F 00 18 00 00 00  ..g..s....o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 53  /re/frida/AgentS
00000030: 65 73 73 69 6F 6E 2F 31  00 00 00 00 00 00 00 00  ession/1........
00000040: 03 01 73 00 06 00 00 00  47 65 74 41 6C 6C 00 00  ..s.....GetAll..
00000050: 02 01 73 00 1F 00 00 00  6F 72 67 2E 66 72 65 65  ..s.....org.free
00000060: 64 65 73 6B 74 6F 70 2E  44 42 75 73 2E 50 72 6F  desktop.DBus.Pro
00000070: 70 65 72 74 69 65 73 00  17 00 00 00 72 65 2E 66  perties.....re.f
00000080: 72 69 64 61 2E 41 67 65  6E 74 53 65 73 73 69 6F  rida.AgentSessio
00000090: 6E 31 32 00                                       n12.
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 08 00 00 00  05 00 00 00 18 00 00 00  l...............
00000010: 08 01 67 00 05 61 7B 73  76 7D 00 00 00 00 00 00  ..g..a{sv}......
00000020: 05 01 75 00 04 00 00 00  00 00 00 00 00 00 00 00  ..u.............
  1. Following this, Frida creates the JavaScript to be injected:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 251
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 83 00 00 00  05 00 00 00 68 00 00 00  l...........h...
00000010: 08 01 67 00 02 73 73 00  01 01 6F 00 18 00 00 00  ..g..ss...o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 53  /re/frida/AgentS
00000030: 65 73 73 69 6F 6E 2F 31  00 00 00 00 00 00 00 00  ession/1........
00000040: 03 01 73 00 0C 00 00 00  43 72 65 61 74 65 53 63  ..s.....CreateSc
00000050: 72 69 70 74 00 00 00 00  02 01 73 00 17 00 00 00  ript......s.....
00000060: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000070: 73 73 69 6F 6E 31 32 00  00 00 00 00 00 00 00 00  ssion12.........
00000080: 76 00 00 00 0A 49 6E 74  65 72 63 65 70 74 6F 72  v....Interceptor
00000090: 2E 61 74 74 61 63 68 28  70 74 72 28 22 39 34 32  .attach(ptr("942
000000A0: 37 32 36 36 32 37 32 39  30 34 35 22 29 2C 20 7B  72662729045"), {
000000B0: 0A 20 20 20 20 6F 6E 45  6E 74 65 72 3A 20 66 75  .    onEnter: fu
000000C0: 6E 63 74 69 6F 6E 28 61  72 67 73 29 20 7B 0A 20  nction(args) {. 
000000D0: 20 20 20 20 20 20 20 73  65 6E 64 28 61 72 67 73         send(args
000000E0: 5B 30 5D 2E 74 6F 49 6E  74 33 32 28 29 29 3B 0A  [0].toInt32());.
000000F0: 20 20 20 20 7D 0A 7D 29  3B 0A 00                     }.});..
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 44
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 04 00 00 00  06 00 00 00 18 00 00 00  l...............
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 05 01 75 00 05 00 00 00  01 00 00 00              ..u.........
  1. Frida then signals the agent to load the script:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 04 00 00 00  06 00 00 00 70 00 00 00  l...........p...
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 0A 00 00 00  ..........s.....
00000050: 4C 6F 61 64 53 63 72 69  70 74 00 00 00 00 00 00  LoadScript......
00000060: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000070: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
00000080: 01 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 00 00 00 00  07 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 06 00 00 00  ..g.......u.....
  1. The injected script, when run, returns the requested data through the socket:
Output
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 04 01 01 2C 00 00 00  08 00 00 00 78 00 00 00  l...,.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 1B 00 00 00  ssion12.........
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 31 7D 00 00 00 00 00  payload":1}.....
000000B0: 00 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 04 01 01 2C 00 00 00  09 00 00 00 78 00 00 00  l...,.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 1B 00 00 00  ssion12.........
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 32 7D 00 00 00 00 00  payload":2}.....
000000B0: 00 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 04 01 01 2C 00 00 00  0A 00 00 00 78 00 00 00  l...,.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 1B 00 00 00  ssion12.........
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 33 7D 00 00 00 00 00  payload":3}.....
000000B0: 00 00 00 00                                       ....
---snip---
  1. This message repeats with the payload incrementing by 1 as specified in the example code. When the user is done using Frida, Frida will instruct the agent to unload the injected script:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 04 00 00 00  07 00 00 00 70 00 00 00  l...........p...
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 0D 00 00 00  ..........s.....
00000050: 44 65 73 74 72 6F 79 53  63 72 69 70 74 00 00 00  DestroyScript...
00000060: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000070: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
00000080: 01 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 00 00 00 00  0B 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 07 00 00 00  ..g.......u.....
  1. The injected script is then closed by Frida:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 112
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 00 00 00 00  08 00 00 00 60 00 00 00  l...........`...
00000010: 08 01 67 00 00 00 00 00  01 01 6F 00 18 00 00 00  ..g.......o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 53  /re/frida/AgentS
00000030: 65 73 73 69 6F 6E 2F 31  00 00 00 00 00 00 00 00  ession/1........
00000040: 03 01 73 00 05 00 00 00  43 6C 6F 73 65 00 00 00  ..s.....Close...
00000050: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000060: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 04 01 01 04 00 00 00  0C 00 00 00 70 00 00 00  l...........p...
00000010: 08 01 67 00 03 28 75 29  00 00 00 00 00 00 00 00  ..g..(u)........
00000020: 01 01 6F 00 1E 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 50 72  a/AgentSessionPr
00000040: 6F 76 69 64 65 72 00 00  03 01 73 00 06 00 00 00  ovider....s.....
00000050: 43 6C 6F 73 65 64 00 00  02 01 73 00 1F 00 00 00  Closed....s.....
00000060: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000070: 73 73 69 6F 6E 50 72 6F  76 69 64 65 72 31 32 00  ssionProvider12.
00000080: 01 00 00 00                                       ....
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 00 00 00 00  0D 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 08 00 00 00  ..g.......u.....
  1. And, for the final step, Frida instructs the injected agent to unload:
Output
====
STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 120
command[26527]: 'python ./hook.py 0x55bd9092f155'
command[26525]: './hello'
----
00000000: 6C 01 00 01 00 00 00 00  09 00 00 00 68 00 00 00  l...........h...
00000010: 08 01 67 00 00 00 00 00  01 01 6F 00 1E 00 00 00  ..g.......o.....
00000020: 2F 72 65 2F 66 72 69 64  61 2F 41 67 65 6E 74 53  /re/frida/AgentS
00000030: 65 73 73 69 6F 6E 50 72  6F 76 69 64 65 72 00 00  essionProvider..
00000040: 03 01 73 00 06 00 00 00  55 6E 6C 6F 61 64 00 00  ..s.....Unload..
00000050: 02 01 73 00 1F 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000060: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 50 72 6F  .AgentSessionPro
00000070: 76 69 64 65 72 31 32 00                           vider12.
====
STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32
command[26525]: './hello'
command[26527]: 'python ./hook.py 0x55bd9092f155'
----
00000000: 6C 02 01 01 00 00 00 00  0E 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 09 00 00 00  ..g.......u.....

Frida CLI Tab Completion Protocol

The Frida command line tool has a tab completion-based prompt that allows quick access to all of its features. We will examine the communications that occur while Frida is performing a tab complete operation.

  1. Frida starts the interaction by sending a PostToScript command to the injected script. The script sent calls Object.getOwnProperties() on the this object:
Output
====
STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 220
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
command[26847]: './hello'
----
00000000: 6C 01 00 01 5C 00 00 00  0B 00 00 00 70 00 00 00  l...\.......p...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 0C 00 00 00  ..........s.....
00000050: 50 6F 73 74 54 6F 53 63  72 69 70 74 00 00 00 00  PostToScript....
00000060: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000070: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
00000080: 01 00 00 00 4A 00 00 00  5B 22 66 72 69 64 61 3A  ....J...["frida:
00000090: 72 70 63 22 2C 20 35 2C  20 22 63 61 6C 6C 22 2C  rpc", 5, "call",
000000A0: 20 22 65 76 61 6C 75 61  74 65 22 2C 20 5B 22 4F   "evaluate", ["O
000000B0: 62 6A 65 63 74 2E 67 65  74 4F 77 6E 50 72 6F 70  bject.getOwnProp
000000C0: 65 72 74 79 4E 61 6D 65  73 28 74 68 69 73 29 22  ertyNames(this)"
000000D0: 5D 5D 00 00 00 00 00 00  00 00 00 00              ]]..........
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 02 01 01 00 00 00 00  10 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 0B 00 00 00  ..g.......u.....
  1. This causes the Frida agent within the target process to evaluate the script and return all properties of the this object, the possible actions and available commands we are attempting to tab complete:
Output
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 1736
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 04 01 01 40 06 00 00  11 00 00 00 78 00 00 00  l...@.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 2C 06 00 00  ssion12.....,...
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 5B 22 66 72 69 64 61  payload":["frida
000000B0: 3A 72 70 63 22 2C 35 2C  22 6F 6B 22 2C 5B 22 6F  :rpc",5,"ok",["o
000000C0: 62 6A 65 63 74 22 2C 5B  22 4E 61 4E 22 2C 22 49  bject",["NaN","I
000000D0: 6E 66 69 6E 69 74 79 22  2C 22 75 6E 64 65 66 69  nfinity","undefi
000000E0: 6E 65 64 22 2C 22 4F 62  6A 65 63 74 22 2C 22 46  ned","Object","F
000000F0: 75 6E 63 74 69 6F 6E 22  2C 22 41 72 72 61 79 22  unction","Array"
00000100: 2C 22 53 74 72 69 6E 67  22 2C 22 42 6F 6F 6C 65  ,"String","Boole
00000110: 61 6E 22 2C 22 4E 75 6D  62 65 72 22 2C 22 44 61  an","Number","Da
00000120: 74 65 22 2C 22 52 65 67  45 78 70 22 2C 22 45 72  te","RegExp","Er
00000130: 72 6F 72 22 2C 22 45 76  61 6C 45 72 72 6F 72 22  ror","EvalError"
00000140: 2C 22 52 61 6E 67 65 45  72 72 6F 72 22 2C 22 52  ,"RangeError","R
00000150: 65 66 65 72 65 6E 63 65  45 72 72 6F 72 22 2C 22  eferenceError","
00000160: 53 79 6E 74 61 78 45 72  72 6F 72 22 2C 22 54 79  SyntaxError","Ty
00000170: 70 65 45 72 72 6F 72 22  2C 22 55 52 49 45 72 72  peError","URIErr
00000180: 6F 72 22 2C 22 4D 61 74  68 22 2C 22 4A 53 4F 4E  or","Math","JSON
00000190: 22 2C 22 44 75 6B 74 61  70 65 22 2C 22 50 72 6F  ","Duktape","Pro
000001A0: 78 79 22 2C 22 52 65 66  6C 65 63 74 22 2C 22 42  xy","Reflect","B
000001B0: 75 66 66 65 72 22 2C 22  41 72 72 61 79 42 75 66  uffer","ArrayBuf
000001C0: 66 65 72 22 2C 22 44 61  74 61 56 69 65 77 22 2C  fer","DataView",
000001D0: 22 49 6E 74 38 41 72 72  61 79 22 2C 22 55 69 6E  "Int8Array","Uin
000001E0: 74 38 41 72 72 61 79 22  2C 22 55 69 6E 74 38 43  t8Array","Uint8C
000001F0: 6C 61 6D 70 65 64 41 72  72 61 79 22 2C 22 49 6E  lampedArray","In
00000200: 74 31 36 41 72 72 61 79  22 2C 22 55 69 6E 74 31  t16Array","Uint1
00000210: 36 41 72 72 61 79 22 2C  22 49 6E 74 33 32 41 72  6Array","Int32Ar
00000220: 72 61 79 22 2C 22 55 69  6E 74 33 32 41 72 72 61  ray","Uint32Arra
00000230: 79 22 2C 22 46 6C 6F 61  74 33 32 41 72 72 61 79  y","Float32Array
00000240: 22 2C 22 46 6C 6F 61 74  36 34 41 72 72 61 79 22  ","Float64Array"
00000250: 2C 22 70 61 72 73 65 49  6E 74 22 2C 22 70 61 72  ,"parseInt","par
00000260: 73 65 46 6C 6F 61 74 22  2C 22 54 65 78 74 45 6E  seFloat","TextEn
00000270: 63 6F 64 65 72 22 2C 22  54 65 78 74 44 65 63 6F  coder","TextDeco
00000280: 64 65 72 22 2C 22 70 65  72 66 6F 72 6D 61 6E 63  der","performanc
00000290: 65 22 2C 22 65 76 61 6C  22 2C 22 69 73 4E 61 4E  e","eval","isNaN
000002A0: 22 2C 22 69 73 46 69 6E  69 74 65 22 2C 22 64 65  ","isFinite","de
000002B0: 63 6F 64 65 55 52 49 22  2C 22 64 65 63 6F 64 65  codeURI","decode
000002C0: 55 52 49 43 6F 6D 70 6F  6E 65 6E 74 22 2C 22 65  URIComponent","e
000002D0: 6E 63 6F 64 65 55 52 49  22 2C 22 65 6E 63 6F 64  ncodeURI","encod
000002E0: 65 55 52 49 43 6F 6D 70  6F 6E 65 6E 74 22 2C 22  eURIComponent","
000002F0: 65 73 63 61 70 65 22 2C  22 75 6E 65 73 63 61 70  escape","unescap
00000300: 65 22 2C 22 67 6C 6F 62  61 6C 22 2C 22 46 72 69  e","global","Fri
00000310: 64 61 22 2C 22 53 63 72  69 70 74 22 2C 22 57 65  da","Script","We
00000320: 61 6B 52 65 66 22 2C 22  5F 73 65 74 54 69 6D 65  akRef","_setTime
00000330: 6F 75 74 22 2C 22 5F 73  65 74 49 6E 74 65 72 76  out","_setInterv
00000340: 61 6C 22 2C 22 63 6C 65  61 72 54 69 6D 65 6F 75  al","clearTimeou
00000350: 74 22 2C 22 63 6C 65 61  72 49 6E 74 65 72 76 61  t","clearInterva
00000360: 6C 22 2C 22 67 63 22 2C  22 5F 73 65 6E 64 22 2C  l","gc","_send",
00000370: 22 5F 73 65 74 55 6E 68  61 6E 64 6C 65 64 45 78  "_setUnhandledEx
00000380: 63 65 70 74 69 6F 6E 43  61 6C 6C 62 61 63 6B 22  ceptionCallback"
00000390: 2C 22 5F 73 65 74 49 6E  63 6F 6D 69 6E 67 4D 65  ,"_setIncomingMe
000003A0: 73 73 61 67 65 43 61 6C  6C 62 61 63 6B 22 2C 22  ssageCallback","
000003B0: 5F 77 61 69 74 46 6F 72  45 76 65 6E 74 22 2C 22  _waitForEvent","
000003C0: 49 6E 74 36 34 22 2C 22  55 49 6E 74 36 34 22 2C  Int64","UInt64",
000003D0: 22 4E 61 74 69 76 65 50  6F 69 6E 74 65 72 22 2C  "NativePointer",
000003E0: 22 4E 61 74 69 76 65 52  65 73 6F 75 72 63 65 22  "NativeResource"
000003F0: 2C 22 4E 61 74 69 76 65  46 75 6E 63 74 69 6F 6E  ,"NativeFunction
00000400: 22 2C 22 53 79 73 74 65  6D 46 75 6E 63 74 69 6F  ","SystemFunctio
00000410: 6E 22 2C 22 4E 61 74 69  76 65 43 61 6C 6C 62 61  n","NativeCallba
00000420: 63 6B 22 2C 22 43 70 75  43 6F 6E 74 65 78 74 22  ck","CpuContext"
00000430: 2C 22 53 6F 75 72 63 65  4D 61 70 22 2C 22 4B 65  ,"SourceMap","Ke
00000440: 72 6E 65 6C 22 2C 22 4D  65 6D 6F 72 79 22 2C 22  rnel","Memory","
00000450: 4D 65 6D 6F 72 79 41 63  63 65 73 73 4D 6F 6E 69  MemoryAccessMoni
00000460: 74 6F 72 22 2C 22 50 72  6F 63 65 73 73 22 2C 22  tor","Process","
00000470: 54 68 72 65 61 64 22 2C  22 42 61 63 6B 74 72 61  Thread","Backtra
00000480: 63 65 72 22 2C 22 4D 6F  64 75 6C 65 22 2C 22 4D  cer","Module","M
00000490: 6F 64 75 6C 65 4D 61 70  22 2C 22 46 69 6C 65 22  oduleMap","File"
000004A0: 2C 22 49 4F 53 74 72 65  61 6D 22 2C 22 49 6E 70  ,"IOStream","Inp
000004B0: 75 74 53 74 72 65 61 6D  22 2C 22 4F 75 74 70 75  utStream","Outpu
000004C0: 74 53 74 72 65 61 6D 22  2C 22 55 6E 69 78 49 6E  tStream","UnixIn
000004D0: 70 75 74 53 74 72 65 61  6D 22 2C 22 55 6E 69 78  putStream","Unix
000004E0: 4F 75 74 70 75 74 53 74  72 65 61 6D 22 2C 22 53  OutputStream","S
000004F0: 6F 63 6B 65 74 22 2C 22  53 6F 63 6B 65 74 4C 69  ocket","SocketLi
00000500: 73 74 65 6E 65 72 22 2C  22 53 6F 63 6B 65 74 43  stener","SocketC
00000510: 6F 6E 6E 65 63 74 69 6F  6E 22 2C 22 53 71 6C 69  onnection","Sqli
00000520: 74 65 44 61 74 61 62 61  73 65 22 2C 22 53 71 6C  teDatabase","Sql
00000530: 69 74 65 53 74 61 74 65  6D 65 6E 74 22 2C 22 49  iteStatement","I
00000540: 6E 74 65 72 63 65 70 74  6F 72 22 2C 22 49 6E 76  nterceptor","Inv
00000550: 6F 63 61 74 69 6F 6E 4C  69 73 74 65 6E 65 72 22  ocationListener"
00000560: 2C 22 49 6E 76 6F 63 61  74 69 6F 6E 43 6F 6E 74  ,"InvocationCont
00000570: 65 78 74 22 2C 22 49 6E  76 6F 63 61 74 69 6F 6E  ext","Invocation
00000580: 41 72 67 73 22 2C 22 49  6E 76 6F 63 61 74 69 6F  Args","Invocatio
00000590: 6E 52 65 74 75 72 6E 56  61 6C 75 65 22 2C 22 41  nReturnValue","A
000005A0: 70 69 52 65 73 6F 6C 76  65 72 22 2C 22 44 65 62  piResolver","Deb
000005B0: 75 67 53 79 6D 62 6F 6C  22 2C 22 49 6E 73 74 72  ugSymbol","Instr
000005C0: 75 63 74 69 6F 6E 22 2C  22 58 38 36 57 72 69 74  uction","X86Writ
000005D0: 65 72 22 2C 22 58 38 36  52 65 6C 6F 63 61 74 6F  er","X86Relocato
000005E0: 72 22 2C 22 53 74 61 6C  6B 65 72 22 2C 22 53 74  r","Stalker","St
000005F0: 61 6C 6B 65 72 49 74 65  72 61 74 6F 72 22 2C 22  alkerIterator","
00000600: 50 72 6F 62 65 41 72 67  73 22 2C 22 5F 5F 63 6F  ProbeArgs","__co
00000610: 72 65 2D 6A 73 5F 73 68  61 72 65 64 5F 5F 22 2C  re-js_shared__",
00000620: 22 50 72 6F 6D 69 73 65  22 2C 22 72 70 63 22 2C  "Promise","rpc",
00000630: 22 72 65 63 76 22 2C 22  73 65 6E 64 22 2C 22 73  "recv","send","s
00000640: 65 74 54 69 6D 65 6F 75  74 22 2C 22 73 65 74 49  etTimeout","setI
00000650: 6E 74 65 72 76 61 6C 22  2C 22 73 65 74 49 6D 6D  nterval","setImm
00000660: 65 64 69 61 74 65 22 2C  22 63 6C 65 61 72 49 6D  ediate","clearIm
00000670: 6D 65 64 69 61 74 65 22  2C 22 69 6E 74 36 34 22  mediate","int64"
00000680: 2C 22 75 69 6E 74 36 34  22 2C 22 70 74 72 22 2C  ,"uint64","ptr",
00000690: 22 4E 55 4C 4C 22 2C 22  63 6F 6E 73 6F 6C 65 22  "NULL","console"
000006A0: 2C 22 68 65 78 64 75 6D  70 22 2C 22 4F 62 6A 43  ,"hexdump","ObjC
000006B0: 22 2C 22 4A 61 76 61 22  5D 5D 5D 7D 00 00 00 00  ","Java"]]]}....
000006C0: 00 00 00 00 00 00 00 00                           ........
  1. Seeing this list, we begin to type out File. and hit tab to see our options. Object.getOwnProperties() is called again, but now it is called on File. This returns us the following attributes: prototype, length, and name:
Output
====
STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 1220
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
command[26847]: './hello'
----
00000000: 6C 01 00 01 44 04 00 00  0C 00 00 00 70 00 00 00  l...D.......p...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 0C 00 00 00  ..........s.....
00000050: 50 6F 73 74 54 6F 53 63  72 69 70 74 00 00 00 00  PostToScript....
00000060: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000070: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
00000080: 01 00 00 00 30 04 00 00  5B 22 66 72 69 64 61 3A  ....0...["frida:
00000090: 72 70 63 22 2C 20 36 2C  20 22 63 61 6C 6C 22 2C  rpc", 6, "call",
000000A0: 20 22 65 76 61 6C 75 61  74 65 22 2C 20 5B 22 74   "evaluate", ["t
000000B0: 72 79 20 7B 5C 6E 20 20  20 20 20 20 20 20 20 20  ry {\n          
000000C0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000000D0: 20 20 20 20 20 20 20 20  20 20 28 66 75 6E 63 74            (funct
000000E0: 69 6F 6E 20 28 6F 29 20  7B 5C 6E 20 20 20 20 20  ion (o) {\n     
000000F0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000100: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000110: 20 20 20 5C 22 75 73 65  20 73 74 72 69 63 74 5C     \"use strict\
00000120: 22 3B 5C 6E 20 20 20 20  20 20 20 20 20 20 20 20  ";\n            
00000130: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000140: 20 20 20 20 20 20 20 20  20 20 20 20 76 61 72 20              var 
00000150: 6B 20 3D 20 4F 62 6A 65  63 74 2E 67 65 74 4F 77  k = Object.getOw
00000160: 6E 50 72 6F 70 65 72 74  79 4E 61 6D 65 73 28 6F  nPropertyNames(o
00000170: 29 3B 5C 6E 20 20 20 20  20 20 20 20 20 20 20 20  );\n            
00000180: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000190: 20 20 20 20 20 20 20 20  20 20 20 20 69 66 20 28              if (
000001A0: 6F 20 21 3D 3D 20 6E 75  6C 6C 20 26 26 20 6F 20  o !== null && o 
000001B0: 21 3D 3D 20 75 6E 64 65  66 69 6E 65 64 29 20 7B  !== undefined) {
000001C0: 5C 6E 20 20 20 20 20 20  20 20 20 20 20 20 20 20  \n              
000001D0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000001E0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 76 61                va
000001F0: 72 20 70 3B 5C 6E 20 20  20 20 20 20 20 20 20 20  r p;\n          
00000200: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000210: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000220: 20 20 69 66 20 28 74 79  70 65 6F 66 20 6F 20 21    if (typeof o !
00000230: 3D 3D 20 27 6F 62 6A 65  63 74 27 29 5C 6E 20 20  == 'object')\n  
00000240: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000250: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000260: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 70 20                p 
00000270: 3D 20 6F 2E 5F 5F 70 72  6F 74 6F 5F 5F 3B 5C 6E  = o.__proto__;\n
00000280: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000290: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000002A0: 20 20 20 20 20 20 20 20  20 20 20 20 65 6C 73 65              else
000002B0: 5C 6E 20 20 20 20 20 20  20 20 20 20 20 20 20 20  \n              
000002C0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000002D0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000002E0: 20 20 70 20 3D 20 4F 62  6A 65 63 74 2E 67 65 74    p = Object.get
000002F0: 50 72 6F 74 6F 74 79 70  65 4F 66 28 6F 29 3B 5C  PrototypeOf(o);\
00000300: 6E 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  n               
00000310: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000320: 20 20 20 20 20 20 20 20  20 20 20 20 20 69 66 20               if 
00000330: 28 70 20 21 3D 3D 20 6E  75 6C 6C 20 26 26 20 70  (p !== null && p
00000340: 20 21 3D 3D 20 75 6E 64  65 66 69 6E 65 64 29 5C   !== undefined)\
00000350: 6E 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  n               
00000360: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000370: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000380: 20 6B 20 3D 20 6B 2E 63  6F 6E 63 61 74 28 4F 62   k = k.concat(Ob
00000390: 6A 65 63 74 2E 67 65 74  4F 77 6E 50 72 6F 70 65  ject.getOwnPrope
000003A0: 72 74 79 4E 61 6D 65 73  28 70 29 29 3B 5C 6E 20  rtyNames(p));\n 
000003B0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000003C0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000003D0: 20 20 20 20 20 20 20 7D  5C 6E 20 20 20 20 20 20         }\n      
000003E0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000003F0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000400: 20 20 72 65 74 75 72 6E  20 6B 3B 5C 6E 20 20 20    return k;\n   
00000410: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000420: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000430: 20 7D 29 28 46 69 6C 65  29 3B 5C 6E 20 20 20 20   })(File);\n    
00000440: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000450: 20 20 20 20 20 20 20 20  20 20 20 20 7D 20 63 61              } ca
00000460: 74 63 68 20 28 65 29 20  7B 5C 6E 20 20 20 20 20  tch (e) {\n     
00000470: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
00000480: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 5B                 [
00000490: 5D 3B 5C 6E 20 20 20 20  20 20 20 20 20 20 20 20  ];\n            
000004A0: 20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                  
000004B0: 20 20 20 20 7D 22 5D 5D  00 00 00 00 00 00 00 00      }"]]........
000004C0: 00 00 00 00                                       ....
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 02 01 01 00 00 00 00  12 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 0C 00 00 00  ..g.......u.....
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 240
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 04 01 01 68 00 00 00  13 00 00 00 78 00 00 00  l...h.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 57 00 00 00  ssion12.....W...
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 5B 22 66 72 69 64 61  payload":["frida
000000B0: 3A 72 70 63 22 2C 36 2C  22 6F 6B 22 2C 5B 22 6F  :rpc",6,"ok",["o
000000C0: 62 6A 65 63 74 22 2C 5B  22 70 72 6F 74 6F 74 79  bject",["prototy
000000D0: 70 65 22 2C 22 6C 65 6E  67 74 68 22 2C 22 6E 61  pe","length","na
000000E0: 6D 65 22 5D 5D 5D 7D 00  00 00 00 00 00 00 00 00  me"]]]}.........
  1. Back in the UI, we tab cycle to the length attribute and hit enter on File.length. This tells the injected script to call evaluate on File.length. The script responds with an array indicating that the type of the evaluated expression is "number" and the value is 2:
Output
====
STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 200
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
command[26847]: './hello'
----
00000000: 6C 01 00 01 48 00 00 00  0D 00 00 00 70 00 00 00  l...H.......p...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 0C 00 00 00  ..........s.....
00000050: 50 6F 73 74 54 6F 53 63  72 69 70 74 00 00 00 00  PostToScript....
00000060: 02 01 73 00 17 00 00 00  72 65 2E 66 72 69 64 61  ..s.....re.frida
00000070: 2E 41 67 65 6E 74 53 65  73 73 69 6F 6E 31 32 00  .AgentSession12.
00000080: 01 00 00 00 35 00 00 00  5B 22 66 72 69 64 61 3A  ....5...["frida:
00000090: 72 70 63 22 2C 20 37 2C  20 22 63 61 6C 6C 22 2C  rpc", 7, "call",
000000A0: 20 22 65 76 61 6C 75 61  74 65 22 2C 20 5B 22 46   "evaluate", ["F
000000B0: 69 6C 65 2E 6C 65 6E 67  74 68 22 5D 5D 00 00 00  ile.length"]]...
000000C0: 00 00 00 00 00 00 00 00                           ........
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 02 01 01 00 00 00 00  14 00 00 00 10 00 00 00  l...............
00000010: 08 01 67 00 00 00 00 00  05 01 75 00 0D 00 00 00  ..g.......u.....
====
STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 212
command[26847]: './hello'
command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847'
----
00000000: 6C 04 01 01 4C 00 00 00  15 00 00 00 78 00 00 00  l...L.......x...
00000010: 08 01 67 00 07 28 75 29  73 62 61 79 00 00 00 00  ..g..(u)sbay....
00000020: 01 01 6F 00 18 00 00 00  2F 72 65 2F 66 72 69 64  ..o...../re/frid
00000030: 61 2F 41 67 65 6E 74 53  65 73 73 69 6F 6E 2F 31  a/AgentSession/1
00000040: 00 00 00 00 00 00 00 00  03 01 73 00 11 00 00 00  ..........s.....
00000050: 4D 65 73 73 61 67 65 46  72 6F 6D 53 63 72 69 70  MessageFromScrip
00000060: 74 00 00 00 00 00 00 00  02 01 73 00 17 00 00 00  t.........s.....
00000070: 72 65 2E 66 72 69 64 61  2E 41 67 65 6E 74 53 65  re.frida.AgentSe
00000080: 73 73 69 6F 6E 31 32 00  01 00 00 00 3B 00 00 00  ssion12.....;...
00000090: 7B 22 74 79 70 65 22 3A  22 73 65 6E 64 22 2C 22  {"type":"send","
000000A0: 70 61 79 6C 6F 61 64 22  3A 5B 22 66 72 69 64 61  payload":["frida
000000B0: 3A 72 70 63 22 2C 37 2C  22 6F 6B 22 2C 5B 22 6E  :rpc",7,"ok",["n
000000C0: 75 6D 62 65 72 22 2C 32  5D 5D 7D 00 00 00 00 00  umber",2]]}.....
000000D0: 00 00 00 00                                       ....

eBPF Coding Tricks

While writing unixdump, we spent an inordinate amount of time attempting to please the eBPF bytecode validator with code constructs it would accept. Most of the time, it would not like idiomatic code that was correct; this was seemingly due to compiler optimizations used by the BCC toolchain. Regardless, we often had to obscure our code in ways that would enable it to pass inspection, and, as a result, the code likely performs worse than if the validator worked correctly in the first place. Additionally, as some of the data structures we needed to parse are dynamically sized and based on dynamic offsets, we had to write (or generate) inline code to parse them directly without loops or recursion. And then there are the generic eBPF hoops that need to be jumped through on a regular basis.

eBPF Gotchas

No Loops, No Jumper Cables

eBPF doesn’t like loops, that much is clear; but we often still need to perform such operations. Abusing the eBPF memcpy-alike, bpf_probe_read will only get one so far, especially if one needs to NULL out a struct. In practice, short statically-bounded loops will be unrolled by the compiler and work, but longer loops will not and won’t. However, it is simple to unroll loops with statically-known bounds using compiler pragmas:

#pragma unroll
for (size_t i=0; i < 30; i++) {
  arr[i] = arr[i] + 1;
}

This is a fairly useful construct that be ruthlessly applied to a number of different problems.

Uninitialized Memory

One of the things to be careful about with eBPF is that when attempting to copy data from the eBPF stack elsewhere, if any uninitialized memory would be copied, the validator will error with offending stack offsets that are entirely unhelpful. Usually, this is the result of having padding between fields in your structs. A simple way of handling this is to use an unrolled loop akin to memset-ing zero; where possible, such code will be optimized to use 8-byte writes. However, this is computationally wasteful. Instead, another option is to carefully control field types and ordering to fill in all gaps. Failing this, explicitly declaring settable padding values and padding unions can enable a programmer to manually elide double writes to the struct. And lastly, one can always use a packed struct (e.g. struct __attribute__((__packed__)) foo {...}); this may require more byte shuffling operations to write and read, but can be of help when the limiting factor of the eBPF code is the effective rate/drop limit of perf_submit, by reducing the overall amount of data sent.

eBPF Chicanery

For unixdump we had a number of operational needs based on correctness or performance goals that required writing a significant amount of non-idiomatic C code and code generation tooling. While none of this is especially groundbreaking, it is worth discussing how to perform common programmatic tasks while under constraints like those imposed by eBPF.

Ratcheting

In addition to managing memory shared between kernel space and userspace, we also needed to maintain state of the current position within the custom ring buffer. This is achieved simply enough by using another per-CPU ring buffer, one that only holds a single value. This provides a separate position value associated with each per-CPU ring buffer. However, the problem with this setup is not in the data itself, but the mechanism by which it is incremented, or, more importantly, wrapped. The eBPF validator was displeased with any ratchet that tried to perform the wrap via a specific switch case. Instead only it accepts implementations where wrapping is only performed using the default: label; attempts to wrap the value in the last “valid” case or “guess” the wrapping position will fail, even if the default: code also wraps. For example, the following is an eBPF-valid position counter ratchet implementation:

u32 pos = UINT32_MAX;
int key = 0;
sync = sync_buf.lookup(&key);
if (!sync) {
  return 0;
}

pos = 0;
switch (sync->next) {
  case 0: {
    pos = 0;
    sync->next = 1;
    break;
  };

  case 1: {
    pos = 1;
    sync->next = 2;
    break;
  };

  default: {
    pos = 0;
    sync->next = 1;
  }
}

Dynamic Structure Parsing

While writing unixdump, we got the crazy idea to keep track of all ancillary data (e.g. file descriptors) passing over Unix domain sockets. While this is of great benefit for tracking how processes are passing file handles, sockets, and other descriptors to each other, the “format” into which the data is marshalled is very fluid and poorly specified. For example, similarly to SMS messages, received messages may have a different structure from what what actually sent; in particular, multiple messages of the same type may be coalesced into a single message containing multiple values, regardless of the order in which they were sent.

In unixdump, we use unrolled nested loops to iterate through the CMSG structures containing ancillary data. Where possible, we use the CMSG_* macros to index into the buffer and access fields; however, we reimplemented several of these macros to be compatible with BCC’s pointer dereference instrumentation which was unable to handle all of the CMSG_* macros. To store the data and report it back to userspace, we used a typed union struct that can store both SCM_RIGHTS (file descriptors) and SCM_CREDENTIALS (Unix credentials), which additionally keeps track of the count of the former and whether or not the last element returned to userspace was actually the last element of the in-kernel CMSG structure. Both the max count of copyable CMSGs and slots within a CMSG (for storing SCM_RIGHTS file descriptors) are configurable via the CLI; this also modifies the unrolled loop counts.

Static Data Structures and Algorithms

To appropriately handle the glut of data caught by unixdump, we needed to performantly filter PIDs (inclusively or exclusively) in eBPF C code, so as to limit the amount of data and number of events sent to userland. Iteratively comparing each one would be extremely costly, so we instead opted to use a binary search tree. As a recursive binary search implementation will trip the loop check, we instead generate the entire static C implementation (dynamically in Python) for the values being filtered. For reference, the implementation can be found here.

Dark eBPF Thaumaturgy

Even with all of the above tricks to keep it happy, the eBPF validator’s muse is still a fickle miscreant with a very short attention span. Be it due to changes in the toolchain or the Linux kernel itself, the validator may look upon your overly clever code and decide to smite you where you stand. Sometimes, appeasing the validator requires ever greater sacrifices of idiomaticity.

Dynamic Length Byte Copies

Per the issue mentioned earlier, it is not immediately clear that variable length byte copies from kernel memory are possible with eBPF. Given that the recommended solution is to use a helper function for copying NUL-terminated C strings, this would be a problem when the variable length data is binary content that may contain NULL bytes. However, this is not the case, and such copies can be performed, albeit with some careful slight slight-of-hand. While this is not an issue for socket paths, of which the sun_path field of struct sockaddr_un is guaranteed to be at least UNIX_PATH_MAX (108) bytes long, this is an issue for copying arbitrary socket data. While it is important to ensure that stack-based arrays are fully written to (e.g. write NULL bytes to the remainder of arrays), this limitation does not exist for eBPF map structures as they are zero-initialized by the kernel to prevent information leaks. Instead, the trouble occurs when one attempts to truncate the copy length. Between BCC and the eBPF validator, it is often the case that a byte copy of the length of an array or less is considered unsafe, and therefore rejected. Instead, when tapering off the array length, it was previously necessary to cap the copy length to sizeof(buffer)-1. The odd behavior here is that if the source length is the same as the destination length, it must still be truncated. Additionally, to prevent optimizations that may elide certain comparisons needed to provide the eBPF validator with register bounds, we found that it was possible to simply wrap the desired code in a static inline function to shadow the variables in play. For example, in unixdump we perform this copy and track whether or not the data was truncated in the following code:

inline static
void copy_into_entry_buffer(data_t* entry, size_t const len,
                            char* base, u8 volatile* trunc) {
  int l = (int)len;
  if (l < 0) {
    l = 0;
  }
  if (l >= BUFFER_SIZE) {
    *trunc = 1;
  }
  if (l >= BUFFER_SIZE) {
    l = BUFFER_SIZE - 1;
  }
  bpf_probe_read(entry->buffer, l, base);
}

Note: This behavior has changed a few times between BCC and Linux kernel versions, and when using current versions of both, it is possible to implement the optimal case of copying right up to the end of the array; however, to support older versions we continue to use the less optimal “truncate on equal” version shown above.

Type Juggling

Another spooky behavior we observed with a previous version of BCC (which we have not observed since) was an interesting case where the return type of a function could cause the validator to raise an error. While it may be simple enough to imagine such a situation involving mixing signed and unsigned integers, this instance related to the use of bool as both the return and variable type, which was eventually casted to size_t. In some versions of our code, the validator would raise an error if the return value was bool, but in others it would raise an error if the return value was size_t. For context, unixdump will, based on CLI options for certain features, enable or disable certain kprobe functionality with #if(def)s. As a result, we simply used the same feature detections to set a BOOL_TYPE define used as the return and variable type with either bool or size_t. At the time, we did not bother to triage this issue (sorry!), but it does not affect the current unixdump code when using a current BCC. As for whether or not this is because the current BCC fixed the issue, or our current code is unaffected, it is a mystery.
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   
                                                   

Obfuscation, or: How I Learned to Stop Worrying and Outsmart the Compiler

When writing eBPF code, one’s greatest enemy is often the compiler’s optimizers. eBPF’s most glaring flaw is that the compiler and the validator have no means to communicate other than through the generated code. Try as you might to write your code in a concise way that would otherwise ensure its correctness, the compiler may simply optimize out all of your “unnecessary” data validation checks, leaving the validator to complain that you are not “properly” validating all of the edge cases. While sometimes, one can get around such occurrences with the volatile keyword, other times it will be necessary to rework the code over and over in an attempt to fool both the compiler and eBPF validator. As noted earlier, we have observed that placing code verbatim within an inline static function would result in certain offending code passing validation. This appeared to be due to the fact that certain assumptions on the “parameters” could no longer be made, preventing the compiler from eliding code required by the validator. However, it is worth noting that because of such blunders within the validator, one’s code must sometimes be implemented suboptimally, which will incur unnecessary performance penalties. We still prefer to accept such specific penalties over configuring BCC to compile eBPF C code with -O0.

Conclusion

While it can be a bit tricky to write anything more than the sorts of very simple eBPF kernel tracing tools currently promoted as BCC reference examples focused on basic system profiling, it is very much possible to use eBPF to develop full-featured tracing tools and tooling. Additionally, though the developer experience has a tendency to be extremely perplexing, it does appear to be actively improving over time, given the lessened need for hacky validator appeasement rituals.

We got our feet wet in the world of eBPF-based kernel tracing by attempting to solve a somewhat niche problem, but the outcome seems promising. Our initial test case for eBPF, unixdump, is open source and available on GitHub; check it out here: https://github.com/nccgroup/ebpf/tree/master/unixdump. We plan to continue to add features and filters to unixdump, and would greatly appreciate any contributions. The next features on the roadmap are proper timestamping, and outputting to pcapng so that one can load Unix domain socket traffic dumps into Wireshark/tshark and apply their vast repertoire of protocol dissectors.


  1. Depending on your OS, Unix domain sockets may be described in unix(7), unix(4), or sockaddr(3socket).↩︎

  2. While Linux enforces file path permissions on file path-based Unix domain sockets, this behavior is not consistent across all Unix implementations. However, in general, Unix OSes have similar sets of APIs enabling Unix domain socket peer processes to verify each other’s identity.↩︎

  3. https://www.kernel.org/doc/Documentation/networking/filter.txt↩︎