CVE-2018-8611 Exploiting Windows KTM Part 1/5 – Introduction


TL;DR

In this 5-part blog post series we will discuss the exploitation of CVE-2018-8611, a local privilege escalation vulnerability in the Windows kernel. Originally, this vulnerability was disclosed because Kaspersky discovered a 0day exploit in the wild. No substantial public details about this vulnerability were ever published, nor was a sample of the exploit publically released. Furthermore, hashes of the exploits discovered by Kaspersky were also never released.

CVE-2018-8611 affects a component of the Windows kernel called the Kernel Transaction Manager (KTM), which has not been explored much by the public security community. CVE-2018-8611 is a kernel race condition, which is also a slightly less discussed bug class on Windows (outside of bochspwn). Interestingly, access to KTM is not restricted by current syscall sandboxing filters. This is different from the win32k component which is generally blocked from sandboxes e.g. the win32k syscall filter enforced by Chrome . This means that this vulnerability served as a valuable sandbox escape in client-side exploitation scenarios.

We will take a deep dive into the internals of KTM, show our patch analysis, work out what the underyling vulnerability is (with the help of Kaspersky’s relatively small amount of technical details), and finally discuss the approach we took to develop a fairly reliable exploit. We were able to develop an exploit that works on all version from Windows Vista through Windows 10 1809, on both x86 and x64 architectures.

This research was done by Aaron Adams and Cedric Halbronn, who work in the Exploit Development Group (EDG) at NCC Group. The research was first presented at POC2019, and the slides for that presentation can be found here.

As with many of our other public writeups we will try to verbosely explain our thinking and methodology when approaching this research in order to help people new to the field. We try to highlight some hurdles, mistakes, and dead ends we ran into. As always, we are happy to receive suggestions, corrections, and feedback.

Getting started

Published Vulnerability Details

We first found out about this vulnerability from reading a public Kaspersky blog published in December 2018. The writeup states that in October 2018 a Kaspersky technology called Automatic Exploit Prevention (AEP) detected exploitation of the vulnerability, after which they reported it to Microsoft. This led to it being fixed by Microsoft in December 2018.

The Kaspersky blog was relatively light on details, however provided some very useful hints. To be honest, some of the information on our first few reads didn’t make much sense, or we just weren’t connecting the dots, however as we later solidified our understanding of KTM and the vulnerability itself, the information they did provide about the exploit became increasingly useful.

The entirety of Kaspersky’s public technical description of the vulnerability and exploitation behavior is pasted verbatim in the two paragraphs below, which we will refer back to occasionally when we describe our analysis:

To abuse this vulnerability exploit first creates a named pipe and opens it for
read and write. Then it creates a pair of new transaction manager objects,
resource manager objects, transaction objects and creates a big number of
enlistment objects for what we will call “Transaction #2”. Enlistment is a
special object that is used for association between a transaction and a
resource manager. When the transaction state changes associated resource
manager is notified by the KTM. After that it creates one more enlistment
object only now it does so for “Transaction #1” and commits all the changes
made during this transaction.

After all the initial preparations have been made exploit proceeds to the
second part of vulnerability trigger. It creates multiple threads and binds
them to a single CPU core. One of created threads calls
NtQueryInformationResourceManager in a loop, while second thread tries to
execute NtRecoverResourceManager once. But the vulnerability itself is
triggered in the third thread. This thread uses a trick of execution
NtQueryInformationThread to obtain information on the latest executed syscall
for the second thread. Successful execution of NtRecoverResourceManager will
mean that race condition has occurred and further execution of WriteFile on
previously created named pipe will lead to memory corruption.

There is a lot of information packed into those two paragraphs, most of which won’t likely make sense right now for the reader, but will hopefully as you continue reading. It is worth noting that even after knowing how to exploit the vulnerability, not all of it makes sense, but could simply be indicative of slightly different exploitation approaches.

After exploiting this vulnerability early in 2019 and later preparing the blogs and POC presentation in October 2019, we were made aware that Kaspersky presented additional details about the 0day exploit at BlueHat Shanghai in May 2019. We analyze some of the techniques used by the 0day exploit and compare them with our approach in part 5 of our blog series.

Test Environment

In case you want to replicate some of this work, we did the majority of our research using the following tools:

  • VMWare Workstation: virtual machines

  • IDA Pro disassembler with the x86/x64 HexRays decompiler. Features from version 7.3

  • WinDbg/WinDbg Preview: two versions of the same debugger. To do kernel pool analysis on older systems, you have to use the old WinDbg instead of WinDbg Preview due to some bugs. On newer systems, WinDbg Preview is recommended because is faster

  • virtualkd: rapid Windows VM kernel debugging

  • IDArling: IDA Pro collaboration plugin used by 2 people

  • ret-sync: IDA Pro / WinDbg syncing plugin for a better exploit developer experience

  • Diaphora: best IDA Pro plugin used for patch diffing

  • HexRaysPyTools: HexRays helper plugin to propagate (new) types recursively

  • draw.io: Diagram creation

A number of online tools/resources were also very helpful:

Thanks to everyone who develops and maintains the tools and resources above!

Starting from Windows 7

We had some back and forth discussion about starting on Windows 7, and it was ultimately chosen because from our experience while exploiting win32k there were sometimes more symbols, so we thought that could also be the case for KTM – though it wasn’t actually any different in the end as far as we can tell.

Another reason for selecting Windows 7 is that IDA doesn’t work very well for projects containing multiple related files. Starting on Windows 8, KTM was separated out from ntoskrnl.exe into tm.sys. Focusing on tm.sys could appear attractive at first due to containing only KTM functions. However, it would actually make it more tedious to reverse, as KTM uses many functions from ntoskrnl.exe, and so we’d be jumping between two IDA databases.

One shortcoming from targeting Windows 7 first is we started by building a write primitive that worked on Vista and Windows 7. However, it then turned out it failed due to mitigations on Windows 8 and above, forcing us to revisit our approach. There are always tradeoffs to these decisions. Often it’s better to just not over-analyze and get on with it.

In light of our reversing primarily being done on Windows 7, almost all the code snippets in this blog post are taken from a decompiled Windows 7 ntoskrnl.exe binary. Where there is some noteworthy difference in behavior in other Windows versions we try to mention it, but as the vulnerability is quite complex it is likely we overlook some details. So, if you choose to replicate, be aware that as you port across Windows Vista through 10 you will come across small differences in code, structure layouts, offsets. Where these differences exist, they will occasionally need special casing when porting the exploit.

It’s also worth noting that patch diffing KTM bugs on Windows 8 and later is slightly easier in so far as no other ntoskrnl.exe changes or vulnerabilities will show up in your results while patch diffing tm.sys.

We specifically reversed the Windows 7 x64 version of ntoskrnl.exe version 6.1.7601.24291.

Understanding the Windows Kernel Transaction Manager (KTM)

Documentation

After looking at the patch (which we’ll describe later) and not really understanding anything about what we were looking at, we decided that we first needed to know as much about KTM as possible. We chose to start by reversing most of the primary kernel APIs and system call implementations, and at the same time build up our own set of userland code samples for exploring the APIs. We knew we would end up being able to reuse a lot of our little samples for the exploit, and it allowed us to slowly build up a solid foundation of understanding.

Unlike many other MSDN pages, there are very few public samples of working KTM code on the MSDN KTM portal, so there are not a lot of snippets we can re-use.

One other resource is the MSDN Kernel-Mode KTM portal which discusses KTM internals from the kernel side. It is very useful but also overwhelming when trying to understand new concepts. It is particularly fine to not understand everything but is really worth reading. Also note that we are only interested in kernel APIs to better understand concepts but we won’t be able to directly call them since our local elevation of privilege exploit will only be able to call userland KTM APIs.

Consequently, most of what we wrote was simply worked out by trial and error, or by reversing how KTM components use APIs. Even if the documentation of KTM on MSDN is quite thorough, it is a bit confusing to sort through all the terminology without working code snippets to demonstrate the correlation between concepts.

We also found a good overview of how KTM worked in general that is not in the official documentation, which was a series of 3 videos posted by Microsoft between 2005 and 2007 as part of a "Going Deep" series about new Windows technologies during the release of Vista:

More recently some malware called "Proton Bot" was found using KTM functions in attempt to bypass things like API monitoring/hooks. It doesn’t appear that it actually leverages the transactional nature of KTM for any other purpose though.

What is the Kernel Transaction Manager (KTM)?

KTM is a technology added in Windows Vista to introduce the "transactional operation" concept. Windows itself makes use of this component in the Windows registry and NTFS filesystem at least. The concept is particularly present in the database world like SQL, etc. The idea is that some given operation, called a ‘transaction’, may require a number of pieces of related work to be completed across multiple resources. Such a transaction, at a high level representing some single piece of work, needs to effectively be atomic and if any piece fails, then the whole transaction fails.

This type of ‘transaction’ is especially important for complicated multi-process actions. An Automated Teller Machine (ATM) attempting to reconcile both its own cash and a requestors account information is a common example case that might benefit from the concept of atomic transactions. Installing software is another example, where upon interruption it needs to rollback all the changes because e.g. the installation was cancelled by the user.

This means that if any part of the operation eventually fails, it’s possible to roll back all other work that has already been completed related to that transaction, to make it as if it never occurred at all. Only if all of the work related to a transaction succeeds is that transaction actually completed.

KTM has the concept of recovering from some failure and rolling back certain transactions. We will detail them later but for now it is only necessary to note that in the event that some piece of work related to a transaction fails, the system is designed in a way to notify all the other workers who are taking part in that transaction, to ensure they know that something has failed or that they all need to resynchronize to some agreed upon state.

The MSDN KTM portal is the main reference for how Microsoft has gone about providing APIs to allow userland client software to deal with transactions.

We are interested in four major components involved in KTM, that we will describe in more detail. Throughout the document we use their long and short form names interchangeably in the next sections:

  • Resource Manager (RM)

  • Transaction Manager (TM)

  • Transaction (Tx)

  • Enlistment (En)

What is a transaction (Tx)?

It’s already been hinted at a bit above, but a transaction is what everything else in KTM revolves around. The Microsoft docs provide a fairly thorough high level explanation. They also provide a lower level view here.

From our perspective a transaction is a _KTRANSACTION kernel structure that has an association with a transaction manager, and one or more enlistments. It effectively represents some multi-part piece of work that is expected to be done atomically. A transaction is used to track this work when it is about to start, when it is in the process of doing it, or has already been done.

A transaction has three primary operations that can be done on it: creation, committing, and rolling back. Committing a transaction means converting the partial operations into permanent changes. Rolling back a transaction just means reverting all of the partial operations that have occured so far, prior to the transaction actually being completed. A transaction that has been rolled back cannot be committed and vice versa.

The following is a snippet of the _KTRANSACTION structure:

//0x2d8 bytes (sizeof)
struct _KTRANSACTION
{
    struct _KEVENT OutcomeEvent;                                            //0x0
    ULONG cookie;                                                           //0x18
    struct _KMUTANT Mutex;                                                  //0x20
    [...]
    struct _GUID UOW;                                                       //0xb0
    enum _KTRANSACTION_STATE State;                                         //0xc0
    ULONG Flags;                                                            //0xc4
    struct _LIST_ENTRY EnlistmentHead;                                      //0xc8
    ULONG EnlistmentCount;                                                  //0xd8
    [...]
    union _LARGE_INTEGER Timeout;                                           //0x128
    struct _UNICODE_STRING Description;                                     //0x130
    [...]
    struct _KTM* Tm;                                                        //0x200
    [...]
};

We don’t use the _KTRANSACTION structure much in practice, but it’s interesting to note that it tracks the number of enlistments (EnlistmentCount), a linked list of enlistments (EnlistmentHead) and a pointer to the associated transaction manager (Tm).

We create a transaction simply by calling the CreateTransaction() userland function:

HANDLE hTx = CreateTransaction(
	NULL, // lpTransactionAttributes
	0,	// UOW
	0,	// CreateOptions
	0,	// IsolationLevel
	0,	// IsolationFlags
	0,	// infinite timeout
	L"ExampleTx" // Description
);

It is worth noting that most of the KTM kernel structures have a cookie field, which can be seen in the _KTRANSACTION structure above. These are unique per structure type, so are fairly useful when poking around in the debugger to confirm we are looking at the right _KTM* type and offset. Below we list the cookie values for useful KTM types:

Cookie Object type
0xb00b0001 _KTRANSACTION
0xb00b0002 _KRESOURCEMANAGER
0xb00b0003 _KENLISTMENT
0xb00b0004 _KTM
0xb00b0005 Protocol Address Info?
0xb00b0006 Propagate Request?

Most KTM* objects also have detected pool tags, allowing them to be tracked on the kernel pool. Below we show only the tags for the most relevant structures for the vulnerability we will be describing.

Pool tag Object type
TmTx _KTRANSACTION
TmRm _KRESOURCEMANAGER
TmEn _KENLISTMENT
TmTm _KTM

What is a transaction manager (TM)?

A transaction manager is basically an entity which manages transactions in general. It is the highest order piece in the KTM hierarchy. Typically a transaction manager will have one or more resource managers associated with it. In order to actually do any sort of transactions you must first create a transaction manager.

One important transaction manager concept for exploitation purposes is the difference between volatile and durable:

Summarized from MSDN on transaction managers:

  • Durable – regular transaction managers that have a log and can recover their state

  • Volatile – these transaction managers do not have a log and cannot recover their state

When working with a durable transaction manager, everything you do ends up getting recorded in a log file on disk. If you’re doing things like repeatedly trying to win race condition using thousands of enlistments, it is likely to quickly fill the log and may cause errors. We learned through trial and error that using a volatile transaction manager fixes a lot of problems and this is what we used for exploitation.

The kernel structure describing a transaction manager is _KTM:

//0x3c0 bytes (sizeof)
struct _KTM
{
    ULONG cookie;                                                           //0x0
    struct _KMUTANT Mutex;                                                  //0x8
    enum KTM_STATE State;                                                   //0x40
    [...]
    ULONG Flags;                                                            //0x80
    ULONG VolatileFlags;                                                    //0x84
    struct _UNICODE_STRING LogFileName;                                     //0x88
    struct _FILE_OBJECT* LogFileObject;                                     //0x98
    [...]
    struct _KRESOURCEMANAGER* TmRm;                                         //0x2a8
    [...]
};

Much like the _KTRANSACTION, we don’t need to use this too much in practice during exploitation. A lot of the content is related to logging (LogFileName, LogFileObject), but it also tracks all associated resource managers (TmRm).

The Flags field uses undocumented flags, which we’ve annotated as follows during our reversing:

enum KTM_FLAGS {
    KTM_FLAG_VOLATILE               = 0x01,
    KTM_FLAG_COMMIT_SYSTEM_VOLUME   = 0x02,
    KTM_FLAG_COMMIT_SYSTEM_HIVES    = 0x04,
    KTM_FLAG_COMMIT_LOWEST          = 0x08,
    KTM_FLAG_THAW                   = 0x10,
    KTM_FLAG_NO_IDENTITY            = 0x20,
    KTM_FLAG_CORRUPT_FOR_PROGRESS   = 0x40,
    KTM_FLAG_CORRUPT_FOR_RECOVERY   = 0x80,
    KTM_FLAG_CLUSTERED              = 0x100,
    KTM_FLAG_UNK4000                = 0x4000
};

Typically creating a transaction manager is the first thing you’ll do and you must call CreateTransactionManager():

HANDLE hTM = CreateTransactionManager(
	NULL, // lpTransactionAttributes
	NULL, // LogFileName
	TRANSACTION_MANAGER_VOLATILE,	// CreateOptions
	0	// CommitStrength
);

What is a resource manager (RM)?

A resource manager is some entity that manages a resource that will be involved in completing some work associated with a transaction. For instance, in order to complete a transaction, perhaps both the filesystem and registry must be involved and do some work. In this case both the filesystem and registry would have their own associated resource manager, and each resource manager would be responsible for managing some part of this transaction’s work. Each resource manager would enlist in the transaction to complete the necessary work, by creating an enlistment associated with that transaction. While waiting for a transaction to complete, you would wait for both of the involved resource managers to do their work before moving on.

From the perspective of exploitation we just create our own resource manager, and enlist in our own transaction, so it doesn’t really reflect most real use cases.

A resource manager is created using the CreateResourceManager() userland function:

HANDLE hRM = CreateResourceManager(
	NULL,	    // lpTransactionAttributes
	pRMGuid,	// ResourceManagerId - GUID can't be NULL
	RESOURCE_MANAGER_VOLATILE, // CreateOptions - No log file
	hTM,        // TmHandle - Previously created transaction manager handle
    NULL,       // Description
);

The kernel structure for a resource manager is the _KRESOURCEMANAGER:

//0x250 bytes (sizeof)
struct _KRESOURCEMANAGER
{
    struct _KEVENT NotificationAvailable;                                   //0x0
    ULONG cookie;                                                           //0x18
    enum _KRESOURCEMANAGER_STATE State;                                     //0x1c
    ULONG Flags;                                                            //0x20
    struct _KMUTANT Mutex;                                                  //0x28
    [...]
    struct _KQUEUE NotificationQueue;                                       //0x98
    struct _KMUTANT NotificationMutex;                                      //0xd8
    struct _LIST_ENTRY EnlistmentHead;                                      //0x110
    ULONG EnlistmentCount;                                                  //0x120
    LONG (*NotificationRoutine)(struct _KENLISTMENT* arg1, VOID* arg2, VOID* arg3, ULONG arg4, union _LARGE_INTEGER* arg5, ULONG arg6, VOID* arg7); //0x128
    [...]
    struct _KTM* Tm;                                                        //0x168
    struct _UNICODE_STRING Description;                                     //0x170
    [...]
}; 

This is a very important kernel structure for understanding the vulnerability and for exploitation. The most important fields are:

  • Tm – Pointer to the associated transaction manager

  • Description – An (optional) log name unicode string associated with the resource manager read the Description field, etc.

  • EnlistmentHead – The current list of enlistments associated with this resource manager

  • Mutex – Locks the resource manager, meaning other code cannot parse that resource manager’s enlistments list (EnlistmentHead),

  • NotificationQueue – A queue of notification packets that can be queried from userland to retrieve events related to enlistment state changes

The Flags field uses undocumented flags, which we’ve annotated as follows during our reversing:

enum KRESOURCEMANAGER_FLAGS {
    KRESOURCEMANAGER_NOTIFY         = 0x01,
    KRESOURCEMANAGER_UNK02          = 0x02,
    KRESOURCEMANAGER_IS_VOLATILE    = 0x04,
    KRESOURCEMANAGER_COMMUNICATION  = 0x08,
    KRESOURCEMANAGER_COMPLETION     = 0x10
};

We will talk more about how some of these fields are used as we describe the vulnerability and how to exploit it.

What is an enlistment (En)?

The best way to understand an enlistment is to think of it as a commitment by some resource manager to do some piece of work in order to complete a transaction. That resource manager has enlisted in the transaction, which means all other enlistments will need to coordinate states to progress towards completing the transaction. One enlistment can be thought of as one piece of work associated with the transaction. As long as that enlistment doesn’t transition to a read-only enlistment, it will be required to indicate that it has completed a given stage of its work, which eventually allows a transaction to transition to the next state. Whether or not a rollback or recover can occur will also depend on the state of some or all enlistments related to a transaction. We will describe these states in more detail later.

The kernel structure for a _KENLISTMENT is as follows:

//0x1e0 bytes (sizeof)
struct _KENLISTMENT
{
    ULONG cookie;                                                           //0x0
    struct _KTMOBJECT_NAMESPACE_LINK NamespaceLink;                         //0x8
    struct _GUID EnlistmentId;                                              //0x30
    struct _KMUTANT Mutex;                                                  //0x40
    struct _LIST_ENTRY NextSameTx;                                          //0x78
    struct _LIST_ENTRY NextSameRm;                                          //0x88
    struct _KRESOURCEMANAGER* ResourceManager;                              //0x98
    struct _KTRANSACTION* Transaction;                                      //0xa0
    enum _KENLISTMENT_STATE State;                                          //0xa8
    ULONG Flags;                                                            //0xac
    ULONG NotificationMask;                                                 //0xb0
    [...]
}; 

The most important fields are:

  • Transaction – The transaction that the enlistment is actually doing work for

  • Flags – Mask indicating the type of the enlistment and some internal enlistment states (e.g. KENLISTMENT_IS_NOTIFIABLE, is an internal ‘state’ that indicates if this enlistment has already been notified about some transactional state change)

  • Mutex – Locks the enlistment and prevents other code from manipulating it

  • State – The current transactional state of the enlistment in relation to the transaction (e.g. "PrePrepared")

  • NotificationMask – Which notifications should be queued to the resource manager related to this enlistment

  • NextSameRm – A linked list of enlistments associated with the same resource manager. This is the list entry whose head is _KRESOURCEMANAGER.EnlistmentHead

The Flags field uses undocumented flags, which we’ve annotated as follows during our reversing:

enum KENLISTMENT_FLAGS {
    KENLISTMENT_SUPERIOR           = 0x01,
    KENLISTMENT_RECOVERABLE        = 0x02,
    KENLISTMENT_FINALIZED          = 0x04,
    KENLISTMENT_FINAL_NOTIFICATION = 0x08,
    KENLISTMENT_OUTCOME_REQUIRED   = 0x10,
    KENLISTMENT_HAS_SUPERIOR_SUB   = 0x20,
    KENLISTMENT_IS_NOTIFIABLE      = 0x80,
    KENLISTMENT_DELETED            = 0x80000000
};

To create an enlistment you call the CreateEnlistment() function. You must specify a handle to a previously created resource manager and transaction.

hEn = CreateEnlistment(
	NULL,   // lpEnlistmentAttributes
	hRM,    // ResourceManagerHandle - Existing resource manager handle
	hTx,    // TransactionHandle - Existing transaction manager handle
    0x39ffff0f, // NotificationMask - Special value to receive all possible notifications
	0,	    // CreateOptions
	NULL    // EnlistmentKey
);

We will go on a little side tangent here about something that slowed us down temporarily. The Microsoft documentation specifies TRANSACTION_NOTIFY_MASK = 0x3fffffff as the mask that indicates all valid bits, and we were hoping to just use this to receive all notifications. However, we instead always got an error when calling CreateEnlistment(), which turned out to be that the combination of flags specified by TRANSACTION_NOTIFY_MASK is an invalid combination itself when passed as the set of all notifications you want to receive. We didn’t want to only specify the documented bits all ORed together either, as there could be undocumented notifications. We reversed the logic and noticed a test done by a function called TmpIsNotificationmaskValid():

char __fastcall TmpIsNotificationMaskValid(enum NOTIFICATION_MASK NotificationMask, char CreateOptionsHasSuperior)
{
  char result;

  result = TRUE;
  if ( CreateOptionsHasSuperior )
  {
    // 0x207=TRANSACTION_NOTIFY_SINGLE_PHASE_COMMIT|TRANSACTION_NOTIFY_COMMIT|
    //       TRANSACTION_NOTIFY_PREPARE|TRANSACTION_NOTIFY_PREPREPARE
    if ( !(NotificationMask & 8) || NotificationMask & 0x207 )
      return FALSE;
  }
  else if ( !(NotificationMask & 8)
         || NotificationMask & 0x60000F0
         || NotificationMask & 2 && !(NotificationMask & 4)
         || _bittest(&NotificationMask, 9u) && !(NotificationMask & 2)// bittest(9) = TRANSACTION_NOTIFY_SINGLE_PHASE_COMMIT
         || !(NotificationMask & 4) && !(NotificationMask & 1) )// 1=TRANSACTION_NOTIFY_PREPREPARE
  {
    return FALSE;
  }
  return result;
}

In our case we aren’t really using "superior" enlistments, so we are interested in the second if condition. TRANSACTION_NOTIFY_MASK is 0x3FFFFFFF, but we need to be sure that the bits in 0x60000F0 are not set. The rest of the tests are making sure that if one bit is set, some other bit is not unset. So we just go the ‘set everything we can’ route, which means we only need to avoid the 0x60000F0 bits:

python -c "print('0x%x' % (0x3FFFFFFF & (~0x60000F0)))"
0x39ffff0f

This is why we pass 0x39ffff0f in the earlier example.

Superior enlistments

Above we briefly showed some logic related to a "superior" enlistment, and said we weren’t going to intentionally use them. However, they are worth touching on because, although we don’t explicitly use them for exploitation, we will abuse some logic related functionality to help us with debugging.

A superior enlistment is used in a ‘distributed’ transaction scenario where some enlistments are spread across some distributed channel, but one enlistment is in ‘charge’ of the others. The description of them on the MSDN KTM portal is a bit lacking in our opinion, in so far as there isn’t a single place that succinctly explains it. It is more alluded to in passing across various function descriptions.

Most importantly when trying to understand the KTM APIs and select functions to use, it is useful to know that some functions are only meant to be used with superior enlistments and others with non-superior enlistments, and this can be somewhat confusing. Specifically the naming for the superior-only functions often looks like what you think you want to use. The way to tell the difference is any API name ending with Enlistment is typically meant to work on superior enlistments only. However if the API name ends with Complete, then it is for non-superior enlistments. This is technically described in the API documentation but isn’t always immediately clear if you’re new to KTM. For example, this is what it says for something like CommitEnlistment():

Commits the transaction associated with this enlistment handle. This
function is used by communication resource managers (sometimes called superior
transaction managers).

Contrast this with CommitComplete() which says the following:

Indicates that a resource manager (RM) has finished committing a
transaction that was requested by the transaction manager (TM).

Below are two short lists showing common functions for transitioning enlistment states.

Non-superior enlistment Superior enlistment
PrePrepareComplete() PrePrepareEnlistment()
PrepareComplete() PrepareEnlistment()
CommitComplete() CommitEnlistment()

From the kernel perspective, whether or not an enlistment is superior is indicated by an undocumented KENLISTMENT_SUPERIOR flag being set in the _KENLISTMENT.Flags field (as detailed above). If you’re working with non-superior enlistments to trigger the bug, you will never normally enter conditional statements related to superior enlistments. This is interesting to keep in mind, as we will abuse this fact to assist with our kernel debugging after a successful race win later.

Recovery and rollback

At any moment, all of the enlistments, and thus associated work, is tracked by kernel structures. At the request of an API, or in the event of some other error, a partially committed transaction can be reverted (referred to as a rollback). In contrast, requesting that an enlistment or all enlistments associated with a resource manager recover to some previous state is referred to as recovery.

The main difference between the two specifically is that rolling back means that the transaction being rolled back is effectively being arborted, whereas recovery is an attempt to come to some known synchronized state for all the enlistments involved in an active transaction so that the transaction can continue. This distinction is enough for discussion around the scope of what we will talk about.

In order to recover all enlistments associated with a resource manager we call RecoverResourceManager(). Similarly RecoverTransactionManager() triggers recovery on everything associated with the transaction manager. Alternatively a single enlistment could be recovered using RecoverEnlistment().

To rollback a transaction we use RollbackTransaction(). Alternatively we can rollback a transaction by using an enlistment associated wtih that transaction, via RollbackEnlistment().

The most important thing to note for understanding CVE-2018-8611 that we noticed during testing is that, in order for a resource manager recovery to occur, this resource manager must already have at least one committed transaction to rollback to.

Notifications

Each resource manager has a set of associated transaction notifications that occur on milestone events, such as an enlistment switching from one state to another. One very relevant example of this is, while recovering a resource manager, notifications are being sent for each enlistment involved in the recovery. If you want to read these notifications from userland, you call GetNotificationResourceManager(). These events are placed into a FIFO queue tracked by the resource manager (_KRESOURCEMANAGER.NotificationQueue).

An example of fetching a notification and printing some information about it from the associated TRANSACTION_NOTIFICATION structure is shown below. This structure is followed by a TRANSACTION_NOTIFICATION_RECOVERY_ARGUMENT, so we use the following structure:

typedef struct TRANSACTION_NOTIFICATION_ALL_ {
	TRANSACTION_NOTIFICATION TransNotif;
	TRANSACTION_NOTIFICATION_RECOVERY_ARGUMENT RecovArg;
} TRANSACTION_NOTIFICATION_ALL;

The following code shows how to read out notifications:

BOOL bRet;
DWORD ReturnLength;
TRANSACTION_NOTIFICATION_ALL TransNotif;
ZeroMemory(&TransNotif, sizeof(TransNotif));
TransNotif.ArgumentLength = sizeof(TRANSACTION_NOTIFICATION);
bRet = GetNotificationResourceManager(
    hRM,
    (PTRANSACTION_NOTIFICATION)&TransNotif,
    sizeof(TransNotif),
    1000,	 // Millisecs
    &ReturnLength,
);
printf("TransactionKey=%p, TransactionNotification=%p\n",
	TransNotif.TransNotif.TransactionKey,
	TransNotif.TransNotif.TransactionNotification);
printf_guid(&TransNotif.RecovArg.EnlistmentId);

When you request notification information, amongst other information, you receive the GUID associated with the exact enlistment who had the associated notification.

We will touch more on the actual states that enlistments and transactions may be transitioning to at a later time.

Conclusion

In this blog post, we have presented a basic background about how KTM works in the kernel and its associated KTM kernel object structures. We also detailed how to interact with KTM from userland throught important KTM-related functions.

Part 2 of our blog will delve into understanding the CVE-2018-9611 vulnerability and patch.

Read all posts in the Exploiting Windows KTM series