Writing Robust Yara Detection Rules for Heartbleed

This blog walks through the methodology and process of writing robust Yara rules to detect either Heartbleed vulnerable OpenSSL statically linked or shared libraries which omit version information. Although Yara is designed for pattern matching and typically used by malware researchers we’ll show how we can also use it to detect vulnerable binaries.

One person’s static malware signature is another person’s vulnerable binary detector

What is the difference between a traditional binary anti-virus signature (non-hash based) and a vulnerable binary detector? Nothing.. well your perspective.. but in essence nothing.

The problem

So the problem we were faced with was how to detect binaries which statically linked vulnerable versions of the OpenSSL.

The Vulnerability

The vulnerability itself is well described by Sean Cassidy in his blog titled Diagnosis of the OpenSSL Heartbleed Bug. In this blog post he calls out this vulnerable code block in dtls1_process_heartbeat in d1_both.c:

/* Enter response type, length and copy payload */
s2n(payload, bp);
memcpy(bp, pl, payload);

If we look at the original vulnerable version this is the complete code block from 1.0.1e:

buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
/* Enter response type, length and copy payload */
s2n(payload, bp);
memcpy(bp, pl, payload);
bp += payload;
/* Random padding */
RAND_pseudo_bytes(bp, padding);

It is code block that will form the basis of our signature. However even in the patched version this code block is present so how do we avoid false positives? Well the patch added the following code a little bit before this block:

If we look at the patched version from 1.0.1g we see:

/* Read type and payload length first*/
if (1 + 2 + 16 > s->s3->rrec.length)
 return 0; /* silently discard */
hbtype = *p++;
n2s(p, payload);
if (1 + 2 + payload + 16 > s->s3->rrec.length)
 return 0; /* silently discard per RFC 6520 sec. 4 */
pl = p;

It is this block that will form the basis of the out ‘and not this’ part of the Yara rule.

Extracting the signature

Looking in IDA Pro inside a compiled version of ssleay32.lib inside of d1_both.obj we can locate the original vulnerable code block shown in Figure 1 here:

Vulnerable code block disassembled

Now the trick to a good signature is long enough to not be false positive prone whilst omitting anything that may change between compiles. In this case it’s primarily omitting the addresses which follow the assembly call instructions and some CPU register usage. If we highlight the bit of the disassembly which we are going to use as our signature we get:

Vulnerable code block disassembled with signature highlighted

This leads us to having the following Yara signature:

rule HeartBleedWin32
 $opensslmini = {E8 ?? ?? ?? ?? 8B 4C 24 24 8B E8 8D 7D 01 8B C3 C6 45 00 02 C1 E8 08 53 88 07 88 5F 01 51 83 C7 02 57 E8 ?? ?? ?? ??}

So how do we avoid false positives for patched versions given the same signature is in the non-vulnerable version? Well we can finger print the patch itself shown in Figure 2 above. If we look at it in IDA (comments are mine) we see:

Patch to Heartbleed in 1.0.1g disassembled in x86

Again if we highlight the bytes we will use for the signature we get (yes we could do more but we don’t as we have found no cause to):

Patch to Heartbleed in 1.0.1g disassembled in x86 with signature highlighted

We then end up with being able to create the signature as follows:

rule HeartBleedWin32
 $opensslmini = {E8 ?? ?? ?? ?? 8B 4C 24 24 8B E8 8D 7D 01 8B C3 C6 45 00 02 C1 E8 08 53 88 07 88 5F 01 51 83 C7 02 57 E8 ?? ?? ?? ??}
 $heartbleedpatch = {83 F9 13 73 ?? 5F 33 C0 5E 59 C3 0F ?? ?? ?? 0F ?? ??}
 $opensslmini and not $heartbleedpatch

Which as should be clear says if you match our first signature but not the second then evaluate to being true and thus indicating vulnerable to the Heartbleed bug.

The implementation

So Alberto Barbaro the consultant we had working on this pulled the above methodology into a whizzy web front end which has signatures for:

  • X86 on Windows – Visual Studio compiled
  • X64 on Windows – Visual Studio compiled
  • X86 on Linux – GCC compiled
  • X64 on Linux – GCC compiled
  • ARM on Linux – GCC compiled

This web front end can be found on our labs site here – https://labs.nccgroup.com/heartbleed/

Example results showing a vulnerable version

Findings from testing against suspected malicious code

Our Cyber Defence Operations team tested these signatures against a suspected malicious code feed we consume. The first striking finding was that there were examples of real malicious code using vulnerable versions of OpenSSL either statically or as DLLs embedded in resource sections including an interesting sample from Russia. The statistics from the testing showed on the whole good results for the Win32 signature:

  • Eleven correct identifications of vulnerable versions.
  • Three false positives for not detecting the patched G version but correctly identifying the first code block
    • One had 1.0.1c and 1.0.1g mashed into the same binary using different versions of functions for different objects which is likely a symptom of them not doing a full rebuild.

Ironically OpenSSL more often than not embeds strings which identify the version so the signatures were able to be further refined using a simple string matching rule. We added logic the rule to double check resulting as shown here:

rule HeartBleedWin32
 $opensslmini = {E8 ?? ?? ?? ?? 8B 4C 24 24 8B E8 8D 7D 01 8B C3 C6 45 00 02 C1 E8 08 53 88 07 88 5F 01 51 83 C7 02 57 E8 ?? ?? ?? ??}
 $heartbleedpatch = {83 ?? 13 73 ?? 5F 33 C0 5E 59 C3 0F ?? ?? ?? 0F ?? ??}
 $opensslVer = "OpenSSL 1.0.1g"
 $opensslmini and not ($heartbleedpatch or $opensslVer)

The topic of malicious code usage of vulnerable OpenSSL versions will be the subject of a future post.

ARM Signature Brittleness

The ARM signature is the most brittle at the moment and we’re working on refining it as we see more samples uploaded to understand the root cause be it different compiler usage, optimization level or instruction mode.

Further refinements

So if we wanted to refine this further we could also add further sanity checks such as if the binary is in actual fact a PE or an ELF. The reason we didn’t do this was to allow detection in monolithic firmware blobs you might have.

Also it is worth mentioning that the ECX register is not excluded from the signatures in this particular case. The reason we did this was it is in theory used for counting in C so unlikely to be repurposed by the compiler. However we do appreciate that in C++ it is used for passing around ‘this’ but in this particular instance it is likely safe to include. But if you wanted to err on the side caution then you could also ?? it in the signature.

F.L.I.R.T Déjà vu

For the keen eyed on you then this blog post might seem a little déjà vu if you’ve ever looked at IDA’s F.L.I.R.T (Fast Library Identification and Recognition Technology). Indeed you’d be right the same base concept of excluding variant bytes as described in the F.L.I.R.T deep dive is employed. So in short the basic premise is not new just a different application.

The Future

We’re actively working on an automated way to automatically generate Yara signatures for a particular library symbol or export so what watch this space as we feel this has significant mileage.


This work was primarily performed by Alberto Barbaro with support from David Cannings and myself.

Published date:  02 June 2014

Written by:  Ollie Whitehouse