Reverse engineering and decrypting CyberArk vault credential files
Author: Jelle Vergeer
This blog will be a technical deep-dive into CyberArk credential files and how the credentials stored in these files are encrypted and decrypted. I discovered it was possible to reverse engineer the encryption and key generation algorithms and decrypt the encrypted vault password. I also provide a python implementation to decrypt the contents of the files.
It was a bit more than a year ago that we did a penetration test for a customer where we came across CyberArk. During the penetration test we tested the implementation of their AD tiering model and they used CyberArk to implement this. During the penetration test we were able to get access to the CyberArk Privileged Session Manager (PSM) server. We found several .cred CyberArk related files on this server. At the time of the assignment I suspected the files were related to accessing the CyberArk Vault. This component stores all passwords used by CyberArk. The software seemed to be able to access the vault using the files with no additional user input necessary. These credential files contain several fields, including an encrypted password and an “AdditionalInformation” field. I immediately suspected I could reverse or break the crypto to recover the password, though the binaries were quite large and complex (C++ classes everywhere).
A few months later during another assignment for another customer we again found CyberArk related credential files, but again, nobody knew how to decrypt them. So during a boring COVID stay-at-home holiday I dove into the CreateCredFile.exe binary, used to create new credential files, and started reverse engineering the logic. Creating a dummy credential file using the CreateCredFile utility looks like to following:
The encryption and key generation algorithms
It appears there are several types of credential files (Password, Token, PKI, Proxy and KeyPair). For this exercise we will look at the password type. The details in the file can be encrypted using several algorithms:
- DPAPI protected machine storage
- DPAPI protected user storage
The default seemed to be the custom one, and after some effort I started to understand the logic how the software encrypts and decrypts the password in the file. The encryption algorithm is roughly the following:
First the software generates 20 random bytes and converts this to a hexadecimal string. This string is stored in the internal CCAGCredFile object for later use. This basically is the “AdditionalInformation” field in the credential files. When the software actually enters the routine to encrypt the password, it will generate a string that will be used to generate the final AES key. I will refer to this string as the base key. This string will consist of the following parts, appended together:
- The Application Type restriction, converted to lower case, hashed with SHA1 and base64 encoded.
- The Executable Path restriction, converted to lower case.
- The Machine IP restriction.
- The Machine Hostname restriction, converted to lower case.
- The OS Username restriction, converted to lower case.
- The 20 random bytes, or AdditionalInformation field.
Note that by default, the software will not apply the additional restrictions, only relying on the additional info field, present in the credential files. After the base key is generated, the software will generate the actual encryption key used for encrypting and decrypting credentials in the credential files. It will start by creating a SHA1 context, and update the context with the base key. Next it will create two copies of the context. The first context is updated with the integer ‘1’, and the second is updated with the integer ‘2’, both in big endian format. The finalized digest of the first context serves as the first part of the key, appended by the first 12 bytes of the finalized second digest. The AES key is thus 32 bytes long.
When encrypting a value, the software generates some random bytes to use as initialization vector (IV) , and stores the IV in the first block of encrypted bytes. Furthermore, when a value is encrypted, the software will encrypt the value itself, combined with the hash of the value. I assume this is done to verify the decryption routine was successful and the data is not corrupted.
Decrypting credential files
Because, by default, the software will only rely on the random bytes as base key, which are included in the credential file, we can generate the correct AES key to decrypt the encrypted contents in the file. I implemented a Python utility to decrypt CyberArk Credential files and it can be downloaded here. The additional verification attributes the software can use to include in the base key can be provided as command line arguments to the decryption tool. Most of these can be either guessed, or easily discovered, as an attacker will most likely already have a foothold in the network, so a hostname or IP address is easily uncovered. In some cases the software even stores these verification attributes in the file as it asks to include the restrictions in the credential file when creating one using the CreateCredFile.exe utility.
How to defend against attackers from decrypting the CyberArk vault password in these credential files? First off, prevent an attacker from gaining access to the credential files in the first place. Protect your credential files and don’t leave them accessible by users or systems that don’t need access to them. Second, when creating credential files using the CreateCredFile utility, prefer the “Use Operating System Protected Storage for credentials file secret” option to protect the credentials with an additional (DPAPI) encryption layer. If this encryption is applied, an attacker will need access to the system on which the credential file was generated in order to decrypt the credential file.
We reported this issue at CyberArk and they released a new version mitigating the decryption of the credential file by changing the crypto implementation and making the DPAPI option the default. We did not have access to the new version to verify these changes.
20-06-2021 – Reported issue at CyberArk.
21/23/27/28-06-2021 – Communication back and forth with questions and explanation.
29-06-2021 – Call with CyberArk. They released a new version which should mitigate the issue.