Exploring Verifiable Random Functions in Code

Verifiable Random Functions (VRFs) have recently seen a strong surge in popularity due to their usefulness in blockchain applications. Earlier I wrote about what VRFs are, where they can be used, and a few dozen things to consider when reviewing them. In this follow-on blog post, I am pleased to introduce actual working code that everyone can easily examine and explore. I’ll cover how to access the code, show it in action, and drop a few hints on what you might do with it.

Get it, Run it

The code is a fully self-contained Python 3 reference implementation matching the ECVRF-EDWARDS25519-SHA512-Elligator2 ciphersuite configuration found in the latest VRF specification at https://tools.ietf.org/pdf/draft-irtf-cfrg-vrf-06.pdf. The code is hosted on NCC Group’s GitHub page at https://github.com/nccgroup/draft-irtf-cfrg-vrf-06 and includes the relevant test cases.

The Internet Engineering Task Force (IETF) Crypto Forum Research Group (CFRG) connects theory and practice by providing a forum to discuss and develop new cryptographic techniques. Often these efforts result in RFCs such as the “Edwards-Curve Digital Signature Algorithm (EdDSA)” in RFC 8032. In this instance, CFRG efforts related to the VRF specification formally started in early 2017, now resulting in the recent draft 06. Think of this as pre-RFC material.

The purpose of the code is to enable easy examination and exploration, so raw simplicity and a rapid start were emphasized. As simpler algorithms are generally less efficient (and can expose timing side-channels), the code is not intended for production. From a Linux shell, clone the project and run the code as follows:

$ git clone https://github.com/nccgroup/draft-irtf-cfrg-vrf-06.git
$ cd draft-irtf-cfrg-vrf-06.git
$ python3 demo.py

VRFs in Action

Here we describe two generic use cases in action: a sealed auction and cryptographic sortition. Both provide a solid foundation for a near-infinite set of refinements and variations.

1. Sealed auction

The easiest and fastest way to ramp-up an understanding of all the moving parts is to first review the prior VRF blog entry. VRFs can be thought of as enabling a hash digest signed by a secret key and verified with a public key. Below is a sealed-auction use case extracted from demo.py.

import ecvrf_edwards25519_sha512_elligator2

# Alice generates a secret and public key pair
secret_key = secrets.token_bytes(nbytes=32)
public_key = ecvrf_edwards25519_sha512_elligator2.get_public_key(secret_key)

# Alice generates a beta_string commitment to share with Bob
alpha_string = b'I bid $100 for the horse named IntegrityChain'
p_status, pi_string = ecvrf_edwards25519_sha512_elligator2.ecvrf_prove(secret_key, alpha_string)
b_status, beta_string = ecvrf_edwards25519_sha512_elligator2.ecvrf_proof_to_hash(pi_string)

# Alice initially shares ONLY the beta_string with Bob

# (Much) Later, Bob validates Alice's subsequently shared public_key, pi_string, and alpha_string
result, beta_string2 = ecvrf_edwards25519_sha512_elligator2.ecvrf_verify(public_key, pi_string, alpha_string)
if p_status == "VALID" and b_status == "VALID" and result == "VALID" and beta_string == beta_string2:
    print("Commitment verified")

Above we can see Alice generating a public_key and secret_key pair. While the public_key is meant to be shared, the ordering of this event depends on the specific use case or protocol. Here we show this is done only at the last stage of the scenario. Similarly, error checking is usually done throughout but we have collected it at the end here for clarity.

Next we see Alice perform a few steps to generate a beta_string which is the ‘random digest’ provided to Bob. The pi_string is used in the later proving step and so is initially held in reserve. Alice submits only the beta_string as her anonymous commitment.

When it comes time to close out the auction (e.g. open the sealed envelopes), Alice provides Bob with the original plaintext alpha_string along with the pi_string and her public_key. Bob takes these and is able to confirm consistency amongst the values. Bob is able to unambiguously connect the original beta_string with Alice’s public_key, which corresponds to Alice’s secret_key. So the mechanics can be considered a sort of public/secret signed hash.

2. Cryptographic Sortition

Now, let’s consider cryptographic sortition which is essentially about securely selecting a small ‘voting committee’ from a much larger set of potential participants. This technique is used in a variety of alternatives to (and improvements upon) the ‘Proof of Work’ consensus system which is commonly considered wasteful in terms of energy and other resources.

In this scenario, the system broadcasts the hash of the most recent consensus block to all potential participants as the common public alpha_string. Each participant privately calculates their beta_string corresponding to their individual secret_key. If the beta_string happens to match some agreed characteristics, such as specific leading digits or a relationship to their currency ownership etc, the participant can consider themselves selected for the voting committee to build consensus on the next block. Here is some pseudocode for this scenario:

0. Keygen in advance: secret key, public key, the latter is well known.

1. System broadcasts hash of latest consensus block as alpha_string 

2. Each participant privately determines their eligibility
   a. Generate pi_string with alpha_string and secret key
   b. Generate beta_string from pi_string
   c. If beta_string meets requirements, participant is (self) selected

3. When it comes time to vote on next consensus block, provide
   vote along with pi_string proof; 'Everybody' can validate that
   the beta_string meets requirements and the secret_key can vote.

This scheme is quite nice as the participants can securely select themselves without any interaction beyond the original alpha_string. An adversary cannot pre-calculate who will be selected (for attack targeting). The beta_string can be related to other criteria such as the account balance corresponding to the secret_key. Voting results can be subsequently validated by anyone with an interest as all but the secret_key can be considered public.

Examine and Explore

A few folks have told me that the VRF specification reads like Python pseudocode. In that case, the repository code simply removes the ‘pseudo’ … leaving executable Python code. In fact, a good chunk of the specification is incorporated right alongside the code statements as comments and makes it read like a step-by-step recipe.

As a result, you can develop a basic scenario of your own and watch it execute step-by-step in a Python IDE debugger. Sometimes the math is tricky, so watching the intermediate calculations evolve can be quite useful. Further, the code is arranged to support changes/evolution/extensions by providing the ability to test detailed intermediate values. If you have an interest in porting VRFs to your language and system of choice, the code provides the ability to generate detailed testing vectors. Check out the test file to see everything in action.

As described in the original blog post, the primary API is no more complicated than what you have already seen above. Specifically:

# Section 5.1. ECVRF Proving
def ecvrf_prove(sk, alpha_string):
        sk - VRF private key (32 bytes)
        alpha_string - input alpha, an octet string
        ("VALID", pi_string) - where pi_string is the VRF proof, octet string of length ptLen+n+qLen
        (80) bytes, or ("INVALID", []) upon failure

# Section 5.2. ECVRF Proof To Hash
def ecvrf_proof_to_hash(pi_string):
        pi_string - VRF proof, octet string of length ptLen+n+qLen (80) bytes
        ("VALID", beta_string) where beta_string is the VRF hash output, octet string
        of length hLen (64) bytes, or ("INVALID", []) upon failure

# Section 5.3. ECVRF Verifying
def ecvrf_verify(y, pi_string, alpha_string):
        y - public key, an EC point as bytes
        pi_string - VRF proof, octet string of length ptLen+n+qLen (80) bytes
        alpha_string - VRF input, octet string
        ("VALID", beta_string), where beta_string is the VRF hash output, octet string
        of length hLen (64) bytes; or ("INVALID", []) upon failure

One interesting aspect of the specification relates to the usability of the last function above. As written, the (successful) caller receives a ("VALID", beta_string) tuple and is expected to confirm the first value is “VALID” and the second value matches the expected beta_string. If the latter step is not done, perhaps due to a mistake or oversight, the scheme falls apart. Many would say that a more foolproof version of the function ought to take in the expected beta_string and respond with a simpler Fail/Success indication.

Health warning: The primary purpose of the code is to enable easy examination and exploration. As the repository README.md states, the code is alpha-quality and not suitable for production. Obvious timing side-channels exist. Specifically, both the algorithms within the code and (the use of) Python’s big integers are clearly not constant time and thus introduce timing side channels.

This post introduced actual working VRF code in action. The code is accessible and easy for everyone to examine and explore. I have shown it in action and dropped a few hints on what you might do with it. Now it is your turn to engage and enjoy!