Pairing over BLS12-381, Part 1: Fields

This is the first of three code-centric blog posts on pairing based cryptography. The series will ultimately conclude with a detailed review of the popular BLS12-381 pairing operations found in a variety of applications such as BLS signatures [1]. Support for these operations in an Ethereum precompiled contract has been proposed [2], and support for a related pairing configuration in precompiled contracts is already in operation [3, 4]. For now, this first post introduces a little Haskell, finite fields, the embedding degree, and the implementation of a 12-degree prime extension field tower.

These blog posts complement existing theoretical resources by placing a strong emphasis on actual working code. Everything will be demonstrated inside of just 200 lines [5] of Haskell accompanied by generous comments and whitespace. The source code is available at https://github.com/nccgroup/pairing-bls12381 to facilitate experimentation. Note that the Filecoin project [6] also has some great examples written in Rust which provide a complementary perspective.

Why Haskell?

Haskell [7] is a general-purpose functional programming language well-suited for teaching and research purposes. Its statically typed expressions and overloaded operators remind many of physics equations involving specific units, which is wonderful given that we will be dealing with a variety of very similar but distinct fields. This prevents inadvertently mixing units, leading to mistakes and bad results such as the Mars orbiter crash. Further, the remarkable conciseness helps to keep our central concepts in sharp focus without verbose distractions. As such, it serves as a great bridge between theory and practice.

Housekeeping

Throughout these three blogs, we will cover each and every (functional) line necessary to reach our end goal of pairing. Let’s get started with some basic housekeeping below. We provide line numbers that match the source code in ./Crypto/Pairing_bls12381.hs [8] where all the action takes place.

We see above that every source file starts off with a module declaration that defines a module name matching the filename and also provides a list of functions to export. This export list will become our pairing API in the end, and the ordering drives the nice documentation produced via stack haddock [9]. Following the module declaration, lines 62-64 import just a few specific functions from the standard library for use later.

Modular Arithmetic and the Fq1 Field

As expected, at the foundation of everything is a set of integers from 0 to a prime number fieldPrime along with two operations (addition and multiplication) satisfying a few properties that result in a finite field:

• An operation on two elements of the set results in another element of the set.
• Associativity and commutativity; Distributivity of multiplication over addition.
• Identities and inverses exist for both addition and multiplication.

We are going to be working with very large numbers. Fortunately, Haskell provides an arbitrary-precision Integer type (that includes support for the modulo operator) such that we will not need to worry about overflow.

As we declare our fieldPrime on line 76 below, we precede it with a type declaration for the compiler to inspect. Our implementation then defines a custom finite Field interface that extends the Num interface on lines 80-82. By first ‘meeting’ the Num interface, which is essentially addition, subtraction, multiplication and a few helper functions, we get access to a lot of standard functionality as well as things like operator evaluation precedence. Our custom Field interface extension then only consists of the modular multiplicative inverse and a special helper function mul_nonres(). Note that long lines in the code snippets are truncated to keep the fonts sensibly sized – see the original code for every last digit of the field prime.

Now we are ready to implement the code for the first single-element finite field. Shown below on lines 88-98 are the necessary functions for the Fq1 type to implement the Num interface – this is essentially overloading the +, -, * and fromInteger() operators. As they are an existing interface, the type declarations already exist and need not be repeated. We are able to take a short-cut on the abs() and signum() functions because we do not rely on any standard functionality that utilizes them (but we do need to implement something). This logic is followed by the two functions implementing our Field interface on lines 103 and 106/107.

Note that operators containing symbols, such as +, - and *, are typically used via infix notation. To specify them via postfix we need to surround them with parenthesis. Postfix functions like mod can be specified via infix notation if surrounded by tick marks. Haskell comments are behind -- tokens. Further, the blog text distinguishes function() from constant with parentheses for clarity, despite Haskell not using parenthesis in actual function calls.

The three symbol operators accept two parameters each of the specified type. When the operators are called, the system attempts to pattern match the supplied arguments with the accepted parameters. For example, first the Fq1 type on line 88 is required on both parameters. Then the a0 and b0 parameters on the left hand side are considered initially empty, so the pattern match process thus fills them with values. On the right side of the equals sign of line 88, we construct a fresh Fq1 type to hold the function return value – the arithmetic operation is performed and the result modulo our field prime is returned. The fromInteger() function on line 94 just constructs a fresh Fq1 instance based on an integer parameter. This pattern matching approach runs throughout all Haskell code.

The above logic is relatively minimal and straightforward. All the parenthesis to the right of the function names are necessary to force correct evaluation precedence. We have implemented what is necessary for Fq1 to attach to the Num class per line 86, and also the Field class per line 101.

Besides mul_nonres() which we will address later, the only detail left to review is the implementation of the multiplicative inverse on lines 106/107. While Fermat’s little theorem [10] would allow us to implement this in a single line via exponentiation for this field and all subsequent ones, a small performance optimization allows an early view into more typical Haskell functional programming.

Below we have adapted the well-known Euclidean algorithm [11] which is typically used for computing the greatest common divisor of two integers. We further modify it to remove the very expensive division operations and instead substitute a right shift by 1 (which is effectively divide by 2). As we saw earlier, the logic is preceded by a type declaration on line 112. When called with the correct parameters, as is done in the inv() function above, the function will return the multiplicative inverse.

Here we see the same pattern matching happening on the function parameters on line 113, but now we have additional guard clauses following that. These clauses can be interpreted as conditional tests that select which single expression on the right of the equal sign is actually evaluated. The first guard clause supports simple input validation, the next two guard clauses are for the recursive terminating conditions and the remaining four do the actual iterative work. Note that in Haskell, variables are write-once which precludes the typical iterative for loops. As can be seen above, recursion is heavily relied upon instead. Detailed insight into this algorithm is presented in the reference noted in the code comment [12].

Embedding Degree

The next blog post will utilize the above finite field to represent point coordinates on the elliptic curve E1 : Y2 = X3 + 4 modulo our field prime, but we will drop some advance hints here leading to needing more fields. The standard generator for this curve yields a group of nearly 2256 which is what gives us nearly 128-bits of security. Below is the precise figure for our group order on line 220, which is also exported in the original module declaration.

The pairing operation requires two elliptic curve points from two different groups having this same order r using the field prime q. Thus, we will have to employ a prime extension field of degree k, where k is the smallest positive integer such that r|(qk − 1) (and a few other conditions) which allows multiple groups with the same order to emerge. In our case, the embedding degree k is 12.

So the curve name BLS12-381 is suggestive of the embedding degree of 12 along with a 381-bit field prime.

Instead of the basic set of integers we used in Fq1, this prime extension field of degree 12 can be described as the set of polynomials with coefficients modulo the field prime and of degree strictly less than 12. So ultimately we need to implement a field consisting of polynomials that are equivalent to this:

Fq^12 : a11x11 + a10x10 + a9x9 + a8x8 + a7x7 + a6x6 + a5x5 + a4x4 + a3x3 + a2x2 + a1x1 + a0

Rather than implementing this field in a flat manner, we will build it in a few hierarchical steps.

Tower Organization of Fq12

Our Fq12 field will be implemented in a hierarchy of sub-fields as shown in the picture below.

The bottom layer contains instances of Fq1 which we have already coded. The next layer up is Fq2 which is implemented as a set of polynomials of degree strictly less than 2 and containing coefficients in Fq1. Similarly, Fq6 and Fq12 are built from the layers below them. Consider each of the upper layers to contain references to instances of those below. The result is known as the tower organization.

The code below shows the data types we will be using to represent the various fields. As Fq1 is essentially an alias for Integer it uses the simpler newtype declaration form on line 68. The Fq2 field data declaration on line 69 contains two instances (for coefficients) of the type Fq1. The upper layers Fq6 and Fq12 follow this exact same strategy. Note the individual coefficients are ordered with the
highest degree first and are marked as t, u, v and w so we can refer to them individually as needed.

As we will see, each layer will have its own modulo reduction scheme. For Fq1 we used just the fieldPrime. The mul_nonres() function will be used for the layers above this.

Implementing Fq2, Fq6 and Fq12

Working from the bottom up and already having Fq1 coded, we implement Fq2 below. This field is essentially identical to what most know as the complex numbers. Addition, subtraction and multiplication should be as expected. Note that Haskell does the right thing when taking mod of a negative number; the result is positive. The mul_nonres() function implementation on line 144 multiplies its single argument by the irreducible polynomial of the field above (which in this case is Fq6 where v3 − ξ = 0 and ξ = u + 1).

The implementation for Fq6 is shown below. In this case we have references to three instances of the Fq2 layer below as can be seen in the leftmost pattern matching targets. The schoolbook multiplication shows how mul_nonres() is used to reduce the higher order terms on lines 164 and 165. The fromInteger() function consistently (here and in the other fields) takes a single integer and places it into the rightmost coefficient. The multiplicative inverse on lines 178-183 is a little more involved but follows standard algorithms presented in the literature.

Finally, we have Fq12 below which follows the same pattern as the prior fields. Consider for a moment how adding two Fq12 instances together results in adding each of their two Fq6 references together. When each of these two Fq6 instances are added together, what is really happening is adding together the three references of Fq2. And of course, adding Fq2 instances involves adding the individual references to Fq1. In each case, the operators are overloaded and easily able to drive the desired behavior. Thankfully, operator precedence minimizes the proliferation of distracting parenthesis. While these fields could likely be rewritten as a parameterized version of a single algorithm, the patterns would then become far less obvious (and thus far more difficult to understand).

What have we learned?

This blog post has introduced some basic Haskell, provided a typical single-element finite field implementation, touched on the embedding degree needed to establish more groups of order r on our target curve, and finally developed an implementation of a degree 12 prime extension field through a series of smaller fields arranged as a tower. This is a large portion of the internal machinery needed to perform operations on the elliptic curves and to support the actual pairing function. Remember, the code is easily accessible to download and run via stack test – please refer to the verbose README.md.

What’s next?

The next post will implement a variety of elliptic curve operations using coordinates constructed from two of the fields implemented above. We will export several of those operations and the group generators, implement some validation checks and describe the sextic twist. Stay tuned!

References

  1. https://datatracker.ietf.org/doc/draft-irtf-cfrg-bls-signature/
  2. https://eips.ethereum.org/EIPS/eip-2537
  3. https://eips.ethereum.org/EIPS/eip-196
  4. https://eips.ethereum.org/EIPS/eip-197
  5. Reported by the cloc tool at https://github.com/AlDanial/cloc
  6. bls-signatures, pairing, group and ff at https://github.com/filecoin-project
  7. https://www.haskell.org/
  8. https://gihub.com/nccgroup/pairing-bls12381/Crypto/Pairing_bls12381.hs
  9. https://docs.haskellstack.org/en/stable/README/
  10. https://en.wikipedia.org/wiki/Fermat%27s_little_theorem
  11. https://en.wikipedia.org/wiki/Euclidean_algorithm
  12. https://link.springer.com/book/10.1007/b97644