To Concatenate or Not Concatenate RDRAND?

By John P Mechalas,

Published:03/07/2014   Last Updated:03/07/2014

At the heart of Intel® Data Protection with Secure Key is the digital random number generator (DRNG), a NIST* SP800-90A compliant pseudorandom number generator which is accessed using the RDRAND instruction. Beginning with Intel CPU's code-named Broadwell, Secure Key will also include an SP800-90B and C compliant true random number generator, called an enhanced nondeterministic random number generator in the NIST specifications, that will be accessible via the RDSEED instruction.

Last year, I wrote a short blog post explaining the difference between RDRAND and RDSEED, and providing some recommendations on which instruction to use and under what circumstances. This blog posting was not without some controversy due to the nature of the DRNG and the numbers returned by the RDRAND instruction. While the DRNG generates 128-bit random numbers, the RDRAND instruction can return, at most, 64-bit random numbers. At issue was how to properly create a 128-bit random number from RDRAND when you can only fetch 64-bits at a time.

(Note that this article does not apply to RDSEED. The values returned by RDSEED can always be safely concatenated, as RDSEED is intended for seeding other pseudorandom number generators.)

How do you generate a 128-bit random number using RDRAND?

As one of our DRNG engineers put it, this is a question "that has two and a half answers, depending on who you talk to". The "two answers" part is that there are really two approaches for creating a 128-bit number from two 64-bit (or four 32-bit) random numbers. The "half answer" is that which method you choose depends in part on what you plan to do with the result. My goal is to present both methods as well as the guidance for selecting the correct one in your applications.

Note that the theoretical discussions below are not specific to the DRNG: they apply to any pseudorandom number generator. I reference RDRAND specifically because that is the focus of this article, but the methodologies presented below are universal.

Method #1: Concatenation

This is the simplest and fastest method: you simply concatenate two values together to form a longer number. Mathematically speaking, you fetch two random numbers, R1 and R2, and form R = R1 || R2 where || represents concatenation.  If you have the two 64-bit random numbers 0x807c837d and 0x58b6df9c, you can assemble the 128-bit number 0x807c837d58b6df9c. The order in which the values are arranged does not matter.

The problem with concatenating values from a pseudorandom number generator is that the operation is additive when it comes to determining the total entropy in the resulting value. Entropy is a measure of what is unknown inside of a system. The amount of work required to brute-force predict a random value that has n bits of entropy is O(2n). If you concatenate two values together, the entropy required to brute force the result becomes only 2n + 2n = 2n+1. By combining two n-bit random values, each with n bits of entropy, in this manner you get a random number with effectively only one additional bit of entropy.

Where this gets confusing with respect to the DRNG is that it produces 128-bit values with 128 bits of entropy, but the RDRAND instruction can return at most 64 bits at a time. From a purely cryptographic standpoint, when you are given an n-bit random number you can have at most n bits of entropy. While it can be argued that the DRNG is in reality just splitting a 128-bit value into two pieces and handing them to you one piece at a time, from a theoretical viewpoint this does not matter. While the original value had 128 bits of entropy, the end result is that you are handed two 64-bit numbers one at a time, each of which only has 64 bits of entropy. Because they come from a pseudorandom number generator, the entropy is additive when the values are concatenated, and the resulting 128-bit number has only 65 bits of entropy.

This is a theoretical argument. In practice an attacker may need many more than 265 computations to brute-force predict the random values that come from the DRNG because of its construction, but in security applications it is best practice to design for the ideal attacker.

Does this mean that concatenation is bad and should never be done? No. It just means that there are scenarios where concatenation is inappropriate.

Method #2: Cryptographic Mixing

This method is more involved than concatenation, but it leads to a more robust random number. What you do is concatenate the two random values together as above, but then apply a cryptographic primitive to the concatenated value, discarding the originals. Mathematically, R = F(R1 || R2), where is F is some cryptographic operation.

The advantage of this method over simple concatenation is that it completely destroys the original two random numbers, leaving no trace of them behind. The final value is a number that does not resemble the originals, cannot be identified as such, and the ideal attacker no longer has knowledge of the pseudorandom number sequence. When values are processed in this manner, the entropy is multiplicative. Returning to the amount of work required to brute-force predict a random value, from cryptographic mixing the result is of order 2n * 2n = 22n. Specifically, given two 64-bit numbers from RDRAND and mixed cryptographically, you end up with a 128-bit value that has 128 bits of entropy.

Which cryptographic primitive should you apply to the intermediate values R1 and R2? To completely eliminate all traces of the original value, you want a one-way function.

AES in CBC-MAC mode

This method is described in the Intel® Digital Random Number Generator (DRNG) Software Implementation Guide. Use of 128-bit AES in CBC-MAC mode is used to create 128-bit random values from two 64-bit samples. One large advantage of this method is that it can be done entirely within the registers, which eliminates memory- and cache-based attacks and greatly streamlines execution. It has one disadvantage, though, and that is that it adds a circular dependency on RDRAND: either the IV or the key has to be randomized (or, ideally, both), and those values must come from somewhere, with the logical choice being RDRAND.

SHA2 with HMAC

This method uses SHA256 with HMAC—sometimes written as HMAC_SHA256—to generate a 256-bit digest from the original two values, returning the first 128 bits (or the last; it does not matter which). The reason for using HMAC_SHA256 instead of just plain SHA-2 is that HMAC is provably better at extracting randomness (See "Randomness Extraction and Key Derivation Using the CBC, Cascade and HMAC Modes", by Dodis, Gennaro, Hastad, Krawczyk, and Tabin).

Unlike using AES, the HMAC_SHA256 method does not introduce a dependency on RDRAND as it can be done using an arbitrary key, but it cannot easily be implemented in a manner that keeps the intermediate values solely within the registers. This does, in theory, make it vulnerable to memory-based attacks.

So when do I concatenate, and when do I mix cryptographically?

Determining which method to use is pretty straight forward, as what we are talking about here is the theoretical time required to brute-force predict random numbers. It stands to reason, then, that a usage with a short life is more forgiving than one with a long life. The concern is that, given sufficient time and resources, an attacker can brute force the random values before their useful life has ended.

Specifically, that means that concatenation is sufficient for generating nonces, ephemeral keys such as session keys, and general random data. Where you really want to use cryptographic mixing is when you are generating static encryption keys. You can expect those keys to be in use for several years, and the data they protect is at risk if they can be exposed.

But isn't this all theoretical?

Well, yes, but that doesn’t mean we should ignore it. For one, it's always a good idea to follow best practices when programming security functions. This is not an area where we want to encourage sloppy programming. The second is that cryptographers are paranoid for a reason, and that reason is that sometimes theory meets practice. Sometimes it does so in unexpected ways.

Consider the following scenario: A developer creates software that generates 128-bit encryption keys using RDRAND. For simplicity, they use simple concatenation instead of cryptographic mixing, working under the assumption that because RDRAND is generating 128-bit random numbers, those two 64-bit values are really just a permutation of the original 128-bit value. They build their code and install it and it runs fine.

At some point in the future, however, that code may be executed on a machine that does not support the RDRAND instruction. Or, maybe the source code gets modified by someone who adds support for some other pseudorandom number generator. Or a coding error causes the program to choose a source other than RDRAND. The exact cause doesn't matter: instead of pulling from RDRAND, they instead pull 64-bit random numbers from some other source and concatenate them together. If that source has fewer than 128-bits of entropy, then they have measurably reduced the security of the system.

Another scenario: another programmer uses the original program as a reference. Maybe they are less versed in security programming, and blindly assume that concatenation is a valid means of concatenating multiple values from a PRNG because that's how it was done in the example they are following. The original programmer has inadvertently encouraged someone to follow a very, very bad practice.

Do things right, and follow the theory, and you protect yourself both in the present and the future.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.