RDRAND: Do I need to check the carry flag, or can I just check for zero?

One question I have been getting a lot lately is whether you have to check the status of the carry flag to see if a valid random number was returned by RDRAND. The reason why this question gets asked is because of this description of a RDRAND underflow condition, which appears in the DRNG Software Implementation Guide:

After invoking the RDRAND instruction, the caller must examine the carry flag (CF) to determine whether a random value was available at the time the RDRAND instruction was executed. A value of 1 indicates that a random value was available and placed in the destination register provided in the invocation. A value of 0 indicates that a random value was not available. In this case, the destination register will also be zeroed.

It is the final two sentences, the ones which I have indicated with bold type, that are the source of the inquiry. The logic goes like this: if the RDRAND instruction places a zero in the register in addition to clearing the carry flag (CF), can a developer just check for a value of zero, instead?

Strictly speaking, the answer is no. The proper way to determine whether or not RDRAND returned a valid random number is to check the status of CF. If CF=1, the number returned by RDRAND is valid. If CF=0, a random number was not available.

For the rest of this discussion, for simplicity's sake, we'll assume you want to obtain a 64-bit random value. Everything below still applies if you ask for 16- or 32-bit values.

Why this matters

Behind this question is the assumption that the registers are only 0 when the RDRAND instruction cannot return a random number. In other words, that the RDRAND instruction never returns 0 as a random number.

For very early implementations of Intel Data Protection with Secure Key, this was true. The physical hardware, specifically the signaling method used on the bus, did not support an out-of-band method of transmitting an error condition. In order to support error reporting on these early architectures, it was necessary to appropriate one of the possible values in the 64-bit random number space and use that to indicate an error condition. For simplicity, the designers of the DRNG chose the value "zero". On these early architectures, random 64-bit values of zero are discarded, and thus never returned as a valid random number. A zero is only sent when an underflow occurs. This effectively reduces the random number space for RDRAND from 264 to 264 - 1, since the legal range is (1, 264-1) instead of (0, 264-1).

Newer architectures, however, do not have this limitation. Future implementations of Secure Key can return a value of 0 as a valid random number. They return values in the full 264 space, (0, 264-1). On these architectures, checking for a value of zero instead of for the correct condition of CF=0 throws away valid random numbers. In other words, you'll be issuing an extra RDRAND roughly every 264 executions on average.

What if I just ignore results of zero?

While this is a logical question—if I don't care about zero, don't care that my valid range is (1, 264-1), then I can just ignore a zero—but using this as the test for a valid number is still technically incorrect. Software should not act upon the secondary effects of an instruction when making decisions. Only the published error checking procedures should be followed, as those are the ones that are guaranteed to be accurate and work both in the present and the future.

What if I need that zero in my range?

Since early architectures can not return a zero, the recommended way to expand the range to (0, 264-1) is to XOR two RDRAND values together. This is guaranteed to produce a uniform random number in the full range of (0, 264-1) because XOR'ing a value with a uniformly random value results in a uniformly random value. Any bias towards a particular value will be negligible, though if you are paranoid about bias you can XOR multiple values: each XOR operation will result in an increasingly uniform result.



For more complete information about compiler optimizations, see our Optimization Notice.


R H.'s picture

What about the code below. Will it produce a file with correct RDRAND random numbers ?

#include <stdio.h>
#include <stdlib.h>

#define USE_GCC_INLINE_ASM     

unsigned int rdrandfetch(void)
    unsigned int r;
    int cf = 0;
    const size_t RDRAND_POLLS = 32;
    size_t i;

     for(i = 0; i != RDRAND_POLLS; ++i)

#ifdef USE_GCC_INLINE_ASM     

        // Encoding of rdrand %eax
        asm(".byte 0x0F, 0xC7, 0xF0; adcl $0,%1" :
              "=a" (r), "=r" (cf) : "0" (r), "1" (cf) : "cc");
        cf = _rdrand32_step(&r);

        if(cf == 1) return r;
    if(i >= RDRAND_POLLS)
        printf("\nRdrand failed. Aborted.\n");



#define NDWORDS (1024*256) //1 MB
unsigned int buf[NDWORDS];

int main(int argc, char **argv)

int i,j,mb;

FILE *out;

out = fopen(argv[1], "wb");
if (out == NULL)
    fprintf(stderr, "could not open output file\n");
    return 1;

    fprintf(stderr, "could not read number of MBytes\n");
    return 1;

printf("\nFetching %d MB..",mb);


    buf[j] = rdrandfetch();





John M. (Intel)'s picture

There are two reasons to choose xor over word splicing:

  • xor is much faster
  • It is more correct. Xor'ing moves numbers towards greater uniformity. Slicing up two numbers in this fashion will solve the range problem, but will not make the individual words more uniform than they already were.

One of the engineers responsible for the DRNG is developing a mathematical argument for the general case of using xor as an extractor, and has promised to either come to me with a more formal response to this matter, or reply directly on this thread when that is done.

peter-gerdes's picture

The Xor recommendation seems a little weird when it is so easy to do the right thing.

You want a random number distributed from 0-2^64-1 uniformly?  Call RDRAND twice generating values x, y.  Now build z by setting the low 32 bits of z to the low 32 bits of x-1 and the high 32 bits of z to the low 32 bits of y-1.

Since x-1 and y-1 are uniformly distributed on the range [0,2^64-2] and the restriction of a uniform distribution is uniform so the low 32 bits of x-1 and y-1 are uniformly distributed on [0,2^32].  Now as the map 2^32*z_1+z_2 is a bijection between [0,2^64-1] and [0,2^32-1]x[0,2^32-1] the value 2^32*(y-1) + x-1 is uniformly distributed on [0,2^64 - 1]



Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.