One question I have been getting a lot lately is whether you have to check the status of the carry flag to see if a valid random number was returned by RDRAND. The reason why this question gets asked is because of this description of a RDRAND underflow condition, which appears in the DRNG Software Implementation Guide:
After invoking the RDRAND instruction, the caller must examine the carry flag (CF) to determine whether a random value was available at the time the RDRAND instruction was executed. A value of 1 indicates that a random value was available and placed in the destination register provided in the invocation. A value of 0 indicates that a random value was not available. In this case, the destination register will also be zeroed.
It is the final two sentences, the ones which I have indicated with bold type, that are the source of the inquiry. The logic goes like this: if the RDRAND instruction places a zero in the register in addition to clearing the carry flag (CF), can a developer just check for a value of zero, instead?
Strictly speaking, the answer is no. The proper way to determine whether or not RDRAND returned a valid random number is to check the status of CF. If CF=1, the number returned by RDRAND is valid. If CF=0, a random number was not available.
For the rest of this discussion, for simplicity's sake, we'll assume you want to obtain a 64-bit random value. Everything below still applies if you ask for 16- or 32-bit values.
Why this matters
Behind this question is the assumption that the registers are only 0 when the RDRAND instruction cannot return a random number. In other words, that the RDRAND instruction never returns 0 as a random number.
For very early implementations of Intel Data Protection with Secure Key, this was true. The physical hardware, specifically the signaling method used on the bus, did not support an out-of-band method of transmitting an error condition. In order to support error reporting on these early architectures, it was necessary to appropriate one of the possible values in the 64-bit random number space and use that to indicate an error condition. For simplicity, the designers of the DRNG chose the value "zero". On these early architectures, random 64-bit values of zero are discarded, and thus never returned as a valid random number. A zero is only sent when an underflow occurs. This effectively reduces the random number space for RDRAND from 264 to 264 - 1, since the legal range is (1, 264-1) instead of (0, 264-1).
Newer architectures, however, do not have this limitation. Future implementations of Secure Key can return a value of 0 as a valid random number. They return values in the full 264 space, (0, 264-1). On these architectures, checking for a value of zero instead of for the correct condition of CF=0 throws away valid random numbers. In other words, you'll be issuing an extra RDRAND roughly every 264 executions on average.
What if I just ignore results of zero?
While this is a logical question—if I don't care about zero, don't care that my valid range is (1, 264-1), then I can just ignore a zero—but using this as the test for a valid number is still technically incorrect. Software should not act upon the secondary effects of an instruction when making decisions. Only the published error checking procedures should be followed, as those are the ones that are guaranteed to be accurate and work both in the present and the future.
What if I need that zero in my range?
Since early architectures can not return a zero, the recommended way to expand the range to (0, 264-1) is to XOR two RDRAND values together. This is guaranteed to produce a uniform random number in the full range of (0, 264-1) because XOR'ing a value with a uniformly random value results in a uniformly random value. Any bias towards a particular value will be negligible, though if you are paranoid about bias you can XOR multiple values: each XOR operation will result in an increasingly uniform result.