* RNG Design Document

A good PRNG is necessary for secure crypto, and is in many ways a single point
of failure. I feel the code implementing Botan's RNGs (Randpool, ANSI X9.17,
FIPS 186 RNG) is pretty clear, so it should be fairly easy to understand the
algorithms directly. Just in case additional explanation is desired, I wrote
this.

Randpool is designed to provide the maximum possible assurance that the PRNG is
safe. The X9.17 generator (actually a modified X9.17) is significantly faster,
but somewhat more risky for general use. For this reason, Randpool is used for
generating keys, and the X9.17 generator is relegated to generating nonces and
the like. The FIPS 186 generator basically exists because the changes made to
the ANSI X9.17, while making it secure, probably are large enough that the
specific implementation is no longer FIPS compliant. If, in fact, it turns out
that the ANSI X9.17 is acceptable from a FIPS standpoint, the FIPS 186 RNG will
probably be removed. (FIPS approval of any RNGs used is required for a FIPS 140
validation).

If you think of any kind of attack on either RNG design, please contact me so I
can fix it.

* Randpool

Randpool has an internal state called pool, which is 1024 bytes long. This is
where entropy is mixed into and extracted from. There is also a small buffer
(called output) from which the output is taken. It is based around a hash
function (SHA-1) and a block cipher (AES). The intent is that the design will
be secure unless someone can break both the hash and the cipher. Where a
specific size is mentioned, it should be taken as a multiple of the block
cipher size. For example if a 256-bit block cipher were used instead of AES,
all of the sizes internally would double.

Every time some new output is needed, we hash a counter (incremented every
output), the value of a high resolution timer, and the entire pool. The
resulting hash is XORed into the output buffer (wrapping as needed), and the
output buffer is then encrypted with AES, producing 16 bytes of output.

Initially the cipher is keyed with all zeros, but we then generate enough bytes
(which are discarded) that the pool is mixed and the cipher is keyed with
something random (as described below). However the only entropy at this point
is a few timer values, so there probably isn't much actual entropy at this
point, but it will at least look random. Since the pool is mixed (see next
paragraph for details) every time entropy is added, and you have to add entropy
before Randpool will produce output, any actual PRNG output should be encrypted
with a fully random key.

After every 8 blocks (128 bytes) have been produced, we mix the pool. To do
this, we first rekey our AES object using the contents of the current output
buffer (this key is not used directly as output, for obvious reasons). We then
encrypt the entire pool in CBC mode, using the last block of the pool as the
IV. Finally, the output buffer has it's bits inverted, and it is encrypted
using the new AES key (which is actually the contents of the output buffer
before it's bits are flipped).

Every time randomness is added to the PRNG, we remix the pool and add the value
of high resolution timers. This is done at least once regardless of how much
data was added, plus once for every block of 512 bytes (or portion of) that was
added.

The use of the counter and timestamp means that the resulting hash will change
every time (even if the timestamp somehow becomes frozen). Even if, somehow,
both values became frozen, the output would still appear random because it is
XORed against the previous output and then encrypted with a block cipher.

The pool mixing strategy is a bit more questionable. In particular, it gives an
attacker a number of outputs of the form AES_k(~k) for random k. However,
unless AES is much weaker than anyone thinks, it should not be a problem since
assuming the way we generate the output is safe, it is impossible for an
attacker to control the key in any meaningful way. Perhaps a stronger strategy
is needed here. The mixing function itself (AES/CBC) should work well for
distributing changes across the entire state - 16 bytes of information is
pretty decent.

The reason for using the last block of the pool as the IV when encrypting the
pool in CBC mode is because changes in the middle of the pool will affect the
end of the pool in the next mix, so using the end as the IV will affect the
entire pool the next time. We always mix at least twice when adding entropy, so
this should be OK.

When adding entropy, the Randpool object estimates how much entropy is in the
input buffer. If less than 256 bits of (estimated) entropy have been provided
as input, Randpool will refuse to produce output. When mixing in provided
entropy, we XOR the input against at most half of the pool (this is done
repeatedly if needed).  After each XOR against the pool, the generation
algorithm is run, followed by a pool mixing. We follow up with another pool
mixing at the end. While this is rather expensive, no more than a few tens of
kilobytes will ever be provided as input to a Randpool, so it is not much of an
issue. The reason for limiting each mixing pass from affecting no more than
half the pool is to make sure that it is impossible to affect the entire
internal state.

* X9.17 RNG

This is essentially the standard issue X9.17 appendix C PRNG, with the
following changes:

 * AES is used instead of TripleDES
 * The timestamp is encrypted in CBC mode instead of ECB
 * Every 16 iterations, we do a rekeying of the cipher, taking the new key
   from some internal state.

Entropy is added by XOR'ing the input into one of the state buffers, then
generating new output (which is discarded). This is felt to be sufficient,
because in Botan the entropy provided to the X9.17 generator is the output of a
Randpool generator. In addition, the fact that the X9.17 generator is only used
to generate nonces means attacking it is not likely to be terribly fruitful
anyway.

The X9.17 generator will refuse to work unless it believes that at least 96
bits of entropy have been provided. This should be more than sufficient for
nonce generation.

* FIPS 186 RNG

The FIPS 186 RNG is exactly as specified in FIPS 186-2. Internally, it holds a
pointer to a 'real' RNG implementation (Randpool). Whenever entropy is added to
the FIPS RNG, it simply passes than entropy on to the Randpool, then generates
a new XKEY by pulling 20 bytes from the Randpool (if the Randpool is
sufficiently seeded). A new XKEY is also generated each time the randomize()
member function is called -- this seems morally equivalent to generating a new
XKEY each time a new k or x parameter is generated (the FIPS is not
particularly clear about how the RNG is supposed to behave when used as a
general purpose random number generator).

Each time new output is generated, XSEED (the "optional user input" parameter)
is generated by reading 20 bytes from Randpool.

Thus, all entropy estimation, mixing, and so forth are actually performed by
the Randpool implementation. The state of the normal FIPS RNG (with a stateless
XKEY and XSEED) is quite small, and while it's large enough to prevent brute
force search, it might be too small to be secure given the existence of
weaknesses in the mixing function.

* Entropy Estimation

The entropy estimation techniques used are not terribly clever. The estimation
is done in the function entropy_estimate in src/util.cpp

Essentially, if the input is less than 5 bytes long, we say that the input had
no entropy. This is just to simplify the rest of the code, since the algorithm
assumes it has a sequence of bytes to work on. Typically the inputs are 64-256
bytes long, so this isn't a problem.

The entropy is estimated using first, second, and third level deltas, with a
difference metric of XOR. The entropy of the byte is estimated to be the
Hamming weight of the smallest of the three deltas. The sum of each byte's
entropy is taken to be the entropy estimate. We then return (estimate/2), so we
are (hopefully) underestimating the actual entropy. This lowered entropy
estimate means that often a single slow poll (from a single entropy source)
will not collect enough entropy to fully seed the PRNG. This was done
intentionally so that multiple polls (hopefully from multiple sources) will
always been done. The exception is if /dev/urandom is available, as often a
single poll from it is sufficient to seed the PRNG. This is probably fine, and
users who are feeling especially paranoid can always call
Global_RNG::seed(true, 0) to force a slow poll from all available entropy
sources.

We used to do two different estimates, one as above, and another using a
difference metric (additive difference), and then return the smaller of the
two. But in fact the addition-based estimate was always much larger, so we
don't bother with that anymore.

The estimates seem to make some amount of sense. For example, the outputs of
/dev/urandom and EGD will appear to have about the same amount of entropy,
which makes sense as they are both hashed with SHA-1. In contrast, less random
data, like ASCII strings, will get a significantly lower entropy rating. For
example, the typical string which is used to initialize the OpenSSL RNG will
not come close to satisfying the RNGs.
