From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sandy Harris Subject: [PATCH 4/7] Different version of driver using hash from AES-GCM Compiled if CONFIG_RANDOM_GCM=y Date: Sat, 7 Nov 2015 09:30:39 -0500 Message-ID: <1446906642-19372-4-git-send-email-sandyinchina@gmail.com> References: <1446906642-19372-1-git-send-email-sandyinchina@gmail.com> Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org To: "Theodore Ts\\'o" , Jason Cooper , "H. Peter Anvin" , John Denker Return-path: In-Reply-To: <1446906642-19372-1-git-send-email-sandyinchina@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org Signed-off-by: Sandy Harris --- drivers/char/random_gcm.c | 3716 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 3716 insertions(+) create mode 100644 drivers/char/random_gcm.c diff --git a/drivers/char/random_gcm.c b/drivers/char/random_gcm.c new file mode 100644 index 0000000..360fbe3 --- /dev/null +++ b/drivers/char/random_gcm.c @@ -0,0 +1,3716 @@ +/* + * random.c -- A strong random number generator + * + * Copyright Matt Mackall , 2003, 2004, 2005 + * + * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999. All + * rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, and the entire permission notice in its entirety, + * including the disclaimer of warranties. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. The name of the author may not be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * ALTERNATIVELY, this product may be distributed under the terms of + * the GNU General Public License, in which case the provisions of the GPL are + * required INSTEAD OF the above restrictions. (This clause is + * necessary due to a potential bad interaction between the GPL and + * the restrictions contained in a BSD-style copyright.) + * + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF + * WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT + * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH + * DAMAGE. + */ + +/* + * (now, with legal B.S. out of the way.....) + * + * This routine gathers environmental noise from device drivers, etc., + * and returns good random numbers, suitable for cryptographic use. + * Besides the obvious cryptographic uses, these numbers are also good + * for seeding TCP sequence numbers, and other places where it is + * desirable to have numbers which are not only random, but hard to + * predict by an attacker. + * + * Theory of operation + * =================== + * + * Computers are very predictable devices. Hence it is extremely hard + * to produce truly random numbers on a computer --- as opposed to + * pseudo-random numbers, which can easily generated by using a + * algorithm. Unfortunately, it is very easy for attackers to guess + * the sequence of pseudo-random number generators, and for some + * applications this is not acceptable. So instead, we must try to + * gather "environmental noise" from the computer's environment, which + * must be hard for outside attackers to observe, and use that to + * generate random numbers. In a Unix environment, this is best done + * from inside the kernel. + * + * Sources of randomness from the environment include inter-keyboard + * timings, inter-interrupt timings from some interrupts, and other + * events which are both (a) non-deterministic and (b) hard for an + * outside observer to measure. Randomness from these sources are + * added to an "entropy pool", which is mixed using a CRC-like function. + * This is not cryptographically strong, but it is adequate assuming + * the randomness is not chosen maliciously, and it is fast enough that + * the overhead of doing it on every interrupt is very reasonable. + * As random bytes are mixed into the entropy pool, the routines keep + * an *estimate* of how many bits of randomness have been stored into + * the random number generator's internal state. + * + * When random bytes are desired, they are obtained by taking the SHA + * hash of the contents of the "entropy pool". The SHA hash avoids + * exposing the internal state of the entropy pool. It is believed to + * be computationally infeasible to derive any useful information + * about the input of SHA from its output. Even if it is possible to + * analyze SHA in some clever way, as long as the amount of data + * returned from the generator is less than the inherent entropy in + * the pool, the output data is totally unpredictable. For this + * reason, the routine decreases its internal estimate of how many + * bits of "true randomness" are contained in the entropy pool as it + * outputs random numbers. + * + * If this estimate goes to zero, the routine can still generate + * random numbers; however, an attacker may (at least in theory) be + * able to infer the future output of the generator from prior + * outputs. This requires successful cryptanalysis of SHA, which is + * not believed to be feasible, but there is a remote possibility. + * Nonetheless, these numbers should be useful for the vast majority + * of purposes. + * + * Exported interfaces ---- output + * =============================== + * + * There are three exported interfaces; the first is one designed to + * be used from within the kernel: + * + * void get_random_bytes(void *buf, int nbytes); + * + * This interface will return the requested number of random bytes, + * and place it in the requested buffer. + * + * The two other interfaces are two character devices /dev/random and + * /dev/urandom. /dev/random is suitable for use when very high + * quality randomness is desired (for example, for key generation or + * one-time pads), as it will only return a maximum of the number of + * bits of randomness (as estimated by the random number generator) + * contained in the entropy pool. + * + * The /dev/urandom device does not have this limit, and will return + * as many bytes as are requested. As more and more random bytes are + * requested without giving time for the entropy pool to recharge, + * this will result in random numbers that are merely cryptographically + * strong. For many applications, however, this is acceptable. + * + * Exported interfaces ---- input + * ============================== + * + * The current exported interfaces for gathering environmental noise + * from the devices are: + * + * void add_device_randomness(const void *buf, unsigned int size); + * void add_input_randomness(unsigned int type, unsigned int code, + * unsigned int value); + * void add_interrupt_randomness(int irq, int irq_flags); + * void add_disk_randomness(struct gendisk *disk); + * + * add_device_randomness() is for adding data to the random pool that + * is likely to differ between two devices (or possibly even per boot). + * This would be things like MAC addresses or serial numbers, or the + * read-out of the RTC. This does *not* add any actual entropy to the + * pool, but it initializes the pool to different values for devices + * that might otherwise be identical and have very little entropy + * available to them (particularly common in the embedded world). + * + * add_input_randomness() uses the input layer interrupt timing, as well as + * the event type information from the hardware. + * + * add_interrupt_randomness() uses the interrupt timing as random + * inputs to the entropy pool. Using the cycle counters and the irq source + * as inputs, it feeds the randomness roughly once a second. + * + * add_disk_randomness() uses what amounts to the seek time of block + * layer request events, on a per-disk_devt basis, as input to the + * entropy pool. Note that high-speed solid state drives with very low + * seek times do not make for good sources of entropy, as their seek + * times are usually fairly consistent. + * + * All of these routines try to estimate how many bits of randomness a + * particular randomness source. They do this by keeping track of the + * first and second order deltas of the event timings. + * + * Ensuring unpredictability at system startup + * ============================================ + * + * When any operating system starts up, it will go through a sequence + * of actions that are fairly predictable by an adversary, especially + * if the start-up does not involve interaction with a human operator. + * This reduces the actual number of bits of unpredictability in the + * entropy pool below the value in entropy_count. In order to + * counteract this effect, it helps to carry information in the + * entropy pool across shut-downs and start-ups. To do this, put the + * following lines an appropriate script which is run during the boot + * sequence: + * + * echo "Initializing random number generator..." + * random_seed=/var/run/random-seed + * # Carry a random seed from start-up to start-up + * # Load and then save the whole entropy pool + * if [ -f $random_seed ]; then + * cat $random_seed >/dev/urandom + * else + * touch $random_seed + * fi + * chmod 600 $random_seed + * dd if=/dev/urandom of=$random_seed count=1 bs=512 + * + * and the following lines in an appropriate script which is run as + * the system is shutdown: + * + * # Carry a random seed from shut-down to start-up + * # Save the whole entropy pool + * echo "Saving random seed..." + * random_seed=/var/run/random-seed + * touch $random_seed + * chmod 600 $random_seed + * dd if=/dev/urandom of=$random_seed count=1 bs=512 + * + * For example, on most modern systems using the System V init + * scripts, such code fragments would be found in + * /etc/rc.d/init.d/random. On older Linux systems, the correct script + * location might be in /etc/rcb.d/rc.local or /etc/rc.d/rc.0. + * + * Effectively, these commands cause the contents of the entropy pool + * to be saved at shut-down time and reloaded into the entropy pool at + * start-up. (The 'dd' in the addition to the bootup script is to + * make sure that /etc/random-seed is different for every start-up, + * even if the system crashes without executing rc.0.) Even with + * complete knowledge of the start-up activities, predicting the state + * of the entropy pool requires knowledge of the previous history of + * the system. + * + * Configuring the /dev/random driver under Linux + * ============================================== + * + * The /dev/random driver under Linux uses minor numbers 8 and 9 of + * the /dev/mem major number (#1). So if your system does not have + * /dev/random and /dev/urandom created already, they can be created + * by using the commands: + * + * mknod /dev/random c 1 8 + * mknod /dev/urandom c 1 9 + * + * Acknowledgements: + * ================= + * + * Ideas for constructing this random number generator were derived + * from Pretty Good Privacy's random number generator, and from private + * discussions with Phil Karn. Colin Plumb provided a faster random + * number generator, which speed up the mixing function of the entropy + * pool, taken from PGPfone. Dale Worley has also contributed many + * useful ideas and suggestions to improve this driver. + * + * Any flaws in the design are solely my responsibility, and should + * not be attributed to the Phil, Colin, or any of authors of PGP. + * + * Further background information on this topic may be obtained from + * RFC 4086, "Randomness Requirements for Security", by Donald + * Eastlake, Steve Crocker, and Jeff Schiller. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#define CREATE_TRACE_POINTS +#include + +/* #define ADD_INTERRUPT_BENCH */ + +#ifndef CONFIG_RANDOM_INIT +#error This version needs CONFIG_RANDOM_INIT +#endif +#ifndef CONFIG_RANDOM_GCM +#error This version should not be compiled if CONFIG_RANDOM_GCM is not set +#endif + +/* + * Configuration information + */ + +#include + +#define EXTRACT_SIZE 16 /* 128-bit GCM hash */ +#define SEC_XFER_SIZE 512 +#define DEBUG_RANDOM_BOOT 0 + +#define LONGS(x) (((x) + sizeof(unsigned long) - 1)/sizeof(unsigned long)) + +/* + * To allow fractional bits to be tracked, the entropy_count field is + * denominated in units of 1/8th bits. + * + * 2*(ENTROPY_SHIFT + log2(poolbits)) must <= 31, or the multiply in + * credit_entropy_bits() needs to be 64 bits wide. + */ +#define ENTROPY_SHIFT 3 +#define ENTROPY_BITS(r) ((r)->entropy_count >> ENTROPY_SHIFT) + +/* sanity checks */ + +#if( (ENTROPY_SHIFT+INPUT_POOL_SHIFT) >= 16) +#error *_SHIFT values problematic for credit_entropy_bits()ki +#endif + +#if( (INPUT_POOL_WORDS%16) || (OUTPUT_POOL_WORDS%16) ) +#error Pool size not divisible by 16, which code assumes +#endif + +#if( INPUT_POOL_WORDS < 32 ) +#error Input pool less than a quarter of default size +#endif + +#if( INPUT_POOL_WORDS < OUTPUT_POOL_WORDS ) +#error Strange configuration, input pool smalller than output +#endif + +/* + * The minimum number of bits of entropy before we wake up a read on + * /dev/random. Should be enough to do a significant reseed. + */ +static int random_read_wakeup_bits = 64; + +/* + * If the entropy count falls under this number of bits, then we + * should wake up processes which are selecting or polling on write + * access to /dev/random. + */ +static int random_write_wakeup_bits = 28 * OUTPUT_POOL_WORDS; + +/* + * The minimum number of seconds between urandom pool reseeding. We + * do this to limit the amount of entropy that can be drained from the + * input pool even if there are heavy demands on /dev/urandom. + */ +static int random_min_urandom_seed = 60; + +/* + * Originally, we used a primitive polynomial of degree .poolwords + * over GF(2). The taps for various sizes are defined below. They + * were chosen to be evenly spaced except for the last tap, which is 1 + * to get the twisting happening as fast as possible. + * + * For the purposes of better mixing, we use the CRC-32 polynomial as + * well to make a (modified) twisted Generalized Feedback Shift + * Register. (See M. Matsumoto & Y. Kurita, 1992. Twisted GFSR + * generators. ACM Transactions on Modeling and Computer Simulation + * 2(3):179-194. Also see M. Matsumoto & Y. Kurita, 1994. Twisted + * GFSR generators II. ACM Transactions on Modeling and Computer + * Simulation 4:254-266) + * + * Thanks to Colin Plumb for suggesting this. + * + * The mixing operation is much less sensitive than the output hash, + * where we use SHA-1. All that we want of mixing operation is that + * it be a good non-cryptographic hash; i.e. it not produce collisions + * when fed "random" data of the sort we expect to see. As long as + * the pool state differs for different inputs, we have preserved the + * input entropy and done a good job. The fact that an intelligent + * attacker can construct inputs that will produce controlled + * alterations to the pool's state is not important because we don't + * consider such inputs to contribute any randomness. The only + * property we need with respect to them is that the attacker can't + * increase his/her knowledge of the pool's state. Since all + * additions are reversible (knowing the final state and the input, + * you can reconstruct the initial state), if an attacker has any + * uncertainty about the initial state, he/she can only shuffle that + * uncertainty about, but never cause any collisions (which would + * decrease the uncertainty). + * + * Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and + * Videau in their paper, "The Linux Pseudorandom Number Generator + * Revisited" (see: http://eprint.iacr.org/2012/251.pdf). In their + * paper, they point out that we are not using a true Twisted GFSR, + * since Matsumoto & Kurita used a trinomial feedback polynomial (that + * is, with only three taps, instead of the six that we are using). + * As a result, the resulting polynomial is neither primitive nor + * irreducible, and hence does not have a maximal period over + * GF(2**32). They suggest a slight change to the generator + * polynomial which improves the resulting TGFSR polynomial to be + * irreducible, which we have made here. + */ +static struct poolinfo { + int poolbitshift, poolwords, poolbytes, poolbits, poolfracbits; +#define S(x) ilog2(x)+5, (x), (x)*4, (x)*32, (x) << (ENTROPY_SHIFT+5) + int tap1, tap2, tap3, tap4, tap5; +} poolinfo_table[] = { + /* was: x^128 + x^103 + x^76 + x^51 +x^25 + x + 1 */ + /* x^128 + x^104 + x^76 + x^51 +x^25 + x + 1 */ + { S(128), 104, 76, 51, 25, 1 }, + /* was: x^32 + x^26 + x^20 + x^14 + x^7 + x + 1 */ + /* x^32 + x^26 + x^19 + x^14 + x^7 + x + 1 */ + { S(32), 26, 19, 14, 7, 1 }, +#if 0 + /* x^2048 + x^1638 + x^1231 + x^819 + x^411 + x + 1 -- 115 */ + { S(2048), 1638, 1231, 819, 411, 1 }, + + /* x^1024 + x^817 + x^615 + x^412 + x^204 + x + 1 -- 290 */ + { S(1024), 817, 615, 412, 204, 1 }, + + /* x^1024 + x^819 + x^616 + x^410 + x^207 + x^2 + 1 -- 115 */ + { S(1024), 819, 616, 410, 207, 2 }, + + /* x^512 + x^411 + x^308 + x^208 + x^104 + x + 1 -- 225 */ + { S(512), 411, 308, 208, 104, 1 }, + + /* x^512 + x^409 + x^307 + x^206 + x^102 + x^2 + 1 -- 95 */ + { S(512), 409, 307, 206, 102, 2 }, + /* x^512 + x^409 + x^309 + x^205 + x^103 + x^2 + 1 -- 95 */ + { S(512), 409, 309, 205, 103, 2 }, + + /* x^256 + x^205 + x^155 + x^101 + x^52 + x + 1 -- 125 */ + { S(256), 205, 155, 101, 52, 1 }, + + /* x^128 + x^103 + x^78 + x^51 + x^27 + x^2 + 1 -- 70 */ + { S(128), 103, 78, 51, 27, 2 }, + + /* x^64 + x^52 + x^39 + x^26 + x^14 + x + 1 -- 15 */ + { S(64), 52, 39, 26, 14, 1 }, +#endif +}; + +/* + * Static global variables + */ +static DECLARE_WAIT_QUEUE_HEAD(random_read_wait); +static DECLARE_WAIT_QUEUE_HEAD(random_write_wait); +static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait); +static struct fasync_struct *fasync; + +static DEFINE_SPINLOCK(random_ready_list_lock); +static LIST_HEAD(random_ready_list); + +/********************************************************************** + * + * OS independent entropy store. Here are the functions which handle + * storing entropy in an entropy pool. + * + **********************************************************************/ + +struct entropy_store; +struct entropy_store { + /* read-only data: */ + const struct poolinfo *poolinfo; + __u32 *pool; + const char *name; + struct entropy_store *pull; + struct work_struct push_work; + + /* read-write data: */ + unsigned long last_pulled; + spinlock_t lock; + unsigned short add_ptr; + unsigned short input_rotate; + int entropy_count; + int entropy_total; + unsigned int initialized:1; + unsigned int limit:1; + unsigned int last_data_init:1; + __u8 last_data[EXTRACT_SIZE]; + u32 *A, *B, which, count ; + u32 *p, *q, *end, size ; +}; + +static void push_to_pool(struct work_struct *work); + +static struct entropy_store input_pool = { + .poolinfo = &poolinfo_table[0], + .name = "input", + .limit = 1, + .lock = __SPIN_LOCK_UNLOCKED(input_pool.lock), + .pool = pools, + .A = constants, + .B = constants+4, + .which = 0, + .count = 0, + .size = INPUT_POOL_WORDS, + .p = pools, + .q = pools + (INPUT_POOL_WORDS/2), + .end = pools + INPUT_POOL_WORDS +}; + +static struct entropy_store blocking_pool = { + .poolinfo = &poolinfo_table[1], + .name = "blocking", + .limit = 1, + .pull = &input_pool, + .lock = __SPIN_LOCK_UNLOCKED(blocking_pool.lock), + .push_work = __WORK_INITIALIZER(blocking_pool.push_work, + push_to_pool), + .pool = pools + INPUT_POOL_WORDS, + .A = constants+8, + .B = constants+12, + .which = 0, + .count = 0, + .size = OUTPUT_POOL_WORDS, + .p = pools + INPUT_POOL_WORDS, + .q = pools + INPUT_POOL_WORDS + (OUTPUT_POOL_WORDS/2), + .end = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS +}; + +static struct entropy_store nonblocking_pool = { + .poolinfo = &poolinfo_table[1], + .name = "nonblocking", + .pull = &input_pool, + .lock = __SPIN_LOCK_UNLOCKED(nonblocking_pool.lock), + .push_work = __WORK_INITIALIZER(nonblocking_pool.push_work, + push_to_pool), + .pool = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS, + .A = constants+16, + .B = constants+20, + .which = 0, + .count = 0, + .size = OUTPUT_POOL_WORDS, + .p = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS, + .q = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS + (OUTPUT_POOL_WORDS/2), + .end = pools + INPUT_POOL_WORDS + (OUTPUT_POOL_WORDS*2) +}; + +/* no actual pool; just hash the counter */ +static struct entropy_store dummy_pool = { + .poolinfo = &poolinfo_table[1], + .name = "dummy", + .lock = __SPIN_LOCK_UNLOCKED(dummy_pool.lock), + .pool = NULL, + .A = constants+24, + .B = constants+28, + .which = 0, + .count = 0, + /* should never be used */ + .size = 0, + .p = NULL, + .q = NULL, + .end = NULL +}; + +static int got_hw_rng ; + +/***************************************************************** + * forward declarations and a few macros + *****************************************************************/ + +static void init_random(void) ; + +/* fill an output buffer from a pool */ +static void loop_output( struct entropy_store *, u32 *, u32 ) ; + +static void count(void) ; +static void counter_any(void) ; + +/* get 128 bits */ +static int get_or_fail( struct entropy_store *, u32 * ) ; +static void get128( struct entropy_store *, u32 * ) ; +static int get_any( u32 * ) ; + +/* These functions each do a unidirectional mix + * into some data structure. They mix in 128 bits + * at a time to give "catastrophic reseeding", and + * all zero out the input buffer after use. + */ +static void buffer2array( struct entropy_store *, u32 * ) ; +static void buffer2pool( struct entropy_store *, u32 * ) ; +static void buffer2counter( u32 * ) ; + +/* hw rng functions */ +static int get_hw_random( u32 * ) ; +static int load_constants(void) ; +static int load_input(void) ; + +/* mix chunks of data structures in place */ +static void mix_const_p( struct entropy_store * ) ; +static void mix_const_all(void); +static void top_mix(void); +static void big_mix(void); + +static void clear_addmul(void); + +/* rotate a 32-bit word left n bits */ +#define ROTL(v, n) ( ((v) << (n)) | ((v) >> (32 - (n))) ) + +/* common case with 128-bit buffer */ +#define zero128( target ) memzero_explicit( (u8 *) target, 16 ) + +static __u32 const twist_table[8] = { + 0x00000000, 0x3b6e20c8, 0x76dc4190, 0x4db26158, + 0xedb88320, 0xd6d6a3e8, 0x9b64c2b0, 0xa00ae278 }; + +/* + * This function adds bytes into the entropy "pool". It does not + * update the entropy estimate. The caller should call + * credit_entropy_bits if this is appropriate. + * + * The pool is stirred with a primitive polynomial of the appropriate + * degree, and then twisted. We twist by three bits at a time because + * it's cheap to do so and helps slightly in the expected case where + * the entropy is concentrated in the low-order bits. + */ +static void _mix_pool_bytes(struct entropy_store *r, const void *in, + int nbytes) +{ + unsigned long i, tap1, tap2, tap3, tap4, tap5; + int input_rotate; + int wordmask = r->poolinfo->poolwords - 1; + const char *bytes = in; + __u32 w; + + tap1 = r->poolinfo->tap1; + tap2 = r->poolinfo->tap2; + tap3 = r->poolinfo->tap3; + tap4 = r->poolinfo->tap4; + tap5 = r->poolinfo->tap5; + + input_rotate = r->input_rotate; + i = r->add_ptr; + + /* mix one byte at a time to simplify size handling and churn faster */ + while (nbytes--) { + w = rol32(*bytes++, input_rotate); + i = (i - 1) & wordmask; + + /* XOR in the various taps */ + w ^= r->pool[i]; + w ^= r->pool[(i + tap1) & wordmask]; + w ^= r->pool[(i + tap2) & wordmask]; + w ^= r->pool[(i + tap3) & wordmask]; + w ^= r->pool[(i + tap4) & wordmask]; + w ^= r->pool[(i + tap5) & wordmask]; + + /* Mix the result back in with a twist */ + r->pool[i] = (w >> 3) ^ twist_table[w & 7]; + + /* + * Normally, we add 7 bits of rotation to the pool. + * At the beginning of the pool, add an extra 7 bits + * rotation, so that successive passes spread the + * input bits across the pool evenly. + */ + input_rotate = (input_rotate + (i ? 7 : 14)) & 31; + } + + r->input_rotate = input_rotate; + r->add_ptr = i; +} + +static void __mix_pool_bytes(struct entropy_store *r, const void *in, + int nbytes) +{ + trace_mix_pool_bytes_nolock(r->name, nbytes, _RET_IP_); + _mix_pool_bytes(r, in, nbytes); +} + +static void mix_pool_bytes(struct entropy_store *r, const void *in, + int nbytes) +{ + unsigned long flags; + + trace_mix_pool_bytes(r->name, nbytes, _RET_IP_); + spin_lock_irqsave(&r->lock, flags); + _mix_pool_bytes(r, in, nbytes); + spin_unlock_irqrestore(&r->lock, flags); +} + +struct fast_pool { + __u32 pool[4]; + unsigned long last; + unsigned short reg_idx; + unsigned char count; +}; + +/* + * This is a fast mixing routine used by the interrupt randomness + * collector. It's hardcoded for an 128 bit pool and assumes that any + * locks that might be needed are taken by the caller. + */ +static void fast_mix(struct fast_pool *f) +{ + __u32 a = f->pool[0], b = f->pool[1]; + __u32 c = f->pool[2], d = f->pool[3]; + + a += b; c += d; + b = rol32(b, 6); d = rol32(d, 27); + d ^= a; b ^= c; + + a += b; c += d; + b = rol32(b, 16); d = rol32(d, 14); + d ^= a; b ^= c; + + a += b; c += d; + b = rol32(b, 6); d = rol32(d, 27); + d ^= a; b ^= c; + + a += b; c += d; + b = rol32(b, 16); d = rol32(d, 14); + d ^= a; b ^= c; + + f->pool[0] = a; f->pool[1] = b; + f->pool[2] = c; f->pool[3] = d; + f->count++; +} + +static void process_random_ready_list(void) +{ + unsigned long flags; + struct random_ready_callback *rdy, *tmp; + + spin_lock_irqsave(&random_ready_list_lock, flags); + list_for_each_entry_safe(rdy, tmp, &random_ready_list, list) { + struct module *owner = rdy->owner; + + list_del_init(&rdy->list); + rdy->func(rdy); + module_put(owner); + } + spin_unlock_irqrestore(&random_ready_list_lock, flags); +} + +/* + * Credit (or debit) the entropy store with n bits of entropy. + * Use credit_entropy_bits_safe() if the value comes from userspace + * or otherwise should be checked for extreme values. + */ +static void credit_entropy_bits(struct entropy_store *r, int nbits) +{ + int entropy_count, orig; + const int pool_size = r->poolinfo->poolfracbits; + int nfrac = nbits << ENTROPY_SHIFT; + + if (!nbits) + return; + +retry: + entropy_count = orig = ACCESS_ONCE(r->entropy_count); + if (nfrac < 0) { + /* Debit */ + entropy_count += nfrac; + } else { + /* + * Credit: we have to account for the possibility of + * overwriting already present entropy. Even in the + * ideal case of pure Shannon entropy, new contributions + * approach the full value asymptotically: + * + * entropy <- entropy + (pool_size - entropy) * + * (1 - exp(-add_entropy/pool_size)) + * + * For add_entropy <= pool_size/2 then + * (1 - exp(-add_entropy/pool_size)) >= + * (add_entropy/pool_size)*0.7869... + * so we can approximate the exponential with + * 3/4*add_entropy/pool_size and still be on the + * safe side by adding at most pool_size/2 at a time. + * + * The use of pool_size-2 in the while statement is to + * prevent rounding artifacts from making the loop + * arbitrarily long; this limits the loop to log2(pool_size)*2 + * turns no matter how large nbits is. + */ + int pnfrac = nfrac; + const int s = r->poolinfo->poolbitshift + ENTROPY_SHIFT + 2; + /* The +2 corresponds to the /4 in the denominator */ + + do { + unsigned int anfrac = min(pnfrac, pool_size/2); + unsigned int add = + ((pool_size - entropy_count)*anfrac*3) >> s; + + entropy_count += add; + pnfrac -= anfrac; + } while (unlikely(entropy_count < pool_size-2 && pnfrac)); + } + + if (unlikely(entropy_count < 0)) { + pr_warn("random: negative entropy/overflow: pool %s count %d\n", + r->name, entropy_count); + WARN_ON(1); + entropy_count = 0; + } else if (entropy_count > pool_size) + entropy_count = pool_size; + if (cmpxchg(&r->entropy_count, orig, entropy_count) != orig) + goto retry; + + r->entropy_total += nbits; + if (!r->initialized && r->entropy_total > 128) { + r->initialized = 1; + r->entropy_total = 0; + if (r == &nonblocking_pool) { + prandom_reseed_late(); + process_random_ready_list(); + wake_up_all(&urandom_init_wait); + pr_notice("random: %s pool is initialized\n", r->name); + } + } + + trace_credit_entropy_bits(r->name, nbits, + entropy_count >> ENTROPY_SHIFT, + r->entropy_total, _RET_IP_); + + if (r == &input_pool) { + int entropy_bits = entropy_count >> ENTROPY_SHIFT; + + /* should we wake readers? */ + if (entropy_bits >= random_read_wakeup_bits) { + wake_up_interruptible(&random_read_wait); + kill_fasync(&fasync, SIGIO, POLL_IN); + } + /* If the input pool is getting full, send some + * entropy to the two output pools, flipping back and + * forth between them, until the output pools are 75% + * full. + */ + if (entropy_bits > random_write_wakeup_bits && + r->initialized && + r->entropy_total >= 2*random_read_wakeup_bits) { + static struct entropy_store *last = &blocking_pool; + struct entropy_store *other = &blocking_pool; + + if (last == &blocking_pool) + other = &nonblocking_pool; + if (other->entropy_count <= + 3 * other->poolinfo->poolfracbits / 4) + last = other; + if (last->entropy_count <= + 3 * last->poolinfo->poolfracbits / 4) { + schedule_work(&last->push_work); + r->entropy_total = 0; + } + } + } +} + +static void credit_entropy_bits_safe(struct entropy_store *r, int nbits) +{ + const int nbits_max = (int)(~0U >> (ENTROPY_SHIFT + 1)); + + /* Cap the value to avoid overflows */ + nbits = min(nbits, nbits_max); + nbits = max(nbits, -nbits_max); + + credit_entropy_bits(r, nbits); +} + +/********************************************************************* + * + * Entropy input management + * + *********************************************************************/ + +/* There is one of these per entropy source */ +struct timer_rand_state { + cycles_t last_time; + long last_delta, last_delta2; + unsigned dont_count_entropy:1; +}; + +#define INIT_TIMER_RAND_STATE { INITIAL_JIFFIES, }; + +/* + * Add device- or boot-specific data to the input and nonblocking + * pools to help initialize them to unique values. + * + * None of this adds any entropy, it is meant to avoid the + * problem of the nonblocking pool having similar initial state + * across largely identical devices. + */ +void add_device_randomness(const void *buf, unsigned int size) +{ + unsigned long time = random_get_entropy() ^ jiffies; + unsigned long flags; + + trace_add_device_randomness(size, _RET_IP_); + spin_lock_irqsave(&input_pool.lock, flags); + _mix_pool_bytes(&input_pool, buf, size); + _mix_pool_bytes(&input_pool, &time, sizeof(time)); + spin_unlock_irqrestore(&input_pool.lock, flags); + + spin_lock_irqsave(&nonblocking_pool.lock, flags); + _mix_pool_bytes(&nonblocking_pool, buf, size); + _mix_pool_bytes(&nonblocking_pool, &time, sizeof(time)); + spin_unlock_irqrestore(&nonblocking_pool.lock, flags); +} +EXPORT_SYMBOL(add_device_randomness); + +static struct timer_rand_state input_timer_state = INIT_TIMER_RAND_STATE; + +/* + * This function adds entropy to the entropy "pool" by using timing + * delays. It uses the timer_rand_state structure to make an estimate + * of how many bits of entropy this call has added to the pool. + * + * The number "num" is also added to the pool - it should somehow describe + * the type of event which just happened. This is currently 0-255 for + * keyboard scan codes, and 256 upwards for interrupts. + * + */ +static void add_timer_randomness(struct timer_rand_state *state, unsigned num) +{ + struct entropy_store *r; + struct { + long jiffies; + unsigned cycles; + unsigned num; + } sample; + long delta, delta2, delta3; + + preempt_disable(); + + sample.jiffies = jiffies; + sample.cycles = random_get_entropy(); + sample.num = num; + r = nonblocking_pool.initialized ? &input_pool : &nonblocking_pool; + mix_pool_bytes(r, &sample, sizeof(sample)); + + /* + * Calculate number of bits of randomness we probably added. + * We take into account the first, second and third-order deltas + * in order to make our estimate. + */ + + if (!state->dont_count_entropy) { + delta = sample.jiffies - state->last_time; + state->last_time = sample.jiffies; + + delta2 = delta - state->last_delta; + state->last_delta = delta; + + delta3 = delta2 - state->last_delta2; + state->last_delta2 = delta2; + + if (delta < 0) + delta = -delta; + if (delta2 < 0) + delta2 = -delta2; + if (delta3 < 0) + delta3 = -delta3; + if (delta > delta2) + delta = delta2; + if (delta > delta3) + delta = delta3; + + /* + * delta is now minimum absolute delta. + * Round down by 1 bit on general principles, + * and limit entropy entimate to 12 bits. + */ + credit_entropy_bits(r, min_t(int, fls(delta>>1), 11)); + } + preempt_enable(); +} + +void add_input_randomness(unsigned int type, unsigned int code, + unsigned int value) +{ + static unsigned char last_value; + + /* ignore autorepeat and the like */ + if (value == last_value) + return; + + last_value = value; + add_timer_randomness(&input_timer_state, + (type << 4) ^ code ^ (code >> 4) ^ value); + trace_add_input_randomness(ENTROPY_BITS(&input_pool)); +} +EXPORT_SYMBOL_GPL(add_input_randomness); + +static DEFINE_PER_CPU(struct fast_pool, irq_randomness); + +#ifdef ADD_INTERRUPT_BENCH +static unsigned long avg_cycles, avg_deviation; + +#define AVG_SHIFT 8 /* Exponential average factor k=1/256 */ +#define FIXED_1_2 (1 << (AVG_SHIFT-1)) + +static void add_interrupt_bench(cycles_t start) +{ + long delta = random_get_entropy() - start; + + /* Use a weighted moving average */ + delta = delta - ((avg_cycles + FIXED_1_2) >> AVG_SHIFT); + avg_cycles += delta; + /* And average deviation */ + delta = abs(delta) - ((avg_deviation + FIXED_1_2) >> AVG_SHIFT); + avg_deviation += delta; +} +#else +#define add_interrupt_bench(x) +#endif + +static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs) +{ + __u32 *ptr = (__u32 *) regs; + + if (regs == NULL) + return 0; + if (f->reg_idx >= sizeof(struct pt_regs) / sizeof(__u32)) + f->reg_idx = 0; + return *(ptr + f->reg_idx++); +} + +void add_interrupt_randomness(int irq, int irq_flags) +{ + struct entropy_store *r; + struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness); + struct pt_regs *regs = get_irq_regs(); + unsigned long now = jiffies; + cycles_t cycles = random_get_entropy(); + __u32 c_high, j_high; + __u64 ip; + unsigned long seed; + int credit = 0; + + if (cycles == 0) + cycles = get_reg(fast_pool, regs); + c_high = (sizeof(cycles) > 4) ? cycles >> 32 : 0; + j_high = (sizeof(now) > 4) ? now >> 32 : 0; + fast_pool->pool[0] ^= cycles ^ j_high ^ irq; + fast_pool->pool[1] ^= now ^ c_high; + ip = regs ? instruction_pointer(regs) : _RET_IP_; + fast_pool->pool[2] ^= ip; + fast_pool->pool[3] ^= (sizeof(ip) > 4) ? ip >> 32 : + get_reg(fast_pool, regs); + + fast_mix(fast_pool); + add_interrupt_bench(cycles); + + if ((fast_pool->count < 64) && + !time_after(now, fast_pool->last + HZ)) + return; + + r = nonblocking_pool.initialized ? &input_pool : &nonblocking_pool; + if (!spin_trylock(&r->lock)) + return; + + fast_pool->last = now; + __mix_pool_bytes(r, &fast_pool->pool, sizeof(fast_pool->pool)); + + /* + * If we have architectural seed generator, produce a seed and + * add it to the pool. For the sake of paranoia don't let the + * architectural seed generator dominate the input from the + * interrupt noise. + */ + if (arch_get_random_seed_long(&seed)) { + __mix_pool_bytes(r, &seed, sizeof(seed)); + credit = 1; + } + spin_unlock(&r->lock); + + fast_pool->count = 0; + + /* award one bit for the contents of the fast pool */ + credit_entropy_bits(r, credit + 1); +} + +#ifdef CONFIG_BLOCK +void add_disk_randomness(struct gendisk *disk) +{ + if (!disk || !disk->random) + return; + /* first major is 1, so we get >= 0x200 here */ + add_timer_randomness(disk->random, 0x100 + disk_devt(disk)); + trace_add_disk_randomness(disk_devt(disk), ENTROPY_BITS(&input_pool)); +} +EXPORT_SYMBOL_GPL(add_disk_randomness); +#endif + +/********************************************************************* + * + * Entropy extraction routines + * + *********************************************************************/ + +static ssize_t extract_entropy(struct entropy_store *r, void *buf, + size_t nbytes, int min, int rsvd); + +/* + * This utility inline function is responsible for transferring entropy + * from the primary pool to the secondary extraction pool. We make + * sure we pull enough for a 'catastrophic reseed'. + */ +static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes); +static void xfer_secondary_pool(struct entropy_store *r, size_t nbytes) +{ + if (!r->pull || + r->entropy_count >= (nbytes << (ENTROPY_SHIFT + 3)) || + r->entropy_count > r->poolinfo->poolfracbits) + return; + + if (r->limit == 0 && random_min_urandom_seed) { + unsigned long now = jiffies; + + if (time_before(now, + r->last_pulled + random_min_urandom_seed * HZ)) + return; + r->last_pulled = now; + } + + _xfer_secondary_pool(r, nbytes); +} + +static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes) +{ + u32 temp[4] ; + int bytes = nbytes; + + /* pull at least as much as a wakeup */ + bytes = max_t(int, bytes, random_read_wakeup_bits / 8); + /* but never more than the pool size */ + bytes = min_t(int, bytes, OUTPUT_POOL_WORDS); + + trace_xfer_secondary_pool(r->name, bytes * 8, nbytes * 8, + ENTROPY_BITS(r), ENTROPY_BITS(r->pull)); + for( ; bytes > 3 ; bytes -= 4 ) { + get128(r->pull, temp ) ; + buffer2pool( r, temp ) ; + } +} + +/* + * Used as a workqueue function so that when the input pool is getting + * full, we can "spill over" some entropy to the output pools. That + * way the output pools can store some of the excess entropy instead + * of letting it go to waste. + */ +static void push_to_pool(struct work_struct *work) +{ + struct entropy_store *r = container_of(work, struct entropy_store, + push_work); + BUG_ON(!r); + _xfer_secondary_pool(r, random_read_wakeup_bits/8); + trace_push_to_pool(r->name, r->entropy_count >> ENTROPY_SHIFT, + r->pull->entropy_count >> ENTROPY_SHIFT); +} + +/* + * This function decides how many bytes to actually take from the + * given pool, and also debits the entropy count accordingly. + */ +static size_t account(struct entropy_store *r, size_t nbytes, int min, + int reserved) +{ + int entropy_count, orig; + size_t ibytes, nfrac; + + BUG_ON(r->entropy_count > r->poolinfo->poolfracbits); + + /* Can we pull enough? */ +retry: + entropy_count = orig = ACCESS_ONCE(r->entropy_count); + ibytes = nbytes; + /* If limited, never pull more than available */ + if (r->limit) { + int have_bytes = entropy_count >> (ENTROPY_SHIFT + 3); + + if ((have_bytes -= reserved) < 0) + have_bytes = 0; + ibytes = min_t(size_t, ibytes, have_bytes); + } + if (ibytes < min) + ibytes = 0; + + if (unlikely(entropy_count < 0)) { + pr_warn("random: negative entropy count: pool %s count %d\n", + r->name, entropy_count); + WARN_ON(1); + entropy_count = 0; + } + nfrac = ibytes << (ENTROPY_SHIFT + 3); + if ((size_t) entropy_count > nfrac) + entropy_count -= nfrac; + else + entropy_count = 0; + + if (cmpxchg(&r->entropy_count, orig, entropy_count) != orig) + goto retry; + + trace_debit_entropy(r->name, 8 * ibytes); + if (ibytes && + (r->entropy_count >> ENTROPY_SHIFT) < random_write_wakeup_bits) { + wake_up_interruptible(&random_write_wait); + kill_fasync(&fasync, SIGIO, POLL_OUT); + } + + return ibytes; +} + +/* + * This function does the actual extraction for extract_entropy and + * extract_entropy_user. + * + * Note: we assume that .poolwords is a multiple of 16 words. + */ +static void extract_buf(struct entropy_store *r, __u8 *out) +{ + get128( r, (u32 *) out ) ; +} + +/* + * This function extracts randomness from the "entropy pool", and + * returns it in a buffer. + * + * The min parameter specifies the minimum amount we can pull before + * failing to avoid races that defeat catastrophic reseeding while the + * reserved parameter indicates how much entropy we must leave in the + * pool after each pull to avoid starving other readers. + */ +static ssize_t extract_entropy(struct entropy_store *r, void *buf, + size_t nbytes, int min, int reserved) +{ + ssize_t ret = 0, i; + __u8 tmp[EXTRACT_SIZE]; + unsigned long flags; + + /* if last_data isn't primed, we need EXTRACT_SIZE extra bytes */ + if (fips_enabled) { + spin_lock_irqsave(&r->lock, flags); + if (!r->last_data_init) { + r->last_data_init = 1; + spin_unlock_irqrestore(&r->lock, flags); + trace_extract_entropy(r->name, EXTRACT_SIZE, + ENTROPY_BITS(r), _RET_IP_); + xfer_secondary_pool(r, EXTRACT_SIZE); + extract_buf(r, tmp); + spin_lock_irqsave(&r->lock, flags); + memcpy(r->last_data, tmp, EXTRACT_SIZE); + } + spin_unlock_irqrestore(&r->lock, flags); + } + + trace_extract_entropy(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_); + xfer_secondary_pool(r, nbytes); + nbytes = account(r, nbytes, min, reserved); + + while (nbytes) { + extract_buf(r, tmp); + + if (fips_enabled) { + spin_lock_irqsave(&r->lock, flags); + if (!memcmp(tmp, r->last_data, EXTRACT_SIZE)) + panic("Hardware RNG duplicated output!\n"); + memcpy(r->last_data, tmp, EXTRACT_SIZE); + spin_unlock_irqrestore(&r->lock, flags); + } + i = min_t(int, nbytes, EXTRACT_SIZE); + memcpy(buf, tmp, i); + nbytes -= i; + buf += i; + ret += i; + } + + /* Wipe data just returned from memory */ + memzero_explicit(tmp, sizeof(tmp)); + + return ret; +} + +/* + * This function extracts randomness from the "entropy pool", and + * returns it in a userspace buffer. + */ +static ssize_t extract_entropy_user(struct entropy_store *r, void __user *buf, + size_t nbytes) +{ + ssize_t ret = 0, i; + __u8 tmp[EXTRACT_SIZE]; + int large_request = (nbytes > 256); + + trace_extract_entropy_user(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_); + xfer_secondary_pool(r, nbytes); + nbytes = account(r, nbytes, 0, 0); + + while (nbytes) { + if (large_request && need_resched()) { + if (signal_pending(current)) { + if (ret == 0) + ret = -ERESTARTSYS; + break; + } + schedule(); + } + + extract_buf(r, tmp); + i = min_t(int, nbytes, EXTRACT_SIZE); + if (copy_to_user(buf, tmp, i)) { + ret = -EFAULT; + break; + } + + nbytes -= i; + buf += i; + ret += i; + } + + /* Wipe data just returned from memory */ + memzero_explicit(tmp, sizeof(tmp)); + + return ret; +} + +/* + * This function is the exported kernel interface. It returns some + * number of good random numbers, suitable for key generation, seeding + * TCP sequence numbers, etc. It does not rely on the hardware random + * number generator. For random bytes direct from the hardware RNG + * (when available), use get_random_bytes_arch(). + */ +void get_random_bytes(void *buf, int nbytes) +{ +#if DEBUG_RANDOM_BOOT > 0 + if (unlikely(nonblocking_pool.initialized == 0)) + printk(KERN_NOTICE "random: %pF get_random_bytes called " + "with %d bits of entropy available\n", + (void *) _RET_IP_, + nonblocking_pool.entropy_total); +#endif + trace_get_random_bytes(nbytes, _RET_IP_); + loop_output(&nonblocking_pool, buf, nbytes); +} +EXPORT_SYMBOL(get_random_bytes); + +/* + * Add a callback function that will be invoked when the nonblocking + * pool is initialised. + * + * returns: 0 if callback is successfully added + * -EALREADY if pool is already initialised (callback not called) + * -ENOENT if module for callback is not alive + */ +int add_random_ready_callback(struct random_ready_callback *rdy) +{ + struct module *owner; + unsigned long flags; + int err = -EALREADY; + + if (likely(nonblocking_pool.initialized)) + return err; + + owner = rdy->owner; + if (!try_module_get(owner)) + return -ENOENT; + + spin_lock_irqsave(&random_ready_list_lock, flags); + if (nonblocking_pool.initialized) + goto out; + + owner = NULL; + + list_add(&rdy->list, &random_ready_list); + err = 0; + +out: + spin_unlock_irqrestore(&random_ready_list_lock, flags); + + module_put(owner); + + return err; +} +EXPORT_SYMBOL(add_random_ready_callback); + +/* + * Delete a previously registered readiness callback function. + */ +void del_random_ready_callback(struct random_ready_callback *rdy) +{ + unsigned long flags; + struct module *owner = NULL; + + spin_lock_irqsave(&random_ready_list_lock, flags); + if (!list_empty(&rdy->list)) { + list_del_init(&rdy->list); + owner = rdy->owner; + } + spin_unlock_irqrestore(&random_ready_list_lock, flags); + + module_put(owner); +} +EXPORT_SYMBOL(del_random_ready_callback); + +/* + * This function will use the architecture-specific hardware random + * number generator if it is available. The arch-specific hw RNG will + * almost certainly be faster than what we can do in software, but it + * is impossible to verify that it is implemented securely (as + * opposed, to, say, the AES encryption of a sequence number using a + * key known by the NSA). So it's useful if we need the speed, but + * only if we're willing to trust the hardware manufacturer not to + * have put in a back door. + */ +void get_random_bytes_arch(void *buf, int nbytes) +{ + char *p = buf; + + trace_get_random_bytes_arch(nbytes, _RET_IP_); + while (nbytes) { + unsigned long v; + int chunk = min(nbytes, (int)sizeof(unsigned long)); + + if (!arch_get_random_long(&v)) + break; + + memcpy(p, &v, chunk); + p += chunk; + nbytes -= chunk; + } + + if (nbytes) + extract_entropy(&nonblocking_pool, p, nbytes, 0, 0); +} +EXPORT_SYMBOL(get_random_bytes_arch); + +/* + * Note that setup_arch() may call add_device_randomness() + * long before we get here. This allows seeding of the pools + * with some platform dependent data very early in the boot + * process. But it limits our options here. We must use + * statically allocated structures that already have all + * initializations complete at compile time. We should also + * take care not to overwrite the precious per platform data + * we were given. + */ +static int rand_initialize(void) +{ + init_random() ; + return 0; +} +early_initcall(rand_initialize); + +#ifdef CONFIG_BLOCK +void rand_initialize_disk(struct gendisk *disk) +{ + struct timer_rand_state *state; + + /* + * If kzalloc returns null, we just won't use that entropy + * source. + */ + state = kzalloc(sizeof(struct timer_rand_state), GFP_KERNEL); + if (state) { + state->last_time = INITIAL_JIFFIES; + disk->random = state; + } +} +#endif + +static ssize_t +_random_read(int nonblock, char __user *buf, size_t nbytes) +{ + ssize_t n; + + if (nbytes == 0) + return 0; + + nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE); + while (1) { + n = extract_entropy_user(&blocking_pool, buf, nbytes); + if (n < 0) + return n; + trace_random_read(n*8, (nbytes-n)*8, + ENTROPY_BITS(&blocking_pool), + ENTROPY_BITS(&input_pool)); + if (n > 0) + return n; + + /* Pool is (near) empty. Maybe wait and retry. */ + if (nonblock) + return -EAGAIN; + + wait_event_interruptible(random_read_wait, + ENTROPY_BITS(&input_pool) >= + random_read_wakeup_bits); + if (signal_pending(current)) + return -ERESTARTSYS; + } +} + +static ssize_t +random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) +{ + return _random_read(file->f_flags & O_NONBLOCK, buf, nbytes); +} + +static ssize_t +urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) +{ + int ret; + + if (unlikely(nonblocking_pool.initialized == 0)) + printk_once(KERN_NOTICE "random: %s urandom read " + "with %d bits of entropy available\n", + current->comm, nonblocking_pool.entropy_total); + + nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3)); + ret = extract_entropy_user(&nonblocking_pool, buf, nbytes); + + trace_urandom_read(8 * nbytes, ENTROPY_BITS(&nonblocking_pool), + ENTROPY_BITS(&input_pool)); + return ret; +} + +static unsigned int +random_poll(struct file *file, poll_table * wait) +{ + unsigned int mask; + + poll_wait(file, &random_read_wait, wait); + poll_wait(file, &random_write_wait, wait); + mask = 0; + if (ENTROPY_BITS(&input_pool) >= random_read_wakeup_bits) + mask |= POLLIN | POLLRDNORM; + if (ENTROPY_BITS(&input_pool) < random_write_wakeup_bits) + mask |= POLLOUT | POLLWRNORM; + return mask; +} + +static int +write_pool(struct entropy_store *r, const char __user *buffer, size_t count) +{ + size_t bytes; + __u32 buf[16]; + const char __user *p = buffer; + + while (count > 0) { + bytes = min(count, sizeof(buf)); + if (copy_from_user(&buf, p, bytes)) + return -EFAULT; + + count -= bytes; + p += bytes; + + mix_pool_bytes(r, buf, bytes); + cond_resched(); + } + + return 0; +} + +static ssize_t random_write(struct file *file, const char __user *buffer, + size_t count, loff_t *ppos) +{ + size_t ret; + + ret = write_pool(&blocking_pool, buffer, count); + if (ret) + return ret; + ret = write_pool(&nonblocking_pool, buffer, count); + if (ret) + return ret; + + return (ssize_t)count; +} + +static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg) +{ + int size, ent_count; + int __user *p = (int __user *)arg; + int retval; + + switch (cmd) { + case RNDGETENTCNT: + /* inherently racy, no point locking */ + ent_count = ENTROPY_BITS(&input_pool); + if (put_user(ent_count, p)) + return -EFAULT; + return 0; + case RNDADDTOENTCNT: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + if (get_user(ent_count, p)) + return -EFAULT; + credit_entropy_bits_safe(&input_pool, ent_count); + return 0; + case RNDADDENTROPY: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + if (get_user(ent_count, p++)) + return -EFAULT; + if (ent_count < 0) + return -EINVAL; + if (get_user(size, p++)) + return -EFAULT; + retval = write_pool(&input_pool, (const char __user *)p, + size); + if (retval < 0) + return retval; + credit_entropy_bits_safe(&input_pool, ent_count); + return 0; + case RNDZAPENTCNT: + case RNDCLEARPOOL: + /* + * Clear the entropy pool counters. We no longer clear + * the entropy pool, as that's silly. + */ + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + input_pool.entropy_count = 0; + nonblocking_pool.entropy_count = 0; + blocking_pool.entropy_count = 0; + return 0; + default: + return -EINVAL; + } +} + +static int random_fasync(int fd, struct file *filp, int on) +{ + return fasync_helper(fd, filp, on, &fasync); +} + +const struct file_operations random_fops = { + .read = random_read, + .write = random_write, + .poll = random_poll, + .unlocked_ioctl = random_ioctl, + .fasync = random_fasync, + .llseek = noop_llseek, +}; + +const struct file_operations urandom_fops = { + .read = urandom_read, + .write = random_write, + .unlocked_ioctl = random_ioctl, + .fasync = random_fasync, + .llseek = noop_llseek, +}; + +SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count, + unsigned int, flags) +{ + if (flags & ~(GRND_NONBLOCK|GRND_RANDOM)) + return -EINVAL; + + if (count > INT_MAX) + count = INT_MAX; + + if (flags & GRND_RANDOM) + return _random_read(flags & GRND_NONBLOCK, buf, count); + + if (unlikely(nonblocking_pool.initialized == 0)) { + if (flags & GRND_NONBLOCK) + return -EAGAIN; + wait_event_interruptible(urandom_init_wait, + nonblocking_pool.initialized); + if (signal_pending(current)) + return -ERESTARTSYS; + } + return urandom_read(NULL, buf, count, NULL); +} + +/*************************************************************** + * Random UUID interface + * + * Used here for a Boot ID, but can be useful for other kernel + * drivers. + ***************************************************************/ + +/* + * Generate random UUID + */ +void generate_random_uuid(unsigned char uuid_out[16]) +{ + get_random_bytes(uuid_out, 16); + /* Set UUID version to 4 --- truly random generation */ + uuid_out[6] = (uuid_out[6] & 0x0F) | 0x40; + /* Set the UUID variant to DCE */ + uuid_out[8] = (uuid_out[8] & 0x3F) | 0x80; +} +EXPORT_SYMBOL(generate_random_uuid); + +/******************************************************************** + * + * Sysctl interface + * + ********************************************************************/ + +#ifdef CONFIG_SYSCTL + +#include + +static int min_read_thresh = 8, min_write_thresh; +static int max_read_thresh = OUTPUT_POOL_WORDS * 32; +static int max_write_thresh = INPUT_POOL_WORDS * 32; +static char sysctl_bootid[16]; + +/* + * This function is used to return both the bootid UUID, and random + * UUID. The difference is in whether table->data is NULL; if it is, + * then a new UUID is generated and returned to the user. + * + * If the user accesses this via the proc interface, the UUID will be + * returned as an ASCII string in the standard UUID format; if via the + * sysctl system call, as 16 bytes of binary data. + */ +static int proc_do_uuid(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) +{ + struct ctl_table fake_table; + unsigned char buf[64], tmp_uuid[16], *uuid; + + uuid = table->data; + if (!uuid) { + uuid = tmp_uuid; + generate_random_uuid(uuid); + } else { + static DEFINE_SPINLOCK(bootid_spinlock); + + spin_lock(&bootid_spinlock); + if (!uuid[8]) + generate_random_uuid(uuid); + spin_unlock(&bootid_spinlock); + } + + sprintf(buf, "%pU", uuid); + + fake_table.data = buf; + fake_table.maxlen = sizeof(buf); + + return proc_dostring(&fake_table, write, buffer, lenp, ppos); +} + +/* + * Return entropy available scaled to integral bits + */ +static int proc_do_entropy(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, loff_t *ppos) +{ + struct ctl_table fake_table; + int entropy_count; + + entropy_count = *(int *)table->data >> ENTROPY_SHIFT; + + fake_table.data = &entropy_count; + fake_table.maxlen = sizeof(entropy_count); + + return proc_dointvec(&fake_table, write, buffer, lenp, ppos); +} + +static int sysctl_poolsize = INPUT_POOL_WORDS * 32; +extern struct ctl_table random_table[]; +struct ctl_table random_table[] = { + { + .procname = "poolsize", + .data = &sysctl_poolsize, + .maxlen = sizeof(int), + .mode = 0444, + .proc_handler = proc_dointvec, + }, + { + .procname = "entropy_avail", + .maxlen = sizeof(int), + .mode = 0444, + .proc_handler = proc_do_entropy, + .data = &input_pool.entropy_count, + }, + { + .procname = "read_wakeup_threshold", + .data = &random_read_wakeup_bits, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_read_thresh, + .extra2 = &max_read_thresh, + }, + { + .procname = "write_wakeup_threshold", + .data = &random_write_wakeup_bits, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_write_thresh, + .extra2 = &max_write_thresh, + }, + { + .procname = "urandom_min_reseed_secs", + .data = &random_min_urandom_seed, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, + { + .procname = "boot_id", + .data = &sysctl_bootid, + .maxlen = 16, + .mode = 0444, + .proc_handler = proc_do_uuid, + }, + { + .procname = "uuid", + .maxlen = 16, + .mode = 0444, + .proc_handler = proc_do_uuid, + }, +#ifdef ADD_INTERRUPT_BENCH + { + .procname = "add_interrupt_avg_cycles", + .data = &avg_cycles, + .maxlen = sizeof(avg_cycles), + .mode = 0444, + .proc_handler = proc_doulongvec_minmax, + }, + { + .procname = "add_interrupt_avg_deviation", + .data = &avg_deviation, + .maxlen = sizeof(avg_deviation), + .mode = 0444, + .proc_handler = proc_doulongvec_minmax, + }, +#endif + { } +}; +#endif /* CONFIG_SYSCTL */ + +static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned; + +int random_int_secret_init(void) +{ + get_random_bytes(random_int_secret, sizeof(random_int_secret)); + return 0; +} + +/* + * Get a random word for internal kernel use only. Similar to urandom but + * with the goal of minimal entropy pool depletion. As a result, the random + * value is not cryptographically secure but for several uses the cost of + * depleting entropy is too high + */ +static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash); +unsigned int get_random_int(void) +{ + __u32 *hash; + unsigned int ret; + + if (arch_get_random_int(&ret)) + return ret; + + hash = get_cpu_var(get_random_int_hash); + + hash[0] += current->pid + jiffies + random_get_entropy(); + md5_transform(hash, random_int_secret); + ret = hash[0]; + put_cpu_var(get_random_int_hash); + + return ret; +} +EXPORT_SYMBOL(get_random_int); + +/* + * randomize_range() returns a start address such that + * + * [...... .....] + * start end + * + * a with size "len" starting at the return value is inside in the + * area defined by [start, end], but is otherwise randomized. + */ +unsigned long +randomize_range(unsigned long start, unsigned long end, unsigned long len) +{ + unsigned long range = end - len - start; + + if (end <= start + len) + return 0; + return PAGE_ALIGN(get_random_int() % range + start); +} + +/* Interface for in-kernel drivers of true hardware RNGs. + * Those devices may produce endless random bits and will be throttled + * when our pool is full. + */ +void add_hwgenerator_randomness(const char *buffer, size_t count, + size_t entropy) +{ + struct entropy_store *poolp = &input_pool; + + /* Suspend writing if we're above the trickle threshold. + * We'll be woken up again once below random_write_wakeup_thresh, + * or when the calling thread is about to terminate. + */ + wait_event_interruptible(random_write_wait, kthread_should_stop() || + ENTROPY_BITS(&input_pool) <= random_write_wakeup_bits); + mix_pool_bytes(poolp, buffer, count); + credit_entropy_bits(poolp, entropy); +} +EXPORT_SYMBOL_GPL(add_hwgenerator_randomness); + +/* + * Experimental code to replace parts of random.c + * Everything from here down is new code. + * Sandy Harris, sandyinchina@gmail.com + * + * Uses 128-bit hash from AES-GCM instead of 160-bit + * SHA-1. Changing the hash also allows other changes. + * + * Goals: + * + * The main design goal was improved decoupling so that + * heavy use of /dev/urandom does not deplete the entropy + * pool for /dev/random. As I see it, this is the only + * place where the current random(4) design is visibly + * flawed. + * + * Another goal was simpler mixing in of additional data + * in various places. This may help with the difficult + * problem of timely initialisation; there have been + * some security failures due to mis-handling of this + * issue. These cannot be completely dealt with in the + * driver, but we can do some things. + * + * I believe this code achieves both goals. + * + * The GCM hash: + * + * This sort of hash-like primitive has largely replaced + * more complex hashes in IPsec and TLS authentication; + * the new methods are often considerably faster and the + * code is simpler. It therefore seemed worth trying such + * a hash here. + * + * I chose the Galois field multiplication from AES-GCM + * because it is widely used, well-analysed, and + * considered secure. References are RFCs 4106 and 5288 + * and NIST standard SP-800-38D. + * + * Intel and AMD both have instructions designed to + * make the GCM calculation faster + * https://en.wikipedia.org/wiki/CLMUL_instruction_set + * Those are not used in this proof-of-concept code + * + * https://eprint.iacr.org/2013/157.pdf discusses bugs + * in the Open SSL version of this hash. + * + * Whether GCM is secure for this application needs + * analysis. IPsec generates a 128-bit hash but uses + * only 96 bits, which makes some attacks much harder; + * this application uses all 128 bits. Also, the input + * for IPsec authentication is ciphertext, which is + * highly random with any decent cipher; input here is + * mainly pool data which may be much less random. + * + * Existing random(4) code folds the 160-bit SHA-1 + * output to get an 80-bit final output; I do not + * consider such a transform necessary here, but that + * needs analysis too. + * + * I add complications beyond the basic hash; those need + * analysis as well. + * + * Differences from current driver: + * + * I change nothing on the input side; the whole entropy + * collection and estimation part of existing code, as + * applied to the input pool, are untouched. + * + * The hashing and output routines, though, are completely + * replaced. The management of output pools is also changed; + * they just count how many outputs since the last reseed, + * as a counter-mode block cipher does, rather than trying + * to track entropy. + * + * Mixing: + * + * Much of the mixing uses invertible functions such + * as the pseudo-Hadamard transform or aria_mix(). + * These provably cannot reduce entropy; if they + * did, it would not be possible to invert them. + * + * As in existing code, all operations putting data + * into any pool are unidirectional; they use += or + * ^= to mix in new data so they cannot reduce the + * randomness of the pool, even with bad input data. + * + * I add an array of constants[], two for each pool, + * for use in the hashing, and a counter[] used + * in every output operation. All operations that + * put new data into those are also unidirectional. + * + * Output dependencies + * + * Every output from a normal pool (input, blocking + * or non-blocking) involves a GCM hash of pool + * contents. + * + * As well as pool data, every output depends on: + * + * two-128-bit entries from constants[] used + * in the hashing + * a global counter[] which is also hashed + * + * There is a 4th dummy pool (pool == NULL) + * which only hashes the counter, intended to + * replace the MD5 code in the current driver. + * + * There are three functions to get 128 bits, + * two from a specified pool p + * + * get128( p, out ) may block + * get_or_fail( p, out ) non-blocking + * + * get_any( out ) tries a series of sources, + * never blocks but does not always give a + * high-grade result + * + * Tests: + * + * Various tests here are deliberately more general + * than necessary; this protects against coding + * blunders, against flukes like a cosmic ray changing + * memory, and against misbehaviour from stressed devices + * like an overheated router, whether the stress is just + * natural or is part of an attack. + * + * For example, when a value is confidently expected + * to be either 0 or 1, if(x==0) ... if(x==1) ... + * is the obvious way to test it, but it is slightly + * safer to use if(x==0) ... else ... so unexpected + * cases can be handled. Similarly, end-of-loop tests + * could use x == N but x >= N is slightly safer. + * + * The value of this is arguably negligible and certainly + * minor, but the cost is near-zero and the behaviour + * is identical in all expected cases. I have therefore + * done this everywhere that I noticed it was possible. + * It would also be possible, of course, to detect and + * log unexpected cases, but it is not clear that this + * would be of much value. + */ + +static spinlock_t counter_lock ; +static spinlock_t constants_lock ; + +/********************************************************* + * unidirectional mixing operations + * + * both mix 128 bits from source into target + * two ways: xor or additions + ********************************************************/ + +static void xor128(u32 *target, u32 *source) +{ +#ifdef CONFIG_64BIT + u64 *s, *t ; + s = (u64 *) source ; + t = (u64 *) target ; + t[0] ^= s[0] ; + t[1] ^= s[1] ; +#else + target[0] ^= source[0] ; + target[1] ^= source[1] ; + target[2] ^= source[2] ; + target[3] ^= source[3] ; +#endif +} + +/* + * not a 128-bit addition, + * just four 32-bit or two 64-bit + */ +static void add128(u32 *target, u32 *source) +{ +#ifdef CONFIG_64BIT + u64 *s, *t ; + s = (u64 *) source ; + t = (u64 *) target ; + t[0] += s[0] ; + t[1] += s[1] ; +#else + target[0] += source[0] ; + target[1] += source[1] ; + target[2] += source[2] ; + target[3] += source[3] ; +#endif +} + +static void add256(u32 *target, u32 *source) +{ +#ifdef CONFIG_64BIT + u64 *s, *t ; + s = (u64 *) source ; + t = (u64 *) target ; + t[0] += s[0] ; + t[1] += s[1] ; + t[2] += s[2] ; + t[3] += s[3] ; +#else + target[0] += source[0] ; + target[1] += source[1] ; + target[2] += source[2] ; + target[3] += source[3] ; + target[4] += source[4] ; + target[5] += source[5] ; + target[6] += source[6] ; + target[7] += source[7] ; +#endif +} + +/********************************************************************* + * Two ways to mix a 128-bit buffer, one each for 256, 512 and 1024 + * These are generic functions that can mix anything the right size + * None know anything about pools or take any locks + * + * All mix in place, using no external data except buffer contents + * Any temporary storage used is cleared before returning + *********************************************************************/ + +/* + * The Aria block cipher is a Korean standard + * Cipher home page: http://210.104.33.10/ARIA/index-e.html + * See also RFC 5794 + * + * This application uses only the linear transform from + * Aria, not the whole cipher + * + * Mixes a 128-bit object treated as 16 bytes + * Each output byte is the XOR of 7 input bytes + * + * Some caution is needed in applying this since the + * function is its own inverse; using it twice on the + * same data gets you right back where you started + * + * Version here is based on GPL source at: + * http://www.oryx-embedded.com/doc/aria_8c_source.html + */ +static void aria_mix( u8 *x ) +{ + u8 y[16] ; + + y[0] = x[3] ^ x[4] ^ x[6] ^ x[8] ^ x[9] ^ x[13] ^ x[14]; + y[1] = x[2] ^ x[5] ^ x[7] ^ x[8] ^ x[9] ^ x[12] ^ x[15]; + y[2] = x[1] ^ x[4] ^ x[6] ^ x[10] ^ x[11] ^ x[12] ^ x[15]; + y[3] = x[0] ^ x[5] ^ x[7] ^ x[10] ^ x[11] ^ x[13] ^ x[14]; + y[4] = x[0] ^ x[2] ^ x[5] ^ x[8] ^ x[11] ^ x[14] ^ x[15]; + y[5] = x[1] ^ x[3] ^ x[4] ^ x[9] ^ x[10] ^ x[14] ^ x[15]; + y[6] = x[0] ^ x[2] ^ x[7] ^ x[9] ^ x[10] ^ x[12] ^ x[13]; + y[7] = x[1] ^ x[3] ^ x[6] ^ x[8] ^ x[11] ^ x[12] ^ x[13]; + y[8] = x[0] ^ x[1] ^ x[4] ^ x[7] ^ x[10] ^ x[13] ^ x[15]; + y[9] = x[0] ^ x[1] ^ x[5] ^ x[6] ^ x[11] ^ x[12] ^ x[14]; + y[10] = x[2] ^ x[3] ^ x[5] ^ x[6] ^ x[8] ^ x[13] ^ x[15]; + y[11] = x[2] ^ x[3] ^ x[4] ^ x[7] ^ x[9] ^ x[12] ^ x[14]; + y[12] = x[1] ^ x[2] ^ x[6] ^ x[7] ^ x[9] ^ x[11] ^ x[12]; + y[13] = x[0] ^ x[3] ^ x[6] ^ x[7] ^ x[8] ^ x[10] ^ x[13]; + y[14] = x[0] ^ x[3] ^ x[4] ^ x[5] ^ x[9] ^ x[11] ^ x[14]; + y[15] = x[1] ^ x[2] ^ x[4] ^ x[5] ^ x[8] ^ x[10] ^ x[15]; + memcpy( x, y, 16 ) ; + zero128( y ) ; +} + +/* + * The pseudo-Hadamard transform (PHT) can be + * applied to any word size and any number of words + * that is a power of two. Here for 4, 8 or 16 + * 32-bit words. + * + * In all cases it is invertible so it provably loses + * no entropy, and it makes every output word depend + * on every input word. + * + * conceptually, a 2-way PHT on a, b is + * x = a + b + * y = a + 2b + * a = x + * b = y + * a better implementation is just + * a += b + * b += a + * + * Larger PHTs use multiple applications of that. + * + * If you have 64-bit operations and aligned + * data structures, then these can be made + * faster. Only pht128() and add128() need to + * change; others just call them. + * + * If 32-bit arithmetic is used, then pht128() + * pht256() and pht512() are exactly the PHT + * on the appropriate number of 32-bit words. + * + * The 64-bit versions are not quite PHTs, but + * the important properties remain. They are still + * invertible & still make all 32-bit output words + * depend on all input words. + */ + +static void pht128( u32 *x ) +{ +#ifndef CONFIG_64BIT + /* + * a 4-way PHT is built from 4 2-way PHTs + * here it is unrolled into 8 += operations + * each line is a two-way PHT + */ + x[0] += x[1] ; x[1] += x[0] ; + x[2] += x[3] ; x[3] += x[2] ; + x[0] += x[2] ; x[2] += x[0] ; + x[1] += x[3] ; x[3] += x[1] ; +#else + /* + * two 2-way 64-bit PHTs (4 += operations) + * and a swap of two 32-bit words + */ + u32 temp ; + u64 *y ; + y = (u64 *) x ; + y[0] += y[1] ; y[1] += y[0] ; + temp = x[1]; x[1] = x[2] ; x[2] = temp ; + y[0] += y[1] ; y[1] += y[0] ; +#endif +} + +static void pht256( u32 *x ) +{ + u32 *y ; + y = x + 4 ; + + pht128(x) ; + pht128(y) ; + add128( x, y ) ; + add128( y, x ) ; +} + +static void pht512( u32 *x ) +{ + u32 *y ; + y = x + 8 ; + + pht256(x) ; + pht256(y) ; + add256( x, y ) ; + add256( y, x ) ; +} + +/* + * cube_mix() is from Daniel Bernstein's Cubehash + * It mixes 1024 bits, treated as an array of 32-bit words. + * + * based on Bernstein's code as distributed at + * http://bench.cr.yp.to/supercop.html + * He labels his code as public domain + * + * He has multiple versions. This is from the file + * cubehash1632/simple where 1632 indicates his main + * proposal (16 rounds and a 32-word state) and simple + * indicates the simplest code. The 1632 directory also + * has four different unrolled versions and over 20 + * versions for specific hardware. There are also + * many other directories, so lots of options for + * eventual optimisations. Here I just use a simple + * one for proof-of-concept testing. + * + * The Cubehash algorithm has three stages: + * + * 1 put some constants into the array + * mix with this transform to get initial state + * 2 for each input block + * mix input into state + * mix with this transform + * 3 mix with a different transform to + * get an output smaller than state + * + * Here there is no stage 1 or 3 since the state we + * mix is already initialised and we want output of + * the same size. Nor is there any input data; we are + * not hashing here. + * + * We just use the central transform to mix a buffer. + */ + +/* + * This is what Bernstein uses in his main proposal + * Arguably we need more because we lack stages 1 and 3 + * Arguably less since this not a hash; any mixing is OK + */ +#define CUBEHASH_ROUNDS 16 + +static void cube_mix( u32 *x ) +{ + int i; + int r; + u32 y[16]; + + for (r = 0;r < CUBEHASH_ROUNDS;++r) { + for (i = 0;i < 16;++i) x[i + 16] += x[i]; + for (i = 0;i < 16;++i) y[i ^ 8] = x[i]; + for (i = 0;i < 16;++i) x[i] = ROTL(y[i],7); + for (i = 0;i < 16;++i) x[i] ^= x[i + 16]; + for (i = 0;i < 16;++i) y[i ^ 2] = x[i + 16]; + for (i = 0;i < 16;++i) x[i + 16] = y[i]; + for (i = 0;i < 16;++i) x[i + 16] += x[i]; + for (i = 0;i < 16;++i) y[i ^ 4] = x[i]; + for (i = 0;i < 16;++i) x[i] = ROTL(y[i],11); + for (i = 0;i < 16;++i) x[i] ^= x[i + 16]; + for (i = 0;i < 16;++i) y[i ^ 1] = x[i + 16]; + for (i = 0;i < 16;++i) x[i + 16] = y[i]; + } + memzero_explicit(y, 64) ; +} + +/******************************************************************** + * Code to manage the array of two 128-bit "constants" per pool + * These are not really constants; this code changes them + * They are treated as constants in the extract-from-pool code + *********************************************************************/ + +/* + * mix one pool's constants array, two 128-bit rows + * in place mixing, uses no external data + * PHT + a rotation to make it nonlinear + */ +static void mix_const_p( struct entropy_store *r ) +{ + u32 *x ; + unsigned long flags ; + + x = r->A ; + + spin_lock_irqsave( &constants_lock, flags ) ; + *x = ROTL( *x, 5 ) ; + pht256( x ) ; + spin_unlock_irqrestore( &constants_lock, flags ) ; +} + +/* + * Update both constants for a pool. + * Needs no rotations because mix_const_p() has one + * + * Every call to this affects every hash for that pool, + * all future outputs from it, and all future feedback + * into it. + * + * This is the preferred way to rekey a pool, rather than + * buffer2pool() which mixes into the pool contents. + * + * This mixes in 128 bits of new data, so it is what the + * Yarrow paper calls "catastrophic reseeding". It resets + * r->count to indicate the rekeying but does not change + * r->entropy_count. + * + * All buffer2*() routines zero the input data after using it + */ +static void buffer2array( struct entropy_store *r, u32 *data ) +{ + u32 *x; + unsigned long flags1, flags2 ; + + x = r->A ; + + spin_lock_irqsave( &r->lock, flags1 ) ; + spin_lock_irqsave( &constants_lock, flags2 ) ; + xor128( x, data ) ; + pht256( x ) ; + spin_unlock_irqrestore( &constants_lock, flags2 ) ; + r->count = 0 ; + spin_unlock_irqrestore( &r->lock, flags1 ) ; + zero128( data ) ; +} + +/* + * mix the eight 128-bit constants[] for all pools + * in place mixing, uses no external data + * + * This uses the 1024-bit transform from Bernstein's Cubehash + * that has XOR, + and rotations so mixing is quite nonlinear + */ +static void mix_const_all( ) +{ + unsigned long flags ; + + spin_lock_irqsave( &constants_lock, flags ) ; + cube_mix( constants ) ; + spin_unlock_irqrestore( &constants_lock, flags ) ; +} + +/* + * mix the constants[] array and both output pools + * all in-place mixing, no external data + */ +static void big_mix() +{ + struct entropy_store *n, *b ; + unsigned long flags, flags2 ; + + n = &nonblocking_pool ; + b = &blocking_pool ; + + (void) mix_const_all() ; + + /* + * mix the output pools if possible + * with the default value for OUTPUT_POOL_WORDS + * the if here always succeeds + * + * for the >32 case, only part of pool is mixed + * but probably enough + */ + if( OUTPUT_POOL_WORDS >= 32 ) { + spin_lock_irqsave( &n->lock, flags ) ; + cube_mix( n->pool ) ; + spin_unlock_irqrestore( &n->lock, flags ) ; + + spin_lock_irqsave( &b->lock, flags ) ; + cube_mix( b->pool ) ; + spin_unlock_irqrestore( &b->lock, flags ) ; + } + /* + * the two pools combined are big enough + * do one mix for both + */ + else if( (OUTPUT_POOL_WORDS >= 16) && (n->pool == b->pool+OUTPUT_POOL_WORDS) ) { + spin_lock_irqsave( &n->lock, flags ) ; + spin_lock_irqsave( &b->lock, flags2 ) ; + cube_mix( b->pool ) ; + spin_unlock_irqrestore( &b->lock, flags2 ) ; + spin_unlock_irqrestore( &n->lock, flags ) ; + } + /* + * this should never be reached + * but put in some code for safety + */ + else if( OUTPUT_POOL_WORDS >= 8 ) { + spin_lock_irqsave( &n->lock, flags ) ; + pht256( n->pool ) ; + spin_unlock_irqrestore( &n->lock, flags ) ; + spin_lock_irqsave( &b->lock, flags ) ; + pht256( b->pool ) ; + spin_unlock_irqrestore( &b->lock, flags ) ; + } + /* This should definitely never be reached */ + else pr_warn("random: strange output pool size %d\n", OUTPUT_POOL_WORDS ) ; +} + +/* + * constants[] array has 10 128-bit rows + * 8 are pool constants, last 2 counter[] + * + * mix the last 4 rows + * 8 words in counter[] + * 8 words of constants[] for dummy_pool + * + * no rotations needed here; count() has enough + */ +static void top_mix() +{ + u32 *x ; + struct entropy_store *d ; + unsigned long flags1, flags2 ; + + d = &dummy_pool ; + x = d->A ; + + spin_lock_irqsave( &d->lock, flags1 ) ; + spin_lock_irqsave( &constants_lock, flags2 ) ; + pht512( x ) ; + spin_unlock_irqrestore( &constants_lock, flags2 ) ; + spin_unlock_irqrestore( &d->lock, flags1 ) ; +} + +/********************************************************************** + * The main hashing routines, based on authenticator code from AES-GCM + * + * GCM is Galois Counter Mode + * All operations are in a Galois field with 128-bit elements + * see http://csrc.nist.gov/publications/nistpubs/800-38D/SP-800-38D.pdf + **********************************************************************/ + +static u8 abits[128], ybits[128], prodbits[256] ; + +/* + * based on Dan Bernstein's AES-GCM implementation, + * part of CAESAR test code http://competitions.cr.yp.to/caesar.html + * + * Bernstein's description: + * + * a = (a + x) * y in the finite field + * 16 bytes in a + * xlen bytes in x; xlen <= 16; x is implicitly 0-padded + * 16 bytes in y + */ + +static void addmul(u8 *a, const u8 *x, u32 xlen, const u8 *y) +{ + int i, j; + + for (i = 0;i < xlen;++i) + a[i] ^= x[i]; + for (i = 0;i < 128;++i) + abits[i] = (a[i / 8] >> (7 - (i % 8))) & 1; + for (i = 0;i < 128;++i) + ybits[i] = (y[i / 8] >> (7 - (i % 8))) & 1; + + memzero_explicit( prodbits, 256 ) ; + for (i = 0;i < 128;++i) + for (j = 0;j < 128;++j) + prodbits[i + j] ^= abits[i] & ybits[j]; + for (i = 127;i >= 0;--i) { + prodbits[i] ^= prodbits[i + 128]; + prodbits[i + 1] ^= prodbits[i + 128]; + prodbits[i + 2] ^= prodbits[i + 128]; + prodbits[i + 7] ^= prodbits[i + 128]; + prodbits[i + 128] ^= prodbits[i + 128]; + } + + zero128( a ) ; + for (i = 0;i < 128;++i) + a[i / 8] |= (prodbits[i] << (7 - (i % 8))); +} + +/* + * Bernstein's code has prodbits[], abits[] and ybits[] as locals + * We make them global so this function can clear them + * + * With them as locals we could + * either clear them for every addmul() call (expensive) + * or not clear them at all (possible, though minor, security risk) + * better to use globals, clear them at end of sequence + */ +static void clear_addmul() +{ + memzero_explicit( prodbits, 256 ) ; + memzero_explicit( abits, 128 ) ; + memzero_explicit( ybits, 128 ) ; +} + +/* + * Mix n bytes into an accumulator using addmul() + * + * This is a keyed hash that takes nbytes of input, a 128-bit initial value + * and 128-bit key (the multiplier for addmul()), and gives a 128-bit output. + * + * This routine does not either initialise the accumulator or finalise output. + * The expected calling sequence looks like this: + * + * intialise accumulator (from some constant) + * call this to mix in data (another constant is multiplier) + * optionally, repeat call one or more times for other data + * finalise output + * + * The main use here is against the various pools, replacing the hash + * previously used there. This should be faster and as secure, though + * speed needs testing & the security claim needs analysis. + * + * Note that it can be used with any data, and with a sequence of data + * chunks. In AES-GCM it is run over unencrypted headers so those can + * be authenticated along with the encrypted payload. + * + * Here it is run over counter[] as well as pool data so that outputs + * depend on a global piece of state, not just on one pool. + * + * It might also be run over any kernel data structure that is expected + * to be unpredictable to an enemy, giving extra entropy. + * + * It can also be run over anything that is expected to be different + * + * on each machine (e.g. Ethernet MACs) + * on each boot (clock data) + * or on each read of /dev/urandom (process info for reader). + * + * Such data cannot be trusted for entropy; it may be unknown to some + * attackers, but we cannot rely on it being unknown to all. However it + * can still be useful in a role like that of salt in a hash; it makes + * brute force or table-driven attacks much harder. + */ +static void mix_in( u8 *data, u32 nbytes, u8 *mul, u32 *accum) +{ + u32 len, left ; + u8 *p ; + for( p = data, left = nbytes ; left != 0 ; p += len, left -= len) { + len = (left >= 16) ? 16 : left ; + addmul( (u8 *) accum, p, len, mul ) ; + } +} + +/* + * Start of every output routine. + * + * The Schneier et al Yarrow rng design rekeys a counter mode + * block cipher from its own output every 10 blocks, to avoid + * giving an enemy a sequence of related values to work on. + * + * Here we have feedback into any non-dummy pool on every iteration, + * changing 8 pool words every time. If the pool is 4K bits, 128 words, + * then every word is changed after 16 iterations; in a smaller pool + * this happens sooner. That may be all the rekeying we need, but there + * is some mixing of the constants here to supplement it. + * + * The dummy pool (r->pool == NULL) gets no feedback into the pool, so + * we mix its constants more often. + * + * This routine never requests output from any pool to drive rekeying. + * That overhead would be excessive in a routine that is called for + * every output operation from any pool. + * + * AES-GCM authentication is + * + * initialise accumulator all-zero + * mix in data with multiplier H + * xor in H before output + * + * Algorithm here is + * + * maybe mix constants r->A and r-o>B + * initialise accumulator from r->A + * mix in data with multiplier r->B + * counter[] for any pool + * pool data for non-dummy pools + * xor in r->B + * + * That finishes the first hash. For the dummy pool, we stop + * there and use that output. + * + * Some constants, both primes from list at: + * https://primes.utm.edu/lists/small/10000.txt + * + * ADJUST THESE FOR TUNING + * To test, I just use the first primes > 10, 100 + * + * FREQUENCY how often to mix constants for most pools + * FREQDUMMY for dummy pool + */ + +#define FREQUENCY 101 +#define FREQDUMMY 11 + +static void mix_first( struct entropy_store *r, u32 *accum ) +{ + u32 x ; + unsigned long flags ; + + spin_lock_irqsave( &r->lock, flags ) ; + x = r->count++ ; + spin_unlock_irqrestore( &r->lock, flags ) ; + + /* + * sometimes mix constants before using them + * do not zero the count + * only buffer2array() does that + */ + if( r->pool != NULL) { + if( (x%FREQUENCY) == 0 ) + mix_const_p( r ) ; + } + else { + if( (x%FREQDUMMY) == 0 ) + mix_const_p( r ) ; + } + + /* initialise the accumulator */ + memcpy( (u8 *) accum, (u8 *) r->A, 16 ) ; + + /* mix in the counter and update it */ + addmul( (u8 *) accum, (u8 *) counter, 16, (u8 *) r->B) ; + count() ; + + /* for non-dummy pools, mix in pool data */ + if( r->pool != NULL ) + mix_in( (u8 *) r->pool, r->size, (u8 *) r->B, accum ) ; + + /* + * finalise result + * it depends on at least r->A, r->B and counter[] + * for non-dummy pools, on pool contents as well + */ + xor128( accum, r->B ) ; + + clear_addmul() ; +} + +/* + * Last function in mixing sequence for any of 3 real pools + * Not used for dummy pool + * + * No locking needed in this function + * Caller need not hold locks either, & should not + * + * First, put feedback into the pool + * + * save a copy of the 1st hash's result + * feed the result back into pool + * + * Then do 2nd hash to get output different from the feedback + * + * re-initialise accumulator from r->B + * mix in saved data with multiplier r->A + * xor in data to get output + * + * The constants are used differently in the two hashes. In + * mix_first(), A is the initialiser and B the multiplier. + * In the second hash here, they swap roles. + * + * In the first hash, the same constant is used twice, first + * as the muiltipler in finite field multiplication then in + * an XOR. This is exactly the way that AES-GCM uses its + * constant H. + * + * AES-GCM has: hash( data, all-0, H ) xor H + * our 1st hash: hash( data, A, B ) xor B + * our 2nd hash: hash( data, B, A ) xor data + * + * A well-known paper on building hashes from block ciphers, + * pretty much the standard reference on the topic, is: + * Preneel, Govaerts & Vandewalle + * https://www.cosic.esat.kuleuven.be/publications/article-48.ps + * + * It shows that some structures resist backtracking. + * They consider 64 possibilities and show that exactly + * 12 of them are secure. Both hashes here use structures + * from among that 12. + */ + +static void mix_last( struct entropy_store *r, u32 *accum ) +{ + u32 temp[4] ; + + /* + * for the dummy pool, this should not be called + * if it is, there is nothing to do here + */ + if( r->pool == NULL ) { + pr_warn("random: mix_last() called for dummy pool\n" ) ; + return ; + } + + /* + * for any other pool, continue + * save result for use in generating output + */ + memcpy( temp, accum, 16 ) ; + + /* feed intermediate result back into pool */ + buffer2pool( r, accum ) ; + + /* + * Apply another hash step to the saved value in temp[] + * to create an output different from feedback + */ + memcpy( accum, r->B, 16 ) ; + addmul( (u8 *) accum, (u8 *) temp, 16, (u8 *) r->A) ; + xor128( accum, temp ) ; + + clear_addmul() ; + zero128( temp ) ; +} + +/* + * Input pool rekeys from external data and maybe hardware rng + * Blocking pool rekeys from the input pool before every output + * Dummy pool gets its constants changed when top_mix() is used. + * + * In mix_first() all pools sometimes mix their own constants + * and in mix_last() all non-dummy pools get feedback applied + * to their pool data. All pools are affected by the counter[] + * and by mix_const_all(). + * + * The only place where rekeying needs more complex management + * is for the nonblocking pool + * + * The blocking pool generates only one /dev/random output + * each time it is reseeded. It appears safe to generate + * additional outputs to reseed the nonblocking pool; there is + * good mixing there so blocking pool output is not exposed to + * attack by this, except in a remarkably indirect way. + * + * The blocking pool is reseeded whenever /dev/random is + * used, so if it is used often, then the nonblocking pool + * will almost always be able to safely reseed from there. + * + * How many outputs can we safely take from a seeded pool? + * ====================================================== + * + * Too large a value will be insecure, but it is not clear what + * "too large" means here. The question has been well studied + * for counter mode block ciphers, but the analysis does not + * apply directly here; at best it allows sensible guesses. + * + * For n-bit block size the Yarrow paper shows a generic attack + * for any counter mode block cipher after 2^(n/3) output blocks, + * about 2^42 for 128-bit block size, and one NIST document + * suggests an absolute upper limit of 2^48 for AES-CTR. + * + * Real applications generally use a much lower limit. Here I + * think a value for SAFE_OUT around 2^16 is the largest that + * could reasonably be considered, perhaps the prime (2^16)+1. + * + * However, using that seems unnecessary; a much lower value + * is enough to effectively decouple /dev/urandom and /devrandom. + * We want a low enough value that going over it sometimes when + * entropy is low will not be fatal. + * + * Even if /dev/random is not used, the nonblocking pool can reseed + * from the blocking pool SAFE_OUT times before it needs to reseed + * from a hardware rng or the input pool. Since it does SAFE_OUT + * output blocks per reseed, it can produce SAFE_OUT*SAFE_OUT blocks + * before it needs to reseed other than from the blocking pool. + * + * Using primes (just because), some possibilities are: + * + * with SAFE_OUT = 31, almost 1,000 blocks + * with SAFE_OUT = 101, over 10,000 blocks + * with SAFE_OUT = 331, over 100,000 blocks + * with SAFE_OUT = 503, over 250,000 blocks + * with SAFE_OUT = 1009, over 1,000,000 blocks + * with SAFE_OUT = (2^16)+1, over 2^32 blocks + * + * Any sensible value for SAFE_OUT will greatly reduce load on the + * input pool when the nonblocking pool is heavily used. + */ + +#define SAFE_OUT 503 + +/* constants to test input pool entropy level */ +#define E_MINIMUM 1024 +#define E_PLENTY (INPUT_POOL_WORDS*24) + +/* + * try to get 128 bits from a pool + * return 1 for success, 0 for failure + */ +static int get_or_fail( struct entropy_store *r, u32 *out ) +{ + int got ; + u32 temp[4] ; + unsigned long flags ; + + if( r == &input_pool ) { + spin_lock_irqsave( &r->lock, flags ) ; + if( (got = (ENTROPY_BITS(r) > E_MINIMUM)) ) + credit_entropy_bits( r, -128 ) ; + spin_unlock_irqrestore( &r->lock, flags ) ; + if( got ) { + mix_first( r, out ) ; + mix_last( r, out ) ; + return 1 ; + } + else return 0 ; + } + else if( (r == &blocking_pool) || (r == &nonblocking_pool) ) { + /* + * need not lock here + * going slightly over SAFE_OUT is not dangerous + */ + if( r->count < SAFE_OUT ) { + mix_first( r, out ) ; + mix_last( r, out ) ; + return 1 ; + } + else return 0 ; + } + /* + * dummy pool always succeeds + * but may need rekeying first + */ + else if( r == &dummy_pool) { + if( r->count >= SAFE_OUT ) { + get_any( temp ) ; + buffer2array( r, temp ) ; + } + mix_first( r, out ) ; + return 1 ; + } + else { + pr_warn("random: get_or_fail() gets bad pool argument\n" ) ; + return 0 ; + } +} + +/* + * get 128 bits from somewhere + * always succeeds, but may not always give good data + * + * return value indicates data source + * 1 = input, 2 = blocking, 3 = nonblocking + * 4 = dummy, 5 = hw rng + */ +static int get_any( u32 *out ) +{ + int got ; + struct entropy_store *r ; + unsigned long flags ; + + /* + * use the input pool if it has plenty + * of entropy + * + * unlike get_or_fail(), this function + * does not test for > E_MINIMUM + * so it avoids depleting input entropy + * except when there is plenty + */ + r = &input_pool ; + spin_lock_irqsave( &r->lock, flags ) ; + if( (got = (ENTROPY_BITS(r) > E_PLENTY)) ) + credit_entropy_bits( r, -128 ) ; + spin_unlock_irqrestore( &r->lock, flags ) ; + if( got ) { + mix_first( r, out ) ; + mix_last( r, out ) ; + return 1 ; + } + + /* + * this is likely to be the most common case + * & should usually succeed + */ + if( get_or_fail( &blocking_pool, out ) ) + return 2 ; + + /* + * hw rng may not be fully trusted, + * but it is fine as a fallback here + */ + if( get_hw_random( out ) ) { + /* + * if we reach here, hw rng works + * but input pool is not close to full + * so try to refill it + */ + load_input() ; + return 5 ; + } + + /* reaching here should be rare; do what we can */ + if( get_or_fail( &nonblocking_pool, out ) ) + return 3 ; + + /* dummy pool always succeeds */ + get128( &dummy_pool, out ) ; + return 4 ; +} + +/* + * get 128 bits from a pool + * for input or blocking pool, this may block + * for dummy or nonblocking, it will not + */ + +static u32 rekey_flip_flop = 0 ; + +static void get128( struct entropy_store *r, u32 *out ) +{ + u32 temp[4] ; + unsigned long flags ; + + /* + * get_or_fail( r, out ) cannot be used here + * pool must be rekeyed before output + */ + if( r == &blocking_pool ) { + /* + * try non-blocking function first + * if it fails, use blocking function + */ + if( !get_or_fail( &input_pool, temp ) ) + get128( &input_pool, temp ) ; + /* + * one way or the other, we have data, so reseed + * r->count is reset in buffer2array() + */ + buffer2array( r, temp ) ; + + /* produce output */ + mix_first( r, out ) ; + mix_last( r, out ) ; + return ; + } + + /* + * for any pool except blocking + * see if pool is ready for output + * dummy pool is always ready + */ + if( get_or_fail( r, out) ) { + return ; + } + + /* + * nonblocking pool not ready + * rekey it, without blocking + */ + if( r == &nonblocking_pool ) { + /* + * First choice is to rekey from blocking pool + * This should very often succeed + * else non-blocking function that always succeeds + */ + if( !get_or_fail(&blocking_pool, temp) ) + (void) get_any( temp ) ; + /* + * one way or the other, we have data, so reseed + * r->count is reset in buffer2array() + */ + buffer2array( r, temp ) ; + /* + * Do some extra mixing + * + * Rekeying is infrequent enough (once + * every SAFE_OUT blocks) that we can + * afford a somewhat expensive mix here + * + * constants[] has 10 128-bits rows + * 8 for pool constants, 2 for counter[] + * + * mix_const_all() mixes first 8 rows + * top_mix() mixes last 4 + * they overlap so all 10 get mixed + * if both are used + */ + if( rekey_flip_flop ) { + /* + * Mix all the pool constants + * so the rekey affects all pools + * This is the only full mix except + * during initialisation + */ + mix_const_all() ; + rekey_flip_flop = 0 ; + } + else { + /* + * mix counter[] + * and constants for dummy pool + */ + top_mix() ; + rekey_flip_flop = 1 ; + } + + /* produce output */ + mix_first( r, out ) ; + mix_last( r, out ) ; + return ; + } + + if( r == &input_pool ) { + /* pool entropy is low, so try hw rng */ + if( !load_input() ) { + /* no hw rng, toss in something */ + (void) get_any( temp ) ; + buffer2pool( r, temp ) ; + } + + /* + * ADD CODE HERE + * adapt code from current driver + * needs to block sometimes + * and deal with entropy_count + */ + spin_lock_irqsave( &r->lock, flags ) ; + credit_entropy_bits( r, -128 ) ; + spin_unlock_irqrestore( &r->lock, flags ) ; + + /* produce output */ + mix_first( r, out ) ; + mix_last( r, out ) ; + return ; + } +} + +/***************************************************************** + * loop to fill an output buffer with data + * for input or blocking pool, this may block + *****************************************************************/ + +static void loop_output( struct entropy_store *r, u32 *out, u32 nbytes ) +{ + u32 temp[4] ; + int n, m ; + u8 *p ; + + /* + * for pools that may block, try to avoid it + * fill input pool from hw rng if available + */ + if( got_hw_rng && ((r == &input_pool) || (r==&blocking_pool)) ) + load_input() ; + + /* + * Ensure that each call to this function will start + * a new output stream which is almost independent + * of previous streams. For a rationale, see the + * Fortuna paper by Schneier et al. + */ + counter_any() ; + + /* + * ADD CODE HERE? + * + * For /dev/urandom accesses, we could mix in process + * info for the reading process, just apply addmul() + * to task_info struct to mix it into counter[] or + * into the constants + * + * This depends on a different aspect of the system than + * anything else in the driver, namely the order in which + * user processes ask for data and the current state of + * those processes. + * + * Except perhaps on simple embedded systems, this should + * be hard to guess. It should be impossible to monitor + * unless the attacker is logged into the system or has + * left a background process running on it. Even then, + * monitoring it would not be easy. + */ + + for( n = nbytes, p = (u8 *) out ; n > 0 ; n -= m, p += m ) { + m = (n >= 16) ? 16 : n ; + get128( r, temp ) ; + memcpy( p, (u8 *) temp, m) ; + } + zero128( temp ) ; +} + +/****************************************************************** + * Mixing into pool data + * + * This routine is used only to mix data into the pool itself, + * for feedback in mix_last() + * + * Output operations from any pool use the hashing parts of + * mix_last(), not this code. + * + * For rekeying, buffer2array() is preferred over this; change a + * constant rather than pool data. The effects are more easily + * analysed, and more general since changing a constant always + * affects the pool but not vice versa. + * + * Use this only for data known to be (or at least appear) + * highly random + * + * hardware RNG data + * hash output + * cipher output (not used here) + * + * Input mixing should NOT use this; existing driver code is far + * better for low-to-medium entropy inputs. Existing code is OK + * for high-entropy inputs as well, though it appears to have been + * designed for the low entropy case. + * + * I added this in hopes it would be faster, and easier to analyze + * in the high-entropy case. Also, using two different mixers gives + * insurance if either has some unknown weakness. + *******************************************************************/ + +/* + * Mix a 128-bit buffer into a pool, changing 8 32-bit pool words + * All buffer2*() routines zero the input data after using it + * + * This does not reset r->count; only buffer2array() does that + * Nor does it change r->entropy_count + * + * Eventually this stirs the entire pool, making every pool word + * depend both on every other pool word and on many external inputs. + * This is the only stirring the output pools get, except during + * initialisation. + */ +static void buffer2pool( struct entropy_store *r, u32 *buff) +{ + u32 *a, *b ; + unsigned long flags ; + + /* normal case, real pool */ + if( r->pool != NULL ) { + spin_lock_irqsave( &r->lock, flags ) ; + a = r->p ; + b = r->q ; + /* mix a[] and add new data */ + a[0] = ROTL( a[0], 5 ) ; + xor128( a, buff ) ; + pht128( a ) ; + /* mix b[] */ + aria_mix( (u8 *) b ) ; + /* PHTs between rows */ + add128( a, b ) ; + add128( b, a ) ; + /* update pointers */ + r->p += 4 ; + if( r->p >= r->end ) + r->p = r->pool ; + r->q += 4 ; + if( r->q >= r->end ) + r->q = r->pool ; + spin_unlock_irqrestore( &r->lock, flags ) ; + zero128( buff ) ; + } + /* + * if called for dummy pool, which should not happen + * there is no pool to mix to + * so mix to constants instead + */ + else { + buffer2array( r, buff ) ; + pr_warn("random: buffer2pool() called for dummy pool\n" ) ; + } +} + +/********************************************************* + * initialise counter & output pools + * + * This should not be done until there is enough (256 bits?) + * entropy in the input pool. + * + * This code does not deal with that problem! + * FIX BEFORE USING + ********************************************************/ + +/* how many 128-bit chunks to mix into a pool */ +#define HOW_MANY 4 + +static void init_random() +{ + u32 temp[4], *x, *y ; + int j ; + struct entropy_store *i, *b, *n, *d ; + ktime_t now ; + + i = &input_pool ; + b = &blocking_pool ; + n = &nonblocking_pool ; + d = &dummy_pool ; + + spin_unlock( &counter_lock ) ; + spin_unlock( &constants_lock ) ; + + /* + * fill input pool from hardware rng if possible + * if that works, mix hw data into constants as well + */ + if( load_input() ) + (void) load_constants() ; + + /* + * ADD CODE HERE? + * + * If data from kernel command line is available, + * mix it into counter[] or input pool before doing + * anything else. Either way, it will then affect + * all future operations + * + * Simplest: XOR 256 bits into 8 words of counter[] + * or with exactly 128, call buffer2counter() + */ + + mix_first( i, temp ) ; + + /* + * Existing code to get data for the input pool uses timer + * information. So do programs like my maxwell(8), Stephan + * Mueller's jitter driver or Havege. Most of my code here + * therefore does not use timings since that entropy is + * already accounted for. There are two exceptions: + * + * buffer2counter() mixes in jiffies + * + * Here timer info is added so initialisation is a bit + * different each time. Nowhere near enough entropy + * to make things secure by itself, but better than + * nothing. + */ + now = ktime_get_real() ; + mix_in( (u8 *) &now, sizeof(now), (u8 *) i->B, temp) ; + + mix_in( (u8 *) utsname(), sizeof(*(utsname())), (u8 *) i->B, temp) ; + + /* + * ADD CODE HERE + * + * Mix static info into temp[] + * things that can act as salt + * + * These need not be unpredictable + * just different on different systems + * e.g. ethernet MAC, other hardware info. + * + * Existing code uses utsname(). That and if + * possible more should be added here. + */ + + mix_last( i, temp ) ; + + /* + * Use that first result to re-initialise the counter + * This will affect all future outputs from any pool + * + * Provided enough entropy is present before this, + * from any of: + * data in random_init.h + * kernel command line + * input to pool before this runs + * this makes the counter unknowable to an enemy + * + * All future outputs, including the ones that + * rekey pools below, depend on the counter + */ + buffer2counter( temp ) ; + + /* + * mix data into the output pools + * try to get from input pool first + * else from dummy pool which never blocks + * + * don't use get_any() yet; its only advantage + * over just using dummy pool is that it might + * get from output pools, but that is much more + * expensive and output pools are not fully + * initialised yet + */ + for( j = 0, x=n->pool, y=b->pool ; j < HOW_MANY ; j++, x+=4, y+=4 ) { + if( !get_or_fail(i, temp) ) + get128( d, temp) ; + spin_lock( &n->lock) ; + xor128( x, temp ) ; + spin_unlock( &n->lock) ; + spin_lock( &b->lock) ; + add128( y, temp ) ; + spin_unlock( &b->lock) ; + } + /* now get_any() and constants_any() can safely be used */ + + /* + * refill input pool from hardware rng if possible + * if that works, mix hw data into constants as well + */ + if( load_input() ) { + (void) load_constants() ; + } + else { + /* + * update counter[] and constants for dummy pool + * before using them + */ + top_mix() ; + /* + * mix pseudorandom bits into input pool + * use cheap non-blocking source, dummy pool + */ + for( j = 0, x=i->pool ; j < HOW_MANY ; j++, x+=4 ) { + get128( d, temp ) ; + add128( x, temp ) ; + } + /* + * mix random data into constants[] + * use best available data + */ + (void) get_any( temp ) ; + buffer2array( i, temp ); + (void) get_any( temp ) ; + buffer2array( n, temp ); + (void) get_any( temp ) ; + buffer2array( b, temp ); + (void) get_any( temp ) ; + buffer2array( d, temp ); + } + /* Mix constants[] and both output pools */ + big_mix() ; + + /* output should use a different counter[] value */ + counter_any() ; +} + +/***************************************************************** + * 128-bit counter to mix in when hashing + * + * There is only one counter[] and three functions to update it, + * count() to iterate it, buffer2counter() or counter_any() + * to re-initialise it with a new starting value + * + * mix_first() uses counter[] and calls count(), so the count both + * affects and is affected by all output operations on any pool. + * + * Operations on this counter do not affect the per-pool counts + * for any pool, neither the entropy count nor the r->count + * iteration counter. + * + * One reason for including the counter is that it allows fast + * initialisation. The very first output from the input pool is + * used to update the counter. Once that is done, even if the + * pools were all worthless, every output operation would still + * have at least the strength of hash(constants, counter) which + * is very roughly equivalent to a counter mode block cipher + * encrypt(key,counter). + * + * mix_first() mixes in the counter so it affects all output from + * any pool and all feedback into any pool. Every operation on any + * pool changes the counter, so it automatically influences all the + * other pools, albeit in an indirect and quite limited way. + * + * This can contribute to recovery after an rng state compromise. + * Even knowing the counter value at one time an enemy cannot infer + * the future effects unless he can predict the order of future + * output operations, which depends on data requests from all sources. + * Nor can he work backwards to get previous outputs unless he knows + * the order of previous operations. + * + * This may provide almost no protection on a simple embedded system + * or over a very short time span, since in those cases an enemy + * might guess the sequence of operations or search through some + * moderate number of possibilties. However it should be quite + * effective for more complex systems and longer time spans. + ****************************************************************/ + +static u32 iter_count = 0 ; +static u32 loop_count = 0 ; + +/* + * 41 times 251 iterations per loop + * gives about 10,000 outputs before auto-rekey + */ +#define MAX_LOOPS 41 + +/* constant from SHA-1 */ +#define COUNTER_DELTA 0x67452301 + +/* + * Code is based on my own work in the Enchilada cipher: + * https://aezoo.compute.dtu.dk/doku.php?id=enchilada + * That implements a 128-bit counter in 4 32-bit words + * + * Here counter[] is declared as 8 words; the others + * are used only during updates, in buffer2counter() + * + * Add a constant instead of just incrementing, and include some + * other operations, so Hamming weight changes more than for a + * simple counter. Mix +, XOR and rotation so it is nonlinear. + * + * This may not be strictly necessary, but a simple counter can + * be considered safe only if you trust the crypto completely. + * Low Hamming weight differences in inputs do allow some attacks + * on block ciphers or hashes and the high bits of a large counter + * that is only incremented do not change for aeons. + * + * The extra code here is cheap insurance. + * + * For discussion, see mailing list thread starting at: + * http://www.metzdowd.com/pipermail/cryptography/2014-May/021345.html + */ + +static void count(void) +{ + int reseed ; + unsigned long flags ; + + /* + * There should be enough other rekeying that + * this is quite rare. This is just here for + * safety, much as IPsec rekeys after 2^32 + * blocks if no other rekeying is done. + */ + spin_lock_irqsave( &counter_lock, flags ) ; + reseed = (loop_count >= MAX_LOOPS) ; + spin_unlock_irqrestore( &counter_lock, flags ) ; + if( reseed ) + counter_any() ; + + spin_lock_irqsave( &counter_lock, flags ) ; + + /* + * Limit the switch to < 256 cases + * should work with any CPU & compiler + * + * Five constants used, all primes + * roughly evenly spaced, around 50, 100, 150, 200, 250 + */ + switch( iter_count ) { + /* + * mix three array elements + * each element is used twice + * once on left, once on right + * pattern is circular + * order chosen for fast mixing + */ + case 47: + counter[1] += counter[3] ; + break ; + case 101: + counter[2] += counter[1] ; + break ; + case 197: + counter[3] += counter[2] ; + break ; + /* + * inject counter[0] into that loop + * the loop and counter[0] use += + * so use ^= here + * + * inject into counter[2] + * so case 197 starts spreading the effect + */ + case 149: + counter[2] ^= counter[0] ; + break ; + /* + * restart loop + * throw in rotations for nonlinearity + */ + case 251: + counter[1] = ROTL( counter[1], 3) ; + counter[2] = ROTL( counter[2], 7) ; + counter[3] = ROTL( counter[3], 13) ; + iter_count = 0 ; + loop_count++ ; + break ; + /* + * for 247 out of every 252 iterations + * the switch does nothing + */ + default: + break ; + } + /* + * counter[0] is purely a counter + * nothing above affects it + * uses += instead of ++ to change Hamming weight more + * + * would repeat after 2^32 iterations + * not a problem since the rest of counter[] changes too + * and 2^32 will not be reached + */ + counter[0] += COUNTER_DELTA ; + iter_count++ ; + + spin_unlock_irqrestore( &counter_lock, flags ) ; +} + +/* + * code to set a new counter value + * + * All buffer2*() routines + * expect 128 bits of input + * zero the input data after using it + */ +static void buffer2counter( u32 *data ) +{ + unsigned long flags ; + + spin_lock_irqsave( &counter_lock, flags ) ; + /* + * timing data is used elsewhere in driver + * and we do not want an expensive operation + * here, so use simplest thing that makes + * every call different + */ + counter[0] ^= jiffies ; + /* + * mix all 8 words in counter[] array + * this and top_mix() are the only things + * that change the high 4 words + */ + pht256( counter ) ; + /* + * input data mixed into low 4 words of counter[] + * which are the actual 128-bit counter + * + * high 4 words are multiplier in GCM mixing + * this is the only place they are used + */ + addmul( (u8 *) counter, (u8 *) data, 16, (u8 *) (counter+4) ) ; + /* + * make the mixing non-invertible + * see reference to Preneel et al. in comment for mix_last() + */ + xor128( counter, data ) ; + + loop_count = 0 ; + iter_count = 0 ; + + spin_unlock_irqrestore( &counter_lock, flags ) ; + + zero128( data ) ; + clear_addmul() ; +} + +static void counter_any( ) +{ + u32 temp[4] ; + (void) get_any( temp ) ; + buffer2counter( temp ) ; +} + +/**************************************************************** + * Code to deal with hardware RNG, if any + * + * get_hw_random() just puts 128 bits from hw rng in a buffer + * + * load_input() makes sure that, if we have a hardware rng, then the + * input pool is well supplied with data + * + * Absent an rng instruction, these functions would be the logical + * place to add data from something else, such as a hardware rng + * accessed via a driver rather than an instruction (Turbid, or an + * on-board or plug-in device) or something using timing data such + * as Havege or Stephan Mueller's jitter. There is no code for that + * here yet. + * + * Both get_hw_random() and load_input() set got_hw_rng and return + * a value for success/failure. If all arch_get_random_long() calls + * succeed, both got_hw_rng and the return are 1; if any call fails + * both are 0 + * + * Code calling those functions can either check got_hw_rng and + * avoid the call if it is 0 or just make the call unconditionally + * and let the function set got_hw_rng. + ***********************************************************************/ + +/* + * How much do we trust the hardware? + * 0-32 for entropy credit per 32-bit word + * + * arbitrary number here for testing + * NEEDS TO BE SET MORE CAREFULLY + * may need #ifdef for architecture-specific value + */ +#define TRUST32 25 + +/* + * check for out-of-bounds values, allowing only values 1-31 + * a value of 0 would be senseless + * 32 is too trusting for any real device + */ +#if (TRUST32 < 1) || (TRUST32 > 31) +#error Out-of-bounds setting for TRUST32 +#endif + +/* + * fill a 128-bit buffer with hw rand data + * only used by routines in this section + * other code calls those, not this, since + * the higher-level routines do more + */ +static inline int hw2buff( u32 *out ) +{ + int i ; + unsigned long *p ; + + for( i = 0, p = (unsigned long *) out ; i < 4 ; i++, p++ ) + if( !arch_get_random_long( p ) ) + return 0 ; + return 1 ; +} + +/* put 128 bits into a buffer, set got_hw_rng */ +static int get_hw_random( u32 *out ) +{ + int ret ; + ret = hw2buff( out ) ; + got_hw_rng = ret ; + return ret ; +} + +/* (approximately) fill the input pool with hw rng data */ + +static u32 *next_word = pools ; + +static int load_input() +{ + struct entropy_store *r ; + u32 temp[4], *end_buffer ; + int i, n, ret, limit, e_count ; + unsigned long x, flags ; + + r = &input_pool ; + + /* + * deliberately somewhat imprecise calculation + * we need not exactly fill the pool + * + * no lock here; we are just reading values + * and an error will not do real harm + */ + n = (r->poolinfo->poolbits - ENTROPY_BITS(r)) / 128 ; + + /* + * if pool is not full + * loop to put data into the pool itself + * this does need the lock + */ + if( n > 0 ) { + limit = n*4 ; + end_buffer = r->pool + INPUT_POOL_WORDS ; + spin_lock_irqsave( &r->lock, flags ) ; + for( i = e_count = 0, ret = 1 ; ret && (i= end_buffer ) + next_word = r->pool ; + if( (ret = arch_get_random_long( &x )) ) { + *next_word ^= x ; + e_count += TRUST32 ; + } + } + credit_entropy_bits( r, e_count ) ; + spin_unlock_irqrestore( &r->lock, flags ) ; + } + /* + * if pool is near full, change its constants + * no loop, just do 128 bits + */ + else if( (ret = hw2buff(temp)) ) { + buffer2array( r, temp ) ; + } + got_hw_rng = ret ; + return ret ; +} + +/* update all constants with data from hw rng if possible */ +static int load_constants() +{ + int i, ret ; + u32 *p ; + unsigned long x, flags ; + + spin_lock_irqsave( &constants_lock, flags ) ; + for( i = 0, p = constants, ret = 1 ; ret && (i < ARRAY_WORDS) ; i++, p++ ) { + if( (ret = arch_get_random_long( &x )) ) + *p ^= x ; + } + spin_unlock_irqrestore( &constants_lock, flags ) ; + got_hw_rng = ret ; + return ret ; +} -- 2.5.0