All of lore.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: andi@firstfloor.org, tytso@mit.edu
Cc: ahferroin7@gmail.com, jepler@unpythonic.net,
	linux-kernel@vger.kernel.org, linux@horizon.com,
	linux@rasmusvillemoes.dk
Subject: [RFC PATCH 4/4] random: Make non-blocking mixback non-blocking
Date: 16 Oct 2015 01:34:52 -0400	[thread overview]
Message-ID: <20151016053452.13180.qmail@ns.horizon.com> (raw)
In-Reply-To: <20151016052802.12363.qmail@ns.horizon.com>

Andi Kleen has reported that some loads cause hideous lock congestion
on the /dev/urandom pool lock.  This patch uses some trickery to
allow extract_buf to avoid taking the pool lock to do mixback in case
of contention.

The fundamental idea is that, if there are multiple concurrent
readers, only one mixback is necessary for antibacktracking.
What a reader has to be sure of is that *some* mixback happens
after the reader is done hashing the pool.

So if one CPU is holding the mixback_lock, the others just spam their
mixback data into a global buffer (races be damned) and set a flag.
The lock holder promises to check that flag and do the mixback
before returning.  This promise applies the entire time the
mixback_lock is held, so the flag must be checked after dropping it.

In theory this could be applied to the /dev/random pool as well, but
that's not subject to the same kind of abuse as urandom, so leave it
alone for the sake of conservatism.

The first two spin_trylock() operations in nonblocking_mixback() are
not strictly necessary; if either one succeeds, that shortens the
code path, but things would work if they failed unconditionally.

Another legal change would be to move the release of the mixback_lock
after the first __mix_pool_bytes.  But since there's a shortcut
optimization if the pool lock is available, that would require keeping
a flag to indicate if the mixback_lock is held and needs to be released.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 drivers/char/random.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 100 insertions(+), 4 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index cf34e83d..54df2982 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -419,7 +419,6 @@ static LIST_HEAD(random_ready_list);
  *
  **********************************************************************/
 
-struct entropy_store;
 struct entropy_store {
 	/* read-only data: */
 	const struct poolinfo *poolinfo;
@@ -465,7 +464,17 @@ static struct entropy_store blocking_pool = {
 					push_to_pool),
 };
 
-static struct entropy_store nonblocking_pool = {
+/* This is a variant with some extra fields for nonblocking mixback */
+struct entropy_store_plus {
+	struct entropy_store store;
+	bool mixback_flag;
+	__u32 mixback[SHA_DIGEST_WORDS];
+	spinlock_t mixback_lock;
+};
+
+#define nonblocking_pool nonblocking_plus.store
+static struct entropy_store_plus nonblocking_plus = {
+    .store = {
 	.poolinfo = &poolinfo_table[1],
 	.name = "nonblocking",
 	.pull = &input_pool,
@@ -473,6 +482,9 @@ static struct entropy_store nonblocking_pool = {
 	.pool = nonblocking_pool_data,
 	.push_work = __WORK_INITIALIZER(nonblocking_pool.push_work,
 					push_to_pool),
+    },
+    .mixback_flag = false,
+    .mixback_lock = __SPIN_LOCK_UNLOCKED(nonblocking_plus.mixback_lock),
 };
 
 static __u32 const twist_table[8] = {
@@ -554,6 +566,83 @@ static void mix_pool_bytes(struct entropy_store *r, const void *in,
 	spin_unlock_irqrestore(&r->lock, flags);
 }
 
+/*
+ * Because the /dev/urandom pool is by far the busiest, large machines
+ * with applications that hit the pool hard can fall off the locking
+ * cliff on the pool lock.  (They *should* use a private CPRNG reseeded
+ * infrequently, but there are lazy programmers in the world.)
+ *
+ * This function avoids contention on the pool's lock, by taking
+ * advantage of the fact that mixback from a given pool state is logically
+ * idempotent: if there are multiple threads wanting to mix back, only
+ * one of them has to succeed.  A non-deterministic mixture of the
+ * competing mixback data is also okay, so we use a common temporary
+ * buffer written without locking.
+ *
+ * It is also not necessary that a thread confirm that mixback has
+ * happened before returning to user space; a backtracking window a few
+ * microseconds long is not a concern, as long as it is strongly bounded.
+ * In this case, because the mixback lock is taken without disabling
+ * interrupts, the window is up to the interrupt latency long, but that's
+ * considered acceptable.
+ *
+ * What is important is that, after a mixback has started, any later
+ * pool readers will schedule an additional mixback.  That's what all
+ * the dancing around with mixback_flag and mixback_lock is about.
+ */
+static void nonblocking_mixback(struct entropy_store_plus *r,
+				const void *in, int nbytes)
+{
+	unsigned long flags;
+
+	if (!spin_trylock_irqsave(&r->store.lock, flags)) {
+		/* Someone's currently writing to the pool.  Anyone waiting? */
+		if (!spin_trylock(&r->mixback_lock)) {
+			/* Leave it for the lock holder to take care of. */
+			nbytes = min_t(int, nbytes, sizeof(r->mixback));
+			memcpy(r->mixback, in, nbytes);
+			smp_store_release(&r->mixback_flag, true);
+			smp_wmb();
+			/*
+			 * If the lock is still held, we are guaranteed that
+			 * the lock holder will promptly do the mixback for us.
+			 */
+			if (!spin_trylock(&r->mixback_lock))
+				return;
+		}
+		/*
+		 * Wait in line to do mixback.  This has to be a separate
+		 * lock because we don't want to oblige interrupt entropy
+		 * sources which also take the pool lock to do mixback.
+		 */
+		spin_lock_irqsave(&r->store.lock, flags);
+		spin_unlock(&r->mixback_lock);
+		smp_mb();	/* Between unlock and read of r->mixback flag */
+	}
+
+	__mix_pool_bytes(&r->store, in, nbytes);
+
+	/*
+	 * Do any buffered mixback.  Holding the mixback_lock obliges
+	 * us to check after dropping it, but we only have to check once.
+	 * (We aren't required to check at all if we never took the
+	 * mixback_lock, but it does no harm to.)
+	 */
+	if (READ_ONCE_CTRL(r->mixback_flag)) {
+		__u32 mixback[ARRAY_SIZE(r->mixback)];
+
+		memcpy(mixback, r->mixback, sizeof mixback);
+		memset(r->mixback, 0, sizeof r->mixback);
+		smp_store_release(&r->mixback_flag, false);
+
+		smp_wmb();	/* Ensure release is seen before mixing */
+		__mix_pool_bytes(&r->store, mixback, sizeof mixback);
+		memzero_explicit(mixback, sizeof mixback);
+	}
+	spin_unlock_irqrestore(&r->store.lock, flags);
+}
+
+
 struct fast_pool {
 	__u32		pool[4];
 	unsigned long	last;
@@ -1173,8 +1262,15 @@ static void extract_buf(struct entropy_store *r, __u8 out[EXTRACT_SIZE],
 	 *
 	 * FIXME: update security analysis in light of reduced mixback.
 	 */
-	if (mixback)
-		mix_pool_bytes(r, hash.w, sizeof(hash.w));
+	if (mixback) {
+		if (r->limit) {
+			mix_pool_bytes(r, hash.w, sizeof(hash.w));
+		} else {
+			struct entropy_store_plus *p =
+			    container_of(r, struct entropy_store_plus, store);
+			nonblocking_mixback(p, hash.w, sizeof(hash.w));
+		}
+	}
 
 	memzero_explicit(workspace, sizeof(workspace));
 
-- 
2.6.1


  parent reply	other threads:[~2015-10-16  5:34 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-10 18:45 Updated scalable urandom patchkit George Spelvin
2015-10-11  2:31 ` Theodore Ts'o
2015-10-11  2:53   ` Theodore Ts'o
2015-10-11  4:35   ` George Spelvin
2015-10-11 22:25     ` Theodore Ts'o
2015-10-12  0:16       ` George Spelvin
2015-10-12  4:05         ` Theodore Ts'o
2015-10-12  7:49           ` George Spelvin
2015-10-12 13:54             ` Theodore Ts'o
2015-10-12 20:30               ` George Spelvin
2015-10-12 20:34                 ` George Spelvin
2015-10-13  2:46                 ` Theodore Ts'o
2015-10-13  3:50                   ` Raymond Jennings
2015-10-13  7:50                     ` George Spelvin
2015-10-13  6:24                   ` George Spelvin
2015-10-13 16:20                   ` Andi Kleen
2015-10-13 21:10                     ` George Spelvin
2015-10-14  2:15                       ` Andi Kleen
2015-10-16  5:28                         ` [RFC PATCH 0/4] Alternate sclable urandom patchset George Spelvin
2015-10-16  5:29                           ` [RFC PATCH 1/4] random: Reduce stack usage in _xfer_secondary_pool George Spelvin
2015-10-16  5:30                           ` [RFC PATCH 2/4] random: Remove two unused arguments from extract_entropy() George Spelvin
2015-10-16  5:33                           ` [RFC PATCH 3/4] random: Only do mixback once per read George Spelvin
2015-10-16  6:12                             ` kbuild test robot
2015-10-16  8:11                               ` George Spelvin
2015-10-16  6:23                             ` kbuild test robot
2015-10-16  5:34                           ` George Spelvin [this message]
2015-10-21  8:27                       ` Updated scalable urandom patchkit George Spelvin
2015-10-21 11:47                         ` Andi Kleen
2015-10-21 18:10                           ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151016053452.13180.qmail@ns.horizon.com \
    --to=linux@horizon.com \
    --cc=ahferroin7@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=jepler@unpythonic.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.