All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] random: PREEMPT_RT fixes
@ 2022-02-07 15:39 Jason A. Donenfeld
  2022-02-07 15:39 ` [PATCH v3 1/2] random: remove batched entropy locking Jason A. Donenfeld
  2022-02-07 15:39 ` [PATCH v3 2/2] random: defer fast pool mixing to worker Jason A. Donenfeld
  0 siblings, 2 replies; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-07 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason A. Donenfeld, Sebastian Andrzej Siewior, Thomas Gleixner,
	Peter Zijlstra, Theodore Ts'o, Sultan Alsawaf,
	Jonathan Neuschäfer

Here are the two patches we've been looking at thus far in separate
threads, now brought together as one patchset. This doesn't fix _all_ of
the PREEMPT_RT issues, but it does very much move in the right direction
of having less locking in places where that matters.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Jonathan Neuschäfer <j.neuschaefer@gmx.net>

Jason A. Donenfeld (2):
  random: remove batched entropy locking
  random: defer fast pool mixing to worker

 drivers/char/random.c         | 101 +++++++++++++++++++---------------
 include/trace/events/random.h |   6 --
 2 files changed, 56 insertions(+), 51 deletions(-)

-- 
2.35.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/2] random: remove batched entropy locking
  2022-02-07 15:39 [PATCH v3 0/2] random: PREEMPT_RT fixes Jason A. Donenfeld
@ 2022-02-07 15:39 ` Jason A. Donenfeld
  2022-02-08 23:44   ` Eric Biggers
  2022-02-07 15:39 ` [PATCH v3 2/2] random: defer fast pool mixing to worker Jason A. Donenfeld
  1 sibling, 1 reply; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-07 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason A. Donenfeld, Andy Lutomirski, Jonathan Neuschäfer,
	Sebastian Andrzej Siewior, Sultan Alsawaf

Rather than use spinlocks to protect batched entropy, we can instead
disable interrupts locally, since we're dealing with per-cpu data, and
manage resets with a basic generation counter. At the same time, we
can't quite do this on PREEMPT_RT, where we still want spinlocks-as-
mutexes semantics. So we use a local_lock_t, which provides the right
behavior for each. Because this is a per-cpu lock, that generation
counter is still doing the necessary CPU-to-CPU communication.

This should improve performance a bit. It will also fix the linked splat
that Jonathan received with a PROVE_RAW_LOCK_NESTING=y.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Reported-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/char/random.c | 51 ++++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 8ae2bf5ca606..395e76c009d6 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1731,13 +1731,16 @@ static int __init random_sysctls_init(void)
 device_initcall(random_sysctls_init);
 #endif	/* CONFIG_SYSCTL */
 
+static atomic_t batch_generation = ATOMIC_INIT(0);
+
 struct batched_entropy {
 	union {
 		u64 entropy_u64[CHACHA_BLOCK_SIZE / sizeof(u64)];
 		u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
 	};
+	local_lock_t lock;
 	unsigned int position;
-	spinlock_t batch_lock;
+	int generation;
 };
 
 /*
@@ -1749,7 +1752,7 @@ struct batched_entropy {
  * point prior.
  */
 static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u64) = {
-	.batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u64.lock),
+	.lock = INIT_LOCAL_LOCK(batched_entropy_u64.lock)
 };
 
 u64 get_random_u64(void)
@@ -1758,41 +1761,54 @@ u64 get_random_u64(void)
 	unsigned long flags;
 	struct batched_entropy *batch;
 	static void *previous;
+	int next_gen;
 
 	warn_unseeded_randomness(&previous);
 
+	local_lock_irqsave(&batched_entropy_u64.lock, flags);
 	batch = raw_cpu_ptr(&batched_entropy_u64);
-	spin_lock_irqsave(&batch->batch_lock, flags);
-	if (batch->position % ARRAY_SIZE(batch->entropy_u64) == 0) {
+
+	next_gen = atomic_read(&batch_generation);
+	if (batch->position % ARRAY_SIZE(batch->entropy_u64) == 0 ||
+	    next_gen != batch->generation) {
 		extract_crng((u8 *)batch->entropy_u64);
 		batch->position = 0;
+		batch->generation = next_gen;
 	}
+
 	ret = batch->entropy_u64[batch->position++];
-	spin_unlock_irqrestore(&batch->batch_lock, flags);
+	local_unlock_irqrestore(&batched_entropy_u64.lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_u64);
 
 static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u32) = {
-	.batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u32.lock),
+	.lock = INIT_LOCAL_LOCK(batched_entropy_u32.lock)
 };
+
 u32 get_random_u32(void)
 {
 	u32 ret;
 	unsigned long flags;
 	struct batched_entropy *batch;
 	static void *previous;
+	int next_gen;
 
 	warn_unseeded_randomness(&previous);
 
+	local_lock_irqsave(&batched_entropy_u32.lock, flags);
 	batch = raw_cpu_ptr(&batched_entropy_u32);
-	spin_lock_irqsave(&batch->batch_lock, flags);
-	if (batch->position % ARRAY_SIZE(batch->entropy_u32) == 0) {
+
+	next_gen = atomic_read(&batch_generation);
+	if (batch->position % ARRAY_SIZE(batch->entropy_u32) == 0 ||
+	    next_gen != batch->generation) {
 		extract_crng((u8 *)batch->entropy_u32);
 		batch->position = 0;
+		batch->generation = next_gen;
 	}
+
 	ret = batch->entropy_u32[batch->position++];
-	spin_unlock_irqrestore(&batch->batch_lock, flags);
+	local_unlock_irqrestore(&batched_entropy_u32.lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_u32);
@@ -1803,22 +1819,7 @@ EXPORT_SYMBOL(get_random_u32);
  * next usage. */
 static void invalidate_batched_entropy(void)
 {
-	int cpu;
-	unsigned long flags;
-
-	for_each_possible_cpu(cpu) {
-		struct batched_entropy *batched_entropy;
-
-		batched_entropy = per_cpu_ptr(&batched_entropy_u32, cpu);
-		spin_lock_irqsave(&batched_entropy->batch_lock, flags);
-		batched_entropy->position = 0;
-		spin_unlock(&batched_entropy->batch_lock);
-
-		batched_entropy = per_cpu_ptr(&batched_entropy_u64, cpu);
-		spin_lock(&batched_entropy->batch_lock);
-		batched_entropy->position = 0;
-		spin_unlock_irqrestore(&batched_entropy->batch_lock, flags);
-	}
+	atomic_inc(&batch_generation);
 }
 
 /**
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] random: defer fast pool mixing to worker
  2022-02-07 15:39 [PATCH v3 0/2] random: PREEMPT_RT fixes Jason A. Donenfeld
  2022-02-07 15:39 ` [PATCH v3 1/2] random: remove batched entropy locking Jason A. Donenfeld
@ 2022-02-07 15:39 ` Jason A. Donenfeld
  2022-02-09  0:12   ` Eric Biggers
  1 sibling, 1 reply; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-07 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason A. Donenfeld, Sebastian Andrzej Siewior, Thomas Gleixner,
	Peter Zijlstra, Theodore Ts'o, Sultan Alsawaf,
	Jonathan Neuschäfer

On PREEMPT_RT, it's problematic to take spinlocks from hard irq
handlers. We can fix this by deferring to a work queue the dumping of
the fast pool into the input pool.

We accomplish this with some careful rules on fast_pool->count:

  - When it's incremented to >= 64, we schedule the work.
  - If the top bit is set, we never schedule the work, even if >= 64.
  - The worker is responsible for setting it back to 0 when it's done.

In the worst case, an irq handler is mixing a new irq into the pool at
the same time as the worker is dumping it into the input pool. In this
case, we only ever set the count back to 0 _after_ we're done, so that
subsequent cycles will require a full 64 to dump it in again. In other
words, the result of this race is only ever adding a little bit more
information than normal, but never less, and never crediting any more
for this partial additional information.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/char/random.c         | 50 +++++++++++++++++++++--------------
 include/trace/events/random.h |  6 -----
 2 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 395e76c009d6..1bfeb4393e39 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -377,12 +377,6 @@ static void _mix_pool_bytes(const void *in, int nbytes)
 	blake2s_update(&input_pool.hash, in, nbytes);
 }
 
-static void __mix_pool_bytes(const void *in, int nbytes)
-{
-	trace_mix_pool_bytes_nolock(nbytes, _RET_IP_);
-	_mix_pool_bytes(in, nbytes);
-}
-
 static void mix_pool_bytes(const void *in, int nbytes)
 {
 	unsigned long flags;
@@ -394,11 +388,13 @@ static void mix_pool_bytes(const void *in, int nbytes)
 }
 
 struct fast_pool {
-	u32 pool[4];
+	struct work_struct mix;
 	unsigned long last;
+	u32 pool[4];
+	unsigned int count;
 	u16 reg_idx;
-	u8 count;
 };
+#define FAST_POOL_MIX_INFLIGHT (1U << 31)
 
 /*
  * This is a fast mixing routine used by the interrupt randomness
@@ -428,7 +424,6 @@ static void fast_mix(struct fast_pool *f)
 
 	f->pool[0] = a;  f->pool[1] = b;
 	f->pool[2] = c;  f->pool[3] = d;
-	f->count++;
 }
 
 static void process_random_ready_list(void)
@@ -977,12 +972,30 @@ static u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
 	return *ptr;
 }
 
+static void mix_interrupt_randomness(struct work_struct *work)
+{
+	struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
+
+	/*
+	 * Since this is the result of a trip through the scheduler, xor in
+	 * a cycle counter. It can't hurt, and might help.
+	 */
+	fast_pool->pool[3] ^= random_get_entropy();
+
+	mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
+	/* We take care to zero out the count only after we're done reading the pool. */
+	WRITE_ONCE(fast_pool->count, 0);
+	fast_pool->last = jiffies;
+	credit_entropy_bits(1);
+}
+
 void add_interrupt_randomness(int irq)
 {
 	struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness);
 	struct pt_regs *regs = get_irq_regs();
 	unsigned long now = jiffies;
 	cycles_t cycles = random_get_entropy();
+	unsigned int new_count;
 	u32 c_high, j_high;
 	u64 ip;
 
@@ -999,9 +1012,10 @@ void add_interrupt_randomness(int irq)
 
 	fast_mix(fast_pool);
 	add_interrupt_bench(cycles);
+	new_count = ++fast_pool->count;
 
 	if (unlikely(crng_init == 0)) {
-		if ((fast_pool->count >= 64) &&
+		if (new_count >= 64 &&
 		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
 			fast_pool->count = 0;
 			fast_pool->last = now;
@@ -1009,20 +1023,16 @@ void add_interrupt_randomness(int irq)
 		return;
 	}
 
-	if ((fast_pool->count < 64) && !time_after(now, fast_pool->last + HZ))
+	if (new_count & FAST_POOL_MIX_INFLIGHT)
 		return;
 
-	if (!spin_trylock(&input_pool.lock))
+	if (new_count < 64 && !time_after(now, fast_pool->last + HZ))
 		return;
 
-	fast_pool->last = now;
-	__mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
-	spin_unlock(&input_pool.lock);
-
-	fast_pool->count = 0;
-
-	/* award one bit for the contents of the fast pool */
-	credit_entropy_bits(1);
+	if (unlikely(!fast_pool->mix.func))
+		INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
+	fast_pool->count |= FAST_POOL_MIX_INFLIGHT;
+	queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix);
 }
 EXPORT_SYMBOL_GPL(add_interrupt_randomness);
 
diff --git a/include/trace/events/random.h b/include/trace/events/random.h
index ad149aeaf42c..833f42afc70f 100644
--- a/include/trace/events/random.h
+++ b/include/trace/events/random.h
@@ -52,12 +52,6 @@ DEFINE_EVENT(random__mix_pool_bytes, mix_pool_bytes,
 	TP_ARGS(bytes, IP)
 );
 
-DEFINE_EVENT(random__mix_pool_bytes, mix_pool_bytes_nolock,
-	TP_PROTO(int bytes, unsigned long IP),
-
-	TP_ARGS(bytes, IP)
-);
-
 TRACE_EVENT(credit_entropy_bits,
 	TP_PROTO(int bits, int entropy_count, unsigned long IP),
 
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] random: remove batched entropy locking
  2022-02-07 15:39 ` [PATCH v3 1/2] random: remove batched entropy locking Jason A. Donenfeld
@ 2022-02-08 23:44   ` Eric Biggers
  2022-02-09  0:24     ` Jason A. Donenfeld
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Biggers @ 2022-02-08 23:44 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: linux-kernel, Andy Lutomirski, Jonathan Neuschäfer,
	Sebastian Andrzej Siewior, Sultan Alsawaf

On Mon, Feb 07, 2022 at 04:39:13PM +0100, Jason A. Donenfeld wrote:
> Rather than use spinlocks to protect batched entropy, we can instead
> disable interrupts locally, since we're dealing with per-cpu data, and
> manage resets with a basic generation counter. At the same time, we
> can't quite do this on PREEMPT_RT, where we still want spinlocks-as-
> mutexes semantics. So we use a local_lock_t, which provides the right
> behavior for each. Because this is a per-cpu lock, that generation
> counter is still doing the necessary CPU-to-CPU communication.
> 
> This should improve performance a bit. It will also fix the linked splat
> that Jonathan received with a PROVE_RAW_LOCK_NESTING=y.
> 
> Suggested-by: Andy Lutomirski <luto@kernel.org>
> Reported-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
> Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
> Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Sultan Alsawaf <sultan@kerneltoast.com>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  drivers/char/random.c | 51 ++++++++++++++++++++++---------------------
>  1 file changed, 26 insertions(+), 25 deletions(-)

Looks good.  The comment above invalidate_batched_entropy() needs to be updated,
though.

Also, please Cc patches for random.c to linux-crypto.

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] random: defer fast pool mixing to worker
  2022-02-07 15:39 ` [PATCH v3 2/2] random: defer fast pool mixing to worker Jason A. Donenfeld
@ 2022-02-09  0:12   ` Eric Biggers
  2022-02-09  0:36     ` Jason A. Donenfeld
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Biggers @ 2022-02-09  0:12 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: linux-kernel, Sebastian Andrzej Siewior, Thomas Gleixner,
	Peter Zijlstra, Theodore Ts'o, Sultan Alsawaf,
	Jonathan Neuschäfer

On Mon, Feb 07, 2022 at 04:39:14PM +0100, Jason A. Donenfeld wrote:
> +static void mix_interrupt_randomness(struct work_struct *work)
> +{
> +	struct fast_pool *fast_pool = container_of(work, struct fast_pool, mix);
> +
> +	/*
> +	 * Since this is the result of a trip through the scheduler, xor in
> +	 * a cycle counter. It can't hurt, and might help.
> +	 */
> +	fast_pool->pool[3] ^= random_get_entropy();
> +
> +	mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
> +	/* We take care to zero out the count only after we're done reading the pool. */
> +	WRITE_ONCE(fast_pool->count, 0);
> +	fast_pool->last = jiffies;
> +	credit_entropy_bits(1);
> +}

So, add_interrupt_randomness() can execute on the same CPU re-entrantly at any
time this is executing?  That could result in some pretty weird behavior, where
the pool gets changed half-way through being used, so what is used is neither
the old nor the new state of the pool.  Is there a reason why this is okay?

>  void add_interrupt_randomness(int irq)
>  {
>  	struct fast_pool *fast_pool = this_cpu_ptr(&irq_randomness);
>  	struct pt_regs *regs = get_irq_regs();
>  	unsigned long now = jiffies;
>  	cycles_t cycles = random_get_entropy();
> +	unsigned int new_count;
>  	u32 c_high, j_high;
>  	u64 ip;
>  
> @@ -999,9 +1012,10 @@ void add_interrupt_randomness(int irq)
>  
>  	fast_mix(fast_pool);
>  	add_interrupt_bench(cycles);
> +	new_count = ++fast_pool->count;
>  
>  	if (unlikely(crng_init == 0)) {
> -		if ((fast_pool->count >= 64) &&
> +		if (new_count >= 64 &&
>  		    crng_fast_load((u8 *)fast_pool->pool, sizeof(fast_pool->pool)) > 0) {
>  			fast_pool->count = 0;
>  			fast_pool->last = now;
> @@ -1009,20 +1023,16 @@ void add_interrupt_randomness(int irq)
>  		return;
>  	}
>  
> -	if ((fast_pool->count < 64) && !time_after(now, fast_pool->last + HZ))
> +	if (new_count & FAST_POOL_MIX_INFLIGHT)
>  		return;
>  
> -	if (!spin_trylock(&input_pool.lock))
> +	if (new_count < 64 && !time_after(now, fast_pool->last + HZ))
>  		return;
>  
> -	fast_pool->last = now;
> -	__mix_pool_bytes(&fast_pool->pool, sizeof(fast_pool->pool));
> -	spin_unlock(&input_pool.lock);
> -
> -	fast_pool->count = 0;
> -
> -	/* award one bit for the contents of the fast pool */
> -	credit_entropy_bits(1);
> +	if (unlikely(!fast_pool->mix.func))
> +		INIT_WORK(&fast_pool->mix, mix_interrupt_randomness);
> +	fast_pool->count |= FAST_POOL_MIX_INFLIGHT;
> +	queue_work_on(raw_smp_processor_id(), system_highpri_wq, &fast_pool->mix);
>  }

Is there a reason why the FAST_POOL_MIX_INFLIGHT is part of 'count' instead of a
separate boolean?

Also, a high level question.  Now that the call to mix_pool_bytes() would no
longer occur in hard IRQ context, how much reason is there to minimize the
amount of data passed to it?  Would it be feasible to just concatenate the
interrupt data into an array, and pass the whole array to mix_pool_bytes() --
eliminating the homebrew ARX thing entirely?

- Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] random: remove batched entropy locking
  2022-02-08 23:44   ` Eric Biggers
@ 2022-02-09  0:24     ` Jason A. Donenfeld
  0 siblings, 0 replies; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-09  0:24 UTC (permalink / raw)
  To: Eric Biggers
  Cc: LKML, Andy Lutomirski, Jonathan Neuschäfer,
	Sebastian Andrzej Siewior, Sultan Alsawaf

On Wed, Feb 9, 2022 at 12:44 AM Eric Biggers <ebiggers@kernel.org> wrote:
> Looks good.  The comment above invalidate_batched_entropy() needs to be updated,
> though.

Fixed, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] random: defer fast pool mixing to worker
  2022-02-09  0:12   ` Eric Biggers
@ 2022-02-09  0:36     ` Jason A. Donenfeld
  2022-02-09 12:49       ` Jason A. Donenfeld
  0 siblings, 1 reply; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-09  0:36 UTC (permalink / raw)
  To: Eric Biggers
  Cc: LKML, Sebastian Andrzej Siewior, Thomas Gleixner, Peter Zijlstra,
	Theodore Ts'o, Sultan Alsawaf, Jonathan Neuschäfer

Hi Eric,

On Wed, Feb 9, 2022 at 1:12 AM Eric Biggers <ebiggers@kernel.org> wrote:
> So, add_interrupt_randomness() can execute on the same CPU re-entrantly at any
> time this is executing?  That could result in some pretty weird behavior, where
> the pool gets changed half-way through being used, so what is used is neither
> the old nor the new state of the pool.  Is there a reason why this is okay?

Yes, right, that's the "idea" of this patch, if you could call it such
a thing. The argument is that we set fast_pool->count to zero *after*
mixing in the existing bytes + whatever partial bytes might be mixed
in on an interrupt halfway through the execution of mix_pool_bytes.
Since we set the count to zero after, it means we do not give any
credit to those partial bytes for the following set of 64 interrupts.
What winds up being mixed in will contain at least as much as what was
there had it not been interrupted. And what gets mixed in the next
time will only have more mixed in than it otherwise would have, not
less.

> Is there a reason why the FAST_POOL_MIX_INFLIGHT is part of 'count' instead of a
> separate boolean?

So that we can clear it with a single WRITE_ONCE, and to save space in
the per-cpu crng.

> Also, a high level question.  Now that the call to mix_pool_bytes() would no
> longer occur in hard IRQ context, how much reason is there to minimize the
> amount of data passed to it?  Would it be feasible to just concatenate the
> interrupt data into an array, and pass the whole array to mix_pool_bytes() --
> eliminating the homebrew ARX thing entirely?

Indeed I'm working on replacing fast_mix() with something we can
actually reason about. I thought about a big array but I'm not quite
convinced that the memory overhead of this would be worth it. Right
now, each interrupt generates 16 bytes of data, and we ingest that
data after 64ish interrupts -- but sometimes more (and sometimes
less). So that's a whole kilobyte in the best case. That's not /tons/
but it's not nothing either. So while the big array idea is one
possibility, it's not the only one I'm evaluating. Hopefully in the
coming weeks I'll have some ideas ready to discuss on that.

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/2] random: defer fast pool mixing to worker
  2022-02-09  0:36     ` Jason A. Donenfeld
@ 2022-02-09 12:49       ` Jason A. Donenfeld
  0 siblings, 0 replies; 8+ messages in thread
From: Jason A. Donenfeld @ 2022-02-09 12:49 UTC (permalink / raw)
  To: Eric Biggers
  Cc: LKML, Sebastian Andrzej Siewior, Thomas Gleixner, Peter Zijlstra,
	Theodore Ts'o, Sultan Alsawaf, Jonathan Neuschäfer

On Wed, Feb 9, 2022 at 1:36 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> On Wed, Feb 9, 2022 at 1:12 AM Eric Biggers <ebiggers@kernel.org> wrote:
> > So, add_interrupt_randomness() can execute on the same CPU re-entrantly at any
> > time this is executing?  That could result in some pretty weird behavior, where
> > the pool gets changed half-way through being used, so what is used is neither
> > the old nor the new state of the pool.  Is there a reason why this is okay?
>
> Yes, right, that's the "idea" of this patch, if you could call it such
> a thing. The argument is that we set fast_pool->count to zero *after*
> mixing in the existing bytes + whatever partial bytes might be mixed
> in on an interrupt halfway through the execution of mix_pool_bytes.
> Since we set the count to zero after, it means we do not give any
> credit to those partial bytes for the following set of 64 interrupts.
> What winds up being mixed in will contain at least as much as what was
> there had it not been interrupted. And what gets mixed in the next
> time will only have more mixed in than it otherwise would have, not
> less.

I can actually make it even better by memcpy()ing the fast pool first.
This way any races only affect the fast_mix side -- harmless as
described above -- without affecting blake2s. The latter was _already_
buffering it, but memcpy()ing makes it more clear and doesn't rely on
that behavior. It also means that we get to set count to zero a bit
sooner.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-09 12:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 15:39 [PATCH v3 0/2] random: PREEMPT_RT fixes Jason A. Donenfeld
2022-02-07 15:39 ` [PATCH v3 1/2] random: remove batched entropy locking Jason A. Donenfeld
2022-02-08 23:44   ` Eric Biggers
2022-02-09  0:24     ` Jason A. Donenfeld
2022-02-07 15:39 ` [PATCH v3 2/2] random: defer fast pool mixing to worker Jason A. Donenfeld
2022-02-09  0:12   ` Eric Biggers
2022-02-09  0:36     ` Jason A. Donenfeld
2022-02-09 12:49       ` Jason A. Donenfeld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.