All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kfence: buffer random bools in bitmask
@ 2022-10-26 20:40 Jason A. Donenfeld
  2022-10-26 20:45 ` Jason A. Donenfeld
  2022-10-27  0:04 ` Marco Elver
  0 siblings, 2 replies; 5+ messages in thread
From: Jason A. Donenfeld @ 2022-10-26 20:40 UTC (permalink / raw)
  To: kasan-dev, elver, patches; +Cc: Jason A. Donenfeld, Sebastian Andrzej Siewior

Recently kfence got a 4x speed up in calls to the RNG, due to using
internally get_random_u8() instead of get_random_u32() for its random
boolean values. We can extend that speed up another 8x, to 32x total, by
buffering a long at a time, and reading bits from it.

I'd looked into introducing a get_random_bool(), along with the
complexities required for that kind of function to work for a general
case. But kfence is the only high-speed user of random booleans in a hot
path, so we're better off open coding this to take advantage of kfence
particularities.

In particular, we take advantage of the fact that kfence_guarded_alloc()
already disables interrupts for its raw spinlocks, so that we can keep
track of a per-cpu buffered boolean bitmask, without needing to add more
interrupt disabling.

This is slightly complicated by PREEMPT_RT, where we actually need to
take a local_lock instead. But the resulting code in both cases compiles
down to something very compact, and is basically zero cost.
Specifically, on !PREEMPT_RT, this amounts to:

    local_irq_save(flags);
    random boolean stuff;
    raw_spin_lock(&other_thing);
    do the existing stuff;
    raw_spin_unlock_irqrestore(&other_thing, flags);

By using a local_lock in the way this patch does, we now also get this
code on PREEMPT_RT:

    spin_lock(this_cpu_ptr(&local_lock));
    random boolean stuff;
    spin_unlock(this_cpu_ptr(&local_lock));
    raw_spin_lock_irqsave(&other_thing, flags);
    do the existing stuff;
    raw_spin_unlock_irqrestore(&other_thing, flags);

This is also optimal for RT systems. So all and all, this is pretty
good. But there are some compile-time conditionals in order to
accomplish this.

Cc: Marco Elver <elver@google.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 mm/kfence/core.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 6cbd93f2007b..c212ae0cecba 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -356,21 +356,47 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g
 				  unsigned long *stack_entries, size_t num_stack_entries,
 				  u32 alloc_stack_hash)
 {
+	struct random_bools {
+		unsigned long bits;
+		unsigned int len;
+		local_lock_t lock;
+	};
+	static DEFINE_PER_CPU(struct random_bools, pcpu_bools) = {
+		.lock = INIT_LOCAL_LOCK(pcpu_bools.lock)
+	};
+	struct random_bools *bools;
 	struct kfence_metadata *meta = NULL;
 	unsigned long flags;
 	struct slab *slab;
 	void *addr;
-	const bool random_right_allocate = get_random_u32_below(2);
+	bool random_right_allocate;
 	const bool random_fault = CONFIG_KFENCE_STRESS_TEST_FAULTS &&
 				  !get_random_u32_below(CONFIG_KFENCE_STRESS_TEST_FAULTS);
 
+	local_lock_irqsave(&pcpu_bools.lock, flags);
+	bools = raw_cpu_ptr(&pcpu_bools);
+	if (unlikely(!bools->len)) {
+		bools->bits = get_random_long();
+		bools->len = BITS_PER_LONG;
+	}
+	random_right_allocate = bools->bits & 1;
+	bools->bits >>= 1;
+	bools->len--;
+
 	/* Try to obtain a free object. */
-	raw_spin_lock_irqsave(&kfence_freelist_lock, flags);
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		raw_spin_lock_irqsave(&kfence_freelist_lock, flags);
+	else
+		raw_spin_lock(&kfence_freelist_lock);
 	if (!list_empty(&kfence_freelist)) {
 		meta = list_entry(kfence_freelist.next, struct kfence_metadata, list);
 		list_del_init(&meta->list);
 	}
-	raw_spin_unlock_irqrestore(&kfence_freelist_lock, flags);
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		raw_spin_unlock_irqrestore(&kfence_freelist_lock, flags);
+	else
+		raw_spin_unlock(&kfence_freelist_lock);
+	local_unlock_irqrestore(&pcpu_bools.lock, flags);
 	if (!meta) {
 		atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_CAPACITY]);
 		return NULL;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-27  1:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-26 20:40 [PATCH] kfence: buffer random bools in bitmask Jason A. Donenfeld
2022-10-26 20:45 ` Jason A. Donenfeld
2022-10-27  0:04 ` Marco Elver
2022-10-27  0:26   ` Jason A. Donenfeld
2022-10-27  1:40     ` Marco Elver

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.