All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] SLUB: cleanup hook processing
@ 2023-12-04 19:34 Vlastimil Babka
  2023-12-04 19:34 ` [PATCH 1/4] mm/slub: fix bulk alloc and free stats Vlastimil Babka
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-04 19:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev, Vlastimil Babka

This is a spin-off of preparatory patches from the percpu array series
[1] as they are IMHO useful on their own and simple enough to target
6.8, while the percpu array is still a RFC.

To avoid non-trivial conflict, the series is also rebased on top of the
SLAB removal branch. [2]

[1] https://lore.kernel.org/all/20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.8/slab-removal

---
Vlastimil Babka (4):
      mm/slub: fix bulk alloc and free stats
      mm/slub: introduce __kmem_cache_free_bulk() without free hooks
      mm/slub: handle bulk and single object freeing separately
      mm/slub: free KFENCE objects in slab_free_hook()

 mm/slub.c | 109 +++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 73 insertions(+), 36 deletions(-)
---
base-commit: 4a38e93b3a7e6669c44929fed918b1494e902dd7
change-id: 20231204-slub-cleanup-hooks-f5f54132a61c

Best regards,
-- 
Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] mm/slub: fix bulk alloc and free stats
  2023-12-04 19:34 [PATCH 0/4] SLUB: cleanup hook processing Vlastimil Babka
@ 2023-12-04 19:34 ` Vlastimil Babka
  2023-12-05  8:11   ` Chengming Zhou
  2023-12-04 19:34 ` [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks Vlastimil Babka
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-04 19:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev, Vlastimil Babka

The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies
identified wrt bulk alloc/free operations:

- Bulk allocations from cpu freelist are not counted. Add the
  ALLOC_FASTPATH counter there.

- Bulk fastpath freeing will count a list of multiple objects with a
  single FREE_FASTPATH inc. Add a stat_add() variant to count them all.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 3f8b95757106..d7b0ca6012e0 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -396,6 +396,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
 #endif
 }
 
+static inline
+void stat_add(const struct kmem_cache *s, enum stat_item si, int v)
+{
+#ifdef CONFIG_SLUB_STATS
+	raw_cpu_add(s->cpu_slab->stat[si], v);
+#endif
+}
+
 /*
  * The slab lists for all objects.
  */
@@ -4268,7 +4276,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
 
 		local_unlock(&s->cpu_slab->lock);
 	}
-	stat(s, FREE_FASTPATH);
+	stat_add(s, FREE_FASTPATH, cnt);
 }
 #else /* CONFIG_SLUB_TINY */
 static void do_slab_free(struct kmem_cache *s,
@@ -4545,6 +4553,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
 		c->freelist = get_freepointer(s, object);
 		p[i] = object;
 		maybe_wipe_obj_freeptr(s, p[i]);
+		stat(s, ALLOC_FASTPATH);
 	}
 	c->tid = next_tid(c->tid);
 	local_unlock_irqrestore(&s->cpu_slab->lock, irqflags);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  2023-12-04 19:34 [PATCH 0/4] SLUB: cleanup hook processing Vlastimil Babka
  2023-12-04 19:34 ` [PATCH 1/4] mm/slub: fix bulk alloc and free stats Vlastimil Babka
@ 2023-12-04 19:34 ` Vlastimil Babka
  2023-12-05  8:19   ` Chengming Zhou
  2023-12-04 19:34 ` [PATCH 3/4] mm/slub: handle bulk and single object freeing separately Vlastimil Babka
  2023-12-04 19:34 ` [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook() Vlastimil Babka
  3 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-04 19:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev, Vlastimil Babka

Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
objects that were allocated before the failure, using
kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
hooks (KASAN etc.) and those expect objects that were processed by the
post alloc hooks, slab_post_alloc_hook() is called before
kmem_cache_free_bulk().

This is wasteful, although not a big concern in practice for the rare
error path. But in order to efficiently handle percpu array batch refill
and free in the near future, we will also need a variant of
kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
and use it for the failure path.

As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
parameter, remove it.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d7b0ca6012e0..0742564c4538 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4478,6 +4478,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
 	return same;
 }
 
+/*
+ * Internal bulk free of objects that were not initialised by the post alloc
+ * hooks and thus should not be processed by the free hooks
+ */
+static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+	if (!size)
+		return;
+
+	do {
+		struct detached_freelist df;
+
+		size = build_detached_freelist(s, size, p, &df);
+		if (!df.slab)
+			continue;
+
+		do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
+			     _RET_IP_);
+	} while (likely(size));
+}
+
 /* Note that interrupts must be enabled when calling this function. */
 void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 {
@@ -4499,7 +4520,7 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
 
 #ifndef CONFIG_SLUB_TINY
 static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
-			size_t size, void **p, struct obj_cgroup *objcg)
+					  size_t size, void **p)
 {
 	struct kmem_cache_cpu *c;
 	unsigned long irqflags;
@@ -4563,14 +4584,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
 
 error:
 	slub_put_cpu_ptr(s->cpu_slab);
-	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
-	kmem_cache_free_bulk(s, i, p);
+	__kmem_cache_free_bulk(s, i, p);
 	return 0;
 
 }
 #else /* CONFIG_SLUB_TINY */
 static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
-			size_t size, void **p, struct obj_cgroup *objcg)
+				   size_t size, void **p)
 {
 	int i;
 
@@ -4593,8 +4613,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
 	return i;
 
 error:
-	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
-	kmem_cache_free_bulk(s, i, p);
+	__kmem_cache_free_bulk(s, i, p);
 	return 0;
 }
 #endif /* CONFIG_SLUB_TINY */
@@ -4614,7 +4633,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	if (unlikely(!s))
 		return 0;
 
-	i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
+	i = __kmem_cache_alloc_bulk(s, flags, size, p);
 
 	/*
 	 * memcg and kmem_cache debug support and memory initialization.

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] mm/slub: handle bulk and single object freeing separately
  2023-12-04 19:34 [PATCH 0/4] SLUB: cleanup hook processing Vlastimil Babka
  2023-12-04 19:34 ` [PATCH 1/4] mm/slub: fix bulk alloc and free stats Vlastimil Babka
  2023-12-04 19:34 ` [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks Vlastimil Babka
@ 2023-12-04 19:34 ` Vlastimil Babka
  2023-12-05 13:23   ` Chengming Zhou
  2023-12-04 19:34 ` [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook() Vlastimil Babka
  3 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-04 19:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev, Vlastimil Babka

Currently we have a single function slab_free() handling both single
object freeing and bulk freeing with necessary hooks, the latter case
requiring slab_free_freelist_hook(). It should be however better to
distinguish the two use cases for the following reasons:

- code simpler to follow for the single object case

- better code generation - although inlining should eliminate the
  slab_free_freelist_hook() for single object freeing in case no
  debugging options are enabled, it seems it's not perfect. When e.g.
  KASAN is enabled, we're imposing additional unnecessary overhead for
  single object freeing.

- preparation to add percpu array caches in near future

Therefore, simplify slab_free() for the single object case by dropping
unnecessary parameters and calling only slab_free_hook() instead of
slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk()
and adjust callers accordingly.

While at it, flip (and document) slab_free_hook() return value so that
it returns true when the freeing can proceed, which matches the logic of
slab_free_freelist_hook() and is not confusingly the opposite.

Additionally we can simplify a bit by changing the tail parameter of
do_slab_free() when freeing a single object - instead of NULL we can set
it equal to head.

bloat-o-meter shows small code reduction with a .config that has KASAN
etc disabled:

add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118)
Function                                     old     new   delta
kmem_cache_alloc_bulk                       1203    1196      -7
kmem_cache_free                              861     835     -26
__kmem_cache_free                            741     704     -37
kmem_cache_free_bulk                         911     863     -48

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 59 +++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 0742564c4538..ed2fa92e914c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2037,9 +2037,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
 /*
  * Hooks for other subsystems that check memory allocations. In a typical
  * production configuration these hooks all should produce no code at all.
+ *
+ * Returns true if freeing of the object can proceed, false if its reuse
+ * was delayed by KASAN quarantine.
  */
-static __always_inline bool slab_free_hook(struct kmem_cache *s,
-						void *x, bool init)
+static __always_inline
+bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
 {
 	kmemleak_free_recursive(x, s->flags);
 	kmsan_slab_free(s, x);
@@ -2072,7 +2075,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s,
 		       s->size - s->inuse - rsize);
 	}
 	/* KASAN might put x into memory quarantine, delaying its reuse. */
-	return kasan_slab_free(s, x, init);
+	return !kasan_slab_free(s, x, init);
 }
 
 static inline bool slab_free_freelist_hook(struct kmem_cache *s,
@@ -2082,7 +2085,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
 
 	void *object;
 	void *next = *head;
-	void *old_tail = *tail ? *tail : *head;
+	void *old_tail = *tail;
 
 	if (is_kfence_address(next)) {
 		slab_free_hook(s, next, false);
@@ -2098,8 +2101,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
 		next = get_freepointer(s, object);
 
 		/* If object's reuse doesn't have to be delayed */
-		if (likely(!slab_free_hook(s, object,
-					   slab_want_init_on_free(s)))) {
+		if (likely(slab_free_hook(s, object,
+					  slab_want_init_on_free(s)))) {
 			/* Move object to the new freelist */
 			set_freepointer(s, object, *head);
 			*head = object;
@@ -2114,9 +2117,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
 		}
 	} while (object != old_tail);
 
-	if (*head == *tail)
-		*tail = NULL;
-
 	return *head != NULL;
 }
 
@@ -4227,7 +4227,6 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
 				struct slab *slab, void *head, void *tail,
 				int cnt, unsigned long addr)
 {
-	void *tail_obj = tail ? : head;
 	struct kmem_cache_cpu *c;
 	unsigned long tid;
 	void **freelist;
@@ -4246,14 +4245,14 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
 	barrier();
 
 	if (unlikely(slab != c->slab)) {
-		__slab_free(s, slab, head, tail_obj, cnt, addr);
+		__slab_free(s, slab, head, tail, cnt, addr);
 		return;
 	}
 
 	if (USE_LOCKLESS_FAST_PATH()) {
 		freelist = READ_ONCE(c->freelist);
 
-		set_freepointer(s, tail_obj, freelist);
+		set_freepointer(s, tail, freelist);
 
 		if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) {
 			note_cmpxchg_failure("slab_free", s, tid);
@@ -4270,7 +4269,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
 		tid = c->tid;
 		freelist = c->freelist;
 
-		set_freepointer(s, tail_obj, freelist);
+		set_freepointer(s, tail, freelist);
 		c->freelist = head;
 		c->tid = next_tid(tid);
 
@@ -4283,15 +4282,27 @@ static void do_slab_free(struct kmem_cache *s,
 				struct slab *slab, void *head, void *tail,
 				int cnt, unsigned long addr)
 {
-	void *tail_obj = tail ? : head;
-
-	__slab_free(s, slab, head, tail_obj, cnt, addr);
+	__slab_free(s, slab, head, tail, cnt, addr);
 }
 #endif /* CONFIG_SLUB_TINY */
 
-static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
-				      void *head, void *tail, void **p, int cnt,
-				      unsigned long addr)
+static __fastpath_inline
+void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
+	       unsigned long addr)
+{
+	bool init;
+
+	memcg_slab_free_hook(s, slab, &object, 1);
+
+	init = !is_kfence_address(object) && slab_want_init_on_free(s);
+
+	if (likely(slab_free_hook(s, object, init)))
+		do_slab_free(s, slab, object, object, 1, addr);
+}
+
+static __fastpath_inline
+void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head,
+		    void *tail, void **p, int cnt, unsigned long addr)
 {
 	memcg_slab_free_hook(s, slab, p, cnt);
 	/*
@@ -4305,7 +4316,7 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
 #ifdef CONFIG_KASAN_GENERIC
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
 {
-	do_slab_free(cache, virt_to_slab(x), x, NULL, 1, addr);
+	do_slab_free(cache, virt_to_slab(x), x, x, 1, addr);
 }
 #endif
 
@@ -4349,7 +4360,7 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
 	if (!s)
 		return;
 	trace_kmem_cache_free(_RET_IP_, x, s);
-	slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_);
+	slab_free(s, virt_to_slab(x), x, _RET_IP_);
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
@@ -4395,7 +4406,7 @@ void kfree(const void *object)
 
 	slab = folio_slab(folio);
 	s = slab->slab_cache;
-	slab_free(s, slab, x, NULL, &x, 1, _RET_IP_);
+	slab_free(s, slab, x, _RET_IP_);
 }
 EXPORT_SYMBOL(kfree);
 
@@ -4512,8 +4523,8 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 		if (!df.slab)
 			continue;
 
-		slab_free(df.s, df.slab, df.freelist, df.tail, &p[size], df.cnt,
-			  _RET_IP_);
+		slab_free_bulk(df.s, df.slab, df.freelist, df.tail, &p[size],
+			       df.cnt, _RET_IP_);
 	} while (likely(size));
 }
 EXPORT_SYMBOL(kmem_cache_free_bulk);

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-04 19:34 [PATCH 0/4] SLUB: cleanup hook processing Vlastimil Babka
                   ` (2 preceding siblings ...)
  2023-12-04 19:34 ` [PATCH 3/4] mm/slub: handle bulk and single object freeing separately Vlastimil Babka
@ 2023-12-04 19:34 ` Vlastimil Babka
  2023-12-05 13:27   ` Chengming Zhou
  3 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-04 19:34 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev, Vlastimil Babka

When freeing an object that was allocated from KFENCE, we do that in the
slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
the cpu slab, so the fastpath has to fallback to the slowpath.

This optimization doesn't help much though, because is_kfence_address()
is checked earlier anyway during the free hook processing or detached
freelist building. Thus we can simplify the code by making the
slab_free_hook() free the KFENCE object immediately, similarly to KASAN
quarantine.

In slab_free_hook() we can place kfence_free() above init processing, as
callers have been making sure to set init to false for KFENCE objects.
This simplifies slab_free(). This places it also above kasan_slab_free()
which is ok as that skips KFENCE objects anyway.

While at it also determine the init value in slab_free_freelist_hook()
outside of the loop.

This change will also make introducing per cpu array caches easier.

Tested-by: Marco Elver <elver@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index ed2fa92e914c..e38c2b712f6c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
  * production configuration these hooks all should produce no code at all.
  *
  * Returns true if freeing of the object can proceed, false if its reuse
- * was delayed by KASAN quarantine.
+ * was delayed by KASAN quarantine, or it was returned to KFENCE.
  */
 static __always_inline
 bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
@@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
 		__kcsan_check_access(x, s->object_size,
 				     KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
 
+	if (kfence_free(kasan_reset_tag(x)))
+		return false;
+
 	/*
 	 * As memory initialization might be integrated into KASAN,
 	 * kasan_slab_free and initialization memset's must be
@@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
 	void *object;
 	void *next = *head;
 	void *old_tail = *tail;
+	bool init;
 
 	if (is_kfence_address(next)) {
 		slab_free_hook(s, next, false);
-		return true;
+		return false;
 	}
 
 	/* Head and tail of the reconstructed freelist */
 	*head = NULL;
 	*tail = NULL;
 
+	init = slab_want_init_on_free(s);
+
 	do {
 		object = next;
 		next = get_freepointer(s, object);
 
 		/* If object's reuse doesn't have to be delayed */
-		if (likely(slab_free_hook(s, object,
-					  slab_want_init_on_free(s)))) {
+		if (likely(slab_free_hook(s, object, init))) {
 			/* Move object to the new freelist */
 			set_freepointer(s, object, *head);
 			*head = object;
@@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
 
 	stat(s, FREE_SLOWPATH);
 
-	if (kfence_free(head))
-		return;
-
 	if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
 		free_to_partial_list(s, slab, head, tail, cnt, addr);
 		return;
@@ -4290,13 +4292,9 @@ static __fastpath_inline
 void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
 	       unsigned long addr)
 {
-	bool init;
-
 	memcg_slab_free_hook(s, slab, &object, 1);
 
-	init = !is_kfence_address(object) && slab_want_init_on_free(s);
-
-	if (likely(slab_free_hook(s, object, init)))
+	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
 		do_slab_free(s, slab, object, object, 1, addr);
 }
 

-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] mm/slub: fix bulk alloc and free stats
  2023-12-04 19:34 ` [PATCH 1/4] mm/slub: fix bulk alloc and free stats Vlastimil Babka
@ 2023-12-05  8:11   ` Chengming Zhou
  0 siblings, 0 replies; 17+ messages in thread
From: Chengming Zhou @ 2023-12-05  8:11 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/5 03:34, Vlastimil Babka wrote:
> The SLUB sysfs stats enabled CONFIG_SLUB_STATS have two deficiencies
> identified wrt bulk alloc/free operations:
> 
> - Bulk allocations from cpu freelist are not counted. Add the
>   ALLOC_FASTPATH counter there.
> 
> - Bulk fastpath freeing will count a list of multiple objects with a
>   single FREE_FASTPATH inc. Add a stat_add() variant to count them all.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Looks good to me!

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

> ---
>  mm/slub.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 3f8b95757106..d7b0ca6012e0 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -396,6 +396,14 @@ static inline void stat(const struct kmem_cache *s, enum stat_item si)
>  #endif
>  }
>  
> +static inline
> +void stat_add(const struct kmem_cache *s, enum stat_item si, int v)
> +{
> +#ifdef CONFIG_SLUB_STATS
> +	raw_cpu_add(s->cpu_slab->stat[si], v);
> +#endif
> +}
> +
>  /*
>   * The slab lists for all objects.
>   */
> @@ -4268,7 +4276,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>  
>  		local_unlock(&s->cpu_slab->lock);
>  	}
> -	stat(s, FREE_FASTPATH);
> +	stat_add(s, FREE_FASTPATH, cnt);
>  }
>  #else /* CONFIG_SLUB_TINY */
>  static void do_slab_free(struct kmem_cache *s,
> @@ -4545,6 +4553,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>  		c->freelist = get_freepointer(s, object);
>  		p[i] = object;
>  		maybe_wipe_obj_freeptr(s, p[i]);
> +		stat(s, ALLOC_FASTPATH);
>  	}
>  	c->tid = next_tid(c->tid);
>  	local_unlock_irqrestore(&s->cpu_slab->lock, irqflags);
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  2023-12-04 19:34 ` [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks Vlastimil Babka
@ 2023-12-05  8:19   ` Chengming Zhou
  2023-12-05 19:57     ` Vlastimil Babka
  0 siblings, 1 reply; 17+ messages in thread
From: Chengming Zhou @ 2023-12-05  8:19 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/5 03:34, Vlastimil Babka wrote:
> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
> objects that were allocated before the failure, using
> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
> hooks (KASAN etc.) and those expect objects that were processed by the
> post alloc hooks, slab_post_alloc_hook() is called before
> kmem_cache_free_bulk().
> 
> This is wasteful, although not a big concern in practice for the rare
> error path. But in order to efficiently handle percpu array batch refill
> and free in the near future, we will also need a variant of
> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
> and use it for the failure path.
> 
> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
> parameter, remove it.

The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
the bulk interface won't be used on chargeable slab?

Thanks.

> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 33 ++++++++++++++++++++++++++-------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index d7b0ca6012e0..0742564c4538 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4478,6 +4478,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
>  	return same;
>  }
>  
> +/*
> + * Internal bulk free of objects that were not initialised by the post alloc
> + * hooks and thus should not be processed by the free hooks
> + */
> +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> +{
> +	if (!size)
> +		return;
> +
> +	do {
> +		struct detached_freelist df;
> +
> +		size = build_detached_freelist(s, size, p, &df);
> +		if (!df.slab)
> +			continue;
> +
> +		do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
> +			     _RET_IP_);
> +	} while (likely(size));
> +}
> +
>  /* Note that interrupts must be enabled when calling this function. */
>  void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
>  {
> @@ -4499,7 +4520,7 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
>  
>  #ifndef CONFIG_SLUB_TINY
>  static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> -			size_t size, void **p, struct obj_cgroup *objcg)
> +					  size_t size, void **p)
>  {
>  	struct kmem_cache_cpu *c;
>  	unsigned long irqflags;
> @@ -4563,14 +4584,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>  
>  error:
>  	slub_put_cpu_ptr(s->cpu_slab);
> -	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> -	kmem_cache_free_bulk(s, i, p);
> +	__kmem_cache_free_bulk(s, i, p);
>  	return 0;
>  
>  }
>  #else /* CONFIG_SLUB_TINY */
>  static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> -			size_t size, void **p, struct obj_cgroup *objcg)
> +				   size_t size, void **p)
>  {
>  	int i;
>  
> @@ -4593,8 +4613,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>  	return i;
>  
>  error:
> -	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> -	kmem_cache_free_bulk(s, i, p);
> +	__kmem_cache_free_bulk(s, i, p);
>  	return 0;
>  }
>  #endif /* CONFIG_SLUB_TINY */
> @@ -4614,7 +4633,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
>  	if (unlikely(!s))
>  		return 0;
>  
> -	i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
> +	i = __kmem_cache_alloc_bulk(s, flags, size, p);
>  
>  	/*
>  	 * memcg and kmem_cache debug support and memory initialization.
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] mm/slub: handle bulk and single object freeing separately
  2023-12-04 19:34 ` [PATCH 3/4] mm/slub: handle bulk and single object freeing separately Vlastimil Babka
@ 2023-12-05 13:23   ` Chengming Zhou
  0 siblings, 0 replies; 17+ messages in thread
From: Chengming Zhou @ 2023-12-05 13:23 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/5 03:34, Vlastimil Babka wrote:
> Currently we have a single function slab_free() handling both single
> object freeing and bulk freeing with necessary hooks, the latter case
> requiring slab_free_freelist_hook(). It should be however better to
> distinguish the two use cases for the following reasons:
> 
> - code simpler to follow for the single object case
> 
> - better code generation - although inlining should eliminate the
>   slab_free_freelist_hook() for single object freeing in case no
>   debugging options are enabled, it seems it's not perfect. When e.g.
>   KASAN is enabled, we're imposing additional unnecessary overhead for
>   single object freeing.
> 
> - preparation to add percpu array caches in near future
> 
> Therefore, simplify slab_free() for the single object case by dropping
> unnecessary parameters and calling only slab_free_hook() instead of
> slab_free_freelist_hook(). Rename the bulk variant to slab_free_bulk()
> and adjust callers accordingly.
> 
> While at it, flip (and document) slab_free_hook() return value so that
> it returns true when the freeing can proceed, which matches the logic of
> slab_free_freelist_hook() and is not confusingly the opposite.
> 
> Additionally we can simplify a bit by changing the tail parameter of
> do_slab_free() when freeing a single object - instead of NULL we can set
> it equal to head.
> 
> bloat-o-meter shows small code reduction with a .config that has KASAN
> etc disabled:
> 
> add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-118 (-118)
> Function                                     old     new   delta
> kmem_cache_alloc_bulk                       1203    1196      -7
> kmem_cache_free                              861     835     -26
> __kmem_cache_free                            741     704     -37
> kmem_cache_free_bulk                         911     863     -48
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Looks good to me.

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

Thanks!

> ---
>  mm/slub.c | 59 +++++++++++++++++++++++++++++++++++------------------------
>  1 file changed, 35 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 0742564c4538..ed2fa92e914c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2037,9 +2037,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>  /*
>   * Hooks for other subsystems that check memory allocations. In a typical
>   * production configuration these hooks all should produce no code at all.
> + *
> + * Returns true if freeing of the object can proceed, false if its reuse
> + * was delayed by KASAN quarantine.
>   */
> -static __always_inline bool slab_free_hook(struct kmem_cache *s,
> -						void *x, bool init)
> +static __always_inline
> +bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>  {
>  	kmemleak_free_recursive(x, s->flags);
>  	kmsan_slab_free(s, x);
> @@ -2072,7 +2075,7 @@ static __always_inline bool slab_free_hook(struct kmem_cache *s,
>  		       s->size - s->inuse - rsize);
>  	}
>  	/* KASAN might put x into memory quarantine, delaying its reuse. */
> -	return kasan_slab_free(s, x, init);
> +	return !kasan_slab_free(s, x, init);
>  }
>  
>  static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> @@ -2082,7 +2085,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  
>  	void *object;
>  	void *next = *head;
> -	void *old_tail = *tail ? *tail : *head;
> +	void *old_tail = *tail;
>  
>  	if (is_kfence_address(next)) {
>  		slab_free_hook(s, next, false);
> @@ -2098,8 +2101,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  		next = get_freepointer(s, object);
>  
>  		/* If object's reuse doesn't have to be delayed */
> -		if (likely(!slab_free_hook(s, object,
> -					   slab_want_init_on_free(s)))) {
> +		if (likely(slab_free_hook(s, object,
> +					  slab_want_init_on_free(s)))) {
>  			/* Move object to the new freelist */
>  			set_freepointer(s, object, *head);
>  			*head = object;
> @@ -2114,9 +2117,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  		}
>  	} while (object != old_tail);
>  
> -	if (*head == *tail)
> -		*tail = NULL;
> -
>  	return *head != NULL;
>  }
>  
> @@ -4227,7 +4227,6 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>  				struct slab *slab, void *head, void *tail,
>  				int cnt, unsigned long addr)
>  {
> -	void *tail_obj = tail ? : head;
>  	struct kmem_cache_cpu *c;
>  	unsigned long tid;
>  	void **freelist;
> @@ -4246,14 +4245,14 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>  	barrier();
>  
>  	if (unlikely(slab != c->slab)) {
> -		__slab_free(s, slab, head, tail_obj, cnt, addr);
> +		__slab_free(s, slab, head, tail, cnt, addr);
>  		return;
>  	}
>  
>  	if (USE_LOCKLESS_FAST_PATH()) {
>  		freelist = READ_ONCE(c->freelist);
>  
> -		set_freepointer(s, tail_obj, freelist);
> +		set_freepointer(s, tail, freelist);
>  
>  		if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) {
>  			note_cmpxchg_failure("slab_free", s, tid);
> @@ -4270,7 +4269,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s,
>  		tid = c->tid;
>  		freelist = c->freelist;
>  
> -		set_freepointer(s, tail_obj, freelist);
> +		set_freepointer(s, tail, freelist);
>  		c->freelist = head;
>  		c->tid = next_tid(tid);
>  
> @@ -4283,15 +4282,27 @@ static void do_slab_free(struct kmem_cache *s,
>  				struct slab *slab, void *head, void *tail,
>  				int cnt, unsigned long addr)
>  {
> -	void *tail_obj = tail ? : head;
> -
> -	__slab_free(s, slab, head, tail_obj, cnt, addr);
> +	__slab_free(s, slab, head, tail, cnt, addr);
>  }
>  #endif /* CONFIG_SLUB_TINY */
>  
> -static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
> -				      void *head, void *tail, void **p, int cnt,
> -				      unsigned long addr)
> +static __fastpath_inline
> +void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
> +	       unsigned long addr)
> +{
> +	bool init;
> +
> +	memcg_slab_free_hook(s, slab, &object, 1);
> +
> +	init = !is_kfence_address(object) && slab_want_init_on_free(s);
> +
> +	if (likely(slab_free_hook(s, object, init)))
> +		do_slab_free(s, slab, object, object, 1, addr);
> +}
> +
> +static __fastpath_inline
> +void slab_free_bulk(struct kmem_cache *s, struct slab *slab, void *head,
> +		    void *tail, void **p, int cnt, unsigned long addr)
>  {
>  	memcg_slab_free_hook(s, slab, p, cnt);
>  	/*
> @@ -4305,7 +4316,7 @@ static __fastpath_inline void slab_free(struct kmem_cache *s, struct slab *slab,
>  #ifdef CONFIG_KASAN_GENERIC
>  void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr)
>  {
> -	do_slab_free(cache, virt_to_slab(x), x, NULL, 1, addr);
> +	do_slab_free(cache, virt_to_slab(x), x, x, 1, addr);
>  }
>  #endif
>  
> @@ -4349,7 +4360,7 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
>  	if (!s)
>  		return;
>  	trace_kmem_cache_free(_RET_IP_, x, s);
> -	slab_free(s, virt_to_slab(x), x, NULL, &x, 1, _RET_IP_);
> +	slab_free(s, virt_to_slab(x), x, _RET_IP_);
>  }
>  EXPORT_SYMBOL(kmem_cache_free);
>  
> @@ -4395,7 +4406,7 @@ void kfree(const void *object)
>  
>  	slab = folio_slab(folio);
>  	s = slab->slab_cache;
> -	slab_free(s, slab, x, NULL, &x, 1, _RET_IP_);
> +	slab_free(s, slab, x, _RET_IP_);
>  }
>  EXPORT_SYMBOL(kfree);
>  
> @@ -4512,8 +4523,8 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
>  		if (!df.slab)
>  			continue;
>  
> -		slab_free(df.s, df.slab, df.freelist, df.tail, &p[size], df.cnt,
> -			  _RET_IP_);
> +		slab_free_bulk(df.s, df.slab, df.freelist, df.tail, &p[size],
> +			       df.cnt, _RET_IP_);
>  	} while (likely(size));
>  }
>  EXPORT_SYMBOL(kmem_cache_free_bulk);
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-04 19:34 ` [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook() Vlastimil Babka
@ 2023-12-05 13:27   ` Chengming Zhou
  2023-12-06  9:58     ` Vlastimil Babka
  0 siblings, 1 reply; 17+ messages in thread
From: Chengming Zhou @ 2023-12-05 13:27 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/5 03:34, Vlastimil Babka wrote:
> When freeing an object that was allocated from KFENCE, we do that in the
> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
> the cpu slab, so the fastpath has to fallback to the slowpath.
> 
> This optimization doesn't help much though, because is_kfence_address()
> is checked earlier anyway during the free hook processing or detached
> freelist building. Thus we can simplify the code by making the
> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
> quarantine.
> 
> In slab_free_hook() we can place kfence_free() above init processing, as
> callers have been making sure to set init to false for KFENCE objects.
> This simplifies slab_free(). This places it also above kasan_slab_free()
> which is ok as that skips KFENCE objects anyway.
> 
> While at it also determine the init value in slab_free_freelist_hook()
> outside of the loop.
> 
> This change will also make introducing per cpu array caches easier.
> 
> Tested-by: Marco Elver <elver@google.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index ed2fa92e914c..e38c2b712f6c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>   * production configuration these hooks all should produce no code at all.
>   *
>   * Returns true if freeing of the object can proceed, false if its reuse
> - * was delayed by KASAN quarantine.
> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
>   */
>  static __always_inline
>  bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>  		__kcsan_check_access(x, s->object_size,
>  				     KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>  
> +	if (kfence_free(kasan_reset_tag(x)))

I'm wondering if "kasan_reset_tag()" is needed here?

The patch looks good to me!

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

Thanks.

> +		return false;
> +
>  	/*
>  	 * As memory initialization might be integrated into KASAN,
>  	 * kasan_slab_free and initialization memset's must be
> @@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>  	void *object;
>  	void *next = *head;
>  	void *old_tail = *tail;
> +	bool init;
>  
>  	if (is_kfence_address(next)) {
>  		slab_free_hook(s, next, false);
> -		return true;
> +		return false;
>  	}
>  
>  	/* Head and tail of the reconstructed freelist */
>  	*head = NULL;
>  	*tail = NULL;
>  
> +	init = slab_want_init_on_free(s);
> +
>  	do {
>  		object = next;
>  		next = get_freepointer(s, object);
>  
>  		/* If object's reuse doesn't have to be delayed */
> -		if (likely(slab_free_hook(s, object,
> -					  slab_want_init_on_free(s)))) {
> +		if (likely(slab_free_hook(s, object, init))) {
>  			/* Move object to the new freelist */
>  			set_freepointer(s, object, *head);
>  			*head = object;
> @@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>  
>  	stat(s, FREE_SLOWPATH);
>  
> -	if (kfence_free(head))
> -		return;
> -
>  	if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
>  		free_to_partial_list(s, slab, head, tail, cnt, addr);
>  		return;
> @@ -4290,13 +4292,9 @@ static __fastpath_inline
>  void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
>  	       unsigned long addr)
>  {
> -	bool init;
> -
>  	memcg_slab_free_hook(s, slab, &object, 1);
>  
> -	init = !is_kfence_address(object) && slab_want_init_on_free(s);
> -
> -	if (likely(slab_free_hook(s, object, init)))
> +	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
>  		do_slab_free(s, slab, object, object, 1, addr);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  2023-12-05  8:19   ` Chengming Zhou
@ 2023-12-05 19:57     ` Vlastimil Babka
  2023-12-06  0:31       ` Chengming Zhou
  0 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-05 19:57 UTC (permalink / raw)
  To: Chengming Zhou, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 12/5/23 09:19, Chengming Zhou wrote:
> On 2023/12/5 03:34, Vlastimil Babka wrote:
>> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
>> objects that were allocated before the failure, using
>> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
>> hooks (KASAN etc.) and those expect objects that were processed by the
>> post alloc hooks, slab_post_alloc_hook() is called before
>> kmem_cache_free_bulk().
>> 
>> This is wasteful, although not a big concern in practice for the rare
>> error path. But in order to efficiently handle percpu array batch refill
>> and free in the near future, we will also need a variant of
>> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
>> and use it for the failure path.
>> 
>> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
>> parameter, remove it.
> 
> The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
> forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
> the bulk interface won't be used on chargeable slab?

You're right! I missed that the memcg_pre_alloc_hook() already does the
charging, so we need to uncharge. How does this look? Thanks for noticing!

----8<----
From 52f8e77fdfeabffffdce6b761ba5508e940df3be Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Thu, 2 Nov 2023 16:34:39 +0100
Subject: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free
 hooks

Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
objects that were allocated before the failure, using
kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
hooks (KASAN etc.) and those expect objects that were processed by the
post alloc hooks, slab_post_alloc_hook() is called before
kmem_cache_free_bulk().

This is wasteful, although not a big concern in practice for the rare
error path. But in order to efficiently handle percpu array batch refill
and free in the near future, we will also need a variant of
kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
and use it for the failure path.

In case of failure we however still need to perform memcg uncharge so
handle that in a new memcg_slab_alloc_error_hook(). Thanks to Chengming
Zhou for noticing the missing uncharge.

As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
parameter, remove it.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 47 insertions(+), 9 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d7b0ca6012e0..0a9e4bd0dd68 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2003,6 +2003,14 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
 
 	__memcg_slab_free_hook(s, slab, p, objects, objcgs);
 }
+
+static inline
+void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
+			   struct obj_cgroup *objcg)
+{
+	if (objcg)
+		obj_cgroup_uncharge(objcg, objects * obj_full_size(s));
+}
 #else /* CONFIG_MEMCG_KMEM */
 static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr)
 {
@@ -2032,6 +2040,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
 					void **p, int objects)
 {
 }
+
+static inline
+void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
+				 struct obj_cgroup *objcg)
+{
+}
 #endif /* CONFIG_MEMCG_KMEM */
 
 /*
@@ -4478,6 +4492,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
 	return same;
 }
 
+/*
+ * Internal bulk free of objects that were not initialised by the post alloc
+ * hooks and thus should not be processed by the free hooks
+ */
+static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
+{
+	if (!size)
+		return;
+
+	do {
+		struct detached_freelist df;
+
+		size = build_detached_freelist(s, size, p, &df);
+		if (!df.slab)
+			continue;
+
+		do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
+			     _RET_IP_);
+	} while (likely(size));
+}
+
 /* Note that interrupts must be enabled when calling this function. */
 void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 {
@@ -4498,8 +4533,9 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 EXPORT_SYMBOL(kmem_cache_free_bulk);
 
 #ifndef CONFIG_SLUB_TINY
-static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
-			size_t size, void **p, struct obj_cgroup *objcg)
+static inline
+int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
+			    void **p)
 {
 	struct kmem_cache_cpu *c;
 	unsigned long irqflags;
@@ -4563,14 +4599,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
 
 error:
 	slub_put_cpu_ptr(s->cpu_slab);
-	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
-	kmem_cache_free_bulk(s, i, p);
+	__kmem_cache_free_bulk(s, i, p);
 	return 0;
 
 }
 #else /* CONFIG_SLUB_TINY */
 static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
-			size_t size, void **p, struct obj_cgroup *objcg)
+				   size_t size, void **p)
 {
 	int i;
 
@@ -4593,8 +4628,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
 	return i;
 
 error:
-	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
-	kmem_cache_free_bulk(s, i, p);
+	__kmem_cache_free_bulk(s, i, p);
 	return 0;
 }
 #endif /* CONFIG_SLUB_TINY */
@@ -4614,15 +4648,19 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	if (unlikely(!s))
 		return 0;
 
-	i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
+	i = __kmem_cache_alloc_bulk(s, flags, size, p);
 
 	/*
 	 * memcg and kmem_cache debug support and memory initialization.
 	 * Done outside of the IRQ disabled fastpath loop.
 	 */
-	if (i != 0)
+	if (likely(i != 0)) {
 		slab_post_alloc_hook(s, objcg, flags, size, p,
 			slab_want_init_on_alloc(flags, s), s->object_size);
+	} else {
+		memcg_slab_alloc_error_hook(s, size, objcg);
+	}
+
 	return i;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_bulk);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks
  2023-12-05 19:57     ` Vlastimil Babka
@ 2023-12-06  0:31       ` Chengming Zhou
  0 siblings, 0 replies; 17+ messages in thread
From: Chengming Zhou @ 2023-12-06  0:31 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/6 03:57, Vlastimil Babka wrote:
> On 12/5/23 09:19, Chengming Zhou wrote:
>> On 2023/12/5 03:34, Vlastimil Babka wrote:
>>> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
>>> objects that were allocated before the failure, using
>>> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
>>> hooks (KASAN etc.) and those expect objects that were processed by the
>>> post alloc hooks, slab_post_alloc_hook() is called before
>>> kmem_cache_free_bulk().
>>>
>>> This is wasteful, although not a big concern in practice for the rare
>>> error path. But in order to efficiently handle percpu array batch refill
>>> and free in the near future, we will also need a variant of
>>> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
>>> and use it for the failure path.
>>>
>>> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
>>> parameter, remove it.
>>
>> The objects may have been charged before, but it seems __kmem_cache_alloc_bulk()
>> forget to uncharge them? I can't find "uncharge" in do_slab_free(), or maybe
>> the bulk interface won't be used on chargeable slab?
> 
> You're right! I missed that the memcg_pre_alloc_hook() already does the
> charging, so we need to uncharge. How does this look? Thanks for noticing!
> 
> ----8<----
> From 52f8e77fdfeabffffdce6b761ba5508e940df3be Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Thu, 2 Nov 2023 16:34:39 +0100
> Subject: [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free
>  hooks
> 
> Currently, when __kmem_cache_alloc_bulk() fails, it frees back the
> objects that were allocated before the failure, using
> kmem_cache_free_bulk(). Because kmem_cache_free_bulk() calls the free
> hooks (KASAN etc.) and those expect objects that were processed by the
> post alloc hooks, slab_post_alloc_hook() is called before
> kmem_cache_free_bulk().
> 
> This is wasteful, although not a big concern in practice for the rare
> error path. But in order to efficiently handle percpu array batch refill
> and free in the near future, we will also need a variant of
> kmem_cache_free_bulk() that avoids the free hooks. So introduce it now
> and use it for the failure path.
> 
> In case of failure we however still need to perform memcg uncharge so
> handle that in a new memcg_slab_alloc_error_hook(). Thanks to Chengming
> Zhou for noticing the missing uncharge.
> 
> As a consequence, __kmem_cache_alloc_bulk() no longer needs the objcg
> parameter, remove it.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Looks good to me!

Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

Thanks!

> ---
>  mm/slub.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 47 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index d7b0ca6012e0..0a9e4bd0dd68 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2003,6 +2003,14 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
>  
>  	__memcg_slab_free_hook(s, slab, p, objects, objcgs);
>  }
> +
> +static inline
> +void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
> +			   struct obj_cgroup *objcg)
> +{
> +	if (objcg)
> +		obj_cgroup_uncharge(objcg, objects * obj_full_size(s));
> +}
>  #else /* CONFIG_MEMCG_KMEM */
>  static inline struct mem_cgroup *memcg_from_slab_obj(void *ptr)
>  {
> @@ -2032,6 +2040,12 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>  					void **p, int objects)
>  {
>  }
> +
> +static inline
> +void memcg_slab_alloc_error_hook(struct kmem_cache *s, int objects,
> +				 struct obj_cgroup *objcg)
> +{
> +}
>  #endif /* CONFIG_MEMCG_KMEM */
>  
>  /*
> @@ -4478,6 +4492,27 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
>  	return same;
>  }
>  
> +/*
> + * Internal bulk free of objects that were not initialised by the post alloc
> + * hooks and thus should not be processed by the free hooks
> + */
> +static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
> +{
> +	if (!size)
> +		return;
> +
> +	do {
> +		struct detached_freelist df;
> +
> +		size = build_detached_freelist(s, size, p, &df);
> +		if (!df.slab)
> +			continue;
> +
> +		do_slab_free(df.s, df.slab, df.freelist, df.tail, df.cnt,
> +			     _RET_IP_);
> +	} while (likely(size));
> +}
> +
>  /* Note that interrupts must be enabled when calling this function. */
>  void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
>  {
> @@ -4498,8 +4533,9 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
>  EXPORT_SYMBOL(kmem_cache_free_bulk);
>  
>  #ifndef CONFIG_SLUB_TINY
> -static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> -			size_t size, void **p, struct obj_cgroup *objcg)
> +static inline
> +int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> +			    void **p)
>  {
>  	struct kmem_cache_cpu *c;
>  	unsigned long irqflags;
> @@ -4563,14 +4599,13 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>  
>  error:
>  	slub_put_cpu_ptr(s->cpu_slab);
> -	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> -	kmem_cache_free_bulk(s, i, p);
> +	__kmem_cache_free_bulk(s, i, p);
>  	return 0;
>  
>  }
>  #else /* CONFIG_SLUB_TINY */
>  static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
> -			size_t size, void **p, struct obj_cgroup *objcg)
> +				   size_t size, void **p)
>  {
>  	int i;
>  
> @@ -4593,8 +4628,7 @@ static int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags,
>  	return i;
>  
>  error:
> -	slab_post_alloc_hook(s, objcg, flags, i, p, false, s->object_size);
> -	kmem_cache_free_bulk(s, i, p);
> +	__kmem_cache_free_bulk(s, i, p);
>  	return 0;
>  }
>  #endif /* CONFIG_SLUB_TINY */
> @@ -4614,15 +4648,19 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
>  	if (unlikely(!s))
>  		return 0;
>  
> -	i = __kmem_cache_alloc_bulk(s, flags, size, p, objcg);
> +	i = __kmem_cache_alloc_bulk(s, flags, size, p);
>  
>  	/*
>  	 * memcg and kmem_cache debug support and memory initialization.
>  	 * Done outside of the IRQ disabled fastpath loop.
>  	 */
> -	if (i != 0)
> +	if (likely(i != 0)) {
>  		slab_post_alloc_hook(s, objcg, flags, size, p,
>  			slab_want_init_on_alloc(flags, s), s->object_size);
> +	} else {
> +		memcg_slab_alloc_error_hook(s, size, objcg);
> +	}
> +
>  	return i;
>  }
>  EXPORT_SYMBOL(kmem_cache_alloc_bulk);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-05 13:27   ` Chengming Zhou
@ 2023-12-06  9:58     ` Vlastimil Babka
  2023-12-06 13:01       ` Chengming Zhou
  0 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-06  9:58 UTC (permalink / raw)
  To: Chengming Zhou, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 12/5/23 14:27, Chengming Zhou wrote:
> On 2023/12/5 03:34, Vlastimil Babka wrote:
>> When freeing an object that was allocated from KFENCE, we do that in the
>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
>> the cpu slab, so the fastpath has to fallback to the slowpath.
>> 
>> This optimization doesn't help much though, because is_kfence_address()
>> is checked earlier anyway during the free hook processing or detached
>> freelist building. Thus we can simplify the code by making the
>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
>> quarantine.
>> 
>> In slab_free_hook() we can place kfence_free() above init processing, as
>> callers have been making sure to set init to false for KFENCE objects.
>> This simplifies slab_free(). This places it also above kasan_slab_free()
>> which is ok as that skips KFENCE objects anyway.
>> 
>> While at it also determine the init value in slab_free_freelist_hook()
>> outside of the loop.
>> 
>> This change will also make introducing per cpu array caches easier.
>> 
>> Tested-by: Marco Elver <elver@google.com>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>  mm/slub.c | 22 ++++++++++------------
>>  1 file changed, 10 insertions(+), 12 deletions(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index ed2fa92e914c..e38c2b712f6c 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>>   * production configuration these hooks all should produce no code at all.
>>   *
>>   * Returns true if freeing of the object can proceed, false if its reuse
>> - * was delayed by KASAN quarantine.
>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
>>   */
>>  static __always_inline
>>  bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>>  		__kcsan_check_access(x, s->object_size,
>>  				     KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>>  
>> +	if (kfence_free(kasan_reset_tag(x)))
> 
> I'm wondering if "kasan_reset_tag()" is needed here?

I think so, because AFAICS the is_kfence_address() check in kfence_free()
could be a false negative otherwise. In fact now I even question some of the
other is_kfence_address() checks in mm/slub.c, mainly
build_detached_freelist() which starts from pointers coming directly from
slab users. Insight from KASAN/KFENCE folks appreciated :)

> The patch looks good to me!
> 
> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>

Thanks!

> Thanks.
> 
>> +		return false;
>> +
>>  	/*
>>  	 * As memory initialization might be integrated into KASAN,
>>  	 * kasan_slab_free and initialization memset's must be
>> @@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
>>  	void *object;
>>  	void *next = *head;
>>  	void *old_tail = *tail;
>> +	bool init;
>>  
>>  	if (is_kfence_address(next)) {
>>  		slab_free_hook(s, next, false);
>> -		return true;
>> +		return false;
>>  	}
>>  
>>  	/* Head and tail of the reconstructed freelist */
>>  	*head = NULL;
>>  	*tail = NULL;
>>  
>> +	init = slab_want_init_on_free(s);
>> +
>>  	do {
>>  		object = next;
>>  		next = get_freepointer(s, object);
>>  
>>  		/* If object's reuse doesn't have to be delayed */
>> -		if (likely(slab_free_hook(s, object,
>> -					  slab_want_init_on_free(s)))) {
>> +		if (likely(slab_free_hook(s, object, init))) {
>>  			/* Move object to the new freelist */
>>  			set_freepointer(s, object, *head);
>>  			*head = object;
>> @@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>  
>>  	stat(s, FREE_SLOWPATH);
>>  
>> -	if (kfence_free(head))
>> -		return;
>> -
>>  	if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
>>  		free_to_partial_list(s, slab, head, tail, cnt, addr);
>>  		return;
>> @@ -4290,13 +4292,9 @@ static __fastpath_inline
>>  void slab_free(struct kmem_cache *s, struct slab *slab, void *object,
>>  	       unsigned long addr)
>>  {
>> -	bool init;
>> -
>>  	memcg_slab_free_hook(s, slab, &object, 1);
>>  
>> -	init = !is_kfence_address(object) && slab_want_init_on_free(s);
>> -
>> -	if (likely(slab_free_hook(s, object, init)))
>> +	if (likely(slab_free_hook(s, object, slab_want_init_on_free(s))))
>>  		do_slab_free(s, slab, object, object, 1, addr);
>>  }
>>  
>> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-06  9:58     ` Vlastimil Babka
@ 2023-12-06 13:01       ` Chengming Zhou
  2023-12-06 14:44         ` Marco Elver
  0 siblings, 1 reply; 17+ messages in thread
From: Chengming Zhou @ 2023-12-06 13:01 UTC (permalink / raw)
  To: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim
  Cc: Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Marco Elver, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On 2023/12/6 17:58, Vlastimil Babka wrote:
> On 12/5/23 14:27, Chengming Zhou wrote:
>> On 2023/12/5 03:34, Vlastimil Babka wrote:
>>> When freeing an object that was allocated from KFENCE, we do that in the
>>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
>>> the cpu slab, so the fastpath has to fallback to the slowpath.
>>>
>>> This optimization doesn't help much though, because is_kfence_address()
>>> is checked earlier anyway during the free hook processing or detached
>>> freelist building. Thus we can simplify the code by making the
>>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
>>> quarantine.
>>>
>>> In slab_free_hook() we can place kfence_free() above init processing, as
>>> callers have been making sure to set init to false for KFENCE objects.
>>> This simplifies slab_free(). This places it also above kasan_slab_free()
>>> which is ok as that skips KFENCE objects anyway.
>>>
>>> While at it also determine the init value in slab_free_freelist_hook()
>>> outside of the loop.
>>>
>>> This change will also make introducing per cpu array caches easier.
>>>
>>> Tested-by: Marco Elver <elver@google.com>
>>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>>> ---
>>>  mm/slub.c | 22 ++++++++++------------
>>>  1 file changed, 10 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index ed2fa92e914c..e38c2b712f6c 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>>>   * production configuration these hooks all should produce no code at all.
>>>   *
>>>   * Returns true if freeing of the object can proceed, false if its reuse
>>> - * was delayed by KASAN quarantine.
>>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
>>>   */
>>>  static __always_inline
>>>  bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
>>>  		__kcsan_check_access(x, s->object_size,
>>>  				     KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
>>>  
>>> +	if (kfence_free(kasan_reset_tag(x)))
>>
>> I'm wondering if "kasan_reset_tag()" is needed here?
> 
> I think so, because AFAICS the is_kfence_address() check in kfence_free()
> could be a false negative otherwise. In fact now I even question some of the

Ok.

> other is_kfence_address() checks in mm/slub.c, mainly
> build_detached_freelist() which starts from pointers coming directly from
> slab users. Insight from KASAN/KFENCE folks appreciated :)
> 
I know very little about KASAN/KFENCE, looking forward to their insight. :)

Just saw a check in __kasan_slab_alloc():

	if (is_kfence_address(object))
		return (void *)object;

So thought it seems that a kfence object would be skipped by KASAN.

Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-06 13:01       ` Chengming Zhou
@ 2023-12-06 14:44         ` Marco Elver
  2023-12-11 22:11           ` Andrey Konovalov
  0 siblings, 1 reply; 17+ messages in thread
From: Marco Elver @ 2023-12-06 14:44 UTC (permalink / raw)
  To: Chengming Zhou, Andrey Konovalov
  Cc: Vlastimil Babka, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Dmitry Vyukov, linux-mm, linux-kernel,
	kasan-dev

On Wed, 6 Dec 2023 at 14:02, Chengming Zhou <chengming.zhou@linux.dev> wrote:
>
> On 2023/12/6 17:58, Vlastimil Babka wrote:
> > On 12/5/23 14:27, Chengming Zhou wrote:
> >> On 2023/12/5 03:34, Vlastimil Babka wrote:
> >>> When freeing an object that was allocated from KFENCE, we do that in the
> >>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be
> >>> the cpu slab, so the fastpath has to fallback to the slowpath.
> >>>
> >>> This optimization doesn't help much though, because is_kfence_address()
> >>> is checked earlier anyway during the free hook processing or detached
> >>> freelist building. Thus we can simplify the code by making the
> >>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN
> >>> quarantine.
> >>>
> >>> In slab_free_hook() we can place kfence_free() above init processing, as
> >>> callers have been making sure to set init to false for KFENCE objects.
> >>> This simplifies slab_free(). This places it also above kasan_slab_free()
> >>> which is ok as that skips KFENCE objects anyway.
> >>>
> >>> While at it also determine the init value in slab_free_freelist_hook()
> >>> outside of the loop.
> >>>
> >>> This change will also make introducing per cpu array caches easier.
> >>>
> >>> Tested-by: Marco Elver <elver@google.com>
> >>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >>> ---
> >>>  mm/slub.c | 22 ++++++++++------------
> >>>  1 file changed, 10 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/mm/slub.c b/mm/slub.c
> >>> index ed2fa92e914c..e38c2b712f6c 100644
> >>> --- a/mm/slub.c
> >>> +++ b/mm/slub.c
> >>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
> >>>   * production configuration these hooks all should produce no code at all.
> >>>   *
> >>>   * Returns true if freeing of the object can proceed, false if its reuse
> >>> - * was delayed by KASAN quarantine.
> >>> + * was delayed by KASAN quarantine, or it was returned to KFENCE.
> >>>   */
> >>>  static __always_inline
> >>>  bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> >>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init)
> >>>             __kcsan_check_access(x, s->object_size,
> >>>                                  KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT);
> >>>
> >>> +   if (kfence_free(kasan_reset_tag(x)))
> >>
> >> I'm wondering if "kasan_reset_tag()" is needed here?
> >
> > I think so, because AFAICS the is_kfence_address() check in kfence_free()
> > could be a false negative otherwise. In fact now I even question some of the
>
> Ok.
>
> > other is_kfence_address() checks in mm/slub.c, mainly
> > build_detached_freelist() which starts from pointers coming directly from
> > slab users. Insight from KASAN/KFENCE folks appreciated :)
> >
> I know very little about KASAN/KFENCE, looking forward to their insight. :)
>
> Just saw a check in __kasan_slab_alloc():
>
>         if (is_kfence_address(object))
>                 return (void *)object;
>
> So thought it seems that a kfence object would be skipped by KASAN.

The is_kfence_address() implementation tolerates tagged addresses,
i.e. if it receives a tagged non-kfence address, it will never return
true.

The KASAN_HW_TAGS patches and KFENCE patches were in development
concurrently, and at the time there was some conflict resolution that
happened when both were merged. The
is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
squashed into 2b8305260fb.

[1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/

Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-06 14:44         ` Marco Elver
@ 2023-12-11 22:11           ` Andrey Konovalov
  2023-12-12 11:42             ` Vlastimil Babka
  0 siblings, 1 reply; 17+ messages in thread
From: Andrey Konovalov @ 2023-12-11 22:11 UTC (permalink / raw)
  To: Marco Elver
  Cc: Chengming Zhou, Vlastimil Babka, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, Alexander Potapenko, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <elver@google.com> wrote:
>
> The is_kfence_address() implementation tolerates tagged addresses,
> i.e. if it receives a tagged non-kfence address, it will never return
> true.
>
> The KASAN_HW_TAGS patches and KFENCE patches were in development
> concurrently, and at the time there was some conflict resolution that
> happened when both were merged. The
> is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
> squashed into 2b8305260fb.
>
> [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/
>
> Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?

I don't remember at this point, but this could have been just a safety measure.

If is_kfence_address tolerates tagged addresses, we should be able to
drop these kasan_reset_tag calls.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-11 22:11           ` Andrey Konovalov
@ 2023-12-12 11:42             ` Vlastimil Babka
  2023-12-20 23:44               ` Andrey Konovalov
  0 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2023-12-12 11:42 UTC (permalink / raw)
  To: Andrey Konovalov, Marco Elver
  Cc: Chengming Zhou, Christoph Lameter, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, Roman Gushchin, Hyeonggon Yoo,
	Alexander Potapenko, Dmitry Vyukov, linux-mm, linux-kernel,
	kasan-dev

On 12/11/23 23:11, Andrey Konovalov wrote:
> On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <elver@google.com> wrote:
>>
>> The is_kfence_address() implementation tolerates tagged addresses,
>> i.e. if it receives a tagged non-kfence address, it will never return
>> true.

So just to be sure, it can't happen that a genuine kfence address would then
become KASAN tagged and handed out, and thus when tested by
is_kfence_address() it would be a false negative?

>> The KASAN_HW_TAGS patches and KFENCE patches were in development
>> concurrently, and at the time there was some conflict resolution that
>> happened when both were merged. The
>> is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was
>> squashed into 2b8305260fb.
>>
>> [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/
>>
>> Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?
> 
> I don't remember at this point, but this could have been just a safety measure.
> 
> If is_kfence_address tolerates tagged addresses, we should be able to
> drop these kasan_reset_tag calls.

Will drop it once the above is confirmed. Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook()
  2023-12-12 11:42             ` Vlastimil Babka
@ 2023-12-20 23:44               ` Andrey Konovalov
  0 siblings, 0 replies; 17+ messages in thread
From: Andrey Konovalov @ 2023-12-20 23:44 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Marco Elver, Chengming Zhou, Christoph Lameter, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, Roman Gushchin,
	Hyeonggon Yoo, Alexander Potapenko, Dmitry Vyukov, linux-mm,
	linux-kernel, kasan-dev

On Tue, Dec 12, 2023 at 12:42 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> On 12/11/23 23:11, Andrey Konovalov wrote:
> > On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <elver@google.com> wrote:
> >>
> >> The is_kfence_address() implementation tolerates tagged addresses,
> >> i.e. if it receives a tagged non-kfence address, it will never return
> >> true.
>
> So just to be sure, it can't happen that a genuine kfence address would then
> become KASAN tagged and handed out, and thus when tested by
> is_kfence_address() it would be a false negative?

No, this should not happen. KFENCE objects never get tags assigned to them.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-12-20 23:44 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-04 19:34 [PATCH 0/4] SLUB: cleanup hook processing Vlastimil Babka
2023-12-04 19:34 ` [PATCH 1/4] mm/slub: fix bulk alloc and free stats Vlastimil Babka
2023-12-05  8:11   ` Chengming Zhou
2023-12-04 19:34 ` [PATCH 2/4] mm/slub: introduce __kmem_cache_free_bulk() without free hooks Vlastimil Babka
2023-12-05  8:19   ` Chengming Zhou
2023-12-05 19:57     ` Vlastimil Babka
2023-12-06  0:31       ` Chengming Zhou
2023-12-04 19:34 ` [PATCH 3/4] mm/slub: handle bulk and single object freeing separately Vlastimil Babka
2023-12-05 13:23   ` Chengming Zhou
2023-12-04 19:34 ` [PATCH 4/4] mm/slub: free KFENCE objects in slab_free_hook() Vlastimil Babka
2023-12-05 13:27   ` Chengming Zhou
2023-12-06  9:58     ` Vlastimil Babka
2023-12-06 13:01       ` Chengming Zhou
2023-12-06 14:44         ` Marco Elver
2023-12-11 22:11           ` Andrey Konovalov
2023-12-12 11:42             ` Vlastimil Babka
2023-12-20 23:44               ` Andrey Konovalov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.