All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] MM: More bulk API work
@ 2016-01-07 14:03 Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
                   ` (11 more replies)
  0 siblings, 12 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This series contain three aspects:
 1. cleanup and code sharing between SLUB and SLAB
 2. implementing accelerated bulk API for SLAB allocator
 3. new API kfree_bulk()

Reviewers please review the changed order of debug calls in the SLAB
allocator, as they are changed to do the same as the SLUB allocator.

Patchset based on top Linus tree at commit ee9a7d2cb0cf1.

---

Jesper Dangaard Brouer (10):
      slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
      mm/slab: move SLUB alloc hooks to common mm/slab.h
      mm: fault-inject take over bootstrap kmem_cache check
      slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB
      mm: kmemcheck skip object if slab allocation failed
      slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB
      slab: implement bulk alloc in SLAB allocator
      slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk
      slab: implement bulk free in SLAB allocator
      mm: new API kfree_bulk() for SLAB+SLUB allocators


 include/linux/fault-inject.h |    5 +-
 include/linux/slab.h         |    8 +++
 mm/failslab.c                |   11 +++-
 mm/kmemcheck.c               |    3 +
 mm/slab.c                    |  121 +++++++++++++++++++++++++++---------------
 mm/slab.h                    |   62 ++++++++++++++++++++++
 mm/slab_common.c             |    8 ++-
 mm/slub.c                    |   92 +++++++++-----------------------
 8 files changed, 194 insertions(+), 116 deletions(-)

--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
@ 2016-01-07 14:03 ` Jesper Dangaard Brouer
  2016-01-07 15:54   ` Christoph Lameter
  2016-01-08  2:58   ` Joonsoo Kim
  2016-01-07 14:03 ` [PATCH 02/10] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This change is primarily an attempt to make it easier to realize the
optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
enabled.

Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
overhead is zero. This is because, as long as no process have
enabled kmem cgroups accounting, the assignment is replaced by
asm-NOP operations.  This is possible because memcg_kmem_enabled()
uses a static_key_false() construct.

It also helps readability as it avoid accessing the p[] array like:
p[size - 1] which "expose" that the array is processed backwards
inside helper function build_detached_freelist().

Lastly this also makes the code more robust, in error case like
passing NULL pointers in the array. Which were previously handled
before commit 033745189b1b ("slub: add missing kmem cgroup
support to kmem_cache_free_bulk").

Fixes: 033745189b1b ("slub: add missing kmem cgroup support to kmem_cache_free_bulk")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slub.c |   20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 46997517406e..0538e45e1964 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2833,8 +2833,9 @@ struct detached_freelist {
  * synchronization primitive.  Look ahead in the array is limited due
  * to performance reasons.
  */
-static int build_detached_freelist(struct kmem_cache *s, size_t size,
-				   void **p, struct detached_freelist *df)
+static inline
+int build_detached_freelist(struct kmem_cache **s, size_t size,
+			    void **p, struct detached_freelist *df)
 {
 	size_t first_skipped_index = 0;
 	int lookahead = 3;
@@ -2850,8 +2851,11 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 	if (!object)
 		return 0;
 
+	/* Support for memcg, compiler can optimize this out */
+	*s = cache_from_obj(*s, object);
+
 	/* Start new detached freelist */
-	set_freepointer(s, object, NULL);
+	set_freepointer(*s, object, NULL);
 	df->page = virt_to_head_page(object);
 	df->tail = object;
 	df->freelist = object;
@@ -2866,7 +2870,7 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 		/* df->page is always set at this point */
 		if (df->page == virt_to_head_page(object)) {
 			/* Opportunity build freelist */
-			set_freepointer(s, object, df->freelist);
+			set_freepointer(*s, object, df->freelist);
 			df->freelist = object;
 			df->cnt++;
 			p[size] = NULL; /* mark object processed */
@@ -2885,7 +2889,6 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 	return first_skipped_index;
 }
 
-
 /* Note that interrupts must be enabled when calling this function. */
 void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
 {
@@ -2894,12 +2897,9 @@ void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
 
 	do {
 		struct detached_freelist df;
-		struct kmem_cache *s;
-
-		/* Support for memcg */
-		s = cache_from_obj(orig_s, p[size - 1]);
+		struct kmem_cache *s = orig_s;
 
-		size = build_detached_freelist(s, size, p, &df);
+		size = build_detached_freelist(&s, size, p, &df);
 		if (unlikely(!df.page))
 			continue;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/10] mm/slab: move SLUB alloc hooks to common mm/slab.h
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
@ 2016-01-07 14:03 ` Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 03/10] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

First step towards sharing alloc_hook's between SLUB and SLAB
allocators.  Move the SLUB allocators *_alloc_hook to the common
mm/slab.h for internal slab definitions.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.h |   62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/slub.c |   54 -----------------------------------------------------
 2 files changed, 62 insertions(+), 54 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 7b6087197997..92b10da2c71f 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -38,6 +38,10 @@ struct kmem_cache {
 #endif
 
 #include <linux/memcontrol.h>
+#include <linux/fault-inject.h>
+#include <linux/kmemcheck.h>
+#include <linux/kasan.h>
+#include <linux/kmemleak.h>
 
 /*
  * State of the slab allocator.
@@ -319,6 +323,64 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 	return s;
 }
 
+static inline size_t slab_ksize(const struct kmem_cache *s)
+{
+#ifndef CONFIG_SLUB
+	return s->object_size;
+
+#else /* CONFIG_SLUB */
+# ifdef CONFIG_SLUB_DEBUG
+	/*
+	 * Debugging requires use of the padding between object
+	 * and whatever may come after it.
+	 */
+	if (s->flags & (SLAB_RED_ZONE | SLAB_POISON))
+		return s->object_size;
+# endif
+	/*
+	 * If we have the need to store the freelist pointer
+	 * back there or track user information then we can
+	 * only use the space before that information.
+	 */
+	if (s->flags & (SLAB_DESTROY_BY_RCU | SLAB_STORE_USER))
+		return s->inuse;
+	/*
+	 * Else we can use all the padding etc for the allocation
+	 */
+	return s->size;
+#endif
+}
+
+static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
+						     gfp_t flags)
+{
+	flags &= gfp_allowed_mask;
+	lockdep_trace_alloc(flags);
+	might_sleep_if(gfpflags_allow_blocking(flags));
+
+	if (should_failslab(s->object_size, flags, s->flags))
+		return NULL;
+
+	return memcg_kmem_get_cache(s, flags);
+}
+
+static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
+					size_t size, void **p)
+{
+	size_t i;
+
+	flags &= gfp_allowed_mask;
+	for (i = 0; i < size; i++) {
+		void *object = p[i];
+
+		kmemcheck_slab_alloc(s, flags, object, slab_ksize(s));
+		kmemleak_alloc_recursive(object, s->object_size, 1,
+					 s->flags, flags);
+		kasan_slab_alloc(s, object);
+	}
+	memcg_kmem_put_cache(s);
+}
+
 #ifndef CONFIG_SLOB
 /*
  * The slab lists for all objects.
diff --git a/mm/slub.c b/mm/slub.c
index 0538e45e1964..3697f216d7c7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -284,30 +284,6 @@ static inline int slab_index(void *p, struct kmem_cache *s, void *addr)
 	return (p - addr) / s->size;
 }
 
-static inline size_t slab_ksize(const struct kmem_cache *s)
-{
-#ifdef CONFIG_SLUB_DEBUG
-	/*
-	 * Debugging requires use of the padding between object
-	 * and whatever may come after it.
-	 */
-	if (s->flags & (SLAB_RED_ZONE | SLAB_POISON))
-		return s->object_size;
-
-#endif
-	/*
-	 * If we have the need to store the freelist pointer
-	 * back there or track user information then we can
-	 * only use the space before that information.
-	 */
-	if (s->flags & (SLAB_DESTROY_BY_RCU | SLAB_STORE_USER))
-		return s->inuse;
-	/*
-	 * Else we can use all the padding etc for the allocation
-	 */
-	return s->size;
-}
-
 static inline int order_objects(int order, unsigned long size, int reserved)
 {
 	return ((PAGE_SIZE << order) - reserved) / size;
@@ -1279,36 +1255,6 @@ static inline void kfree_hook(const void *x)
 	kasan_kfree_large(x);
 }
 
-static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
-						     gfp_t flags)
-{
-	flags &= gfp_allowed_mask;
-	lockdep_trace_alloc(flags);
-	might_sleep_if(gfpflags_allow_blocking(flags));
-
-	if (should_failslab(s->object_size, flags, s->flags))
-		return NULL;
-
-	return memcg_kmem_get_cache(s, flags);
-}
-
-static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
-					size_t size, void **p)
-{
-	size_t i;
-
-	flags &= gfp_allowed_mask;
-	for (i = 0; i < size; i++) {
-		void *object = p[i];
-
-		kmemcheck_slab_alloc(s, flags, object, slab_ksize(s));
-		kmemleak_alloc_recursive(object, s->object_size, 1,
-					 s->flags, flags);
-		kasan_slab_alloc(s, object);
-	}
-	memcg_kmem_put_cache(s);
-}
-
 static inline void slab_free_hook(struct kmem_cache *s, void *x)
 {
 	kmemleak_free_recursive(x, s->flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/10] mm: fault-inject take over bootstrap kmem_cache check
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 02/10] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
@ 2016-01-07 14:03 ` Jesper Dangaard Brouer
  2016-01-07 14:03 ` [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Remove the SLAB specific function slab_should_failslab(), by moving
the check against fault-injection for the bootstrap slab, into the
shared function should_failslab() (used by both SLAB and SLUB).

This is a step towards sharing alloc_hook's between SLUB and SLAB.

This bootstrap slab "kmem_cache" is used for allocating struct
kmem_cache objects to the allocator itself.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/fault-inject.h |    5 ++---
 mm/failslab.c                |   11 ++++++++---
 mm/slab.c                    |   12 ++----------
 mm/slab.h                    |    2 +-
 4 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
index 3159a7dba034..9f4956d8601c 100644
--- a/include/linux/fault-inject.h
+++ b/include/linux/fault-inject.h
@@ -62,10 +62,9 @@ static inline struct dentry *fault_create_debugfs_attr(const char *name,
 #endif /* CONFIG_FAULT_INJECTION */
 
 #ifdef CONFIG_FAILSLAB
-extern bool should_failslab(size_t size, gfp_t gfpflags, unsigned long flags);
+extern bool should_failslab(struct kmem_cache *s, gfp_t gfpflags);
 #else
-static inline bool should_failslab(size_t size, gfp_t gfpflags,
-				unsigned long flags)
+static inline bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
 	return false;
 }
diff --git a/mm/failslab.c b/mm/failslab.c
index 79171b4a5826..0c5b3f31f310 100644
--- a/mm/failslab.c
+++ b/mm/failslab.c
@@ -1,5 +1,6 @@
 #include <linux/fault-inject.h>
 #include <linux/slab.h>
+#include "slab.h"
 
 static struct {
 	struct fault_attr attr;
@@ -11,18 +12,22 @@ static struct {
 	.cache_filter = false,
 };
 
-bool should_failslab(size_t size, gfp_t gfpflags, unsigned long cache_flags)
+bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
+	/* No fault-injection for bootstrap cache */
+	if (unlikely(s == kmem_cache))
+		return false;
+
 	if (gfpflags & __GFP_NOFAIL)
 		return false;
 
 	if (failslab.ignore_gfp_reclaim && (gfpflags & __GFP_RECLAIM))
 		return false;
 
-	if (failslab.cache_filter && !(cache_flags & SLAB_FAILSLAB))
+	if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
 		return false;
 
-	return should_fail(&failslab.attr, size);
+	return should_fail(&failslab.attr, s->object_size);
 }
 
 static int __init setup_failslab(char *str)
diff --git a/mm/slab.c b/mm/slab.c
index 4765c97ce690..d5b29e7bee81 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2917,14 +2917,6 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,
 #define cache_alloc_debugcheck_after(a,b,objp,d) (objp)
 #endif
 
-static bool slab_should_failslab(struct kmem_cache *cachep, gfp_t flags)
-{
-	if (unlikely(cachep == kmem_cache))
-		return false;
-
-	return should_failslab(cachep->object_size, flags, cachep->flags);
-}
-
 static inline void *____cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 {
 	void *objp;
@@ -3152,7 +3144,7 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
 
 	lockdep_trace_alloc(flags);
 
-	if (slab_should_failslab(cachep, flags))
+	if (should_failslab(cachep, flags))
 		return NULL;
 
 	cachep = memcg_kmem_get_cache(cachep, flags);
@@ -3240,7 +3232,7 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 
 	lockdep_trace_alloc(flags);
 
-	if (slab_should_failslab(cachep, flags))
+	if (should_failslab(cachep, flags))
 		return NULL;
 
 	cachep = memcg_kmem_get_cache(cachep, flags);
diff --git a/mm/slab.h b/mm/slab.h
index 92b10da2c71f..343ee496c53b 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -358,7 +358,7 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
 	lockdep_trace_alloc(flags);
 	might_sleep_if(gfpflags_allow_blocking(flags));
 
-	if (should_failslab(s->object_size, flags, s->flags))
+	if (should_failslab(s, flags))
 		return NULL;
 
 	return memcg_kmem_get_cache(s, flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (2 preceding siblings ...)
  2016-01-07 14:03 ` [PATCH 03/10] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
@ 2016-01-07 14:03 ` Jesper Dangaard Brouer
  2016-01-08  3:05   ` Joonsoo Kim
  2016-01-07 14:03 ` [PATCH 05/10] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Dedublicate code in SLAB allocator functions slab_alloc() and
slab_alloc_node() by using the slab_pre_alloc_hook() call, which
is now shared between SLUB and SLAB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index d5b29e7bee81..17fd6268ad41 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3140,15 +3140,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
 	void *ptr;
 	int slab_node = numa_mem_id();
 
-	flags &= gfp_allowed_mask;
-
-	lockdep_trace_alloc(flags);
-
-	if (should_failslab(cachep, flags))
+	cachep = slab_pre_alloc_hook(cachep, flags);
+	if (!cachep)
 		return NULL;
 
-	cachep = memcg_kmem_get_cache(cachep, flags);
-
 	cache_alloc_debugcheck_before(cachep, flags);
 	local_irq_save(save_flags);
 
@@ -3228,15 +3223,10 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 	unsigned long save_flags;
 	void *objp;
 
-	flags &= gfp_allowed_mask;
-
-	lockdep_trace_alloc(flags);
-
-	if (should_failslab(cachep, flags))
+	cachep = slab_pre_alloc_hook(cachep, flags);
+	if (!cachep)
 		return NULL;
 
-	cachep = memcg_kmem_get_cache(cachep, flags);
-
 	cache_alloc_debugcheck_before(cachep, flags);
 	local_irq_save(save_flags);
 	objp = __do_cache_alloc(cachep, flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/10] mm: kmemcheck skip object if slab allocation failed
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (3 preceding siblings ...)
  2016-01-07 14:03 ` [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
@ 2016-01-07 14:03 ` Jesper Dangaard Brouer
  2016-01-07 14:04 ` [PATCH 06/10] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:03 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

In the SLAB allocator kmemcheck_slab_alloc() is guarded against
being called in case the object is NULL.  In SLUB allocator this
NULL pointer invocation can happen, which seems like an oversight.

Move the NULL pointer check into kmemcheck code (kmemcheck_slab_alloc)
so the check gets moved out of the fastpath, when not compiled
with CONFIG_KMEMCHECK.

This is a step towards sharing post_alloc_hook between SLUB and
SLAB, because slab_post_alloc_hook() does not perform this check
before calling kmemcheck_slab_alloc().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/kmemcheck.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/kmemcheck.c b/mm/kmemcheck.c
index cab58bb592d8..6f4f424037c0 100644
--- a/mm/kmemcheck.c
+++ b/mm/kmemcheck.c
@@ -60,6 +60,9 @@ void kmemcheck_free_shadow(struct page *page, int order)
 void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
 			  size_t size)
 {
+	if (unlikely(!object)) /* Skip object if allocation failed */
+		return;
+
 	/*
 	 * Has already been memset(), which initializes the shadow for us
 	 * as well.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/10] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (4 preceding siblings ...)
  2016-01-07 14:03 ` [PATCH 05/10] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
@ 2016-01-07 14:04 ` Jesper Dangaard Brouer
  2016-01-07 14:04 ` [PATCH 07/10] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:04 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Reviewers notice that the order in slab_post_alloc_hook() of
kmemcheck_slab_alloc() and kmemleak_alloc_recursive() gets
swapped compared to slab.c / SLAB allocator.

Also notice memset now occurs before calling kmemcheck_slab_alloc()
and kmemleak_alloc_recursive().

I assume this reordering of kmemcheck, kmemleak and memset is okay
because this is the order they are used by the SLUB allocator.

This patch completes the sharing of alloc_hook's between SLUB and SLAB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 17fd6268ad41..47e7bcab8c3b 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3172,16 +3172,11 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
   out:
 	local_irq_restore(save_flags);
 	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
-	kmemleak_alloc_recursive(ptr, cachep->object_size, 1, cachep->flags,
-				 flags);
 
-	if (likely(ptr)) {
-		kmemcheck_slab_alloc(cachep, flags, ptr, cachep->object_size);
-		if (unlikely(flags & __GFP_ZERO))
-			memset(ptr, 0, cachep->object_size);
-	}
+	if (unlikely(flags & __GFP_ZERO) && ptr)
+		memset(ptr, 0, cachep->object_size);
 
-	memcg_kmem_put_cache(cachep);
+	slab_post_alloc_hook(cachep, flags, 1, &ptr);
 	return ptr;
 }
 
@@ -3232,17 +3227,12 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 	objp = __do_cache_alloc(cachep, flags);
 	local_irq_restore(save_flags);
 	objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
-	kmemleak_alloc_recursive(objp, cachep->object_size, 1, cachep->flags,
-				 flags);
 	prefetchw(objp);
 
-	if (likely(objp)) {
-		kmemcheck_slab_alloc(cachep, flags, objp, cachep->object_size);
-		if (unlikely(flags & __GFP_ZERO))
-			memset(objp, 0, cachep->object_size);
-	}
+	if (unlikely(flags & __GFP_ZERO) && objp)
+		memset(objp, 0, cachep->object_size);
 
-	memcg_kmem_put_cache(cachep);
+	slab_post_alloc_hook(cachep, flags, 1, &objp);
 	return objp;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/10] slab: implement bulk alloc in SLAB allocator
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (5 preceding siblings ...)
  2016-01-07 14:04 ` [PATCH 06/10] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
@ 2016-01-07 14:04 ` Jesper Dangaard Brouer
  2016-01-07 14:04 ` [PATCH 08/10] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:04 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch implements the alloc side of bulk API for the SLAB
allocator.

Further optimization are still possible by changing the call to
__do_cache_alloc() into something that can return multiple
objects.  This optimization is left for later, given end results
already show in the area of 80% speedup.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 47e7bcab8c3b..70be9235e083 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3392,9 +3392,42 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 EXPORT_SYMBOL(kmem_cache_free_bulk);
 
 int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
-								void **p)
+			  void **p)
 {
-	return __kmem_cache_alloc_bulk(s, flags, size, p);
+	size_t i;
+
+	s = slab_pre_alloc_hook(s, flags);
+	if (!s)
+		return 0;
+
+	cache_alloc_debugcheck_before(s, flags);
+
+	local_irq_disable();
+	for (i = 0; i < size; i++) {
+		void *objp = __do_cache_alloc(s, flags);
+
+		/* this call could be done outside IRQ disabled section */
+		objp = cache_alloc_debugcheck_after(s, flags, objp, _RET_IP_);
+
+		if (unlikely(!objp))
+			goto error;
+		p[i] = objp;
+	}
+	local_irq_enable();
+
+	/* Clear memory outside IRQ disabled section */
+	if (unlikely(flags & __GFP_ZERO))
+		for (i = 0; i < size; i++)
+			memset(p[i], 0, s->object_size);
+
+	slab_post_alloc_hook(s, flags, size, p);
+	/* FIXME: Trace call missing. Christoph would like a bulk variant */
+	return size;
+error:
+	local_irq_enable();
+	slab_post_alloc_hook(s, flags, i, p);
+	__kmem_cache_free_bulk(s, i, p);
+	return 0;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_bulk);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/10] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (6 preceding siblings ...)
  2016-01-07 14:04 ` [PATCH 07/10] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
@ 2016-01-07 14:04 ` Jesper Dangaard Brouer
  2016-01-07 14:04 ` [PATCH 09/10] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:04 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Move the call to cache_alloc_debugcheck_after() outside the IRQ
disabled section in kmem_cache_alloc_bulk().

When CONFIG_DEBUG_SLAB is disabled the compiler should remove
this code.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 70be9235e083..33218af6a731 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3391,6 +3391,16 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 }
 EXPORT_SYMBOL(kmem_cache_free_bulk);
 
+static __always_inline void
+cache_alloc_debugcheck_after_bulk(struct kmem_cache *s, gfp_t flags,
+				  size_t size, void **p, unsigned long caller)
+{
+	size_t i;
+
+	for (i = 0; i < size; i++)
+		p[i] = cache_alloc_debugcheck_after(s, flags, p[i], caller);
+}
+
 int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			  void **p)
 {
@@ -3406,15 +3416,14 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	for (i = 0; i < size; i++) {
 		void *objp = __do_cache_alloc(s, flags);
 
-		/* this call could be done outside IRQ disabled section */
-		objp = cache_alloc_debugcheck_after(s, flags, objp, _RET_IP_);
-
 		if (unlikely(!objp))
 			goto error;
 		p[i] = objp;
 	}
 	local_irq_enable();
 
+	cache_alloc_debugcheck_after_bulk(s, flags, size, p, _RET_IP_);
+
 	/* Clear memory outside IRQ disabled section */
 	if (unlikely(flags & __GFP_ZERO))
 		for (i = 0; i < size; i++)
@@ -3425,6 +3434,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	return size;
 error:
 	local_irq_enable();
+	cache_alloc_debugcheck_after_bulk(s, flags, i, p, _RET_IP_);
 	slab_post_alloc_hook(s, flags, i, p);
 	__kmem_cache_free_bulk(s, i, p);
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/10] slab: implement bulk free in SLAB allocator
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (7 preceding siblings ...)
  2016-01-07 14:04 ` [PATCH 08/10] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
@ 2016-01-07 14:04 ` Jesper Dangaard Brouer
  2016-01-07 14:04 ` [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:04 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch implements the free side of bulk API for the SLAB
allocator kmem_cache_free_bulk(), and concludes the
implementation of optimized bulk API for SLAB allocator.

Benchmarked[1] cost of alloc+free (obj size 256 bytes) on CPU
i7-4790K @ 4.00GHz, with no debug options, no PREEMPT and
CONFIG_MEMCG_KMEM=y but no active user of kmemcg.

SLAB single alloc+free cost: 87 cycles(tsc) 21.814 ns with this
optimized config.

bulk- Current fallback          - optimized SLAB bulk
  1 - 102 cycles(tsc) 25.747 ns - 41 cycles(tsc) 10.490 ns - improved 59.8%
  2 -  94 cycles(tsc) 23.546 ns - 26 cycles(tsc)  6.567 ns - improved 72.3%
  3 -  92 cycles(tsc) 23.127 ns - 20 cycles(tsc)  5.244 ns - improved 78.3%
  4 -  90 cycles(tsc) 22.663 ns - 18 cycles(tsc)  4.588 ns - improved 80.0%
  8 -  88 cycles(tsc) 22.242 ns - 14 cycles(tsc)  3.656 ns - improved 84.1%
 16 -  88 cycles(tsc) 22.010 ns - 13 cycles(tsc)  3.480 ns - improved 85.2%
 30 -  89 cycles(tsc) 22.305 ns - 13 cycles(tsc)  3.303 ns - improved 85.4%
 32 -  89 cycles(tsc) 22.277 ns - 13 cycles(tsc)  3.309 ns - improved 85.4%
 34 -  88 cycles(tsc) 22.246 ns - 13 cycles(tsc)  3.294 ns - improved 85.2%
 48 -  88 cycles(tsc) 22.121 ns - 13 cycles(tsc)  3.492 ns - improved 85.2%
 64 -  88 cycles(tsc) 22.052 ns - 13 cycles(tsc)  3.411 ns - improved 85.2%
128 -  89 cycles(tsc) 22.452 ns - 15 cycles(tsc)  3.841 ns - improved 83.1%
158 -  89 cycles(tsc) 22.403 ns - 14 cycles(tsc)  3.746 ns - improved 84.3%
250 -  91 cycles(tsc) 22.775 ns - 16 cycles(tsc)  4.111 ns - improved 82.4%

Notice it is not recommended to do very large bulk operation with
this bulk API, because local IRQs are disabled in this period.

[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 33218af6a731..1358f86c0684 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3385,12 +3385,6 @@ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 }
 EXPORT_SYMBOL(kmem_cache_alloc);
 
-void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
-{
-	__kmem_cache_free_bulk(s, size, p);
-}
-EXPORT_SYMBOL(kmem_cache_free_bulk);
-
 static __always_inline void
 cache_alloc_debugcheck_after_bulk(struct kmem_cache *s, gfp_t flags,
 				  size_t size, void **p, unsigned long caller)
@@ -3584,6 +3578,29 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
+void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
+{
+	struct kmem_cache *s;
+	size_t i;
+
+	local_irq_disable();
+	for (i = 0; i < size; i++) {
+		void *objp = p[i];
+
+		s = cache_from_obj(orig_s, objp);
+
+		debug_check_no_locks_freed(objp, s->object_size);
+		if (!(s->flags & SLAB_DEBUG_OBJECTS))
+			debug_check_no_obj_freed(objp, s->object_size);
+
+		__cache_free(s, objp, _RET_IP_);
+	}
+	local_irq_enable();
+
+	/* FIXME: add tracing */
+}
+EXPORT_SYMBOL(kmem_cache_free_bulk);
+
 /**
  * kfree - free previously allocated memory
  * @objp: pointer returned by kmalloc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (8 preceding siblings ...)
  2016-01-07 14:04 ` [PATCH 09/10] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
@ 2016-01-07 14:04 ` Jesper Dangaard Brouer
  2016-01-08  3:03   ` Joonsoo Kim
  2016-01-07 18:54 ` [PATCH 00/10] MM: More bulk API work Linus Torvalds
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
  11 siblings, 1 reply; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 14:04 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch introduce a new API call kfree_bulk() for bulk freeing
memory objects not bound to a single kmem_cache.

Christoph pointed out that it is possible to implement freeing of
objects, without knowing the kmem_cache pointer as that information is
available from the object's page->slab_cache.  Proposing to remove the
kmem_cache argument from the bulk free API.

Jesper demonstrated that these extra steps per object comes at a
performance cost.  It is only in the case CONFIG_MEMCG_KMEM is
compiled in and activated runtime that these steps are done anyhow.
The extra cost is most visible for SLAB allocator, because the SLUB
allocator does the page lookup (virt_to_head_page()) anyhow.

Thus, the conclusion was to keep the kmem_cache free bulk API with a
kmem_cache pointer, but we can still implement a kfree_bulk() API
fairly easily.  Simply by handling if kmem_cache_free_bulk() gets
called with a kmem_cache NULL pointer.

This does increase the code size a bit, but implementing a separate
kfree_bulk() call would likely increase code size even more.

Below benchmarks cost of alloc+free (obj size 256 bytes) on
CPU i7-4790K @ 4.00GHz, no PREEMPT and CONFIG_MEMCG_KMEM=y.

Code size increase for SLAB:

 add/remove: 0/0 grow/shrink: 1/0 up/down: 74/0 (74)
 function                                     old     new   delta
 kmem_cache_free_bulk                         660     734     +74

SLAB fastpath: 85 cycles(tsc) 21.468 ns (step:0)
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 - 101 cycles 25.291 ns -  41 cycles 10.499 ns - 130 cycles 32.522 ns
   2 -  95 cycles 23.964 ns -  26 cycles  6.558 ns -  56 cycles 14.134 ns
   3 -  93 cycles 23.281 ns -  20 cycles  5.244 ns -  41 cycles 10.393 ns
   4 -  92 cycles 23.123 ns -  18 cycles  4.589 ns -  26 cycles 6.612 ns
   8 -  90 cycles 22.696 ns -  24 cycles  6.211 ns -  32 cycles 8.175 ns
  16 - 108 cycles 27.175 ns -  13 cycles  3.418 ns -  21 cycles 5.480 ns
  30 -  90 cycles 22.708 ns -  14 cycles  3.667 ns -  20 cycles 5.222 ns
  32 -  90 cycles 22.687 ns -  13 cycles  3.337 ns -  20 cycles 5.170 ns
  34 -  90 cycles 22.699 ns -  14 cycles  3.622 ns -  21 cycles 5.269 ns
  48 -  90 cycles 22.585 ns -  14 cycles  3.525 ns -  21 cycles 5.261 ns
  64 -  90 cycles 22.523 ns -  13 cycles  3.440 ns -  20 cycles 5.190 ns
 128 -  91 cycles 22.962 ns -  15 cycles  3.883 ns -  22 cycles 5.622 ns
 158 -  91 cycles 22.877 ns -  15 cycles  3.770 ns -  22 cycles 5.582 ns
 250 -  93 cycles 23.282 ns -  16 cycles  4.133 ns -  24 cycles 6.047 ns

SLAB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 134 cycles(tsc) 33.514 ns (step:0)
 1 - 146 cycles 36.634 ns -  66 cycles 16.705 ns - 67 cycles 16.793 ns
 2 - 137 cycles 34.452 ns -  41 cycles 10.326 ns - 42 cycles 10.736 ns
 3 - 135 cycles 33.856 ns -  34 cycles 8.545 ns - 36 cycles 9.049 ns
 4 - 132 cycles 33.232 ns -  33 cycles 8.306 ns - 29 cycles 7.368 ns
 8 - 134 cycles 33.612 ns -  24 cycles 6.250 ns - 24 cycles 6.130 ns
 16 - 131 cycles 33.003 ns -  23 cycles 5.981 ns - 22 cycles 5.552 ns
 30 - 131 cycles 32.920 ns -  21 cycles 5.499 ns - 21 cycles 5.397 ns
 32 - 131 cycles 32.861 ns -  21 cycles 5.482 ns - 21 cycles 5.301 ns
 34 - 131 cycles 32.837 ns -  21 cycles 5.461 ns - 20 cycles 5.236 ns
 48 - 130 cycles 32.725 ns -  23 cycles 5.878 ns - 21 cycles 5.367 ns
 64 - 130 cycles 32.625 ns -  21 cycles 5.374 ns - 21 cycles 5.251 ns
 128 - 132 cycles 33.048 ns -  22 cycles 5.725 ns - 22 cycles 5.662 ns
 158 - 132 cycles 33.235 ns -  22 cycles 5.641 ns - 22 cycles 5.579 ns
 250 - 134 cycles 33.557 ns -  24 cycles 6.035 ns - 23 cycles 5.934 ns

Code size increase for SLUB:
 function                                     old     new   delta
 kmem_cache_free_bulk                         717     799     +82

SLUB benchmark:
 SLUB fastpath: 46 cycles(tsc) 11.691 ns (step:0)
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 -  61 cycles 15.486 ns -  53 cycles 13.364 ns - 57 cycles 14.464 ns
   2 -  54 cycles 13.703 ns -  32 cycles  8.110 ns - 33 cycles 8.482 ns
   3 -  53 cycles 13.272 ns -  25 cycles  6.362 ns - 27 cycles 6.947 ns
   4 -  51 cycles 12.994 ns -  24 cycles  6.087 ns - 24 cycles 6.078 ns
   8 -  50 cycles 12.576 ns -  21 cycles  5.354 ns - 22 cycles 5.513 ns
  16 -  49 cycles 12.368 ns -  20 cycles  5.054 ns - 20 cycles 5.042 ns
  30 -  49 cycles 12.273 ns -  18 cycles  4.748 ns - 19 cycles 4.758 ns
  32 -  49 cycles 12.401 ns -  19 cycles  4.821 ns - 19 cycles 4.810 ns
  34 -  98 cycles 24.519 ns -  24 cycles  6.154 ns - 24 cycles 6.157 ns
  48 -  83 cycles 20.833 ns -  21 cycles  5.446 ns - 21 cycles 5.429 ns
  64 -  75 cycles 18.891 ns -  20 cycles  5.247 ns - 20 cycles 5.238 ns
 128 -  93 cycles 23.271 ns -  27 cycles  6.856 ns - 27 cycles 6.823 ns
 158 - 102 cycles 25.581 ns -  30 cycles  7.714 ns - 30 cycles 7.695 ns
 250 - 107 cycles 26.917 ns -  38 cycles  9.514 ns - 38 cycles 9.506 ns

SLUB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 71 cycles(tsc) 17.897 ns (step:0)
 1 - 85 cycles 21.484 ns -  78 cycles 19.569 ns - 75 cycles 18.938 ns
 2 - 81 cycles 20.363 ns -  45 cycles 11.258 ns - 44 cycles 11.076 ns
 3 - 78 cycles 19.709 ns -  33 cycles 8.354 ns - 32 cycles 8.044 ns
 4 - 77 cycles 19.430 ns -  28 cycles 7.216 ns - 28 cycles 7.003 ns
 8 - 101 cycles 25.288 ns -  23 cycles 5.849 ns - 23 cycles 5.787 ns
 16 - 76 cycles 19.148 ns -  20 cycles 5.162 ns - 20 cycles 5.081 ns
 30 - 76 cycles 19.067 ns -  19 cycles 4.868 ns - 19 cycles 4.821 ns
 32 - 76 cycles 19.052 ns -  19 cycles 4.857 ns - 19 cycles 4.815 ns
 34 - 121 cycles 30.291 ns -  25 cycles 6.333 ns - 25 cycles 6.268 ns
 48 - 108 cycles 27.111 ns -  21 cycles 5.498 ns - 21 cycles 5.458 ns
 64 - 100 cycles 25.164 ns -  20 cycles 5.242 ns - 20 cycles 5.229 ns
 128 - 155 cycles 38.976 ns -  27 cycles 6.886 ns - 27 cycles 6.892 ns
 158 - 132 cycles 33.034 ns -  30 cycles 7.711 ns - 30 cycles 7.728 ns
 250 - 130 cycles 32.612 ns -  38 cycles 9.560 ns - 38 cycles 9.549 ns

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/slab.h |    8 ++++++++
 mm/slab.c            |    5 ++++-
 mm/slab_common.c     |    8 ++++++--
 mm/slub.c            |   22 +++++++++++++++++++---
 4 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 2037a861e367..599b47f02b27 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -318,6 +318,14 @@ void kmem_cache_free(struct kmem_cache *, void *);
 void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
+static __always_inline void kfree_bulk(size_t size, void **p)
+{
+	/* Reusing call to kmem_cache_free_bulk() allow kfree_bulk to
+	 * use same code icache
+	 */
+	kmem_cache_free_bulk(NULL, size, p);
+}
+
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment;
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment;
diff --git a/mm/slab.c b/mm/slab.c
index 1358f86c0684..d4dc4836918f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3587,7 +3587,10 @@ void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
 	for (i = 0; i < size; i++) {
 		void *objp = p[i];
 
-		s = cache_from_obj(orig_s, objp);
+		if (!orig_s) /* called via kfree_bulk */
+			s = virt_to_cache(objp);
+		else
+			s = cache_from_obj(orig_s, objp);
 
 		debug_check_no_locks_freed(objp, s->object_size);
 		if (!(s->flags & SLAB_DEBUG_OBJECTS))
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 3c6a86b4ec25..963c25589949 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -108,8 +108,12 @@ void __kmem_cache_free_bulk(struct kmem_cache *s, size_t nr, void **p)
 {
 	size_t i;
 
-	for (i = 0; i < nr; i++)
-		kmem_cache_free(s, p[i]);
+	for (i = 0; i < nr; i++) {
+		if (s)
+			kmem_cache_free(s, p[i]);
+		else
+			kfree(p[i]);
+	}
 }
 
 int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
diff --git a/mm/slub.c b/mm/slub.c
index 3697f216d7c7..c33d2e1f011e 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2786,23 +2786,39 @@ int build_detached_freelist(struct kmem_cache **s, size_t size,
 	size_t first_skipped_index = 0;
 	int lookahead = 3;
 	void *object;
+	struct page *page;
 
 	/* Always re-init detached_freelist */
 	df->page = NULL;
 
 	do {
 		object = p[--size];
+		/* Do we need !ZERO_OR_NULL_PTR(object) here? (for kfree) */
 	} while (!object && size);
 
 	if (!object)
 		return 0;
 
-	/* Support for memcg, compiler can optimize this out */
-	*s = cache_from_obj(*s, object);
+	page = virt_to_head_page(object);
+	if (!*s) {
+		/* Handle kalloc'ed objects */
+		if (unlikely(!PageSlab(page))) {
+			BUG_ON(!PageCompound(page));
+			kfree_hook(object);
+			__free_kmem_pages(page, compound_order(page));
+			p[size] = NULL; /* mark object processed */
+			return size;
+		}
+		/* Derive kmem_cache from object */
+		*s = page->slab_cache;
+	} else {
+		/* Support for memcg, compiler can optimize this out */
+		*s = cache_from_obj(*s, object);
+	}
 
 	/* Start new detached freelist */
+	df->page = page;
 	set_freepointer(*s, object, NULL);
-	df->page = virt_to_head_page(object);
 	df->tail = object;
 	df->freelist = object;
 	p[size] = NULL; /* mark object processed */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
@ 2016-01-07 15:54   ` Christoph Lameter
  2016-01-07 17:41     ` Jesper Dangaard Brouer
  2016-01-08  2:58   ` Joonsoo Kim
  1 sibling, 1 reply; 31+ messages in thread
From: Christoph Lameter @ 2016-01-07 15:54 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-mm, Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim

On Thu, 7 Jan 2016, Jesper Dangaard Brouer wrote:

> +	/* Support for memcg, compiler can optimize this out */
> +	*s = cache_from_obj(*s, object);
> +

Well the indirection on *s presumably cannot be optimized out. And the
indirection is not needed when cgroups are not active.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-07 15:54   ` Christoph Lameter
@ 2016-01-07 17:41     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-07 17:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-mm, Vladimir Davydov, Andrew Morton, Linus Torvalds,
	Joonsoo Kim, brouer

On Thu, 7 Jan 2016 09:54:24 -0600 (CST)
Christoph Lameter <cl@linux.com> wrote:

> On Thu, 7 Jan 2016, Jesper Dangaard Brouer wrote:
> 
> > +	/* Support for memcg, compiler can optimize this out */
> > +	*s = cache_from_obj(*s, object);
> > +
> 
> Well the indirection on *s presumably cannot be optimized out. And the
> indirection is not needed when cgroups are not active.

The indirection is optimized out, because build_detached_freelist is
inlined (and I marked it so for readability (even-though it was getting
inlined before by GCC)).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 00/10] MM: More bulk API work
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (9 preceding siblings ...)
  2016-01-07 14:04 ` [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
@ 2016-01-07 18:54 ` Linus Torvalds
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
  11 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2016-01-07 18:54 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Joonsoo Kim

On Thu, Jan 7, 2016 at 6:03 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> This series contain three aspects:
>  1. cleanup and code sharing between SLUB and SLAB
>  2. implementing accelerated bulk API for SLAB allocator
>  3. new API kfree_bulk()

FWIW, looks ok to me from a quick patch read-through. Nothing raises
my hackles like happened with the old slab work.

           Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
  2016-01-07 15:54   ` Christoph Lameter
@ 2016-01-08  2:58   ` Joonsoo Kim
  2016-01-08 11:05     ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2016-01-08  2:58 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Linus Torvalds

On Thu, Jan 07, 2016 at 03:03:38PM +0100, Jesper Dangaard Brouer wrote:
> This change is primarily an attempt to make it easier to realize the
> optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
> enabled.
> 
> Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
> overhead is zero. This is because, as long as no process have
> enabled kmem cgroups accounting, the assignment is replaced by
> asm-NOP operations.  This is possible because memcg_kmem_enabled()
> uses a static_key_false() construct.
> 
> It also helps readability as it avoid accessing the p[] array like:
> p[size - 1] which "expose" that the array is processed backwards
> inside helper function build_detached_freelist().

That part is cleande up but overall code doesn't looks readable to me.
How about below change?

Thanks.

---------------------->8------------------
 struct detached_freelist {
+       struct kmem_cache *s;
        struct page *page;
        void *tail;
        void *freelist;
@@ -2852,8 +2853,11 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
        if (!object)
                return 0;
 
+       /* Support for memcg */
+       df->s = cache_from_obj(s, object);
+
        /* Start new detached freelist */
-       set_freepointer(s, object, NULL);
+       set_freepointer(df.s, object, NULL);
        df->page = virt_to_head_page(object);
        df->tail = object;
        df->freelist = object;
@@ -2868,7 +2872,7 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
                /* df->page is always set at this point */
                if (df->page == virt_to_head_page(object)) {
                        /* Opportunity build freelist */
-                       set_freepointer(s, object, df->freelist);
+                       set_freepointer(df.s, object, df->freelist);
                        df->freelist = object;
                        df->cnt++;
                        p[size] = NULL; /* mark object processed */
@@ -2889,23 +2893,19 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 
 
 /* Note that interrupts must be enabled when calling this function. */
-void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
+void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 {
        if (WARN_ON(!size))
                return;
 
        do {
                struct detached_freelist df;
-               struct kmem_cache *s;
-
-               /* Support for memcg */
-               s = cache_from_obj(orig_s, p[size - 1]);
 
                size = build_detached_freelist(s, size, p, &df);
                if (unlikely(!df.page))
                        continue;
 
-               slab_free(s, df.page, df.freelist, df.tail, df.cnt, _RET_IP_);
+               slab_free(df.s, df.page, df.freelist, df.tail, df.cnt, _RET_IP_);
        } while (likely(size));
 }
 EXPORT_SYMBOL(kmem_cache_free_bulk);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators
  2016-01-07 14:04 ` [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
@ 2016-01-08  3:03   ` Joonsoo Kim
  2016-01-08 11:20     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 31+ messages in thread
From: Joonsoo Kim @ 2016-01-08  3:03 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Linus Torvalds

On Thu, Jan 07, 2016 at 03:04:23PM +0100, Jesper Dangaard Brouer wrote:
> This patch introduce a new API call kfree_bulk() for bulk freeing
> memory objects not bound to a single kmem_cache.
> 
> Christoph pointed out that it is possible to implement freeing of
> objects, without knowing the kmem_cache pointer as that information is
> available from the object's page->slab_cache.  Proposing to remove the
> kmem_cache argument from the bulk free API.
> 
> Jesper demonstrated that these extra steps per object comes at a
> performance cost.  It is only in the case CONFIG_MEMCG_KMEM is
> compiled in and activated runtime that these steps are done anyhow.
> The extra cost is most visible for SLAB allocator, because the SLUB
> allocator does the page lookup (virt_to_head_page()) anyhow.
> 
> Thus, the conclusion was to keep the kmem_cache free bulk API with a
> kmem_cache pointer, but we can still implement a kfree_bulk() API
> fairly easily.  Simply by handling if kmem_cache_free_bulk() gets
> called with a kmem_cache NULL pointer.
> 
> This does increase the code size a bit, but implementing a separate
> kfree_bulk() call would likely increase code size even more.
> 
> Below benchmarks cost of alloc+free (obj size 256 bytes) on
> CPU i7-4790K @ 4.00GHz, no PREEMPT and CONFIG_MEMCG_KMEM=y.
> 
> Code size increase for SLAB:
> 
>  add/remove: 0/0 grow/shrink: 1/0 up/down: 74/0 (74)
>  function                                     old     new   delta
>  kmem_cache_free_bulk                         660     734     +74
> 
> SLAB fastpath: 85 cycles(tsc) 21.468 ns (step:0)
>   sz - fallback             - kmem_cache_free_bulk - kfree_bulk
>    1 - 101 cycles 25.291 ns -  41 cycles 10.499 ns - 130 cycles 32.522 ns

This looks experimental error. Why does kfree_bulk() takes more time
than fallback?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB
  2016-01-07 14:03 ` [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
@ 2016-01-08  3:05   ` Joonsoo Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Joonsoo Kim @ 2016-01-08  3:05 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Linus Torvalds

On Thu, Jan 07, 2016 at 03:03:53PM +0100, Jesper Dangaard Brouer wrote:
> Dedublicate code in SLAB allocator functions slab_alloc() and
> slab_alloc_node() by using the slab_pre_alloc_hook() call, which
> is now shared between SLUB and SLAB.
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  mm/slab.c |   18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index d5b29e7bee81..17fd6268ad41 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3140,15 +3140,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
>  	void *ptr;
>  	int slab_node = numa_mem_id();
>  
> -	flags &= gfp_allowed_mask;
> -
> -	lockdep_trace_alloc(flags);
> -
> -	if (should_failslab(cachep, flags))
> +	cachep = slab_pre_alloc_hook(cachep, flags);
> +	if (!cachep)
>  		return NULL;

How about adding unlikely here?

>  
> -	cachep = memcg_kmem_get_cache(cachep, flags);
> -
>  	cache_alloc_debugcheck_before(cachep, flags);
>  	local_irq_save(save_flags);
>  
> @@ -3228,15 +3223,10 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
>  	unsigned long save_flags;
>  	void *objp;
>  
> -	flags &= gfp_allowed_mask;
> -
> -	lockdep_trace_alloc(flags);
> -
> -	if (should_failslab(cachep, flags))
> +	cachep = slab_pre_alloc_hook(cachep, flags);
> +	if (!cachep)
>  		return NULL;

Dito.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-08  2:58   ` Joonsoo Kim
@ 2016-01-08 11:05     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-08 11:05 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Linus Torvalds, brouer

On Fri, 8 Jan 2016 11:58:39 +0900
Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> On Thu, Jan 07, 2016 at 03:03:38PM +0100, Jesper Dangaard Brouer wrote:
> > This change is primarily an attempt to make it easier to realize the
> > optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
> > enabled.
> > 
> > Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
> > overhead is zero. This is because, as long as no process have
> > enabled kmem cgroups accounting, the assignment is replaced by
> > asm-NOP operations.  This is possible because memcg_kmem_enabled()
> > uses a static_key_false() construct.
> > 
> > It also helps readability as it avoid accessing the p[] array like:
> > p[size - 1] which "expose" that the array is processed backwards
> > inside helper function build_detached_freelist().
> 
> That part is cleande up but overall code doesn't looks readable to me.

True, I also don't like my "*s" indirection, even-though it gets removed
in the compiled code.

> How about below change?

Looks more C readable, but I have to verify that the compiler can still
realize the optimization, of just using "s" directly, when
CONFIG_MEMCG_KMEM is not activated.  

> ---------------------->8------------------
>  struct detached_freelist {
> +       struct kmem_cache *s;
>         struct page *page;
>         void *tail;
>         void *freelist;
>         int cnt;
>  }

I'll likely place "s" at another point, as 16 bytes alignment of this
struct can influence performance on Intel CPUs.  (e.g. freelist+cnt
gets updated at almost the same time, and were 16 bytes aligned before)


> @@ -2852,8 +2853,11 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
>         if (!object)
>                 return 0;
>  
> +       /* Support for memcg */
> +       df->s = cache_from_obj(s, object);
> +
>         /* Start new detached freelist */
> -       set_freepointer(s, object, NULL);
> +       set_freepointer(df.s, object, NULL);

Not compile tested ;-) ... df->s

>         df->page = virt_to_head_page(object);
>         df->tail = object;
>         df->freelist = object;
> @@ -2868,7 +2872,7 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
>                 /* df->page is always set at this point */
>                 if (df->page == virt_to_head_page(object)) {
>                         /* Opportunity build freelist */
> -                       set_freepointer(s, object, df->freelist);
> +                       set_freepointer(df.s, object, df->freelist);
>                         df->freelist = object;
>                         df->cnt++;
>                         p[size] = NULL; /* mark object processed */
> @@ -2889,23 +2893,19 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
>  
>  
>  /* Note that interrupts must be enabled when calling this function. */
> -void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
> +void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
>  {
>         if (WARN_ON(!size))
>                 return;
>  
>         do {
>                 struct detached_freelist df;
> -               struct kmem_cache *s;
> -
> -               /* Support for memcg */
> -               s = cache_from_obj(orig_s, p[size - 1]);
>  
>                 size = build_detached_freelist(s, size, p, &df);
>                 if (unlikely(!df.page))
>                         continue;
>  
> -               slab_free(s, df.page, df.freelist, df.tail, df.cnt, _RET_IP_);
> +               slab_free(df.s, df.page, df.freelist, df.tail, df.cnt, _RET_IP_);

Argh... line will be 81 chars wide...

>         } while (likely(size));
>  }
>  EXPORT_SYMBOL(kmem_cache_free_bulk);

I'll try it out. Thanks for your suggestion.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators
  2016-01-08  3:03   ` Joonsoo Kim
@ 2016-01-08 11:20     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-08 11:20 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Christoph Lameter, Vladimir Davydov, Andrew Morton,
	Linus Torvalds, brouer

On Fri, 8 Jan 2016 12:03:48 +0900
Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> On Thu, Jan 07, 2016 at 03:04:23PM +0100, Jesper Dangaard Brouer wrote:
> > This patch introduce a new API call kfree_bulk() for bulk freeing
> > memory objects not bound to a single kmem_cache.
> > 
> > Christoph pointed out that it is possible to implement freeing of
> > objects, without knowing the kmem_cache pointer as that information is
> > available from the object's page->slab_cache.  Proposing to remove the
> > kmem_cache argument from the bulk free API.
> > 
> > Jesper demonstrated that these extra steps per object comes at a
> > performance cost.  It is only in the case CONFIG_MEMCG_KMEM is
> > compiled in and activated runtime that these steps are done anyhow.
> > The extra cost is most visible for SLAB allocator, because the SLUB
> > allocator does the page lookup (virt_to_head_page()) anyhow.
> > 
> > Thus, the conclusion was to keep the kmem_cache free bulk API with a
> > kmem_cache pointer, but we can still implement a kfree_bulk() API
> > fairly easily.  Simply by handling if kmem_cache_free_bulk() gets
> > called with a kmem_cache NULL pointer.
> > 
> > This does increase the code size a bit, but implementing a separate
> > kfree_bulk() call would likely increase code size even more.
> > 
> > Below benchmarks cost of alloc+free (obj size 256 bytes) on
> > CPU i7-4790K @ 4.00GHz, no PREEMPT and CONFIG_MEMCG_KMEM=y.
> > 
> > Code size increase for SLAB:
> > 
> >  add/remove: 0/0 grow/shrink: 1/0 up/down: 74/0 (74)
> >  function                                     old     new   delta
> >  kmem_cache_free_bulk                         660     734     +74
> > 
> > SLAB fastpath: 85 cycles(tsc) 21.468 ns (step:0)
> >   sz - fallback             - kmem_cache_free_bulk - kfree_bulk
> >    1 - 101 cycles 25.291 ns -  41 cycles 10.499 ns - 130 cycles 32.522 ns
> 
> This looks experimental error. Why does kfree_bulk() takes more time
> than fallback?

This does look like an experimental error.  Sometimes instabilities
occurs, when slab_caches gets merged, but I tried to counter that by
using boot param slab_nomerge.

In the case for SLAB kfree_bulk() single object, then it can be slower
than the fallback, because it will likely always hit a branch
mispredict for the kfree case (which is okay, as that is not the case
we optimize for, single obj free).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH V2 00/11] MM: More bulk API work
  2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
                   ` (10 preceding siblings ...)
  2016-01-07 18:54 ` [PATCH 00/10] MM: More bulk API work Linus Torvalds
@ 2016-01-12 15:13 ` Jesper Dangaard Brouer
  2016-01-12 15:13   ` [PATCH V2 01/11] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
                     ` (10 more replies)
  11 siblings, 11 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:13 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This series contain three aspects:
 1. cleanup and code sharing between SLUB and SLAB
 2. implementing accelerated bulk API for SLAB allocator
 3. new API kfree_bulk()

Reviewers please review the changed order of debug calls in the SLAB
allocator, as they are changed to do the same as the SLUB allocator.

Patchset based on top Linus tree at commit afd2ff9b7e1b ("Linux 4.4").

Test module for exercising the new kfree_bulk() API is avail here:
 https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test05_kfree_bulk.c

---

Jesper Dangaard Brouer (11):
      slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
      mm/slab: move SLUB alloc hooks to common mm/slab.h
      mm: fault-inject take over bootstrap kmem_cache check
      slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB
      mm: kmemcheck skip object if slab allocation failed
      slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB
      slab: implement bulk alloc in SLAB allocator
      slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk
      slab: implement bulk free in SLAB allocator
      mm: new API kfree_bulk() for SLAB+SLUB allocators
      mm: fix some spelling


 include/linux/fault-inject.h |    5 +-
 include/linux/memcontrol.h   |    2 -
 include/linux/slab.h         |   11 +++-
 mm/failslab.c                |   11 +++-
 mm/kmemcheck.c               |    3 +
 mm/slab.c                    |  121 +++++++++++++++++++++++++++---------------
 mm/slab.h                    |   64 ++++++++++++++++++++++
 mm/slab_common.c             |    8 ++-
 mm/slub.c                    |   93 +++++++++-----------------------
 9 files changed, 198 insertions(+), 120 deletions(-)

--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH V2 01/11] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
@ 2016-01-12 15:13   ` Jesper Dangaard Brouer
  2016-01-12 15:13   ` [PATCH V2 02/11] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:13 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This change is primarily an attempt to make it easier to realize the
optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
enabled.

Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
overhead is zero. This is because, as long as no process have
enabled kmem cgroups accounting, the assignment is replaced by
asm-NOP operations.  This is possible because memcg_kmem_enabled()
uses a static_key_false() construct.

It also helps readability as it avoid accessing the p[] array like:
p[size - 1] which "expose" that the array is processed backwards
inside helper function build_detached_freelist().

Lastly this also makes the code more robust, in error case like
passing NULL pointers in the array. Which were previously handled
before commit 033745189b1b ("slub: add missing kmem cgroup
support to kmem_cache_free_bulk").

Fixes: 033745189b1b ("slub: add missing kmem cgroup support to kmem_cache_free_bulk")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

---
V2: used Joonsoo Kim's suggestion to store "s" in struct detached_freelist.
 - Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
 - Verified ASM code generated is still optimal, with different ifdef config's

 mm/slub.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 46997517406e..65d5f92d51d2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2819,6 +2819,7 @@ struct detached_freelist {
 	void *tail;
 	void *freelist;
 	int cnt;
+	struct kmem_cache *s;
 };
 
 /*
@@ -2833,8 +2834,9 @@ struct detached_freelist {
  * synchronization primitive.  Look ahead in the array is limited due
  * to performance reasons.
  */
-static int build_detached_freelist(struct kmem_cache *s, size_t size,
-				   void **p, struct detached_freelist *df)
+static inline
+int build_detached_freelist(struct kmem_cache *s, size_t size,
+			    void **p, struct detached_freelist *df)
 {
 	size_t first_skipped_index = 0;
 	int lookahead = 3;
@@ -2850,8 +2852,11 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 	if (!object)
 		return 0;
 
+	/* Support for memcg, compiler can optimize this out */
+	df->s = cache_from_obj(s, object);
+
 	/* Start new detached freelist */
-	set_freepointer(s, object, NULL);
+	set_freepointer(df->s, object, NULL);
 	df->page = virt_to_head_page(object);
 	df->tail = object;
 	df->freelist = object;
@@ -2866,7 +2871,7 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 		/* df->page is always set at this point */
 		if (df->page == virt_to_head_page(object)) {
 			/* Opportunity build freelist */
-			set_freepointer(s, object, df->freelist);
+			set_freepointer(df->s, object, df->freelist);
 			df->freelist = object;
 			df->cnt++;
 			p[size] = NULL; /* mark object processed */
@@ -2885,25 +2890,20 @@ static int build_detached_freelist(struct kmem_cache *s, size_t size,
 	return first_skipped_index;
 }
 
-
 /* Note that interrupts must be enabled when calling this function. */
-void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
+void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 {
 	if (WARN_ON(!size))
 		return;
 
 	do {
 		struct detached_freelist df;
-		struct kmem_cache *s;
-
-		/* Support for memcg */
-		s = cache_from_obj(orig_s, p[size - 1]);
 
 		size = build_detached_freelist(s, size, p, &df);
 		if (unlikely(!df.page))
 			continue;
 
-		slab_free(s, df.page, df.freelist, df.tail, df.cnt, _RET_IP_);
+		slab_free(df.s, df.page, df.freelist, df.tail, df.cnt,_RET_IP_);
 	} while (likely(size));
 }
 EXPORT_SYMBOL(kmem_cache_free_bulk);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 02/11] mm/slab: move SLUB alloc hooks to common mm/slab.h
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
  2016-01-12 15:13   ` [PATCH V2 01/11] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
@ 2016-01-12 15:13   ` Jesper Dangaard Brouer
  2016-01-12 15:14   ` [PATCH V2 03/11] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:13 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

First step towards sharing alloc_hook's between SLUB and SLAB
allocators.  Move the SLUB allocators *_alloc_hook to the common
mm/slab.h for internal slab definitions.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.h |   62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 mm/slub.c |   54 -----------------------------------------------------
 2 files changed, 62 insertions(+), 54 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 7b6087197997..92b10da2c71f 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -38,6 +38,10 @@ struct kmem_cache {
 #endif
 
 #include <linux/memcontrol.h>
+#include <linux/fault-inject.h>
+#include <linux/kmemcheck.h>
+#include <linux/kasan.h>
+#include <linux/kmemleak.h>
 
 /*
  * State of the slab allocator.
@@ -319,6 +323,64 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
 	return s;
 }
 
+static inline size_t slab_ksize(const struct kmem_cache *s)
+{
+#ifndef CONFIG_SLUB
+	return s->object_size;
+
+#else /* CONFIG_SLUB */
+# ifdef CONFIG_SLUB_DEBUG
+	/*
+	 * Debugging requires use of the padding between object
+	 * and whatever may come after it.
+	 */
+	if (s->flags & (SLAB_RED_ZONE | SLAB_POISON))
+		return s->object_size;
+# endif
+	/*
+	 * If we have the need to store the freelist pointer
+	 * back there or track user information then we can
+	 * only use the space before that information.
+	 */
+	if (s->flags & (SLAB_DESTROY_BY_RCU | SLAB_STORE_USER))
+		return s->inuse;
+	/*
+	 * Else we can use all the padding etc for the allocation
+	 */
+	return s->size;
+#endif
+}
+
+static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
+						     gfp_t flags)
+{
+	flags &= gfp_allowed_mask;
+	lockdep_trace_alloc(flags);
+	might_sleep_if(gfpflags_allow_blocking(flags));
+
+	if (should_failslab(s->object_size, flags, s->flags))
+		return NULL;
+
+	return memcg_kmem_get_cache(s, flags);
+}
+
+static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
+					size_t size, void **p)
+{
+	size_t i;
+
+	flags &= gfp_allowed_mask;
+	for (i = 0; i < size; i++) {
+		void *object = p[i];
+
+		kmemcheck_slab_alloc(s, flags, object, slab_ksize(s));
+		kmemleak_alloc_recursive(object, s->object_size, 1,
+					 s->flags, flags);
+		kasan_slab_alloc(s, object);
+	}
+	memcg_kmem_put_cache(s);
+}
+
 #ifndef CONFIG_SLOB
 /*
  * The slab lists for all objects.
diff --git a/mm/slub.c b/mm/slub.c
index 65d5f92d51d2..9ef1abc683b2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -284,30 +284,6 @@ static inline int slab_index(void *p, struct kmem_cache *s, void *addr)
 	return (p - addr) / s->size;
 }
 
-static inline size_t slab_ksize(const struct kmem_cache *s)
-{
-#ifdef CONFIG_SLUB_DEBUG
-	/*
-	 * Debugging requires use of the padding between object
-	 * and whatever may come after it.
-	 */
-	if (s->flags & (SLAB_RED_ZONE | SLAB_POISON))
-		return s->object_size;
-
-#endif
-	/*
-	 * If we have the need to store the freelist pointer
-	 * back there or track user information then we can
-	 * only use the space before that information.
-	 */
-	if (s->flags & (SLAB_DESTROY_BY_RCU | SLAB_STORE_USER))
-		return s->inuse;
-	/*
-	 * Else we can use all the padding etc for the allocation
-	 */
-	return s->size;
-}
-
 static inline int order_objects(int order, unsigned long size, int reserved)
 {
 	return ((PAGE_SIZE << order) - reserved) / size;
@@ -1279,36 +1255,6 @@ static inline void kfree_hook(const void *x)
 	kasan_kfree_large(x);
 }
 
-static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
-						     gfp_t flags)
-{
-	flags &= gfp_allowed_mask;
-	lockdep_trace_alloc(flags);
-	might_sleep_if(gfpflags_allow_blocking(flags));
-
-	if (should_failslab(s->object_size, flags, s->flags))
-		return NULL;
-
-	return memcg_kmem_get_cache(s, flags);
-}
-
-static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
-					size_t size, void **p)
-{
-	size_t i;
-
-	flags &= gfp_allowed_mask;
-	for (i = 0; i < size; i++) {
-		void *object = p[i];
-
-		kmemcheck_slab_alloc(s, flags, object, slab_ksize(s));
-		kmemleak_alloc_recursive(object, s->object_size, 1,
-					 s->flags, flags);
-		kasan_slab_alloc(s, object);
-	}
-	memcg_kmem_put_cache(s);
-}
-
 static inline void slab_free_hook(struct kmem_cache *s, void *x)
 {
 	kmemleak_free_recursive(x, s->flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 03/11] mm: fault-inject take over bootstrap kmem_cache check
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
  2016-01-12 15:13   ` [PATCH V2 01/11] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
  2016-01-12 15:13   ` [PATCH V2 02/11] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
@ 2016-01-12 15:14   ` Jesper Dangaard Brouer
  2016-01-12 15:14   ` [PATCH V2 04/11] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:14 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Remove the SLAB specific function slab_should_failslab(), by moving
the check against fault-injection for the bootstrap slab, into the
shared function should_failslab() (used by both SLAB and SLUB).

This is a step towards sharing alloc_hook's between SLUB and SLAB.

This bootstrap slab "kmem_cache" is used for allocating struct
kmem_cache objects to the allocator itself.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/fault-inject.h |    5 ++---
 mm/failslab.c                |   11 ++++++++---
 mm/slab.c                    |   12 ++----------
 mm/slab.h                    |    2 +-
 4 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/include/linux/fault-inject.h b/include/linux/fault-inject.h
index 3159a7dba034..9f4956d8601c 100644
--- a/include/linux/fault-inject.h
+++ b/include/linux/fault-inject.h
@@ -62,10 +62,9 @@ static inline struct dentry *fault_create_debugfs_attr(const char *name,
 #endif /* CONFIG_FAULT_INJECTION */
 
 #ifdef CONFIG_FAILSLAB
-extern bool should_failslab(size_t size, gfp_t gfpflags, unsigned long flags);
+extern bool should_failslab(struct kmem_cache *s, gfp_t gfpflags);
 #else
-static inline bool should_failslab(size_t size, gfp_t gfpflags,
-				unsigned long flags)
+static inline bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
 	return false;
 }
diff --git a/mm/failslab.c b/mm/failslab.c
index 79171b4a5826..0c5b3f31f310 100644
--- a/mm/failslab.c
+++ b/mm/failslab.c
@@ -1,5 +1,6 @@
 #include <linux/fault-inject.h>
 #include <linux/slab.h>
+#include "slab.h"
 
 static struct {
 	struct fault_attr attr;
@@ -11,18 +12,22 @@ static struct {
 	.cache_filter = false,
 };
 
-bool should_failslab(size_t size, gfp_t gfpflags, unsigned long cache_flags)
+bool should_failslab(struct kmem_cache *s, gfp_t gfpflags)
 {
+	/* No fault-injection for bootstrap cache */
+	if (unlikely(s == kmem_cache))
+		return false;
+
 	if (gfpflags & __GFP_NOFAIL)
 		return false;
 
 	if (failslab.ignore_gfp_reclaim && (gfpflags & __GFP_RECLAIM))
 		return false;
 
-	if (failslab.cache_filter && !(cache_flags & SLAB_FAILSLAB))
+	if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
 		return false;
 
-	return should_fail(&failslab.attr, size);
+	return should_fail(&failslab.attr, s->object_size);
 }
 
 static int __init setup_failslab(char *str)
diff --git a/mm/slab.c b/mm/slab.c
index 4765c97ce690..d5b29e7bee81 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2917,14 +2917,6 @@ static void *cache_alloc_debugcheck_after(struct kmem_cache *cachep,
 #define cache_alloc_debugcheck_after(a,b,objp,d) (objp)
 #endif
 
-static bool slab_should_failslab(struct kmem_cache *cachep, gfp_t flags)
-{
-	if (unlikely(cachep == kmem_cache))
-		return false;
-
-	return should_failslab(cachep->object_size, flags, cachep->flags);
-}
-
 static inline void *____cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 {
 	void *objp;
@@ -3152,7 +3144,7 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
 
 	lockdep_trace_alloc(flags);
 
-	if (slab_should_failslab(cachep, flags))
+	if (should_failslab(cachep, flags))
 		return NULL;
 
 	cachep = memcg_kmem_get_cache(cachep, flags);
@@ -3240,7 +3232,7 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 
 	lockdep_trace_alloc(flags);
 
-	if (slab_should_failslab(cachep, flags))
+	if (should_failslab(cachep, flags))
 		return NULL;
 
 	cachep = memcg_kmem_get_cache(cachep, flags);
diff --git a/mm/slab.h b/mm/slab.h
index 92b10da2c71f..343ee496c53b 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -358,7 +358,7 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
 	lockdep_trace_alloc(flags);
 	might_sleep_if(gfpflags_allow_blocking(flags));
 
-	if (should_failslab(s->object_size, flags, s->flags))
+	if (should_failslab(s, flags))
 		return NULL;
 
 	return memcg_kmem_get_cache(s, flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 04/11] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (2 preceding siblings ...)
  2016-01-12 15:14   ` [PATCH V2 03/11] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
@ 2016-01-12 15:14   ` Jesper Dangaard Brouer
  2016-01-12 15:14   ` [PATCH V2 05/11] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:14 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Dedublicate code in SLAB allocator functions slab_alloc() and
slab_alloc_node() by using the slab_pre_alloc_hook() call, which
is now shared between SLUB and SLAB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

---
V2: added unlikely() per request of Kim

 mm/slab.c |   18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index d5b29e7bee81..30365be73547 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3140,15 +3140,10 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
 	void *ptr;
 	int slab_node = numa_mem_id();
 
-	flags &= gfp_allowed_mask;
-
-	lockdep_trace_alloc(flags);
-
-	if (should_failslab(cachep, flags))
+	cachep = slab_pre_alloc_hook(cachep, flags);
+	if (unlikely(!cachep))
 		return NULL;
 
-	cachep = memcg_kmem_get_cache(cachep, flags);
-
 	cache_alloc_debugcheck_before(cachep, flags);
 	local_irq_save(save_flags);
 
@@ -3228,15 +3223,10 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 	unsigned long save_flags;
 	void *objp;
 
-	flags &= gfp_allowed_mask;
-
-	lockdep_trace_alloc(flags);
-
-	if (should_failslab(cachep, flags))
+	cachep = slab_pre_alloc_hook(cachep, flags);
+	if (unlikely(!cachep))
 		return NULL;
 
-	cachep = memcg_kmem_get_cache(cachep, flags);
-
 	cache_alloc_debugcheck_before(cachep, flags);
 	local_irq_save(save_flags);
 	objp = __do_cache_alloc(cachep, flags);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 05/11] mm: kmemcheck skip object if slab allocation failed
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (3 preceding siblings ...)
  2016-01-12 15:14   ` [PATCH V2 04/11] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
@ 2016-01-12 15:14   ` Jesper Dangaard Brouer
  2016-01-12 15:14   ` [PATCH V2 06/11] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:14 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

In the SLAB allocator kmemcheck_slab_alloc() is guarded against
being called in case the object is NULL.  In SLUB allocator this
NULL pointer invocation can happen, which seems like an oversight.

Move the NULL pointer check into kmemcheck code (kmemcheck_slab_alloc)
so the check gets moved out of the fastpath, when not compiled
with CONFIG_KMEMCHECK.

This is a step towards sharing post_alloc_hook between SLUB and
SLAB, because slab_post_alloc_hook() does not perform this check
before calling kmemcheck_slab_alloc().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/kmemcheck.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/kmemcheck.c b/mm/kmemcheck.c
index cab58bb592d8..6f4f424037c0 100644
--- a/mm/kmemcheck.c
+++ b/mm/kmemcheck.c
@@ -60,6 +60,9 @@ void kmemcheck_free_shadow(struct page *page, int order)
 void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
 			  size_t size)
 {
+	if (unlikely(!object)) /* Skip object if allocation failed */
+		return;
+
 	/*
 	 * Has already been memset(), which initializes the shadow for us
 	 * as well.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 06/11] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (4 preceding siblings ...)
  2016-01-12 15:14   ` [PATCH V2 05/11] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
@ 2016-01-12 15:14   ` Jesper Dangaard Brouer
  2016-01-12 15:15   ` [PATCH V2 07/11] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:14 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Reviewers notice that the order in slab_post_alloc_hook() of
kmemcheck_slab_alloc() and kmemleak_alloc_recursive() gets
swapped compared to slab.c / SLAB allocator.

Also notice memset now occurs before calling kmemcheck_slab_alloc()
and kmemleak_alloc_recursive().

I assume this reordering of kmemcheck, kmemleak and memset is okay
because this is the order they are used by the SLUB allocator.

This patch completes the sharing of alloc_hook's between SLUB and SLAB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 30365be73547..a05f716031de 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3172,16 +3172,11 @@ slab_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid,
   out:
 	local_irq_restore(save_flags);
 	ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
-	kmemleak_alloc_recursive(ptr, cachep->object_size, 1, cachep->flags,
-				 flags);
 
-	if (likely(ptr)) {
-		kmemcheck_slab_alloc(cachep, flags, ptr, cachep->object_size);
-		if (unlikely(flags & __GFP_ZERO))
-			memset(ptr, 0, cachep->object_size);
-	}
+	if (unlikely(flags & __GFP_ZERO) && ptr)
+		memset(ptr, 0, cachep->object_size);
 
-	memcg_kmem_put_cache(cachep);
+	slab_post_alloc_hook(cachep, flags, 1, &ptr);
 	return ptr;
 }
 
@@ -3232,17 +3227,12 @@ slab_alloc(struct kmem_cache *cachep, gfp_t flags, unsigned long caller)
 	objp = __do_cache_alloc(cachep, flags);
 	local_irq_restore(save_flags);
 	objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
-	kmemleak_alloc_recursive(objp, cachep->object_size, 1, cachep->flags,
-				 flags);
 	prefetchw(objp);
 
-	if (likely(objp)) {
-		kmemcheck_slab_alloc(cachep, flags, objp, cachep->object_size);
-		if (unlikely(flags & __GFP_ZERO))
-			memset(objp, 0, cachep->object_size);
-	}
+	if (unlikely(flags & __GFP_ZERO) && objp)
+		memset(objp, 0, cachep->object_size);
 
-	memcg_kmem_put_cache(cachep);
+	slab_post_alloc_hook(cachep, flags, 1, &objp);
 	return objp;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 07/11] slab: implement bulk alloc in SLAB allocator
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (5 preceding siblings ...)
  2016-01-12 15:14   ` [PATCH V2 06/11] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
@ 2016-01-12 15:15   ` Jesper Dangaard Brouer
  2016-01-12 15:15   ` [PATCH V2 08/11] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:15 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch implements the alloc side of bulk API for the SLAB
allocator.

Further optimization are still possible by changing the call to
__do_cache_alloc() into something that can return multiple
objects.  This optimization is left for later, given end results
already show in the area of 80% speedup.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   37 +++++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index a05f716031de..17931ea961d1 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3392,9 +3392,42 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 EXPORT_SYMBOL(kmem_cache_free_bulk);
 
 int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
-								void **p)
+			  void **p)
 {
-	return __kmem_cache_alloc_bulk(s, flags, size, p);
+	size_t i;
+
+	s = slab_pre_alloc_hook(s, flags);
+	if (!s)
+		return 0;
+
+	cache_alloc_debugcheck_before(s, flags);
+
+	local_irq_disable();
+	for (i = 0; i < size; i++) {
+		void *objp = __do_cache_alloc(s, flags);
+
+		/* this call could be done outside IRQ disabled section */
+		objp = cache_alloc_debugcheck_after(s, flags, objp, _RET_IP_);
+
+		if (unlikely(!objp))
+			goto error;
+		p[i] = objp;
+	}
+	local_irq_enable();
+
+	/* Clear memory outside IRQ disabled section */
+	if (unlikely(flags & __GFP_ZERO))
+		for (i = 0; i < size; i++)
+			memset(p[i], 0, s->object_size);
+
+	slab_post_alloc_hook(s, flags, size, p);
+	/* FIXME: Trace call missing. Christoph would like a bulk variant */
+	return size;
+error:
+	local_irq_enable();
+	slab_post_alloc_hook(s, flags, i, p);
+	__kmem_cache_free_bulk(s, i, p);
+	return 0;
 }
 EXPORT_SYMBOL(kmem_cache_alloc_bulk);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 08/11] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (6 preceding siblings ...)
  2016-01-12 15:15   ` [PATCH V2 07/11] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
@ 2016-01-12 15:15   ` Jesper Dangaard Brouer
  2016-01-12 15:15   ` [PATCH V2 09/11] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:15 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Move the call to cache_alloc_debugcheck_after() outside the IRQ
disabled section in kmem_cache_alloc_bulk().

When CONFIG_DEBUG_SLAB is disabled the compiler should remove
this code.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 17931ea961d1..3f391e200ea2 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3391,6 +3391,16 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
 }
 EXPORT_SYMBOL(kmem_cache_free_bulk);
 
+static __always_inline void
+cache_alloc_debugcheck_after_bulk(struct kmem_cache *s, gfp_t flags,
+				  size_t size, void **p, unsigned long caller)
+{
+	size_t i;
+
+	for (i = 0; i < size; i++)
+		p[i] = cache_alloc_debugcheck_after(s, flags, p[i], caller);
+}
+
 int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			  void **p)
 {
@@ -3406,15 +3416,14 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	for (i = 0; i < size; i++) {
 		void *objp = __do_cache_alloc(s, flags);
 
-		/* this call could be done outside IRQ disabled section */
-		objp = cache_alloc_debugcheck_after(s, flags, objp, _RET_IP_);
-
 		if (unlikely(!objp))
 			goto error;
 		p[i] = objp;
 	}
 	local_irq_enable();
 
+	cache_alloc_debugcheck_after_bulk(s, flags, size, p, _RET_IP_);
+
 	/* Clear memory outside IRQ disabled section */
 	if (unlikely(flags & __GFP_ZERO))
 		for (i = 0; i < size; i++)
@@ -3425,6 +3434,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 	return size;
 error:
 	local_irq_enable();
+	cache_alloc_debugcheck_after_bulk(s, flags, i, p, _RET_IP_);
 	slab_post_alloc_hook(s, flags, i, p);
 	__kmem_cache_free_bulk(s, i, p);
 	return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 09/11] slab: implement bulk free in SLAB allocator
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (7 preceding siblings ...)
  2016-01-12 15:15   ` [PATCH V2 08/11] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
@ 2016-01-12 15:15   ` Jesper Dangaard Brouer
  2016-01-12 15:16   ` [PATCH V2 10/11] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
  2016-01-12 15:16   ` [PATCH V2 11/11] mm: fix some spelling Jesper Dangaard Brouer
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:15 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch implements the free side of bulk API for the SLAB
allocator kmem_cache_free_bulk(), and concludes the
implementation of optimized bulk API for SLAB allocator.

Benchmarked[1] cost of alloc+free (obj size 256 bytes) on CPU
i7-4790K @ 4.00GHz, with no debug options, no PREEMPT and
CONFIG_MEMCG_KMEM=y but no active user of kmemcg.

SLAB single alloc+free cost: 87 cycles(tsc) 21.814 ns with this
optimized config.

bulk- Current fallback          - optimized SLAB bulk
  1 - 102 cycles(tsc) 25.747 ns - 41 cycles(tsc) 10.490 ns - improved 59.8%
  2 -  94 cycles(tsc) 23.546 ns - 26 cycles(tsc)  6.567 ns - improved 72.3%
  3 -  92 cycles(tsc) 23.127 ns - 20 cycles(tsc)  5.244 ns - improved 78.3%
  4 -  90 cycles(tsc) 22.663 ns - 18 cycles(tsc)  4.588 ns - improved 80.0%
  8 -  88 cycles(tsc) 22.242 ns - 14 cycles(tsc)  3.656 ns - improved 84.1%
 16 -  88 cycles(tsc) 22.010 ns - 13 cycles(tsc)  3.480 ns - improved 85.2%
 30 -  89 cycles(tsc) 22.305 ns - 13 cycles(tsc)  3.303 ns - improved 85.4%
 32 -  89 cycles(tsc) 22.277 ns - 13 cycles(tsc)  3.309 ns - improved 85.4%
 34 -  88 cycles(tsc) 22.246 ns - 13 cycles(tsc)  3.294 ns - improved 85.2%
 48 -  88 cycles(tsc) 22.121 ns - 13 cycles(tsc)  3.492 ns - improved 85.2%
 64 -  88 cycles(tsc) 22.052 ns - 13 cycles(tsc)  3.411 ns - improved 85.2%
128 -  89 cycles(tsc) 22.452 ns - 15 cycles(tsc)  3.841 ns - improved 83.1%
158 -  89 cycles(tsc) 22.403 ns - 14 cycles(tsc)  3.746 ns - improved 84.3%
250 -  91 cycles(tsc) 22.775 ns - 16 cycles(tsc)  4.111 ns - improved 82.4%

Notice it is not recommended to do very large bulk operation with
this bulk API, because local IRQs are disabled in this period.

[1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/slab_bulk_test01.c

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 mm/slab.c |   29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 3f391e200ea2..6cc5f99fe2ea 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3385,12 +3385,6 @@ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
 }
 EXPORT_SYMBOL(kmem_cache_alloc);
 
-void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p)
-{
-	__kmem_cache_free_bulk(s, size, p);
-}
-EXPORT_SYMBOL(kmem_cache_free_bulk);
-
 static __always_inline void
 cache_alloc_debugcheck_after_bulk(struct kmem_cache *s, gfp_t flags,
 				  size_t size, void **p, unsigned long caller)
@@ -3584,6 +3578,29 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 }
 EXPORT_SYMBOL(kmem_cache_free);
 
+void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
+{
+	struct kmem_cache *s;
+	size_t i;
+
+	local_irq_disable();
+	for (i = 0; i < size; i++) {
+		void *objp = p[i];
+
+		s = cache_from_obj(orig_s, objp);
+
+		debug_check_no_locks_freed(objp, s->object_size);
+		if (!(s->flags & SLAB_DEBUG_OBJECTS))
+			debug_check_no_obj_freed(objp, s->object_size);
+
+		__cache_free(s, objp, _RET_IP_);
+	}
+	local_irq_enable();
+
+	/* FIXME: add tracing */
+}
+EXPORT_SYMBOL(kmem_cache_free_bulk);
+
 /**
  * kfree - free previously allocated memory
  * @objp: pointer returned by kmalloc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 10/11] mm: new API kfree_bulk() for SLAB+SLUB allocators
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (8 preceding siblings ...)
  2016-01-12 15:15   ` [PATCH V2 09/11] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
@ 2016-01-12 15:16   ` Jesper Dangaard Brouer
  2016-01-12 15:16   ` [PATCH V2 11/11] mm: fix some spelling Jesper Dangaard Brouer
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:16 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

This patch introduce a new API call kfree_bulk() for bulk freeing
memory objects not bound to a single kmem_cache.

Christoph pointed out that it is possible to implement freeing of
objects, without knowing the kmem_cache pointer as that information is
available from the object's page->slab_cache.  Proposing to remove the
kmem_cache argument from the bulk free API.

Jesper demonstrated that these extra steps per object comes at a
performance cost.  It is only in the case CONFIG_MEMCG_KMEM is
compiled in and activated runtime that these steps are done anyhow.
The extra cost is most visible for SLAB allocator, because the SLUB
allocator does the page lookup (virt_to_head_page()) anyhow.

Thus, the conclusion was to keep the kmem_cache free bulk API with a
kmem_cache pointer, but we can still implement a kfree_bulk() API
fairly easily.  Simply by handling if kmem_cache_free_bulk() gets
called with a kmem_cache NULL pointer.

This does increase the code size a bit, but implementing a separate
kfree_bulk() call would likely increase code size even more.

Below benchmarks cost of alloc+free (obj size 256 bytes) on
CPU i7-4790K @ 4.00GHz, no PREEMPT and CONFIG_MEMCG_KMEM=y.

Code size increase for SLAB:

 add/remove: 0/0 grow/shrink: 1/0 up/down: 74/0 (74)
 function                                     old     new   delta
 kmem_cache_free_bulk                         660     734     +74

SLAB fastpath: 87 cycles(tsc) 21.814
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 - 103 cycles 25.878 ns -  41 cycles 10.498 ns - 81 cycles 20.312 ns
   2 -  94 cycles 23.673 ns -  26 cycles  6.682 ns - 42 cycles 10.649 ns
   3 -  92 cycles 23.181 ns -  21 cycles  5.325 ns - 39 cycles 9.950 ns
   4 -  90 cycles 22.727 ns -  18 cycles  4.673 ns - 26 cycles 6.693 ns
   8 -  89 cycles 22.270 ns -  14 cycles  3.664 ns - 23 cycles 5.835 ns
  16 -  88 cycles 22.038 ns -  14 cycles  3.503 ns - 22 cycles 5.543 ns
  30 -  89 cycles 22.284 ns -  13 cycles  3.310 ns - 20 cycles 5.197 ns
  32 -  88 cycles 22.249 ns -  13 cycles  3.420 ns - 20 cycles 5.166 ns
  34 -  88 cycles 22.224 ns -  14 cycles  3.643 ns - 20 cycles 5.170 ns
  48 -  88 cycles 22.088 ns -  14 cycles  3.507 ns - 20 cycles 5.203 ns
  64 -  88 cycles 22.063 ns -  13 cycles  3.428 ns - 20 cycles 5.152 ns
 128 -  89 cycles 22.483 ns -  15 cycles  3.891 ns - 23 cycles 5.885 ns
 158 -  89 cycles 22.381 ns -  15 cycles  3.779 ns - 22 cycles 5.548 ns
 250 -  91 cycles 22.798 ns -  16 cycles  4.152 ns - 23 cycles 5.967 ns

SLAB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 130 cycles(tsc) 32.684 ns (step:0)
 1 - 148 cycles 37.220 ns -  66 cycles 16.622 ns - 66 cycles 16.583 ns
 2 - 141 cycles 35.510 ns -  51 cycles 12.820 ns - 58 cycles 14.625 ns
 3 - 140 cycles 35.017 ns -  37 cycles 9.326 ns - 33 cycles 8.474 ns
 4 - 137 cycles 34.507 ns -  31 cycles 7.888 ns - 33 cycles 8.300 ns
 8 - 140 cycles 35.069 ns -  25 cycles 6.461 ns - 25 cycles 6.436 ns
 16 - 138 cycles 34.542 ns -  23 cycles 5.945 ns - 22 cycles 5.670 ns
 30 - 136 cycles 34.227 ns -  22 cycles 5.502 ns - 22 cycles 5.587 ns
 32 - 136 cycles 34.253 ns -  21 cycles 5.475 ns - 21 cycles 5.324 ns
 34 - 136 cycles 34.254 ns -  21 cycles 5.448 ns - 20 cycles 5.194 ns
 48 - 136 cycles 34.075 ns -  21 cycles 5.458 ns - 21 cycles 5.367 ns
 64 - 135 cycles 33.994 ns -  21 cycles 5.350 ns - 21 cycles 5.259 ns
 128 - 137 cycles 34.446 ns -  23 cycles 5.816 ns - 22 cycles 5.688 ns
 158 - 137 cycles 34.379 ns -  22 cycles 5.727 ns - 22 cycles 5.602 ns
 250 - 138 cycles 34.755 ns -  24 cycles 6.093 ns - 23 cycles 5.986 ns

Code size increase for SLUB:
 function                                     old     new   delta
 kmem_cache_free_bulk                         717     799     +82

SLUB benchmark:
 SLUB fastpath: 46 cycles(tsc) 11.691 ns (step:0)
  sz - fallback             - kmem_cache_free_bulk - kfree_bulk
   1 -  61 cycles 15.486 ns -  53 cycles 13.364 ns - 57 cycles 14.464 ns
   2 -  54 cycles 13.703 ns -  32 cycles  8.110 ns - 33 cycles 8.482 ns
   3 -  53 cycles 13.272 ns -  25 cycles  6.362 ns - 27 cycles 6.947 ns
   4 -  51 cycles 12.994 ns -  24 cycles  6.087 ns - 24 cycles 6.078 ns
   8 -  50 cycles 12.576 ns -  21 cycles  5.354 ns - 22 cycles 5.513 ns
  16 -  49 cycles 12.368 ns -  20 cycles  5.054 ns - 20 cycles 5.042 ns
  30 -  49 cycles 12.273 ns -  18 cycles  4.748 ns - 19 cycles 4.758 ns
  32 -  49 cycles 12.401 ns -  19 cycles  4.821 ns - 19 cycles 4.810 ns
  34 -  98 cycles 24.519 ns -  24 cycles  6.154 ns - 24 cycles 6.157 ns
  48 -  83 cycles 20.833 ns -  21 cycles  5.446 ns - 21 cycles 5.429 ns
  64 -  75 cycles 18.891 ns -  20 cycles  5.247 ns - 20 cycles 5.238 ns
 128 -  93 cycles 23.271 ns -  27 cycles  6.856 ns - 27 cycles 6.823 ns
 158 - 102 cycles 25.581 ns -  30 cycles  7.714 ns - 30 cycles 7.695 ns
 250 - 107 cycles 26.917 ns -  38 cycles  9.514 ns - 38 cycles 9.506 ns

SLUB when enabling MEMCG_KMEM runtime:
 - kmemcg fastpath: 71 cycles(tsc) 17.897 ns (step:0)
 1 - 85 cycles 21.484 ns -  78 cycles 19.569 ns - 75 cycles 18.938 ns
 2 - 81 cycles 20.363 ns -  45 cycles 11.258 ns - 44 cycles 11.076 ns
 3 - 78 cycles 19.709 ns -  33 cycles 8.354 ns - 32 cycles 8.044 ns
 4 - 77 cycles 19.430 ns -  28 cycles 7.216 ns - 28 cycles 7.003 ns
 8 - 101 cycles 25.288 ns -  23 cycles 5.849 ns - 23 cycles 5.787 ns
 16 - 76 cycles 19.148 ns -  20 cycles 5.162 ns - 20 cycles 5.081 ns
 30 - 76 cycles 19.067 ns -  19 cycles 4.868 ns - 19 cycles 4.821 ns
 32 - 76 cycles 19.052 ns -  19 cycles 4.857 ns - 19 cycles 4.815 ns
 34 - 121 cycles 30.291 ns -  25 cycles 6.333 ns - 25 cycles 6.268 ns
 48 - 108 cycles 27.111 ns -  21 cycles 5.498 ns - 21 cycles 5.458 ns
 64 - 100 cycles 25.164 ns -  20 cycles 5.242 ns - 20 cycles 5.229 ns
 128 - 155 cycles 38.976 ns -  27 cycles 6.886 ns - 27 cycles 6.892 ns
 158 - 132 cycles 33.034 ns -  30 cycles 7.711 ns - 30 cycles 7.728 ns
 250 - 130 cycles 32.612 ns -  38 cycles 9.560 ns - 38 cycles 9.549 ns

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

----
V2: Re-ran SLAB benchmark, Kim noticed measurement instability
 - Also tested with SLOB allocator

 include/linux/slab.h |    9 +++++++++
 mm/slab.c            |    5 ++++-
 mm/slab_common.c     |    8 ++++++--
 mm/slub.c            |   21 ++++++++++++++++++---
 4 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 2037a861e367..df70e69c8f7f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -318,6 +318,15 @@ void kmem_cache_free(struct kmem_cache *, void *);
 void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
 int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 
+/*
+ * Caller must not use kfree_bulk() on memory not originally allocated
+ * by kmalloc(), because the SLOB allocator cannot handle this.
+ */
+static __always_inline void kfree_bulk(size_t size, void **p)
+{
+	kmem_cache_free_bulk(NULL, size, p);
+}
+
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment;
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment;
diff --git a/mm/slab.c b/mm/slab.c
index 6cc5f99fe2ea..68a6c5b53069 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3587,7 +3587,10 @@ void kmem_cache_free_bulk(struct kmem_cache *orig_s, size_t size, void **p)
 	for (i = 0; i < size; i++) {
 		void *objp = p[i];
 
-		s = cache_from_obj(orig_s, objp);
+		if (!orig_s) /* called via kfree_bulk */
+			s = virt_to_cache(objp);
+		else
+			s = cache_from_obj(orig_s, objp);
 
 		debug_check_no_locks_freed(objp, s->object_size);
 		if (!(s->flags & SLAB_DEBUG_OBJECTS))
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 3c6a86b4ec25..963c25589949 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -108,8 +108,12 @@ void __kmem_cache_free_bulk(struct kmem_cache *s, size_t nr, void **p)
 {
 	size_t i;
 
-	for (i = 0; i < nr; i++)
-		kmem_cache_free(s, p[i]);
+	for (i = 0; i < nr; i++) {
+		if (s)
+			kmem_cache_free(s, p[i]);
+		else
+			kfree(p[i]);
+	}
 }
 
 int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t nr,
diff --git a/mm/slub.c b/mm/slub.c
index 9ef1abc683b2..9c5e1bc3b389 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2787,23 +2787,38 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
 	size_t first_skipped_index = 0;
 	int lookahead = 3;
 	void *object;
+	struct page *page;
 
 	/* Always re-init detached_freelist */
 	df->page = NULL;
 
 	do {
 		object = p[--size];
+		/* Do we need !ZERO_OR_NULL_PTR(object) here? (for kfree) */
 	} while (!object && size);
 
 	if (!object)
 		return 0;
 
-	/* Support for memcg, compiler can optimize this out */
-	df->s = cache_from_obj(s, object);
+	page = virt_to_head_page(object);
+	if (!s) {
+		/* Handle kalloc'ed objects */
+		if (unlikely(!PageSlab(page))) {
+			BUG_ON(!PageCompound(page));
+			kfree_hook(object);
+			__free_kmem_pages(page, compound_order(page));
+			p[size] = NULL; /* mark object processed */
+			return size;
+		}
+		/* Derive kmem_cache from object */
+		df->s = page->slab_cache;
+	} else {
+		df->s = cache_from_obj(s, object); /* Support for memcg */
+	}
 
 	/* Start new detached freelist */
+	df->page = page;
 	set_freepointer(df->s, object, NULL);
-	df->page = virt_to_head_page(object);
 	df->tail = object;
 	df->freelist = object;
 	p[size] = NULL; /* mark object processed */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH V2 11/11] mm: fix some spelling
  2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
                     ` (9 preceding siblings ...)
  2016-01-12 15:16   ` [PATCH V2 10/11] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
@ 2016-01-12 15:16   ` Jesper Dangaard Brouer
  10 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2016-01-12 15:16 UTC (permalink / raw)
  To: linux-mm, Christoph Lameter
  Cc: Vladimir Davydov, Andrew Morton, Linus Torvalds, Joonsoo Kim,
	Jesper Dangaard Brouer

Fixup trivial spelling errors, noticed while reading the code.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 include/linux/memcontrol.h |    2 +-
 include/linux/slab.h       |    2 +-
 mm/slab.h                  |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index cd0e2413c358..e88c57e83215 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -757,7 +757,7 @@ int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
 void __memcg_kmem_uncharge(struct page *page, int order);
 
 /*
- * helper for acessing a memcg's index. It will be used as an index in the
+ * helper for accessing a memcg's index. It will be used as an index in the
  * child cache array in kmem_cache, and also to derive its name. This function
  * will return -1 when this is not a kmem-limited memcg.
  */
diff --git a/include/linux/slab.h b/include/linux/slab.h
index df70e69c8f7f..5bbde0814235 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -309,7 +309,7 @@ void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment
 void kmem_cache_free(struct kmem_cache *, void *);
 
 /*
- * Bulk allocation and freeing operations. These are accellerated in an
+ * Bulk allocation and freeing operations. These are accelerated in an
  * allocator specific way to avoid taking locks repeatedly or building
  * metadata structures unnecessarily.
  *
diff --git a/mm/slab.h b/mm/slab.h
index 343ee496c53b..3357430e29e8 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -170,7 +170,7 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 /*
  * Generic implementation of bulk operations
  * These are useful for situations in which the allocator cannot
- * perform optimizations. In that case segments of the objecct listed
+ * perform optimizations. In that case segments of the object listed
  * may be allocated or freed using these operations.
  */
 void __kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-01-12 15:16 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07 14:03 [PATCH 00/10] MM: More bulk API work Jesper Dangaard Brouer
2016-01-07 14:03 ` [PATCH 01/10] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
2016-01-07 15:54   ` Christoph Lameter
2016-01-07 17:41     ` Jesper Dangaard Brouer
2016-01-08  2:58   ` Joonsoo Kim
2016-01-08 11:05     ` Jesper Dangaard Brouer
2016-01-07 14:03 ` [PATCH 02/10] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
2016-01-07 14:03 ` [PATCH 03/10] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
2016-01-07 14:03 ` [PATCH 04/10] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
2016-01-08  3:05   ` Joonsoo Kim
2016-01-07 14:03 ` [PATCH 05/10] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
2016-01-07 14:04 ` [PATCH 06/10] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
2016-01-07 14:04 ` [PATCH 07/10] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
2016-01-07 14:04 ` [PATCH 08/10] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
2016-01-07 14:04 ` [PATCH 09/10] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
2016-01-07 14:04 ` [PATCH 10/10] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
2016-01-08  3:03   ` Joonsoo Kim
2016-01-08 11:20     ` Jesper Dangaard Brouer
2016-01-07 18:54 ` [PATCH 00/10] MM: More bulk API work Linus Torvalds
2016-01-12 15:13 ` [PATCH V2 00/11] " Jesper Dangaard Brouer
2016-01-12 15:13   ` [PATCH V2 01/11] slub: cleanup code for kmem cgroup support to kmem_cache_free_bulk Jesper Dangaard Brouer
2016-01-12 15:13   ` [PATCH V2 02/11] mm/slab: move SLUB alloc hooks to common mm/slab.h Jesper Dangaard Brouer
2016-01-12 15:14   ` [PATCH V2 03/11] mm: fault-inject take over bootstrap kmem_cache check Jesper Dangaard Brouer
2016-01-12 15:14   ` [PATCH V2 04/11] slab: use slab_pre_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
2016-01-12 15:14   ` [PATCH V2 05/11] mm: kmemcheck skip object if slab allocation failed Jesper Dangaard Brouer
2016-01-12 15:14   ` [PATCH V2 06/11] slab: use slab_post_alloc_hook in SLAB allocator shared with SLUB Jesper Dangaard Brouer
2016-01-12 15:15   ` [PATCH V2 07/11] slab: implement bulk alloc in SLAB allocator Jesper Dangaard Brouer
2016-01-12 15:15   ` [PATCH V2 08/11] slab: avoid running debug SLAB code with IRQs disabled for alloc_bulk Jesper Dangaard Brouer
2016-01-12 15:15   ` [PATCH V2 09/11] slab: implement bulk free in SLAB allocator Jesper Dangaard Brouer
2016-01-12 15:16   ` [PATCH V2 10/11] mm: new API kfree_bulk() for SLAB+SLUB allocators Jesper Dangaard Brouer
2016-01-12 15:16   ` [PATCH V2 11/11] mm: fix some spelling Jesper Dangaard Brouer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.