linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] Slab allocator array operations
@ 2015-01-23 21:37 Christoph Lameter
  2015-01-23 21:37 ` [RFC 1/3] Slab infrastructure for " Christoph Lameter
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Christoph Lameter @ 2015-01-23 21:37 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

Attached a series of 3 patches to implement functionality to allocate
arrays of pointers to slab objects. This can be used by the slab
allocators to offer more optimized allocation and free paths.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC 1/3] Slab infrastructure for array operations
  2015-01-23 21:37 [RFC 0/3] Slab allocator array operations Christoph Lameter
@ 2015-01-23 21:37 ` Christoph Lameter
  2015-01-27  8:21   ` Joonsoo Kim
  2015-01-23 21:37 ` [RFC 2/3] slub: Support " Christoph Lameter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2015-01-23 21:37 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

[-- Attachment #1: array_alloc --]
[-- Type: text/plain, Size: 6476 bytes --]

This patch adds the basic infrastructure for alloc / free operations
on pointer arrays. It includes a fallback function that can perform
the array operations using the single alloc and free that every
slab allocator performs.

Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
header files in order to implement their own fast version for
these array operations.

Array operations allow a reduction of the processing overhead
during allocation and therefore speed up acquisition of larger
amounts of objects.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/slab.h
===================================================================
--- linux.orig/include/linux/slab.h
+++ linux/include/linux/slab.h
@@ -123,6 +123,7 @@ struct kmem_cache *memcg_create_kmem_cac
 void kmem_cache_destroy(struct kmem_cache *);
 int kmem_cache_shrink(struct kmem_cache *);
 void kmem_cache_free(struct kmem_cache *, void *);
+void kmem_cache_free_array(struct kmem_cache *, size_t, void **);
 
 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -290,6 +291,39 @@ static __always_inline int kmalloc_index
 void *__kmalloc(size_t size, gfp_t flags);
 void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags);
 
+/*
+ * Additional flags that may be specified in kmem_cache_alloc_array()'s
+ * gfp flags.
+ *
+ * If no flags are specified then kmem_cache_alloc_array() will first exhaust
+ * the partial slab page lists of the local node, then allocate new pages from
+ * the page allocator as long as more than objects per page objects are wanted
+ * and fill up the rest from local cached objects. If that is not enough then
+ * the remaining objects will be allocated via kmem_cache_alloc()
+ */
+
+/* Use objects cached for the processor */
+#define GFP_SLAB_ARRAY_LOCAL		((__force gfp_t)0x40000000)
+
+/* Use slabs from this node that have objects available */
+#define GFP_SLAB_ARRAY_PARTIAL		((__force gfp_t)0x20000000)
+
+/* Allocate new slab pages from page allocator */
+#define GFP_SLAB_ARRAY_NEW		((__force gfp_t)0x10000000)
+
+/*
+ * If other measures did not fill up the array to the full count
+ * requested then use kmem_cache_alloc to ensure the number of
+ * objects requested is allocated.
+ * If this flag is not set then the the allocation may return
+ * less than specified if there are no more objects of the
+ * particular type.
+ */
+#define GFP_SLAB_ARRAY_FULL_COUNT	((__force gfp_t)0x08000000)
+
+int kmem_cache_alloc_array(struct kmem_cache *, gfp_t gfpflags,
+				size_t nr, void **);
+
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node);
 void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node);
Index: linux/mm/slab_common.c
===================================================================
--- linux.orig/mm/slab_common.c
+++ linux/mm/slab_common.c
@@ -105,6 +105,92 @@ static inline int kmem_cache_sanity_chec
 }
 #endif
 
+#ifndef _HAVE_SLAB_ALLOCATOR_ARRAY_OPERATIONS
+int kmem_cache_alloc_array(struct kmem_cache *s,
+		gfp_t flags, size_t nr, void **p)
+{
+	int i;
+
+	/*
+	 * Generic code does not support the processing of the
+	 * special allocation flags. So strip them off the mask.
+	 */
+	flags &= __GFP_BITS_MASK;
+
+	for (i = 0; i < nr; i++) {
+		void *x = kmem_cache_alloc(s, flags);
+
+		if (!x)
+			return i;
+		p[i] = x;
+	}
+	return nr;
+}
+EXPORT_SYMBOL(kmem_cache_alloc_array);
+
+void kmem_cache_free_array(struct kmem_cache *s, size_t nr, void **p)
+{
+	int i;
+
+	for (i = 0; i < nr; i++)
+		kmem_cache_free(s, p[i]);
+}
+EXPORT_SYMBOL(kmem_cache_free_array);
+#else
+
+int kmem_cache_alloc_array(struct kmem_cache *s,
+		gfp_t flags, size_t nr, void **p)
+{
+	int i = 0;
+
+	/*
+	 * Setup the default operation mode if no special GFP_SLAB_*
+	 * flags were specified.
+	 */
+	if ((flags & ~__GFP_BITS_MASK) == 0)
+		flags |= GFP_SLAB_ARRAY_PARTIAL |
+			 GFP_SLAB_ARRAY_NEW |
+			 GFP_SLAB_ARRAY_LOCAL |
+			 GFP_SLAB_ARRAY_FULL_COUNT;
+
+	/*
+	 * First extract objects from partial lists in order to
+	 * avoid further fragmentation.
+	 */
+	if (flags & GFP_SLAB_ARRAY_PARTIAL)
+		i += slab_array_alloc_from_partial(s, nr - i, p + i);
+
+	/*
+	 * If there are still a larger number of objects to be allocated
+	 * use the page allocator directly.
+	 */
+	if ((flags & GFP_SLAB_ARRAY_NEW) && nr - i > objects_per_slab_page(s))
+		i += slab_array_alloc_from_page_allocator(s,
+				flags & __GFP_BITS_MASK,
+				nr - i, p + i);
+
+	/* Get per cpu objects that may be available */
+	if (flags & GFP_SLAB_ARRAY_LOCAL)
+		i += slab_array_alloc_from_local(s, nr - i, p + i);
+
+	/*
+	 * If a fully filled array has been requested then fill it
+	 * up if there are objects missing using the regular kmem_cache_alloc()
+	 */
+	if (flags & GFP_SLAB_ARRAY_FULL_COUNT)
+		while (i < nr) {
+			void *x = kmem_cache_alloc(s,
+					flags & __GFP_BITS_MASK);
+			if (!x)
+				return i;
+			p[i++] = x;
+		}
+
+	return i;
+}
+EXPORT_SYMBOL(kmem_cache_alloc_array);
+#endif
+
 #ifdef CONFIG_MEMCG_KMEM
 static int memcg_alloc_cache_params(struct mem_cgroup *memcg,
 		struct kmem_cache *s, struct kmem_cache *root_cache)
Index: linux/mm/slab.h
===================================================================
--- linux.orig/mm/slab.h
+++ linux/mm/slab.h
@@ -69,6 +69,10 @@ extern struct kmem_cache *kmem_cache;
 unsigned long calculate_alignment(unsigned long flags,
 		unsigned long align, unsigned long size);
 
+/* Determine the number of objects per slab page */
+unsigned objects_per_slab_page(struct kmem_cache *);
+
+
 #ifndef CONFIG_SLOB
 /* Kmalloc array related functions */
 void create_kmalloc_caches(unsigned long);
@@ -362,4 +366,10 @@ void *slab_next(struct seq_file *m, void
 void slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+
+int slab_array_alloc_from_partial(struct kmem_cache *s, size_t nr, void **p);
+int slab_array_alloc_from_local(struct kmem_cache *s, size_t nr, void **p);
+int slab_array_alloc_from_page_allocator(struct kmem_cache *s, gfp_t flags,
+					size_t nr, void **p);
+
 #endif /* MM_SLAB_H */
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -332,6 +332,11 @@ static inline int oo_objects(struct kmem
 	return x.x & OO_MASK;
 }
 
+unsigned objects_per_slab_page(struct kmem_cache *s)
+{
+	return oo_objects(s->oo);
+}
+
 /*
  * Per slab locking using the pagelock
  */


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC 2/3] slub: Support for array operations
  2015-01-23 21:37 [RFC 0/3] Slab allocator array operations Christoph Lameter
  2015-01-23 21:37 ` [RFC 1/3] Slab infrastructure for " Christoph Lameter
@ 2015-01-23 21:37 ` Christoph Lameter
  2015-01-23 21:37 ` [RFC 3/3] Array alloc test code Christoph Lameter
  2015-01-23 22:57 ` [RFC 0/3] Slab allocator array operations Andrew Morton
  3 siblings, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2015-01-23 21:37 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

[-- Attachment #1: array_alloc_slub --]
[-- Type: text/plain, Size: 5154 bytes --]

The major portions are there but there is no support yet for
directly allocating per cpu objects. There could also be more
sophisticated code to exploit the batch freeing.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/include/linux/slub_def.h
===================================================================
--- linux.orig/include/linux/slub_def.h
+++ linux/include/linux/slub_def.h
@@ -110,4 +110,5 @@ static inline void sysfs_slab_remove(str
 }
 #endif
 
+#define _HAVE_SLAB_ALLOCATOR_ARRAY_OPERATIONS
 #endif /* _LINUX_SLUB_DEF_H */
Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -1379,13 +1379,9 @@ static void setup_object(struct kmem_cac
 		s->ctor(object);
 }
 
-static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
+static struct page *__new_slab(struct kmem_cache *s, gfp_t flags, int node)
 {
 	struct page *page;
-	void *start;
-	void *p;
-	int order;
-	int idx;
 
 	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
 		pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
@@ -1394,33 +1390,42 @@ static struct page *new_slab(struct kmem
 
 	page = allocate_slab(s,
 		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
-	if (!page)
-		goto out;
+	if (page) {
+		inc_slabs_node(s, page_to_nid(page), page->objects);
+		page->slab_cache = s;
+		__SetPageSlab(page);
+		if (page->pfmemalloc)
+			SetPageSlabPfmemalloc(page);
+	}
 
-	order = compound_order(page);
-	inc_slabs_node(s, page_to_nid(page), page->objects);
-	page->slab_cache = s;
-	__SetPageSlab(page);
-	if (page->pfmemalloc)
-		SetPageSlabPfmemalloc(page);
-
-	start = page_address(page);
-
-	if (unlikely(s->flags & SLAB_POISON))
-		memset(start, POISON_INUSE, PAGE_SIZE << order);
-
-	for_each_object_idx(p, idx, s, start, page->objects) {
-		setup_object(s, page, p);
-		if (likely(idx < page->objects))
-			set_freepointer(s, p, p + s->size);
-		else
-			set_freepointer(s, p, NULL);
-	}
-
-	page->freelist = start;
-	page->inuse = page->objects;
-	page->frozen = 1;
-out:
+	return page;
+}
+
+static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
+{
+	struct page *page = __new_slab(s, flags, node);
+
+	if (page) {
+		void *p;
+		int idx;
+		void *start = page_address(page);
+
+		if (unlikely(s->flags & SLAB_POISON))
+			memset(start, POISON_INUSE,
+				PAGE_SIZE << compound_order(page));
+
+		for_each_object_idx(p, idx, s, start, page->objects) {
+			setup_object(s, page, p);
+			if (likely(idx < page->objects))
+				set_freepointer(s, p, p + s->size);
+			else
+				set_freepointer(s, p, NULL);
+		}
+
+		page->freelist = start;
+		page->inuse = page->objects;
+		page->frozen = 1;
+	}
 	return page;
 }
 
@@ -2516,8 +2521,78 @@ EXPORT_SYMBOL(kmem_cache_alloc_node_trac
 #endif
 #endif
 
+int slab_array_alloc_from_partial(struct kmem_cache *s,
+			size_t nr, void **p)
+{
+	void **end = p + nr;
+	struct kmem_cache_node *n = get_node(s, numa_mem_id());
+	int allocated = 0;
+	unsigned long flags;
+	struct page *page, *page2;
+
+	if (!n->nr_partial)
+		return 0;
+
+
+	spin_lock_irqsave(&n->list_lock, flags);
+	list_for_each_entry_safe(page, page2, &n->partial, lru) {
+		void *freelist;
+
+		if (page->objects - page->inuse > end - p)
+			/* More objects free in page than we want */
+			break;
+		list_del(&page->lru);
+		slab_lock(page);
+		freelist = page->freelist;
+		page->inuse = page->objects;
+		page->freelist = NULL;
+		slab_unlock(page);
+		/* Grab all available objects */
+		while (freelist) {
+			*p++ = freelist;
+			freelist = get_freepointer(s, freelist);
+			allocated++;
+		}
+	}
+	spin_unlock_irqrestore(&n->list_lock, flags);
+	return allocated;
+}
+
+int slab_array_alloc_from_page_allocator(struct kmem_cache *s,
+		gfp_t flags, size_t nr, void **p)
+{
+	void **end = p + nr;
+	int allocated = 0;
+
+	while (end - p >= oo_objects(s->oo)) {
+		struct page *page = __new_slab(s, flags, NUMA_NO_NODE);
+		void *q = page_address(page);
+		int i;
+
+		/* Use all the objects */
+		for (i = 0; i < page->objects; i++) {
+			setup_object(s, page, q);
+			*p++ = q;
+			q += s->size;
+		}
+
+		page->inuse = page->objects;
+		page->freelist = NULL;
+		allocated += page->objects;
+	}
+	return allocated;
+}
+
+int slab_array_alloc_from_local(struct kmem_cache *s,
+		size_t nr, void **p)
+{
+	/* Go for the per cpu partials list first */
+	/* Use the cpu_slab if objects are still needed */
+	return 0;
+}
+
 /*
- * Slow patch handling. This may still be called frequently since objects
+ * Slow path handling. This may still be called frequently since objects
  * have a longer lifetime than the cpu slabs in most processing loads.
  *
  * So we still attempt to reduce cache line usage. Just take the slab
@@ -2637,6 +2712,14 @@ slab_empty:
 	discard_slab(s, page);
 }
 
+void kmem_cache_free_array(struct kmem_cache *s, size_t nr, void **p)
+{
+	void **end = p + nr;
+
+	for ( ; p < end; p++)
+		__slab_free(s, virt_to_head_page(p), p, 0);
+}
+
 /*
  * Fastpath with forced inlining to produce a kfree and kmem_cache_free that
  * can perform fastpath freeing without additional function calls.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC 3/3] Array alloc test code
  2015-01-23 21:37 [RFC 0/3] Slab allocator array operations Christoph Lameter
  2015-01-23 21:37 ` [RFC 1/3] Slab infrastructure for " Christoph Lameter
  2015-01-23 21:37 ` [RFC 2/3] slub: Support " Christoph Lameter
@ 2015-01-23 21:37 ` Christoph Lameter
  2015-01-23 22:57 ` [RFC 0/3] Slab allocator array operations Andrew Morton
  3 siblings, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2015-01-23 21:37 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

[-- Attachment #1: array_alloc_test --]
[-- Type: text/plain, Size: 991 bytes --]


Some simply throw in thing that allocates 100 objects and frees them again.

Spews out complaints about interrupts disabled since we are in an initcall.
But it shows that it works.

Signed-off-by: Christoph Lameter <cl@linux.com>

Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c
+++ linux/mm/slub.c
@@ -5308,6 +5308,22 @@ static int __init slab_sysfs_init(void)
 
 	mutex_unlock(&slab_mutex);
 	resiliency_test();
+
+	/* Test array alloc */
+	{
+		void *arr[100];
+		int nr;
+
+		printk(KERN_INFO "Array allocation test\n");
+		printk(KERN_INFO "---------------------\n");
+		printk(KERN_INFO "Allocation 100 objects\n");
+		nr = kmem_cache_alloc_array(kmem_cache_node, GFP_KERNEL, 100, arr);
+		printk(KERN_INFO "Number allocated = %d\n", nr);
+		printk(KERN_INFO "Freeing the objects\n");
+		kmem_cache_free_array(kmem_cache_node, 100, arr);
+		printk(KERN_INFO "Array allocation test done.\n");
+	}
+
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/3] Slab allocator array operations
  2015-01-23 21:37 [RFC 0/3] Slab allocator array operations Christoph Lameter
                   ` (2 preceding siblings ...)
  2015-01-23 21:37 ` [RFC 3/3] Array alloc test code Christoph Lameter
@ 2015-01-23 22:57 ` Andrew Morton
  2015-01-24  0:28   ` Christoph Lameter
  3 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2015-01-23 22:57 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

On Fri, 23 Jan 2015 15:37:27 -0600 Christoph Lameter <cl@linux.com> wrote:

> Attached a series of 3 patches to implement functionality to allocate
> arrays of pointers to slab objects. This can be used by the slab
> allocators to offer more optimized allocation and free paths.

What's the driver for this?  The networking people, I think?  If so,
some discussion about that would be useful: who is involved, why they
have this need, who are the people we need to bug to get it tested,
whether this implementation is found adequate, etc.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/3] Slab allocator array operations
  2015-01-23 22:57 ` [RFC 0/3] Slab allocator array operations Andrew Morton
@ 2015-01-24  0:28   ` Christoph Lameter
  2015-02-03 23:19     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2015-01-24  0:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, penberg, iamjoonsoo, Jesper Dangaard Brouer

On Fri, 23 Jan 2015, Andrew Morton wrote:

> On Fri, 23 Jan 2015 15:37:27 -0600 Christoph Lameter <cl@linux.com> wrote:
>
> > Attached a series of 3 patches to implement functionality to allocate
> > arrays of pointers to slab objects. This can be used by the slab
> > allocators to offer more optimized allocation and free paths.
>
> What's the driver for this?  The networking people, I think?  If so,
> some discussion about that would be useful: who is involved, why they
> have this need, who are the people we need to bug to get it tested,
> whether this implementation is found adequate, etc.

Jesper and I gave a talk at LCA about this. LWN has an article on it.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-23 21:37 ` [RFC 1/3] Slab infrastructure for " Christoph Lameter
@ 2015-01-27  8:21   ` Joonsoo Kim
  2015-01-27 16:57     ` Christoph Lameter
  0 siblings, 1 reply; 14+ messages in thread
From: Joonsoo Kim @ 2015-01-27  8:21 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, linux-kernel, linux-mm, penberg, iamjoonsoo,
	Jesper Dangaard Brouer

On Fri, Jan 23, 2015 at 03:37:28PM -0600, Christoph Lameter wrote:
> This patch adds the basic infrastructure for alloc / free operations
> on pointer arrays. It includes a fallback function that can perform
> the array operations using the single alloc and free that every
> slab allocator performs.
> 
> Allocators must define _HAVE_SLAB_ALLOCATOR_OPERATIONS in their
> header files in order to implement their own fast version for
> these array operations.
> 
> Array operations allow a reduction of the processing overhead
> during allocation and therefore speed up acquisition of larger
> amounts of objects.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> Index: linux/include/linux/slab.h
> ===================================================================
> --- linux.orig/include/linux/slab.h
> +++ linux/include/linux/slab.h
> @@ -123,6 +123,7 @@ struct kmem_cache *memcg_create_kmem_cac
>  void kmem_cache_destroy(struct kmem_cache *);
>  int kmem_cache_shrink(struct kmem_cache *);
>  void kmem_cache_free(struct kmem_cache *, void *);
> +void kmem_cache_free_array(struct kmem_cache *, size_t, void **);
>  
>  /*
>   * Please use this macro to create slab caches. Simply specify the
> @@ -290,6 +291,39 @@ static __always_inline int kmalloc_index
>  void *__kmalloc(size_t size, gfp_t flags);
>  void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags);
>  
> +/*
> + * Additional flags that may be specified in kmem_cache_alloc_array()'s
> + * gfp flags.
> + *
> + * If no flags are specified then kmem_cache_alloc_array() will first exhaust
> + * the partial slab page lists of the local node, then allocate new pages from
> + * the page allocator as long as more than objects per page objects are wanted
> + * and fill up the rest from local cached objects. If that is not enough then
> + * the remaining objects will be allocated via kmem_cache_alloc()
> + */
> +
> +/* Use objects cached for the processor */
> +#define GFP_SLAB_ARRAY_LOCAL		((__force gfp_t)0x40000000)
> +
> +/* Use slabs from this node that have objects available */
> +#define GFP_SLAB_ARRAY_PARTIAL		((__force gfp_t)0x20000000)
> +
> +/* Allocate new slab pages from page allocator */
> +#define GFP_SLAB_ARRAY_NEW		((__force gfp_t)0x10000000)

Hello, Christoph.

Please correct my e-mail address next time. :)
iamjoonsoo.kim@lge.com or js1304@gmail.com

IMHO, exposing these options is not a good idea. It's really
implementation specific. And, this flag won't show consistent performance
according to specific slab implementation. For example, to get best
performance, if SLAB is used, GFP_SLAB_ARRAY_LOCAL would be the best option,
but, for the same purpose, if SLUB is used, GFP_SLAB_ARRAY_NEW would
be the best option. And, performance could also depend on number of objects
and size.

And, overriding gfp flag isn't a good idea. Someday gfp could use
these values and they can't notice that these are used in slab
subsystem with different meaning.

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-27  8:21   ` Joonsoo Kim
@ 2015-01-27 16:57     ` Christoph Lameter
  2015-01-28  1:33       ` Joonsoo Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2015-01-27 16:57 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: akpm, linux-kernel, linux-mm, penberg, iamjoonsoo,
	Jesper Dangaard Brouer

On Tue, 27 Jan 2015, Joonsoo Kim wrote:

> IMHO, exposing these options is not a good idea. It's really
> implementation specific. And, this flag won't show consistent performance
> according to specific slab implementation. For example, to get best
> performance, if SLAB is used, GFP_SLAB_ARRAY_LOCAL would be the best option,
> but, for the same purpose, if SLUB is used, GFP_SLAB_ARRAY_NEW would
> be the best option. And, performance could also depend on number of objects
> and size.

Why would slab show a better performance? SLUB also can have partial
allocated pages per cpu and could also get data quite fast if only a
minimal number of objects are desired. SLAB is slightly better because the
number of cachelines touches stays small due to the arrangement of the freelist
on the slab page and the queueing approach that does not involve linked
lists.


GFP_SLAB_ARRAY new is best for large quantities in either allocator since
SLAB also has to construct local metadata structures.

> And, overriding gfp flag isn't a good idea. Someday gfp could use
> these values and they can't notice that these are used in slab
> subsystem with different meaning.

We can put a BUILD_BUG_ON in there to ensure that the GFP flags do not get
too high. The upper portion of the GFP flags is also used elsewhere. And
it is an allocation option so it naturally fits in there.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-27 16:57     ` Christoph Lameter
@ 2015-01-28  1:33       ` Joonsoo Kim
  2015-01-28 15:30         ` Christoph Lameter
  0 siblings, 1 reply; 14+ messages in thread
From: Joonsoo Kim @ 2015-01-28  1:33 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Joonsoo Kim, akpm, LKML, Linux Memory Management List,
	Pekka Enberg, iamjoonsoo, Jesper Dangaard Brouer

2015-01-28 1:57 GMT+09:00 Christoph Lameter <cl@linux.com>:
> On Tue, 27 Jan 2015, Joonsoo Kim wrote:
>
>> IMHO, exposing these options is not a good idea. It's really
>> implementation specific. And, this flag won't show consistent performance
>> according to specific slab implementation. For example, to get best
>> performance, if SLAB is used, GFP_SLAB_ARRAY_LOCAL would be the best option,
>> but, for the same purpose, if SLUB is used, GFP_SLAB_ARRAY_NEW would
>> be the best option. And, performance could also depend on number of objects
>> and size.
>
> Why would slab show a better performance? SLUB also can have partial
> allocated pages per cpu and could also get data quite fast if only a
> minimal number of objects are desired. SLAB is slightly better because the
> number of cachelines touches stays small due to the arrangement of the freelist
> on the slab page and the queueing approach that does not involve linked
> lists.
>
>
> GFP_SLAB_ARRAY new is best for large quantities in either allocator since
> SLAB also has to construct local metadata structures.

In case of SLAB, there is just a little more work to construct local metadata so
GFP_SLAB_ARRAY_NEW would not show better performance
than GFP_SLAB_ARRAY_LOCAL, because it would cause more overhead due to
more page allocations. Because of this characteristic, I said that
which option is
the best is implementation specific and therefore we should not expose it.

Even if we narrow down the problem to the SLUB, choosing correct option is
difficult enough. User should know how many objects are cached in this
kmem_cache
in order to choose best option since relative quantity would make
performance difference.

And, how many objects are cached in this kmem_cache could be changed
whenever implementation changed.

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-28  1:33       ` Joonsoo Kim
@ 2015-01-28 15:30         ` Christoph Lameter
  2015-01-29  7:44           ` Joonsoo Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Lameter @ 2015-01-28 15:30 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Joonsoo Kim, akpm, LKML, Linux Memory Management List,
	Pekka Enberg, iamjoonsoo, Jesper Dangaard Brouer

On Wed, 28 Jan 2015, Joonsoo Kim wrote:

> > GFP_SLAB_ARRAY new is best for large quantities in either allocator since
> > SLAB also has to construct local metadata structures.
>
> In case of SLAB, there is just a little more work to construct local metadata so
> GFP_SLAB_ARRAY_NEW would not show better performance
> than GFP_SLAB_ARRAY_LOCAL, because it would cause more overhead due to
> more page allocations. Because of this characteristic, I said that
> which option is
> the best is implementation specific and therefore we should not expose it.

For large amounts of objects (hundreds or higher) GFP_SLAB_ARRAY_LOCAL
will never have enough objects. GFP_SLAB_ARRAY_NEW will go to the page
allocator and bypass free table creation and all the queuing that objects
go through normally in SLAB. AFAICT its going to be a significant win.

A similar situation is true for the freeing operation. If the freeing
operation results in all objects in a page being freed then we can also
bypass that and put the page directly back into the page allocator (to be
implemented once we agree on an approach).

> Even if we narrow down the problem to the SLUB, choosing correct option is
> difficult enough. User should know how many objects are cached in this
> kmem_cache
> in order to choose best option since relative quantity would make
> performance difference.

Ok we can add a function call to calculate the number of objects cached
per cpu and per node? But then that is rather fluid and could change any
moment.

> And, how many objects are cached in this kmem_cache could be changed
> whenever implementation changed.

The default when no options are specified is to first exhaust the node
partial objects, then allocate new slabs as long as we have more than
objects per page left and only then satisfy from cpu local object. I think
that is satisfactory for the majority of the cases.

The detailed control options were requested at the meeting in Auckland at
the LCA. I am fine with dropping those if they do not make sense. Makes
the API and implementation simpler. Jesper, are you ok with this?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-28 15:30         ` Christoph Lameter
@ 2015-01-29  7:44           ` Joonsoo Kim
  2015-02-03 22:55             ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 14+ messages in thread
From: Joonsoo Kim @ 2015-01-29  7:44 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: akpm, LKML, Linux Memory Management List, Pekka Enberg,
	iamjoonsoo, Jesper Dangaard Brouer

On Wed, Jan 28, 2015 at 09:30:56AM -0600, Christoph Lameter wrote:
> On Wed, 28 Jan 2015, Joonsoo Kim wrote:
> 
> > > GFP_SLAB_ARRAY new is best for large quantities in either allocator since
> > > SLAB also has to construct local metadata structures.
> >
> > In case of SLAB, there is just a little more work to construct local metadata so
> > GFP_SLAB_ARRAY_NEW would not show better performance
> > than GFP_SLAB_ARRAY_LOCAL, because it would cause more overhead due to
> > more page allocations. Because of this characteristic, I said that
> > which option is
> > the best is implementation specific and therefore we should not expose it.
> 
> For large amounts of objects (hundreds or higher) GFP_SLAB_ARRAY_LOCAL
> will never have enough objects. GFP_SLAB_ARRAY_NEW will go to the page
> allocator and bypass free table creation and all the queuing that objects
> go through normally in SLAB. AFAICT its going to be a significant win.
> 
> A similar situation is true for the freeing operation. If the freeing
> operation results in all objects in a page being freed then we can also
> bypass that and put the page directly back into the page allocator (to be
> implemented once we agree on an approach).
> 
> > Even if we narrow down the problem to the SLUB, choosing correct option is
> > difficult enough. User should know how many objects are cached in this
> > kmem_cache
> > in order to choose best option since relative quantity would make
> > performance difference.
> 
> Ok we can add a function call to calculate the number of objects cached
> per cpu and per node? But then that is rather fluid and could change any
> moment.
> 
> > And, how many objects are cached in this kmem_cache could be changed
> > whenever implementation changed.
> 
> The default when no options are specified is to first exhaust the node
> partial objects, then allocate new slabs as long as we have more than
> objects per page left and only then satisfy from cpu local object. I think
> that is satisfactory for the majority of the cases.
> 
> The detailed control options were requested at the meeting in Auckland at
> the LCA. I am fine with dropping those if they do not make sense. Makes
> the API and implementation simpler. Jesper, are you ok with this?

IMHO, it'd be better to choose a proper way of allocation by slab itself
and not to expose options to API user. We could decide the best option
according to current status of kmem_cache and requested object number
and internal implementation.

Is there any obvious example these option are needed for user?

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 1/3] Slab infrastructure for array operations
  2015-01-29  7:44           ` Joonsoo Kim
@ 2015-02-03 22:55             ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 14+ messages in thread
From: Jesper Dangaard Brouer @ 2015-02-03 22:55 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Christoph Lameter, akpm, LKML, Linux Memory Management List,
	Pekka Enberg, brouer

On Thu, 29 Jan 2015 16:44:43 +0900
Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> On Wed, Jan 28, 2015 at 09:30:56AM -0600, Christoph Lameter wrote:
> > On Wed, 28 Jan 2015, Joonsoo Kim wrote:
> > 
[...]
> > 
> > The default when no options are specified is to first exhaust the node
> > partial objects, then allocate new slabs as long as we have more than
> > objects per page left and only then satisfy from cpu local object. I think
> > that is satisfactory for the majority of the cases.
> > 
> > The detailed control options were requested at the meeting in Auckland at
> > the LCA. I am fine with dropping those if they do not make sense. Makes
> > the API and implementation simpler. Jesper, are you ok with this?

Yes, I'm okay with dropping the allocation flags. 

We might want to keep the flag "GFP_SLAB_ARRAY_FULL_COUNT" for allowing
allocator to return less-than the requested elements (but I'm not 100%
sure).  The idea behind this is, if the allocator can "see" that it
needs to perform a (relativly) expensive operation, then I would rather
want it to return current elements (even if it's less than requested).
As this is likely very performance sensitive code using this API.


> IMHO, it'd be better to choose a proper way of allocation by slab
> itself and not to expose options to API user. We could decide the
> best option according to current status of kmem_cache and requested
> object number and internal implementation.
> 
> Is there any obvious example these option are needed for user?

The use-cases were, if the subsystem/user know about their use-case e.g.
1) needing a large allocation which does not need to be cache hot,
2) needing a smaller (e.g 8-16 elems) allocation that should be cache hot.

But, as you argue, I guess it is best to leave this up to the slab
implementation as the status of the kmem_cache is only known to the
allocator itself.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/3] Slab allocator array operations
  2015-01-24  0:28   ` Christoph Lameter
@ 2015-02-03 23:19     ` Jesper Dangaard Brouer
  2015-02-06 18:39       ` Christoph Lameter
  0 siblings, 1 reply; 14+ messages in thread
From: Jesper Dangaard Brouer @ 2015-02-03 23:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, linux-kernel, linux-mm, penberg, brouer, netdev,
	Joonsoo Kim

On Fri, 23 Jan 2015 18:28:00 -0600 (CST)
Christoph Lameter <cl@linux.com> wrote:

> On Fri, 23 Jan 2015, Andrew Morton wrote:
> 
> > On Fri, 23 Jan 2015 15:37:27 -0600 Christoph Lameter <cl@linux.com> wrote:
> >
> > > Attached a series of 3 patches to implement functionality to allocate
> > > arrays of pointers to slab objects. This can be used by the slab
> > > allocators to offer more optimized allocation and free paths.
> >
> > What's the driver for this?  The networking people, I think?  If so,
> > some discussion about that would be useful: who is involved, why they
> > have this need, who are the people we need to bug to get it tested,
> > whether this implementation is found adequate, etc.

Yes, networking people like me ;-)

I promised Christoph that I will performance benchmark this. I'll start
by writing/performing some micro benchmarks, but it first starts to get
really interesting once we plug it into e.g. the networking stack, as
effects as instruction-cache misses due to code size starts to play a
role.

> 
> Jesper and I gave a talk at LCA about this. LWN has an article on it.

LWN: Improving Linux networking performance
 - http://lwn.net/Articles/629155/
 - YouTube: https://www.youtube.com/watch?v=3XG9-X777Jo

LWN: Toward a more efficient slab allocator
 - http://lwn.net/Articles/629152/
 - YouTube: https://www.youtube.com/watch?v=s0lZzP1jOzI

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 0/3] Slab allocator array operations
  2015-02-03 23:19     ` Jesper Dangaard Brouer
@ 2015-02-06 18:39       ` Christoph Lameter
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Lameter @ 2015-02-06 18:39 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Andrew Morton, linux-kernel, linux-mm, penberg, netdev, Joonsoo Kim

On Wed, 4 Feb 2015, Jesper Dangaard Brouer wrote:

> I promised Christoph that I will performance benchmark this. I'll start
> by writing/performing some micro benchmarks, but it first starts to get
> really interesting once we plug it into e.g. the networking stack, as
> effects as instruction-cache misses due to code size starts to play a
> role.

Ok I got a patchset here with the options removed. Just the basic ops.
Should I repost that?


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-02-06 18:39 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 21:37 [RFC 0/3] Slab allocator array operations Christoph Lameter
2015-01-23 21:37 ` [RFC 1/3] Slab infrastructure for " Christoph Lameter
2015-01-27  8:21   ` Joonsoo Kim
2015-01-27 16:57     ` Christoph Lameter
2015-01-28  1:33       ` Joonsoo Kim
2015-01-28 15:30         ` Christoph Lameter
2015-01-29  7:44           ` Joonsoo Kim
2015-02-03 22:55             ` Jesper Dangaard Brouer
2015-01-23 21:37 ` [RFC 2/3] slub: Support " Christoph Lameter
2015-01-23 21:37 ` [RFC 3/3] Array alloc test code Christoph Lameter
2015-01-23 22:57 ` [RFC 0/3] Slab allocator array operations Andrew Morton
2015-01-24  0:28   ` Christoph Lameter
2015-02-03 23:19     ` Jesper Dangaard Brouer
2015-02-06 18:39       ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).