All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, brouer@redhat.com
Subject: Re: slub: bulk allocation from per cpu partial pages
Date: Thu, 16 Apr 2015 14:06:38 +0200	[thread overview]
Message-ID: <20150416140638.684838a2@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1504091215330.18198@gentwo.org>

On Thu, 9 Apr 2015 12:16:23 -0500 (CDT)
Christoph Lameter <cl@linux.com> wrote:

> Next step: cover all of the per cpu objects available.
> 
> 
> Expand the bulk allocation support to drain the per cpu partial
> pages while interrupts are off.

Started my micro benchmarking.

On CPU E5-2630 @ 2.30GHz, the cost of kmem_cache_alloc +
kmem_cache_free, is a tight loop (most optimal fast-path), cost 22ns.
With elem size 256 bytes, where slab chooses to make 32 obj-per-slab.

With this patch, testing different bulk sizes, the cost of alloc+free
per element is improved for small sizes of bulk (which I guess this the
is expected outcome).

Have something to compare against, I also ran the bulk sizes through
the fallback versions __kmem_cache_alloc_bulk() and
__kmem_cache_free_bulk(), e.g. the none optimized versions.

 size    --  optimized -- fallback
 bulk  8 --  15ns      --  22ns
 bulk 16 --  15ns      --  22ns
 bulk 30 --  44ns      --  48ns
 bulk 32 --  47ns      --  50ns
 bulk 64 --  52ns      --  54ns

For smaller bulk sizes 8 and 16, this is actually a significant
improvement, especially considering the free side is not optimized.

Thus, the 7ns improvement must come from the alloc side only.


> Signed-off-by: Christoph Lameter <cl@linux.com>
> 
> Index: linux/mm/slub.c
> ===================================================================
> --- linux.orig/mm/slub.c
> +++ linux/mm/slub.c
> @@ -2771,15 +2771,45 @@ bool kmem_cache_alloc_bulk(struct kmem_c
>  		while (size) {
>  			void *object = c->freelist;
> 
> -			if (!object)
> -				break;
> +			if (unlikely(!object)) {
> +				/*
> +				 * Check if there remotely freed objects
> +				 * availalbe in the page.
> +				 */
> +				object = get_freelist(s, c->page);
> +
> +				if (!object) {
> +					/*
> +					 * All objects in use lets check if
> +					 * we have other per cpu partial
> +					 * pages that have available
> +					 * objects.
> +					 */
> +					c->page = c->partial;
> +					if (!c->page) {
> +						/* No per cpu objects left */
> +						c->freelist = NULL;
> +						break;
> +					}
> +
> +					/* Next per cpu partial page */
> +					c->partial = c->page->next;
> +					c->freelist = get_freelist(s,
> +							c->page);
> +					continue;
> +				}
> +
> +			}
> +
> 
> -			c->freelist = get_freepointer(s, object);
>  			*p++ = object;
>  			size--;
> 
>  			if (unlikely(flags & __GFP_ZERO))
>  				memset(object, 0, s->object_size);
> +
> +			c->freelist = get_freepointer(s, object);
> +
>  		}
>  		c->tid = next_tid(c->tid);
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-04-16 12:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-08 18:13 slub bulk alloc: Extract objects from the per cpu slab Christoph Lameter
2015-04-08 22:53 ` Andrew Morton
2015-04-09 14:03   ` Christoph Lameter
2015-04-09 17:16     ` slub: bulk allocation from per cpu partial pages Christoph Lameter
2015-04-16 12:06       ` Jesper Dangaard Brouer [this message]
2015-04-16 15:54         ` Christoph Lameter
2015-04-17  5:44           ` Jesper Dangaard Brouer
2015-04-17  6:06             ` Jesper Dangaard Brouer
2015-04-30 18:40               ` Christoph Lameter
2015-04-30 19:20                 ` Jesper Dangaard Brouer
2015-04-09 20:19     ` slub bulk alloc: Extract objects from the per cpu slab Andrew Morton
2015-04-11  2:19       ` Christoph Lameter
2015-04-11  7:25         ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150416140638.684838a2@redhat.com \
    --to=brouer@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.