linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux kernel <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org, Christoph Lameter <clameter@sgi.com>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Subject: Re: [PATCH] alloc_percpu() fails to allocate percpu data
Date: Thu, 21 Feb 2008 23:26:05 +0100	[thread overview]
Message-ID: <1203632765.6112.20.camel@lappy> (raw)
In-Reply-To: <47BDBC23.10605@cosmosbay.com>


On Thu, 2008-02-21 at 19:00 +0100, Eric Dumazet wrote:
> Some oprofile results obtained while using tbench on a 2x2 cpu machine 
> were very surprising.
> 
> For example, loopback_xmit() function was using high number of cpu 
> cycles to perform the statistic updates, supposed to be real cheap
> since they use percpu data
> 
>         pcpu_lstats = netdev_priv(dev);
>         lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
>         lb_stats->packets++;  /* HERE : serious contention */
>         lb_stats->bytes += skb->len;
> 
> 
> struct pcpu_lstats is a small structure containing two longs. It
> appears that on my 32bits platform, alloc_percpu(8) allocates a single
> cache line,  instead of giving to each cpu a separate cache line.
> 
> Using the following patch gave me impressive boost in various
> benchmarks ( 6 % in tbench) (all percpu_counters hit this bug too)
> 
> Long term fix (ie >= 2.6.26) would be to let each CPU allocate their
> own block of memory, so that we dont need to roudup sizes to
> L1_CACHE_BYTES, or merging the SGI stuff of course...
> 
> Note : SLUB vs SLAB is important here to *show* the improvement, since
> they dont have the same minimum allocation sizes (8 bytes vs 32
> bytes). This could very well explain regressions some guys reported
> when they switched to SLUB.

I've complained about this false sharing as well, so until we get the
new and improved percpu allocators,

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> 
>  mm/allocpercpu.c |   15 ++++++++++++++-
>  1 files changed, 14 insertions(+), 1 deletion(-)
> 
> 
> plain text document attachment (percpu_populate.patch)
> diff --git a/mm/allocpercpu.c b/mm/allocpercpu.c
> index 7e58322..b0012e2 100644
> --- a/mm/allocpercpu.c
> +++ b/mm/allocpercpu.c
> @@ -6,6 +6,10 @@
>  #include <linux/mm.h>
>  #include <linux/module.h>
>  
> +#ifndef cache_line_size
> +#define cache_line_size()	L1_CACHE_BYTES
> +#endif
> +
>  /**
>   * percpu_depopulate - depopulate per-cpu data for given cpu
>   * @__pdata: per-cpu data to depopulate
> @@ -52,6 +56,11 @@ void *percpu_populate(void *__pdata, size_t size, gfp_t gfp, int cpu)
>  	struct percpu_data *pdata = __percpu_disguise(__pdata);
>  	int node = cpu_to_node(cpu);
>  
> +	/*
> +	 * We should make sure each CPU gets private memory.
> +	 */
> +	size = roundup(size, cache_line_size());
> +
>  	BUG_ON(pdata->ptrs[cpu]);
>  	if (node_online(node))
>  		pdata->ptrs[cpu] = kmalloc_node(size, gfp|__GFP_ZERO, node);
> @@ -98,7 +107,11 @@ EXPORT_SYMBOL_GPL(__percpu_populate_mask);
>   */
>  void *__percpu_alloc_mask(size_t size, gfp_t gfp, cpumask_t *mask)
>  {
> -	void *pdata = kzalloc(nr_cpu_ids * sizeof(void *), gfp);
> +	/*
> +	 * We allocate whole cache lines to avoid false sharing
> +	 */
> +	size_t sz = roundup(nr_cpu_ids * sizeof(void *), cache_line_size());
> +	void *pdata = kzalloc(sz, gfp);
>  	void *__pdata = __percpu_disguise(pdata);
>  
>  	if (unlikely(!pdata))


  reply	other threads:[~2008-02-21 22:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-21 18:00 [PATCH] alloc_percpu() fails to allocate percpu data Eric Dumazet
2008-02-21 22:26 ` Peter Zijlstra [this message]
2008-02-23  9:23   ` Nick Piggin
2008-02-27 19:44     ` Christoph Lameter
2008-03-03  3:14       ` Nick Piggin
2008-03-03  7:48         ` Eric Dumazet
2008-03-03  9:41           ` Nick Piggin
2008-03-03 19:30         ` Christoph Lameter
2008-02-23  8:04 ` Andrew Morton
2008-02-27 19:59 ` Christoph Lameter
2008-02-27 20:24   ` Andrew Morton
2008-02-27 21:56     ` Christoph Lameter
2008-03-01 13:53     ` Eric Dumazet
2008-03-11 18:15 ` Mike Snitzer
2008-03-11 18:41   ` Eric Dumazet
2008-03-11 19:39     ` Mike Snitzer
2008-03-12  0:18       ` [stable] " Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1203632765.6112.20.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).