All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: paulmck@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, cl@linux-foundation.org,
	mingo@elte.hu, manfred@colorfullife.com, dipankar@in.ibm.com,
	josht@linux.vnet.ibm.com, schamp@sgi.com, niv@us.ibm.com,
	dvhltc@us.ibm.com, ego@in.ibm.com, laijs@cn.fujitsu.com,
	rostedt@goodmis.org, peterz@infradead.org,
	penberg@cs.helsinki.fi, andi@firstfloor.org
Subject: Re: [PATCH, RFC] v4 scalable classic RCU implementation
Date: Fri, 5 Sep 2008 16:52:35 -0700	[thread overview]
Message-ID: <20080905165235.c086581e.akpm@linux-foundation.org> (raw)
In-Reply-To: <20080905230411.GC6737@linux.vnet.ibm.com>

On Fri, 5 Sep 2008 16:04:11 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> ...
>
> > > +#if (NR_CPUS) <= RCU_FANOUT
> > > +#  define NUM_RCU_LVLS	      1
> > > +#  define NUM_RCU_LVL_0	      1
> > > +#  define NUM_RCU_LVL_1	      (NR_CPUS)
> > > +#  define NUM_RCU_LVL_2	      0
> > > +#  define NUM_RCU_LVL_3	      0
> > > +#elif (NR_CPUS) <= RCU_FANOUT_SQ
> > > +#  define NUM_RCU_LVLS	      2
> > > +#  define NUM_RCU_LVL_0	      1
> > > +#  define NUM_RCU_LVL_1	      (((NR_CPUS) + RCU_FANOUT - 1) / RCU_FANOUT)
> > > +#  define NUM_RCU_LVL_2	      (NR_CPUS)
> > > +#  define NUM_RCU_LVL_3	      0
> > > +#elif (NR_CPUS) <= RCU_FANOUT_CUBE
> > > +#  define NUM_RCU_LVLS	      3
> > > +#  define NUM_RCU_LVL_0	      1
> > > +#  define NUM_RCU_LVL_1	      (((NR_CPUS) + RCU_FANOUT_SQ - 1) / RCU_FANOUT_SQ)
> > > +#  define NUM_RCU_LVL_2	      (((NR_CPUS) + (RCU_FANOUT) - 1) / (RCU_FANOUT))
> > > +#  define NUM_RCU_LVL_3	      NR_CPUS
> > > +#else
> > > +# error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
> > > +#endif /* #if (NR_CPUS) <= RCU_FANOUT */
> > 
> > Using NR_CPUS for anything at all is grossly, grossly inaccurate. 
> > Vendors can and will ship kernels with NR_CPUS=1024 and their customers
> > can and will run those kernels on 4-cpu machines.  Lots of customers.
> > 
> > That's a two-and-a-half-order-of-magnitude inaccuracy.  It makes all
> > your above work meaningless.
> > 
> > To be useful, these decisions should be made at runtime.
> 
> I agree in principle, but this case is an exception.
> 
> Suppose that we have NR_CPUS=1024 on a 4-CPU 64-bit machine.  Since 64^2
> is greater than 1024, we end up with a two-level hierarchy, with one
> rcu_node structure at the root and 16 rcu_node leaf structures, each
> of which takes up a single 128-byte cache line.  There will be two such
> structures in the system, one for rcu and one for rcu_bh.
> 
> So I do not believe that this will be a problem.
> 
> One laptops, this is even less of an issue -- NR_CPUS=8 on my laptop,
> which would reduce to a pair rcu_node structures, one for rcu, the other
> for rcu_bh.

Is it likely that anyone will ship kernels with NR_CPUS=8?  What are
distros presently using, and what will they be using 1-2 years hence?

> Making the decision at runtime would bloat the code by much more than the
> extra data consumed.  And introduce yet more races between online/offline
> and everything else.  Besides, this way I am being bloat-compatible
> with DEFINE_PER_CPU().  ;-)
> 
> > > +#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3)
> > > +#define NUM_RCU_NODES (RCU_SUM - NR_CPUS)
> > > +
> > > +/*
> > > + * Definition for node within the RCU grace-period-detection hierarchy.
> > > + */
> > > +struct rcu_node {
> > > +	spinlock_t lock;
> > > +	unsigned long qsmask;	/* CPUs or groups that need to switch in */
> > > +				/*  order for current grace period to proceed.*/
> > > +	unsigned long qsmaskinit;
> > > +				/* Per-GP initialization for qsmask. */
> > > +	unsigned long grpmask;	/* Mask to apply to parent qsmask. */
> > > +	int	grplo;		/* lowest-numbered CPU or group here. */
> > > +	int	grphi;		/* highest-numbered CPU or group here. */
> > > +	u8	grpnum;		/* CPU/group number for next level up. */
> > > +	u8	level;		/* root is at level 0. */
> > > +	struct rcu_node *parent;
> > > +} ____cacheline_internodealigned_in_smp;
> > 
> > So this is a 4096-byte structure on some setups.
> 
> You lost me on this one.  On a 64-bit system, I have 8 bytes each for
> lock, qsmask, qsmaskinit, grpmask, and parent, four bytes each for grplo
> and grphi, and another byte each for grpnum and level, for a total of
> 50 bytes for each struct rcu_node, which comes to a single cache line
> for most large system.  Given the default CONFIG_RCU_FANOUT=64 and
> NR_CPUS=4096, we have a two-level hierarchy with one root rcu_node
> structure and 64 leaf rcu_node structures.  This gives a total of
> 65 cache lines.

____cacheline_internodealigned_in_smp will expand this structure to
4096 bytes on CONFIG_X86_VSMP=y.

> > It's a pretty big pill to swallow.  Nice performance testing results
> > will help it to slide down.
> 
> Well, the read side is easy -- exactly the same code sequence as for
> Classic RCU.  On the update side, this is more of a bug fix for large
> numbers of CPUs, where unadorned Classic RCU is said to suffer terminal
> lock contention.  I will see what I can come up with, but at the end of
> the day, this will need some testing on machines larger than the 128-CPU
> systems that I have access to.
> 

OK, thanks.  As it's effectively a bugfix, a full description of the
bug would grease some wheels.


  reply	other threads:[~2008-09-05 23:54 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-21 23:43 [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation Paul E. McKenney
2008-08-22  4:37 ` Ingo Molnar
2008-08-22 13:47   ` Paul E. McKenney
2008-08-22 17:22     ` Paul E. McKenney
2008-08-22 18:16       ` Josh Triplett
2008-08-23 16:07       ` Ingo Molnar
2008-08-24  2:44         ` Paul E. McKenney
2008-08-22 23:29 ` Josh Triplett
2008-08-23  1:53   ` Paul E. McKenney
2008-08-25 22:02     ` Josh Triplett
2008-08-26 16:05       ` Paul E. McKenney
2008-08-27  0:38         ` Josh Triplett
2008-08-27 18:34           ` Paul E. McKenney
2008-08-27 20:23             ` Josh Triplett
2008-08-27 20:41               ` Paul E. McKenney
2008-08-25 10:34   ` Peter Zijlstra
2008-08-25 15:16     ` Paul E. McKenney
2008-08-25 15:26       ` Peter Zijlstra
2008-08-27 18:28         ` Paul E. McKenney
2008-08-24  8:08 ` Manfred Spraul
2008-08-24 16:32   ` Paul E. McKenney
2008-08-24 18:25     ` Manfred Spraul
2008-08-24 21:19       ` Paul E. McKenney
2008-08-25  0:07 ` [PATCH, RFC, tip/core/rcu] v2 " Paul E. McKenney
2008-08-30  0:49   ` [PATCH, RFC, tip/core/rcu] v3 " Paul E. McKenney
2008-08-30  9:33     ` Peter Zijlstra
2008-08-30 14:10       ` Paul E. McKenney
2008-08-30 15:40         ` Peter Zijlstra
2008-08-30 19:38           ` Paul E. McKenney
2008-09-02 13:26           ` Mathieu Desnoyers
2008-09-02 13:41             ` Peter Zijlstra
2008-09-02 14:55               ` Paul E. McKenney
2008-08-30  9:58     ` Lai Jiangshan
2008-08-30 13:32       ` Manfred Spraul
2008-08-30 14:34         ` Paul E. McKenney
2008-08-31 10:58           ` Manfred Spraul
2008-08-31 17:20             ` Paul E. McKenney
2008-08-31 17:45               ` Manfred Spraul
2008-08-31 17:55                 ` Paul E. McKenney
2008-08-31 18:18                   ` Manfred Spraul
2008-08-31 19:23                     ` Paul E. McKenney
2008-08-30 14:29       ` Paul E. McKenney
2008-09-01  9:38     ` Andi Kleen
2008-09-02  1:05       ` Paul E. McKenney
2008-09-02  6:18         ` Andi Kleen
2008-09-05 15:29     ` [PATCH, RFC] v4 " Paul E. McKenney
2008-09-05 19:33       ` Andrew Morton
2008-09-05 23:04         ` Paul E. McKenney
2008-09-05 23:52           ` Andrew Morton [this message]
2008-09-06  4:16             ` Paul E. McKenney
2008-09-06 16:37       ` Manfred Spraul
2008-09-07 17:25         ` Paul E. McKenney
2008-09-07 10:18       ` [RFC, PATCH] Add a CPU_STARTING notifier (was: Re: [PATCH, RFC] v4 scalable classic RCU implementation) Manfred Spraul
2008-09-07 11:07         ` Andi Kleen
2008-09-07 19:46         ` Paul E. McKenney
2008-09-15 16:02       ` [PATCH, RFC] v4 scalable classic RCU implementation Paul E. McKenney
2008-09-16 16:52         ` Manfred Spraul
2008-09-16 17:30           ` Paul E. McKenney
2008-09-16 17:48             ` Manfred Spraul
2008-09-16 18:22               ` Paul E. McKenney
2008-09-21 11:09               ` Manfred Spraul
2008-09-21 21:14                 ` Paul E. McKenney
2008-09-23 23:53         ` [PATCH, RFC] v6 " Paul E. McKenney
2008-09-25  7:26           ` Ingo Molnar
2008-09-25 14:05             ` Paul E. McKenney
2008-09-25  7:29           ` Ingo Molnar
2008-09-25 14:18             ` Paul E. McKenney
2008-10-10 16:09           ` [PATCH, RFC] v7 " Paul E. McKenney
2008-10-12 15:52             ` Manfred Spraul
2008-10-12 22:46               ` Paul E. McKenney
2008-10-13 18:03                 ` Manfred Spraul
2008-10-15  1:11                   ` Paul E. McKenney
2008-10-15  8:13                     ` Manfred Spraul
2008-10-15 15:26                       ` Paul E. McKenney
2008-10-22 18:41                         ` Manfred Spraul
2008-10-22 21:02                           ` Paul E. McKenney
2008-10-22 21:24                             ` Manfred Spraul
2008-10-27 16:45                               ` Paul E. McKenney
2008-10-27 19:48                                 ` Manfred Spraul
2008-10-27 23:52                                   ` Paul E. McKenney
2008-10-28  5:30                                     ` Manfred Spraul
2008-10-28 15:17                                       ` Paul E. McKenney
2008-10-28 17:21                                         ` Manfred Spraul
2008-10-28 17:35                                           ` Paul E. McKenney
2008-10-17  8:34             ` Gautham R Shenoy
2008-10-17 15:35               ` Gautham R Shenoy
2008-10-17 15:46                 ` Paul E. McKenney
2008-10-17 15:43               ` Paul E. McKenney
2008-12-08 18:42               ` Paul E. McKenney
2008-11-02 20:10             ` Manfred Spraul
2008-11-03 20:33               ` Paul E. McKenney
2008-11-05 19:48                 ` Manfred Spraul
2008-11-05 21:27                   ` Paul E. McKenney
2008-11-15 23:20             ` [PATCH, RFC] v8 " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080905165235.c086581e.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=dipankar@in.ibm.com \
    --cc=dvhltc@us.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=josht@linux.vnet.ibm.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=schamp@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.