linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, cl@linux-foundation.org,
	mingo@elte.hu, akpm@linux-foundation.org,
	manfred@colorfullife.com, dipankar@in.ibm.com,
	josht@linux.vnet.ibm.com, schamp@sgi.com, niv@us.ibm.com,
	dvhltc@us.ibm.com, ego@in.ibm.com, rostedt@goodmis.org,
	peterz@infradead.org
Subject: Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation
Date: Sat, 30 Aug 2008 07:29:35 -0700	[thread overview]
Message-ID: <20080830142935.GE7107@linux.vnet.ibm.com> (raw)
In-Reply-To: <48B919C2.1040809@cn.fujitsu.com>

On Sat, Aug 30, 2008 at 05:58:26PM +0800, Lai Jiangshan wrote:
> I just had a fast review. so my comments is nothing but cleanup.

Thank you for looking it over!!!

>           Thanks, Lai.
> 
> Paul E. McKenney wrote:
> > Hello!
> 
> > +rcu_start_gp(struct rcu_state *rsp, unsigned long iflg)
> > +	__releases(rsp->rda[smp_processor_id()]->lock)
> > +{
> > +	unsigned long flags = iflg;
> > +	struct rcu_data *rdp = rsp->rda[smp_processor_id()];
> > +	struct rcu_node *rnp = rcu_get_root(rsp);
> > +	struct rcu_node *rnp_cur;
> > +	struct rcu_node *rnp_end;
> > +
> > +	if (!cpu_needs_another_gp(rsp, rdp)) {
> >  
> >  		/*
> > -		 * Accessing nohz_cpu_mask before incrementing rcp->cur needs a
> > -		 * Barrier  Otherwise it can cause tickless idle CPUs to be
> > -		 * included in rcp->cpumask, which will extend graceperiods
> > -		 * unnecessarily.
> > +		 * Either there is no need to detect any more grace periods
> > +		 * at the moment, or we are already in the process of
> > +		 * detecting one.  Either way, we should not start a new
> > +		 * RCU grace period, so drop the lock and return.
> >  		 */
> > -		smp_mb();
> > -		cpus_andnot(rcp->cpumask, cpu_online_map, nohz_cpu_mask);
> > +		spin_unlock_irqrestore(&rnp->lock, flags);
> > +		return;
> > +	}
> > +
> > +	/* Advance to a new grace period and initialize state. */
> > +
> > +	rsp->gpnum++;
> > +	rsp->signaled = RCU_SIGNAL_INIT;
> > +	rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
> > +	record_gp_stall_check_time();
> > +	dyntick_save_completed(rsp, rsp->completed - 1);
> > +	note_new_gpnum(rsp, rdp);
> > +
> > +	/*
> > +	 * Because we are first, we know that all our callbacks will
> > +	 * be covered by this upcoming grace period, even the ones
> > +	 * that were registered arbitrarily recently.
> > +	 */
> > +
> > +	rcu_next_callbacks_are_ready(rdp);
> > +	rdp->nxttail[RCU_WAIT_TAIL] = rdp->nxttail[RCU_NEXT_TAIL];
> >  
> > -		rcp->signaled = 0;
> > +	/* Special-case the common single-level case. */
> > +
> > +	if (NUM_RCU_NODES == 1) {
> > +		rnp->qsmask = rnp->qsmaskinit;
> 
> I tried a mask like qsmaskinit before. The system came to deadlock
> when I did on/offline cpus.

And I did need to address this.

> I didn't find out the whys for I bethought of these two problem:
> 
> problem 1:
> ----race condition 1:
> <cpu_down>
> synchronize_rcu <called from offline handler in other subsystem>
> rcu_offline_cpu
> 
> 
> -----race condition 2:
> rcu_online_cpu
> synchronize_rcu <called from online handler in other subsystem>
> <cpu_up>
> 
> in these two condition, synchronize_rcu isblocked for ever for
> synchronize_rcu have to wait a cpu in rnp->qsmask, but this
> cpu don't run.

First, only one of these race conditions can happen at a time, since
only one online/offline action can be happening at a time.

What I did to solve it was to make force_quiescent_state() check
to see if the CPU currently blocking the grace period is offline.
(The actual checking for offline is in rcu_implicit_offline_qs(),
which is called indirectly from force_quiescent_state().)

So when this race occurs, it is taken care of within three jiffies.
This happened -many- times during my most recent test ("of=" in the
rcudata trace).

> problem 2:
> we need call rcu_offline_cpu() in these two cases in rcu_cpu_notify()
> since qsmaskinit had changed by rcu_online_cpu()
> 
> 	case CPU_UP_CANCELED:
> 	case CPU_UP_CANCELED_FROZEN:

Good catch!!!  Fixed.  The current code would work in this case, but grace
periods would be unnecessarily extended until force_quiescent_state()
got a chance to clean things up.  So very good to fix this one.

> > +static void
> > +cpu_quiet(int cpu, struct rcu_state *rsp, struct rcu_data *rdp, long *lastcomp)
> >  {
> >  	unsigned long flags;
> > +	long mask;
> 
> long mask -> unsigned long mask

Good eyes!  Fixed.

> > +static void __rcu_offline_cpu(int cpu, struct rcu_state *rsp)
> >  {
> > -	if (list) {
> > -		local_irq_disable();
> > -		this_rdp->batch = batch;
> > -		*this_rdp->nxttail[2] = list;
> > -		this_rdp->nxttail[2] = tail;
> > -		local_irq_enable();
> > +	int i;
> > +	unsigned long flags;
> > +	long mask;
> 
> long mask -> unsigned long mask

Here too!

> > + * Queue an RCU callback for invocation after a grace period.
> > + */
> > +void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
> > +{
> > +	unsigned long flags;
> > +
> > +	head->func = func;
> > +	head->next = NULL;
> > +	local_irq_save(flags);
> > +	__call_rcu(head, &rcu_state, &__get_cpu_var(rcu_data));
> > +	local_irq_restore(flags);
> > +}
> 
> struct rcu_state has a field: struct rcu_data *rda[NR_CPUS]
> so we can move these lines around __call_rcu into __call_rcu.
> 
> __call_rcu(struct rcu_head *head, struct rcu_state *rsp)
> {
> 	local_irq_save(flags);
> 	struct rcu_data *rdp = rsp->rda[smp_processor_id()];
> .....
> 	local_irq_save(flags);
> }

Very good point!!!  And then call_rcu() and call_rcu_bh() become
one-liners.  ;-)

> > +static void
> > +rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
> >  {
> > -	if (user ||
> > -	    (idle_cpu(cpu) && !in_softirq() &&
> > -				hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> > -
> > -		/*
> > -		 * Get here if this CPU took its interrupt from user
> > -		 * mode or from the idle loop, and if this is not a
> > -		 * nested interrupt.  In this case, the CPU is in
> > -		 * a quiescent state, so count it.
> > -		 *
> > -		 * Also do a memory barrier.  This is needed to handle
> > -		 * the case where writes from a preempt-disable section
> > -		 * of code get reordered into schedule() by this CPU's
> > -		 * write buffer.  The memory barrier makes sure that
> > -		 * the rcu_qsctr_inc() and rcu_bh_qsctr_inc() are see
> > -		 * by other CPUs to happen after any such write.
> > -		 */
> > +	unsigned long flags;
> > +	int i;
> > +	long mask;
> 
> long mask -> unsigned long mask

And again!  ;-)  Very good eyes!!!

> > +
> > +/*
> > + * Helper function for rcu_init() that initializes one rcu_state structure.
> > + */
> > +static void __init rcu_init_one(struct rcu_state *rsp)
> > +{
> > +	int i;
> > +	int j;
> > +	struct rcu_node *rnp;
> > +
> > +	/* Initialize the level-tracking arrays. */
> > +
> > +	for (i = 1; i < NUM_RCU_LVLS; i++) {
> > +		rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1];
> > +	}
> > +	rcu_init_levelspread(rsp);
> > +
> > +	/* Initialize the elements themselves, starting from the leaves. */
> > +
> > +	for (i = NUM_RCU_LVLS - 1; i >= 0; i--) {
> > +		rnp = rsp->level[i];
> > +		for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) {
> > +			spin_lock_init(&rnp->lock);
> > +			rnp->qsmask = 0;
> > +			rnp->grplo = j * rsp->levelspread[i];
> > +			rnp->grphi = (j + 1) * rsp->levelspread[i] - 1;
> > +			if (rnp->grphi >= rsp->levelcnt[i + 1])
> > +				rnp->grphi = rsp->levelcnt[i + 1] - 1;
> > +			rnp->qsmaskinit = 0;
> 
> if no other reason, I will init fields with the order as they are declared.

Good point, moved.

> > +			if (i != NUM_RCU_LVLS - 1)
> > +				rnp->grplo = rnp->grphi = 0;
> > +			if (i == 0) {
> > +				rnp->grpnum = 0;
> > +				rnp->parent = NULL;
> > +			} else {
> > +				rnp->grpnum = j % rsp->levelspread[i - 1];
> > +				rnp->parent = rsp->level[i - 1] + 
> > +					      j / rsp->levelspread[i - 1];
> > +			}
> > +			rnp->level = i;
> > +		}
> > +	}
> > +}
> > +
> > +/*
> > + * Helper macro for rcu_init().  To be used nowhere else!
> 
> rcu_init -> __rcu_init

Good catch, fixed.

> > + * Assigns leaf node pointers into each CPU's rcu_data structure.
> > + */
> > +#define RCU_DATA_PTR_INIT(rsp, rcu_data) \
> > +do { \
> > +	rnp = (rsp)->level[NUM_RCU_LVLS - 1]; \
> > +	j = 0; \
> > +	for_each_possible_cpu(i) { \
> > +		if (i > rnp[j].grphi) \
> > +			j++; \
> > +		per_cpu(rcu_data, i).mynode = &rnp[j]; \
> > +		(rsp)->rda[i] = &per_cpu(rcu_data, i); \
> > +	} \
> > +} while (0)
> > +
> >  static struct notifier_block __cpuinitdata rcu_nb = {
> >  	.notifier_call	= rcu_cpu_notify,
> >  };
> >  
> 

  parent reply	other threads:[~2008-08-30 14:29 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-21 23:43 [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation Paul E. McKenney
2008-08-22  4:37 ` Ingo Molnar
2008-08-22 13:47   ` Paul E. McKenney
2008-08-22 17:22     ` Paul E. McKenney
2008-08-22 18:16       ` Josh Triplett
2008-08-23 16:07       ` Ingo Molnar
2008-08-24  2:44         ` Paul E. McKenney
2008-08-22 23:29 ` Josh Triplett
2008-08-23  1:53   ` Paul E. McKenney
2008-08-25 22:02     ` Josh Triplett
2008-08-26 16:05       ` Paul E. McKenney
2008-08-27  0:38         ` Josh Triplett
2008-08-27 18:34           ` Paul E. McKenney
2008-08-27 20:23             ` Josh Triplett
2008-08-27 20:41               ` Paul E. McKenney
2008-08-25 10:34   ` Peter Zijlstra
2008-08-25 15:16     ` Paul E. McKenney
2008-08-25 15:26       ` Peter Zijlstra
2008-08-27 18:28         ` Paul E. McKenney
2008-08-24  8:08 ` Manfred Spraul
2008-08-24 16:32   ` Paul E. McKenney
2008-08-24 18:25     ` Manfred Spraul
2008-08-24 21:19       ` Paul E. McKenney
2008-08-25  0:07 ` [PATCH, RFC, tip/core/rcu] v2 " Paul E. McKenney
2008-08-30  0:49   ` [PATCH, RFC, tip/core/rcu] v3 " Paul E. McKenney
2008-08-30  9:33     ` Peter Zijlstra
2008-08-30 14:10       ` Paul E. McKenney
2008-08-30 15:40         ` Peter Zijlstra
2008-08-30 19:38           ` Paul E. McKenney
2008-09-02 13:26           ` Mathieu Desnoyers
2008-09-02 13:41             ` Peter Zijlstra
2008-09-02 14:55               ` Paul E. McKenney
2008-08-30  9:58     ` Lai Jiangshan
2008-08-30 13:32       ` Manfred Spraul
2008-08-30 14:34         ` Paul E. McKenney
2008-08-31 10:58           ` Manfred Spraul
2008-08-31 17:20             ` Paul E. McKenney
2008-08-31 17:45               ` Manfred Spraul
2008-08-31 17:55                 ` Paul E. McKenney
2008-08-31 18:18                   ` Manfred Spraul
2008-08-31 19:23                     ` Paul E. McKenney
2008-08-30 14:29       ` Paul E. McKenney [this message]
2008-09-01  9:38     ` Andi Kleen
2008-09-02  1:05       ` Paul E. McKenney
2008-09-02  6:18         ` Andi Kleen
2008-09-05 15:29     ` [PATCH, RFC] v4 " Paul E. McKenney
2008-09-05 19:33       ` Andrew Morton
2008-09-05 23:04         ` Paul E. McKenney
2008-09-05 23:52           ` Andrew Morton
2008-09-06  4:16             ` Paul E. McKenney
2008-09-06 16:37       ` Manfred Spraul
2008-09-07 17:25         ` Paul E. McKenney
2008-09-07 10:18       ` [RFC, PATCH] Add a CPU_STARTING notifier (was: Re: [PATCH, RFC] v4 scalable classic RCU implementation) Manfred Spraul
2008-09-07 11:07         ` Andi Kleen
2008-09-07 19:46         ` Paul E. McKenney
2008-09-15 16:02       ` [PATCH, RFC] v4 scalable classic RCU implementation Paul E. McKenney
2008-09-16 16:52         ` Manfred Spraul
2008-09-16 17:30           ` Paul E. McKenney
2008-09-16 17:48             ` Manfred Spraul
2008-09-16 18:22               ` Paul E. McKenney
2008-09-21 11:09               ` Manfred Spraul
2008-09-21 21:14                 ` Paul E. McKenney
2008-09-23 23:53         ` [PATCH, RFC] v6 " Paul E. McKenney
2008-09-25  7:26           ` Ingo Molnar
2008-09-25 14:05             ` Paul E. McKenney
2008-09-25  7:29           ` Ingo Molnar
2008-09-25 14:18             ` Paul E. McKenney
2008-10-10 16:09           ` [PATCH, RFC] v7 " Paul E. McKenney
2008-10-12 15:52             ` Manfred Spraul
2008-10-12 22:46               ` Paul E. McKenney
2008-10-13 18:03                 ` Manfred Spraul
2008-10-15  1:11                   ` Paul E. McKenney
2008-10-15  8:13                     ` Manfred Spraul
2008-10-15 15:26                       ` Paul E. McKenney
2008-10-22 18:41                         ` Manfred Spraul
2008-10-22 21:02                           ` Paul E. McKenney
2008-10-22 21:24                             ` Manfred Spraul
2008-10-27 16:45                               ` Paul E. McKenney
2008-10-27 19:48                                 ` Manfred Spraul
2008-10-27 23:52                                   ` Paul E. McKenney
2008-10-28  5:30                                     ` Manfred Spraul
2008-10-28 15:17                                       ` Paul E. McKenney
2008-10-28 17:21                                         ` Manfred Spraul
2008-10-28 17:35                                           ` Paul E. McKenney
2008-10-17  8:34             ` Gautham R Shenoy
2008-10-17 15:35               ` Gautham R Shenoy
2008-10-17 15:46                 ` Paul E. McKenney
2008-10-17 15:43               ` Paul E. McKenney
2008-12-08 18:42               ` Paul E. McKenney
2008-11-02 20:10             ` Manfred Spraul
2008-11-03 20:33               ` Paul E. McKenney
2008-11-05 19:48                 ` Manfred Spraul
2008-11-05 21:27                   ` Paul E. McKenney
2008-11-15 23:20             ` [PATCH, RFC] v8 " Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080830142935.GE7107@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=dipankar@in.ibm.com \
    --cc=dvhltc@us.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=josht@linux.vnet.ibm.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=schamp@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).