linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	tglx@linutronix.de, Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH v3] rcu: Allow to eliminate softirq processing from rcutree
Date: Fri, 22 Mar 2019 11:50:49 -0400	[thread overview]
Message-ID: <20190322155049.GA86662@google.com> (raw)
In-Reply-To: <20190322145823.GM4102@linux.ibm.com>

On Fri, Mar 22, 2019 at 07:58:23AM -0700, Paul E. McKenney wrote:
[snip]
> > > >  #ifdef CONFIG_RCU_NOCB_CPU
> > > >  static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
> > > > @@ -94,6 +72,8 @@ static void __init rcu_bootup_announce_oddness(void)
> > > >  		pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_init_delay);
> > > >  	if (gp_cleanup_delay)
> > > >  		pr_info("\tRCU debug GP init slowdown %d jiffies.\n", gp_cleanup_delay);
> > > > +	if (!use_softirq)
> > > > +		pr_info("\tRCU_SOFTIRQ processing moved to rcuc kthreads.\n");
> > > >  	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG))
> > > >  		pr_info("\tRCU debug extended QS entry/exit.\n");
> > > >  	rcupdate_announce_bootup_oddness();
> > > > @@ -629,7 +609,10 @@ static void rcu_read_unlock_special(struct task_struct *t)
> > > >  		/* Need to defer quiescent state until everything is enabled. */
> > > >  		if (irqs_were_disabled) {
> > > >  			/* Enabling irqs does not reschedule, so... */
> > > > -			raise_softirq_irqoff(RCU_SOFTIRQ);
> > > > +			if (!use_softirq)
> > > > +				raise_softirq_irqoff(RCU_SOFTIRQ);
> > > > +			else
> > > > +				invoke_rcu_core();
> > > 
> > > This can result in deadlock.  This happens when the scheduler invokes
> > > rcu_read_unlock() with one of the rq or pi locks held, which means that
> > > interrupts are disabled.  And it also means that the wakeup done in
> > > invoke_rcu_core() could go after the same rq or pi lock.
> > > 
> > > What we really need here is some way to make soemthing happen on this
> > > CPU just after interrupts are re-enabled.  Here are the options I see:
> > > 
> > > 1.	Do set_tsk_need_resched() and set_preempt_need_resched(),
> > > 	just like in the "else" clause below.  This sort of works, but
> > > 	relies on some later interrupt or similar to get things started.
> > > 	This is just fine for normal grace periods, but not so much for
> > > 	expedited grace periods.
> > > 
> > > 2.	IPI some other CPU and have it IPI us back.  Not such a good plan
> > > 	when running an SMP kernel on a single CPU.
> > > 
> > > 3.	Have a "stub" RCU_SOFTIRQ that contains only the following:
> > > 
> > > 	/* Report any deferred quiescent states if preemption enabled. */
> > > 	if (!(preempt_count() & PREEMPT_MASK)) {
> > > 		rcu_preempt_deferred_qs(current);
> > > 	} else if (rcu_preempt_need_deferred_qs(current)) {
> > > 		set_tsk_need_resched(current);
> > > 		set_preempt_need_resched();
> > > 	}
> > > 
> > > 4.	Except that raise_softirq_irqoff() could potentially have this
> > > 	same problem if rcu_read_unlock() is invoked at process level
> > > 	from the scheduler with either rq or pi locks held.  :-/
> > > 
> > > 	Which raises the question "why aren't I seeing hangs and
> > > 	lockdep splats?"
> > 
> > Interesting, could it be you're not seeing a hang in the regular case,
> > because enqueuing ksoftirqd on the same CPU as where the rcu_read_unlock is
> > happening is a rare event? First, ksoftirqd has to even be awakened in the
> > first place. On the other hand, with the new code the thread is always awaked
> > and is more likely to run into the issue you found?
> 
> No, in many cases, including the self-deadlock that showed up last night,
> raise_softirq_irqoff() will simply set a bit in a per-CPU variable.
> One case where this happens is when called from an interrupt handler.

I think we are saying the same thing, in some cases ksoftirqd will be
awakened and some case it will not. I will go through all scenarios to
convince myself it is safe, if I find some issue I will let you know.

> > The lockdep splats should be a more common occurence though IMO. If you could
> > let me know which RCU config is hanging, I can try to debug this at my end as
> > well.
> 
> TREE01, TREE02, TREE03, and TREE09.  I would guess that TREE08 would also
> do the same thing, given that it also sets PREEMPT=y and tests Tree RCU.
> 
> Please see the patch I posted and tested overnight.  I suspect that there
> is a better fix, but this does at least seem to suppress the error.

Ok, will do.

> > > Also, having lots of non-migratable timers might be considered unfriendly,
> > > though they shouldn't be -that- heavily utilized.  Yet, anyway...
> > > I could try adding logic to local_irq_enable() and local_irq_restore(),
> > > but that probably wouldn't go over all that well.  Besides, sometimes
> > > interrupt enabling happens in assembly language.
> > > 
> > > It is quite likely that delays to expedited grace periods wouldn't
> > > happen all that often.  First, the grace period has to start while
> > > the CPU itself (not some blocked task) is in an RCU read-side critical
> > > section, second, that critical section cannot be preempted, and third
> > > the rcu_read_unlock() must run with interrupts disabled.
> > > 
> > > Ah, but that sequence of events is not supposed to happen with the
> > > scheduler lock!
> > > 
> > > From Documentation/RCU/Design/Requirements/Requirements.html:
> > > 
> > > 	It is forbidden to hold any of scheduler's runqueue or
> > > 	priority-inheritance spinlocks across an rcu_read_unlock()
> > > 	unless interrupts have been disabled across the entire RCU
> > > 	read-side critical section, that is, up to and including the
> > > 	matching rcu_read_lock().
> > > 
> > > Here are the reasons we even get to rcu_read_unlock_special():
> > > 
> > > 1.	The just-ended RCU read-side critical section was preempted.
> > > 	This clearly cannot happen if interrupts are disabled across
> > > 	the entire critical section.
> > > 
> > > 2.	The scheduling-clock interrupt noticed that this critical
> > > 	section has been taking a long time.  But scheduling-clock
> > > 	interrupts also cannot happen while interrupts are disabled.
> > > 
> > > 3.	An expedited grace periods started during this critical
> > > 	section.  But if that happened, the corresponding IPI would
> > > 	have waited until this CPU enabled interrupts, so this
> > > 	cannot happen either.
> > > 
> > > So the call to invoke_rcu_core() should be OK after all.
> > > 
> > > Which is a bit of a disappointment, given that I am still seeing hangs!
> > 
> > Oh ok, discount whatever I just said then ;-) Indeed I remember this
> > requirement too now. Your neat documentation skills are indeed life saving :D
> 
> No, this did turn out to be the problem area.  Or at least one of the
> problem areas.  Again, see my earlier email.

Ok. Too many emails so I got confused :-D. I also forgot which version of the
patch are we testing since I don't think an updated one was posted. But I
will refer to your last night diff dig out the base patch from your git tree,
no problem.

> > > I might replace this invoke_rcu_core() with set_tsk_need_resched() and
> > > set_preempt_need_resched() to see if that gets rid of the hangs, but
> > > first...
> > 
> > Could we use the NMI watchdog to dump the stack at the time of the hang? May
> > be a deadlock will present on the stack (I think its config is called
> > HARDLOCKUP_DETECTOR or something).
> 
> Another approach would be to instrument the locking code that notices
> the recursive acquisition.  Or to run lockdep...  Because none of the
> failing scenarios enable lockdep!  ;-)

I was wondering why lockdep is not always turned on in your testing. Is it
due to performance concerns?

thanks,

 - Joel

  reply	other threads:[~2019-03-22 15:50 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-15 11:11 [PATCH] rcu: Allow to eliminate softirq processing from rcutree Sebastian Andrzej Siewior
2019-03-15 13:35 ` Steven Rostedt
2019-03-15 13:57   ` Sebastian Andrzej Siewior
2019-03-18  2:24 ` Paul E. McKenney
2019-03-19 11:44   ` [PATCH v2] " Sebastian Andrzej Siewior
2019-03-19 15:59     ` Paul E. McKenney
2019-03-19 16:24       ` Sebastian Andrzej Siewior
2019-03-19 16:50         ` Paul E. McKenney
2019-03-19 17:02           ` Sebastian Andrzej Siewior
2019-03-20 11:32     ` Sebastian Andrzej Siewior
2019-03-20 15:21       ` Paul E. McKenney
2019-03-20 15:44         ` Paul E. McKenney
2019-03-20 16:05           ` Sebastian Andrzej Siewior
2019-03-20 16:15             ` Paul E. McKenney
2019-03-20 16:35               ` Sebastian Andrzej Siewior
2019-03-20 17:30                 ` Paul E. McKenney
2019-03-20 17:59                   ` Sebastian Andrzej Siewior
2019-03-20 18:12                     ` Paul E. McKenney
2019-03-20 18:14                       ` Sebastian Andrzej Siewior
2019-03-20 21:13                         ` [PATCH v3] " Sebastian Andrzej Siewior
2019-03-20 23:46                           ` Paul E. McKenney
2019-03-21  8:27                             ` Sebastian Andrzej Siewior
2019-03-21 13:26                               ` Paul E. McKenney
2019-03-21 23:32                             ` Paul E. McKenney
2019-03-22  7:35                               ` Paul E. McKenney
2019-03-22 12:43                                 ` Paul E. McKenney
2019-03-22 13:42                               ` Joel Fernandes
2019-03-22 14:58                                 ` Paul E. McKenney
2019-03-22 15:50                                   ` Joel Fernandes [this message]
2019-03-22 16:26                                     ` Paul E. McKenney
2019-03-22 18:07                                       ` Paul E. McKenney
2019-03-22 23:48                           ` Joel Fernandes
2019-03-23  0:25                             ` Paul E. McKenney
2019-03-23  1:04                               ` Joel Fernandes
2019-03-23 16:10                               ` Paul E. McKenney
2019-03-24 23:42                                 ` Paul E. McKenney
2019-03-25 13:41                                   ` Joel Fernandes
2019-03-25 15:08                                     ` Paul E. McKenney
2019-03-25 15:52                                       ` Paul E. McKenney
2019-03-20  0:26 ` [PATCH] " Joel Fernandes
2019-03-20 11:28   ` Sebastian Andrzej Siewior
2019-03-21 12:06     ` Joel Fernandes
2019-03-21 13:52       ` Paul E. McKenney
2019-03-20 15:24   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190322155049.GA86662@google.com \
    --to=joel@joelfernandes.org \
    --cc=bigeasy@linutronix.de \
    --cc=efault@gmx.de \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).