linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: paulmck@linux.vnet.ibm.com
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, mingo@elte.hu,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
	sbw@mit.edu, patches@linaro.org,
	"Paul E. McKenney" <paul.mckenney@linaro.org>
Subject: Re: [PATCH tip/core/rcu 11/15] rcu: Avoid spurious RCU CPU stall warnings
Date: Thu, 06 Sep 2012 17:41:01 -0400	[thread overview]
Message-ID: <1346967661.1680.52.camel@gandalf.local.home> (raw)
In-Reply-To: <20120906210354.GC2448@linux.vnet.ibm.com>

On Thu, 2012-09-06 at 14:03 -0700, Paul E. McKenney wrote:

> Here are a few other ways that stalls can happen:
> 
> o	A CPU looping in an RCU read-side critical section.

For a minute? That's a bug.

> 	
> o	A CPU looping with interrupts disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Also a bug.

> 
> o	A CPU looping with preemption disabled.  This condition can
> 	result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
> 	stalls.

Bug as well.

> 
> o	A CPU looping with bottom halves disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Bug too.

> 
> o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
> 	without invoking schedule().

Another bug.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
> 	happen to preempt a low-priority task in the middle of an RCU
> 	read-side critical section.   This is especially damaging if
> 	that low-priority task is not permitted to run on any other CPU,
> 	in which case the next RCU grace period can never complete, which
> 	will eventually cause the system to run out of memory and hang.
> 	While the system is in the process of running itself out of
> 	memory, you might see stall-warning messages.

Buggy system.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
> 	is running at a higher priority than the RCU softirq threads.
> 	This will prevent RCU callbacks from ever being invoked,
> 	and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
> 	RCU grace periods from ever completing.  Either way, the
> 	system will eventually run out of memory and hang.  In the
> 	CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
> 	messages.

Not really a bug, but the developers need a spanking.

> 
> o	A hardware or software issue shuts off the scheduler-clock
> 	interrupt on a CPU that is not in dyntick-idle mode.  This
> 	problem really has happened, and seems to be most likely to
> 	result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.

Driving the bug.

> 
> o	A bug in the RCU implementation.

Bug in the name.

> 
> o	A hardware failure.  This is quite unlikely, but has occurred
> 	at least once in real life.  A CPU failed in a running system,
> 	becoming unresponsive, but not causing an immediate crash.
> 	This resulted in a series of RCU CPU stall warnings, eventually
> 	leading the realization that the CPU had failed.

Hardware bug.

So, where's the "spurious RCU CPU stall warnings"?

All these cases deserve a warning.

-- Steve



  reply	other threads:[~2012-09-06 21:41 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-30 18:56 [PATCH tip/core/rcu 0/15] Miscellaneous fixes Paul E. McKenney
2012-08-30 18:56 ` [PATCH tip/core/rcu 01/15] rcu: Add PROVE_RCU_DELAY to provoke difficult races Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 02/15] rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region Paul E. McKenney
2012-08-31 16:53     ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 03/15] rcu: Properly initialize ->boost_tasks on CPU offline Paul E. McKenney
2012-08-31 17:56     ` Josh Triplett
2012-09-06 14:40     ` Peter Zijlstra
2012-09-06 20:58       ` Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 04/15] rcu: Permit RCU_NONIDLE() to be used from interrupt context Paul E. McKenney
2012-08-31 18:00     ` Josh Triplett
2012-09-04 22:33       ` Paul E. McKenney
2012-09-04 22:48         ` Josh Triplett
2012-09-04 22:51         ` Steven Rostedt
2012-09-04 23:08           ` Josh Triplett
2012-09-04 23:23             ` Steven Rostedt
2012-09-04 23:33               ` Josh Triplett
2012-09-04 23:43                 ` Paul E. McKenney
2012-09-06 18:54                   ` Josh Triplett
2012-09-06 19:54                     ` Steven Rostedt
2012-09-07  6:09                       ` Josh Triplett
2012-09-07 14:24                         ` Paul E. McKenney
2012-09-07 14:47                           ` Josh Triplett
2012-09-07 15:16                             ` Steven Rostedt
2012-09-12  1:07                               ` Paul E. McKenney
2012-09-12 14:13                                 ` Steven Rostedt
2012-09-12 15:03                                   ` Paul E. McKenney
2012-09-12 15:18                                     ` Steven Rostedt
2012-09-12 16:57                                       ` Paul E. McKenney
2012-09-04 23:46                 ` Steven Rostedt
2012-09-05  0:42                   ` Josh Triplett
2012-09-05  6:23                   ` [PATCH] trace: Don't declare trace_*_rcuidle functions in modules Josh Triplett
2012-09-05 14:26                     ` Mathieu Desnoyers
2012-09-05 16:36                     ` Paul E. McKenney
2012-09-06 19:49                     ` Steven Rostedt
2012-09-14  6:07                     ` [tip:core/rcu] trace: Don' t " tip-bot for Josh Triplett
2012-09-04 23:14           ` [PATCH tip/core/rcu 04/15] rcu: Permit RCU_NONIDLE() to be used from interrupt context Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 05/15] rcu: Improve boost selection when moving tasks to root rcu_node Paul E. McKenney
2012-08-31 18:09     ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 06/15] rcu: Make offline-CPU checking allow for indefinite delays Paul E. McKenney
2012-08-31 18:12     ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 07/15] rcu: Fix obsolete rcu_initiate_boost() header comment Paul E. McKenney
2012-08-31 18:13     ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 08/15] rcu: Apply for_each_rcu_flavor() to increment_cpu_stall_ticks() Paul E. McKenney
2012-08-31 18:15     ` Josh Triplett
2012-09-04 22:44       ` Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 09/15] rcu: Avoid rcu_print_detail_task_stall_rnp() segfault Paul E. McKenney
2012-08-31 18:19     ` Josh Triplett
2012-09-04 22:46       ` Paul E. McKenney
2012-09-04 22:55         ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 10/15] rcu: Protect rcu_node accesses during CPU stall warnings Paul E. McKenney
2012-08-31 18:23     ` Josh Triplett
2012-09-04 22:51       ` Paul E. McKenney
2012-09-06 14:51     ` Peter Zijlstra
2012-09-06 21:01       ` Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 11/15] rcu: Avoid spurious RCU " Paul E. McKenney
2012-08-31 18:24     ` Josh Triplett
2012-09-06 14:56     ` Peter Zijlstra
2012-09-06 15:07       ` Steven Rostedt
2012-09-06 15:19         ` Peter Zijlstra
2012-09-06 21:03           ` Paul E. McKenney
2012-09-06 21:41             ` Steven Rostedt [this message]
2012-09-06 21:58               ` Paul E. McKenney
2012-09-06 22:05                 ` Steven Rostedt
2012-09-06 22:22                   ` Paul E. McKenney
2012-09-07  7:00                     ` Peter Zijlstra
2012-09-07 14:42                       ` Steven Rostedt
2012-08-30 18:56   ` [PATCH tip/core/rcu 12/15] rcu: Remove redundant memory barrier from __call_rcu() Paul E. McKenney
2012-08-31 18:30     ` Josh Triplett
2012-08-31 18:40       ` Josh Triplett
2012-08-30 18:56   ` [PATCH tip/core/rcu 13/15] rcu: Move TINY_PREEMPT_RCU away from raw_local_irq_save() Paul E. McKenney
2012-08-31 18:34     ` Josh Triplett
2012-09-04 23:03       ` Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 14/15] time: RCU permitted to stop idle entry via softirq Paul E. McKenney
2012-08-31 18:51     ` Josh Triplett
2012-09-06 15:12     ` Peter Zijlstra
2012-09-06 21:35       ` Paul E. McKenney
2012-09-06 21:57         ` Steven Rostedt
2012-09-06 22:11           ` Paul E. McKenney
2012-08-30 18:56   ` [PATCH tip/core/rcu 15/15] kmemleak: Replace list_for_each_continue_rcu with new interface Paul E. McKenney
2012-08-31 18:55     ` Josh Triplett
2012-09-04 23:41       ` Paul E. McKenney
2012-08-31 16:49   ` [PATCH tip/core/rcu 01/15] rcu: Add PROVE_RCU_DELAY to provoke difficult races Josh Triplett
2012-09-04 22:36     ` Paul E. McKenney
2012-09-06 14:38   ` Peter Zijlstra
2012-09-06 20:51     ` Paul E. McKenney
2012-09-07  6:54       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1346967661.1680.52.camel@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=darren@dvhart.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mingo@elte.hu \
    --cc=niv@us.ibm.com \
    --cc=patches@linaro.org \
    --cc=paul.mckenney@linaro.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sbw@mit.edu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).