All of lore.kernel.org
 help / color / mirror / Atom feed
* rcu_sched stall detected, but no state dump
@ 2014-12-10 12:52 Miroslav Benes
  2014-12-10 16:28 ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-10 12:52 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1906 bytes --]


Hi,

today I came across RCU stall which was correctly detected, but there is 
no state dump. This is a bit suspicious, I think. 

This is the output in serial console:

[  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
[  105.727003] INFO: Stall ended before state dump start
[  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
[  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
[  168.732006] INFO: Stall ended before state dump start
[  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
[  231.737003] INFO: Stall ended before state dump start
[  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
[  294.742003] INFO: Stall ended before state dump start
[  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
[  357.747003] INFO: Stall ended before state dump start
[  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
[  420.752003] INFO: Stall ended before state dump start
...

It can be reproduced by trivial code attached to this mail (infinite 
loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
The kernel thread is scheduled on the same CPU which causes soft lockup 
(reliably detected when lockup detector is on). There is certainly RCU 
stall, but I would expect a state dump. Is this an expected behaviour? 
Maybe I overlooked some config option, don't know.

I tested 3.18 and also next-20141210. If it is improper behaviour I could 
try to find a good kernel release and bisect it.

Best regards,
--
Miroslav Benes
SUSE Labs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-c++src; name=kthread_mod.c, Size: 604 bytes --]

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/kthread.h>
#include <linux/delay.h>

static struct task_struct *test_thread;

static int test_thread_fn(void *data)
{
	while (1) {
		if (kthread_should_stop())
			break;

//		msleep(1000);
	}

	return 0;
}

static int __init test_module_init(void)
{
	test_thread = kthread_run(test_thread_fn, NULL, "test_thread");

	return 0;
}

static void __exit test_module_cleanup(void)
{
	kthread_stop(test_thread);
}

module_init(test_module_init);
module_exit(test_module_cleanup);

MODULE_LICENSE("GPL");



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-10 12:52 rcu_sched stall detected, but no state dump Miroslav Benes
@ 2014-12-10 16:28 ` Paul E. McKenney
  2014-12-11  9:35   ` Miroslav Benes
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-10 16:28 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> 
> Hi,
> 
> today I came across RCU stall which was correctly detected, but there is 
> no state dump. This is a bit suspicious, I think. 
> 
> This is the output in serial console:
> 
> [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> [  105.727003] INFO: Stall ended before state dump start
> [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> [  168.732006] INFO: Stall ended before state dump start
> [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> [  231.737003] INFO: Stall ended before state dump start
> [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> [  294.742003] INFO: Stall ended before state dump start
> [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> [  357.747003] INFO: Stall ended before state dump start
> [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> [  420.752003] INFO: Stall ended before state dump start
> ...
> 
> It can be reproduced by trivial code attached to this mail (infinite 
> loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> The kernel thread is scheduled on the same CPU which causes soft lockup 
> (reliably detected when lockup detector is on). There is certainly RCU 
> stall, but I would expect a state dump. Is this an expected behaviour? 
> Maybe I overlooked some config option, don't know.

Definitely not expected behavior!  Unless you have only one CPU, but in
that case you should be running tiny RCU, not tree RCU.

> I tested 3.18 and also next-20141210. If it is improper behaviour I could 
> try to find a good kernel release and bisect it.

Please!  Could you also please try the (untested) diagnostic patch below
on either 3.18 or -next?  It should print messages covering all your
CPUs, and the CPU that your kernel module's kthread is running on should
show up as a one bit in the corresponding "mask" printout.

Could you also please check what CPU the rcu_sched kthread is running on?
One possibility is that this kthread is for some reason pinned on the
same CPU that is running your kthread.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 884e0ff020f1..d4018c025ac6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1129,6 +1129,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	print_cpu_stall_info_begin();
 	rcu_for_each_leaf_node(rsp, rnp) {
 		raw_spin_lock_irqsave(&rnp->lock, flags);
+		pr_err("[ CPUs %d-%d mask %#lx ]\n", rnp->grplo, rnp->grphi, rnp->qsmask);
 		ndetected += rcu_print_task_stall(rnp);
 		if (rnp->qsmask != 0) {
 			for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-10 16:28 ` Paul E. McKenney
@ 2014-12-11  9:35   ` Miroslav Benes
  2014-12-11 16:50     ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-11  9:35 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

On Wed, 10 Dec 2014, Paul E. McKenney wrote:

> On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > 
> > Hi,
> > 
> > today I came across RCU stall which was correctly detected, but there is 
> > no state dump. This is a bit suspicious, I think. 
> > 
> > This is the output in serial console:
> > 
> > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > [  105.727003] INFO: Stall ended before state dump start
> > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > [  168.732006] INFO: Stall ended before state dump start
> > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > [  231.737003] INFO: Stall ended before state dump start
> > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > [  294.742003] INFO: Stall ended before state dump start
> > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > [  357.747003] INFO: Stall ended before state dump start
> > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > [  420.752003] INFO: Stall ended before state dump start
> > ...
> > 
> > It can be reproduced by trivial code attached to this mail (infinite 
> > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > (reliably detected when lockup detector is on). There is certainly RCU 
> > stall, but I would expect a state dump. Is this an expected behaviour? 
> > Maybe I overlooked some config option, don't know.
> 
> Definitely not expected behavior!  Unless you have only one CPU, but in
> that case you should be running tiny RCU, not tree RCU.

So indeed I messed up my configs somehow and run the code on uniprocessor 
with SMP=y and tree RCU. With more processors RCU stall is detected and 
correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
softlockup is detected, but no RCU stall in the log (is this correct?). So 
I'm really sorry for the noise.

Anyway I still think that running SMP kernel with tree RCU on 
uniprocessor is possible option (albeit suboptimal and maybe improbable). 
Should I proceed with your patch below and bisection or am I mistaken 
completely and we can leave it because there is no problem?

Thanks,
Miroslav

> > I tested 3.18 and also next-20141210. If it is improper behaviour I could 
> > try to find a good kernel release and bisect it.
> 
> Please!  Could you also please try the (untested) diagnostic patch below
> on either 3.18 or -next?  It should print messages covering all your
> CPUs, and the CPU that your kernel module's kthread is running on should
> show up as a one bit in the corresponding "mask" printout.
> 
> Could you also please check what CPU the rcu_sched kthread is running on?
> One possibility is that this kthread is for some reason pinned on the
> same CPU that is running your kthread.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 884e0ff020f1..d4018c025ac6 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1129,6 +1129,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
>  	print_cpu_stall_info_begin();
>  	rcu_for_each_leaf_node(rsp, rnp) {
>  		raw_spin_lock_irqsave(&rnp->lock, flags);
> +		pr_err("[ CPUs %d-%d mask %#lx ]\n", rnp->grplo, rnp->grphi, rnp->qsmask);
>  		ndetected += rcu_print_task_stall(rnp);
>  		if (rnp->qsmask != 0) {
>  			for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-11  9:35   ` Miroslav Benes
@ 2014-12-11 16:50     ` Paul E. McKenney
  2014-12-11 19:09       ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-11 16:50 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> 
> > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > 
> > > Hi,
> > > 
> > > today I came across RCU stall which was correctly detected, but there is 
> > > no state dump. This is a bit suspicious, I think. 
> > > 
> > > This is the output in serial console:
> > > 
> > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > [  105.727003] INFO: Stall ended before state dump start
> > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > [  168.732006] INFO: Stall ended before state dump start
> > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > [  231.737003] INFO: Stall ended before state dump start
> > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > [  294.742003] INFO: Stall ended before state dump start
> > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > [  357.747003] INFO: Stall ended before state dump start
> > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > [  420.752003] INFO: Stall ended before state dump start
> > > ...
> > > 
> > > It can be reproduced by trivial code attached to this mail (infinite 
> > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > Maybe I overlooked some config option, don't know.
> > 
> > Definitely not expected behavior!  Unless you have only one CPU, but in
> > that case you should be running tiny RCU, not tree RCU.
> 
> So indeed I messed up my configs somehow and run the code on uniprocessor 
> with SMP=y and tree RCU. With more processors RCU stall is detected and 
> correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> softlockup is detected, but no RCU stall in the log (is this correct?). So 
> I'm really sorry for the noise.
> 
> Anyway I still think that running SMP kernel with tree RCU on 
> uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> Should I proceed with your patch below and bisection or am I mistaken 
> completely and we can leave it because there is no problem?

Not a problem, there have been some interesting RCU CPU stall warnings
recently, and your data did add some insight.

So the combination SMP=n PREEMPT=y can happen straightforwardly via
kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
system.  In both cases, what can happen is that RCU's grace-period
kthreads are starved, which can result in those reports.

And these reports are confusing.  I am considering attempting to improve
the diagnostics.  If I do, would you be willing to test the resulting
patches?

							Thanx, Paul

> Thanks,
> Miroslav
> 
> > > I tested 3.18 and also next-20141210. If it is improper behaviour I could 
> > > try to find a good kernel release and bisect it.
> > 
> > Please!  Could you also please try the (untested) diagnostic patch below
> > on either 3.18 or -next?  It should print messages covering all your
> > CPUs, and the CPU that your kernel module's kthread is running on should
> > show up as a one bit in the corresponding "mask" printout.
> > 
> > Could you also please check what CPU the rcu_sched kthread is running on?
> > One possibility is that this kthread is for some reason pinned on the
> > same CPU that is running your kthread.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 884e0ff020f1..d4018c025ac6 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1129,6 +1129,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
> >  	print_cpu_stall_info_begin();
> >  	rcu_for_each_leaf_node(rsp, rnp) {
> >  		raw_spin_lock_irqsave(&rnp->lock, flags);
> > +		pr_err("[ CPUs %d-%d mask %#lx ]\n", rnp->grplo, rnp->grphi, rnp->qsmask);
> >  		ndetected += rcu_print_task_stall(rnp);
> >  		if (rnp->qsmask != 0) {
> >  			for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
> > 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-11 16:50     ` Paul E. McKenney
@ 2014-12-11 19:09       ` Paul E. McKenney
  2014-12-12 14:06         ` Miroslav Benes
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-11 19:09 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Thu, Dec 11, 2014 at 08:50:10AM -0800, Paul E. McKenney wrote:
> On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> > On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> > 
> > > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > today I came across RCU stall which was correctly detected, but there is 
> > > > no state dump. This is a bit suspicious, I think. 
> > > > 
> > > > This is the output in serial console:
> > > > 
> > > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > > [  105.727003] INFO: Stall ended before state dump start
> > > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > > [  168.732006] INFO: Stall ended before state dump start
> > > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > > [  231.737003] INFO: Stall ended before state dump start
> > > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > > [  294.742003] INFO: Stall ended before state dump start
> > > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > > [  357.747003] INFO: Stall ended before state dump start
> > > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > > [  420.752003] INFO: Stall ended before state dump start
> > > > ...
> > > > 
> > > > It can be reproduced by trivial code attached to this mail (infinite 
> > > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > > Maybe I overlooked some config option, don't know.
> > > 
> > > Definitely not expected behavior!  Unless you have only one CPU, but in
> > > that case you should be running tiny RCU, not tree RCU.
> > 
> > So indeed I messed up my configs somehow and run the code on uniprocessor 
> > with SMP=y and tree RCU. With more processors RCU stall is detected and 
> > correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> > softlockup is detected, but no RCU stall in the log (is this correct?). So 
> > I'm really sorry for the noise.
> > 
> > Anyway I still think that running SMP kernel with tree RCU on 
> > uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> > Should I proceed with your patch below and bisection or am I mistaken 
> > completely and we can leave it because there is no problem?
> 
> Not a problem, there have been some interesting RCU CPU stall warnings
> recently, and your data did add some insight.
> 
> So the combination SMP=n PREEMPT=y can happen straightforwardly via
> kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
> straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
> system.  In both cases, what can happen is that RCU's grace-period
> kthreads are starved, which can result in those reports.
> 
> And these reports are confusing.  I am considering attempting to improve
> the diagnostics.  If I do, would you be willing to test the resulting
> patches?

Like this one, for example.  ;-)

							Thanx, Paul

------------------------------------------------------------------------

rcu: Improve diagnostics for spurious RCU CPU stall warnings

The current RCU CPU stall warning code will print "Stall ended before
state dump start" any time that the stall-warning code is triggered on
a CPU that has already reported a quiescent state for the current grace
period and if all quiescent states have been reported for the current
grace period.  However, a true stall can result in these symptoms, for
example, by preventing RCU's grace-period kthreads from ever running

This commit therefore checks for this condition, reporting the end of
the stall only if one of the grace-period counters has actually advanced.
Otherwise, it reports the last time that the grace-period kthread made
meaningful progress.  (In normal situations, the grace-period kthread
should make meaningful progress at least every jiffies_till_next_fqs
jiffies.)

Reported-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 4f8e33952b88..8085bc133791 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -187,6 +187,11 @@ o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
 	behavior, you might need to replace some of the cond_resched()
 	calls with calls to cond_resched_rcu_qs().
 
+o	Anything that prevents RCU's grace-period kthreads from running.
+	This can result in the "All QSes seen" console-log message.
+	This message will include information on when the kthread last
+	ran and how often it should be expected to run.
+
 o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
 	happen to preempt a low-priority task in the middle of an RCU
 	read-side critical section.   This is especially damaging if
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7497dfe6ff3f..6a43836b61b0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1096,7 +1096,7 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
 	}
 }
 
-static void print_other_cpu_stall(struct rcu_state *rsp)
+static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
 {
 	int cpu;
 	long delta;
@@ -1144,10 +1144,18 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
 	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
 	       (long)rsp->gpnum, (long)rsp->completed, totqlen);
-	if (ndetected == 0)
-		pr_err("INFO: Stall ended before state dump start\n");
-	else
+	if (ndetected) {
 		rcu_dump_cpu_stacks(rsp);
+	} else {
+		if (ACCESS_ONCE(rsp->gpnum) != gpnum ||
+		    ACCESS_ONCE(rsp->completed) == gpnum)
+			pr_err("INFO: Stall ended before state dump start\n");
+		else
+			pr_err("All QSes seen, last %s kthread activity %ld/%ld, jiffies_till_next_fqs=%ld\n",
+			       rsp->name,
+			       ACCESS_ONCE(rsp->gp_activity), jiffies,
+			       jiffies_till_next_fqs);
+	}
 
 	/* Complain about tasks blocking the grace period. */
 
@@ -1247,7 +1255,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
 
 		/* They had a few time units to dump stack, so complain. */
-		print_other_cpu_stall(rsp);
+		print_other_cpu_stall(rsp, gpnum);
 	}
 }
 
@@ -1643,6 +1651,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rcu_bind_gp_kthread();
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
@@ -1703,6 +1712,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 					    rnp->grphi, rnp->qsmask);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 
 	mutex_unlock(&rsp->onoff_mutex);
@@ -1719,6 +1729,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rsp->n_force_qs++;
 	if (fqs_state == RCU_SAVE_DYNTICK) {
 		/* Collect dyntick-idle snapshots. */
@@ -1757,6 +1768,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
 	gp_duration = jiffies - rsp->gp_start;
@@ -1793,6 +1805,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 		nocb += rcu_future_gp_cleanup(rsp, rnp);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 	rnp = rcu_get_root(rsp);
 	raw_spin_lock_irq(&rnp->lock);
@@ -1842,6 +1855,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 			if (rcu_gp_init(rsp))
 				break;
 			cond_resched_rcu_qs();
+			ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			WARN_ON(signal_pending(current));
 			trace_rcu_grace_period(rsp->name,
 					       ACCESS_ONCE(rsp->gpnum),
@@ -1885,9 +1899,11 @@ static int __noreturn rcu_gp_kthread(void *arg)
 						       ACCESS_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			} else {
 				/* Deal with stray signal. */
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 				WARN_ON(signal_pending(current));
 				trace_rcu_grace_period(rsp->name,
 						       ACCESS_ONCE(rsp->gpnum),
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 44e9ccfe6570..856518c8dcfa 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -483,6 +483,8 @@ struct rcu_state {
 						/*  due to no GP active. */
 	unsigned long gp_start;			/* Time at which GP started, */
 						/*  but in jiffies. */
+	unsigned long gp_activity;		/* Time of last GP kthread */
+						/*  activity in jiffies. */
 	unsigned long jiffies_stall;		/* Time at which to check */
 						/*  for CPU stalls. */
 	unsigned long jiffies_resched;		/* Time at which to resched */


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-11 19:09       ` Paul E. McKenney
@ 2014-12-12 14:06         ` Miroslav Benes
  2014-12-12 16:58           ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-12 14:06 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

On Thu, 11 Dec 2014, Paul E. McKenney wrote:

> On Thu, Dec 11, 2014 at 08:50:10AM -0800, Paul E. McKenney wrote:
> > On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> > > On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> > > 
> > > > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > today I came across RCU stall which was correctly detected, but there is 
> > > > > no state dump. This is a bit suspicious, I think. 
> > > > > 
> > > > > This is the output in serial console:
> > > > > 
> > > > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > > > [  105.727003] INFO: Stall ended before state dump start
> > > > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > > > [  168.732006] INFO: Stall ended before state dump start
> > > > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > > > [  231.737003] INFO: Stall ended before state dump start
> > > > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > > > [  294.742003] INFO: Stall ended before state dump start
> > > > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > > > [  357.747003] INFO: Stall ended before state dump start
> > > > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > > > [  420.752003] INFO: Stall ended before state dump start
> > > > > ...
> > > > > 
> > > > > It can be reproduced by trivial code attached to this mail (infinite 
> > > > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > > > Maybe I overlooked some config option, don't know.
> > > > 
> > > > Definitely not expected behavior!  Unless you have only one CPU, but in
> > > > that case you should be running tiny RCU, not tree RCU.
> > > 
> > > So indeed I messed up my configs somehow and run the code on uniprocessor 
> > > with SMP=y and tree RCU. With more processors RCU stall is detected and 
> > > correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> > > softlockup is detected, but no RCU stall in the log (is this correct?). So 
> > > I'm really sorry for the noise.
> > > 
> > > Anyway I still think that running SMP kernel with tree RCU on 
> > > uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> > > Should I proceed with your patch below and bisection or am I mistaken 
> > > completely and we can leave it because there is no problem?
> > 
> > Not a problem, there have been some interesting RCU CPU stall warnings
> > recently, and your data did add some insight.
> > 
> > So the combination SMP=n PREEMPT=y can happen straightforwardly via
> > kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
> > straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
> > system.  In both cases, what can happen is that RCU's grace-period
> > kthreads are starved, which can result in those reports.
> > 
> > And these reports are confusing.  I am considering attempting to improve
> > the diagnostics.  If I do, would you be willing to test the resulting
> > patches?
> 
> Like this one, for example.  ;-)

Ok, with next-20141212, where this patch is included, and SMP=y PREEMPT=n 
I get the following

...
[  206.949003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  206.949003]  (detected by 0, t=21002 jiffies, g=3384, c=3383, q=247)
[  206.949003] All QSes seen, last rcu_sched kthread activity 4294853243/4294874245, jiffies_till_next_fqs=3
[  269.954004] INFO: rcu_sched detected stalls on CPUs/tasks:
[  269.954004]  (detected by 0, t=84007 jiffies, g=3384, c=3383, q=449)
[  269.954004] All QSes seen, last rcu_sched kthread activity 4294853243/4294937250, jiffies_till_next_fqs=3
[  332.959004] INFO: rcu_sched detected stalls on CPUs/tasks:
[  332.959004]  (detected by 0, t=147012 jiffies, g=3384, c=3383, q=651)
[  332.959004] All QSes seen, last rcu_sched kthread activity 4294853243/4295000255, jiffies_till_next_fqs=3
[  395.964003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  395.964003]  (detected by 0, t=210017 jiffies, g=3384, c=3383, q=858)
[  395.964003] All QSes seen, last rcu_sched kthread activity 4294853243/4295063260, jiffies_till_next_fqs=3
...

So it seems to work and it is not confusing anymore. Would it be possible 
to dump the stack also in this case? The stall is there, the condition is 
correctly detected, so I guess it could be possible. It would give the 
information what caused the stall. 

In SMP=n PREEMPT=y case there is no stall with my test. I guess it is 
because module inclusion does not disable preemption. However it is 
possible in other cases as you wrote above.

One last thing. Is there a reason that there is not any similar INFO in 
the console for tiny implementation? I enabled RCU_TRACE and still got 
nothing. I have only shallow knowledge of RCU, though.

I'll test more patches if you have some...

Thanks a lot
Miroslav

> ------------------------------------------------------------------------
> 
> rcu: Improve diagnostics for spurious RCU CPU stall warnings
> 
> The current RCU CPU stall warning code will print "Stall ended before
> state dump start" any time that the stall-warning code is triggered on
> a CPU that has already reported a quiescent state for the current grace
> period and if all quiescent states have been reported for the current
> grace period.  However, a true stall can result in these symptoms, for
> example, by preventing RCU's grace-period kthreads from ever running
> 
> This commit therefore checks for this condition, reporting the end of
> the stall only if one of the grace-period counters has actually advanced.
> Otherwise, it reports the last time that the grace-period kthread made
> meaningful progress.  (In normal situations, the grace-period kthread
> should make meaningful progress at least every jiffies_till_next_fqs
> jiffies.)
> 
> Reported-by: Miroslav Benes <mbenes@suse.cz>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
> index 4f8e33952b88..8085bc133791 100644
> --- a/Documentation/RCU/stallwarn.txt
> +++ b/Documentation/RCU/stallwarn.txt
> @@ -187,6 +187,11 @@ o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
>  	behavior, you might need to replace some of the cond_resched()
>  	calls with calls to cond_resched_rcu_qs().
>  
> +o	Anything that prevents RCU's grace-period kthreads from running.
> +	This can result in the "All QSes seen" console-log message.
> +	This message will include information on when the kthread last
> +	ran and how often it should be expected to run.
> +
>  o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
>  	happen to preempt a low-priority task in the middle of an RCU
>  	read-side critical section.   This is especially damaging if
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 7497dfe6ff3f..6a43836b61b0 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1096,7 +1096,7 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
>  	}
>  }
>  
> -static void print_other_cpu_stall(struct rcu_state *rsp)
> +static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
>  {
>  	int cpu;
>  	long delta;
> @@ -1144,10 +1144,18 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
>  	pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
>  	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
>  	       (long)rsp->gpnum, (long)rsp->completed, totqlen);
> -	if (ndetected == 0)
> -		pr_err("INFO: Stall ended before state dump start\n");
> -	else
> +	if (ndetected) {
>  		rcu_dump_cpu_stacks(rsp);
> +	} else {
> +		if (ACCESS_ONCE(rsp->gpnum) != gpnum ||
> +		    ACCESS_ONCE(rsp->completed) == gpnum)
> +			pr_err("INFO: Stall ended before state dump start\n");
> +		else
> +			pr_err("All QSes seen, last %s kthread activity %ld/%ld, jiffies_till_next_fqs=%ld\n",
> +			       rsp->name,
> +			       ACCESS_ONCE(rsp->gp_activity), jiffies,
> +			       jiffies_till_next_fqs);
> +	}
>  
>  	/* Complain about tasks blocking the grace period. */
>  
> @@ -1247,7 +1255,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
>  		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
>  
>  		/* They had a few time units to dump stack, so complain. */
> -		print_other_cpu_stall(rsp);
> +		print_other_cpu_stall(rsp, gpnum);
>  	}
>  }
>  
> @@ -1643,6 +1651,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	rcu_bind_gp_kthread();
>  	raw_spin_lock_irq(&rnp->lock);
>  	smp_mb__after_unlock_lock();
> @@ -1703,6 +1712,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  					    rnp->grphi, rnp->qsmask);
>  		raw_spin_unlock_irq(&rnp->lock);
>  		cond_resched_rcu_qs();
> +		ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	}
>  
>  	mutex_unlock(&rsp->onoff_mutex);
> @@ -1719,6 +1729,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  	unsigned long maxj;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	rsp->n_force_qs++;
>  	if (fqs_state == RCU_SAVE_DYNTICK) {
>  		/* Collect dyntick-idle snapshots. */
> @@ -1757,6 +1768,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	raw_spin_lock_irq(&rnp->lock);
>  	smp_mb__after_unlock_lock();
>  	gp_duration = jiffies - rsp->gp_start;
> @@ -1793,6 +1805,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  		nocb += rcu_future_gp_cleanup(rsp, rnp);
>  		raw_spin_unlock_irq(&rnp->lock);
>  		cond_resched_rcu_qs();
> +		ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	}
>  	rnp = rcu_get_root(rsp);
>  	raw_spin_lock_irq(&rnp->lock);
> @@ -1842,6 +1855,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  			if (rcu_gp_init(rsp))
>  				break;
>  			cond_resched_rcu_qs();
> +			ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  			WARN_ON(signal_pending(current));
>  			trace_rcu_grace_period(rsp->name,
>  					       ACCESS_ONCE(rsp->gpnum),
> @@ -1885,9 +1899,11 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  						       ACCESS_ONCE(rsp->gpnum),
>  						       TPS("fqsend"));
>  				cond_resched_rcu_qs();
> +				ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  			} else {
>  				/* Deal with stray signal. */
>  				cond_resched_rcu_qs();
> +				ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  				WARN_ON(signal_pending(current));
>  				trace_rcu_grace_period(rsp->name,
>  						       ACCESS_ONCE(rsp->gpnum),
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 44e9ccfe6570..856518c8dcfa 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -483,6 +483,8 @@ struct rcu_state {
>  						/*  due to no GP active. */
>  	unsigned long gp_start;			/* Time at which GP started, */
>  						/*  but in jiffies. */
> +	unsigned long gp_activity;		/* Time of last GP kthread */
> +						/*  activity in jiffies. */
>  	unsigned long jiffies_stall;		/* Time at which to check */
>  						/*  for CPU stalls. */
>  	unsigned long jiffies_resched;		/* Time at which to resched */
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-12 14:06         ` Miroslav Benes
@ 2014-12-12 16:58           ` Paul E. McKenney
  2014-12-15 13:26             ` Miroslav Benes
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-12 16:58 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> On Thu, 11 Dec 2014, Paul E. McKenney wrote:
> 
> > On Thu, Dec 11, 2014 at 08:50:10AM -0800, Paul E. McKenney wrote:
> > > On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> > > > On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> > > > 
> > > > > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > today I came across RCU stall which was correctly detected, but there is 
> > > > > > no state dump. This is a bit suspicious, I think. 
> > > > > > 
> > > > > > This is the output in serial console:
> > > > > > 
> > > > > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > > > > [  105.727003] INFO: Stall ended before state dump start
> > > > > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > > > > [  168.732006] INFO: Stall ended before state dump start
> > > > > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > > > > [  231.737003] INFO: Stall ended before state dump start
> > > > > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > > > > [  294.742003] INFO: Stall ended before state dump start
> > > > > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > > > > [  357.747003] INFO: Stall ended before state dump start
> > > > > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > > > > [  420.752003] INFO: Stall ended before state dump start
> > > > > > ...
> > > > > > 
> > > > > > It can be reproduced by trivial code attached to this mail (infinite 
> > > > > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > > > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > > > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > > > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > > > > Maybe I overlooked some config option, don't know.
> > > > > 
> > > > > Definitely not expected behavior!  Unless you have only one CPU, but in
> > > > > that case you should be running tiny RCU, not tree RCU.
> > > > 
> > > > So indeed I messed up my configs somehow and run the code on uniprocessor 
> > > > with SMP=y and tree RCU. With more processors RCU stall is detected and 
> > > > correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> > > > softlockup is detected, but no RCU stall in the log (is this correct?). So 
> > > > I'm really sorry for the noise.
> > > > 
> > > > Anyway I still think that running SMP kernel with tree RCU on 
> > > > uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> > > > Should I proceed with your patch below and bisection or am I mistaken 
> > > > completely and we can leave it because there is no problem?
> > > 
> > > Not a problem, there have been some interesting RCU CPU stall warnings
> > > recently, and your data did add some insight.
> > > 
> > > So the combination SMP=n PREEMPT=y can happen straightforwardly via
> > > kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
> > > straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
> > > system.  In both cases, what can happen is that RCU's grace-period
> > > kthreads are starved, which can result in those reports.
> > > 
> > > And these reports are confusing.  I am considering attempting to improve
> > > the diagnostics.  If I do, would you be willing to test the resulting
> > > patches?
> > 
> > Like this one, for example.  ;-)
> 
> Ok, with next-20141212, where this patch is included, and SMP=y PREEMPT=n 
> I get the following
> 
> ...
> [  206.949003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  206.949003]  (detected by 0, t=21002 jiffies, g=3384, c=3383, q=247)
> [  206.949003] All QSes seen, last rcu_sched kthread activity 4294853243/4294874245, jiffies_till_next_fqs=3
> [  269.954004] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  269.954004]  (detected by 0, t=84007 jiffies, g=3384, c=3383, q=449)
> [  269.954004] All QSes seen, last rcu_sched kthread activity 4294853243/4294937250, jiffies_till_next_fqs=3
> [  332.959004] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  332.959004]  (detected by 0, t=147012 jiffies, g=3384, c=3383, q=651)
> [  332.959004] All QSes seen, last rcu_sched kthread activity 4294853243/4295000255, jiffies_till_next_fqs=3
> [  395.964003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  395.964003]  (detected by 0, t=210017 jiffies, g=3384, c=3383, q=858)
> [  395.964003] All QSes seen, last rcu_sched kthread activity 4294853243/4295063260, jiffies_till_next_fqs=3
> ...
> 
> So it seems to work and it is not confusing anymore. Would it be possible 
> to dump the stack also in this case? The stall is there, the condition is 
> correctly detected, so I guess it could be possible. It would give the 
> information what caused the stall. 

Ah, good point!  In this case, the stall is self-detected, but RCU
incorrectly includes that it is some other CPU's fault because RCU is
not waiting on the current CPU.  Please see below for updated patch.

> In SMP=n PREEMPT=y case there is no stall with my test. I guess it is 
> because module inclusion does not disable preemption. However it is 
> possible in other cases as you wrote above.

OK, that matches expected behavior.

> One last thing. Is there a reason that there is not any similar INFO in 
> the console for tiny implementation? I enabled RCU_TRACE and still got 
> nothing. I have only shallow knowledge of RCU, though.

It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
it should, at least as long as scheduling-clock interrupts are being
delivered.  One thing to try for diagnosis is to have every (say) 1000th
call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).

> I'll test more patches if you have some...

Please see below for update that should print current CPU's stack for
the "All QSes seen" case.  This replaces the earlier patch.

							Thanx, Paul

> Thanks a lot
> Miroslav

------------------------------------------------------------------------

rcu: Improve diagnostics for spurious RCU CPU stall warnings

The current RCU CPU stall warning code will print "Stall ended before
state dump start" any time that the stall-warning code is triggered on
a CPU that has already reported a quiescent state for the current grace
period and if all quiescent states have been reported for the current
grace period.  However, a true stall can result in these symptoms, for
example, by preventing RCU's grace-period kthreads from ever running

This commit therefore checks for this condition, reporting the end of
the stall only if one of the grace-period counters has actually advanced.
Otherwise, it reports the last time that the grace-period kthread made
meaningful progress.  (In normal situations, the grace-period kthread
should make meaningful progress at least every jiffies_till_next_fqs
jiffies.)

Reported-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 4f8e33952b88..8085bc133791 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -187,6 +187,11 @@ o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
 	behavior, you might need to replace some of the cond_resched()
 	calls with calls to cond_resched_rcu_qs().
 
+o	Anything that prevents RCU's grace-period kthreads from running.
+	This can result in the "All QSes seen" console-log message.
+	This message will include information on when the kthread last
+	ran and how often it should be expected to run.
+
 o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
 	happen to preempt a low-priority task in the middle of an RCU
 	read-side critical section.   This is especially damaging if
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 7497dfe6ff3f..614dfaa94dd2 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1096,11 +1096,13 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
 	}
 }
 
-static void print_other_cpu_stall(struct rcu_state *rsp)
+static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
 {
 	int cpu;
 	long delta;
 	unsigned long flags;
+	unsigned long gpa;
+	unsigned long j;
 	int ndetected = 0;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 	long totqlen = 0;
@@ -1144,10 +1146,22 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
 	pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
 	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
 	       (long)rsp->gpnum, (long)rsp->completed, totqlen);
-	if (ndetected == 0)
-		pr_err("INFO: Stall ended before state dump start\n");
-	else
+	if (ndetected) {
 		rcu_dump_cpu_stacks(rsp);
+	} else {
+		if (ACCESS_ONCE(rsp->gpnum) != gpnum ||
+		    ACCESS_ONCE(rsp->completed) == gpnum) {
+			pr_err("INFO: Stall ended before state dump start\n");
+		} else {
+			j = jiffies;
+			gpa = ACCESS_ONCE(rsp->gp_activity);
+			pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld\n",
+			       rsp->name, j - gpa, j, gpa,
+			       jiffies_till_next_fqs);
+			/* In this case, the current CPU might be at fault. */
+			sched_show_task(current);
+		}
+	}
 
 	/* Complain about tasks blocking the grace period. */
 
@@ -1247,7 +1261,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
 
 		/* They had a few time units to dump stack, so complain. */
-		print_other_cpu_stall(rsp);
+		print_other_cpu_stall(rsp, gpnum);
 	}
 }
 
@@ -1643,6 +1657,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rcu_bind_gp_kthread();
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
@@ -1703,6 +1718,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 					    rnp->grphi, rnp->qsmask);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 
 	mutex_unlock(&rsp->onoff_mutex);
@@ -1719,6 +1735,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 	unsigned long maxj;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	rsp->n_force_qs++;
 	if (fqs_state == RCU_SAVE_DYNTICK) {
 		/* Collect dyntick-idle snapshots. */
@@ -1757,6 +1774,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
+	ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	raw_spin_lock_irq(&rnp->lock);
 	smp_mb__after_unlock_lock();
 	gp_duration = jiffies - rsp->gp_start;
@@ -1793,6 +1811,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 		nocb += rcu_future_gp_cleanup(rsp, rnp);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched_rcu_qs();
+		ACCESS_ONCE(rsp->gp_activity) = jiffies;
 	}
 	rnp = rcu_get_root(rsp);
 	raw_spin_lock_irq(&rnp->lock);
@@ -1842,6 +1861,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 			if (rcu_gp_init(rsp))
 				break;
 			cond_resched_rcu_qs();
+			ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			WARN_ON(signal_pending(current));
 			trace_rcu_grace_period(rsp->name,
 					       ACCESS_ONCE(rsp->gpnum),
@@ -1885,9 +1905,11 @@ static int __noreturn rcu_gp_kthread(void *arg)
 						       ACCESS_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 			} else {
 				/* Deal with stray signal. */
 				cond_resched_rcu_qs();
+				ACCESS_ONCE(rsp->gp_activity) = jiffies;
 				WARN_ON(signal_pending(current));
 				trace_rcu_grace_period(rsp->name,
 						       ACCESS_ONCE(rsp->gpnum),
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 44e9ccfe6570..856518c8dcfa 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -483,6 +483,8 @@ struct rcu_state {
 						/*  due to no GP active. */
 	unsigned long gp_start;			/* Time at which GP started, */
 						/*  but in jiffies. */
+	unsigned long gp_activity;		/* Time of last GP kthread */
+						/*  activity in jiffies. */
 	unsigned long jiffies_stall;		/* Time at which to check */
 						/*  for CPU stalls. */
 	unsigned long jiffies_resched;		/* Time at which to resched */


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-12 16:58           ` Paul E. McKenney
@ 2014-12-15 13:26             ` Miroslav Benes
  2014-12-16 18:42               ` Paul E. McKenney
  2014-12-19 11:09               ` Miroslav Benes
  0 siblings, 2 replies; 14+ messages in thread
From: Miroslav Benes @ 2014-12-15 13:26 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

On Fri, 12 Dec 2014, Paul E. McKenney wrote:

> On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> > On Thu, 11 Dec 2014, Paul E. McKenney wrote:
> > 
> > > On Thu, Dec 11, 2014 at 08:50:10AM -0800, Paul E. McKenney wrote:
> > > > On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> > > > > On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> > > > > 
> > > > > > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > > > > > 
> > > > > > > Hi,
> > > > > > > 
> > > > > > > today I came across RCU stall which was correctly detected, but there is 
> > > > > > > no state dump. This is a bit suspicious, I think. 
> > > > > > > 
> > > > > > > This is the output in serial console:
> > > > > > > 
> > > > > > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > > > > > [  105.727003] INFO: Stall ended before state dump start
> > > > > > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > > > > > [  168.732006] INFO: Stall ended before state dump start
> > > > > > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > > > > > [  231.737003] INFO: Stall ended before state dump start
> > > > > > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > > > > > [  294.742003] INFO: Stall ended before state dump start
> > > > > > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > > > > > [  357.747003] INFO: Stall ended before state dump start
> > > > > > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > > > > > [  420.752003] INFO: Stall ended before state dump start
> > > > > > > ...
> > > > > > > 
> > > > > > > It can be reproduced by trivial code attached to this mail (infinite 
> > > > > > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > > > > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > > > > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > > > > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > > > > > Maybe I overlooked some config option, don't know.
> > > > > > 
> > > > > > Definitely not expected behavior!  Unless you have only one CPU, but in
> > > > > > that case you should be running tiny RCU, not tree RCU.
> > > > > 
> > > > > So indeed I messed up my configs somehow and run the code on uniprocessor 
> > > > > with SMP=y and tree RCU. With more processors RCU stall is detected and 
> > > > > correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> > > > > softlockup is detected, but no RCU stall in the log (is this correct?). So 
> > > > > I'm really sorry for the noise.
> > > > > 
> > > > > Anyway I still think that running SMP kernel with tree RCU on 
> > > > > uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> > > > > Should I proceed with your patch below and bisection or am I mistaken 
> > > > > completely and we can leave it because there is no problem?
> > > > 
> > > > Not a problem, there have been some interesting RCU CPU stall warnings
> > > > recently, and your data did add some insight.
> > > > 
> > > > So the combination SMP=n PREEMPT=y can happen straightforwardly via
> > > > kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
> > > > straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
> > > > system.  In both cases, what can happen is that RCU's grace-period
> > > > kthreads are starved, which can result in those reports.
> > > > 
> > > > And these reports are confusing.  I am considering attempting to improve
> > > > the diagnostics.  If I do, would you be willing to test the resulting
> > > > patches?
> > > 
> > > Like this one, for example.  ;-)
> > 
> > Ok, with next-20141212, where this patch is included, and SMP=y PREEMPT=n 
> > I get the following
> > 
> > ...
> > [  206.949003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  206.949003]  (detected by 0, t=21002 jiffies, g=3384, c=3383, q=247)
> > [  206.949003] All QSes seen, last rcu_sched kthread activity 4294853243/4294874245, jiffies_till_next_fqs=3
> > [  269.954004] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  269.954004]  (detected by 0, t=84007 jiffies, g=3384, c=3383, q=449)
> > [  269.954004] All QSes seen, last rcu_sched kthread activity 4294853243/4294937250, jiffies_till_next_fqs=3
> > [  332.959004] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  332.959004]  (detected by 0, t=147012 jiffies, g=3384, c=3383, q=651)
> > [  332.959004] All QSes seen, last rcu_sched kthread activity 4294853243/4295000255, jiffies_till_next_fqs=3
> > [  395.964003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  395.964003]  (detected by 0, t=210017 jiffies, g=3384, c=3383, q=858)
> > [  395.964003] All QSes seen, last rcu_sched kthread activity 4294853243/4295063260, jiffies_till_next_fqs=3
> > ...
> > 
> > So it seems to work and it is not confusing anymore. Would it be possible 
> > to dump the stack also in this case? The stall is there, the condition is 
> > correctly detected, so I guess it could be possible. It would give the 
> > information what caused the stall. 
> 
> Ah, good point!  In this case, the stall is self-detected, but RCU
> incorrectly includes that it is some other CPU's fault because RCU is
> not waiting on the current CPU.  Please see below for updated patch.
> 
> > In SMP=n PREEMPT=y case there is no stall with my test. I guess it is 
> > because module inclusion does not disable preemption. However it is 
> > possible in other cases as you wrote above.
> 
> OK, that matches expected behavior.
> 
> > One last thing. Is there a reason that there is not any similar INFO in 
> > the console for tiny implementation? I enabled RCU_TRACE and still got 
> > nothing. I have only shallow knowledge of RCU, though.
> 
> It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
> it should, at least as long as scheduling-clock interrupts are being
> delivered.  One thing to try for diagnosis is to have every (say) 1000th
> call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
> and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).

Unfortunately nothing is there. I'll look into it when I have more time 
later this week and let you know.

> 
> > I'll test more patches if you have some...
> 
> Please see below for update that should print current CPU's stack for
> the "All QSes seen" case.  This replaces the earlier patch.

This gives

...
[  149.320003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  149.320003]  (detected by 0, t=21002 jiffies, g=2823, c=2822, q=209)
[  149.320003] All QSes seen, last rcu_sched kthread activity 21002 (4294816616-4294795614), jiffies_till_next_fqs=3
[  149.320003] test_thread     R  running task    14832  4015      2 0x00000008
[  149.320003]  00000000000039f0 ffff88013fc03da8 ffffffff8107874c ffffffff810786b2
[  149.320003]  ffffffff81e391c0 000000000000520a ffff88013fdcc4a0 ffffffff81e38cc0
[  149.320003]  ffffffff81e38cc0 ffff88013fc03e28 ffffffff810b1049 0000000000000002
[  149.320003] Call Trace:
[  149.320003]  <IRQ>  [<ffffffff8107874c>] sched_show_task+0x11c/0x190
[  149.320003]  [<ffffffff810786b2>] ? sched_show_task+0x82/0x190
[  149.320003]  [<ffffffff810b1049>] rcu_check_callbacks+0x839/0x850
[  149.320003]  [<ffffffff810b627b>] update_process_times+0x4b/0x80
[  149.320003]  [<ffffffff810c6865>] ? tick_sched_do_timer+0x35/0x40
[  149.320003]  [<ffffffff810c60d6>] tick_sched_handle.isra.19+0x36/0x50
[  149.320003]  [<ffffffff810c68b5>] tick_sched_timer+0x45/0x80
[  149.320003]  [<ffffffff810b7022>] __run_hrtimer+0xb2/0x250
[  149.320003]  [<ffffffff810b73cc>] ? hrtimer_interrupt+0x7c/0x250
[  149.320003]  [<ffffffff810c6870>] ? tick_sched_do_timer+0x40/0x40
[  149.320003]  [<ffffffff810b7447>] hrtimer_interrupt+0xf7/0x250
[  149.320003]  [<ffffffffa0000000>] ? 0xffffffffa0000000
[  149.320003]  [<ffffffff810356eb>] local_apic_timer_interrupt+0x3b/0x70
[  149.320003]  [<ffffffff818a9845>] smp_apic_timer_interrupt+0x45/0x60
[  149.320003]  [<ffffffff818a7c2f>] apic_timer_interrupt+0x6f/0x80
[  149.320003]  <EOI>  [<ffffffff818a7960>] ? retint_restore_args+0xe/0xe
[  149.320003]  [<ffffffff8106d004>] ? kthread_should_stop+0x24/0x30
[  149.320003]  [<ffffffffa0000015>] ? test_thread_fn+0x15/0x20 [kthread_mod]
[  149.320003]  [<ffffffff8106d460>] kthread+0xf0/0x110
[  149.320003]  [<ffffffff8106d370>] ? kthread_create_on_node+0x240/0x240
[  149.320003]  [<ffffffff818a6cec>] ret_from_fork+0x7c/0xb0
[  149.320003]  [<ffffffff8106d370>] ? kthread_create_on_node+0x240/0x240
...

So it works as expected which is great. You can add my Tested-by if you 
feel it is worth it.

Thanks
Miroslav


> ------------------------------------------------------------------------
> 
> rcu: Improve diagnostics for spurious RCU CPU stall warnings
> 
> The current RCU CPU stall warning code will print "Stall ended before
> state dump start" any time that the stall-warning code is triggered on
> a CPU that has already reported a quiescent state for the current grace
> period and if all quiescent states have been reported for the current
> grace period.  However, a true stall can result in these symptoms, for
> example, by preventing RCU's grace-period kthreads from ever running
> 
> This commit therefore checks for this condition, reporting the end of
> the stall only if one of the grace-period counters has actually advanced.
> Otherwise, it reports the last time that the grace-period kthread made
> meaningful progress.  (In normal situations, the grace-period kthread
> should make meaningful progress at least every jiffies_till_next_fqs
> jiffies.)
> 
> Reported-by: Miroslav Benes <mbenes@suse.cz>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
> index 4f8e33952b88..8085bc133791 100644
> --- a/Documentation/RCU/stallwarn.txt
> +++ b/Documentation/RCU/stallwarn.txt
> @@ -187,6 +187,11 @@ o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
>  	behavior, you might need to replace some of the cond_resched()
>  	calls with calls to cond_resched_rcu_qs().
>  
> +o	Anything that prevents RCU's grace-period kthreads from running.
> +	This can result in the "All QSes seen" console-log message.
> +	This message will include information on when the kthread last
> +	ran and how often it should be expected to run.
> +
>  o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
>  	happen to preempt a low-priority task in the middle of an RCU
>  	read-side critical section.   This is especially damaging if
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 7497dfe6ff3f..614dfaa94dd2 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1096,11 +1096,13 @@ static void rcu_dump_cpu_stacks(struct rcu_state *rsp)
>  	}
>  }
>  
> -static void print_other_cpu_stall(struct rcu_state *rsp)
> +static void print_other_cpu_stall(struct rcu_state *rsp, unsigned long gpnum)
>  {
>  	int cpu;
>  	long delta;
>  	unsigned long flags;
> +	unsigned long gpa;
> +	unsigned long j;
>  	int ndetected = 0;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  	long totqlen = 0;
> @@ -1144,10 +1146,22 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
>  	pr_cont("(detected by %d, t=%ld jiffies, g=%ld, c=%ld, q=%lu)\n",
>  	       smp_processor_id(), (long)(jiffies - rsp->gp_start),
>  	       (long)rsp->gpnum, (long)rsp->completed, totqlen);
> -	if (ndetected == 0)
> -		pr_err("INFO: Stall ended before state dump start\n");
> -	else
> +	if (ndetected) {
>  		rcu_dump_cpu_stacks(rsp);
> +	} else {
> +		if (ACCESS_ONCE(rsp->gpnum) != gpnum ||
> +		    ACCESS_ONCE(rsp->completed) == gpnum) {
> +			pr_err("INFO: Stall ended before state dump start\n");
> +		} else {
> +			j = jiffies;
> +			gpa = ACCESS_ONCE(rsp->gp_activity);
> +			pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld\n",
> +			       rsp->name, j - gpa, j, gpa,
> +			       jiffies_till_next_fqs);
> +			/* In this case, the current CPU might be at fault. */
> +			sched_show_task(current);
> +		}
> +	}
>  
>  	/* Complain about tasks blocking the grace period. */
>  
> @@ -1247,7 +1261,7 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
>  		   ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY)) {
>  
>  		/* They had a few time units to dump stack, so complain. */
> -		print_other_cpu_stall(rsp);
> +		print_other_cpu_stall(rsp, gpnum);
>  	}
>  }
>  
> @@ -1643,6 +1657,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	rcu_bind_gp_kthread();
>  	raw_spin_lock_irq(&rnp->lock);
>  	smp_mb__after_unlock_lock();
> @@ -1703,6 +1718,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
>  					    rnp->grphi, rnp->qsmask);
>  		raw_spin_unlock_irq(&rnp->lock);
>  		cond_resched_rcu_qs();
> +		ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	}
>  
>  	mutex_unlock(&rsp->onoff_mutex);
> @@ -1719,6 +1735,7 @@ static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
>  	unsigned long maxj;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	rsp->n_force_qs++;
>  	if (fqs_state == RCU_SAVE_DYNTICK) {
>  		/* Collect dyntick-idle snapshots. */
> @@ -1757,6 +1774,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  	struct rcu_data *rdp;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
>  
> +	ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	raw_spin_lock_irq(&rnp->lock);
>  	smp_mb__after_unlock_lock();
>  	gp_duration = jiffies - rsp->gp_start;
> @@ -1793,6 +1811,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
>  		nocb += rcu_future_gp_cleanup(rsp, rnp);
>  		raw_spin_unlock_irq(&rnp->lock);
>  		cond_resched_rcu_qs();
> +		ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  	}
>  	rnp = rcu_get_root(rsp);
>  	raw_spin_lock_irq(&rnp->lock);
> @@ -1842,6 +1861,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  			if (rcu_gp_init(rsp))
>  				break;
>  			cond_resched_rcu_qs();
> +			ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  			WARN_ON(signal_pending(current));
>  			trace_rcu_grace_period(rsp->name,
>  					       ACCESS_ONCE(rsp->gpnum),
> @@ -1885,9 +1905,11 @@ static int __noreturn rcu_gp_kthread(void *arg)
>  						       ACCESS_ONCE(rsp->gpnum),
>  						       TPS("fqsend"));
>  				cond_resched_rcu_qs();
> +				ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  			} else {
>  				/* Deal with stray signal. */
>  				cond_resched_rcu_qs();
> +				ACCESS_ONCE(rsp->gp_activity) = jiffies;
>  				WARN_ON(signal_pending(current));
>  				trace_rcu_grace_period(rsp->name,
>  						       ACCESS_ONCE(rsp->gpnum),
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 44e9ccfe6570..856518c8dcfa 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -483,6 +483,8 @@ struct rcu_state {
>  						/*  due to no GP active. */
>  	unsigned long gp_start;			/* Time at which GP started, */
>  						/*  but in jiffies. */
> +	unsigned long gp_activity;		/* Time of last GP kthread */
> +						/*  activity in jiffies. */
>  	unsigned long jiffies_stall;		/* Time at which to check */
>  						/*  for CPU stalls. */
>  	unsigned long jiffies_resched;		/* Time at which to resched */
> 

--
Miroslav Benes
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-15 13:26             ` Miroslav Benes
@ 2014-12-16 18:42               ` Paul E. McKenney
  2014-12-19 11:09               ` Miroslav Benes
  1 sibling, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-16 18:42 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Mon, Dec 15, 2014 at 02:26:30PM +0100, Miroslav Benes wrote:
> On Fri, 12 Dec 2014, Paul E. McKenney wrote:
> 
> > On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> > > On Thu, 11 Dec 2014, Paul E. McKenney wrote:
> > > 
> > > > On Thu, Dec 11, 2014 at 08:50:10AM -0800, Paul E. McKenney wrote:
> > > > > On Thu, Dec 11, 2014 at 10:35:15AM +0100, Miroslav Benes wrote:
> > > > > > On Wed, 10 Dec 2014, Paul E. McKenney wrote:
> > > > > > 
> > > > > > > On Wed, Dec 10, 2014 at 01:52:02PM +0100, Miroslav Benes wrote:
> > > > > > > > 
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > today I came across RCU stall which was correctly detected, but there is 
> > > > > > > > no state dump. This is a bit suspicious, I think. 
> > > > > > > > 
> > > > > > > > This is the output in serial console:
> > > > > > > > 
> > > > > > > > [  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
> > > > > > > > [  105.727003] INFO: Stall ended before state dump start
> > > > > > > > [  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
> > > > > > > > [  168.732006] INFO: Stall ended before state dump start
> > > > > > > > [  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
> > > > > > > > [  231.737003] INFO: Stall ended before state dump start
> > > > > > > > [  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
> > > > > > > > [  294.742003] INFO: Stall ended before state dump start
> > > > > > > > [  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
> > > > > > > > [  357.747003] INFO: Stall ended before state dump start
> > > > > > > > [  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > > > > > > [  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
> > > > > > > > [  420.752003] INFO: Stall ended before state dump start
> > > > > > > > ...
> > > > > > > > 
> > > > > > > > It can be reproduced by trivial code attached to this mail (infinite 
> > > > > > > > loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
> > > > > > > > The kernel thread is scheduled on the same CPU which causes soft lockup 
> > > > > > > > (reliably detected when lockup detector is on). There is certainly RCU 
> > > > > > > > stall, but I would expect a state dump. Is this an expected behaviour? 
> > > > > > > > Maybe I overlooked some config option, don't know.
> > > > > > > 
> > > > > > > Definitely not expected behavior!  Unless you have only one CPU, but in
> > > > > > > that case you should be running tiny RCU, not tree RCU.
> > > > > > 
> > > > > > So indeed I messed up my configs somehow and run the code on uniprocessor 
> > > > > > with SMP=y and tree RCU. With more processors RCU stall is detected and 
> > > > > > correct state is dumped. On uniprocessor with SMP=n and tiny RCU 
> > > > > > softlockup is detected, but no RCU stall in the log (is this correct?). So 
> > > > > > I'm really sorry for the noise.
> > > > > > 
> > > > > > Anyway I still think that running SMP kernel with tree RCU on 
> > > > > > uniprocessor is possible option (albeit suboptimal and maybe improbable). 
> > > > > > Should I proceed with your patch below and bisection or am I mistaken 
> > > > > > completely and we can leave it because there is no problem?
> > > > > 
> > > > > Not a problem, there have been some interesting RCU CPU stall warnings
> > > > > recently, and your data did add some insight.
> > > > > 
> > > > > So the combination SMP=n PREEMPT=y can happen straightforwardly via
> > > > > kbuild.  The combination SMP=n PREEMPT=n can happen (somewhat less)
> > > > > straightforwardly by running an SMP=y PREEMPT=n kernel on a single-CPU
> > > > > system.  In both cases, what can happen is that RCU's grace-period
> > > > > kthreads are starved, which can result in those reports.
> > > > > 
> > > > > And these reports are confusing.  I am considering attempting to improve
> > > > > the diagnostics.  If I do, would you be willing to test the resulting
> > > > > patches?
> > > > 
> > > > Like this one, for example.  ;-)
> > > 
> > > Ok, with next-20141212, where this patch is included, and SMP=y PREEMPT=n 
> > > I get the following
> > > 
> > > ...
> > > [  206.949003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  206.949003]  (detected by 0, t=21002 jiffies, g=3384, c=3383, q=247)
> > > [  206.949003] All QSes seen, last rcu_sched kthread activity 4294853243/4294874245, jiffies_till_next_fqs=3
> > > [  269.954004] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  269.954004]  (detected by 0, t=84007 jiffies, g=3384, c=3383, q=449)
> > > [  269.954004] All QSes seen, last rcu_sched kthread activity 4294853243/4294937250, jiffies_till_next_fqs=3
> > > [  332.959004] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  332.959004]  (detected by 0, t=147012 jiffies, g=3384, c=3383, q=651)
> > > [  332.959004] All QSes seen, last rcu_sched kthread activity 4294853243/4295000255, jiffies_till_next_fqs=3
> > > [  395.964003] INFO: rcu_sched detected stalls on CPUs/tasks:
> > > [  395.964003]  (detected by 0, t=210017 jiffies, g=3384, c=3383, q=858)
> > > [  395.964003] All QSes seen, last rcu_sched kthread activity 4294853243/4295063260, jiffies_till_next_fqs=3
> > > ...
> > > 
> > > So it seems to work and it is not confusing anymore. Would it be possible 
> > > to dump the stack also in this case? The stall is there, the condition is 
> > > correctly detected, so I guess it could be possible. It would give the 
> > > information what caused the stall. 
> > 
> > Ah, good point!  In this case, the stall is self-detected, but RCU
> > incorrectly includes that it is some other CPU's fault because RCU is
> > not waiting on the current CPU.  Please see below for updated patch.
> > 
> > > In SMP=n PREEMPT=y case there is no stall with my test. I guess it is 
> > > because module inclusion does not disable preemption. However it is 
> > > possible in other cases as you wrote above.
> > 
> > OK, that matches expected behavior.
> > 
> > > One last thing. Is there a reason that there is not any similar INFO in 
> > > the console for tiny implementation? I enabled RCU_TRACE and still got 
> > > nothing. I have only shallow knowledge of RCU, though.
> > 
> > It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
> > it should, at least as long as scheduling-clock interrupts are being
> > delivered.  One thing to try for diagnosis is to have every (say) 1000th
> > call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
> > and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).
> 
> Unfortunately nothing is there. I'll look into it when I have more time 
> later this week and let you know.
> 
> > 
> > > I'll test more patches if you have some...
> > 
> > Please see below for update that should print current CPU's stack for
> > the "All QSes seen" case.  This replaces the earlier patch.
> 
> This gives
> 
> ...
> [  149.320003] INFO: rcu_sched detected stalls on CPUs/tasks:
> [  149.320003]  (detected by 0, t=21002 jiffies, g=2823, c=2822, q=209)
> [  149.320003] All QSes seen, last rcu_sched kthread activity 21002 (4294816616-4294795614), jiffies_till_next_fqs=3
> [  149.320003] test_thread     R  running task    14832  4015      2 0x00000008
> [  149.320003]  00000000000039f0 ffff88013fc03da8 ffffffff8107874c ffffffff810786b2
> [  149.320003]  ffffffff81e391c0 000000000000520a ffff88013fdcc4a0 ffffffff81e38cc0
> [  149.320003]  ffffffff81e38cc0 ffff88013fc03e28 ffffffff810b1049 0000000000000002
> [  149.320003] Call Trace:
> [  149.320003]  <IRQ>  [<ffffffff8107874c>] sched_show_task+0x11c/0x190
> [  149.320003]  [<ffffffff810786b2>] ? sched_show_task+0x82/0x190
> [  149.320003]  [<ffffffff810b1049>] rcu_check_callbacks+0x839/0x850
> [  149.320003]  [<ffffffff810b627b>] update_process_times+0x4b/0x80
> [  149.320003]  [<ffffffff810c6865>] ? tick_sched_do_timer+0x35/0x40
> [  149.320003]  [<ffffffff810c60d6>] tick_sched_handle.isra.19+0x36/0x50
> [  149.320003]  [<ffffffff810c68b5>] tick_sched_timer+0x45/0x80
> [  149.320003]  [<ffffffff810b7022>] __run_hrtimer+0xb2/0x250
> [  149.320003]  [<ffffffff810b73cc>] ? hrtimer_interrupt+0x7c/0x250
> [  149.320003]  [<ffffffff810c6870>] ? tick_sched_do_timer+0x40/0x40
> [  149.320003]  [<ffffffff810b7447>] hrtimer_interrupt+0xf7/0x250
> [  149.320003]  [<ffffffffa0000000>] ? 0xffffffffa0000000
> [  149.320003]  [<ffffffff810356eb>] local_apic_timer_interrupt+0x3b/0x70
> [  149.320003]  [<ffffffff818a9845>] smp_apic_timer_interrupt+0x45/0x60
> [  149.320003]  [<ffffffff818a7c2f>] apic_timer_interrupt+0x6f/0x80
> [  149.320003]  <EOI>  [<ffffffff818a7960>] ? retint_restore_args+0xe/0xe
> [  149.320003]  [<ffffffff8106d004>] ? kthread_should_stop+0x24/0x30
> [  149.320003]  [<ffffffffa0000015>] ? test_thread_fn+0x15/0x20 [kthread_mod]
> [  149.320003]  [<ffffffff8106d460>] kthread+0xf0/0x110
> [  149.320003]  [<ffffffff8106d370>] ? kthread_create_on_node+0x240/0x240
> [  149.320003]  [<ffffffff818a6cec>] ret_from_fork+0x7c/0xb0
> [  149.320003]  [<ffffffff8106d370>] ? kthread_create_on_node+0x240/0x240
> ...
> 
> So it works as expected which is great. You can add my Tested-by if you 
> feel it is worth it.

Got it, thank you!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-15 13:26             ` Miroslav Benes
  2014-12-16 18:42               ` Paul E. McKenney
@ 2014-12-19 11:09               ` Miroslav Benes
  2014-12-19 15:32                 ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-19 11:09 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

On Mon, 15 Dec 2014, Miroslav Benes wrote:

> On Fri, 12 Dec 2014, Paul E. McKenney wrote:
> 
> > On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> > 

[...]

> > > One last thing. Is there a reason that there is not any similar INFO in 
> > > the console for tiny implementation? I enabled RCU_TRACE and still got 
> > > nothing. I have only shallow knowledge of RCU, though.
> > 
> > It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
> > it should, at least as long as scheduling-clock interrupts are being
> > delivered.  One thing to try for diagnosis is to have every (say) 1000th
> > call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
> > and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).
> 
> Unfortunately nothing is there. I'll look into it when I have more time 
> later this week and let you know.

So I looked into it. Scheduling-clock interrupts are delivered and 
check_cpu_stall() in tiny_plugin.h is called twice each tick, once for 
rcu_bh and once for rcu_sched. What happens is the following (part of the 
printk output in check_cpu_stall)...

...
[  276.049053] name: rcu_bh, jiffies: 4294943345, jiffies_stall: 4294964342, rcp->curtail:           (null)
[  276.050046] name: rcu_sched, jiffies: 4294943345, jiffies_stall: 4294943357, rcp->curtail:           (null)
[  276.051312] name: rcu_bh, jiffies: 4294943347, jiffies_stall: 4294964345, rcp->curtail:           (null)
[  276.052305] name: rcu_sched, jiffies: 4294943347, jiffies_stall: 4294943357, rcp->curtail:           (null)
[  276.054634] name: rcu_bh, jiffies: 4294943356, jiffies_stall: 4294964347, rcp->curtail:           (null)
[  276.054634] name: rcu_sched, jiffies: 4294943356, jiffies_stall: 4294943357, rcp->curtail:           (null)
[  276.062989] name: rcu_bh, jiffies: 4294943358, jiffies_stall: 4294964356, rcp->curtail:           (null)
[  276.063982] name: rcu_sched, jiffies: 4294943358, jiffies_stall: 4294943357, rcp->curtail:           (null)
[  276.065285] name: rcu_bh, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.066278] name: rcu_sched, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.068566] name: rcu_bh, jiffies: 4294943370, jiffies_stall: 4294964361, rcp->curtail:           (null)
[  276.068566] name: rcu_sched, jiffies: 4294943370, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.076867] name: rcu_bh, jiffies: 4294943372, jiffies_stall: 4294964370, rcp->curtail:           (null)
[  276.077861] name: rcu_sched, jiffies: 4294943372, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.079199] name: rcu_bh, jiffies: 4294943375, jiffies_stall: 4294964372, rcp->curtail:           (null)
[  276.080193] name: rcu_sched, jiffies: 4294943375, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.082565] name: rcu_bh, jiffies: 4294943384, jiffies_stall: 4294964375, rcp->curtail:           (null)
[  276.082565] name: rcu_sched, jiffies: 4294943384, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.090984] name: rcu_bh, jiffies: 4294943386, jiffies_stall: 4294964384, rcp->curtail:           (null)
[  276.091976] name: rcu_sched, jiffies: 4294943386, jiffies_stall: 4294964358, rcp->curtail:           (null)
[  276.093296] name: rcu_bh, jiffies: 4294943389, jiffies_stall: 4294964386, rcp->curtail:           (null)
[  276.094289] name: rcu_sched, jiffies: 4294943389, jiffies_stall: 4294964358, rcp->curtail:           (null)
...

In case of rcu_bh jiffies_stall is regularly updated and no stall is thus 
detected (if I understand the code correctly). In case of rcu_sched the 
situation is different. jiffies_stall is updated every 21 seconds (my 
RCU_CPU_STALL_TIMEOUT). So I would say the stall should be detected there. 
Unfortunately it is bound to *rcp->curtail being not NULL, and it is in my 
case (in printk output there is rcp->curtail, but it is *rcp->curtail in 
fact). The patch below did the trick for me and cpu stall was detected and 
info printed.

...
[  138.534004] INFO: rcu_sched stall on CPU (41980 ticks this GP) idle=140000000000001 (t=42000 jiffies q=64)
[  138.534004] CPU: 0 PID: 4140 Comm: test_thread Tainted: G           O   3.18.0-next-20141218-rcu_tiny_patched+ #1
[  138.534004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20141015_183547-build31 04/01/2014
[  138.534004]  0000000000000000 ffffffff81e2ce28 ffffffff81881b77 ffffffff81e2ce48
[  138.534004]  ffffffff810925c1 ffffffff81e2ce68 0000000000000000 ffffffff81e2ce68
[  138.534004]  ffffffff810932c2 0000000000000000 ffff8801395bc650 ffffffff81e2ce88
[  138.534004] Call Trace:
[  138.534004]  <IRQ>  [<ffffffff81881b77>] dump_stack+0x19/0x1b
[  138.534004]  [<ffffffff810925c1>] check_cpu_stall.part.6+0x61/0xb0
[  138.534004]  [<ffffffff810932c2>] rcu_check_callbacks+0x82/0x90
[  138.534004]  [<ffffffff81097258>] update_process_times+0x38/0x60
[  138.534004]  [<ffffffff810a6635>] tick_sched_handle.isra.22+0x35/0x50
[  138.534004]  [<ffffffff810a6ab3>] tick_sched_timer+0x43/0x70
[  138.534004]  [<ffffffff81097e1a>] __run_hrtimer+0xaa/0x240
[  138.534004]  [<ffffffff810981b6>] ? hrtimer_interrupt+0x66/0x250
[  138.534004]  [<ffffffff810a6a70>] ? tick_sched_do_timer+0x30/0x30
[  138.534004]  [<ffffffff8109822f>] hrtimer_interrupt+0xdf/0x250
[  138.534004]  [<ffffffffa0000000>] ? 0xffffffffa0000000
[  138.534004]  [<ffffffff8102d986>] local_apic_timer_interrupt+0x26/0x50
[  138.534004]  [<ffffffff8189026f>] smp_apic_timer_interrupt+0x3f/0x60
[  138.534004]  [<ffffffff8188e91d>] apic_timer_interrupt+0x6d/0x80
[  138.534004]  <EOI>  [<ffffffff8188e75c>] ? retint_restore_args+0xe/0xe
[  138.534004]  [<ffffffff81062c9d>] ? kthread_should_stop+0x1d/0x30
[  138.534004]  [<ffffffffa0000015>] ? test_thread_fn+0x15/0x20 [kthread_mod]
[  138.534004]  [<ffffffff81063139>] kthread+0xf9/0x110
[  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
[  138.534004]  [<ffffffff8188da2a>] ret_from_fork+0x7a/0xb0
[  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
...

But I'm not sure whether it is possible that the list of pending callback 
is empty. Maybe the bug is elsewhere. My understanding of RCU is limited 
but I'll try to look around. Also the commit log might be imprecise :/

Best regards,
Miroslav

-- >8 --
>From be3495519513356daa366517bb458bc4f05ecf9b Mon Sep 17 00:00:00 2001
From: Miroslav Benes <mbenes@suse.cz>
Date: Fri, 19 Dec 2014 11:11:04 +0100
Subject: [PATCH] rcu: fix rcu stall detection in tiny implementation

The current tiny RCU stall detection depends on presence of pending callbacks in
the list (*rcp->curtail et al.). But they are none even in case of real CPU
stalls prohibiting the detection. When the check for callback presence is
removed, INFO about the stall is correctly printed.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
 kernel/rcu/tiny_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
index 858c565..3f743fa 100644
--- a/kernel/rcu/tiny_plugin.h
+++ b/kernel/rcu/tiny_plugin.h
@@ -145,7 +145,7 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
 	rcp->ticks_this_gp++;
 	j = jiffies;
 	js = ACCESS_ONCE(rcp->jiffies_stall);
-	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
+	if (ULONG_CMP_GE(j, js)) {
 		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
 		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
 		       jiffies - rcp->gp_start, rcp->qlen);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-19 11:09               ` Miroslav Benes
@ 2014-12-19 15:32                 ` Paul E. McKenney
  2014-12-21 17:46                   ` Paul E. McKenney
  2014-12-22 12:42                   ` Miroslav Benes
  0 siblings, 2 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-19 15:32 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Fri, Dec 19, 2014 at 12:09:48PM +0100, Miroslav Benes wrote:
> On Mon, 15 Dec 2014, Miroslav Benes wrote:
> 
> > On Fri, 12 Dec 2014, Paul E. McKenney wrote:
> > 
> > > On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> > > 
> 
> [...]
> 
> > > > One last thing. Is there a reason that there is not any similar INFO in 
> > > > the console for tiny implementation? I enabled RCU_TRACE and still got 
> > > > nothing. I have only shallow knowledge of RCU, though.
> > > 
> > > It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
> > > it should, at least as long as scheduling-clock interrupts are being
> > > delivered.  One thing to try for diagnosis is to have every (say) 1000th
> > > call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
> > > and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).
> > 
> > Unfortunately nothing is there. I'll look into it when I have more time 
> > later this week and let you know.
> 
> So I looked into it. Scheduling-clock interrupts are delivered and 
> check_cpu_stall() in tiny_plugin.h is called twice each tick, once for 
> rcu_bh and once for rcu_sched. What happens is the following (part of the 
> printk output in check_cpu_stall)...
> 
> ...
> [  276.049053] name: rcu_bh, jiffies: 4294943345, jiffies_stall: 4294964342, rcp->curtail:           (null)
> [  276.050046] name: rcu_sched, jiffies: 4294943345, jiffies_stall: 4294943357, rcp->curtail:           (null)
> [  276.051312] name: rcu_bh, jiffies: 4294943347, jiffies_stall: 4294964345, rcp->curtail:           (null)
> [  276.052305] name: rcu_sched, jiffies: 4294943347, jiffies_stall: 4294943357, rcp->curtail:           (null)
> [  276.054634] name: rcu_bh, jiffies: 4294943356, jiffies_stall: 4294964347, rcp->curtail:           (null)
> [  276.054634] name: rcu_sched, jiffies: 4294943356, jiffies_stall: 4294943357, rcp->curtail:           (null)
> [  276.062989] name: rcu_bh, jiffies: 4294943358, jiffies_stall: 4294964356, rcp->curtail:           (null)
> [  276.063982] name: rcu_sched, jiffies: 4294943358, jiffies_stall: 4294943357, rcp->curtail:           (null)
> [  276.065285] name: rcu_bh, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.066278] name: rcu_sched, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.068566] name: rcu_bh, jiffies: 4294943370, jiffies_stall: 4294964361, rcp->curtail:           (null)
> [  276.068566] name: rcu_sched, jiffies: 4294943370, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.076867] name: rcu_bh, jiffies: 4294943372, jiffies_stall: 4294964370, rcp->curtail:           (null)
> [  276.077861] name: rcu_sched, jiffies: 4294943372, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.079199] name: rcu_bh, jiffies: 4294943375, jiffies_stall: 4294964372, rcp->curtail:           (null)
> [  276.080193] name: rcu_sched, jiffies: 4294943375, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.082565] name: rcu_bh, jiffies: 4294943384, jiffies_stall: 4294964375, rcp->curtail:           (null)
> [  276.082565] name: rcu_sched, jiffies: 4294943384, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.090984] name: rcu_bh, jiffies: 4294943386, jiffies_stall: 4294964384, rcp->curtail:           (null)
> [  276.091976] name: rcu_sched, jiffies: 4294943386, jiffies_stall: 4294964358, rcp->curtail:           (null)
> [  276.093296] name: rcu_bh, jiffies: 4294943389, jiffies_stall: 4294964386, rcp->curtail:           (null)
> [  276.094289] name: rcu_sched, jiffies: 4294943389, jiffies_stall: 4294964358, rcp->curtail:           (null)
> ...
> 
> In case of rcu_bh jiffies_stall is regularly updated and no stall is thus 
> detected (if I understand the code correctly). In case of rcu_sched the 
> situation is different. jiffies_stall is updated every 21 seconds (my 
> RCU_CPU_STALL_TIMEOUT). So I would say the stall should be detected there. 
> Unfortunately it is bound to *rcp->curtail being not NULL, and it is in my 
> case (in printk output there is rcp->curtail, but it is *rcp->curtail in 
> fact). The patch below did the trick for me and cpu stall was detected and 
> info printed.
> 
> ...
> [  138.534004] INFO: rcu_sched stall on CPU (41980 ticks this GP) idle=140000000000001 (t=42000 jiffies q=64)
> [  138.534004] CPU: 0 PID: 4140 Comm: test_thread Tainted: G           O   3.18.0-next-20141218-rcu_tiny_patched+ #1
> [  138.534004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20141015_183547-build31 04/01/2014
> [  138.534004]  0000000000000000 ffffffff81e2ce28 ffffffff81881b77 ffffffff81e2ce48
> [  138.534004]  ffffffff810925c1 ffffffff81e2ce68 0000000000000000 ffffffff81e2ce68
> [  138.534004]  ffffffff810932c2 0000000000000000 ffff8801395bc650 ffffffff81e2ce88
> [  138.534004] Call Trace:
> [  138.534004]  <IRQ>  [<ffffffff81881b77>] dump_stack+0x19/0x1b
> [  138.534004]  [<ffffffff810925c1>] check_cpu_stall.part.6+0x61/0xb0
> [  138.534004]  [<ffffffff810932c2>] rcu_check_callbacks+0x82/0x90
> [  138.534004]  [<ffffffff81097258>] update_process_times+0x38/0x60
> [  138.534004]  [<ffffffff810a6635>] tick_sched_handle.isra.22+0x35/0x50
> [  138.534004]  [<ffffffff810a6ab3>] tick_sched_timer+0x43/0x70
> [  138.534004]  [<ffffffff81097e1a>] __run_hrtimer+0xaa/0x240
> [  138.534004]  [<ffffffff810981b6>] ? hrtimer_interrupt+0x66/0x250
> [  138.534004]  [<ffffffff810a6a70>] ? tick_sched_do_timer+0x30/0x30
> [  138.534004]  [<ffffffff8109822f>] hrtimer_interrupt+0xdf/0x250
> [  138.534004]  [<ffffffffa0000000>] ? 0xffffffffa0000000
> [  138.534004]  [<ffffffff8102d986>] local_apic_timer_interrupt+0x26/0x50
> [  138.534004]  [<ffffffff8189026f>] smp_apic_timer_interrupt+0x3f/0x60
> [  138.534004]  [<ffffffff8188e91d>] apic_timer_interrupt+0x6d/0x80
> [  138.534004]  <EOI>  [<ffffffff8188e75c>] ? retint_restore_args+0xe/0xe
> [  138.534004]  [<ffffffff81062c9d>] ? kthread_should_stop+0x1d/0x30
> [  138.534004]  [<ffffffffa0000015>] ? test_thread_fn+0x15/0x20 [kthread_mod]
> [  138.534004]  [<ffffffff81063139>] kthread+0xf9/0x110
> [  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
> [  138.534004]  [<ffffffff8188da2a>] ret_from_fork+0x7a/0xb0
> [  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
> ...
> 
> But I'm not sure whether it is possible that the list of pending callback 
> is empty. Maybe the bug is elsewhere. My understanding of RCU is limited 
> but I'll try to look around. Also the commit log might be imprecise :/

You found the problem, good show!  However, we do need a slightly
different fix, please see below.

> Best regards,
> Miroslav
> 
> -- >8 --
> >From be3495519513356daa366517bb458bc4f05ecf9b Mon Sep 17 00:00:00 2001
> From: Miroslav Benes <mbenes@suse.cz>
> Date: Fri, 19 Dec 2014 11:11:04 +0100
> Subject: [PATCH] rcu: fix rcu stall detection in tiny implementation

The first word following a ":" needs to be capitalized.

> The current tiny RCU stall detection depends on presence of pending callbacks in
> the list (*rcp->curtail et al.). But they are none even in case of real CPU
> stalls prohibiting the detection. When the check for callback presence is
> removed, INFO about the stall is correctly printed.

As you say, this needs to be reworked, but first please see below.

> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
>  kernel/rcu/tiny_plugin.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
> index 858c565..3f743fa 100644
> --- a/kernel/rcu/tiny_plugin.h
> +++ b/kernel/rcu/tiny_plugin.h
> @@ -145,7 +145,7 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
>  	rcp->ticks_this_gp++;
>  	j = jiffies;
>  	js = ACCESS_ONCE(rcp->jiffies_stall);
> -	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
> +	if (ULONG_CMP_GE(j, js)) {

The problem is that ->curtail is a tail pointer, and thus always points
to a NULL pointer, so the expression is always false.  However, we don't
want to give RCU CPU stall warnings if there are no callbacks, because
this can result in spurious stall warnings after long idle periods.

So we need to check the header pointer as follows:

+	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {

But what do you do if your test module happens to start spinning when
there is no callback?

One approach is to do a call_rcu() just after starting your loop.
Alternatively, do several just before starting your loop.

							Thanx, Paul

>  		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
>  		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
>  		       jiffies - rcp->gp_start, rcp->qlen);
> -- 
> 2.1.2
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-19 15:32                 ` Paul E. McKenney
@ 2014-12-21 17:46                   ` Paul E. McKenney
  2014-12-22 12:42                   ` Miroslav Benes
  1 sibling, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-21 17:46 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Fri, Dec 19, 2014 at 07:32:20AM -0800, Paul E. McKenney wrote:
> On Fri, Dec 19, 2014 at 12:09:48PM +0100, Miroslav Benes wrote:
> > On Mon, 15 Dec 2014, Miroslav Benes wrote:

[ . . . ]

> > But I'm not sure whether it is possible that the list of pending callback 
> > is empty. Maybe the bug is elsewhere. My understanding of RCU is limited 
> > but I'll try to look around. Also the commit log might be imprecise :/
> 
> You found the problem, good show!  However, we do need a slightly
> different fix, please see below.

And, as I should have said the first time, please update your patch
as mentioned below so that I can add it to the -rcu tree.

								Thanx, Paul

> > Best regards,
> > Miroslav
> > 
> > -- >8 --
> > >From be3495519513356daa366517bb458bc4f05ecf9b Mon Sep 17 00:00:00 2001
> > From: Miroslav Benes <mbenes@suse.cz>
> > Date: Fri, 19 Dec 2014 11:11:04 +0100
> > Subject: [PATCH] rcu: fix rcu stall detection in tiny implementation
> 
> The first word following a ":" needs to be capitalized.
> 
> > The current tiny RCU stall detection depends on presence of pending callbacks in
> > the list (*rcp->curtail et al.). But they are none even in case of real CPU
> > stalls prohibiting the detection. When the check for callback presence is
> > removed, INFO about the stall is correctly printed.
> 
> As you say, this needs to be reworked, but first please see below.
> 
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> > ---
> >  kernel/rcu/tiny_plugin.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
> > index 858c565..3f743fa 100644
> > --- a/kernel/rcu/tiny_plugin.h
> > +++ b/kernel/rcu/tiny_plugin.h
> > @@ -145,7 +145,7 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
> >  	rcp->ticks_this_gp++;
> >  	j = jiffies;
> >  	js = ACCESS_ONCE(rcp->jiffies_stall);
> > -	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
> > +	if (ULONG_CMP_GE(j, js)) {
> 
> The problem is that ->curtail is a tail pointer, and thus always points
> to a NULL pointer, so the expression is always false.  However, we don't
> want to give RCU CPU stall warnings if there are no callbacks, because
> this can result in spurious stall warnings after long idle periods.
> 
> So we need to check the header pointer as follows:
> 
> +	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {
> 
> But what do you do if your test module happens to start spinning when
> there is no callback?
> 
> One approach is to do a call_rcu() just after starting your loop.
> Alternatively, do several just before starting your loop.
> 
> 							Thanx, Paul
> 
> >  		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
> >  		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
> >  		       jiffies - rcp->gp_start, rcp->qlen);
> > -- 
> > 2.1.2
> > 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-19 15:32                 ` Paul E. McKenney
  2014-12-21 17:46                   ` Paul E. McKenney
@ 2014-12-22 12:42                   ` Miroslav Benes
  2014-12-22 19:40                     ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-22 12:42 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

On Fri, 19 Dec 2014, Paul E. McKenney wrote:

> On Fri, Dec 19, 2014 at 12:09:48PM +0100, Miroslav Benes wrote:
> > On Mon, 15 Dec 2014, Miroslav Benes wrote:
> > 
> > > On Fri, 12 Dec 2014, Paul E. McKenney wrote:
> > > 
> > > > On Fri, Dec 12, 2014 at 03:06:20PM +0100, Miroslav Benes wrote:
> > > > 
> > 
> > [...]
> > 
> > > > > One last thing. Is there a reason that there is not any similar INFO in 
> > > > > the console for tiny implementation? I enabled RCU_TRACE and still got 
> > > > > nothing. I have only shallow knowledge of RCU, though.
> > > > 
> > > > It is supposed to work given CONFIG_RCU_TRACE=y, and it looks like
> > > > it should, at least as long as scheduling-clock interrupts are being
> > > > delivered.  One thing to try for diagnosis is to have every (say) 1000th
> > > > call to check_cpu_stall() (in kernel/rcu/tiny_plugin.h) do a printk()
> > > > and same for rcu_check_callbacks() (in kernel/rcu/tiny.c).
> > > 
> > > Unfortunately nothing is there. I'll look into it when I have more time 
> > > later this week and let you know.
> > 
> > So I looked into it. Scheduling-clock interrupts are delivered and 
> > check_cpu_stall() in tiny_plugin.h is called twice each tick, once for 
> > rcu_bh and once for rcu_sched. What happens is the following (part of the 
> > printk output in check_cpu_stall)...
> > 
> > ...
> > [  276.049053] name: rcu_bh, jiffies: 4294943345, jiffies_stall: 4294964342, rcp->curtail:           (null)
> > [  276.050046] name: rcu_sched, jiffies: 4294943345, jiffies_stall: 4294943357, rcp->curtail:           (null)
> > [  276.051312] name: rcu_bh, jiffies: 4294943347, jiffies_stall: 4294964345, rcp->curtail:           (null)
> > [  276.052305] name: rcu_sched, jiffies: 4294943347, jiffies_stall: 4294943357, rcp->curtail:           (null)
> > [  276.054634] name: rcu_bh, jiffies: 4294943356, jiffies_stall: 4294964347, rcp->curtail:           (null)
> > [  276.054634] name: rcu_sched, jiffies: 4294943356, jiffies_stall: 4294943357, rcp->curtail:           (null)
> > [  276.062989] name: rcu_bh, jiffies: 4294943358, jiffies_stall: 4294964356, rcp->curtail:           (null)
> > [  276.063982] name: rcu_sched, jiffies: 4294943358, jiffies_stall: 4294943357, rcp->curtail:           (null)
> > [  276.065285] name: rcu_bh, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.066278] name: rcu_sched, jiffies: 4294943361, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.068566] name: rcu_bh, jiffies: 4294943370, jiffies_stall: 4294964361, rcp->curtail:           (null)
> > [  276.068566] name: rcu_sched, jiffies: 4294943370, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.076867] name: rcu_bh, jiffies: 4294943372, jiffies_stall: 4294964370, rcp->curtail:           (null)
> > [  276.077861] name: rcu_sched, jiffies: 4294943372, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.079199] name: rcu_bh, jiffies: 4294943375, jiffies_stall: 4294964372, rcp->curtail:           (null)
> > [  276.080193] name: rcu_sched, jiffies: 4294943375, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.082565] name: rcu_bh, jiffies: 4294943384, jiffies_stall: 4294964375, rcp->curtail:           (null)
> > [  276.082565] name: rcu_sched, jiffies: 4294943384, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.090984] name: rcu_bh, jiffies: 4294943386, jiffies_stall: 4294964384, rcp->curtail:           (null)
> > [  276.091976] name: rcu_sched, jiffies: 4294943386, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > [  276.093296] name: rcu_bh, jiffies: 4294943389, jiffies_stall: 4294964386, rcp->curtail:           (null)
> > [  276.094289] name: rcu_sched, jiffies: 4294943389, jiffies_stall: 4294964358, rcp->curtail:           (null)
> > ...
> > 
> > In case of rcu_bh jiffies_stall is regularly updated and no stall is thus 
> > detected (if I understand the code correctly). In case of rcu_sched the 
> > situation is different. jiffies_stall is updated every 21 seconds (my 
> > RCU_CPU_STALL_TIMEOUT). So I would say the stall should be detected there. 
> > Unfortunately it is bound to *rcp->curtail being not NULL, and it is in my 
> > case (in printk output there is rcp->curtail, but it is *rcp->curtail in 
> > fact). The patch below did the trick for me and cpu stall was detected and 
> > info printed.
> > 
> > ...
> > [  138.534004] INFO: rcu_sched stall on CPU (41980 ticks this GP) idle=140000000000001 (t=42000 jiffies q=64)
> > [  138.534004] CPU: 0 PID: 4140 Comm: test_thread Tainted: G           O   3.18.0-next-20141218-rcu_tiny_patched+ #1
> > [  138.534004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20141015_183547-build31 04/01/2014
> > [  138.534004]  0000000000000000 ffffffff81e2ce28 ffffffff81881b77 ffffffff81e2ce48
> > [  138.534004]  ffffffff810925c1 ffffffff81e2ce68 0000000000000000 ffffffff81e2ce68
> > [  138.534004]  ffffffff810932c2 0000000000000000 ffff8801395bc650 ffffffff81e2ce88
> > [  138.534004] Call Trace:
> > [  138.534004]  <IRQ>  [<ffffffff81881b77>] dump_stack+0x19/0x1b
> > [  138.534004]  [<ffffffff810925c1>] check_cpu_stall.part.6+0x61/0xb0
> > [  138.534004]  [<ffffffff810932c2>] rcu_check_callbacks+0x82/0x90
> > [  138.534004]  [<ffffffff81097258>] update_process_times+0x38/0x60
> > [  138.534004]  [<ffffffff810a6635>] tick_sched_handle.isra.22+0x35/0x50
> > [  138.534004]  [<ffffffff810a6ab3>] tick_sched_timer+0x43/0x70
> > [  138.534004]  [<ffffffff81097e1a>] __run_hrtimer+0xaa/0x240
> > [  138.534004]  [<ffffffff810981b6>] ? hrtimer_interrupt+0x66/0x250
> > [  138.534004]  [<ffffffff810a6a70>] ? tick_sched_do_timer+0x30/0x30
> > [  138.534004]  [<ffffffff8109822f>] hrtimer_interrupt+0xdf/0x250
> > [  138.534004]  [<ffffffffa0000000>] ? 0xffffffffa0000000
> > [  138.534004]  [<ffffffff8102d986>] local_apic_timer_interrupt+0x26/0x50
> > [  138.534004]  [<ffffffff8189026f>] smp_apic_timer_interrupt+0x3f/0x60
> > [  138.534004]  [<ffffffff8188e91d>] apic_timer_interrupt+0x6d/0x80
> > [  138.534004]  <EOI>  [<ffffffff8188e75c>] ? retint_restore_args+0xe/0xe
> > [  138.534004]  [<ffffffff81062c9d>] ? kthread_should_stop+0x1d/0x30
> > [  138.534004]  [<ffffffffa0000015>] ? test_thread_fn+0x15/0x20 [kthread_mod]
> > [  138.534004]  [<ffffffff81063139>] kthread+0xf9/0x110
> > [  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
> > [  138.534004]  [<ffffffff8188da2a>] ret_from_fork+0x7a/0xb0
> > [  138.534004]  [<ffffffff81063040>] ? kthread_create_on_node+0x230/0x230
> > ...
> > 
> > But I'm not sure whether it is possible that the list of pending callback 
> > is empty. Maybe the bug is elsewhere. My understanding of RCU is limited 
> > but I'll try to look around. Also the commit log might be imprecise :/
> 
> You found the problem, good show!  However, we do need a slightly
> different fix, please see below.
> 
> > Best regards,
> > Miroslav
> > 
> > -- >8 --
> > >From be3495519513356daa366517bb458bc4f05ecf9b Mon Sep 17 00:00:00 2001
> > From: Miroslav Benes <mbenes@suse.cz>
> > Date: Fri, 19 Dec 2014 11:11:04 +0100
> > Subject: [PATCH] rcu: fix rcu stall detection in tiny implementation
> 
> The first word following a ":" needs to be capitalized.
> 
> > The current tiny RCU stall detection depends on presence of pending callbacks in
> > the list (*rcp->curtail et al.). But they are none even in case of real CPU
> > stalls prohibiting the detection. When the check for callback presence is
> > removed, INFO about the stall is correctly printed.
> 
> As you say, this needs to be reworked, but first please see below.
> 
> > Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> > ---
> >  kernel/rcu/tiny_plugin.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
> > index 858c565..3f743fa 100644
> > --- a/kernel/rcu/tiny_plugin.h
> > +++ b/kernel/rcu/tiny_plugin.h
> > @@ -145,7 +145,7 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
> >  	rcp->ticks_this_gp++;
> >  	j = jiffies;
> >  	js = ACCESS_ONCE(rcp->jiffies_stall);
> > -	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
> > +	if (ULONG_CMP_GE(j, js)) {
> 
> The problem is that ->curtail is a tail pointer, and thus always points
> to a NULL pointer, so the expression is always false.  However, we don't
> want to give RCU CPU stall warnings if there are no callbacks, because
> this can result in spurious stall warnings after long idle periods.

Ah, I should have spotted that. It makes sense now.

> So we need to check the header pointer as follows:
> 
> +	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {

Ok, thank you.

> But what do you do if your test module happens to start spinning when
> there is no callback?
> 
> One approach is to do a call_rcu() just after starting your loop.
> Alternatively, do several just before starting your loop.

Yes, that would be the way if I understand it correctly. Fortunately there 
is always some pending RCU callback during my test. I wonder where it 
comes from (probably module handling. I'll try to find out just to be 
sure.). 

So, is the following patch correct? I tested it and it seems ok. And sorry 
for the delay, I took the weekend off. Thank you for your patience.

Miroslav

-- >8 --
>From 0dccc8e322e1a38016422be8f4af110eb7a405fc Mon Sep 17 00:00:00 2001
From: Miroslav Benes <mbenes@suse.cz>
Date: Mon, 22 Dec 2014 13:24:44 +0100
Subject: [PATCH] rcu: Fix RCU CPU stall detection in tiny implementation

The tiny RCU CPU stall detection depends on *rcp->curtail not being
NULL. It is however a tail pointer and thus NULL by definition. Instead we
should check rcp->rcucblist for the presence of pending callbacks which need to
be processed. With this fix INFO about the stall is printed and jiffies_stall
(jiffies at next stall) correctly updated.

Note that the check for pending callback is necessary to avoid spurious warnings
if there are no pendings callbacks.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
---
 kernel/rcu/tiny_plugin.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
index 858c565..868e132 100644
--- a/kernel/rcu/tiny_plugin.h
+++ b/kernel/rcu/tiny_plugin.h
@@ -145,13 +145,13 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
 	rcp->ticks_this_gp++;
 	j = jiffies;
 	js = ACCESS_ONCE(rcp->jiffies_stall);
-	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
+	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {
 		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
 		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
 		       jiffies - rcp->gp_start, rcp->qlen);
 		dump_stack();
 	}
-	if (*rcp->curtail && ULONG_CMP_GE(j, js))
+	if (rcp->rcucblist && ULONG_CMP_GE(j, js))
 		ACCESS_ONCE(rcp->jiffies_stall) = jiffies +
 			3 * rcu_jiffies_till_stall_check() + 3;
 	else if (ULONG_CMP_GE(j, js))
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: rcu_sched stall detected, but no state dump
  2014-12-22 12:42                   ` Miroslav Benes
@ 2014-12-22 19:40                     ` Paul E. McKenney
  0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-12-22 19:40 UTC (permalink / raw)
  To: Miroslav Benes; +Cc: Linux Kernel Mailing List

On Mon, Dec 22, 2014 at 01:42:14PM +0100, Miroslav Benes wrote:
> On Fri, 19 Dec 2014, Paul E. McKenney wrote:
> > On Fri, Dec 19, 2014 at 12:09:48PM +0100, Miroslav Benes wrote:

[ . . . ]

> > But what do you do if your test module happens to start spinning when
> > there is no callback?
> > 
> > One approach is to do a call_rcu() just after starting your loop.
> > Alternatively, do several just before starting your loop.
> 
> Yes, that would be the way if I understand it correctly. Fortunately there 
> is always some pending RCU callback during my test. I wonder where it 
> comes from (probably module handling. I'll try to find out just to be 
> sure.). 
> 
> So, is the following patch correct? I tested it and it seems ok. And sorry 
> for the delay, I took the weekend off. Thank you for your patience.

Good!  I forward ported it to -rcu, merged the two "if" statements,
queued it, and have started testing.  Thank you!!!

							Thanx, Paul

> Miroslav
> 
> -- >8 --
> >From 0dccc8e322e1a38016422be8f4af110eb7a405fc Mon Sep 17 00:00:00 2001
> From: Miroslav Benes <mbenes@suse.cz>
> Date: Mon, 22 Dec 2014 13:24:44 +0100
> Subject: [PATCH] rcu: Fix RCU CPU stall detection in tiny implementation
> 
> The tiny RCU CPU stall detection depends on *rcp->curtail not being
> NULL. It is however a tail pointer and thus NULL by definition. Instead we
> should check rcp->rcucblist for the presence of pending callbacks which need to
> be processed. With this fix INFO about the stall is printed and jiffies_stall
> (jiffies at next stall) correctly updated.
> 
> Note that the check for pending callback is necessary to avoid spurious warnings
> if there are no pendings callbacks.
> 
> Signed-off-by: Miroslav Benes <mbenes@suse.cz>
> ---
>  kernel/rcu/tiny_plugin.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
> index 858c565..868e132 100644
> --- a/kernel/rcu/tiny_plugin.h
> +++ b/kernel/rcu/tiny_plugin.h
> @@ -145,13 +145,13 @@ static void check_cpu_stall(struct rcu_ctrlblk *rcp)
>  	rcp->ticks_this_gp++;
>  	j = jiffies;
>  	js = ACCESS_ONCE(rcp->jiffies_stall);
> -	if (*rcp->curtail && ULONG_CMP_GE(j, js)) {
> +	if (rcp->rcucblist && ULONG_CMP_GE(j, js)) {
>  		pr_err("INFO: %s stall on CPU (%lu ticks this GP) idle=%llx (t=%lu jiffies q=%ld)\n",
>  		       rcp->name, rcp->ticks_this_gp, rcu_dynticks_nesting,
>  		       jiffies - rcp->gp_start, rcp->qlen);
>  		dump_stack();
>  	}
> -	if (*rcp->curtail && ULONG_CMP_GE(j, js))
> +	if (rcp->rcucblist && ULONG_CMP_GE(j, js))
>  		ACCESS_ONCE(rcp->jiffies_stall) = jiffies +
>  			3 * rcu_jiffies_till_stall_check() + 3;
>  	else if (ULONG_CMP_GE(j, js))
> -- 
> 2.1.2
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-12-22 19:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10 12:52 rcu_sched stall detected, but no state dump Miroslav Benes
2014-12-10 16:28 ` Paul E. McKenney
2014-12-11  9:35   ` Miroslav Benes
2014-12-11 16:50     ` Paul E. McKenney
2014-12-11 19:09       ` Paul E. McKenney
2014-12-12 14:06         ` Miroslav Benes
2014-12-12 16:58           ` Paul E. McKenney
2014-12-15 13:26             ` Miroslav Benes
2014-12-16 18:42               ` Paul E. McKenney
2014-12-19 11:09               ` Miroslav Benes
2014-12-19 15:32                 ` Paul E. McKenney
2014-12-21 17:46                   ` Paul E. McKenney
2014-12-22 12:42                   ` Miroslav Benes
2014-12-22 19:40                     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.