rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock

All of lore.kernel.org
 help / color / mirror / Atom feed

* rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
@ 2015-01-18 14:17 Sasha Levin
  2015-01-18 23:22 ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-18 14:17 UTC (permalink / raw)
  To: Paul E. McKenney, Lai Jiangshan
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Dave Jones

Hi Paul, Lai,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel, I've stumbled on the following spew:

[  598.188036] WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
[  598.188036] Modules linked in:
[  598.188036] CPU: 30 PID: 23771 Comm: trinity-c118 Not tainted 3.19.0-rc4-next-20150116-sasha-00055-gb8e1507-dirty #1745
[  598.188036]  ffffffff926e52dc ffff8801b4403c38 ffffffff91439fb2 0000000000000000
[  598.188036]  0000000000000000 ffff8801b4403c78 ffffffff8e159e1a ffff8801b4403c58
[  598.188036]  0000000000000000 ffff88002f093000 00000000ffffffff ffff8801b23b9290
[  598.188036] Call Trace:
[  598.188036] <IRQ> dump_stack (lib/dump_stack.c:52)
[  598.212152] warn_slowpath_common (kernel/panic.c:447)
[  598.212152] warn_slowpath_null (kernel/panic.c:481)
[  598.212152] rcu_read_unlock_special (kernel/rcu/tree_plugin.h:337 (discriminator 9))
[  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
[  598.212152] __rcu_read_unlock (kernel/rcu/update.c:97)
[  598.212152] select_task_rq_fair (kernel/sched/fair.c:4805)
[  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
[  598.212152] ? try_to_wake_up (kernel/sched/core.c:1701)
[  598.212152] try_to_wake_up (kernel/sched/core.c:1415 kernel/sched/core.c:1729)
[  598.212152] wake_up_process (kernel/sched/core.c:1797 (discriminator 3))
[  598.212152] hrtimer_wakeup (kernel/time/hrtimer.c:1490)
[  598.212152] __run_hrtimer (kernel/time/hrtimer.c:1218 (discriminator 3))
[  598.212152] ? hrtimer_interrupt (kernel/time/hrtimer.c:622 kernel/time/hrtimer.c:1254)
[  598.212152] ? hrtimer_get_res (kernel/time/hrtimer.c:1480)
[  598.212152] hrtimer_interrupt (kernel/time/hrtimer.c:1307)
[  598.212152] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:921)
[  598.212152] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:660 arch/x86/kernel/apic/apic.c:945)
[  598.212152] apic_timer_interrupt (arch/x86/kernel/entry_64.S:983)
[  598.212152] <EOI> ? context_tracking_user_enter (./arch/x86/include/asm/paravirt.h:809 kernel/context_tracking.c:106)
[  598.212152] syscall_trace_leave (arch/x86/kernel/ptrace.c:1640)
[  598.212152] int_check_syscall_exit_work (arch/x86/kernel/entry_64.S:577)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-18 14:17 rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550() Sasha Levin
@ 2015-01-18 23:22 ` Paul E. McKenney
  2015-01-20 15:39   ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-18 23:22 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML, Dave Jones

On Sun, Jan 18, 2015 at 09:17:40AM -0500, Sasha Levin wrote:
> Hi Paul, Lai,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel, I've stumbled on the following spew:
> 
> [  598.188036] WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
> [  598.188036] Modules linked in:
> [  598.188036] CPU: 30 PID: 23771 Comm: trinity-c118 Not tainted 3.19.0-rc4-next-20150116-sasha-00055-gb8e1507-dirty #1745
> [  598.188036]  ffffffff926e52dc ffff8801b4403c38 ffffffff91439fb2 0000000000000000
> [  598.188036]  0000000000000000 ffff8801b4403c78 ffffffff8e159e1a ffff8801b4403c58
> [  598.188036]  0000000000000000 ffff88002f093000 00000000ffffffff ffff8801b23b9290
> [  598.188036] Call Trace:
> [  598.188036] <IRQ> dump_stack (lib/dump_stack.c:52)
> [  598.212152] warn_slowpath_common (kernel/panic.c:447)
> [  598.212152] warn_slowpath_null (kernel/panic.c:481)
> [  598.212152] rcu_read_unlock_special (kernel/rcu/tree_plugin.h:337 (discriminator 9))
> [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
> [  598.212152] __rcu_read_unlock (kernel/rcu/update.c:97)
> [  598.212152] select_task_rq_fair (kernel/sched/fair.c:4805)
> [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
> [  598.212152] ? try_to_wake_up (kernel/sched/core.c:1701)
> [  598.212152] try_to_wake_up (kernel/sched/core.c:1415 kernel/sched/core.c:1729)
> [  598.212152] wake_up_process (kernel/sched/core.c:1797 (discriminator 3))
> [  598.212152] hrtimer_wakeup (kernel/time/hrtimer.c:1490)
> [  598.212152] __run_hrtimer (kernel/time/hrtimer.c:1218 (discriminator 3))
> [  598.212152] ? hrtimer_interrupt (kernel/time/hrtimer.c:622 kernel/time/hrtimer.c:1254)
> [  598.212152] ? hrtimer_get_res (kernel/time/hrtimer.c:1480)
> [  598.212152] hrtimer_interrupt (kernel/time/hrtimer.c:1307)
> [  598.212152] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:921)
> [  598.212152] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:660 arch/x86/kernel/apic/apic.c:945)
> [  598.212152] apic_timer_interrupt (arch/x86/kernel/entry_64.S:983)
> [  598.212152] <EOI> ? context_tracking_user_enter (./arch/x86/include/asm/paravirt.h:809 kernel/context_tracking.c:106)
> [  598.212152] syscall_trace_leave (arch/x86/kernel/ptrace.c:1640)
> [  598.212152] int_check_syscall_exit_work (arch/x86/kernel/entry_64.S:577)

So RCU believes that an RCU read-side critical section that ended within
an interrupt handler (in this case, an hrtimer) somehow got preempted.
Which is not supposed to happen.

Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
and retry?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-18 23:22 ` Paul E. McKenney
@ 2015-01-20 15:39   ` Sasha Levin
  2015-01-21  2:57     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-20 15:39 UTC (permalink / raw)
  To: paulmck; +Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML, Dave Jones

On 01/18/2015 06:22 PM, Paul E. McKenney wrote:
> On Sun, Jan 18, 2015 at 09:17:40AM -0500, Sasha Levin wrote:
>> > Hi Paul, Lai,
>> > 
>> > While fuzzing with trinity inside a KVM tools guest running the latest -next
>> > kernel, I've stumbled on the following spew:
>> > 
>> > [  598.188036] WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
>> > [  598.188036] Modules linked in:
>> > [  598.188036] CPU: 30 PID: 23771 Comm: trinity-c118 Not tainted 3.19.0-rc4-next-20150116-sasha-00055-gb8e1507-dirty #1745
>> > [  598.188036]  ffffffff926e52dc ffff8801b4403c38 ffffffff91439fb2 0000000000000000
>> > [  598.188036]  0000000000000000 ffff8801b4403c78 ffffffff8e159e1a ffff8801b4403c58
>> > [  598.188036]  0000000000000000 ffff88002f093000 00000000ffffffff ffff8801b23b9290
>> > [  598.188036] Call Trace:
>> > [  598.188036] <IRQ> dump_stack (lib/dump_stack.c:52)
>> > [  598.212152] warn_slowpath_common (kernel/panic.c:447)
>> > [  598.212152] warn_slowpath_null (kernel/panic.c:481)
>> > [  598.212152] rcu_read_unlock_special (kernel/rcu/tree_plugin.h:337 (discriminator 9))
>> > [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
>> > [  598.212152] __rcu_read_unlock (kernel/rcu/update.c:97)
>> > [  598.212152] select_task_rq_fair (kernel/sched/fair.c:4805)
>> > [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
>> > [  598.212152] ? try_to_wake_up (kernel/sched/core.c:1701)
>> > [  598.212152] try_to_wake_up (kernel/sched/core.c:1415 kernel/sched/core.c:1729)
>> > [  598.212152] wake_up_process (kernel/sched/core.c:1797 (discriminator 3))
>> > [  598.212152] hrtimer_wakeup (kernel/time/hrtimer.c:1490)
>> > [  598.212152] __run_hrtimer (kernel/time/hrtimer.c:1218 (discriminator 3))
>> > [  598.212152] ? hrtimer_interrupt (kernel/time/hrtimer.c:622 kernel/time/hrtimer.c:1254)
>> > [  598.212152] ? hrtimer_get_res (kernel/time/hrtimer.c:1480)
>> > [  598.212152] hrtimer_interrupt (kernel/time/hrtimer.c:1307)
>> > [  598.212152] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:921)
>> > [  598.212152] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:660 arch/x86/kernel/apic/apic.c:945)
>> > [  598.212152] apic_timer_interrupt (arch/x86/kernel/entry_64.S:983)
>> > [  598.212152] <EOI> ? context_tracking_user_enter (./arch/x86/include/asm/paravirt.h:809 kernel/context_tracking.c:106)
>> > [  598.212152] syscall_trace_leave (arch/x86/kernel/ptrace.c:1640)
>> > [  598.212152] int_check_syscall_exit_work (arch/x86/kernel/entry_64.S:577)
> So RCU believes that an RCU read-side critical section that ended within
> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> Which is not supposed to happen.
> 
> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> and retry?

I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-20 15:39   ` Sasha Levin
@ 2015-01-21  2:57     ` Paul E. McKenney
  2015-01-21 15:44       ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-21  2:57 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML, Dave Jones

On Tue, Jan 20, 2015 at 10:39:37AM -0500, Sasha Levin wrote:
> On 01/18/2015 06:22 PM, Paul E. McKenney wrote:
> > On Sun, Jan 18, 2015 at 09:17:40AM -0500, Sasha Levin wrote:
> >> > Hi Paul, Lai,
> >> > 
> >> > While fuzzing with trinity inside a KVM tools guest running the latest -next
> >> > kernel, I've stumbled on the following spew:
> >> > 
> >> > [  598.188036] WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
> >> > [  598.188036] Modules linked in:
> >> > [  598.188036] CPU: 30 PID: 23771 Comm: trinity-c118 Not tainted 3.19.0-rc4-next-20150116-sasha-00055-gb8e1507-dirty #1745
> >> > [  598.188036]  ffffffff926e52dc ffff8801b4403c38 ffffffff91439fb2 0000000000000000
> >> > [  598.188036]  0000000000000000 ffff8801b4403c78 ffffffff8e159e1a ffff8801b4403c58
> >> > [  598.188036]  0000000000000000 ffff88002f093000 00000000ffffffff ffff8801b23b9290
> >> > [  598.188036] Call Trace:
> >> > [  598.188036] <IRQ> dump_stack (lib/dump_stack.c:52)
> >> > [  598.212152] warn_slowpath_common (kernel/panic.c:447)
> >> > [  598.212152] warn_slowpath_null (kernel/panic.c:481)
> >> > [  598.212152] rcu_read_unlock_special (kernel/rcu/tree_plugin.h:337 (discriminator 9))
> >> > [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
> >> > [  598.212152] __rcu_read_unlock (kernel/rcu/update.c:97)
> >> > [  598.212152] select_task_rq_fair (kernel/sched/fair.c:4805)
> >> > [  598.212152] ? select_task_rq_fair (include/linux/rcupdate.h:889 kernel/sched/fair.c:4740)
> >> > [  598.212152] ? try_to_wake_up (kernel/sched/core.c:1701)
> >> > [  598.212152] try_to_wake_up (kernel/sched/core.c:1415 kernel/sched/core.c:1729)
> >> > [  598.212152] wake_up_process (kernel/sched/core.c:1797 (discriminator 3))
> >> > [  598.212152] hrtimer_wakeup (kernel/time/hrtimer.c:1490)
> >> > [  598.212152] __run_hrtimer (kernel/time/hrtimer.c:1218 (discriminator 3))
> >> > [  598.212152] ? hrtimer_interrupt (kernel/time/hrtimer.c:622 kernel/time/hrtimer.c:1254)
> >> > [  598.212152] ? hrtimer_get_res (kernel/time/hrtimer.c:1480)
> >> > [  598.212152] hrtimer_interrupt (kernel/time/hrtimer.c:1307)
> >> > [  598.212152] local_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:921)
> >> > [  598.212152] smp_apic_timer_interrupt (./arch/x86/include/asm/apic.h:660 arch/x86/kernel/apic/apic.c:945)
> >> > [  598.212152] apic_timer_interrupt (arch/x86/kernel/entry_64.S:983)
> >> > [  598.212152] <EOI> ? context_tracking_user_enter (./arch/x86/include/asm/paravirt.h:809 kernel/context_tracking.c:106)
> >> > [  598.212152] syscall_trace_leave (arch/x86/kernel/ptrace.c:1640)
> >> > [  598.212152] int_check_syscall_exit_work (arch/x86/kernel/entry_64.S:577)
> > So RCU believes that an RCU read-side critical section that ended within
> > an interrupt handler (in this case, an hrtimer) somehow got preempted.
> > Which is not supposed to happen.
> > 
> > Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> > and retry?
> 
> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.

OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

These handle the problems that Dave Jones, yourself, and a few others
located this past December.  Could you please give them a spin?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-21  2:57     ` Paul E. McKenney
@ 2015-01-21 15:44       ` Sasha Levin
  2015-01-22  0:43         ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-21 15:44 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>> So RCU believes that an RCU read-side critical section that ended within
>>> > > an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>> > > Which is not supposed to happen.
>>> > > 
>>> > > Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>> > > and retry?
>> > 
>> > I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> 
> These handle the problems that Dave Jones, yourself, and a few others
> located this past December.  Could you please give them a spin?

They seem to be a part of -next already, so this testing already includes them.

I seem to be getting them about once a day, anything I can add to debug it?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-21 15:44       ` Sasha Levin
@ 2015-01-22  0:43         ` Paul E. McKenney
  2015-01-23  3:29           ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-22  0:43 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>> So RCU believes that an RCU read-side critical section that ended within
> >>> > > an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>> > > Which is not supposed to happen.
> >>> > > 
> >>> > > Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>> > > and retry?
> >> > 
> >> > I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> > OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> > their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> > 
> > These handle the problems that Dave Jones, yourself, and a few others
> > located this past December.  Could you please give them a spin?
> 
> They seem to be a part of -next already, so this testing already includes them.
> 
> I seem to be getting them about once a day, anything I can add to debug it?

Could you please try reproducing with the following patch?

							Thanx, Paul

------------------------------------------------------------------------

rcu: Improve diagnostics for blocked critical sections in irq

If an RCU read-side critical section occurs within an interrupt handler
or a softirq handler, it cannot have been preempted.  Therefore, there is
a check in rcu_read_unlock_special() checking for this error.  However,
when this check triggers, it lacks diagnostic information.  This commit
therefore moves rcu_read_unlock()'s lockdep annotation to follow the
call to __rcu_read_unlock() and changes rcu_read_unlock_special()'s
WARN_ON_ONCE() to an lockdep_rcu_suspicious() in order to locate where
the offending RCU read-side critical section began.  In addition, the
value of the ->rcu_read_unlock_special field is printed.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 3e6afed51051..70b896e16f19 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -942,9 +942,9 @@ static inline void rcu_read_unlock(void)
 {
 	rcu_lockdep_assert(rcu_is_watching(),
 			   "rcu_read_unlock() used illegally while idle");
-	rcu_lock_release(&rcu_lock_map);
 	__release(RCU);
 	__rcu_read_unlock();
+	rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
 }
 
 /**
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index bb947ef1a2a4..8d2b497b52e9 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -334,7 +334,13 @@ void rcu_read_unlock_special(struct task_struct *t)
 	}
 
 	/* Hardware IRQ handlers cannot block, complain if they get here. */
-	if (WARN_ON_ONCE(in_irq() || in_serving_softirq())) {
+	if (in_irq() || in_serving_softirq()) {
+		lockdep_rcu_suspicious(__FILE__, __LINE__,
+				       "rcu_read_unlock() from irq or softirq with blocking in critical section!!!\n");
+		pr_alert("->rcu_read_unlock_special: %#x (b: %d, nq: %d)\n",
+			 t->rcu_read_unlock_special.s,
+			 t->rcu_read_unlock_special.b.blocked,
+			 t->rcu_read_unlock_special.b.need_qs);
 		local_irq_restore(flags);
 		return;
 	}


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-22  0:43         ` Paul E. McKenney
@ 2015-01-23  3:29           ` Sasha Levin
  2015-01-23  3:51             ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-23  3:29 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>> Which is not supposed to happen.
>>>>>>>
>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>> and retry?
>>>>>
>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>
>>> These handle the problems that Dave Jones, yourself, and a few others
>>> located this past December.  Could you please give them a spin?
>>
>> They seem to be a part of -next already, so this testing already includes them.
>>
>> I seem to be getting them about once a day, anything I can add to debug it?
> 
> Could you please try reproducing with the following patch?

Yes, and I've got mixed results. It reproduced, and all I got was:

[  717.645572] ===============================
[  717.645572] [ INFO: suspicious RCU usage. ]
[  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
[  717.645572] -------------------------------
[  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
[  717.645572] !
[  717.645572]
[  717.645572] other info that might help us debug this:
[  717.645572]
[  717.645572]
[  717.645572] rcu_scheduler_active = 1, debug_locks = 1
[  717.645572] 3 locks held by trinity-c29/16497:
[  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
[  717.645572]  #1:
[hang]

So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
should follow that.

I've removed the lockdep call and will re-run it.


Thanks,
Sasha



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  3:29           ` Sasha Levin
@ 2015-01-23  3:51             ` Paul E. McKenney
  2015-01-23  4:02               ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23  3:51 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> > On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> >> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>>>> So RCU believes that an RCU read-side critical section that ended within
> >>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>>>>>> Which is not supposed to happen.
> >>>>>>>
> >>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>>>>>> and retry?
> >>>>>
> >>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> >>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> >>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> >>>
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >>>
> >>> These handle the problems that Dave Jones, yourself, and a few others
> >>> located this past December.  Could you please give them a spin?
> >>
> >> They seem to be a part of -next already, so this testing already includes them.
> >>
> >> I seem to be getting them about once a day, anything I can add to debug it?
> > 
> > Could you please try reproducing with the following patch?
> 
> Yes, and I've got mixed results. It reproduced, and all I got was:
> 
> [  717.645572] ===============================
> [  717.645572] [ INFO: suspicious RCU usage. ]
> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> [  717.645572] -------------------------------
> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> [  717.645572] !
> [  717.645572]
> [  717.645572] other info that might help us debug this:
> [  717.645572]
> [  717.645572]
> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> [  717.645572] 3 locks held by trinity-c29/16497:
> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> [  717.645572]  #1:
> [hang]
> 
> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> should follow that.
> 
> I've removed the lockdep call and will re-run it.

Thank you!  You are keeping the pr_alert(), correct?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  3:51             ` Paul E. McKenney
@ 2015-01-23  4:02               ` Sasha Levin
  2015-01-23  4:05                 ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-23  4:02 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
>> > On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
>>> > > On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>>>> > >> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>>>> > >>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>>>> > >>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>>>> > >>>>>>> Which is not supposed to happen.
>>>>>>>>> > >>>>>>>
>>>>>>>>> > >>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>>>> > >>>>>>> and retry?
>>>>>>> > >>>>>
>>>>>>> > >>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>>>> > >>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>>>> > >>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>>> > >>>
>>>>> > >>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>>> > >>>
>>>>> > >>> These handle the problems that Dave Jones, yourself, and a few others
>>>>> > >>> located this past December.  Could you please give them a spin?
>>>> > >>
>>>> > >> They seem to be a part of -next already, so this testing already includes them.
>>>> > >>
>>>> > >> I seem to be getting them about once a day, anything I can add to debug it?
>>> > > 
>>> > > Could you please try reproducing with the following patch?
>> > 
>> > Yes, and I've got mixed results. It reproduced, and all I got was:
>> > 
>> > [  717.645572] ===============================
>> > [  717.645572] [ INFO: suspicious RCU usage. ]
>> > [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
>> > [  717.645572] -------------------------------
>> > [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
>> > [  717.645572] !
>> > [  717.645572]
>> > [  717.645572] other info that might help us debug this:
>> > [  717.645572]
>> > [  717.645572]
>> > [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
>> > [  717.645572] 3 locks held by trinity-c29/16497:
>> > [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
>> > [  717.645572]  #1:
>> > [hang]
>> > 
>> > So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
>> > should follow that.
>> > 
>> > I've removed the lockdep call and will re-run it.
> Thank you!  You are keeping the pr_alert(), correct?

Yup, just the lockdep call goes away.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  4:02               ` Sasha Levin
@ 2015-01-23  4:05                 ` Sasha Levin
  2015-01-23  6:55                   ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-23  4:05 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/22/2015 11:02 PM, Sasha Levin wrote:
> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>>>>>>>>>>>>> and retry?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>>>>>>>>
>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>>>>>>>>
>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
>>>>>>>>>> located this past December.  Could you please give them a spin?
>>>>>>>>
>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
>>>>>>>>
>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
>>>>>>
>>>>>> Could you please try reproducing with the following patch?
>>>>
>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
>>>>
>>>> [  717.645572] ===============================
>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
>>>> [  717.645572] -------------------------------
>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
>>>> [  717.645572] !
>>>> [  717.645572]
>>>> [  717.645572] other info that might help us debug this:
>>>> [  717.645572]
>>>> [  717.645572]
>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
>>>> [  717.645572] 3 locks held by trinity-c29/16497:
>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
>>>> [  717.645572]  #1:
>>>> [hang]
>>>>
>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
>>>> should follow that.
>>>>
>>>> I've removed the lockdep call and will re-run it.
>> Thank you!  You are keeping the pr_alert(), correct?
> 
> Yup, just the lockdep call goes away.

Okay, this reproduced faster than I anticipated:

[  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
[  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)

It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
of times in a very short interval. Maybe that would also explain lockdep crapping
itself.


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  4:05                 ` Sasha Levin
@ 2015-01-23  6:55                   ` Paul E. McKenney
  2015-01-23  8:41                     ` Lai Jiangshan
                                       ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23  6:55 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
> On 01/22/2015 11:02 PM, Sasha Levin wrote:
> > On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> >> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> >>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> >>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> >>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
> >>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>>>>>>>>>>>>>>>>> Which is not supposed to happen.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>>>>>>>>>>>>>>>>> and retry?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> >>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> >>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> >>>>>>>>>>
> >>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >>>>>>>>>>
> >>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
> >>>>>>>>>> located this past December.  Could you please give them a spin?
> >>>>>>>>
> >>>>>>>> They seem to be a part of -next already, so this testing already includes them.
> >>>>>>>>
> >>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
> >>>>>>
> >>>>>> Could you please try reproducing with the following patch?
> >>>>
> >>>> Yes, and I've got mixed results. It reproduced, and all I got was:
> >>>>
> >>>> [  717.645572] ===============================
> >>>> [  717.645572] [ INFO: suspicious RCU usage. ]
> >>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> >>>> [  717.645572] -------------------------------
> >>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> >>>> [  717.645572] !
> >>>> [  717.645572]
> >>>> [  717.645572] other info that might help us debug this:
> >>>> [  717.645572]
> >>>> [  717.645572]
> >>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> >>>> [  717.645572] 3 locks held by trinity-c29/16497:
> >>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> >>>> [  717.645572]  #1:
> >>>> [hang]
> >>>>
> >>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> >>>> should follow that.
> >>>>
> >>>> I've removed the lockdep call and will re-run it.
> >> Thank you!  You are keeping the pr_alert(), correct?
> > 
> > Yup, just the lockdep call goes away.
> 
> Okay, this reproduced faster than I anticipated:
> 
> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> 
> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
> of times in a very short interval. Maybe that would also explain lockdep crapping
> itself.

OK, that was what I thought was the situation.  I have not yet fully
worked out how RCU gets into that state, but in the meantime, here
is a patch that should prevent the splats.  (It requires a subtle
interaction of quiescent-state detection and the scheduling-clock
interrupt.)

							Thanx, Paul

------------------------------------------------------------------------

rcu: Clear need_qs flag to prevent splat

If the scheduling-clock interrupt sets the current tasks need_qs flag,
but if the current CPU passes through a quiescent state in the meantime,
then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
RCU into thinking that additional rcu_read_unlock_special() processing
is needed.  This commit therefore clears the need_qs flag before checking
for additional processing.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 8669de884445..ec99dc16aa38 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
 	special = t->rcu_read_unlock_special;
 	if (special.b.need_qs) {
 		rcu_preempt_qs();
+		t->rcu_read_unlock_special.need_qs = false;
 		if (!t->rcu_read_unlock_special.s) {
 			local_irq_restore(flags);
 			return;


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  6:55                   ` Paul E. McKenney
@ 2015-01-23  8:41                     ` Lai Jiangshan
  2015-01-23  9:38                       ` Paul E. McKenney
  2015-01-23  9:16                     ` Lai Jiangshan
  2015-01-23  9:36                     ` Paul E. McKenney
  2 siblings, 1 reply; 30+ messages in thread
From: Lai Jiangshan @ 2015-01-23  8:41 UTC (permalink / raw)
  To: paulmck, Sasha Levin
  Cc: Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/23/2015 02:55 PM, Paul E. McKenney wrote:
> On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
>> On 01/22/2015 11:02 PM, Sasha Levin wrote:
>>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
>>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
>>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
>>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>>>>>>>>>>>>>>> and retry?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>>>>>>>>>>
>>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>>>>>>>>>>
>>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
>>>>>>>>>>>> located this past December.  Could you please give them a spin?
>>>>>>>>>>
>>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
>>>>>>>>>>
>>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
>>>>>>>>
>>>>>>>> Could you please try reproducing with the following patch?
>>>>>>
>>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
>>>>>>
>>>>>> [  717.645572] ===============================
>>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
>>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
>>>>>> [  717.645572] -------------------------------
>>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
>>>>>> [  717.645572] !
>>>>>> [  717.645572]
>>>>>> [  717.645572] other info that might help us debug this:
>>>>>> [  717.645572]
>>>>>> [  717.645572]
>>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
>>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
>>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
>>>>>> [  717.645572]  #1:
>>>>>> [hang]
>>>>>>
>>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
>>>>>> should follow that.
>>>>>>
>>>>>> I've removed the lockdep call and will re-run it.
>>>> Thank you!  You are keeping the pr_alert(), correct?
>>>
>>> Yup, just the lockdep call goes away.
>>
>> Okay, this reproduced faster than I anticipated:
>>
>> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>
>> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
>> of times in a very short interval. Maybe that would also explain lockdep crapping
>> itself.
> 
> OK, that was what I thought was the situation.  I have not yet fully
> worked out how RCU gets into that state, but in the meantime, here
> is a patch that should prevent the splats.  (It requires a subtle
> interaction of quiescent-state detection and the scheduling-clock
> interrupt.)
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> rcu: Clear need_qs flag to prevent splat
> 
> If the scheduling-clock interrupt sets the current tasks need_qs flag,
> but if the current CPU passes through a quiescent state in the meantime,

How does the above things happen?
"the scheduling-clock interrupt" is local-irq-disabled.

> then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> RCU into thinking that additional rcu_read_unlock_special() processing
> is needed.  This commit therefore clears the need_qs flag before checking
> for additional processing.
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 8669de884445..ec99dc16aa38 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>  	special = t->rcu_read_unlock_special;
>  	if (special.b.need_qs) {
>  		rcu_preempt_qs();
> +		t->rcu_read_unlock_special.need_qs = false;
>  		if (!t->rcu_read_unlock_special.s) {
>  			local_irq_restore(flags);
>  			return;
> 
> .
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  6:55                   ` Paul E. McKenney
  2015-01-23  8:41                     ` Lai Jiangshan
@ 2015-01-23  9:16                     ` Lai Jiangshan
  2015-01-23  9:48                       ` Paul E. McKenney
  2015-01-23  9:36                     ` Paul E. McKenney
  2 siblings, 1 reply; 30+ messages in thread
From: Lai Jiangshan @ 2015-01-23  9:16 UTC (permalink / raw)
  To: paulmck, Sasha Levin
  Cc: Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/23/2015 02:55 PM, Paul E. McKenney wrote:
> On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
>> On 01/22/2015 11:02 PM, Sasha Levin wrote:
>>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
>>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
>>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
>>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>>>>>>>>>>>>>>> and retry?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>>>>>>>>>>
>>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>>>>>>>>>>
>>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
>>>>>>>>>>>> located this past December.  Could you please give them a spin?
>>>>>>>>>>
>>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
>>>>>>>>>>
>>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
>>>>>>>>
>>>>>>>> Could you please try reproducing with the following patch?
>>>>>>
>>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
>>>>>>
>>>>>> [  717.645572] ===============================
>>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
>>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
>>>>>> [  717.645572] -------------------------------
>>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
>>>>>> [  717.645572] !
>>>>>> [  717.645572]
>>>>>> [  717.645572] other info that might help us debug this:
>>>>>> [  717.645572]
>>>>>> [  717.645572]
>>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
>>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
>>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
>>>>>> [  717.645572]  #1:
>>>>>> [hang]
>>>>>>
>>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
>>>>>> should follow that.
>>>>>>
>>>>>> I've removed the lockdep call and will re-run it.
>>>> Thank you!  You are keeping the pr_alert(), correct?
>>>
>>> Yup, just the lockdep call goes away.
>>
>> Okay, this reproduced faster than I anticipated:
>>
>> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>
>> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
>> of times in a very short interval. Maybe that would also explain lockdep crapping
>> itself.
> 
> OK, that was what I thought was the situation.  I have not yet fully
> worked out how RCU gets into that state, but in the meantime, here
> is a patch that should prevent the splats.  (It requires a subtle
> interaction of quiescent-state detection and the scheduling-clock
> interrupt.)
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> rcu: Clear need_qs flag to prevent splat
> 
> If the scheduling-clock interrupt sets the current tasks need_qs flag,
> but if the current CPU passes through a quiescent state in the meantime,
> then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> RCU into thinking that additional rcu_read_unlock_special() processing
> is needed.  This commit therefore clears the need_qs flag before checking
> for additional processing.
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 8669de884445..ec99dc16aa38 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>  	special = t->rcu_read_unlock_special;
>  	if (special.b.need_qs) {
>  		rcu_preempt_qs();
> +		t->rcu_read_unlock_special.need_qs = false;
>  		if (!t->rcu_read_unlock_special.s) {
>  			local_irq_restore(flags);
>  			return;
> 
> .


rcu_preempt_qs() can be called from rcu_preempt_note_context_switch()
without irq-disabled. I think it is dangerous, since it touches need_qs and
passed_quiesce directly and touches rcu_read_unlock_special.b.blocked and
qs_pending indirectly.  At least it addes burden for me to understand them all.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  6:55                   ` Paul E. McKenney
  2015-01-23  8:41                     ` Lai Jiangshan
  2015-01-23  9:16                     ` Lai Jiangshan
@ 2015-01-23  9:36                     ` Paul E. McKenney
  2015-01-23 21:51                       ` Sasha Levin
  2015-01-24  2:18                       ` Lai Jiangshan
  2 siblings, 2 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23  9:36 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Thu, Jan 22, 2015 at 10:55:42PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
> > On 01/22/2015 11:02 PM, Sasha Levin wrote:
> > > On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> > >> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> > >>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> > >>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> > >>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> > >>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
> > >>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> > >>>>>>>>>>>>>>>>>> Which is not supposed to happen.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> > >>>>>>>>>>>>>>>>>> and retry?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> > >>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> > >>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> > >>>>>>>>>>
> > >>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> > >>>>>>>>>>
> > >>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
> > >>>>>>>>>> located this past December.  Could you please give them a spin?
> > >>>>>>>>
> > >>>>>>>> They seem to be a part of -next already, so this testing already includes them.
> > >>>>>>>>
> > >>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
> > >>>>>>
> > >>>>>> Could you please try reproducing with the following patch?
> > >>>>
> > >>>> Yes, and I've got mixed results. It reproduced, and all I got was:
> > >>>>
> > >>>> [  717.645572] ===============================
> > >>>> [  717.645572] [ INFO: suspicious RCU usage. ]
> > >>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> > >>>> [  717.645572] -------------------------------
> > >>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> > >>>> [  717.645572] !
> > >>>> [  717.645572]
> > >>>> [  717.645572] other info that might help us debug this:
> > >>>> [  717.645572]
> > >>>> [  717.645572]
> > >>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> > >>>> [  717.645572] 3 locks held by trinity-c29/16497:
> > >>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> > >>>> [  717.645572]  #1:
> > >>>> [hang]
> > >>>>
> > >>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> > >>>> should follow that.
> > >>>>
> > >>>> I've removed the lockdep call and will re-run it.
> > >> Thank you!  You are keeping the pr_alert(), correct?
> > > 
> > > Yup, just the lockdep call goes away.
> > 
> > Okay, this reproduced faster than I anticipated:
> > 
> > [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> > 
> > It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
> > of times in a very short interval. Maybe that would also explain lockdep crapping
> > itself.
> 
> OK, that was what I thought was the situation.  I have not yet fully
> worked out how RCU gets into that state, but in the meantime, here
> is a patch that should prevent the splats.  (It requires a subtle
> interaction of quiescent-state detection and the scheduling-clock
> interrupt.)

And I did finally figure out how this can happen.  Please see below
for an updated patch with this information recorded in the commit log.
Sasha, I am impressed -- your testing not only located a true RCU bug,
but an RCU bug that can happen on a uniprocessor!  ;-)

As far as I know, the bug is harmless apart from the splat, but still...

							Thanx, Paul

------------------------------------------------------------------------

rcu: Clear need_qs flag to prevent splat

If the scheduling-clock interrupt sets the current tasks need_qs flag,
but if the current CPU passes through a quiescent state in the meantime,
then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
RCU into thinking that additional rcu_read_unlock_special() processing
is needed.  This commit therefore clears the need_qs flag before checking
for additional processing.

For this problem to occur, we need rcu_preempt_data.passed_quiesce equal
to true and current->rcu_read_unlock_special.b.need_qs also equal to true.
This condition can occur as follows:

1.	CPU 0 is aware of the current preemptible RCU grace period,
    	but has not yet passed through a quiescent state.  Among other
    	things, this means that rcu_preempt_data.passed_quiesce is false.
    
2.	Task A running on CPU 0 enters a preemptible RCU read-side
    	critical section.
    
3.	CPU 0 takes a scheduling-clock interrupt, which notices the
    	RCU read-side critical section and the need for a quiescent state,
    	and thus sets current->rcu_read_unlock_special.b.need_qs to true.
    
4.	Task A is preempted, enters the scheduler, eventually invoking
    	rcu_preempt_note_context_switch() which in turn invokes
    	rcu_preempt_qs().
    
    	Because rcu_preempt_data.passed_quiesce is false,
    	control enters the body of the "if" statement, which sets
    	rcu_preempt_data.passed_quiesce to true.
    
5.	At this point, CPU 0 takes an interrupt.  The interrupt
    	handler contains an RCU read-side critical section, and
    	the rcu_read_unlock() notes that current->rcu_read_unlock_special
    	is nonzero, and thus invokes rcu_read_unlock_special().
    
6.	Once in rcu_read_unlock_special(), the fact that
    	current->rcu_read_unlock_special.b.need_qs is true becomes
    	apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs().
    	Recursively, given that we interrupted out of that same
    	function in the preceding step.
    
7.	Because rcu_preempt_data.passed_quiesce is now true,
    	rcu_preempt_qs() does nothing, and simply returns.
    
8.	Upon return to rcu_read_unlock_special(), it is noted that
    	current->rcu_read_unlock_special is still nonzero (because
    	the interrupted rcu_preempt_qs() had not yet gotten around
    	to clearing current->rcu_read_unlock_special.b.need_qs).
    
9.	Execution proceeds to the WARN_ON_ONCE(), which notes that
    	we are in an interrupt handler and thus duly splats.

The solution, as noted above, is to make rcu_read_unlock_special()
clear out current->rcu_read_unlock_special.b.need_qs after calling
rcu_preempt_qs().  The interrupted rcu_preempt_qs() will clear it again,
but this is harmless.  The worst that happens is that we clobber another
attempt to set this field, but this is not a problem because we just
got done reporting a quiescent state.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 8669de884445..ec99dc16aa38 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
 	special = t->rcu_read_unlock_special;
 	if (special.b.need_qs) {
 		rcu_preempt_qs();
+		t->rcu_read_unlock_special.need_qs = false;
 		if (!t->rcu_read_unlock_special.s) {
 			local_irq_restore(flags);
 			return;


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  8:41                     ` Lai Jiangshan
@ 2015-01-23  9:38                       ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23  9:38 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Sasha Levin, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Fri, Jan 23, 2015 at 04:41:12PM +0800, Lai Jiangshan wrote:
> On 01/23/2015 02:55 PM, Paul E. McKenney wrote:
> > On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
> >> On 01/22/2015 11:02 PM, Sasha Levin wrote:
> >>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> >>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> >>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> >>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> >>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
> >>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>>>>>>>>>>>>>>>>>>> and retry?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> >>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> >>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> >>>>>>>>>>>>
> >>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >>>>>>>>>>>>
> >>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
> >>>>>>>>>>>> located this past December.  Could you please give them a spin?
> >>>>>>>>>>
> >>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
> >>>>>>>>>>
> >>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
> >>>>>>>>
> >>>>>>>> Could you please try reproducing with the following patch?
> >>>>>>
> >>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
> >>>>>>
> >>>>>> [  717.645572] ===============================
> >>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
> >>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> >>>>>> [  717.645572] -------------------------------
> >>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> >>>>>> [  717.645572] !
> >>>>>> [  717.645572]
> >>>>>> [  717.645572] other info that might help us debug this:
> >>>>>> [  717.645572]
> >>>>>> [  717.645572]
> >>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> >>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
> >>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> >>>>>> [  717.645572]  #1:
> >>>>>> [hang]
> >>>>>>
> >>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> >>>>>> should follow that.
> >>>>>>
> >>>>>> I've removed the lockdep call and will re-run it.
> >>>> Thank you!  You are keeping the pr_alert(), correct?
> >>>
> >>> Yup, just the lockdep call goes away.
> >>
> >> Okay, this reproduced faster than I anticipated:
> >>
> >> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>
> >> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
> >> of times in a very short interval. Maybe that would also explain lockdep crapping
> >> itself.
> > 
> > OK, that was what I thought was the situation.  I have not yet fully
> > worked out how RCU gets into that state, but in the meantime, here
> > is a patch that should prevent the splats.  (It requires a subtle
> > interaction of quiescent-state detection and the scheduling-clock
> > interrupt.)
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > rcu: Clear need_qs flag to prevent splat
> > 
> > If the scheduling-clock interrupt sets the current tasks need_qs flag,
> > but if the current CPU passes through a quiescent state in the meantime,
> 
> How does the above things happen?
> "the scheduling-clock interrupt" is local-irq-disabled.

The scheduling-clock interrupt just sets things up.  The remainder of the
failure scenario happens after this interrupt returns.  See my earlier
email for the whole sad story.  ;-)

							Thanx, Paul

> > then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> > RCU into thinking that additional rcu_read_unlock_special() processing
> > is needed.  This commit therefore clears the need_qs flag before checking
> > for additional processing.
> > 
> > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 8669de884445..ec99dc16aa38 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >  	special = t->rcu_read_unlock_special;
> >  	if (special.b.need_qs) {
> >  		rcu_preempt_qs();
> > +		t->rcu_read_unlock_special.need_qs = false;
> >  		if (!t->rcu_read_unlock_special.s) {
> >  			local_irq_restore(flags);
> >  			return;
> > 
> > .
> > 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  9:16                     ` Lai Jiangshan
@ 2015-01-23  9:48                       ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23  9:48 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Sasha Levin, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Fri, Jan 23, 2015 at 05:16:46PM +0800, Lai Jiangshan wrote:
> On 01/23/2015 02:55 PM, Paul E. McKenney wrote:
> > On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
> >> On 01/22/2015 11:02 PM, Sasha Levin wrote:
> >>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> >>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> >>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> >>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> >>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
> >>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>>>>>>>>>>>>>>>>>>> and retry?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> >>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> >>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> >>>>>>>>>>>>
> >>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >>>>>>>>>>>>
> >>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
> >>>>>>>>>>>> located this past December.  Could you please give them a spin?
> >>>>>>>>>>
> >>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
> >>>>>>>>>>
> >>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
> >>>>>>>>
> >>>>>>>> Could you please try reproducing with the following patch?
> >>>>>>
> >>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
> >>>>>>
> >>>>>> [  717.645572] ===============================
> >>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
> >>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> >>>>>> [  717.645572] -------------------------------
> >>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> >>>>>> [  717.645572] !
> >>>>>> [  717.645572]
> >>>>>> [  717.645572] other info that might help us debug this:
> >>>>>> [  717.645572]
> >>>>>> [  717.645572]
> >>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> >>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
> >>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> >>>>>> [  717.645572]  #1:
> >>>>>> [hang]
> >>>>>>
> >>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> >>>>>> should follow that.
> >>>>>>
> >>>>>> I've removed the lockdep call and will re-run it.
> >>>> Thank you!  You are keeping the pr_alert(), correct?
> >>>
> >>> Yup, just the lockdep call goes away.
> >>
> >> Okay, this reproduced faster than I anticipated:
> >>
> >> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>
> >> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
> >> of times in a very short interval. Maybe that would also explain lockdep crapping
> >> itself.
> > 
> > OK, that was what I thought was the situation.  I have not yet fully
> > worked out how RCU gets into that state, but in the meantime, here
> > is a patch that should prevent the splats.  (It requires a subtle
> > interaction of quiescent-state detection and the scheduling-clock
> > interrupt.)
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > rcu: Clear need_qs flag to prevent splat
> > 
> > If the scheduling-clock interrupt sets the current tasks need_qs flag,
> > but if the current CPU passes through a quiescent state in the meantime,
> > then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> > RCU into thinking that additional rcu_read_unlock_special() processing
> > is needed.  This commit therefore clears the need_qs flag before checking
> > for additional processing.
> > 
> > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 8669de884445..ec99dc16aa38 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >  	special = t->rcu_read_unlock_special;
> >  	if (special.b.need_qs) {
> >  		rcu_preempt_qs();
> > +		t->rcu_read_unlock_special.need_qs = false;
> >  		if (!t->rcu_read_unlock_special.s) {
> >  			local_irq_restore(flags);
> >  			return;
> > 
> > .
> 
> 
> rcu_preempt_qs() can be called from rcu_preempt_note_context_switch()
> without irq-disabled. I think it is dangerous, since it touches need_qs and
> passed_quiesce directly and touches rcu_read_unlock_special.b.blocked and
> qs_pending indirectly.  At least it addes burden for me to understand them all.

Yep, disabling interrupts across the call to rcu_preempt_qs() in
rcu_preempt_note_context_switch() would be another way to fix this.
And in fact the code used to disable interrupts across this call.
But I was informed that interrupt disabling in the scheduler fastpath
was socially irresponsible, so I removed it, obviously failing to think
it through.  So the fix above should cover things without re-introducing
the overhead on the scheduler fastpath.

Make sense, or is there some other problem with this?

Here are some that I considered and dismissed as non-problems:

1.	We could take an interrupt just before recording the quiescent
	state, do RCU_SOFTIRQ on return from that interrupt, notice a new
	grace period, and upon return from interrupt to rcu_preempt_qs()
	record a quiescent state.

	But that is OK, because we are in rcu_preempt_qs(), and thus
	are in a quiescent state both before and after the interrupt,
	regardless of what the grace period number might be.

2.	As #1 above, but we get interrupted just after recording
	the quiescent state instead of just before.

	This is also OK.  When the new grace period is noticed, the
	quiescent-state information is cleared.  The current quiescent
	state is ignored, but there will be another quiescent state
	along at some point in the near future.

3.	As #1 above, but after clearing need_qs.  Same result as #2.

Any that I missed?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  9:36                     ` Paul E. McKenney
@ 2015-01-23 21:51                       ` Sasha Levin
  2015-01-23 22:54                         ` Paul E. McKenney
  2015-01-24  2:18                       ` Lai Jiangshan
  1 sibling, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-23 21:51 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/23/2015 04:36 AM, Paul E. McKenney wrote:
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 8669de884445..ec99dc16aa38 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>  	special = t->rcu_read_unlock_special;
>  	if (special.b.need_qs) {
>  		rcu_preempt_qs();
> +		t->rcu_read_unlock_special.need_qs = false;

It didn't compile, I've used:

		t->rcu_read_unlock_special.b.need_qs = false;


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23 21:51                       ` Sasha Levin
@ 2015-01-23 22:54                         ` Paul E. McKenney
  2015-01-23 23:05                           ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23 22:54 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Fri, Jan 23, 2015 at 04:51:52PM -0500, Sasha Levin wrote:
> On 01/23/2015 04:36 AM, Paul E. McKenney wrote:
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 8669de884445..ec99dc16aa38 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >  	special = t->rcu_read_unlock_special;
> >  	if (special.b.need_qs) {
> >  		rcu_preempt_qs();
> > +		t->rcu_read_unlock_special.need_qs = false;
> 
> It didn't compile, I've used:
> 
> 		t->rcu_read_unlock_special.b.need_qs = false;

Apologies, I should have marked it "untested".  Good show on finding
the correct fix.

But does your fixed version help?  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23 22:54                         ` Paul E. McKenney
@ 2015-01-23 23:05                           ` Sasha Levin
  2015-01-23 23:16                             ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-23 23:05 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/23/2015 05:54 PM, Paul E. McKenney wrote:
> On Fri, Jan 23, 2015 at 04:51:52PM -0500, Sasha Levin wrote:
>> > On 01/23/2015 04:36 AM, Paul E. McKenney wrote:
>>> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>>> > > index 8669de884445..ec99dc16aa38 100644
>>> > > --- a/kernel/rcu/tree_plugin.h
>>> > > +++ b/kernel/rcu/tree_plugin.h
>>> > > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>>> > >  	special = t->rcu_read_unlock_special;
>>> > >  	if (special.b.need_qs) {
>>> > >  		rcu_preempt_qs();
>>> > > +		t->rcu_read_unlock_special.need_qs = false;
>> > 
>> > It didn't compile, I've used:
>> > 
>> > 		t->rcu_read_unlock_special.b.need_qs = false;
> Apologies, I should have marked it "untested".  Good show on finding
> the correct fix.
> 
> But does your fixed version help?  ;-)

I haven't seen it happening it again, but you won't get a "yes" out of me
until tomorrow evening :)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23 23:05                           ` Sasha Levin
@ 2015-01-23 23:16                             ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-23 23:16 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Fri, Jan 23, 2015 at 06:05:25PM -0500, Sasha Levin wrote:
> On 01/23/2015 05:54 PM, Paul E. McKenney wrote:
> > On Fri, Jan 23, 2015 at 04:51:52PM -0500, Sasha Levin wrote:
> >> > On 01/23/2015 04:36 AM, Paul E. McKenney wrote:
> >>> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> >>> > > index 8669de884445..ec99dc16aa38 100644
> >>> > > --- a/kernel/rcu/tree_plugin.h
> >>> > > +++ b/kernel/rcu/tree_plugin.h
> >>> > > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >>> > >  	special = t->rcu_read_unlock_special;
> >>> > >  	if (special.b.need_qs) {
> >>> > >  		rcu_preempt_qs();
> >>> > > +		t->rcu_read_unlock_special.need_qs = false;
> >> > 
> >> > It didn't compile, I've used:
> >> > 
> >> > 		t->rcu_read_unlock_special.b.need_qs = false;
> > Apologies, I should have marked it "untested".  Good show on finding
> > the correct fix.
> > 
> > But does your fixed version help?  ;-)
> 
> I haven't seen it happening it again, but you won't get a "yes" out of me
> until tomorrow evening :)

Fair enough!  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-23  9:36                     ` Paul E. McKenney
  2015-01-23 21:51                       ` Sasha Levin
@ 2015-01-24  2:18                       ` Lai Jiangshan
  2015-01-24 21:18                         ` Paul E. McKenney
  1 sibling, 1 reply; 30+ messages in thread
From: Lai Jiangshan @ 2015-01-24  2:18 UTC (permalink / raw)
  To: paulmck, Sasha Levin
  Cc: Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/23/2015 05:36 PM, Paul E. McKenney wrote:
> On Thu, Jan 22, 2015 at 10:55:42PM -0800, Paul E. McKenney wrote:
>> On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
>>> On 01/22/2015 11:02 PM, Sasha Levin wrote:
>>>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
>>>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
>>>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
>>>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
>>>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
>>>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
>>>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
>>>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
>>>>>>>>>>>>>>>>>>>>> and retry?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
>>>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
>>>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>>>>>>>>>>>>>
>>>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
>>>>>>>>>>>>> located this past December.  Could you please give them a spin?
>>>>>>>>>>>
>>>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
>>>>>>>>>>>
>>>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
>>>>>>>>>
>>>>>>>>> Could you please try reproducing with the following patch?
>>>>>>>
>>>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
>>>>>>>
>>>>>>> [  717.645572] ===============================
>>>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
>>>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
>>>>>>> [  717.645572] -------------------------------
>>>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
>>>>>>> [  717.645572] !
>>>>>>> [  717.645572]
>>>>>>> [  717.645572] other info that might help us debug this:
>>>>>>> [  717.645572]
>>>>>>> [  717.645572]
>>>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
>>>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
>>>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
>>>>>>> [  717.645572]  #1:
>>>>>>> [hang]
>>>>>>>
>>>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
>>>>>>> should follow that.
>>>>>>>
>>>>>>> I've removed the lockdep call and will re-run it.
>>>>> Thank you!  You are keeping the pr_alert(), correct?
>>>>
>>>> Yup, just the lockdep call goes away.
>>>
>>> Okay, this reproduced faster than I anticipated:
>>>
>>> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
>>>
>>> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
>>> of times in a very short interval. Maybe that would also explain lockdep crapping
>>> itself.
>>
>> OK, that was what I thought was the situation.  I have not yet fully
>> worked out how RCU gets into that state, but in the meantime, here
>> is a patch that should prevent the splats.  (It requires a subtle
>> interaction of quiescent-state detection and the scheduling-clock
>> interrupt.)
> 
> And I did finally figure out how this can happen.  Please see below
> for an updated patch with this information recorded in the commit log.
> Sasha, I am impressed -- your testing not only located a true RCU bug,
> but an RCU bug that can happen on a uniprocessor!  ;-)
> 
> As far as I know, the bug is harmless apart from the splat, but still...
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> rcu: Clear need_qs flag to prevent splat
> 
> If the scheduling-clock interrupt sets the current tasks need_qs flag,
> but if the current CPU passes through a quiescent state in the meantime,
> then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> RCU into thinking that additional rcu_read_unlock_special() processing
> is needed.  This commit therefore clears the need_qs flag before checking
> for additional processing.
> 
> For this problem to occur, we need rcu_preempt_data.passed_quiesce equal
> to true and current->rcu_read_unlock_special.b.need_qs also equal to true.
> This condition can occur as follows:


Hi, Paul
I still can't draw the sequence map.

> 
> 1.	CPU 0 is aware of the current preemptible RCU grace period,
>     	but has not yet passed through a quiescent state.  Among other
>     	things, this means that rcu_preempt_data.passed_quiesce is false.
>     
> 2.	Task A running on CPU 0 enters a preemptible RCU read-side
>     	critical section.
>     
> 3.	CPU 0 takes a scheduling-clock interrupt, which notices the
>     	RCU read-side critical section and the need for a quiescent state,
>     	and thus sets current->rcu_read_unlock_special.b.need_qs to true.


So, Task A is still in the preemptible RCU read-side critical section here.

>     
> 4.	Task A is preempted, enters the scheduler, eventually invoking
>     	rcu_preempt_note_context_switch() which in turn invokes
>     	rcu_preempt_qs().
>     
>     	Because rcu_preempt_data.passed_quiesce is false,
>     	control enters the body of the "if" statement, which sets
>     	rcu_preempt_data.passed_quiesce to true.
>     
> 5.	At this point, CPU 0 takes an interrupt.  The interrupt
>     	handler contains an RCU read-side critical section, and
>     	the rcu_read_unlock() notes that current->rcu_read_unlock_special
>     	is nonzero, and thus invokes rcu_read_unlock_special().

If the previous critical section is not ended, this new critical section
is nested, and this new critical section will not call rcu_read_unlock_special().

If the previous critical section is ended, the conditions were changed between
step#3,#4,#5, and the following #6... can't happen.

Thanks,
Lai

>     
> 6.	Once in rcu_read_unlock_special(), the fact that
>     	current->rcu_read_unlock_special.b.need_qs is true becomes
>     	apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs().
>     	Recursively, given that we interrupted out of that same
>     	function in the preceding step.
>     
> 7.	Because rcu_preempt_data.passed_quiesce is now true,
>     	rcu_preempt_qs() does nothing, and simply returns.
>     
> 8.	Upon return to rcu_read_unlock_special(), it is noted that
>     	current->rcu_read_unlock_special is still nonzero (because
>     	the interrupted rcu_preempt_qs() had not yet gotten around
>     	to clearing current->rcu_read_unlock_special.b.need_qs).
>     
> 9.	Execution proceeds to the WARN_ON_ONCE(), which notes that
>     	we are in an interrupt handler and thus duly splats.
> 
> The solution, as noted above, is to make rcu_read_unlock_special()
> clear out current->rcu_read_unlock_special.b.need_qs after calling
> rcu_preempt_qs().  The interrupted rcu_preempt_qs() will clear it again,
> but this is harmless.  The worst that happens is that we clobber another
> attempt to set this field, but this is not a problem because we just
> got done reporting a quiescent state.
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 8669de884445..ec99dc16aa38 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>  	special = t->rcu_read_unlock_special;
>  	if (special.b.need_qs) {
>  		rcu_preempt_qs();
> +		t->rcu_read_unlock_special.need_qs = false;
>  		if (!t->rcu_read_unlock_special.s) {
>  			local_irq_restore(flags);
>  			return;
> 
> .
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-24  2:18                       ` Lai Jiangshan
@ 2015-01-24 21:18                         ` Paul E. McKenney
  2015-01-26  2:08                           ` Lai Jiangshan
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-24 21:18 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Sasha Levin, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Sat, Jan 24, 2015 at 10:18:05AM +0800, Lai Jiangshan wrote:
> On 01/23/2015 05:36 PM, Paul E. McKenney wrote:
> > On Thu, Jan 22, 2015 at 10:55:42PM -0800, Paul E. McKenney wrote:
> >> On Thu, Jan 22, 2015 at 11:05:45PM -0500, Sasha Levin wrote:
> >>> On 01/22/2015 11:02 PM, Sasha Levin wrote:
> >>>> On 01/22/2015 10:51 PM, Paul E. McKenney wrote:
> >>>>> On Thu, Jan 22, 2015 at 10:29:01PM -0500, Sasha Levin wrote:
> >>>>>>> On 01/21/2015 07:43 PM, Paul E. McKenney wrote:
> >>>>>>>>> On Wed, Jan 21, 2015 at 10:44:57AM -0500, Sasha Levin wrote:
> >>>>>>>>>>> On 01/20/2015 09:57 PM, Paul E. McKenney wrote:
> >>>>>>>>>>>>>>>>> So RCU believes that an RCU read-side critical section that ended within
> >>>>>>>>>>>>>>>>>>>>> an interrupt handler (in this case, an hrtimer) somehow got preempted.
> >>>>>>>>>>>>>>>>>>>>> Which is not supposed to happen.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Do you have CONFIG_PROVE_RCU enabled?  If not, could you please enable it
> >>>>>>>>>>>>>>>>>>>>> and retry?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I did have CONFIG_PROVE_RCU, and didn't see anything else besides what I pasted here.
> >>>>>>>>>>>>> OK, fair enough.  I do have a stack of RCU CPU stall-warning changes on
> >>>>>>>>>>>>> their way in, please see v3.19-rc1..630181c4a915 in -rcu, which is at:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> These handle the problems that Dave Jones, yourself, and a few others
> >>>>>>>>>>>>> located this past December.  Could you please give them a spin?
> >>>>>>>>>>>
> >>>>>>>>>>> They seem to be a part of -next already, so this testing already includes them.
> >>>>>>>>>>>
> >>>>>>>>>>> I seem to be getting them about once a day, anything I can add to debug it?
> >>>>>>>>>
> >>>>>>>>> Could you please try reproducing with the following patch?
> >>>>>>>
> >>>>>>> Yes, and I've got mixed results. It reproduced, and all I got was:
> >>>>>>>
> >>>>>>> [  717.645572] ===============================
> >>>>>>> [  717.645572] [ INFO: suspicious RCU usage. ]
> >>>>>>> [  717.645572] 3.19.0-rc5-next-20150121-sasha-00064-g3c37e35-dirty #1809 Tainted: G        W
> >>>>>>> [  717.645572] -------------------------------
> >>>>>>> [  717.645572] kernel/rcu/tree_plugin.h:337 rcu_read_unlock() from irq or softirq with blocking in critical section!!!
> >>>>>>> [  717.645572] !
> >>>>>>> [  717.645572]
> >>>>>>> [  717.645572] other info that might help us debug this:
> >>>>>>> [  717.645572]
> >>>>>>> [  717.645572]
> >>>>>>> [  717.645572] rcu_scheduler_active = 1, debug_locks = 1
> >>>>>>> [  717.645572] 3 locks held by trinity-c29/16497:
> >>>>>>> [  717.645572]  #0:  (&sb->s_type->i_mutex_key){+.+.+.}, at: [<ffffffff81bec373>] lookup_slow+0xd3/0x420
> >>>>>>> [  717.645572]  #1:
> >>>>>>> [hang]
> >>>>>>>
> >>>>>>> So the rest of the locks/stack trace didn't get printed, nor the pr_alert() which
> >>>>>>> should follow that.
> >>>>>>>
> >>>>>>> I've removed the lockdep call and will re-run it.
> >>>>> Thank you!  You are keeping the pr_alert(), correct?
> >>>>
> >>>> Yup, just the lockdep call goes away.
> >>>
> >>> Okay, this reproduced faster than I anticipated:
> >>>
> >>> [  786.160131] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.239513] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.240503] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.242575] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>> [  786.243565] ->rcu_read_unlock_special: 0x100 (b: 0, nq: 1)
> >>>
> >>> It seems like the WARN_ON_ONCE was hiding the fact it actually got hit couple
> >>> of times in a very short interval. Maybe that would also explain lockdep crapping
> >>> itself.
> >>
> >> OK, that was what I thought was the situation.  I have not yet fully
> >> worked out how RCU gets into that state, but in the meantime, here
> >> is a patch that should prevent the splats.  (It requires a subtle
> >> interaction of quiescent-state detection and the scheduling-clock
> >> interrupt.)
> > 
> > And I did finally figure out how this can happen.  Please see below
> > for an updated patch with this information recorded in the commit log.
> > Sasha, I am impressed -- your testing not only located a true RCU bug,
> > but an RCU bug that can happen on a uniprocessor!  ;-)
> > 
> > As far as I know, the bug is harmless apart from the splat, but still...
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > rcu: Clear need_qs flag to prevent splat
> > 
> > If the scheduling-clock interrupt sets the current tasks need_qs flag,
> > but if the current CPU passes through a quiescent state in the meantime,
> > then rcu_preempt_qs() will fail to clear the need_qs flag, which can fool
> > RCU into thinking that additional rcu_read_unlock_special() processing
> > is needed.  This commit therefore clears the need_qs flag before checking
> > for additional processing.
> > 
> > For this problem to occur, we need rcu_preempt_data.passed_quiesce equal
> > to true and current->rcu_read_unlock_special.b.need_qs also equal to true.
> > This condition can occur as follows:
> 
> 
> Hi, Paul
> I still can't draw the sequence map.
> 
> > 
> > 1.	CPU 0 is aware of the current preemptible RCU grace period,
> >     	but has not yet passed through a quiescent state.  Among other
> >     	things, this means that rcu_preempt_data.passed_quiesce is false.
> >     
> > 2.	Task A running on CPU 0 enters a preemptible RCU read-side
> >     	critical section.
> >     
> > 3.	CPU 0 takes a scheduling-clock interrupt, which notices the
> >     	RCU read-side critical section and the need for a quiescent state,
> >     	and thus sets current->rcu_read_unlock_special.b.need_qs to true.
> 
> 
> So, Task A is still in the preemptible RCU read-side critical section here.
> 
> >     
> > 4.	Task A is preempted, enters the scheduler, eventually invoking
> >     	rcu_preempt_note_context_switch() which in turn invokes
> >     	rcu_preempt_qs().
> >     
> >     	Because rcu_preempt_data.passed_quiesce is false,
> >     	control enters the body of the "if" statement, which sets
> >     	rcu_preempt_data.passed_quiesce to true.
> >     
> > 5.	At this point, CPU 0 takes an interrupt.  The interrupt
> >     	handler contains an RCU read-side critical section, and
> >     	the rcu_read_unlock() notes that current->rcu_read_unlock_special
> >     	is nonzero, and thus invokes rcu_read_unlock_special().
> 
> If the previous critical section is not ended, this new critical section
> is nested, and this new critical section will not call rcu_read_unlock_special().
> 
> If the previous critical section is ended, the conditions were changed between
> step#3,#4,#5, and the following #6... can't happen.

Good point!  In my scenario, CPU 0 would not yet have switched away from
Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
some more thought.

Any ideas?

							Thanx, Paul

> Thanks,
> Lai
> 
> >     
> > 6.	Once in rcu_read_unlock_special(), the fact that
> >     	current->rcu_read_unlock_special.b.need_qs is true becomes
> >     	apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs().
> >     	Recursively, given that we interrupted out of that same
> >     	function in the preceding step.
> >     
> > 7.	Because rcu_preempt_data.passed_quiesce is now true,
> >     	rcu_preempt_qs() does nothing, and simply returns.
> >     
> > 8.	Upon return to rcu_read_unlock_special(), it is noted that
> >     	current->rcu_read_unlock_special is still nonzero (because
> >     	the interrupted rcu_preempt_qs() had not yet gotten around
> >     	to clearing current->rcu_read_unlock_special.b.need_qs).
> >     
> > 9.	Execution proceeds to the WARN_ON_ONCE(), which notes that
> >     	we are in an interrupt handler and thus duly splats.
> > 
> > The solution, as noted above, is to make rcu_read_unlock_special()
> > clear out current->rcu_read_unlock_special.b.need_qs after calling
> > rcu_preempt_qs().  The interrupted rcu_preempt_qs() will clear it again,
> > but this is harmless.  The worst that happens is that we clobber another
> > attempt to set this field, but this is not a problem because we just
> > got done reporting a quiescent state.
> > 
> > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 8669de884445..ec99dc16aa38 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >  	special = t->rcu_read_unlock_special;
> >  	if (special.b.need_qs) {
> >  		rcu_preempt_qs();
> > +		t->rcu_read_unlock_special.need_qs = false;
> >  		if (!t->rcu_read_unlock_special.s) {
> >  			local_irq_restore(flags);
> >  			return;
> > 
> > .
> > 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-24 21:18                         ` Paul E. McKenney
@ 2015-01-26  2:08                           ` Lai Jiangshan
  2015-01-27 22:03                             ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Lai Jiangshan @ 2015-01-26  2:08 UTC (permalink / raw)
  To: paulmck
  Cc: Sasha Levin, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/25/2015 05:18 AM, Paul E. McKenney wrote:

> 
> Good point!  In my scenario, CPU 0 would not yet have switched away from
> Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
> some more thought.
> 
> Any ideas?

I don't known which commit was merged from the rcu-git-tree in Sasha's test
I try to review it.

We can fallback to git-bitsect if the reviews fails.

Thanks,
Lai

> 
> 							Thanx, Paul
> 
>> Thanks,
>> Lai
>>
>>>     
>>> 6.	Once in rcu_read_unlock_special(), the fact that
>>>     	current->rcu_read_unlock_special.b.need_qs is true becomes
>>>     	apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs().
>>>     	Recursively, given that we interrupted out of that same
>>>     	function in the preceding step.
>>>     
>>> 7.	Because rcu_preempt_data.passed_quiesce is now true,
>>>     	rcu_preempt_qs() does nothing, and simply returns.
>>>     
>>> 8.	Upon return to rcu_read_unlock_special(), it is noted that
>>>     	current->rcu_read_unlock_special is still nonzero (because
>>>     	the interrupted rcu_preempt_qs() had not yet gotten around
>>>     	to clearing current->rcu_read_unlock_special.b.need_qs).
>>>     
>>> 9.	Execution proceeds to the WARN_ON_ONCE(), which notes that
>>>     	we are in an interrupt handler and thus duly splats.
>>>
>>> The solution, as noted above, is to make rcu_read_unlock_special()
>>> clear out current->rcu_read_unlock_special.b.need_qs after calling
>>> rcu_preempt_qs().  The interrupted rcu_preempt_qs() will clear it again,
>>> but this is harmless.  The worst that happens is that we clobber another
>>> attempt to set this field, but this is not a problem because we just
>>> got done reporting a quiescent state.
>>>
>>> Reported-by: Sasha Levin <sasha.levin@oracle.com>
>>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>
>>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>>> index 8669de884445..ec99dc16aa38 100644
>>> --- a/kernel/rcu/tree_plugin.h
>>> +++ b/kernel/rcu/tree_plugin.h
>>> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
>>>  	special = t->rcu_read_unlock_special;
>>>  	if (special.b.need_qs) {
>>>  		rcu_preempt_qs();
>>> +		t->rcu_read_unlock_special.need_qs = false;
>>>  		if (!t->rcu_read_unlock_special.s) {
>>>  			local_irq_restore(flags);
>>>  			return;
>>>
>>> .
>>>
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-26  2:08                           ` Lai Jiangshan
@ 2015-01-27 22:03                             ` Paul E. McKenney
  2015-01-27 22:08                               ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-27 22:03 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Sasha Levin, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Mon, Jan 26, 2015 at 10:08:04AM +0800, Lai Jiangshan wrote:
> On 01/25/2015 05:18 AM, Paul E. McKenney wrote:
> 
> > 
> > Good point!  In my scenario, CPU 0 would not yet have switched away from
> > Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
> > some more thought.
> > 
> > Any ideas?
> 
> I don't known which commit was merged from the rcu-git-tree in Sasha's test
> I try to review it.

If I had to guess, it would be 1d082fd06188 (Remove local_irq_disable()
in rcu_preempt_note_context_switch()), though his finding this might be
more directly related to increases in trinity's levels of stress.

> We can fallback to git-bitsect if the reviews fails.

One (very unlikely) possibility is that Sasha's compiler is ignoring the
barrier() in rcu_preempt_qs().

							Thanx, Paul

> Thanks,
> Lai
> 
> > 
> > 							Thanx, Paul
> > 
> >> Thanks,
> >> Lai
> >>
> >>>     
> >>> 6.	Once in rcu_read_unlock_special(), the fact that
> >>>     	current->rcu_read_unlock_special.b.need_qs is true becomes
> >>>     	apparent, so rcu_read_unlock_special() invokes rcu_preempt_qs().
> >>>     	Recursively, given that we interrupted out of that same
> >>>     	function in the preceding step.
> >>>     
> >>> 7.	Because rcu_preempt_data.passed_quiesce is now true,
> >>>     	rcu_preempt_qs() does nothing, and simply returns.
> >>>     
> >>> 8.	Upon return to rcu_read_unlock_special(), it is noted that
> >>>     	current->rcu_read_unlock_special is still nonzero (because
> >>>     	the interrupted rcu_preempt_qs() had not yet gotten around
> >>>     	to clearing current->rcu_read_unlock_special.b.need_qs).
> >>>     
> >>> 9.	Execution proceeds to the WARN_ON_ONCE(), which notes that
> >>>     	we are in an interrupt handler and thus duly splats.
> >>>
> >>> The solution, as noted above, is to make rcu_read_unlock_special()
> >>> clear out current->rcu_read_unlock_special.b.need_qs after calling
> >>> rcu_preempt_qs().  The interrupted rcu_preempt_qs() will clear it again,
> >>> but this is harmless.  The worst that happens is that we clobber another
> >>> attempt to set this field, but this is not a problem because we just
> >>> got done reporting a quiescent state.
> >>>
> >>> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> >>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >>>
> >>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> >>> index 8669de884445..ec99dc16aa38 100644
> >>> --- a/kernel/rcu/tree_plugin.h
> >>> +++ b/kernel/rcu/tree_plugin.h
> >>> @@ -322,6 +322,7 @@ void rcu_read_unlock_special(struct task_struct *t)
> >>>  	special = t->rcu_read_unlock_special;
> >>>  	if (special.b.need_qs) {
> >>>  		rcu_preempt_qs();
> >>> +		t->rcu_read_unlock_special.need_qs = false;
> >>>  		if (!t->rcu_read_unlock_special.s) {
> >>>  			local_irq_restore(flags);
> >>>  			return;
> >>>
> >>> .
> >>>
> >>
> > 
> > .
> > 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-27 22:03                             ` Paul E. McKenney
@ 2015-01-27 22:08                               ` Sasha Levin
  2015-01-27 23:16                                 ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-27 22:08 UTC (permalink / raw)
  To: paulmck, Lai Jiangshan
  Cc: Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/27/2015 05:03 PM, Paul E. McKenney wrote:
> On Mon, Jan 26, 2015 at 10:08:04AM +0800, Lai Jiangshan wrote:
>> > On 01/25/2015 05:18 AM, Paul E. McKenney wrote:
>> > 
>>> > > 
>>> > > Good point!  In my scenario, CPU 0 would not yet have switched away from
>>> > > Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
>>> > > some more thought.
>>> > > 
>>> > > Any ideas?
>> > 
>> > I don't known which commit was merged from the rcu-git-tree in Sasha's test
>> > I try to review it.
> If I had to guess, it would be 1d082fd06188 (Remove local_irq_disable()
> in rcu_preempt_note_context_switch()), though his finding this might be
> more directly related to increases in trinity's levels of stress.

Quick update from my end: I've stopped seeing this warning, but I've also stopped
seeing warnings for the other RCU issue I've reported (https://lkml.org/lkml/2015/1/22/676)
so I'm slightly unhappy about that.

>> > We can fallback to git-bitsect if the reviews fails.
> One (very unlikely) possibility is that Sasha's compiler is ignoring the
> barrier() in rcu_preempt_qs().

I'm actually running the latest gcc (trunk) as well, so it's very possible that it was
doing something stupid.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-27 22:08                               ` Sasha Levin
@ 2015-01-27 23:16                                 ` Paul E. McKenney
  2015-01-30 19:57                                   ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-27 23:16 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Tue, Jan 27, 2015 at 05:08:21PM -0500, Sasha Levin wrote:
> On 01/27/2015 05:03 PM, Paul E. McKenney wrote:
> > On Mon, Jan 26, 2015 at 10:08:04AM +0800, Lai Jiangshan wrote:
> >> > On 01/25/2015 05:18 AM, Paul E. McKenney wrote:
> >> > 
> >>> > > 
> >>> > > Good point!  In my scenario, CPU 0 would not yet have switched away from
> >>> > > Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
> >>> > > some more thought.
> >>> > > 
> >>> > > Any ideas?
> >> > 
> >> > I don't known which commit was merged from the rcu-git-tree in Sasha's test
> >> > I try to review it.
> > If I had to guess, it would be 1d082fd06188 (Remove local_irq_disable()
> > in rcu_preempt_note_context_switch()), though his finding this might be
> > more directly related to increases in trinity's levels of stress.
> 
> Quick update from my end: I've stopped seeing this warning, but I've also stopped
> seeing warnings for the other RCU issue I've reported (https://lkml.org/lkml/2015/1/22/676)
> so I'm slightly unhappy about that.

Another approach would be to remove that patch and then revert 1d082fd06188.

Either way, may I have your Tested-by?

> >> > We can fallback to git-bitsect if the reviews fails.
> > One (very unlikely) possibility is that Sasha's compiler is ignoring the
> > barrier() in rcu_preempt_qs().
> 
> I'm actually running the latest gcc (trunk) as well, so it's very possible that it was
> doing something stupid.

Hmmmm...  Could you please send along the assembly output for rcu_preempt_qs()?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-27 23:16                                 ` Paul E. McKenney
@ 2015-01-30 19:57                                   ` Sasha Levin
  2015-01-30 20:33                                     ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-01-30 19:57 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/27/2015 06:16 PM, Paul E. McKenney wrote:
> On Tue, Jan 27, 2015 at 05:08:21PM -0500, Sasha Levin wrote:
>> On 01/27/2015 05:03 PM, Paul E. McKenney wrote:
>>> On Mon, Jan 26, 2015 at 10:08:04AM +0800, Lai Jiangshan wrote:
>>>>> On 01/25/2015 05:18 AM, Paul E. McKenney wrote:
>>>>>
>>>>>>>
>>>>>>> Good point!  In my scenario, CPU 0 would not yet have switched away from
>>>>>>> Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
>>>>>>> some more thought.
>>>>>>>
>>>>>>> Any ideas?
>>>>>
>>>>> I don't known which commit was merged from the rcu-git-tree in Sasha's test
>>>>> I try to review it.
>>> If I had to guess, it would be 1d082fd06188 (Remove local_irq_disable()
>>> in rcu_preempt_note_context_switch()), though his finding this might be
>>> more directly related to increases in trinity's levels of stress.
>>
>> Quick update from my end: I've stopped seeing this warning, but I've also stopped
>> seeing warnings for the other RCU issue I've reported (https://lkml.org/lkml/2015/1/22/676)
>> so I'm slightly unhappy about that.
> 
> Another approach would be to remove that patch and then revert 1d082fd06188.
> 
> Either way, may I have your Tested-by?

Yup, I haven't seen it so far.

>>>>> We can fallback to git-bitsect if the reviews fails.
>>> One (very unlikely) possibility is that Sasha's compiler is ignoring the
>>> barrier() in rcu_preempt_qs().
>>
>> I'm actually running the latest gcc (trunk) as well, so it's very possible that it was
>> doing something stupid.
> 
> Hmmmm...  Could you please send along the assembly output for rcu_preempt_qs()?


0000000000002b20 <rcu_preempt_qs>:
    2b20:       e8 00 00 00 00          callq  2b25 <rcu_preempt_qs+0x5>
                        2b21: R_X86_64_PC32     __fentry__-0x4
    2b25:       55                      push   %rbp
    2b26:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        2b29: R_X86_64_32S      .rodata
    2b2d:       48 89 e5                mov    %rsp,%rbp
    2b30:       53                      push   %rbx
    2b31:       48 83 ec 08             sub    $0x8,%rsp
    2b35:       e8 00 00 00 00          callq  2b3a <rcu_preempt_qs+0x1a>
                        2b36: R_X86_64_PC32     __this_cpu_preempt_check-0x4
    2b3a:       65 8a 05 00 00 00 00    mov    %gs:0x0(%rip),%al        # 2b41 <rcu_preempt_qs+0x21>
                        2b3d: R_X86_64_PC32     rcu_preempt_data+0x14
    2b41:       84 c0                   test   %al,%al
    2b43:       74 0b                   je     2b50 <rcu_preempt_qs+0x30>
    2b45:       48 83 c4 08             add    $0x8,%rsp
    2b49:       5b                      pop    %rbx
    2b4a:       5d                      pop    %rbp
    2b4b:       c3                      retq
    2b4c:       0f 1f 40 00             nopl   0x0(%rax)
    2b50:       48 8b 1d 00 00 00 00    mov    0x0(%rip),%rbx        # 2b57 <rcu_preempt_qs+0x37>
                        2b53: R_X86_64_PC32     __tracepoint_str+0xb4
    2b57:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        2b5a: R_X86_64_32S      .rodata
    2b5e:       e8 00 00 00 00          callq  2b63 <rcu_preempt_qs+0x43>
                        2b5f: R_X86_64_PC32     __this_cpu_preempt_check-0x4
    2b63:       48 8b 3d 00 00 00 00    mov    0x0(%rip),%rdi        # 2b6a <rcu_preempt_qs+0x4a>
                        2b66: R_X86_64_PC32     __tracepoint_str+0xbc
    2b6a:       65 48 8b 35 00 00 00    mov    %gs:0x0(%rip),%rsi        # 2b72 <rcu_preempt_qs+0x52>
    2b71:       00
                        2b6e: R_X86_64_PC32     rcu_preempt_data+0x4
    2b72:       48 89 da                mov    %rbx,%rdx
    2b75:       e8 66 fd ff ff          callq  28e0 <trace_rcu_grace_period>
    2b7a:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
                        2b7d: R_X86_64_32S      .rodata+0x260
    2b81:       e8 00 00 00 00          callq  2b86 <rcu_preempt_qs+0x66>
                        2b82: R_X86_64_PC32     __this_cpu_preempt_check-0x4
    2b86:       65 c6 05 00 00 00 00    movb   $0x1,%gs:0x0(%rip)        # 2b8e <rcu_preempt_qs+0x6e>
    2b8d:       01
                        2b89: R_X86_64_PC32     rcu_preempt_data+0x13
    2b8e:       65 48 8b 1c 25 00 00    mov    %gs:0x0,%rbx
    2b95:       00 00
                        2b93: R_X86_64_32S      current_task
    2b97:       48 8d bb 15 07 00 00    lea    0x715(%rbx),%rdi
    2b9e:       48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
    2ba5:       fc ff df
    2ba8:       48 89 fa                mov    %rdi,%rdx
    2bab:       48 c1 ea 03             shr    $0x3,%rdx
    2baf:       0f b6 04 02             movzbl (%rdx,%rax,1),%eax
    2bb3:       48 89 fa                mov    %rdi,%rdx
    2bb6:       83 e2 07                and    $0x7,%edx
    2bb9:       38 d0                   cmp    %dl,%al
    2bbb:       7f 04                   jg     2bc1 <rcu_preempt_qs+0xa1>
    2bbd:       84 c0                   test   %al,%al
    2bbf:       75 0e                   jne    2bcf <rcu_preempt_qs+0xaf>
    2bc1:       c6 83 15 07 00 00 00    movb   $0x0,0x715(%rbx)
    2bc8:       48 83 c4 08             add    $0x8,%rsp
    2bcc:       5b                      pop    %rbx
    2bcd:       5d                      pop    %rbp
    2bce:       c3                      retq
    2bcf:       e8 00 00 00 00          callq  2bd4 <rcu_preempt_qs+0xb4>
                        2bd0: R_X86_64_PC32     __asan_report_store1_noabort-0x4
    2bd4:       eb eb                   jmp    2bc1 <rcu_preempt_qs+0xa1>
    2bd6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    2bdd:       00 00 00


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-30 19:57                                   ` Sasha Levin
@ 2015-01-30 20:33                                     ` Paul E. McKenney
  2015-02-11 23:17                                       ` Sasha Levin
  0 siblings, 1 reply; 30+ messages in thread
From: Paul E. McKenney @ 2015-01-30 20:33 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Fri, Jan 30, 2015 at 02:57:00PM -0500, Sasha Levin wrote:
> On 01/27/2015 06:16 PM, Paul E. McKenney wrote:
> > On Tue, Jan 27, 2015 at 05:08:21PM -0500, Sasha Levin wrote:
> >> On 01/27/2015 05:03 PM, Paul E. McKenney wrote:
> >>> On Mon, Jan 26, 2015 at 10:08:04AM +0800, Lai Jiangshan wrote:
> >>>>> On 01/25/2015 05:18 AM, Paul E. McKenney wrote:
> >>>>>
> >>>>>>>
> >>>>>>> Good point!  In my scenario, CPU 0 would not yet have switched away from
> >>>>>>> Task A.  Hmmm...  Yet Sasha really does see this failure.  Will give it
> >>>>>>> some more thought.
> >>>>>>>
> >>>>>>> Any ideas?
> >>>>>
> >>>>> I don't known which commit was merged from the rcu-git-tree in Sasha's test
> >>>>> I try to review it.
> >>> If I had to guess, it would be 1d082fd06188 (Remove local_irq_disable()
> >>> in rcu_preempt_note_context_switch()), though his finding this might be
> >>> more directly related to increases in trinity's levels of stress.
> >>
> >> Quick update from my end: I've stopped seeing this warning, but I've also stopped
> >> seeing warnings for the other RCU issue I've reported (https://lkml.org/lkml/2015/1/22/676)
> >> so I'm slightly unhappy about that.
> > 
> > Another approach would be to remove that patch and then revert 1d082fd06188.
> > 
> > Either way, may I have your Tested-by?
> 
> Yup, I haven't seen it so far.

Thank you!

> >>>>> We can fallback to git-bitsect if the reviews fails.
> >>> One (very unlikely) possibility is that Sasha's compiler is ignoring the
> >>> barrier() in rcu_preempt_qs().
> >>
> >> I'm actually running the latest gcc (trunk) as well, so it's very possible that it was
> >> doing something stupid.
> > 
> > Hmmmm...  Could you please send along the assembly output for rcu_preempt_qs()?
> 
> 
> 0000000000002b20 <rcu_preempt_qs>:
>     2b20:       e8 00 00 00 00          callq  2b25 <rcu_preempt_qs+0x5>
>                         2b21: R_X86_64_PC32     __fentry__-0x4
>     2b25:       55                      push   %rbp
>     2b26:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
>                         2b29: R_X86_64_32S      .rodata
>     2b2d:       48 89 e5                mov    %rsp,%rbp
>     2b30:       53                      push   %rbx
>     2b31:       48 83 ec 08             sub    $0x8,%rsp
>     2b35:       e8 00 00 00 00          callq  2b3a <rcu_preempt_qs+0x1a>
>                         2b36: R_X86_64_PC32     __this_cpu_preempt_check-0x4
>     2b3a:       65 8a 05 00 00 00 00    mov    %gs:0x0(%rip),%al        # 2b41 <rcu_preempt_qs+0x21>
>                         2b3d: R_X86_64_PC32     rcu_preempt_data+0x14
>     2b41:       84 c0                   test   %al,%al
>     2b43:       74 0b                   je     2b50 <rcu_preempt_qs+0x30>
>     2b45:       48 83 c4 08             add    $0x8,%rsp
>     2b49:       5b                      pop    %rbx
>     2b4a:       5d                      pop    %rbp
>     2b4b:       c3                      retq
>     2b4c:       0f 1f 40 00             nopl   0x0(%rax)
>     2b50:       48 8b 1d 00 00 00 00    mov    0x0(%rip),%rbx        # 2b57 <rcu_preempt_qs+0x37>
>                         2b53: R_X86_64_PC32     __tracepoint_str+0xb4
>     2b57:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
>                         2b5a: R_X86_64_32S      .rodata
>     2b5e:       e8 00 00 00 00          callq  2b63 <rcu_preempt_qs+0x43>
>                         2b5f: R_X86_64_PC32     __this_cpu_preempt_check-0x4
>     2b63:       48 8b 3d 00 00 00 00    mov    0x0(%rip),%rdi        # 2b6a <rcu_preempt_qs+0x4a>
>                         2b66: R_X86_64_PC32     __tracepoint_str+0xbc
>     2b6a:       65 48 8b 35 00 00 00    mov    %gs:0x0(%rip),%rsi        # 2b72 <rcu_preempt_qs+0x52>
>     2b71:       00
>                         2b6e: R_X86_64_PC32     rcu_preempt_data+0x4
>     2b72:       48 89 da                mov    %rbx,%rdx
>     2b75:       e8 66 fd ff ff          callq  28e0 <trace_rcu_grace_period>
>     2b7a:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
>                         2b7d: R_X86_64_32S      .rodata+0x260
>     2b81:       e8 00 00 00 00          callq  2b86 <rcu_preempt_qs+0x66>
>                         2b82: R_X86_64_PC32     __this_cpu_preempt_check-0x4
>     2b86:       65 c6 05 00 00 00 00    movb   $0x1,%gs:0x0(%rip)        # 2b8e <rcu_preempt_qs+0x6e>
>     2b8d:       01
>                         2b89: R_X86_64_PC32     rcu_preempt_data+0x13
>     2b8e:       65 48 8b 1c 25 00 00    mov    %gs:0x0,%rbx
>     2b95:       00 00
>                         2b93: R_X86_64_32S      current_task
>     2b97:       48 8d bb 15 07 00 00    lea    0x715(%rbx),%rdi
>     2b9e:       48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
>     2ba5:       fc ff df
>     2ba8:       48 89 fa                mov    %rdi,%rdx
>     2bab:       48 c1 ea 03             shr    $0x3,%rdx
>     2baf:       0f b6 04 02             movzbl (%rdx,%rax,1),%eax
>     2bb3:       48 89 fa                mov    %rdi,%rdx
>     2bb6:       83 e2 07                and    $0x7,%edx
>     2bb9:       38 d0                   cmp    %dl,%al
>     2bbb:       7f 04                   jg     2bc1 <rcu_preempt_qs+0xa1>
>     2bbd:       84 c0                   test   %al,%al
>     2bbf:       75 0e                   jne    2bcf <rcu_preempt_qs+0xaf>
>     2bc1:       c6 83 15 07 00 00 00    movb   $0x0,0x715(%rbx)
>     2bc8:       48 83 c4 08             add    $0x8,%rsp
>     2bcc:       5b                      pop    %rbx
>     2bcd:       5d                      pop    %rbp
>     2bce:       c3                      retq
>     2bcf:       e8 00 00 00 00          callq  2bd4 <rcu_preempt_qs+0xb4>
>                         2bd0: R_X86_64_PC32     __asan_report_store1_noabort-0x4
>     2bd4:       eb eb                   jmp    2bc1 <rcu_preempt_qs+0xa1>
>     2bd6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
>     2bdd:       00 00 00

Looks to me like your version of gcc is respecting barrier().  Though
I confess myself amazed by the branches at 2bbb, 2bbf, 2bcf, and 2bd4...

							Thanx, Paul

> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-01-30 20:33                                     ` Paul E. McKenney
@ 2015-02-11 23:17                                       ` Sasha Levin
  2015-02-12  0:42                                         ` Paul E. McKenney
  0 siblings, 1 reply; 30+ messages in thread
From: Sasha Levin @ 2015-02-11 23:17 UTC (permalink / raw)
  To: paulmck
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On 01/30/2015 03:33 PM, Paul E. McKenney wrote:
>>> Either way, may I have your Tested-by?
>> > 
>> > Yup, I haven't seen it so far.
> Thank you!

I'm still seeing this issue in -next, and I see that that commit wasn't
applied yet.

Anything it's waiting on?


Thanks,
Sasha


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550()
  2015-02-11 23:17                                       ` Sasha Levin
@ 2015-02-12  0:42                                         ` Paul E. McKenney
  0 siblings, 0 replies; 30+ messages in thread
From: Paul E. McKenney @ 2015-02-12  0:42 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Lai Jiangshan, Ingo Molnar, Peter Zijlstra, LKML,
	davej@codemonkey.org.uk >> Dave Jones

On Wed, Feb 11, 2015 at 06:17:28PM -0500, Sasha Levin wrote:
> On 01/30/2015 03:33 PM, Paul E. McKenney wrote:
> >>> Either way, may I have your Tested-by?
> >> > 
> >> > Yup, I haven't seen it so far.
> > Thank you!
> 
> I'm still seeing this issue in -next, and I see that that commit wasn't
> applied yet.
> 
> Anything it's waiting on?

Me to get my act together and move it from my for-v3.21 bucket into my
for-v3.20 bucket.  I have rebased and have started testing, should hit
-next soon.  If all goes well, I will send a pull request.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-02-12  0:42 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-18 14:17 rcu, sched: WARNING: CPU: 30 PID: 23771 at kernel/rcu/tree_plugin.h:337 rcu_read_unlock_special+0x369/0x550() Sasha Levin
2015-01-18 23:22 ` Paul E. McKenney
2015-01-20 15:39   ` Sasha Levin
2015-01-21  2:57     ` Paul E. McKenney
2015-01-21 15:44       ` Sasha Levin
2015-01-22  0:43         ` Paul E. McKenney
2015-01-23  3:29           ` Sasha Levin
2015-01-23  3:51             ` Paul E. McKenney
2015-01-23  4:02               ` Sasha Levin
2015-01-23  4:05                 ` Sasha Levin
2015-01-23  6:55                   ` Paul E. McKenney
2015-01-23  8:41                     ` Lai Jiangshan
2015-01-23  9:38                       ` Paul E. McKenney
2015-01-23  9:16                     ` Lai Jiangshan
2015-01-23  9:48                       ` Paul E. McKenney
2015-01-23  9:36                     ` Paul E. McKenney
2015-01-23 21:51                       ` Sasha Levin
2015-01-23 22:54                         ` Paul E. McKenney
2015-01-23 23:05                           ` Sasha Levin
2015-01-23 23:16                             ` Paul E. McKenney
2015-01-24  2:18                       ` Lai Jiangshan
2015-01-24 21:18                         ` Paul E. McKenney
2015-01-26  2:08                           ` Lai Jiangshan
2015-01-27 22:03                             ` Paul E. McKenney
2015-01-27 22:08                               ` Sasha Levin
2015-01-27 23:16                                 ` Paul E. McKenney
2015-01-30 19:57                                   ` Sasha Levin
2015-01-30 20:33                                     ` Paul E. McKenney
2015-02-11 23:17                                       ` Sasha Levin
2015-02-12  0:42                                         ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.