linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: Regression in RCU subsystem in latest mainline kernel
       [not found]   ` <1645938.As0LR1yeVd@pcimr>
@ 2013-06-14 21:06     ` Steven Rostedt
  2013-06-15  2:02       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2013-06-14 21:06 UTC (permalink / raw)
  To: Rojhalat Ibrahim; +Cc: paulmck, linuxppc-dev, linux-kernel

On Fri, 2013-06-14 at 14:46 +0200, Rojhalat Ibrahim wrote:
> On Friday 14 June 2013 05:28:00 Paul E. McKenney wrote:
> > On Fri, Jun 14, 2013 at 01:47:00PM +0200, Rojhalat Ibrahim wrote:
> > > Hi,
> > > 
> > > the current mainline kernel from git reproducibly hangs on my Freescale
> > > PowerPC P5020DS development system. Most of the time the hang occurs
> > > during
> > > boot (kernel just stops booting, system hangs), sometimes shortly after
> > > booting (system hangs, no message). Very often booting stops at the
> > > initialization of the Intel PRO 1000 PCIe network device (e1000e).

I was pretty much able to reproduce this on my PA Semi PPC box. Funny
thing is, when I type on the console, it makes progress. Anyway, it
seems that powerpc has an issue with irq_work(). I'll try to get some
time either tonight or next week to figure it out.

Thanks!

-- Steve

> > > 
> > > I used git bisect and arrived at this commit:
> > > 
> > > 016a8d5be6ddcc72ef0432d82d9f6fa34f61b907
> > > rcu: Don't call wakeup() with rcu_node structure ->lock held
> > > 
> > > With versions before that commit the system runs normally.
> > 
> > Odd, given that this fix was intended to prevent hangs.  Could you
> > please send your .config?
> > 
> > 							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-14 21:06     ` Regression in RCU subsystem in latest mainline kernel Steven Rostedt
@ 2013-06-15  2:02       ` Benjamin Herrenschmidt
  2013-06-15  2:17         ` Steven Rostedt
  2013-06-17  7:42         ` Michael Ellerman
  0 siblings, 2 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2013-06-15  2:02 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linuxppc-dev, paulmck, Rojhalat Ibrahim, linux-kernel

On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> thing is, when I type on the console, it makes progress. Anyway, it
> seems that powerpc has an issue with irq_work(). I'll try to get some
> time either tonight or next week to figure it out.

Does this help ?

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 5cbcf4d..ea185e0 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
 	 * in case we also had a rollover while hard disabled
 	 */
 	local_paca->irq_happened &= ~PACA_IRQ_DEC;
-	if (decrementer_check_overflow())
+	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
 		return 0x900;
 
 	/* Finally check if an external interrupt happened */

Cheers,
Ben.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:02       ` Benjamin Herrenschmidt
@ 2013-06-15  2:17         ` Steven Rostedt
  2013-06-15  2:21           ` Benjamin Herrenschmidt
  2013-06-17 13:21           ` Rojhalat Ibrahim
  2013-06-17  7:42         ` Michael Ellerman
  1 sibling, 2 replies; 16+ messages in thread
From: Steven Rostedt @ 2013-06-15  2:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, paulmck, Rojhalat Ibrahim, linux-kernel

On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > thing is, when I type on the console, it makes progress. Anyway, it
> > seems that powerpc has an issue with irq_work(). I'll try to get some
> > time either tonight or next week to figure it out.
> 
> Does this help ?

It did for me. Rojhalat, did this fix your issue too?

-- Steve

> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5cbcf4d..ea185e0 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
>  	 * in case we also had a rollover while hard disabled
>  	 */
>  	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> -	if (decrementer_check_overflow())
> +	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
>  		return 0x900;
>  
>  	/* Finally check if an external interrupt happened */
> 
> Cheers,
> Ben.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:17         ` Steven Rostedt
@ 2013-06-15  2:21           ` Benjamin Herrenschmidt
  2013-06-15  2:31             ` Steven Rostedt
  2013-06-17 13:21           ` Rojhalat Ibrahim
  1 sibling, 1 reply; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2013-06-15  2:21 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linuxppc-dev, paulmck, Rojhalat Ibrahim, linux-kernel

On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > thing is, when I type on the console, it makes progress. Anyway, it
> > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > time either tonight or next week to figure it out.
> > 
> > Does this help ?
> 
> It did for me. Rojhalat, did this fix your issue too?

Ok, adding that to what I'm about to send to Linus.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:21           ` Benjamin Herrenschmidt
@ 2013-06-15  2:31             ` Steven Rostedt
  2013-06-15  2:51               ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2013-06-15  2:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, paulmck, Rojhalat Ibrahim, linux-kernel

On Sat, 2013-06-15 at 12:21 +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> > On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > time either tonight or next week to figure it out.
> > > 
> > > Does this help ?
> > 
> > It did for me. Rojhalat, did this fix your issue too?
> 
> Ok, adding that to what I'm about to send to Linus.
> 

Feel free to add: Tested-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:31             ` Steven Rostedt
@ 2013-06-15  2:51               ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2013-06-15  2:51 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linuxppc-dev, Rojhalat Ibrahim, linux-kernel

On Fri, Jun 14, 2013 at 10:31:12PM -0400, Steven Rostedt wrote:
> On Sat, 2013-06-15 at 12:21 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> > > On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > > time either tonight or next week to figure it out.
> > > > 
> > > > Does this help ?
> > > 
> > > It did for me. Rojhalat, did this fix your issue too?
> > 
> > Ok, adding that to what I'm about to send to Linus.
> 
> Feel free to add: Tested-by: Steven Rostedt <rostedt@goodmis.org>

Thank you both!!!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:02       ` Benjamin Herrenschmidt
  2013-06-15  2:17         ` Steven Rostedt
@ 2013-06-17  7:42         ` Michael Ellerman
  2013-06-19  4:09           ` Paul E. McKenney
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2013-06-17  7:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rojhalat Ibrahim, paulmck, linuxppc-dev, linux-kernel, Steven Rostedt

On Sat, Jun 15, 2013 at 12:02:21PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > thing is, when I type on the console, it makes progress. Anyway, it
> > seems that powerpc has an issue with irq_work(). I'll try to get some
> > time either tonight or next week to figure it out.
> 
> Does this help ?
> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5cbcf4d..ea185e0 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
>  	 * in case we also had a rollover while hard disabled
>  	 */
>  	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> -	if (decrementer_check_overflow())
> +	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
>  		return 0x900;
>  
>  	/* Finally check if an external interrupt happened */
> 

This seems to help, but doesn't elminate the RCU stall warnings I am
seeing. I now see them less often, but not never.

Stack trace is something like:

  INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 12, t=21372 jiffies, g=18446744073709551503, c=18446744073709551502, q=1018)
  Task dump for CPU 32:
  power8-events   R  running task     4960  2009   1988 0x00000004
  Call Trace:
  [c000000fb0e3f910] [c000000fb0e3f9d0] 0xc000000fb0e3f9d0 (unreliable)
  
  [c000000fb0e3edc0] [c0000000000b2894] .__run_hrtimer+0xa4/0x2a0
  [c000000fb0e3ee70] [c0000000000b36d8] .hrtimer_interrupt+0x148/0x320
  [c000000fb0e3ef80] [c00000000001c754] .timer_interrupt+0x134/0x320
  [c000000fb0e3f040] [c00000000000a4f4] restore_check_irq_replay+0x68/0xa8
  --- Exception: 901 at .arch_local_irq_restore+0x24/0x90
      LR = .__do_softirq+0x100/0x3a0
  [c000000fb0e3f330] [c0000000000c4784] .vtime_account_irq_enter+0x34/0x70 (unreliable)
  [c000000fb0e3f3a0] [c000000000089680] .__do_softirq+0x100/0x3a0
  [c000000fb0e3f4c0] [c000000000089b38] .irq_exit+0xc8/0x110
  [c000000fb0e3f540] [c00000000001c788] .timer_interrupt+0x168/0x320
  [c000000fb0e3f600] [c0000000000025cc] decrementer_common+0x14c/0x180
  --- Exception: 901 at .arch_local_irq_restore+0x74/0x90
      LR = .arch_local_irq_restore+0x74/0x90
  [c000000fb0e3f8f0] [c000000fb0e3f970] 0xc000000fb0e3f970 (unreliable)
  [c000000fb0e3f960] [c0000000000e4ae0] .smp_call_function_single+0x1d0/0x1e0
  [c000000fb0e3fa10] [c000000000147aa4] .task_function_call+0x54/0x70
  [c000000fb0e3fab0] [c000000000147bc4] .perf_event_enable+0x104/0x1c0
  [c000000fb0e3fb60] [c000000000146800] .perf_event_for_each_child+0x60/0x110
  [c000000fb0e3fbf0] [c00000000014a528] .perf_ioctl+0x108/0x3f0
  [c000000fb0e3fca0] [c0000000001d7138] .do_vfs_ioctl+0xb8/0x730
  [c000000fb0e3fd80] [c0000000001d780c] .SyS_ioctl+0x5c/0xb0
  [c000000fb0e3fe30] [c000000000009d54] syscall_exit+0x0/0x98


cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-15  2:17         ` Steven Rostedt
  2013-06-15  2:21           ` Benjamin Herrenschmidt
@ 2013-06-17 13:21           ` Rojhalat Ibrahim
  2013-06-17 13:51             ` Steven Rostedt
  1 sibling, 1 reply; 16+ messages in thread
From: Rojhalat Ibrahim @ 2013-06-17 13:21 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: paulmck, linuxppc-dev, linux-kernel

On Friday 14 June 2013 22:17:34 Steven Rostedt wrote:
> On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > thing is, when I type on the console, it makes progress. Anyway, it
> > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > time either tonight or next week to figure it out.
> > 
> > Does this help ?
> 
> It did for me. Rojhalat, did this fix your issue too?
> 

FWIW, since the fix is already applied, but it fixes my problem too.

   Rojhalat


> -- Steve
> 
> > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> > index 5cbcf4d..ea185e0 100644
> > --- a/arch/powerpc/kernel/irq.c
> > +++ b/arch/powerpc/kernel/irq.c
> > @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
> > 
> >  	 * in case we also had a rollover while hard disabled
> >  	 */
> >  	
> >  	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> > 
> > -	if (decrementer_check_overflow())
> > +	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
> > 
> >  		return 0x900;
> >  	
> >  	/* Finally check if an external interrupt happened */
> > 
> > Cheers,
> > Ben.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-17 13:21           ` Rojhalat Ibrahim
@ 2013-06-17 13:51             ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2013-06-17 13:51 UTC (permalink / raw)
  To: Rojhalat Ibrahim; +Cc: paulmck, linuxppc-dev, linux-kernel

On Mon, 2013-06-17 at 15:21 +0200, Rojhalat Ibrahim wrote:
> On Friday 14 June 2013 22:17:34 Steven Rostedt wrote:
> > On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > time either tonight or next week to figure it out.
> > > 
> > > Does this help ?
> > 
> > It did for me. Rojhalat, did this fix your issue too?
> > 
> 
> FWIW, since the fix is already applied, but it fixes my problem too.
> 

Thanks for the update. Yeah, Ben was about to send a queue to Linus, and
as this proved to fix it for me, he felt that it needed to go in that
queue instead of waiting for an update.

But I do appreciate the update as I wasn't 100% that this would fix the
problem for you too.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-17  7:42         ` Michael Ellerman
@ 2013-06-19  4:09           ` Paul E. McKenney
  2013-06-25  7:19             ` Michael Ellerman
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2013-06-19  4:09 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, Rojhalat Ibrahim, Steven Rostedt, linux-kernel

On Mon, Jun 17, 2013 at 05:42:13PM +1000, Michael Ellerman wrote:
> On Sat, Jun 15, 2013 at 12:02:21PM +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > thing is, when I type on the console, it makes progress. Anyway, it
> > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > time either tonight or next week to figure it out.
> > 
> > Does this help ?
> > 
> > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> > index 5cbcf4d..ea185e0 100644
> > --- a/arch/powerpc/kernel/irq.c
> > +++ b/arch/powerpc/kernel/irq.c
> > @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
> >  	 * in case we also had a rollover while hard disabled
> >  	 */
> >  	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> > -	if (decrementer_check_overflow())
> > +	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
> >  		return 0x900;
> >  
> >  	/* Finally check if an external interrupt happened */
> > 
> 
> This seems to help, but doesn't elminate the RCU stall warnings I am
> seeing. I now see them less often, but not never.
> 
> Stack trace is something like:

Hmmm...  How many CPUs are on your system?  And how much work is
perf_event_for_each_child() having to do here?

If the amount of work is large and your kernel is built with
CONFIG_PREEMPT=n, the RCU CPU stall warning would be expected behavior.
If so, we might need a preemption point in perf_event_for_each_child().

							Thanx, Paul

>   INFO: rcu_sched detected stalls on CPUs/tasks: { 32} (detected by 12, t=21372 jiffies, g=18446744073709551503, c=18446744073709551502, q=1018)
>   Task dump for CPU 32:
>   power8-events   R  running task     4960  2009   1988 0x00000004
>   Call Trace:
>   [c000000fb0e3f910] [c000000fb0e3f9d0] 0xc000000fb0e3f9d0 (unreliable)
>   
>   [c000000fb0e3edc0] [c0000000000b2894] .__run_hrtimer+0xa4/0x2a0
>   [c000000fb0e3ee70] [c0000000000b36d8] .hrtimer_interrupt+0x148/0x320
>   [c000000fb0e3ef80] [c00000000001c754] .timer_interrupt+0x134/0x320
>   [c000000fb0e3f040] [c00000000000a4f4] restore_check_irq_replay+0x68/0xa8
>   --- Exception: 901 at .arch_local_irq_restore+0x24/0x90
>       LR = .__do_softirq+0x100/0x3a0
>   [c000000fb0e3f330] [c0000000000c4784] .vtime_account_irq_enter+0x34/0x70 (unreliable)
>   [c000000fb0e3f3a0] [c000000000089680] .__do_softirq+0x100/0x3a0
>   [c000000fb0e3f4c0] [c000000000089b38] .irq_exit+0xc8/0x110
>   [c000000fb0e3f540] [c00000000001c788] .timer_interrupt+0x168/0x320
>   [c000000fb0e3f600] [c0000000000025cc] decrementer_common+0x14c/0x180
>   --- Exception: 901 at .arch_local_irq_restore+0x74/0x90
>       LR = .arch_local_irq_restore+0x74/0x90
>   [c000000fb0e3f8f0] [c000000fb0e3f970] 0xc000000fb0e3f970 (unreliable)
>   [c000000fb0e3f960] [c0000000000e4ae0] .smp_call_function_single+0x1d0/0x1e0
>   [c000000fb0e3fa10] [c000000000147aa4] .task_function_call+0x54/0x70
>   [c000000fb0e3fab0] [c000000000147bc4] .perf_event_enable+0x104/0x1c0
>   [c000000fb0e3fb60] [c000000000146800] .perf_event_for_each_child+0x60/0x110
>   [c000000fb0e3fbf0] [c00000000014a528] .perf_ioctl+0x108/0x3f0
>   [c000000fb0e3fca0] [c0000000001d7138] .do_vfs_ioctl+0xb8/0x730
>   [c000000fb0e3fd80] [c0000000001d780c] .SyS_ioctl+0x5c/0xb0
>   [c000000fb0e3fe30] [c000000000009d54] syscall_exit+0x0/0x98
> 
> 
> cheers
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-19  4:09           ` Paul E. McKenney
@ 2013-06-25  7:19             ` Michael Ellerman
  2013-06-25  7:36               ` Benjamin Herrenschmidt
  2013-06-25  7:44               ` Michael Ellerman
  0 siblings, 2 replies; 16+ messages in thread
From: Michael Ellerman @ 2013-06-25  7:19 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linuxppc-dev, Rojhalat Ibrahim, Steven Rostedt, linux-kernel

On Tue, Jun 18, 2013 at 09:09:06PM -0700, Paul E. McKenney wrote:
> On Mon, Jun 17, 2013 at 05:42:13PM +1000, Michael Ellerman wrote:
> > On Sat, Jun 15, 2013 at 12:02:21PM +1000, Benjamin Herrenschmidt wrote:
> > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > time either tonight or next week to figure it out.
> > > 
> > > Does this help ?
> > > 
> > > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> > > index 5cbcf4d..ea185e0 100644
> > > --- a/arch/powerpc/kernel/irq.c
> > > +++ b/arch/powerpc/kernel/irq.c
> > > @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
> > >  	 * in case we also had a rollover while hard disabled
> > >  	 */
> > >  	local_paca->irq_happened &= ~PACA_IRQ_DEC;
> > > -	if (decrementer_check_overflow())
> > > +	if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
> > >  		return 0x900;
> > >  
> > >  	/* Finally check if an external interrupt happened */
> > > 
> > 
> > This seems to help, but doesn't elminate the RCU stall warnings I am
> > seeing. I now see them less often, but not never.
> > 
> > Stack trace is something like:
 
Hi Paul,

Sorry I've been distracted with other stuff the last week.

> Hmmm...  How many CPUs are on your system?  And how much work is
> perf_event_for_each_child() having to do here?

I'm not 100% sure which system this trace is from. But it would have
~100-128 cpus.

I don't think perf_event_for_each_child() is doing much, there should
only be a single event and the smp_call_function_single() should be
degrading to a local function call.

> If the amount of work is large and your kernel is built with
> CONFIG_PREEMPT=n, the RCU CPU stall warning would be expected behavior.
> If so, we might need a preemption point in perf_event_for_each_child().

I'm using CONFIG_PREEMPT_NONE=y, which I think is what you mean.

Here's another trace from 3.10-rc7 plus a few local patches.

We suspect that the perf enable could be causing a flood of interrupts, but why
that's clogging things up so badly who knows.

INFO: rcu_sched self-detected stall on CPU { 38}  (t=2600 jiffies g=1 c=0 q=9)
cpu 0x26: Vector: 0  at [c0000007ed952b60]
    pc: c00000000014f500: .rcu_check_callbacks+0x400/0x8e0
    lr: c00000000014f500: .rcu_check_callbacks+0x400/0x8e0
    sp: c0000007ed952cd0
   msr: 9000000000009032
  current = 0xc0000007ed8b4a80
  paca    = 0xc00000000fdcab00	 softe: 0	 irq_happened: 0x00
    pid   = 2492, comm = power8-events
enter ? for help
[c0000007ed952e00] c0000000000a3e88 .update_process_times+0x48/0xa0
[c0000007ed952e90] c0000000000fd600 .tick_sched_handle.isra.13+0x40/0xd0
[c0000007ed952f20] c0000000000fd8b4 .tick_sched_timer+0x64/0xa0
[c0000007ed952fc0] c0000000000ca074 .__run_hrtimer+0x94/0x250
[c0000007ed953060] c0000000000cb0f8 .hrtimer_interrupt+0x138/0x3a0
[c0000007ed953150] c00000000001ef54 .timer_interrupt+0x124/0x2f0
[c0000007ed953200] c00000000000a5fc restore_check_irq_replay+0x68/0xa8
--- Exception: 901 (Decrementer) at c0000000000105ec .arch_local_irq_restore+0xc/0x10
[link register   ] c000000000096dac .__do_softirq+0x13c/0x380
[c0000007ed9534f0] c000000000096da0 .__do_softirq+0x130/0x380 (unreliable)
[c0000007ed953610] c000000000097228 .irq_exit+0xd8/0x120
[c0000007ed953690] c00000000001ef88 .timer_interrupt+0x158/0x2f0
[c0000007ed953740] c00000000000a5fc restore_check_irq_replay+0x68/0xa8
--- Exception: 901 (Decrementer) at c00000000010e16c .smp_call_function_single+0x13c/0x230
[c0000007ed953a30] c000000000189c64 .task_function_call+0x54/0x70 (unreliable)
[c0000007ed953ad0] c000000000189d4c .perf_event_enable+0xcc/0x150
[c0000007ed953b70] c000000000187ea0 .perf_event_for_each_child+0x60/0x100
[c0000007ed953c00] c00000000018c5e8 .perf_ioctl+0x108/0x3c0
[c0000007ed953ca0] c000000000226e94 .do_vfs_ioctl+0xc4/0x740
[c0000007ed953d90] c000000000227570 .SyS_ioctl+0x60/0xb0
[c0000007ed953e30] c000000000009e60 syscall_exit+0x0/0x98
--- Exception: c01 (System Call) at 00001fffffee03d0
SP (3fffdf0d2700) is in userspace


cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-25  7:19             ` Michael Ellerman
@ 2013-06-25  7:36               ` Benjamin Herrenschmidt
  2013-06-25  7:44               ` Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2013-06-25  7:36 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Rojhalat Ibrahim, Paul E. McKenney, linuxppc-dev, linux-kernel,
	Steven Rostedt

On Tue, 2013-06-25 at 17:19 +1000, Michael Ellerman wrote:
> Here's another trace from 3.10-rc7 plus a few local patches.
> 
> We suspect that the perf enable could be causing a flood of
> interrupts, but why
> that's clogging things up so badly who knows.

Additionally, perf being potentially NMIs , we might be hitting a bad
case of reentrance in RCU ... hard to tell.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-25  7:19             ` Michael Ellerman
  2013-06-25  7:36               ` Benjamin Herrenschmidt
@ 2013-06-25  7:44               ` Michael Ellerman
  2013-06-25 16:03                 ` Paul E. McKenney
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2013-06-25  7:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Rojhalat Ibrahim, linuxppc-dev, linux-kernel, Steven Rostedt

On Tue, Jun 25, 2013 at 05:19:14PM +1000, Michael Ellerman wrote:
> 
> Here's another trace from 3.10-rc7 plus a few local patches.

And here's another with CONFIG_RCU_CPU_STALL_INFO=y in case that's useful:

PASS running test_pmc5_6_overuse()
INFO: rcu_sched self-detected stall on CPU
	8: (1 GPs behind) idle=8eb/140000000000002/0 softirq=215/220 
	 (t=2100 jiffies g=18446744073709551583 c=18446744073709551582 q=13)
cpu 0x8: Vector: 0  at [c0000003ea03eae0]
    pc: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
    lr: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
    sp: c0000003ea03ec40
   msr: 9000000000009032
  current = 0xc0000003ebf9f4a0
  paca    = 0xc00000000fdc2400	 softe: 0	 irq_happened: 0x00
    pid   = 2444, comm = power8-events
enter ? for help
[c0000003ea03ed70] c000000000094cd0 .update_process_times+0x40/0x90
[c0000003ea03ee00] c0000000000df050 .tick_sched_handle.isra.13+0x20/0xa0
[c0000003ea03ee80] c0000000000df2bc .tick_sched_timer+0x5c/0xa0
[c0000003ea03ef20] c0000000000b3728 .__run_hrtimer+0x98/0x260
[c0000003ea03efc0] c0000000000b4738 .hrtimer_interrupt+0x138/0x3c0
[c0000003ea03f0d0] c00000000001cd34 .timer_interrupt+0x124/0x2f0
[c0000003ea03f180] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
--- Exception: 901 (Decrementer) at c000000000093ad4 .run_timer_softirq+0x74/0x360
[c0000003ea03f580] c000000000089ac4 .__do_softirq+0x174/0x350
[c0000003ea03f6a0] c000000000089ea8 .irq_exit+0xb8/0x100
[c0000003ea03f720] c00000000001cd68 .timer_interrupt+0x158/0x2f0
[c0000003ea03f7d0] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
--- Exception: 901 (Decrementer) at c00000000014a520 .task_function_call+0x60/0x70
[c0000003ea03fac0] c00000000014a634 .perf_event_enable+0x104/0x1c0 (unreliable)
[c0000003ea03fb70] c0000000001495ec .perf_event_for_each_child+0x5c/0xf0
[c0000003ea03fc00] c00000000014cd78 .perf_ioctl+0x108/0x400
[c0000003ea03fca0] c0000000001d9aa0 .do_vfs_ioctl+0xb0/0x740
[c0000003ea03fd80] c0000000001da188 .SyS_ioctl+0x58/0xb0
[c0000003ea03fe30] c000000000009d54 syscall_exit+0x0/0x98
--- Exception: c01 (System Call) at 00001fffffee03d0
SP (3ffff5e7cc90) is in userspace


cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-25  7:44               ` Michael Ellerman
@ 2013-06-25 16:03                 ` Paul E. McKenney
  2013-06-26  8:10                   ` Michael Ellerman
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2013-06-25 16:03 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Rojhalat Ibrahim, linuxppc-dev, linux-kernel, Steven Rostedt

On Tue, Jun 25, 2013 at 05:44:23PM +1000, Michael Ellerman wrote:
> On Tue, Jun 25, 2013 at 05:19:14PM +1000, Michael Ellerman wrote:
> > 
> > Here's another trace from 3.10-rc7 plus a few local patches.
> 
> And here's another with CONFIG_RCU_CPU_STALL_INFO=y in case that's useful:
> 
> PASS running test_pmc5_6_overuse()
> INFO: rcu_sched self-detected stall on CPU
> 	8: (1 GPs behind) idle=8eb/140000000000002/0 softirq=215/220 

So this CPU has been out of action since before the beginning of the
current grace period ("1 GPs behind").  It is not idle, having taken
a pair of nested interrupts from process context (matching the stack
below).  This CPU has take five softirqs since the last grace period
that it noticed, which makes it likely that the loop is within the
softirq handler.

> 	 (t=2100 jiffies g=18446744073709551583 c=18446744073709551582 q=13)

Assuming HZ=100, this stall has been going on  for 21 seconds.  There
is a grace period in progress according to RCU's global state (which
this CPU is not yet aware of).  There are a total of 13 RCU callbacks
queued across the entire system.

If the system is at all responsive, I suggest using ftrace (either from
the boot command line or at runtime) to trace __do_softirq() and
hrtimer_interrupt().

							Thanx, Paul

> cpu 0x8: Vector: 0  at [c0000003ea03eae0]
>     pc: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
>     lr: c00000000011d9b0: .rcu_check_callbacks+0x450/0x910
>     sp: c0000003ea03ec40
>    msr: 9000000000009032
>   current = 0xc0000003ebf9f4a0
>   paca    = 0xc00000000fdc2400	 softe: 0	 irq_happened: 0x00
>     pid   = 2444, comm = power8-events
> enter ? for help
> [c0000003ea03ed70] c000000000094cd0 .update_process_times+0x40/0x90
> [c0000003ea03ee00] c0000000000df050 .tick_sched_handle.isra.13+0x20/0xa0
> [c0000003ea03ee80] c0000000000df2bc .tick_sched_timer+0x5c/0xa0
> [c0000003ea03ef20] c0000000000b3728 .__run_hrtimer+0x98/0x260
> [c0000003ea03efc0] c0000000000b4738 .hrtimer_interrupt+0x138/0x3c0
> [c0000003ea03f0d0] c00000000001cd34 .timer_interrupt+0x124/0x2f0
> [c0000003ea03f180] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
> --- Exception: 901 (Decrementer) at c000000000093ad4 .run_timer_softirq+0x74/0x360
> [c0000003ea03f580] c000000000089ac4 .__do_softirq+0x174/0x350
> [c0000003ea03f6a0] c000000000089ea8 .irq_exit+0xb8/0x100
> [c0000003ea03f720] c00000000001cd68 .timer_interrupt+0x158/0x2f0
> [c0000003ea03f7d0] c00000000000a4f4 restore_check_irq_replay+0x68/0xa8
> --- Exception: 901 (Decrementer) at c00000000014a520 .task_function_call+0x60/0x70
> [c0000003ea03fac0] c00000000014a634 .perf_event_enable+0x104/0x1c0 (unreliable)
> [c0000003ea03fb70] c0000000001495ec .perf_event_for_each_child+0x5c/0xf0
> [c0000003ea03fc00] c00000000014cd78 .perf_ioctl+0x108/0x400
> [c0000003ea03fca0] c0000000001d9aa0 .do_vfs_ioctl+0xb0/0x740
> [c0000003ea03fd80] c0000000001da188 .SyS_ioctl+0x58/0xb0
> [c0000003ea03fe30] c000000000009d54 syscall_exit+0x0/0x98
> --- Exception: c01 (System Call) at 00001fffffee03d0
> SP (3ffff5e7cc90) is in userspace
> 
> 
> cheers
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-25 16:03                 ` Paul E. McKenney
@ 2013-06-26  8:10                   ` Michael Ellerman
  2013-06-26 14:16                     ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2013-06-26  8:10 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Rojhalat Ibrahim, linuxppc-dev, linux-kernel, Steven Rostedt

On Tue, Jun 25, 2013 at 09:03:32AM -0700, Paul E. McKenney wrote:
> On Tue, Jun 25, 2013 at 05:44:23PM +1000, Michael Ellerman wrote:
> > On Tue, Jun 25, 2013 at 05:19:14PM +1000, Michael Ellerman wrote:
> > > 
> > > Here's another trace from 3.10-rc7 plus a few local patches.
> > 
> > And here's another with CONFIG_RCU_CPU_STALL_INFO=y in case that's useful:
> > 
> > PASS running test_pmc5_6_overuse()
> > INFO: rcu_sched self-detected stall on CPU
> > 	8: (1 GPs behind) idle=8eb/140000000000002/0 softirq=215/220 
> 
> So this CPU has been out of action since before the beginning of the
> current grace period ("1 GPs behind").  It is not idle, having taken
> a pair of nested interrupts from process context (matching the stack
> below).  This CPU has take five softirqs since the last grace period
> that it noticed, which makes it likely that the loop is within the
> softirq handler.
> 
> > 	 (t=2100 jiffies g=18446744073709551583 c=18446744073709551582 q=13)
> 
> Assuming HZ=100, this stall has been going on  for 21 seconds.  There
> is a grace period in progress according to RCU's global state (which
> this CPU is not yet aware of).  There are a total of 13 RCU callbacks
> queued across the entire system.
> 
> If the system is at all responsive, I suggest using ftrace (either from
> the boot command line or at runtime) to trace __do_softirq() and
> hrtimer_interrupt().

Thanks for decoding it Paul.

I've narrowed down the test case and I think this is probably just a
case of too many perf interrupts. If I reduce the sampling period by
half the test runs fine.

There is logic in perf to detect an interrupt storm, but for some reason
it's not saving us. I'll dig in there, but I don't think it's an RCU
problem.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Regression in RCU subsystem in latest mainline kernel
  2013-06-26  8:10                   ` Michael Ellerman
@ 2013-06-26 14:16                     ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2013-06-26 14:16 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Rojhalat Ibrahim, linuxppc-dev, linux-kernel, Steven Rostedt

On Wed, Jun 26, 2013 at 06:10:58PM +1000, Michael Ellerman wrote:
> On Tue, Jun 25, 2013 at 09:03:32AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 25, 2013 at 05:44:23PM +1000, Michael Ellerman wrote:
> > > On Tue, Jun 25, 2013 at 05:19:14PM +1000, Michael Ellerman wrote:
> > > > 
> > > > Here's another trace from 3.10-rc7 plus a few local patches.
> > > 
> > > And here's another with CONFIG_RCU_CPU_STALL_INFO=y in case that's useful:
> > > 
> > > PASS running test_pmc5_6_overuse()
> > > INFO: rcu_sched self-detected stall on CPU
> > > 	8: (1 GPs behind) idle=8eb/140000000000002/0 softirq=215/220 
> > 
> > So this CPU has been out of action since before the beginning of the
> > current grace period ("1 GPs behind").  It is not idle, having taken
> > a pair of nested interrupts from process context (matching the stack
> > below).  This CPU has take five softirqs since the last grace period
> > that it noticed, which makes it likely that the loop is within the
> > softirq handler.
> > 
> > > 	 (t=2100 jiffies g=18446744073709551583 c=18446744073709551582 q=13)
> > 
> > Assuming HZ=100, this stall has been going on  for 21 seconds.  There
> > is a grace period in progress according to RCU's global state (which
> > this CPU is not yet aware of).  There are a total of 13 RCU callbacks
> > queued across the entire system.
> > 
> > If the system is at all responsive, I suggest using ftrace (either from
> > the boot command line or at runtime) to trace __do_softirq() and
> > hrtimer_interrupt().
> 
> Thanks for decoding it Paul.
> 
> I've narrowed down the test case and I think this is probably just a
> case of too many perf interrupts. If I reduce the sampling period by
> half the test runs fine.
> 
> There is logic in perf to detect an interrupt storm, but for some reason
> it's not saving us. I'll dig in there, but I don't think it's an RCU
> problem.

Whew!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-06-26 14:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1626500.7WAVXjfS9F@pcimr>
     [not found] ` <20130614122800.GL5146@linux.vnet.ibm.com>
     [not found]   ` <1645938.As0LR1yeVd@pcimr>
2013-06-14 21:06     ` Regression in RCU subsystem in latest mainline kernel Steven Rostedt
2013-06-15  2:02       ` Benjamin Herrenschmidt
2013-06-15  2:17         ` Steven Rostedt
2013-06-15  2:21           ` Benjamin Herrenschmidt
2013-06-15  2:31             ` Steven Rostedt
2013-06-15  2:51               ` Paul E. McKenney
2013-06-17 13:21           ` Rojhalat Ibrahim
2013-06-17 13:51             ` Steven Rostedt
2013-06-17  7:42         ` Michael Ellerman
2013-06-19  4:09           ` Paul E. McKenney
2013-06-25  7:19             ` Michael Ellerman
2013-06-25  7:36               ` Benjamin Herrenschmidt
2013-06-25  7:44               ` Michael Ellerman
2013-06-25 16:03                 ` Paul E. McKenney
2013-06-26  8:10                   ` Michael Ellerman
2013-06-26 14:16                     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).