linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-next: boot failure after merge of the final tree (tip related)
@ 2010-04-15  6:12 Stephen Rothwell
  2010-04-15  6:48 ` Benjamin Herrenschmidt
  2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
  0 siblings, 2 replies; 19+ messages in thread
From: Stephen Rothwell @ 2010-04-15  6:12 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Peter Zijlstra
  Cc: linux-next, linux-kernel, Frederic Weisbecker, ppc-dev

[-- Attachment #1: Type: text/plain, Size: 2554 bytes --]

Hi all,

Yesterday's (and today's) linux-next boot (PowerPC) failed like this:

------------[ cut here ]------------
Badness at kernel/lockdep.c:2301
NIP: c0000000000a35c8 LR: c0000000000084c4 CTR: 0000000000000000
REGS: c000000000bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 24000044  XER: 00000004
TASK = c000000000aa3d30[0] 'swapper' THREAD: c000000000bf4000 CPU: 0
GPR00: 0000000000000001 c000000000bf7a60 c000000000bf32f0 c0000000000084c4 
GPR04: 0000000000000000 0000000000000a00 0000000000000000 0000000000000068 
GPR08: 0000000000000008 c000000000c4fabe 0000000000000000 7265677368657265 
GPR12: 8000000000009032 c000000007691000 0000000001c00000 c000000000770bf8 
GPR16: c00000000076f390 0000000000000000 0000000000430000 00000000024876f0 
GPR20: c000000000887480 0000000002487480 c0000000008876f0 0000000001b5f8d0 
GPR24: c000000000770478 0000000003300000 c000000000c1f1c8 c000000000884610 
GPR28: c000000000c1b290 c0000000000084c4 c000000000b45068 c000000000aa3d30 
NIP [c0000000000a35c8] .trace_hardirqs_on_caller+0xb0/0x224
LR [c0000000000084c4] system_call_common+0xc4/0x114
Call Trace:
[c000000000bf7a60] [c000000000bf7ba0] init_thread_union+0x3ba0/0x4000 (unreliable)
[c000000000bf7af0] [c0000000000084c4] system_call_common+0xc4/0x114
--- Exception: c01 at .kernel_thread+0x28/0x70
    LR = .rest_init+0x34/0xf8
[c000000000bf7de0] [c00000000086916c] .proc_sys_init+0x20/0x64 (unreliable)
[c000000000bf7e50] [c0000000000099c0] .rest_init+0x20/0xf8
[c000000000bf7ee0] [c000000000848af0] .start_kernel+0x484/0x4a8
[c000000000bf7f90] [c0000000000083c0] .start_here_common+0x1c/0x5c
Instruction dump:
409e0188 0fe00000 48000180 801f08d8 2f800000 41be0050 880d01da 2fa00000 
41be0028 e93e8538 88090000 68000001 <0b000000> 2fa00000 41be0010 e93e8538 
------------[ cut here ]------------

Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
irqs enabled.

Line 2301 is:

        if (unlikely(curr->hardirqs_enabled)) {
                debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
                return;
        }

This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
recurses ...

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: boot failure after merge of the final tree (tip related)
  2010-04-15  6:12 linux-next: boot failure after merge of the final tree (tip related) Stephen Rothwell
@ 2010-04-15  6:48 ` Benjamin Herrenschmidt
  2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
  1 sibling, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-15  6:48 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Peter Zijlstra,
	Frederic Weisbecker, linux-next, ppc-dev, linux-kernel

On Thu, 2010-04-15 at 16:12 +1000, Stephen Rothwell wrote:
> 
> Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> debug_atomic_inc() which is called from trace_hardirqs_on_caller()
> with
> irqs enabled.
> 
> Line 2301 is:
> 
>         if (unlikely(curr->hardirqs_enabled)) {
>                 debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
>                 return;
>         }
> 
> This is especially bad since on PowerPC, WARN_ON is a TRAP and the
> return
> path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> recurses ... 

I think this is because our syscall entry pretty much force-enable irqs.

I remember deciding back then that getting lockdep balanced in that area
was tricky and I didn't do it to avoid adding more overhead to the
syscall path but I suppose I could revisit if necessary.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:12 linux-next: boot failure after merge of the final tree (tip related) Stephen Rothwell
  2010-04-15  6:48 ` Benjamin Herrenschmidt
@ 2010-04-15  6:49 ` Ingo Molnar
  2010-04-15  6:55   ` David Miller
                     ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Ingo Molnar @ 2010-04-15  6:49 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Thomas Gleixner, H. Peter Anvin, Peter Zijlstra, linux-next,
	linux-kernel, Frederic Weisbecker, ppc-dev


* Stephen Rothwell <sfr@canb.auug.org.au> wrote:

> Hi all,
> 
> Yesterday's (and today's) linux-next boot (PowerPC) failed like this:
> 
> ------------[ cut here ]------------
> Badness at kernel/lockdep.c:2301
> NIP: c0000000000a35c8 LR: c0000000000084c4 CTR: 0000000000000000
> REGS: c000000000bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
> MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 24000044  XER: 00000004
> TASK = c000000000aa3d30[0] 'swapper' THREAD: c000000000bf4000 CPU: 0
> GPR00: 0000000000000001 c000000000bf7a60 c000000000bf32f0 c0000000000084c4 
> GPR04: 0000000000000000 0000000000000a00 0000000000000000 0000000000000068 
> GPR08: 0000000000000008 c000000000c4fabe 0000000000000000 7265677368657265 
> GPR12: 8000000000009032 c000000007691000 0000000001c00000 c000000000770bf8 
> GPR16: c00000000076f390 0000000000000000 0000000000430000 00000000024876f0 
> GPR20: c000000000887480 0000000002487480 c0000000008876f0 0000000001b5f8d0 
> GPR24: c000000000770478 0000000003300000 c000000000c1f1c8 c000000000884610 
> GPR28: c000000000c1b290 c0000000000084c4 c000000000b45068 c000000000aa3d30 
> NIP [c0000000000a35c8] .trace_hardirqs_on_caller+0xb0/0x224
> LR [c0000000000084c4] system_call_common+0xc4/0x114
> Call Trace:
> [c000000000bf7a60] [c000000000bf7ba0] init_thread_union+0x3ba0/0x4000 (unreliable)
> [c000000000bf7af0] [c0000000000084c4] system_call_common+0xc4/0x114
> --- Exception: c01 at .kernel_thread+0x28/0x70
>     LR = .rest_init+0x34/0xf8
> [c000000000bf7de0] [c00000000086916c] .proc_sys_init+0x20/0x64 (unreliable)
> [c000000000bf7e50] [c0000000000099c0] .rest_init+0x20/0xf8
> [c000000000bf7ee0] [c000000000848af0] .start_kernel+0x484/0x4a8
> [c000000000bf7f90] [c0000000000083c0] .start_here_common+0x1c/0x5c
> Instruction dump:
> 409e0188 0fe00000 48000180 801f08d8 2f800000 41be0050 880d01da 2fa00000 
> 41be0028 e93e8538 88090000 68000001 <0b000000> 2fa00000 41be0010 e93e8538 
> ------------[ cut here ]------------
> 
> Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
> irqs enabled.
> 
> Line 2301 is:
> 
>         if (unlikely(curr->hardirqs_enabled)) {
>                 debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
>                 return;
>         }
> 
> This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
> path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> recurses ...

Ok, we'll fix the warning.

Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
reason we have WARN_ON versus BUG_ON - it should be fixed.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
@ 2010-04-15  6:55   ` David Miller
  2010-04-15  7:32     ` Ingo Molnar
  2010-04-16  1:56     ` Benjamin Herrenschmidt
  2010-04-15 10:04   ` Stephen Rothwell
  2010-04-15 13:00   ` linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related) Frederic Weisbecker
  2 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2010-04-15  6:55 UTC (permalink / raw)
  To: mingo
  Cc: sfr, tglx, hpa, peterz, linux-next, linux-kernel, fweisbec, linuxppc-dev

From: Ingo Molnar <mingo@elte.hu>
Date: Thu, 15 Apr 2010 08:49:40 +0200

> Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> reason we have WARN_ON versus BUG_ON - it should be fixed.

I disagree, an implementation should be allowed to use the most
efficient implementation possible for both interfaces.

I would be using traps for both on sparc64 if that were really
feasible on sparc64 (and actually with gcc-4.5's "asm goto" it might
actually be now)

The WARN and BUG macros, when implemented without traps, have serious
implications for overall code size and register pressure.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:55   ` David Miller
@ 2010-04-15  7:32     ` Ingo Molnar
  2010-04-16  1:51       ` Benjamin Herrenschmidt
  2010-04-16  1:56     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 19+ messages in thread
From: Ingo Molnar @ 2010-04-15  7:32 UTC (permalink / raw)
  To: David Miller
  Cc: sfr, tglx, hpa, peterz, linux-next, linux-kernel, fweisbec, linuxppc-dev


* David Miller <davem@davemloft.net> wrote:

> From: Ingo Molnar <mingo@elte.hu>
> Date: Thu, 15 Apr 2010 08:49:40 +0200
> 
> > Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> > reason we have WARN_ON versus BUG_ON - it should be fixed.
> 
> I disagree, an implementation should be allowed to use the most
> efficient implementation possible for both interfaces.

It trades robustness for slightly better space/code efficiency.

Such a trap based mechanism exists on x86 as well and we use it for BUG_ON(). 
We intentionally dont use it to generate warnings and dont override __WARN(), 
because it would blow up way too often when a warning triggers in some 
sensitive codepath that cannot take a trap.

Anyway, the warning obviously has to be fixed - but the boot crash itself is 
PowerPC's own doing.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
  2010-04-15  6:55   ` David Miller
@ 2010-04-15 10:04   ` Stephen Rothwell
  2010-04-15 13:37     ` [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled Frederic Weisbecker
  2010-04-15 13:00   ` linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related) Frederic Weisbecker
  2 siblings, 1 reply; 19+ messages in thread
From: Stephen Rothwell @ 2010-04-15 10:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, H. Peter Anvin, Peter Zijlstra, linux-next,
	linux-kernel, Frederic Weisbecker, ppc-dev

[-- Attachment #1: Type: text/plain, Size: 236 bytes --]

Hi Ingo,

On Thu, 15 Apr 2010 08:49:40 +0200 Ingo Molnar <mingo@elte.hu> wrote:
>
> Ok, we'll fix the warning.

Thanks.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
  2010-04-15  6:55   ` David Miller
  2010-04-15 10:04   ` Stephen Rothwell
@ 2010-04-15 13:00   ` Frederic Weisbecker
  2010-04-15 14:03     ` Ingo Molnar
  2 siblings, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-15 13:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, linux-next, linux-kernel, ppc-dev

On Thu, Apr 15, 2010 at 08:49:40AM +0200, Ingo Molnar wrote:
> 
> * Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> 
> > Hi all,
> > 
> > Yesterday's (and today's) linux-next boot (PowerPC) failed like this:
> > 
> > ------------[ cut here ]------------
> > Badness at kernel/lockdep.c:2301
> > NIP: c0000000000a35c8 LR: c0000000000084c4 CTR: 0000000000000000
> > REGS: c000000000bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
> > MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 24000044  XER: 00000004
> > TASK = c000000000aa3d30[0] 'swapper' THREAD: c000000000bf4000 CPU: 0
> > GPR00: 0000000000000001 c000000000bf7a60 c000000000bf32f0 c0000000000084c4 
> > GPR04: 0000000000000000 0000000000000a00 0000000000000000 0000000000000068 
> > GPR08: 0000000000000008 c000000000c4fabe 0000000000000000 7265677368657265 
> > GPR12: 8000000000009032 c000000007691000 0000000001c00000 c000000000770bf8 
> > GPR16: c00000000076f390 0000000000000000 0000000000430000 00000000024876f0 
> > GPR20: c000000000887480 0000000002487480 c0000000008876f0 0000000001b5f8d0 
> > GPR24: c000000000770478 0000000003300000 c000000000c1f1c8 c000000000884610 
> > GPR28: c000000000c1b290 c0000000000084c4 c000000000b45068 c000000000aa3d30 
> > NIP [c0000000000a35c8] .trace_hardirqs_on_caller+0xb0/0x224
> > LR [c0000000000084c4] system_call_common+0xc4/0x114
> > Call Trace:
> > [c000000000bf7a60] [c000000000bf7ba0] init_thread_union+0x3ba0/0x4000 (unreliable)
> > [c000000000bf7af0] [c0000000000084c4] system_call_common+0xc4/0x114
> > --- Exception: c01 at .kernel_thread+0x28/0x70
> >     LR = .rest_init+0x34/0xf8
> > [c000000000bf7de0] [c00000000086916c] .proc_sys_init+0x20/0x64 (unreliable)
> > [c000000000bf7e50] [c0000000000099c0] .rest_init+0x20/0xf8
> > [c000000000bf7ee0] [c000000000848af0] .start_kernel+0x484/0x4a8
> > [c000000000bf7f90] [c0000000000083c0] .start_here_common+0x1c/0x5c
> > Instruction dump:
> > 409e0188 0fe00000 48000180 801f08d8 2f800000 41be0050 880d01da 2fa00000 
> > 41be0028 e93e8538 88090000 68000001 <0b000000> 2fa00000 41be0010 e93e8538 
> > ------------[ cut here ]------------
> > 
> > Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> > Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> > debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
> > irqs enabled.
> > 
> > Line 2301 is:
> > 
> >         if (unlikely(curr->hardirqs_enabled)) {
> >                 debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
> >                 return;
> >         }
> > 
> > This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
> > path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> > recurses ...
> 
> Ok, we'll fix the warning.
> 
> Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> reason we have WARN_ON versus BUG_ON - it should be fixed.


In this case, I guess the following fix should be sufficient?
I'm going to test it and provide a sane changelog.


diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 78325f8..65d4336 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
 		return;
 
 	if (unlikely(curr->hardirqs_enabled)) {
+		unsigned long flags;
+
+		raw_local_irq_save(flags);
 		debug_atomic_inc(redundant_hardirqs_on);
+		raw_local_irq_restore(flags);
 		return;
 	}
 	/* we'll do an OFF -> ON transition: */


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  2010-04-15 10:04   ` Stephen Rothwell
@ 2010-04-15 13:37     ` Frederic Weisbecker
  2010-04-27  7:32       ` Stephen Rothwell
  2010-04-28  7:12       ` Stephen Rothwell
  0 siblings, 2 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-15 13:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Stephen Rothwell, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt

When a path restore the flags while irqs are already enabled, we
update the per cpu var redundant_hardirqs_on in a racy fashion
and debug_atomic_inc() warns about this situation.

In this particular case, we need to explictly disable the irqs before
updating this stat var in order to update it safely.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 kernel/lockdep.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index 78325f8..65d4336 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
 		return;
 
 	if (unlikely(curr->hardirqs_enabled)) {
+		unsigned long flags;
+
+		raw_local_irq_save(flags);
 		debug_atomic_inc(redundant_hardirqs_on);
+		raw_local_irq_restore(flags);
 		return;
 	}
 	/* we'll do an OFF -> ON transition: */
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15 13:00   ` linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related) Frederic Weisbecker
@ 2010-04-15 14:03     ` Ingo Molnar
  2010-04-15 17:15       ` Frederic Weisbecker
  2010-04-15 17:24       ` Frederic Weisbecker
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2010-04-15 14:03 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, linux-next, linux-kernel, ppc-dev


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Thu, Apr 15, 2010 at 08:49:40AM +0200, Ingo Molnar wrote:
> > 
> > * Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > 
> > > Hi all,
> > > 
> > > Yesterday's (and today's) linux-next boot (PowerPC) failed like this:
> > > 
> > > ------------[ cut here ]------------
> > > Badness at kernel/lockdep.c:2301
> > > NIP: c0000000000a35c8 LR: c0000000000084c4 CTR: 0000000000000000
> > > REGS: c000000000bf77e0 TRAP: 0700   Not tainted  (2.6.34-rc4-autokern1)
> > > MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 24000044  XER: 00000004
> > > TASK = c000000000aa3d30[0] 'swapper' THREAD: c000000000bf4000 CPU: 0
> > > GPR00: 0000000000000001 c000000000bf7a60 c000000000bf32f0 c0000000000084c4 
> > > GPR04: 0000000000000000 0000000000000a00 0000000000000000 0000000000000068 
> > > GPR08: 0000000000000008 c000000000c4fabe 0000000000000000 7265677368657265 
> > > GPR12: 8000000000009032 c000000007691000 0000000001c00000 c000000000770bf8 
> > > GPR16: c00000000076f390 0000000000000000 0000000000430000 00000000024876f0 
> > > GPR20: c000000000887480 0000000002487480 c0000000008876f0 0000000001b5f8d0 
> > > GPR24: c000000000770478 0000000003300000 c000000000c1f1c8 c000000000884610 
> > > GPR28: c000000000c1b290 c0000000000084c4 c000000000b45068 c000000000aa3d30 
> > > NIP [c0000000000a35c8] .trace_hardirqs_on_caller+0xb0/0x224
> > > LR [c0000000000084c4] system_call_common+0xc4/0x114
> > > Call Trace:
> > > [c000000000bf7a60] [c000000000bf7ba0] init_thread_union+0x3ba0/0x4000 (unreliable)
> > > [c000000000bf7af0] [c0000000000084c4] system_call_common+0xc4/0x114
> > > --- Exception: c01 at .kernel_thread+0x28/0x70
> > >     LR = .rest_init+0x34/0xf8
> > > [c000000000bf7de0] [c00000000086916c] .proc_sys_init+0x20/0x64 (unreliable)
> > > [c000000000bf7e50] [c0000000000099c0] .rest_init+0x20/0xf8
> > > [c000000000bf7ee0] [c000000000848af0] .start_kernel+0x484/0x4a8
> > > [c000000000bf7f90] [c0000000000083c0] .start_here_common+0x1c/0x5c
> > > Instruction dump:
> > > 409e0188 0fe00000 48000180 801f08d8 2f800000 41be0050 880d01da 2fa00000 
> > > 41be0028 e93e8538 88090000 68000001 <0b000000> 2fa00000 41be0010 e93e8538 
> > > ------------[ cut here ]------------
> > > 
> > > Caused by commit bd6d29c25bb1a24a4c160ec5de43e0004e01f72b ("lockstat:
> > > Make lockstat counting per cpu").  This added a WARN_ON_ONCE to
> > > debug_atomic_inc() which is called from trace_hardirqs_on_caller() with
> > > irqs enabled.
> > > 
> > > Line 2301 is:
> > > 
> > >         if (unlikely(curr->hardirqs_enabled)) {
> > >                 debug_atomic_inc(redundant_hardirqs_on);   <--- 2301
> > >                 return;
> > >         }
> > > 
> > > This is especially bad since on PowerPC, WARN_ON is a TRAP and the return
> > > path from the TRAP also calls trace_hardirqs_on_caller(), so the TRAP
> > > recurses ...
> > 
> > Ok, we'll fix the warning.
> > 
> > Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> > reason we have WARN_ON versus BUG_ON - it should be fixed.
> 
> 
> In this case, I guess the following fix should be sufficient?
> I'm going to test it and provide a sane changelog.
> 
> 
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 78325f8..65d4336 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
>  		return;
>  
>  	if (unlikely(curr->hardirqs_enabled)) {
> +		unsigned long flags;
> +
> +		raw_local_irq_save(flags);
>  		debug_atomic_inc(redundant_hardirqs_on);
> +		raw_local_irq_restore(flags);
>  		return;
>  	}
>  	/* we'll do an OFF -> ON transition: */

that looks rather ugly. Why not do a raw:

	this_cpu_inc(lockdep_stats.redundant_hardirqs_on);

which basically open-codes debug_atomic_inc(), but without the warning?

Btw., using the this_cpu() methods might result in faster code for all the 
debug_atomic_inc() macros as well?

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15 14:03     ` Ingo Molnar
@ 2010-04-15 17:15       ` Frederic Weisbecker
  2010-04-16 10:38         ` Peter Zijlstra
  2010-04-15 17:24       ` Frederic Weisbecker
  1 sibling, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-15 17:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, linux-next, linux-kernel, ppc-dev

On Thu, Apr 15, 2010 at 04:03:58PM +0200, Ingo Molnar wrote:
> > In this case, I guess the following fix should be sufficient?
> > I'm going to test it and provide a sane changelog.
> > 
> > 
> > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > index 78325f8..65d4336 100644
> > --- a/kernel/lockdep.c
> > +++ b/kernel/lockdep.c
> > @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
> >  		return;
> >  
> >  	if (unlikely(curr->hardirqs_enabled)) {
> > +		unsigned long flags;
> > +
> > +		raw_local_irq_save(flags);
> >  		debug_atomic_inc(redundant_hardirqs_on);
> > +		raw_local_irq_restore(flags);
> >  		return;
> >  	}
> >  	/* we'll do an OFF -> ON transition: */
> 
> that looks rather ugly. Why not do a raw:
> 
> 	this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> 
> which basically open-codes debug_atomic_inc(), but without the warning?


Because that would open a race against interrupts that might
touch lockdep_stats.redundant_hardirqs_on too.

If you think it's not very important (this race must be pretty rare I guess),
I can use your solution.



> 
> Btw., using the this_cpu() methods might result in faster code for all the 
> debug_atomic_inc() macros as well?


Indeed, will change that too.

Thanks.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15 14:03     ` Ingo Molnar
  2010-04-15 17:15       ` Frederic Weisbecker
@ 2010-04-15 17:24       ` Frederic Weisbecker
  2010-04-15 17:39         ` Ingo Molnar
  1 sibling, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-15 17:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, linux-next, linux-kernel, ppc-dev

On Thu, Apr 15, 2010 at 04:03:58PM +0200, Ingo Molnar wrote:
> > 
> > 
> > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > index 78325f8..65d4336 100644
> > --- a/kernel/lockdep.c
> > +++ b/kernel/lockdep.c
> > @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
> >  		return;
> >  
> >  	if (unlikely(curr->hardirqs_enabled)) {
> > +		unsigned long flags;
> > +
> > +		raw_local_irq_save(flags);
> >  		debug_atomic_inc(redundant_hardirqs_on);
> > +		raw_local_irq_restore(flags);
> >  		return;
> >  	}
> >  	/* we'll do an OFF -> ON transition: */
> 
> that looks rather ugly. Why not do a raw:
> 
> 	this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> 
> which basically open-codes debug_atomic_inc(), but without the warning?


There is also no guarantee we are in a non-preemptable section. We can then
also race against another cpu.

I'm not sure what to do.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15 17:24       ` Frederic Weisbecker
@ 2010-04-15 17:39         ` Ingo Molnar
  0 siblings, 0 replies; 19+ messages in thread
From: Ingo Molnar @ 2010-04-15 17:39 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	Peter Zijlstra, linux-next, linux-kernel, ppc-dev


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Thu, Apr 15, 2010 at 04:03:58PM +0200, Ingo Molnar wrote:
> > > 
> > > 
> > > diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> > > index 78325f8..65d4336 100644
> > > --- a/kernel/lockdep.c
> > > +++ b/kernel/lockdep.c
> > > @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
> > >  		return;
> > >  
> > >  	if (unlikely(curr->hardirqs_enabled)) {
> > > +		unsigned long flags;
> > > +
> > > +		raw_local_irq_save(flags);
> > >  		debug_atomic_inc(redundant_hardirqs_on);
> > > +		raw_local_irq_restore(flags);
> > >  		return;
> > >  	}
> > >  	/* we'll do an OFF -> ON transition: */
> > 
> > that looks rather ugly. Why not do a raw:
> > 
> > 	this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> > 
> > which basically open-codes debug_atomic_inc(), but without the warning?
> 
> 
> There is also no guarantee we are in a non-preemptable section. We can then
> also race against another cpu.
> 
> I'm not sure what to do.

it's a statistics counter so worst-case we lose a count. It's not a real issue 
- but might be worth adding a comment.

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  7:32     ` Ingo Molnar
@ 2010-04-16  1:51       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-16  1:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Miller, sfr, peterz, fweisbec, linux-kernel, linux-next,
	hpa, tglx, linuxppc-dev

On Thu, 2010-04-15 at 09:32 +0200, Ingo Molnar wrote:
> It trades robustness for slightly better space/code efficiency.
> 
> Such a trap based mechanism exists on x86 as well and we use it for BUG_ON(). 
> We intentionally dont use it to generate warnings and dont override __WARN(), 
> because it would blow up way too often when a warning triggers in some 
> sensitive codepath that cannot take a trap.
> 
> Anyway, the warning obviously has to be fixed - but the boot crash itself is 
> PowerPC's own doing. 

Well, yes and no, as I explained in a separate branch of that thread. We
indeed can't cope with a WARN in that spot because it goes recursive. 

Now the reason we have this double-enable is due afaik to the way I
implemented IRQ trace, because things like syscalls basically
force-enable IRQs on powerpc and I don't necessarily have tracking
informations in the exception return path of what the "old" value was.

I need to double check what the exact scenario here is and whether I can
fix it but it's one of those cases where what lockdep is warning about
isn't actually an error I believe.

Cheers,
Ben. 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15  6:55   ` David Miller
  2010-04-15  7:32     ` Ingo Molnar
@ 2010-04-16  1:56     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-16  1:56 UTC (permalink / raw)
  To: David Miller
  Cc: mingo, sfr, peterz, fweisbec, linux-kernel, linux-next, hpa,
	tglx, linuxppc-dev

On Wed, 2010-04-14 at 23:55 -0700, David Miller wrote:
> From: Ingo Molnar <mingo@elte.hu>
> Date: Thu, 15 Apr 2010 08:49:40 +0200
> 
> > Btw., WARN_ON trapping on PowerPC is clearly a PowerPC bug - there's a good 
> > reason we have WARN_ON versus BUG_ON - it should be fixed.
> 
> I disagree, an implementation should be allowed to use the most
> efficient implementation possible for both interfaces.

Right, and I don't think the reason why we have WARN_ON vs. BUG_ON ever
had anything to do with whether it's implemented with a trap or not :-) 

It's purely related to whether it's supposed to be fatal or not. Now,
there is indeed the potential problem you mention of WARN_ON being
called in places where such a trap is unsafe, but so far, this is the
first time I can remember we hit that problem so we've been getting away
with it for quite a while :-)

Now, whether the trap is or is not more efficient than an explicit test
is something that is still being debated on powerpc. It has the
advantage of not un-leafing functions (and thus not creating stack
frames, adding register reloads, etc... when not needed), basically
putting the burden of saving/restoring registers to the (hopefully) rare
path of actually taking the WARN/BUG.

We could probably manufacture something similar with careful use of
inline asm and an out of line asm trampoline.

The benefit of the trap instruction vs. conditional branches per-se is
probably nil. It's really more about the codegen impact, register
clobber due to the added function call, etc.. at least for us.

Cheers,
Ben.

> I would be using traps for both on sparc64 if that were really
> feasible on sparc64 (and actually with gcc-4.5's "asm goto" it might
> actually be now)
> 
> The WARN and BUG macros, when implemented without traps, have serious
> implications for overall code size and register pressure.
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-15 17:15       ` Frederic Weisbecker
@ 2010-04-16 10:38         ` Peter Zijlstra
  2010-04-16 12:32           ` Frederic Weisbecker
  0 siblings, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2010-04-16 10:38 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	linux-next, linux-kernel, ppc-dev

On Thu, 2010-04-15 at 19:15 +0200, Frederic Weisbecker wrote:
> > that looks rather ugly. Why not do a raw:
> > 
> >       this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> > 
> > which basically open-codes debug_atomic_inc(), but without the warning?
> 
> 
> Because that would open a race against interrupts that might
> touch lockdep_stats.redundant_hardirqs_on too.


How so, its a pure per-cpu variable right? so either the increment
happens before the interrupts hits, or after, in either case there
should not be a race with interrupts.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related)
  2010-04-16 10:38         ` Peter Zijlstra
@ 2010-04-16 12:32           ` Frederic Weisbecker
  0 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-16 12:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Stephen Rothwell, Thomas Gleixner, H. Peter Anvin,
	linux-next, linux-kernel, ppc-dev

On Fri, Apr 16, 2010 at 12:38:43PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-04-15 at 19:15 +0200, Frederic Weisbecker wrote:
> > > that looks rather ugly. Why not do a raw:
> > > 
> > >       this_cpu_inc(lockdep_stats.redundant_hardirqs_on);
> > > 
> > > which basically open-codes debug_atomic_inc(), but without the warning?
> > 
> > 
> > Because that would open a race against interrupts that might
> > touch lockdep_stats.redundant_hardirqs_on too.
> 
> 
> How so, its a pure per-cpu variable right? so either the increment
> happens before the interrupts hits, or after, in either case there
> should not be a race with interrupts.


In x86 yeah, I guess the compiler simply loads the address
and does an inc directly, which is atomic wrt interrupts.

But what about another arch that would need an intermediate
load of the value:

load val, reg
add reg, 1
		<---interrupt here
store reg, val


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  2010-04-15 13:37     ` [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled Frederic Weisbecker
@ 2010-04-27  7:32       ` Stephen Rothwell
  2010-04-27 18:07         ` Frederic Weisbecker
  2010-04-28  7:12       ` Stephen Rothwell
  1 sibling, 1 reply; 19+ messages in thread
From: Stephen Rothwell @ 2010-04-27  7:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Peter Zijlstra, David Miller, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

Hi Frederic,

On Thu, 15 Apr 2010 15:37:50 +0200 Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> When a path restore the flags while irqs are already enabled, we
> update the per cpu var redundant_hardirqs_on in a racy fashion
> and debug_atomic_inc() warns about this situation.
> 
> In this particular case, we need to explictly disable the irqs before
> updating this stat var in order to update it safely.
> 
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  kernel/lockdep.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)

Ping?  Any movement on this?

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  2010-04-27  7:32       ` Stephen Rothwell
@ 2010-04-27 18:07         ` Frederic Weisbecker
  0 siblings, 0 replies; 19+ messages in thread
From: Frederic Weisbecker @ 2010-04-27 18:07 UTC (permalink / raw)
  To: Stephen Rothwell, Ingo Molnar
  Cc: LKML, Peter Zijlstra, David Miller, Benjamin Herrenschmidt

On Tue, Apr 27, 2010 at 05:32:27PM +1000, Stephen Rothwell wrote:
> Hi Frederic,
> 
> On Thu, 15 Apr 2010 15:37:50 +0200 Frederic Weisbecker <fweisbec@gmail.com> wrote:
> >
> > When a path restore the flags while irqs are already enabled, we
> > update the per cpu var redundant_hardirqs_on in a racy fashion
> > and debug_atomic_inc() warns about this situation.
> > 
> > In this particular case, we need to explictly disable the irqs before
> > updating this stat var in order to update it safely.
> > 
> > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: David Miller <davem@davemloft.net>
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > ---
> >  kernel/lockdep.c |    4 ++++
> >  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> Ping?  Any movement on this?


Ingo, if you have no problems with these two patches, could you please
apply them?

Thanks.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled
  2010-04-15 13:37     ` [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled Frederic Weisbecker
  2010-04-27  7:32       ` Stephen Rothwell
@ 2010-04-28  7:12       ` Stephen Rothwell
  1 sibling, 0 replies; 19+ messages in thread
From: Stephen Rothwell @ 2010-04-28  7:12 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Peter Zijlstra, David Miller, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1579 bytes --]

Hi Frederic,

On Thu, 15 Apr 2010 15:37:50 +0200 Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
> When a path restore the flags while irqs are already enabled, we
> update the per cpu var redundant_hardirqs_on in a racy fashion
> and debug_atomic_inc() warns about this situation.
> 
> In this particular case, we need to explictly disable the irqs before
> updating this stat var in order to update it safely.
> 
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  kernel/lockdep.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 78325f8..65d4336 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -2298,7 +2298,11 @@ void trace_hardirqs_on_caller(unsigned long ip)
>  		return;
>  
>  	if (unlikely(curr->hardirqs_enabled)) {
> +		unsigned long flags;
> +
> +		raw_local_irq_save(flags);
>  		debug_atomic_inc(redundant_hardirqs_on);
> +		raw_local_irq_restore(flags);
>  		return;
>  	}
>  	/* we'll do an OFF -> ON transition: */

I applied this patch to linux-next today and it fixes my PowerPC boot
problems, so I will keep it in linux-next until it gets applied to the
tip tree or something better comes along.

Thanks.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2010-04-28  7:12 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-15  6:12 linux-next: boot failure after merge of the final tree (tip related) Stephen Rothwell
2010-04-15  6:48 ` Benjamin Herrenschmidt
2010-04-15  6:49 ` linux-next: PowerPC WARN_ON_ONCE() " Ingo Molnar
2010-04-15  6:55   ` David Miller
2010-04-15  7:32     ` Ingo Molnar
2010-04-16  1:51       ` Benjamin Herrenschmidt
2010-04-16  1:56     ` Benjamin Herrenschmidt
2010-04-15 10:04   ` Stephen Rothwell
2010-04-15 13:37     ` [PATCH] lockdep: Fix redundant_hardirqs_on incremented with irqs enabled Frederic Weisbecker
2010-04-27  7:32       ` Stephen Rothwell
2010-04-27 18:07         ` Frederic Weisbecker
2010-04-28  7:12       ` Stephen Rothwell
2010-04-15 13:00   ` linux-next: PowerPC WARN_ON_ONCE() after merge of the final tree (tip related) Frederic Weisbecker
2010-04-15 14:03     ` Ingo Molnar
2010-04-15 17:15       ` Frederic Weisbecker
2010-04-16 10:38         ` Peter Zijlstra
2010-04-16 12:32           ` Frederic Weisbecker
2010-04-15 17:24       ` Frederic Weisbecker
2010-04-15 17:39         ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).