All of lore.kernel.org
 help / color / mirror / Atom feed
* perf: perf_fuzzer triggers instant reboot
@ 2014-09-08 17:47 Vince Weaver
  2014-09-08 18:51 ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-08 17:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

Hello

so I finally had time to run my perf_fuzzer again and it has rapidly 
turned up an alarming crash that instant-reboots my core2 test machine.

This is on 3.17-rc4.  It is reproducible.

The first time all that appeared on the serial console was:

[ 2616.535995] Kernel panic - not syncing: Lost physical address for unconsumed uncorrectable error

The second time this happened instead:

[  756.009271] traps: perf_fuzzer[4236] trap invalid opcode ip:4044c0 sp:7fffb0ed6f90 error:0
[  756.017590] BUG: unable to handle kernel paging request

I'll work on trying to narrow this down a bit.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-08 17:47 perf: perf_fuzzer triggers instant reboot Vince Weaver
@ 2014-09-08 18:51 ` Peter Zijlstra
  2014-09-08 19:08   ` Vince Weaver
  2014-09-09 16:06   ` Vince Weaver
  0 siblings, 2 replies; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-08 18:51 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

[-- Attachment #1: Type: text/plain, Size: 250 bytes --]

On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote:
> Hello
> 
> so I finally had time to run my perf_fuzzer again and it has rapidly 
> turned up an alarming crash that instant-reboots my core2 test machine.

Urgh, of course :/



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-08 18:51 ` Peter Zijlstra
@ 2014-09-08 19:08   ` Vince Weaver
  2014-09-09 16:06   ` Vince Weaver
  1 sibling, 0 replies; 26+ messages in thread
From: Vince Weaver @ 2014-09-08 19:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On Mon, 8 Sep 2014, Peter Zijlstra wrote:

> On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote:
> > Hello
> > 
> > so I finally had time to run my perf_fuzzer again and it has rapidly 
> > turned up an alarming crash that instant-reboots my core2 test machine.
> 
> Urgh, of course :/

I can trigger it with 3.17-rc4 and 3.17-rc3 but not 3.16 so I'm attempting 
a bisection, so hopefully it turns out to be a real issue and not just 
some sort of compilation artifact.

It takes a while to compile on this machine though so it might be a day or 
so until I have the bisection results.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-08 18:51 ` Peter Zijlstra
  2014-09-08 19:08   ` Vince Weaver
@ 2014-09-09 16:06   ` Vince Weaver
  2014-09-09 17:20     ` Vince Weaver
  1 sibling, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-09 16:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On Mon, 8 Sep 2014, Peter Zijlstra wrote:

> On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote:
> > Hello
> > 
> > so I finally had time to run my perf_fuzzer again and it has rapidly 
> > turned up an alarming crash that instant-reboots my core2 test machine.
> 
> Urgh, of course :/

So possibly was a waste of time, as it bisects to:

	commit 5e467e27b7c7a11ef47068138248003e873674a1
	Author: Tony Luck <tony.luck@intel.com>
	Date:   Wed Jul 30 14:05:15 2014 -0700

	    [IA64] Wire up getrandom() system call

Which makes no sense at all :(

The times the testcase fails give some pretty spectacular error messages:


[   78.018145] NOHZ: local_softirq_pending 100
[   83.012661] BUG: Bad page map in process perf_fuzzer  pte:ffff8800cafc5190 pmd:119d00067
[   83.020863] addr:000000000120c000 vm_flags:00100073 anon_vma:ffff8800cae19dc0 mapping:          (null) index:120c

[  316.040001] general protection fault: 0000 [#1] 

[  178.024220] perf_fuzzer[2788]: segfault at ffff88011fa7bef8 ip 00007f8c0e088c2f sp 00007fff003315e0 error 5

[ 2616.535995] Kernel panic - not syncing: Lost physical address for unconsumed uncorrectable error


[  751.656861] NOHZ: local_softirq_pending 100
[  756.009271] traps: perf_fuzzer[4236] trap invalid opcode ip:4044c0 sp:7fffb0ed6f90 error:0
[  756.017590] BUG: unable to handle kernel paging request

among others.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-09 16:06   ` Vince Weaver
@ 2014-09-09 17:20     ` Vince Weaver
  2014-09-09 17:53       ` Vince Weaver
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-09 17:20 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo

On Tue, 9 Sep 2014, Vince Weaver wrote:
> 
> [  751.656861] NOHZ: local_softirq_pending 100
> [  756.009271] traps: perf_fuzzer[4236] trap invalid opcode ip:4044c0 sp:7fffb0ed6f90 error:0
> [  756.017590] BUG: unable to handle kernel paging request

so it turns out that while it seems reproducible, it must be some sort of 
race condition and only happens like 75% of the time on a 3.17-rc4 kernel.

Almost managed to get an oops that last time:

[  222.457213] NOHZ: local_softirq_pending 100
[  222.712448] NOHZ: local_softirq_pending 100
[  222.947573] NOHZ: local_softirq_pending 100
[  229.168047] BUG: unable to handle kernel NULL pointer dereference at 0000000000000087
[  229.172003] IP: [<ffffffff81062cde>] scheduler_tick+0x56/0xbb
[  229.172003] PGD c4b72067 PUD c52df067 PMD 0 
[  229.172003] Thread overran stack, or stack corrupted
[  229.172003] Oops: 0000 [#1] SMP 
[  229.172003] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg mcs7830 usbnet coretemp evdev psmouse pcspkr ohci_pci ohci_hcd  
^P<C0><FF><FF><E8><FF><FF><A8>A!^A<88><FF><FF>hA!^A<88><FF>
<FF>^P}<CB>7(OCE) 

any advice on tracking this down?  I can fire up ftrace but it's unclear 
to me I will get anything useful as the system insta-reboots pretty 
quickly.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-09 17:20     ` Vince Weaver
@ 2014-09-09 17:53       ` Vince Weaver
  2014-09-10  8:31         ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-09 17:53 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt


OK so trying to use ftrace to track this issue, and this happens (on 
core2, 3.17-rc4)

[  295.992012] PANIC: double fault, error_code: 0x0
[  295.992012] CPU: 1 PID: 2916 Comm: trace-cmd Not tainted 3.17.0-rc4+ #82
[  295.992012] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012
[  295.992012] task: ffff8800cb20a800 ti: ffff8800ca30c000 task.ti: ffff8800ca30c000
[  295.992012] RIP: 0010:[<ffffffff81037b92>]  [<ffffffff81037b92>] perf_trace_x86_exceptions+0x2d/0xb7
[  295.992012] RSP: 0018:ffff8800bfffff58  EFLAGS: 00010082
[  295.992012] RAX: 0000000000000001 RBX: ffff880037c9f490 RCX: 0000000000000000
[  295.992012] RDX: ffff8800c0000098 RSI: ffffe8ffffc81010 RDI: ffffffff81a26600
[  295.992012] RBP: ffff8800c0000058 R08: 0000000000044ea5 R09: ffffe8ffffc81010
[  295.992012] R10: 0000000000000000 R11: ffff8800ca30fc48 R12: ffff8800c0000098
[  295.992012] R13: ffffe8ffffc81010 R14: 0000000000000000 R15: ffffffff81a26600
[  295.992012] FS:  00007f4217f82700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[  295.992012] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  295.992012] CR2: ffff8800bfffff48 CR3: 00000000ca0b9000 CR4: 00000000000407e0
[  295.992012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  295.992012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  295.992012] Stack:
[  295.992012] PANIC: double fault, error_code: 0x0

and the system reboots.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-09 17:53       ` Vince Weaver
@ 2014-09-10  8:31         ` Peter Zijlstra
  2014-09-10 13:18           ` Vince Weaver
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-10  8:31 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 1735 bytes --]

On Tue, Sep 09, 2014 at 01:53:50PM -0400, Vince Weaver wrote:
> 
> OK so trying to use ftrace to track this issue, and this happens (on 
> core2, 3.17-rc4)
> 
> [  295.992012] PANIC: double fault, error_code: 0x0
> [  295.992012] CPU: 1 PID: 2916 Comm: trace-cmd Not tainted 3.17.0-rc4+ #82
> [  295.992012] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012
> [  295.992012] task: ffff8800cb20a800 ti: ffff8800ca30c000 task.ti: ffff8800ca30c000
> [  295.992012] RIP: 0010:[<ffffffff81037b92>]  [<ffffffff81037b92>] perf_trace_x86_exceptions+0x2d/0xb7
> [  295.992012] RSP: 0018:ffff8800bfffff58  EFLAGS: 00010082
> [  295.992012] RAX: 0000000000000001 RBX: ffff880037c9f490 RCX: 0000000000000000
> [  295.992012] RDX: ffff8800c0000098 RSI: ffffe8ffffc81010 RDI: ffffffff81a26600
> [  295.992012] RBP: ffff8800c0000058 R08: 0000000000044ea5 R09: ffffe8ffffc81010
> [  295.992012] R10: 0000000000000000 R11: ffff8800ca30fc48 R12: ffff8800c0000098
> [  295.992012] R13: ffffe8ffffc81010 R14: 0000000000000000 R15: ffffffff81a26600
> [  295.992012] FS:  00007f4217f82700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
> [  295.992012] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  295.992012] CR2: ffff8800bfffff48 CR3: 00000000ca0b9000 CR4: 00000000000407e0
> [  295.992012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  295.992012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> [  295.992012] Stack:
> [  295.992012] PANIC: double fault, error_code: 0x0
> 

Sasha reported something from his KVM based fuzzing, maybe that's the
same. But that x86_exceptions thing is interesting, lemme go look at
that first.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10  8:31         ` Peter Zijlstra
@ 2014-09-10 13:18           ` Vince Weaver
  2014-09-10 13:28             ` Peter Zijlstra
  2014-09-10 14:01             ` Sasha Levin
  0 siblings, 2 replies; 26+ messages in thread
From: Vince Weaver @ 2014-09-10 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt

On Wed, 10 Sep 2014, Peter Zijlstra wrote:
> 
> Sasha reported something from his KVM based fuzzing, maybe that's the
> same. But that x86_exceptions thing is interesting, lemme go look at
> that first.

that's what got me looking at things again, the trinity reports.  Though I 
think those involve CPU hotplugging which my fuzzer shouldn't trigger.

I do think this is the same memory corruption/reboot bug that I reported 
back in February (the thread is "perf_fuzzer compiled for x32 causes 
reboot" but I wasn't able to isolate the problem then either.

Somehow something is stomping over memory with a forking workload (likely 
an improper free with RCU like we've seen before) but the fact that it 
causes a reboot immediately makes it *really* hard to debug this.


I'm not sure about the tracing issue, possibly it's just a side effect of 
memory being corrupted by the same bug.  In case it's relevant here's the 
trace-cmd I was trying to use:

   trace-cmd record -e raw_syscalls -p function -l '*perf*' -n 'perf_event_task_tick' -l 'x86*' -l 'intel*' -P 2807

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10 13:18           ` Vince Weaver
@ 2014-09-10 13:28             ` Peter Zijlstra
  2014-09-10 14:01             ` Sasha Levin
  1 sibling, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-10 13:28 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt

On Wed, Sep 10, 2014 at 09:18:35AM -0400, Vince Weaver wrote:

> Somehow something is stomping over memory with a forking workload (likely 
> an improper free with RCU like we've seen before) but the fact that it 
> causes a reboot immediately makes it *really* hard to debug this.

Yes, the insta reboot thing is a total pain. Too bad Steve is out for a
spell; the only thing I can think of is trying to 'preserve' the trace
buffer over the reboot; its a warm reboot and memory contents should be
'stable'. So if we can get the new boot to agree with the old kernel's
idea of trace buffers we might retain enough.

Another approach would be using the firewire debug facility to read the
trace buffer post-mortem. Of course, that requires you have FW in at
least two boxes and an appropriate cable (not something I've actually
ever done due to lack of FW hardware).

Maybe the EHCI debug port (USB) might provide similar capabilities --
again, significant lack of experience due to not actually having
hardware for that.

I think I've once managed to hit the triple fault reboot in qemu/kvm,
which makes inspecting the dead state tons easier, if you can manage to
reproduce in a virt environment you've got a chance (of course, the
problem at that time was not perf and so a lot less sensitive to
hardware).


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10 13:18           ` Vince Weaver
  2014-09-10 13:28             ` Peter Zijlstra
@ 2014-09-10 14:01             ` Sasha Levin
  2014-09-10 14:30               ` Vince Weaver
  1 sibling, 1 reply; 26+ messages in thread
From: Sasha Levin @ 2014-09-10 14:01 UTC (permalink / raw)
  To: Vince Weaver, Peter Zijlstra
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt

On 09/10/2014 09:18 AM, Vince Weaver wrote:
> that's what got me looking at things again, the trinity reports.  Though I 
> think those involve CPU hotplugging which my fuzzer shouldn't trigger.
> 
> I do think this is the same memory corruption/reboot bug that I reported 
> back in February (the thread is "perf_fuzzer compiled for x32 causes 
> reboot" but I wasn't able to isolate the problem then either.
> 
> Somehow something is stomping over memory with a forking workload (likely 
> an improper free with RCU like we've seen before) but the fact that it 
> causes a reboot immediately makes it *really* hard to debug this.

Could this be http://permalink.gmane.org/gmane.linux.kernel/1779436 which
I saw couple days ago?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10 14:01             ` Sasha Levin
@ 2014-09-10 14:30               ` Vince Weaver
  2014-09-10 14:33                 ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-10 14:30 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Vince Weaver, Peter Zijlstra, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo, Steven Rostedt

On Wed, 10 Sep 2014, Sasha Levin wrote:

> On 09/10/2014 09:18 AM, Vince Weaver wrote:
> > that's what got me looking at things again, the trinity reports.  Though I 
> > think those involve CPU hotplugging which my fuzzer shouldn't trigger.
> > 
> > I do think this is the same memory corruption/reboot bug that I reported 
> > back in February (the thread is "perf_fuzzer compiled for x32 causes 
> > reboot" but I wasn't able to isolate the problem then either.
> > 
> > Somehow something is stomping over memory with a forking workload (likely 
> > an improper free with RCU like we've seen before) but the fact that it 
> > causes a reboot immediately makes it *really* hard to debug this.
> 
> Could this be http://permalink.gmane.org/gmane.linux.kernel/1779436 which
> I saw couple days ago?

It could be.

I have about 10 open bugs with similar symptoms found with my perf_fuzzer 
here (and a few more that are possibly the same memory corruption bug that 
"fixes" went into the kernel but it's unclear if it actually fixed things
or just altered the locking enough to make it harder to hit).

	http://web.eece.maine.edu/~vweaver/projects/perf_events/fuzzer/bugs_found.html

I've been trying for months now to make progress on these but this type of 
bug is really hard to debug.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10 14:30               ` Vince Weaver
@ 2014-09-10 14:33                 ` Peter Zijlstra
  2014-09-11 13:27                   ` Vince Weaver
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-10 14:33 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Sasha Levin, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Steven Rostedt

On Wed, Sep 10, 2014 at 10:30:31AM -0400, Vince Weaver wrote:
> On Wed, 10 Sep 2014, Sasha Levin wrote:
> 
> > On 09/10/2014 09:18 AM, Vince Weaver wrote:
> > > that's what got me looking at things again, the trinity reports.  Though I 
> > > think those involve CPU hotplugging which my fuzzer shouldn't trigger.
> > > 
> > > I do think this is the same memory corruption/reboot bug that I reported 
> > > back in February (the thread is "perf_fuzzer compiled for x32 causes 
> > > reboot" but I wasn't able to isolate the problem then either.
> > > 
> > > Somehow something is stomping over memory with a forking workload (likely 
> > > an improper free with RCU like we've seen before) but the fact that it 
> > > causes a reboot immediately makes it *really* hard to debug this.
> > 
> > Could this be http://permalink.gmane.org/gmane.linux.kernel/1779436 which
> > I saw couple days ago?
> 
> It could be.
> 
> I have about 10 open bugs with similar symptoms found with my perf_fuzzer 
> here (and a few more that are possibly the same memory corruption bug that 
> "fixes" went into the kernel but it's unclear if it actually fixed things
> or just altered the locking enough to make it harder to hit).
> 
> 	http://web.eece.maine.edu/~vweaver/projects/perf_events/fuzzer/bugs_found.html
> 
> I've been trying for months now to make progress on these but this type of 
> bug is really hard to debug.

Did we actually fix some at least? I had the idea we did get a few
sorted. But yes, this is tedious and hard going :/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-10 14:33                 ` Peter Zijlstra
@ 2014-09-11 13:27                   ` Vince Weaver
  2014-09-25  4:59                     ` Vince Weaver
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-11 13:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Sasha Levin, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo, Steven Rostedt

On Wed, 10 Sep 2014, Peter Zijlstra wrote:

> > I've been trying for months now to make progress on these but this type of 
> > bug is really hard to debug.
> 
> Did we actually fix some at least? I had the idea we did get a few
> sorted. But yes, this is tedious and hard going :/

we did fix some of them.

Others it's a bit unclear.  Fixes went in that made my particular testcase 
stop triggering things, but more fuzzing would crash in similar ways.  So 
it's unclear if the actual problem was fixed or if the fix just shook 
things up enough to make the original bug (probably a race condition of 
some sort) harder to trigger.

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-11 13:27                   ` Vince Weaver
@ 2014-09-25  4:59                     ` Vince Weaver
  2014-09-25 16:38                       ` Cong Wang
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-25  4:59 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, Sasha Levin, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo, Cong Wang


So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479)
	perf: Fix a race condition in perf_remove_from_context()

and that sounds a lot like the weird fork()/memory-corruption bug that the 
fuzzer has been triggering.

So I applied that patch alone on top of the 3.17-rc4 kernel that I could 
reproducibly reboot... and with the patch I can't trigger the problem 
anymore.

Now that just might mean the patch pushed the code around enough so my 
test doesn't trigger, but there is hope that maybe this fixes things.

Cong Wang, do you have more info on how you came across this bug?  And how 
you tracked down the problem?

Thanks,

Vince

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-25  4:59                     ` Vince Weaver
@ 2014-09-25 16:38                       ` Cong Wang
  2014-09-28  4:09                         ` Sasha Levin
  2014-09-29  5:21                         ` perf: perf_fuzzer triggers instant reboot Vince Weaver
  0 siblings, 2 replies; 26+ messages in thread
From: Cong Wang @ 2014-09-25 16:38 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, Sasha Levin, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>
> So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479)
>         perf: Fix a race condition in perf_remove_from_context()
>
> and that sounds a lot like the weird fork()/memory-corruption bug that the
> fuzzer has been triggering.
>
> So I applied that patch alone on top of the 3.17-rc4 kernel that I could
> reproducibly reboot... and with the patch I can't trigger the problem
> anymore.
>
> Now that just might mean the patch pushed the code around enough so my
> test doesn't trigger, but there is hope that maybe this fixes things.

I read this as it fixes your crash as well?

>
> Cong Wang, do you have more info on how you came across this bug?  And how
> you tracked down the problem?

Sure, as I said in the changelog, it is a soft lockup which was triggered on
dozens of machines here, it is actually pretty straightforward:


[5108912.562963] BUG: soft lockup - CPU#7 stuck for 22s! [perf:13856]
[5108912.563173] Modules linked in: netconsole configfs ipv6 bonding
dm_multipath video sbs sbshc hed acpi_pad acpi_memhotplug acpi_ipmi
parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf
ipmi_msghandler dell_rbu igb dcdbas shpchp i2c_i801 i2c_core iTCO_wdt
i7core_edac edac_core iTCO_vendor_support ioatdma dca microcode
[5108912.563198] CPU 7
[5108912.563199] Modules linked in: netconsole configfs ipv6 bonding
dm_multipath video sbs sbshc hed acpi_pad acpi_memhotplug acpi_ipmi
parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf
ipmi_msghandler dell_rbu igb dcdbas shpchp i2c_i801 i2c_core iTCO_wdt
i7core_edac edac_core iTCO_vendor_support ioatdma dca microcode
[5108912.563216]
[5108912.563219] Pid: 13856, comm: perf Not tainted 3.4.78 #1 Dell
Inc.  C6100           /0D61XP
[5108912.563222] RIP: 0010:[<ffffffff810d9a6a>]  [<ffffffff810d9a6a>]
perf_remove_from_context+0x8d/0xb4
[5108912.563233] RSP: 0018:ffff8809ea39bd48  EFLAGS: 00000202
[5108912.563235] RAX: 000000000000006d RBX: ffffffff810d6dcc RCX:
0000000000000000
[5108912.563237] RDX: ffff88123fc8006d RSI: ffffffff810d6dcc RDI:
ffff8808f8541c0c
[5108912.563239] RBP: ffff8809ea39bd88 R08: 0000000000000001 R09:
0000000000000000
[5108912.563241] R10: ffff8809abf95610 R11: ffff880a3a6c331c R12:
0000000000000000
[5108912.563243] R13: 00000000000000ef R14: 0000000000000001 R15:
0000000000000000
[5108912.563245] FS:  0000000000000000(0000) GS:ffff88123fc20000(0000)
knlGS:0000000000000000
[5108912.563248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[5108912.563250] CR2: 00007f692e787180 CR3: 0000000001a0b000 CR4:
00000000000007e0
[5108912.563252] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[5108912.563254] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[5108912.563256] Process perf (pid: 13856, threadinfo
ffff8809ea39a000, task ffff880a2167c470)
[5108912.563258] Stack:
[5108912.563260]  ffff880a3a6c32c0 ffff88048c7c3820 ffff8809ea39bd88
ffff88048c7c3800
[5108912.563265]  ffff88048c7c3800 ffff8808f8541c00 ffff8808f8541c10
ffff88091b56c000
[5108912.563269]  ffff8809ea39bdb8 ffffffff810d9bbe ffff8809ea39bdb8
ffff880a2167c470
[5108912.563273] Call Trace:
[5108912.563278]  [<ffffffff810d9bbe>] perf_event_release_kernel+0x77/0x91
[5108912.563282]  [<ffffffff810d9c56>] put_event+0x7e/0x86
[5108912.563285]  [<ffffffff810d9dc0>] perf_release+0x10/0x14
[5108912.563291]  [<ffffffff8112dfc0>] __fput+0xfe/0x1f6
[5108912.563294]  [<ffffffff8112e0d2>] fput+0x1a/0x1c
[5108912.563297]  [<ffffffff8112ad8d>] filp_close+0x72/0x7d
[5108912.563303]  [<ffffffff8103efbf>] put_files_struct+0x6c/0xc3
[5108912.563306]  [<ffffffff8103f057>] exit_files+0x41/0x46
[5108912.563309]  [<ffffffff81040971>] do_exit+0x292/0x3b6
[5108912.563312]  [<ffffffff81040b12>] do_group_exit+0x7d/0xa5
[5108912.563315]  [<ffffffff81040b51>] sys_exit_group+0x17/0x1b
[5108912.563320]  [<ffffffff814c2469>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-25 16:38                       ` Cong Wang
@ 2014-09-28  4:09                         ` Sasha Levin
  2014-09-29 11:11                           ` Peter Zijlstra
  2014-09-30 17:23                           ` Peter Zijlstra
  2014-09-29  5:21                         ` perf: perf_fuzzer triggers instant reboot Vince Weaver
  1 sibling, 2 replies; 26+ messages in thread
From: Sasha Levin @ 2014-09-28  4:09 UTC (permalink / raw)
  To: Cong Wang, Vince Weaver
  Cc: Peter Zijlstra, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo

On 09/25/2014 12:38 PM, Cong Wang wrote:
> On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>> >
>> > So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479)
>> >         perf: Fix a race condition in perf_remove_from_context()
>> >
>> > and that sounds a lot like the weird fork()/memory-corruption bug that the
>> > fuzzer has been triggering.
>> >
>> > So I applied that patch alone on top of the 3.17-rc4 kernel that I could
>> > reproducibly reboot... and with the patch I can't trigger the problem
>> > anymore.
>> >
>> > Now that just might mean the patch pushed the code around enough so my
>> > test doesn't trigger, but there is hope that maybe this fixes things.
> I read this as it fixes your crash as well?

Cong, I *suspect* that that commit also triggers the following lockdep warning.

I haven't confirmed that, but hopefully it'll help:

[  690.800861] ======================================================
[  690.800864] [ INFO: possible circular locking dependency detected ]
[  690.800877] 3.17.0-rc6-next-20140926-sasha-00051-g9253dff-dirty #1242 Not tainted
[  690.800881] -------------------------------------------------------
[  690.800887] trinity-c95/17888 is trying to acquire lock:
[  690.800925] (&(&pool->lock)->rlock){..-.-.}, at: __queue_work (kernel/workqueue.c:1325)
[  690.800929]
[  690.800929] but task is already holding lock:
[  690.800955] (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:988)
[  690.800958]
[  690.800958] which lock already depends on the new lock.
[  690.800958]
[  690.800960]
[  690.800960] the existing dependency chain (in reverse order) is:
[  690.800971]
[  690.800971] -> #3 (&ctx->lock){-.-...}:
[  690.800990] lock_acquire (kernel/locking/lockdep.c:3610)
[  690.801006] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[  690.801023] __perf_event_task_sched_out (kernel/events/core.c:2419 kernel/events/core.c:2445)
[  690.801040] perf_event_task_sched_out (include/linux/perf_event.h:714)
[  690.801051] __schedule (kernel/sched/core.c:2178 kernel/sched/core.c:2216 kernel/sched/core.c:2336 kernel/sched/core.c:2858)
[  690.801061] preempt_schedule_irq (./arch/x86/include/asm/paravirt.h:814 kernel/sched/core.c:2975)
[  690.801075] retint_kernel (arch/x86/kernel/entry_64.S:920)
[  690.801086] perf_swevent_init (kernel/events/core.c:5963 kernel/events/core.c:5983 kernel/events/core.c:6043)
[  690.801100] perf_init_event (kernel/events/core.c:6841)
[  690.801110] perf_event_alloc (kernel/events/core.c:6996)
[  690.801124] SYSC_perf_event_open (kernel/events/core.c:7291)
[  690.801136] SyS_perf_event_open (kernel/events/core.c:7210)
[  690.801149] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)
[  690.801163]
[  690.801163] -> #2 (&rq->lock){-.-.-.}:
[  690.801185] lock_acquire (kernel/locking/lockdep.c:3610)
[  690.801194] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[  690.801206] wake_up_new_task (include/linux/sched.h:2932 kernel/sched/core.c:320 kernel/sched/core.c:2128)
[  690.801220] do_fork (kernel/fork.c:1690)
[  690.801233] kernel_thread (kernel/fork.c:1712)
[  690.801250] rest_init (init/main.c:404)
[  690.801265] start_kernel (init/main.c:682)
[  690.801280] x86_64_start_reservations (arch/x86/kernel/head64.c:199)
[  690.801297] x86_64_start_kernel (arch/x86/kernel/head64.c:188)
[  690.801315]
[  690.801315] -> #1 (&p->pi_lock){-.-.-.}:
[  690.801326] lock_acquire (kernel/locking/lockdep.c:3610)
[  690.801340] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:117 kernel/locking/spinlock.c:159)
[  690.801350] try_to_wake_up (kernel/sched/core.c:1692)
[  690.801364] wake_up_process (kernel/sched/core.c:1787 (discriminator 3))
[  690.801377] create_worker (include/linux/spinlock.h:359 kernel/workqueue.c:1713)
[  690.801401] init_workqueues (kernel/workqueue.c:4861)
[  690.801415] do_one_initcall (init/main.c:792)
[  690.801427] kernel_init_freeable (init/main.c:893 init/main.c:999)
[  690.801436] kernel_init (init/main.c:937)
[  690.801457] ret_from_fork (arch/x86/kernel/entry_64.S:348)
[  690.801474]
[  690.801474] -> #0 (&(&pool->lock)->rlock){..-.-.}:
[  690.801488] __lock_acquire (kernel/locking/lockdep.c:1842 kernel/locking/lockdep.c:1947 kernel/locking/lockdep.c:2133 kernel/locking/lockdep.c:3184)
[  690.801499] lock_acquire (kernel/locking/lockdep.c:3610)
[  690.801507] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[  690.801517] __queue_work (kernel/workqueue.c:1325)
[  690.801525] queue_work_on (kernel/workqueue.c:1403)
[  690.801542] free_object (lib/debugobjects.c:209)
[  690.801552] __debug_check_no_obj_freed (lib/debugobjects.c:718)
[  690.801561] debug_check_no_obj_freed (lib/debugobjects.c:727)
[  690.801574] kmem_cache_free (mm/slub.c:2687 mm/slub.c:2715)
[  690.801583] free_task (kernel/fork.c:221)
[  690.801594] __put_task_struct (kernel/fork.c:251)
[  690.801609] put_ctx (include/linux/sched.h:1864 kernel/events/core.c:904)
[  690.801619] find_get_context (kernel/events/core.c:913 kernel/events/core.c:3222)
[  690.801630] SYSC_perf_event_open (kernel/events/core.c:7347)
[  690.801638] SyS_perf_event_open (kernel/events/core.c:7210)
[  690.801650] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)
[  690.801653]
[  690.801653] other info that might help us debug this:
[  690.801653]
[  690.801669] Chain exists of:
[  690.801669]   &(&pool->lock)->rlock --> &rq->lock --> &ctx->lock
[  690.801669]
[  690.801679]  Possible unsafe locking scenario:
[  690.801679]
[  690.801684]        CPU0                    CPU1
[  690.801686]        ----                    ----
[  690.801693]   lock(&ctx->lock);
[  690.801703]                                lock(&rq->lock);
[  690.801708]                                lock(&ctx->lock);
[  690.801714]   lock(&(&pool->lock)->rlock);
[  690.801717]
[  690.801717]  *** DEADLOCK ***
[  690.801717]
[  690.801720] 2 locks held by trinity-c95/17888:
[  690.801738] #0: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:92)
[  690.801754] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:988)
[  690.801758]
[  690.801758] stack backtrace:
[  690.801766] CPU: 21 PID: 17888 Comm: trinity-c95 Not tainted 3.17.0-rc6-next-20140926-sasha-00051-g9253dff-dirty #1242
[  690.801779]  ffffffff92b7f320 0000000000000000 ffffffff92afbee0 ffff8804078179c8
[  690.801798]  ffffffff8ef0070f 0000000000000011 ffffffff92ab6aa0 ffff880407817a18
[  690.801813]  ffffffff8a24ec2c ffff880407817aa8 ffff880409c00000 ffff880407817a18
[  690.801818] Call Trace:
[  690.801836] dump_stack (lib/dump_stack.c:52)
[  690.801845] print_circular_bug (kernel/locking/lockdep.c:1217)
[  690.801856] __lock_acquire (kernel/locking/lockdep.c:1842 kernel/locking/lockdep.c:1947 kernel/locking/lockdep.c:2133 kernel/locking/lockdep.c:3184)
[  690.801872] lock_acquire (kernel/locking/lockdep.c:3610)
[  690.801883] ? __queue_work (kernel/workqueue.c:1325)
[  690.801892] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
[  690.801902] ? __queue_work (kernel/workqueue.c:1325)
[  690.801912] ? get_work_pool (include/linux/idr.h:120 kernel/workqueue.c:674)
[  690.801921] __queue_work (kernel/workqueue.c:1325)
[  690.801932] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[  690.801943] queue_work_on (kernel/workqueue.c:1403)
[  690.801956] free_object (lib/debugobjects.c:209)
[  690.801967] __debug_check_no_obj_freed (lib/debugobjects.c:718)
[  690.801983] debug_check_no_obj_freed (lib/debugobjects.c:727)
[  690.801995] kmem_cache_free (mm/slub.c:2687 mm/slub.c:2715)
[  690.802005] ? free_task (kernel/fork.c:221)
[  690.802016] free_task (kernel/fork.c:221)
[  690.802026] __put_task_struct (kernel/fork.c:251)
[  690.802037] put_ctx (include/linux/sched.h:1864 kernel/events/core.c:904)
[  690.802049] find_get_context (kernel/events/core.c:913 kernel/events/core.c:3222)
[  690.802063] ? perf_event_alloc (kernel/events/core.c:7005)
[  690.802078] SYSC_perf_event_open (kernel/events/core.c:7347)
[  690.802087] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[  690.802097] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2602)
[  690.802111] SyS_perf_event_open (kernel/events/core.c:7210)
[  690.802120] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-25 16:38                       ` Cong Wang
  2014-09-28  4:09                         ` Sasha Levin
@ 2014-09-29  5:21                         ` Vince Weaver
  2014-09-30 17:58                           ` Cong Wang
  1 sibling, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-09-29  5:21 UTC (permalink / raw)
  To: Cong Wang
  Cc: Vince Weaver, Peter Zijlstra, Sasha Levin, linux-kernel,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On Thu, 25 Sep 2014, Cong Wang wrote:

> On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:

> > Now that just might mean the patch pushed the code around enough so my
> > test doesn't trigger, but there is hope that maybe this fixes things.
> 
> I read this as it fixes your crash as well?

It makes a testcase I have stop triggering an instant reboot, but there 
have been various attempts to fix similar problems over the past few 
months and so far the general family of lockups/reboots still remains.

I'll have to start fuzzing with 3.17-rc7 and see if things are better now,
but it can take days to trigger this class of bugs.

> Sure, as I said in the changelog, it is a soft lockup which was triggered on
> dozens of machines here, it is actually pretty straightforward:

so you managed to trigger it with the perf utility?
Do you have the exact command line that triggered it?

Thanks,

Vince



> [5108912.562963] BUG: soft lockup - CPU#7 stuck for 22s! [perf:13856]
> [5108912.563173] Modules linked in: netconsole configfs ipv6 bonding
> dm_multipath video sbs sbshc hed acpi_pad acpi_memhotplug acpi_ipmi
> parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf
> ipmi_msghandler dell_rbu igb dcdbas shpchp i2c_i801 i2c_core iTCO_wdt
> i7core_edac edac_core iTCO_vendor_support ioatdma dca microcode
> [5108912.563198] CPU 7
> [5108912.563199] Modules linked in: netconsole configfs ipv6 bonding
> dm_multipath video sbs sbshc hed acpi_pad acpi_memhotplug acpi_ipmi
> parport_pc lp parport tcp_diag inet_diag ipmi_si ipmi_devintf
> ipmi_msghandler dell_rbu igb dcdbas shpchp i2c_i801 i2c_core iTCO_wdt
> i7core_edac edac_core iTCO_vendor_support ioatdma dca microcode
> [5108912.563216]
> [5108912.563219] Pid: 13856, comm: perf Not tainted 3.4.78 #1 Dell
> Inc.  C6100           /0D61XP
> [5108912.563222] RIP: 0010:[<ffffffff810d9a6a>]  [<ffffffff810d9a6a>]
> perf_remove_from_context+0x8d/0xb4
> [5108912.563233] RSP: 0018:ffff8809ea39bd48  EFLAGS: 00000202
> [5108912.563235] RAX: 000000000000006d RBX: ffffffff810d6dcc RCX:
> 0000000000000000
> [5108912.563237] RDX: ffff88123fc8006d RSI: ffffffff810d6dcc RDI:
> ffff8808f8541c0c
> [5108912.563239] RBP: ffff8809ea39bd88 R08: 0000000000000001 R09:
> 0000000000000000
> [5108912.563241] R10: ffff8809abf95610 R11: ffff880a3a6c331c R12:
> 0000000000000000
> [5108912.563243] R13: 00000000000000ef R14: 0000000000000001 R15:
> 0000000000000000
> [5108912.563245] FS:  0000000000000000(0000) GS:ffff88123fc20000(0000)
> knlGS:0000000000000000
> [5108912.563248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [5108912.563250] CR2: 00007f692e787180 CR3: 0000000001a0b000 CR4:
> 00000000000007e0
> [5108912.563252] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [5108912.563254] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [5108912.563256] Process perf (pid: 13856, threadinfo
> ffff8809ea39a000, task ffff880a2167c470)
> [5108912.563258] Stack:
> [5108912.563260]  ffff880a3a6c32c0 ffff88048c7c3820 ffff8809ea39bd88
> ffff88048c7c3800
> [5108912.563265]  ffff88048c7c3800 ffff8808f8541c00 ffff8808f8541c10
> ffff88091b56c000
> [5108912.563269]  ffff8809ea39bdb8 ffffffff810d9bbe ffff8809ea39bdb8
> ffff880a2167c470
> [5108912.563273] Call Trace:
> [5108912.563278]  [<ffffffff810d9bbe>] perf_event_release_kernel+0x77/0x91
> [5108912.563282]  [<ffffffff810d9c56>] put_event+0x7e/0x86
> [5108912.563285]  [<ffffffff810d9dc0>] perf_release+0x10/0x14
> [5108912.563291]  [<ffffffff8112dfc0>] __fput+0xfe/0x1f6
> [5108912.563294]  [<ffffffff8112e0d2>] fput+0x1a/0x1c
> [5108912.563297]  [<ffffffff8112ad8d>] filp_close+0x72/0x7d
> [5108912.563303]  [<ffffffff8103efbf>] put_files_struct+0x6c/0xc3
> [5108912.563306]  [<ffffffff8103f057>] exit_files+0x41/0x46
> [5108912.563309]  [<ffffffff81040971>] do_exit+0x292/0x3b6
> [5108912.563312]  [<ffffffff81040b12>] do_group_exit+0x7d/0xa5
> [5108912.563315]  [<ffffffff81040b51>] sys_exit_group+0x17/0x1b
> [5108912.563320]  [<ffffffff814c2469>] system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-28  4:09                         ` Sasha Levin
@ 2014-09-29 11:11                           ` Peter Zijlstra
  2014-09-29 17:01                             ` Sasha Levin
  2014-09-30 17:23                           ` Peter Zijlstra
  1 sibling, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-29 11:11 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Cong Wang, Vince Weaver, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote:

> [  690.801720] 2 locks held by trinity-c95/17888:
> [  690.801738] #0: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:92)
> [  690.801754] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:988)
> [  690.801758]
> [  690.801758] stack backtrace:
> [  690.801766] CPU: 21 PID: 17888 Comm: trinity-c95 Not tainted 3.17.0-rc6-next-20140926-sasha-00051-g9253dff-dirty #1242
> [  690.801779]  ffffffff92b7f320 0000000000000000 ffffffff92afbee0 ffff8804078179c8
> [  690.801798]  ffffffff8ef0070f 0000000000000011 ffffffff92ab6aa0 ffff880407817a18
> [  690.801813]  ffffffff8a24ec2c ffff880407817aa8 ffff880409c00000 ffff880407817a18
> [  690.801818] Call Trace:
> [  690.801836] dump_stack (lib/dump_stack.c:52)
> [  690.801845] print_circular_bug (kernel/locking/lockdep.c:1217)
> [  690.801856] __lock_acquire (kernel/locking/lockdep.c:1842 kernel/locking/lockdep.c:1947 kernel/locking/lockdep.c:2133 kernel/locking/lockdep.c:3184)
> [  690.801872] lock_acquire (kernel/locking/lockdep.c:3610)
> [  690.801892] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
> [  690.801921] __queue_work (kernel/workqueue.c:1325)
> [  690.801943] queue_work_on (kernel/workqueue.c:1403)
> [  690.801956] free_object (lib/debugobjects.c:209)
> [  690.801967] __debug_check_no_obj_freed (lib/debugobjects.c:718)
> [  690.801983] debug_check_no_obj_freed (lib/debugobjects.c:727)
> [  690.801995] kmem_cache_free (mm/slub.c:2687 mm/slub.c:2715)
> [  690.802016] free_task (kernel/fork.c:221)
> [  690.802026] __put_task_struct (kernel/fork.c:251)
> [  690.802037] put_ctx (include/linux/sched.h:1864 kernel/events/core.c:904)
> [  690.802049] find_get_context (kernel/events/core.c:913 kernel/events/core.c:3222)
> [  690.802078] SYSC_perf_event_open (kernel/events/core.c:7347)
> [  690.802111] SyS_perf_event_open (kernel/events/core.c:7210)
> [  690.802120] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)

This doesn't make sense; perf_lock_task_context() isn't supposed to
return with ctx->lock held and therefore it should not still be held in
find_get_context() when calling put_ctx().

Now, the only put_ctx() call in find_get_context() is in the !ctx path
of the perf_lock_task_context() call, furthermore there is a
mutex_lock() - which implies a might_sleep() - before that, so we can't
still be holding a spinlock().

Most puzzled.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-29 11:11                           ` Peter Zijlstra
@ 2014-09-29 17:01                             ` Sasha Levin
  2014-09-30  8:54                               ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Sasha Levin @ 2014-09-29 17:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Cong Wang, Vince Weaver, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On 09/29/2014 07:11 AM, Peter Zijlstra wrote:
> On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote:
> 
>> > [  690.801720] 2 locks held by trinity-c95/17888:
>> > [  690.801738] #0: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:92)
>> > [  690.801754] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:988)
>> > [  690.801758]
>> > [  690.801758] stack backtrace:
>> > [  690.801766] CPU: 21 PID: 17888 Comm: trinity-c95 Not tainted 3.17.0-rc6-next-20140926-sasha-00051-g9253dff-dirty #1242
>> > [  690.801779]  ffffffff92b7f320 0000000000000000 ffffffff92afbee0 ffff8804078179c8
>> > [  690.801798]  ffffffff8ef0070f 0000000000000011 ffffffff92ab6aa0 ffff880407817a18
>> > [  690.801813]  ffffffff8a24ec2c ffff880407817aa8 ffff880409c00000 ffff880407817a18
>> > [  690.801818] Call Trace:
>> > [  690.801836] dump_stack (lib/dump_stack.c:52)
>> > [  690.801845] print_circular_bug (kernel/locking/lockdep.c:1217)
>> > [  690.801856] __lock_acquire (kernel/locking/lockdep.c:1842 kernel/locking/lockdep.c:1947 kernel/locking/lockdep.c:2133 kernel/locking/lockdep.c:3184)
>> > [  690.801872] lock_acquire (kernel/locking/lockdep.c:3610)
>> > [  690.801892] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
>> > [  690.801921] __queue_work (kernel/workqueue.c:1325)
>> > [  690.801943] queue_work_on (kernel/workqueue.c:1403)
>> > [  690.801956] free_object (lib/debugobjects.c:209)
>> > [  690.801967] __debug_check_no_obj_freed (lib/debugobjects.c:718)
>> > [  690.801983] debug_check_no_obj_freed (lib/debugobjects.c:727)
>> > [  690.801995] kmem_cache_free (mm/slub.c:2687 mm/slub.c:2715)
>> > [  690.802016] free_task (kernel/fork.c:221)
>> > [  690.802026] __put_task_struct (kernel/fork.c:251)
>> > [  690.802037] put_ctx (include/linux/sched.h:1864 kernel/events/core.c:904)
>> > [  690.802049] find_get_context (kernel/events/core.c:913 kernel/events/core.c:3222)
>> > [  690.802078] SYSC_perf_event_open (kernel/events/core.c:7347)
>> > [  690.802111] SyS_perf_event_open (kernel/events/core.c:7210)
>> > [  690.802120] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)
> This doesn't make sense; perf_lock_task_context() isn't supposed to
> return with ctx->lock held and therefore it should not still be held in
> find_get_context() when calling put_ctx().
> 
> Now, the only put_ctx() call in find_get_context() is in the !ctx path
> of the perf_lock_task_context() call, furthermore there is a
> mutex_lock() - which implies a might_sleep() - before that, so we can't
> still be holding a spinlock().

I think you missed the put_ctx() call in the other branch in find_get_context(),
which is the call described by the trace above:

	find_get_context()
		unclone_ctx()
			put_ctx()


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-29 17:01                             ` Sasha Levin
@ 2014-09-30  8:54                               ` Peter Zijlstra
  0 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-30  8:54 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Cong Wang, Vince Weaver, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On Mon, Sep 29, 2014 at 01:01:33PM -0400, Sasha Levin wrote:
> On 09/29/2014 07:11 AM, Peter Zijlstra wrote:
> > On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote:
> > 
> >> > [  690.801720] 2 locks held by trinity-c95/17888:
> >> > [  690.801738] #0: (cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:92)
> >> > [  690.801754] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:988)
> >> > [  690.801758]
> >> > [  690.801758] stack backtrace:
> >> > [  690.801766] CPU: 21 PID: 17888 Comm: trinity-c95 Not tainted 3.17.0-rc6-next-20140926-sasha-00051-g9253dff-dirty #1242
> >> > [  690.801779]  ffffffff92b7f320 0000000000000000 ffffffff92afbee0 ffff8804078179c8
> >> > [  690.801798]  ffffffff8ef0070f 0000000000000011 ffffffff92ab6aa0 ffff880407817a18
> >> > [  690.801813]  ffffffff8a24ec2c ffff880407817aa8 ffff880409c00000 ffff880407817a18
> >> > [  690.801818] Call Trace:
> >> > [  690.801836] dump_stack (lib/dump_stack.c:52)
> >> > [  690.801845] print_circular_bug (kernel/locking/lockdep.c:1217)
> >> > [  690.801856] __lock_acquire (kernel/locking/lockdep.c:1842 kernel/locking/lockdep.c:1947 kernel/locking/lockdep.c:2133 kernel/locking/lockdep.c:3184)
> >> > [  690.801872] lock_acquire (kernel/locking/lockdep.c:3610)
> >> > [  690.801892] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151)
> >> > [  690.801921] __queue_work (kernel/workqueue.c:1325)
> >> > [  690.801943] queue_work_on (kernel/workqueue.c:1403)
> >> > [  690.801956] free_object (lib/debugobjects.c:209)
> >> > [  690.801967] __debug_check_no_obj_freed (lib/debugobjects.c:718)
> >> > [  690.801983] debug_check_no_obj_freed (lib/debugobjects.c:727)
> >> > [  690.801995] kmem_cache_free (mm/slub.c:2687 mm/slub.c:2715)
> >> > [  690.802016] free_task (kernel/fork.c:221)
> >> > [  690.802026] __put_task_struct (kernel/fork.c:251)
> >> > [  690.802037] put_ctx (include/linux/sched.h:1864 kernel/events/core.c:904)
> >> > [  690.802049] find_get_context (kernel/events/core.c:913 kernel/events/core.c:3222)
> >> > [  690.802078] SYSC_perf_event_open (kernel/events/core.c:7347)
> >> > [  690.802111] SyS_perf_event_open (kernel/events/core.c:7210)
> >> > [  690.802120] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)
> > This doesn't make sense; perf_lock_task_context() isn't supposed to
> > return with ctx->lock held and therefore it should not still be held in
> > find_get_context() when calling put_ctx().
> > 
> > Now, the only put_ctx() call in find_get_context() is in the !ctx path
> > of the perf_lock_task_context() call, furthermore there is a
> > mutex_lock() - which implies a might_sleep() - before that, so we can't
> > still be holding a spinlock().
> 
> I think you missed the put_ctx() call in the other branch in find_get_context(),
> which is the call described by the trace above:
> 
> 	find_get_context()
> 		unclone_ctx()
> 			put_ctx()
> 

Yes indeed. Bah. Lemme see what I can make of that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-28  4:09                         ` Sasha Levin
  2014-09-29 11:11                           ` Peter Zijlstra
@ 2014-09-30 17:23                           ` Peter Zijlstra
  2014-10-01 11:16                             ` Sasha Levin
  2014-10-03  5:27                             ` [tip:perf/urgent] perf: Fix unclone_ctx() vs. locking tip-bot for Peter Zijlstra
  1 sibling, 2 replies; 26+ messages in thread
From: Peter Zijlstra @ 2014-09-30 17:23 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Cong Wang, Vince Weaver, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo


How about this then?

---
Subject: perf: Fix unclone_ctx() vs locking

The idiot who did 4a1c0f262f88 forgot to pay attention and fix all
similar cases. Do so now.

In particular, unclone_ctx() must be called while holding ctx->lock,
therefore all such sites are broken for the same reason. Pull the
put_ctx() call out from under ctx->lock.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c | 54 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a232b4006026..769d8587cf31 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -911,13 +911,23 @@ static void put_ctx(struct perf_event_context *ctx)
 	}
 }
 
-static void unclone_ctx(struct perf_event_context *ctx)
+/*
+ * This must be done under the ctx->lock, such as to serialize against
+ * context_equiv(), therefore we cannot call put_ctx() since that might end up
+ * calling scheduler related locks and ctx->lock nests inside those.
+ */
+static __must_check struct perf_event_context *
+unclone_ctx(struct perf_event_context *ctx)
 {
-	if (ctx->parent_ctx) {
-		put_ctx(ctx->parent_ctx);
+	struct perf_event_context *parent_ctx = ctx->parent_ctx;
+
+	lockdep_assert_held(&ctx->lock);
+
+	if (parent_ctx)
 		ctx->parent_ctx = NULL;
-	}
 	ctx->generation++;
+
+	return parent_ctx;
 }
 
 static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
@@ -2264,6 +2274,9 @@ static void ctx_sched_out(struct perf_event_context *ctx,
 static int context_equiv(struct perf_event_context *ctx1,
 			 struct perf_event_context *ctx2)
 {
+	lockdep_assert_held(&ctx1->lock);
+	lockdep_assert_held(&ctx2->lock);
+
 	/* Pinning disables the swap optimization */
 	if (ctx1->pin_count || ctx2->pin_count)
 		return 0;
@@ -2997,6 +3010,7 @@ static int event_enable_on_exec(struct perf_event *event,
  */
 static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 {
+	struct perf_event_context *clone_ctx = NULL;
 	struct perf_event *event;
 	unsigned long flags;
 	int enabled = 0;
@@ -3028,7 +3042,7 @@ static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 	 * Unclone this context if we enabled any event.
 	 */
 	if (enabled)
-		unclone_ctx(ctx);
+		clone_ctx = unclone_ctx(ctx);
 
 	raw_spin_unlock(&ctx->lock);
 
@@ -3038,6 +3052,9 @@ static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 	perf_event_context_sched_in(ctx, ctx->task);
 out:
 	local_irq_restore(flags);
+
+	if (clone_ctx)
+		put_ctx(clone_ctx);
 }
 
 void perf_event_exec(void)
@@ -3190,7 +3207,7 @@ find_lively_task_by_vpid(pid_t vpid)
 static struct perf_event_context *
 find_get_context(struct pmu *pmu, struct task_struct *task, int cpu)
 {
-	struct perf_event_context *ctx;
+	struct perf_event_context *ctx, *clone_ctx = NULL;
 	struct perf_cpu_context *cpuctx;
 	unsigned long flags;
 	int ctxn, err;
@@ -3224,9 +3241,12 @@ find_get_context(struct pmu *pmu, struct task_struct *task, int cpu)
 retry:
 	ctx = perf_lock_task_context(task, ctxn, &flags);
 	if (ctx) {
-		unclone_ctx(ctx);
+		clone_ctx = unclone_ctx(ctx);
 		++ctx->pin_count;
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
+
+		if (clone_ctx)
+			put_ctx(clone_ctx);
 	} else {
 		ctx = alloc_perf_context(pmu, task);
 		err = -ENOMEM;
@@ -7651,7 +7671,7 @@ __perf_event_exit_task(struct perf_event *child_event,
 static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 {
 	struct perf_event *child_event, *next;
-	struct perf_event_context *child_ctx, *parent_ctx;
+	struct perf_event_context *child_ctx, *clone_ctx = NULL;
 	unsigned long flags;
 
 	if (likely(!child->perf_event_ctxp[ctxn])) {
@@ -7678,28 +7698,16 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 	child->perf_event_ctxp[ctxn] = NULL;
 
 	/*
-	 * In order to avoid freeing: child_ctx->parent_ctx->task
-	 * under perf_event_context::lock, grab another reference.
-	 */
-	parent_ctx = child_ctx->parent_ctx;
-	if (parent_ctx)
-		get_ctx(parent_ctx);
-
-	/*
 	 * If this context is a clone; unclone it so it can't get
 	 * swapped to another process while we're removing all
 	 * the events from it.
 	 */
-	unclone_ctx(child_ctx);
+	clone_ctx = unclone_ctx(child_ctx);
 	update_context_time(child_ctx);
 	raw_spin_unlock_irqrestore(&child_ctx->lock, flags);
 
-	/*
-	 * Now that we no longer hold perf_event_context::lock, drop
-	 * our extra child_ctx->parent_ctx reference.
-	 */
-	if (parent_ctx)
-		put_ctx(parent_ctx);
+	if (clone_ctx)
+		put_ctx(clone_ctx);
 
 	/*
 	 * Report the task dead after unscheduling the events so that we

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-29  5:21                         ` perf: perf_fuzzer triggers instant reboot Vince Weaver
@ 2014-09-30 17:58                           ` Cong Wang
  0 siblings, 0 replies; 26+ messages in thread
From: Cong Wang @ 2014-09-30 17:58 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, Sasha Levin, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On Sun, Sep 28, 2014 at 10:21 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:
> On Thu, 25 Sep 2014, Cong Wang wrote:
>
>> On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>
>> > Now that just might mean the patch pushed the code around enough so my
>> > test doesn't trigger, but there is hope that maybe this fixes things.
>>
>> I read this as it fixes your crash as well?
>
> It makes a testcase I have stop triggering an instant reboot, but there
> have been various attempts to fix similar problems over the past few
> months and so far the general family of lockups/reboots still remains.
>
> I'll have to start fuzzing with 3.17-rc7 and see if things are better now,
> but it can take days to trigger this class of bugs.

Great! Thanks for testing!

>
>> Sure, as I said in the changelog, it is a soft lockup which was triggered on
>> dozens of machines here, it is actually pretty straightforward:
>
> so you managed to trigger it with the perf utility?


Yes, it should be some job in our data center running perf.

> Do you have the exact command line that triggered it?
>

No, I didn't check /proc/<pid>/cmdline when I was debugging this.
What's more important, it was triggered by some job in data center,
hard to reproduce manually.

Hope this helps.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-09-30 17:23                           ` Peter Zijlstra
@ 2014-10-01 11:16                             ` Sasha Levin
  2014-10-02 15:06                               ` Vince Weaver
  2014-10-03  5:27                             ` [tip:perf/urgent] perf: Fix unclone_ctx() vs. locking tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 26+ messages in thread
From: Sasha Levin @ 2014-10-01 11:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Cong Wang, Vince Weaver, linux-kernel, Paul Mackerras,
	Ingo Molnar, Arnaldo Carvalho de Melo

On 09/30/2014 01:23 PM, Peter Zijlstra wrote:
> How about this then?
> 
> ---
> Subject: perf: Fix unclone_ctx() vs locking
> 
> The idiot who did 4a1c0f262f88 forgot to pay attention and fix all
> similar cases. Do so now.
> 
> In particular, unclone_ctx() must be called while holding ctx->lock,
> therefore all such sites are broken for the same reason. Pull the
> put_ctx() call out from under ctx->lock.
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Looks good! The issue didn't reproduce anymore.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-10-01 11:16                             ` Sasha Levin
@ 2014-10-02 15:06                               ` Vince Weaver
  2014-10-02 16:06                                 ` Vince Weaver
  0 siblings, 1 reply; 26+ messages in thread
From: Vince Weaver @ 2014-10-02 15:06 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Zijlstra, Cong Wang, Vince Weaver, linux-kernel,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On Wed, 1 Oct 2014, Sasha Levin wrote:

> On 09/30/2014 01:23 PM, Peter Zijlstra wrote:
> > How about this then?
> > 
> > ---
> > Subject: perf: Fix unclone_ctx() vs locking
> > 
> > The idiot who did 4a1c0f262f88 forgot to pay attention and fix all
> > similar cases. Do so now.
> > 
> > In particular, unclone_ctx() must be called while holding ctx->lock,
> > therefore all such sites are broken for the same reason. Pull the
> > put_ctx() call out from under ctx->lock.
> > 
> > Reported-by: Sasha Levin <sasha.levin@oracle.com>
> > Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> Looks good! The issue didn't reproduce anymore.

So I left my core2 machine fuzzing (on 3.17-rc7) overnight and in the 
morning the fuzzer was unkillable, stuck in the following.  Does this look 
like the same problem?  

It looks like this is easily reproducible (just wedged the machine again)
so let me check back after testing the patch.

Vince

[152447.120375] SysRq : Show backtrace of all active CPUs
[152447.124005] sending NMI to all CPUs:
[152447.124005] NMI backtrace for cpu 0
[152447.124005] CPU: 0 PID: 10004 Comm: perf_fuzzer Tainted: G        W      3.17.0-rc7+ #84
[152447.124005] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015  10/19/2012
[152447.124005] task: ffff88009d1b9000 ti: ffff88009c2ec000 task.ti: ffff88009c2ec000
[152447.124005] RIP: 0010:[<ffffffff8129a837>]  [<ffffffff8129a837>] delay_tsc+0x1b/0x4e
[152447.124005] RSP: 0018:ffff88011fc03d78  EFLAGS: 00000046
[152447.124005] RAX: 0000000000000000 RBX: 0000000000002710 RCX: 000000003a2a18ef
[152447.124005] RDX: 000000000000005f RSI: 0000000000000000 RDI: 0000000000265906
[152447.124005] RBP: ffff88011fc03d78 R08: 000000003a2a194e R09: 0000000000000000
[152447.124005] R10: ffffffff81673c90 R11: 0000000000000000 R12: 0000000000000007
[152447.124005] R13: 000000000000006c R14: 0000000000000001 R15: 0000000000000046
[152447.124005] FS:  00007fb8311eb700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[152447.124005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[152447.124005] CR2: 00007fff79ac1648 CR3: 000000009d7f9000 CR4: 00000000000407f0
[152447.124005] DR0: 0000000001b3f000 DR1: 0000000001937000 DR2: 0000000000000000
[152447.124005] DR3: 0000000001b2e000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[152447.124005] Stack:
[152447.124005]  ffff88011fc03d88 ffffffff8129a7c9 ffff88011fc03d98 ffffffff8129a7ef
[152447.124005]  ffff88011fc03db0 ffffffff8102b4b4 ffffffff81a6e610 ffff88011fc03dc0
[152447.124005]  ffffffff8131cd31 ffff88011fc03df0 ffffffff8131d2ea ffffffff81c76110
[152447.124005] Call Trace:
[152447.124005]  <IRQ> 
[152447.124005]  [<ffffffff8129a7c9>] __delay+0xf/0x11
[152447.124005]  [<ffffffff8129a7ef>] __const_udelay+0x24/0x26
[152447.124005]  [<ffffffff8102b4b4>] arch_trigger_all_cpu_backtrace+0xc5/0xd1
[152447.124005]  [<ffffffff8131cd31>] sysrq_handle_showallcpus+0x13/0x15
[152447.124005]  [<ffffffff8131d2ea>] __handle_sysrq+0x94/0x121
[152447.124005]  [<ffffffff8131d39a>] handle_sysrq+0x23/0x25
[152447.124005]  [<ffffffff8132dd3f>] serial8250_rx_chars+0x14b/0x1b8
[152447.124005]  [<ffffffff8132de22>] serial8250_handle_irq+0x76/0xb4
[152447.124005]  [<ffffffff8132de81>] serial8250_default_handle_irq+0x21/0x24
[152447.124005]  [<ffffffff8132d132>] serial8250_interrupt+0x3d/0xb2
[152447.124005]  [<ffffffff81075f34>] handle_irq_event_percpu+0x43/0x16e
[152447.124005]  [<ffffffff8108b382>] ? clockevents_program_event+0x9d/0xb9
[152447.124005]  [<ffffffff8107609b>] handle_irq_event+0x3c/0x57
[152447.124005]  [<ffffffff81078a0d>] handle_edge_irq+0xb1/0xcb
[152447.124005]  [<ffffffff810046ce>] handle_irq+0x21/0x2a
[152447.124005]  [<ffffffff81004146>] do_IRQ+0x4e/0xc3
[152447.124005]  [<ffffffff815244aa>] common_interrupt+0x6a/0x6a
[152447.124005]  <EOI> 
[152447.124005]  [<ffffffff8107b2ce>] ? synchronize_srcu_expedited+0x15/0x15
[152447.124005]  [<ffffffff8110dffd>] ? kmem_cache_alloc_trace+0xcb/0xda
[152447.124005]  [<ffffffff8107d7f6>] ? __call_rcu.constprop.63+0x55/0x1c8
[152447.124005]  [<ffffffff8107d983>] kfree_call_rcu+0x1a/0x1c
[152447.124005]  [<ffffffff810ca0dc>] put_ctx+0x50/0x53
[152447.124005]  [<ffffffff810cb9f7>] find_get_context+0x13f/0x170
[152447.124005]  [<ffffffff810cff70>] SYSC_perf_event_open+0x47b/0x7f5
[152447.124005]  [<ffffffff810d064a>] SyS_perf_event_open+0xe/0x10
[152447.124005]  [<ffffffff81523916>] system_call_fastpath+0x1a/0x1f
[152447.124005] Code: 90 55 48 8d 3c bf 48 89 e5 e8 b1 ff ff ff 5d c3 66 66 66 66 90 55 48 89 e5 65 8b 34 25 d4 b0 00 00 66 66 90 0f ae e8 0f 31 89 c1 <66> 66 90 0f ae e8 0f 31 48 c1 e2 20 89 c0 48 09 c2 41 89 d0 29 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: perf: perf_fuzzer triggers instant reboot
  2014-10-02 15:06                               ` Vince Weaver
@ 2014-10-02 16:06                                 ` Vince Weaver
  0 siblings, 0 replies; 26+ messages in thread
From: Vince Weaver @ 2014-10-02 16:06 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Sasha Levin, Peter Zijlstra, Cong Wang, linux-kernel,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo

On Thu, 2 Oct 2014, Vince Weaver wrote:

> It looks like this is easily reproducible (just wedged the machine again)
> so let me check back after testing the patch.

no, can still wedge the machine even with this patch applied.

Will try messing with ftrace to see if I can figure out what's going on.

Vince


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [tip:perf/urgent] perf: Fix unclone_ctx() vs. locking
  2014-09-30 17:23                           ` Peter Zijlstra
  2014-10-01 11:16                             ` Sasha Levin
@ 2014-10-03  5:27                             ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 26+ messages in thread
From: tip-bot for Peter Zijlstra @ 2014-10-03  5:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, sasha.levin, hpa, mingo, torvalds, peterz,
	acme, cwang, vincent.weaver, tglx

Commit-ID:  211de6eba8960521e2be450a7d07db85fba4604c
Gitweb:     http://git.kernel.org/tip/211de6eba8960521e2be450a7d07db85fba4604c
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 30 Sep 2014 19:23:08 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 3 Oct 2014 05:41:06 +0200

perf: Fix unclone_ctx() vs. locking

The idiot who did 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
forgot to pay attention and fix all similar cases. Do so now.

In particular, unclone_ctx() must be called while holding ctx->lock,
therefore all such sites are broken for the same reason. Pull the
put_ctx() call out from under ctx->lock.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Probably-also-reported-by: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140930172308.GI4241@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 54 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 31 insertions(+), 23 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d640a8b..afdd9e1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -902,13 +902,23 @@ static void put_ctx(struct perf_event_context *ctx)
 	}
 }
 
-static void unclone_ctx(struct perf_event_context *ctx)
+/*
+ * This must be done under the ctx->lock, such as to serialize against
+ * context_equiv(), therefore we cannot call put_ctx() since that might end up
+ * calling scheduler related locks and ctx->lock nests inside those.
+ */
+static __must_check struct perf_event_context *
+unclone_ctx(struct perf_event_context *ctx)
 {
-	if (ctx->parent_ctx) {
-		put_ctx(ctx->parent_ctx);
+	struct perf_event_context *parent_ctx = ctx->parent_ctx;
+
+	lockdep_assert_held(&ctx->lock);
+
+	if (parent_ctx)
 		ctx->parent_ctx = NULL;
-	}
 	ctx->generation++;
+
+	return parent_ctx;
 }
 
 static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
@@ -2210,6 +2220,9 @@ static void ctx_sched_out(struct perf_event_context *ctx,
 static int context_equiv(struct perf_event_context *ctx1,
 			 struct perf_event_context *ctx2)
 {
+	lockdep_assert_held(&ctx1->lock);
+	lockdep_assert_held(&ctx2->lock);
+
 	/* Pinning disables the swap optimization */
 	if (ctx1->pin_count || ctx2->pin_count)
 		return 0;
@@ -2943,6 +2956,7 @@ static int event_enable_on_exec(struct perf_event *event,
  */
 static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 {
+	struct perf_event_context *clone_ctx = NULL;
 	struct perf_event *event;
 	unsigned long flags;
 	int enabled = 0;
@@ -2974,7 +2988,7 @@ static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 	 * Unclone this context if we enabled any event.
 	 */
 	if (enabled)
-		unclone_ctx(ctx);
+		clone_ctx = unclone_ctx(ctx);
 
 	raw_spin_unlock(&ctx->lock);
 
@@ -2984,6 +2998,9 @@ static void perf_event_enable_on_exec(struct perf_event_context *ctx)
 	perf_event_context_sched_in(ctx, ctx->task);
 out:
 	local_irq_restore(flags);
+
+	if (clone_ctx)
+		put_ctx(clone_ctx);
 }
 
 void perf_event_exec(void)
@@ -3135,7 +3152,7 @@ errout:
 static struct perf_event_context *
 find_get_context(struct pmu *pmu, struct task_struct *task, int cpu)
 {
-	struct perf_event_context *ctx;
+	struct perf_event_context *ctx, *clone_ctx = NULL;
 	struct perf_cpu_context *cpuctx;
 	unsigned long flags;
 	int ctxn, err;
@@ -3169,9 +3186,12 @@ find_get_context(struct pmu *pmu, struct task_struct *task, int cpu)
 retry:
 	ctx = perf_lock_task_context(task, ctxn, &flags);
 	if (ctx) {
-		unclone_ctx(ctx);
+		clone_ctx = unclone_ctx(ctx);
 		++ctx->pin_count;
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
+
+		if (clone_ctx)
+			put_ctx(clone_ctx);
 	} else {
 		ctx = alloc_perf_context(pmu, task);
 		err = -ENOMEM;
@@ -7523,7 +7543,7 @@ __perf_event_exit_task(struct perf_event *child_event,
 static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 {
 	struct perf_event *child_event, *next;
-	struct perf_event_context *child_ctx, *parent_ctx;
+	struct perf_event_context *child_ctx, *clone_ctx = NULL;
 	unsigned long flags;
 
 	if (likely(!child->perf_event_ctxp[ctxn])) {
@@ -7550,28 +7570,16 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 	child->perf_event_ctxp[ctxn] = NULL;
 
 	/*
-	 * In order to avoid freeing: child_ctx->parent_ctx->task
-	 * under perf_event_context::lock, grab another reference.
-	 */
-	parent_ctx = child_ctx->parent_ctx;
-	if (parent_ctx)
-		get_ctx(parent_ctx);
-
-	/*
 	 * If this context is a clone; unclone it so it can't get
 	 * swapped to another process while we're removing all
 	 * the events from it.
 	 */
-	unclone_ctx(child_ctx);
+	clone_ctx = unclone_ctx(child_ctx);
 	update_context_time(child_ctx);
 	raw_spin_unlock_irqrestore(&child_ctx->lock, flags);
 
-	/*
-	 * Now that we no longer hold perf_event_context::lock, drop
-	 * our extra child_ctx->parent_ctx reference.
-	 */
-	if (parent_ctx)
-		put_ctx(parent_ctx);
+	if (clone_ctx)
+		put_ctx(clone_ctx);
 
 	/*
 	 * Report the task dead after unscheduling the events so that we

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-10-03  5:28 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-08 17:47 perf: perf_fuzzer triggers instant reboot Vince Weaver
2014-09-08 18:51 ` Peter Zijlstra
2014-09-08 19:08   ` Vince Weaver
2014-09-09 16:06   ` Vince Weaver
2014-09-09 17:20     ` Vince Weaver
2014-09-09 17:53       ` Vince Weaver
2014-09-10  8:31         ` Peter Zijlstra
2014-09-10 13:18           ` Vince Weaver
2014-09-10 13:28             ` Peter Zijlstra
2014-09-10 14:01             ` Sasha Levin
2014-09-10 14:30               ` Vince Weaver
2014-09-10 14:33                 ` Peter Zijlstra
2014-09-11 13:27                   ` Vince Weaver
2014-09-25  4:59                     ` Vince Weaver
2014-09-25 16:38                       ` Cong Wang
2014-09-28  4:09                         ` Sasha Levin
2014-09-29 11:11                           ` Peter Zijlstra
2014-09-29 17:01                             ` Sasha Levin
2014-09-30  8:54                               ` Peter Zijlstra
2014-09-30 17:23                           ` Peter Zijlstra
2014-10-01 11:16                             ` Sasha Levin
2014-10-02 15:06                               ` Vince Weaver
2014-10-02 16:06                                 ` Vince Weaver
2014-10-03  5:27                             ` [tip:perf/urgent] perf: Fix unclone_ctx() vs. locking tip-bot for Peter Zijlstra
2014-09-29  5:21                         ` perf: perf_fuzzer triggers instant reboot Vince Weaver
2014-09-30 17:58                           ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.