All of lore.kernel.org
 help / color / mirror / Atom feed
* perf/tracepoint: another fuzzer generated lockup
@ 2013-11-08 19:45 Vince Weaver
  2013-11-08 20:06 ` Vince Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Vince Weaver @ 2013-11-08 19:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Peter Zijlstra, Dave Jones, Frederic Weisbecker


The perf_fuzzer hits this bug pretty reliably within about an hour, but 
unfortunately after a large number of syscalls so not easy to isolate.

This is against stock 3.12 with the no-ftrace-function-tracing-unless-root
patch applied.

I can trigger this is a normal user, no root involved.

[106657.412003] ------------[ cut here ]------------
[106657.412003] WARNING: CPU: 1 PID: 4131 at kernel/watchdog.c:245 watchdog_ove)
[106657.412003] Watchdog detected hard LOCKUP on cpu 1
[106657.412003] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powed
[106657.412003] CPU: 1 PID: 4131 Comm: perf_fuzzer Not tainted 3.12.0 #2
[106657.412003] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, B2
[106657.412003]  00000000000000f5 ffff88011fc87c48 ffffffff8151aad6 ffff88011fc8
[106657.412003]  ffffffff8103cd68 0000000100000000 ffffffff810a101c ffff88011b30
[106657.412003]  ffff88011fc87ef8 ffff88011fc87d58 0000000000000000 ffff88011fc8
[106657.412003] Call Trace:
[106657.412003]  <NMI>  [<ffffffff8151aad6>] ? dump_stack+0x41/0x53
[106657.412003]  [<ffffffff8103cd68>] ? warn_slowpath_common+0x81/0x9c
[106657.412003]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
[106657.412003]  [<ffffffff8103ce25>] ? warn_slowpath_fmt+0x46/0x48
[106657.412003]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
[106657.412003]  [<ffffffff810ca4a2>] ? __perf_event_overflow+0x12e/0x1ab
[106657.412003]  [<ffffffff810167d3>] ? intel_pmu_lbr_read+0x253/0x349
[106657.412003]  [<ffffffff81018c95>] ? intel_pmu_handle_irq+0x2b8/0x34d
[106657.412003]  [<ffffffff8151edfc>] ? nmi_handle+0xa8/0x13a
[106657.412003]  [<ffffffff8151f473>] ? perf_event_nmi_handler+0x24/0x3d
[106657.412003]  [<ffffffff8151edb2>] ? nmi_handle+0x5e/0x13a
[106657.412003]  [<ffffffff8151ef35>] ? do_nmi+0xa7/0x2bd
[106657.412003]  [<ffffffff8151e537>] ? end_repeat_nmi+0x1e/0x2e
[106657.412003]  [<ffffffff8104b09c>] ? do_send_sig_info+0x73/0x75
[106657.412003]  [<ffffffff8104b09c>] ? do_send_sig_info+0x73/0x75
[106657.412003]  [<ffffffff8104b09c>] ? do_send_sig_info+0x73/0x75
[106657.412003]  <<EOE>>  <IRQ>  [<ffffffff8111f683>] ? send_sigio_to_task+0xeb5
[106657.412003]  [<ffffffff8111f737>] ? send_sigio+0xaa/0xdd
[106657.412003]  [<ffffffff8111f7d3>] ? kill_fasync+0x69/0x86
[106657.412003]  [<ffffffff810cabf9>] ? perf_event_wakeup+0x8c/0xa1
[106657.412003]  [<ffffffff810c494d>] ? __irq_work_run+0x5f/0x7b
[106657.412003]  [<ffffffff81005f82>] ? smp_trace_irq_work_interrupt+0x4b/0x7e
[106657.412003]  [<ffffffff815261ca>] ? trace_irq_work_interrupt+0x6a/0x70
[106657.412003]  [<ffffffff8151da07>] ? _raw_spin_unlock_irqrestore+0xf/0x11
[106657.412003]  [<ffffffff810a3a89>] ? rcu_process_callbacks+0x4a4/0x4b8
[106657.412003]  [<ffffffff81041027>] ? __do_softirq+0x102/0x208
[106657.412003]  [<ffffffff815263dc>] ? call_softirq+0x1c/0x30
[106657.412003]  [<ffffffff810041b9>] ? do_softirq+0x38/0x6d
[106657.412003]  [<ffffffff81040d1c>] ? irq_exit+0x45/0xa1
[106657.412003]  [<ffffffff81526a14>] ? smp_trace_apic_timer_interrupt+0x83/0x91
[106657.412003]  [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
[106657.412003]  <EOI> 
[106657.412003] ---[ end trace ff266ee4a75ddbbf ]---
[106657.412003] perf samples too long (6605838 > 10000), lowering kernel.perf_e0
[106657.412003] INFO: NMI handler (perf_event_nmi_handler) took too long to runs
[106678.192001] perf samples too long (6554240 > 20000), lowering kernel.perf_e0
[106679.508002] perf samples too long (6503044 > 40000), lowering kernel.perf_e0
[106680.852001] perf samples too long (6452247 > 76923), lowering kernel.perf_e0
[106682.192001] perf samples too long (6401848 > 142857), lowering kernel.perf_0
[106683.536001] perf samples too long (6351842 > 250000), lowering kernel.perf_0
[106684.876001] perf samples too long (6302227 > 500000), lowering kernel.perf_0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 20:06 ` Vince Weaver
@ 2013-11-08 20:02   ` Frederic Weisbecker
  2013-11-08 20:23     ` Vince Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-08 20:02 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, Nov 08, 2013 at 03:06:11PM -0500, Vince Weaver wrote:
> 
> and again, this time after 600k successful syscalls or so.
> 
> This is on a core2 machine.
> 
> [ 1020.396002] ------------[ cut here ]------------
> [ 1020.396002] WARNING: CPU: 1 PID: 3036 at kernel/watchdog.c:245 watchdog_over)
> [ 1020.396002] Watchdog detected hard LOCKUP on cpu 1
> [ 1020.396002] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powern
> [ 1020.396002] CPU: 1 PID: 3036 Comm: perf_fuzzer Not tainted 3.12.0 #2
> [ 1020.396002] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
> [ 1020.396002]  00000000000000f5 ffff88011fc87c48 ffffffff8151aad6 ffff88011fc88
> [ 1020.396002]  ffffffff8103cd68 0000000100000000 ffffffff810a101c ffff88011b380
> [ 1020.396002]  ffff88011fc87ef8 ffff88011fc87d58 0000000000000000 ffff88011fc88
> [ 1020.396002] Call Trace:
> [ 1020.396002]  <NMI>  [<ffffffff8151aad6>] ? dump_stack+0x41/0x53
> [ 1020.396002]  [<ffffffff8103cd68>] ? warn_slowpath_common+0x81/0x9c
> [ 1020.396002]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
> [ 1020.396002]  [<ffffffff8103ce25>] ? warn_slowpath_fmt+0x46/0x48
> [ 1020.396002]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
> [ 1020.396002]  [<ffffffff810ca4a2>] ? __perf_event_overflow+0x12e/0x1ab
> [ 1020.396002]  [<ffffffff81018cf5>] ? intel_pmu_handle_irq+0x318/0x34d
> [ 1020.396002]  [<ffffffff81018cfc>] ? intel_pmu_handle_irq+0x31f/0x34d
> [ 1020.396002]  [<ffffffff81018c95>] ? intel_pmu_handle_irq+0x2b8/0x34d
> [ 1020.396002]  [<ffffffff81018671>] ? intel_pmu_enable_all+0x55/0xa7
> [ 1020.396002]  [<ffffffff8151edfc>] ? nmi_handle+0xa8/0x13a
> [ 1020.396002]  [<ffffffff8151f473>] ? perf_event_nmi_handler+0x24/0x3d
> [ 1020.396002]  [<ffffffff8151edb2>] ? nmi_handle+0x5e/0x13a
> [ 1020.396002]  [<ffffffff8151ef35>] ? do_nmi+0xa7/0x2bd
> [ 1020.396002]  [<ffffffff8151e537>] ? end_repeat_nmi+0x1e/0x2e
> [ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
> [ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
> [ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
> [ 1020.396002]  <<EOE>>  <IRQ>  [<ffffffff81005ee2>] ? arch_irq_work_raise+0x27a
> [ 1020.396002]  [<ffffffff810c4ada>] ? irq_work_queue+0x70/0x72
> [ 1020.396002]  [<ffffffff810ca510>] ? __perf_event_overflow+0x19c/0x1ab
> [ 1020.396002]  [<ffffffff8104ae16>] ? __send_signal+0x244/0x26f
> [ 1020.396002]  [<ffffffff810ca68b>] ? perf_swevent_overflow+0x3f/0x58
> [ 1020.396002]  [<ffffffff810ca8a5>] ? perf_tp_event+0xe9/0x1f6
> [ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
> [ 1020.396002]  [<ffffffff81040954>] ? perf_trace_softirq+0x8a/0x96
> [ 1020.396002]  [<ffffffff8100366f>] ? perf_trace_x86_irq_vector+0xc0/0xc9
> [ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
> [ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
> [ 1020.396002]  [<ffffffff815261ca>] ? trace_irq_work_interrupt+0x6a/0x70
> [ 1020.396002]  [<ffffffff81040fba>] ? __do_softirq+0x95/0x208
> [ 1020.396002]  [<ffffffff815263dc>] ? call_softirq+0x1c/0x30
> [ 1020.396002]  [<ffffffff810041b9>] ? do_softirq+0x38/0x6d
> [ 1020.396002]  [<ffffffff81040d1c>] ? irq_exit+0x45/0xa1
> [ 1020.396002]  [<ffffffff81526a14>] ? smp_trace_apic_timer_interrupt+0x83/0x91
> [ 1020.396002]  [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
> [ 1020.396002]  <EOI>  [<ffffffff810c7c20>] ? __perf_event_enable+0x118/0x173
> [ 1020.396002]  [<ffffffff8104c694>] ? get_signal_to_deliver+0x47d/0x48a
> [ 1020.396002]  [<ffffffff810024da>] ? do_signal+0x46/0x5f5
> [ 1020.396002]  [<ffffffff8100b9f5>] ? perf_trace_sys_exit+0x3e/0xa7
> [ 1020.396002]  [<ffffffff81002ab5>] ? do_notify_resume+0x2c/0x6a
> [ 1020.396002]  [<ffffffff81524ea0>] ? int_signal+0x12/0x17
> [ 1020.396002] ---[ end trace 2b385d729ba32e92 ]---
> [ 1020.396002] perf samples too long (8101340 > 10000), lowering kernel.perf_ev0
> [ 1020.396002] INFO: NMI handler (perf_event_nmi_handler) took too long to run:s
> [ 1033.301811] perf samples too long (8038058 > 20000), lowering kernel.perf_ev0
> [ 1034.125811] perf samples too long (7975268 > 40000), lowering kernel.perf_ev0
> [ 1034.973813] perf samples too long (7912968 > 76923), lowering kernel.perf_ev0
> [ 1035.821813] perf samples too long (7851154 > 142857), lowering kernel.perf_e0
> [ 1036.669813] perf samples too long (7789824 > 250000), lowering kernel.perf_e0
> [ 1037.517812] perf samples too long (7728973 > 500000), lowering kernel.perf_e0
> 

There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
previous reports seemed to suggest that lbr is involved, but not this one.

I'm going to look further into this, unfortunately my testbox is an AMD...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 19:45 perf/tracepoint: another fuzzer generated lockup Vince Weaver
@ 2013-11-08 20:06 ` Vince Weaver
  2013-11-08 20:02   ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Vince Weaver @ 2013-11-08 20:06 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones,
	Frederic Weisbecker


and again, this time after 600k successful syscalls or so.

This is on a core2 machine.

[ 1020.396002] ------------[ cut here ]------------
[ 1020.396002] WARNING: CPU: 1 PID: 3036 at kernel/watchdog.c:245 watchdog_over)
[ 1020.396002] Watchdog detected hard LOCKUP on cpu 1
[ 1020.396002] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powern
[ 1020.396002] CPU: 1 PID: 3036 Comm: perf_fuzzer Not tainted 3.12.0 #2
[ 1020.396002] Hardware name: AOpen   DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
[ 1020.396002]  00000000000000f5 ffff88011fc87c48 ffffffff8151aad6 ffff88011fc88
[ 1020.396002]  ffffffff8103cd68 0000000100000000 ffffffff810a101c ffff88011b380
[ 1020.396002]  ffff88011fc87ef8 ffff88011fc87d58 0000000000000000 ffff88011fc88
[ 1020.396002] Call Trace:
[ 1020.396002]  <NMI>  [<ffffffff8151aad6>] ? dump_stack+0x41/0x53
[ 1020.396002]  [<ffffffff8103cd68>] ? warn_slowpath_common+0x81/0x9c
[ 1020.396002]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
[ 1020.396002]  [<ffffffff8103ce25>] ? warn_slowpath_fmt+0x46/0x48
[ 1020.396002]  [<ffffffff810a101c>] ? watchdog_overflow_callback+0x9b/0xa6
[ 1020.396002]  [<ffffffff810ca4a2>] ? __perf_event_overflow+0x12e/0x1ab
[ 1020.396002]  [<ffffffff81018cf5>] ? intel_pmu_handle_irq+0x318/0x34d
[ 1020.396002]  [<ffffffff81018cfc>] ? intel_pmu_handle_irq+0x31f/0x34d
[ 1020.396002]  [<ffffffff81018c95>] ? intel_pmu_handle_irq+0x2b8/0x34d
[ 1020.396002]  [<ffffffff81018671>] ? intel_pmu_enable_all+0x55/0xa7
[ 1020.396002]  [<ffffffff8151edfc>] ? nmi_handle+0xa8/0x13a
[ 1020.396002]  [<ffffffff8151f473>] ? perf_event_nmi_handler+0x24/0x3d
[ 1020.396002]  [<ffffffff8151edb2>] ? nmi_handle+0x5e/0x13a
[ 1020.396002]  [<ffffffff8151ef35>] ? do_nmi+0xa7/0x2bd
[ 1020.396002]  [<ffffffff8151e537>] ? end_repeat_nmi+0x1e/0x2e
[ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
[ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
[ 1020.396002]  [<ffffffff8102bcc8>] ? apic_send_IPI_self+0x33/0x35
[ 1020.396002]  <<EOE>>  <IRQ>  [<ffffffff81005ee2>] ? arch_irq_work_raise+0x27a
[ 1020.396002]  [<ffffffff810c4ada>] ? irq_work_queue+0x70/0x72
[ 1020.396002]  [<ffffffff810ca510>] ? __perf_event_overflow+0x19c/0x1ab
[ 1020.396002]  [<ffffffff8104ae16>] ? __send_signal+0x244/0x26f
[ 1020.396002]  [<ffffffff810ca68b>] ? perf_swevent_overflow+0x3f/0x58
[ 1020.396002]  [<ffffffff810ca8a5>] ? perf_tp_event+0xe9/0x1f6
[ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
[ 1020.396002]  [<ffffffff81040954>] ? perf_trace_softirq+0x8a/0x96
[ 1020.396002]  [<ffffffff8100366f>] ? perf_trace_x86_irq_vector+0xc0/0xc9
[ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
[ 1020.396002]  [<ffffffff81005f6b>] ? smp_trace_irq_work_interrupt+0x34/0x7e
[ 1020.396002]  [<ffffffff815261ca>] ? trace_irq_work_interrupt+0x6a/0x70
[ 1020.396002]  [<ffffffff81040fba>] ? __do_softirq+0x95/0x208
[ 1020.396002]  [<ffffffff815263dc>] ? call_softirq+0x1c/0x30
[ 1020.396002]  [<ffffffff810041b9>] ? do_softirq+0x38/0x6d
[ 1020.396002]  [<ffffffff81040d1c>] ? irq_exit+0x45/0xa1
[ 1020.396002]  [<ffffffff81526a14>] ? smp_trace_apic_timer_interrupt+0x83/0x91
[ 1020.396002]  [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
[ 1020.396002]  <EOI>  [<ffffffff810c7c20>] ? __perf_event_enable+0x118/0x173
[ 1020.396002]  [<ffffffff8104c694>] ? get_signal_to_deliver+0x47d/0x48a
[ 1020.396002]  [<ffffffff810024da>] ? do_signal+0x46/0x5f5
[ 1020.396002]  [<ffffffff8100b9f5>] ? perf_trace_sys_exit+0x3e/0xa7
[ 1020.396002]  [<ffffffff81002ab5>] ? do_notify_resume+0x2c/0x6a
[ 1020.396002]  [<ffffffff81524ea0>] ? int_signal+0x12/0x17
[ 1020.396002] ---[ end trace 2b385d729ba32e92 ]---
[ 1020.396002] perf samples too long (8101340 > 10000), lowering kernel.perf_ev0
[ 1020.396002] INFO: NMI handler (perf_event_nmi_handler) took too long to run:s
[ 1033.301811] perf samples too long (8038058 > 20000), lowering kernel.perf_ev0
[ 1034.125811] perf samples too long (7975268 > 40000), lowering kernel.perf_ev0
[ 1034.973813] perf samples too long (7912968 > 76923), lowering kernel.perf_ev0
[ 1035.821813] perf samples too long (7851154 > 142857), lowering kernel.perf_e0
[ 1036.669813] perf samples too long (7789824 > 250000), lowering kernel.perf_e0
[ 1037.517812] perf samples too long (7728973 > 500000), lowering kernel.perf_e0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 20:02   ` Frederic Weisbecker
@ 2013-11-08 20:23     ` Vince Weaver
  2013-11-08 20:48       ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Vince Weaver @ 2013-11-08 20:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra,
	Dave Jones

On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> 
> There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
> previous reports seemed to suggest that lbr is involved, but not this one.

I may be wrong but I think everything between <NMI> and <EOE> is just
noise from the NMI perf-event watchdog timer kicking in.

Why that code would be reading the LBR registers I don't know.

Vince



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 20:23     ` Vince Weaver
@ 2013-11-08 20:48       ` Frederic Weisbecker
  2013-11-08 21:15         ` Vince Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-08 20:48 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, Nov 08, 2013 at 03:23:07PM -0500, Vince Weaver wrote:
> On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> > 
> > There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
> > previous reports seemed to suggest that lbr is involved, but not this one.
> 
> I may be wrong but I think everything between <NMI> and <EOE> is just
> noise from the NMI perf-event watchdog timer kicking in.

Ah good point.

So the pattern seem to be that irq work/perf_event_wakeup is involved, may be
interrupting a tracepoint event or so.

It would be nice if you enable CONFIG_FRAME_POINTER in your future reports
so that we get more precise traces. Or may be you actually enabled it and it
doesn't work?
 
> Why that code would be reading the LBR registers I don't know.

Yeah that's just an inconditional call to check if there is lbr branch to record.

> Vince
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 20:48       ` Frederic Weisbecker
@ 2013-11-08 21:15         ` Vince Weaver
  2013-11-08 22:24           ` Frederic Weisbecker
                             ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Vince Weaver @ 2013-11-08 21:15 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra,
	Dave Jones

[-- Attachment #1: Type: TEXT/PLAIN, Size: 993 bytes --]

On Fri, 8 Nov 2013, Frederic Weisbecker wrote:

> On Fri, Nov 08, 2013 at 03:23:07PM -0500, Vince Weaver wrote:
> > On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> > > 
> > > There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
> > > previous reports seemed to suggest that lbr is involved, but not this one.
> > 
> > I may be wrong but I think everything between <NMI> and <EOE> is just
> > noise from the NMI perf-event watchdog timer kicking in.
> 
> Ah good point.
> 
> So the pattern seem to be that irq work/perf_event_wakeup is involved, may be
> interrupting a tracepoint event or so.

I managed to construct a reproducible test case, which is attached.  I 
sometimes have to run it 5-10 times before it triggers.

> It would be nice if you enable CONFIG_FRAME_POINTER in your future reports
> so that we get more precise traces. Or may be you actually enabled it and it
> doesn't work?

No, it's not enabled, I can try setting that in the future.

Vince



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-csrc; name=out.c, Size: 2950 bytes --]

/* log_to_code output from ./new.trace */
/* by Vince Weaver <vincent.weaver _at_ maine.edu */

#define _GNU_SOURCE 1
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <signal.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <poll.h>
#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>

int fd[1024];
struct perf_event_attr pe[1024];
char *mmap_result[1024];
#define MAX_READ_SIZE 65536
static long long data[MAX_READ_SIZE];

#define MAX_POLL_FDS 128
struct pollfd pollfds[MAX_POLL_FDS];

long long id;

int forked_pid;

struct sigaction sa;
static int overflows=0;
static int sigios=0;

static void our_handler(int signum, siginfo_t *info, void *uc) {
	int fd = info->si_fd;
	int ret;

	overflows++;
	ioctl(fd,PERF_EVENT_IOC_DISABLE,0);
	if (sigios) return;
	ret=ioctl(fd, PERF_EVENT_IOC_REFRESH,1);
}
int perf_event_open(struct perf_event_attr *hw_event_uptr,
	pid_t pid, int cpu, int group_fd, unsigned long flags) {

	return syscall(__NR_perf_event_open,hw_event_uptr, pid, cpu,
		group_fd, flags);
}

int main(int argc, char **argv) {
/* 1 */
/* fd = 82 */

	memset(&pe[82],0,sizeof(struct perf_event_attr));
	pe[82].type=PERF_TYPE_TRACEPOINT;
	pe[82].size=80;
	pe[82].config=0x18;
	pe[82].sample_period=0xdd95cac7b947b610;
	pe[82].sample_type=PERF_SAMPLE_IP|PERF_SAMPLE_TIME|PERF_SAMPLE_READ|PERF_SAMPLE_CALLCHAIN|PERF_SAMPLE_PERIOD|PERF_SAMPLE_DATA_SRC;
//|PERF_SAMPLE_IDENTIFIER; /* 18135 */
	pe[82].read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP; /* f */
	pe[82].exclude_hv=1;
	pe[82].mmap=1;
	pe[82].comm=1;
	pe[82].watermark=1;
	pe[82].precise_ip=1; /* constant skid */
	pe[82].mmap_data=1;
	pe[82].exclude_host=1;
	pe[82].exclude_guest=1;
	pe[82].exclude_callchain_user=1;
	pe[82].wakeup_watermark=232;
	pe[82].bp_type=HW_BREAKPOINT_EMPTY;
	pe[82].config1=0x9;
	pe[82].branch_sample_type=PERF_SAMPLE_BRANCH_USER|PERF_SAMPLE_BRANCH_KERNEL|PERF_SAMPLE_BRANCH_HV|PERF_SAMPLE_BRANCH_ANY|PERF_SAMPLE_BRANCH_ANY_CALL|PERF_SAMPLE_BRANCH_ANY_RETURN|PERF_SAMPLE_BRANCH_IND_CALL;
//|PERF_SAMPLE_BRANCH_ABORT_TX;
	pe[82].sample_regs_user=39;
	pe[82].sample_stack_user=37552;

	fd[82]=perf_event_open(&pe[82],
				0, /* current thread */
				-1, /* all cpus */
				-1, /* New Group Leader */
				PERF_FLAG_FD_OUTPUT /*2*/ );


/* 2 */
	mmap_result[82]=mmap(NULL, 20480,PROT_READ|PROT_WRITE, MAP_SHARED,fd[82], 0);
/* 3 */
	memset(&sa, 0, sizeof(struct sigaction));
	sa.sa_sigaction = our_handler;
	sa.sa_flags = SA_SIGINFO;
	if (sigaction( SIGRTMIN+2, &sa, NULL) < 0) {
		printf("Error setting up signal handler\n");
	}
	fcntl(fd[82], F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
	fcntl(fd[82], F_SETSIG, SIGRTMIN+2);
	fcntl(fd[82], F_SETOWN,getpid());
	/* Replayed 3 syscalls */
	return 0;
}

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 21:15         ` Vince Weaver
@ 2013-11-08 22:24           ` Frederic Weisbecker
  2013-11-08 22:36           ` Frederic Weisbecker
  2013-11-09  0:25           ` perf/tracepoint: another fuzzer generated lockup Frederic Weisbecker
  2 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-08 22:24 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, Nov 08, 2013 at 04:15:21PM -0500, Vince Weaver wrote:
> On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> 
> > On Fri, Nov 08, 2013 at 03:23:07PM -0500, Vince Weaver wrote:
> > > On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> > > > 
> > > > There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
> > > > previous reports seemed to suggest that lbr is involved, but not this one.
> > > 
> > > I may be wrong but I think everything between <NMI> and <EOE> is just
> > > noise from the NMI perf-event watchdog timer kicking in.
> > 
> > Ah good point.
> > 
> > So the pattern seem to be that irq work/perf_event_wakeup is involved, may be
> > interrupting a tracepoint event or so.
> 
> I managed to construct a reproducible test case, which is attached.  I 
> sometimes have to run it 5-10 times before it triggers.
> 

Ah with this I can reproduce. I need to run it into a loop for a few seconds to trigger it:

[   91.750943] ------------[ cut here ]------------
[   91.755440] WARNING: CPU: 1 PID: 968 at kernel/watchdog.c:246 watchdog_overflow_callback+0x9a/0xc0()
[   91.764530] Watchdog detected hard LOCKUP on cpu 1
[   91.769121] Modules linked in:
[   91.772328] CPU: 1 PID: 968 Comm: out Not tainted 3.12.0+ #47
[   91.778042] Hardware name: FUJITSU SIEMENS AMD690VM-FMH/AMD690VM-FMH, BIOS V5.13 03/14/2008
[   91.786358]  00000000000000f6 ffff880107c87bd8 ffffffff815bac1a ffff8800cd0bafa0
[   91.793714]  ffff880107c87c28 ffff880107c87c18 ffffffff8104dffc ffff880107c87e08
[   91.801076]  ffff880103d20000 0000000000000000 ffff880107c87d38 0000000000000000
[   91.808444] Call Trace:
[   91.810869]  <NMI>  [<ffffffff815bac1a>] dump_stack+0x4f/0x7c
[   91.816589]  [<ffffffff8104dffc>] warn_slowpath_common+0x8c/0xc0
[   91.822564]  [<ffffffff8104e0e6>] warn_slowpath_fmt+0x46/0x50
[   91.828281]  [<ffffffff810ef2ea>] watchdog_overflow_callback+0x9a/0xc0
[   91.834777]  [<ffffffff8111bc78>] __perf_event_overflow+0x98/0x310
[   91.840928]  [<ffffffff81118a40>] ? perf_event_task_disable+0x90/0x90
[   91.847336]  [<ffffffff8111cb44>] perf_event_overflow+0x14/0x20
[   91.853227]  [<ffffffff81019eda>] x86_pmu_handle_irq+0x12a/0x180
[   91.859198]  [<ffffffff81018ad4>] perf_event_nmi_handler+0x34/0x60
[   91.865353]  [<ffffffff81006f76>] nmi_handle.isra.3+0xc6/0x3e0
[   91.871154]  [<ffffffff81006eb5>] ? nmi_handle.isra.3+0x5/0x3e0
[   91.877045]  [<ffffffff8102fd60>] ? perf_ibs_handle_irq+0x420/0x420
[   91.883275]  [<ffffffff810073a0>] do_nmi+0x110/0x390
[   91.888219]  [<ffffffff815c9631>] end_repeat_nmi+0x1e/0x2e
[   91.893676]  [<ffffffff812ef38d>] ? delay_tsc+0x6d/0xc0
[   91.898871]  [<ffffffff812ef38d>] ? delay_tsc+0x6d/0xc0
[   91.904068]  [<ffffffff812ef38d>] ? delay_tsc+0x6d/0xc0
[   91.909258]  <<EOE>>  <IRQ>  [<ffffffff812ef27f>] __delay+0xf/0x20
[   91.915415]  [<ffffffff812ef2b7>] __const_udelay+0x27/0x30
[   91.920873]  [<ffffffff810ba31c>] __rcu_read_unlock+0x5c/0xa0
[   91.926589]  [<ffffffff811815b3>] kill_fasync+0x163/0x2a0
[   91.931958]  [<ffffffff81181476>] ? kill_fasync+0x26/0x2a0
[   91.937414]  [<ffffffff812ef2b7>] ? __const_udelay+0x27/0x30
[   91.943045]  [<ffffffff8111988e>] perf_event_wakeup+0x15e/0x290
[   91.948935]  [<ffffffff81119730>] ? __perf_event_task_sched_out+0x540/0x540
[   91.955864]  [<ffffffff811199f7>] perf_pending_event+0x37/0x60
[   91.961667]  [<ffffffff81112bb7>] __irq_work_run+0x77/0xa0
[   91.967122]  [<ffffffff81112c58>] irq_work_run+0x18/0x30
[   91.972404]  [<ffffffff810080ad>] smp_trace_irq_work_interrupt+0x3d/0x2a0
[   91.979163]  [<ffffffff815cb132>] trace_irq_work_interrupt+0x72/0x80
[   91.985486]  [<ffffffff815c8fb7>] ? retint_restore_args+0x13/0x13
[   91.991548]  [<ffffffff815c877a>] ? _raw_spin_unlock_irqrestore+0x7a/0x90
[   91.998305]  [<ffffffff810c101b>] rcu_process_callbacks+0x1db/0x530
[   92.004541]  [<ffffffff8105381d>] __do_softirq+0xdd/0x490
[   92.009911]  [<ffffffff81053fe6>] irq_exit+0x96/0xc0
[   92.014849]  [<ffffffff815cbc3a>] smp_trace_apic_timer_interrupt+0x5a/0x2b4
[   92.021777]  [<ffffffff815ca7b2>] trace_apic_timer_interrupt+0x72/0x80
[   92.028264]  <EOI>  [<ffffffff81180a95>] ? f_setown+0x5/0x150
[   92.033990]  [<ffffffff81180d3a>] ? SyS_fcntl+0xda/0x640
[   92.039267]  [<ffffffff812f06ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   92.045683]  [<ffffffff815c9aa9>] system_call_fastpath+0x16/0x1b
[   92.051657] ---[ end trace cd18ca1175c8c7fd ]---
[   92.056246] perf samples too long (2385259 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[   92.065604] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 314.659 msecs
[  101.744824] perf samples too long (2366626 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[  111.738707] perf samples too long (2348139 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
[  121.732589] perf samples too long (2329797 > 19230), lowering kernel.perf_event_max_sample_rate to 7000
[  131.726473] perf samples too long (2311632 > 35714), lowering kernel.perf_event_max_sample_rate to 4000
[  141.720356] perf samples too long (2293575 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
[  151.714239] perf samples too long (2275693 > 125000), lowering kernel.perf_event_max_sample_rate to 1000

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 21:15         ` Vince Weaver
  2013-11-08 22:24           ` Frederic Weisbecker
@ 2013-11-08 22:36           ` Frederic Weisbecker
  2013-11-09  1:09             ` Steven Rostedt
                               ` (2 more replies)
  2013-11-09  0:25           ` perf/tracepoint: another fuzzer generated lockup Frederic Weisbecker
  2 siblings, 3 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-08 22:36 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, Nov 08, 2013 at 04:15:21PM -0500, Vince Weaver wrote:
> On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> 
> > On Fri, Nov 08, 2013 at 03:23:07PM -0500, Vince Weaver wrote:
> > > On Fri, 8 Nov 2013, Frederic Weisbecker wrote:
> > > > 
> > > > There seem to be a loop that takes too long in intel_pmu_handle_irq(). Your two
> > > > previous reports seemed to suggest that lbr is involved, but not this one.
> > > 
> > > I may be wrong but I think everything between <NMI> and <EOE> is just
> > > noise from the NMI perf-event watchdog timer kicking in.
> > 
> > Ah good point.
> > 
> > So the pattern seem to be that irq work/perf_event_wakeup is involved, may be
> > interrupting a tracepoint event or so.
> 
> I managed to construct a reproducible test case, which is attached.  I 
> sometimes have to run it 5-10 times before it triggers.
> 

And another one, I wonder if actually the problem is on the perf NMI handler
itself, look below:

[  237.192886] ------------[ cut here ]------------
[  237.197385] WARNING: CPU: 2 PID: 1016 at kernel/watchdog.c:246 watchdog_overflow_callback+0x9a/0xc0()
[  237.206562] Watchdog detected hard LOCKUP on cpu 2
[  237.211152] Modules linked in:
[  237.214361] CPU: 2 PID: 1016 Comm: out Not tainted 3.12.0+ #47
[  237.220160] Hardware name: FUJITSU SIEMENS AMD690VM-FMH/AMD690VM-FMH, BIOS V5.13 03/14/2008
[  237.228475]  00000000000000f6 ffff880107d07bd8 ffffffff815bac1a ffff8800cda34f60
[  237.235834]  ffff880107d07c28 ffff880107d07c18 ffffffff8104dffc ffff880107d07e08
[  237.243192]  ffff880103d58000 0000000000000000 ffff880107d07d38 0000000000000000
[  237.250556] Call Trace:
[  237.252984]  <NMI>  [<ffffffff815bac1a>] dump_stack+0x4f/0x7c
[  237.258706]  [<ffffffff8104dffc>] warn_slowpath_common+0x8c/0xc0
[  237.264682]  [<ffffffff8104e0e6>] warn_slowpath_fmt+0x46/0x50
[  237.270397]  [<ffffffff810ef2ea>] watchdog_overflow_callback+0x9a/0xc0
[  237.276895]  [<ffffffff8111bc78>] __perf_event_overflow+0x98/0x310
[  237.283045]  [<ffffffff81118a40>] ? perf_event_task_disable+0x90/0x90
[  237.289453]  [<ffffffff8111cb44>] perf_event_overflow+0x14/0x20
[  237.295344]  [<ffffffff81019eda>] x86_pmu_handle_irq+0x12a/0x180
[  237.301319]  [<ffffffff81018ad4>] perf_event_nmi_handler+0x34/0x60
[  237.307464]  [<ffffffff81006f76>] nmi_handle.isra.3+0xc6/0x3e0
[  237.313271]  [<ffffffff81006eb5>] ? nmi_handle.isra.3+0x5/0x3e0
[  237.319163]  [<ffffffff8102fd60>] ? perf_ibs_handle_irq+0x420/0x420
[  237.325397]  [<ffffffff810073a0>] do_nmi+0x110/0x390
[  237.330336]  [<ffffffff815c9631>] end_repeat_nmi+0x1e/0x2e
[  237.335792]  [<ffffffff81005045>] ? dump_trace+0x75/0x2f0
[  237.341162]  [<ffffffff81005045>] ? dump_trace+0x75/0x2f0
[  237.346532]  [<ffffffff81005045>] ? dump_trace+0x75/0x2f0
[  237.351900]  <<EOE>>  <IRQ>  [<ffffffff8105ec15>] ? __sigqueue_alloc+0x5/0x280
[  237.359091]  [<ffffffff8101a4d1>] perf_callchain_kernel+0x51/0x70
[  237.365155]  [<ffffffff8111fec6>] perf_callchain+0x256/0x2c0
[  237.370783]  [<ffffffff8111bb5b>] perf_prepare_sample+0x27b/0x300
[  237.376849]  [<ffffffff810bc1ea>] ? __rcu_is_watching+0x1a/0x30
[  237.382736]  [<ffffffff8111bd2c>] __perf_event_overflow+0x14c/0x310
[  237.388973]  [<ffffffff8111bcd9>] ? __perf_event_overflow+0xf9/0x310
[  237.395291]  [<ffffffff8109aa6d>] ? trace_hardirqs_off+0xd/0x10
[  237.401186]  [<ffffffff815c8753>] ? _raw_spin_unlock_irqrestore+0x53/0x90
[  237.407941]  [<ffffffff81061b46>] ? do_send_sig_info+0x66/0x90
[  237.413744]  [<ffffffff8111c0f9>] perf_swevent_overflow+0xa9/0xc0
[  237.419808]  [<ffffffff8111c16f>] perf_swevent_event+0x5f/0x80
[  237.425610]  [<ffffffff8111c2b8>] perf_tp_event+0x128/0x420
[  237.431154]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
[  237.438085]  [<ffffffff815c83b5>] ? _raw_read_unlock+0x35/0x60
[  237.443887]  [<ffffffff81003fe7>] perf_trace_x86_irq_vector+0xc7/0xe0
[  237.450295]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
[  237.457226]  [<ffffffff81008108>] smp_trace_irq_work_interrupt+0x98/0x2a0
[  237.463983]  [<ffffffff815cb132>] trace_irq_work_interrupt+0x72/0x80
[  237.470303]  [<ffffffff815c8fb7>] ? retint_restore_args+0x13/0x13
[  237.476366]  [<ffffffff815c877a>] ? _raw_spin_unlock_irqrestore+0x7a/0x90
[  237.483117]  [<ffffffff810c101b>] rcu_process_callbacks+0x1db/0x530
[  237.489360]  [<ffffffff8105381d>] __do_softirq+0xdd/0x490
[  237.494728]  [<ffffffff81053fe6>] irq_exit+0x96/0xc0
[  237.499668]  [<ffffffff815cbc3a>] smp_trace_apic_timer_interrupt+0x5a/0x2b4
[  237.506596]  [<ffffffff815ca7b2>] trace_apic_timer_interrupt+0x72/0x80
[  237.513083]  <EOI>  [<ffffffff812ef38a>] ? delay_tsc+0x6a/0xc0
[  237.518895]  [<ffffffff812ef382>] ? delay_tsc+0x62/0xc0
[  237.524091]  [<ffffffff812ef27f>] __delay+0xf/0x20
[  237.528856]  [<ffffffff812ef2b7>] __const_udelay+0x27/0x30
[  237.534313]  [<ffffffff810ba31c>] __rcu_read_unlock+0x5c/0xa0
[  237.540029]  [<ffffffff81122255>] find_get_page+0xf5/0x220
[  237.545485]  [<ffffffff81122165>] ? find_get_page+0x5/0x220
[  237.551031]  [<ffffffff8112497b>] filemap_fault+0x8b/0x500
[  237.556487]  [<ffffffff81148459>] __do_fault+0x69/0x4f0
[  237.561682]  [<ffffffff8114c6dc>] handle_mm_fault+0x18c/0x940
[  237.567400]  [<ffffffff81042c71>] ? __do_page_fault+0x111/0x4e0
[  237.573289]  [<ffffffff8109af14>] ? __lock_is_held+0x54/0x70
[  237.578913]  [<ffffffff81042cde>] __do_page_fault+0x17e/0x4e0
[  237.584636]  [<ffffffff812ef3a8>] ? delay_tsc+0x88/0xc0
[  237.589831]  [<ffffffff812ef27f>] ? __delay+0xf/0x20
[  237.594769]  [<ffffffff812ef2b7>] ? __const_udelay+0x27/0x30
[  237.600399]  [<ffffffff810ba31c>] ? __rcu_read_unlock+0x5c/0xa0
[  237.606290]  [<ffffffff812f06ed>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[  237.612785]  [<ffffffff8104307e>] do_page_fault+0xe/0x10
[  237.618069]  [<ffffffff815c9228>] page_fault+0x28/0x30
[  237.623178] ---[ end trace 40cda30d05d0ffa6 ]---
[  237.627769] perf samples too long (3397569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[  237.637124] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 444.233 msecs

444 msecs is huge.

[  247.186562] perf samples too long (3371028 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[  257.180241] perf samples too long (3344694 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
[  267.173920] perf samples too long (3318566 > 19230), lowering kernel.perf_event_max_sample_rate to 7000
[  277.167598] perf samples too long (3292642 > 35714), lowering kernel.perf_event_max_sample_rate to 4000
[  287.161277] perf samples too long (3266920 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
[  297.154956] perf samples too long (3241400 > 125000), lowering kernel.perf_event_max_sample_rate to 1000

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 21:15         ` Vince Weaver
  2013-11-08 22:24           ` Frederic Weisbecker
  2013-11-08 22:36           ` Frederic Weisbecker
@ 2013-11-09  0:25           ` Frederic Weisbecker
  2 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09  0:25 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Steven Rostedt, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, Nov 08, 2013 at 04:15:21PM -0500, Vince Weaver wrote:
> 
> int main(int argc, char **argv) {
> /* 1 */
> /* fd = 82 */
> 
> 	memset(&pe[82],0,sizeof(struct perf_event_attr));
> 	pe[82].type=PERF_TYPE_TRACEPOINT;
> 	pe[82].size=80;
> 	pe[82].config=0x18;

I did some more testing and 0x18 on my setup is the irq_work_exit event. I tried a few
other tracepoints (tried the timer interrupt for example) but they did not trigger the issue.

And since the irq work handles the pending disable and wakeup, maybe that fiddles with some
perf state and make the NMI handler run into some buggy loop.

Perhaps that's triggered with some help of the disable ioctl.

I don't know, my bed is calling so I'll test that further tomorrow or so.

Thanks.

> 	pe[82].sample_period=0xdd95cac7b947b610;
> 	pe[82].sample_type=PERF_SAMPLE_IP|PERF_SAMPLE_TIME|PERF_SAMPLE_READ|PERF_SAMPLE_CALLCHAIN|PERF_SAMPLE_PERIOD|PERF_SAMPLE_DATA_SRC;
> //|PERF_SAMPLE_IDENTIFIER; /* 18135 */
> 	pe[82].read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP; /* f */
> 	pe[82].exclude_hv=1;
> 	pe[82].mmap=1;
> 	pe[82].comm=1;
> 	pe[82].watermark=1;
> 	pe[82].precise_ip=1; /* constant skid */
> 	pe[82].mmap_data=1;
> 	pe[82].exclude_host=1;
> 	pe[82].exclude_guest=1;
> 	pe[82].exclude_callchain_user=1;
> 	pe[82].wakeup_watermark=232;
> 	pe[82].bp_type=HW_BREAKPOINT_EMPTY;
> 	pe[82].config1=0x9;
> 	pe[82].branch_sample_type=PERF_SAMPLE_BRANCH_USER|PERF_SAMPLE_BRANCH_KERNEL|PERF_SAMPLE_BRANCH_HV|PERF_SAMPLE_BRANCH_ANY|PERF_SAMPLE_BRANCH_ANY_CALL|PERF_SAMPLE_BRANCH_ANY_RETURN|PERF_SAMPLE_BRANCH_IND_CALL;
> //|PERF_SAMPLE_BRANCH_ABORT_TX;
> 	pe[82].sample_regs_user=39;
> 	pe[82].sample_stack_user=37552;
> 
> 	fd[82]=perf_event_open(&pe[82],
> 				0, /* current thread */
> 				-1, /* all cpus */
> 				-1, /* New Group Leader */
> 				PERF_FLAG_FD_OUTPUT /*2*/ );
> 
> 
> /* 2 */
> 	mmap_result[82]=mmap(NULL, 20480,PROT_READ|PROT_WRITE, MAP_SHARED,fd[82], 0);
> /* 3 */
> 	memset(&sa, 0, sizeof(struct sigaction));
> 	sa.sa_sigaction = our_handler;
> 	sa.sa_flags = SA_SIGINFO;
> 	if (sigaction( SIGRTMIN+2, &sa, NULL) < 0) {
> 		printf("Error setting up signal handler\n");
> 	}
> 	fcntl(fd[82], F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
> 	fcntl(fd[82], F_SETSIG, SIGRTMIN+2);
> 	fcntl(fd[82], F_SETOWN,getpid());
> 	/* Replayed 3 syscalls */
> 	return 0;
> }


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 22:36           ` Frederic Weisbecker
@ 2013-11-09  1:09             ` Steven Rostedt
  2013-11-09 14:10             ` Peter Zijlstra
  2013-11-09 15:11             ` Peter Zijlstra
  2 siblings, 0 replies; 38+ messages in thread
From: Steven Rostedt @ 2013-11-09  1:09 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, LKML, Ingo Molnar, Peter Zijlstra, Dave Jones

On Fri, 8 Nov 2013 23:36:58 +0100
Frederic Weisbecker <fweisbec@gmail.com> wrote:


> [  237.623178] ---[ end trace 40cda30d05d0ffa6 ]---
> [  237.627769] perf samples too long (3397569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> [  237.637124] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 444.233 msecs
> 
> 444 msecs is huge.

I don't think that's the issue you are looking for. Peter told me
there's a timer race with overflows or something (I don't remember the
details, Peter can clarify it), that causes the second timestamp to be
bogus compared to the first, and this makes the code think that the NMI
handler was running much longer that it actually was.

-- Steve

> 
> [  247.186562] perf samples too long (3371028 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
> [  257.180241] perf samples too long (3344694 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
> [  267.173920] perf samples too long (3318566 > 19230), lowering kernel.perf_event_max_sample_rate to 7000
> [  277.167598] perf samples too long (3292642 > 35714), lowering kernel.perf_event_max_sample_rate to 4000
> [  287.161277] perf samples too long (3266920 > 62500), lowering kernel.perf_event_max_sample_rate to 2000
> [  297.154956] perf samples too long (3241400 > 125000), lowering kernel.perf_event_max_sample_rate to 1000


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 22:36           ` Frederic Weisbecker
  2013-11-09  1:09             ` Steven Rostedt
@ 2013-11-09 14:10             ` Peter Zijlstra
  2013-11-09 14:20               ` Frederic Weisbecker
  2013-11-09 14:52               ` Frederic Weisbecker
  2013-11-09 15:11             ` Peter Zijlstra
  2 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-09 14:10 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Fri, Nov 08, 2013 at 11:36:58PM +0100, Frederic Weisbecker wrote:
> [  237.627769] perf samples too long (3397569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> [  237.637124] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 444.233 msecs
> 
> 444 msecs is huge.

Be glad your system lived to tell about it ;-) Calling printk() from NMI
context is Russian roulette; I'm still waiting for the first report it
actually locked up :-)

That said, I'm not sure what kernel you're running, but there were some
issues with time-keeping hereabouts, but more importantly that second
timing includes the printk() call of the first -- so that's always going
to be fucked.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 14:10             ` Peter Zijlstra
@ 2013-11-09 14:20               ` Frederic Weisbecker
  2013-11-11 12:44                 ` Ingo Molnar
  2013-11-09 14:52               ` Frederic Weisbecker
  1 sibling, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09 14:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 03:10:39PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 08, 2013 at 11:36:58PM +0100, Frederic Weisbecker wrote:
> > [  237.627769] perf samples too long (3397569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> > [  237.637124] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 444.233 msecs
> > 
> > 444 msecs is huge.
> 
> Be glad your system lived to tell about it ;-) Calling printk() from NMI
> context is Russian roulette; I'm still waiting for the first report it
> actually locked up :-)

Haha, right. I dump with earlyprintk but that doesn't change the fact it passes through printk
machinery. Fortunately I haven't yet got burdened with that. Although... maybe printk plays a role
in the issue here...

> 
> That said, I'm not sure what kernel you're running, but there were some
> issues with time-keeping hereabouts, but more importantly that second
> timing includes the printk() call of the first -- so that's always going
> to be fucked.

It's a recent tip:master. So the delta debug printout is certainly buggy, meanwhile
these lockup only happen with Vince selftests, and they trigger a lot of these NMI-too-long
issues, or may be that's the other way round :)...

I'm trying to narrow down the issue, lets hope the lockup is not actually due to printk
itself.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 14:10             ` Peter Zijlstra
  2013-11-09 14:20               ` Frederic Weisbecker
@ 2013-11-09 14:52               ` Frederic Weisbecker
  2013-11-09 15:13                 ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09 14:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 03:10:39PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 08, 2013 at 11:36:58PM +0100, Frederic Weisbecker wrote:
> > [  237.627769] perf samples too long (3397569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> > [  237.637124] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 444.233 msecs
> > 
> > 444 msecs is huge.
> 
> Be glad your system lived to tell about it ;-) Calling printk() from NMI
> context is Russian roulette; I'm still waiting for the first report it
> actually locked up :-)
> 
> That said, I'm not sure what kernel you're running, but there were some
> issues with time-keeping hereabouts, but more importantly that second
> timing includes the printk() call of the first -- so that's always going
> to be fucked.

So, an idea of what may be happening: an event overflows while FASYNC flag is set so it triggers an irq work
to send the signal (kill_fasync).
After the irq work triggers, it generates an irq_work_exit event, which in turn overflows and,
if it has FASYNC, triggers a new irq work. The irq work triggers and generates an irq work exit event which
has FASYNC flag, etc...

Looks like a nice way to deadlock with an infinite loop of irq work.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-08 22:36           ` Frederic Weisbecker
  2013-11-09  1:09             ` Steven Rostedt
  2013-11-09 14:10             ` Peter Zijlstra
@ 2013-11-09 15:11             ` Peter Zijlstra
  2013-11-09 15:22               ` Frederic Weisbecker
  2013-11-14 15:23               ` Peter Zijlstra
  2 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-09 15:11 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Fri, Nov 08, 2013 at 11:36:58PM +0100, Frederic Weisbecker wrote:
> [  237.359091]  [<ffffffff8101a4d1>] perf_callchain_kernel+0x51/0x70
> [  237.365155]  [<ffffffff8111fec6>] perf_callchain+0x256/0x2c0
> [  237.370783]  [<ffffffff8111bb5b>] perf_prepare_sample+0x27b/0x300
> [  237.376849]  [<ffffffff810bc1ea>] ? __rcu_is_watching+0x1a/0x30
> [  237.382736]  [<ffffffff8111bd2c>] __perf_event_overflow+0x14c/0x310
> [  237.388973]  [<ffffffff8111bcd9>] ? __perf_event_overflow+0xf9/0x310
> [  237.395291]  [<ffffffff8109aa6d>] ? trace_hardirqs_off+0xd/0x10
> [  237.401186]  [<ffffffff815c8753>] ? _raw_spin_unlock_irqrestore+0x53/0x90
> [  237.407941]  [<ffffffff81061b46>] ? do_send_sig_info+0x66/0x90
> [  237.413744]  [<ffffffff8111c0f9>] perf_swevent_overflow+0xa9/0xc0
> [  237.419808]  [<ffffffff8111c16f>] perf_swevent_event+0x5f/0x80
> [  237.425610]  [<ffffffff8111c2b8>] perf_tp_event+0x128/0x420
> [  237.431154]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
> [  237.438085]  [<ffffffff815c83b5>] ? _raw_read_unlock+0x35/0x60
> [  237.443887]  [<ffffffff81003fe7>] perf_trace_x86_irq_vector+0xc7/0xe0
> [  237.450295]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
> [  237.457226]  [<ffffffff81008108>] smp_trace_irq_work_interrupt+0x98/0x2a0
> [  237.463983]  [<ffffffff815cb132>] trace_irq_work_interrupt+0x72/0x80
> [  237.470303]  [<ffffffff815c8fb7>] ? retint_restore_args+0x13/0x13
> [  237.476366]  [<ffffffff815c877a>] ? _raw_spin_unlock_irqrestore+0x7a/0x90
> [  237.483117]  [<ffffffff810c101b>] rcu_process_callbacks+0x1db/0x530
> [  237.489360]  [<ffffffff8105381d>] __do_softirq+0xdd/0x490
> [  237.494728]  [<ffffffff81053fe6>] irq_exit+0x96/0xc0
> [  237.499668]  [<ffffffff815cbc3a>] smp_trace_apic_timer_interrupt+0x5a/0x2b4
> [  237.506596]  [<ffffffff815ca7b2>] trace_apic_timer_interrupt+0x72/0x80

Cute.. so what appears to happen is that:

1) we trace irq_work_exit
2) we generate event
3) event needs to deliver signal
4) we queue irq_work to send signal
5) goto 1

Does something like this solve it?

---
 kernel/events/core.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4dc078d18929..a3ad40f347c4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5289,6 +5289,16 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	perf_output_end(&handle);
 }
 
+static inline void perf_pending(struct perf_event *event)
+{
+	if (in_nmi()) {
+		irq_work_pending(&event->pending);
+		return;
+	}
+
+	perf_pending_event(&event->pending);
+}
+
 /*
  * Generic event overflow handling, sampling.
  */
@@ -5345,7 +5355,7 @@ static int __perf_event_overflow(struct perf_event *event,
 		ret = 1;
 		event->pending_kill = POLL_HUP;
 		event->pending_disable = 1;
-		irq_work_queue(&event->pending);
+		perf_pending(event);
 	}
 
 	if (event->overflow_handler)
@@ -5355,7 +5365,7 @@ static int __perf_event_overflow(struct perf_event *event,
 
 	if (event->fasync && event->pending_kill) {
 		event->pending_wakeup = 1;
-		irq_work_queue(&event->pending);
+		perf_pending(event);
 	}
 
 	return ret;

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 14:52               ` Frederic Weisbecker
@ 2013-11-09 15:13                 ` Peter Zijlstra
  2013-11-09 15:27                   ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-09 15:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 03:52:59PM +0100, Frederic Weisbecker wrote:
> So, an idea of what may be happening: an event overflows while FASYNC flag is set so it triggers an irq work
> to send the signal (kill_fasync).
> After the irq work triggers, it generates an irq_work_exit event, which in turn overflows and,
> if it has FASYNC, triggers a new irq work. The irq work triggers and generates an irq work exit event which
> has FASYNC flag, etc...
> 
> Looks like a nice way to deadlock with an infinite loop of irq work.


Yep, exactly, see the email I just send.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:11             ` Peter Zijlstra
@ 2013-11-09 15:22               ` Frederic Weisbecker
  2013-11-09 15:30                 ` Peter Zijlstra
  2013-11-14 15:23               ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09 15:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:11:01PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 08, 2013 at 11:36:58PM +0100, Frederic Weisbecker wrote:
> > [  237.359091]  [<ffffffff8101a4d1>] perf_callchain_kernel+0x51/0x70
> > [  237.365155]  [<ffffffff8111fec6>] perf_callchain+0x256/0x2c0
> > [  237.370783]  [<ffffffff8111bb5b>] perf_prepare_sample+0x27b/0x300
> > [  237.376849]  [<ffffffff810bc1ea>] ? __rcu_is_watching+0x1a/0x30
> > [  237.382736]  [<ffffffff8111bd2c>] __perf_event_overflow+0x14c/0x310
> > [  237.388973]  [<ffffffff8111bcd9>] ? __perf_event_overflow+0xf9/0x310
> > [  237.395291]  [<ffffffff8109aa6d>] ? trace_hardirqs_off+0xd/0x10
> > [  237.401186]  [<ffffffff815c8753>] ? _raw_spin_unlock_irqrestore+0x53/0x90
> > [  237.407941]  [<ffffffff81061b46>] ? do_send_sig_info+0x66/0x90
> > [  237.413744]  [<ffffffff8111c0f9>] perf_swevent_overflow+0xa9/0xc0
> > [  237.419808]  [<ffffffff8111c16f>] perf_swevent_event+0x5f/0x80
> > [  237.425610]  [<ffffffff8111c2b8>] perf_tp_event+0x128/0x420
> > [  237.431154]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
> > [  237.438085]  [<ffffffff815c83b5>] ? _raw_read_unlock+0x35/0x60
> > [  237.443887]  [<ffffffff81003fe7>] perf_trace_x86_irq_vector+0xc7/0xe0
> > [  237.450295]  [<ffffffff81008108>] ? smp_trace_irq_work_interrupt+0x98/0x2a0
> > [  237.457226]  [<ffffffff81008108>] smp_trace_irq_work_interrupt+0x98/0x2a0
> > [  237.463983]  [<ffffffff815cb132>] trace_irq_work_interrupt+0x72/0x80
> > [  237.470303]  [<ffffffff815c8fb7>] ? retint_restore_args+0x13/0x13
> > [  237.476366]  [<ffffffff815c877a>] ? _raw_spin_unlock_irqrestore+0x7a/0x90
> > [  237.483117]  [<ffffffff810c101b>] rcu_process_callbacks+0x1db/0x530
> > [  237.489360]  [<ffffffff8105381d>] __do_softirq+0xdd/0x490
> > [  237.494728]  [<ffffffff81053fe6>] irq_exit+0x96/0xc0
> > [  237.499668]  [<ffffffff815cbc3a>] smp_trace_apic_timer_interrupt+0x5a/0x2b4
> > [  237.506596]  [<ffffffff815ca7b2>] trace_apic_timer_interrupt+0x72/0x80
> 
> Cute.. so what appears to happen is that:
> 
> 1) we trace irq_work_exit
> 2) we generate event
> 3) event needs to deliver signal
> 4) we queue irq_work to send signal
> 5) goto 1
> 
> Does something like this solve it?
> 
> ---
>  kernel/events/core.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4dc078d18929..a3ad40f347c4 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5289,6 +5289,16 @@ static void perf_log_throttle(struct perf_event *event, int enable)
>  	perf_output_end(&handle);
>  }
>  
> +static inline void perf_pending(struct perf_event *event)
> +{
> +	if (in_nmi()) {
> +		irq_work_pending(&event->pending);

I guess you mean irq_work_queue()?

But there are much more reasons that just being in nmi to async wakeups, signal sending, etc...
The fact that an event can happen anywhere (rq lock acquire or whatever) makes perf events all fragile
enough to always require irq work for these.

Probably what we need is rather some limit. Maybe we can't seriously apply recursion checks here
but perhaps the simple fact that we raise an irq work from an irq work should trigger an alarm
of some sort.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:13                 ` Peter Zijlstra
@ 2013-11-09 15:27                   ` Frederic Weisbecker
  2013-11-09 15:59                     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09 15:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:13:56PM +0100, Peter Zijlstra wrote:
> On Sat, Nov 09, 2013 at 03:52:59PM +0100, Frederic Weisbecker wrote:
> > So, an idea of what may be happening: an event overflows while FASYNC flag is set so it triggers an irq work
> > to send the signal (kill_fasync).
> > After the irq work triggers, it generates an irq_work_exit event, which in turn overflows and,
> > if it has FASYNC, triggers a new irq work. The irq work triggers and generates an irq work exit event which
> > has FASYNC flag, etc...
> > 
> > Looks like a nice way to deadlock with an infinite loop of irq work.
> 
> 
> Yep, exactly, see the email I just send.

In fact, raising an irq work from an irq work should simply be prohibited. That's not a sane
behaviour.

It's natural for async stuffs that have reasonable delays between each pass allow re-enqueuing,
like workqueue or rcu callbacks, or timers. But with irq work that doesn't look right, expect for
lazy irq works though. But lets just not allow it at all :)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:22               ` Frederic Weisbecker
@ 2013-11-09 15:30                 ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-09 15:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:22:57PM +0100, Frederic Weisbecker wrote:
> > ---
> >  kernel/events/core.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 4dc078d18929..a3ad40f347c4 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5289,6 +5289,16 @@ static void perf_log_throttle(struct perf_event *event, int enable)
> >  	perf_output_end(&handle);
> >  }
> >  
> > +static inline void perf_pending(struct perf_event *event)
> > +{
> > +	if (in_nmi()) {
> > +		irq_work_pending(&event->pending);
> 
> I guess you mean irq_work_queue()?

Uhm yah

> But there are much more reasons that just being in nmi to async
> wakeups, signal sending, etc...  The fact that an event can happen
> anywhere (rq lock acquire or whatever) makes perf events all fragile
> enough to always require irq work for these.

Fair enough :/

> Probably what we need is rather some limit. Maybe we can't seriously
> apply recursion checks here but perhaps the simple fact that we raise
> an irq work from an irq work should trigger an alarm of some sort.

I think irq_work was designed explicitly to allow this -- Oleg had some
usecase for this.

So my initial approach was trying to detect if there was a fasync signal
pending and break out of the loop in that case; but fasync gives me a
bloody headache.

It looks like you cannot even determine the signum you need to test
pending without acquiring locks, let alone find all the tasks it would
raise it against.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:27                   ` Frederic Weisbecker
@ 2013-11-09 15:59                     ` Peter Zijlstra
  2013-11-09 16:08                       ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-09 15:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:27:01PM +0100, Frederic Weisbecker wrote:
> In fact, raising an irq work from an irq work should simply be prohibited. That's not a sane
> behaviour.

Well, it is because as you raised on IRC we could be holding locks and
trying to avoid deadlocks. This is the very reason irq_work gets used in
printk.

And its not a recursive run()->work()->queue() either, because as you said this
tracepoint is in arch code _after_ work_run completes.

All in all an exceedingly vexing issue.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:59                     ` Peter Zijlstra
@ 2013-11-09 16:08                       ` Frederic Weisbecker
  0 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-09 16:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:59:38PM +0100, Peter Zijlstra wrote:
> On Sat, Nov 09, 2013 at 04:27:01PM +0100, Frederic Weisbecker wrote:
> > In fact, raising an irq work from an irq work should simply be prohibited. That's not a sane
> > behaviour.
> 
> Well, it is because as you raised on IRC we could be holding locks and
> trying to avoid deadlocks. This is the very reason irq_work gets used in
> printk.
> 
> And its not a recursive run()->work()->queue() either, because as you said this
> tracepoint is in arch code _after_ work_run completes.

Yeah, indeed :(

> 
> All in all an exceedingly vexing issue.

Yep, the only sane way to solve this now is to check that signal pending, unfortunately
as you said it's not acquired that we can even just check this lockless...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 14:20               ` Frederic Weisbecker
@ 2013-11-11 12:44                 ` Ingo Molnar
  2013-11-11 15:53                   ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Ingo Molnar @ 2013-11-11 12:44 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Vince Weaver, Steven Rostedt, LKML, Dave Jones


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> > That said, I'm not sure what kernel you're running, but there were 
> > some issues with time-keeping hereabouts, but more importantly that 
> > second timing includes the printk() call of the first -- so that's 
> > always going to be fucked.
> 
> It's a recent tip:master. So the delta debug printout is certainly 
> buggy, meanwhile these lockup only happen with Vince selftests, and they 
> trigger a lot of these NMI-too-long issues, or may be that's the other 
> way round :)...
> 
> I'm trying to narrow down the issue, lets hope the lockup is not 
> actually due to printk itself.

I'd _very_ strongly suggest to not include the printk() overhead in the 
execution time delta! What that function wants to report is pure NMI 
execution overhead, not problem reporting overhead.

That way any large number reported there is always a bug somewhere, 
somehow.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-11 12:44                 ` Ingo Molnar
@ 2013-11-11 15:53                   ` Peter Zijlstra
  2013-11-11 21:13                     ` Ingo Molnar
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-11 15:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Vince Weaver, Steven Rostedt, LKML, Dave Jones

On Mon, Nov 11, 2013 at 01:44:19PM +0100, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > > That said, I'm not sure what kernel you're running, but there were 
> > > some issues with time-keeping hereabouts, but more importantly that 
> > > second timing includes the printk() call of the first -- so that's 
> > > always going to be fucked.
> > 
> > It's a recent tip:master. So the delta debug printout is certainly 
> > buggy, meanwhile these lockup only happen with Vince selftests, and they 
> > trigger a lot of these NMI-too-long issues, or may be that's the other 
> > way round :)...
> > 
> > I'm trying to narrow down the issue, lets hope the lockup is not 
> > actually due to printk itself.
> 
> I'd _very_ strongly suggest to not include the printk() overhead in the 
> execution time delta! What that function wants to report is pure NMI 
> execution overhead, not problem reporting overhead.
> 
> That way any large number reported there is always a bug somewhere, 
> somehow.

-ENOPATCH :-)

You'll find that there's two levels of measuring NMI latency and the
outer will invariably include the reporting of the inner one; fixing
that is going to be hideously ugly.

That said, I would very strongly suggest to tear that printk() from the
NMI path, its just waiting to wreck someone's machine :-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-11 15:53                   ` Peter Zijlstra
@ 2013-11-11 21:13                     ` Ingo Molnar
  0 siblings, 0 replies; 38+ messages in thread
From: Ingo Molnar @ 2013-11-11 21:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Vince Weaver, Steven Rostedt, LKML, Dave Jones


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Nov 11, 2013 at 01:44:19PM +0100, Ingo Molnar wrote:
> > 
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > > That said, I'm not sure what kernel you're running, but there were 
> > > > some issues with time-keeping hereabouts, but more importantly that 
> > > > second timing includes the printk() call of the first -- so that's 
> > > > always going to be fucked.
> > > 
> > > It's a recent tip:master. So the delta debug printout is certainly 
> > > buggy, meanwhile these lockup only happen with Vince selftests, and they 
> > > trigger a lot of these NMI-too-long issues, or may be that's the other 
> > > way round :)...
> > > 
> > > I'm trying to narrow down the issue, lets hope the lockup is not 
> > > actually due to printk itself.
> > 
> > I'd _very_ strongly suggest to not include the printk() overhead in the 
> > execution time delta! What that function wants to report is pure NMI 
> > execution overhead, not problem reporting overhead.
> > 
> > That way any large number reported there is always a bug somewhere, 
> > somehow.
> 
> -ENOPATCH :-)
> 
> You'll find that there's two levels of measuring NMI latency and the 
> outer will invariably include the reporting of the inner one; fixing 
> that is going to be hideously ugly.
> 
> That said, I would very strongly suggest to tear that printk() from the 
> NMI path, its just waiting to wreck someone's machine :-)

So why not just write the value somewhere and printk once at the end of 
the NMI sequence, once everything is said and done?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-09 15:11             ` Peter Zijlstra
  2013-11-09 15:22               ` Frederic Weisbecker
@ 2013-11-14 15:23               ` Peter Zijlstra
  2013-11-14 15:33                 ` Peter Zijlstra
                                   ` (3 more replies)
  1 sibling, 4 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-14 15:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Sat, Nov 09, 2013 at 04:10:14PM +0100, Peter Zijlstra wrote:
> Cute.. so what appears to happen is that:
> 
> 1) we trace irq_work_exit
> 2) we generate event
> 3) event needs to deliver signal
> 4) we queue irq_work to send signal
> 5) goto 1

---
 arch/x86/include/asm/trace/irq_vectors.h | 11 +++++++++++
 include/linux/ftrace_event.h             | 16 ++++++++++++++++
 include/linux/tracepoint.h               |  4 ++++
 include/trace/ftrace.h                   |  7 +++++++
 kernel/trace/trace_event_perf.c          |  6 ++++++
 5 files changed, 44 insertions(+)

diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h
index 2874df24e7a4..4cab890007a7 100644
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -72,6 +72,17 @@ DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi);
 DEFINE_IRQ_VECTOR_EVENT(irq_work);
 
 /*
+ * We must dis-allow sampling irq_work_exit() because perf event sampling
+ * itself can cause irq_work, which would lead to an infinite loop;
+ *
+ *  1) irq_work_exit happens
+ *  2) generates perf sample
+ *  3) generates irq_work
+ *  4) goto 1
+ */
+TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
+
+/*
  * call_function - called when entering/exiting a call function interrupt
  * vector handler
  */
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 5eaa746735ff..3e4d05566022 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -244,6 +244,9 @@ struct ftrace_event_call {
 #ifdef CONFIG_PERF_EVENTS
 	int				perf_refcount;
 	struct hlist_head __percpu	*perf_events;
+
+	int	(*perf_perm)(struct ftrace_event_call *,
+			     struct perf_event *);
 #endif
 };
 
@@ -306,6 +309,19 @@ struct ftrace_event_file {
 	}								\
 	early_initcall(trace_init_flags_##name);
 
+#define __TRACE_EVENT_PERF_PERM(name, expr)				\
+	static int perf_perm_##name(struct ftrace_event_call *tp_event, \
+				    struct perf_event *p_event)		\
+	{								\
+		return expr;						\
+	}								\
+	static int __init trace_init_flags_##name(void)			\
+	{								\
+		event_##name.perf_perm = &perf_perm_##name;		\
+		return 0;						\
+	}								\
+	early_initcall(trace_init_flags_##name);
+
 #define PERF_MAX_TRACE_SIZE	2048
 
 #define MAX_FILTER_STR_VAL	256	/* Should handle KSYM_SYMBOL_LEN */
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index ebeab360d851..dd5cb1dca207 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -267,6 +267,8 @@ static inline void tracepoint_synchronize_unregister(void)
 
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#define TRACE_EVENT_PERF_PERM(event, expr)
+
 #endif /* DECLARE_TRACE */
 
 #ifndef TRACE_EVENT
@@ -399,4 +401,6 @@ static inline void tracepoint_synchronize_unregister(void)
 
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#define TRACE_EVENT_PERF_PERM(event, expr)
+
 #endif /* ifdef TRACE_EVENT (see note above) */
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 5c7ab17cbb02..7bdda40fc406 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -90,6 +90,10 @@
 #define TRACE_EVENT_FLAGS(name, value)					\
 	__TRACE_EVENT_FLAGS(name, value)
 
+#undef TRACE_EVENT_PERF_PERM
+#define TRACE_EVENT_PERF_PERM(name, expr)				\
+	__TRACE_EVENT_PERF_PERM(name, expr)
+
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
 
@@ -140,6 +144,9 @@
 #undef TRACE_EVENT_FLAGS
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#undef TRACE_EVENT_PERF_PERM
+#define TRACE_EVENT_PERF_PERM(event, expr)
+
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
 /*
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 78e27e3b52ac..630889f68b1d 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -24,6 +24,12 @@ static int	total_ref_count;
 static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 				 struct perf_event *p_event)
 {
+	if (tp_event->perf_perm) {
+		int ret = tp_event->perf_perm(tp_event, p_event);
+		if (ret)
+			return ret;
+	}
+
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event) &&
 	    perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 15:23               ` Peter Zijlstra
@ 2013-11-14 15:33                 ` Peter Zijlstra
  2013-11-14 15:35                   ` Frederic Weisbecker
  2013-11-15  1:16                   ` Masami Hiramatsu
  2013-11-14 16:03                 ` Frederic Weisbecker
                                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-14 15:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones,
	masami.hiramatsu.pt

On Thu, Nov 14, 2013 at 04:23:04PM +0100, Peter Zijlstra wrote:
>  /*
> + * We must dis-allow sampling irq_work_exit() because perf event sampling
> + * itself can cause irq_work, which would lead to an infinite loop;
> + *
> + *  1) irq_work_exit happens
> + *  2) generates perf sample
> + *  3) generates irq_work
> + *  4) goto 1
> + */
> +TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);

And the only reason this doesn't feed fwd itself into oblivion for
irq_work_enter() is because the irq_work_list must not be empty when the
interrupt is raised, and queueing further work does not re-raise the
IPI.


Also, we should probably do something 'smart' for kprobes, as all of
irq_work.c and plenty of perf itself is not __kprobe marked so you're
all free to insert kprobes in the middle of perf and then attach perf to
such a thing.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 15:33                 ` Peter Zijlstra
@ 2013-11-14 15:35                   ` Frederic Weisbecker
  2013-11-15  1:16                   ` Masami Hiramatsu
  1 sibling, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-14 15:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones,
	masami.hiramatsu.pt

On Thu, Nov 14, 2013 at 04:33:01PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 14, 2013 at 04:23:04PM +0100, Peter Zijlstra wrote:
> >  /*
> > + * We must dis-allow sampling irq_work_exit() because perf event sampling
> > + * itself can cause irq_work, which would lead to an infinite loop;
> > + *
> > + *  1) irq_work_exit happens
> > + *  2) generates perf sample
> > + *  3) generates irq_work
> > + *  4) goto 1
> > + */
> > +TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
> 
> And the only reason this doesn't feed fwd itself into oblivion for
> irq_work_enter() is because the irq_work_list must not be empty when the
> interrupt is raised, and queueing further work does not re-raise the
> IPI.

Right.

> 
> 
> Also, we should probably do something 'smart' for kprobes, as all of
> irq_work.c and plenty of perf itself is not __kprobe marked so you're
> all free to insert kprobes in the middle of perf and then attach perf to
> such a thing.
> 

True, ok I'm going to comment about that on your patch.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 15:23               ` Peter Zijlstra
  2013-11-14 15:33                 ` Peter Zijlstra
@ 2013-11-14 16:03                 ` Frederic Weisbecker
  2013-11-14 17:20                 ` Vince Weaver
  2013-11-19 19:18                 ` [tip:perf/urgent] ftrace, perf: Avoid infinite event generation loop tip-bot for Peter Zijlstra
  3 siblings, 0 replies; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-14 16:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Thu, Nov 14, 2013 at 04:23:04PM +0100, Peter Zijlstra wrote:
> On Sat, Nov 09, 2013 at 04:10:14PM +0100, Peter Zijlstra wrote:
> > Cute.. so what appears to happen is that:
> > 
> > 1) we trace irq_work_exit
> > 2) we generate event
> > 3) event needs to deliver signal
> > 4) we queue irq_work to send signal
> > 5) goto 1
> 
> ---
>  arch/x86/include/asm/trace/irq_vectors.h | 11 +++++++++++
>  include/linux/ftrace_event.h             | 16 ++++++++++++++++
>  include/linux/tracepoint.h               |  4 ++++
>  include/trace/ftrace.h                   |  7 +++++++
>  kernel/trace/trace_event_perf.c          |  6 ++++++
>  5 files changed, 44 insertions(+)
> 
> diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h
> index 2874df24e7a4..4cab890007a7 100644
> --- a/arch/x86/include/asm/trace/irq_vectors.h
> +++ b/arch/x86/include/asm/trace/irq_vectors.h
> @@ -72,6 +72,17 @@ DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi);
>  DEFINE_IRQ_VECTOR_EVENT(irq_work);
>  
>  /*
> + * We must dis-allow sampling irq_work_exit() because perf event sampling
> + * itself can cause irq_work, which would lead to an infinite loop;
> + *
> + *  1) irq_work_exit happens
> + *  2) generates perf sample
> + *  3) generates irq_work
> + *  4) goto 1
> + */
> +TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
> +
> +/*
>   * call_function - called when entering/exiting a call function interrupt
>   * vector handler
>   */
> diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
> index 5eaa746735ff..3e4d05566022 100644
> --- a/include/linux/ftrace_event.h
> +++ b/include/linux/ftrace_event.h
> @@ -244,6 +244,9 @@ struct ftrace_event_call {
>  #ifdef CONFIG_PERF_EVENTS
>  	int				perf_refcount;
>  	struct hlist_head __percpu	*perf_events;
> +
> +	int	(*perf_perm)(struct ftrace_event_call *,
> +			     struct perf_event *);
>  #endif
>  };
>  
> @@ -306,6 +309,19 @@ struct ftrace_event_file {
>  	}								\
>  	early_initcall(trace_init_flags_##name);
>  
> +#define __TRACE_EVENT_PERF_PERM(name, expr)				\
> +	static int perf_perm_##name(struct ftrace_event_call *tp_event, \
> +				    struct perf_event *p_event)		\
> +	{								\
> +		return expr;						\
> +	}								\
> +	static int __init trace_init_flags_##name(void)			\
> +	{								\
> +		event_##name.perf_perm = &perf_perm_##name;		\
> +		return 0;						\
> +	}								\
> +	early_initcall(trace_init_flags_##name);
> +
>  #define PERF_MAX_TRACE_SIZE	2048
>  
>  #define MAX_FILTER_STR_VAL	256	/* Should handle KSYM_SYMBOL_LEN */
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index ebeab360d851..dd5cb1dca207 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -267,6 +267,8 @@ static inline void tracepoint_synchronize_unregister(void)
>  
>  #define TRACE_EVENT_FLAGS(event, flag)
>  
> +#define TRACE_EVENT_PERF_PERM(event, expr)
> +
>  #endif /* DECLARE_TRACE */
>  
>  #ifndef TRACE_EVENT
> @@ -399,4 +401,6 @@ static inline void tracepoint_synchronize_unregister(void)
>  
>  #define TRACE_EVENT_FLAGS(event, flag)
>  
> +#define TRACE_EVENT_PERF_PERM(event, expr)
> +
>  #endif /* ifdef TRACE_EVENT (see note above) */
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 5c7ab17cbb02..7bdda40fc406 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -90,6 +90,10 @@
>  #define TRACE_EVENT_FLAGS(name, value)					\
>  	__TRACE_EVENT_FLAGS(name, value)
>  
> +#undef TRACE_EVENT_PERF_PERM
> +#define TRACE_EVENT_PERF_PERM(name, expr)				\
> +	__TRACE_EVENT_PERF_PERM(name, expr)
> +
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>  
>  
> @@ -140,6 +144,9 @@
>  #undef TRACE_EVENT_FLAGS
>  #define TRACE_EVENT_FLAGS(event, flag)
>  
> +#undef TRACE_EVENT_PERF_PERM
> +#define TRACE_EVENT_PERF_PERM(event, expr)
> +
>  #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>  
>  /*
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index 78e27e3b52ac..630889f68b1d 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -24,6 +24,12 @@ static int	total_ref_count;
>  static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
>  				 struct perf_event *p_event)
>  {
> +	if (tp_event->perf_perm) {
> +		int ret = tp_event->perf_perm(tp_event, p_event);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	/* The ftrace function trace is allowed only for root. */
>  	if (ftrace_event_is_function(tp_event) &&
>  	    perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))

Looks good, I can't think of a more general solution.

Now thinking about kprobes, which can be used to reproduce this same kind of
scenario when put in the right place, perhaps we need to do something in perf_swevent_overflow()
which reject events once they cross a dangerous amount which we arbitrary define on top
of what looks like a storm that can trigger a lockup.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 17:20                 ` Vince Weaver
@ 2013-11-14 17:14                   ` Peter Zijlstra
  2013-11-14 17:41                     ` Steven Rostedt
  2013-11-14 19:18                     ` Vince Weaver
  0 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-14 17:14 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Frederic Weisbecker, Steven Rostedt, LKML, Ingo Molnar, Dave Jones

On Thu, Nov 14, 2013 at 12:20:35PM -0500, Vince Weaver wrote:
> 
> possibly not an issue with your patch, but found in nearby code:
> 
> in perf_trace_init:
> 
>         int event_id = p_event->attr.config;
> 
> 	...
> 
>                 if (tp_event->event.type == event_id &&
> 
> 
> note that "attr.config" is 64-bit but "event_id" is only 32-bit,
> so things like 0xffffffff00000018 will map to irq_work_exit.

I will have a look.

> I'm not sure if your PERM code properly handles this case, I don't have 
> time to work through the macro-magic in your patch.

Can't blame you; took me a fair chunk of the day to come up with this.
That trace event stuff is horrid.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 15:23               ` Peter Zijlstra
  2013-11-14 15:33                 ` Peter Zijlstra
  2013-11-14 16:03                 ` Frederic Weisbecker
@ 2013-11-14 17:20                 ` Vince Weaver
  2013-11-14 17:14                   ` Peter Zijlstra
  2013-11-19 19:18                 ` [tip:perf/urgent] ftrace, perf: Avoid infinite event generation loop tip-bot for Peter Zijlstra
  3 siblings, 1 reply; 38+ messages in thread
From: Vince Weaver @ 2013-11-14 17:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Vince Weaver, Steven Rostedt, LKML,
	Ingo Molnar, Dave Jones


possibly not an issue with your patch, but found in nearby code:

in perf_trace_init:

        int event_id = p_event->attr.config;

	...

                if (tp_event->event.type == event_id &&


note that "attr.config" is 64-bit but "event_id" is only 32-bit,
so things like 0xffffffff00000018 will map to irq_work_exit.

I'm not sure if your PERM code properly handles this case, I don't have 
time to work through the macro-magic in your patch.

Vince



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 17:14                   ` Peter Zijlstra
@ 2013-11-14 17:41                     ` Steven Rostedt
  2013-11-14 19:18                     ` Vince Weaver
  1 sibling, 0 replies; 38+ messages in thread
From: Steven Rostedt @ 2013-11-14 17:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Frederic Weisbecker, LKML, Ingo Molnar, Dave Jones

On Thu, 14 Nov 2013 18:14:05 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

 
> Can't blame you; took me a fair chunk of the day to come up with this.
> That trace event stuff is horrid.

I feel honored ;-)

-- Steve

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 17:14                   ` Peter Zijlstra
  2013-11-14 17:41                     ` Steven Rostedt
@ 2013-11-14 19:18                     ` Vince Weaver
  1 sibling, 0 replies; 38+ messages in thread
From: Vince Weaver @ 2013-11-14 19:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Frederic Weisbecker, Steven Rostedt, LKML,
	Ingo Molnar, Dave Jones

On Thu, 14 Nov 2013, Peter Zijlstra wrote:

> On Thu, Nov 14, 2013 at 12:20:35PM -0500, Vince Weaver wrote:
> > 
> > possibly not an issue with your patch, but found in nearby code:
> > 
> > in perf_trace_init:
> > 
> >         int event_id = p_event->attr.config;
> > 
> > 	...
> > 
> >                 if (tp_event->event.type == event_id &&
> > 
> > 
> > note that "attr.config" is 64-bit but "event_id" is only 32-bit,
> > so things like 0xffffffff00000018 will map to irq_work_exit.
>
> I will have a look.

Assuming this is a real issue, I'll submit a patch to fix this if I'm not
beaten to it by someone else.

> > I'm not sure if your PERM code properly handles this case, I don't have 
> > time to work through the macro-magic in your patch.
> 
> Can't blame you; took me a fair chunk of the day to come up with this.
> That trace event stuff is horrid.

I did some tests both with the high bits set and non-set and wasn't able 
to trigger the bug.  Also ran the fuzzer for a while and while it keeps 
finding other bugs, it hasn't tripped over any that were overtly 
tracepoint related. So you can add my

Tested-by: Vince Weaver <vincent.weaver@maine.edu>

Vince

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-14 15:33                 ` Peter Zijlstra
  2013-11-14 15:35                   ` Frederic Weisbecker
@ 2013-11-15  1:16                   ` Masami Hiramatsu
  2013-11-15 12:28                     ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Masami Hiramatsu @ 2013-11-15  1:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Vince Weaver, Steven Rostedt, LKML,
	Ingo Molnar, Dave Jones

(2013/11/15 0:33), Peter Zijlstra wrote:
> On Thu, Nov 14, 2013 at 04:23:04PM +0100, Peter Zijlstra wrote:
>>  /*
>> + * We must dis-allow sampling irq_work_exit() because perf event sampling
>> + * itself can cause irq_work, which would lead to an infinite loop;
>> + *
>> + *  1) irq_work_exit happens
>> + *  2) generates perf sample
>> + *  3) generates irq_work
>> + *  4) goto 1
>> + */
>> +TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
> 
> And the only reason this doesn't feed fwd itself into oblivion for
> irq_work_enter() is because the irq_work_list must not be empty when the
> interrupt is raised, and queueing further work does not re-raise the
> IPI.
> 
> 
> Also, we should probably do something 'smart' for kprobes, as all of
> irq_work.c and plenty of perf itself is not __kprobe marked so you're
> all free to insert kprobes in the middle of perf and then attach perf to
> such a thing.

Kprobes itself can detect nested call by using per-cpu current-running
kprobe pointer. And if it is nested, it just skips calling handlers.
Anyway, I don't recommend to probe inside the handlers, but yes,
you can trace perf-handler by ftrace B). I actually traced a kprobe-bug
by kprobe-tracer last night, that was amazing :)

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-15  1:16                   ` Masami Hiramatsu
@ 2013-11-15 12:28                     ` Peter Zijlstra
  2013-11-15 14:15                       ` Steven Rostedt
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-15 12:28 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frederic Weisbecker, Vince Weaver, Steven Rostedt, LKML,
	Ingo Molnar, Dave Jones

On Fri, Nov 15, 2013 at 10:16:18AM +0900, Masami Hiramatsu wrote:
> Kprobes itself can detect nested call by using per-cpu current-running
> kprobe pointer. And if it is nested, it just skips calling handlers.
> Anyway, I don't recommend to probe inside the handlers, but yes,
> you can trace perf-handler by ftrace B). I actually traced a kprobe-bug
> by kprobe-tracer last night, that was amazing :)

Ah, ok, so that would avoid the worst problems. Good. Should we still
mark the entire perf swevent path as __kprobes just to be sure?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-15 12:28                     ` Peter Zijlstra
@ 2013-11-15 14:15                       ` Steven Rostedt
  2013-11-15 14:28                         ` Frederic Weisbecker
  0 siblings, 1 reply; 38+ messages in thread
From: Steven Rostedt @ 2013-11-15 14:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Frederic Weisbecker, Vince Weaver, LKML,
	Ingo Molnar, Dave Jones

On Fri, 15 Nov 2013 13:28:33 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Nov 15, 2013 at 10:16:18AM +0900, Masami Hiramatsu wrote:
> > Kprobes itself can detect nested call by using per-cpu current-running
> > kprobe pointer. And if it is nested, it just skips calling handlers.
> > Anyway, I don't recommend to probe inside the handlers, but yes,
> > you can trace perf-handler by ftrace B). I actually traced a kprobe-bug
> > by kprobe-tracer last night, that was amazing :)
> 
> Ah, ok, so that would avoid the worst problems. Good. Should we still
> mark the entire perf swevent path as __kprobes just to be sure?

I wouldn't unless we can prove that it breaks. It's sometimes nice to
be able to debug the debugging facilities with the debugging
facilities ;-)

-- Steve

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-15 14:15                       ` Steven Rostedt
@ 2013-11-15 14:28                         ` Frederic Weisbecker
  2013-11-17  7:53                           ` Masami Hiramatsu
  0 siblings, 1 reply; 38+ messages in thread
From: Frederic Weisbecker @ 2013-11-15 14:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Masami Hiramatsu, Vince Weaver, LKML,
	Ingo Molnar, Dave Jones

On Fri, Nov 15, 2013 at 09:15:21AM -0500, Steven Rostedt wrote:
> On Fri, 15 Nov 2013 13:28:33 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Fri, Nov 15, 2013 at 10:16:18AM +0900, Masami Hiramatsu wrote:
> > > Kprobes itself can detect nested call by using per-cpu current-running
> > > kprobe pointer. And if it is nested, it just skips calling handlers.
> > > Anyway, I don't recommend to probe inside the handlers, but yes,
> > > you can trace perf-handler by ftrace B). I actually traced a kprobe-bug
> > > by kprobe-tracer last night, that was amazing :)
> > 
> > Ah, ok, so that would avoid the worst problems. Good. Should we still
> > mark the entire perf swevent path as __kprobes just to be sure?
> 
> I wouldn't unless we can prove that it breaks. It's sometimes nice to
> be able to debug the debugging facilities with the debugging
> facilities ;-)

Even with the existing __kprobes annotations, I'm sure we can find many ways to
break the kernel.

We can reproduce that bug with irq_work recursion with setting a kprobe in
the end of the irq_work() arch low level handler for example. Or simply
somewhere in irq_exit().

I think we could do dangerous things with breakpoints as well. Setting breakpoints
in do_debug() or stuffs like that.

So keeping up with __kprobes annotations to every potential dangerous site
is not viable IMHO. It's important to maintain basic sanity with tagging sites
that are used by kprobes itself but we can't really prevent from any issue.

At some point it's up to the user to know what he's doing and avoid recursion.
As long as kprobes can be set only by root.

OTOH it would be nice to detect these kind of behaviour (thinking about irq_work for
example) and warn when something wrong is suspected.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-15 14:28                         ` Frederic Weisbecker
@ 2013-11-17  7:53                           ` Masami Hiramatsu
  2013-11-17  9:43                             ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Masami Hiramatsu @ 2013-11-17  7:53 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Steven Rostedt, Peter Zijlstra, Vince Weaver, LKML, Ingo Molnar,
	Dave Jones

(2013/11/15 23:28), Frederic Weisbecker wrote:
> On Fri, Nov 15, 2013 at 09:15:21AM -0500, Steven Rostedt wrote:
>> On Fri, 15 Nov 2013 13:28:33 +0100
>> Peter Zijlstra <peterz@infradead.org> wrote:
>>
>>> On Fri, Nov 15, 2013 at 10:16:18AM +0900, Masami Hiramatsu wrote:
>>>> Kprobes itself can detect nested call by using per-cpu current-running
>>>> kprobe pointer. And if it is nested, it just skips calling handlers.
>>>> Anyway, I don't recommend to probe inside the handlers, but yes,
>>>> you can trace perf-handler by ftrace B). I actually traced a kprobe-bug
>>>> by kprobe-tracer last night, that was amazing :)
>>>
>>> Ah, ok, so that would avoid the worst problems. Good. Should we still
>>> mark the entire perf swevent path as __kprobes just to be sure?
>>
>> I wouldn't unless we can prove that it breaks. It's sometimes nice to
>> be able to debug the debugging facilities with the debugging
>> facilities ;-)
> 
> Even with the existing __kprobes annotations, I'm sure we can find many ways to
> break the kernel.
> 
> We can reproduce that bug with irq_work recursion with setting a kprobe in
> the end of the irq_work() arch low level handler for example. Or simply
> somewhere in irq_exit().
>
> I think we could do dangerous things with breakpoints as well. Setting breakpoints
> in do_debug() or stuffs like that.
> 
> So keeping up with __kprobes annotations to every potential dangerous site
> is not viable IMHO. It's important to maintain basic sanity with tagging sites
> that are used by kprobes itself but we can't really prevent from any issue.
> 
> At some point it's up to the user to know what he's doing and avoid recursion.
> As long as kprobes can be set only by root.

Hmm, it would be better to add some documentation how users can avoid
such thing.

> OTOH it would be nice to detect these kind of behaviour (thinking about irq_work for
> example) and warn when something wrong is suspected.

Agreed.
FYI, kprobes has a recursion detection counter and it is reported via
debugfs/tracing/kprobe_profile :)

Thank you,

-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: perf/tracepoint: another fuzzer generated lockup
  2013-11-17  7:53                           ` Masami Hiramatsu
@ 2013-11-17  9:43                             ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2013-11-17  9:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Frederic Weisbecker, Steven Rostedt, Vince Weaver, LKML,
	Ingo Molnar, Dave Jones

On Sun, Nov 17, 2013 at 04:53:59PM +0900, Masami Hiramatsu wrote:
> Agreed.
> FYI, kprobes has a recursion detection counter and it is reported via
> debugfs/tracing/kprobe_profile :)

Perf also has a recursion counter (4 actually, not reported to
userspace), but the problem with the irq_work_exit thing is that its a
non-recursive event loop, so recursion checks don't actually work to
catch it.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [tip:perf/urgent] ftrace, perf: Avoid infinite event generation loop
  2013-11-14 15:23               ` Peter Zijlstra
                                   ` (2 preceding siblings ...)
  2013-11-14 17:20                 ` Vince Weaver
@ 2013-11-19 19:18                 ` tip-bot for Peter Zijlstra
  3 siblings, 0 replies; 38+ messages in thread
From: tip-bot for Peter Zijlstra @ 2013-11-19 19:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, peterz, davej, vincent.weaver,
	fweisbec, rostedt, tglx

Commit-ID:  d5b5f391d434c5cc8bcb1ab2d759738797b85f52
Gitweb:     http://git.kernel.org/tip/d5b5f391d434c5cc8bcb1ab2d759738797b85f52
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Thu, 14 Nov 2013 16:23:04 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 19 Nov 2013 16:57:40 +0100

ftrace, perf: Avoid infinite event generation loop

Vince's perf-trinity fuzzer found yet another 'interesting' problem.

When we sample the irq_work_exit tracepoint with period==1 (or
PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an
infinite event generation loop:

  ,-> <IPI>
  |     irq_work_exit() ->
  |       trace_irq_work_exit() ->
  |         ...
  |           __perf_event_overflow() -> (due to fasync)
  |             irq_work_queue() -> (irq_work_list must be empty)
  '---------      arch_irq_work_raise()

Similar things can happen due to regular poll() wakeups if we exceed
the ring-buffer wakeup watermark, or have an event_limit.

To avoid this, dis-allow sampling this particular tracepoint.

In order to achieve this, create a special perf_perm function pointer
for each event and call this (when set) on trying to create a
tracepoint perf event.

[ roasted: use expr... to allow for ',' in your expression ]

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131114152304.GC5364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/trace/irq_vectors.h | 11 +++++++++++
 include/linux/ftrace_event.h             | 16 ++++++++++++++++
 include/linux/tracepoint.h               |  4 ++++
 include/trace/ftrace.h                   |  7 +++++++
 kernel/trace/trace_event_perf.c          |  6 ++++++
 5 files changed, 44 insertions(+)

diff --git a/arch/x86/include/asm/trace/irq_vectors.h b/arch/x86/include/asm/trace/irq_vectors.h
index 2874df2..4cab890 100644
--- a/arch/x86/include/asm/trace/irq_vectors.h
+++ b/arch/x86/include/asm/trace/irq_vectors.h
@@ -72,6 +72,17 @@ DEFINE_IRQ_VECTOR_EVENT(x86_platform_ipi);
 DEFINE_IRQ_VECTOR_EVENT(irq_work);
 
 /*
+ * We must dis-allow sampling irq_work_exit() because perf event sampling
+ * itself can cause irq_work, which would lead to an infinite loop;
+ *
+ *  1) irq_work_exit happens
+ *  2) generates perf sample
+ *  3) generates irq_work
+ *  4) goto 1
+ */
+TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0);
+
+/*
  * call_function - called when entering/exiting a call function interrupt
  * vector handler
  */
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 9abbe63..8c9b7a1 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -248,6 +248,9 @@ struct ftrace_event_call {
 #ifdef CONFIG_PERF_EVENTS
 	int				perf_refcount;
 	struct hlist_head __percpu	*perf_events;
+
+	int	(*perf_perm)(struct ftrace_event_call *,
+			     struct perf_event *);
 #endif
 };
 
@@ -317,6 +320,19 @@ struct ftrace_event_file {
 	}								\
 	early_initcall(trace_init_flags_##name);
 
+#define __TRACE_EVENT_PERF_PERM(name, expr...)				\
+	static int perf_perm_##name(struct ftrace_event_call *tp_event, \
+				    struct perf_event *p_event)		\
+	{								\
+		return ({ expr; });					\
+	}								\
+	static int __init trace_init_perf_perm_##name(void)		\
+	{								\
+		event_##name.perf_perm = &perf_perm_##name;		\
+		return 0;						\
+	}								\
+	early_initcall(trace_init_perf_perm_##name);
+
 #define PERF_MAX_TRACE_SIZE	2048
 
 #define MAX_FILTER_STR_VAL	256	/* Should handle KSYM_SYMBOL_LEN */
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index ebeab36..f16dc0a 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -267,6 +267,8 @@ static inline void tracepoint_synchronize_unregister(void)
 
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#define TRACE_EVENT_PERF_PERM(event, expr...)
+
 #endif /* DECLARE_TRACE */
 
 #ifndef TRACE_EVENT
@@ -399,4 +401,6 @@ static inline void tracepoint_synchronize_unregister(void)
 
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#define TRACE_EVENT_PERF_PERM(event, expr...)
+
 #endif /* ifdef TRACE_EVENT (see note above) */
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 52594b2..6b852f6 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -90,6 +90,10 @@
 #define TRACE_EVENT_FLAGS(name, value)					\
 	__TRACE_EVENT_FLAGS(name, value)
 
+#undef TRACE_EVENT_PERF_PERM
+#define TRACE_EVENT_PERF_PERM(name, expr...)				\
+	__TRACE_EVENT_PERF_PERM(name, expr)
+
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
 
@@ -140,6 +144,9 @@
 #undef TRACE_EVENT_FLAGS
 #define TRACE_EVENT_FLAGS(event, flag)
 
+#undef TRACE_EVENT_PERF_PERM
+#define TRACE_EVENT_PERF_PERM(event, expr...)
+
 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
 
 /*
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 78e27e3..630889f 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -24,6 +24,12 @@ static int	total_ref_count;
 static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
 				 struct perf_event *p_event)
 {
+	if (tp_event->perf_perm) {
+		int ret = tp_event->perf_perm(tp_event, p_event);
+		if (ret)
+			return ret;
+	}
+
 	/* The ftrace function trace is allowed only for root. */
 	if (ftrace_event_is_function(tp_event) &&
 	    perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2013-11-19 19:19 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-08 19:45 perf/tracepoint: another fuzzer generated lockup Vince Weaver
2013-11-08 20:06 ` Vince Weaver
2013-11-08 20:02   ` Frederic Weisbecker
2013-11-08 20:23     ` Vince Weaver
2013-11-08 20:48       ` Frederic Weisbecker
2013-11-08 21:15         ` Vince Weaver
2013-11-08 22:24           ` Frederic Weisbecker
2013-11-08 22:36           ` Frederic Weisbecker
2013-11-09  1:09             ` Steven Rostedt
2013-11-09 14:10             ` Peter Zijlstra
2013-11-09 14:20               ` Frederic Weisbecker
2013-11-11 12:44                 ` Ingo Molnar
2013-11-11 15:53                   ` Peter Zijlstra
2013-11-11 21:13                     ` Ingo Molnar
2013-11-09 14:52               ` Frederic Weisbecker
2013-11-09 15:13                 ` Peter Zijlstra
2013-11-09 15:27                   ` Frederic Weisbecker
2013-11-09 15:59                     ` Peter Zijlstra
2013-11-09 16:08                       ` Frederic Weisbecker
2013-11-09 15:11             ` Peter Zijlstra
2013-11-09 15:22               ` Frederic Weisbecker
2013-11-09 15:30                 ` Peter Zijlstra
2013-11-14 15:23               ` Peter Zijlstra
2013-11-14 15:33                 ` Peter Zijlstra
2013-11-14 15:35                   ` Frederic Weisbecker
2013-11-15  1:16                   ` Masami Hiramatsu
2013-11-15 12:28                     ` Peter Zijlstra
2013-11-15 14:15                       ` Steven Rostedt
2013-11-15 14:28                         ` Frederic Weisbecker
2013-11-17  7:53                           ` Masami Hiramatsu
2013-11-17  9:43                             ` Peter Zijlstra
2013-11-14 16:03                 ` Frederic Weisbecker
2013-11-14 17:20                 ` Vince Weaver
2013-11-14 17:14                   ` Peter Zijlstra
2013-11-14 17:41                     ` Steven Rostedt
2013-11-14 19:18                     ` Vince Weaver
2013-11-19 19:18                 ` [tip:perf/urgent] ftrace, perf: Avoid infinite event generation loop tip-bot for Peter Zijlstra
2013-11-09  0:25           ` perf/tracepoint: another fuzzer generated lockup Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.