linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Core2 cpu triggers hard lockup with perf test
@ 2016-02-27 12:37 Jiri Olsa
  2016-02-27 14:48 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jiri Olsa @ 2016-02-27 12:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Andi Kleen, Stephane Eranian, Wang Nan, zheng.z.yan, Kan Liang
  Cc: LKML

hi,
we are getting hard lockups on Core2 cpus (model 23)
just by running 'perf test'

PID: 10425  TASK: ffff880068562e00  CPU: 3   COMMAND: "perf"
 #0 [ffff88007d985a08] machine_kexec at ffffffff8105521b
 #1 [ffff88007d985a68] crash_kexec at ffffffff810f7412
 #2 [ffff88007d985b38] panic at ffffffff8163c031
 #3 [ffff88007d985bb8] watchdog_overflow_callback at ffffffff81120472
 #4 [ffff88007d985bc8] __perf_event_overflow at ffffffff81164e0e
 #5 [ffff88007d985c00] perf_event_overflow at ffffffff81165a44
 #6 [ffff88007d985c10] intel_pmu_handle_irq at ffffffff81033198
 #7 [ffff88007d985e60] perf_event_nmi_handler at ffffffff8164be8b
 #8 [ffff88007d985e80] nmi_handle at ffffffff8164b5d9
 #9 [ffff88007d985ec8] do_nmi at ffffffff8164b789
#10 [ffff88007d985ef0] end_repeat_nmi at ffffffff8164aa13
    [exception RIP: intel_pmu_enable_all+17]
    RIP: ffffffff81032301  RSP: ffff88005e917c98  RFLAGS: 00000046
    RAX: ffff88007d98cd20  RBX: ffff88005e991000  RCX: 000000000000038f
    RDX: 0000000000000007  RSI: 0000000000000003  RDI: 0000000000000000
    RBP: ffff88005e917cd8   R8: ffffffffffffff85   R9: 000000ffffffffff
    R10: ffff88007d98c100  R11: ffff88005e9179e0  R12: ffff88007d98bd10
    R13: ffff88007d98b9e0  R14: ffff88007d98bc08  R15: 0000000000000002
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
#11 [ffff88005e917c98] intel_pmu_enable_all at ffffffff81032301
#12 [ffff88005e917c98] x86_pmu_enable at ffffffff8102ba24
#13 [ffff88005e917ce0] perf_pmu_enable at ffffffff81160457
#14 [ffff88005e917cf0] perf_event_context_sched_in at ffffffff81161930
#15 [ffff88005e917d20] perf_event_exec at ffffffff811621db
#16 [ffff88005e917d68] setup_new_exec at ffffffff811edffd
#17 [ffff88005e917d88] load_elf_binary at ffffffff81240ed9
#18 [ffff88005e917e58] search_binary_handler at ffffffff811ec89d
#19 [ffff88005e917ea0] do_execve_common at ffffffff811ede04
#20 [ffff88005e917f30] sys_execve at ffffffff811ee199
#21 [ffff88005e917f50] stub_execve at ffffffff816531a9

the reproducer seems to be hw event with very small
period like (thanks Arnaldo ;-):
  perf record -e cycles -c 123 kill

I bisected it down to the:
  156174999dd1 perf/intel/x86: Enlarge the PEBS buffer

Looks like the bigger PEBS buffer together with event being
marked as PERF_X86_EVENT_FREERUNNING will block the CPU right
after the event is enabled before it could reach local_irq_enable
and trigger the NMI watchdog.

I can't find what's special about Core2 CPU PEBS setup,
it seems that oher CPUs are ok (tried on ivb/snb/hsw).

reverting the 156174999dd1 fixed the issue for me

ideas? thanks,
jirka

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-02-27 12:37 [BUG] Core2 cpu triggers hard lockup with perf test Jiri Olsa
@ 2016-02-27 14:48 ` Peter Zijlstra
  2016-02-27 15:46 ` Andi Kleen
  2016-02-29 22:12 ` Liang, Kan
  2 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2016-02-27 14:48 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Andi Kleen,
	Stephane Eranian, Wang Nan, zheng.z.yan, Kan Liang, LKML

On Sat, Feb 27, 2016 at 01:37:01PM +0100, Jiri Olsa wrote:
> we are getting hard lockups on Core2 cpus (model 23)

> I can't find what's special about Core2 CPU PEBS setup,
> it seems that oher CPUs are ok (tried on ivb/snb/hsw).
> 
> reverting the 156174999dd1 fixed the issue for me
> 
> ideas? thanks,

The obvious difference between Core2 and later chips is that Core2 only
has PEBS on a single counter (cnt0).

I'll try and have a closer look on Monday, I should still have a Core2
class machine around the house somewhere.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-02-27 12:37 [BUG] Core2 cpu triggers hard lockup with perf test Jiri Olsa
  2016-02-27 14:48 ` Peter Zijlstra
@ 2016-02-27 15:46 ` Andi Kleen
  2016-02-29 22:12 ` Liang, Kan
  2 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2016-02-27 15:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Andi Kleen, Stephane Eranian, Wang Nan, zheng.z.yan, Kan Liang,
	LKML

> I can't find what's special about Core2 CPU PEBS setup,
> it seems that oher CPUs are ok (tried on ivb/snb/hsw).
> 
> reverting the 156174999dd1 fixed the issue for me

Ok multi-record PEbS was never tested on Core 2. I suppose we 
can enable it only on Nehalem+

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-02-27 12:37 [BUG] Core2 cpu triggers hard lockup with perf test Jiri Olsa
  2016-02-27 14:48 ` Peter Zijlstra
  2016-02-27 15:46 ` Andi Kleen
@ 2016-02-29 22:12 ` Liang, Kan
  2016-03-01  6:55   ` Jiri Olsa
  2016-03-01  9:17   ` Peter Zijlstra
  2 siblings, 2 replies; 18+ messages in thread
From: Liang, Kan @ 2016-02-29 22:12 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Andi Kleen, Stephane Eranian, Wang Nan, zheng.z.yan
  Cc: LKML



> 
> I can't find what's special about Core2 CPU PEBS setup, it seems that oher
> CPUs are ok (tried on ivb/snb/hsw).
> 
> reverting the 156174999dd1 fixed the issue for me
> 
> ideas? thanks,

I think we may just disable the multiple pebs support for core2
as the patch below.

In SDM "18.4.4.4 Re-configuring PEBS Facilities" it mentioned that
a quiescent period is needed between stopping the prior event counting and
setting up a new PEBS event when software needs to reconfigure PEBS facilities.
The quiescent period is to allow any latent residual PEBS records to complete
its capture at their previously specified buffer address
That requirement only can be found in Core Microarchitecture. 

I think it may implies that there is some observed delay in writing PEBS buffer.
So if perf record precise hw event with very small period, the slow PEBS writing
may lockup the CPU. If so, I think disabling the multiple pebs should be a good
way.


---
 arch/x86/events/intel/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 8fddb02..a56230f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2577,8 +2577,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	if (event->attr.precise_ip) {
 		if (!event->attr.freq) {
 			event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
-			if (!(event->attr.sample_type &
-			      ~intel_pmu_free_running_flags(event)))
+			if ((x86_pmu.intel_cap.pebs_format > 0) &&
+			    !(event->attr.sample_type & ~intel_pmu_free_running_flags(event)))
 				event->hw.flags |= PERF_X86_EVENT_FREERUNNING;
 		}
 		if (x86_pmu.pebs_aliases)
-- 
2.5.0

Thanks,
Kan

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-02-29 22:12 ` Liang, Kan
@ 2016-03-01  6:55   ` Jiri Olsa
  2016-03-01  9:17   ` Peter Zijlstra
  1 sibling, 0 replies; 18+ messages in thread
From: Jiri Olsa @ 2016-03-01  6:55 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Arnaldo Carvalho de Melo, Ingo Molnar, Peter Zijlstra,
	Andi Kleen, Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Mon, Feb 29, 2016 at 10:12:08PM +0000, Liang, Kan wrote:
> 
> 
> > 
> > I can't find what's special about Core2 CPU PEBS setup, it seems that oher
> > CPUs are ok (tried on ivb/snb/hsw).
> > 
> > reverting the 156174999dd1 fixed the issue for me
> > 
> > ideas? thanks,
> 
> I think we may just disable the multiple pebs support for core2
> as the patch below.
> 
> In SDM "18.4.4.4 Re-configuring PEBS Facilities" it mentioned that
> a quiescent period is needed between stopping the prior event counting and
> setting up a new PEBS event when software needs to reconfigure PEBS facilities.
> The quiescent period is to allow any latent residual PEBS records to complete
> its capture at their previously specified buffer address
> That requirement only can be found in Core Microarchitecture. 
> 
> I think it may implies that there is some observed delay in writing PEBS buffer.
> So if perf record precise hw event with very small period, the slow PEBS writing
> may lockup the CPU. If so, I think disabling the multiple pebs should be a good
> way.
> 
> 

hi,
got same lockup with the patch:


[  167.486514] Kernel panic - not syncing: Hard LOCKUP
[  167.486514] CPU: 3 PID: 10656 Comm: perf Not tainted 4.5.0-rc4+ #7
[  167.486514] Hardware name: System Manufacturer To Be Filled By O.E.M. Product Name To Be Filled By O.E.M./BB Name To be filled by O.E.M., BIOS CGELIA55.86
[  167.486514]  0000000000000086 0000000084986595 ffff88007d985b28 ffffffff8133983f
[  167.486514]  ffffffff8191b723 0000000000000000 ffff88007d985ba8 ffffffff811872d1
[  167.486514]  ffff880000000008 ffff88007d985bb8 ffff88007d985b58 0000000084986595
[  167.486514] Call Trace:
[  167.486514]  <NMI>  [<ffffffff8133983f>] dump_stack+0x63/0x84
[  167.486514]  [<ffffffff811872d1>] panic+0xe2/0x229
[  167.486514]  [<ffffffff8113dc30>] watchdog_overflow_callback+0x100/0x100
[  167.486514]  [<ffffffff8117ee18>] __perf_event_overflow+0x88/0x1c0
[  167.486514]  [<ffffffff8117f994>] perf_event_overflow+0x14/0x20
[  167.486514]  [<ffffffff8100c42f>] intel_pmu_handle_irq+0x1df/0x460
[  167.486514]  [<ffffffff81052e3f>] ? native_apic_wait_icr_idle+0x1f/0x30
[  167.486514]  [<ffffffff81032cc5>] ? arch_irq_work_raise+0x35/0x40
[  167.486514]  [<ffffffff8100563d>] perf_event_nmi_handler+0x2d/0x50
[  167.486514]  [<ffffffff810313a2>] nmi_handle+0x62/0xf0
[  167.486514]  [<ffffffff81031a06>] default_do_nmi+0xf6/0x120
[  167.486514]  [<ffffffff81031b11>] do_nmi+0xe1/0x150
[  167.486514]  [<ffffffff816ad5f1>] end_repeat_nmi+0x1a/0x1e
[  167.486514]  [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30
[  167.486514]  [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30
[  167.486514]  [<ffffffff81063a16>] ? native_write_msr_safe+0x6/0x30
[  167.486514]  <<EOE>>  [<ffffffff8100b5cd>] ? __intel_pmu_enable_all.isra.12+0x4d/0xb0
[  167.486514]  [<ffffffff8100b640>] intel_pmu_enable_all+0x10/0x20
[  167.486514]  [<ffffffff810072c3>] x86_pmu_enable+0x263/0x2f0
[  167.486514]  [<ffffffff81179a72>] perf_pmu_enable+0x22/0x30
[  167.486514]  [<ffffffff8117a721>] ctx_resched+0x51/0x60
[  167.486514]  [<ffffffff8117b2ff>] perf_event_exec+0x10f/0x140
[  167.486514]  [<ffffffff8121949d>] setup_new_exec+0x6d/0x1a0
[  167.486514]  [<ffffffff8126b58a>] load_elf_binary+0x37a/0x10e0
[  167.486514]  [<ffffffff811b77f2>] ? get_user_pages+0x52/0x60
[  167.486514]  [<ffffffff8121779e>] search_binary_handler+0x9e/0x1e0
[  167.486514]  [<ffffffff812191f4>] do_execveat_common.isra.34+0x554/0x6e0
[  167.486514]  [<ffffffff8121960a>] SyS_execve+0x3a/0x50
[  167.486514]  [<ffffffff816ab195>] stub_execve+0x5/0x5
[  167.486514]  [<ffffffff816aaeee>] ? entry_SYSCALL_64_fastpath+0x12/0x71


jirka

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-02-29 22:12 ` Liang, Kan
  2016-03-01  6:55   ` Jiri Olsa
@ 2016-03-01  9:17   ` Peter Zijlstra
  2016-03-01 11:06     ` Jiri Olsa
  1 sibling, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01  9:17 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Ingo Molnar, Andi Kleen,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Mon, Feb 29, 2016 at 10:12:08PM +0000, Liang, Kan wrote:

> In SDM "18.4.4.4 Re-configuring PEBS Facilities" it mentioned that
> a quiescent period is needed between stopping the prior event counting and
> setting up a new PEBS event when software needs to reconfigure PEBS facilities.
> The quiescent period is to allow any latent residual PEBS records to complete
> its capture at their previously specified buffer address

> That requirement only can be found in Core Microarchitecture. 

But that should apply to all (PEBS) event scheduling, not just the
multi thing.

Also very convenient that quiescent period is so well defined. How long
should we wait, a day?

> I think it may implies that there is some observed delay in writing PEBS buffer.

Doesn't it explicitly state just that?

> So if perf record precise hw event with very small period, the slow PEBS writing
> may lockup the CPU. 

And I still don't see how this would explain a lockup in the MSR writes.

[ Jiri, can you disable that stupid panic on hard lockup and let it run
for a while, see if all the lockup msgs hit the same IP? Also, can you
look where exactly that IP lives in the code? ]

So I suspect it actually just did the PERF_GLOBAL_CTRL write, how else
would the hardware watchdog trigger on that same CPU.

After that, there's only BTS muck, which you're not using, so WTH is it
actually stuck on?

> If so, I think disabling the multiple pebs should be a good way.

As said, this should affect any and all PEBS event scheduling, not just
the multi stuff.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01  9:17   ` Peter Zijlstra
@ 2016-03-01 11:06     ` Jiri Olsa
  2016-03-01 11:20       ` Peter Zijlstra
  2016-03-01 14:51       ` Andi Kleen
  0 siblings, 2 replies; 18+ messages in thread
From: Jiri Olsa @ 2016-03-01 11:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar, Andi Kleen,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 10:17:03AM +0100, Peter Zijlstra wrote:
> On Mon, Feb 29, 2016 at 10:12:08PM +0000, Liang, Kan wrote:
> 
> > In SDM "18.4.4.4 Re-configuring PEBS Facilities" it mentioned that
> > a quiescent period is needed between stopping the prior event counting and
> > setting up a new PEBS event when software needs to reconfigure PEBS facilities.
> > The quiescent period is to allow any latent residual PEBS records to complete
> > its capture at their previously specified buffer address
> 
> > That requirement only can be found in Core Microarchitecture. 
> 
> But that should apply to all (PEBS) event scheduling, not just the
> multi thing.
> 
> Also very convenient that quiescent period is so well defined. How long
> should we wait, a day?
> 
> > I think it may implies that there is some observed delay in writing PEBS buffer.
> 
> Doesn't it explicitly state just that?
> 
> > So if perf record precise hw event with very small period, the slow PEBS writing
> > may lockup the CPU. 
> 
> And I still don't see how this would explain a lockup in the MSR writes.
> 
> [ Jiri, can you disable that stupid panic on hard lockup and let it run
> for a while, see if all the lockup msgs hit the same IP? Also, can you
> look where exactly that IP lives in the code? ]

im on it.. also the patch that makes this happen just
enlarge the buffer for PEBS:

  156174999dd1 perf/intel/x86: Enlarge the PEBS buffer

but I did not find anyaPEBS buffer lenght limitations in SDM

jirka

> 
> So I suspect it actually just did the PERF_GLOBAL_CTRL write, how else
> would the hardware watchdog trigger on that same CPU.
> 
> After that, there's only BTS muck, which you're not using, so WTH is it
> actually stuck on?
> 
> > If so, I think disabling the multiple pebs should be a good way.
> 
> As said, this should affect any and all PEBS event scheduling, not just
> the multi stuff.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 11:06     ` Jiri Olsa
@ 2016-03-01 11:20       ` Peter Zijlstra
  2016-03-01 14:51       ` Andi Kleen
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01 11:20 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar, Andi Kleen,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 12:06:51PM +0100, Jiri Olsa wrote:
> > 
> > [ Jiri, can you disable that stupid panic on hard lockup and let it run
> > for a while, see if all the lockup msgs hit the same IP? Also, can you
> > look where exactly that IP lives in the code? ]
> 
> im on it.. 

Thanks!

> also the patch that makes this happen just
> enlarge the buffer for PEBS:
> 
>   156174999dd1 perf/intel/x86: Enlarge the PEBS buffer
> 
> but I did not find anyaPEBS buffer lenght limitations in SDM

Probably just makes it easier to tickle.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 11:06     ` Jiri Olsa
  2016-03-01 11:20       ` Peter Zijlstra
@ 2016-03-01 14:51       ` Andi Kleen
  2016-03-01 14:59         ` Peter Zijlstra
  1 sibling, 1 reply; 18+ messages in thread
From: Andi Kleen @ 2016-03-01 14:51 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Liang, Kan, Arnaldo Carvalho de Melo,
	Ingo Molnar, Andi Kleen, Stephane Eranian, Wang Nan, zheng.z.yan,
	LKML

> im on it.. also the patch that makes this happen just
> enlarge the buffer for PEBS:
> 
>   156174999dd1 perf/intel/x86: Enlarge the PEBS buffer
> 
> but I did not find anyaPEBS buffer lenght limitations in SDM

May be the easiest would be to just keep the old buffer size
on Merom.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 14:51       ` Andi Kleen
@ 2016-03-01 14:59         ` Peter Zijlstra
  2016-03-01 17:17           ` Jiri Olsa
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01 14:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 03:51:05PM +0100, Andi Kleen wrote:
> > im on it.. also the patch that makes this happen just
> > enlarge the buffer for PEBS:
> > 
> >   156174999dd1 perf/intel/x86: Enlarge the PEBS buffer
> > 
> > but I did not find anyaPEBS buffer lenght limitations in SDM
> 
> May be the easiest would be to just keep the old buffer size
> on Merom.

I would still like to know why it breaks though. Just changing code
about because it makes fail go away is never the right answer.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 14:59         ` Peter Zijlstra
@ 2016-03-01 17:17           ` Jiri Olsa
  2016-03-01 17:32             ` Andi Kleen
                               ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jiri Olsa @ 2016-03-01 17:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 03:59:09PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 01, 2016 at 03:51:05PM +0100, Andi Kleen wrote:
> > > im on it.. also the patch that makes this happen just
> > > enlarge the buffer for PEBS:
> > > 
> > >   156174999dd1 perf/intel/x86: Enlarge the PEBS buffer
> > > 
> > > but I did not find anyaPEBS buffer lenght limitations in SDM
> > 
> > May be the easiest would be to just keep the old buffer size
> > on Merom.
> 
> I would still like to know why it breaks though. Just changing code
> about because it makes fail go away is never the right answer.

I had to go throught several config switch offs,
however now it's pretty clear where it hangs

I got one (just one occurance):

[  125.982977] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1^M
[  125.982977] Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ocrdma ib_core snd_hda_codec_idt snd_hda_codec_generic ib_addr snd_hda_intel snd_hda_codec snd_hda_core coretemp kvm_intel snd_hwdep snd_seq kvm snd_seq_device snd_pcm snd_timer snd nfsd soundcore shpchp pcspkr auth_rpcgss ppdev nfs_acl sg irqbypass parport_serial parport_pc lockd i2c_i801 parport acpi_cpufreq grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom i915 video i2c_algo_bit ahci drm_kms_helper libahci syscopyarea sysfillrect sysimgblt fb_sys_fops e1000e drm ptp pps_core i2c_core lpfc libata scsi_transport_fc be2net vxlan ip6_udp_tunnel udp_tunnel fjes dm_mirror dm_region_hash dm_log dm_mod^M
[  125.982977] CPU: 1 PID: 5586 Comm: perf Not tainted 4.5.0-rc4fix+ #10^M
[  125.982977] Hardware name: System Manufacturer To Be Filled By O.E.M. Product Name To Be Filled By O.E.M./BB Name To be filled by O.E.M., BIOS CGELIA55.86^M
[  125.982977]  0000000000000086 00000000522d7515 ffff88007d885bb0 ffffffff8131dd11^M
[  125.982977]  0000000000000000 0000000000000000 ffff88007d885bc8 ffffffff811279cb^M
[  125.982977]  ffff88007a548000 ffff88007d885c00 ffffffff81168618 0000000000000002^M
[  125.982977] Call Trace:^M
[  125.982977]  <NMI>  [<ffffffff8131dd11>] dump_stack+0x4d/0x6c^M
[  125.982977]  [<ffffffff811279cb>] watchdog_overflow_callback+0xeb/0x100^M
[  125.982977]  [<ffffffff81168618>] __perf_event_overflow+0x88/0x1c0^M
[  125.982977]  [<ffffffff81169194>] perf_event_overflow+0x14/0x20^M
[  125.982977]  [<ffffffff8100b656>] intel_pmu_handle_irq+0x1e6/0x4a0^M
[  125.982977]  [<ffffffff810413bf>] ? native_apic_wait_icr_idle+0x1f/0x30^M
[  125.982977]  [<ffffffff81021d15>] ? arch_irq_work_raise+0x35/0x40^M
[  125.982977]  [<ffffffff8115c54e>] ? irq_work_queue+0x6e/0x80^M
[  125.982977]  [<ffffffff8100466d>] perf_event_nmi_handler+0x2d/0x50^M
[  125.982977]  [<ffffffff81020382>] nmi_handle+0x62/0xf0^M
[  125.982977]  [<ffffffff810209d6>] default_do_nmi+0xe6/0x110^M
[  125.982977]  [<ffffffff81020ad7>] do_nmi+0xd7/0x140^M
[  125.982977]  [<ffffffff8167af37>] end_repeat_nmi+0x1a/0x1e^M
[  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
[  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
[  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
[  125.982977]  <<EOE>>  [<ffffffff8100af10>] intel_pmu_enable_all+0x10/0x20^M
[  125.982977]  [<ffffffff81006283>] x86_pmu_enable+0x263/0x2f0^M
[  125.982977]  [<ffffffff811632d2>] perf_pmu_enable+0x22/0x30^M
[  125.982977]  [<ffffffff81163f51>] ctx_resched+0x51/0x60^M
[  125.982977]  [<ffffffff81164b09>] perf_event_exec+0x109/0x150^M
[  125.982977]  [<ffffffff811fff7d>] setup_new_exec+0x6d/0x1a0^M
[  125.982977]  [<ffffffff8125104a>] load_elf_binary+0x37a/0x10e0^M
[  125.982977]  [<ffffffff811a06c2>] ? get_user_pages+0x52/0x60^M
[  125.982977]  [<ffffffff811fe32e>] search_binary_handler+0x9e/0x1e0^M
[  125.982977]  [<ffffffff811ffccd>] do_execveat_common.isra.37+0x54d/0x6e0^M
[  125.982977]  [<ffffffff812000ea>] SyS_execve+0x3a/0x50^M
[  125.982977]  [<ffffffff81679065>] stub_execve+0x5/0x5^M
[  125.982977]  [<ffffffff81678dd7>] ? entry_SYSCALL_64_fastpath+0x12/0x6a^M


and several rcu stalls.. I guess that's because above path did rcu_read_lock:

[  166.667008] INFO: rcu_sched detected stalls on CPUs/tasks:^M
[  166.668003]  1-...: (16 GPs behind) idle=b13/140000000000000/0 softirq=6921/6923 fqs=19790 ^M
[  166.774020]  (detected by 2, t=60168 jiffies, g=6522, c=6521, q=0)^M
[  166.774020] Task dump for CPU 1:^M
[  166.774020] perf            R  running task        0  5586   5585 0x00000088^M
[  166.774020]  0000000000000001 ffff88007144fd70 ffffffff81164b09 ffff88007d899808^M
[  166.774020]  0000000000000286 ffff88007d899808 ffff880076f8e7c0 ffff880079734200^M
[  166.774020]  ffff880071450000 0000000000000000 ffff880078dfc500 ffff88007144fd90^M
[  166.774020] Call Trace:^M
[  166.774020]  [<ffffffff81164b09>] ? perf_event_exec+0x109/0x150^M
[  166.774020]  [<ffffffff811fff7d>] ? setup_new_exec+0x6d/0x1a0^M
[  166.774020]  [<ffffffff8125104a>] ? load_elf_binary+0x37a/0x10e0^M
[  166.774020]  [<ffffffff811a06c2>] ? get_user_pages+0x52/0x60^M
[  166.774020]  [<ffffffff811fe32e>] ? search_binary_handler+0x9e/0x1e0^M
[  166.774020]  [<ffffffff811ffccd>] ? do_execveat_common.isra.37+0x54d/0x6e0^M
[  166.774020]  [<ffffffff812000ea>] ? SyS_execve+0x3a/0x50^M
[  166.774020]  [<ffffffff81679065>] ? stub_execve+0x5/0x5^M
[  166.774020]  [<ffffffff81678dd7>] ? entry_SYSCALL_64_fastpath+0x12/0x6a^M
[  346.672008] INFO: rcu_sched detected stalls on CPUs/tasks:^M


the exception addr is on wrmsr:

ffffffff8100ae30 <__intel_pmu_enable_all.isra.11>:
ffffffff8100ae30:       e8 bb 02 67 00          callq  ffffffff8167b0f0 <__fentry__>
ffffffff8100ae35:       55                      push   %rbp
ffffffff8100ae36:       48 89 e5                mov    %rsp,%rbp
ffffffff8100ae39:       41 54                   push   %r12
ffffffff8100ae3b:       41 89 fc                mov    %edi,%r12d
ffffffff8100ae3e:       53                      push   %rbx
ffffffff8100ae3f:       48 c7 c3 80 a3 00 00    mov    $0xa380,%rbx
ffffffff8100ae46:       65 48 03 1d d2 f2 ff    add    %gs:0x7efff2d2(%rip),%rbx        # a120 <this_cpu_off>
ffffffff8100ae4d:       7e
ffffffff8100ae4e:       e8 6d 49 00 00          callq  ffffffff8100f7c0 <intel_pmu_pebs_enable_all>
ffffffff8100ae53:       41 0f b6 fc             movzbl %r12b,%edi
ffffffff8100ae57:       e8 94 58 00 00          callq  ffffffff810106f0 <intel_pmu_lbr_enable_all>
ffffffff8100ae5c:       48 8b 83 68 0c 00 00    mov    0xc68(%rbx),%rax
ffffffff8100ae63:       b9 8f 03 00 00          mov    $0x38f,%ecx
ffffffff8100ae68:       48 f7 d0                not    %rax
ffffffff8100ae6b:       48 23 05 26 80 ad 00    and    0xad8026(%rip),%rax        # ffffffff81ae2e98 <x86_pmu+0x138>
ffffffff8100ae72:       48 89 c2                mov    %rax,%rdx
ffffffff8100ae75:       48 c1 ea 20             shr    $0x20,%rdx
ffffffff8100ae79:       0f 30                   wrmsr


I tried what Andy suggested below (not sure what he meant by Merom,
I took PEBS format0 instead), works for me

jirka


---
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c8a243d6fc82..c4a1a769bae7 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -269,7 +269,7 @@ static int alloc_pebs_buffer(int cpu)
 	if (!x86_pmu.pebs)
 		return 0;
 
-	buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
+	buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
@@ -286,7 +286,7 @@ static int alloc_pebs_buffer(int cpu)
 		per_cpu(insn_buffer, cpu) = ibuffer;
 	}
 
-	max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
+	max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
 
 	ds->pebs_buffer_base = (u64)(unsigned long)buffer;
 	ds->pebs_index = ds->pebs_buffer_base;
@@ -1319,6 +1319,7 @@ void __init intel_ds_init(void)
 
 	x86_pmu.bts  = boot_cpu_has(X86_FEATURE_BTS);
 	x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
+	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
 	if (x86_pmu.pebs) {
 		char pebs_type = x86_pmu.intel_cap.pebs_trap ?  '+' : '-';
 		int format = x86_pmu.intel_cap.pebs_format;
@@ -1327,6 +1328,7 @@ void __init intel_ds_init(void)
 		case 0:
 			pr_cont("PEBS fmt0%c, ", pebs_type);
 			x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
+			x86_pmu.pebs_buffer_size = PAGE_SIZE;
 			x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
 			break;
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 7bb61e32fb29..1ab6279fed1d 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -586,6 +586,7 @@ struct x86_pmu {
 			pebs_broken	:1,
 			pebs_prec_dist	:1;
 	int		pebs_record_size;
+	int		pebs_buffer_size;
 	void		(*drain_pebs)(struct pt_regs *regs);
 	struct event_constraint *pebs_constraints;
 	void		(*pebs_aliases)(struct perf_event *event);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 17:17           ` Jiri Olsa
@ 2016-03-01 17:32             ` Andi Kleen
  2016-03-01 17:49             ` Peter Zijlstra
  2016-03-01 18:12             ` Peter Zijlstra
  2 siblings, 0 replies; 18+ messages in thread
From: Andi Kleen @ 2016-03-01 17:32 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo,
	Ingo Molnar, Stephane Eranian, Wang Nan, zheng.z.yan, LKML

> I tried what Andy suggested below (not sure what he meant by Merom,
> I took PEBS format0 instead), works for me

Thanks Jiri. Patch looks good to me.

Reviewed-by: Andi Kleen <ak@linux.intel.com>

We may want to make this buffer size a configurable anyways because
with multi-record PEBS it can make sense to use much larger buffers
to reduce overhead.  But that could be done separately.

-Andi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 17:17           ` Jiri Olsa
  2016-03-01 17:32             ` Andi Kleen
@ 2016-03-01 17:49             ` Peter Zijlstra
  2016-03-01 18:04               ` Jiri Olsa
  2016-03-01 18:12             ` Peter Zijlstra
  2 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01 17:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 06:17:22PM +0100, Jiri Olsa wrote:

> [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> [  125.982977]  <<EOE>>  [<ffffffff8100af10>] intel_pmu_enable_all+0x10/0x20^M
> [  125.982977]  [<ffffffff81006283>] x86_pmu_enable+0x263/0x2f0^M
> [  125.982977]  [<ffffffff811632d2>] perf_pmu_enable+0x22/0x30^M
> [  125.982977]  [<ffffffff81163f51>] ctx_resched+0x51/0x60^M
> [  125.982977]  [<ffffffff81164b09>] perf_event_exec+0x109/0x150^M
> [  125.982977]  [<ffffffff811fff7d>] setup_new_exec+0x6d/0x1a0^M
> [  125.982977]  [<ffffffff8125104a>] load_elf_binary+0x37a/0x10e0^M
> [  125.982977]  [<ffffffff811a06c2>] ? get_user_pages+0x52/0x60^M
> [  125.982977]  [<ffffffff811fe32e>] search_binary_handler+0x9e/0x1e0^M
> [  125.982977]  [<ffffffff811ffccd>] do_execveat_common.isra.37+0x54d/0x6e0^M
> [  125.982977]  [<ffffffff812000ea>] SyS_execve+0x3a/0x50^M
> [  125.982977]  [<ffffffff81679065>] stub_execve+0x5/0x5^M
> [  125.982977]  [<ffffffff81678dd7>] ? entry_SYSCALL_64_fastpath+0x12/0x6a^M

> the exception addr is on wrmsr:
> 
> ffffffff8100ae30 <__intel_pmu_enable_all.isra.11>:
> ffffffff8100ae30:       e8 bb 02 67 00          callq  ffffffff8167b0f0 <__fentry__>
> ffffffff8100ae35:       55                      push   %rbp
> ffffffff8100ae36:       48 89 e5                mov    %rsp,%rbp
> ffffffff8100ae39:       41 54                   push   %r12
> ffffffff8100ae3b:       41 89 fc                mov    %edi,%r12d
> ffffffff8100ae3e:       53                      push   %rbx
> ffffffff8100ae3f:       48 c7 c3 80 a3 00 00    mov    $0xa380,%rbx
> ffffffff8100ae46:       65 48 03 1d d2 f2 ff    add    %gs:0x7efff2d2(%rip),%rbx        # a120 <this_cpu_off>
> ffffffff8100ae4d:       7e
> ffffffff8100ae4e:       e8 6d 49 00 00          callq  ffffffff8100f7c0 <intel_pmu_pebs_enable_all>
> ffffffff8100ae53:       41 0f b6 fc             movzbl %r12b,%edi
> ffffffff8100ae57:       e8 94 58 00 00          callq  ffffffff810106f0 <intel_pmu_lbr_enable_all>
> ffffffff8100ae5c:       48 8b 83 68 0c 00 00    mov    0xc68(%rbx),%rax
> ffffffff8100ae63:       b9 8f 03 00 00          mov    $0x38f,%ecx
> ffffffff8100ae68:       48 f7 d0                not    %rax
> ffffffff8100ae6b:       48 23 05 26 80 ad 00    and    0xad8026(%rip),%rax        # ffffffff81ae2e98 <x86_pmu+0x138>
> ffffffff8100ae72:       48 89 c2                mov    %rax,%rdx
> ffffffff8100ae75:       48 c1 ea 20             shr    $0x20,%rdx
> ffffffff8100ae79:       0f 30                   wrmsr
> 

That's the PERF_GLOBAL_CTRL, right? But it must have succeeded,
otherwise the NMI watchdog would never have fired.

Something is hosed alright.

I think I've seen my IVB-EP do something similar. But mostly that
machine gets stuck in intel_bts_enable_local().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 17:49             ` Peter Zijlstra
@ 2016-03-01 18:04               ` Jiri Olsa
  2016-03-01 18:14                 ` Peter Zijlstra
  0 siblings, 1 reply; 18+ messages in thread
From: Jiri Olsa @ 2016-03-01 18:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 06:49:03PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 01, 2016 at 06:17:22PM +0100, Jiri Olsa wrote:
> 
> > [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> > [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> > [  125.982977]  [<ffffffff8100ae7b>] ? __intel_pmu_enable_all.isra.11+0x4b/0xd0^M
> > [  125.982977]  <<EOE>>  [<ffffffff8100af10>] intel_pmu_enable_all+0x10/0x20^M
> > [  125.982977]  [<ffffffff81006283>] x86_pmu_enable+0x263/0x2f0^M
> > [  125.982977]  [<ffffffff811632d2>] perf_pmu_enable+0x22/0x30^M
> > [  125.982977]  [<ffffffff81163f51>] ctx_resched+0x51/0x60^M
> > [  125.982977]  [<ffffffff81164b09>] perf_event_exec+0x109/0x150^M
> > [  125.982977]  [<ffffffff811fff7d>] setup_new_exec+0x6d/0x1a0^M
> > [  125.982977]  [<ffffffff8125104a>] load_elf_binary+0x37a/0x10e0^M
> > [  125.982977]  [<ffffffff811a06c2>] ? get_user_pages+0x52/0x60^M
> > [  125.982977]  [<ffffffff811fe32e>] search_binary_handler+0x9e/0x1e0^M
> > [  125.982977]  [<ffffffff811ffccd>] do_execveat_common.isra.37+0x54d/0x6e0^M
> > [  125.982977]  [<ffffffff812000ea>] SyS_execve+0x3a/0x50^M
> > [  125.982977]  [<ffffffff81679065>] stub_execve+0x5/0x5^M
> > [  125.982977]  [<ffffffff81678dd7>] ? entry_SYSCALL_64_fastpath+0x12/0x6a^M
> 
> > the exception addr is on wrmsr:
> > 
> > ffffffff8100ae30 <__intel_pmu_enable_all.isra.11>:
> > ffffffff8100ae30:       e8 bb 02 67 00          callq  ffffffff8167b0f0 <__fentry__>
> > ffffffff8100ae35:       55                      push   %rbp
> > ffffffff8100ae36:       48 89 e5                mov    %rsp,%rbp
> > ffffffff8100ae39:       41 54                   push   %r12
> > ffffffff8100ae3b:       41 89 fc                mov    %edi,%r12d
> > ffffffff8100ae3e:       53                      push   %rbx
> > ffffffff8100ae3f:       48 c7 c3 80 a3 00 00    mov    $0xa380,%rbx
> > ffffffff8100ae46:       65 48 03 1d d2 f2 ff    add    %gs:0x7efff2d2(%rip),%rbx        # a120 <this_cpu_off>
> > ffffffff8100ae4d:       7e
> > ffffffff8100ae4e:       e8 6d 49 00 00          callq  ffffffff8100f7c0 <intel_pmu_pebs_enable_all>
> > ffffffff8100ae53:       41 0f b6 fc             movzbl %r12b,%edi
> > ffffffff8100ae57:       e8 94 58 00 00          callq  ffffffff810106f0 <intel_pmu_lbr_enable_all>
> > ffffffff8100ae5c:       48 8b 83 68 0c 00 00    mov    0xc68(%rbx),%rax
> > ffffffff8100ae63:       b9 8f 03 00 00          mov    $0x38f,%ecx
> > ffffffff8100ae68:       48 f7 d0                not    %rax
> > ffffffff8100ae6b:       48 23 05 26 80 ad 00    and    0xad8026(%rip),%rax        # ffffffff81ae2e98 <x86_pmu+0x138>
> > ffffffff8100ae72:       48 89 c2                mov    %rax,%rdx
> > ffffffff8100ae75:       48 c1 ea 20             shr    $0x20,%rdx
> > ffffffff8100ae79:       0f 30                   wrmsr
> > 
> 
> That's the PERF_GLOBAL_CTRL, right? But it must have succeeded,

yep, should be this one:

static void __intel_pmu_enable_all(int added, bool pmi)
{
        struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);

        intel_pmu_pebs_enable_all();
        intel_pmu_lbr_enable_all(pmi);
 >>>    wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL,
                        x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask);


> otherwise the NMI watchdog would never have fired.

so NMI wouldn't trigger if CPU is inside wrmsr?

jirka

> 
> Something is hosed alright.
> 
> I think I've seen my IVB-EP do something similar. But mostly that
> machine gets stuck in intel_bts_enable_local().
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 17:17           ` Jiri Olsa
  2016-03-01 17:32             ` Andi Kleen
  2016-03-01 17:49             ` Peter Zijlstra
@ 2016-03-01 18:12             ` Peter Zijlstra
  2016-03-01 19:03               ` [PATCH] perf x86: Use PAGE_SIZE for PEBS buffer size on Core2 Jiri Olsa
  2 siblings, 1 reply; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01 18:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 06:17:22PM +0100, Jiri Olsa wrote:
> I tried what Andy suggested below (not sure what he meant by Merom,
> I took PEBS format0 instead), works for me

Model 15, see intel_pmu_init(). But you're actually running on a Penryn
I suspect, since we disabled PEBS for Merom.

There's also a bunch of Atoms that uses PEBS format 0, no idea if
they're affected too.

> ---
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index c8a243d6fc82..c4a1a769bae7 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -269,7 +269,7 @@ static int alloc_pebs_buffer(int cpu)
>  	if (!x86_pmu.pebs)
>  		return 0;
>  
> -	buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
> +	buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
>  	if (unlikely(!buffer))
>  		return -ENOMEM;
>  
> @@ -286,7 +286,7 @@ static int alloc_pebs_buffer(int cpu)
>  		per_cpu(insn_buffer, cpu) = ibuffer;
>  	}
>  
> -	max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
> +	max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
>  
>  	ds->pebs_buffer_base = (u64)(unsigned long)buffer;
>  	ds->pebs_index = ds->pebs_buffer_base;
> @@ -1319,6 +1319,7 @@ void __init intel_ds_init(void)
>  
>  	x86_pmu.bts  = boot_cpu_has(X86_FEATURE_BTS);
>  	x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
> +	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
>  	if (x86_pmu.pebs) {
>  		char pebs_type = x86_pmu.intel_cap.pebs_trap ?  '+' : '-';
>  		int format = x86_pmu.intel_cap.pebs_format;
> @@ -1327,6 +1328,7 @@ void __init intel_ds_init(void)
>  		case 0:
>  			pr_cont("PEBS fmt0%c, ", pebs_type);
>  			x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);

At the very least this wants a comment along the lines of:

			/*
			 * Using >PAGE_SIZE buffers makes the WRMSR to
			 * PERF_GLOBAL_CTRL in intel_pmu_enable_all()
			 * mysteriously hang on Core2.
			 *
			 * As a workaround, we don't do this.
			 */

> +			x86_pmu.pebs_buffer_size = PAGE_SIZE;
>  			x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
>  			break;
>  
> diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> index 7bb61e32fb29..1ab6279fed1d 100644
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -586,6 +586,7 @@ struct x86_pmu {
>  			pebs_broken	:1,
>  			pebs_prec_dist	:1;
>  	int		pebs_record_size;
> +	int		pebs_buffer_size;
>  	void		(*drain_pebs)(struct pt_regs *regs);
>  	struct event_constraint *pebs_constraints;
>  	void		(*pebs_aliases)(struct perf_event *event);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] Core2 cpu triggers hard lockup with perf test
  2016-03-01 18:04               ` Jiri Olsa
@ 2016-03-01 18:14                 ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2016-03-01 18:14 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, zheng.z.yan, LKML

On Tue, Mar 01, 2016 at 07:04:40PM +0100, Jiri Olsa wrote:

> > That's the PERF_GLOBAL_CTRL, right? But it must have succeeded,
> 
> yep, should be this one:
> 
> static void __intel_pmu_enable_all(int added, bool pmi)
> {
>         struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> 
>         intel_pmu_pebs_enable_all();
>         intel_pmu_lbr_enable_all(pmi);
>  >>>    wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL,
>                         x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask);
> 
> 
> > otherwise the NMI watchdog would never have fired.
> 
> so NMI wouldn't trigger if CPU is inside wrmsr?

Well, anything goes with MSR writes, that's all a magic heap of
micro-code.

But at the very least it did actually enable the counters, otherwise the
counter used for the NMI watchdog could not fire, it too would still be
disabled.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] perf x86: Use PAGE_SIZE for PEBS buffer size on Core2
  2016-03-01 18:12             ` Peter Zijlstra
@ 2016-03-01 19:03               ` Jiri Olsa
  2016-03-08 13:15                 ` [tip:perf/core] perf/x86/intel: " tip-bot for Jiri Olsa
  0 siblings, 1 reply; 18+ messages in thread
From: Jiri Olsa @ 2016-03-01 19:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andi Kleen, Liang, Kan, Arnaldo Carvalho de Melo, Ingo Molnar,
	Stephane Eranian, Wang Nan, LKML

On Tue, Mar 01, 2016 at 07:12:07PM +0100, Peter Zijlstra wrote:

SNIP

> > @@ -1327,6 +1328,7 @@ void __init intel_ds_init(void)
> >  		case 0:
> >  			pr_cont("PEBS fmt0%c, ", pebs_type);
> >  			x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
> 
> At the very least this wants a comment along the lines of:
> 
> 			/*
> 			 * Using >PAGE_SIZE buffers makes the WRMSR to
> 			 * PERF_GLOBAL_CTRL in intel_pmu_enable_all()
> 			 * mysteriously hang on Core2.
> 			 *
> 			 * As a workaround, we don't do this.
> 			 */
> 
> > +			x86_pmu.pebs_buffer_size = PAGE_SIZE;
> >  			x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
> >  			break;
> >  
> > diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
> > index 7bb61e32fb29..1ab6279fed1d 100644
> > --- a/arch/x86/events/perf_event.h
> > +++ b/arch/x86/events/perf_event.h
> > @@ -586,6 +586,7 @@ struct x86_pmu {
> >  			pebs_broken	:1,
> >  			pebs_prec_dist	:1;
> >  	int		pebs_record_size;
> > +	int		pebs_buffer_size;
> >  	void		(*drain_pebs)(struct pt_regs *regs);
> >  	struct event_constraint *pebs_constraints;
> >  	void		(*pebs_aliases)(struct perf_event *event);

sending updated patch
jirka


---
Using PAGE_SIZE buffers makes the WRMSR to
PERF_GLOBAL_CTRL in intel_pmu_enable_all()
mysteriously hang on Core2. As a workaround,
we don't do this.

The hard lockup is easily triggered by running
'perf test attr' repeatedly. Most of the time
it gets stuck on sample session with small periods.

  # perf test attr -vv
  14: struct perf_event_attr setup                             :
  --- start ---
  ...
    'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1

Reported-by: Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/events/intel/ds.c   | 13 +++++++++++--
 arch/x86/events/perf_event.h |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c8a243d6fc82..b8420c364c5d 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -269,7 +269,7 @@ static int alloc_pebs_buffer(int cpu)
 	if (!x86_pmu.pebs)
 		return 0;
 
-	buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
+	buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
@@ -286,7 +286,7 @@ static int alloc_pebs_buffer(int cpu)
 		per_cpu(insn_buffer, cpu) = ibuffer;
 	}
 
-	max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
+	max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
 
 	ds->pebs_buffer_base = (u64)(unsigned long)buffer;
 	ds->pebs_index = ds->pebs_buffer_base;
@@ -1319,6 +1319,7 @@ void __init intel_ds_init(void)
 
 	x86_pmu.bts  = boot_cpu_has(X86_FEATURE_BTS);
 	x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
+	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
 	if (x86_pmu.pebs) {
 		char pebs_type = x86_pmu.intel_cap.pebs_trap ?  '+' : '-';
 		int format = x86_pmu.intel_cap.pebs_format;
@@ -1327,6 +1328,14 @@ void __init intel_ds_init(void)
 		case 0:
 			pr_cont("PEBS fmt0%c, ", pebs_type);
 			x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
+			/*
+			* Using >PAGE_SIZE buffers makes the WRMSR to
+			* PERF_GLOBAL_CTRL in intel_pmu_enable_all()
+			* mysteriously hang on Core2.
+			*
+			* As a workaround, we don't do this.
+			*/
+			x86_pmu.pebs_buffer_size = PAGE_SIZE;
 			x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
 			break;
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 7bb61e32fb29..1ab6279fed1d 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -586,6 +586,7 @@ struct x86_pmu {
 			pebs_broken	:1,
 			pebs_prec_dist	:1;
 	int		pebs_record_size;
+	int		pebs_buffer_size;
 	void		(*drain_pebs)(struct pt_regs *regs);
 	struct event_constraint *pebs_constraints;
 	void		(*pebs_aliases)(struct perf_event *event);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:perf/core] perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2
  2016-03-01 19:03               ` [PATCH] perf x86: Use PAGE_SIZE for PEBS buffer size on Core2 Jiri Olsa
@ 2016-03-08 13:15                 ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Jiri Olsa @ 2016-03-08 13:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, vincent.weaver, stable, alexander.shishkin,
	linux-kernel, ak, hpa, acme, mingo, peterz, torvalds, wangnan0,
	tglx, jolsa, kan.liang, jolsa

Commit-ID:  e72daf3f4d764c47fb71c9bdc7f9c54a503825b1
Gitweb:     http://git.kernel.org/tip/e72daf3f4d764c47fb71c9bdc7f9c54a503825b1
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Tue, 1 Mar 2016 20:03:52 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 8 Mar 2016 12:18:32 +0100

perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2

Using PAGE_SIZE buffers makes the WRMSR to PERF_GLOBAL_CTRL in
intel_pmu_enable_all() mysteriously hang on Core2. As a workaround, we
don't do this.

The hard lockup is easily triggered by running 'perf test attr'
repeatedly. Most of the time it gets stuck on sample session with
small periods.

  # perf test attr -vv
  14: struct perf_event_attr setup                             :
  --- start ---
  ...
    'PERF_TEST_ATTR=/tmp/tmpuEKz3B /usr/bin/perf record -o /tmp/tmpuEKz3B/perf.data -c 123 kill >/dev/null 2>&1' ret 1

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160301190352.GA8355@krava.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/ds.c   | 13 +++++++++++--
 arch/x86/events/perf_event.h |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index c8a243d..22ece02 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -269,7 +269,7 @@ static int alloc_pebs_buffer(int cpu)
 	if (!x86_pmu.pebs)
 		return 0;
 
-	buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node);
+	buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
 	if (unlikely(!buffer))
 		return -ENOMEM;
 
@@ -286,7 +286,7 @@ static int alloc_pebs_buffer(int cpu)
 		per_cpu(insn_buffer, cpu) = ibuffer;
 	}
 
-	max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
+	max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
 
 	ds->pebs_buffer_base = (u64)(unsigned long)buffer;
 	ds->pebs_index = ds->pebs_buffer_base;
@@ -1319,6 +1319,7 @@ void __init intel_ds_init(void)
 
 	x86_pmu.bts  = boot_cpu_has(X86_FEATURE_BTS);
 	x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
+	x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
 	if (x86_pmu.pebs) {
 		char pebs_type = x86_pmu.intel_cap.pebs_trap ?  '+' : '-';
 		int format = x86_pmu.intel_cap.pebs_format;
@@ -1327,6 +1328,14 @@ void __init intel_ds_init(void)
 		case 0:
 			pr_cont("PEBS fmt0%c, ", pebs_type);
 			x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
+			/*
+			 * Using >PAGE_SIZE buffers makes the WRMSR to
+			 * PERF_GLOBAL_CTRL in intel_pmu_enable_all()
+			 * mysteriously hang on Core2.
+			 *
+			 * As a workaround, we don't do this.
+			 */
+			x86_pmu.pebs_buffer_size = PAGE_SIZE;
 			x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
 			break;
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 7bb61e3..1ab6279 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -586,6 +586,7 @@ struct x86_pmu {
 			pebs_broken	:1,
 			pebs_prec_dist	:1;
 	int		pebs_record_size;
+	int		pebs_buffer_size;
 	void		(*drain_pebs)(struct pt_regs *regs);
 	struct event_constraint *pebs_constraints;
 	void		(*pebs_aliases)(struct perf_event *event);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-08 13:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-27 12:37 [BUG] Core2 cpu triggers hard lockup with perf test Jiri Olsa
2016-02-27 14:48 ` Peter Zijlstra
2016-02-27 15:46 ` Andi Kleen
2016-02-29 22:12 ` Liang, Kan
2016-03-01  6:55   ` Jiri Olsa
2016-03-01  9:17   ` Peter Zijlstra
2016-03-01 11:06     ` Jiri Olsa
2016-03-01 11:20       ` Peter Zijlstra
2016-03-01 14:51       ` Andi Kleen
2016-03-01 14:59         ` Peter Zijlstra
2016-03-01 17:17           ` Jiri Olsa
2016-03-01 17:32             ` Andi Kleen
2016-03-01 17:49             ` Peter Zijlstra
2016-03-01 18:04               ` Jiri Olsa
2016-03-01 18:14                 ` Peter Zijlstra
2016-03-01 18:12             ` Peter Zijlstra
2016-03-01 19:03               ` [PATCH] perf x86: Use PAGE_SIZE for PEBS buffer size on Core2 Jiri Olsa
2016-03-08 13:15                 ` [tip:perf/core] perf/x86/intel: " tip-bot for Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).