linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf: WARNING perfevents: irq loop stuck!
@ 2015-04-30 21:04 Vince Weaver
  2015-05-01  7:02 ` Ingo Molnar
  0 siblings, 1 reply; 7+ messages in thread
From: Vince Weaver @ 2015-04-30 21:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar,
	Paul Mackerras


So this is just a warning, and I've reported it before, but the 
perf_fuzzer triggers this fairly regularly on my Haswell system.

It looks like fixed counter 0 (retired instructions) being set to
0000fffffffffffe occasionally causes an irq loop storm and gets stuck 
until the PMU state is cleared.

[ 8224.179407] ------------[ cut here ]------------
[ 8224.184368] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel.c:1602 intel_pmu_handle_irq+0x2bc/0x450()
[ 8224.195686] perfevents: irq loop stuck!
[ 8224.199835] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_hdmi cr$
[ 8224.273188] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.0.0+ #136
[ 8224.281012] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 8224.288988]  ffffffff81a093a8 ffff88011ea05b50 ffffffff816d5639 0000000000000000
[ 8224.297058]  ffff88011ea05ba0 ffff88011ea05b90 ffffffff81072e4a ffff880119b87090
[ 8224.306453]  0000000000000064 ffff88011ea0bd80 ffff8800c38c7800 0000000000000040
[ 8224.315894] Call Trace:
[ 8224.319886]  <NMI>  [<ffffffff816d5639>] dump_stack+0x45/0x57
[ 8224.327439]  [<ffffffff81072e4a>] warn_slowpath_common+0x8a/0xc0
[ 8224.335238]  [<ffffffff81072ec6>] warn_slowpath_fmt+0x46/0x50
[ 8224.342812]  [<ffffffff8103399c>] intel_pmu_handle_irq+0x2bc/0x450
[ 8224.350830]  [<ffffffff81164215>] ? perf_event_task_tick+0x1d5/0x2b0
[ 8224.359025]  [<ffffffff8102a40b>] perf_event_nmi_handler+0x2b/0x50
[ 8224.367069]  [<ffffffff810185c0>] nmi_handle+0xa0/0x150
[ 8224.374067]  [<ffffffff81018525>] ? nmi_handle+0x5/0x150
[ 8224.381169]  [<ffffffff8101886a>] default_do_nmi+0x4a/0x140
[ 8224.388487]  [<ffffffff810189f8>] do_nmi+0x98/0xe0
[ 8224.394886]  [<ffffffff816df8af>] end_repeat_nmi+0x1e/0x2e
[ 8224.402082]  [<ffffffff8105bffa>] ? native_write_msr_safe+0xa/0x10
[ 8224.409958]  [<ffffffff8105bffa>] ? native_write_msr_safe+0xa/0x10
[ 8224.417785]  [<ffffffff8105bffa>] ? native_write_msr_safe+0xa/0x10
[ 8224.425536]  <<EOE>>  <IRQ>  [<ffffffff8103356e>] intel_pmu_enable_event+0x20e/0x230
[ 8224.435001]  [<ffffffff8102b721>] x86_pmu_start+0x81/0x120
[ 8224.442007]  [<ffffffff8102bf25>] x86_pmu_enable+0x295/0x310
[ 8224.449150]  [<ffffffff811621aa>] perf_pmu_enable+0x2a/0x30
[ 8224.456173]  [<ffffffff81029fe8>] x86_pmu_commit_txn+0x78/0xa0
[ 8224.456173]  [<ffffffff81029fe8>] x86_pmu_commit_txn+0x78/0xa0
[ 8224.463561]  [<ffffffff81427f6b>] ? debug_object_activate+0x14b/0x1e0
[ 8224.471591]  [<ffffffff810bdde9>] ? __lock_acquire.isra.31+0x3b9/0x1000
[ 8224.479715]  [<ffffffff81427f6b>] ? debug_object_activate+0x14b/0x1e0
[ 8224.487676]  [<ffffffff810bcba4>] ? __lock_is_held+0x54/0x80
[ 8224.494794]  [<ffffffff81162d50>] ? event_sched_in.isra.85+0x190/0x330
[ 8224.502894]  [<ffffffff811630a8>] group_sched_in+0x1b8/0x1e0
[ 8224.510082]  [<ffffffff8101df1a>] ? native_sched_clock+0x2a/0x90
[ 8224.517646]  [<ffffffff811637bc>] __perf_event_enable+0x25c/0x2a0
[ 8224.525297]  [<ffffffff810f3e69>] ? tick_nohz_irq_exit+0x29/0x30
[ 8224.532776]  [<ffffffff8115ef30>] ? cpu_clock_event_start+0x40/0x40
[ 8224.540595]  [<ffffffff8115ef80>] remote_function+0x50/0x60
[ 8224.547668]  [<ffffffff810f8cd1>] flush_smp_call_function_queue+0x81/0x180
[ 8224.556126]  [<ffffffff810f9763>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 8224.565749]  [<ffffffff8104b5c8>] smp_trace_call_function_single_interrupt+0x38/0xc0
[ 8224.575132]  [<ffffffff816de9be>] trace_call_function_single_interrupt+0x6e/0x80
[ 8224.584168]  <EOI>  [<ffffffff8156e8b5>] ? cpuidle_enter_state+0x65/0x160
[ 8224.584168]  <EOI>  [<ffffffff8156e8b5>] ? cpuidle_enter_state+0x65/0x160
[ 8224.592646]  [<ffffffff8156e8a1>] ? cpuidle_enter_state+0x51/0x160
[ 8224.600408]  [<ffffffff8156e9e7>] cpuidle_enter+0x17/0x20
[ 8224.607285]  [<ffffffff810b6e39>] cpu_startup_entry+0x399/0x440
[ 8224.614712]  [<ffffffff816c9ddb>] rest_init+0xbb/0xd0
[ 8224.621283]  [<ffffffff81d46f74>] start_kernel+0x44e/0x45b
[ 8224.628212]  [<ffffffff81d46120>] ? early_idt_handlers+0x120/0x120
[ 8224.635980]  [<ffffffff81d464d7>] x86_64_start_reservations+0x2a/0x2c
[ 8224.643999]  [<ffffffff81d46614>] x86_64_start_kernel+0x13b/0x14a
[ 8224.651691] ---[ end trace 5679ca0875946dba ]---
[ 8224.657764]
[ 8224.660447] CPU#0: ctrl:       0000000000000000
[ 8224.666448] CPU#0: status:     0000000000000000
[ 8224.672414] CPU#0: overflow:   0000000000000000
[ 8224.678373] CPU#0: fixed:      00000000000000b9
[ 8224.684338] CPU#0: pebs:       0000000000000000
[ 8224.690322] CPU#0: debugctl:   0000000000000000
[ 8224.696253] CPU#0: active:     0000000300000000
[ 8224.702232] CPU#0:   gen-PMC0 ctrl:  00000000108001c2
[ 8224.708823] CPU#0:   gen-PMC0 count: 0000000000000001
[ 8224.715371] CPU#0:   gen-PMC0 left:  0000ffffffffffff
[ 8224.721914] CPU#0:   gen-PMC1 ctrl:  00000000001382d0
[ 8224.728402] CPU#0:   gen-PMC1 count: 0000000000ff3792
[ 8224.734849] CPU#0:   gen-PMC1 left:  0000ffffff012119
[ 8224.741271] CPU#0:   gen-PMC2 ctrl:  00000003ff96764b
[ 8224.747600] CPU#0:   gen-PMC2 count: 0000000000000001
[ 8224.753884] CPU#0:   gen-PMC2 left:  0000ffffffffffff
[ 8224.760100] CPU#0:   gen-PMC3 ctrl:  0000000000131acd
[ 8224.766320] CPU#0:   gen-PMC3 count: 000000000000288a
[ 8224.772585] CPU#0:   gen-PMC3 left:  0000ffffffffffff
[ 8224.778769] CPU#0: fixed-PMC0 count: 0000fffffffffffe
[ 8224.784973] CPU#0: fixed-PMC1 count: 0000fffda1877268
[ 8224.791196] CPU#0: fixed-PMC2 count: 0000000000f222eb
[ 8224.797450] perf_event_intel: clearing PMU state on CPU#0
[ 8224.804063] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 624.630 msecs
[ 8224.813789] perf interrupt took too long (4880480 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 8224.825400] perf interrupt took too long (4842512 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 8225.007411] perf interrupt took too long (4804688 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[ 8225.244832] perf interrupt took too long (4767159 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[ 8225.295535] perf interrupt took too long (4729923 > 38461), lowering kernel.perf_event_max_sample_rate to 3250
[ 8225.747314] perf interrupt took too long (4692980 > 71428), lowering kernel.perf_event_max_sample_rate to 1750
...
[202220.612133] perf_event_intel: clearing PMU state on CPU#6
[202220.665533] perf_event_intel: clearing PMU state on CPU#6
[202220.674658] perf_event_intel: clearing PMU state on CPU#6
[202220.734200] perf_event_intel: clearing PMU state on CPU#6
[202220.966526] perf_event_intel: clearing PMU state on CPU#6
[202220.976874] perf_event_intel: clearing PMU state on CPU#6
[202222.950479] perf_event_intel: clearing PMU state on CPU#6
[202223.183925] perf_event_intel: clearing PMU state on CPU#6
[212335.610297] perf_event_intel: clearing PMU state on CPU#2
[212335.922425] perf_event_intel: clearing PMU state on CPU#2
[212336.014458] perf_event_intel: clearing PMU state on CPU#2
[212336.258568] perf_event_intel: clearing PMU state on CPU#2
[226783.599841] perf_event_intel: clearing PMU state on CPU#0
[269662.113390] perf_event_intel: clearing PMU state on CPU#4
[401640.647353] perf_event_intel: clearing PMU state on CPU#3
[409463.976440] perf_event_intel: clearing PMU state on CPU#6
[409464.172527] perf_event_intel: clearing PMU state on CPU#6
[409464.692722] perf_event_intel: clearing PMU state on CPU#6
[422018.413829] perf_event_intel: clearing PMU state on CPU#1
[422018.458279] perf_event_intel: clearing PMU state on CPU#1
[422018.496802] perf_event_intel: clearing PMU state on CPU#1
[422018.510058] perf_event_intel: clearing PMU state on CPU#1
[422018.523668] perf_event_intel: clearing PMU state on CPU#1
[422018.549935] perf_event_intel: clearing PMU state on CPU#1
[422019.274748] perf_event_intel: clearing PMU state on CPU#1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-04-30 21:04 perf: WARNING perfevents: irq loop stuck! Vince Weaver
@ 2015-05-01  7:02 ` Ingo Molnar
  2015-05-01 17:20   ` Vince Weaver
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ingo Molnar @ 2015-05-01  7:02 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Jiri Olsa, Ingo Molnar, Paul Mackerras


* Vince Weaver <vincent.weaver@maine.edu> wrote:

> So this is just a warning, and I've reported it before, but the 
> perf_fuzzer triggers this fairly regularly on my Haswell system.
> 
> It looks like fixed counter 0 (retired instructions) being set to 
> 0000fffffffffffe occasionally causes an irq loop storm and gets 
> stuck until the PMU state is cleared.

So 0000fffffffffffe corresponds to 2 events left until overflow, 
right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
allow these super short periods.

Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
one similar to bdw_limit_period()? Something like the patch below?

Totally untested and such. I picked 128 because of Broadwell, but 
lower values might work as well. You could try to increase it to 3 and 
upwards and see which one stops triggering stuck NMI loops?

Thanks,

	Ingo

Signed-off-by: Ingo Molnar <mingo@kernel.org>

---
 arch/x86/kernel/cpu/perf_event_intel.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 960e85de13fb..26b13ea8299c 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2479,6 +2479,15 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 
 	return c;
 }
+/*
+ * Really short periods might create infinite PMC NMI loops on Haswell,
+ * so limit them to 128. There's no official erratum for this AFAIK.
+ */
+static unsigned int hsw_limit_period(struct perf_event *event, unsigned int left)
+{
+	return max(left, 128U);
+}
+
 
 /*
  * Broadwell:
@@ -2495,7 +2504,7 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
  * Therefore the effective (average) period matches the requested period,
  * despite coarser hardware granularity.
  */
-static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
+static unsigned int bdw_limit_period(struct perf_event *event, unsigned left)
 {
 	if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
 			X86_CONFIG(.event=0xc0, .umask=0x01)) {
@@ -3265,6 +3274,7 @@ __init int intel_pmu_init(void)
 		x86_pmu.hw_config = hsw_hw_config;
 		x86_pmu.get_event_constraints = hsw_get_event_constraints;
 		x86_pmu.cpu_events = hsw_events_attrs;
+		x86_pmu.limit_period = hsw_limit_period;
 		x86_pmu.lbr_double_abort = true;
 		pr_cont("Haswell events, ");
 		break;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-05-01  7:02 ` Ingo Molnar
@ 2015-05-01 17:20   ` Vince Weaver
  2015-05-08  4:22   ` Vince Weaver
  2015-05-08  7:55   ` Ingo Molnar
  2 siblings, 0 replies; 7+ messages in thread
From: Vince Weaver @ 2015-05-01 17:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Vince Weaver, linux-kernel, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Paul Mackerras

On Fri, 1 May 2015, Ingo Molnar wrote:

> 
> * Vince Weaver <vincent.weaver@maine.edu> wrote:
> 
> > So this is just a warning, and I've reported it before, but the 
> > perf_fuzzer triggers this fairly regularly on my Haswell system.
> > 
> > It looks like fixed counter 0 (retired instructions) being set to 
> > 0000fffffffffffe occasionally causes an irq loop storm and gets 
> > stuck until the PMU state is cleared.
> 
> So 0000fffffffffffe corresponds to 2 events left until overflow, 
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> allow these super short periods.
> 
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> one similar to bdw_limit_period()? Something like the patch below?

I spent the morning trying to get a reproducer for this.  It turns out to 
be complex.  It seems in addition to fixed counter 0 being set to -2, at 
least one other non-fixed counter must be about to overflow.

For example, in this case gen-PMC2 is also poised to overflow at the same 
time.

CPU#0:	 gen-PMC2 ctrl:		00000003ff96764b
CPU#0:   gen-PMC2 count:	0000000000000001
gen-PMC2 left:			0000ffffffffffff
...
[ 2408.612442] CPU#0: fixed-PMC0 count: 0000fffffffffffe


It's not always PMC2 but in the warnings there's at least one other 
gen-PMC about to overflow at the exact same time as the fixed one.

Vince

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-05-01  7:02 ` Ingo Molnar
  2015-05-01 17:20   ` Vince Weaver
@ 2015-05-08  4:22   ` Vince Weaver
  2015-05-08  7:53     ` Ingo Molnar
  2015-05-08  7:55   ` Ingo Molnar
  2 siblings, 1 reply; 7+ messages in thread
From: Vince Weaver @ 2015-05-08  4:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Vince Weaver, linux-kernel, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Paul Mackerras

On Fri, 1 May 2015, Ingo Molnar wrote:

> So 0000fffffffffffe corresponds to 2 events left until overflow, 
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> allow these super short periods.
> 
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> one similar to bdw_limit_period()? Something like the patch below?
> 
> Totally untested and such. I picked 128 because of Broadwell, but 
> lower values might work as well. You could try to increase it to 3 and 
> upwards and see which one stops triggering stuck NMI loops?

I spent a lot of time trying to come up with a test case that triggered 
this more reliably but failed.

It definitely is an issue with PMC0 being -2 causing the PMC0 bit in the 
status register getting stuck and no clearing.  Often there is also a PEBS 
event active at the same time but that might be coincidence.

With your patch applied I can't trigger the issue. I haven't tried 
narrowing down the exact value yet.

Vince

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-05-08  4:22   ` Vince Weaver
@ 2015-05-08  7:53     ` Ingo Molnar
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2015-05-08  7:53 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Jiri Olsa, Ingo Molnar, Paul Mackerras


* Vince Weaver <vincent.weaver@maine.edu> wrote:

> On Fri, 1 May 2015, Ingo Molnar wrote:
> 
> > So 0000fffffffffffe corresponds to 2 events left until overflow, 
> > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> > allow these super short periods.
> > 
> > Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> > one similar to bdw_limit_period()? Something like the patch below?
> > 
> > Totally untested and such. I picked 128 because of Broadwell, but 
> > lower values might work as well. You could try to increase it to 3 and 
> > upwards and see which one stops triggering stuck NMI loops?
> 
> I spent a lot of time trying to come up with a test case that 
> triggered this more reliably but failed.
> 
> It definitely is an issue with PMC0 being -2 causing the PMC0 bit in 
> the status register getting stuck and no clearing.  Often there is 
> also a PEBS event active at the same time but that might be 
> coincidence.
> 
> With your patch applied I can't trigger the issue. I haven't tried 
> narrowing down the exact value yet.

So how about I change it from 128U to 2U and apply it upstream?

I.e. use the minimal threshold that we have observed to cause 
problems. That way should it ever trigger in different circumstances 
we'll eventually trigger it or hear about it.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-05-01  7:02 ` Ingo Molnar
  2015-05-01 17:20   ` Vince Weaver
  2015-05-08  4:22   ` Vince Weaver
@ 2015-05-08  7:55   ` Ingo Molnar
  2015-05-18 17:48     ` Vince Weaver
  2 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2015-05-08  7:55 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Jiri Olsa, Ingo Molnar, Paul Mackerras


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Vince Weaver <vincent.weaver@maine.edu> wrote:
> 
> > So this is just a warning, and I've reported it before, but the 
> > perf_fuzzer triggers this fairly regularly on my Haswell system.
> > 
> > It looks like fixed counter 0 (retired instructions) being set to 
> > 0000fffffffffffe occasionally causes an irq loop storm and gets 
> > stuck until the PMU state is cleared.
> 
> So 0000fffffffffffe corresponds to 2 events left until overflow, 
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> allow these super short periods.
> 
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> one similar to bdw_limit_period()? Something like the patch below?
> 
> Totally untested and such. I picked 128 because of Broadwell, but 
> lower values might work as well. You could try to increase it to 3 and 
> upwards and see which one stops triggering stuck NMI loops?
> 
> Thanks,
> 
> 	Ingo
> 
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 960e85de13fb..26b13ea8299c 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2479,6 +2479,15 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>  
>  	return c;
>  }
> +/*
> + * Really short periods might create infinite PMC NMI loops on Haswell,
> + * so limit them to 128. There's no official erratum for this AFAIK.
> + */
> +static unsigned int hsw_limit_period(struct perf_event *event, unsigned int left)
> +{
> +	return max(left, 128U);
> +}
> +
>  
>  /*
>   * Broadwell:
> @@ -2495,7 +2504,7 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>   * Therefore the effective (average) period matches the requested period,
>   * despite coarser hardware granularity.
>   */
> -static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
> +static unsigned int bdw_limit_period(struct perf_event *event, unsigned left)
>  {
>  	if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
>  			X86_CONFIG(.event=0xc0, .umask=0x01)) {
> @@ -3265,6 +3274,7 @@ __init int intel_pmu_init(void)
>  		x86_pmu.hw_config = hsw_hw_config;
>  		x86_pmu.get_event_constraints = hsw_get_event_constraints;
>  		x86_pmu.cpu_events = hsw_events_attrs;
> +		x86_pmu.limit_period = hsw_limit_period;
>  		x86_pmu.lbr_double_abort = true;
>  		pr_cont("Haswell events, ");
>  		break;

Also, I'd apply the quirk not just to Haswell, but Nehalem, Westmere 
and Ivy Bridge as well, I have seen it as early as on a Nehalem 
prototype box.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: perf: WARNING perfevents: irq loop stuck!
  2015-05-08  7:55   ` Ingo Molnar
@ 2015-05-18 17:48     ` Vince Weaver
  0 siblings, 0 replies; 7+ messages in thread
From: Vince Weaver @ 2015-05-18 17:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Vince Weaver, linux-kernel, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Paul Mackerras

On Fri, 8 May 2015, Ingo Molnar wrote:

> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > 
> > * Vince Weaver <vincent.weaver@maine.edu> wrote:
> > 
> > > So this is just a warning, and I've reported it before, but the 
> > > perf_fuzzer triggers this fairly regularly on my Haswell system.
> > > 
> > > It looks like fixed counter 0 (retired instructions) being set to 
> > > 0000fffffffffffe occasionally causes an irq loop storm and gets 
> > > stuck until the PMU state is cleared.
> > 
> > So 0000fffffffffffe corresponds to 2 events left until overflow, 
> > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we 
> > allow these super short periods.
> > 
> > Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, 
> > one similar to bdw_limit_period()? Something like the patch below?
> > 
> > Totally untested and such. I picked 128 because of Broadwell, but 
> > lower values might work as well. You could try to increase it to 3 and 
> > upwards and see which one stops triggering stuck NMI loops?
> > 
> > Thanks,
> > 
> > 	Ingo
> > 
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > 
> > ---
> >  arch/x86/kernel/cpu/perf_event_intel.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> > index 960e85de13fb..26b13ea8299c 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -2479,6 +2479,15 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> >  
> >  	return c;
> >  }
> > +/*
> > + * Really short periods might create infinite PMC NMI loops on Haswell,
> > + * so limit them to 128. There's no official erratum for this AFAIK.
> > + */
> > +static unsigned int hsw_limit_period(struct perf_event *event, unsigned int left)
> > +{
> > +	return max(left, 128U);
> > +}
> > +
> >  
> >  /*
> >   * Broadwell:
> > @@ -2495,7 +2504,7 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> >   * Therefore the effective (average) period matches the requested period,
> >   * despite coarser hardware granularity.
> >   */
> > -static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
> > +static unsigned int bdw_limit_period(struct perf_event *event, unsigned left)
> >  {
> >  	if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
> >  			X86_CONFIG(.event=0xc0, .umask=0x01)) {
> > @@ -3265,6 +3274,7 @@ __init int intel_pmu_init(void)
> >  		x86_pmu.hw_config = hsw_hw_config;
> >  		x86_pmu.get_event_constraints = hsw_get_event_constraints;
> >  		x86_pmu.cpu_events = hsw_events_attrs;
> > +		x86_pmu.limit_period = hsw_limit_period;
> >  		x86_pmu.lbr_double_abort = true;
> >  		pr_cont("Haswell events, ");
> >  		break;
> 
> Also, I'd apply the quirk not just to Haswell, but Nehalem, Westmere 
> and Ivy Bridge as well, I have seen it as early as on a Nehalem 
> prototype box.

so at the suggestion of Andi Kleen I did some tests to see if this was 
related to Haswell erratum HSD143:
	Fixed-Function Performance Counter May Over Count Instructions
	Retired by 32 When Intel Hyper-Threading Technology is Enabled

and indeed the problem seemed to go away if I disabled Hyperthreading.

However a patch implementing the  Intel suggested workaround for that 
erratum of programming the FIXED_CTR_CTRL_MSR only after the 
GLOBAL_CTRL_MSR is set did not fix the issue (once I re-enabled 
hypethreading on the machine).

Vince

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-18 17:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-30 21:04 perf: WARNING perfevents: irq loop stuck! Vince Weaver
2015-05-01  7:02 ` Ingo Molnar
2015-05-01 17:20   ` Vince Weaver
2015-05-08  4:22   ` Vince Weaver
2015-05-08  7:53     ` Ingo Molnar
2015-05-08  7:55   ` Ingo Molnar
2015-05-18 17:48     ` Vince Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).