linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/x86: fix spurious NMI with PEBS Load Latency event
@ 2017-04-04 17:52 kan.liang
  2017-04-04 18:23 ` Stephane Eranian
  0 siblings, 1 reply; 2+ messages in thread
From: kan.liang @ 2017-04-04 17:52 UTC (permalink / raw)
  To: mingo, peterz, linux-kernel; +Cc: eranian, ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

Spurious NMIs will be observed when applying the following command.
   while true ; do sudo perf record -b -a -e
   "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp,cpu/umask=0x03,event=0x0/,
    cpu/umask=0x02,event=0x0/,cycles,branches,cache-misses,
    cache-references" -- sleep 10 ; done

The issue was introduced by
commit 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status
on HSW+")

The previous patch clear the status's bits for the counters used for
PEBS events, by masking the whole 64 bits pebs_enabled.
However, only the lower 32 bits of both status and pebs_enabled are
reserved for PEBS-able counters.
For status, the first three bits of upper 32 bits are fixed counter
overflow bit.
For pebs_enabled, the first three bits of upper 32 bits are for PEBS
Load Latency event.
In the test case, the PEBS Load Latency event and fixed counter event
could be overflowed at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.

Correct the PEBS enabled mask by ignoring the non-PEBS bits.

Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
status on HSW+")
Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 319da60..5b69787 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2151,7 +2151,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 	 * counters from the GLOBAL_STATUS mask and we always process PEBS
 	 * events via drain_pebs().
 	 */
-	status &= ~cpuc->pebs_enabled;
+	status &= ~(cpuc->pebs_enabled & ((1ULL << MAX_PEBS_EVENTS) - 1));
 
 	/*
 	 * PEBS overflow sets bit 62 in the global status register
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] perf/x86: fix spurious NMI with PEBS Load Latency event
  2017-04-04 17:52 [PATCH] perf/x86: fix spurious NMI with PEBS Load Latency event kan.liang
@ 2017-04-04 18:23 ` Stephane Eranian
  0 siblings, 0 replies; 2+ messages in thread
From: Stephane Eranian @ 2017-04-04 18:23 UTC (permalink / raw)
  To: Liang, Kan; +Cc: mingo, Peter Zijlstra, LKML, ak

On Tue, Apr 4, 2017 at 10:52 AM,  <kan.liang@intel.com> wrote:
> From: Kan Liang <kan.liang@intel.com>
>
> Spurious NMIs will be observed when applying the following command.
>    while true ; do sudo perf record -b -a -e
>    "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp,cpu/umask=0x03,event=0x0/,
>     cpu/umask=0x02,event=0x0/,cycles,branches,cache-misses,
>     cache-references" -- sleep 10 ; done
>
> The issue was introduced by
> commit 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL status
> on HSW+")
>
> The previous patch clear the status's bits for the counters used for
> PEBS events, by masking the whole 64 bits pebs_enabled.
> However, only the lower 32 bits of both status and pebs_enabled are
> reserved for PEBS-able counters.
> For status, the first three bits of upper 32 bits are fixed counter
> overflow bit.
> For pebs_enabled, the first three bits of upper 32 bits are for PEBS
> Load Latency event.
> In the test case, the PEBS Load Latency event and fixed counter event
> could be overflowed at the same time. The fixed counter overflow bit will
> be cleared by mistake. Once it is cleared, the fixed counter overflow
> never be processed, which finally trigger spurious NMI.
>
> Correct the PEBS enabled mask by ignoring the non-PEBS bits.
>
Yes, the patch is correct, bits 32-35 are used for load latency, bit 63 for
precise stores. So yes, the clearing must only look at the bottom 4 bits.

I would suggest using a macros for (1ULL << MAX_PEBS_EVENTS) - 1)
as it is used in at least 2 other places. That would make this cleaner and
more explicit: PEBS_COUNTER_MASK, for instance.

> Fixes: 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
> status on HSW+")
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  arch/x86/events/intel/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 319da60..5b69787 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2151,7 +2151,7 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>          * counters from the GLOBAL_STATUS mask and we always process PEBS
>          * events via drain_pebs().
>          */
> -       status &= ~cpuc->pebs_enabled;
> +       status &= ~(cpuc->pebs_enabled & ((1ULL << MAX_PEBS_EVENTS) - 1));
>
>         /*
>          * PEBS overflow sets bit 62 in the global status register
> --
> 2.4.3
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-04-04 18:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 17:52 [PATCH] perf/x86: fix spurious NMI with PEBS Load Latency event kan.liang
2017-04-04 18:23 ` Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).