[PATCH] perf/x86: Fix overlap counter scheduling bug

* [PATCH] perf/x86: Fix overlap counter scheduling bug
@ 2016-11-01 15:44 Jiri Olsa
  2016-11-08 12:20 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2016-11-01 15:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, Robert Richter, Yan Zheng, lkml, Ingo Molnar, Andi Kleen

My fuzzer testing hits following warning in the counter scheduling code:

  WARNING: CPU: 0 PID: 0 at arch/x86/events/core.c:718 perf_assign_events+0x2ae/0x2c0
  Call Trace:
   <IRQ>
   dump_stack+0x68/0x93
   __warn+0xcb/0xf0
   warn_slowpath_null+0x1d/0x20
   perf_assign_events+0x2ae/0x2c0
   uncore_assign_events+0x1a7/0x250 [intel_uncore]
   uncore_pmu_event_add+0x7a/0x3c0 [intel_uncore]
   event_sched_in.isra.104+0xf6/0x2e0
   group_sched_in+0x6e/0x190
   ...

The reason is that the counter scheduling code assumes
overlap constraints with mask weight < SCHED_STATES_MAX.

This assumption is broken with uncore cbox constraint
added for snbep in:
  3b19e4c98c03 perf/x86: Fix event constraint for SandyBridge-EP C-Box

It's also easily triggered by running following perf command
on snbep box:
   # perf stat -e 'uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/,uncore_cbox_0/event=0x1f/' -a

Fixing this by increasing the SCHED_STATES_MAX to 3
and adding build check for EVENT_CONSTRAINT_OVERLAP
macro.

Cc: Vince Weaver <vince@deater.net>
Cc: Robert Richter <rric@kernel.org>
Cc: Yan Zheng <zheng.z.yan@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/events/core.c       |  3 ---
 arch/x86/events/perf_event.h | 10 +++++++++-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d31735f37ed7..c725f8854216 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -676,9 +676,6 @@ struct sched_state {
 	unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
 };
 
-/* Total max is X86_PMC_IDX_MAX, but we are O(n!) limited */
-#define	SCHED_STATES_MAX	2
-
 struct perf_sched {
 	int			max_weight;
 	int			max_events;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 5874d8de1f8d..4553275b37d4 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -277,8 +277,16 @@ struct cpu_hw_events {
  * dramatically.  The number of such EVENT_CONSTRAINT_OVERLAP() macros
  * and its counter masks must be kept at a minimum.
  */
+
+/* Total max is X86_PMC_IDX_MAX, but we are O(n!) limited */
+#define SCHED_STATES_MAX 3
+
+/* Check we dont overlap beyond the states max. */
+#define OVERLAP_CHECK(n)   (!!sizeof(char[1 - 2*!!(HWEIGHT(n) > SCHED_STATES_MAX)]))
+#define OVERLAP_HWEIGHT(n) (OVERLAP_CHECK(n)*HWEIGHT(n))
+
 #define EVENT_CONSTRAINT_OVERLAP(c, n, m)	\
-	__EVENT_CONSTRAINT(c, n, m, HWEIGHT(n), 1, 0)
+	__EVENT_CONSTRAINT(c, n, m, OVERLAP_HWEIGHT(n), 1, 0)
 
 /*
  * Constraint on the Event code.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread