All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Various x86 pmu scheduling patches
@ 2015-05-21 11:17 Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
                   ` (10 more replies)
  0 siblings, 11 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

Unless I messed things up again, patch 1 and 2 are for perf/urgent and the rest
can wait.

I would still like to relax the HT scheduling constraint in case there are no
funny events at all, but I've not yet found a nice way to do that.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 12:35   ` Stephane Eranian
                     ` (2 more replies)
  2015-05-21 11:17 ` [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint Peter Zijlstra
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz
  Cc: vincent.weaver, eranian, jolsa, kan.liang, linux-kernel,
	Andrew Hunter, Maria Dimakopoulou

[-- Attachment #1: peterz-pmu-sched-1.patch --]
[-- Type: text/plain, Size: 10322 bytes --]

Commit 43b4578071c0 ("perf/x86: Reduce stack usage of
x86_schedule_events()") violated the rule that 'fake' scheduling; as
used for event/group validation; should not change the event state.

This went mostly un-noticed because repeated calls of
x86_pmu::get_event_constraints() would give the same result. And
x86_pmu::put_event_constraints() would mostly not do anything.

Things could still go wrong esp. for shared_regs, because
cpuc->is_fake can return a different constraint and then the
put_event_constraint will not match up.

Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
bug workaround") made the situation much worse by actually setting the
event->hw.constraint value to NULL, so when validation and actual
scheduling interact we get NULL ptr derefs.

Fix it by removing the constraint pointer from the event and move it
back to an array, this time in cpuc instead of on the stack.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 43b4578071c0 ("perf/x86: Reduce stack usage of x86_schedule_events()")
Fixes: e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption bug workaround")
Cc: Andrew Hunter <ahh@google.com>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event.c              |   32 +++++++++++++-------------
 arch/x86/kernel/cpu/perf_event.h              |    9 ++++---
 arch/x86/kernel/cpu/perf_event_intel.c        |   15 +++---------
 arch/x86/kernel/cpu/perf_event_intel_ds.c     |    4 +--
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    7 ++---
 arch/x86/kernel/cpu/perf_event_intel_uncore.h |    1 
 6 files changed, 32 insertions(+), 36 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -620,7 +620,7 @@ struct sched_state {
 struct perf_sched {
 	int			max_weight;
 	int			max_events;
-	struct perf_event	**events;
+	struct event_constraint	**constraints;
 	struct sched_state	state;
 	int			saved_states;
 	struct sched_state	saved[SCHED_STATES_MAX];
@@ -629,7 +629,7 @@ struct perf_sched {
 /*
  * Initialize interator that runs through all events and counters.
  */
-static void perf_sched_init(struct perf_sched *sched, struct perf_event **events,
+static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints,
 			    int num, int wmin, int wmax)
 {
 	int idx;
@@ -637,10 +637,10 @@ static void perf_sched_init(struct perf_
 	memset(sched, 0, sizeof(*sched));
 	sched->max_events	= num;
 	sched->max_weight	= wmax;
-	sched->events		= events;
+	sched->constraints	= constraints;
 
 	for (idx = 0; idx < num; idx++) {
-		if (events[idx]->hw.constraint->weight == wmin)
+		if (constraints[idx]->weight == wmin)
 			break;
 	}
 
@@ -687,7 +687,7 @@ static bool __perf_sched_find_counter(st
 	if (sched->state.event >= sched->max_events)
 		return false;
 
-	c = sched->events[sched->state.event]->hw.constraint;
+	c = sched->constraints[sched->state.event];
 	/* Prefer fixed purpose counters */
 	if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) {
 		idx = INTEL_PMC_IDX_FIXED;
@@ -745,7 +745,7 @@ static bool perf_sched_next_event(struct
 			if (sched->state.weight > sched->max_weight)
 				return false;
 		}
-		c = sched->events[sched->state.event]->hw.constraint;
+		c = sched->constraints[sched->state.event];
 	} while (c->weight != sched->state.weight);
 
 	sched->state.counter = 0;	/* start with first counter */
@@ -756,12 +756,12 @@ static bool perf_sched_next_event(struct
 /*
  * Assign a counter for each event.
  */
-int perf_assign_events(struct perf_event **events, int n,
+int perf_assign_events(struct event_constraint **constraints, int n,
 			int wmin, int wmax, int *assign)
 {
 	struct perf_sched sched;
 
-	perf_sched_init(&sched, events, n, wmin, wmax);
+	perf_sched_init(&sched, constraints, n, wmin, wmax);
 
 	do {
 		if (!perf_sched_find_counter(&sched))
@@ -788,9 +788,8 @@ int x86_schedule_events(struct cpu_hw_ev
 		x86_pmu.start_scheduling(cpuc);
 
 	for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
-		hwc = &cpuc->event_list[i]->hw;
 		c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
-		hwc->constraint = c;
+		cpuc->event_constraint[i] = c;
 
 		wmin = min(wmin, c->weight);
 		wmax = max(wmax, c->weight);
@@ -801,7 +800,7 @@ int x86_schedule_events(struct cpu_hw_ev
 	 */
 	for (i = 0; i < n; i++) {
 		hwc = &cpuc->event_list[i]->hw;
-		c = hwc->constraint;
+		c = cpuc->event_constraint[i];
 
 		/* never assigned */
 		if (hwc->idx == -1)
@@ -821,9 +820,10 @@ int x86_schedule_events(struct cpu_hw_ev
 	}
 
 	/* slow path */
-	if (i != n)
-		unsched = perf_assign_events(cpuc->event_list, n, wmin,
+	if (i != n) {
+		unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
 					     wmax, assign);
+	}
 
 	/*
 	 * In case of success (unsched = 0), mark events as committed,
@@ -840,7 +840,7 @@ int x86_schedule_events(struct cpu_hw_ev
 			e = cpuc->event_list[i];
 			e->hw.flags |= PERF_X86_EVENT_COMMITTED;
 			if (x86_pmu.commit_scheduling)
-				x86_pmu.commit_scheduling(cpuc, e, assign[i]);
+				x86_pmu.commit_scheduling(cpuc, i, assign[i]);
 		}
 	}
 
@@ -1292,8 +1292,10 @@ static void x86_pmu_del(struct perf_even
 		x86_pmu.put_event_constraints(cpuc, event);
 
 	/* Delete the array entry. */
-	while (++i < cpuc->n_events)
+	while (++i < cpuc->n_events) {
 		cpuc->event_list[i-1] = cpuc->event_list[i];
+		cpuc->event_constraint[i-1] = cpuc->event_constraint[i];
+	}
 	--cpuc->n_events;
 
 	perf_event_update_userpage(event);
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -172,7 +172,10 @@ struct cpu_hw_events {
 					     added in the current transaction */
 	int			assign[X86_PMC_IDX_MAX]; /* event to counter assignment */
 	u64			tags[X86_PMC_IDX_MAX];
+
 	struct perf_event	*event_list[X86_PMC_IDX_MAX]; /* in enabled order */
+	struct event_constraint	*event_constraint[X86_PMC_IDX_MAX];
+
 
 	unsigned int		group_flag;
 	int			is_fake;
@@ -519,9 +522,7 @@ struct x86_pmu {
 	void		(*put_event_constraints)(struct cpu_hw_events *cpuc,
 						 struct perf_event *event);
 
-	void		(*commit_scheduling)(struct cpu_hw_events *cpuc,
-					     struct perf_event *event,
-					     int cntr);
+	void		(*commit_scheduling)(struct cpu_hw_events *cpuc, int idx, int cntr);
 
 	void		(*start_scheduling)(struct cpu_hw_events *cpuc);
 
@@ -717,7 +718,7 @@ static inline void __x86_pmu_enable_even
 
 void x86_pmu_enable_all(int added);
 
-int perf_assign_events(struct perf_event **events, int n,
+int perf_assign_events(struct event_constraint **constraints, int n,
 			int wmin, int wmax, int *assign);
 int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);
 
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2106,7 +2106,7 @@ static struct event_constraint *
 intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 			    struct perf_event *event)
 {
-	struct event_constraint *c1 = event->hw.constraint;
+	struct event_constraint *c1 = cpuc->event_constraint[idx];
 	struct event_constraint *c2;
 
 	/*
@@ -2188,8 +2188,6 @@ intel_put_shared_regs_event_constraints(
 static void intel_put_event_constraints(struct cpu_hw_events *cpuc,
 					struct perf_event *event)
 {
-	struct event_constraint *c = event->hw.constraint;
-
 	intel_put_shared_regs_event_constraints(cpuc, event);
 
 	/*
@@ -2197,19 +2195,14 @@ static void intel_put_event_constraints(
 	 * all events are subject to and must call the
 	 * put_excl_constraints() routine
 	 */
-	if (c && cpuc->excl_cntrs)
+	if (cpuc->excl_cntrs)
 		intel_put_excl_constraints(cpuc, event);
-
-	/* cleanup dynamic constraint */
-	if (c && (c->flags & PERF_X86_EVENT_DYNAMIC))
-		event->hw.constraint = NULL;
 }
 
-static void intel_commit_scheduling(struct cpu_hw_events *cpuc,
-				    struct perf_event *event, int cntr)
+static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
 {
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct event_constraint *c = event->hw.constraint;
+	struct event_constraint *c = cpuc->event_constraint[idx];
 	struct intel_excl_states *xlo, *xl;
 	int tid = cpuc->excl_thread_id;
 	int o_tid = 1 - tid;
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -706,9 +706,9 @@ void intel_pmu_pebs_disable(struct perf_
 
 	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
 
-	if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
+	if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
 		cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
-	else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
+	else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
 		cpuc->pebs_enabled &= ~(1ULL << 63);
 
 	if (cpuc->enabled)
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -364,9 +364,8 @@ static int uncore_assign_events(struct i
 	bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX);
 
 	for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) {
-		hwc = &box->event_list[i]->hw;
 		c = uncore_get_event_constraint(box, box->event_list[i]);
-		hwc->constraint = c;
+		box->event_constraint[i] = c;
 		wmin = min(wmin, c->weight);
 		wmax = max(wmax, c->weight);
 	}
@@ -374,7 +373,7 @@ static int uncore_assign_events(struct i
 	/* fastpath, try to reuse previous register */
 	for (i = 0; i < n; i++) {
 		hwc = &box->event_list[i]->hw;
-		c = hwc->constraint;
+		c = box->event_constraint[i];
 
 		/* never assigned */
 		if (hwc->idx == -1)
@@ -394,7 +393,7 @@ static int uncore_assign_events(struct i
 	}
 	/* slow path */
 	if (i != n)
-		ret = perf_assign_events(box->event_list, n,
+		ret = perf_assign_events(box->event_constraint, n,
 					 wmin, wmax, assign);
 
 	if (!assign || ret) {
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -97,6 +97,7 @@ struct intel_uncore_box {
 	atomic_t refcnt;
 	struct perf_event *events[UNCORE_PMC_IDX_MAX];
 	struct perf_event *event_list[UNCORE_PMC_IDX_MAX];
+	struct event_constraint *event_constraint[UNCORE_PMC_IDX_MAX];
 	unsigned long active_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
 	u64 tags[UNCORE_PMC_IDX_MAX];
 	struct pci_dev *pci_dev;



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-22 10:04   ` Stephane Eranian
  2015-05-21 11:17 ` [PATCH 03/10] perf/x86: Correct local vs remote sibling state Peter Zijlstra
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-2.patch --]
[-- Type: text/plain, Size: 3759 bytes --]

The (SNB/IVB/HSW) HT bug only affects events that can be programmed
onto GP counters, therefore we should only limit the number of GP
counters that can be used per cpu -- iow we should not constrain the
FP counters.

This patch changes the code such that we only count GP counters.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event.c       |   15 +++++++++++++++
 arch/x86/kernel/cpu/perf_event.h       |    2 --
 arch/x86/kernel/cpu/perf_event_intel.c |   20 --------------------
 3 files changed, 15 insertions(+), 22 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -611,6 +611,7 @@ struct sched_state {
 	int	event;		/* event index */
 	int	counter;	/* counter index */
 	int	unassigned;	/* number of events to be assigned left */
+	int	nr_gp_counters;	/* number of GP counters used */
 	unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
 };
 
@@ -696,6 +697,20 @@ static bool __perf_sched_find_counter(st
 				goto done;
 		}
 	}
+
+	/*
+	 * Do not allow scheduling of more than half the available generic
+	 * counters.
+	 *
+	 * This helps avoid counter starvation of sibling thread by ensuring at
+	 * most half the counters cannot be in exclusive mode. There is no
+	 * designated counters for the limits. Any N/2 counters can be used.
+	 * This helps with events with specific counter constraints.
+	 */
+	if (is_ht_workaround_enabled() &&
+	    sched->state.nr_gp_counters++ >= x86_pmu.num_counters / 2)
+		return false;
+
 	/* Grab the first unused counter starting with idx */
 	idx = sched->state.counter;
 	for_each_set_bit_from(idx, c->idxmsk, INTEL_PMC_IDX_FIXED) {
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -134,8 +134,6 @@ enum intel_excl_state_type {
 struct intel_excl_states {
 	enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
 	enum intel_excl_state_type state[X86_PMC_IDX_MAX];
-	int  num_alloc_cntrs;/* #counters allocated */
-	int  max_alloc_cntrs;/* max #counters allowed */
 	bool sched_started; /* true if scheduling has started */
 };
 
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1923,7 +1923,6 @@ intel_start_scheduling(struct cpu_hw_eve
 	xl = &excl_cntrs->states[tid];
 
 	xl->sched_started = true;
-	xl->num_alloc_cntrs = 0;
 	/*
 	 * lock shared state until we are done scheduling
 	 * in stop_event_scheduling()
@@ -2008,18 +2007,6 @@ intel_get_excl_constraints(struct cpu_hw
 	xl = &excl_cntrs->states[tid];
 	xlo = &excl_cntrs->states[o_tid];
 
-	/*
-	 * do not allow scheduling of more than max_alloc_cntrs
-	 * which is set to half the available generic counters.
-	 * this helps avoid counter starvation of sibling thread
-	 * by ensuring at most half the counters cannot be in
-	 * exclusive mode. There is not designated counters for the
-	 * limits. Any N/2 counters can be used. This helps with
-	 * events with specifix counter constraints
-	 */
-	if (xl->num_alloc_cntrs++ == xl->max_alloc_cntrs)
-		return &emptyconstraint;
-
 	cx = c;
 
 	/*
@@ -2632,8 +2619,6 @@ static void intel_pmu_cpu_starting(int c
 		cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
 
 	if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
-		int h = x86_pmu.num_counters >> 1;
-
 		for_each_cpu(i, topology_thread_cpumask(cpu)) {
 			struct intel_excl_cntrs *c;
 
@@ -2647,11 +2632,6 @@ static void intel_pmu_cpu_starting(int c
 		}
 		cpuc->excl_cntrs->core_id = core_id;
 		cpuc->excl_cntrs->refcnt++;
-		/*
-		 * set hard limit to half the number of generic counters
-		 */
-		cpuc->excl_cntrs->states[0].max_alloc_cntrs = h;
-		cpuc->excl_cntrs->states[1].max_alloc_cntrs = h;
 	}
 }
 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 03/10] perf/x86: Correct local vs remote sibling state
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 13:31   ` Stephane Eranian
  2015-05-21 11:17 ` [PATCH 04/10] perf/x86: Use lockdep Peter Zijlstra
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-3.patch --]
[-- Type: text/plain, Size: 5762 bytes --]

For some obscure reason the current code accounts the current SMT
thread's state on the remote thread and reads the remote's state on
the local SMT thread.

While internally consistent, and 'correct' its pointless confusion we
can do without.

Flip them the right way around.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   69 +++++++++++++--------------------
 1 file changed, 28 insertions(+), 41 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1903,9 +1903,8 @@ static void
 intel_start_scheduling(struct cpu_hw_events *cpuc)
 {
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct intel_excl_states *xl, *xlo;
+	struct intel_excl_states *xl;
 	int tid = cpuc->excl_thread_id;
-	int o_tid = 1 - tid; /* sibling thread */
 
 	/*
 	 * nothing needed if in group validation mode
@@ -1919,7 +1918,6 @@ intel_start_scheduling(struct cpu_hw_eve
 	if (!excl_cntrs)
 		return;
 
-	xlo = &excl_cntrs->states[o_tid];
 	xl = &excl_cntrs->states[tid];
 
 	xl->sched_started = true;
@@ -1932,18 +1930,17 @@ intel_start_scheduling(struct cpu_hw_eve
 	raw_spin_lock(&excl_cntrs->lock);
 
 	/*
-	 * save initial state of sibling thread
+	 * Save a copy of our state to work on.
 	 */
-	memcpy(xlo->init_state, xlo->state, sizeof(xlo->init_state));
+	memcpy(xl->init_state, xl->state, sizeof(xl->init_state));
 }
 
 static void
 intel_stop_scheduling(struct cpu_hw_events *cpuc)
 {
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct intel_excl_states *xl, *xlo;
+	struct intel_excl_states *xl;
 	int tid = cpuc->excl_thread_id;
-	int o_tid = 1 - tid; /* sibling thread */
 
 	/*
 	 * nothing needed if in group validation mode
@@ -1956,13 +1953,12 @@ intel_stop_scheduling(struct cpu_hw_even
 	if (!excl_cntrs)
 		return;
 
-	xlo = &excl_cntrs->states[o_tid];
 	xl = &excl_cntrs->states[tid];
 
 	/*
-	 * make new sibling thread state visible
+	 * Commit the working state.
 	 */
-	memcpy(xlo->state, xlo->init_state, sizeof(xlo->state));
+	memcpy(xl->state, xl->init_state, sizeof(xl->state));
 
 	xl->sched_started = false;
 	/*
@@ -1977,10 +1973,9 @@ intel_get_excl_constraints(struct cpu_hw
 {
 	struct event_constraint *cx;
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct intel_excl_states *xl, *xlo;
-	int is_excl, i;
+	struct intel_excl_states *xlo;
 	int tid = cpuc->excl_thread_id;
-	int o_tid = 1 - tid; /* alternate */
+	int is_excl, i;
 
 	/*
 	 * validating a group does not require
@@ -1994,18 +1989,6 @@ intel_get_excl_constraints(struct cpu_hw
 	 */
 	if (!excl_cntrs)
 		return c;
-	/*
-	 * event requires exclusive counter access
-	 * across HT threads
-	 */
-	is_excl = c->flags & PERF_X86_EVENT_EXCL;
-
-	/*
-	 * xl = state of current HT
-	 * xlo = state of sibling HT
-	 */
-	xl = &excl_cntrs->states[tid];
-	xlo = &excl_cntrs->states[o_tid];
 
 	cx = c;
 
@@ -2049,6 +2032,17 @@ intel_get_excl_constraints(struct cpu_hw
 	 */
 
 	/*
+	 * state of sibling HT
+	 */
+	xlo = &excl_cntrs->states[tid ^ 1];
+
+	/*
+	 * event requires exclusive counter access
+	 * across HT threads
+	 */
+	is_excl = c->flags & PERF_X86_EVENT_EXCL;
+
+	/*
 	 * Modify static constraint with current dynamic
 	 * state of thread
 	 *
@@ -2062,14 +2056,14 @@ intel_get_excl_constraints(struct cpu_hw
 		 * our corresponding counter cannot be used
 		 * regardless of our event
 		 */
-		if (xl->state[i] == INTEL_EXCL_EXCLUSIVE)
+		if (xlo->state[i] == INTEL_EXCL_EXCLUSIVE)
 			__clear_bit(i, cx->idxmsk);
 		/*
 		 * if measuring an exclusive event, sibling
 		 * measuring non-exclusive, then counter cannot
 		 * be used
 		 */
-		if (is_excl && xl->state[i] == INTEL_EXCL_SHARED)
+		if (is_excl && xlo->state[i] == INTEL_EXCL_SHARED)
 			__clear_bit(i, cx->idxmsk);
 	}
 
@@ -2119,10 +2113,9 @@ static void intel_put_excl_constraints(s
 {
 	struct hw_perf_event *hwc = &event->hw;
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct intel_excl_states *xlo, *xl;
-	unsigned long flags = 0; /* keep compiler happy */
 	int tid = cpuc->excl_thread_id;
-	int o_tid = 1 - tid;
+	struct intel_excl_states *xl;
+	unsigned long flags = 0; /* keep compiler happy */
 
 	/*
 	 * nothing needed if in group validation mode
@@ -2136,7 +2129,6 @@ static void intel_put_excl_constraints(s
 		return;
 
 	xl = &excl_cntrs->states[tid];
-	xlo = &excl_cntrs->states[o_tid];
 
 	/*
 	 * put_constraint may be called from x86_schedule_events()
@@ -2151,7 +2143,7 @@ static void intel_put_excl_constraints(s
 	 * counter state as unused now
 	 */
 	if (hwc->idx >= 0)
-		xlo->state[hwc->idx] = INTEL_EXCL_UNUSED;
+		xl->state[hwc->idx] = INTEL_EXCL_UNUSED;
 
 	if (!xl->sched_started)
 		raw_spin_unlock_irqrestore(&excl_cntrs->lock, flags);
@@ -2190,16 +2182,12 @@ static void intel_commit_scheduling(stru
 {
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
 	struct event_constraint *c = cpuc->event_constraint[idx];
-	struct intel_excl_states *xlo, *xl;
+	struct intel_excl_states *xl;
 	int tid = cpuc->excl_thread_id;
-	int o_tid = 1 - tid;
-	int is_excl;
 
 	if (cpuc->is_fake || !c)
 		return;
 
-	is_excl = c->flags & PERF_X86_EVENT_EXCL;
-
 	if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
 		return;
 
@@ -2209,15 +2197,14 @@ static void intel_commit_scheduling(stru
 		return;
 
 	xl = &excl_cntrs->states[tid];
-	xlo = &excl_cntrs->states[o_tid];
 
 	WARN_ON_ONCE(!raw_spin_is_locked(&excl_cntrs->lock));
 
 	if (cntr >= 0) {
-		if (is_excl)
-			xlo->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
+		if (c->flags & PERF_X86_EVENT_EXCL)
+			xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
 		else
-			xlo->init_state[cntr] = INTEL_EXCL_SHARED;
+			xl->init_state[cntr] = INTEL_EXCL_SHARED;
 	}
 }
 



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 04/10] perf/x86: Use lockdep
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (2 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 03/10] perf/x86: Correct local vs remote sibling state Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 05/10] perf/x86: Simplify dynamic constraint code somewhat Peter Zijlstra
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-3a.patch --]
[-- Type: text/plain, Size: 935 bytes --]

Lockdep is very good at finding incorrect IRQ state while locking and
is far better at telling us if we hold a lock than the _is_locked()
API. It also generates less code for the !DEBUG kernels.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1926,7 +1926,6 @@ intel_start_scheduling(struct cpu_hw_eve
 	 * in stop_event_scheduling()
 	 * makes scheduling appear as a transaction
 	 */
-	WARN_ON_ONCE(!irqs_disabled());
 	raw_spin_lock(&excl_cntrs->lock);
 
 	/*
@@ -2198,7 +2197,7 @@ static void intel_commit_scheduling(stru
 
 	xl = &excl_cntrs->states[tid];
 
-	WARN_ON_ONCE(!raw_spin_is_locked(&excl_cntrs->lock));
+	lockdep_assert_held(&excl_cntrs->lock);
 
 	if (cntr >= 0) {
 		if (c->flags & PERF_X86_EVENT_EXCL)



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 05/10] perf/x86: Simplify dynamic constraint code somewhat
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (3 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 04/10] perf/x86: Use lockdep Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 06/10] perf/x86: Make WARNs consistent Peter Zijlstra
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-4.patch --]
[-- Type: text/plain, Size: 2788 bytes --]

Instead of using cx after the dynamic allocation, put all cx inside
the dynamic allocation block and use c outside of it.

Also use direct assignment to copy the structure; let the compiler
figure it out.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1970,7 +1970,6 @@ static struct event_constraint *
 intel_get_excl_constraints(struct cpu_hw_events *cpuc, struct perf_event *event,
 			   int idx, struct event_constraint *c)
 {
-	struct event_constraint *cx;
 	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
 	struct intel_excl_states *xlo;
 	int tid = cpuc->excl_thread_id;
@@ -1989,8 +1988,6 @@ intel_get_excl_constraints(struct cpu_hw
 	if (!excl_cntrs)
 		return c;
 
-	cx = c;
-
 	/*
 	 * because we modify the constraint, we need
 	 * to make a copy. Static constraints come
@@ -2000,6 +1997,7 @@ intel_get_excl_constraints(struct cpu_hw
 	 * been cloned (marked dynamic)
 	 */
 	if (!(c->flags & PERF_X86_EVENT_DYNAMIC)) {
+		struct event_constraint *cx;
 
 		/* sanity check */
 		if (idx < 0)
@@ -2014,13 +2012,14 @@ intel_get_excl_constraints(struct cpu_hw
 		 * initialize dynamic constraint
 		 * with static constraint
 		 */
-		memcpy(cx, c, sizeof(*cx));
+		*cx = *c;
 
 		/*
 		 * mark constraint as dynamic, so we
 		 * can free it later on
 		 */
 		cx->flags |= PERF_X86_EVENT_DYNAMIC;
+		c = cx;
 	}
 
 	/*
@@ -2049,37 +2048,37 @@ intel_get_excl_constraints(struct cpu_hw
 	 * SHARED   : sibling counter measuring non-exclusive event
 	 * UNUSED   : sibling counter unused
 	 */
-	for_each_set_bit(i, cx->idxmsk, X86_PMC_IDX_MAX) {
+	for_each_set_bit(i, c->idxmsk, X86_PMC_IDX_MAX) {
 		/*
 		 * exclusive event in sibling counter
 		 * our corresponding counter cannot be used
 		 * regardless of our event
 		 */
 		if (xlo->state[i] == INTEL_EXCL_EXCLUSIVE)
-			__clear_bit(i, cx->idxmsk);
+			__clear_bit(i, c->idxmsk);
 		/*
 		 * if measuring an exclusive event, sibling
 		 * measuring non-exclusive, then counter cannot
 		 * be used
 		 */
 		if (is_excl && xlo->state[i] == INTEL_EXCL_SHARED)
-			__clear_bit(i, cx->idxmsk);
+			__clear_bit(i, c->idxmsk);
 	}
 
 	/*
 	 * recompute actual bit weight for scheduling algorithm
 	 */
-	cx->weight = hweight64(cx->idxmsk64);
+	c->weight = hweight64(c->idxmsk64);
 
 	/*
 	 * if we return an empty mask, then switch
 	 * back to static empty constraint to avoid
 	 * the cost of freeing later on
 	 */
-	if (cx->weight == 0)
-		cx = &emptyconstraint;
+	if (c->weight == 0)
+		c = &emptyconstraint;
 
-	return cx;
+	return c;
 }
 
 static struct event_constraint *



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 06/10] perf/x86: Make WARNs consistent
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (4 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 05/10] perf/x86: Simplify dynamic constraint code somewhat Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 07/10] perf/x86: Move intel_commit_scheduling() Peter Zijlstra
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-5.patch --]
[-- Type: text/plain, Size: 2282 bytes --]

The commit_scheduling hook is pointlessly different from the start and
stop scheduling hook.

Furthermore, the constraint should never be NULL, so remove that test.
Even though we'll never get called (because we NULL the callbacks)
when !is_ht_workaround_enabled() put that test in.

Collapse the (pointless) WARN_ON_ONCE and bail on !cpuc->excl_cntrs --
this is doubly pointless, because its the same condition as
is_ht_workaround_enabled() which was already pointless because the
whole method won't ever be called.

Furthremore, make all the !excl_cntrs test WARN_ON_ONCE; they're all
pointless, because the above, either the function
({get,put}_excl_constraint) are already predicated on it existing or
the is_ht_workaround_enabled() thing is the same test.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1915,7 +1915,7 @@ intel_start_scheduling(struct cpu_hw_eve
 	/*
 	 * no exclusion needed
 	 */
-	if (!excl_cntrs)
+	if (WARN_ON_ONCE(!excl_cntrs))
 		return;
 
 	xl = &excl_cntrs->states[tid];
@@ -1949,7 +1949,7 @@ intel_stop_scheduling(struct cpu_hw_even
 	/*
 	 * no exclusion needed
 	 */
-	if (!excl_cntrs)
+	if (WARN_ON_ONCE(!excl_cntrs))
 		return;
 
 	xl = &excl_cntrs->states[tid];
@@ -1985,7 +1985,7 @@ intel_get_excl_constraints(struct cpu_hw
 	/*
 	 * no exclusion needed
 	 */
-	if (!excl_cntrs)
+	if (WARN_ON_ONCE(!excl_cntrs))
 		return c;
 
 	/*
@@ -2121,9 +2121,7 @@ static void intel_put_excl_constraints(s
 	if (cpuc->is_fake)
 		return;
 
-	WARN_ON_ONCE(!excl_cntrs);
-
-	if (!excl_cntrs)
+	if (WARN_ON_ONCE(!excl_cntrs))
 		return;
 
 	xl = &excl_cntrs->states[tid];
@@ -2190,15 +2188,13 @@ static void intel_commit_scheduling(stru
 	struct intel_excl_states *xl;
 	int tid = cpuc->excl_thread_id;
 
-	if (cpuc->is_fake || !c)
+	if (cpuc->is_fake || !is_ht_workaround_enabled())
 		return;
 
-	if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
+	if (WARN_ON_ONCE(!excl_cntrs))
 		return;
 
-	WARN_ON_ONCE(!excl_cntrs);
-
-	if (!excl_cntrs)
+	if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
 		return;
 
 	xl = &excl_cntrs->states[tid];



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 07/10] perf/x86: Move intel_commit_scheduling()
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (5 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 06/10] perf/x86: Make WARNs consistent Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 11:17 ` [PATCH 08/10] perf/x86: Remove pointless tests Peter Zijlstra
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-6.patch --]
[-- Type: text/plain, Size: 3424 bytes --]

Place intel_commit_scheduling() in the right place, which is in
between start and stop.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event.h       |    4 +-
 arch/x86/kernel/cpu/perf_event_intel.c |   60 ++++++++++++++++-----------------
 2 files changed, 32 insertions(+), 32 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -520,10 +520,10 @@ struct x86_pmu {
 	void		(*put_event_constraints)(struct cpu_hw_events *cpuc,
 						 struct perf_event *event);
 
-	void		(*commit_scheduling)(struct cpu_hw_events *cpuc, int idx, int cntr);
-
 	void		(*start_scheduling)(struct cpu_hw_events *cpuc);
 
+	void		(*commit_scheduling)(struct cpu_hw_events *cpuc, int idx, int cntr);
+
 	void		(*stop_scheduling)(struct cpu_hw_events *cpuc);
 
 	struct event_constraint *event_constraints;
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1934,6 +1934,34 @@ intel_start_scheduling(struct cpu_hw_eve
 	memcpy(xl->init_state, xl->state, sizeof(xl->init_state));
 }
 
+static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
+{
+	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
+	struct event_constraint *c = cpuc->event_constraint[idx];
+	struct intel_excl_states *xl;
+	int tid = cpuc->excl_thread_id;
+
+	if (cpuc->is_fake || !is_ht_workaround_enabled())
+		return;
+
+	if (WARN_ON_ONCE(!excl_cntrs))
+		return;
+
+	if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
+		return;
+
+	xl = &excl_cntrs->states[tid];
+
+	lockdep_assert_held(&excl_cntrs->lock);
+
+	if (cntr >= 0) {
+		if (c->flags & PERF_X86_EVENT_EXCL)
+			xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
+		else
+			xl->init_state[cntr] = INTEL_EXCL_SHARED;
+	}
+}
+
 static void
 intel_stop_scheduling(struct cpu_hw_events *cpuc)
 {
@@ -2174,34 +2202,6 @@ static void intel_put_event_constraints(
 		intel_put_excl_constraints(cpuc, event);
 }
 
-static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
-{
-	struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
-	struct event_constraint *c = cpuc->event_constraint[idx];
-	struct intel_excl_states *xl;
-	int tid = cpuc->excl_thread_id;
-
-	if (cpuc->is_fake || !is_ht_workaround_enabled())
-		return;
-
-	if (WARN_ON_ONCE(!excl_cntrs))
-		return;
-
-	if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
-		return;
-
-	xl = &excl_cntrs->states[tid];
-
-	lockdep_assert_held(&excl_cntrs->lock);
-
-	if (cntr >= 0) {
-		if (c->flags & PERF_X86_EVENT_EXCL)
-			xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
-		else
-			xl->init_state[cntr] = INTEL_EXCL_SHARED;
-	}
-}
-
 static void intel_pebs_aliases_core2(struct perf_event *event)
 {
 	if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
@@ -2910,8 +2910,8 @@ static __init void intel_ht_bug(void)
 {
 	x86_pmu.flags |= PMU_FL_EXCL_CNTRS | PMU_FL_EXCL_ENABLED;
 
-	x86_pmu.commit_scheduling = intel_commit_scheduling;
 	x86_pmu.start_scheduling = intel_start_scheduling;
+	x86_pmu.commit_scheduling = intel_commit_scheduling;
 	x86_pmu.stop_scheduling = intel_stop_scheduling;
 }
 
@@ -3367,8 +3367,8 @@ static __init int fixup_ht_bug(void)
 
 	x86_pmu.flags &= ~(PMU_FL_EXCL_CNTRS | PMU_FL_EXCL_ENABLED);
 
-	x86_pmu.commit_scheduling = NULL;
 	x86_pmu.start_scheduling = NULL;
+	x86_pmu.commit_scheduling = NULL;
 	x86_pmu.stop_scheduling = NULL;
 
 	watchdog_nmi_enable_all();



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 08/10] perf/x86: Remove pointless tests
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (6 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 07/10] perf/x86: Move intel_commit_scheduling() Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 13:24   ` Stephane Eranian
  2015-05-21 11:17 ` [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state Peter Zijlstra
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-7.patch --]
[-- Type: text/plain, Size: 1236 bytes --]

Both intel_commit_scheduling() and intel_get_excl_contraints() test
for cntr < 0.

The only way that can happen (aside from a bug) is through
validate_event(), however that is already captured by the
cpuc->is_fake test.

So remove these test and simplify the code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1955,12 +1955,10 @@ static void intel_commit_scheduling(stru
 
 	lockdep_assert_held(&excl_cntrs->lock);
 
-	if (cntr >= 0) {
-		if (c->flags & PERF_X86_EVENT_EXCL)
-			xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
-		else
-			xl->init_state[cntr] = INTEL_EXCL_SHARED;
-	}
+	if (c->flags & PERF_X86_EVENT_EXCL)
+		xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
+	else
+		xl->init_state[cntr] = INTEL_EXCL_SHARED;
 }
 
 static void
@@ -2028,10 +2026,6 @@ intel_get_excl_constraints(struct cpu_hw
 	if (!(c->flags & PERF_X86_EVENT_DYNAMIC)) {
 		struct event_constraint *cx;
 
-		/* sanity check */
-		if (idx < 0)
-			return &emptyconstraint;
-
 		/*
 		 * grab pre-allocated constraint entry
 		 */



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (7 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 08/10] perf/x86: Remove pointless tests Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 13:39   ` Stephane Eranian
  2015-05-21 11:17 ` [PATCH 10/10] perf,x86: Simplify logic Peter Zijlstra
  2015-05-21 11:48 ` [PATCH 00/10] Various x86 pmu scheduling patches Stephane Eranian
  10 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-8.patch --]
[-- Type: text/plain, Size: 3085 bytes --]

For some obscure reason intel_{start,stop}_scheduling() copy the HT
state to an intermediate array. This would make sense if we ever were
to make changes to it which we'd have to discard.

Except we don't. By the time we call intel_commit_scheduling() we're;
as the name implies; committed to them. We'll never back out.

So the intermediate array is pointless, modify the state in place and
kill the extra array.

And remove the pointless array initialization: INTEL_EXCL_UNUSED == 0.

Note; all is serialized by intel_excl_cntr::lock.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event.c       |    1 -
 arch/x86/kernel/cpu/perf_event.h       |    1 -
 arch/x86/kernel/cpu/perf_event_intel.c |   22 ++--------------------
 3 files changed, 2 insertions(+), 22 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -861,7 +861,6 @@ int x86_schedule_events(struct cpu_hw_ev
 	}
 
 	if (!assign || unsched) {
-
 		for (i = 0; i < n; i++) {
 			e = cpuc->event_list[i];
 			/*
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -132,7 +132,6 @@ enum intel_excl_state_type {
 };
 
 struct intel_excl_states {
-	enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
 	enum intel_excl_state_type state[X86_PMC_IDX_MAX];
 	bool sched_started; /* true if scheduling has started */
 };
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1927,11 +1927,6 @@ intel_start_scheduling(struct cpu_hw_eve
 	 * makes scheduling appear as a transaction
 	 */
 	raw_spin_lock(&excl_cntrs->lock);
-
-	/*
-	 * Save a copy of our state to work on.
-	 */
-	memcpy(xl->init_state, xl->state, sizeof(xl->init_state));
 }
 
 static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
@@ -1955,9 +1950,9 @@ static void intel_commit_scheduling(stru
 	lockdep_assert_held(&excl_cntrs->lock);
 
 	if (c->flags & PERF_X86_EVENT_EXCL)
-		xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
+		xl->state[cntr] = INTEL_EXCL_EXCLUSIVE;
 	else
-		xl->init_state[cntr] = INTEL_EXCL_SHARED;
+		xl->state[cntr] = INTEL_EXCL_SHARED;
 }
 
 static void
@@ -1980,11 +1975,6 @@ intel_stop_scheduling(struct cpu_hw_even
 
 	xl = &excl_cntrs->states[tid];
 
-	/*
-	 * Commit the working state.
-	 */
-	memcpy(xl->state, xl->init_state, sizeof(xl->state));
-
 	xl->sched_started = false;
 	/*
 	 * release shared state lock (acquired in intel_start_scheduling())
@@ -2509,19 +2499,11 @@ struct intel_shared_regs *allocate_share
 static struct intel_excl_cntrs *allocate_excl_cntrs(int cpu)
 {
 	struct intel_excl_cntrs *c;
-	int i;
 
 	c = kzalloc_node(sizeof(struct intel_excl_cntrs),
 			 GFP_KERNEL, cpu_to_node(cpu));
 	if (c) {
 		raw_spin_lock_init(&c->lock);
-		for (i = 0; i < X86_PMC_IDX_MAX; i++) {
-			c->states[0].state[i] = INTEL_EXCL_UNUSED;
-			c->states[0].init_state[i] = INTEL_EXCL_UNUSED;
-
-			c->states[1].state[i] = INTEL_EXCL_UNUSED;
-			c->states[1].init_state[i] = INTEL_EXCL_UNUSED;
-		}
 		c->core_id = -1;
 	}
 	return c;



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 10/10] perf,x86: Simplify logic
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (8 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state Peter Zijlstra
@ 2015-05-21 11:17 ` Peter Zijlstra
  2015-05-21 11:48 ` [PATCH 00/10] Various x86 pmu scheduling patches Stephane Eranian
  10 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 11:17 UTC (permalink / raw)
  To: mingo, peterz; +Cc: vincent.weaver, eranian, jolsa, linux-kernel

[-- Attachment #1: peterz-pmu-sched-9.patch --]
[-- Type: text/plain, Size: 528 bytes --]

	!x && y == ! (x || !y)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -858,9 +858,7 @@ int x86_schedule_events(struct cpu_hw_ev
 			if (x86_pmu.commit_scheduling)
 				x86_pmu.commit_scheduling(cpuc, i, assign[i]);
 		}
-	}
-
-	if (!assign || unsched) {
+	} else {
 		for (i = 0; i < n; i++) {
 			e = cpuc->event_list[i];
 			/*



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/10] Various x86 pmu scheduling patches
  2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
                   ` (9 preceding siblings ...)
  2015-05-21 11:17 ` [PATCH 10/10] perf,x86: Simplify logic Peter Zijlstra
@ 2015-05-21 11:48 ` Stephane Eranian
  2015-05-21 12:53   ` Peter Zijlstra
  10 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 11:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

Peter,

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Unless I messed things up again, patch 1 and 2 are for perf/urgent and the rest
> can wait.
>
> I would still like to relax the HT scheduling constraint in case there are no
> funny events at all, but I've not yet found a nice way to do that.
>
Are you talking about the HT bug workaround? If there is no
corrupting event across HT threads, then scheduling is not impacted.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
@ 2015-05-21 12:35   ` Stephane Eranian
  2015-05-21 12:56     ` Peter Zijlstra
  2015-05-21 14:53   ` Peter Zijlstra
  2015-08-21 20:31   ` Sasha Levin
  2 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 12:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

Peter,

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> Commit 43b4578071c0 ("perf/x86: Reduce stack usage of
> x86_schedule_events()") violated the rule that 'fake' scheduling; as
> used for event/group validation; should not change the event state.
>
> This went mostly un-noticed because repeated calls of
> x86_pmu::get_event_constraints() would give the same result. And
> x86_pmu::put_event_constraints() would mostly not do anything.
>
> Things could still go wrong esp. for shared_regs, because
> cpuc->is_fake can return a different constraint and then the
> put_event_constraint will not match up.
>
I don't follow this here. What do you mean by 'match up'?

> Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
> bug workaround") made the situation much worse by actually setting the
> event->hw.constraint value to NULL, so when validation and actual
> scheduling interact we get NULL ptr derefs.
>

But  x86_schedule_events() does reset the hw.constraint for each invocation:

           c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
           hwc->constraint = c;

> Fix it by removing the constraint pointer from the event and move it
> back to an array, this time in cpuc instead of on the stack.
>
> Reported-by: Vince Weaver <vincent.weaver@maine.edu>
> Fixes: 43b4578071c0 ("perf/x86: Reduce stack usage of x86_schedule_events()")
> Fixes: e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption bug workaround")
> Cc: Andrew Hunter <ahh@google.com>
> Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/perf_event.c              |   32 +++++++++++++-------------
>  arch/x86/kernel/cpu/perf_event.h              |    9 ++++---
>  arch/x86/kernel/cpu/perf_event_intel.c        |   15 +++---------
>  arch/x86/kernel/cpu/perf_event_intel_ds.c     |    4 +--
>  arch/x86/kernel/cpu/perf_event_intel_uncore.c |    7 ++---
>  arch/x86/kernel/cpu/perf_event_intel_uncore.h |    1
>  6 files changed, 32 insertions(+), 36 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -620,7 +620,7 @@ struct sched_state {
>  struct perf_sched {
>         int                     max_weight;
>         int                     max_events;
> -       struct perf_event       **events;
> +       struct event_constraint **constraints;
>         struct sched_state      state;
>         int                     saved_states;
>         struct sched_state      saved[SCHED_STATES_MAX];
> @@ -629,7 +629,7 @@ struct perf_sched {
>  /*
>   * Initialize interator that runs through all events and counters.
>   */
> -static void perf_sched_init(struct perf_sched *sched, struct perf_event **events,
> +static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints,
>                             int num, int wmin, int wmax)
>  {
>         int idx;
> @@ -637,10 +637,10 @@ static void perf_sched_init(struct perf_
>         memset(sched, 0, sizeof(*sched));
>         sched->max_events       = num;
>         sched->max_weight       = wmax;
> -       sched->events           = events;
> +       sched->constraints      = constraints;
>
>         for (idx = 0; idx < num; idx++) {
> -               if (events[idx]->hw.constraint->weight == wmin)
> +               if (constraints[idx]->weight == wmin)
>                         break;
>         }
>
> @@ -687,7 +687,7 @@ static bool __perf_sched_find_counter(st
>         if (sched->state.event >= sched->max_events)
>                 return false;
>
> -       c = sched->events[sched->state.event]->hw.constraint;
> +       c = sched->constraints[sched->state.event];
>         /* Prefer fixed purpose counters */
>         if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) {
>                 idx = INTEL_PMC_IDX_FIXED;
> @@ -745,7 +745,7 @@ static bool perf_sched_next_event(struct
>                         if (sched->state.weight > sched->max_weight)
>                                 return false;
>                 }
> -               c = sched->events[sched->state.event]->hw.constraint;
> +               c = sched->constraints[sched->state.event];
>         } while (c->weight != sched->state.weight);
>
>         sched->state.counter = 0;       /* start with first counter */
> @@ -756,12 +756,12 @@ static bool perf_sched_next_event(struct
>  /*
>   * Assign a counter for each event.
>   */
> -int perf_assign_events(struct perf_event **events, int n,
> +int perf_assign_events(struct event_constraint **constraints, int n,
>                         int wmin, int wmax, int *assign)
>  {
>         struct perf_sched sched;
>
> -       perf_sched_init(&sched, events, n, wmin, wmax);
> +       perf_sched_init(&sched, constraints, n, wmin, wmax);
>
>         do {
>                 if (!perf_sched_find_counter(&sched))
> @@ -788,9 +788,8 @@ int x86_schedule_events(struct cpu_hw_ev
>                 x86_pmu.start_scheduling(cpuc);
>
>         for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
> -               hwc = &cpuc->event_list[i]->hw;
>                 c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
> -               hwc->constraint = c;
> +               cpuc->event_constraint[i] = c;
>
>                 wmin = min(wmin, c->weight);
>                 wmax = max(wmax, c->weight);
> @@ -801,7 +800,7 @@ int x86_schedule_events(struct cpu_hw_ev
>          */
>         for (i = 0; i < n; i++) {
>                 hwc = &cpuc->event_list[i]->hw;
> -               c = hwc->constraint;
> +               c = cpuc->event_constraint[i];
>
>                 /* never assigned */
>                 if (hwc->idx == -1)
> @@ -821,9 +820,10 @@ int x86_schedule_events(struct cpu_hw_ev
>         }
>
>         /* slow path */
> -       if (i != n)
> -               unsched = perf_assign_events(cpuc->event_list, n, wmin,
> +       if (i != n) {
> +               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
>                                              wmax, assign);
> +       }
>
>         /*
>          * In case of success (unsched = 0), mark events as committed,
> @@ -840,7 +840,7 @@ int x86_schedule_events(struct cpu_hw_ev
>                         e = cpuc->event_list[i];
>                         e->hw.flags |= PERF_X86_EVENT_COMMITTED;
>                         if (x86_pmu.commit_scheduling)
> -                               x86_pmu.commit_scheduling(cpuc, e, assign[i]);
> +                               x86_pmu.commit_scheduling(cpuc, i, assign[i]);
>                 }
>         }
>
> @@ -1292,8 +1292,10 @@ static void x86_pmu_del(struct perf_even
>                 x86_pmu.put_event_constraints(cpuc, event);
>
>         /* Delete the array entry. */
> -       while (++i < cpuc->n_events)
> +       while (++i < cpuc->n_events) {
>                 cpuc->event_list[i-1] = cpuc->event_list[i];
> +               cpuc->event_constraint[i-1] = cpuc->event_constraint[i];
> +       }
>         --cpuc->n_events;
>
>         perf_event_update_userpage(event);
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -172,7 +172,10 @@ struct cpu_hw_events {
>                                              added in the current transaction */
>         int                     assign[X86_PMC_IDX_MAX]; /* event to counter assignment */
>         u64                     tags[X86_PMC_IDX_MAX];
> +
>         struct perf_event       *event_list[X86_PMC_IDX_MAX]; /* in enabled order */
> +       struct event_constraint *event_constraint[X86_PMC_IDX_MAX];
> +
>
>         unsigned int            group_flag;
>         int                     is_fake;
> @@ -519,9 +522,7 @@ struct x86_pmu {
>         void            (*put_event_constraints)(struct cpu_hw_events *cpuc,
>                                                  struct perf_event *event);
>
> -       void            (*commit_scheduling)(struct cpu_hw_events *cpuc,
> -                                            struct perf_event *event,
> -                                            int cntr);
> +       void            (*commit_scheduling)(struct cpu_hw_events *cpuc, int idx, int cntr);
>
>         void            (*start_scheduling)(struct cpu_hw_events *cpuc);
>
> @@ -717,7 +718,7 @@ static inline void __x86_pmu_enable_even
>
>  void x86_pmu_enable_all(int added);
>
> -int perf_assign_events(struct perf_event **events, int n,
> +int perf_assign_events(struct event_constraint **constraints, int n,
>                         int wmin, int wmax, int *assign);
>  int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);
>
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2106,7 +2106,7 @@ static struct event_constraint *
>  intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>                             struct perf_event *event)
>  {
> -       struct event_constraint *c1 = event->hw.constraint;
> +       struct event_constraint *c1 = cpuc->event_constraint[idx];
>         struct event_constraint *c2;
>
>         /*
> @@ -2188,8 +2188,6 @@ intel_put_shared_regs_event_constraints(
>  static void intel_put_event_constraints(struct cpu_hw_events *cpuc,
>                                         struct perf_event *event)
>  {
> -       struct event_constraint *c = event->hw.constraint;
> -
>         intel_put_shared_regs_event_constraints(cpuc, event);
>
>         /*
> @@ -2197,19 +2195,14 @@ static void intel_put_event_constraints(
>          * all events are subject to and must call the
>          * put_excl_constraints() routine
>          */
> -       if (c && cpuc->excl_cntrs)
> +       if (cpuc->excl_cntrs)
>                 intel_put_excl_constraints(cpuc, event);
> -
> -       /* cleanup dynamic constraint */
> -       if (c && (c->flags & PERF_X86_EVENT_DYNAMIC))
> -               event->hw.constraint = NULL;
>  }
>
> -static void intel_commit_scheduling(struct cpu_hw_events *cpuc,
> -                                   struct perf_event *event, int cntr)
> +static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
>  {
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
> -       struct event_constraint *c = event->hw.constraint;
> +       struct event_constraint *c = cpuc->event_constraint[idx];
>         struct intel_excl_states *xlo, *xl;
>         int tid = cpuc->excl_thread_id;
>         int o_tid = 1 - tid;
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -706,9 +706,9 @@ void intel_pmu_pebs_disable(struct perf_
>
>         cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>
> -       if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
> +       if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
>                 cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> -       else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
> +       else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
>                 cpuc->pebs_enabled &= ~(1ULL << 63);
>
>         if (cpuc->enabled)
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -364,9 +364,8 @@ static int uncore_assign_events(struct i
>         bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX);
>
>         for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) {
> -               hwc = &box->event_list[i]->hw;
>                 c = uncore_get_event_constraint(box, box->event_list[i]);
> -               hwc->constraint = c;
> +               box->event_constraint[i] = c;
>                 wmin = min(wmin, c->weight);
>                 wmax = max(wmax, c->weight);
>         }
> @@ -374,7 +373,7 @@ static int uncore_assign_events(struct i
>         /* fastpath, try to reuse previous register */
>         for (i = 0; i < n; i++) {
>                 hwc = &box->event_list[i]->hw;
> -               c = hwc->constraint;
> +               c = box->event_constraint[i];
>
>                 /* never assigned */
>                 if (hwc->idx == -1)
> @@ -394,7 +393,7 @@ static int uncore_assign_events(struct i
>         }
>         /* slow path */
>         if (i != n)
> -               ret = perf_assign_events(box->event_list, n,
> +               ret = perf_assign_events(box->event_constraint, n,
>                                          wmin, wmax, assign);
>
>         if (!assign || ret) {
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
> @@ -97,6 +97,7 @@ struct intel_uncore_box {
>         atomic_t refcnt;
>         struct perf_event *events[UNCORE_PMC_IDX_MAX];
>         struct perf_event *event_list[UNCORE_PMC_IDX_MAX];
> +       struct event_constraint *event_constraint[UNCORE_PMC_IDX_MAX];
>         unsigned long active_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
>         u64 tags[UNCORE_PMC_IDX_MAX];
>         struct pci_dev *pci_dev;
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/10] Various x86 pmu scheduling patches
  2015-05-21 11:48 ` [PATCH 00/10] Various x86 pmu scheduling patches Stephane Eranian
@ 2015-05-21 12:53   ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 12:53 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 04:48:16AM -0700, Stephane Eranian wrote:
> Peter,
> 
> On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Unless I messed things up again, patch 1 and 2 are for perf/urgent and the rest
> > can wait.
> >
> > I would still like to relax the HT scheduling constraint in case there are no
> > funny events at all, but I've not yet found a nice way to do that.
> >
> Are you talking about the HT bug workaround? If there is no
> corrupting event across HT threads, then scheduling is not impacted.

It is, commit c02cdbf60b51 ("perf/x86/intel: Limit to half counters when
the HT workaround is enabled, to avoid exclusive mode starvation") is
active irrespective of any active events.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 12:35   ` Stephane Eranian
@ 2015-05-21 12:56     ` Peter Zijlstra
  2015-05-21 13:07       ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 12:56 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 05:35:02AM -0700, Stephane Eranian wrote:
> Peter,
> 
> On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > Commit 43b4578071c0 ("perf/x86: Reduce stack usage of
> > x86_schedule_events()") violated the rule that 'fake' scheduling; as
> > used for event/group validation; should not change the event state.
> >
> > This went mostly un-noticed because repeated calls of
> > x86_pmu::get_event_constraints() would give the same result. And
> > x86_pmu::put_event_constraints() would mostly not do anything.
> >
> > Things could still go wrong esp. for shared_regs, because
> > cpuc->is_fake can return a different constraint and then the
> > put_event_constraint will not match up.
> >
> I don't follow this here. What do you mean by 'match up'?

Ah, I wrote that Changelog for a prior patch; which by writing the
changelog I found faulty.

But I then forgot to update the Changelog.

I was under the impression put_event_constraints() would actually take
the constraint as an argument, and with the below example, it would not
do put on the same it would get.

> > Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
> > bug workaround") made the situation much worse by actually setting the
> > event->hw.constraint value to NULL, so when validation and actual
> > scheduling interact we get NULL ptr derefs.
> >
> 
> But  x86_schedule_events() does reset the hw.constraint for each invocation:
> 
>            c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
>            hwc->constraint = c;

Yes, so if you have:

	validate_group()

		hwc->constraint = c;

	<context switch>

		c = hwc->constraint;

The second c might not be the first.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 12:56     ` Peter Zijlstra
@ 2015-05-21 13:07       ` Stephane Eranian
  2015-05-21 13:09         ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 5:56 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 21, 2015 at 05:35:02AM -0700, Stephane Eranian wrote:
>> > Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
>> > bug workaround") made the situation much worse by actually setting the
>> > event->hw.constraint value to NULL, so when validation and actual
>> > scheduling interact we get NULL ptr derefs.
>> >
>>
>> But  x86_schedule_events() does reset the hw.constraint for each invocation:
>>
>>            c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
>>            hwc->constraint = c;
>
> Yes, so if you have:
>
>         validate_group()
>
>                 hwc->constraint = c;
>
Ok, you get that because validate_group() invokes x6_schedule_events() but
on the fake_cpuc. This on fake_cpuc->event_list[]->hwc.

>         <context switch>
>
>                 c = hwc->constraint;
>
> The second c might not be the first.
And where does this assignment come from?
For actual scheduling, we are using the actual cpuc, not fake_cpuc.
Validate_group() does not modify global cpuc state. Or am I missing something?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:07       ` Stephane Eranian
@ 2015-05-21 13:09         ` Peter Zijlstra
  2015-05-21 13:18           ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 13:09 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 06:07:20AM -0700, Stephane Eranian wrote:
> On Thu, May 21, 2015 at 5:56 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, May 21, 2015 at 05:35:02AM -0700, Stephane Eranian wrote:
> >> > Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
> >> > bug workaround") made the situation much worse by actually setting the
> >> > event->hw.constraint value to NULL, so when validation and actual
> >> > scheduling interact we get NULL ptr derefs.
> >> >
> >>
> >> But  x86_schedule_events() does reset the hw.constraint for each invocation:
> >>
> >>            c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
> >>            hwc->constraint = c;
> >
> > Yes, so if you have:
> >
> >         validate_group()
> >
> >                 hwc->constraint = c;
> >
> Ok, you get that because validate_group() invokes x6_schedule_events() but
> on the fake_cpuc. This on fake_cpuc->event_list[]->hwc.
> 
> >         <context switch>
> >
> >                 c = hwc->constraint;
> >
> > The second c might not be the first.
> And where does this assignment come from?

That's a read. The <context switch> can include a call to
x86_schedule_events().

> For actual scheduling, we are using the actual cpuc, not fake_cpuc.
> Validate_group() does not modify global cpuc state. Or am I missing
> something?

No, but x86_schedule_event() can modify event state, which is the fail.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:09         ` Peter Zijlstra
@ 2015-05-21 13:18           ` Stephane Eranian
  2015-05-21 13:20             ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 6:09 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 21, 2015 at 06:07:20AM -0700, Stephane Eranian wrote:
>> On Thu, May 21, 2015 at 5:56 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Thu, May 21, 2015 at 05:35:02AM -0700, Stephane Eranian wrote:
>> >> > Commit e979121b1b15 ("perf/x86/intel: Implement cross-HT corruption
>> >> > bug workaround") made the situation much worse by actually setting the
>> >> > event->hw.constraint value to NULL, so when validation and actual
>> >> > scheduling interact we get NULL ptr derefs.
>> >> >
>> >>
>> >> But  x86_schedule_events() does reset the hw.constraint for each invocation:
>> >>
>> >>            c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
>> >>            hwc->constraint = c;
>> >
>> > Yes, so if you have:
>> >
>> >         validate_group()
>> >
>> >                 hwc->constraint = c;
>> >
>> Ok, you get that because validate_group() invokes x6_schedule_events() but
>> on the fake_cpuc. This on fake_cpuc->event_list[]->hwc.
>>
>> >         <context switch>
>> >
>> >                 c = hwc->constraint;
>> >
>> > The second c might not be the first.
>> And where does this assignment come from?
>
> That's a read. The <context switch> can include a call to
> x86_schedule_events().
Yes, but x86_schedule_events() never reads the constraint
without setting it again before.

>
>> For actual scheduling, we are using the actual cpuc, not fake_cpuc.
>> Validate_group() does not modify global cpuc state. Or am I missing
>> something?
>
> No, but x86_schedule_event() can modify event state, which is the fail.
>
Yes, it does modify the cpuc->event_list[]->hwc, because it is used as a
cache for *EACH* invocation of the function. It is irrelevant outside the
function.

>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:18           ` Stephane Eranian
@ 2015-05-21 13:20             ` Peter Zijlstra
  2015-05-21 13:27               ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 13:20 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 06:18:15AM -0700, Stephane Eranian wrote:
> Yes, it does modify the cpuc->event_list[]->hwc, because it is used as a
> cache for *EACH* invocation of the function. It is irrelevant outside the
> function.

Yes, but the problem is that they _nest_.

So you get to use the one from the nested call in the outer call.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 08/10] perf/x86: Remove pointless tests
  2015-05-21 11:17 ` [PATCH 08/10] perf/x86: Remove pointless tests Peter Zijlstra
@ 2015-05-21 13:24   ` Stephane Eranian
  0 siblings, 0 replies; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:24 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> Both intel_commit_scheduling() and intel_get_excl_contraints() test
> for cntr < 0.
>
> The only way that can happen (aside from a bug) is through
> validate_event(), however that is already captured by the
> cpuc->is_fake test.
>
You are saying that we can never get called with assign[i] = -1.
I believe this is correct. The test was leftover from debugging.

> So remove these test and simplify the code.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |   14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1955,12 +1955,10 @@ static void intel_commit_scheduling(stru
>
>         lockdep_assert_held(&excl_cntrs->lock);
>
> -       if (cntr >= 0) {
> -               if (c->flags & PERF_X86_EVENT_EXCL)
> -                       xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
> -               else
> -                       xl->init_state[cntr] = INTEL_EXCL_SHARED;
> -       }
> +       if (c->flags & PERF_X86_EVENT_EXCL)
> +               xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
> +       else
> +               xl->init_state[cntr] = INTEL_EXCL_SHARED;
>  }
>
>  static void
> @@ -2028,10 +2026,6 @@ intel_get_excl_constraints(struct cpu_hw
>         if (!(c->flags & PERF_X86_EVENT_DYNAMIC)) {
>                 struct event_constraint *cx;
>
> -               /* sanity check */
> -               if (idx < 0)
> -                       return &emptyconstraint;
> -
>                 /*
>                  * grab pre-allocated constraint entry
>                  */
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:20             ` Peter Zijlstra
@ 2015-05-21 13:27               ` Stephane Eranian
  2015-05-21 13:29                 ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 6:20 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 21, 2015 at 06:18:15AM -0700, Stephane Eranian wrote:
>> Yes, it does modify the cpuc->event_list[]->hwc, because it is used as a
>> cache for *EACH* invocation of the function. It is irrelevant outside the
>> function.
>
> Yes, but the problem is that they _nest_.
>
> So you get to use the one from the nested call in the outer call.

You mean x86_schedule_events() calls x86_schedule_events()?
Or are you talking about a preemption while executing x86_schedule_events()?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:27               ` Stephane Eranian
@ 2015-05-21 13:29                 ` Peter Zijlstra
  2015-05-21 13:36                   ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 13:29 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
> Or are you talking about a preemption while executing x86_schedule_events()?

That.

And we can of course cure that by an earlier patch I send; but I find it
a much simpler rule to just never allow modifying global state for
validation.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/10] perf/x86: Correct local vs remote sibling state
  2015-05-21 11:17 ` [PATCH 03/10] perf/x86: Correct local vs remote sibling state Peter Zijlstra
@ 2015-05-21 13:31   ` Stephane Eranian
  2015-05-21 14:10     ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:31 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> For some obscure reason the current code accounts the current SMT
> thread's state on the remote thread and reads the remote's state on
> the local SMT thread.
>
> While internally consistent, and 'correct' its pointless confusion we
> can do without.
>
> Flip them the right way around.
>
So you are changing the logic here from:

  * EXCLUSIVE: sibling counter measuring exclusive event
  * SHARED   : sibling counter measuring non-exclusive event
  * UNUSED   : sibling counter unused

to:

   * EXCLUSIVE: current thread is using an exclusive event
   * SHARED: current thread is using a non-exclusive event
   * UNUSED: current thread is not using this counter

I am okay with this just need to make sure there were no
assumptions made about that. I will look.

> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |   69 +++++++++++++--------------------
>  1 file changed, 28 insertions(+), 41 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1903,9 +1903,8 @@ static void
>  intel_start_scheduling(struct cpu_hw_events *cpuc)
>  {
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
> -       struct intel_excl_states *xl, *xlo;
> +       struct intel_excl_states *xl;
>         int tid = cpuc->excl_thread_id;
> -       int o_tid = 1 - tid; /* sibling thread */
>
>         /*
>          * nothing needed if in group validation mode
> @@ -1919,7 +1918,6 @@ intel_start_scheduling(struct cpu_hw_eve
>         if (!excl_cntrs)
>                 return;
>
> -       xlo = &excl_cntrs->states[o_tid];
>         xl = &excl_cntrs->states[tid];
>
>         xl->sched_started = true;
> @@ -1932,18 +1930,17 @@ intel_start_scheduling(struct cpu_hw_eve
>         raw_spin_lock(&excl_cntrs->lock);
>
>         /*
> -        * save initial state of sibling thread
> +        * Save a copy of our state to work on.
>          */
> -       memcpy(xlo->init_state, xlo->state, sizeof(xlo->init_state));
> +       memcpy(xl->init_state, xl->state, sizeof(xl->init_state));
>  }
>
>  static void
>  intel_stop_scheduling(struct cpu_hw_events *cpuc)
>  {
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
> -       struct intel_excl_states *xl, *xlo;
> +       struct intel_excl_states *xl;
>         int tid = cpuc->excl_thread_id;
> -       int o_tid = 1 - tid; /* sibling thread */
>
>         /*
>          * nothing needed if in group validation mode
> @@ -1956,13 +1953,12 @@ intel_stop_scheduling(struct cpu_hw_even
>         if (!excl_cntrs)
>                 return;
>
> -       xlo = &excl_cntrs->states[o_tid];
>         xl = &excl_cntrs->states[tid];
>
>         /*
> -        * make new sibling thread state visible
> +        * Commit the working state.
>          */
> -       memcpy(xlo->state, xlo->init_state, sizeof(xlo->state));
> +       memcpy(xl->state, xl->init_state, sizeof(xl->state));
>
>         xl->sched_started = false;
>         /*
> @@ -1977,10 +1973,9 @@ intel_get_excl_constraints(struct cpu_hw
>  {
>         struct event_constraint *cx;
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
> -       struct intel_excl_states *xl, *xlo;
> -       int is_excl, i;
> +       struct intel_excl_states *xlo;
>         int tid = cpuc->excl_thread_id;
> -       int o_tid = 1 - tid; /* alternate */
> +       int is_excl, i;
>
>         /*
>          * validating a group does not require
> @@ -1994,18 +1989,6 @@ intel_get_excl_constraints(struct cpu_hw
>          */
>         if (!excl_cntrs)
>                 return c;
> -       /*
> -        * event requires exclusive counter access
> -        * across HT threads
> -        */
> -       is_excl = c->flags & PERF_X86_EVENT_EXCL;
> -
> -       /*
> -        * xl = state of current HT
> -        * xlo = state of sibling HT
> -        */
> -       xl = &excl_cntrs->states[tid];
> -       xlo = &excl_cntrs->states[o_tid];
>
>         cx = c;
>
> @@ -2049,6 +2032,17 @@ intel_get_excl_constraints(struct cpu_hw
>          */
>
>         /*
> +        * state of sibling HT
> +        */
> +       xlo = &excl_cntrs->states[tid ^ 1];
> +
> +       /*
> +        * event requires exclusive counter access
> +        * across HT threads
> +        */
> +       is_excl = c->flags & PERF_X86_EVENT_EXCL;
> +
> +       /*
>          * Modify static constraint with current dynamic
>          * state of thread
>          *
> @@ -2062,14 +2056,14 @@ intel_get_excl_constraints(struct cpu_hw
>                  * our corresponding counter cannot be used
>                  * regardless of our event
>                  */
> -               if (xl->state[i] == INTEL_EXCL_EXCLUSIVE)
> +               if (xlo->state[i] == INTEL_EXCL_EXCLUSIVE)
>                         __clear_bit(i, cx->idxmsk);
>                 /*
>                  * if measuring an exclusive event, sibling
>                  * measuring non-exclusive, then counter cannot
>                  * be used
>                  */
> -               if (is_excl && xl->state[i] == INTEL_EXCL_SHARED)
> +               if (is_excl && xlo->state[i] == INTEL_EXCL_SHARED)
>                         __clear_bit(i, cx->idxmsk);
>         }
>
> @@ -2119,10 +2113,9 @@ static void intel_put_excl_constraints(s
>  {
>         struct hw_perf_event *hwc = &event->hw;
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
> -       struct intel_excl_states *xlo, *xl;
> -       unsigned long flags = 0; /* keep compiler happy */
>         int tid = cpuc->excl_thread_id;
> -       int o_tid = 1 - tid;
> +       struct intel_excl_states *xl;
> +       unsigned long flags = 0; /* keep compiler happy */
>
>         /*
>          * nothing needed if in group validation mode
> @@ -2136,7 +2129,6 @@ static void intel_put_excl_constraints(s
>                 return;
>
>         xl = &excl_cntrs->states[tid];
> -       xlo = &excl_cntrs->states[o_tid];
>
>         /*
>          * put_constraint may be called from x86_schedule_events()
> @@ -2151,7 +2143,7 @@ static void intel_put_excl_constraints(s
>          * counter state as unused now
>          */
>         if (hwc->idx >= 0)
> -               xlo->state[hwc->idx] = INTEL_EXCL_UNUSED;
> +               xl->state[hwc->idx] = INTEL_EXCL_UNUSED;
>
>         if (!xl->sched_started)
>                 raw_spin_unlock_irqrestore(&excl_cntrs->lock, flags);
> @@ -2190,16 +2182,12 @@ static void intel_commit_scheduling(stru
>  {
>         struct intel_excl_cntrs *excl_cntrs = cpuc->excl_cntrs;
>         struct event_constraint *c = cpuc->event_constraint[idx];
> -       struct intel_excl_states *xlo, *xl;
> +       struct intel_excl_states *xl;
>         int tid = cpuc->excl_thread_id;
> -       int o_tid = 1 - tid;
> -       int is_excl;
>
>         if (cpuc->is_fake || !c)
>                 return;
>
> -       is_excl = c->flags & PERF_X86_EVENT_EXCL;
> -
>         if (!(c->flags & PERF_X86_EVENT_DYNAMIC))
>                 return;
>
> @@ -2209,15 +2197,14 @@ static void intel_commit_scheduling(stru
>                 return;
>
>         xl = &excl_cntrs->states[tid];
> -       xlo = &excl_cntrs->states[o_tid];
>
>         WARN_ON_ONCE(!raw_spin_is_locked(&excl_cntrs->lock));
>
>         if (cntr >= 0) {
> -               if (is_excl)
> -                       xlo->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
> +               if (c->flags & PERF_X86_EVENT_EXCL)
> +                       xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
>                 else
> -                       xlo->init_state[cntr] = INTEL_EXCL_SHARED;
> +                       xl->init_state[cntr] = INTEL_EXCL_SHARED;
>         }
>  }
>
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:29                 ` Peter Zijlstra
@ 2015-05-21 13:36                   ` Stephane Eranian
  2015-05-21 14:03                     ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
>> Or are you talking about a preemption while executing x86_schedule_events()?
>
> That.
>
> And we can of course cure that by an earlier patch I send; but I find it
> a much simpler rule to just never allow modifying global state for
> validation.

I can see  validation being preempted, but not the context switch code path.
Is that what you are talking about?

You are saying validate_group() is in the middle of x86_schedule_events()
using fake_cpuc, when it gets preempted. The context switch code when it loads
the new thread's PMU state calls x86_schedule_events() which modifies the
cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
nest but they do not touch the same state. And when you eventually come back
to validate_group() you are back to using the fake_cpuc. So I am still not clear
on how the corruption can happen.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state
  2015-05-21 11:17 ` [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state Peter Zijlstra
@ 2015-05-21 13:39   ` Stephane Eranian
  2015-05-21 14:12     ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 13:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> For some obscure reason intel_{start,stop}_scheduling() copy the HT
> state to an intermediate array. This would make sense if we ever were
> to make changes to it which we'd have to discard.
>
> Except we don't. By the time we call intel_commit_scheduling() we're;
> as the name implies; committed to them. We'll never back out.
>
Well, I remember there was a reason why this state was introduced. I thought
we could do it without initially but had to add it to solve a problem.
We do backtrack
in the scheduling algorithm because of the tincremental transaction between the
generic layer and the x86 layer. I'll have to remember the exact case where this
happens.

Now, it may be that because you changed the logic of the xl vs. xlo, this saved
state is not needed anymore. I'll have to look at that some more.


> So the intermediate array is pointless, modify the state in place and
> kill the extra array.
>
> And remove the pointless array initialization: INTEL_EXCL_UNUSED == 0.
>
> Note; all is serialized by intel_excl_cntr::lock.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/perf_event.c       |    1 -
>  arch/x86/kernel/cpu/perf_event.h       |    1 -
>  arch/x86/kernel/cpu/perf_event_intel.c |   22 ++--------------------
>  3 files changed, 2 insertions(+), 22 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -861,7 +861,6 @@ int x86_schedule_events(struct cpu_hw_ev
>         }
>
>         if (!assign || unsched) {
> -
>                 for (i = 0; i < n; i++) {
>                         e = cpuc->event_list[i];
>                         /*
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -132,7 +132,6 @@ enum intel_excl_state_type {
>  };
>
>  struct intel_excl_states {
> -       enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
>         enum intel_excl_state_type state[X86_PMC_IDX_MAX];
>         bool sched_started; /* true if scheduling has started */
>  };
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1927,11 +1927,6 @@ intel_start_scheduling(struct cpu_hw_eve
>          * makes scheduling appear as a transaction
>          */
>         raw_spin_lock(&excl_cntrs->lock);
> -
> -       /*
> -        * Save a copy of our state to work on.
> -        */
> -       memcpy(xl->init_state, xl->state, sizeof(xl->init_state));
>  }
>
>  static void intel_commit_scheduling(struct cpu_hw_events *cpuc, int idx, int cntr)
> @@ -1955,9 +1950,9 @@ static void intel_commit_scheduling(stru
>         lockdep_assert_held(&excl_cntrs->lock);
>
>         if (c->flags & PERF_X86_EVENT_EXCL)
> -               xl->init_state[cntr] = INTEL_EXCL_EXCLUSIVE;
> +               xl->state[cntr] = INTEL_EXCL_EXCLUSIVE;
>         else
> -               xl->init_state[cntr] = INTEL_EXCL_SHARED;
> +               xl->state[cntr] = INTEL_EXCL_SHARED;
>  }
>
>  static void
> @@ -1980,11 +1975,6 @@ intel_stop_scheduling(struct cpu_hw_even
>
>         xl = &excl_cntrs->states[tid];
>
> -       /*
> -        * Commit the working state.
> -        */
> -       memcpy(xl->state, xl->init_state, sizeof(xl->state));
> -
>         xl->sched_started = false;
>         /*
>          * release shared state lock (acquired in intel_start_scheduling())
> @@ -2509,19 +2499,11 @@ struct intel_shared_regs *allocate_share
>  static struct intel_excl_cntrs *allocate_excl_cntrs(int cpu)
>  {
>         struct intel_excl_cntrs *c;
> -       int i;
>
>         c = kzalloc_node(sizeof(struct intel_excl_cntrs),
>                          GFP_KERNEL, cpu_to_node(cpu));
>         if (c) {
>                 raw_spin_lock_init(&c->lock);
> -               for (i = 0; i < X86_PMC_IDX_MAX; i++) {
> -                       c->states[0].state[i] = INTEL_EXCL_UNUSED;
> -                       c->states[0].init_state[i] = INTEL_EXCL_UNUSED;
> -
> -                       c->states[1].state[i] = INTEL_EXCL_UNUSED;
> -                       c->states[1].init_state[i] = INTEL_EXCL_UNUSED;
> -               }
>                 c->core_id = -1;
>         }
>         return c;
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 13:36                   ` Stephane Eranian
@ 2015-05-21 14:03                     ` Peter Zijlstra
  2015-05-21 15:11                       ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 14:03 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote:
> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
> >> Or are you talking about a preemption while executing x86_schedule_events()?
> >
> > That.
> >
> > And we can of course cure that by an earlier patch I send; but I find it
> > a much simpler rule to just never allow modifying global state for
> > validation.
> 
> I can see  validation being preempted, but not the context switch code path.
> Is that what you are talking about?
> 
> You are saying validate_group() is in the middle of x86_schedule_events()
> using fake_cpuc, when it gets preempted. The context switch code when it loads
> the new thread's PMU state calls x86_schedule_events() which modifies the
> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
> nest but they do not touch the same state.

They both touch event->hw->constraint.

>  And when you eventually come back
> to validate_group() you are back to using the fake_cpuc. So I am still not clear
> on how the corruption can happen.

validate_group()
  x86_schedule_events()
    event->hw.constraint = c; # store

     <context switch>
       perf_task_event_sched_in()
         ...
           x86_schedule_events();
             event->hw.constraint = c2; # store

             ...

             put_event_constraints(event); # assume failure to schedule
               intel_put_event_constraints()
                 event->hw.constraint = NULL;

      <context switch end>

    c = event->hw.constraint; # read -> NULL

    if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref


This in particular is possible when the event in question is a cpu-wide
event and group-leader, where the validate_group() tries to add an event
to the group.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/10] perf/x86: Correct local vs remote sibling state
  2015-05-21 13:31   ` Stephane Eranian
@ 2015-05-21 14:10     ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 14:10 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 06:31:25AM -0700, Stephane Eranian wrote:
> On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > For some obscure reason the current code accounts the current SMT
> > thread's state on the remote thread and reads the remote's state on
> > the local SMT thread.
> >
> > While internally consistent, and 'correct' its pointless confusion we
> > can do without.
> >
> > Flip them the right way around.
> >
> So you are changing the logic here from:
> 
>   * EXCLUSIVE: sibling counter measuring exclusive event
>   * SHARED   : sibling counter measuring non-exclusive event
>   * UNUSED   : sibling counter unused
> 
> to:
> 
>    * EXCLUSIVE: current thread is using an exclusive event
>    * SHARED: current thread is using a non-exclusive event
>    * UNUSED: current thread is not using this counter
> 
> I am okay with this just need to make sure there were no
> assumptions made about that. I will look.

Right; and when we construct the constraint mask we look at the other
one too. So both on the update and the read side I flipped things
around.

And that is really the only thing that matters, that you look at the
other sibling's thread state when constructing that mask. And that's
kept invariant.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state
  2015-05-21 13:39   ` Stephane Eranian
@ 2015-05-21 14:12     ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 14:12 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 06:39:29AM -0700, Stephane Eranian wrote:
> On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > For some obscure reason intel_{start,stop}_scheduling() copy the HT
> > state to an intermediate array. This would make sense if we ever were
> > to make changes to it which we'd have to discard.
> >
> > Except we don't. By the time we call intel_commit_scheduling() we're;
> > as the name implies; committed to them. We'll never back out.
> >
> Well, I remember there was a reason why this state was introduced. I thought
> we could do it without initially but had to add it to solve a problem.
> We do backtrack
> in the scheduling algorithm because of the tincremental transaction between the
> generic layer and the x86 layer. I'll have to remember the exact case where this
> happens.
> 
> Now, it may be that because you changed the logic of the xl vs. xlo, this saved
> state is not needed anymore. I'll have to look at that some more.

Another hint its pointless is that stop does an unconditional copy back.

So no matter what we did, all changes are always published.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
  2015-05-21 12:35   ` Stephane Eranian
@ 2015-05-21 14:53   ` Peter Zijlstra
  2015-05-21 15:42     ` Stephane Eranian
  2015-08-21 20:31   ` Sasha Levin
  2 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-21 14:53 UTC (permalink / raw)
  To: mingo
  Cc: vincent.weaver, eranian, jolsa, kan.liang, linux-kernel,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 01:17:11PM +0200, Peter Zijlstra wrote:
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -706,9 +706,9 @@ void intel_pmu_pebs_disable(struct perf_
>  
>  	cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>  
> -	if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
> +	if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
>  		cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> -	else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
> +	else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
>  		cpuc->pebs_enabled &= ~(1ULL << 63);
>  
>  	if (cpuc->enabled)

This btw look like an independent bug.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 14:03                     ` Peter Zijlstra
@ 2015-05-21 15:11                       ` Stephane Eranian
  2015-05-22  6:49                         ` Ingo Molnar
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 15:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote:
>> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
>> >> Or are you talking about a preemption while executing x86_schedule_events()?
>> >
>> > That.
>> >
>> > And we can of course cure that by an earlier patch I send; but I find it
>> > a much simpler rule to just never allow modifying global state for
>> > validation.
>>
>> I can see  validation being preempted, but not the context switch code path.
>> Is that what you are talking about?
>>
>> You are saying validate_group() is in the middle of x86_schedule_events()
>> using fake_cpuc, when it gets preempted. The context switch code when it loads
>> the new thread's PMU state calls x86_schedule_events() which modifies the
>> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
>> nest but they do not touch the same state.
>
> They both touch event->hw->constraint.
>
>>  And when you eventually come back
>> to validate_group() you are back to using the fake_cpuc. So I am still not clear
>> on how the corruption can happen.
>
> validate_group()
>   x86_schedule_events()
>     event->hw.constraint = c; # store
>
>      <context switch>
>        perf_task_event_sched_in()
>          ...
>            x86_schedule_events();
>              event->hw.constraint = c2; # store
>
>              ...
>
>              put_event_constraints(event); # assume failure to schedule
>                intel_put_event_constraints()
>                  event->hw.constraint = NULL;
>
>       <context switch end>
>
>     c = event->hw.constraint; # read -> NULL
>
>     if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
>
>
> This in particular is possible when the event in question is a cpu-wide
> event and group-leader, where the validate_group() tries to add an event
> to the group.
>
Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc, it is related
to the fact that the constraint is cached in the event struct itself and that
one is shared between validate_group() and x86_schedule_events() because
cpu_hw_event->event_list[] is an array of pointers to events and not an array of
events.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 14:53   ` Peter Zijlstra
@ 2015-05-21 15:42     ` Stephane Eranian
  0 siblings, 0 replies; 56+ messages in thread
From: Stephane Eranian @ 2015-05-21 15:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 7:53 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, May 21, 2015 at 01:17:11PM +0200, Peter Zijlstra wrote:
>> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> @@ -706,9 +706,9 @@ void intel_pmu_pebs_disable(struct perf_
>>
>>       cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>>
>> -     if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
>> +     if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
>>               cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
>> -     else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
>> +     else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
>>               cpuc->pebs_enabled &= ~(1ULL << 63);
>>
>>       if (cpuc->enabled)
>
> This btw look like an independent bug.

You mean, referring to event->hw.constraint?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 15:11                       ` Stephane Eranian
@ 2015-05-22  6:49                         ` Ingo Molnar
  2015-05-22  9:26                           ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Ingo Molnar @ 2015-05-22  6:49 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Peter Zijlstra, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou


* Stephane Eranian <eranian@google.com> wrote:

> On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote:
> >> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
> >> >> Or are you talking about a preemption while executing x86_schedule_events()?
> >> >
> >> > That.
> >> >
> >> > And we can of course cure that by an earlier patch I send; but I find it
> >> > a much simpler rule to just never allow modifying global state for
> >> > validation.
> >>
> >> I can see  validation being preempted, but not the context switch code path.
> >> Is that what you are talking about?
> >>
> >> You are saying validate_group() is in the middle of x86_schedule_events()
> >> using fake_cpuc, when it gets preempted. The context switch code when it loads
> >> the new thread's PMU state calls x86_schedule_events() which modifies the
> >> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
> >> nest but they do not touch the same state.
> >
> > They both touch event->hw->constraint.
> >
> >>  And when you eventually come back
> >> to validate_group() you are back to using the fake_cpuc. So I am still not clear
> >> on how the corruption can happen.
> >
> > validate_group()
> >   x86_schedule_events()
> >     event->hw.constraint = c; # store
> >
> >      <context switch>
> >        perf_task_event_sched_in()
> >          ...
> >            x86_schedule_events();
> >              event->hw.constraint = c2; # store
> >
> >              ...
> >
> >              put_event_constraints(event); # assume failure to schedule
> >                intel_put_event_constraints()
> >                  event->hw.constraint = NULL;
> >
> >       <context switch end>
> >
> >     c = event->hw.constraint; # read -> NULL
> >
> >     if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
> >
> >
> > This in particular is possible when the event in question is a cpu-wide
> > event and group-leader, where the validate_group() tries to add an event
> > to the group.
>
> Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc, 
> it is related to the fact that the constraint is cached in the event 
> struct itself and that one is shared between validate_group() and 
> x86_schedule_events() because cpu_hw_event->event_list[] is an array 
> of pointers to events and not an array of events.

Btw., comments and the code structure should be greatly enhanced to 
make all that very clear and hard to mess up.

A month ago perf became fuzzing-proof, and now that's down the drain 
again...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-22  6:49                         ` Ingo Molnar
@ 2015-05-22  9:26                           ` Stephane Eranian
  2015-05-22  9:46                             ` Ingo Molnar
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22  9:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou

On Thu, May 21, 2015 at 11:49 PM, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Stephane Eranian <eranian@google.com> wrote:
>
> > On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote:
> > >> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > >> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
> > >> >> Or are you talking about a preemption while executing x86_schedule_events()?
> > >> >
> > >> > That.
> > >> >
> > >> > And we can of course cure that by an earlier patch I send; but I find it
> > >> > a much simpler rule to just never allow modifying global state for
> > >> > validation.
> > >>
> > >> I can see  validation being preempted, but not the context switch code path.
> > >> Is that what you are talking about?
> > >>
> > >> You are saying validate_group() is in the middle of x86_schedule_events()
> > >> using fake_cpuc, when it gets preempted. The context switch code when it loads
> > >> the new thread's PMU state calls x86_schedule_events() which modifies the
> > >> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
> > >> nest but they do not touch the same state.
> > >
> > > They both touch event->hw->constraint.
> > >
> > >>  And when you eventually come back
> > >> to validate_group() you are back to using the fake_cpuc. So I am still not clear
> > >> on how the corruption can happen.
> > >
> > > validate_group()
> > >   x86_schedule_events()
> > >     event->hw.constraint = c; # store
> > >
> > >      <context switch>
> > >        perf_task_event_sched_in()
> > >          ...
> > >            x86_schedule_events();
> > >              event->hw.constraint = c2; # store
> > >
> > >              ...
> > >
> > >              put_event_constraints(event); # assume failure to schedule
> > >                intel_put_event_constraints()
> > >                  event->hw.constraint = NULL;
> > >
> > >       <context switch end>
> > >
> > >     c = event->hw.constraint; # read -> NULL
> > >
> > >     if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
> > >
> > >
> > > This in particular is possible when the event in question is a cpu-wide
> > > event and group-leader, where the validate_group() tries to add an event
> > > to the group.
> >
> > Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc,
> > it is related to the fact that the constraint is cached in the event
> > struct itself and that one is shared between validate_group() and
> > x86_schedule_events() because cpu_hw_event->event_list[] is an array
> > of pointers to events and not an array of events.
>
> Btw., comments and the code structure should be greatly enhanced to
> make all that very clear and hard to mess up.
>
Peter and I will clean this up.

>
> A month ago perf became fuzzing-proof, and now that's down the drain
> again...
>
It will be fixed this week.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-22  9:26                           ` Stephane Eranian
@ 2015-05-22  9:46                             ` Ingo Molnar
  0 siblings, 0 replies; 56+ messages in thread
From: Ingo Molnar @ 2015-05-22  9:46 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Peter Zijlstra, Vince Weaver, Jiri Olsa, Liang, Kan, LKML,
	Andrew Hunter, Maria Dimakopoulou


* Stephane Eranian <eranian@google.com> wrote:

> On Thu, May 21, 2015 at 11:49 PM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Stephane Eranian <eranian@google.com> wrote:
> >
> > > On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > > On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote:
> > > >> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > >> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote:
> > > >> >> Or are you talking about a preemption while executing x86_schedule_events()?
> > > >> >
> > > >> > That.
> > > >> >
> > > >> > And we can of course cure that by an earlier patch I send; but I find it
> > > >> > a much simpler rule to just never allow modifying global state for
> > > >> > validation.
> > > >>
> > > >> I can see  validation being preempted, but not the context switch code path.
> > > >> Is that what you are talking about?
> > > >>
> > > >> You are saying validate_group() is in the middle of x86_schedule_events()
> > > >> using fake_cpuc, when it gets preempted. The context switch code when it loads
> > > >> the new thread's PMU state calls x86_schedule_events() which modifies the
> > > >> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls
> > > >> nest but they do not touch the same state.
> > > >
> > > > They both touch event->hw->constraint.
> > > >
> > > >>  And when you eventually come back
> > > >> to validate_group() you are back to using the fake_cpuc. So I am still not clear
> > > >> on how the corruption can happen.
> > > >
> > > > validate_group()
> > > >   x86_schedule_events()
> > > >     event->hw.constraint = c; # store
> > > >
> > > >      <context switch>
> > > >        perf_task_event_sched_in()
> > > >          ...
> > > >            x86_schedule_events();
> > > >              event->hw.constraint = c2; # store
> > > >
> > > >              ...
> > > >
> > > >              put_event_constraints(event); # assume failure to schedule
> > > >                intel_put_event_constraints()
> > > >                  event->hw.constraint = NULL;
> > > >
> > > >       <context switch end>
> > > >
> > > >     c = event->hw.constraint; # read -> NULL
> > > >
> > > >     if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref
> > > >
> > > >
> > > > This in particular is possible when the event in question is a cpu-wide
> > > > event and group-leader, where the validate_group() tries to add an event
> > > > to the group.
> > >
> > > Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc,
> > > it is related to the fact that the constraint is cached in the event
> > > struct itself and that one is shared between validate_group() and
> > > x86_schedule_events() because cpu_hw_event->event_list[] is an array
> > > of pointers to events and not an array of events.
> >
> > Btw., comments and the code structure should be greatly enhanced 
> > to make all that very clear and hard to mess up.
> >
> Peter and I will clean this up.

Great, thanks!

	Ingo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-21 11:17 ` [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint Peter Zijlstra
@ 2015-05-22 10:04   ` Stephane Eranian
  2015-05-22 11:21     ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 10:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Thu, May 21, 2015 at 4:17 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> The (SNB/IVB/HSW) HT bug only affects events that can be programmed
> onto GP counters, therefore we should only limit the number of GP
> counters that can be used per cpu -- iow we should not constrain the
> FP counters.
>
> This patch changes the code such that we only count GP counters.
>
> Reported-by: Vince Weaver <vincent.weaver@maine.edu>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/perf_event.c       |   15 +++++++++++++++
>  arch/x86/kernel/cpu/perf_event.h       |    2 --
>  arch/x86/kernel/cpu/perf_event_intel.c |   20 --------------------
>  3 files changed, 15 insertions(+), 22 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -611,6 +611,7 @@ struct sched_state {
>         int     event;          /* event index */
>         int     counter;        /* counter index */
>         int     unassigned;     /* number of events to be assigned left */
> +       int     nr_gp_counters; /* number of GP counters used */
>         unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
>  };
>
> @@ -696,6 +697,20 @@ static bool __perf_sched_find_counter(st
>                                 goto done;
>                 }
>         }
> +
> +       /*
> +        * Do not allow scheduling of more than half the available generic
> +        * counters.
> +        *
> +        * This helps avoid counter starvation of sibling thread by ensuring at
> +        * most half the counters cannot be in exclusive mode. There is no
> +        * designated counters for the limits. Any N/2 counters can be used.
> +        * This helps with events with specific counter constraints.
> +        */
> +       if (is_ht_workaround_enabled() &&
> +           sched->state.nr_gp_counters++ >= x86_pmu.num_counters / 2)
> +               return false;
> +

Has to be > and not >= otherwise:

$ perf stat -a -C 0 -e r81d0,r81d0 -I 1000 sleep 100

Gives you 50% multiplexing when it should be 0% based on my testing.

>         /* Grab the first unused counter starting with idx */
>         idx = sched->state.counter;
>         for_each_set_bit_from(idx, c->idxmsk, INTEL_PMC_IDX_FIXED) {
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -134,8 +134,6 @@ enum intel_excl_state_type {
>  struct intel_excl_states {
>         enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
>         enum intel_excl_state_type state[X86_PMC_IDX_MAX];
> -       int  num_alloc_cntrs;/* #counters allocated */
> -       int  max_alloc_cntrs;/* max #counters allowed */
>         bool sched_started; /* true if scheduling has started */
>  };
>
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1923,7 +1923,6 @@ intel_start_scheduling(struct cpu_hw_eve
>         xl = &excl_cntrs->states[tid];
>
>         xl->sched_started = true;
> -       xl->num_alloc_cntrs = 0;
>         /*
>          * lock shared state until we are done scheduling
>          * in stop_event_scheduling()
> @@ -2008,18 +2007,6 @@ intel_get_excl_constraints(struct cpu_hw
>         xl = &excl_cntrs->states[tid];
>         xlo = &excl_cntrs->states[o_tid];
>
> -       /*
> -        * do not allow scheduling of more than max_alloc_cntrs
> -        * which is set to half the available generic counters.
> -        * this helps avoid counter starvation of sibling thread
> -        * by ensuring at most half the counters cannot be in
> -        * exclusive mode. There is not designated counters for the
> -        * limits. Any N/2 counters can be used. This helps with
> -        * events with specifix counter constraints
> -        */
> -       if (xl->num_alloc_cntrs++ == xl->max_alloc_cntrs)
> -               return &emptyconstraint;
> -
>         cx = c;
>
>         /*
> @@ -2632,8 +2619,6 @@ static void intel_pmu_cpu_starting(int c
>                 cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
>
>         if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
> -               int h = x86_pmu.num_counters >> 1;
> -
>                 for_each_cpu(i, topology_thread_cpumask(cpu)) {
>                         struct intel_excl_cntrs *c;
>
> @@ -2647,11 +2632,6 @@ static void intel_pmu_cpu_starting(int c
>                 }
>                 cpuc->excl_cntrs->core_id = core_id;
>                 cpuc->excl_cntrs->refcnt++;
> -               /*
> -                * set hard limit to half the number of generic counters
> -                */
> -               cpuc->excl_cntrs->states[0].max_alloc_cntrs = h;
> -               cpuc->excl_cntrs->states[1].max_alloc_cntrs = h;
>         }
>  }
>
>
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 10:04   ` Stephane Eranian
@ 2015-05-22 11:21     ` Peter Zijlstra
  2015-05-22 11:24       ` Stephane Eranian
  2015-05-22 11:28       ` Peter Zijlstra
  0 siblings, 2 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 11:21 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 03:04:45AM -0700, Stephane Eranian wrote:
> > +       if (is_ht_workaround_enabled() &&
> > +           sched->state.nr_gp_counters++ >= x86_pmu.num_counters / 2)
> > +               return false;
> > +
> 
> Has to be > and not >= otherwise:

but its a post inc, so we'll return: 0, 1, 2, ... With > we'll match
after 3 gp events.

I'll agree its not working right though.

FWIW, I currently have the below; which also isn't working right.

It should not enforce the limit when there's no exclusive events being
scheduled.

It also doesn't break uncore scheduling.

---
 arch/x86/kernel/cpu/perf_event.c              |   31 ++++++++++++++++++++++----
 arch/x86/kernel/cpu/perf_event.h              |   10 +++++---
 arch/x86/kernel/cpu/perf_event_intel.c        |   28 ++++++-----------------
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |    2 -
 4 files changed, 43 insertions(+), 28 deletions(-)

--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -611,6 +611,7 @@ struct sched_state {
 	int	event;		/* event index */
 	int	counter;	/* counter index */
 	int	unassigned;	/* number of events to be assigned left */
+	int	nr_gp;		/* number of GP counters used */
 	unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
 };
 
@@ -620,6 +621,7 @@ struct sched_state {
 struct perf_sched {
 	int			max_weight;
 	int			max_events;
+	int			max_gp;
 	struct event_constraint	**constraints;
 	struct sched_state	state;
 	int			saved_states;
@@ -630,13 +632,14 @@ struct perf_sched {
  * Initialize interator that runs through all events and counters.
  */
 static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints,
-			    int num, int wmin, int wmax)
+			    int num, int wmin, int wmax, int gpmax)
 {
 	int idx;
 
 	memset(sched, 0, sizeof(*sched));
 	sched->max_events	= num;
 	sched->max_weight	= wmax;
+	sched->max_gp		= gpmax;
 	sched->constraints	= constraints;
 
 	for (idx = 0; idx < num; idx++) {
@@ -696,6 +699,10 @@ static bool __perf_sched_find_counter(st
 				goto done;
 		}
 	}
+
+	if (sched->state.nr_gp++ >= sched->max_gp)
+		return false;
+
 	/* Grab the first unused counter starting with idx */
 	idx = sched->state.counter;
 	for_each_set_bit_from(idx, c->idxmsk, INTEL_PMC_IDX_FIXED) {
@@ -757,11 +764,11 @@ static bool perf_sched_next_event(struct
  * Assign a counter for each event.
  */
 int perf_assign_events(struct event_constraint **constraints, int n,
-			int wmin, int wmax, int *assign)
+			int wmin, int wmax, int gpmax, int *assign)
 {
 	struct perf_sched sched;
 
-	perf_sched_init(&sched, constraints, n, wmin, wmax);
+	perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
 
 	do {
 		if (!perf_sched_find_counter(&sched))
@@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
 
 	/* slow path */
 	if (i != n) {
+		int gpmax = x86_pmu.num_counters / 2;
+
+		/*
+		 * Do not allow scheduling of more than half the available
+		 * generic counters.
+		 *
+		 * This helps avoid counter starvation of sibling thread by
+		 * ensuring at most half the counters cannot be in exclusive
+		 * mode. There is no designated counters for the limits. Any
+		 * N/2 counters can be used. This helps with events with
+		 * specific counter constraints.
+		 */
+		if (is_ht_workaround_enabled() && !cpuc->is_fake &&
+		    READ_ONCE(cpuc->excl_cntrs->exclusive_present))
+			gpmax /= 2;
+
 		unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
-					     wmax, assign);
+					     wmax, gpmax, assign);
 	}
 
 	/*
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -134,8 +134,6 @@ enum intel_excl_state_type {
 struct intel_excl_states {
 	enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
 	enum intel_excl_state_type state[X86_PMC_IDX_MAX];
-	int  num_alloc_cntrs;/* #counters allocated */
-	int  max_alloc_cntrs;/* max #counters allowed */
 	bool sched_started; /* true if scheduling has started */
 };
 
@@ -144,6 +142,11 @@ struct intel_excl_cntrs {
 
 	struct intel_excl_states states[2];
 
+	union {
+		u16	has_exclusive[2];
+		u32	exclusive_present;
+	};
+
 	int		refcnt;		/* per-core: #HT threads */
 	unsigned	core_id;	/* per-core: core id */
 };
@@ -176,6 +179,7 @@ struct cpu_hw_events {
 	struct perf_event	*event_list[X86_PMC_IDX_MAX]; /* in enabled order */
 	struct event_constraint	*event_constraint[X86_PMC_IDX_MAX];
 
+	int			n_excl; /* the number of exclusive events */
 
 	unsigned int		group_flag;
 	int			is_fake;
@@ -719,7 +723,7 @@ static inline void __x86_pmu_enable_even
 void x86_pmu_enable_all(int added);
 
 int perf_assign_events(struct event_constraint **constraints, int n,
-			int wmin, int wmax, int *assign);
+			int wmin, int wmax, int gpmax, int *assign);
 int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);
 
 void x86_pmu_stop(struct perf_event *event, int flags);
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1923,7 +1923,6 @@ intel_start_scheduling(struct cpu_hw_eve
 	xl = &excl_cntrs->states[tid];
 
 	xl->sched_started = true;
-	xl->num_alloc_cntrs = 0;
 	/*
 	 * lock shared state until we are done scheduling
 	 * in stop_event_scheduling()
@@ -2000,6 +1999,10 @@ intel_get_excl_constraints(struct cpu_hw
 	 * across HT threads
 	 */
 	is_excl = c->flags & PERF_X86_EVENT_EXCL;
+	if (is_excl) {
+		if (!cpuc->n_excl++)
+			WRITE_ONCE(excl_cntrs->has_exclusive[tid], 1);
+	}
 
 	/*
 	 * xl = state of current HT
@@ -2008,18 +2011,6 @@ intel_get_excl_constraints(struct cpu_hw
 	xl = &excl_cntrs->states[tid];
 	xlo = &excl_cntrs->states[o_tid];
 
-	/*
-	 * do not allow scheduling of more than max_alloc_cntrs
-	 * which is set to half the available generic counters.
-	 * this helps avoid counter starvation of sibling thread
-	 * by ensuring at most half the counters cannot be in
-	 * exclusive mode. There is not designated counters for the
-	 * limits. Any N/2 counters can be used. This helps with
-	 * events with specifix counter constraints
-	 */
-	if (xl->num_alloc_cntrs++ == xl->max_alloc_cntrs)
-		return &emptyconstraint;
-
 	cx = c;
 
 	/*
@@ -2150,6 +2141,10 @@ static void intel_put_excl_constraints(s
 
 	xl = &excl_cntrs->states[tid];
 	xlo = &excl_cntrs->states[o_tid];
+	if (hwc->flags & PERF_X86_EVENT_EXCL) {
+		if (!--cpuc->n_excl)
+			WRITE_ONCE(excl_cntrs->has_exclusive[tid], 0);
+	}
 
 	/*
 	 * put_constraint may be called from x86_schedule_events()
@@ -2632,8 +2627,6 @@ static void intel_pmu_cpu_starting(int c
 		cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
 
 	if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
-		int h = x86_pmu.num_counters >> 1;
-
 		for_each_cpu(i, topology_thread_cpumask(cpu)) {
 			struct intel_excl_cntrs *c;
 
@@ -2647,11 +2640,6 @@ static void intel_pmu_cpu_starting(int c
 		}
 		cpuc->excl_cntrs->core_id = core_id;
 		cpuc->excl_cntrs->refcnt++;
-		/*
-		 * set hard limit to half the number of generic counters
-		 */
-		cpuc->excl_cntrs->states[0].max_alloc_cntrs = h;
-		cpuc->excl_cntrs->states[1].max_alloc_cntrs = h;
 	}
 }
 
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -394,7 +394,7 @@ static int uncore_assign_events(struct i
 	/* slow path */
 	if (i != n)
 		ret = perf_assign_events(box->event_constraint, n,
-					 wmin, wmax, assign);
+					 wmin, wmax, n, assign);
 
 	if (!assign || ret) {
 		for (i = 0; i < n; i++)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 11:21     ` Peter Zijlstra
@ 2015-05-22 11:24       ` Stephane Eranian
  2015-05-22 11:28       ` Peter Zijlstra
  1 sibling, 0 replies; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 11:24 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 4:21 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 03:04:45AM -0700, Stephane Eranian wrote:
>> > +       if (is_ht_workaround_enabled() &&
>> > +           sched->state.nr_gp_counters++ >= x86_pmu.num_counters / 2)
>> > +               return false;
>> > +
>>
>> Has to be > and not >= otherwise:
>
> but its a post inc, so we'll return: 0, 1, 2, ... With > we'll match
> after 3 gp events.
>
> I'll agree its not working right though.
>
> FWIW, I currently have the below; which also isn't working right.
>
> It should not enforce the limit when there's no exclusive events being
> scheduled.
>
> It also doesn't break uncore scheduling.
>
> ---
>  arch/x86/kernel/cpu/perf_event.c              |   31 ++++++++++++++++++++++----
>  arch/x86/kernel/cpu/perf_event.h              |   10 +++++---
>  arch/x86/kernel/cpu/perf_event_intel.c        |   28 ++++++-----------------
>  arch/x86/kernel/cpu/perf_event_intel_uncore.c |    2 -
>  4 files changed, 43 insertions(+), 28 deletions(-)
>
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -611,6 +611,7 @@ struct sched_state {
>         int     event;          /* event index */
>         int     counter;        /* counter index */
>         int     unassigned;     /* number of events to be assigned left */
> +       int     nr_gp;          /* number of GP counters used */
>         unsigned long used[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
>  };
>
> @@ -620,6 +621,7 @@ struct sched_state {
>  struct perf_sched {
>         int                     max_weight;
>         int                     max_events;
> +       int                     max_gp;
>         struct event_constraint **constraints;
>         struct sched_state      state;
>         int                     saved_states;
> @@ -630,13 +632,14 @@ struct perf_sched {
>   * Initialize interator that runs through all events and counters.
>   */
>  static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints,
> -                           int num, int wmin, int wmax)
> +                           int num, int wmin, int wmax, int gpmax)
>  {
>         int idx;
>
>         memset(sched, 0, sizeof(*sched));
>         sched->max_events       = num;
>         sched->max_weight       = wmax;
> +       sched->max_gp           = gpmax;
>         sched->constraints      = constraints;
>
>         for (idx = 0; idx < num; idx++) {
> @@ -696,6 +699,10 @@ static bool __perf_sched_find_counter(st
>                                 goto done;
>                 }
>         }
> +
> +       if (sched->state.nr_gp++ >= sched->max_gp)
> +               return false;
> +
>         /* Grab the first unused counter starting with idx */
>         idx = sched->state.counter;
>         for_each_set_bit_from(idx, c->idxmsk, INTEL_PMC_IDX_FIXED) {
> @@ -757,11 +764,11 @@ static bool perf_sched_next_event(struct
>   * Assign a counter for each event.
>   */
>  int perf_assign_events(struct event_constraint **constraints, int n,
> -                       int wmin, int wmax, int *assign)
> +                       int wmin, int wmax, int gpmax, int *assign)
>  {
>         struct perf_sched sched;
>
> -       perf_sched_init(&sched, constraints, n, wmin, wmax);
> +       perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax);
>
>         do {
>                 if (!perf_sched_find_counter(&sched))
> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>
>         /* slow path */
>         if (i != n) {
> +               int gpmax = x86_pmu.num_counters / 2;
> +
> +               /*
> +                * Do not allow scheduling of more than half the available
> +                * generic counters.
> +                *
> +                * This helps avoid counter starvation of sibling thread by
> +                * ensuring at most half the counters cannot be in exclusive
> +                * mode. There is no designated counters for the limits. Any
> +                * N/2 counters can be used. This helps with events with
> +                * specific counter constraints.
> +                */
> +               if (is_ht_workaround_enabled() && !cpuc->is_fake &&
> +                   READ_ONCE(cpuc->excl_cntrs->exclusive_present))
> +                       gpmax /= 2;
> +
That's num_counters / 4!
I think you meant: int gpmax = x86_pmu.num_counters;

>                 unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
> -                                            wmax, assign);
> +                                            wmax, gpmax, assign);
>         }
>
>         /*
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -134,8 +134,6 @@ enum intel_excl_state_type {
>  struct intel_excl_states {
>         enum intel_excl_state_type init_state[X86_PMC_IDX_MAX];
>         enum intel_excl_state_type state[X86_PMC_IDX_MAX];
> -       int  num_alloc_cntrs;/* #counters allocated */
> -       int  max_alloc_cntrs;/* max #counters allowed */
>         bool sched_started; /* true if scheduling has started */
>  };
>
> @@ -144,6 +142,11 @@ struct intel_excl_cntrs {
>
>         struct intel_excl_states states[2];
>
> +       union {
> +               u16     has_exclusive[2];
> +               u32     exclusive_present;
> +       };
> +
>         int             refcnt;         /* per-core: #HT threads */
>         unsigned        core_id;        /* per-core: core id */
>  };
> @@ -176,6 +179,7 @@ struct cpu_hw_events {
>         struct perf_event       *event_list[X86_PMC_IDX_MAX]; /* in enabled order */
>         struct event_constraint *event_constraint[X86_PMC_IDX_MAX];
>
> +       int                     n_excl; /* the number of exclusive events */
>
>         unsigned int            group_flag;
>         int                     is_fake;
> @@ -719,7 +723,7 @@ static inline void __x86_pmu_enable_even
>  void x86_pmu_enable_all(int added);
>
>  int perf_assign_events(struct event_constraint **constraints, int n,
> -                       int wmin, int wmax, int *assign);
> +                       int wmin, int wmax, int gpmax, int *assign);
>  int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);
>
>  void x86_pmu_stop(struct perf_event *event, int flags);
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1923,7 +1923,6 @@ intel_start_scheduling(struct cpu_hw_eve
>         xl = &excl_cntrs->states[tid];
>
>         xl->sched_started = true;
> -       xl->num_alloc_cntrs = 0;
>         /*
>          * lock shared state until we are done scheduling
>          * in stop_event_scheduling()
> @@ -2000,6 +1999,10 @@ intel_get_excl_constraints(struct cpu_hw
>          * across HT threads
>          */
>         is_excl = c->flags & PERF_X86_EVENT_EXCL;
> +       if (is_excl) {
> +               if (!cpuc->n_excl++)
> +                       WRITE_ONCE(excl_cntrs->has_exclusive[tid], 1);
> +       }
>
>         /*
>          * xl = state of current HT
> @@ -2008,18 +2011,6 @@ intel_get_excl_constraints(struct cpu_hw
>         xl = &excl_cntrs->states[tid];
>         xlo = &excl_cntrs->states[o_tid];
>
> -       /*
> -        * do not allow scheduling of more than max_alloc_cntrs
> -        * which is set to half the available generic counters.
> -        * this helps avoid counter starvation of sibling thread
> -        * by ensuring at most half the counters cannot be in
> -        * exclusive mode. There is not designated counters for the
> -        * limits. Any N/2 counters can be used. This helps with
> -        * events with specifix counter constraints
> -        */
> -       if (xl->num_alloc_cntrs++ == xl->max_alloc_cntrs)
> -               return &emptyconstraint;
> -
>         cx = c;
>
>         /*
> @@ -2150,6 +2141,10 @@ static void intel_put_excl_constraints(s
>
>         xl = &excl_cntrs->states[tid];
>         xlo = &excl_cntrs->states[o_tid];
> +       if (hwc->flags & PERF_X86_EVENT_EXCL) {
> +               if (!--cpuc->n_excl)
> +                       WRITE_ONCE(excl_cntrs->has_exclusive[tid], 0);
> +       }
>
>         /*
>          * put_constraint may be called from x86_schedule_events()
> @@ -2632,8 +2627,6 @@ static void intel_pmu_cpu_starting(int c
>                 cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
>
>         if (x86_pmu.flags & PMU_FL_EXCL_CNTRS) {
> -               int h = x86_pmu.num_counters >> 1;
> -
>                 for_each_cpu(i, topology_thread_cpumask(cpu)) {
>                         struct intel_excl_cntrs *c;
>
> @@ -2647,11 +2640,6 @@ static void intel_pmu_cpu_starting(int c
>                 }
>                 cpuc->excl_cntrs->core_id = core_id;
>                 cpuc->excl_cntrs->refcnt++;
> -               /*
> -                * set hard limit to half the number of generic counters
> -                */
> -               cpuc->excl_cntrs->states[0].max_alloc_cntrs = h;
> -               cpuc->excl_cntrs->states[1].max_alloc_cntrs = h;
>         }
>  }
>
> --- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -394,7 +394,7 @@ static int uncore_assign_events(struct i
>         /* slow path */
>         if (i != n)
>                 ret = perf_assign_events(box->event_constraint, n,
> -                                        wmin, wmax, assign);
> +                                        wmin, wmax, n, assign);
>
>         if (!assign || ret) {
>                 for (i = 0; i < n; i++)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 11:21     ` Peter Zijlstra
  2015-05-22 11:24       ` Stephane Eranian
@ 2015-05-22 11:28       ` Peter Zijlstra
  2015-05-22 12:35         ` Stephane Eranian
  1 sibling, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 11:28 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>  
>  	/* slow path */
>  	if (i != n) {
> +		int gpmax = x86_pmu.num_counters / 2;
> +
> +		/*
> +		 * Do not allow scheduling of more than half the available
> +		 * generic counters.
> +		 *
> +		 * This helps avoid counter starvation of sibling thread by
> +		 * ensuring at most half the counters cannot be in exclusive
> +		 * mode. There is no designated counters for the limits. Any
> +		 * N/2 counters can be used. This helps with events with
> +		 * specific counter constraints.
> +		 */
> +		if (is_ht_workaround_enabled() && !cpuc->is_fake &&
> +		    READ_ONCE(cpuc->excl_cntrs->exclusive_present))
> +			gpmax /= 2;
> +
>  		unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
> -					     wmax, assign);
> +					     wmax, gpmax, assign);
>  	}
>  

Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 11:28       ` Peter Zijlstra
@ 2015-05-22 12:35         ` Stephane Eranian
  2015-05-22 12:53           ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 12:35 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
>> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>>
>>       /* slow path */
>>       if (i != n) {
>> +             int gpmax = x86_pmu.num_counters / 2;
>> +
>> +             /*
>> +              * Do not allow scheduling of more than half the available
>> +              * generic counters.
>> +              *
>> +              * This helps avoid counter starvation of sibling thread by
>> +              * ensuring at most half the counters cannot be in exclusive
>> +              * mode. There is no designated counters for the limits. Any
>> +              * N/2 counters can be used. This helps with events with
>> +              * specific counter constraints.
>> +              */
>> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
>> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
>> +                     gpmax /= 2;
>> +
>>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
>> -                                          wmax, assign);
>> +                                          wmax, gpmax, assign);
>>       }
>>
>
> Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.

Yes, that's what I said. Other problem is, with no watchdog, measuring
a non-corrupting event is still multiplexing when more than 2 instances
are passed:
 $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100

I get 50% scheduling, only 2 out of 4 events scheduled at any time.

There is nothing running on the sibling thread, so it should let me run with 4
instances as per your patch.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 12:35         ` Stephane Eranian
@ 2015-05-22 12:53           ` Peter Zijlstra
  2015-05-22 12:55             ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 12:53 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 05:35:14AM -0700, Stephane Eranian wrote:
> On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
> >> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
> >>
> >>       /* slow path */
> >>       if (i != n) {
> >> +             int gpmax = x86_pmu.num_counters / 2;
> >> +
> >> +             /*
> >> +              * Do not allow scheduling of more than half the available
> >> +              * generic counters.
> >> +              *
> >> +              * This helps avoid counter starvation of sibling thread by
> >> +              * ensuring at most half the counters cannot be in exclusive
> >> +              * mode. There is no designated counters for the limits. Any
> >> +              * N/2 counters can be used. This helps with events with
> >> +              * specific counter constraints.
> >> +              */
> >> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
> >> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
> >> +                     gpmax /= 2;
> >> +
> >>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
> >> -                                          wmax, assign);
> >> +                                          wmax, gpmax, assign);
> >>       }
> >>
> >
> > Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.
> 
> Yes, that's what I said. Other problem is, with no watchdog, measuring
> a non-corrupting event is still multiplexing when more than 2 instances
> are passed:
>  $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100
> 
> I get 50% scheduling, only 2 out of 4 events scheduled at any time.
> 
> There is nothing running on the sibling thread, so it should let me run with 4
> instances as per your patch.

Ah, I limited it to n/2 if either of the siblings has an exclusive event
on.

In any case, there's at least two more bugs in that patch that I know
of, one of which I've fixed, the other I know how to fix.

Will re-post in a bit once I got it tested.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 12:53           ` Peter Zijlstra
@ 2015-05-22 12:55             ` Stephane Eranian
  2015-05-22 12:59               ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 12:55 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 5:53 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 05:35:14AM -0700, Stephane Eranian wrote:
>> On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
>> >> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>> >>
>> >>       /* slow path */
>> >>       if (i != n) {
>> >> +             int gpmax = x86_pmu.num_counters / 2;
>> >> +
>> >> +             /*
>> >> +              * Do not allow scheduling of more than half the available
>> >> +              * generic counters.
>> >> +              *
>> >> +              * This helps avoid counter starvation of sibling thread by
>> >> +              * ensuring at most half the counters cannot be in exclusive
>> >> +              * mode. There is no designated counters for the limits. Any
>> >> +              * N/2 counters can be used. This helps with events with
>> >> +              * specific counter constraints.
>> >> +              */
>> >> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
>> >> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
>> >> +                     gpmax /= 2;
>> >> +
>> >>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
>> >> -                                          wmax, assign);
>> >> +                                          wmax, gpmax, assign);
>> >>       }
>> >>
>> >
>> > Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.
>>
>> Yes, that's what I said. Other problem is, with no watchdog, measuring
>> a non-corrupting event is still multiplexing when more than 2 instances
>> are passed:
>>  $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100
>>
>> I get 50% scheduling, only 2 out of 4 events scheduled at any time.
>>
>> There is nothing running on the sibling thread, so it should let me run with 4
>> instances as per your patch.
>
> Ah, I limited it to n/2 if either of the siblings has an exclusive event
> on.
>
But in my test case above, there was no exclusive event at all on either
sibling and yet it limited the non-excl to 2.

> In any case, there's at least two more bugs in that patch that I know
> of, one of which I've fixed, the other I know how to fix.
>
> Will re-post in a bit once I got it tested.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 12:55             ` Stephane Eranian
@ 2015-05-22 12:59               ` Peter Zijlstra
  2015-05-22 13:05                 ` Stephane Eranian
  2015-05-22 13:10                 ` Peter Zijlstra
  0 siblings, 2 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 12:59 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 05:55:32AM -0700, Stephane Eranian wrote:
> On Fri, May 22, 2015 at 5:53 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, May 22, 2015 at 05:35:14AM -0700, Stephane Eranian wrote:
> >> On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> > On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
> >> >> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
> >> >>
> >> >>       /* slow path */
> >> >>       if (i != n) {
> >> >> +             int gpmax = x86_pmu.num_counters / 2;
> >> >> +
> >> >> +             /*
> >> >> +              * Do not allow scheduling of more than half the available
> >> >> +              * generic counters.
> >> >> +              *
> >> >> +              * This helps avoid counter starvation of sibling thread by
> >> >> +              * ensuring at most half the counters cannot be in exclusive
> >> >> +              * mode. There is no designated counters for the limits. Any
> >> >> +              * N/2 counters can be used. This helps with events with
> >> >> +              * specific counter constraints.
> >> >> +              */
> >> >> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
> >> >> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
> >> >> +                     gpmax /= 2;
> >> >> +
> >> >>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
> >> >> -                                          wmax, assign);
> >> >> +                                          wmax, gpmax, assign);
> >> >>       }
> >> >>
> >> >
> >> > Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.
> >>
> >> Yes, that's what I said. Other problem is, with no watchdog, measuring
> >> a non-corrupting event is still multiplexing when more than 2 instances
> >> are passed:
> >>  $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100
> >>
> >> I get 50% scheduling, only 2 out of 4 events scheduled at any time.
> >>
> >> There is nothing running on the sibling thread, so it should let me run with 4
> >> instances as per your patch.
> >
> > Ah, I limited it to n/2 if either of the siblings has an exclusive event
> > on.
> >
> But in my test case above, there was no exclusive event at all on either
> sibling and yet it limited the non-excl to 2.

I bet you tested the exclusive events earlier :-) Its one of the bugs,
the n_excl accounting is leaking up. Once !0 it stays !0.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 12:59               ` Peter Zijlstra
@ 2015-05-22 13:05                 ` Stephane Eranian
  2015-05-22 13:07                   ` Stephane Eranian
  2015-05-22 13:25                   ` Peter Zijlstra
  2015-05-22 13:10                 ` Peter Zijlstra
  1 sibling, 2 replies; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 13:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 5:59 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 05:55:32AM -0700, Stephane Eranian wrote:
>> On Fri, May 22, 2015 at 5:53 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Fri, May 22, 2015 at 05:35:14AM -0700, Stephane Eranian wrote:
>> >> On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> >> > On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
>> >> >> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>> >> >>
>> >> >>       /* slow path */
>> >> >>       if (i != n) {
>> >> >> +             int gpmax = x86_pmu.num_counters / 2;
>> >> >> +
>> >> >> +             /*
>> >> >> +              * Do not allow scheduling of more than half the available
>> >> >> +              * generic counters.
>> >> >> +              *
>> >> >> +              * This helps avoid counter starvation of sibling thread by
>> >> >> +              * ensuring at most half the counters cannot be in exclusive
>> >> >> +              * mode. There is no designated counters for the limits. Any
>> >> >> +              * N/2 counters can be used. This helps with events with
>> >> >> +              * specific counter constraints.
>> >> >> +              */
>> >> >> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
>> >> >> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
>> >> >> +                     gpmax /= 2;
>> >> >> +
>> >> >>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
>> >> >> -                                          wmax, assign);
>> >> >> +                                          wmax, gpmax, assign);
>> >> >>       }
>> >> >>
>> >> >
>> >> > Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.
>> >>
>> >> Yes, that's what I said. Other problem is, with no watchdog, measuring
>> >> a non-corrupting event is still multiplexing when more than 2 instances
>> >> are passed:
>> >>  $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100
>> >>
>> >> I get 50% scheduling, only 2 out of 4 events scheduled at any time.
>> >>
>> >> There is nothing running on the sibling thread, so it should let me run with 4
>> >> instances as per your patch.
>> >
>> > Ah, I limited it to n/2 if either of the siblings has an exclusive event
>> > on.
>> >
>> But in my test case above, there was no exclusive event at all on either
>> sibling and yet it limited the non-excl to 2.
>
> I bet you tested the exclusive events earlier :-) Its one of the bugs,
> the n_excl accounting is leaking up. Once !0 it stays !0.

So you're saying intel_put_excl_constraint() does not do the --n_excl?
Could it be that the flags is not showing PERF_X86_EVENT_EXCL?
Cannot be related to cpuc_fake because you have it in both the ++
and -- functions.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:05                 ` Stephane Eranian
@ 2015-05-22 13:07                   ` Stephane Eranian
  2015-05-22 13:25                     ` Peter Zijlstra
  2015-05-22 13:25                   ` Peter Zijlstra
  1 sibling, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 13:07 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 6:05 AM, Stephane Eranian <eranian@google.com> wrote:
> On Fri, May 22, 2015 at 5:59 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Fri, May 22, 2015 at 05:55:32AM -0700, Stephane Eranian wrote:
>>> On Fri, May 22, 2015 at 5:53 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>> > On Fri, May 22, 2015 at 05:35:14AM -0700, Stephane Eranian wrote:
>>> >> On Fri, May 22, 2015 at 4:28 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>>> >> > On Fri, May 22, 2015 at 01:21:46PM +0200, Peter Zijlstra wrote:
>>> >> >> @@ -821,8 +828,24 @@ int x86_schedule_events(struct cpu_hw_ev
>>> >> >>
>>> >> >>       /* slow path */
>>> >> >>       if (i != n) {
>>> >> >> +             int gpmax = x86_pmu.num_counters / 2;
>>> >> >> +
>>> >> >> +             /*
>>> >> >> +              * Do not allow scheduling of more than half the available
>>> >> >> +              * generic counters.
>>> >> >> +              *
>>> >> >> +              * This helps avoid counter starvation of sibling thread by
>>> >> >> +              * ensuring at most half the counters cannot be in exclusive
>>> >> >> +              * mode. There is no designated counters for the limits. Any
>>> >> >> +              * N/2 counters can be used. This helps with events with
>>> >> >> +              * specific counter constraints.
>>> >> >> +              */
>>> >> >> +             if (is_ht_workaround_enabled() && !cpuc->is_fake &&
>>> >> >> +                 READ_ONCE(cpuc->excl_cntrs->exclusive_present))
>>> >> >> +                     gpmax /= 2;
>>> >> >> +
>>> >> >>               unsched = perf_assign_events(cpuc->event_constraint, n, wmin,
>>> >> >> -                                          wmax, assign);
>>> >> >> +                                          wmax, gpmax, assign);
>>> >> >>       }
>>> >> >>
>>> >> >
>>> >> > Hmm, I divide by 2 twice.. no wonder it doesn't quite work as expected.
>>> >>
>>> >> Yes, that's what I said. Other problem is, with no watchdog, measuring
>>> >> a non-corrupting event is still multiplexing when more than 2 instances
>>> >> are passed:
>>> >>  $ perf stat -a -C 0 -e r20cc,r20cc,r20cc,r20cc -I 1000 sleep 100
>>> >>
>>> >> I get 50% scheduling, only 2 out of 4 events scheduled at any time.
>>> >>
>>> >> There is nothing running on the sibling thread, so it should let me run with 4
>>> >> instances as per your patch.
>>> >
>>> > Ah, I limited it to n/2 if either of the siblings has an exclusive event
>>> > on.
>>> >
>>> But in my test case above, there was no exclusive event at all on either
>>> sibling and yet it limited the non-excl to 2.
>>
>> I bet you tested the exclusive events earlier :-) Its one of the bugs,
>> the n_excl accounting is leaking up. Once !0 it stays !0.
>
> So you're saying intel_put_excl_constraint() does not do the --n_excl?
> Could it be that the flags is not showing PERF_X86_EVENT_EXCL?
> Cannot be related to cpuc_fake because you have it in both the ++
> and -- functions.

One other thing I noticed is that the --n_excl needs to be protected by the
excl_cntrs->lock in put_excl_constraints().

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 12:59               ` Peter Zijlstra
  2015-05-22 13:05                 ` Stephane Eranian
@ 2015-05-22 13:10                 ` Peter Zijlstra
  1 sibling, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 13:10 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 02:59:08PM +0200, Peter Zijlstra wrote:
> 
> I bet you tested the exclusive events earlier :-) Its one of the bugs,
> the n_excl accounting is leaking up. Once !0 it stays !0.

Ha!, also caused by my first patch.

It looks to be generally working. Let me update the Changelogs on the
lot and send out the stack again.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:05                 ` Stephane Eranian
  2015-05-22 13:07                   ` Stephane Eranian
@ 2015-05-22 13:25                   ` Peter Zijlstra
  1 sibling, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 13:25 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 06:05:49AM -0700, Stephane Eranian wrote:
> > I bet you tested the exclusive events earlier :-) Its one of the bugs,
> > the n_excl accounting is leaking up. Once !0 it stays !0.
> 
> So you're saying intel_put_excl_constraint() does not do the --n_excl?

No it does, but we call get_events_constraints() every time we do
x86_schedule_events(), and put_events_constraints() only on
x86_pmu_del(). This means we call get() much more than we put().

Therefore leak up.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:07                   ` Stephane Eranian
@ 2015-05-22 13:25                     ` Peter Zijlstra
  2015-05-22 13:29                       ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 13:25 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
> 
> One other thing I noticed is that the --n_excl needs to be protected by the
> excl_cntrs->lock in put_excl_constraints().

Nah, its strictly per cpu.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:25                     ` Peter Zijlstra
@ 2015-05-22 13:29                       ` Stephane Eranian
  2015-05-22 13:36                         ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 13:29 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
>>
>> One other thing I noticed is that the --n_excl needs to be protected by the
>> excl_cntrs->lock in put_excl_constraints().
>
> Nah, its strictly per cpu.

No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
sibling HT. Otherwise this would not work!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:29                       ` Stephane Eranian
@ 2015-05-22 13:36                         ` Peter Zijlstra
  2015-05-22 13:40                           ` Stephane Eranian
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 13:36 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 06:29:47AM -0700, Stephane Eranian wrote:
> On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
> >>
> >> One other thing I noticed is that the --n_excl needs to be protected by the
> >> excl_cntrs->lock in put_excl_constraints().
> >
> > Nah, its strictly per cpu.
> 
> No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
> sibling HT. Otherwise this would not work!

n_excl is per cpuc, see the trickery with has_exclusive vs
exclusive_present on how I avoid the lock.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:36                         ` Peter Zijlstra
@ 2015-05-22 13:40                           ` Stephane Eranian
  2015-05-22 13:48                             ` Peter Zijlstra
  0 siblings, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-05-22 13:40 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 6:36 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 22, 2015 at 06:29:47AM -0700, Stephane Eranian wrote:
>> On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
>> >>
>> >> One other thing I noticed is that the --n_excl needs to be protected by the
>> >> excl_cntrs->lock in put_excl_constraints().
>> >
>> > Nah, its strictly per cpu.
>>
>> No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
>> sibling HT. Otherwise this would not work!
>
> n_excl is per cpuc, see the trickery with has_exclusive vs
> exclusive_present on how I avoid the lock.

Yes, but I believe  you create a store forward penalty with this.
You store 16bits and you load 32 bits on the same cache line.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:40                           ` Stephane Eranian
@ 2015-05-22 13:48                             ` Peter Zijlstra
  2015-05-23  8:26                               ` Ingo Molnar
  0 siblings, 1 reply; 56+ messages in thread
From: Peter Zijlstra @ 2015-05-22 13:48 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Ingo Molnar, Vince Weaver, Jiri Olsa, LKML

On Fri, May 22, 2015 at 06:40:49AM -0700, Stephane Eranian wrote:
> On Fri, May 22, 2015 at 6:36 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, May 22, 2015 at 06:29:47AM -0700, Stephane Eranian wrote:
> >> On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> > On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
> >> >>
> >> >> One other thing I noticed is that the --n_excl needs to be protected by the
> >> >> excl_cntrs->lock in put_excl_constraints().
> >> >
> >> > Nah, its strictly per cpu.
> >>
> >> No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
> >> sibling HT. Otherwise this would not work!
> >
> > n_excl is per cpuc, see the trickery with has_exclusive vs
> > exclusive_present on how I avoid the lock.
> 
> Yes, but I believe  you create a store forward penalty with this.
> You store 16bits and you load 32 bits on the same cache line.

The store and load are fairly well spaced -- the entire scheduling fast
path is in between.

And such a penalty is still cheap compared to locking, no?

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint
  2015-05-22 13:48                             ` Peter Zijlstra
@ 2015-05-23  8:26                               ` Ingo Molnar
  0 siblings, 0 replies; 56+ messages in thread
From: Ingo Molnar @ 2015-05-23  8:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Stephane Eranian, Vince Weaver, Jiri Olsa, LKML


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, May 22, 2015 at 06:40:49AM -0700, Stephane Eranian wrote:
> > On Fri, May 22, 2015 at 6:36 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Fri, May 22, 2015 at 06:29:47AM -0700, Stephane Eranian wrote:
> > >> On Fri, May 22, 2015 at 6:25 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > >> > On Fri, May 22, 2015 at 06:07:00AM -0700, Stephane Eranian wrote:
> > >> >>
> > >> >> One other thing I noticed is that the --n_excl needs to be protected by the
> > >> >> excl_cntrs->lock in put_excl_constraints().
> > >> >
> > >> > Nah, its strictly per cpu.
> > >>
> > >> No. the excl_cntrs struct is pointed to by cpuc but it is shared between the
> > >> sibling HT. Otherwise this would not work!
> > >
> > > n_excl is per cpuc, see the trickery with has_exclusive vs
> > > exclusive_present on how I avoid the lock.
> > 
> > Yes, but I believe  you create a store forward penalty with this.
> > You store 16bits and you load 32 bits on the same cache line.

Same cacheline access has no such penalty: only if the partial access 
is for the same word.

> The store and load are fairly well spaced -- the entire scheduling 
> fast path is in between.
> 
> And such a penalty is still cheap compared to locking, no?

The 'penalty' is essentially just a delay in the execution of the 
load, if the store has not finished yet: typically less than 10 
cycles, around 3 cycles on recent uarchs.

So it should not be a big issue if there's indeed so much code between 
them - probably it's not even causing any delay anywhere.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
  2015-05-21 12:35   ` Stephane Eranian
  2015-05-21 14:53   ` Peter Zijlstra
@ 2015-08-21 20:31   ` Sasha Levin
  2015-09-10  4:48     ` Sasha Levin
  2015-09-10  8:54     ` Stephane Eranian
  2 siblings, 2 replies; 56+ messages in thread
From: Sasha Levin @ 2015-08-21 20:31 UTC (permalink / raw)
  To: Peter Zijlstra, mingo
  Cc: vincent.weaver, eranian, jolsa, kan.liang, linux-kernel,
	Andrew Hunter, Maria Dimakopoulou

On 05/21/2015 07:17 AM, Peter Zijlstra wrote:
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2106,7 +2106,7 @@ static struct event_constraint *
>  intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>  			    struct perf_event *event)
>  {
> -	struct event_constraint *c1 = event->hw.constraint;
> +	struct event_constraint *c1 = cpuc->event_constraint[idx];
>  	struct event_constraint *c2;

Hey Peter,

I was chasing a memory corruption in this area and I think I found
a possible culprit:

After this patch, In the code above, we'd access "cpuc->event_constraint[idx]"
and read/change memory.

The problem is that a valid value for idx is also -1, which isn't checked
here, so we end up accessing and possibly corrupting memory that isn't ours.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-08-21 20:31   ` Sasha Levin
@ 2015-09-10  4:48     ` Sasha Levin
  2015-09-10  8:54     ` Stephane Eranian
  1 sibling, 0 replies; 56+ messages in thread
From: Sasha Levin @ 2015-09-10  4:48 UTC (permalink / raw)
  To: Peter Zijlstra, mingo
  Cc: vincent.weaver, eranian, jolsa, kan.liang, linux-kernel,
	Andrew Hunter, Maria Dimakopoulou

Ping?

On 08/21/2015 04:31 PM, Sasha Levin wrote:
> On 05/21/2015 07:17 AM, Peter Zijlstra wrote:
>> --- a/arch/x86/kernel/cpu/perf_event_intel.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
>> @@ -2106,7 +2106,7 @@ static struct event_constraint *
>>  intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>>  			    struct perf_event *event)
>>  {
>> -	struct event_constraint *c1 = event->hw.constraint;
>> +	struct event_constraint *c1 = cpuc->event_constraint[idx];
>>  	struct event_constraint *c2;
> 
> Hey Peter,
> 
> I was chasing a memory corruption in this area and I think I found
> a possible culprit:
> 
> After this patch, In the code above, we'd access "cpuc->event_constraint[idx]"
> and read/change memory.
> 
> The problem is that a valid value for idx is also -1, which isn't checked
> here, so we end up accessing and possibly corrupting memory that isn't ours.
> 
> 
> Thanks,
> Sasha
> 


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-08-21 20:31   ` Sasha Levin
  2015-09-10  4:48     ` Sasha Levin
@ 2015-09-10  8:54     ` Stephane Eranian
  2015-09-10 10:01       ` Peter Zijlstra
  1 sibling, 1 reply; 56+ messages in thread
From: Stephane Eranian @ 2015-09-10  8:54 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Peter Zijlstra, Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan,
	LKML, Andrew Hunter, Maria Dimakopoulou

On Fri, Aug 21, 2015 at 1:31 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>
> On 05/21/2015 07:17 AM, Peter Zijlstra wrote:
> > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > @@ -2106,7 +2106,7 @@ static struct event_constraint *
> >  intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> >                           struct perf_event *event)
> >  {
> > -     struct event_constraint *c1 = event->hw.constraint;
> > +     struct event_constraint *c1 = cpuc->event_constraint[idx];
> >       struct event_constraint *c2;
>
> Hey Peter,
>
> I was chasing a memory corruption in this area and I think I found
> a possible culprit:
>
> After this patch, In the code above, we'd access "cpuc->event_constraint[idx]"
> and read/change memory.
>
> The problem is that a valid value for idx is also -1, which isn't checked
> here, so we end up accessing and possibly corrupting memory that isn't ours.
>
>
I believe your analysis is correct, the following path will create the problem:

   validate_group()
         validate_event()
             x86_pmu.get_event_constraints(fake_cpuc, -1, event)
             intel_get_event_constraints(cpuc, idx, event)
                 struct event_constraints *c1  = cpuc->event_constraints[idx];

here idx = -1, and the kernel is accessing an invalid memory location.

If think the code could be changed to:

   struct event_constraint *c1 = NULL;
   if (idx > -1)
       c1 = cpuc->event_constraints[idx];

idx is not used in the __intel_get_event_constraints() path if I read
the code correctly.

Peter?



>
> Thanks,
> Sasha

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/10] perf,x86: Fix event/group validation
  2015-09-10  8:54     ` Stephane Eranian
@ 2015-09-10 10:01       ` Peter Zijlstra
  0 siblings, 0 replies; 56+ messages in thread
From: Peter Zijlstra @ 2015-09-10 10:01 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Sasha Levin, Ingo Molnar, Vince Weaver, Jiri Olsa, Liang, Kan,
	LKML, Andrew Hunter, Maria Dimakopoulou

On Thu, Sep 10, 2015 at 01:54:18AM -0700, Stephane Eranian wrote:
> On Fri, Aug 21, 2015 at 1:31 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
> >
> > On 05/21/2015 07:17 AM, Peter Zijlstra wrote:
> > > --- a/arch/x86/kernel/cpu/perf_event_intel.c
> > > +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> > > @@ -2106,7 +2106,7 @@ static struct event_constraint *
> > >  intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> > >                           struct perf_event *event)
> > >  {
> > > -     struct event_constraint *c1 = event->hw.constraint;
> > > +     struct event_constraint *c1 = cpuc->event_constraint[idx];
> > >       struct event_constraint *c2;
> >
> > Hey Peter,
> >
> > I was chasing a memory corruption in this area and I think I found
> > a possible culprit:
> >
> > After this patch, In the code above, we'd access "cpuc->event_constraint[idx]"
> > and read/change memory.
> >
> > The problem is that a valid value for idx is also -1, which isn't checked
> > here, so we end up accessing and possibly corrupting memory that isn't ours.
> >
> >
> I believe your analysis is correct, the following path will create the problem:
> 
>    validate_group()
>          validate_event()
>              x86_pmu.get_event_constraints(fake_cpuc, -1, event)
>              intel_get_event_constraints(cpuc, idx, event)
>                  struct event_constraints *c1  = cpuc->event_constraints[idx];
> 
> here idx = -1, and the kernel is accessing an invalid memory location.
> 
> If think the code could be changed to:
> 
>    struct event_constraint *c1 = NULL;
>    if (idx > -1)
>        c1 = cpuc->event_constraints[idx];
> 
> idx is not used in the __intel_get_event_constraints() path if I read
> the code correctly.

I prefer >= 0, but yes that looks about right. I still want to rework
all this fake stuff some time, but we should fix this asap.

Something like so then?

---
Subject: perf, intel: Fix out-of-bound
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu Sep 10 11:58:27 CEST 2015

Sasha reported that we can get here with .idx==-1, and
cpuc->event_constraints unallocated.

Cc: stable@vger.kernel.org
Fixes: b371b5943178 ("perf/x86: Fix event/group validation")
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2316,9 +2316,12 @@ static struct event_constraint *
 intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 			    struct perf_event *event)
 {
-	struct event_constraint *c1 = cpuc->event_constraint[idx];
+	struct event_constraint *c1 = NULL;
 	struct event_constraint *c2;
 
+	if (idx >= 0) /* fake does < 0 */
+		c1 = cpuc->event_constraint[idx];
+
 	/*
 	 * first time only
 	 * - static constraint: no change across incremental scheduling calls

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-09-10 10:02 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21 11:17 [PATCH 00/10] Various x86 pmu scheduling patches Peter Zijlstra
2015-05-21 11:17 ` [PATCH 01/10] perf,x86: Fix event/group validation Peter Zijlstra
2015-05-21 12:35   ` Stephane Eranian
2015-05-21 12:56     ` Peter Zijlstra
2015-05-21 13:07       ` Stephane Eranian
2015-05-21 13:09         ` Peter Zijlstra
2015-05-21 13:18           ` Stephane Eranian
2015-05-21 13:20             ` Peter Zijlstra
2015-05-21 13:27               ` Stephane Eranian
2015-05-21 13:29                 ` Peter Zijlstra
2015-05-21 13:36                   ` Stephane Eranian
2015-05-21 14:03                     ` Peter Zijlstra
2015-05-21 15:11                       ` Stephane Eranian
2015-05-22  6:49                         ` Ingo Molnar
2015-05-22  9:26                           ` Stephane Eranian
2015-05-22  9:46                             ` Ingo Molnar
2015-05-21 14:53   ` Peter Zijlstra
2015-05-21 15:42     ` Stephane Eranian
2015-08-21 20:31   ` Sasha Levin
2015-09-10  4:48     ` Sasha Levin
2015-09-10  8:54     ` Stephane Eranian
2015-09-10 10:01       ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 02/10] perf/x86: Improve HT workaround GP counter constraint Peter Zijlstra
2015-05-22 10:04   ` Stephane Eranian
2015-05-22 11:21     ` Peter Zijlstra
2015-05-22 11:24       ` Stephane Eranian
2015-05-22 11:28       ` Peter Zijlstra
2015-05-22 12:35         ` Stephane Eranian
2015-05-22 12:53           ` Peter Zijlstra
2015-05-22 12:55             ` Stephane Eranian
2015-05-22 12:59               ` Peter Zijlstra
2015-05-22 13:05                 ` Stephane Eranian
2015-05-22 13:07                   ` Stephane Eranian
2015-05-22 13:25                     ` Peter Zijlstra
2015-05-22 13:29                       ` Stephane Eranian
2015-05-22 13:36                         ` Peter Zijlstra
2015-05-22 13:40                           ` Stephane Eranian
2015-05-22 13:48                             ` Peter Zijlstra
2015-05-23  8:26                               ` Ingo Molnar
2015-05-22 13:25                   ` Peter Zijlstra
2015-05-22 13:10                 ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 03/10] perf/x86: Correct local vs remote sibling state Peter Zijlstra
2015-05-21 13:31   ` Stephane Eranian
2015-05-21 14:10     ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 04/10] perf/x86: Use lockdep Peter Zijlstra
2015-05-21 11:17 ` [PATCH 05/10] perf/x86: Simplify dynamic constraint code somewhat Peter Zijlstra
2015-05-21 11:17 ` [PATCH 06/10] perf/x86: Make WARNs consistent Peter Zijlstra
2015-05-21 11:17 ` [PATCH 07/10] perf/x86: Move intel_commit_scheduling() Peter Zijlstra
2015-05-21 11:17 ` [PATCH 08/10] perf/x86: Remove pointless tests Peter Zijlstra
2015-05-21 13:24   ` Stephane Eranian
2015-05-21 11:17 ` [PATCH 09/10] perf/x86: Remove intel_excl_states::init_state Peter Zijlstra
2015-05-21 13:39   ` Stephane Eranian
2015-05-21 14:12     ` Peter Zijlstra
2015-05-21 11:17 ` [PATCH 10/10] perf,x86: Simplify logic Peter Zijlstra
2015-05-21 11:48 ` [PATCH 00/10] Various x86 pmu scheduling patches Stephane Eranian
2015-05-21 12:53   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.