All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/3] perf fixes
@ 2015-01-23 12:51 Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-23 12:51 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx, peterz

Hi,

Here are a few perf patches that (hopefully) fix a number of reported
(and at least one unreported afaik) issues as triggered by Vince's
fuzzer.

After a few days of staring at that event->ctx mess and writing
increasingly horrible patches, I came up with the approach from patch 3.
It still isn't pretty or straight forward though; please give it careful
consideration.

Jiri reported some success on an slightly older stack.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition
  2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
@ 2015-01-23 12:52 ` Peter Zijlstra
  2015-01-23 15:02   ` Mark Rutland
  2015-01-28 14:30   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-23 12:52 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx, peterz

[-- Attachment #1: peterz-perf-fix-grouping.patch --]
[-- Type: text/plain, Size: 2413 bytes --]

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered by the fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/perf_event.h |    6 ------
 kernel/events/core.c       |   15 +++++++++++++--
 2 files changed, 13 insertions(+), 8 deletions(-)

--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -450,11 +450,6 @@ struct perf_event {
 #endif /* CONFIG_PERF_EVENTS */
 };
 
-enum perf_event_context_type {
-	task_context,
-	cpu_context,
-};
-
 /**
  * struct perf_event_context - event context structure
  *
@@ -462,7 +457,6 @@ enum perf_event_context_type {
  */
 struct perf_event_context {
 	struct pmu			*pmu;
-	enum perf_event_context_type	type;
 	/*
 	 * Protect the states of the events in the list,
 	 * nr_active, and the list:
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6846,7 +6846,6 @@ int perf_pmu_register(struct pmu *pmu, c
 		__perf_event_init_context(&cpuctx->ctx);
 		lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
 		lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
-		cpuctx->ctx.type = cpu_context;
 		cpuctx->ctx.pmu = pmu;
 
 		__perf_cpu_hrtimer_init(cpuctx, cpu);
@@ -7493,7 +7492,19 @@ SYSCALL_DEFINE5(perf_event_open,
 		 * task or CPU context:
 		 */
 		if (move_group) {
-			if (group_leader->ctx->type != ctx->type)
+			/*
+			 * Make sure we're both on the same task, or both
+			 * per-cpu events.
+			 */
+			if (group_leader->ctx->task != ctx->task)
+				goto err_context;
+
+			/*
+			 * Make sure we're both events for the same CPU;
+			 * grouping events for different CPUs is broken; since
+			 * you can never concurrently schedule them anyhow.
+			 */
+			if (group_leader->cpu != event->cpu)
 				goto err_context;
 		} else {
 			if (group_leader->ctx != ctx)



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
@ 2015-01-23 12:52 ` Peter Zijlstra
  2015-01-26 16:26   ` Peter Zijlstra
  2015-02-04 14:39   ` [tip:perf/core] perf: Add a bit of paranoia tip-bot for Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 3/3] perf: Fix event->ctx locking Peter Zijlstra
  2015-01-26 14:30 ` [RFC][PATCH 0/3] perf fixes Vince Weaver
  3 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-23 12:52 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx, peterz

[-- Attachment #1: peterz-perf-ctx-paranioa.patch --]
[-- Type: text/plain, Size: 2029 bytes --]

Add a few WARNs to catch things that should never happen.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1275,6 +1275,8 @@ static void perf_group_attach(struct per
 	if (group_leader == event)
 		return;
 
+	WARN_ON_ONCE(group_leader->ctx != event->ctx);
+
 	if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
 			!is_software_event(event))
 		group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;
@@ -1296,6 +1298,10 @@ static void
 list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 {
 	struct perf_cpu_context *cpuctx;
+
+	WARN_ON_ONCE(event->ctx != ctx);
+	lockdep_assert_held(&ctx->lock);
+
 	/*
 	 * We can have double detach due to exit/hot-unplug + close.
 	 */
@@ -1380,6 +1386,8 @@ static void perf_group_detach(struct per
 
 		/* Inherit group flags from the previous leader */
 		sibling->group_flags = event->group_flags;
+
+		WARN_ON_ONCE(sibling->ctx != event->ctx);
 	}
 
 out:
@@ -1442,6 +1450,10 @@ event_sched_out(struct perf_event *event
 {
 	u64 tstamp = perf_event_time(event);
 	u64 delta;
+
+	WARN_ON_ONCE(event->ctx != ctx);
+	lockdep_assert_held(&ctx->lock);
+
 	/*
 	 * An event which could not be activated because of
 	 * filter mismatch still needs to have its timings
@@ -7822,14 +7834,19 @@ static void perf_free_event(struct perf_
 
 	put_event(parent);
 
+	raw_spin_lock_irq(&ctx->lock);
 	perf_group_detach(event);
 	list_del_event(event, ctx);
+	raw_spin_unlock_irq(&ctx->lock);
 	free_event(event);
 }
 
 /*
- * free an unexposed, unused context as created by inheritance by
+ * Free an unexposed, unused context as created by inheritance by
  * perf_event_init_task below, used by fork() in case of fail.
+ *
+ * Not all locks are strictly required, but take them anyway to be nice and
+ * help out with the lockdep assertions.
  */
 void perf_event_free_task(struct task_struct *task)
 {



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC][PATCH 3/3] perf: Fix event->ctx locking
  2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
  2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
@ 2015-01-23 12:52 ` Peter Zijlstra
  2015-02-04 14:39   ` [tip:perf/core] " tip-bot for Peter Zijlstra
  2015-01-26 14:30 ` [RFC][PATCH 0/3] perf fixes Vince Weaver
  3 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-23 12:52 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx, peterz

[-- Attachment #1: peterz-perf-ctx-double-lock.patch --]
[-- Type: text/plain, Size: 14529 bytes --]

There have been a few reported issues wrt. the lack of locking around
changing event->ctx. This patch tries to address those.

It avoids the whole rwsem thing; and while it appears to work, please
give it some thought in review.

What I did fail at is sensible runtime checks on the use of
event->ctx, the RCU use makes it very hard.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |  244 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 207 insertions(+), 37 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -907,6 +907,77 @@ static void put_ctx(struct perf_event_co
 }
 
 /*
+ * Because of perf_event::ctx migration in sys_perf_event_open::move_group and
+ * perf_pmu_migrate_context() we need some magic.
+ *
+ * Those places that change perf_event::ctx will hold both
+ * perf_event_ctx::mutex of the 'old' and 'new' ctx value.
+ *
+ * Lock ordering is by mutex address. There is one other site where
+ * perf_event_context::mutex nests and that is put_event(). But remember that
+ * that is a parent<->child context relation, and migration does not affect
+ * children, therefore these two orderings should not interact.
+ *
+ * The change in perf_event::ctx does not affect children (as claimed above)
+ * because the sys_perf_event_open() case will install a new event and break
+ * the ctx parent<->child relation, and perf_pmu_migrate_context() is only
+ * concerned with cpuctx and that doesn't have children.
+ *
+ * The places that change perf_event::ctx will issue:
+ *
+ *   perf_remove_from_context();
+ *   synchronize_rcu();
+ *   perf_install_in_context();
+ *
+ * to affect the change. The remove_from_context() + synchronize_rcu() should
+ * quiesce the event, after which we can install it in the new location. This
+ * means that only external vectors (perf_fops, prctl) can perturb the event
+ * while in transit. Therefore all such accessors should also acquire
+ * perf_event_context::mutex to serialize against this.
+ *
+ * However; because event->ctx can change while we're waiting to acquire
+ * ctx->mutex we must be careful and use the below perf_event_ctx_lock()
+ * function.
+ *
+ * Lock order:
+ *	task_struct::perf_event_mutex
+ *	  perf_event_context::mutex
+ *	    perf_event_context::lock
+ *	    perf_event::child_mutex;
+ *	    perf_event::mmap_mutex
+ *	    mmap_sem
+ */
+static struct perf_event_context *perf_event_ctx_lock(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+again:
+	rcu_read_lock();
+	ctx = ACCESS_ONCE(event->ctx);
+	if (!atomic_inc_not_zero(&ctx->refcount)) {
+		rcu_read_unlock();
+		goto again;
+	}
+	rcu_read_unlock();
+
+	mutex_lock(&ctx->mutex);
+	if (event->ctx != ctx) {
+		mutex_unlock(&ctx->mutex);
+		put_ctx(ctx);
+		goto again;
+	}
+
+	return ctx;
+}
+
+static void perf_event_ctx_unlock(struct perf_event *event,
+				  struct perf_event_context *ctx)
+{
+	mutex_unlock(&ctx->mutex);
+	put_ctx(ctx);
+}
+
+/*
  * This must be done under the ctx->lock, such as to serialize against
  * context_equiv(), therefore we cannot call put_ctx() since that might end up
  * calling scheduler related locks and ctx->lock nests inside those.
@@ -1666,7 +1737,7 @@ int __perf_event_disable(void *info)
  * is the current context on this CPU and preemption is disabled,
  * hence we can't get into perf_event_task_sched_out for this context.
  */
-void perf_event_disable(struct perf_event *event)
+static void _perf_event_disable(struct perf_event *event)
 {
 	struct perf_event_context *ctx = event->ctx;
 	struct task_struct *task = ctx->task;
@@ -1707,6 +1778,19 @@ void perf_event_disable(struct perf_even
 	}
 	raw_spin_unlock_irq(&ctx->lock);
 }
+
+/*
+ * Strictly speaking kernel users cannot create groups and therefore this
+ * interface does not need the perf_event_ctx_lock() magic.
+ */
+void perf_event_disable(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+	ctx = perf_event_ctx_lock(event);
+	_perf_event_disable(event);
+	perf_event_ctx_unlock(event, ctx);
+}
 EXPORT_SYMBOL_GPL(perf_event_disable);
 
 static void perf_set_shadow_time(struct perf_event *event,
@@ -2170,7 +2254,7 @@ static int __perf_event_enable(void *inf
  * perf_event_for_each_child or perf_event_for_each as described
  * for perf_event_disable.
  */
-void perf_event_enable(struct perf_event *event)
+static void _perf_event_enable(struct perf_event *event)
 {
 	struct perf_event_context *ctx = event->ctx;
 	struct task_struct *task = ctx->task;
@@ -2226,9 +2310,21 @@ void perf_event_enable(struct perf_event
 out:
 	raw_spin_unlock_irq(&ctx->lock);
 }
+
+/*
+ * See perf_event_disable();
+ */
+void perf_event_enable(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+	ctx = perf_event_ctx_lock(event);
+	_perf_event_enable(event);
+	perf_event_ctx_unlock(event, ctx);
+}
 EXPORT_SYMBOL_GPL(perf_event_enable);
 
-int perf_event_refresh(struct perf_event *event, int refresh)
+static int _perf_event_refresh(struct perf_event *event, int refresh)
 {
 	/*
 	 * not supported on inherited events
@@ -2237,10 +2333,25 @@ int perf_event_refresh(struct perf_event
 		return -EINVAL;
 
 	atomic_add(refresh, &event->event_limit);
-	perf_event_enable(event);
+	_perf_event_enable(event);
 
 	return 0;
 }
+
+/*
+ * See perf_event_disable()
+ */
+int perf_event_refresh(struct perf_event *event, int refresh)
+{
+	struct perf_event_context *ctx;
+	int ret;
+
+	ctx = perf_event_ctx_lock(event);
+	ret = _perf_event_refresh(event, refresh);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
+}
 EXPORT_SYMBOL_GPL(perf_event_refresh);
 
 static void ctx_sched_out(struct perf_event_context *ctx,
@@ -3433,7 +3544,16 @@ static void perf_remove_from_owner(struc
 	rcu_read_unlock();
 
 	if (owner) {
-		mutex_lock(&owner->perf_event_mutex);
+		/*
+		 * If we're here through perf_event_exit_task() we're already
+		 * holding ctx->mutex which would be an inversion wrt. the
+		 * normal lock order.
+		 *
+		 * However we can safely take this lock because its the child
+		 * ctx->mutex.
+		 */
+		mutex_lock_nested(&owner->perf_event_mutex, SINGLE_DEPTH_NESTING);
+
 		/*
 		 * We have to re-check the event->owner field, if it is cleared
 		 * we raced with perf_event_exit_task(), acquiring the mutex
@@ -3559,12 +3679,13 @@ static int perf_event_read_group(struct
 				   u64 read_format, char __user *buf)
 {
 	struct perf_event *leader = event->group_leader, *sub;
-	int n = 0, size = 0, ret = -EFAULT;
 	struct perf_event_context *ctx = leader->ctx;
-	u64 values[5];
+	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
+	u64 values[5];
+
+	lockdep_assert_held(&ctx->mutex);
 
-	mutex_lock(&ctx->mutex);
 	count = perf_event_read_value(leader, &enabled, &running);
 
 	values[n++] = 1 + leader->nr_siblings;
@@ -3579,7 +3700,7 @@ static int perf_event_read_group(struct
 	size = n * sizeof(u64);
 
 	if (copy_to_user(buf, values, size))
-		goto unlock;
+		return -EFAULT;
 
 	ret = size;
 
@@ -3593,14 +3714,11 @@ static int perf_event_read_group(struct
 		size = n * sizeof(u64);
 
 		if (copy_to_user(buf + ret, values, size)) {
-			ret = -EFAULT;
-			goto unlock;
+			return -EFAULT;
 		}
 
 		ret += size;
 	}
-unlock:
-	mutex_unlock(&ctx->mutex);
 
 	return ret;
 }
@@ -3672,8 +3790,14 @@ static ssize_t
 perf_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
 	struct perf_event *event = file->private_data;
+	struct perf_event_context *ctx;
+	int ret;
 
-	return perf_read_hw(event, buf, count);
+	ctx = perf_event_ctx_lock(event);
+	ret = perf_read_hw(event, buf, count);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
 }
 
 static unsigned int perf_poll(struct file *file, poll_table *wait)
@@ -3699,7 +3823,7 @@ static unsigned int perf_poll(struct fil
 	return events;
 }
 
-static void perf_event_reset(struct perf_event *event)
+static void _perf_event_reset(struct perf_event *event)
 {
 	(void)perf_event_read(event);
 	local64_set(&event->count, 0);
@@ -3718,6 +3842,7 @@ static void perf_event_for_each_child(st
 	struct perf_event *child;
 
 	WARN_ON_ONCE(event->ctx->parent_ctx);
+
 	mutex_lock(&event->child_mutex);
 	func(event);
 	list_for_each_entry(child, &event->child_list, child_list)
@@ -3731,14 +3856,13 @@ static void perf_event_for_each(struct p
 	struct perf_event_context *ctx = event->ctx;
 	struct perf_event *sibling;
 
-	WARN_ON_ONCE(ctx->parent_ctx);
-	mutex_lock(&ctx->mutex);
+	lockdep_assert_held(&ctx->mutex);
+
 	event = event->group_leader;
 
 	perf_event_for_each_child(event, func);
 	list_for_each_entry(sibling, &event->sibling_list, group_entry)
 		perf_event_for_each_child(sibling, func);
-	mutex_unlock(&ctx->mutex);
 }
 
 static int perf_event_period(struct perf_event *event, u64 __user *arg)
@@ -3808,25 +3932,24 @@ static int perf_event_set_output(struct
 				 struct perf_event *output_event);
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 
-static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
-	struct perf_event *event = file->private_data;
 	void (*func)(struct perf_event *);
 	u32 flags = arg;
 
 	switch (cmd) {
 	case PERF_EVENT_IOC_ENABLE:
-		func = perf_event_enable;
+		func = _perf_event_enable;
 		break;
 	case PERF_EVENT_IOC_DISABLE:
-		func = perf_event_disable;
+		func = _perf_event_disable;
 		break;
 	case PERF_EVENT_IOC_RESET:
-		func = perf_event_reset;
+		func = _perf_event_reset;
 		break;
 
 	case PERF_EVENT_IOC_REFRESH:
-		return perf_event_refresh(event, arg);
+		return _perf_event_refresh(event, arg);
 
 	case PERF_EVENT_IOC_PERIOD:
 		return perf_event_period(event, (u64 __user *)arg);
@@ -3873,6 +3996,19 @@ static long perf_ioctl(struct file *file
 	return 0;
 }
 
+static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct perf_event *event = file->private_data;
+	struct perf_event_context *ctx;
+	long ret;
+
+	ctx = perf_event_ctx_lock(event);
+	ret = _perf_ioctl(event, cmd, arg);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
+}
+
 #ifdef CONFIG_COMPAT
 static long perf_compat_ioctl(struct file *file, unsigned int cmd,
 				unsigned long arg)
@@ -3895,11 +4031,15 @@ static long perf_compat_ioctl(struct fil
 
 int perf_event_task_enable(void)
 {
+	struct perf_event_context *ctx;
 	struct perf_event *event;
 
 	mutex_lock(&current->perf_event_mutex);
-	list_for_each_entry(event, &current->perf_event_list, owner_entry)
-		perf_event_for_each_child(event, perf_event_enable);
+	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
+		ctx = perf_event_ctx_lock(event);
+		perf_event_for_each_child(event, _perf_event_enable);
+		perf_event_ctx_unlock(event, ctx);
+	}
 	mutex_unlock(&current->perf_event_mutex);
 
 	return 0;
@@ -3907,11 +4047,15 @@ int perf_event_task_enable(void)
 
 int perf_event_task_disable(void)
 {
+	struct perf_event_context *ctx;
 	struct perf_event *event;
 
 	mutex_lock(&current->perf_event_mutex);
-	list_for_each_entry(event, &current->perf_event_list, owner_entry)
-		perf_event_for_each_child(event, perf_event_disable);
+	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
+		ctx = perf_event_ctx_lock(event);
+		perf_event_for_each_child(event, _perf_event_disable);
+		perf_event_ctx_unlock(event, ctx);
+	}
 	mutex_unlock(&current->perf_event_mutex);
 
 	return 0;
@@ -7269,6 +7413,15 @@ perf_event_set_output(struct perf_event
 	return ret;
 }
 
+static void mutex_lock_double(struct mutex *a, struct mutex *b)
+{
+	if (b < a)
+		swap(a, b);
+
+	mutex_lock(a);
+	mutex_lock_nested(b, SINGLE_DEPTH_NESTING);
+}
+
 /**
  * sys_perf_event_open - open a performance event, associate it to a task/cpu
  *
@@ -7284,7 +7437,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	struct perf_event *group_leader = NULL, *output_event = NULL;
 	struct perf_event *event, *sibling;
 	struct perf_event_attr attr;
-	struct perf_event_context *ctx;
+	struct perf_event_context *ctx, *uninitialized_var(gctx);
 	struct file *event_file = NULL;
 	struct fd group = {NULL, 0};
 	struct task_struct *task = NULL;
@@ -7482,9 +7635,14 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	if (move_group) {
-		struct perf_event_context *gctx = group_leader->ctx;
+		gctx = group_leader->ctx;
+
+		/*
+		 * See perf_event_ctx_lock() for comments on the details
+		 * of swizzling perf_event::ctx.
+		 */
+		mutex_lock_double(&gctx->mutex, &ctx->mutex);
 
-		mutex_lock(&gctx->mutex);
 		perf_remove_from_context(group_leader, false);
 
 		/*
@@ -7499,15 +7657,19 @@ SYSCALL_DEFINE5(perf_event_open,
 			perf_event__state_init(sibling);
 			put_ctx(gctx);
 		}
-		mutex_unlock(&gctx->mutex);
-		put_ctx(gctx);
+	} else {
+		mutex_lock(&ctx->mutex);
 	}
 
 	WARN_ON_ONCE(ctx->parent_ctx);
-	mutex_lock(&ctx->mutex);
 
 	if (move_group) {
+		/*
+		 * Wait for everybody to stop referencing the events through
+		 * the old lists, before installing it on new lists.
+		 */
 		synchronize_rcu();
+
 		perf_install_in_context(ctx, group_leader, group_leader->cpu);
 		get_ctx(ctx);
 		list_for_each_entry(sibling, &group_leader->sibling_list,
@@ -7519,6 +7681,11 @@ SYSCALL_DEFINE5(perf_event_open,
 
 	perf_install_in_context(ctx, event, event->cpu);
 	perf_unpin_context(ctx);
+
+	if (move_group) {
+		mutex_unlock(&gctx->mutex);
+		put_ctx(gctx);
+	}
 	mutex_unlock(&ctx->mutex);
 
 	put_online_cpus();
@@ -7626,7 +7793,11 @@ void perf_pmu_migrate_context(struct pmu
 	src_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, src_cpu)->ctx;
 	dst_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, dst_cpu)->ctx;
 
-	mutex_lock(&src_ctx->mutex);
+	/*
+	 * See perf_event_ctx_lock() for comments on the details
+	 * of swizzling perf_event::ctx.
+	 */
+	mutex_lock_double(&src_ctx->mutex, &dst_ctx->mutex);
 	list_for_each_entry_safe(event, tmp, &src_ctx->event_list,
 				 event_entry) {
 		perf_remove_from_context(event, false);
@@ -7634,11 +7805,9 @@ void perf_pmu_migrate_context(struct pmu
 		put_ctx(src_ctx);
 		list_add(&event->migrate_entry, &events);
 	}
-	mutex_unlock(&src_ctx->mutex);
 
 	synchronize_rcu();
 
-	mutex_lock(&dst_ctx->mutex);
 	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
 		list_del(&event->migrate_entry);
 		if (event->state >= PERF_EVENT_STATE_OFF)
@@ -7648,6 +7817,7 @@ void perf_pmu_migrate_context(struct pmu
 		get_ctx(dst_ctx);
 	}
 	mutex_unlock(&dst_ctx->mutex);
+	mutex_unlock(&src_ctx->mutex);
 }
 EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
 



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition
  2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
@ 2015-01-23 15:02   ` Mark Rutland
  2015-01-23 15:07     ` Peter Zijlstra
  2015-01-28 14:30   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2015-01-23 15:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa, torvalds, tglx

On Fri, Jan 23, 2015 at 12:52:00PM +0000, Peter Zijlstra wrote:
> The fix from 9fc81d87420d ("perf: Fix events installation during
> moving group") was incomplete in that it failed to recognise that
> creating a group with events for different CPUs is semantically
> broken -- they cannot be co-scheduled.
> 
> Furthermore, it leads to real breakage where, when we create an event
> for CPU Y and then migrate it to form a group on CPU X, the code gets
> confused where the counter is programmed -- triggered by the fuzzer.
> 
> Fix this by tightening the rules for creating groups. Only allow
> grouping of counters that can be co-scheduled in the same context.
> This means for the same task and/or the same cpu.

It seems this would still allow you to group CPU-affine software and
uncore events, which also doesn't make sense: the software events will
count on a single CPU while the uncore events aren't really CPU-affine.

Which isn't anything against this patch, but probably something we
should tighten up too.

> 
> Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/linux/perf_event.h |    6 ------
>  kernel/events/core.c       |   15 +++++++++++++--
>  2 files changed, 13 insertions(+), 8 deletions(-)
> 
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -450,11 +450,6 @@ struct perf_event {
>  #endif /* CONFIG_PERF_EVENTS */
>  };
>  
> -enum perf_event_context_type {
> -	task_context,
> -	cpu_context,
> -};
> -
>  /**
>   * struct perf_event_context - event context structure
>   *
> @@ -462,7 +457,6 @@ enum perf_event_context_type {
>   */
>  struct perf_event_context {
>  	struct pmu			*pmu;
> -	enum perf_event_context_type	type;
>  	/*
>  	 * Protect the states of the events in the list,
>  	 * nr_active, and the list:
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6846,7 +6846,6 @@ int perf_pmu_register(struct pmu *pmu, c
>  		__perf_event_init_context(&cpuctx->ctx);
>  		lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
>  		lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
> -		cpuctx->ctx.type = cpu_context;
>  		cpuctx->ctx.pmu = pmu;
>  
>  		__perf_cpu_hrtimer_init(cpuctx, cpu);
> @@ -7493,7 +7492,19 @@ SYSCALL_DEFINE5(perf_event_open,
>  		 * task or CPU context:
>  		 */
>  		if (move_group) {
> -			if (group_leader->ctx->type != ctx->type)
> +			/*
> +			 * Make sure we're both on the same task, or both
> +			 * per-cpu events.
> +			 */
> +			if (group_leader->ctx->task != ctx->task)
> +				goto err_context;
> +

Up to this point, this looks very similar to what I tried previously
[1], where we eventually figured out [2] that this raced with the
context switch optimisation. I never got around to fixing that race.

I'll try and get my head around that again. I'm not sure if that's still
a problem, and from a quick look at this series it's not clear that it
would be fixed if it is a problem.

Thanks,
Mark.

[1] https://lkml.org/lkml/2014/2/10/937
[2] https://lkml.org/lkml/2014/2/27/834

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition
  2015-01-23 15:02   ` Mark Rutland
@ 2015-01-23 15:07     ` Peter Zijlstra
  2015-01-23 15:22       ` Mark Rutland
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-23 15:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa, torvalds, tglx

On Fri, Jan 23, 2015 at 03:02:12PM +0000, Mark Rutland wrote:
> On Fri, Jan 23, 2015 at 12:52:00PM +0000, Peter Zijlstra wrote:
> > The fix from 9fc81d87420d ("perf: Fix events installation during
> > moving group") was incomplete in that it failed to recognise that
> > creating a group with events for different CPUs is semantically
> > broken -- they cannot be co-scheduled.
> > 
> > Furthermore, it leads to real breakage where, when we create an event
> > for CPU Y and then migrate it to form a group on CPU X, the code gets
> > confused where the counter is programmed -- triggered by the fuzzer.
> > 
> > Fix this by tightening the rules for creating groups. Only allow
> > grouping of counters that can be co-scheduled in the same context.
> > This means for the same task and/or the same cpu.
> 
> It seems this would still allow you to group CPU-affine software and
> uncore events, which also doesn't make sense: the software events will
> count on a single CPU while the uncore events aren't really CPU-affine.
> 
> Which isn't anything against this patch, but probably something we
> should tighten up too.

Indeed, that would need a wee bit of extra infrastructure though; as we
cannot currently distinguish between regular cpuctx and uncore like
things.

Good spotting though.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition
  2015-01-23 15:07     ` Peter Zijlstra
@ 2015-01-23 15:22       ` Mark Rutland
  2015-02-04 12:59         ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Mark Rutland @ 2015-01-23 15:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa, torvalds, tglx

On Fri, Jan 23, 2015 at 03:07:16PM +0000, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 03:02:12PM +0000, Mark Rutland wrote:
> > On Fri, Jan 23, 2015 at 12:52:00PM +0000, Peter Zijlstra wrote:
> > > The fix from 9fc81d87420d ("perf: Fix events installation during
> > > moving group") was incomplete in that it failed to recognise that
> > > creating a group with events for different CPUs is semantically
> > > broken -- they cannot be co-scheduled.
> > > 
> > > Furthermore, it leads to real breakage where, when we create an event
> > > for CPU Y and then migrate it to form a group on CPU X, the code gets
> > > confused where the counter is programmed -- triggered by the fuzzer.
> > > 
> > > Fix this by tightening the rules for creating groups. Only allow
> > > grouping of counters that can be co-scheduled in the same context.
> > > This means for the same task and/or the same cpu.
> > 
> > It seems this would still allow you to group CPU-affine software and
> > uncore events, which also doesn't make sense: the software events will
> > count on a single CPU while the uncore events aren't really CPU-affine.
> > 
> > Which isn't anything against this patch, but probably something we
> > should tighten up too.
> 
> Indeed, that would need a wee bit of extra infrastructure though; as we
> cannot currently distinguish between regular cpuctx and uncore like
> things.

Isn't the event->pmu->task_ctx_nr sufficient, as with how we identify
software events?

Or am I making some completely bogus assumptions in the diff below?

Mark.

---->8----
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 664de5a..7b945d5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -657,6 +657,15 @@ static inline int is_software_event(struct perf_event *event)
 	return event->pmu->task_ctx_nr == perf_sw_context;
 }
 
+/*
+ * Return 1 for an event which is associated with neither a particular
+ * CPU nor a particular task.
+ */
+static inline int is_system_event(struct perf_event *event)
+{
+	return event->pmu->task_ctx_nr == perf_invalid_context;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void __perf_sw_event(u32, u64, struct pt_regs *, u64);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2cb857d..50c42b6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7525,6 +7525,18 @@ SYSCALL_DEFINE5(perf_event_open,
 	account_event(event);
 
 	/*
+	 * System-wide (A.K.A. "uncore") events cannot be associated with a
+	 * particular CPU, and hence cannot be associated with a particular
+	 * task either. It's non-sensical to group them with other event types,
+	 * which are CPU or task bound.
+	 */
+	if (group_leader &&
+		(is_system_event(event) != is_system_event(group_leader))) {
+		err = -EINVAL;
+		goto err_alloc;
+	}
+
+	/*
 	 * Special case software events and allow them to be part of
 	 * any hardware group.
 	 */


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 0/3] perf fixes
  2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
                   ` (2 preceding siblings ...)
  2015-01-23 12:52 ` [RFC][PATCH 3/3] perf: Fix event->ctx locking Peter Zijlstra
@ 2015-01-26 14:30 ` Vince Weaver
  2015-01-26 18:03   ` Vince Weaver
  3 siblings, 1 reply; 29+ messages in thread
From: Vince Weaver @ 2015-01-26 14:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa,
	mark.rutland, torvalds, tglx

On Fri, 23 Jan 2015, Peter Zijlstra wrote:

> Here are a few perf patches that (hopefully) fix a number of reported
> (and at least one unreported afaik) issues as triggered by Vince's
> fuzzer.
> 
> After a few days of staring at that event->ctx mess and writing
> increasingly horrible patches, I came up with the approach from patch 3.
> It still isn't pretty or straight forward though; please give it careful
> consideration.
> 
> Jiri reported some success on an slightly older stack.

I applied this to (just before) 3.19-rc6 and let it run on my 
troublemaking Haswell machine overnight.  It is looking good!

There was some typical fuzzing noise (see below), but none of the hangs or 
ctx related spews this machine usually quickly generates.


[  746.172668] NOHZ: local_softirq_pending 100
[ 1738.020456] perf interrupt took too long (2941 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 3502.758359] NOHZ: local_softirq_pending 100
[ 4500.691858] NOHZ: local_softirq_pending 100
[ 5541.657788] NOHZ: local_softirq_pending 100
[ 5541.673445] NOHZ: local_softirq_pending 100
[ 5541.705029] NOHZ: local_softirq_pending 100
[ 6338.750033] NOHZ: local_softirq_pending 100
[ 6572.046890] NOHZ: local_softirq_pending 100
[ 6727.222796] NOHZ: local_softirq_pending 100
[ 8377.455432] NOHZ: local_softirq_pending 100
[16142.319122] ------------[ cut here ]------------
[16142.324098] WARNING: CPU: 0 PID: 1694 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x10a/0x120()
[16142.335511] Can't find any breakpoint slot
[16142.339765] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel crct10dif_pclmul snd_hda_controller crc32_pclmul ppdev ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel aes_x86_64 lpc_ich mfd_core evdev lrw mei_me i915 psmouse snd_hda_codec mei xhci_pci gf128mul i2c_i801 serio_raw xhci_hcd glue_helper snd_hwdep snd_pcm ablk_helper cryptd snd_timer drm_kms_helper parport_pc parport tpm_tis pcspkr snd tpm drm battery video soundcore wmi i2c_algo_bit button processor sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci e1000e libata ptp usbcore scsi_mod usb_common pps_core crc32c_intel thermal fan thermal_sys
[16142.411408] CPU: 0 PID: 1694 Comm: perf_fuzzer Not tainted 3.19.0-rc5+ #125
[16142.418883] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[16142.426826]  ffffffff81a253f0 ffff8800c8c57af8 ffffffff816b6741 0000000000000000
[16142.434825]  ffff8800c8c57b48 ffff8800c8c57b38 ffffffff8106dcda 0000000000000000
[16142.442854]  00000000fffffff0 ffff8800ce5b7800 ffff88011ea189ec ffff880036d00400
[16142.450891] Call Trace:
[16142.453502]  [<ffffffff816b6741>] dump_stack+0x45/0x57
[16142.458992]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[16142.465411]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
[16142.471599]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[16142.478607]  [<ffffffff8101cdda>] arch_install_hw_breakpoint+0x10a/0x120
[16142.485805]  [<ffffffff8115ff58>] hw_breakpoint_add+0x48/0x50
[16142.491991]  [<ffffffff81159336>] event_sched_in.isra.74+0xa6/0x280
[16142.498719]  [<ffffffff8115957f>] group_sched_in+0x6f/0x1e0
[16142.504659]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[16142.511084]  [<ffffffff811598c0>] ctx_sched_in+0x1d0/0x3f0
[16142.516953]  [<ffffffff81159b49>] perf_event_sched_in+0x69/0xa0
[16142.523290]  [<ffffffff8115a53e>] __perf_install_in_context+0x12e/0x1e0
[16142.530362]  [<ffffffff81154570>] remote_function+0x50/0x60
[16142.536356]  [<ffffffff810ef53e>] generic_exec_single+0x12e/0x190
[16142.542878]  [<ffffffff81154520>] ? task_clock_event_add+0x40/0x40
[16142.549540]  [<ffffffff810ef607>] smp_call_function_single+0x67/0xa0
[16142.556324]  [<ffffffff81153419>] task_function_call+0x49/0x60
[16142.562569]  [<ffffffff8115a410>] ? perf_cpu_hrtimer_handler+0x270/0x270
[16142.569777]  [<ffffffff811549fb>] perf_install_in_context+0x8b/0x100
[16142.576592]  [<ffffffff8115e330>] SYSC_perf_event_open+0xb40/0xc60
[16142.583255]  [<ffffffff812037d4>] ? mntput+0x24/0x40
[16142.588597]  [<ffffffff8115e89e>] SyS_perf_event_open+0xe/0x10
[16142.594865]  [<ffffffff816be6ad>] system_call_fastpath+0x16/0x1b
[16142.601342] ---[ end trace 963de0bbaca247a6 ]---
[17937.065232] ------------[ cut here ]------------
[17937.070232] WARNING: CPU: 6 PID: 0 at arch/x86/kernel/cpu/perf_event_intel.c:1372 intel_pmu_handle_irq+0x282/0x3a0()
[17937.081580] perfevents: irq loop stuck!
[17937.085694] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel crct10dif_pclmul snd_hda_controller crc32_pclmul ppdev ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel aes_x86_64 lpc_ich mfd_core evdev lrw mei_me i915 psmouse snd_hda_codec mei xhci_pci gf128mul i2c_i801 serio_raw xhci_hcd glue_helper snd_hwdep snd_pcm ablk_helper cryptd snd_timer drm_kms_helper parport_pc parport tpm_tis pcspkr snd tpm drm battery video soundcore wmi i2c_algo_bit button processor sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci e1000e libata ptp usbcore scsi_mod usb_common pps_core crc32c_intel thermal fan thermal_sys
[17937.157306] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G        W      3.19.0-rc5+ #125
[17937.166846] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[17937.176142]  ffffffff81a266b0 ffff88011eb85b50 ffffffff816b6741 0000000000000000
[17937.185583]  ffff88011eb85ba0 ffff88011eb85b90 ffffffff8106dcda ffff880119b8a090
[17937.194981]  0000000000000064 ffff88011eb8bd40 ffff8800c8198800 0000000000000040
[17937.204343] Call Trace:
[17937.208301]  <NMI>  [<ffffffff816b6741>] dump_stack+0x45/0x57
[17937.215913]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[17937.223740]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
[17937.231279]  [<ffffffff81032732>] intel_pmu_handle_irq+0x282/0x3a0
[17937.239313]  [<ffffffff810ea108>] ? __tick_nohz_idle_enter+0x198/0x4b0
[17937.247716]  [<ffffffff8102a04b>] perf_event_nmi_handler+0x2b/0x50
[17937.255714]  [<ffffffff81018610>] nmi_handle+0xa0/0x150
[17937.262676]  [<ffffffff81018575>] ? nmi_handle+0x5/0x150
[17937.269741]  [<ffffffff810188ba>] default_do_nmi+0x4a/0x140
[17937.277090]  [<ffffffff81018a48>] do_nmi+0x98/0xe0
[17937.283503]  [<ffffffff816c0bb1>] end_repeat_nmi+0x1e/0x2e
[17937.290754]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[17937.298744]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[17937.306769]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[17937.314746]  <<EOE>>  <IRQ>  [<ffffffff81032346>] intel_pmu_enable_event+0x236/0x250
[17937.324514]  [<ffffffff8102b281>] x86_pmu_start+0x81/0x120
[17937.331809]  [<ffffffff8102ba55>] x86_pmu_enable+0x295/0x310
[17937.339270]  [<ffffffff8115858a>] perf_pmu_enable+0x2a/0x30
[17937.346625]  [<ffffffff81029e38>] x86_pmu_commit_txn+0x78/0xa0
[17937.354222]  [<ffffffff8115aec4>] ? perf_event_update_userpage+0xd4/0x160
[17937.362774]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[17937.371080]  [<ffffffff810b47c8>] ? __lock_is_held+0x58/0x80
[17937.378334]  [<ffffffff81159410>] ? event_sched_in.isra.74+0x180/0x280
[17937.386508]  [<ffffffff811596c8>] group_sched_in+0x1b8/0x1e0
[17937.393755]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[17937.401364]  [<ffffffff81159ddc>] __perf_event_enable+0x25c/0x2a0
[17937.409031]  [<ffffffff810ea649>] ? tick_nohz_irq_exit+0x29/0x30
[17937.416612]  [<ffffffff81154570>] remote_function+0x50/0x60
[17937.423719]  [<ffffffff810ef822>] flush_smp_call_function_queue+0x62/0x140
[17937.432229]  [<ffffffff810efe43>] generic_smp_call_function_single_interrupt+0x13/0x60
[17937.441849]  [<ffffffff81046e07>] smp_call_function_single_interrupt+0x27/0x40
[17937.450770]  [<ffffffff816bfc3d>] call_function_single_interrupt+0x6d/0x80
[17937.459289]  <EOI>  [<ffffffff81554165>] ? cpuidle_enter_state+0x65/0x160
[17937.467209]  [<ffffffff81554151>] ? cpuidle_enter_state+0x51/0x160
[17937.474492]  [<ffffffff81554347>] cpuidle_enter+0x17/0x20
[17937.480917]  [<ffffffff810aec31>] cpu_startup_entry+0x311/0x3c0
[17937.487903]  [<ffffffff810476e0>] start_secondary+0x140/0x150
[17937.494721] ---[ end trace 963de0bbaca247a7 ]---
[17937.500322] 
[17937.502585] CPU#6: ctrl:       0000000000000000
[17937.508088] CPU#6: status:     0000000000000000
[17937.513564] CPU#6: overflow:   0000000000000000
[17937.519041] CPU#6: fixed:      00000000000000ba
[17937.524565] CPU#6: pebs:       0000000000000000
[17937.530065] CPU#6: active:     0000000300000001
[17937.535553] CPU#6:   gen-PMC0 ctrl:  0000000000534f2e
[17937.541595] CPU#6:   gen-PMC0 count: 000000000000ea46
[17937.547629] CPU#6:   gen-PMC0 left:  0000ffffffff15ea
[17937.553664] CPU#6:   gen-PMC1 ctrl:  00000000001301b7
[17937.559697] CPU#6:   gen-PMC1 count: 0000000000000001
[17937.565701] CPU#6:   gen-PMC1 left:  0000ffffffffffff
[17937.571695] CPU#6:   gen-PMC2 ctrl:  00000000001301b7
[17937.577704] CPU#6:   gen-PMC2 count: 0000000000000001
[17937.583686] CPU#6:   gen-PMC2 left:  0000ffffffffffff
[17937.589696] CPU#6:   gen-PMC3 ctrl:  00000000001301b7
[17937.595684] CPU#6:   gen-PMC3 count: 0000000000000001
[17937.601689] CPU#6:   gen-PMC3 left:  0000ffffffffffff
[17937.607620] CPU#6: fixed-PMC0 count: 0000fffffffffffe
[17937.613544] CPU#6: fixed-PMC1 count: 0000fffe43fef8ca
[17937.619436] CPU#6: fixed-PMC2 count: 0000000000000001
[17937.625294] perf_event_intel: clearing PMU state on CPU#6
[17937.631553] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 566.313 msecs
[17937.640872] perf interrupt took too long (4425128 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[17937.654379] perf_event_intel: clearing PMU state on CPU#6
[17937.660988] perf interrupt took too long (4442384 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[18041.352813] perf interrupt took too long (4407687 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[18103.018508] perf interrupt took too long (4373265 > 38461), lowering kernel.perf_event_max_sample_rate to 3250
[18183.918881] perf interrupt took too long (4339109 > 71428), lowering kernel.perf_event_max_sample_rate to 1750
[18297.804225] perf interrupt took too long (4305222 > 125000), lowering kernel.perf_event_max_sample_rate to 1000
[18362.131963] perf interrupt took too long (4271596 > 250000), lowering kernel.perf_event_max_sample_rate to 500
[18411.994909] perf interrupt took too long (4238237 > 500000), lowering kernel.perf_event_max_sample_rate to 250


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
@ 2015-01-26 16:26   ` Peter Zijlstra
  2015-01-27 21:28     ` Vince Weaver
                       ` (2 more replies)
  2015-02-04 14:39   ` [tip:perf/core] perf: Add a bit of paranoia tip-bot for Peter Zijlstra
  1 sibling, 3 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-26 16:26 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx

On Fri, Jan 23, 2015 at 01:52:01PM +0100, Peter Zijlstra wrote:
> @@ -1442,6 +1450,10 @@ event_sched_out(struct perf_event *event
>  {
>  	u64 tstamp = perf_event_time(event);
>  	u64 delta;
> +
> +	WARN_ON_ONCE(event->ctx != ctx);
> +	lockdep_assert_held(&ctx->lock);
> +
>  	/*
>  	 * An event which could not be activated because of
>  	 * filter mismatch still needs to have its timings

Jiri reported triggering that WARN_ON_ONCE over the weekend:

 event_sched_out.isra.79+0x2b9/0x2d0
 group_sched_out+0x69/0xc0
 ctx_sched_out+0x106/0x130
 task_ctx_sched_out+0x37/0x70
 __perf_install_in_context+0x70/0x1a0
 remote_function+0x48/0x60
 generic_exec_single+0x15b/0x1d0
 smp_call_function_single+0x67/0xa0
 task_function_call+0x53/0x80
 perf_install_in_context+0x8b/0x110


I think the below should cure this; if we install a group leader it will
iterate the (still intact) group list and find its siblings and try and
install those too -- even though those still have the old event->ctx --
in the new ctx.

Upon installing the first group sibling we'd try and schedule out the
group and trigger the above warn.

Fix this by installing the group leader last, installing siblings would
have no effect, they're not reachable through the group lists and
therefore we don't schedule them.

Also delay resetting the state until we're absolutely sure the events
are quiescent -- which raises the question; should perf_pmu_migrate()
not also have perf_event__state_init() calls in?

---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7645,16 +7645,9 @@ SYSCALL_DEFINE5(perf_event_open,
 
 		perf_remove_from_context(group_leader, false);
 
-		/*
-		 * Removing from the context ends up with disabled
-		 * event. What we want here is event in the initial
-		 * startup state, ready to be add into new context.
-		 */
-		perf_event__state_init(group_leader);
 		list_for_each_entry(sibling, &group_leader->sibling_list,
 				    group_entry) {
 			perf_remove_from_context(sibling, false);
-			perf_event__state_init(sibling);
 			put_ctx(gctx);
 		}
 	} else {
@@ -7670,13 +7663,31 @@ SYSCALL_DEFINE5(perf_event_open,
 		 */
 		synchronize_rcu();
 
-		perf_install_in_context(ctx, group_leader, group_leader->cpu);
-		get_ctx(ctx);
+		/*
+		 * Install the group siblings before the group leader.
+		 *
+		 * Because a group leader will try and install the entire group
+		 * (through the sibling list, which is still in-tact), we can
+		 * end up with siblings installed in the wrong context.
+		 *
+		 * By installing siblings first we NO-OP because they're not
+		 * reachable through the group lists.
+		 */
 		list_for_each_entry(sibling, &group_leader->sibling_list,
 				    group_entry) {
+			perf_event__state_init(sibling);
 			perf_install_in_context(ctx, sibling, sibling->cpu);
 			get_ctx(ctx);
 		}
+
+		/*
+		 * Removing from the context ends up with disabled
+		 * event. What we want here is event in the initial
+		 * startup state, ready to be add into new context.
+		 */
+		perf_event__state_init(group_leader);
+		perf_install_in_context(ctx, group_leader, group_leader->cpu);
+		get_ctx(ctx);
 	}
 
 	perf_install_in_context(ctx, event, event->cpu);


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 0/3] perf fixes
  2015-01-26 14:30 ` [RFC][PATCH 0/3] perf fixes Vince Weaver
@ 2015-01-26 18:03   ` Vince Weaver
  2015-01-26 18:34     ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Vince Weaver @ 2015-01-26 18:03 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, mingo, linux-kernel, eranian, jolsa,
	mark.rutland, torvalds, tglx

On Mon, 26 Jan 2015, Vince Weaver wrote:

> I applied this to (just before) 3.19-rc6 and let it run on my 
> troublemaking Haswell machine overnight.  It is looking good!

spoke too soon, managed to trigger one of the new warnings you added.

This is
        WARN_ON_ONCE(event->ctx != ctx);
in event_sched_out()

[42437.159176] ------------[ cut here ]------------
[42437.164613] WARNING: CPU: 1 PID: 15433 at kernel/events/core.c:1525 event_sched_out.isra.72+0x25e/0x2c0()
[42437.175338] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel crct10dif_pclmul snd_hda_controller crc32_pclmul ppdev ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel aes_x86_64 lpc_ich mfd_core evdev lrw mei_me i915 psmouse snd_hda_codec mei xhci_pci gf128mul i2c_i801 serio_raw xhci_hcd glue_helper snd_hwdep snd_pcm ablk_helper cryptd snd_timer drm_kms_helper parport_pc parport tpm_tis pcspkr snd tpm drm battery video soundcore wmi i2c_algo_bit button processor sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci e1000e libata ptp usbcore scsi_mod usb_common pps_core crc32c_intel thermal fan thermal_sys
[42437.248890] CPU: 1 PID: 15433 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc5+ #125
[42437.258294] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[42437.266772]  ffffffff81a3f194 ffff8800c83bfbc8 ffffffff816b6741 0000000000000000
[42437.275427]  0000000000000000 ffff8800c83bfc08 ffffffff8106dcda 01926893541204d2
[42437.284054]  ffff8800c69a4800 ffff88011ea5ced0 ffff88011ea5ced4 ffff880118ce4000
[42437.292660] Call Trace:
[42437.295915]  [<ffffffff816b6741>] dump_stack+0x45/0x57
[42437.302023]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[42437.309080]  [<ffffffff8106ddca>] warn_slowpath_null+0x1a/0x20
[42437.315950]  [<ffffffff811587ee>] event_sched_out.isra.72+0x25e/0x2c0
[42437.323445]  [<ffffffff811588c1>] group_sched_out+0x71/0xd0
[42437.330045]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[42437.337116]  [<ffffffff81158d2e>] ctx_sched_out+0xee/0x150
[42437.343610]  [<ffffffff81158dcb>] task_ctx_sched_out+0x3b/0x80
[42437.350461]  [<ffffffff8115a489>] __perf_install_in_context+0x79/0x1e0
[42437.358073]  [<ffffffff81154570>] remote_function+0x50/0x60
[42437.364702]  [<ffffffff810ef53e>] generic_exec_single+0x12e/0x190
[42437.371828]  [<ffffffff81154520>] ? task_clock_event_add+0x40/0x40
[42437.379083]  [<ffffffff810ef607>] smp_call_function_single+0x67/0xa0
[42437.386491]  [<ffffffff81153419>] task_function_call+0x49/0x60
[42437.393364]  [<ffffffff8115a410>] ? perf_cpu_hrtimer_handler+0x270/0x270
[42437.401167]  [<ffffffff811549fb>] perf_install_in_context+0x8b/0x100
[42437.408611]  [<ffffffff8115e082>] SYSC_perf_event_open+0x892/0xc60
[42437.415841]  [<ffffffff8115e89e>] SyS_perf_event_open+0xe/0x10
[42437.422680]  [<ffffffff816be6ad>] system_call_fastpath+0x16/0x1b
[42437.429728] ---[ end trace 963de0bbaca247a8 ]---


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 0/3] perf fixes
  2015-01-26 18:03   ` Vince Weaver
@ 2015-01-26 18:34     ` Jiri Olsa
  2015-01-26 18:52       ` Vince Weaver
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-01-26 18:34 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, mingo, linux-kernel, eranian, mark.rutland,
	torvalds, tglx

On Mon, Jan 26, 2015 at 01:03:47PM -0500, Vince Weaver wrote:
> On Mon, 26 Jan 2015, Vince Weaver wrote:
> 
> > I applied this to (just before) 3.19-rc6 and let it run on my 
> > troublemaking Haswell machine overnight.  It is looking good!
> 
> spoke too soon, managed to trigger one of the new warnings you added.
> 
> This is
>         WARN_ON_ONCE(event->ctx != ctx);
> in event_sched_out()

I trigered the same one, should be addressed by the fix Peter
sent in earlier email

jirka

> 
> [42437.159176] ------------[ cut here ]------------
> [42437.164613] WARNING: CPU: 1 PID: 15433 at kernel/events/core.c:1525 event_sched_out.isra.72+0x25e/0x2c0()
> [42437.175338] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel crct10dif_pclmul snd_hda_controller crc32_pclmul ppdev ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel aes_x86_64 lpc_ich mfd_core evdev lrw mei_me i915 psmouse snd_hda_codec mei xhci_pci gf128mul i2c_i801 serio_raw xhci_hcd glue_helper snd_hwdep snd_pcm ablk_helper cryptd snd_timer drm_kms_helper parport_pc parport tpm_tis pcspkr snd tpm drm battery video soundcore wmi i2c_algo_bit button processor sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci e1000e libata ptp usbcore scsi_mod usb_common pps_core crc32c_intel thermal fan thermal_sys
> [42437.248890] CPU: 1 PID: 15433 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc5+ #125
> [42437.258294] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [42437.266772]  ffffffff81a3f194 ffff8800c83bfbc8 ffffffff816b6741 0000000000000000
> [42437.275427]  0000000000000000 ffff8800c83bfc08 ffffffff8106dcda 01926893541204d2
> [42437.284054]  ffff8800c69a4800 ffff88011ea5ced0 ffff88011ea5ced4 ffff880118ce4000
> [42437.292660] Call Trace:
> [42437.295915]  [<ffffffff816b6741>] dump_stack+0x45/0x57
> [42437.302023]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
> [42437.309080]  [<ffffffff8106ddca>] warn_slowpath_null+0x1a/0x20
> [42437.315950]  [<ffffffff811587ee>] event_sched_out.isra.72+0x25e/0x2c0
> [42437.323445]  [<ffffffff811588c1>] group_sched_out+0x71/0xd0
> [42437.330045]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
> [42437.337116]  [<ffffffff81158d2e>] ctx_sched_out+0xee/0x150
> [42437.343610]  [<ffffffff81158dcb>] task_ctx_sched_out+0x3b/0x80
> [42437.350461]  [<ffffffff8115a489>] __perf_install_in_context+0x79/0x1e0
> [42437.358073]  [<ffffffff81154570>] remote_function+0x50/0x60
> [42437.364702]  [<ffffffff810ef53e>] generic_exec_single+0x12e/0x190
> [42437.371828]  [<ffffffff81154520>] ? task_clock_event_add+0x40/0x40
> [42437.379083]  [<ffffffff810ef607>] smp_call_function_single+0x67/0xa0
> [42437.386491]  [<ffffffff81153419>] task_function_call+0x49/0x60
> [42437.393364]  [<ffffffff8115a410>] ? perf_cpu_hrtimer_handler+0x270/0x270
> [42437.401167]  [<ffffffff811549fb>] perf_install_in_context+0x8b/0x100
> [42437.408611]  [<ffffffff8115e082>] SYSC_perf_event_open+0x892/0xc60
> [42437.415841]  [<ffffffff8115e89e>] SyS_perf_event_open+0xe/0x10
> [42437.422680]  [<ffffffff816be6ad>] system_call_fastpath+0x16/0x1b
> [42437.429728] ---[ end trace 963de0bbaca247a8 ]---
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 0/3] perf fixes
  2015-01-26 18:34     ` Jiri Olsa
@ 2015-01-26 18:52       ` Vince Weaver
  0 siblings, 0 replies; 29+ messages in thread
From: Vince Weaver @ 2015-01-26 18:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Vince Weaver, Peter Zijlstra, mingo, linux-kernel, eranian,
	mark.rutland, torvalds, tglx

On Mon, 26 Jan 2015, Jiri Olsa wrote:

> I trigered the same one, should be addressed by the fix Peter
> sent in earlier email

ahh, yes.  gmail (which runs my university's mail servers) is for some 
reason sending all e-mails from PeterZ to the spam folder.  I'll try out 
that patch.

Vince




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-26 16:26   ` Peter Zijlstra
@ 2015-01-27 21:28     ` Vince Weaver
  2015-01-29  2:16       ` Vince Weaver
  2015-01-29 14:47     ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
  2015-02-04 14:39     ` [tip:perf/core] perf: Fix move_group() order tip-bot for Peter Zijlstra (Intel)
  2 siblings, 1 reply; 29+ messages in thread
From: Vince Weaver @ 2015-01-27 21:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa,
	mark.rutland, torvalds, tglx

On Mon, 26 Jan 2015, Peter Zijlstra wrote:

> I think the below should cure this; if we install a group leader it will
> iterate the (still intact) group list and find its siblings and try and
> install those too -- even though those still have the old event->ctx --
> in the new ctx.

I've been fuzzing 24 hours (despite the blizzard) with this patch
plus the original series on top of 3.19-rc6 on the Haswell machine and it 
is looking good.

Still getting the "perfevents: irq loop stuck!" warnings seen below, but 
that's probably a problem for a different day.  

(as an aside, when this warning is triggered, almost always the fixed 
counter 0 has a count of 0000fffffffffffe, not sure if that's relevant)

[27788.169271] ------------[ cut here ]------------
[27788.174226] WARNING: CPU: 2 PID: 0 at arch/x86/kernel/cpu/perf_event_intel.c:1372 intel_pmu_handle_irq+0x282/0x3a0()
[27788.185501] perfevents: irq loop stuck!
[27788.189640] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
[27788.261403] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0-rc6+ #126
[27788.268409] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[27788.276362]  ffffffff81a266b0 ffff88011ea85b50 ffffffff816b6761 0000000000000000
[27788.284445]  ffff88011ea85ba0 ffff88011ea85b90 ffffffff8106dcda ffffffff8115c439
[27788.292492]  0000000000000064 ffff88011ea8bd40 ffff8800366d6000 0000000000000040
[27788.300600] Call Trace:
[27788.303225]  <NMI>  [<ffffffff816b6761>] dump_stack+0x45/0x57
[27788.309459]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[27788.315855]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
[27788.322724]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
[27788.328943]  [<ffffffff81032732>] intel_pmu_handle_irq+0x282/0x3a0
[27788.335605]  [<ffffffff810b405c>] ? check_chain_key+0x12c/0x1e0
[27788.341986]  [<ffffffff8102a04b>] perf_event_nmi_handler+0x2b/0x50
[27788.348620]  [<ffffffff81018610>] nmi_handle+0xa0/0x150
[27788.354261]  [<ffffffff81018575>] ? nmi_handle+0x5/0x150
[27788.359898]  [<ffffffff810188ba>] default_do_nmi+0x4a/0x140
[27788.365938]  [<ffffffff81018a48>] do_nmi+0x98/0xe0
[27788.371107]  [<ffffffff816c0bf1>] end_repeat_nmi+0x1e/0x2e
[27788.377003]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.383629]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.390296]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.396953]  <<EOE>>  <IRQ>  [<ffffffff81032346>] intel_pmu_enable_event+0x236/0x250
[27788.405309]  [<ffffffff8102b281>] x86_pmu_start+0x81/0x120
[27788.411284]  [<ffffffff8102ba55>] x86_pmu_enable+0x295/0x310
[27788.417339]  [<ffffffff8115858a>] perf_pmu_enable+0x2a/0x30
[27788.423344]  [<ffffffff81029e38>] x86_pmu_commit_txn+0x78/0xa0
[27788.429615]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[27788.436600]  [<ffffffff8115aec4>] ? perf_event_update_userpage+0xd4/0x160
[27788.443927]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[27788.450880]  [<ffffffff810b47c8>] ? __lock_is_held+0x58/0x80
[27788.456951]  [<ffffffff81159410>] ? event_sched_in.isra.74+0x180/0x280
[27788.463962]  [<ffffffff811596c8>] group_sched_in+0x1b8/0x1e0
[27788.470071]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[27788.476575]  [<ffffffff81159ddc>] __perf_event_enable+0x25c/0x2a0
[27788.483139]  [<ffffffff810ea649>] ? tick_nohz_irq_exit+0x29/0x30
[27788.489614]  [<ffffffff81154570>] remote_function+0x50/0x60
[27788.495572]  [<ffffffff810ef822>] flush_smp_call_function_queue+0x62/0x140
[27788.502919]  [<ffffffff810efe43>] generic_smp_call_function_single_interrupt+0x13/0x60
[27788.511384]  [<ffffffff81046e07>] smp_call_function_single_interrupt+0x27/0x40
[27788.519150]  [<ffffffff816bfc7d>] call_function_single_interrupt+0x6d/0x80
[27788.527654]  <EOI>  [<ffffffff81554185>] ? cpuidle_enter_state+0x65/0x160
[27788.536161]  [<ffffffff81554171>] ? cpuidle_enter_state+0x51/0x160
[27788.543978]  [<ffffffff81554367>] cpuidle_enter+0x17/0x20
[27788.550939]  [<ffffffff810aec31>] cpu_startup_entry+0x311/0x3c0
[27788.558382]  [<ffffffff810476e0>] start_secondary+0x140/0x150
[27788.565763] ---[ end trace 55752a03ec8ab977 ]---
[27788.574572] CPU#2: ctrl:       0000000000000000
[27788.580619] CPU#2: status:     0000000000000000
[27788.586609] CPU#2: overflow:   0000000000000000
[27788.592580] CPU#2: fixed:      00000000000000ba
[27788.598551] CPU#2: pebs:       0000000000000000
[27788.604522] CPU#2: active:     0000000300000000
[27788.610460] CPU#2:   gen-PMC0 ctrl:  00000000a8938001
[27788.617003] CPU#2:   gen-PMC0 count: 0000000000014b9c
[27788.623545] CPU#2:   gen-PMC0 left:  0000ffffffffffff
[27788.630032] CPU#2:   gen-PMC1 ctrl:  000000001833ca08
[27788.636563] CPU#2:   gen-PMC1 count: 0000000000000001
[27788.643123] CPU#2:   gen-PMC1 left:  0000ffffffffffff
[27788.649666] CPU#2:   gen-PMC2 ctrl:  0000000300130000
[27788.656189] CPU#2:   gen-PMC2 count: 0000000000000001
[27788.662622] CPU#2:   gen-PMC2 left:  0000ffffffffffff
[27788.669091] CPU#2:   gen-PMC3 ctrl:  00000000801300e2
[27788.675519] CPU#2:   gen-PMC3 count: 0000000000000001
[27788.681882] CPU#2:   gen-PMC3 left:  0000ffffffffffff
[27788.688207] CPU#2: fixed-PMC0 count: 0000fffffffffffe
[27788.694448] CPU#2: fixed-PMC1 count: 0000fff958ec3d6f
[27788.700703] CPU#2: fixed-PMC2 count: 00000000002b3c77
[27788.706922] perf_event_intel: clearing PMU state on CPU#2
[27788.713598] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 544.301 msecs
[27788.723308] perf interrupt took too long (4252836 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[27922.376291] perf interrupt took too long (4219620 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[28039.869794] perf interrupt took too long (4186668 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[28123.044927] perf interrupt took too long (4153967 > 38461), lowering kernel.perf_event_max_sample_rate to 3250
[28128.807297] perf interrupt took too long (4121532 > 71428), lowering kernel.perf_event_max_sample_rate to 1750
[28128.818494] perf interrupt took too long (4089339 > 125000), lowering kernel.perf_event_max_sample_rate to 1000
[28128.829828] perf interrupt took too long (4057399 > 250000), lowering kernel.perf_event_max_sample_rate to 500
[28128.840903] perf interrupt took too long (4025786 > 500000), lowering kernel.perf_event_max_sample_rate to 250
[46336.872295] perf_event_intel: clearing PMU state on CPU#2
[69001.500107] perf_event_intel: clearing PMU state on CPU#2
[69001.561872] perf_event_intel: clearing PMU state on CPU#2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip:perf/urgent] perf: Tighten (and fix) the grouping condition
  2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
  2015-01-23 15:02   ` Mark Rutland
@ 2015-01-28 14:30   ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 29+ messages in thread
From: tip-bot for Peter Zijlstra @ 2015-01-28 14:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, tglx, hpa, jolsa, linux-kernel, peterz, mingo, torvalds

Commit-ID:  c3c87e770458aa004bd7ed3f29945ff436fd6511
Gitweb:     http://git.kernel.org/tip/c3c87e770458aa004bd7ed3f29945ff436fd6511
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 23 Jan 2015 11:19:48 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 28 Jan 2015 13:17:35 +0100

perf: Tighten (and fix) the grouping condition

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h |  6 ------
 kernel/events/core.c       | 15 +++++++++++++--
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 4f7a61c..664de5a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -450,11 +450,6 @@ struct perf_event {
 #endif /* CONFIG_PERF_EVENTS */
 };
 
-enum perf_event_context_type {
-	task_context,
-	cpu_context,
-};
-
 /**
  * struct perf_event_context - event context structure
  *
@@ -462,7 +457,6 @@ enum perf_event_context_type {
  */
 struct perf_event_context {
 	struct pmu			*pmu;
-	enum perf_event_context_type	type;
 	/*
 	 * Protect the states of the events in the list,
 	 * nr_active, and the list:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 882f835..19efcf1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6776,7 +6776,6 @@ skip_type:
 		__perf_event_init_context(&cpuctx->ctx);
 		lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
 		lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
-		cpuctx->ctx.type = cpu_context;
 		cpuctx->ctx.pmu = pmu;
 
 		__perf_cpu_hrtimer_init(cpuctx, cpu);
@@ -7420,7 +7419,19 @@ SYSCALL_DEFINE5(perf_event_open,
 		 * task or CPU context:
 		 */
 		if (move_group) {
-			if (group_leader->ctx->type != ctx->type)
+			/*
+			 * Make sure we're both on the same task, or both
+			 * per-cpu events.
+			 */
+			if (group_leader->ctx->task != ctx->task)
+				goto err_context;
+
+			/*
+			 * Make sure we're both events for the same CPU;
+			 * grouping events for different CPUs is broken; since
+			 * you can never concurrently schedule them anyhow.
+			 */
+			if (group_leader->cpu != event->cpu)
 				goto err_context;
 		} else {
 			if (group_leader->ctx != ctx)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-27 21:28     ` Vince Weaver
@ 2015-01-29  2:16       ` Vince Weaver
  2015-01-29  7:51         ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Vince Weaver @ 2015-01-29  2:16 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, mingo, linux-kernel, Stephane Eranian, jolsa,
	mark.rutland, torvalds, tglx

On Tue, 27 Jan 2015, Vince Weaver wrote:

> On Mon, 26 Jan 2015, Peter Zijlstra wrote:
> 
> > I think the below should cure this; if we install a group leader it will
> > iterate the (still intact) group list and find its siblings and try and
> > install those too -- even though those still have the old event->ctx --
> > in the new ctx.
> 
> I've been fuzzing 24 hours (despite the blizzard) with this patch
> plus the original series on top of 3.19-rc6 on the Haswell machine and it 
> is looking good.

OK it took about 48 hours but I managed to generate this:

[162118.235829] ------------[ cut here ]------------
[162118.241388] WARNING: CPU: 5 PID: 13571 at kernel/events/core.c:1644 perf_remove_from_context+0xf5/0x120()
[162118.252183] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
[162118.325574] CPU: 5 PID: 13571 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc6+ #126
[162118.334984] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[162118.343581]  ffffffff81a3f194 ffff8800c8463b48 ffffffff816b6761 0000000000000000
[162118.352277]  0000000000000000 ffff8800c8463b88 ffffffff8106dcda ffff8800c9286c40
[162118.360962]  ffff8800c8c01800 ffff8800c8c3a090 ffff8800c8c3a090 ffff8800c71b4800
[162118.369669] Call Trace:
[162118.372984]  [<ffffffff816b6761>] dump_stack+0x45/0x57
[162118.379170]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[162118.386267]  [<ffffffff8106ddca>] warn_slowpath_null+0x1a/0x20
[162118.393160]  [<ffffffff811554b5>] perf_remove_from_context+0xf5/0x120
[162118.400706]  [<ffffffff811571c5>] put_event+0x115/0x170
[162118.407004]  [<ffffffff81157101>] ? put_event+0x51/0x170
[162118.413340]  [<ffffffff816bc06e>] ? mutex_unlock+0xe/0x10
[162118.419792]  [<ffffffff81157255>] perf_release+0x15/0x20
[162118.426144]  [<ffffffff811e234f>] __fput+0xdf/0x1f0
[162118.432009]  [<ffffffff811e24ae>] ____fput+0xe/0x10
[162118.437895]  [<ffffffff8108bbc7>] task_work_run+0xa7/0xe0
[162118.444377]  [<ffffffff81070799>] do_exit+0x319/0xac0
[162118.450443]  [<ffffffff8107cc99>] ? get_signal+0xe9/0x770
[162118.456906]  [<ffffffff8107cf09>] ? get_signal+0x359/0x770
[162118.463427]  [<ffffffff81070fe4>] do_group_exit+0x54/0xe0
[162118.469887]  [<ffffffff8107ce46>] get_signal+0x296/0x770
[162118.476237]  [<ffffffff8105ded2>] ? __do_page_fault+0x1f2/0x580
[162118.483251]  [<ffffffff81013578>] do_signal+0x28/0xbb0
[162118.489392]  [<ffffffff8105e282>] ? do_page_fault+0x22/0x30
[162118.496055]  [<ffffffff81014170>] do_notify_resume+0x70/0x90
[162118.502811]  [<ffffffff816bf662>] retint_signal+0x48/0x86
[162118.509272] ---[ end trace 55752a03ec8ab978 ]---


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-29  2:16       ` Vince Weaver
@ 2015-01-29  7:51         ` Jiri Olsa
  2015-01-29 13:44           ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-01-29  7:51 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Peter Zijlstra, mingo, linux-kernel, Stephane Eranian,
	mark.rutland, torvalds, tglx

On Wed, Jan 28, 2015 at 09:16:49PM -0500, Vince Weaver wrote:
> On Tue, 27 Jan 2015, Vince Weaver wrote:
> 
> > On Mon, 26 Jan 2015, Peter Zijlstra wrote:
> > 
> > > I think the below should cure this; if we install a group leader it will
> > > iterate the (still intact) group list and find its siblings and try and
> > > install those too -- even though those still have the old event->ctx --
> > > in the new ctx.
> > 
> > I've been fuzzing 24 hours (despite the blizzard) with this patch
> > plus the original series on top of 3.19-rc6 on the Haswell machine and it 
> > is looking good.
> 
> OK it took about 48 hours but I managed to generate this:
> 
> [162118.235829] ------------[ cut here ]------------
> [162118.241388] WARNING: CPU: 5 PID: 13571 at kernel/events/core.c:1644 perf_remove_from_context+0xf5/0x120()
> [162118.252183] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
> [162118.325574] CPU: 5 PID: 13571 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc6+ #126
> [162118.334984] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [162118.343581]  ffffffff81a3f194 ffff8800c8463b48 ffffffff816b6761 0000000000000000
> [162118.352277]  0000000000000000 ffff8800c8463b88 ffffffff8106dcda ffff8800c9286c40
> [162118.360962]  ffff8800c8c01800 ffff8800c8c3a090 ffff8800c8c3a090 ffff8800c71b4800
> [162118.369669] Call Trace:
> [162118.372984]  [<ffffffff816b6761>] dump_stack+0x45/0x57
> [162118.379170]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
> [162118.386267]  [<ffffffff8106ddca>] warn_slowpath_null+0x1a/0x20
> [162118.393160]  [<ffffffff811554b5>] perf_remove_from_context+0xf5/0x120
> [162118.400706]  [<ffffffff811571c5>] put_event+0x115/0x170
> [162118.407004]  [<ffffffff81157101>] ? put_event+0x51/0x170
> [162118.413340]  [<ffffffff816bc06e>] ? mutex_unlock+0xe/0x10
> [162118.419792]  [<ffffffff81157255>] perf_release+0x15/0x20
> [162118.426144]  [<ffffffff811e234f>] __fput+0xdf/0x1f0
> [162118.432009]  [<ffffffff811e24ae>] ____fput+0xe/0x10
> [162118.437895]  [<ffffffff8108bbc7>] task_work_run+0xa7/0xe0
> [162118.444377]  [<ffffffff81070799>] do_exit+0x319/0xac0
> [162118.450443]  [<ffffffff8107cc99>] ? get_signal+0xe9/0x770
> [162118.456906]  [<ffffffff8107cf09>] ? get_signal+0x359/0x770
> [162118.463427]  [<ffffffff81070fe4>] do_group_exit+0x54/0xe0
> [162118.469887]  [<ffffffff8107ce46>] get_signal+0x296/0x770
> [162118.476237]  [<ffffffff8105ded2>] ? __do_page_fault+0x1f2/0x580
> [162118.483251]  [<ffffffff81013578>] do_signal+0x28/0xbb0
> [162118.489392]  [<ffffffff8105e282>] ? do_page_fault+0x22/0x30
> [162118.496055]  [<ffffffff81014170>] do_notify_resume+0x70/0x90
> [162118.502811]  [<ffffffff816bf662>] retint_signal+0x48/0x86
> [162118.509272] ---[ end trace 55752a03ec8ab978 ]---
> 


hum, I dont see a way how the event->ctx could be changed once the task is
already in exit, but should we access the event->ctx after the refcnt check?

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 66c1da1ddc2c..0507e3dd6955 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3572,7 +3572,7 @@ static void perf_remove_from_owner(struct perf_event *event)
  */
 static void put_event(struct perf_event *event)
 {
-	struct perf_event_context *ctx = event->ctx;
+	struct perf_event_context *ctx;
 
 	if (!atomic_long_dec_and_test(&event->refcount))
 		return;
@@ -3580,6 +3580,7 @@ static void put_event(struct perf_event *event)
 	if (!is_kernel_event(event))
 		perf_remove_from_owner(event);
 
+	ctx = event->ctx;
 	WARN_ON_ONCE(ctx->parent_ctx);
 	/*
 	 * There are two ways this annotation is useful:

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-29  7:51         ` Jiri Olsa
@ 2015-01-29 13:44           ` Peter Zijlstra
  2015-02-04 14:39             ` [tip:perf/core] perf: Fix put_event() ctx lock tip-bot for Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-29 13:44 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Vince Weaver, mingo, linux-kernel, Stephane Eranian,
	mark.rutland, torvalds, tglx

On Thu, Jan 29, 2015 at 08:51:26AM +0100, Jiri Olsa wrote:
> > [162118.235829] ------------[ cut here ]------------
> > [162118.241388] WARNING: CPU: 5 PID: 13571 at kernel/events/core.c:1644 perf_remove_from_context+0xf5/0x120()
> > [162118.252183] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
> > [162118.325574] CPU: 5 PID: 13571 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc6+ #126
> > [162118.334984] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> > [162118.343581]  ffffffff81a3f194 ffff8800c8463b48 ffffffff816b6761 0000000000000000
> > [162118.352277]  0000000000000000 ffff8800c8463b88 ffffffff8106dcda ffff8800c9286c40
> > [162118.360962]  ffff8800c8c01800 ffff8800c8c3a090 ffff8800c8c3a090 ffff8800c71b4800
> > [162118.369669] Call Trace:
> > [162118.372984]  [<ffffffff816b6761>] dump_stack+0x45/0x57
> > [162118.379170]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
> > [162118.386267]  [<ffffffff8106ddca>] warn_slowpath_null+0x1a/0x20
> > [162118.393160]  [<ffffffff811554b5>] perf_remove_from_context+0xf5/0x120
> > [162118.400706]  [<ffffffff811571c5>] put_event+0x115/0x170
> > [162118.407004]  [<ffffffff81157101>] ? put_event+0x51/0x170
> > [162118.413340]  [<ffffffff816bc06e>] ? mutex_unlock+0xe/0x10
> > [162118.419792]  [<ffffffff81157255>] perf_release+0x15/0x20
> > [162118.426144]  [<ffffffff811e234f>] __fput+0xdf/0x1f0
> > [162118.432009]  [<ffffffff811e24ae>] ____fput+0xe/0x10
> > [162118.437895]  [<ffffffff8108bbc7>] task_work_run+0xa7/0xe0
> > [162118.444377]  [<ffffffff81070799>] do_exit+0x319/0xac0
> > [162118.450443]  [<ffffffff8107cc99>] ? get_signal+0xe9/0x770
> > [162118.456906]  [<ffffffff8107cf09>] ? get_signal+0x359/0x770
> > [162118.463427]  [<ffffffff81070fe4>] do_group_exit+0x54/0xe0
> > [162118.469887]  [<ffffffff8107ce46>] get_signal+0x296/0x770
> > [162118.476237]  [<ffffffff8105ded2>] ? __do_page_fault+0x1f2/0x580
> > [162118.483251]  [<ffffffff81013578>] do_signal+0x28/0xbb0
> > [162118.489392]  [<ffffffff8105e282>] ? do_page_fault+0x22/0x30
> > [162118.496055]  [<ffffffff81014170>] do_notify_resume+0x70/0x90
> > [162118.502811]  [<ffffffff816bf662>] retint_signal+0x48/0x86
> > [162118.509272] ---[ end trace 55752a03ec8ab978 ]---
> > 
> 
> 
> hum, I dont see a way how the event->ctx could be changed once the task is
> already in exit, but should we access the event->ctx after the refcnt check?
> 

So what I suspect; but I'm in zombie mode today it seems; is that while
I initially thought that it was impossible for ctx to change when
refcount dropped to 0, I now suspect its possible.

Note that until perf_remove_from_context() the event is still active and
visible on the lists. So a concurrent sys_perf_event_open() from another
task into this task can race.

In this particular case it seems very narrow, since the syscall has to
start before the task goes into shutdown mode otherwise
find_get_context()'s PF_EXITING test would've bailed.


---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -947,7 +947,8 @@ static void put_ctx(struct perf_event_co
  *	    perf_event::mmap_mutex
  *	    mmap_sem
  */
-static struct perf_event_context *perf_event_ctx_lock(struct perf_event *event)
+static struct perf_event_context *
+perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
 {
 	struct perf_event_context *ctx;
 
@@ -960,7 +961,7 @@ static struct perf_event_context *perf_e
 	}
 	rcu_read_unlock();
 
-	mutex_lock(&ctx->mutex);
+	mutex_lock_nested(&ctx->mutex, nesting);
 	if (event->ctx != ctx) {
 		mutex_unlock(&ctx->mutex);
 		put_ctx(ctx);
@@ -970,6 +971,12 @@ static struct perf_event_context *perf_e
 	return ctx;
 }
 
+static inline struct perf_event_context *
+perf_event_ctx_lock(struct perf_event *event)
+{
+	return perf_event_ctx_lock_nested(event, 0);
+}
+
 static void perf_event_ctx_unlock(struct perf_event *event,
 				  struct perf_event_context *ctx)
 {
@@ -3572,7 +3579,7 @@ static void perf_remove_from_owner(struc
  */
 static void put_event(struct perf_event *event)
 {
-	struct perf_event_context *ctx = event->ctx;
+	struct perf_event_context *ctx;
 
 	if (!atomic_long_dec_and_test(&event->refcount))
 		return;
@@ -3580,7 +3587,6 @@ static void put_event(struct perf_event
 	if (!is_kernel_event(event))
 		perf_remove_from_owner(event);
 
-	WARN_ON_ONCE(ctx->parent_ctx);
 	/*
 	 * There are two ways this annotation is useful:
 	 *
@@ -3593,7 +3599,8 @@ static void put_event(struct perf_event
 	 *     the last filedesc died, so there is no possibility
 	 *     to trigger the AB-BA case.
 	 */
-	mutex_lock_nested(&ctx->mutex, SINGLE_DEPTH_NESTING);
+	ctx = perf_event_ctx_lock_nested(event, SINGLE_DEPTH_NESTING);
+	WARN_ON_ONCE(ctx->parent_ctx);
 	perf_remove_from_context(event, true);
 	mutex_unlock(&ctx->mutex);
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-26 16:26   ` Peter Zijlstra
  2015-01-27 21:28     ` Vince Weaver
@ 2015-01-29 14:47     ` Peter Zijlstra
  2015-02-02  6:33       ` Vince Weaver
  2015-02-04 14:39     ` [tip:perf/core] perf: Fix move_group() order tip-bot for Peter Zijlstra (Intel)
  2 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-01-29 14:47 UTC (permalink / raw)
  To: mingo, linux-kernel
  Cc: vincent.weaver, eranian, jolsa, mark.rutland, torvalds, tglx

On Mon, Jan 26, 2015 at 05:26:39PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 01:52:01PM +0100, Peter Zijlstra wrote:
> Jiri reported triggering that WARN_ON_ONCE over the weekend:
> 
>  event_sched_out.isra.79+0x2b9/0x2d0
>  group_sched_out+0x69/0xc0
>  ctx_sched_out+0x106/0x130
>  task_ctx_sched_out+0x37/0x70
>  __perf_install_in_context+0x70/0x1a0
>  remote_function+0x48/0x60
>  generic_exec_single+0x15b/0x1d0
>  smp_call_function_single+0x67/0xa0
>  task_function_call+0x53/0x80
>  perf_install_in_context+0x8b/0x110
> 
> 
> I think the below should cure this; if we install a group leader it will
> iterate the (still intact) group list and find its siblings and try and
> install those too -- even though those still have the old event->ctx --
> in the new ctx.
> 
> Upon installing the first group sibling we'd try and schedule out the
> group and trigger the above warn.
> 
> Fix this by installing the group leader last, installing siblings would
> have no effect, they're not reachable through the group lists and
> therefore we don't schedule them.
> 
> Also delay resetting the state until we're absolutely sure the events
> are quiescent -- which raises the question; should perf_pmu_migrate()
> not also have perf_event__state_init() calls in?

So perf_pmu_migrate_context() does an explicit OFF->INACTIVE change on
the install side, which seems good enough.

That said, it does need to do that sibling first leaders later install
order too. So I've put the below on top.

---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7817,8 +7817,35 @@ void perf_pmu_migrate_context(struct pmu
 		list_add(&event->migrate_entry, &events);
 	}
 
+	/*
+	 * Wait for the events to quiesce before re-instating them.
+	 */
 	synchronize_rcu();
 
+	/*
+	 * Re-instate events in 2 passes.
+	 *
+	 * Skip over group leaders and only install siblings on this first
+	 * pass, siblings will not get enabled without a leader, however a
+	 * leader will enable its siblings, even if those are still on the old
+	 * context.
+	 */
+	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
+		if (event->group_leader == event)
+			continue;
+
+		list_del(&event->migrate_entry);
+		if (event->state >= PERF_EVENT_STATE_OFF)
+			event->state = PERF_EVENT_STATE_INACTIVE;
+		account_event_cpu(event, dst_cpu);
+		perf_install_in_context(dst_ctx, event, dst_cpu);
+		get_ctx(dst_ctx);
+	}
+
+	/*
+	 * Once all the siblings are setup properly, install the group leaders
+	 * to make it go.
+	 */
 	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
 		list_del(&event->migrate_entry);
 		if (event->state >= PERF_EVENT_STATE_OFF)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-01-29 14:47     ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
@ 2015-02-02  6:33       ` Vince Weaver
  2015-02-02 15:42         ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Vince Weaver @ 2015-02-02  6:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa,
	mark.rutland, torvalds, tglx

On Thu, 29 Jan 2015, Peter Zijlstra wrote:

> That said, it does need to do that sibling first leaders later install
> order too. So I've put the below on top.

so I've lost track of exactly which patches I should be running (do I need 
to apply both of the additional patches?)

Meanwhile my haswell was still fuzzing away (without those two 
updated patches) until it triggered the below and crashed.

[407484.309136] ------------[ cut here ]------------
[407484.314590] WARNING: CPU: 3 PID: 27209 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0()
[407484.325090] Watchdog detected hard LOCKUP on cpu 3
[407484.330093] Modules linked in: btrfs xor raid6_pq ntfs vfat msdos fat dm_mod fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
[407484.408496] CPU: 3 PID: 27209 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc6+ #126
[407484.417914] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[407484.426497]  ffffffff81a3d3da ffff88011eac5aa0 ffffffff816b6761 0000000000000000
[407484.435161]  ffff88011eac5af0 ffff88011eac5ae0 ffffffff8106dcda ffff88011eac5b00
[407484.443900]  ffff8801195f9800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8
[407484.452588] Call Trace:
[407484.455862]  <NMI>  [<ffffffff816b6761>] dump_stack+0x45/0x57
[407484.462741]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[407484.469851]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
[407484.476743]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[407484.483888]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[407484.490999]  [<ffffffff8111b922>] watchdog_overflow_callback+0x92/0xc0
[407484.498672]  [<ffffffff8115c3f1>] __perf_event_overflow+0x91/0x270
[407484.505984]  [<ffffffff8102b15a>] ? x86_perf_event_set_period+0xca/0x170
[407484.513834]  [<ffffffff8115ced9>] perf_event_overflow+0x19/0x20
[407484.520812]  [<ffffffff8103266a>] intel_pmu_handle_irq+0x1ba/0x3a0
[407484.528119]  [<ffffffff8102a04b>] perf_event_nmi_handler+0x2b/0x50
[407484.535402]  [<ffffffff81018610>] nmi_handle+0xa0/0x150
[407484.541701]  [<ffffffff81018575>] ? nmi_handle+0x5/0x150
[407484.548069]  [<ffffffff810188ba>] default_do_nmi+0x4a/0x140
[407484.554692]  [<ffffffff81018a48>] do_nmi+0x98/0xe0
[407484.560517]  [<ffffffff816c0bf1>] end_repeat_nmi+0x1e/0x2e
[407484.567054]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
[407484.574256]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
[407484.581431]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
[407484.588602]  <<EOE>>  <IRQ>  [<ffffffff8115c28c>] perf_prepare_sample+0x2ec/0x3c0
[407484.597358]  [<ffffffff8115c46e>] __perf_event_overflow+0x10e/0x270
[407484.604708]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
[407484.612215]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
[407484.619000]  [<ffffffff8115cfe2>] ? __perf_sw_event+0x72/0x1f0
[407484.625937]  [<ffffffff8115c799>] ? perf_swevent_overflow+0xa9/0xe0
[407484.633287]  [<ffffffff8115c799>] perf_swevent_overflow+0xa9/0xe0
[407484.640467]  [<ffffffff8115c837>] perf_swevent_event+0x67/0x90
[407484.647343]  [<ffffffff8115c924>] perf_tp_event+0xc4/0x210
[407484.653923]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
[407484.660606]  [<ffffffff810b3cf6>] ? perf_trace_lock_acquire+0x146/0x180
[407484.668332]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[407484.675962]  [<ffffffff810b3cf6>] perf_trace_lock_acquire+0x146/0x180
[407484.683490]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
[407484.690211]  [<ffffffff810b6fa9>] lock_acquire+0x119/0x130
[407484.696750]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
[407484.703640]  [<ffffffff811f50ef>] ? kill_fasync+0xf/0xf0
[407484.710008]  [<ffffffff8115b828>] perf_event_wakeup+0x38/0xf0
[407484.716798]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
[407484.723696]  [<ffffffff8115b913>] perf_pending_event+0x33/0x60
[407484.730570]  [<ffffffff8114cc7c>] irq_work_run_list+0x4c/0x80
[407484.737392]  [<ffffffff8114ccc8>] irq_work_run+0x18/0x40
[407484.743765]  [<ffffffff8101955f>] smp_trace_irq_work_interrupt+0x3f/0xc0
[407484.751579]  [<ffffffff816c01fd>] trace_irq_work_interrupt+0x6d/0x80
[407484.759046]  <EOI>  [<ffffffff810b6f4d>] ? lock_acquire+0xbd/0x130
[407484.766380]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
[407484.772786]  [<ffffffff816bdb31>] _raw_spin_lock+0x31/0x40
[407484.779321]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
[407484.785813]  [<ffffffff811f4f42>] SyS_fcntl+0x5b2/0x650
[407484.792109]  [<ffffffff816be6ed>] system_call_fastpath+0x16/0x1b
[407484.799195] ---[ end trace 55752a03ec8ab979 ]---
[407490.307881] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 4, t=5252 jiffies, g=36308737, c=36308736, q=207)
[407490.320908] Task dump for CPU 3:
[407490.325104] perf_fuzzer     R  running task        0 27209   2304 0x00000008
[407490.333439]  ffffffff8115c837 ffff8800c4293cc8 ffffffff8115c924 ffff880000000024
[407490.342163]  0000000000000001 ffff8800c4293b80 ffff8800c4293d90 ffff8800ceb81000
[407490.350899]  ffffe8ffffcca880 ffff8800c4293b10 ffffffff8115c799 ffff8800ceb81000
[407490.359619] Call Trace:
[407490.362932]  [<ffffffff8115c837>] ? perf_swevent_event+0x67/0x90
[407490.370033]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
[407490.376770]  [<ffffffff8115c799>] ? perf_swevent_overflow+0xa9/0xe0
[407490.384068]  [<ffffffff8115c837>] ? perf_swevent_event+0x67/0x90
[407490.391073]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
[407490.397696]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
[407490.404328]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[407490.411921]  [<ffffffff810b3cf6>] ? perf_trace_lock_acquire+0x146/0x180
[407490.419567]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[407490.427133]  [<ffffffff810b6f4d>] ? lock_acquire+0xbd/0x130
[407490.433668]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
[407490.440062]  [<ffffffff816bdb31>] ? _raw_spin_lock+0x31/0x40
[407490.446600]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
[407490.452913]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
[407490.459253]  [<ffffffff816be6ed>] ? system_call_fastpath+0x16/0x1b
(repeat forever filling up the disk with these messages)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-02-02  6:33       ` Vince Weaver
@ 2015-02-02 15:42         ` Peter Zijlstra
  2015-02-02 17:32           ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-02-02 15:42 UTC (permalink / raw)
  To: Vince Weaver
  Cc: mingo, linux-kernel, eranian, jolsa, mark.rutland, torvalds, tglx

On Mon, Feb 02, 2015 at 01:33:14AM -0500, Vince Weaver wrote:
> On Thu, 29 Jan 2015, Peter Zijlstra wrote:
> 
> > That said, it does need to do that sibling first leaders later install
> > order too. So I've put the below on top.
> 
> so I've lost track of exactly which patches I should be running (do I need 
> to apply both of the additional patches?)

Probably, lemme try and get all of the current stuff in tip/master to
make for easier testing.

> Meanwhile my haswell was still fuzzing away (without those two 
> updated patches) until it triggered the below and crashed.
> 
> [407484.309136] ------------[ cut here ]------------
> [407484.314590] WARNING: CPU: 3 PID: 27209 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0()
> [407484.325090] Watchdog detected hard LOCKUP on cpu 3
> [407484.330093] Modules linked in: btrfs xor raid6_pq ntfs vfat msdos fat dm_mod fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
> [407484.408496] CPU: 3 PID: 27209 Comm: perf_fuzzer Tainted: G        W      3.19.0-rc6+ #126
> [407484.417914] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [407484.426497]  ffffffff81a3d3da ffff88011eac5aa0 ffffffff816b6761 0000000000000000
> [407484.435161]  ffff88011eac5af0 ffff88011eac5ae0 ffffffff8106dcda ffff88011eac5b00
> [407484.443900]  ffff8801195f9800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8
> [407484.452588] Call Trace:
> [407484.455862]  <NMI>  [<ffffffff816b6761>] dump_stack+0x45/0x57
> [407484.462741]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
> [407484.469851]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
> [407484.476743]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
> [407484.483888]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
> [407484.490999]  [<ffffffff8111b922>] watchdog_overflow_callback+0x92/0xc0
> [407484.498672]  [<ffffffff8115c3f1>] __perf_event_overflow+0x91/0x270
> [407484.505984]  [<ffffffff8102b15a>] ? x86_perf_event_set_period+0xca/0x170
> [407484.513834]  [<ffffffff8115ced9>] perf_event_overflow+0x19/0x20
> [407484.520812]  [<ffffffff8103266a>] intel_pmu_handle_irq+0x1ba/0x3a0
> [407484.528119]  [<ffffffff8102a04b>] perf_event_nmi_handler+0x2b/0x50
> [407484.535402]  [<ffffffff81018610>] nmi_handle+0xa0/0x150
> [407484.541701]  [<ffffffff81018575>] ? nmi_handle+0x5/0x150
> [407484.548069]  [<ffffffff810188ba>] default_do_nmi+0x4a/0x140
> [407484.554692]  [<ffffffff81018a48>] do_nmi+0x98/0xe0
> [407484.560517]  [<ffffffff816c0bf1>] end_repeat_nmi+0x1e/0x2e
> [407484.567054]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
> [407484.574256]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
> [407484.581431]  [<ffffffff81058bff>] ? perf_get_regs_user+0xbf/0x190
> [407484.588602]  <<EOE>>  <IRQ>  [<ffffffff8115c28c>] perf_prepare_sample+0x2ec/0x3c0
> [407484.597358]  [<ffffffff8115c46e>] __perf_event_overflow+0x10e/0x270
> [407484.604708]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
> [407484.612215]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
> [407484.619000]  [<ffffffff8115cfe2>] ? __perf_sw_event+0x72/0x1f0
> [407484.625937]  [<ffffffff8115c799>] ? perf_swevent_overflow+0xa9/0xe0
> [407484.633287]  [<ffffffff8115c799>] perf_swevent_overflow+0xa9/0xe0
> [407484.640467]  [<ffffffff8115c837>] perf_swevent_event+0x67/0x90
> [407484.647343]  [<ffffffff8115c924>] perf_tp_event+0xc4/0x210
> [407484.653923]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> [407484.660606]  [<ffffffff810b3cf6>] ? perf_trace_lock_acquire+0x146/0x180
> [407484.668332]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
> [407484.675962]  [<ffffffff810b3cf6>] perf_trace_lock_acquire+0x146/0x180
> [407484.683490]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> [407484.690211]  [<ffffffff810b6fa9>] lock_acquire+0x119/0x130
> [407484.696750]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> [407484.703640]  [<ffffffff811f50ef>] ? kill_fasync+0xf/0xf0
> [407484.710008]  [<ffffffff8115b828>] perf_event_wakeup+0x38/0xf0
> [407484.716798]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> [407484.723696]  [<ffffffff8115b913>] perf_pending_event+0x33/0x60
> [407484.730570]  [<ffffffff8114cc7c>] irq_work_run_list+0x4c/0x80
> [407484.737392]  [<ffffffff8114ccc8>] irq_work_run+0x18/0x40
> [407484.743765]  [<ffffffff8101955f>] smp_trace_irq_work_interrupt+0x3f/0xc0
> [407484.751579]  [<ffffffff816c01fd>] trace_irq_work_interrupt+0x6d/0x80
> [407484.759046]  <EOI>  [<ffffffff810b6f4d>] ? lock_acquire+0xbd/0x130
> [407484.766380]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
> [407484.772786]  [<ffffffff816bdb31>] _raw_spin_lock+0x31/0x40
> [407484.779321]  [<ffffffff811f4f42>] ? SyS_fcntl+0x5b2/0x650
> [407484.785813]  [<ffffffff811f4f42>] SyS_fcntl+0x5b2/0x650
> [407484.792109]  [<ffffffff816be6ed>] system_call_fastpath+0x16/0x1b
> [407484.799195] ---[ end trace 55752a03ec8ab979 ]---

That looks like tail recursive fun! An irq work that raises and irq work
ad infinitum. Lemme see if I can squash that.. didn't we have something
like this before... /me goes look.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-02-02 15:42         ` Peter Zijlstra
@ 2015-02-02 17:32           ` Peter Zijlstra
  2015-02-04 14:51             ` Jiri Olsa
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-02-02 17:32 UTC (permalink / raw)
  To: Vince Weaver
  Cc: mingo, linux-kernel, eranian, jolsa, mark.rutland, torvalds, tglx

On Mon, Feb 02, 2015 at 04:42:40PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2015 at 01:33:14AM -0500, Vince Weaver wrote:
> > [407484.309136] ------------[ cut here ]------------

> > [407484.588602]  <<EOE>>  <IRQ>  [<ffffffff8115c28c>] perf_prepare_sample+0x2ec/0x3c0
> > [407484.597358]  [<ffffffff8115c46e>] __perf_event_overflow+0x10e/0x270
> > [407484.604708]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
> > [407484.612215]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
> > [407484.619000]  [<ffffffff8115cfe2>] ? __perf_sw_event+0x72/0x1f0
> > [407484.625937]  [<ffffffff8115c799>] ? perf_swevent_overflow+0xa9/0xe0
> > [407484.633287]  [<ffffffff8115c799>] perf_swevent_overflow+0xa9/0xe0
> > [407484.640467]  [<ffffffff8115c837>] perf_swevent_event+0x67/0x90
> > [407484.647343]  [<ffffffff8115c924>] perf_tp_event+0xc4/0x210
> > [407484.653923]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> > [407484.660606]  [<ffffffff810b3cf6>] ? perf_trace_lock_acquire+0x146/0x180
> > [407484.668332]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
> > [407484.675962]  [<ffffffff810b3cf6>] perf_trace_lock_acquire+0x146/0x180
> > [407484.683490]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> > [407484.690211]  [<ffffffff810b6fa9>] lock_acquire+0x119/0x130
> > [407484.696750]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> > [407484.703640]  [<ffffffff811f50ef>] ? kill_fasync+0xf/0xf0
> > [407484.710008]  [<ffffffff8115b828>] perf_event_wakeup+0x38/0xf0
> > [407484.716798]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> > [407484.723696]  [<ffffffff8115b913>] perf_pending_event+0x33/0x60
> > [407484.730570]  [<ffffffff8114cc7c>] irq_work_run_list+0x4c/0x80
> > [407484.737392]  [<ffffffff8114ccc8>] irq_work_run+0x18/0x40
> > [407484.743765]  [<ffffffff8101955f>] smp_trace_irq_work_interrupt+0x3f/0xc0
> > [407484.751579]  [<ffffffff816c01fd>] trace_irq_work_interrupt+0x6d/0x80

> > [407484.799195] ---[ end trace 55752a03ec8ab979 ]---
> 
> That looks like tail recursive fun! An irq work that raises and irq work
> ad infinitum. Lemme see if I can squash that.. didn't we have something
> like this before... /me goes look.


Does this make it go away?

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4413,6 +4413,8 @@ static void perf_pending_event(struct ir
 	struct perf_event *event = container_of(entry,
 			struct perf_event, pending);
 
+	int rctx = perf_swevent_get_recursion_context();
+
 	if (event->pending_disable) {
 		event->pending_disable = 0;
 		__perf_event_disable(event);
@@ -4422,6 +4424,8 @@ static void perf_pending_event(struct ir
 		event->pending_wakeup = 0;
 		perf_event_wakeup(event);
 	}
+
+	perf_swevent_put_recursion_context(rctx);
 }
 
 /*

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition
  2015-01-23 15:22       ` Mark Rutland
@ 2015-02-04 12:59         ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-02-04 12:59 UTC (permalink / raw)
  To: Mark Rutland
  Cc: mingo, linux-kernel, vincent.weaver, eranian, jolsa, torvalds, tglx

On Fri, Jan 23, 2015 at 03:22:57PM +0000, Mark Rutland wrote:
> > > It seems this would still allow you to group CPU-affine software and
> > > uncore events, which also doesn't make sense: the software events will
> > > count on a single CPU while the uncore events aren't really CPU-affine.
> > > 
> > > Which isn't anything against this patch, but probably something we
> > > should tighten up too.
> > 
> > Indeed, that would need a wee bit of extra infrastructure though; as we
> > cannot currently distinguish between regular cpuctx and uncore like
> > things.
> 
> Isn't the event->pmu->task_ctx_nr sufficient, as with how we identify
> software events?
> 
> Or am I making some completely bogus assumptions in the diff below?

>  	/*
> +	 * System-wide (A.K.A. "uncore") events cannot be associated with a
> +	 * particular CPU, and hence cannot be associated with a particular
> +	 * task either. It's non-sensical to group them with other event types,
> +	 * which are CPU or task bound.
> +	 */

So I think we want to allow grouping software events with say uncore
events; if you start them both out on the same 'cpu'
perf_pmu_migrate_context() would move the software event along with it.

The use case is for non-sampling uncores, where if you have a software
leader you can still get a periodic samples. Clearly looking at task
state or the like is pointless, but PERF_SAMPLE_READ is useful to record
values at regular intervals into the buffer.

But yes, I think ctx_nr might just do.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip:perf/core] perf: Add a bit of paranoia
  2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
  2015-01-26 16:26   ` Peter Zijlstra
@ 2015-02-04 14:39   ` tip-bot for Peter Zijlstra
  1 sibling, 0 replies; 29+ messages in thread
From: tip-bot for Peter Zijlstra @ 2015-02-04 14:39 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: acme, mingo, linux-kernel, torvalds, peterz, tglx, hpa

Commit-ID:  652884fe0c7bd57f534c5fe68d6def0dc8c4b7ed
Gitweb:     http://git.kernel.org/tip/652884fe0c7bd57f534c5fe68d6def0dc8c4b7ed
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 23 Jan 2015 11:20:10 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 4 Feb 2015 08:07:09 +0100

perf: Add a bit of paranoia

Add a few WARN()s to catch things that should never happen.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.150481799@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index b4a696c..b358cb3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1275,6 +1275,8 @@ static void perf_group_attach(struct perf_event *event)
 	if (group_leader == event)
 		return;
 
+	WARN_ON_ONCE(group_leader->ctx != event->ctx);
+
 	if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
 			!is_software_event(event))
 		group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;
@@ -1296,6 +1298,10 @@ static void
 list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 {
 	struct perf_cpu_context *cpuctx;
+
+	WARN_ON_ONCE(event->ctx != ctx);
+	lockdep_assert_held(&ctx->lock);
+
 	/*
 	 * We can have double detach due to exit/hot-unplug + close.
 	 */
@@ -1380,6 +1386,8 @@ static void perf_group_detach(struct perf_event *event)
 
 		/* Inherit group flags from the previous leader */
 		sibling->group_flags = event->group_flags;
+
+		WARN_ON_ONCE(sibling->ctx != event->ctx);
 	}
 
 out:
@@ -1442,6 +1450,10 @@ event_sched_out(struct perf_event *event,
 {
 	u64 tstamp = perf_event_time(event);
 	u64 delta;
+
+	WARN_ON_ONCE(event->ctx != ctx);
+	lockdep_assert_held(&ctx->lock);
+
 	/*
 	 * An event which could not be activated because of
 	 * filter mismatch still needs to have its timings
@@ -7822,14 +7834,19 @@ static void perf_free_event(struct perf_event *event,
 
 	put_event(parent);
 
+	raw_spin_lock_irq(&ctx->lock);
 	perf_group_detach(event);
 	list_del_event(event, ctx);
+	raw_spin_unlock_irq(&ctx->lock);
 	free_event(event);
 }
 
 /*
- * free an unexposed, unused context as created by inheritance by
+ * Free an unexposed, unused context as created by inheritance by
  * perf_event_init_task below, used by fork() in case of fail.
+ *
+ * Not all locks are strictly required, but take them anyway to be nice and
+ * help out with the lockdep assertions.
  */
 void perf_event_free_task(struct task_struct *task)
 {

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip:perf/core] perf: Fix event->ctx locking
  2015-01-23 12:52 ` [RFC][PATCH 3/3] perf: Fix event->ctx locking Peter Zijlstra
@ 2015-02-04 14:39   ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot for Peter Zijlstra @ 2015-02-04 14:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, acme, jolsa, tglx, torvalds, mingo, peterz, paulmck

Commit-ID:  f63a8daa5812afef4f06c962351687e1ff9ccb2b
Gitweb:     http://git.kernel.org/tip/f63a8daa5812afef4f06c962351687e1ff9ccb2b
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Fri, 23 Jan 2015 12:24:14 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 4 Feb 2015 08:07:10 +0100

perf: Fix event->ctx locking

There have been a few reported issues wrt. the lack of locking around
changing event->ctx. This patch tries to address those.

It avoids the whole rwsem thing; and while it appears to work, please
give it some thought in review.

What I did fail at is sensible runtime checks on the use of
event->ctx, the RCU use makes it very hard.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 244 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 207 insertions(+), 37 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index b358cb3..417a96b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -907,6 +907,77 @@ static void put_ctx(struct perf_event_context *ctx)
 }
 
 /*
+ * Because of perf_event::ctx migration in sys_perf_event_open::move_group and
+ * perf_pmu_migrate_context() we need some magic.
+ *
+ * Those places that change perf_event::ctx will hold both
+ * perf_event_ctx::mutex of the 'old' and 'new' ctx value.
+ *
+ * Lock ordering is by mutex address. There is one other site where
+ * perf_event_context::mutex nests and that is put_event(). But remember that
+ * that is a parent<->child context relation, and migration does not affect
+ * children, therefore these two orderings should not interact.
+ *
+ * The change in perf_event::ctx does not affect children (as claimed above)
+ * because the sys_perf_event_open() case will install a new event and break
+ * the ctx parent<->child relation, and perf_pmu_migrate_context() is only
+ * concerned with cpuctx and that doesn't have children.
+ *
+ * The places that change perf_event::ctx will issue:
+ *
+ *   perf_remove_from_context();
+ *   synchronize_rcu();
+ *   perf_install_in_context();
+ *
+ * to affect the change. The remove_from_context() + synchronize_rcu() should
+ * quiesce the event, after which we can install it in the new location. This
+ * means that only external vectors (perf_fops, prctl) can perturb the event
+ * while in transit. Therefore all such accessors should also acquire
+ * perf_event_context::mutex to serialize against this.
+ *
+ * However; because event->ctx can change while we're waiting to acquire
+ * ctx->mutex we must be careful and use the below perf_event_ctx_lock()
+ * function.
+ *
+ * Lock order:
+ *	task_struct::perf_event_mutex
+ *	  perf_event_context::mutex
+ *	    perf_event_context::lock
+ *	    perf_event::child_mutex;
+ *	    perf_event::mmap_mutex
+ *	    mmap_sem
+ */
+static struct perf_event_context *perf_event_ctx_lock(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+again:
+	rcu_read_lock();
+	ctx = ACCESS_ONCE(event->ctx);
+	if (!atomic_inc_not_zero(&ctx->refcount)) {
+		rcu_read_unlock();
+		goto again;
+	}
+	rcu_read_unlock();
+
+	mutex_lock(&ctx->mutex);
+	if (event->ctx != ctx) {
+		mutex_unlock(&ctx->mutex);
+		put_ctx(ctx);
+		goto again;
+	}
+
+	return ctx;
+}
+
+static void perf_event_ctx_unlock(struct perf_event *event,
+				  struct perf_event_context *ctx)
+{
+	mutex_unlock(&ctx->mutex);
+	put_ctx(ctx);
+}
+
+/*
  * This must be done under the ctx->lock, such as to serialize against
  * context_equiv(), therefore we cannot call put_ctx() since that might end up
  * calling scheduler related locks and ctx->lock nests inside those.
@@ -1666,7 +1737,7 @@ int __perf_event_disable(void *info)
  * is the current context on this CPU and preemption is disabled,
  * hence we can't get into perf_event_task_sched_out for this context.
  */
-void perf_event_disable(struct perf_event *event)
+static void _perf_event_disable(struct perf_event *event)
 {
 	struct perf_event_context *ctx = event->ctx;
 	struct task_struct *task = ctx->task;
@@ -1707,6 +1778,19 @@ retry:
 	}
 	raw_spin_unlock_irq(&ctx->lock);
 }
+
+/*
+ * Strictly speaking kernel users cannot create groups and therefore this
+ * interface does not need the perf_event_ctx_lock() magic.
+ */
+void perf_event_disable(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+	ctx = perf_event_ctx_lock(event);
+	_perf_event_disable(event);
+	perf_event_ctx_unlock(event, ctx);
+}
 EXPORT_SYMBOL_GPL(perf_event_disable);
 
 static void perf_set_shadow_time(struct perf_event *event,
@@ -2170,7 +2254,7 @@ unlock:
  * perf_event_for_each_child or perf_event_for_each as described
  * for perf_event_disable.
  */
-void perf_event_enable(struct perf_event *event)
+static void _perf_event_enable(struct perf_event *event)
 {
 	struct perf_event_context *ctx = event->ctx;
 	struct task_struct *task = ctx->task;
@@ -2226,9 +2310,21 @@ retry:
 out:
 	raw_spin_unlock_irq(&ctx->lock);
 }
+
+/*
+ * See perf_event_disable();
+ */
+void perf_event_enable(struct perf_event *event)
+{
+	struct perf_event_context *ctx;
+
+	ctx = perf_event_ctx_lock(event);
+	_perf_event_enable(event);
+	perf_event_ctx_unlock(event, ctx);
+}
 EXPORT_SYMBOL_GPL(perf_event_enable);
 
-int perf_event_refresh(struct perf_event *event, int refresh)
+static int _perf_event_refresh(struct perf_event *event, int refresh)
 {
 	/*
 	 * not supported on inherited events
@@ -2237,10 +2333,25 @@ int perf_event_refresh(struct perf_event *event, int refresh)
 		return -EINVAL;
 
 	atomic_add(refresh, &event->event_limit);
-	perf_event_enable(event);
+	_perf_event_enable(event);
 
 	return 0;
 }
+
+/*
+ * See perf_event_disable()
+ */
+int perf_event_refresh(struct perf_event *event, int refresh)
+{
+	struct perf_event_context *ctx;
+	int ret;
+
+	ctx = perf_event_ctx_lock(event);
+	ret = _perf_event_refresh(event, refresh);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
+}
 EXPORT_SYMBOL_GPL(perf_event_refresh);
 
 static void ctx_sched_out(struct perf_event_context *ctx,
@@ -3433,7 +3544,16 @@ static void perf_remove_from_owner(struct perf_event *event)
 	rcu_read_unlock();
 
 	if (owner) {
-		mutex_lock(&owner->perf_event_mutex);
+		/*
+		 * If we're here through perf_event_exit_task() we're already
+		 * holding ctx->mutex which would be an inversion wrt. the
+		 * normal lock order.
+		 *
+		 * However we can safely take this lock because its the child
+		 * ctx->mutex.
+		 */
+		mutex_lock_nested(&owner->perf_event_mutex, SINGLE_DEPTH_NESTING);
+
 		/*
 		 * We have to re-check the event->owner field, if it is cleared
 		 * we raced with perf_event_exit_task(), acquiring the mutex
@@ -3559,12 +3679,13 @@ static int perf_event_read_group(struct perf_event *event,
 				   u64 read_format, char __user *buf)
 {
 	struct perf_event *leader = event->group_leader, *sub;
-	int n = 0, size = 0, ret = -EFAULT;
 	struct perf_event_context *ctx = leader->ctx;
-	u64 values[5];
+	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
+	u64 values[5];
+
+	lockdep_assert_held(&ctx->mutex);
 
-	mutex_lock(&ctx->mutex);
 	count = perf_event_read_value(leader, &enabled, &running);
 
 	values[n++] = 1 + leader->nr_siblings;
@@ -3579,7 +3700,7 @@ static int perf_event_read_group(struct perf_event *event,
 	size = n * sizeof(u64);
 
 	if (copy_to_user(buf, values, size))
-		goto unlock;
+		return -EFAULT;
 
 	ret = size;
 
@@ -3593,14 +3714,11 @@ static int perf_event_read_group(struct perf_event *event,
 		size = n * sizeof(u64);
 
 		if (copy_to_user(buf + ret, values, size)) {
-			ret = -EFAULT;
-			goto unlock;
+			return -EFAULT;
 		}
 
 		ret += size;
 	}
-unlock:
-	mutex_unlock(&ctx->mutex);
 
 	return ret;
 }
@@ -3672,8 +3790,14 @@ static ssize_t
 perf_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 {
 	struct perf_event *event = file->private_data;
+	struct perf_event_context *ctx;
+	int ret;
 
-	return perf_read_hw(event, buf, count);
+	ctx = perf_event_ctx_lock(event);
+	ret = perf_read_hw(event, buf, count);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
 }
 
 static unsigned int perf_poll(struct file *file, poll_table *wait)
@@ -3699,7 +3823,7 @@ static unsigned int perf_poll(struct file *file, poll_table *wait)
 	return events;
 }
 
-static void perf_event_reset(struct perf_event *event)
+static void _perf_event_reset(struct perf_event *event)
 {
 	(void)perf_event_read(event);
 	local64_set(&event->count, 0);
@@ -3718,6 +3842,7 @@ static void perf_event_for_each_child(struct perf_event *event,
 	struct perf_event *child;
 
 	WARN_ON_ONCE(event->ctx->parent_ctx);
+
 	mutex_lock(&event->child_mutex);
 	func(event);
 	list_for_each_entry(child, &event->child_list, child_list)
@@ -3731,14 +3856,13 @@ static void perf_event_for_each(struct perf_event *event,
 	struct perf_event_context *ctx = event->ctx;
 	struct perf_event *sibling;
 
-	WARN_ON_ONCE(ctx->parent_ctx);
-	mutex_lock(&ctx->mutex);
+	lockdep_assert_held(&ctx->mutex);
+
 	event = event->group_leader;
 
 	perf_event_for_each_child(event, func);
 	list_for_each_entry(sibling, &event->sibling_list, group_entry)
 		perf_event_for_each_child(sibling, func);
-	mutex_unlock(&ctx->mutex);
 }
 
 static int perf_event_period(struct perf_event *event, u64 __user *arg)
@@ -3808,25 +3932,24 @@ static int perf_event_set_output(struct perf_event *event,
 				 struct perf_event *output_event);
 static int perf_event_set_filter(struct perf_event *event, void __user *arg);
 
-static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+static long _perf_ioctl(struct perf_event *event, unsigned int cmd, unsigned long arg)
 {
-	struct perf_event *event = file->private_data;
 	void (*func)(struct perf_event *);
 	u32 flags = arg;
 
 	switch (cmd) {
 	case PERF_EVENT_IOC_ENABLE:
-		func = perf_event_enable;
+		func = _perf_event_enable;
 		break;
 	case PERF_EVENT_IOC_DISABLE:
-		func = perf_event_disable;
+		func = _perf_event_disable;
 		break;
 	case PERF_EVENT_IOC_RESET:
-		func = perf_event_reset;
+		func = _perf_event_reset;
 		break;
 
 	case PERF_EVENT_IOC_REFRESH:
-		return perf_event_refresh(event, arg);
+		return _perf_event_refresh(event, arg);
 
 	case PERF_EVENT_IOC_PERIOD:
 		return perf_event_period(event, (u64 __user *)arg);
@@ -3873,6 +3996,19 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	return 0;
 }
 
+static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct perf_event *event = file->private_data;
+	struct perf_event_context *ctx;
+	long ret;
+
+	ctx = perf_event_ctx_lock(event);
+	ret = _perf_ioctl(event, cmd, arg);
+	perf_event_ctx_unlock(event, ctx);
+
+	return ret;
+}
+
 #ifdef CONFIG_COMPAT
 static long perf_compat_ioctl(struct file *file, unsigned int cmd,
 				unsigned long arg)
@@ -3895,11 +4031,15 @@ static long perf_compat_ioctl(struct file *file, unsigned int cmd,
 
 int perf_event_task_enable(void)
 {
+	struct perf_event_context *ctx;
 	struct perf_event *event;
 
 	mutex_lock(&current->perf_event_mutex);
-	list_for_each_entry(event, &current->perf_event_list, owner_entry)
-		perf_event_for_each_child(event, perf_event_enable);
+	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
+		ctx = perf_event_ctx_lock(event);
+		perf_event_for_each_child(event, _perf_event_enable);
+		perf_event_ctx_unlock(event, ctx);
+	}
 	mutex_unlock(&current->perf_event_mutex);
 
 	return 0;
@@ -3907,11 +4047,15 @@ int perf_event_task_enable(void)
 
 int perf_event_task_disable(void)
 {
+	struct perf_event_context *ctx;
 	struct perf_event *event;
 
 	mutex_lock(&current->perf_event_mutex);
-	list_for_each_entry(event, &current->perf_event_list, owner_entry)
-		perf_event_for_each_child(event, perf_event_disable);
+	list_for_each_entry(event, &current->perf_event_list, owner_entry) {
+		ctx = perf_event_ctx_lock(event);
+		perf_event_for_each_child(event, _perf_event_disable);
+		perf_event_ctx_unlock(event, ctx);
+	}
 	mutex_unlock(&current->perf_event_mutex);
 
 	return 0;
@@ -7269,6 +7413,15 @@ out:
 	return ret;
 }
 
+static void mutex_lock_double(struct mutex *a, struct mutex *b)
+{
+	if (b < a)
+		swap(a, b);
+
+	mutex_lock(a);
+	mutex_lock_nested(b, SINGLE_DEPTH_NESTING);
+}
+
 /**
  * sys_perf_event_open - open a performance event, associate it to a task/cpu
  *
@@ -7284,7 +7437,7 @@ SYSCALL_DEFINE5(perf_event_open,
 	struct perf_event *group_leader = NULL, *output_event = NULL;
 	struct perf_event *event, *sibling;
 	struct perf_event_attr attr;
-	struct perf_event_context *ctx;
+	struct perf_event_context *ctx, *uninitialized_var(gctx);
 	struct file *event_file = NULL;
 	struct fd group = {NULL, 0};
 	struct task_struct *task = NULL;
@@ -7482,9 +7635,14 @@ SYSCALL_DEFINE5(perf_event_open,
 	}
 
 	if (move_group) {
-		struct perf_event_context *gctx = group_leader->ctx;
+		gctx = group_leader->ctx;
+
+		/*
+		 * See perf_event_ctx_lock() for comments on the details
+		 * of swizzling perf_event::ctx.
+		 */
+		mutex_lock_double(&gctx->mutex, &ctx->mutex);
 
-		mutex_lock(&gctx->mutex);
 		perf_remove_from_context(group_leader, false);
 
 		/*
@@ -7499,15 +7657,19 @@ SYSCALL_DEFINE5(perf_event_open,
 			perf_event__state_init(sibling);
 			put_ctx(gctx);
 		}
-		mutex_unlock(&gctx->mutex);
-		put_ctx(gctx);
+	} else {
+		mutex_lock(&ctx->mutex);
 	}
 
 	WARN_ON_ONCE(ctx->parent_ctx);
-	mutex_lock(&ctx->mutex);
 
 	if (move_group) {
+		/*
+		 * Wait for everybody to stop referencing the events through
+		 * the old lists, before installing it on new lists.
+		 */
 		synchronize_rcu();
+
 		perf_install_in_context(ctx, group_leader, group_leader->cpu);
 		get_ctx(ctx);
 		list_for_each_entry(sibling, &group_leader->sibling_list,
@@ -7519,6 +7681,11 @@ SYSCALL_DEFINE5(perf_event_open,
 
 	perf_install_in_context(ctx, event, event->cpu);
 	perf_unpin_context(ctx);
+
+	if (move_group) {
+		mutex_unlock(&gctx->mutex);
+		put_ctx(gctx);
+	}
 	mutex_unlock(&ctx->mutex);
 
 	put_online_cpus();
@@ -7626,7 +7793,11 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
 	src_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, src_cpu)->ctx;
 	dst_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, dst_cpu)->ctx;
 
-	mutex_lock(&src_ctx->mutex);
+	/*
+	 * See perf_event_ctx_lock() for comments on the details
+	 * of swizzling perf_event::ctx.
+	 */
+	mutex_lock_double(&src_ctx->mutex, &dst_ctx->mutex);
 	list_for_each_entry_safe(event, tmp, &src_ctx->event_list,
 				 event_entry) {
 		perf_remove_from_context(event, false);
@@ -7634,11 +7805,9 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
 		put_ctx(src_ctx);
 		list_add(&event->migrate_entry, &events);
 	}
-	mutex_unlock(&src_ctx->mutex);
 
 	synchronize_rcu();
 
-	mutex_lock(&dst_ctx->mutex);
 	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
 		list_del(&event->migrate_entry);
 		if (event->state >= PERF_EVENT_STATE_OFF)
@@ -7648,6 +7817,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
 		get_ctx(dst_ctx);
 	}
 	mutex_unlock(&dst_ctx->mutex);
+	mutex_unlock(&src_ctx->mutex);
 }
 EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip:perf/core] perf: Fix move_group() order
  2015-01-26 16:26   ` Peter Zijlstra
  2015-01-27 21:28     ` Vince Weaver
  2015-01-29 14:47     ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
@ 2015-02-04 14:39     ` tip-bot for Peter Zijlstra (Intel)
  2 siblings, 0 replies; 29+ messages in thread
From: tip-bot for Peter Zijlstra (Intel) @ 2015-02-04 14:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, tglx, hpa, jolsa, peterz, torvalds, mingo

Commit-ID:  8f95b435b62522aed3381aaea920de8d09ccabf3
Gitweb:     http://git.kernel.org/tip/8f95b435b62522aed3381aaea920de8d09ccabf3
Author:     Peter Zijlstra (Intel) <peterz@infradead.org>
AuthorDate: Tue, 27 Jan 2015 11:53:12 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 4 Feb 2015 08:07:11 +0100

perf: Fix move_group() order

Jiri reported triggering the new WARN_ON_ONCE in event_sched_out over
the weekend:

  event_sched_out.isra.79+0x2b9/0x2d0
  group_sched_out+0x69/0xc0
  ctx_sched_out+0x106/0x130
  task_ctx_sched_out+0x37/0x70
  __perf_install_in_context+0x70/0x1a0
  remote_function+0x48/0x60
  generic_exec_single+0x15b/0x1d0
  smp_call_function_single+0x67/0xa0
  task_function_call+0x53/0x80
  perf_install_in_context+0x8b/0x110

I think the below should cure this; if we install a group leader it
will iterate the (still intact) group list and find its siblings and
try and install those too -- even though those still have the old
event->ctx -- in the new ctx.

Upon installing the first group sibling we'd try and schedule out the
group and trigger the above warn.

Fix this by installing the group leader last, installing siblings
would have no effect, they're not reachable through the group lists
and therefore we don't schedule them.

Also delay resetting the state until we're absolutely sure the events
are quiescent.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Reported-by: vincent.weaver@maine.edu
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150126162639.GA21418@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 56 +++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 47 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 417a96b..142dbabc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7645,16 +7645,9 @@ SYSCALL_DEFINE5(perf_event_open,
 
 		perf_remove_from_context(group_leader, false);
 
-		/*
-		 * Removing from the context ends up with disabled
-		 * event. What we want here is event in the initial
-		 * startup state, ready to be add into new context.
-		 */
-		perf_event__state_init(group_leader);
 		list_for_each_entry(sibling, &group_leader->sibling_list,
 				    group_entry) {
 			perf_remove_from_context(sibling, false);
-			perf_event__state_init(sibling);
 			put_ctx(gctx);
 		}
 	} else {
@@ -7670,13 +7663,31 @@ SYSCALL_DEFINE5(perf_event_open,
 		 */
 		synchronize_rcu();
 
-		perf_install_in_context(ctx, group_leader, group_leader->cpu);
-		get_ctx(ctx);
+		/*
+		 * Install the group siblings before the group leader.
+		 *
+		 * Because a group leader will try and install the entire group
+		 * (through the sibling list, which is still in-tact), we can
+		 * end up with siblings installed in the wrong context.
+		 *
+		 * By installing siblings first we NO-OP because they're not
+		 * reachable through the group lists.
+		 */
 		list_for_each_entry(sibling, &group_leader->sibling_list,
 				    group_entry) {
+			perf_event__state_init(sibling);
 			perf_install_in_context(ctx, sibling, sibling->cpu);
 			get_ctx(ctx);
 		}
+
+		/*
+		 * Removing from the context ends up with disabled
+		 * event. What we want here is event in the initial
+		 * startup state, ready to be add into new context.
+		 */
+		perf_event__state_init(group_leader);
+		perf_install_in_context(ctx, group_leader, group_leader->cpu);
+		get_ctx(ctx);
 	}
 
 	perf_install_in_context(ctx, event, event->cpu);
@@ -7806,8 +7817,35 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
 		list_add(&event->migrate_entry, &events);
 	}
 
+	/*
+	 * Wait for the events to quiesce before re-instating them.
+	 */
 	synchronize_rcu();
 
+	/*
+	 * Re-instate events in 2 passes.
+	 *
+	 * Skip over group leaders and only install siblings on this first
+	 * pass, siblings will not get enabled without a leader, however a
+	 * leader will enable its siblings, even if those are still on the old
+	 * context.
+	 */
+	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
+		if (event->group_leader == event)
+			continue;
+
+		list_del(&event->migrate_entry);
+		if (event->state >= PERF_EVENT_STATE_OFF)
+			event->state = PERF_EVENT_STATE_INACTIVE;
+		account_event_cpu(event, dst_cpu);
+		perf_install_in_context(dst_ctx, event, dst_cpu);
+		get_ctx(dst_ctx);
+	}
+
+	/*
+	 * Once all the siblings are setup properly, install the group leaders
+	 * to make it go.
+	 */
 	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
 		list_del(&event->migrate_entry);
 		if (event->state >= PERF_EVENT_STATE_OFF)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [tip:perf/core] perf: Fix put_event() ctx lock
  2015-01-29 13:44           ` Peter Zijlstra
@ 2015-02-04 14:39             ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: tip-bot for Peter Zijlstra @ 2015-02-04 14:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, peterz, jolsa, mingo, torvalds, eranian, hpa, tglx,
	linux-kernel, vincent.weaver

Commit-ID:  a83fe28e2e45392464858a96745db26ac73670c8
Gitweb:     http://git.kernel.org/tip/a83fe28e2e45392464858a96745db26ac73670c8
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Thu, 29 Jan 2015 14:44:34 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 4 Feb 2015 08:07:12 +0100

perf: Fix put_event() ctx lock

So what I suspect; but I'm in zombie mode today it seems; is that while
I initially thought that it was impossible for ctx to change when
refcount dropped to 0, I now suspect its possible.

Note that until perf_remove_from_context() the event is still active and
visible on the lists. So a concurrent sys_perf_event_open() from another
task into this task can race.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: mark.rutland@arm.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150129134434.GB26304@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 142dbabc..f773fa1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -947,7 +947,8 @@ static void put_ctx(struct perf_event_context *ctx)
  *	    perf_event::mmap_mutex
  *	    mmap_sem
  */
-static struct perf_event_context *perf_event_ctx_lock(struct perf_event *event)
+static struct perf_event_context *
+perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
 {
 	struct perf_event_context *ctx;
 
@@ -960,7 +961,7 @@ again:
 	}
 	rcu_read_unlock();
 
-	mutex_lock(&ctx->mutex);
+	mutex_lock_nested(&ctx->mutex, nesting);
 	if (event->ctx != ctx) {
 		mutex_unlock(&ctx->mutex);
 		put_ctx(ctx);
@@ -970,6 +971,12 @@ again:
 	return ctx;
 }
 
+static inline struct perf_event_context *
+perf_event_ctx_lock(struct perf_event *event)
+{
+	return perf_event_ctx_lock_nested(event, 0);
+}
+
 static void perf_event_ctx_unlock(struct perf_event *event,
 				  struct perf_event_context *ctx)
 {
@@ -3572,7 +3579,7 @@ static void perf_remove_from_owner(struct perf_event *event)
  */
 static void put_event(struct perf_event *event)
 {
-	struct perf_event_context *ctx = event->ctx;
+	struct perf_event_context *ctx;
 
 	if (!atomic_long_dec_and_test(&event->refcount))
 		return;
@@ -3580,7 +3587,6 @@ static void put_event(struct perf_event *event)
 	if (!is_kernel_event(event))
 		perf_remove_from_owner(event);
 
-	WARN_ON_ONCE(ctx->parent_ctx);
 	/*
 	 * There are two ways this annotation is useful:
 	 *
@@ -3593,7 +3599,8 @@ static void put_event(struct perf_event *event)
 	 *     the last filedesc died, so there is no possibility
 	 *     to trigger the AB-BA case.
 	 */
-	mutex_lock_nested(&ctx->mutex, SINGLE_DEPTH_NESTING);
+	ctx = perf_event_ctx_lock_nested(event, SINGLE_DEPTH_NESTING);
+	WARN_ON_ONCE(ctx->parent_ctx);
 	perf_remove_from_context(event, true);
 	mutex_unlock(&ctx->mutex);
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-02-02 17:32           ` Peter Zijlstra
@ 2015-02-04 14:51             ` Jiri Olsa
  2015-02-04 16:33               ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Jiri Olsa @ 2015-02-04 14:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, mingo, linux-kernel, eranian, mark.rutland, torvalds, tglx

On Mon, Feb 02, 2015 at 06:32:32PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2015 at 04:42:40PM +0100, Peter Zijlstra wrote:
> > On Mon, Feb 02, 2015 at 01:33:14AM -0500, Vince Weaver wrote:
> > > [407484.309136] ------------[ cut here ]------------
> 
> > > [407484.588602]  <<EOE>>  <IRQ>  [<ffffffff8115c28c>] perf_prepare_sample+0x2ec/0x3c0
> > > [407484.597358]  [<ffffffff8115c46e>] __perf_event_overflow+0x10e/0x270
> > > [407484.604708]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
> > > [407484.612215]  [<ffffffff8115c924>] ? perf_tp_event+0xc4/0x210
> > > [407484.619000]  [<ffffffff8115cfe2>] ? __perf_sw_event+0x72/0x1f0
> > > [407484.625937]  [<ffffffff8115c799>] ? perf_swevent_overflow+0xa9/0xe0
> > > [407484.633287]  [<ffffffff8115c799>] perf_swevent_overflow+0xa9/0xe0
> > > [407484.640467]  [<ffffffff8115c837>] perf_swevent_event+0x67/0x90
> > > [407484.647343]  [<ffffffff8115c924>] perf_tp_event+0xc4/0x210
> > > [407484.653923]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> > > [407484.660606]  [<ffffffff810b3cf6>] ? perf_trace_lock_acquire+0x146/0x180
> > > [407484.668332]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
> > > [407484.675962]  [<ffffffff810b3cf6>] perf_trace_lock_acquire+0x146/0x180
> > > [407484.683490]  [<ffffffff810b6fa9>] ? lock_acquire+0x119/0x130
> > > [407484.690211]  [<ffffffff810b6fa9>] lock_acquire+0x119/0x130
> > > [407484.696750]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> > > [407484.703640]  [<ffffffff811f50ef>] ? kill_fasync+0xf/0xf0
> > > [407484.710008]  [<ffffffff8115b828>] perf_event_wakeup+0x38/0xf0
> > > [407484.716798]  [<ffffffff8115b7f5>] ? perf_event_wakeup+0x5/0xf0
> > > [407484.723696]  [<ffffffff8115b913>] perf_pending_event+0x33/0x60
> > > [407484.730570]  [<ffffffff8114cc7c>] irq_work_run_list+0x4c/0x80
> > > [407484.737392]  [<ffffffff8114ccc8>] irq_work_run+0x18/0x40
> > > [407484.743765]  [<ffffffff8101955f>] smp_trace_irq_work_interrupt+0x3f/0xc0
> > > [407484.751579]  [<ffffffff816c01fd>] trace_irq_work_interrupt+0x6d/0x80
> 
> > > [407484.799195] ---[ end trace 55752a03ec8ab979 ]---
> > 
> > That looks like tail recursive fun! An irq work that raises and irq work
> > ad infinitum. Lemme see if I can squash that.. didn't we have something
> > like this before... /me goes look.
> 
> 
> Does this make it go away?
> 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4413,6 +4413,8 @@ static void perf_pending_event(struct ir
>  	struct perf_event *event = container_of(entry,
>  			struct perf_event, pending);
>  
> +	int rctx = perf_swevent_get_recursion_context();
> +

hum, you should check the rctx

	if (rctx == -1)
		return;

also this recursion is bound to swevent_htable, should we rather add
separate ctx data for irq_work to limit the clashing with SW events?

jirka

>  	if (event->pending_disable) {
>  		event->pending_disable = 0;
>  		__perf_event_disable(event);
> @@ -4422,6 +4424,8 @@ static void perf_pending_event(struct ir
>  		event->pending_wakeup = 0;
>  		perf_event_wakeup(event);
>  	}
> +
> +	perf_swevent_put_recursion_context(rctx);
>  }
>  
>  /*

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-02-04 14:51             ` Jiri Olsa
@ 2015-02-04 16:33               ` Peter Zijlstra
  2015-02-04 16:43                 ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2015-02-04 16:33 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Vince Weaver, mingo, linux-kernel, eranian, mark.rutland, torvalds, tglx

On Wed, Feb 04, 2015 at 03:51:36PM +0100, Jiri Olsa wrote:
> On Mon, Feb 02, 2015 at 06:32:32PM +0100, Peter Zijlstra wrote:
> > > That looks like tail recursive fun! An irq work that raises and irq work
> > > ad infinitum. Lemme see if I can squash that.. didn't we have something
> > > like this before... /me goes look.
> > 
> > 
> > Does this make it go away?
> > 
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -4413,6 +4413,8 @@ static void perf_pending_event(struct ir
> >  	struct perf_event *event = container_of(entry,
> >  			struct perf_event, pending);
> >  
> > +	int rctx = perf_swevent_get_recursion_context();
> > +
> 
> hum, you should check the rctx
> 
> 	if (rctx == -1)
> 		return;

D'uh, yes.

> also this recursion is bound to swevent_htable, should we rather add
> separate ctx data for irq_work to limit the clashing with SW events?

No, we explicitly want to disable software events while handling the
irq_work. The problem as reported looks like irq_work triggering a
swevent (tp actually, but that's classed the same) generates a new
irq_work, and we get stuck in an endless cycle of that.

So by effectively disabling swevents while processing the irq_work we
should break the cycle.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
  2015-02-04 16:33               ` Peter Zijlstra
@ 2015-02-04 16:43                 ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2015-02-04 16:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Vince Weaver, mingo, linux-kernel, eranian, mark.rutland, torvalds, tglx

On Wed, Feb 04, 2015 at 05:33:36PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 04, 2015 at 03:51:36PM +0100, Jiri Olsa wrote:
> > On Mon, Feb 02, 2015 at 06:32:32PM +0100, Peter Zijlstra wrote:
> > > > That looks like tail recursive fun! An irq work that raises and irq work
> > > > ad infinitum. Lemme see if I can squash that.. didn't we have something
> > > > like this before... /me goes look.
> > > 
> > > 
> > > Does this make it go away?
> > > 
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -4413,6 +4413,8 @@ static void perf_pending_event(struct ir
> > >  	struct perf_event *event = container_of(entry,
> > >  			struct perf_event, pending);
> > >  
> > > +	int rctx = perf_swevent_get_recursion_context();
> > > +
> > 
> > hum, you should check the rctx
> > 
> > 	if (rctx == -1)
> > 		return;
> 
> D'uh, yes.

Hmm, that's not actually so, we need to process the irq_work, but we
want no further nested swevents.

We cannot not do the irq_work; so I think what we want is a conditional
put, seeing how if we fail the get, the recursion counter is already
raised and nested events won't happen.

Now in practise its very unlikely to ever happen; you need nested IRQs
for this and most of those have been killed, some legacy drivers might
maybe still use them but I forgot.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-02-04 16:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
2015-01-23 15:02   ` Mark Rutland
2015-01-23 15:07     ` Peter Zijlstra
2015-01-23 15:22       ` Mark Rutland
2015-02-04 12:59         ` Peter Zijlstra
2015-01-28 14:30   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
2015-01-26 16:26   ` Peter Zijlstra
2015-01-27 21:28     ` Vince Weaver
2015-01-29  2:16       ` Vince Weaver
2015-01-29  7:51         ` Jiri Olsa
2015-01-29 13:44           ` Peter Zijlstra
2015-02-04 14:39             ` [tip:perf/core] perf: Fix put_event() ctx lock tip-bot for Peter Zijlstra
2015-01-29 14:47     ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
2015-02-02  6:33       ` Vince Weaver
2015-02-02 15:42         ` Peter Zijlstra
2015-02-02 17:32           ` Peter Zijlstra
2015-02-04 14:51             ` Jiri Olsa
2015-02-04 16:33               ` Peter Zijlstra
2015-02-04 16:43                 ` Peter Zijlstra
2015-02-04 14:39     ` [tip:perf/core] perf: Fix move_group() order tip-bot for Peter Zijlstra (Intel)
2015-02-04 14:39   ` [tip:perf/core] perf: Add a bit of paranoia tip-bot for Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 3/3] perf: Fix event->ctx locking Peter Zijlstra
2015-02-04 14:39   ` [tip:perf/core] " tip-bot for Peter Zijlstra
2015-01-26 14:30 ` [RFC][PATCH 0/3] perf fixes Vince Weaver
2015-01-26 18:03   ` Vince Weaver
2015-01-26 18:34     ` Jiri Olsa
2015-01-26 18:52       ` Vince Weaver

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.