linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [tip:perf/urgent] perf: Fix perf_event_exit_task() race
       [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org>
@ 2016-01-25 13:08 ` Peter Zijlstra
  2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2016-01-25 13:08 UTC (permalink / raw)
  To: linux-kernel, mingo, torvalds, eranian, tglx, hpa, acme, dsahern,
	namhyung, vincent.weaver, jolsa
  Cc: alexander.shishkin


---

Subject: perf: Fix orphan hole
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri Jan 22 22:13:41 CET 2016

We should set event->owner before we install the event, otherwise
there is a hole where the target task can fork() and we'll not inherit
the event because it thinks the event is orphaned.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8489,6 +8489,8 @@ SYSCALL_DEFINE5(perf_event_open,
 	perf_event__header_size(event);
 	perf_event__id_header_size(event);
 
+	event->owner = current;
+
 	perf_install_in_context(ctx, event, event->cpu);
 	perf_unpin_context(ctx);
 
@@ -8498,8 +8500,6 @@ SYSCALL_DEFINE5(perf_event_open,
 
 	put_online_cpus();
 
-	event->owner = current;
-
 	mutex_lock(&current->perf_event_mutex);
 	list_add_tail(&event->owner_entry, &current->perf_event_list);
 	mutex_unlock(&current->perf_event_mutex);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] perf: Fix race in perf_event_exit_task_context
       [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org>
  2016-01-25 13:08 ` [tip:perf/urgent] perf: Fix perf_event_exit_task() race Peter Zijlstra
@ 2016-01-25 13:09 ` Peter Zijlstra
  2016-01-28 19:10   ` [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( ) tip-bot for Peter Zijlstra
  1 sibling, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2016-01-25 13:09 UTC (permalink / raw)
  To: linux-kernel, mingo, torvalds, eranian, tglx, hpa, acme, dsahern,
	namhyung, vincent.weaver, jolsa
  Cc: alexander.shishkin


Subject: perf: Fix race in perf_event_exit_task_context
From: Peter Zijlstra <peterz@infradead.org>
Date: Mon Jan 25 13:03:18 CET 2016

There is a race between perf_event_exit_task_context() and
orphans_remove_work() which results in a use-after-free.

We mark ctx->task with TASK_TOMBSTONE to indicate a context is 'dead',
under ctx->lock. After which point event_function_call() on any event
of that context will NOP

A concurrent orphans_remove_work() will only hold ctx->mutex for the
list iteration and not serialize against this. Therefore its possible
that orphans_remove_work()'s perf_remove_from_context() call will
fail, but we'll continue to free the event, with the result of free'd
memory still being on lists and everything.

Once perf_event_exit_task_context() gets around to acquiring
ctx->mutex it too will iterate the event list, encounter the already
free'd event and proceed to free it _again_. This fails with the WARN
in free_event().

Plug the race by having perf_event_exit_task_context() hold ctx::mutex
over the whole tear-down, thereby 'naturally' serializing against all
other sites, including the orphan work.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/events/core.c |   50 +++++++++++++++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8748,14 +8748,40 @@ static void perf_event_exit_task_context
 {
 	struct perf_event_context *child_ctx, *clone_ctx = NULL;
 	struct perf_event *child_event, *next;
-	unsigned long flags;
 
 	WARN_ON_ONCE(child != current);
 
-	child_ctx = perf_lock_task_context(child, ctxn, &flags);
+	child_ctx = perf_pin_task_context(child, ctxn);
 	if (!child_ctx)
 		return;
 
+	/*
+	 * In order to reduce the amount of tricky in ctx tear-down, we hold
+	 * ctx::mutex over the entire thing. This serializes against almost
+	 * everything that wants to access the ctx.
+	 *
+	 * The exception is sys_perf_event_open() /
+	 * perf_event_create_kernel_count() which does find_get_context()
+	 * without ctx::mutex (it cannot because of the move_group double mutex
+	 * lock thing). See the comments in perf_install_in_context().
+	 *
+	 * We can recurse on the same lock type through:
+	 *
+	 *   __perf_event_exit_task()
+	 *     sync_child_event()
+	 *       put_event()
+	 *         mutex_lock(&ctx->mutex)
+	 *
+	 * But since its the parent context it won't be the same instance.
+	 */
+	mutex_lock(&child_ctx->mutex);
+
+	/*
+	 * In a single ctx::lock section, de-schedule the events and detach the
+	 * context from the task such that we cannot ever get it scheduled back
+	 * in.
+	 */
+	raw_spin_lock_irq(&child_ctx->lock);
 	task_ctx_sched_out(__get_cpu_context(child_ctx), child_ctx);
 
 	/*
@@ -8767,14 +8793,8 @@ static void perf_event_exit_task_context
 	WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
 	put_task_struct(current); /* cannot be last */
 
-	/*
-	 * If this context is a clone; unclone it so it can't get
-	 * swapped to another process while we're removing all
-	 * the events from it.
-	 */
 	clone_ctx = unclone_ctx(child_ctx);
-	update_context_time(child_ctx);
-	raw_spin_unlock_irqrestore(&child_ctx->lock, flags);
+	raw_spin_unlock_irq(&child_ctx->lock);
 
 	if (clone_ctx)
 		put_ctx(clone_ctx);
@@ -8786,18 +8806,6 @@ static void perf_event_exit_task_context
 	 */
 	perf_event_task(child, child_ctx, 0);
 
-	/*
-	 * We can recurse on the same lock type through:
-	 *
-	 *   __perf_event_exit_task()
-	 *     sync_child_event()
-	 *       put_event()
-	 *         mutex_lock(&ctx->mutex)
-	 *
-	 * But since its the parent context it won't be the same instance.
-	 */
-	mutex_lock(&child_ctx->mutex);
-
 	list_for_each_entry_safe(child_event, next, &child_ctx->event_list, event_entry)
 		__perf_event_exit_task(child_event, child_ctx, child);
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( )
  2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra
@ 2016-01-28 19:10   ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 3+ messages in thread
From: tip-bot for Peter Zijlstra @ 2016-01-28 19:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jolsa, vincent.weaver, peterz, eranian, acme, tglx,
	hpa, torvalds, mingo

Commit-ID:  6a3351b612b72c558910c88a43e2ef6d7d68bc97
Gitweb:     http://git.kernel.org/tip/6a3351b612b72c558910c88a43e2ef6d7d68bc97
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Mon, 25 Jan 2016 14:09:54 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 28 Jan 2016 20:06:36 +0100

perf: Fix race in perf_event_exit_task_context()

There is a race between perf_event_exit_task_context() and
orphans_remove_work() which results in a use-after-free.

We mark ctx->task with TASK_TOMBSTONE to indicate a context is
'dead', under ctx->lock. After which point event_function_call()
on any event of that context will NOP

A concurrent orphans_remove_work() will only hold ctx->mutex for
the list iteration and not serialize against this. Therefore its
possible that orphans_remove_work()'s perf_remove_from_context()
call will fail, but we'll continue to free the event, with the
result of free'd memory still being on lists and everything.

Once perf_event_exit_task_context() gets around to acquiring
ctx->mutex it too will iterate the event list, encounter the
already free'd event and proceed to free it _again_. This fails
with the WARN in free_event().

Plug the race by having perf_event_exit_task_context() hold
ctx::mutex over the whole tear-down, thereby 'naturally'
serializing against all other sites, including the orphan work.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: alexander.shishkin@linux.intel.com
Cc: dsahern@gmail.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/20160125130954.GY6357@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/events/core.c | 50 +++++++++++++++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6759f2a..1d243fa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8748,14 +8748,40 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 {
 	struct perf_event_context *child_ctx, *clone_ctx = NULL;
 	struct perf_event *child_event, *next;
-	unsigned long flags;
 
 	WARN_ON_ONCE(child != current);
 
-	child_ctx = perf_lock_task_context(child, ctxn, &flags);
+	child_ctx = perf_pin_task_context(child, ctxn);
 	if (!child_ctx)
 		return;
 
+	/*
+	 * In order to reduce the amount of tricky in ctx tear-down, we hold
+	 * ctx::mutex over the entire thing. This serializes against almost
+	 * everything that wants to access the ctx.
+	 *
+	 * The exception is sys_perf_event_open() /
+	 * perf_event_create_kernel_count() which does find_get_context()
+	 * without ctx::mutex (it cannot because of the move_group double mutex
+	 * lock thing). See the comments in perf_install_in_context().
+	 *
+	 * We can recurse on the same lock type through:
+	 *
+	 *   __perf_event_exit_task()
+	 *     sync_child_event()
+	 *       put_event()
+	 *         mutex_lock(&ctx->mutex)
+	 *
+	 * But since its the parent context it won't be the same instance.
+	 */
+	mutex_lock(&child_ctx->mutex);
+
+	/*
+	 * In a single ctx::lock section, de-schedule the events and detach the
+	 * context from the task such that we cannot ever get it scheduled back
+	 * in.
+	 */
+	raw_spin_lock_irq(&child_ctx->lock);
 	task_ctx_sched_out(__get_cpu_context(child_ctx), child_ctx);
 
 	/*
@@ -8767,14 +8793,8 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 	WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE);
 	put_task_struct(current); /* cannot be last */
 
-	/*
-	 * If this context is a clone; unclone it so it can't get
-	 * swapped to another process while we're removing all
-	 * the events from it.
-	 */
 	clone_ctx = unclone_ctx(child_ctx);
-	update_context_time(child_ctx);
-	raw_spin_unlock_irqrestore(&child_ctx->lock, flags);
+	raw_spin_unlock_irq(&child_ctx->lock);
 
 	if (clone_ctx)
 		put_ctx(clone_ctx);
@@ -8786,18 +8806,6 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn)
 	 */
 	perf_event_task(child, child_ctx, 0);
 
-	/*
-	 * We can recurse on the same lock type through:
-	 *
-	 *   __perf_event_exit_task()
-	 *     sync_child_event()
-	 *       put_event()
-	 *         mutex_lock(&ctx->mutex)
-	 *
-	 * But since its the parent context it won't be the same instance.
-	 */
-	mutex_lock(&child_ctx->mutex);
-
 	list_for_each_entry_safe(child_event, next, &child_ctx->event_list, event_entry)
 		__perf_event_exit_task(child_event, child_ctx, child);
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-01-28 19:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org>
2016-01-25 13:08 ` [tip:perf/urgent] perf: Fix perf_event_exit_task() race Peter Zijlstra
2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra
2016-01-28 19:10   ` [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( ) tip-bot for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).