* Re: [tip:perf/urgent] perf: Fix perf_event_exit_task() race [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org> @ 2016-01-25 13:08 ` Peter Zijlstra 2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra 1 sibling, 0 replies; 3+ messages in thread From: Peter Zijlstra @ 2016-01-25 13:08 UTC (permalink / raw) To: linux-kernel, mingo, torvalds, eranian, tglx, hpa, acme, dsahern, namhyung, vincent.weaver, jolsa Cc: alexander.shishkin --- Subject: perf: Fix orphan hole From: Peter Zijlstra <peterz@infradead.org> Date: Fri Jan 22 22:13:41 CET 2016 We should set event->owner before we install the event, otherwise there is a hole where the target task can fork() and we'll not inherit the event because it thinks the event is orphaned. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8489,6 +8489,8 @@ SYSCALL_DEFINE5(perf_event_open, perf_event__header_size(event); perf_event__id_header_size(event); + event->owner = current; + perf_install_in_context(ctx, event, event->cpu); perf_unpin_context(ctx); @@ -8498,8 +8500,6 @@ SYSCALL_DEFINE5(perf_event_open, put_online_cpus(); - event->owner = current; - mutex_lock(¤t->perf_event_mutex); list_add_tail(&event->owner_entry, ¤t->perf_event_list); mutex_unlock(¤t->perf_event_mutex); ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] perf: Fix race in perf_event_exit_task_context [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org> 2016-01-25 13:08 ` [tip:perf/urgent] perf: Fix perf_event_exit_task() race Peter Zijlstra @ 2016-01-25 13:09 ` Peter Zijlstra 2016-01-28 19:10 ` [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( ) tip-bot for Peter Zijlstra 1 sibling, 1 reply; 3+ messages in thread From: Peter Zijlstra @ 2016-01-25 13:09 UTC (permalink / raw) To: linux-kernel, mingo, torvalds, eranian, tglx, hpa, acme, dsahern, namhyung, vincent.weaver, jolsa Cc: alexander.shishkin Subject: perf: Fix race in perf_event_exit_task_context From: Peter Zijlstra <peterz@infradead.org> Date: Mon Jan 25 13:03:18 CET 2016 There is a race between perf_event_exit_task_context() and orphans_remove_work() which results in a use-after-free. We mark ctx->task with TASK_TOMBSTONE to indicate a context is 'dead', under ctx->lock. After which point event_function_call() on any event of that context will NOP A concurrent orphans_remove_work() will only hold ctx->mutex for the list iteration and not serialize against this. Therefore its possible that orphans_remove_work()'s perf_remove_from_context() call will fail, but we'll continue to free the event, with the result of free'd memory still being on lists and everything. Once perf_event_exit_task_context() gets around to acquiring ctx->mutex it too will iterate the event list, encounter the already free'd event and proceed to free it _again_. This fails with the WARN in free_event(). Plug the race by having perf_event_exit_task_context() hold ctx::mutex over the whole tear-down, thereby 'naturally' serializing against all other sites, including the orphan work. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- kernel/events/core.c | 50 +++++++++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 21 deletions(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8748,14 +8748,40 @@ static void perf_event_exit_task_context { struct perf_event_context *child_ctx, *clone_ctx = NULL; struct perf_event *child_event, *next; - unsigned long flags; WARN_ON_ONCE(child != current); - child_ctx = perf_lock_task_context(child, ctxn, &flags); + child_ctx = perf_pin_task_context(child, ctxn); if (!child_ctx) return; + /* + * In order to reduce the amount of tricky in ctx tear-down, we hold + * ctx::mutex over the entire thing. This serializes against almost + * everything that wants to access the ctx. + * + * The exception is sys_perf_event_open() / + * perf_event_create_kernel_count() which does find_get_context() + * without ctx::mutex (it cannot because of the move_group double mutex + * lock thing). See the comments in perf_install_in_context(). + * + * We can recurse on the same lock type through: + * + * __perf_event_exit_task() + * sync_child_event() + * put_event() + * mutex_lock(&ctx->mutex) + * + * But since its the parent context it won't be the same instance. + */ + mutex_lock(&child_ctx->mutex); + + /* + * In a single ctx::lock section, de-schedule the events and detach the + * context from the task such that we cannot ever get it scheduled back + * in. + */ + raw_spin_lock_irq(&child_ctx->lock); task_ctx_sched_out(__get_cpu_context(child_ctx), child_ctx); /* @@ -8767,14 +8793,8 @@ static void perf_event_exit_task_context WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE); put_task_struct(current); /* cannot be last */ - /* - * If this context is a clone; unclone it so it can't get - * swapped to another process while we're removing all - * the events from it. - */ clone_ctx = unclone_ctx(child_ctx); - update_context_time(child_ctx); - raw_spin_unlock_irqrestore(&child_ctx->lock, flags); + raw_spin_unlock_irq(&child_ctx->lock); if (clone_ctx) put_ctx(clone_ctx); @@ -8786,18 +8806,6 @@ static void perf_event_exit_task_context */ perf_event_task(child, child_ctx, 0); - /* - * We can recurse on the same lock type through: - * - * __perf_event_exit_task() - * sync_child_event() - * put_event() - * mutex_lock(&ctx->mutex) - * - * But since its the parent context it won't be the same instance. - */ - mutex_lock(&child_ctx->mutex); - list_for_each_entry_safe(child_event, next, &child_ctx->event_list, event_entry) __perf_event_exit_task(child_event, child_ctx, child); ^ permalink raw reply [flat|nested] 3+ messages in thread
* [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( ) 2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra @ 2016-01-28 19:10 ` tip-bot for Peter Zijlstra 0 siblings, 0 replies; 3+ messages in thread From: tip-bot for Peter Zijlstra @ 2016-01-28 19:10 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, jolsa, vincent.weaver, peterz, eranian, acme, tglx, hpa, torvalds, mingo Commit-ID: 6a3351b612b72c558910c88a43e2ef6d7d68bc97 Gitweb: http://git.kernel.org/tip/6a3351b612b72c558910c88a43e2ef6d7d68bc97 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Mon, 25 Jan 2016 14:09:54 +0100 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 28 Jan 2016 20:06:36 +0100 perf: Fix race in perf_event_exit_task_context() There is a race between perf_event_exit_task_context() and orphans_remove_work() which results in a use-after-free. We mark ctx->task with TASK_TOMBSTONE to indicate a context is 'dead', under ctx->lock. After which point event_function_call() on any event of that context will NOP A concurrent orphans_remove_work() will only hold ctx->mutex for the list iteration and not serialize against this. Therefore its possible that orphans_remove_work()'s perf_remove_from_context() call will fail, but we'll continue to free the event, with the result of free'd memory still being on lists and everything. Once perf_event_exit_task_context() gets around to acquiring ctx->mutex it too will iterate the event list, encounter the already free'd event and proceed to free it _again_. This fails with the WARN in free_event(). Plug the race by having perf_event_exit_task_context() hold ctx::mutex over the whole tear-down, thereby 'naturally' serializing against all other sites, including the orphan work. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: alexander.shishkin@linux.intel.com Cc: dsahern@gmail.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/20160125130954.GY6357@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/events/core.c | 50 +++++++++++++++++++++++++++++--------------------- 1 file changed, 29 insertions(+), 21 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 6759f2a..1d243fa 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8748,14 +8748,40 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) { struct perf_event_context *child_ctx, *clone_ctx = NULL; struct perf_event *child_event, *next; - unsigned long flags; WARN_ON_ONCE(child != current); - child_ctx = perf_lock_task_context(child, ctxn, &flags); + child_ctx = perf_pin_task_context(child, ctxn); if (!child_ctx) return; + /* + * In order to reduce the amount of tricky in ctx tear-down, we hold + * ctx::mutex over the entire thing. This serializes against almost + * everything that wants to access the ctx. + * + * The exception is sys_perf_event_open() / + * perf_event_create_kernel_count() which does find_get_context() + * without ctx::mutex (it cannot because of the move_group double mutex + * lock thing). See the comments in perf_install_in_context(). + * + * We can recurse on the same lock type through: + * + * __perf_event_exit_task() + * sync_child_event() + * put_event() + * mutex_lock(&ctx->mutex) + * + * But since its the parent context it won't be the same instance. + */ + mutex_lock(&child_ctx->mutex); + + /* + * In a single ctx::lock section, de-schedule the events and detach the + * context from the task such that we cannot ever get it scheduled back + * in. + */ + raw_spin_lock_irq(&child_ctx->lock); task_ctx_sched_out(__get_cpu_context(child_ctx), child_ctx); /* @@ -8767,14 +8793,8 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) WRITE_ONCE(child_ctx->task, TASK_TOMBSTONE); put_task_struct(current); /* cannot be last */ - /* - * If this context is a clone; unclone it so it can't get - * swapped to another process while we're removing all - * the events from it. - */ clone_ctx = unclone_ctx(child_ctx); - update_context_time(child_ctx); - raw_spin_unlock_irqrestore(&child_ctx->lock, flags); + raw_spin_unlock_irq(&child_ctx->lock); if (clone_ctx) put_ctx(clone_ctx); @@ -8786,18 +8806,6 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) */ perf_event_task(child, child_ctx, 0); - /* - * We can recurse on the same lock type through: - * - * __perf_event_exit_task() - * sync_child_event() - * put_event() - * mutex_lock(&ctx->mutex) - * - * But since its the parent context it won't be the same instance. - */ - mutex_lock(&child_ctx->mutex); - list_for_each_entry_safe(child_event, next, &child_ctx->event_list, event_entry) __perf_event_exit_task(child_event, child_ctx, child); ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-01-28 19:10 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <tip-63b6da39bb38e8f1a1ef3180d32a39d6baf9da84@git.kernel.org> 2016-01-25 13:08 ` [tip:perf/urgent] perf: Fix perf_event_exit_task() race Peter Zijlstra 2016-01-25 13:09 ` [PATCH] perf: Fix race in perf_event_exit_task_context Peter Zijlstra 2016-01-28 19:10 ` [tip:perf/urgent] perf: Fix race in perf_event_exit_task_context( ) tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).