From: Will Deacon <will.deacon@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
jeremy.linton@arm.com
Subject: Re: Perf hotplug lockup in v4.9-rc8
Date: Mon, 12 Dec 2016 11:46:40 +0000 [thread overview]
Message-ID: <20161212114640.GD21248@arm.com> (raw)
In-Reply-To: <20161209135900.GU3174@twins.programming.kicks-ass.net>
On Fri, Dec 09, 2016 at 02:59:00PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 07, 2016 at 07:34:55PM +0100, Peter Zijlstra wrote:
>
> > @@ -2352,6 +2357,28 @@ perf_install_in_context(struct perf_event_context *ctx,
> > return;
> > }
> > raw_spin_unlock_irq(&ctx->lock);
> > +
> > + raw_spin_lock_irq(&task->pi_lock);
> > + if (!(task->state == TASK_RUNNING || task->state == TASK_WAKING)) {
> > + /*
> > + * XXX horrific hack...
> > + */
> > + raw_spin_lock(&ctx->lock);
> > + if (task != ctx->task) {
> > + raw_spin_unlock(&ctx->lock);
> > + raw_spin_unlock_irq(&task->pi_lock);
> > + goto again;
> > + }
> > +
> > + add_event_to_ctx(event, ctx);
> > + raw_spin_unlock(&ctx->lock);
> > + raw_spin_unlock_irq(&task->pi_lock);
> > + return;
> > + }
> > + raw_spin_unlock_irq(&task->pi_lock);
> > +
> > + cond_resched();
> > +
> > /*
> > * Since !ctx->is_active doesn't mean anything, we must IPI
> > * unconditionally.
>
> So while I went back and forth trying to make that less ugly, I figured
> there was another problem.
>
> Imagine the cpu_function_call() hitting the 'right' cpu, but not finding
> the task current. It will then continue to install the event in the
> context. However, that doesn't stop another CPU from pulling the task in
> question from our rq and scheduling it elsewhere.
>
> This all lead me to the below patch.. Now it has a rather large comment,
> and while it represents my current thinking on the matter, I'm not at
> all sure its entirely correct. I got my brain in a fair twist while
> writing it.
>
> Please as to carefully think about it.
>
> ---
> kernel/events/core.c | 70 +++++++++++++++++++++++++++++++++++-----------------
> 1 file changed, 48 insertions(+), 22 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 6ee1febdf6ff..7d9ae461c535 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2252,7 +2252,7 @@ static int __perf_install_in_context(void *info)
> struct perf_event_context *ctx = event->ctx;
> struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
> struct perf_event_context *task_ctx = cpuctx->task_ctx;
> - bool activate = true;
> + bool reprogram = true;
> int ret = 0;
>
> raw_spin_lock(&cpuctx->ctx.lock);
> @@ -2260,27 +2260,26 @@ static int __perf_install_in_context(void *info)
> raw_spin_lock(&ctx->lock);
> task_ctx = ctx;
>
> - /* If we're on the wrong CPU, try again */
> - if (task_cpu(ctx->task) != smp_processor_id()) {
> - ret = -ESRCH;
> - goto unlock;
> - }
> + reprogram = (ctx->task == current);
>
> /*
> - * If we're on the right CPU, see if the task we target is
> - * current, if not we don't have to activate the ctx, a future
> - * context switch will do that for us.
> + * If the task is running, it must be running on this CPU,
> + * otherwise we cannot reprogram things.
> + *
> + * If its not running, we don't care, ctx->lock will
> + * serialize against it becoming runnable.
> */
> - if (ctx->task != current)
> - activate = false;
> - else
> - WARN_ON_ONCE(cpuctx->task_ctx && cpuctx->task_ctx != ctx);
> + if (task_curr(ctx->task) && !reprogram) {
> + ret = -ESRCH;
> + goto unlock;
> + }
>
> + WARN_ON_ONCE(reprogram && cpuctx->task_ctx && cpuctx->task_ctx != ctx);
> } else if (task_ctx) {
> raw_spin_lock(&task_ctx->lock);
> }
>
> - if (activate) {
> + if (reprogram) {
> ctx_sched_out(ctx, cpuctx, EVENT_TIME);
> add_event_to_ctx(event, ctx);
> ctx_resched(cpuctx, task_ctx);
> @@ -2331,13 +2330,36 @@ perf_install_in_context(struct perf_event_context *ctx,
> /*
> * Installing events is tricky because we cannot rely on ctx->is_active
> * to be set in case this is the nr_events 0 -> 1 transition.
> + *
> + * Instead we use task_curr(), which tells us if the task is running.
> + * However, since we use task_curr() outside of rq::lock, we can race
> + * against the actual state. This means the result can be wrong.
> + *
> + * If we get a false positive, we retry, this is harmless.
> + *
> + * If we get a false negative, things are complicated. If we are after
> + * perf_event_context_sched_in() ctx::lock will serialize us, and the
> + * value must be correct. If we're before, it doesn't matter since
> + * perf_event_context_sched_in() will program the counter.
> + *
> + * However, this hinges on the remote context switch having observed
> + * our task->perf_event_ctxp[] store, such that it will in fact take
> + * ctx::lock in perf_event_context_sched_in().
> + *
> + * We do this by task_function_call(), if the IPI fails to hit the task
> + * we know any future context switch of task must see the
> + * perf_event_ctpx[] store.
> */
> -again:
> +
> /*
> - * Cannot use task_function_call() because we need to run on the task's
> - * CPU regardless of whether its current or not.
> + * This smp_mb() orders the task->perf_event_ctxp[] store with the
> + * task_cpu() load, such that if the IPI then does not find the task
> + * running, a future context switch of that task must observe the
> + * store.
> */
> - if (!cpu_function_call(task_cpu(task), __perf_install_in_context, event))
> + smp_mb();
> +again:
> + if (!task_function_call(task, __perf_install_in_context, event))
> return;
I'm trying to figure out whether or not the barriers implied by the IPI
are sufficient here, or whether we really need the explicit smp_mb().
Certainly, arch_send_call_function_single_ipi has to order the publishing
of the remote work before the signalling of the interrupt, but the comment
above refers to "the task_cpu() load" and I can't see that after your
diff.
What are you trying to order here?
Will
>
> raw_spin_lock_irq(&ctx->lock);
> @@ -2351,12 +2373,16 @@ perf_install_in_context(struct perf_event_context *ctx,
> raw_spin_unlock_irq(&ctx->lock);
> return;
> }
> - raw_spin_unlock_irq(&ctx->lock);
> /*
> - * Since !ctx->is_active doesn't mean anything, we must IPI
> - * unconditionally.
> + * If the task is not running, ctx->lock will avoid it becoming so,
> + * thus we can safely install the event.
> */
> - goto again;
> + if (task_curr(task)) {
> + raw_spin_unlock_irq(&ctx->lock);
> + goto again;
> + }
> + add_event_to_ctx(event, ctx);
> + raw_spin_unlock_irq(&ctx->lock);
> }
>
> /*
next prev parent reply other threads:[~2016-12-12 11:46 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-07 13:53 Perf hotplug lockup in v4.9-rc8 Mark Rutland
2016-12-07 14:30 ` Mark Rutland
2016-12-07 16:39 ` Mark Rutland
2016-12-07 17:53 ` Mark Rutland
2016-12-07 18:34 ` Peter Zijlstra
2016-12-07 19:56 ` Mark Rutland
2016-12-09 13:59 ` Peter Zijlstra
2016-12-12 11:46 ` Will Deacon [this message]
2016-12-12 12:42 ` Peter Zijlstra
2016-12-22 8:45 ` Peter Zijlstra
2016-12-22 14:00 ` Peter Zijlstra
2016-12-22 16:33 ` Paul E. McKenney
2017-01-11 14:59 ` Mark Rutland
2017-01-11 16:03 ` Peter Zijlstra
2017-01-11 16:26 ` Mark Rutland
2017-01-11 19:51 ` Peter Zijlstra
2017-01-14 12:28 ` [tip:perf/urgent] perf/core: Fix sys_perf_event_open() vs. hotplug tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161212114640.GD21248@arm.com \
--to=will.deacon@arm.com \
--cc=acme@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=jeremy.linton@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).