From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932599AbcLGSfD (ORCPT ); Wed, 7 Dec 2016 13:35:03 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:55055 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753274AbcLGSfA (ORCPT ); Wed, 7 Dec 2016 13:35:00 -0500 Date: Wed, 7 Dec 2016 19:34:55 +0100 From: Peter Zijlstra To: Mark Rutland Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Arnaldo Carvalho de Melo , Thomas Gleixner , Sebastian Andrzej Siewior , jeremy.linton@arm.com Subject: Re: Perf hotplug lockup in v4.9-rc8 Message-ID: <20161207183455.GQ3124@twins.programming.kicks-ass.net> References: <20161207135217.GA25605@leverpostej> <20161207175347.GB13840@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161207175347.GB13840@leverpostej> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 07, 2016 at 05:53:47PM +0000, Mark Rutland wrote: > On Wed, Dec 07, 2016 at 01:52:17PM +0000, Mark Rutland wrote: > > Hi all > > > > Jeremy noticed a kernel lockup on arm64 when the perf tool was used in > > parallel with hotplug, which I've reproduced on arm64 and x86(-64) with > > v4.9-rc8. In both cases I'm using defconfig; I've tried enabling lockdep > > but it was silent for arm64 and x86. > > It looks like we're trying to install a task-bound event into a context > where task_cpu(ctx->task) is dead, and thus the cpu_function_call() in > perf_install_in_context() fails. We retry repeatedly. > > On !PREEMPT (as with x86 defconfig), we manage to prevent the hotplug > machinery from making progress, and this turns into a livelock. > > On PREEMPT (as with arm64 defconfig), I'm somewhat lost. So the problem is that even with PREEMPT we can hit a blocked task that has a 'dead' cpu. We'll spin until either the task wakes up or the CPU does, either can take a very long time. How exactly your test-case triggers this, all it executes is 'true' and that really shouldn't block much, is a mystery still. In any case, the below cures things, but it is, as the comment says, horrific. I'm yet to come up with a better patch that doesn't violate scheduler internals. --- kernel/events/core.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index 6ee1febdf6ff..6faf4b03396e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1203,6 +1203,11 @@ static void put_ctx(struct perf_event_context *ctx) * perf_event_context::lock * perf_event::mmap_mutex * mmap_sem + * + * task_struct::pi_lock + * rq::lock + * perf_event_context::lock + * */ static struct perf_event_context * perf_event_ctx_lock_nested(struct perf_event *event, int nesting) @@ -2352,6 +2357,28 @@ perf_install_in_context(struct perf_event_context *ctx, return; } raw_spin_unlock_irq(&ctx->lock); + + raw_spin_lock_irq(&task->pi_lock); + if (!(task->state == TASK_RUNNING || task->state == TASK_WAKING)) { + /* + * XXX horrific hack... + */ + raw_spin_lock(&ctx->lock); + if (task != ctx->task) { + raw_spin_unlock(&ctx->lock); + raw_spin_unlock_irq(&task->pi_lock); + goto again; + } + + add_event_to_ctx(event, ctx); + raw_spin_unlock(&ctx->lock); + raw_spin_unlock_irq(&task->pi_lock); + return; + } + raw_spin_unlock_irq(&task->pi_lock); + + cond_resched(); + /* * Since !ctx->is_active doesn't mean anything, we must IPI * unconditionally.