From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S932599AbcLGSfD (ORCPT <rfc822;w@1wt.eu>);
        Wed, 7 Dec 2016 13:35:03 -0500
Received: from bombadil.infradead.org ([198.137.202.9]:55055 "EHLO
        bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753274AbcLGSfA (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 7 Dec 2016 13:35:00 -0500
Date: Wed, 7 Dec 2016 19:34:55 +0100
From: Peter Zijlstra <peterz@infradead.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
        jeremy.linton@arm.com
Subject: Re: Perf hotplug lockup in v4.9-rc8
Message-ID: <20161207183455.GQ3124@twins.programming.kicks-ass.net>
References: <20161207135217.GA25605@leverpostej>
 <20161207175347.GB13840@leverpostej>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161207175347.GB13840@leverpostej>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Dec 07, 2016 at 05:53:47PM +0000, Mark Rutland wrote:
> On Wed, Dec 07, 2016 at 01:52:17PM +0000, Mark Rutland wrote:
> > Hi all
> > 
> > Jeremy noticed a kernel lockup on arm64 when the perf tool was used in
> > parallel with hotplug, which I've reproduced on arm64 and x86(-64) with
> > v4.9-rc8. In both cases I'm using defconfig; I've tried enabling lockdep
> > but it was silent for arm64 and x86.
> 
> It looks like we're trying to install a task-bound event into a context
> where task_cpu(ctx->task) is dead, and thus the cpu_function_call() in
> perf_install_in_context() fails. We retry repeatedly.
> 
> On !PREEMPT (as with x86 defconfig), we manage to prevent the hotplug
> machinery from making progress, and this turns into a livelock.
> 
> On PREEMPT (as with arm64 defconfig), I'm somewhat lost.

So the problem is that even with PREEMPT we can hit a blocked task
that has a 'dead' cpu.

We'll spin until either the task wakes up or the CPU does, either can
take a very long time.

How exactly your test-case triggers this, all it executes is 'true' and
that really shouldn't block much, is a mystery still.

In any case, the below cures things, but it is, as the comment says,
horrific.

I'm yet to come up with a better patch that doesn't violate scheduler
internals.

---
 kernel/events/core.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6ee1febdf6ff..6faf4b03396e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1203,6 +1203,11 @@ static void put_ctx(struct perf_event_context *ctx)
  *	      perf_event_context::lock
  *	    perf_event::mmap_mutex
  *	    mmap_sem
+ *
+ *    task_struct::pi_lock
+ *      rq::lock
+ *        perf_event_context::lock
+ *
  */
 static struct perf_event_context *
 perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
@@ -2352,6 +2357,28 @@ perf_install_in_context(struct perf_event_context *ctx,
 		return;
 	}
 	raw_spin_unlock_irq(&ctx->lock);
+
+	raw_spin_lock_irq(&task->pi_lock);
+	if (!(task->state == TASK_RUNNING || task->state == TASK_WAKING)) {
+		/*
+		 * XXX horrific hack...
+		 */
+		raw_spin_lock(&ctx->lock);
+		if (task != ctx->task) {
+			raw_spin_unlock(&ctx->lock);
+			raw_spin_unlock_irq(&task->pi_lock);
+			goto again;
+		}
+
+		add_event_to_ctx(event, ctx);
+		raw_spin_unlock(&ctx->lock);
+		raw_spin_unlock_irq(&task->pi_lock);
+		return;
+	}
+	raw_spin_unlock_irq(&task->pi_lock);
+
+	cond_resched();
+
 	/*
 	 * Since !ctx->is_active doesn't mean anything, we must IPI
 	 * unconditionally.