From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755947AbbKDOVL (ORCPT ); Wed, 4 Nov 2015 09:21:11 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:43369 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750727AbbKDOVK (ORCPT ); Wed, 4 Nov 2015 09:21:10 -0500 Date: Wed, 4 Nov 2015 15:20:58 +0100 From: Peter Zijlstra To: "Paul E. McKenney" Cc: Dave Jones , Linux Kernel , Ingo Molnar , Stephane Eranian , Andi Kleen Subject: Re: perf related lockdep bug Message-ID: <20151104142058.GX3604@twins.programming.kicks-ass.net> References: <20151104051717.GA6098@codemonkey.org.uk> <20151104102151.GG17308@twins.programming.kicks-ass.net> <20151104102800.GZ11639@twins.programming.kicks-ass.net> <20151104105010.GA11639@twins.programming.kicks-ass.net> <20151104134838.GR29027@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151104134838.GR29027@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 04, 2015 at 05:48:38AM -0800, Paul E. McKenney wrote: > Ouch!!! Thank you for the analysis, though I am very surprised that > my testing did not find this. Yeah, not sure how that ended up not triggering earlier. I'm thinking of adding a might_wake(), much like we have might_fault() and add that to printk(). > But pulling all printk()s out from under > rnp->lock is going to re-introduce some stall-warning bugs. figures :/ > So what other options do I have? Kill printk() :-) Its unreliable garbage anyway ;-) > o I could do raise_softirq(), then report the quiescent state in > the core RCU code, but I bet that raise_softirq()'s wakeup gets > me into just as much trouble. Yep.. > o Ditto for workqueues, I suspect. Yep.. > o I cannot send an IPI because interrupts are disabled, and that > would be rather annoying from a real-time perspective in any > case. Indeed. > So this hit the code in perf_lock_task_context() that disables preemption > across an RCU read-side critical section, which previously sufficed to > prevent this scenario. What happened this time is as follows: > > o CPU 0 entered perf_lock_task_context(), disabled preemption, > and entered its RCU read-side critical section. Of course, > the whole point of disabling preemption is to prevent the > matching rcu_read_unlock() from grabbing locks. > > o CPU 1 started an expedited grace period. It checked CPU > state, saw that CPU 0 was running in the kernel, and therefore > IPIed it. > > o The IPI handler running on CPU 0 saw that there was an > RCU read-side critical section in effect, so it set the > ->exp_need_qs flag. > > o When the matching rcu_read_unlock() executes, it notes that > ->exp_need_qs is set, and therefore grabs the locks that it > shouldn't, hence lockdep's complaints about deadlock. > > This problem is caused by the IPI handler interrupting the RCU read-side > critical section. One way to prevent the IPI from doing this is to > disable interrupts across the RCU read-side critical section instead > of merely disabling preemption. This is a reasonable approach given > that acquiring the scheduler locks is going to disable interrupts > in any case. > > The (untested) patch below takes this approach. > > Thoughts? Yes, this should work, but now I worry I need to go audit all of perf and sched for this :/