From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030491AbbKDPfD (ORCPT ); Wed, 4 Nov 2015 10:35:03 -0500 Received: from e38.co.us.ibm.com ([32.97.110.159]:51924 "EHLO e38.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030266AbbKDPfA (ORCPT ); Wed, 4 Nov 2015 10:35:00 -0500 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Wed, 4 Nov 2015 07:34:54 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Dave Jones , Linux Kernel , Ingo Molnar , Stephane Eranian , Andi Kleen Subject: Re: perf related lockdep bug Message-ID: <20151104153454.GU29027@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20151104051717.GA6098@codemonkey.org.uk> <20151104102151.GG17308@twins.programming.kicks-ass.net> <20151104102800.GZ11639@twins.programming.kicks-ass.net> <20151104105010.GA11639@twins.programming.kicks-ass.net> <20151104134838.GR29027@linux.vnet.ibm.com> <20151104142058.GX3604@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151104142058.GX3604@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15110415-0029-0000-0000-00000DDED614 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 04, 2015 at 03:20:58PM +0100, Peter Zijlstra wrote: > On Wed, Nov 04, 2015 at 05:48:38AM -0800, Paul E. McKenney wrote: > > Ouch!!! Thank you for the analysis, though I am very surprised that > > my testing did not find this. > > Yeah, not sure how that ended up not triggering earlier. > > I'm thinking of adding a might_wake(), much like we have might_fault() > and add that to printk(). The idea being that might_wake() complains if a scheduler lock is held? Sounds like a good idea to me. > > But pulling all printk()s out from under > > rnp->lock is going to re-introduce some stall-warning bugs. > > figures :/ > > > So what other options do I have? > > Kill printk() :-) Its unreliable garbage anyway ;-) ;-) ;-) ;-) > > o I could do raise_softirq(), then report the quiescent state in > > the core RCU code, but I bet that raise_softirq()'s wakeup gets > > me into just as much trouble. > > Yep.. > > > o Ditto for workqueues, I suspect. > > Yep.. > > > o I cannot send an IPI because interrupts are disabled, and that > > would be rather annoying from a real-time perspective in any > > case. > > Indeed. > > > So this hit the code in perf_lock_task_context() that disables preemption > > across an RCU read-side critical section, which previously sufficed to > > prevent this scenario. What happened this time is as follows: > > > > o CPU 0 entered perf_lock_task_context(), disabled preemption, > > and entered its RCU read-side critical section. Of course, > > the whole point of disabling preemption is to prevent the > > matching rcu_read_unlock() from grabbing locks. > > > > o CPU 1 started an expedited grace period. It checked CPU > > state, saw that CPU 0 was running in the kernel, and therefore > > IPIed it. > > > > o The IPI handler running on CPU 0 saw that there was an > > RCU read-side critical section in effect, so it set the > > ->exp_need_qs flag. > > > > o When the matching rcu_read_unlock() executes, it notes that > > ->exp_need_qs is set, and therefore grabs the locks that it > > shouldn't, hence lockdep's complaints about deadlock. > > > > This problem is caused by the IPI handler interrupting the RCU read-side > > critical section. One way to prevent the IPI from doing this is to > > disable interrupts across the RCU read-side critical section instead > > of merely disabling preemption. This is a reasonable approach given > > that acquiring the scheduler locks is going to disable interrupts > > in any case. > > > > The (untested) patch below takes this approach. > > > > Thoughts? > > Yes, this should work, but now I worry I need to go audit all of perf > and sched for this :/ Could lockdep be convinced to do the auditing for you? Thanx, Paul