From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756840AbdDRNki (ORCPT ); Tue, 18 Apr 2017 09:40:38 -0400 Received: from s3.sipsolutions.net ([5.9.151.49]:50340 "EHLO sipsolutions.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756219AbdDRNkg (ORCPT ); Tue, 18 Apr 2017 09:40:36 -0400 Message-ID: <1492522832.18845.1.camel@sipsolutions.net> Subject: Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances From: Johannes Berg To: paulmck@linux.vnet.ibm.com Cc: Nicolai Stange , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Date: Tue, 18 Apr 2017 15:40:32 +0200 In-Reply-To: <20170418133136.GS3956@linux.vnet.ibm.com> References: <871stdyg0u.fsf@gmail.com> <20170416095137.2784-1-nicstange@gmail.com> <20170416095137.2784-10-nicstange@gmail.com> <20170417160121.GP3956@linux.vnet.ibm.com> <1492508367.2472.9.camel@sipsolutions.net> <20170418133136.GS3956@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.4-1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2017-04-18 at 06:31 -0700, Paul E. McKenney wrote: > On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote: > > On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote: > > > > > If you have not already done so, please run this with debug > > > enabled, > > > especially CONFIG_PROVE_LOCKING=y (which implies > > > CONFIG_PROVE_RCU=y). > > > This is important because there are configurations for which the > > > deadlocks you saw with SRCU turn into silent failure, including > > > memory corruption. > > > CONFIG_PROVE_RCU=y will catch many of those situations. > > > > Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU > > enabled in the builds where we saw the problem, but I'm not sure. > > CONFIG_PROVE_RCU=y will reliably catch things like this: > > 1. rcu_read_lock(); > synchronize_rcu(); > rcu_read_unlock(); Ok, that's not something that happens here either. > 2. rcu_read_lock(); > schedule_timeout_interruptible(HZ); > rcu_read_unlock(); Neither is this happening. > There are more, but this should get you the flavor of the types > of bugs CONFIG_PROVE_RCU=y can locate for you. Makes sense. However, the issue at hand is what we (you and I) discussed earlier wrt. lockdep -- from SRCU's point of view everything is actually OK, except that the one thread is waiting for something and we can never finish the grace period, and thus synchronize_srcu() will never return. But there's no real SRCU bug here. > > Nicolai probably never even ran into this problem, though it should > > be easy to reproduce. > > I am just worried that the situation resulting in the earlier SRCU > deadlocks might be hiding behind CONFIG_PROVE_RCU=n, > CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.  Or some other bug > hiding behind some other set of Kconfig options. There's no SRCU deadlock though. I know exactly why it happens, in my case, which is the following: Thread 1 userspace: read(debugfs_file_1) srcu_read_lock(&debugfs_srcu); // in debugfs bowels wait_event_interruptible(...); // in my driver's debugfs read method Thread 2: debugfs_remove(debugfs_file_2); srcu_synchronize(&debugfs_srcu); // in debugfs bowels This is the live-lock. The deadlock is something I posited but never ran into: CPU 1 CPU 2 srcu_read_lock(&debugfs_srcu); rtnl_lock(); rtnl_lock(); srcu_synchronize(&debugfs_srcu); Again, no (S)RCU abuse here, just an ABBA deadlock. johannes