From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933390AbdC3Hzx (ORCPT ); Thu, 30 Mar 2017 03:55:53 -0400 Received: from s3.sipsolutions.net ([5.9.151.49]:58220 "EHLO sipsolutions.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932790AbdC3Hzv (ORCPT ); Thu, 30 Mar 2017 03:55:51 -0400 Message-ID: <1490860547.32041.1.camel@sipsolutions.net> Subject: Re: deadlock in synchronize_srcu() in debugfs? From: Johannes Berg To: Nicolai Stange Cc: linux-kernel , "Paul E.McKenney" , gregkh Date: Thu, 30 Mar 2017 09:55:47 +0200 In-Reply-To: <87lgrnmd8b.fsf@gmail.com> (sfid-20170330_093222_584439_BA8321D0) References: <1490280886.2766.4.camel@sipsolutions.net> <87o9ws6m4s.fsf@gmail.com> <1490614617.3393.4.camel@sipsolutions.net> <87lgrnmd8b.fsf@gmail.com> (sfid-20170330_093222_584439_BA8321D0) Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.4-1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2017-03-30 at 09:32 +0200, Nicolai Stange wrote: > > I wonder if holding the RTNL during the debugfs file removal is > really needed. I'll try to have a look in the next couple of days. Yes, I'm pretty much convinced that it is. I considered doing a deferred debugfs_remove() by holding the object around, but then I can't be sure that I can later re-add a new object with the same directory name, so I have much more complexity - I'm not even sure that can be solved at all, *perhaps* by renaming in debugfs first, but that's major new complexity. Enough complexity that I'm considering just removing debugfs usage entirely and invent new mechanisms, or use sysfs, or something else instead. > Summarizing, the problem is the call to the indefinitely blocking > srcu_synchronize() while having a lock held? I'll see whether I can > ask lockdep if any lock is held and spit out a warning then. Half the thread here was about that - it's not easily doable because you'd have to teach lockdep about the special SRCU semantics first. Since it doesn't even seem to do read/write locks properly that's probably a massive undertaking. I also doubt that it's useful, because even if we did flag this sort of situation, it can occur across very different drivers - for example the netronome driver using rtnl_lock() inside its debugfs files, and mac80211 removing a completely unrelated debugfs file within rtnl_lock(). > > > Similarly, nobody should be blocking in debugfs files, like we did > > in ours, but also smsdvb_stats_read(), crtc_crc_open() look like > > they could block for quite a while. > > Blocking in the debugfs files' fops shall be fine by itself, that's > why SRCU is used for the removal stuff. No, it isn't fine at all now! If I have a debugfs file - like the one I had - that could block for external events (firmware, in my case), then *any* other debugfs_remove() in the whole system would also block indefinitely. That's a major problem! The two other files listed above can also block waiting for external events, afaict. I'm told that there are more files in other wireless drivers too. Basically, even with SRCU being used, you cannot have blocking files. You have to treat everything as O_NONBLOCK, because if you don't a completely unrelated debugfs_remove() will block until the file produces data. IMHO that's completely unacceptable. > Yes, there's only one global srcu_struct for debugfs. So far this > hasn't been a problem and if I understand things correctly, it's also > not the problem at hand? If it really becomes an issue, we can very > well introduce per directory srcu_structs as you suggested. No, this is exactly the problem. If I have one blocking file, and remove any completely unrelated file elsewhere in the system, I need to wait for the blocking file to have produced data. That just doesn't scale. Using a separate SRCU subsystem per directory will go some way, but it would still be useful to have lockdep annotations there. Ultimately, I'm not sure I see why one couldn't just have a reader/writer lock per *file*, which would be the ultimate granularity to solve this. Obviously, a blocking file has to be aborted before being removed itself, but there's nothing that says that you can't remove any other file - even from the same directory - while this one is in a blocking read. > Let me have a look > - whether holding the RTNL lock while removing the debugfs files is >   actually needed and > - whether there is an easy way to spot similar scenarios and emit >   a warning for them. > > If this doesn't solve the problem, I'll have to think of a different > way to fix this... It solves - imho with unnecessary hardship on the caller of debugfs_remove() - only half of the problem, namely the real deadlock. It does nothing for the "blocking debugfs file" live-lock where forward progress can be made as soon as the file has some data available - which in practice in my scenario never happened as the data producer was completely idle. > > (*) before removing first first we'd obviously wake up and thereby > > more or less terminate the readers first > > With the current implementation, I can't see an easy way to identify > the tasks blocking on a particular debugfs file. But maybe this is > resolvable and the way to go here... No, this is unrelated. If I write a blocking debugfs file, then *of course* I need to take care that before remove it I wake up all the readers. But that's easy because I control how that file produces data, so I can wake up the waitq and tell my own code inside the debugfs read to return 0 (EOF). This would be entirely inappropriate as a general solution because you'd have to kill the userspace process or something... johannes