From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A51BC43381 for ; Fri, 22 Mar 2019 20:27:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5F4F821841 for ; Fri, 22 Mar 2019 20:27:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727519AbfCVU1w (ORCPT ); Fri, 22 Mar 2019 16:27:52 -0400 Received: from fieldses.org ([173.255.197.46]:52760 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727373AbfCVU1w (ORCPT ); Fri, 22 Mar 2019 16:27:52 -0400 Received: by fieldses.org (Postfix, from userid 2815) id 4E43C1CE6; Fri, 22 Mar 2019 16:27:51 -0400 (EDT) Date: Fri, 22 Mar 2019 16:27:51 -0400 From: "J. Bruce Fields" To: NeilBrown Cc: Jeff Layton , linux-fsdevel@vger.kernel.org, asn@redhat.com Subject: Re: [PATCH] locks: ignore same lock in blocked_lock_hash Message-ID: <20190322202751.GB10961@fieldses.org> References: <20190321112744.7832-1-jlayton@kernel.org> <20190322004233.GA28329@fieldses.org> <87sgvf6560.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87sgvf6560.fsf@notabene.neil.brown.name> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Fri, Mar 22, 2019 at 12:58:47PM +1100, NeilBrown wrote: > On Thu, Mar 21 2019, J. Bruce Fields wrote: > > > On Thu, Mar 21, 2019 at 07:27:44AM -0400, Jeff Layton wrote: > >> Andreas reported that he was seeing the tdbtorture test fail in > >> some cases with -EDEADLCK when it wasn't before. Some debugging > >> showed that deadlock detection was sometimes discovering the > >> caller's lock request itself in a dependency chain. > >> > >> If posix_locks_deadlock() fails to find a deadlock, the caller_fl > >> will be passed to __locks_insert_block(), and this wakes up all > >> locks that are blocked on caller_fl, clearing the fl_blocker link. > >> > >> So if posix_locks_deadlock() finds caller_fl while searching for > >> a deadlock, > > > > I'm feeling dense. Could you step me through the scenario in a little > > more detail? > > Three processes: A, B, C. One byte in one file: X > > A gets a read lock on X. Doesn't block, so nothing goes in the hash > table. > > B gets a read lock on X. Doesn't block. > > C tries to get a write lock on X. It cannot so it can block on either A > or B. It chooses B. It check if this might cause a deadlock, but the > hash is empty so it doesn't. It gets added to the hash. "C -> B" > > A tries to get a write lock on X. It cannot because B has a read lock, > so it needs to block on B, or some child. C is a child and A's > request conflicts with C's request, so A blocks on C. > It first checks: Does A->C cause a loop in the hash-table? No. So > it proceeds to block and adds "A->C" to the hash table. > > B drops its read lock. This wakes up the first child: C. > > C tries (again) to get a write lock on X. It cannot because A still > holds a read lock, so it will have to block on A. First it must > check if "C->A" will cause a loop in the hash-table ... Yes it will > because "A->C" is already there - error! > > This is a false-positive because the very next thing to be done after > posix_locks_deadlock() (while still holding all spinlocks), is > __locks_insert_block(C->A) which calls __locks_wake_up_blocks(C) which > calls __locks_delete_block() for any pending block ??->C and so will > remove A->C from the hash. > > Maybe we could call __locks_wake_up_block() *before* calling > posix_locks_deadlock(). That *might* make more sense. > > > > > > Also, how do we know this catches every such case? > > The only links in the hash table that are about to be removed are those > that record that something is blocked by C (the lock we are trying to > get). So these are the only ones that it is reasonable to special-case > in posix_locks_deadlock(). > > > > > And why aren't we unhashing blocks when we wake them up? > > We do - that is exactly when we unhash them. The problem is they we > wake them up *after* we check for errors, rather than before. Thanks! OK, somehow I overlooked __locks_wake_up_blocks->__locks_delete_block->locks_delete_global_block. And I missed that all this was happening while still under the blocked_lock_lock, so we don't have to worry about other tasks leaving locks in this state. It's possible I'm still being a little dense, but the new check could really use a careful comment (maybe even as detailed as the above). --b.