From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.kernel.org ([198.145.29.99]:49852 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727267AbeHKOa0 (ORCPT ); Sat, 11 Aug 2018 10:30:26 -0400 Message-ID: <0f198c62b057ab7d796746144d458835a6c7433e.camel@kernel.org> Subject: Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups From: Jeff Layton To: "J. Bruce Fields" , NeilBrown Cc: Alexander Viro , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Date: Sat, 11 Aug 2018 07:56:25 -0400 In-Reply-To: <20180810154742.GE7906@fieldses.org> References: <153378012255.1220.6754153662007899557.stgit@noble> <20180809173245.GM23873@fieldses.org> <87lg9frxyc.fsf@notabene.neil.brown.name> <20180810002922.GA3915@fieldses.org> <871sb7rnul.fsf@notabene.neil.brown.name> <20180810025251.GO23873@fieldses.org> <87y3derjut.fsf@notabene.neil.brown.name> <20180810154742.GE7906@fieldses.org> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, 2018-08-10 at 11:47 -0400, J. Bruce Fields wrote: > On Fri, Aug 10, 2018 at 01:17:14PM +1000, NeilBrown wrote: > > On Thu, Aug 09 2018, J. Bruce Fields wrote: > > > > > On Fri, Aug 10, 2018 at 11:50:58AM +1000, NeilBrown wrote: > > > > You're good at this game! > > > > > > Everybody's got to have a hobby, mine is pathological posix locking > > > cases.... > > > > > > > So, because a locker with the same "owner" gets a free pass, you can > > > > *never* say that any lock which conflicts with A also conflicts with B, > > > > as a lock with the same owner as B will never conflict with B, even > > > > though it conflicts with A. > > > > > > > > I think there is still value in having the tree, but when a waiter is > > > > attached under a new blocker, we need to walk the whole tree beneath the > > > > waiter and detach/wake anything that is not blocked by the new blocker. > > > > > > If you're walking the whole tree every time then it might as well be a > > > flat list, I think? > > > > The advantage of a tree is that it keeps over-lapping locks closer > > together. > > For it to make a difference you would need a load where lots of threads > > were locking several different small ranges, and other threads were > > locking large ranges that cover all the small ranges. > > OK, I'm not sure I understand, but I'll give another look at the next > version.... > > > I doubt this is common, but it doesn't seem as strange as other things > > I've seen in the wild. > > The other advantage, of course, is that I've already written the code, > > and I like it. > > > > Maybe I'll do a simple-list version, then a patch to convert that to the > > clever-tree version, and we can then have something concrete to assess. > > That might help, thanks. > FWIW, I did a bit of testing with lockperf tests that I had written on an earlier rework of this code: https://git.samba.org/jlayton/linux.git/?p=jlayton/lockperf.git;a=summary The posix01 and flock01 tests in there show about a 10x speedup with this set in place. I think something closer to Neil's design will end up being what we want here. Consider the relatively common case where you have a whole-file POSIX write lock held with a bunch of different waiters blocked on it (all whole file write locks with different owners): With Neil's patches, you will just wake up a single waiter when the blocked lock is released, as they would all be in a long chain of waiters. If you keep all the locks in a single list, you'll either have to: a) wake up all the waiters on the list when the lock comes free: no lock is held at that point so none of them will conflict. ...or... b) keep track of what waiters have already been awoken, and compare any further candidate for waking against the current set of held locks and any lock requests by waiters that you just woke. b seems more expensive as you have to walk over a larger set of locks on every change. -- Jeff Layton