From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fieldses.org ([173.255.197.46]:52446 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727268AbeHKPJa (ORCPT ); Sat, 11 Aug 2018 11:09:30 -0400 Date: Sat, 11 Aug 2018 08:35:26 -0400 From: "J. Bruce Fields" To: Jeff Layton Cc: NeilBrown , Alexander Viro , Martin Wilck , linux-fsdevel@vger.kernel.org, Frank Filz , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups Message-ID: <20180811123526.GB15848@fieldses.org> References: <153378012255.1220.6754153662007899557.stgit@noble> <20180809173245.GM23873@fieldses.org> <87lg9frxyc.fsf@notabene.neil.brown.name> <20180810002922.GA3915@fieldses.org> <871sb7rnul.fsf@notabene.neil.brown.name> <20180810025251.GO23873@fieldses.org> <87y3derjut.fsf@notabene.neil.brown.name> <20180810154742.GE7906@fieldses.org> <0f198c62b057ab7d796746144d458835a6c7433e.camel@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0f198c62b057ab7d796746144d458835a6c7433e.camel@kernel.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, Aug 11, 2018 at 07:56:25AM -0400, Jeff Layton wrote: > FWIW, I did a bit of testing with lockperf tests that I had written on > an earlier rework of this code: > > https://git.samba.org/jlayton/linux.git/?p=jlayton/lockperf.git;a=summary > > > The posix01 and flock01 tests in there show about a 10x speedup with > this set in place. > > I think something closer to Neil's design will end up being what we want > here. Consider the relatively common case where you have a whole-file > POSIX write lock held with a bunch of different waiters blocked on it > (all whole file write locks with different owners): > > With Neil's patches, you will just wake up a single waiter when the > blocked lock is released, as they would all be in a long chain of > waiters. Right, but you still need to walk the whole tree to make sure that it's the only one you need to wake. The tree structure means that you know all the other locks have non-overlapping ranges, but it doesn't tell you the lock owners. Maybe there's some reasonable way to rule out the shared-lockowner case more quickly too. I haven't thought about that much. > If you keep all the locks in a single list, you'll either have to: > > a) wake up all the waiters on the list when the lock comes free: no lock > is held at that point so none of them will conflict. > > ...or... > > b) keep track of what waiters have already been awoken, and compare any > further candidate for waking against the current set of held locks and > any lock requests by waiters that you just woke. Instead of keeping track of *every* waiter that you've woken, you could keep track of some subset. Worst case that just means waking more processes than you need to, which is wasteful but correct. In the common case that you give, that subset could just be "the first waiter you wake". You'd get the same result. The every-waiter-a-whole-file-write-lock case is pretty easy. To benefit from the tree you need a case where some of the waiters overlap and some don't. Might be worth it, sure. --b. > b seems more expensive as you have to walk over a larger set of locks > on every change. > -- > Jeff Layton