From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zeniv.linux.org.uk ([195.92.253.2]:49062 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750997AbcBGUBF (ORCPT ); Sun, 7 Feb 2016 15:01:05 -0500 Date: Sun, 7 Feb 2016 20:01:00 +0000 From: Al Viro To: Mike Marshall Cc: Linus Torvalds , linux-fsdevel , Stephen Rothwell Subject: [RFC] bufmap-related wait logics (Re: Orangefs ABI documentation) Message-ID: <20160207200100.GB17997@ZenIV.linux.org.uk> References: <20160124001615.GT17997@ZenIV.linux.org.uk> <20160124040529.GX17997@ZenIV.linux.org.uk> <20160130173413.GE17997@ZenIV.linux.org.uk> <20160130182731.GF17997@ZenIV.linux.org.uk> <20160206194210.GX17997@ZenIV.linux.org.uk> <20160207013835.GY17997@ZenIV.linux.org.uk> <20160207035331.GZ17997@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160207035331.GZ17997@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sun, Feb 07, 2016 at 03:53:31AM +0000, Al Viro wrote: > BTW, could you try to reproduce that WARN_ON with these two patches added > and with bufmap debugging turned on? Both double-free and lack of rewinding > are real; I can see scenarios where they would trigger, and I'm pretty sure > that the latter is triggering in your reproducer. Moreover, I'm absolutely > sure that spurious dropping of bufmap references is happening there; what I'm > not sure is whether it was on this double-free or on something else... AFAICS, with bufmap we have 6 kinds of events - 1) daemon installs a bufmap 2) daemon shuts down 3) wait_for_direct_io() requests a read/write slot 4) orangefs_readdir() requests a readdir slot 5) wait_for_direct_io() releases a slot 6) orangefs_readdir() releases a slot and the whole thing can be described via two counters and two waitqueues. Rules: Initially C1 = C2 = -1 (1) if C1 >= 0 sod off, we'd already installed that thing else C1 = number of read/write slots wake up that many of those who wait on Q1 C2 = number of readdir slots wake up that many of those who wait on Q2 (2) C1 -= number of read/write slots + 1 C2 -= number of readdir slots + 1 wait on Q1 for C1 == -1 wait on Q2 for C2 == -1 (3) if C1 <= 0 end = now + 15 minutes while true if C1 < 0 interruptibly wait on Q1 for (C1 > 0) up to min(end - now, 30s) if C1 < 0 return -ETIMEDOUT else interruptibly wait on Q1 for (C1 > 0) up to end - now, exclusive if C1 > 0 break if signal arrived return -EINTR if now after end return -ETIMEDOUT C1--, and grab a slot in read/write slots bitmap (5) release a slot in bitmap; C1++; wake up Q1 (4,6) same as (3,5) with s/C1/C2/, s/Q1/Q2/ I'd probably use Q1.lock for serializing C1 and Q2.lock for C2; the only obstacle is the lack of timeout versions of wait_event_interruptible{,_exclusive}locked() (and obscene identifier length of such beasts, of course). The really annoying thing is that it's very similar to a couple of counting semaphores; home-grown wait primitive is almost always a Bad Idea(tm) and if somebody sees a sane way to cobble that out of higher-level ones, I'd very much prefer that. Suggestions?