From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C2C8C43381 for ; Sun, 17 Mar 2019 00:49:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3D66F218E0 for ; Sun, 17 Mar 2019 00:49:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726726AbfCQAtY (ORCPT ); Sat, 16 Mar 2019 20:49:24 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:45686 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726571AbfCQAtY (ORCPT ); Sat, 16 Mar 2019 20:49:24 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x2H0mbLZ058267 for ; Sat, 16 Mar 2019 20:49:22 -0400 Received: from e15.ny.us.ibm.com (e15.ny.us.ibm.com [129.33.205.205]) by mx0b-001b2d01.pphosted.com with ESMTP id 2r9afwtgjx-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Sat, 16 Mar 2019 20:49:22 -0400 Received: from localhost by e15.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 17 Mar 2019 00:49:21 -0000 Received: from b01cxnp22035.gho.pok.ibm.com (9.57.198.25) by e15.ny.us.ibm.com (146.89.104.202) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Sun, 17 Mar 2019 00:49:19 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x2H0nIY313566168 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 17 Mar 2019 00:49:18 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 422ECB2067; Sun, 17 Mar 2019 00:49:18 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 15DE7B2066; Sun, 17 Mar 2019 00:49:18 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.188]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Sun, 17 Mar 2019 00:49:18 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 767DF16C3407; Sat, 16 Mar 2019 17:50:05 -0700 (PDT) Date: Sat, 16 Mar 2019 17:50:05 -0700 From: "Paul E. McKenney" To: Al Viro Cc: Eric Biggers , "Tobin C. Harding" , linux-fsdevel@vger.kernel.org Subject: Re: dcache locking question Reply-To: paulmck@linux.ibm.com References: <20190314225632.GB15813@eros.localdomain> <20190314231939.GA17269@eros.localdomain> <20190315015021.GU2217@ZenIV.linux.org.uk> <20190315173819.GB77949@gmail.com> <20190315185455.GA2217@ZenIV.linux.org.uk> <20190316223128.GV4102@linux.ibm.com> <20190317001840.GF2217@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190317001840.GF2217@ZenIV.linux.org.uk> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 19031700-0068-0000-0000-000003A6D25C X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010770; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000281; SDB=6.01175416; UDB=6.00614716; IPR=6.00956089; MB=3.00026012; MTD=3.00000008; XFM=3.00000015; UTC=2019-03-17 00:49:20 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19031700-0069-0000-0000-000047D7DD3D Message-Id: <20190317005005.GY4102@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-03-17_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=761 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903170005 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Sun, Mar 17, 2019 at 12:18:40AM +0000, Al Viro wrote: > On Sat, Mar 16, 2019 at 03:31:28PM -0700, Paul E. McKenney wrote: > > > Paul, could you comment on that one? The function in question is > > > this: > > > static struct dentry *__lock_parent(struct dentry *dentry) > > > { > > > struct dentry *parent; > > > rcu_read_lock(); > > > spin_unlock(&dentry->d_lock); > > > again: > > > parent = READ_ONCE(dentry->d_parent); > > > spin_lock(&parent->d_lock); > > > /* > > > * We can't blindly lock dentry until we are sure > > > * that we won't violate the locking order. > > > * Any changes of dentry->d_parent must have > > > * been done with parent->d_lock held, so > > > * spin_lock() above is enough of a barrier > > > * for checking if it's still our child. > > > */ > > > if (unlikely(parent != dentry->d_parent)) { > > > > We are talking about the above access to dentry->d_parent, correct? > > > > Given that dentry->d_parent isn't updated except when holding > > dentry->d_lock, and given that locks do what they are supposed to, > > both in terms of memory ordering and mutual exclusion, the value of > > dentry->d_parent must be constant throughout the current critical section. > > It therefore doesn't matter what strange code the compiler generates, > > it will get the same value regardless. > > > > So, no, there is absolutely no point in doing this: > > > > if (unlikely(parent != READ_ONCE(dentry->d_parent))) { > > > > Or did I miss the point of this discussion? > > dentry->d_parent is *NOT* guaranteed to be stable here - we are not > holding dentry->d_lock. So it can change; what it can't do is to > change to or from the pointer we have in 'parent'. Color me blind to "dentry" vs. "parent". :-/ > IOW, the value of dentry->d_parent isn't stable; the result of > comparison, OTOH, is. > > I also think that READ_ONCE would be useless here - reordering > can't happen due to compiler barrier in spin_lock() and memory > barriers should be provided by ACQUIRE/RELEASE on parent->d_lock. > > We are observing a result of some store. If we fetch the value > equal to 'parent', that store had been under parent->d_lock *AND* > any subsequent store changing the value of dentry->d_parent away > from 'parent' would also have been under parent->d_lock. So > either it hasn't happened yet, or we would've observed its > results, since we are holding parent->lock ourselves. > > If we fetch the value not equal to 'parent', any subsequent store > making dentry->d_parent equal to parent would've been done under > parent->d_lock, so again, it either has not happened yet, or we > would've observed its results. > > FWIW, the whole point here is that we can't grab the lock sufficient > to stabilize dentry->d_parent here - not until we'd verified that > dentry still has the same parent. We could play with trylock, but > AFAICS the trick above is sufficient (and gets fewer retries). I can come up with sequences of values that could mimic the value of 'parent', but that assumes load tearing. I -have- seen stores of constant values be torn, but not stores of runtime-variable values and not loads. Still, such tearing is permitted, and including the READ_ONCE() is making it easier for things like thread sanitizers. In addition, the READ_ONCE() makes it clear that the value being loaded is unstable, which can be useful documentation. And yes, I am a bit paranoid about this sort of thing. Something about talking to a lot of compiler writers and standards committee members... But at the end of the day, you are the maintainer of this code. > What I'm not certain about is whether we need READ_ONCE() on the > other fetch in there. Suppose that gets replaced with plain > assignment; what problems could that cause? It obviously can't > get reordered past the spin_lock() - the address of spinlock > is derived from the value we fetch. And in case of retry... > wouldn't spin_unlock(parent->d_lock) we do there supply a compiler > barrier? Plus all callers to a spin_trylock() before invoking this function, which does prevent the compiler from repeating the read (no reason to in this case) or fusing with some other read. Of course, I still like the documentation and sanitizer-tooling benefits. > I realize that READ_ONCE() gives smp_read_barrier_depends(), but > do we need it here? We need smp_read_barrier_depends() if we dereference the loaded pointer using some operation that did not imply memory ordering, which does happen, for example, in ecryptfs_symlink(). But before the caller has a chance to do that, there is that spin_lock_nested(), which provides the needed ordering. Thanx, Paul