* dcache locking question @ 2019-03-14 22:56 Tobin C. Harding 2019-03-14 23:09 ` Matthew Wilcox 2019-03-14 23:19 ` Tobin C. Harding 0 siblings, 2 replies; 17+ messages in thread From: Tobin C. Harding @ 2019-03-14 22:56 UTC (permalink / raw) To: linux-fsdevel Hi, I'm not able to understand the locking order in dcache. dcache has been around for a while so clearly its right and I'm wrong. Could someone please explain to me how the locking order commented at the top of the file is not violated in the following: From top of fs/dcache.c * If there is an ancestor relationship: * dentry->d_parent->...->d_parent->d_lock * ... * dentry->d_parent->d_lock * dentry->d_lock dentry_kill() appears to require caller to hold the dentry->d_lock yet it locks the parent with spin_trylock(&parent->d_lock), if this fails it calls __lock_parent() which releases the dentry->d_lock before locking the parent and re-acquiring [1] the dentry->d_lock . Is this not locking in two different orders? The same logic exists in lock_parent(). thanks, Tobin. [1] I do not fully understand the spin_lock_nested() macro. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-14 22:56 dcache locking question Tobin C. Harding @ 2019-03-14 23:09 ` Matthew Wilcox 2019-03-15 1:38 ` Tobin C. Harding 2019-03-14 23:19 ` Tobin C. Harding 1 sibling, 1 reply; 17+ messages in thread From: Matthew Wilcox @ 2019-03-14 23:09 UTC (permalink / raw) To: Tobin C. Harding; +Cc: linux-fsdevel On Fri, Mar 15, 2019 at 09:56:32AM +1100, Tobin C. Harding wrote: > Could someone please explain to me how the locking order commented at > the top of the file is not violated in the following: > > >From top of fs/dcache.c > > * If there is an ancestor relationship: > * dentry->d_parent->...->d_parent->d_lock > * ... > * dentry->d_parent->d_lock > * dentry->d_lock > > > dentry_kill() appears to require caller to hold the dentry->d_lock yet > it locks the parent with spin_trylock(&parent->d_lock), if this > fails it calls __lock_parent() which releases the dentry->d_lock before > locking the parent and re-acquiring [1] the dentry->d_lock . Is this not > locking in two different orders? What you're describing here is how we work around having to lock in the wrong order. There are "a few" places in the kernel where we do this. Calling spin_trylock() won't deadlock -- it'll just return failure. At that point, we drop the child, spin waiting for the parent, then lock the child. > [1] I do not fully understand the spin_lock_nested() macro. It describes to lockdep that, while it looks like we're acquiring a lock that we already own, there's actually a hierarchy of locks that are in the same lock class; we're attempting to acquire a lock in "subclass N". N is allowed to be between 0-7, inclusive. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-14 23:09 ` Matthew Wilcox @ 2019-03-15 1:38 ` Tobin C. Harding 0 siblings, 0 replies; 17+ messages in thread From: Tobin C. Harding @ 2019-03-15 1:38 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-fsdevel On Thu, Mar 14, 2019 at 04:09:05PM -0700, Matthew Wilcox wrote: > On Fri, Mar 15, 2019 at 09:56:32AM +1100, Tobin C. Harding wrote: > > Could someone please explain to me how the locking order commented at > > the top of the file is not violated in the following: > > > > >From top of fs/dcache.c > > > > * If there is an ancestor relationship: > > * dentry->d_parent->...->d_parent->d_lock > > * ... > > * dentry->d_parent->d_lock > > * dentry->d_lock > > > > > > dentry_kill() appears to require caller to hold the dentry->d_lock yet > > it locks the parent with spin_trylock(&parent->d_lock), if this > > fails it calls __lock_parent() which releases the dentry->d_lock before > > locking the parent and re-acquiring [1] the dentry->d_lock . Is this not > > locking in two different orders? > > What you're describing here is how we work around having to lock in the > wrong order. There are "a few" places in the kernel where we do this. > Calling spin_trylock() won't deadlock -- it'll just return failure. > At that point, we drop the child, spin waiting for the parent, then lock > the child. > > > [1] I do not fully understand the spin_lock_nested() macro. > > It describes to lockdep that, while it looks like we're acquiring a lock > that we already own, there's actually a hierarchy of locks that are in > the same lock class; we're attempting to acquire a lock in "subclass N". > N is allowed to be between 0-7, inclusive. ok, cool. Thanks for taking the time to explain this. Tobin ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-14 22:56 dcache locking question Tobin C. Harding 2019-03-14 23:09 ` Matthew Wilcox @ 2019-03-14 23:19 ` Tobin C. Harding 2019-03-15 1:50 ` Al Viro 1 sibling, 1 reply; 17+ messages in thread From: Tobin C. Harding @ 2019-03-14 23:19 UTC (permalink / raw) To: linux-fsdevel On Fri, Mar 15, 2019 at 09:56:32AM +1100, Tobin C. Harding wrote: > Hi, > > I'm not able to understand the locking order in dcache. dcache has been > around for a while so clearly its right and I'm wrong. > > Could someone please explain to me how the locking order commented at > the top of the file is not violated in the following: > > From top of fs/dcache.c > > * If there is an ancestor relationship: > * dentry->d_parent->...->d_parent->d_lock > * ... > * dentry->d_parent->d_lock > * dentry->d_lock > This is a more simple example static struct dentry *dentry_kill(struct dentry *dentry) __releases(dentry->d_lock) { ... slow_positive: spin_unlock(&dentry->d_lock); spin_lock(&inode->i_lock); spin_lock(&dentry->d_lock); parent = lock_parent(dentry); ... thanks, Tobin. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-14 23:19 ` Tobin C. Harding @ 2019-03-15 1:50 ` Al Viro 2019-03-15 17:38 ` Eric Biggers 0 siblings, 1 reply; 17+ messages in thread From: Al Viro @ 2019-03-15 1:50 UTC (permalink / raw) To: Tobin C. Harding; +Cc: linux-fsdevel On Fri, Mar 15, 2019 at 10:19:39AM +1100, Tobin C. Harding wrote: > On Fri, Mar 15, 2019 at 09:56:32AM +1100, Tobin C. Harding wrote: > > Hi, > > > > I'm not able to understand the locking order in dcache. dcache has been > > around for a while so clearly its right and I'm wrong. > > > > Could someone please explain to me how the locking order commented at > > the top of the file is not violated in the following: > > > > > From top of fs/dcache.c > > > > * If there is an ancestor relationship: > > * dentry->d_parent->...->d_parent->d_lock > > * ... > > * dentry->d_parent->d_lock > > * dentry->d_lock > > > > This is a more simple example > > static struct dentry *dentry_kill(struct dentry *dentry) > __releases(dentry->d_lock) > { > ... > > slow_positive: > spin_unlock(&dentry->d_lock); > spin_lock(&inode->i_lock); > spin_lock(&dentry->d_lock); > parent = lock_parent(dentry); Yes? Why is that a problem? We start with ->d_lock held. We drop it. At that point we are holding a reference to dentry and no locks whatsoever. We know that inode is not going away (we are holding a reference to positive dentry, so its ->d_inode can't change under us, and it contributes to inode refcount). So we can safely grab its ->i_lock. Then we grab (in normal order) ->d_lock. Then, in lock_parent() we check if ->d_parent points to ourselves. ->d_parent is stabilized by ->d_lock, so that check if fine. If it does *not* point to ourselves, we do trylock on its ->d_parent. Also fine - ->d_parent isn't going anywhere and trylock can't lead to deadlocks; that's what "try" in the name is about. If it succeeds, we have * inode->i_lock held * dentry->d_lock held * parent->d_lock held * inode == dentry->d_inode * parent == dentry->d_parent and we are fine, except that the reference we are holding might not be the only existing one anymore (it used to be, but we'd dropped ->d_lock, so additional ones might've appeared, and we need to repeat the retain_dentry()-related checks). If it fails, we call __lock_parent(). Which * grabs RCU lock * drops ->d_lock (now we are not holding ->d_lock on anything). * fetches ->d_parent. Note the READ_ONCE() there - it's *NOT* stable (no ->d_lock held). We can't expect that ->d_parent won't change or that the reference it used to contribute to parent's refcount is there anymore; as the matter of fact, the only thing that prevents outright _freeing_ of the object 'parent' points to is rcu_read_lock() and RCU delay between dropping the last reference and actual freeing of the sucker. rcu_read_lock() is there, though, which makes it safe to grab ->d_lock on 'parent'. That 'parent' might very well have nothing to do with our dentry by now. We can check if it's equal to its ->d_parent, though. dentry->d_parent is *NOT* stable at that point. It might be changing right now. However, the first store to dentry->d_parent making it not equal to parent would have been done under parent->d_lock. And since we are holding parent->d_lock, we won't miss that store. We might miss subsequent ones, but if we observe dentry->d_parent == parent, we know that it's stable. And if we see dentry->d_parent != parent, we know that dentry has moved around and we need to retry anyway. If we do *not* need to retry, we have * inode->i_lock held * parent->d_lock held * dentry->d_parent == parent * dentry->d_inode == inode At that point we can drop rcu_read_lock() - parent can't be freed under us without dentry->d_parent switched from parent to something else, and anyone trying to do that will spin on parent->d_lock. And now we can grab dentry->d_lock - ordering is guaranteed. The only minor twist is that once upon a time __lock_parent() used to have to cope with the possibility of dentry becoming detached while we were retrying; in that case we obviously don't want to grab dentry->d_lock - we are holding it already (as parent->d_lock). These days (since "make non-exchanging __d_move() copy ->d_parent rather than swap them") if IS_ROOT(dentry) is false at some point, it *never* will become true, so this check could've been dropped, along with the checks for __lock_parent() return value possibly being NULL. I'm not saying it's not subtle and nasty - any time you see READ_ONCE() (or rcu_read_lock(), for that matter), expect headache. But trylock is absolutely innocious part here; we are not even retrying it once. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-15 1:50 ` Al Viro @ 2019-03-15 17:38 ` Eric Biggers 2019-03-15 18:54 ` Al Viro 0 siblings, 1 reply; 17+ messages in thread From: Eric Biggers @ 2019-03-15 17:38 UTC (permalink / raw) To: Al Viro; +Cc: Tobin C. Harding, linux-fsdevel On Fri, Mar 15, 2019 at 01:50:21AM +0000, Al Viro wrote: > > If it fails, we call __lock_parent(). Which > * grabs RCU lock > * drops ->d_lock (now we are not holding ->d_lock > on anything). > * fetches ->d_parent. Note the READ_ONCE() there - > it's *NOT* stable (no ->d_lock held). We can't expect > that ->d_parent won't change or that the reference it used > to contribute to parent's refcount is there anymore; as > the matter of fact, the only thing that prevents outright > _freeing_ of the object 'parent' points to is rcu_read_lock() > and RCU delay between dropping the last reference and > actual freeing of the sucker. rcu_read_lock() is there, > though, which makes it safe to grab ->d_lock on 'parent'. > > That 'parent' might very well have nothing to do with our > dentry by now. We can check if it's equal to its > ->d_parent, though. dentry->d_parent is *NOT* stable > at that point. It might be changing right now. > > However, the first store to dentry->d_parent making it > not equal to parent would have been done under parent->d_lock. > And since we are holding parent->d_lock, we won't miss that > store. We might miss subsequent ones, but if we observe > dentry->d_parent == parent, we know that it's stable. And > if we see dentry->d_parent != parent, we know that dentry > has moved around and we need to retry anyway. Why isn't it necessary to use READ_ONCE(dentry->d_parent) here? if (unlikely(parent != dentry->d_parent)) { Suppose 'parent' is 0xAAAABBBB, and 'dentry->d_parent' is 0xAAAAAAAA and is concurrently changed to 0xBBBBBBBB. d_parent could be read in two parts, 0xAAAA then 0xBBBB, resulting in it appearing that d_parent == 0xAAAABBBB == parent. Yes it won't really be compiled as that in practice, but I thought the point of READ_ONCE() is to *guarantee* it's really done right... - Eric ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-15 17:38 ` Eric Biggers @ 2019-03-15 18:54 ` Al Viro 2019-03-16 22:31 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Al Viro @ 2019-03-15 18:54 UTC (permalink / raw) To: Eric Biggers; +Cc: Tobin C. Harding, linux-fsdevel, Paul E. McKenney On Fri, Mar 15, 2019 at 10:38:23AM -0700, Eric Biggers wrote: > On Fri, Mar 15, 2019 at 01:50:21AM +0000, Al Viro wrote: > > > > If it fails, we call __lock_parent(). Which > > * grabs RCU lock > > * drops ->d_lock (now we are not holding ->d_lock > > on anything). > > * fetches ->d_parent. Note the READ_ONCE() there - > > it's *NOT* stable (no ->d_lock held). We can't expect > > that ->d_parent won't change or that the reference it used > > to contribute to parent's refcount is there anymore; as > > the matter of fact, the only thing that prevents outright > > _freeing_ of the object 'parent' points to is rcu_read_lock() > > and RCU delay between dropping the last reference and > > actual freeing of the sucker. rcu_read_lock() is there, > > though, which makes it safe to grab ->d_lock on 'parent'. > > > > That 'parent' might very well have nothing to do with our > > dentry by now. We can check if it's equal to its > > ->d_parent, though. dentry->d_parent is *NOT* stable > > at that point. It might be changing right now. > > > > However, the first store to dentry->d_parent making it > > not equal to parent would have been done under parent->d_lock. > > And since we are holding parent->d_lock, we won't miss that > > store. We might miss subsequent ones, but if we observe > > dentry->d_parent == parent, we know that it's stable. And > > if we see dentry->d_parent != parent, we know that dentry > > has moved around and we need to retry anyway. > > Why isn't it necessary to use READ_ONCE(dentry->d_parent) here? > > if (unlikely(parent != dentry->d_parent)) { > > Suppose 'parent' is 0xAAAABBBB, and 'dentry->d_parent' is 0xAAAAAAAA and is > concurrently changed to 0xBBBBBBBB. > > d_parent could be read in two parts, 0xAAAA then 0xBBBB, resulting in it > appearing that d_parent == 0xAAAABBBB == parent. > > Yes it won't really be compiled as that in practice, but I thought the point of > READ_ONCE() is to *guarantee* it's really done right... READ_ONCE does not add any extra warranties of atomicity. Fetches and stores of pointers are atomic, period; if that ever breaks, we are in a very deep trouble all over the place. What's more, spin_lock acts as a compiler barrier and, on SMP, is an ACQUIRE operation. So that second fetch of ->d_parent will happen after we grab parent->d_lock, from everyone's POV. Critical areas for the same spinlock are ordered wrt each other. So we have observed FETCH dentry->d_parent => parent LOCK parent->d_lock FETCH dentry->d_parent => parent All stores to dentry->d_parent are done with ->d_lock held on dentry, old value of dentry->d_parent *and* new value of dentry->d_parent. So the second fetch is ordered wrt all stores making dentry->d_parent change to parent and all stores making it change *from* parent. We might miss some stores changing it from one value other than parent to another such, but the predicate itself is fine and will stay fine until we drop parent->d_lock. Paul, could you comment on that one? The function in question is this: static struct dentry *__lock_parent(struct dentry *dentry) { struct dentry *parent; rcu_read_lock(); spin_unlock(&dentry->d_lock); again: parent = READ_ONCE(dentry->d_parent); spin_lock(&parent->d_lock); /* * We can't blindly lock dentry until we are sure * that we won't violate the locking order. * Any changes of dentry->d_parent must have * been done with parent->d_lock held, so * spin_lock() above is enough of a barrier * for checking if it's still our child. */ if (unlikely(parent != dentry->d_parent)) { spin_unlock(&parent->d_lock); goto again; } rcu_read_unlock(); if (parent != dentry) spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); else parent = NULL; return parent; } (in fs/dcache.c) and all stores to ->d_parent are guaranteed to be done under ->d_lock on dentry itself and ->d_lock on both old and new values of ->d_parent. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-15 18:54 ` Al Viro @ 2019-03-16 22:31 ` Paul E. McKenney 2019-03-17 0:18 ` Al Viro 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2019-03-16 22:31 UTC (permalink / raw) To: Al Viro; +Cc: Eric Biggers, Tobin C. Harding, linux-fsdevel On Fri, Mar 15, 2019 at 06:54:55PM +0000, Al Viro wrote: > On Fri, Mar 15, 2019 at 10:38:23AM -0700, Eric Biggers wrote: > > On Fri, Mar 15, 2019 at 01:50:21AM +0000, Al Viro wrote: > > > > > > If it fails, we call __lock_parent(). Which > > > * grabs RCU lock > > > * drops ->d_lock (now we are not holding ->d_lock > > > on anything). > > > * fetches ->d_parent. Note the READ_ONCE() there - > > > it's *NOT* stable (no ->d_lock held). We can't expect > > > that ->d_parent won't change or that the reference it used > > > to contribute to parent's refcount is there anymore; as > > > the matter of fact, the only thing that prevents outright > > > _freeing_ of the object 'parent' points to is rcu_read_lock() > > > and RCU delay between dropping the last reference and > > > actual freeing of the sucker. rcu_read_lock() is there, > > > though, which makes it safe to grab ->d_lock on 'parent'. > > > > > > That 'parent' might very well have nothing to do with our > > > dentry by now. We can check if it's equal to its > > > ->d_parent, though. dentry->d_parent is *NOT* stable > > > at that point. It might be changing right now. > > > > > > However, the first store to dentry->d_parent making it > > > not equal to parent would have been done under parent->d_lock. > > > And since we are holding parent->d_lock, we won't miss that > > > store. We might miss subsequent ones, but if we observe > > > dentry->d_parent == parent, we know that it's stable. And > > > if we see dentry->d_parent != parent, we know that dentry > > > has moved around and we need to retry anyway. > > > > Why isn't it necessary to use READ_ONCE(dentry->d_parent) here? > > > > if (unlikely(parent != dentry->d_parent)) { > > > > Suppose 'parent' is 0xAAAABBBB, and 'dentry->d_parent' is 0xAAAAAAAA and is > > concurrently changed to 0xBBBBBBBB. > > > > d_parent could be read in two parts, 0xAAAA then 0xBBBB, resulting in it > > appearing that d_parent == 0xAAAABBBB == parent. > > > > Yes it won't really be compiled as that in practice, but I thought the point of > > READ_ONCE() is to *guarantee* it's really done right... > > READ_ONCE does not add any extra warranties of atomicity. Fetches and stores > of pointers are atomic, period; if that ever breaks, we are in a very deep > trouble all over the place. Stepping back from the specifics of this particular function for a moment, it would be unusual for a compiler to tear a pointer load or store, with non-tearing of pointer stores assuming values of pointers aren't known at compile time. Compilers can and do fuse both loads and stores, and they can and do repeat loads. Having gotten that off my chest... ;-) > What's more, spin_lock acts as a compiler barrier and, on SMP, is an ACQUIRE > operation. So that second fetch of ->d_parent will happen after we grab > parent->d_lock, from everyone's POV. Critical areas for the same spinlock > are ordered wrt each other. So we have observed > FETCH dentry->d_parent => parent > LOCK parent->d_lock > FETCH dentry->d_parent => parent > All stores to dentry->d_parent are done with ->d_lock held on dentry, > old value of dentry->d_parent *and* new value of dentry->d_parent. So > the second fetch is ordered wrt all stores making dentry->d_parent change > to parent and all stores making it change *from* parent. We might miss > some stores changing it from one value other than parent to another such, > but the predicate itself is fine and will stay fine until we drop > parent->d_lock. > > Paul, could you comment on that one? The function in question is > this: > static struct dentry *__lock_parent(struct dentry *dentry) > { > struct dentry *parent; > rcu_read_lock(); > spin_unlock(&dentry->d_lock); > again: > parent = READ_ONCE(dentry->d_parent); > spin_lock(&parent->d_lock); > /* > * We can't blindly lock dentry until we are sure > * that we won't violate the locking order. > * Any changes of dentry->d_parent must have > * been done with parent->d_lock held, so > * spin_lock() above is enough of a barrier > * for checking if it's still our child. > */ > if (unlikely(parent != dentry->d_parent)) { We are talking about the above access to dentry->d_parent, correct? Given that dentry->d_parent isn't updated except when holding dentry->d_lock, and given that locks do what they are supposed to, both in terms of memory ordering and mutual exclusion, the value of dentry->d_parent must be constant throughout the current critical section. It therefore doesn't matter what strange code the compiler generates, it will get the same value regardless. So, no, there is absolutely no point in doing this: if (unlikely(parent != READ_ONCE(dentry->d_parent))) { Or did I miss the point of this discussion? Thanx, Paul > spin_unlock(&parent->d_lock); > goto again; > } > rcu_read_unlock(); > if (parent != dentry) > spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); > else > parent = NULL; > return parent; > } > (in fs/dcache.c) and all stores to ->d_parent are guaranteed to be done > under ->d_lock on dentry itself and ->d_lock on both old and new values > of ->d_parent. > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-16 22:31 ` Paul E. McKenney @ 2019-03-17 0:18 ` Al Viro 2019-03-17 0:50 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Al Viro @ 2019-03-17 0:18 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Eric Biggers, Tobin C. Harding, linux-fsdevel On Sat, Mar 16, 2019 at 03:31:28PM -0700, Paul E. McKenney wrote: > > Paul, could you comment on that one? The function in question is > > this: > > static struct dentry *__lock_parent(struct dentry *dentry) > > { > > struct dentry *parent; > > rcu_read_lock(); > > spin_unlock(&dentry->d_lock); > > again: > > parent = READ_ONCE(dentry->d_parent); > > spin_lock(&parent->d_lock); > > /* > > * We can't blindly lock dentry until we are sure > > * that we won't violate the locking order. > > * Any changes of dentry->d_parent must have > > * been done with parent->d_lock held, so > > * spin_lock() above is enough of a barrier > > * for checking if it's still our child. > > */ > > if (unlikely(parent != dentry->d_parent)) { > > We are talking about the above access to dentry->d_parent, correct? > > Given that dentry->d_parent isn't updated except when holding > dentry->d_lock, and given that locks do what they are supposed to, > both in terms of memory ordering and mutual exclusion, the value of > dentry->d_parent must be constant throughout the current critical section. > It therefore doesn't matter what strange code the compiler generates, > it will get the same value regardless. > > So, no, there is absolutely no point in doing this: > > if (unlikely(parent != READ_ONCE(dentry->d_parent))) { > > Or did I miss the point of this discussion? dentry->d_parent is *NOT* guaranteed to be stable here - we are not holding dentry->d_lock. So it can change; what it can't do is to change to or from the pointer we have in 'parent'. IOW, the value of dentry->d_parent isn't stable; the result of comparison, OTOH, is. I also think that READ_ONCE would be useless here - reordering can't happen due to compiler barrier in spin_lock() and memory barriers should be provided by ACQUIRE/RELEASE on parent->d_lock. We are observing a result of some store. If we fetch the value equal to 'parent', that store had been under parent->d_lock *AND* any subsequent store changing the value of dentry->d_parent away from 'parent' would also have been under parent->d_lock. So either it hasn't happened yet, or we would've observed its results, since we are holding parent->lock ourselves. If we fetch the value not equal to 'parent', any subsequent store making dentry->d_parent equal to parent would've been done under parent->d_lock, so again, it either has not happened yet, or we would've observed its results. FWIW, the whole point here is that we can't grab the lock sufficient to stabilize dentry->d_parent here - not until we'd verified that dentry still has the same parent. We could play with trylock, but AFAICS the trick above is sufficient (and gets fewer retries). What I'm not certain about is whether we need READ_ONCE() on the other fetch in there. Suppose that gets replaced with plain assignment; what problems could that cause? It obviously can't get reordered past the spin_lock() - the address of spinlock is derived from the value we fetch. And in case of retry... wouldn't spin_unlock(parent->d_lock) we do there supply a compiler barrier? I realize that READ_ONCE() gives smp_read_barrier_depends(), but do we need it here? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-17 0:18 ` Al Viro @ 2019-03-17 0:50 ` Paul E. McKenney 2019-03-17 2:20 ` James Bottomley 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2019-03-17 0:50 UTC (permalink / raw) To: Al Viro; +Cc: Eric Biggers, Tobin C. Harding, linux-fsdevel On Sun, Mar 17, 2019 at 12:18:40AM +0000, Al Viro wrote: > On Sat, Mar 16, 2019 at 03:31:28PM -0700, Paul E. McKenney wrote: > > > Paul, could you comment on that one? The function in question is > > > this: > > > static struct dentry *__lock_parent(struct dentry *dentry) > > > { > > > struct dentry *parent; > > > rcu_read_lock(); > > > spin_unlock(&dentry->d_lock); > > > again: > > > parent = READ_ONCE(dentry->d_parent); > > > spin_lock(&parent->d_lock); > > > /* > > > * We can't blindly lock dentry until we are sure > > > * that we won't violate the locking order. > > > * Any changes of dentry->d_parent must have > > > * been done with parent->d_lock held, so > > > * spin_lock() above is enough of a barrier > > > * for checking if it's still our child. > > > */ > > > if (unlikely(parent != dentry->d_parent)) { > > > > We are talking about the above access to dentry->d_parent, correct? > > > > Given that dentry->d_parent isn't updated except when holding > > dentry->d_lock, and given that locks do what they are supposed to, > > both in terms of memory ordering and mutual exclusion, the value of > > dentry->d_parent must be constant throughout the current critical section. > > It therefore doesn't matter what strange code the compiler generates, > > it will get the same value regardless. > > > > So, no, there is absolutely no point in doing this: > > > > if (unlikely(parent != READ_ONCE(dentry->d_parent))) { > > > > Or did I miss the point of this discussion? > > dentry->d_parent is *NOT* guaranteed to be stable here - we are not > holding dentry->d_lock. So it can change; what it can't do is to > change to or from the pointer we have in 'parent'. Color me blind to "dentry" vs. "parent". :-/ > IOW, the value of dentry->d_parent isn't stable; the result of > comparison, OTOH, is. > > I also think that READ_ONCE would be useless here - reordering > can't happen due to compiler barrier in spin_lock() and memory > barriers should be provided by ACQUIRE/RELEASE on parent->d_lock. > > We are observing a result of some store. If we fetch the value > equal to 'parent', that store had been under parent->d_lock *AND* > any subsequent store changing the value of dentry->d_parent away > from 'parent' would also have been under parent->d_lock. So > either it hasn't happened yet, or we would've observed its > results, since we are holding parent->lock ourselves. > > If we fetch the value not equal to 'parent', any subsequent store > making dentry->d_parent equal to parent would've been done under > parent->d_lock, so again, it either has not happened yet, or we > would've observed its results. > > FWIW, the whole point here is that we can't grab the lock sufficient > to stabilize dentry->d_parent here - not until we'd verified that > dentry still has the same parent. We could play with trylock, but > AFAICS the trick above is sufficient (and gets fewer retries). I can come up with sequences of values that could mimic the value of 'parent', but that assumes load tearing. I -have- seen stores of constant values be torn, but not stores of runtime-variable values and not loads. Still, such tearing is permitted, and including the READ_ONCE() is making it easier for things like thread sanitizers. In addition, the READ_ONCE() makes it clear that the value being loaded is unstable, which can be useful documentation. And yes, I am a bit paranoid about this sort of thing. Something about talking to a lot of compiler writers and standards committee members... But at the end of the day, you are the maintainer of this code. > What I'm not certain about is whether we need READ_ONCE() on the > other fetch in there. Suppose that gets replaced with plain > assignment; what problems could that cause? It obviously can't > get reordered past the spin_lock() - the address of spinlock > is derived from the value we fetch. And in case of retry... > wouldn't spin_unlock(parent->d_lock) we do there supply a compiler > barrier? Plus all callers to a spin_trylock() before invoking this function, which does prevent the compiler from repeating the read (no reason to in this case) or fusing with some other read. Of course, I still like the documentation and sanitizer-tooling benefits. > I realize that READ_ONCE() gives smp_read_barrier_depends(), but > do we need it here? We need smp_read_barrier_depends() if we dereference the loaded pointer using some operation that did not imply memory ordering, which does happen, for example, in ecryptfs_symlink(). But before the caller has a chance to do that, there is that spin_lock_nested(), which provides the needed ordering. Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-17 0:50 ` Paul E. McKenney @ 2019-03-17 2:20 ` James Bottomley 2019-03-17 3:06 ` Al Viro 0 siblings, 1 reply; 17+ messages in thread From: James Bottomley @ 2019-03-17 2:20 UTC (permalink / raw) To: paulmck, Al Viro; +Cc: Eric Biggers, Tobin C. Harding, linux-fsdevel On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: [...] > I -have- seen stores of constant values be torn, but not stores of > runtime-variable values and not loads. Still, such tearing is > permitted, and including the READ_ONCE() is making it easier for > things like thread sanitizers. In addition, the READ_ONCE() makes it > clear that the value being loaded is unstable, which can be > useful documentation. Um, just so I'm clear, because this assumption permeates all our code: load or store tearing can never occur if we're doing load or store of a 32 bit value which is naturally aligned. Where naturally aligned is within the gift of the CPU to determine but which the compiler or kernel will always ensure for us unless we pack the structure or deliberately misalign the allocation. James ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-17 2:20 ` James Bottomley @ 2019-03-17 3:06 ` Al Viro 2019-03-17 4:23 ` James Bottomley 0 siblings, 1 reply; 17+ messages in thread From: Al Viro @ 2019-03-17 3:06 UTC (permalink / raw) To: James Bottomley; +Cc: paulmck, Eric Biggers, Tobin C. Harding, linux-fsdevel On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > [...] > > I -have- seen stores of constant values be torn, but not stores of > > runtime-variable values and not loads. Still, such tearing is > > permitted, and including the READ_ONCE() is making it easier for > > things like thread sanitizers. In addition, the READ_ONCE() makes it > > clear that the value being loaded is unstable, which can be > > useful documentation. > > Um, just so I'm clear, because this assumption permeates all our code: > load or store tearing can never occur if we're doing load or store of a > 32 bit value which is naturally aligned. Where naturally aligned is > within the gift of the CPU to determine but which the compiler or > kernel will always ensure for us unless we pack the structure or > deliberately misalign the allocation. Wait a sec; are there any 64bit architectures where the same is not guaranteed for dereferencing properly aligned void **? If that's the case, I can think of quite a few places that are rather dubious, and I don't see how READ_ONCE() could help in those - e.g. if an architecture only has 32bit loads, rcu list traversals are not going to be doable without one hell of an extra headache. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-17 3:06 ` Al Viro @ 2019-03-17 4:23 ` James Bottomley 2019-03-18 0:35 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: James Bottomley @ 2019-03-17 4:23 UTC (permalink / raw) To: Al Viro; +Cc: paulmck, Eric Biggers, Tobin C. Harding, linux-fsdevel On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote: > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > > [...] > > > I -have- seen stores of constant values be torn, but not stores > > > of runtime-variable values and not loads. Still, such tearing is > > > permitted, and including the READ_ONCE() is making it easier for > > > things like thread sanitizers. In addition, the READ_ONCE() > > > makes it clear that the value being loaded is unstable, which can > > > be useful documentation. > > > > Um, just so I'm clear, because this assumption permeates all our > > code: load or store tearing can never occur if we're doing load or > > store of a 32 bit value which is naturally aligned. Where > > naturally aligned is within the gift of the CPU to determine but > > which the compiler or kernel will always ensure for us unless we > > pack the structure or deliberately misalign the allocation. > > Wait a sec; are there any 64bit architectures where the same is not > guaranteed for dereferencing properly aligned void **? Yes, naturally alligned void * dereference shouldn't tear either. I was just using 32 bit as my example because 64 bit accesses will tear on 32 bit architectures but 64 bit naturally aligned accesses shouldn't tear on 64 bit architectures. However, since we can't guarantee the 64 bitness of the architecture 32 bit or void * is our gold standard for not tearing. James > If that's the case, I can think of quite a few places that are rather > dubious, and I don't see how READ_ONCE() could help in those - e.g. > if an architecture only has 32bit loads, rcu list traversals are > not going to be doable without one hell of an extra headache. > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-17 4:23 ` James Bottomley @ 2019-03-18 0:35 ` Paul E. McKenney 2019-03-18 16:26 ` James Bottomley 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2019-03-18 0:35 UTC (permalink / raw) To: James Bottomley; +Cc: Al Viro, Eric Biggers, Tobin C. Harding, linux-fsdevel On Sat, Mar 16, 2019 at 09:23:16PM -0700, James Bottomley wrote: > On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote: > > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > > > [...] > > > > I -have- seen stores of constant values be torn, but not stores > > > > of runtime-variable values and not loads. Still, such tearing is > > > > permitted, and including the READ_ONCE() is making it easier for > > > > things like thread sanitizers. In addition, the READ_ONCE() > > > > makes it clear that the value being loaded is unstable, which can > > > > be useful documentation. > > > > > > Um, just so I'm clear, because this assumption permeates all our > > > code: load or store tearing can never occur if we're doing load or > > > store of a 32 bit value which is naturally aligned. Where > > > naturally aligned is within the gift of the CPU to determine but > > > which the compiler or kernel will always ensure for us unless we > > > pack the structure or deliberately misalign the allocation. A non-volatile store of certain 32-bit constants can and does tear on some architectures. These architectures would be the ones with a store-immediate instruction with a small immediate field, and where the 32-bit constant is such that a pair of 16-bit immediate store instructions can store that value. There was a bug in an old version of GCC where even volatile 32-bit stores of these constants would tear. They did fix the bug, but it took some time to find a GCC person who understood that this was in fact a bug. Hence my preference for READ_ONCE() and WRITE_ONCE() for data-racing loads and stores. > > Wait a sec; are there any 64bit architectures where the same is not > > guaranteed for dereferencing properly aligned void **? > > Yes, naturally alligned void * dereference shouldn't tear either. I > was just using 32 bit as my example because 64 bit accesses will tear > on 32 bit architectures but 64 bit naturally aligned accesses shouldn't > tear on 64 bit architectures. However, since we can't guarantee the 64 > bitness of the architecture 32 bit or void * is our gold standard for > not tearing. For stores of quantities not known at compiler time, agreed. But that same store-immediate situation could happen on 64-bit systems. > James > > > > If that's the case, I can think of quite a few places that are rather > > dubious, and I don't see how READ_ONCE() could help in those - e.g. > > if an architecture only has 32bit loads, rcu list traversals are > > not going to be doable without one hell of an extra headache. All the 64-bit systems that run the Linux kernel do have 64-bit load instructions and rcu_dereference() uses READ_ONCE() internally, so we should be fine with RCU list traverals. Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-18 0:35 ` Paul E. McKenney @ 2019-03-18 16:26 ` James Bottomley 2019-03-18 17:11 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: James Bottomley @ 2019-03-18 16:26 UTC (permalink / raw) To: paulmck; +Cc: Al Viro, Eric Biggers, Tobin C. Harding, linux-fsdevel On Sun, 2019-03-17 at 17:35 -0700, Paul E. McKenney wrote: > On Sat, Mar 16, 2019 at 09:23:16PM -0700, James Bottomley wrote: > > On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote: > > > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > > > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > > > > [...] > > > > > I -have- seen stores of constant values be torn, but not > > > > > stores of runtime-variable values and not loads. Still, such > > > > > tearing is permitted, and including the READ_ONCE() is making > > > > > it easier for things like thread sanitizers. In addition, > > > > > the READ_ONCE() makes it clear that the value being loaded is > > > > > unstable, which can be useful documentation. > > > > > > > > Um, just so I'm clear, because this assumption permeates all > > > > our code: load or store tearing can never occur if we're doing > > > > load or store of a 32 bit value which is naturally > > > > aligned. Where naturally aligned is within the gift of the CPU > > > > to determine but which the compiler or kernel will always > > > > ensure for us unless we pack the structure or deliberately > > > > misalign the allocation. > > A non-volatile store of certain 32-bit constants can and does tear > on some architectures. These architectures would be the ones with a > store-immediate instruction with a small immediate field, and where > the 32-bit constant is such that a pair of 16-bit immediate store > instructions can store that value. Understood: PA-RISC is one such architecture: our ldil (load immediate long) can only take 21 bits of immediate data and you have to use a second instruction (ldo) to get the remaining 11 bits. However, the compiler guarantees no tearing in memory visibility for PA by doing the lidl/ldo sequence on a register and then writing the register to memory which I believe is an architectural guarantee. > There was a bug in an old version of GCC where even volatile 32-bit > stores of these constants would tear. They did fix the bug, but it > took some time to find a GCC person who understood that this was in > fact a bug. > > Hence my preference for READ_ONCE() and WRITE_ONCE() for data-racing > loads and stores. OK, but didn't everyone eventually agree this was a compiler bug? > > > Wait a sec; are there any 64bit architectures where the same is > > > not guaranteed for dereferencing properly aligned void **? > > > > Yes, naturally alligned void * dereference shouldn't tear > > either. Iwas just using 32 bit as my example because 64 bit > > accesses will tear on 32 bit architectures but 64 bit naturally > > aligned accesses shouldn't tear on 64 bit architectures. However, > > since we can't guarantee the 64 bitness of the architecture 32 bit > > or void * is our gold standard for not tearing. > > For stores of quantities not known at compiler time, agreed. But > that same store-immediate situation could happen on 64-bit systems. > > > James > > > > > > > If that's the case, I can think of quite a few places that are > > > rather dubious, and I don't see how READ_ONCE() could help in > > > those - e.g. if an architecture only has 32bit loads, rcu list > > > traversals are not going to be doable without one hell of an > > > extra headache. > > All the 64-bit systems that run the Linux kernel do have 64-bit load > instructions and rcu_dereference() uses READ_ONCE() internally, so we > should be fine with RCU list traverals. I really don't think it's possible to get the same immediate constant tearing bug on 64 bit. If you look at PA, we have no 64 bit equivalent of the ldil/ldo pair so all 64 bit immediate stores come straight from the global data table via a register, so no tearing. I bet every 64 bit architecture has a similar approach because 64 bit immediate data just requires too many bits to stuff into an instruction pair. James ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-18 16:26 ` James Bottomley @ 2019-03-18 17:11 ` Paul E. McKenney 2019-03-19 15:45 ` Paul E. McKenney 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2019-03-18 17:11 UTC (permalink / raw) To: James Bottomley; +Cc: Al Viro, Eric Biggers, Tobin C. Harding, linux-fsdevel On Mon, Mar 18, 2019 at 09:26:18AM -0700, James Bottomley wrote: > On Sun, 2019-03-17 at 17:35 -0700, Paul E. McKenney wrote: > > On Sat, Mar 16, 2019 at 09:23:16PM -0700, James Bottomley wrote: > > > On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote: > > > > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > > > > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > > > > > [...] > > > > > > I -have- seen stores of constant values be torn, but not > > > > > > stores of runtime-variable values and not loads. Still, such > > > > > > tearing is permitted, and including the READ_ONCE() is making > > > > > > it easier for things like thread sanitizers. In addition, > > > > > > the READ_ONCE() makes it clear that the value being loaded is > > > > > > unstable, which can be useful documentation. > > > > > > > > > > Um, just so I'm clear, because this assumption permeates all > > > > > our code: load or store tearing can never occur if we're doing > > > > > load or store of a 32 bit value which is naturally > > > > > aligned. Where naturally aligned is within the gift of the CPU > > > > > to determine but which the compiler or kernel will always > > > > > ensure for us unless we pack the structure or deliberately > > > > > misalign the allocation. > > > > A non-volatile store of certain 32-bit constants can and does tear > > on some architectures. These architectures would be the ones with a > > store-immediate instruction with a small immediate field, and where > > the 32-bit constant is such that a pair of 16-bit immediate store > > instructions can store that value. > > Understood: PA-RISC is one such architecture: our ldil (load immediate > long) can only take 21 bits of immediate data and you have to use a > second instruction (ldo) to get the remaining 11 bits. However, the > compiler guarantees no tearing in memory visibility for PA by doing the > lidl/ldo sequence on a register and then writing the register to memory > which I believe is an architectural guarantee. Good to know, thank you! > > There was a bug in an old version of GCC where even volatile 32-bit > > stores of these constants would tear. They did fix the bug, but it > > took some time to find a GCC person who understood that this was in > > fact a bug. > > > > Hence my preference for READ_ONCE() and WRITE_ONCE() for data-racing > > loads and stores. > > OK, but didn't everyone eventually agree this was a compiler bug? They did agree, but only in the case where the store was volatile, as in WRITE_ONCE(), and -not- in the case of a plain store. At least the kernel doesn't make general use of vector instructions. If it did, I would not be surprised to see compilers use three 32-bit vector stores to store to a 32-bit int adjacent to a 64-bit pointer. :-/ Thanx, Paul > > > > Wait a sec; are there any 64bit architectures where the same is > > > > not guaranteed for dereferencing properly aligned void **? > > > > > > Yes, naturally alligned void * dereference shouldn't tear > > > either. Iwas just using 32 bit as my example because 64 bit > > > accesses will tear on 32 bit architectures but 64 bit naturally > > > aligned accesses shouldn't tear on 64 bit architectures. However, > > > since we can't guarantee the 64 bitness of the architecture 32 bit > > > or void * is our gold standard for not tearing. > > > > For stores of quantities not known at compiler time, agreed. But > > that same store-immediate situation could happen on 64-bit systems. > > > > > James > > > > > > > > > > If that's the case, I can think of quite a few places that are > > > > rather dubious, and I don't see how READ_ONCE() could help in > > > > those - e.g. if an architecture only has 32bit loads, rcu list > > > > traversals are not going to be doable without one hell of an > > > > extra headache. > > > > All the 64-bit systems that run the Linux kernel do have 64-bit load > > instructions and rcu_dereference() uses READ_ONCE() internally, so we > > should be fine with RCU list traverals. > > I really don't think it's possible to get the same immediate constant > tearing bug on 64 bit. If you look at PA, we have no 64 bit > equivalent of the ldil/ldo pair so all 64 bit immediate stores come > straight from the global data table via a register, so no tearing. I > bet every 64 bit architecture has a similar approach because 64 bit > immediate data just requires too many bits to stuff into an instruction > pair. > > James > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: dcache locking question 2019-03-18 17:11 ` Paul E. McKenney @ 2019-03-19 15:45 ` Paul E. McKenney 0 siblings, 0 replies; 17+ messages in thread From: Paul E. McKenney @ 2019-03-19 15:45 UTC (permalink / raw) To: James Bottomley; +Cc: Al Viro, Eric Biggers, Tobin C. Harding, linux-fsdevel On Mon, Mar 18, 2019 at 10:11:06AM -0700, Paul E. McKenney wrote: > On Mon, Mar 18, 2019 at 09:26:18AM -0700, James Bottomley wrote: > > On Sun, 2019-03-17 at 17:35 -0700, Paul E. McKenney wrote: > > > On Sat, Mar 16, 2019 at 09:23:16PM -0700, James Bottomley wrote: > > > > On Sun, 2019-03-17 at 03:06 +0000, Al Viro wrote: > > > > > On Sat, Mar 16, 2019 at 07:20:20PM -0700, James Bottomley wrote: > > > > > > On Sat, 2019-03-16 at 17:50 -0700, Paul E. McKenney wrote: > > > > > > [...] > > > > > > > I -have- seen stores of constant values be torn, but not > > > > > > > stores of runtime-variable values and not loads. Still, such > > > > > > > tearing is permitted, and including the READ_ONCE() is making > > > > > > > it easier for things like thread sanitizers. In addition, > > > > > > > the READ_ONCE() makes it clear that the value being loaded is > > > > > > > unstable, which can be useful documentation. > > > > > > > > > > > > Um, just so I'm clear, because this assumption permeates all > > > > > > our code: load or store tearing can never occur if we're doing > > > > > > load or store of a 32 bit value which is naturally > > > > > > aligned. Where naturally aligned is within the gift of the CPU > > > > > > to determine but which the compiler or kernel will always > > > > > > ensure for us unless we pack the structure or deliberately > > > > > > misalign the allocation. > > > > > > A non-volatile store of certain 32-bit constants can and does tear > > > on some architectures. These architectures would be the ones with a > > > store-immediate instruction with a small immediate field, and where > > > the 32-bit constant is such that a pair of 16-bit immediate store > > > instructions can store that value. > > > > Understood: PA-RISC is one such architecture: our ldil (load immediate > > long) can only take 21 bits of immediate data and you have to use a > > second instruction (ldo) to get the remaining 11 bits. However, the > > compiler guarantees no tearing in memory visibility for PA by doing the > > lidl/ldo sequence on a register and then writing the register to memory > > which I believe is an architectural guarantee. > > Good to know, thank you! > > > > There was a bug in an old version of GCC where even volatile 32-bit > > > stores of these constants would tear. They did fix the bug, but it > > > took some time to find a GCC person who understood that this was in > > > fact a bug. > > > > > > Hence my preference for READ_ONCE() and WRITE_ONCE() for data-racing > > > loads and stores. > > > > OK, but didn't everyone eventually agree this was a compiler bug? > > They did agree, but only in the case where the store was volatile, > as in WRITE_ONCE(), and -not- in the case of a plain store. > > At least the kernel doesn't make general use of vector instructions. > If it did, I would not be surprised to see compilers use three 32-bit > vector stores to store to a 32-bit int adjacent to a 64-bit pointer. :-/ And it turns out that the CPU architecture in question was x86-64, for whatever that is worth. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55981 (There is also a later bug report dealing strictly with volatile, but my search-engine skills are failing me this morning.) Thanx, Paul > > > > > Wait a sec; are there any 64bit architectures where the same is > > > > > not guaranteed for dereferencing properly aligned void **? > > > > > > > > Yes, naturally alligned void * dereference shouldn't tear > > > > either. Iwas just using 32 bit as my example because 64 bit > > > > accesses will tear on 32 bit architectures but 64 bit naturally > > > > aligned accesses shouldn't tear on 64 bit architectures. However, > > > > since we can't guarantee the 64 bitness of the architecture 32 bit > > > > or void * is our gold standard for not tearing. > > > > > > For stores of quantities not known at compiler time, agreed. But > > > that same store-immediate situation could happen on 64-bit systems. > > > > > > > James > > > > > > > > > > > > > If that's the case, I can think of quite a few places that are > > > > > rather dubious, and I don't see how READ_ONCE() could help in > > > > > those - e.g. if an architecture only has 32bit loads, rcu list > > > > > traversals are not going to be doable without one hell of an > > > > > extra headache. > > > > > > All the 64-bit systems that run the Linux kernel do have 64-bit load > > > instructions and rcu_dereference() uses READ_ONCE() internally, so we > > > should be fine with RCU list traverals. > > > > I really don't think it's possible to get the same immediate constant > > tearing bug on 64 bit. If you look at PA, we have no 64 bit > > equivalent of the ldil/ldo pair so all 64 bit immediate stores come > > straight from the global data table via a register, so no tearing. I > > bet every 64 bit architecture has a similar approach because 64 bit > > immediate data just requires too many bits to stuff into an instruction > > pair. > > > > James > > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-03-19 15:45 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-14 22:56 dcache locking question Tobin C. Harding 2019-03-14 23:09 ` Matthew Wilcox 2019-03-15 1:38 ` Tobin C. Harding 2019-03-14 23:19 ` Tobin C. Harding 2019-03-15 1:50 ` Al Viro 2019-03-15 17:38 ` Eric Biggers 2019-03-15 18:54 ` Al Viro 2019-03-16 22:31 ` Paul E. McKenney 2019-03-17 0:18 ` Al Viro 2019-03-17 0:50 ` Paul E. McKenney 2019-03-17 2:20 ` James Bottomley 2019-03-17 3:06 ` Al Viro 2019-03-17 4:23 ` James Bottomley 2019-03-18 0:35 ` Paul E. McKenney 2019-03-18 16:26 ` James Bottomley 2019-03-18 17:11 ` Paul E. McKenney 2019-03-19 15:45 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).