On Tue, Nov 15, 2016 at 02:01:54PM +0100, Peter Zijlstra wrote: > On Tue, Nov 15, 2016 at 08:33:37PM +0800, Boqun Feng wrote: > > Hi Peter, > > > > On Mon, Nov 14, 2016 at 06:39:53PM +0100, Peter Zijlstra wrote: > > [...] > > > +/* > > > + * Similar to atomic_dec_and_test(), it will BUG on underflow and fail to > > > + * decrement when saturated at UINT_MAX. > > > + * > > > + * Provides release memory ordering, such that prior loads and stores are done > > > + * before a subsequent free. > > > > I'm not sure this is correct, the RELEASE semantics is for the STORE > > part of cmpxchg, and semantically it will guarantee that memory > > operations after cmpxchg won't be reordered upwards, for example, on > > ARM64, the following code: > > > > WRITE_ONCE(x, 1) > > > > atomic_cmpxchg_release(&a, 1, 2); > > r1 = ll(&a) > > if (r1 == 1) { > > sc_release(&a, 2); > > } > > > > free() > > > > could be reordered as, I think: > > > > atomic_cmpxchg_release(&a, 1, 2); > > r1 = ll(&a) > > if (r1 == 1) { > > free() > > WRITE_ONCE(x, 1) > > sc_release(&a, 2); > > } > > > > Of course, we need to wait for Will to confirm about this. But if this > > could happen, we'd better to use a smp_mb()+atomic_cmpxchg_relaxed() > > here and for other refcount_dec_and_*(). > > Can't happen I think because of the control dependency between > dec_and_test() and free(). > > That is, the cmpxchg_release() must complete to determine if it was > successful or it needs a retry. The success, combined with the state of > the variable will then determine if we call free(). > The thing is that determination of the variable's state(i.e. store_release() succeeds) and the actual writeback to memory are two separate events. So yes, free() won't execute before store_release() commits successfully, but there is no barrier here to order the memory effects of store_release() and free(). See a similar example: https://marc.info/?l=linux-s390&m=146604339321723&w=2 But as I said, we actually only need the pairing of orderings: 1) load part of cmpxchg -> free() 2) object accesses -> store part of cmpxchg Ordering #1 can be achieved via control dependency as you pointed out that free()s very much includes stores. And ordering #2 can be achieved with RELEASE. So the code is right, I just thought the comment may be misleading. The reason we use cmpxchg_release() is just for achieving ordering #2, and not to order "prior loads and stores" with "a subsequent free". Am I missing some subtle orderings here? Regards, Boqun > So I don't think we can get free() (which very much includes stores) to > happen before the store-release.