From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755705AbdKBQCu (ORCPT ); Thu, 2 Nov 2017 12:02:50 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:56420 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750858AbdKBQCt (ORCPT ); Thu, 2 Nov 2017 12:02:49 -0400 Date: Thu, 2 Nov 2017 17:02:37 +0100 From: Peter Zijlstra To: Alan Stern Cc: "Reshetova, Elena" , "linux-kernel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "keescook@chromium.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "ishkamiel@gmail.com" , Will Deacon , Paul McKenney , parri.andrea@gmail.com, boqun.feng@gmail.com, dhowells@redhat.com, david@fromorbit.com Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t Message-ID: <20171102160237.t2xkryg6joskf77y@hirez.programming.kicks-ass.net> References: <20171102135742.7o4urtltgvhr6dku@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > Lock functions such as refcount_dec_and_lock() & > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > they atomic counterparts. > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > refcount_dec_and_lock() merely orders all prior load/store's against all > > later load/store's. > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > returns false; It should provide a release: - if !=1, dec_not_one will provide release - if ==1, dec_not_one will no-op, but then we'll acquire the lock and dec_and_test will provide the release, even if the test fails and we unlock again it should still dec. The one exception is when the counter is saturated, but in that case we'll never free the object and the ordering is moot in any case. > it provides ordering only if the return value is true. > In which case it provides acquire ordering (thanks to the spin_lock), > and both release ordering and a control dependency (thanks to the > refcount_dec_and_test). > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > have you got anything simple around? > > The combination of acquire + release is not the same as smp_mb, because acquire+release is nothing, its release+acquire that I meant which should order things locally, but now that you've got me looking at it again, we don't in fact do that. So refcount_dec_and_lock() will provide a release, irrespective of the return value (assuming we're not saturated). If it returns true, it also does an acquire for the lock. But combined they're acquire+release, which is unfortunate.. it means the lock section and the refcount stuff overlaps, but I don't suppose that's actually a problem. Need to consider more.