From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751936AbcEUHiA (ORCPT ); Sat, 21 May 2016 03:38:00 -0400 Received: from merlin.infradead.org ([205.233.59.134]:40691 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751639AbcEUHh7 (ORCPT ); Sat, 21 May 2016 03:37:59 -0400 Date: Sat, 21 May 2016 09:37:39 +0200 From: Peter Zijlstra To: Davidlohr Bueso Cc: Linus Torvalds , Boqun Feng , Manfred Spraul , Waiman Long , Ingo Molnar , ggherdovich@suse.com, Mel Gorman , Linux Kernel Mailing List , Paul McKenney , Will Deacon Subject: Re: sem_lock() vs qspinlocks Message-ID: <20160521073739.GB15728@worktop.ger.corp.intel.com> References: <20160520053926.GC31084@linux-uzut.site> <20160520115819.GF3193@twins.programming.kicks-ass.net> <20160520140533.GA20726@insomnia> <20160520152149.GH3193@twins.programming.kicks-ass.net> <20160520160436.GQ3205@twins.programming.kicks-ass.net> <20160520210618.GK3193@twins.programming.kicks-ass.net> <20160521004839.GA28231@linux-uzut.site> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160521004839.GA28231@linux-uzut.site> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 20, 2016 at 05:48:39PM -0700, Davidlohr Bueso wrote: > On Fri, 20 May 2016, Linus Torvalds wrote: > > > >Oh, I definitely agree on the stable part, and yes, the "splt things > >up" model should come later if people agree that it's a good thing. > > The backporting part is quite nice, yes, but ultimately I think I prefer > Linus' suggestion making things explicit, as opposed to consulting the spinlock > implying barriers. I also hate to have an smp_mb() (particularly for spin_is_locked) > given that we are not optimizing for the common case (regular mutual excl). I'm confused; we _are_ optimizing for the common case. spin_is_locked() is very unlikely to be used. And arguably should be used less in favour of lockdep_assert_held(). > As opposed to spin_is_locked(), spin_unlock_wait() is perhaps more tempting > to use for locking correctness. For example, taking a look at nf_conntrack_all_lock(), > it too likes to get smart with spin_unlock_wait() -- also for finer graining purposes. > While not identical to sems, it goes like: > > nf_conntrack_all_lock(): nf_conntrack_lock(): > spin_lock(B); spin_lock(A); > > if (bar) { // false > bar = 1; ... > } > [loop ctrl-barrier] > spin_unlock_wait(A); > foo(); foo(); > > If the spin_unlock_wait() doesn't yet see the store that makes A visibly locked, > we could end up with both threads in foo(), no?. (Although I'm unsure about that > ctrl barrier and archs could fall into it. The point was to see in-tree examples > of creative thinking with locking). I'm tempted to put that trailing smp_rmb() in spin_unlock_wait() too; because I suspect the netfilter code is broken without it. And it seems intuitive to assume that if we return from unlock_wait() we can indeed observe the critical section we waited on. Something a little like so; but then for _all_ implementations. --- include/asm-generic/qspinlock.h | 6 ++++++ include/linux/compiler.h | 13 ++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index 6bd05700d8c9..2f2eddd3e1f9 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -135,6 +135,12 @@ static inline void queued_spin_unlock_wait(struct qspinlock *lock) smp_mb(); while (atomic_read(&lock->val) & _Q_LOCKED_MASK) cpu_relax(); + + /* + * Match the RELEASE of the spin_unlock() we just observed. Thereby + * ensuring we observe the whole critical section that ended. + */ + smp_acquire__after_ctrl_dep(); } #ifndef virt_spin_lock diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 793c0829e3a3..3c4bc8160947 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -304,21 +304,24 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s __u.__val; \ }) +/* + * A control dependency provides a LOAD->STORE order, the additional RMB + * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, + * aka. ACQUIRE. + */ +#define smp_acquire__after_ctrl_dep() smp_rmb() + /** * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering * @cond: boolean expression to wait for * * Equivalent to using smp_load_acquire() on the condition variable but employs * the control dependency of the wait to reduce the barrier on many platforms. - * - * The control dependency provides a LOAD->STORE order, the additional RMB - * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, - * aka. ACQUIRE. */ #define smp_cond_acquire(cond) do { \ while (!(cond)) \ cpu_relax(); \ - smp_rmb(); /* ctrl + rmb := acquire */ \ + smp_acquire__after_ctrl_dep(); \ } while (0) #endif /* __KERNEL__ */