From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1162463AbbKTKKA (ORCPT ); Fri, 20 Nov 2015 05:10:00 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:35270 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751454AbbKTKJ4 (ORCPT ); Fri, 20 Nov 2015 05:09:56 -0500 Date: Fri, 20 Nov 2015 11:09:35 +0100 From: Peter Zijlstra To: Will Deacon Cc: "Paul E. McKenney" , Linus Torvalds , Boqun Feng , Oleg Nesterov , Ingo Molnar , Linux Kernel Mailing List , Jonathan Corbet , Michal Hocko , David Howells , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras Subject: Re: [PATCH 4/4] locking: Introduce smp_cond_acquire() Message-ID: <20151120100935.GB17308@twins.programming.kicks-ass.net> References: <20151112070915.GC6314@fixme-laptop.cn.ibm.com> <20151116155658.GW17308@twins.programming.kicks-ass.net> <20151116160445.GK11639@twins.programming.kicks-ass.net> <20151116162452.GD1999@arm.com> <20151117115109.GD28649@arm.com> <20151117210109.GY5184@linux.vnet.ibm.com> <20151118112514.GC1588@arm.com> <20151119180151.GF1616@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151119180151.GF1616@arm.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 19, 2015 at 06:01:52PM +0000, Will Deacon wrote: > For completeness, here's what I've currently got. I've failed to measure > any performance impact on my 8-core systems, but that's not surprising. > +static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) > +{ > + unsigned int tmp; > + arch_spinlock_t lockval; > > + asm volatile( > +" sevl\n" > +"1: wfe\n" Using WFE here would lower the cacheline bouncing pressure a bit I imagine. Sure we still pull it over into S(hared) after every store but we don't keep banging on it making the initial e(X)clusive grab hard. > +"2: ldaxr %w0, %2\n" > +" eor %w1, %w0, %w0, ror #16\n" > +" cbnz %w1, 1b\n" > + ARM64_LSE_ATOMIC_INSN( > + /* LL/SC */ > +" stxr %w1, %w0, %2\n" > + /* Serialise against any concurrent lockers */ > +" cbnz %w1, 2b\n", > + /* LSE atomics */ > +" nop\n" > +" nop\n") I find these ARM64_LSE macro thingies aren't always easy to read, its fairly easy to overlook the ',' separating the v8 and v8.1 parts, esp. if you have further interleaving comments like in the above. > + : "=&r" (lockval), "=&r" (tmp), "+Q" (*lock) > + : > + : "memory"); > +}