From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933652AbcJQJWu (ORCPT ); Mon, 17 Oct 2016 05:22:50 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:45857 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932949AbcJQJWo (ORCPT ); Mon, 17 Oct 2016 05:22:44 -0400 Date: Mon, 17 Oct 2016 11:22:26 +0200 From: Peter Zijlstra To: Will Deacon Cc: Linus Torvalds , Waiman Long , Jason Low , Ding Tianhong , Thomas Gleixner , Ingo Molnar , Imre Deak , Linux Kernel Mailing List , Davidlohr Bueso , Tim Chen , Terry Rudd , "Paul E. McKenney" , Jason Low , Chris Wilson , Daniel Vetter Subject: Re: [PATCH -v4 5/8] locking/mutex: Add lock handoff to avoid starvation Message-ID: <20161017092226.GM3117@twins.programming.kicks-ass.net> References: <20161007145243.361481786@infradead.org> <20161007150211.196801561@infradead.org> <20161013151447.GA13138@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161013151447.GA13138@arm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 13, 2016 at 04:14:47PM +0100, Will Deacon wrote: > > + if (__owner_task(owner)) { > > + if (handoff && unlikely(__owner_task(owner) == current)) { > > + /* > > + * Provide ACQUIRE semantics for the lock-handoff. > > + * > > + * We cannot easily use load-acquire here, since > > + * the actual load is a failed cmpxchg, which > > + * doesn't imply any barriers. > > + * > > + * Also, this is a fairly unlikely scenario, and > > + * this contains the cost. > > + */ > > + smp_mb(); /* ACQUIRE */ > > As we discussed on another thread recently, a failed cmpxchg_acquire > will always give you ACQUIRE semantics in practice. Maybe we should update > the documentation to allow this? The only special case is the full-barrier > version. So on PPC we do: static __always_inline unsigned long __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new) { unsigned long prev; __asm__ __volatile__ ( "1: lwarx %0,0,%2 # __cmpxchg_u32_acquire\n" " cmpw 0,%0,%3\n" " bne- 2f\n" PPC405_ERR77(0, %2) " stwcx. %4,0,%2\n" " bne- 1b\n" PPC_ACQUIRE_BARRIER "\n" "2:" : "=&r" (prev), "+m" (*p) : "r" (p), "r" (old), "r" (new) : "cc", "memory"); return prev; } which I read to skip over the ACQUIRE_BARRIER on fail. Similarly, we _could_ make the generic version skip the barrier entirely (we currently do not it seems). And while I agree that it makes semantic sense, in that we always issue the LOAD, and since we defined the ACQUIRE to apply to the LOADs only, and we always issue the LOAD, we should also always provide ACQUIRE semantics. I'm not entirely convinced we should go there just yet. It would make failed cmpxchg_acquire()'s more expensive, and this really is the only place we care about those. So I would propose for now we keep these explicit barriers; both here and the other place you mentioned, but keep this in mind. Also, I don't feel we need more complexity in this patch set just now.