From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751989AbdGFQyn (ORCPT ); Thu, 6 Jul 2017 12:54:43 -0400 Received: from merlin.infradead.org ([205.233.59.134]:42670 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750970AbdGFQyl (ORCPT ); Thu, 6 Jul 2017 12:54:41 -0400 Date: Thu, 6 Jul 2017 18:54:26 +0200 From: Peter Zijlstra To: Alan Stern Cc: "Paul E. McKenney" , David Laight , "linux-kernel@vger.kernel.org" , "netfilter-devel@vger.kernel.org" , "netdev@vger.kernel.org" , "oleg@redhat.com" , "akpm@linux-foundation.org" , "mingo@redhat.com" , "dave@stgolabs.net" , "manfred@colorfullife.com" , "tj@kernel.org" , "arnd@arndb.de" , "linux-arch@vger.kernel.org" , "will.deacon@arm.com" , "parri.andrea@gmail.com" , "torvalds@linux-foundation.org" Subject: Re: [PATCH v2 0/9] Remove spin_unlock_wait() Message-ID: <20170706165426.lcewgpuluythinhz@hirez.programming.kicks-ass.net> References: <20170706162412.GE2393@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 06, 2017 at 12:49:12PM -0400, Alan Stern wrote: > On Thu, 6 Jul 2017, Paul E. McKenney wrote: > > > On Thu, Jul 06, 2017 at 06:10:47PM +0200, Peter Zijlstra wrote: > > > On Thu, Jul 06, 2017 at 08:21:10AM -0700, Paul E. McKenney wrote: > > > > And yes, there are architecture-specific optimizations for an > > > > empty spin_lock()/spin_unlock() critical section, and the current > > > > arch_spin_unlock_wait() implementations show some of these optimizations. > > > > But I expect that performance benefits would need to be demonstrated at > > > > the system level. > > > > > > I do in fact contended there are any optimizations for the exact > > > lock+unlock semantics. > > > > You lost me on this one. > > > > > The current spin_unlock_wait() is weaker. Most notably it will not (with > > > exception of ARM64/PPC for other reasons) cause waits on other CPUs. > > > > Agreed, weaker semantics allow more optimizations. So use cases needing > > only the weaker semantics should more readily show performance benefits. > > But either way, we need compelling use cases, and I do not believe that > > any of the existing spin_unlock_wait() calls are compelling. Perhaps I > > am confused, but I am not seeing it for any of them. > > If somebody really wants the full spin_unlock_wait semantics and > doesn't want to interfere with other CPUs, wouldn't synchronize_sched() > or something similar do the job? It wouldn't be as efficient as > lock+unlock, but it also wouldn't affect other CPUs. So please don't do that. That'll create massive pain for RT. Also I don't think it works. The whole point was that spin_unlock_wait() is _cheaper_ than lock()+unlock(). If it gets to be more expensive there is absolutely no point in using it.