From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753305AbdF2Sru (ORCPT <rfc822;w@1wt.eu>);
        Thu, 29 Jun 2017 14:47:50 -0400
Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:34976 "EHLO
        mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK)
        by vger.kernel.org with ESMTP id S1753054AbdF2Srn (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 29 Jun 2017 14:47:43 -0400
Date: Thu, 29 Jun 2017 11:47:35 -0700
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Alan Stern <stern@rowland.harvard.edu>,
        Andrea Parri <parri.andrea@gmail.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        priyalee.kushwaha@intel.com,
        =?utf-8?Q?Stanis=C5=82aw?= Drozd <drozdziak1@gmail.com>,
        Arnd Bergmann <arnd@arndb.de>, ldr709@gmail.com,
        Thomas Gleixner <tglx@linutronix.de>,
        Peter Zijlstra <peterz@infradead.org>,
        Josh Triplett <josh@joshtriplett.org>, Nicolas Pitre <nico@linaro.org>,
        Krister Johansen <kjlx@templeofstupid.com>,
        Vegard Nossum <vegard.nossum@oracle.com>, dcb314@hotmail.com,
        Wu Fengguang <fengguang.wu@intel.com>,
        Frederic Weisbecker <fweisbec@gmail.com>,
        Rik van Riel <riel@redhat.com>, Steven Rostedt <rostedt@goodmis.org>,
        Ingo Molnar <mingo@kernel.org>, Luc Maranget <luc.maranget@inria.fr>,
        Jade Alglave <j.alglave@ucl.ac.uk>
Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13
Reply-To: paulmck@linux.vnet.ibm.com
References: <20170628170321.GQ3721@linux.vnet.ibm.com>
 <Pine.LNX.4.44L0.1706281547270.27696-100000@netrider.rowland.org>
 <20170628235412.GB3721@linux.vnet.ibm.com>
 <CA+55aFwLq5oPvY5HwpKq9LkCuQ5No7O5g=+ij1T68ONu-gOm_Q@mail.gmail.com>
 <20170629004556.GD3721@linux.vnet.ibm.com>
 <20170629031726.pb5dhjnxxiif25ma@tardis>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170629031726.pb5dhjnxxiif25ma@tardis>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-TM-AS-GCONF: 00
x-cbid: 17062918-0008-0000-0000-0000025188B3
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00007294; HX=3.00000241; KW=3.00000007;
 PH=3.00000004; SC=3.00000214; SDB=6.00880503; UDB=6.00438950; IPR=6.00660655;
 BA=6.00005447; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000;
 ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016011; XFM=3.00000015;
 UTC=2017-06-29 18:47:39
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 17062918-0009-0000-0000-000035DA5C74
Message-Id: <20170629184735.GC2393@linux.vnet.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-29_13:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000
 definitions=main-1706290302
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Jun 29, 2017 at 11:17:26AM +0800, Boqun Feng wrote:
> On Wed, Jun 28, 2017 at 05:45:56PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote:
> > > On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney
> > > <paulmck@linux.vnet.ibm.com> wrote:
> > > >
> > > > Linus, are you dead-set against defining spin_unlock_wait() to be
> > > > spin_lock + spin_unlock?  For example, is the current x86 implementation
> > > > of spin_unlock_wait() really a non-negotiable hard requirement?  Or
> > > > would you be willing to live with the spin_lock + spin_unlock semantics?
> > > 
> > > So I think the "same as spin_lock + spin_unlock" semantics are kind of insane.
> > > 
> > > One of the issues is that the same as "spin_lock + spin_unlock" is
> > > basically now architecture-dependent. Is it really the
> > > architecture-dependent ordering you want to define this as?
> > > 
> > > So I just think it's a *bad* definition. If somebody wants something
> > > that is exactly equivalent to spin_lock+spin_unlock, then dammit, just
> > > do *THAT*. It's completely pointless to me to define
> > > spin_unlock_wait() in those terms.
> > > 
> > > And if it's not equivalent to the *architecture* behavior of
> > > spin_lock+spin_unlock, then I think it should be descibed in terms
> > > that aren't about the architecture implementation (so you shouldn't
> > > describe it as "spin_lock+spin_unlock", you should describe it in
> > > terms of memory barrier semantics.
> > > 
> > > And if we really have to use the spin_lock+spinunlock semantics for
> > > this, then what is the advantage of spin_unlock_wait at all, if it
> > > doesn't fundamentally avoid some locking overhead of just taking the
> > > spinlock in the first place?
> > > 
> > > And if we can't use a cheaper model, maybe we should just get rid of
> > > it entirely?
> > > 
> > > Finally: if the memory barrier semantics are exactly the same, and
> > > it's purely about avoiding some nasty contention case, I think the
> > > concept is broken - contention is almost never an actual issue, and if
> > > it is, the problem is much deeper than spin_unlock_wait().
> > 
> > All good points!
> > 
> > I must confess that your sentence about getting rid of spin_unlock_wait()
> > entirely does resonate with me, especially given the repeated bouts of
> > "but what -exactly- is it -supposed- to do?" over the past 18 months
> > or so.  ;-)
> > 
> > Just for completeness, here is a list of the definitions that have been
> > put forward, just in case it inspires someone to come up with something
> > better:
> > 
> > 1.	spin_unlock_wait() provides only acquire semantics.  Code
> > 	placed after the spin_unlock_wait() will see the effects of
> > 	all previous critical sections, but there is no guarantees for
> > 	subsequent critical sections.  The x86 implementation provides
> > 	this.  I -think- that the ARM and PowerPC implementations could
> > 	get rid of a memory-barrier instruction and still provide this.
> > 
> 
> Yes, except we still need a smp_lwsync() in powerpc's
> spin_unlock_wait().
> 
> And FWIW, the two smp_mb()s in spin_unlock_wait() on PowerPC exist there
> just because when Peter worked on commit 726328d92a42, we decided to let
> the fix for spin_unlock_wait() on PowerPC(i.e. commit 6262db7c088bb ) go
> into the tree first to avoid some possible conflicts.  And.. I forgot to
> do the clean-up for an aquire-semantics spin_unlock_wait() later.. ;-)
> 
> I could send out the necessary fix once we have a conclusion for the
> semantics part.

If we end up still having spin_unlock_wait(), I will be happy to take
you up on that.

							Thanx, Paul

> Regards,
> Boqun
> 
> > 2.	As #1 above, but a "smp_mb();spin_unlock_wait();" provides the
> > 	additional guarantee that code placed before this construct is
> > 	seen by all subsequent critical sections.  The x86 implementation
> > 	provides this, as do ARM and PowerPC, but it is not clear that all
> > 	architectures do.  As Alan noted, this is an extremely unnatural
> > 	definition for the current memory model.
> > 
> > 3.	[ Just for completeness, yes, this is off the table! ]  The
> > 	spin_unlock_wait() has the same semantics as a spin_lock()
> > 	followed immediately by a spin_unlock().
> > 
> > 4.	spin_unlock_wait() is analogous to synchronize_rcu(), where
> > 	spin_unlock_wait()'s "read-side critical sections" are the lock's
> > 	normal critical sections.  This was the first definition I heard
> > 	that made any sense to me, but it turns out to be equivalent
> > 	to #3.	Thus, also off the table.
> > 
> > Does anyone know of any other possible definitions?
> > 
> > 							Thanx, Paul
> >