From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753684Ab3KCWqB (ORCPT ); Sun, 3 Nov 2013 17:46:01 -0500 Received: from e35.co.us.ibm.com ([32.97.110.153]:39450 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751629Ab3KCWqA (ORCPT ); Sun, 3 Nov 2013 17:46:00 -0500 Date: Sun, 3 Nov 2013 14:42:42 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Linus Torvalds , Victor Kaplansky , Oleg Nesterov , Anton Blanchard , Benjamin Herrenschmidt , Frederic Weisbecker , LKML , Linux PPC dev , Mathieu Desnoyers , Michael Ellerman , Michael Neuling Subject: Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb() Message-ID: <20131103224242.GF3947@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20131030092725.GL4126@linux.vnet.ibm.com> <20131030112526.GI16117@laptop.programming.kicks-ass.net> <20131031064015.GV4126@linux.vnet.ibm.com> <20131101145634.GH19466@laptop.lan> <20131102173239.GB3947@linux.vnet.ibm.com> <20131103144017.GA25118@linux.vnet.ibm.com> <20131103151704.GJ19466@laptop.lan> <20131103200124.GK19466@laptop.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131103200124.GK19466@laptop.lan> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13110322-6688-0000-0000-000003234CE2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 03, 2013 at 09:01:24PM +0100, Peter Zijlstra wrote: > On Sun, Nov 03, 2013 at 10:08:14AM -0800, Linus Torvalds wrote: > > On Sun, Nov 3, 2013 at 7:17 AM, Peter Zijlstra wrote: > > > On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote: > > >> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer(). > > > > > > Well, I'm obviously all for introducing this new barrier, for it will > > > reduce a full mfence on x86 to a compiler barrier. And ppc can use > > > lwsync as opposed to sync afaict. Not sure ARM can do better. > > > > > > --- > > > Subject: arch: Introduce new TSO memory barrier smp_tmb() > > > > This is specialized enough that I would *really* like the name to be > > more descriptive. Compare to the special "smp_read_barrier_depends()" > > maco: it's unusual, and it has very specific semantics, so it gets a > > long and descriptive name. > > > > Memory ordering is subtle enough without then using names that are > > subtle in themselves. mb/rmb/wmb are conceptually pretty simple > > operations, and very basic when talking about memory ordering. > > "acquire" and "release" are less simple, but have descriptive names > > and have very specific uses in locking. > > > > In contrast "smp_tmb()" is a *horrible* name, because TSO is a > > description of the memory ordering, not of a particular barrier. It's > > also not even clear that you can have a "tso barrier", since the > > ordering (like acquire/release) presumably is really about one > > particular *store*, not about some kind of barrier between different > > operations. > > > > So please describe exactly what the semantics that barrier has, and > > then name the barrier that way. > > > > I assume that in this particular case, the semantics RCU wants is > > "write barrier, and no preceding reads can move past this point". Its semantics order prior reads against subsequent reads, prior reads against subsequent writes, and prior writes against subsequent writes. It does -not- order prior writes against subsequent reads. > > Calling that "smp_tmb()" is f*cking insane, imnsho. > > Fair enough; from what I could gather the proposed semantics are > RELEASE+WMB, such that neither reads not writes can cross over, writes > can't cross back, but reads could. > > Since both RELEASE and WMB are trivial under TSO the entire thing > collapses. And here are some candidate names, with no attempt to sort sanity from insanity: smp_storebuffer_mb() -- A barrier that enforces those orderings that do not invalidate the hardware store-buffer optimization. smp_not_w_r_mb() -- A barrier that orders everything except prior writes against subsequent reads. smp_acqrel_mb() -- A barrier that combines C/C++ acquire and release semantics. (C/C++ "acquire" orders a specific load against subsequent loads and stores, while C/C++ "release" orders a specific store against prior loads and stores.) Others? > Now I'm currently completely confused as to what C/C++ wrecks vs actual > proper memory order issues; let alone fully comprehend the case that > started all this. Each can result in similar wreckage. In either case, it is about failing to guarantee needed orderings. Thanx, Paul From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e39.co.us.ibm.com (e39.co.us.ibm.com [32.97.110.160]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e39.co.us.ibm.com", Issuer "GeoTrust SSL CA" (not verified)) by ozlabs.org (Postfix) with ESMTPS id A39232C00C8 for ; Mon, 4 Nov 2013 09:46:01 +1100 (EST) Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 3 Nov 2013 15:45:59 -0700 Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id D97B6C9003E for ; Sun, 3 Nov 2013 17:45:54 -0500 (EST) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b01cxnp23034.gho.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rA3Mjt9I66912458 for ; Sun, 3 Nov 2013 22:45:55 GMT Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id rA3MjbW1016299 for ; Sun, 3 Nov 2013 15:45:38 -0700 Date: Sun, 3 Nov 2013 14:42:42 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Subject: Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb() Message-ID: <20131103224242.GF3947@linux.vnet.ibm.com> References: <20131030092725.GL4126@linux.vnet.ibm.com> <20131030112526.GI16117@laptop.programming.kicks-ass.net> <20131031064015.GV4126@linux.vnet.ibm.com> <20131101145634.GH19466@laptop.lan> <20131102173239.GB3947@linux.vnet.ibm.com> <20131103144017.GA25118@linux.vnet.ibm.com> <20131103151704.GJ19466@laptop.lan> <20131103200124.GK19466@laptop.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20131103200124.GK19466@laptop.lan> Cc: Michael Neuling , Mathieu Desnoyers , Oleg Nesterov , LKML , Linux PPC dev , Anton Blanchard , Frederic Weisbecker , Victor Kaplansky , Linus Torvalds Reply-To: paulmck@linux.vnet.ibm.com List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Sun, Nov 03, 2013 at 09:01:24PM +0100, Peter Zijlstra wrote: > On Sun, Nov 03, 2013 at 10:08:14AM -0800, Linus Torvalds wrote: > > On Sun, Nov 3, 2013 at 7:17 AM, Peter Zijlstra wrote: > > > On Sun, Nov 03, 2013 at 06:40:17AM -0800, Paul E. McKenney wrote: > > >> If there was an smp_tmb(), I would likely use it in rcu_assign_pointer(). > > > > > > Well, I'm obviously all for introducing this new barrier, for it will > > > reduce a full mfence on x86 to a compiler barrier. And ppc can use > > > lwsync as opposed to sync afaict. Not sure ARM can do better. > > > > > > --- > > > Subject: arch: Introduce new TSO memory barrier smp_tmb() > > > > This is specialized enough that I would *really* like the name to be > > more descriptive. Compare to the special "smp_read_barrier_depends()" > > maco: it's unusual, and it has very specific semantics, so it gets a > > long and descriptive name. > > > > Memory ordering is subtle enough without then using names that are > > subtle in themselves. mb/rmb/wmb are conceptually pretty simple > > operations, and very basic when talking about memory ordering. > > "acquire" and "release" are less simple, but have descriptive names > > and have very specific uses in locking. > > > > In contrast "smp_tmb()" is a *horrible* name, because TSO is a > > description of the memory ordering, not of a particular barrier. It's > > also not even clear that you can have a "tso barrier", since the > > ordering (like acquire/release) presumably is really about one > > particular *store*, not about some kind of barrier between different > > operations. > > > > So please describe exactly what the semantics that barrier has, and > > then name the barrier that way. > > > > I assume that in this particular case, the semantics RCU wants is > > "write barrier, and no preceding reads can move past this point". Its semantics order prior reads against subsequent reads, prior reads against subsequent writes, and prior writes against subsequent writes. It does -not- order prior writes against subsequent reads. > > Calling that "smp_tmb()" is f*cking insane, imnsho. > > Fair enough; from what I could gather the proposed semantics are > RELEASE+WMB, such that neither reads not writes can cross over, writes > can't cross back, but reads could. > > Since both RELEASE and WMB are trivial under TSO the entire thing > collapses. And here are some candidate names, with no attempt to sort sanity from insanity: smp_storebuffer_mb() -- A barrier that enforces those orderings that do not invalidate the hardware store-buffer optimization. smp_not_w_r_mb() -- A barrier that orders everything except prior writes against subsequent reads. smp_acqrel_mb() -- A barrier that combines C/C++ acquire and release semantics. (C/C++ "acquire" orders a specific load against subsequent loads and stores, while C/C++ "release" orders a specific store against prior loads and stores.) Others? > Now I'm currently completely confused as to what C/C++ wrecks vs actual > proper memory order issues; let alone fully comprehend the case that > started all this. Each can result in similar wreckage. In either case, it is about failing to guarantee needed orderings. Thanx, Paul