All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Randy Dunlap <rdunlap@infradead.org>
Subject: Re: [RFC][PATCH v3]: documentation,atomic: Add new documents
Date: Wed, 2 Aug 2017 09:17:56 -0700	[thread overview]
Message-ID: <20170802161756.GX3730@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170802094531.GA15748@arm.com>

On Wed, Aug 02, 2017 at 10:45:32AM +0100, Will Deacon wrote:
> Hi Paul,
> 
> On Tue, Aug 01, 2017 at 09:14:12AM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 01, 2017 at 01:17:13PM +0100, Will Deacon wrote:
> > > On Tue, Aug 01, 2017 at 01:47:44PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Aug 01, 2017 at 11:19:00AM +0100, Will Deacon wrote:
> > > > > On Tue, Aug 01, 2017 at 11:01:21AM +0200, Peter Zijlstra wrote:
> > > > > > On Mon, Jul 31, 2017 at 10:43:45AM -0700, Paul E. McKenney wrote:
> > > > > > > So if I have something like this, the assertion really can trigger?
> > > > > > > 
> > > > > > > 	WRITE_ONCE(x, 1);		atomic_inc(&y);
> > > > > > > 	r0 = xchg_release(&y, 5);	smp_mb__after_atomic();
> > > > > > > 					r1 = READ_ONCE(x);
> > > > > > > 
> > > > > > > 
> > > > > > > 	WARN_ON(r0 == 0 && r1 == 0);
> > > > > > > 
> > > > > > > I must confess that I am not seeing why we would want to allow this
> > > > > > > outcome.
> > > > > > 
> > > > > > No you are indeed quite right. I just wasn't creative enough. Thanks for
> > > > > > the inspiration.
> > > > > 
> > > > > Just to close this out, we agree that an smp_rmb() instead of
> > > > > smp_mb__after_atomic() would *not* forbid this outcome, right?
> > > > 
> > > > So that really hurts my brain. Per the normal rules that smp_rmb() would
> > > > order the read of @x against the last ll of @y and per ll/sc ordering
> > > > you then still don't get to make the WARN happen.
> > > > 
> > > > On IRC you explained that your 8.1 LSE instructions are not in fact
> > > > ordered by a smp_rmb, only by smp_wmb, which is 'surprising' since you
> > > > really need to load the old value to compute the new value.
> > > 
> > > To be clear, it's only the ST* variants of the LSE instructions that are
> > > treated as a write for the purposes of memory ordering, so these are the
> > > non-*_return variants. It's not unlikely that other architectures will
> > > exhibit the same behaviour (e.g. Power, RISC-V), because the CPU can
> > > treat non-return atomics as "fire-and-forget" and have them handled
> > > elsewhere in the memory subsystem, causing them to be treated similarly
> > > to posted writes.
> > > 
> > > For the code snippet above, the second thread has no idea about the value
> > > of y and so smp_rmb() is the wrong thing to be using imo. It really cares
> > > about ordering the store to y before the read of x, so needs a full mb (i.e.
> > > the test is more like 'R' than 'MP').
> > > 
> > > Also, wouldn't this problem also arise if your atomics were built using a
> > > spinlock where unlock had release semantics?
> > 
> > The current Linux kernel memory model forbids this outcome with smp_rmb(),
> > though I did have to work around the current lack of atomic_inc() using
> > xchg_relaxed(), so please review my litmus tests carefully.
> 
> It's worth noting that we don't have the problem with any value-returning
> atomics, so all flavours of xchg in this test would be forbidden on arm64
> too.

Plus after upgrading to the latest and greatest version of herd,
atomic_inc() worked just fine.  (Hey, I -try- to keep up!)  The updated
litmus test is here:

https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-WillDeacon-MP%2Bo-r%2Bai-rmb-o.litmus

Same outcome.  Alan Stern is looking into what might be adjusted.
Of course, there is no guarantee that this will turn out to be reasonable
or for that matter acceptable to the usual suspects, but if feasible we
should at least see what this does to the model.

> > 	C C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> > 
> > 	(*
> > 	 * Expected result: Never.
> > 	 *
> > 	 * Desired litmus test, with atomic_inc() emulated by xchg_relaxed():
> > 	 *
> > 	 *     WRITE_ONCE(x, 1);               atomic_inc(&y);
> > 	 *     r0 = xchg_release(&y, 5);       smp_rmb();
> > 	 *                                     r1 = READ_ONCE(x);
> > 	 *
> > 	 *
> > 	 *     WARN_ON(r0 == 0 && r1 == 0);
> > 	 *)
> > 
> > 	{
> > 	}
> > 
> > 	P0(int *x, int *y)
> > 	{
> > 		WRITE_ONCE(*x, 1);
> > 		r0 = xchg_release(y, 5);
> > 	}
> > 
> > 	P1(int *x, int *y)
> > 	{
> > 		r2 = xchg_relaxed(y, 1);
> > 		smp_rmb();
> > 		r1 = READ_ONCE(*x);
> > 	}
> > 
> > 	exists
> > 	(0:r0=0 /\ 1:r1=0)
> > 
> > Here is what herd thinks:
> > 
> > 	$ herd7 -bell strong-kernel.bell -cat weak-kernel.cat -macros linux.def ../litmus/manual/kernel/C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> > 	Test C-WillDeacon-MP+o-r+ai-rmb-o Allowed
> > 	States 3
> > 	0:r0=0; 1:r1=1;
> > 	0:r0=1; 1:r1=0;
> > 	0:r0=1; 1:r1=1;
> > 	No
> > 	Witnesses
> > 	Positive: 0 Negative: 3
> > 	Condition exists (0:r0=0 /\ 1:r1=0)
> > 	Observation C-WillDeacon-MP+o-r+ai-rmb-o Never 0 3
> > 	Hash=0c3e25a94b38708a2c5ea11ff52c8077
> > 
> > I get the same answer from strong-kernel.cat (which is our best-guess
> > envelope over hardware guarantees), weak-kernel.cat (which is simplified
> > based on what people actually use), and proposal.cat (which is a candidate
> > model with further simplifications).
> > 
> > I converted this (possibly incorrectly) to PowerPC assembly:
> > 
> > 	PPC w-RMWl-r+w-RMWl-r.litmus
> > 	""
> > 	(*
> > 	 * Does 3.0 Linux-kernel Power atomic_add_return() provide local 
> > 	 * barrier that orders prior stores against subsequent loads?
> > 	 * Use the atomic_add_return() in both threads, but to different variables.
> > 	 * And use the trailing-lwsync variant of atomic_add_return().
> > 	 *)
> > 	(* 24-Aug-2011: ppcmem says "Sometimes" *)
> > 	{
> > 	0:r1=1; 0:r2=x; 0:r3=5; 0:r4=y;   0:r10=0 ; 0:r11=0;
> > 	1:r1=1; 1:r2=x; 1:r3=5; 1:r4=y;   1:r10=0 ; 1:r11=0;
> > 	}
> > 	 P0                | P1                ;
> > 	 stw r1,0(r2)      | lwarx  r11,r10,r4 ;
> > 	 lwsync            | stwcx. r1,r10,r4  ;
> > 	 lwarx  r11,r10,r4 | bne Fail1         ;
> > 	 stwcx. r3,r10,r4  | lwsync            ;
> > 	 bne Fail0         | lwz r3,0(r2)      ;
> > 	 li r3,42          | Fail1:            ;
> > 	 Fail0:            |                   ;
> > 
> > 
> > 	exists
> > 	(0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> > 
> > And ppcmem agrees with the linux-kernel memory model:
> > 
> > 	[ . . . ]
> > 
> > 	Found     82 : Prune count= 13946  seen_succs=  7453   7454 states 
> > 	Found     83 : Prune count= 13997  seen_succs=  7490   7491 states 
> > 	Found     84 : Prune count= 14007  seen_succs=  7506   7507 states 
> > 	Found     85 : Prune count= 17229  seen_succs=  8889   8890 states 
> > 	Found     86 : Prune count= 17235  seen_succs=  8897   8898 states 
> > 	Test w-RMWl-r+w-RMWl-r Allowed
> > 	States 9
> > 	0:r3=5; 0:r11=0; 1:r3=0;
> > 	0:r3=5; 0:r11=0; 1:r3=1;
> > 	0:r3=5; 0:r11=0; 1:r3=5;
> > 	0:r3=5; 0:r11=1; 1:r3=0;
> > 	0:r3=5; 0:r11=1; 1:r3=1;
> > 	0:r3=42; 0:r11=0; 1:r3=1;
> > 	0:r3=42; 0:r11=0; 1:r3=5;
> > 	0:r3=42; 0:r11=1; 1:r3=0;
> > 	0:r3=42; 0:r11=1; 1:r3=1;
> > 	No (allowed not found)
> > 	Condition exists (0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> > 	Hash=58fb07516ac5697580c33e06a354f667
> > 	Observation w-RMWl-r+w-RMWl-r Never 0 9 
> > 
> > So if ARM really needs the litmus test with smp_rmb() to be allowed,
> > we need to adjust the Linux-kernel memory model appropriately.  Which
> > means that one of us needs to reach out to the usual suspects.  Would
> > you like to do that, or would you like me to?
> 
> If you don't mind doing it, then that would be great, thanks. Do shout if
> you need me to help with anything, though!

You will be copied, just to cut out the timezone delays if nothing else.

							Thanx, Paul

  reply	other threads:[~2017-08-02 16:18 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09  9:24 [RFC][PATCH]: documentation,atomic: Add a new atomic_t document Peter Zijlstra
2017-06-09 11:05 ` [RFC][PATCH] atomic: Fix atomic_set_release() for 'funny' architectures Peter Zijlstra
2017-06-09 11:13   ` Peter Zijlstra
2017-06-09 17:28     ` Vineet Gupta
2017-06-09 17:28       ` Vineet Gupta
2017-06-09 18:49       ` Peter Zijlstra
2017-06-09 18:49         ` Peter Zijlstra
2017-06-09 18:58     ` James Bottomley
2017-06-09 14:03   ` Chris Metcalf
2017-08-10 12:10   ` [tip:locking/core] locking/atomic: " tip-bot for Peter Zijlstra
2017-06-09 15:44 ` [RFC][PATCH]: documentation,atomic: Add a new atomic_t document Will Deacon
2017-06-09 19:36   ` Peter Zijlstra
2017-06-11 13:56     ` Boqun Feng
2017-06-12 14:49       ` Peter Zijlstra
2017-06-13  6:39         ` Boqun Feng
2017-06-14 12:33         ` Will Deacon
2017-07-12 12:53         ` Boqun Feng
2017-07-12 13:08           ` Peter Zijlstra
2017-07-12 19:13             ` Paul E. McKenney
2017-07-26 11:53         ` [RFC][PATCH v3]: documentation,atomic: Add new documents Peter Zijlstra
2017-07-26 12:47           ` Boqun Feng
2017-07-31  9:05             ` Peter Zijlstra
2017-07-31 11:04               ` Boqun Feng
2017-07-31 17:43                 ` Paul E. McKenney
2017-08-01  2:14                   ` Boqun Feng
2017-08-01  9:01                   ` Peter Zijlstra
2017-08-01 10:19                     ` Will Deacon
2017-08-01 11:47                       ` Peter Zijlstra
2017-08-01 12:17                         ` Will Deacon
2017-08-01 12:52                           ` Peter Zijlstra
2017-08-01 16:14                           ` Paul E. McKenney
2017-08-01 16:42                             ` Peter Zijlstra
2017-08-01 16:53                               ` Will Deacon
2017-08-01 22:18                               ` Paul E. McKenney
2017-08-02  8:46                                 ` Will Deacon
2017-08-01 18:37                             ` Paul E. McKenney
2017-08-02  9:45                             ` Will Deacon
2017-08-02 16:17                               ` Paul E. McKenney [this message]
2017-08-03 14:05                               ` Boqun Feng
2017-08-03 14:55                                 ` Paul E. McKenney
2017-08-03 16:12                                   ` Will Deacon
2017-08-03 16:58                                     ` Paul E. McKenney
2017-08-01 13:35                     ` Paul E. McKenney
2017-07-26 16:28           ` Randy Dunlap
2017-06-09 18:15 ` [RFC][PATCH]: documentation,atomic: Add a new atomic_t document Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170802161756.GX3730@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.