All of lore.kernel.org
 help / color / mirror / Atom feed
* Does Itanium permit speculative stores?
@ 2013-11-11 17:13 Paul E. McKenney
  2013-11-12 18:00 ` Luck, Tony
  2013-11-27  4:55 ` Jon Masters
  0 siblings, 2 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-11-11 17:13 UTC (permalink / raw)
  To: tony.luck; +Cc: peterz, linux-kernel

Hello, Tony,

Does Itanium permit speculative stores?  For example, on Itanium what are
the permitted outcomes of the following litmus test, where both x and y
are initially zero?

	CPU 0				CPU 1

	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
	if (r1)				if (r2)
		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;

In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
given this litmus test?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Does Itanium permit speculative stores?
  2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney
@ 2013-11-12 18:00 ` Luck, Tony
  2013-11-12 18:26   ` Peter Zijlstra
                     ` (2 more replies)
  2013-11-27  4:55 ` Jon Masters
  1 sibling, 3 replies; 11+ messages in thread
From: Luck, Tony @ 2013-11-12 18:00 UTC (permalink / raw)
  To: paulmck; +Cc: peterz, linux-kernel

> Does Itanium permit speculative stores?  For example, on Itanium what are
> the permitted outcomes of the following litmus test, where both x and y
> are initially zero?

We have a complier visible speculative read via the "ld.s" and "chk" instructions. But
there is no speculative write ("st.s") instruction.  I think you are asking "can out of order
writes become visible in this scenario?"

	CPU 0				CPU 1

	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
	if (r1)				if (r2)
		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;

> In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> given this litmus test?

The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate
ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think
you should be fine.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:00 ` Luck, Tony
@ 2013-11-12 18:26   ` Peter Zijlstra
  2013-11-12 18:46     ` Luck, Tony
  2013-11-12 21:30     ` Paul E. McKenney
  2013-11-12 18:31   ` Paul E. McKenney
  2013-11-12 18:34   ` Peter Zijlstra
  2 siblings, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-12 18:26 UTC (permalink / raw)
  To: Luck, Tony; +Cc: paulmck, linux-kernel

On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote:
> > Does Itanium permit speculative stores?  For example, on Itanium what are
> > the permitted outcomes of the following litmus test, where both x and y
> > are initially zero?
> 
> We have a complier visible speculative read via the "ld.s" and "chk" instructions. But
> there is no speculative write ("st.s") instruction.  I think you are asking "can out of order
> writes become visible in this scenario?"
> 
> 	CPU 0				CPU 1
> 
> 	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
> 	if (r1)				if (r2)
> 		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;
> 
> > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> > given this litmus test?
> 
> The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate
> ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think
> you should be fine.

Cute that volatile generates barrier instructions.

But no; I think Paul accidentally formulated his question in C (since we
all speak C) but meant to ask an architectural question.

So the point we're having a discussion on is if any architecture has
visible speculative STORES and if there's an architecture that doesn't
have control dependencies.

On the visible speculative STORES; can, if in the above example we have
regular loads/stores:

  LOAD r1, x			LOAD r2, y
  IF (r1)			IF (r2)
	STORE y, 1			STORE x, 1

we observe: r1==1 && r2==1

In order for that to be true; we must be able to observe the stores
before the loads are complete -- and therefore before the branches are a
certainty.

Typically if an architecture speculates on branches the result doesn't
become visible/committed until the branch is a certainty -- ie. linear
branch history.

Alternatively:

	x:=0

	IF (cond)			LOAD r1,x
		STORE x,1
	STORE x,2

Can r1 ever be 1 if we know 'cond' will never be true (runtime
constraint, not compile time so the branch cannot be omitted).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:00 ` Luck, Tony
  2013-11-12 18:26   ` Peter Zijlstra
@ 2013-11-12 18:31   ` Paul E. McKenney
  2013-11-12 18:34   ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-11-12 18:31 UTC (permalink / raw)
  To: Luck, Tony; +Cc: peterz, linux-kernel

On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote:
> > Does Itanium permit speculative stores?  For example, on Itanium what are
> > the permitted outcomes of the following litmus test, where both x and y
> > are initially zero?
> 
> We have a complier visible speculative read via the "ld.s" and "chk" instructions. But
> there is no speculative write ("st.s") instruction.  I think you are asking "can out of order
> writes become visible in this scenario?"
> 
> 	CPU 0				CPU 1
> 
> 	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
> 	if (r1)				if (r2)
> 		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;
> 
> > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> > given this litmus test?
> 
> The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate
> ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think
> you should be fine.

Excellent!!!  Thank you for the information!

If I understand you correctly, this underscores the importance of
using ACCESS_ONCE() -- if you omit them in the above scenario, perhaps
you can see out-of-order stores becoming visible in this scenario?

Also, this resolves our earlier IRC discussion about Itanium's lack of
read-read cache coherence.  If you use ACCESS_ONCE properly, then on
Itanium the reads will become ld.acq instructions, ensuring the expected
cache coherence.

Very nice!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:00 ` Luck, Tony
  2013-11-12 18:26   ` Peter Zijlstra
  2013-11-12 18:31   ` Paul E. McKenney
@ 2013-11-12 18:34   ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-12 18:34 UTC (permalink / raw)
  To: Luck, Tony; +Cc: paulmck, linux-kernel

On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote:
> The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate
> ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think
> you should be fine.

Hurm.. so:

+#define smp_store_release(p, v)						\
+do {									\
+	compiletime_assert_atomic_type(*p);				\
+	switch (sizeof(*p)) {						\
+	case 4:								\
+		asm volatile ("st4.rel [%0]=%1"				\
+				: "=r" (p) : "r" (v) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("st8.rel [%0]=%1"				\
+				: "=r" (p) : "r" (v) : "memory");	\
+		break;							\
+	}								\
+} while (0)
+
+#define smp_load_acquire(p)						\
+({									\
+	typeof(*p) ___p1;						\
+	compiletime_assert_atomic_type(*p);				\
+	switch (sizeof(*p)) {						\
+	case 4:								\
+		asm volatile ("ld4.acq %0=[%1]"				\
+				: "=r" (___p1) : "r" (p) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("ld8.acq %0=[%1]"				\
+				: "=r" (___p1) : "r" (p) : "memory");	\
+		break;							\
+	}								\
+	___p1;								\
+})

That all can be written as:

+#define smp_store_release(p, v)						\
+do {									\
+	compiletime_assert_atomic_type(*p);				\
+	ACCESS_ONCE(*p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+({									\
+	typeof(*p) ___p1 = ACCESS_ONCE(*p);				\
+	compiletime_assert_atomic_type(*p);				\
+	___p1;								\
+})

On ia64? Totally much simpler!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: Does Itanium permit speculative stores?
  2013-11-12 18:26   ` Peter Zijlstra
@ 2013-11-12 18:46     ` Luck, Tony
  2013-11-12 18:49       ` Peter Zijlstra
  2013-11-12 21:29       ` Paul E. McKenney
  2013-11-12 21:30     ` Paul E. McKenney
  1 sibling, 2 replies; 11+ messages in thread
From: Luck, Tony @ 2013-11-12 18:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: paulmck, linux-kernel

> So the point we're having a discussion on is if any architecture has
> visible speculative STORES and if there's an architecture that doesn't
> have control dependencies.
>
> On the visible speculative STORES; can, if in the above example we have
> regular loads/stores:
>
>  LOAD r1, x			LOAD r2, y
>  IF (r1)			IF (r2)
>	STORE y, 1			STORE x, 1
>
> we observe: r1==1 && r2==1
>
> In order for that to be true; we must be able to observe the stores
> before the loads are complete -- and therefore before the branches are a
> certainty.

Even without the ".acq" and ".rel" this code is still safe.

Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering"
"In addition, memory writes and flushes must observe control dependencies"

which I take to mean that the STORE can't be visible until we are certain of
the outcome of the conditional.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:46     ` Luck, Tony
@ 2013-11-12 18:49       ` Peter Zijlstra
  2013-11-12 21:29       ` Paul E. McKenney
  1 sibling, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2013-11-12 18:49 UTC (permalink / raw)
  To: Luck, Tony; +Cc: paulmck, linux-kernel

On Tue, Nov 12, 2013 at 06:46:20PM +0000, Luck, Tony wrote:
> > So the point we're having a discussion on is if any architecture has
> > visible speculative STORES and if there's an architecture that doesn't
> > have control dependencies.
> >
> > On the visible speculative STORES; can, if in the above example we have
> > regular loads/stores:
> >
> >  LOAD r1, x			LOAD r2, y
> >  IF (r1)			IF (r2)
> >	STORE y, 1			STORE x, 1
> >
> > we observe: r1==1 && r2==1
> >
> > In order for that to be true; we must be able to observe the stores
> > before the loads are complete -- and therefore before the branches are a
> > certainty.
> 
> Even without the ".acq" and ".rel" this code is still safe.
> 
> Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering"
> "In addition, memory writes and flushes must observe control dependencies"
> 
> which I take to mean that the STORE can't be visible until we are certain of
> the outcome of the conditional.

Awesome!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:46     ` Luck, Tony
  2013-11-12 18:49       ` Peter Zijlstra
@ 2013-11-12 21:29       ` Paul E. McKenney
  1 sibling, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-11-12 21:29 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Peter Zijlstra, linux-kernel

On Tue, Nov 12, 2013 at 06:46:20PM +0000, Luck, Tony wrote:
> > So the point we're having a discussion on is if any architecture has
> > visible speculative STORES and if there's an architecture that doesn't
> > have control dependencies.
> >
> > On the visible speculative STORES; can, if in the above example we have
> > regular loads/stores:
> >
> >  LOAD r1, x			LOAD r2, y
> >  IF (r1)			IF (r2)
> >	STORE y, 1			STORE x, 1
> >
> > we observe: r1==1 && r2==1
> >
> > In order for that to be true; we must be able to observe the stores
> > before the loads are complete -- and therefore before the branches are a
> > certainty.
> 
> Even without the ".acq" and ".rel" this code is still safe.
> 
> Quoting ia64 SDM vol 1, Section 4.4.7 Memory Access Ordering"
> "In addition, memory writes and flushes must observe control dependencies"
> 
> which I take to mean that the STORE can't be visible until we are certain of
> the outcome of the conditional.

Even better!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-12 18:26   ` Peter Zijlstra
  2013-11-12 18:46     ` Luck, Tony
@ 2013-11-12 21:30     ` Paul E. McKenney
  1 sibling, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-11-12 21:30 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Luck, Tony, linux-kernel

On Tue, Nov 12, 2013 at 07:26:17PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 12, 2013 at 06:00:26PM +0000, Luck, Tony wrote:
> > > Does Itanium permit speculative stores?  For example, on Itanium what are
> > > the permitted outcomes of the following litmus test, where both x and y
> > > are initially zero?
> > 
> > We have a complier visible speculative read via the "ld.s" and "chk" instructions. But
> > there is no speculative write ("st.s") instruction.  I think you are asking "can out of order
> > writes become visible in this scenario?"
> > 
> > 	CPU 0				CPU 1
> > 
> > 	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
> > 	if (r1)				if (r2)
> > 		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;
> > 
> > > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> > > given this litmus test?
> > 
> > The "ACCESS_ONCE" macro casts to volatile - which will make gcc generate
> > ordered "ld.acq" and "st.rel" instructions for your code snippets. So I think
> > you should be fine.
> 
> Cute that volatile generates barrier instructions.
> 
> But no; I think Paul accidentally formulated his question in C (since we
> all speak C) but meant to ask an architectural question.

I got both answers, so I am good.  ;-)

> So the point we're having a discussion on is if any architecture has
> visible speculative STORES and if there's an architecture that doesn't
> have control dependencies.
> 
> On the visible speculative STORES; can, if in the above example we have
> regular loads/stores:
> 
>   LOAD r1, x			LOAD r2, y
>   IF (r1)			IF (r2)
> 	STORE y, 1			STORE x, 1
> 
> we observe: r1==1 && r2==1
> 
> In order for that to be true; we must be able to observe the stores
> before the loads are complete -- and therefore before the branches are a
> certainty.
> 
> Typically if an architecture speculates on branches the result doesn't
> become visible/committed until the branch is a certainty -- ie. linear
> branch history.
> 
> Alternatively:
> 
> 	x:=0
> 
> 	IF (cond)			LOAD r1,x
> 		STORE x,1
> 	STORE x,2
> 
> Can r1 ever be 1 if we know 'cond' will never be true (runtime
> constraint, not compile time so the branch cannot be omitted).

I would have been OK mandating use of ACCESS_ONCE() to prevent speculative
stores, but it is even nicer that it is not necessary.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney
  2013-11-12 18:00 ` Luck, Tony
@ 2013-11-27  4:55 ` Jon Masters
  2013-11-27 17:19   ` Paul E. McKenney
  1 sibling, 1 reply; 11+ messages in thread
From: Jon Masters @ 2013-11-27  4:55 UTC (permalink / raw)
  To: paulmck; +Cc: tony.luck, peterz, linux-kernel

On 11/11/2013 12:13 PM, Paul E. McKenney wrote:
> Hello, Tony,
> 
> Does Itanium permit speculative stores?  For example, on Itanium what are
> the permitted outcomes of the following litmus test, where both x and y
> are initially zero?
> 
> 	CPU 0				CPU 1
> 
> 	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
> 	if (r1)				if (r2)
> 		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;
> 
> In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> given this litmus test?
> 
> 							Thanx, Paul
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Btw, I was reading through some UEFI docs and noticed a reference to "A
Formal Specification of Intel Itanium Processor Family Memory Ordering",
then remembered this thread. In case it's of use:

http://www.intel.com/design/itanium/downloads/251429.htm

Jon.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Does Itanium permit speculative stores?
  2013-11-27  4:55 ` Jon Masters
@ 2013-11-27 17:19   ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2013-11-27 17:19 UTC (permalink / raw)
  To: Jon Masters; +Cc: tony.luck, peterz, linux-kernel

On Tue, Nov 26, 2013 at 11:55:58PM -0500, Jon Masters wrote:
> On 11/11/2013 12:13 PM, Paul E. McKenney wrote:
> > Hello, Tony,
> > 
> > Does Itanium permit speculative stores?  For example, on Itanium what are
> > the permitted outcomes of the following litmus test, where both x and y
> > are initially zero?
> > 
> > 	CPU 0				CPU 1
> > 
> > 	r1 = ACCESS_ONCE(x);		r2 = ACCESS_ONCE(y);
> > 	if (r1)				if (r2)
> > 		ACCESS_ONCE(y) = 1;		ACCESS_ONCE(x) = 1;
> > 
> > In particular, is the outcome (r1 == 1 && r2 == 1) possible on Itanium
> > given this litmus test?
> > 
> > 							Thanx, Paul
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> Btw, I was reading through some UEFI docs and noticed a reference to "A
> Formal Specification of Intel Itanium Processor Family Memory Ordering",
> then remembered this thread. In case it's of use:
> 
> http://www.intel.com/design/itanium/downloads/251429.htm

I have seen this, but there have been too many times when I have fooled
myself about what the words mean (with DEC Alpha back in the late 90s
being the most impressive example).  So while I do learn what I can from
them, they are unfortunately not a substitute for asking.  ;-)

Besides, some of the Itanium locking code uses instructions that the
above manual is silent about.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-11-27 22:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-11 17:13 Does Itanium permit speculative stores? Paul E. McKenney
2013-11-12 18:00 ` Luck, Tony
2013-11-12 18:26   ` Peter Zijlstra
2013-11-12 18:46     ` Luck, Tony
2013-11-12 18:49       ` Peter Zijlstra
2013-11-12 21:29       ` Paul E. McKenney
2013-11-12 21:30     ` Paul E. McKenney
2013-11-12 18:31   ` Paul E. McKenney
2013-11-12 18:34   ` Peter Zijlstra
2013-11-27  4:55 ` Jon Masters
2013-11-27 17:19   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.