All of lore.kernel.org
 help / color / mirror / Atom feed
* how to understand cpu in-order commit
@ 2019-11-19 14:11 laokz
  2019-11-19 14:44 ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: laokz @ 2019-11-19 14:11 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook mailing list

Hello Paul,

With your perfbook help, now I can understand base memory-model issue. Such
as tools/memory-model/litmus/LB+fencembonceonce+ctrlonceonce.litmus :

P0(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*x);
	if (r0)
		WRITE_ONCE(*y, 1);
}

P1(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*y);
	smp_mb();
	WRITE_ONCE(*x, 1);
}

exists (0:r0=1 /\ 1:r0=1)

But wonder how about cpu in-order commit. So I removed smp_mb(), the litmus
test showed the result exist! This confused me. In my understanding, cpu in-
order commit means out-of-order-execution results commit to register/memory
in compiled program order. That is cpu P1 must retire r0=y first then x=1,
thus P0 can see P1's update of x. So P1's r0 should never be 1.

Is this caused by LKMM's compatibility with out-of-order commit
architectures? Or what's wrong with me?

Wish your kind help.
laokz




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
  2019-11-19 14:11 how to understand cpu in-order commit laokz
@ 2019-11-19 14:44 ` Paul E. McKenney
  2019-11-19 15:45   ` laokz
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2019-11-19 14:44 UTC (permalink / raw)
  To: laokz; +Cc: perfbook mailing list

On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> Hello Paul,
> 
> With your perfbook help, now I can understand base memory-model issue. Such
> as tools/memory-model/litmus/LB+fencembonceonce+ctrlonceonce.litmus :
> 
> P0(int *x, int *y)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*x);
> 	if (r0)
> 		WRITE_ONCE(*y, 1);
> }
> 
> P1(int *x, int *y)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*y);
> 	smp_mb();
> 	WRITE_ONCE(*x, 1);
> }
> 
> exists (0:r0=1 /\ 1:r0=1)
> 
> But wonder how about cpu in-order commit. So I removed smp_mb(), the litmus
> test showed the result exist! This confused me. In my understanding, cpu in-
> order commit means out-of-order-execution results commit to register/memory
> in compiled program order. That is cpu P1 must retire r0=y first then x=1,
> thus P0 can see P1's update of x. So P1's r0 should never be 1.
> 
> Is this caused by LKMM's compatibility with out-of-order commit
> architectures? Or what's wrong with me?

Nothing is wrong with you.  You are just going through a common phase
in learning about memory models.  ;-)

So you modified P1() as follows, correct?

	P1(int *x, int *y)
	{
		int r0;

		r0 = READ_ONCE(*y);
		WRITE_ONCE(*x, 1);
	}

The compiler is free to rearrange this code as follows:

	P1(int *x, int *y)
	{
		int r0;

		WRITE_ONCE(*x, 1);
		r0 = READ_ONCE(*y);
	}

This can clearly satisfy the exists clause:  P1() does its write,
P0() does its read and its write, and finally P1() does its read.

But suppose we prevented the compiler from moving the code:

	P1(int *x, int *y)
	{
		int r0;

		r0 = READ_ONCE(*y);
		barrier();
		WRITE_ONCE(*x, 1);
	}

Then, as you say, weakly ordered CPUs might still reorder P1()'s
read and write.  So LKMM must still say that the exists clause is
satisfied.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
  2019-11-19 14:44 ` Paul E. McKenney
@ 2019-11-19 15:45   ` laokz
  2019-11-19 21:11     ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: laokz @ 2019-11-19 15:45 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook mailing list

Hello paul,

在 2019-11-19二的 06:44 -0800,Paul E. McKenney写道:
> On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > litmus
> > test showed the result exist! This confused me. In my understanding, cpu
> > in-
> > order commit means out-of-order-execution results commit to
> > register/memory
> > in compiled program order. That is cpu P1 must retire r0=y first then
> > x=1,
> > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > 
> > Is this caused by LKMM's compatibility with out-of-order commit
> > architectures? Or what's wrong with me?
> 
> Nothing is wrong with you.  You are just going through a common phase
> in learning about memory models.  ;-)
> 
> So you modified P1() as follows, correct?
> 
> 	P1(int *x, int *y)
> 	{
> 		int r0;
> 
> 		r0 = READ_ONCE(*y);
> 		WRITE_ONCE(*x, 1);
> 	}
> 
> The compiler is free to rearrange this code as follows:
> 
> 	P1(int *x, int *y)
> 	{
> 		int r0;
> 
> 		WRITE_ONCE(*x, 1);
> 		r0 = READ_ONCE(*y);
> 	}
> 
> This can clearly satisfy the exists clause:  P1() does its write,
> P0() does its read and its write, and finally P1() does its read.

Compiler is one big deal. It seems that seldom compilers obey C11
standard strictly. "Accesses to volatile objects are evaluated strictly
according to the rules of the abstract machine." at least means they should
not change the sequence point.

> But suppose we prevented the compiler from moving the code:
> 
> 	P1(int *x, int *y)
> 	{
> 		int r0;
> 
> 		r0 = READ_ONCE(*y);
> 		barrier();
> 		WRITE_ONCE(*x, 1);
> 	}
> 
> Then, as you say, weakly ordered CPUs might still reorder P1()'s
> read and write.  So LKMM must still say that the exists clause is
> satisfied.

Legacy resource is another big deal.

Thanks your quick reply. It really clears my head.

Best regards,
laokz




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
  2019-11-19 15:45   ` laokz
@ 2019-11-19 21:11     ` Paul E. McKenney
  2019-11-20  3:03       ` laokz
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2019-11-19 21:11 UTC (permalink / raw)
  To: laokz; +Cc: perfbook mailing list

On Tue, Nov 19, 2019 at 11:45:03PM +0800, laokz wrote:
> Hello paul,
> 
> 在 2019-11-19二的 06:44 -0800,Paul E. McKenney写道:
> > On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > > litmus
> > > test showed the result exist! This confused me. In my understanding, cpu
> > > in-
> > > order commit means out-of-order-execution results commit to
> > > register/memory
> > > in compiled program order. That is cpu P1 must retire r0=y first then
> > > x=1,
> > > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > > 
> > > Is this caused by LKMM's compatibility with out-of-order commit
> > > architectures? Or what's wrong with me?
> > 
> > Nothing is wrong with you.  You are just going through a common phase
> > in learning about memory models.  ;-)
> > 
> > So you modified P1() as follows, correct?
> > 
> > 	P1(int *x, int *y)
> > 	{
> > 		int r0;
> > 
> > 		r0 = READ_ONCE(*y);
> > 		WRITE_ONCE(*x, 1);
> > 	}
> > 
> > The compiler is free to rearrange this code as follows:
> > 
> > 	P1(int *x, int *y)
> > 	{
> > 		int r0;
> > 
> > 		WRITE_ONCE(*x, 1);
> > 		r0 = READ_ONCE(*y);
> > 	}
> > 
> > This can clearly satisfy the exists clause:  P1() does its write,
> > P0() does its read and its write, and finally P1() does its read.
> 
> Compiler is one big deal. It seems that seldom compilers obey C11
> standard strictly. "Accesses to volatile objects are evaluated strictly
> according to the rules of the abstract machine." at least means they should
> not change the sequence point.

Yes, you are right, the volatile nature of READ_ONCE() and WRITE_ONCE()
would prohibit the reordering above.  On the other hand, the C++11
standard really does allow relaxed atomic loads and stores to be
reordered.  And since I was at a C++ standards committee a few weeks
ago, I had relaxed atomics on my brain.  Apologies for my confusion.

> > But suppose we prevented the compiler from moving the code:
> > 
> > 	P1(int *x, int *y)
> > 	{
> > 		int r0;
> > 
> > 		r0 = READ_ONCE(*y);
> > 		barrier();
> > 		WRITE_ONCE(*x, 1);
> > 	}
> > 
> > Then, as you say, weakly ordered CPUs might still reorder P1()'s
> > read and write.  So LKMM must still say that the exists clause is
> > satisfied.
> 
> Legacy resource is another big deal.

I will let you argue the "Legacy resource" point with the vendors still
selling weakly ordered CPUs.  ;-)

> Thanks your quick reply. It really clears my head.

;-) ;-) ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
  2019-11-19 21:11     ` Paul E. McKenney
@ 2019-11-20  3:03       ` laokz
  0 siblings, 0 replies; 8+ messages in thread
From: laokz @ 2019-11-20  3:03 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook mailing list

Hello Paul,

Thank you so much for all the information.

Best regards,
laokz

在 2019-11-19二的 13:11 -0800,Paul E. McKenney写道:
> On Tue, Nov 19, 2019 at 11:45:03PM +0800, laokz wrote:
> > Hello paul,
> > 
> > 在 2019-11-19二的 06:44 -0800,Paul E. McKenney写道:
> > > On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > > > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > > > litmus
> > > > test showed the result exist! This confused me. In my understanding, cpu
> > > > in-
> > > > order commit means out-of-order-execution results commit to
> > > > register/memory
> > > > in compiled program order. That is cpu P1 must retire r0=y first then
> > > > x=1,
> > > > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > > > 
> > > > Is this caused by LKMM's compatibility with out-of-order commit
> > > > architectures? Or what's wrong with me?
> > > 
> > > Nothing is wrong with you.  You are just going through a common phase
> > > in learning about memory models.  ;-)
> > > 
> > > So you modified P1() as follows, correct?
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > The compiler is free to rearrange this code as follows:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		WRITE_ONCE(*x, 1);
> > > 		r0 = READ_ONCE(*y);
> > > 	}
> > > 
> > > This can clearly satisfy the exists clause:  P1() does its write,
> > > P0() does its read and its write, and finally P1() does its read.
> > 
> > Compiler is one big deal. It seems that seldom compilers obey C11
> > standard strictly. "Accesses to volatile objects are evaluated strictly
> > according to the rules of the abstract machine." at least means they should
> > not change the sequence point.
> 
> Yes, you are right, the volatile nature of READ_ONCE() and WRITE_ONCE()
> would prohibit the reordering above.  On the other hand, the C++11
> standard really does allow relaxed atomic loads and stores to be
> reordered.  And since I was at a C++ standards committee a few weeks
> ago, I had relaxed atomics on my brain.  Apologies for my confusion.
> 
> > > But suppose we prevented the compiler from moving the code:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		barrier();
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > Then, as you say, weakly ordered CPUs might still reorder P1()'s
> > > read and write.  So LKMM must still say that the exists clause is
> > > satisfied.
> > 
> > Legacy resource is another big deal.
> 
> I will let you argue the "Legacy resource" point with the vendors still
> selling weakly ordered CPUs.  ;-)
> 
> > Thanks your quick reply. It really clears my head.
> 
> ;-) ;-) ;-)
> 
> 							Thanx, Paul
> 




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
       [not found]   ` <20200530151057.8E808206A1@mail.kernel.org>
@ 2020-05-31 17:01     ` Paul E. McKenney
  0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2020-05-31 17:01 UTC (permalink / raw)
  To: laokz; +Cc: perfbook

On Sat, May 30, 2020 at 11:09:42PM +0800, laokz wrote:
> Hi Paul,
> 
> Many appreciation for your light! 
> 
> On 2020-05-30 Sat 05:43 -0700,Paul E. McKenney wrote:
> > On Sat, May 30, 2020 at 06:36:37PM +0800, laokz wrote:
> > > Hello Paul,
> > > 
> > > This is a bit longer story, I am still searching and stuck in the mist:-
> > > )
> > > Hope to get light from you. Thanks!
> > > 
> > > I commented out smb_mb() from tools/memory-model/litmus-
> > > tests/LB+fencembonceonce+ctrlonceonce.litmus.
> > > 
> > > P0(int *x, int *y)
> > > {
> > > 	int r0;
> > > 
> > > 	r0 = READ_ONCE(*x);
> > > 	if (r0)
> > > 		WRITE_ONCE(*y, 1);
> > > }
> > > 
> > > P1(int *x, int *y)
> > > {
> > > 	int r0;
> > > 
> > > 	r0 = READ_ONCE(*y);
> > > //	smp_mb();
> > > 	WRITE_ONCE(*x, 1);
> > > }
> > > 
> > > And confused by that LKMM said it existed P0:r0=1 /\ P1:r0=1
> > > 
> > > I want to clear these questions:
> > > 
> > > 1. Is there 'out-of-order commit/retirement' CPU among linux supported
> > > architectures? If yes, which one? and then the following is trivial.
> > 
> > The powerpc architecture allows prior reads to be reordered with
> > subsequent writes.  To see this, point your browser here:
> > 
> > 	https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html#PPC
> > 
> > And "Select POWER Test" LB -> ctrl+po.  You will then have this:
> > 
> > 	PC LB+ctrl+po
> > 	"DpCtrldW Rfe PodRW Rfe"
> > 	Cycle=Rfe PodRW Rfe DpCtrldW
> > 	{
> > 	0:r2=x; 0:r4=y;
> > 	1:r2=y; 1:r4=x;
> > 	}
> > 	 P0           | P1           ;
> > 	 lwz r1,0(r2) | lwz r1,0(r2) ;
> > 	 cmpw r1,r1   | li r3,1      ;
> > 	 beq  LC00    | stw r3,0(r4) ;
> > 	 LC00:        |              ;
> > 	 li r3,1      |              ;
> > 	 stw r3,0(r4) |              ;
> > 	exists
> > 	(0:r1=1 /\ 1:r1=1)
> > 
> > You then should be able to easily force the P0:r0=1 /\ P1:r0=1 after
> > clicking on the "Interactive" button.  (Hint: First commit Thread 1's
> > "li" instruction, then its "stw" instruction, then all of Thread 0's
> > instructions, and then Thread 1's remaining "lwz" instruction.)
> 
> I followed your pointer. Yes, it showed the same result with my questioning
> litmus test.
> 
> > > 2. READ_ONCE, WRITE_ONCE assure compiler respect program order.
> > > If P0:r0=1, then it must have observed P1 write to x(wall time ahead
> > > P0:r0).
> > > If P1 write to x happened(committed, so visible to outside), then its
> > > read
> > > from y must happened before, because cpu's in-order commit/retirement
> > > restriction(wall time ahead P1:write). 
> > > Then how the most earlier P1:r0 to get value 1?
> > 
> > On powerpc architecture, it can.  But don't take my word for it, try
> > it out on the website listed above.  ;-)
> 
> In this website, I got 
> https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/pldi105-sarkar.pdf. It gave
> me a clue in section 8 page 11:
> 
> >> Specifically: the model allows instructions to **commit out of program
> order**, which permits the LB and LB+rs test outcomes (not observed in
> practice);...
> 
> It sounds resonable to me. Now I try to conclude: If the CPU was implemented
> in-order commit, then my questioning test result(after comment out P1's
> smb_mb) was forbidden. Can I?

The CPUs are quite a bit more complicated than that, and there are a lot
of ways that things can get out of order.  One mechanism is, as you say,
instruction commit order.  Another is the store buffer.  Yet another is
invalidation queues.  A third is the cache coherence protocol.

Appendix C of "Is Parallel Programming Hard, And, If So, What Can You
Do About It?" gives more details.

https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
  2020-05-30 10:36 laokz
@ 2020-05-30 12:43 ` Paul E. McKenney
       [not found]   ` <20200530151057.8E808206A1@mail.kernel.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2020-05-30 12:43 UTC (permalink / raw)
  To: laokz; +Cc: perfbook

On Sat, May 30, 2020 at 06:36:37PM +0800, laokz wrote:
> Hello Paul,
> 
> This is a bit longer story, I am still searching and stuck in the mist:-)
> Hope to get light from you. Thanks!
> 
> I commented out smb_mb() from tools/memory-model/litmus-
> tests/LB+fencembonceonce+ctrlonceonce.litmus.
> 
> P0(int *x, int *y)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*x);
> 	if (r0)
> 		WRITE_ONCE(*y, 1);
> }
> 
> P1(int *x, int *y)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*y);
> //	smp_mb();
> 	WRITE_ONCE(*x, 1);
> }
> 
> And confused by that LKMM said it existed P0:r0=1 /\ P1:r0=1
> 
> I want to clear these questions:
> 
> 1. Is there 'out-of-order commit/retirement' CPU among linux supported
> architectures? If yes, which one? and then the following is trivial.

The powerpc architecture allows prior reads to be reordered with
subsequent writes.  To see this, point your browser here:

	https://www.cl.cam.ac.uk/~pes20/ppcmem/index.html#PPC

And "Select POWER Test" LB -> ctrl+po.  You will then have this:

	PC LB+ctrl+po
	"DpCtrldW Rfe PodRW Rfe"
	Cycle=Rfe PodRW Rfe DpCtrldW
	{
	0:r2=x; 0:r4=y;
	1:r2=y; 1:r4=x;
	}
	 P0           | P1           ;
	 lwz r1,0(r2) | lwz r1,0(r2) ;
	 cmpw r1,r1   | li r3,1      ;
	 beq  LC00    | stw r3,0(r4) ;
	 LC00:        |              ;
	 li r3,1      |              ;
	 stw r3,0(r4) |              ;
	exists
	(0:r1=1 /\ 1:r1=1)

You then should be able to easily force the P0:r0=1 /\ P1:r0=1 after
clicking on the "Interactive" button.  (Hint: First commit Thread 1's
"li" instruction, then its "stw" instruction, then all of Thread 0's
instructions, and then Thread 1's remaining "lwz" instruction.)

> 2. READ_ONCE, WRITE_ONCE assure compiler respect program order.
> If P0:r0=1, then it must have observed P1 write to x(wall time ahead P0:r0).
> If P1 write to x happened(committed, so visible to outside), then its read
> from y must happened before, because cpu's in-order commit/retirement
> restriction(wall time ahead P1:write). 
> Then how the most earlier P1:r0 to get value 1?

On powerpc architecture, it can.  But don't take my word for it, try
it out on the website listed above.  ;-)

							Thanx, Paul

> Thanks again,
> laokz
> 
> On 2019-11-19 Tue 13:11 -0800,Paul E. McKenney wrote:
> > On Tue, Nov 19, 2019 at 11:45:03PM +0800, laokz wrote:
> > > Hello paul,
> > > 
> > > 在 2019-11-19二的 06:44 -0800,Paul E. McKenney写道:
> > > > On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > > > > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > > > > litmus
> > > > > test showed the result exist! This confused me. In my understanding,
> > > > > cpu
> > > > > in-
> > > > > order commit means out-of-order-execution results commit to
> > > > > register/memory
> > > > > in compiled program order. That is cpu P1 must retire r0=y first
> > > > > then
> > > > > x=1,
> > > > > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > > > > 
> > > > > Is this caused by LKMM's compatibility with out-of-order commit
> > > > > architectures? Or what's wrong with me?
> > > > 
> > > > Nothing is wrong with you.  You are just going through a common phase
> > > > in learning about memory models.  ;-)
> > > > 
> > > > So you modified P1() as follows, correct?
> > > > 
> > > > 	P1(int *x, int *y)
> > > > 	{
> > > > 		int r0;
> > > > 
> > > > 		r0 = READ_ONCE(*y);
> > > > 		WRITE_ONCE(*x, 1);
> > > > 	}
> > > > 
> > > > The compiler is free to rearrange this code as follows:
> > > > 
> > > > 	P1(int *x, int *y)
> > > > 	{
> > > > 		int r0;
> > > > 
> > > > 		WRITE_ONCE(*x, 1);
> > > > 		r0 = READ_ONCE(*y);
> > > > 	}
> > > > 
> > > > This can clearly satisfy the exists clause:  P1() does its write,
> > > > P0() does its read and its write, and finally P1() does its read.
> > > 
> > > Compiler is one big deal. It seems that seldom compilers obey C11
> > > standard strictly. "Accesses to volatile objects are evaluated strictly
> > > according to the rules of the abstract machine." at least means they
> > > should
> > > not change the sequence point.
> > 
> > Yes, you are right, the volatile nature of READ_ONCE() and WRITE_ONCE()
> > would prohibit the reordering above.  On the other hand, the C++11
> > standard really does allow relaxed atomic loads and stores to be
> > reordered.  And since I was at a C++ standards committee a few weeks
> > ago, I had relaxed atomics on my brain.  Apologies for my confusion.
> > 
> > > > But suppose we prevented the compiler from moving the code:
> > > > 
> > > > 	P1(int *x, int *y)
> > > > 	{
> > > > 		int r0;
> > > > 
> > > > 		r0 = READ_ONCE(*y);
> > > > 		barrier();
> > > > 		WRITE_ONCE(*x, 1);
> > > > 	}
> > > > 
> > > > Then, as you say, weakly ordered CPUs might still reorder P1()'s
> > > > read and write.  So LKMM must still say that the exists clause is
> > > > satisfied.
> > > 
> > > Legacy resource is another big deal.
> > 
> > I will let you argue the "Legacy resource" point with the vendors still
> > selling weakly ordered CPUs.  ;-)
> > 
> > > Thanks your quick reply. It really clears my head.
> > 
> > ;-) ;-) ;-)
> > 
> > 							Thanx, Paul
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: how to understand cpu in-order commit
@ 2020-05-30 10:36 laokz
  2020-05-30 12:43 ` Paul E. McKenney
  0 siblings, 1 reply; 8+ messages in thread
From: laokz @ 2020-05-30 10:36 UTC (permalink / raw)
  To: paulmck; +Cc: perfbook

Hello Paul,

This is a bit longer story, I am still searching and stuck in the mist:-)
Hope to get light from you. Thanks!

I commented out smb_mb() from tools/memory-model/litmus-
tests/LB+fencembonceonce+ctrlonceonce.litmus.

P0(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*x);
	if (r0)
		WRITE_ONCE(*y, 1);
}

P1(int *x, int *y)
{
	int r0;

	r0 = READ_ONCE(*y);
//	smp_mb();
	WRITE_ONCE(*x, 1);
}

And confused by that LKMM said it existed P0:r0=1 /\ P1:r0=1

I want to clear these questions:

1. Is there 'out-of-order commit/retirement' CPU among linux supported
architectures? If yes, which one? and then the following is trivial.

2. READ_ONCE, WRITE_ONCE assure compiler respect program order.
If P0:r0=1, then it must have observed P1 write to x(wall time ahead P0:r0).
If P1 write to x happened(committed, so visible to outside), then its read
from y must happened before, because cpu's in-order commit/retirement
restriction(wall time ahead P1:write). 
Then how the most earlier P1:r0 to get value 1?

Thanks again,
laokz

On 2019-11-19 Tue 13:11 -0800,Paul E. McKenney wrote:
> On Tue, Nov 19, 2019 at 11:45:03PM +0800, laokz wrote:
> > Hello paul,
> > 
> > 在 2019-11-19二的 06:44 -0800,Paul E. McKenney写道:
> > > On Tue, Nov 19, 2019 at 10:11:48PM +0800, laokz wrote:
> > > > But wonder how about cpu in-order commit. So I removed smp_mb(), the
> > > > litmus
> > > > test showed the result exist! This confused me. In my understanding,
> > > > cpu
> > > > in-
> > > > order commit means out-of-order-execution results commit to
> > > > register/memory
> > > > in compiled program order. That is cpu P1 must retire r0=y first
> > > > then
> > > > x=1,
> > > > thus P0 can see P1's update of x. So P1's r0 should never be 1.
> > > > 
> > > > Is this caused by LKMM's compatibility with out-of-order commit
> > > > architectures? Or what's wrong with me?
> > > 
> > > Nothing is wrong with you.  You are just going through a common phase
> > > in learning about memory models.  ;-)
> > > 
> > > So you modified P1() as follows, correct?
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > The compiler is free to rearrange this code as follows:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		WRITE_ONCE(*x, 1);
> > > 		r0 = READ_ONCE(*y);
> > > 	}
> > > 
> > > This can clearly satisfy the exists clause:  P1() does its write,
> > > P0() does its read and its write, and finally P1() does its read.
> > 
> > Compiler is one big deal. It seems that seldom compilers obey C11
> > standard strictly. "Accesses to volatile objects are evaluated strictly
> > according to the rules of the abstract machine." at least means they
> > should
> > not change the sequence point.
> 
> Yes, you are right, the volatile nature of READ_ONCE() and WRITE_ONCE()
> would prohibit the reordering above.  On the other hand, the C++11
> standard really does allow relaxed atomic loads and stores to be
> reordered.  And since I was at a C++ standards committee a few weeks
> ago, I had relaxed atomics on my brain.  Apologies for my confusion.
> 
> > > But suppose we prevented the compiler from moving the code:
> > > 
> > > 	P1(int *x, int *y)
> > > 	{
> > > 		int r0;
> > > 
> > > 		r0 = READ_ONCE(*y);
> > > 		barrier();
> > > 		WRITE_ONCE(*x, 1);
> > > 	}
> > > 
> > > Then, as you say, weakly ordered CPUs might still reorder P1()'s
> > > read and write.  So LKMM must still say that the exists clause is
> > > satisfied.
> > 
> > Legacy resource is another big deal.
> 
> I will let you argue the "Legacy resource" point with the vendors still
> selling weakly ordered CPUs.  ;-)
> 
> > Thanks your quick reply. It really clears my head.
> 
> ;-) ;-) ;-)
> 
> 							Thanx, Paul
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-31 17:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 14:11 how to understand cpu in-order commit laokz
2019-11-19 14:44 ` Paul E. McKenney
2019-11-19 15:45   ` laokz
2019-11-19 21:11     ` Paul E. McKenney
2019-11-20  3:03       ` laokz
2020-05-30 10:36 laokz
2020-05-30 12:43 ` Paul E. McKenney
     [not found]   ` <20200530151057.8E808206A1@mail.kernel.org>
2020-05-31 17:01     ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.