All of lore.kernel.org
 help / color / mirror / Atom feed
* Is WRITE_ONCE() enough to prevent invention of stores?
@ 2017-09-16 11:01 Akira Yokosawa
  2017-09-17  1:07 ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Akira Yokosawa @ 2017-09-16 11:01 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

Hi Paul,

I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
Restrictions" quoted below:

> Oddly enough, the compiler is within its rights to use a variable
> as temporary storage just before a store to that variable, thus
> inventing stores to that variable.
> Fortunately, most compilers avoid this sort of thing, at least outside
> of stack variables.
> Nevertheless, using WRITE_ONCE() (or declaring the variable
> volatile) should prevent this sort of thing.
> But take care: If you have a translation unit that uses that variable,
> and never makes a volatile access to it, the compiler has no way of
> knowing that it needs to be careful.

I'm wondering if using WRITE_ONCE() in a translation unit is really
enough to prevent invention of stores.

Accessing via a volatile-cast pointer guarantees the access is not
optimized out (and hopefully the referenced value is respected).

But I suspect that it has any effect in preventing invention of extra
loads/stores.

Isn't declaring the variable volatile necessary for the guarantee?

In practice, as is described in the above quote: "Fortunately, most
compilers avoid this sort of thing, at least outside of stack variables",
we can assume non-volatile shared variables are not spilled out to
the variables themselves as far as GCC/LLVM is concerned.
But this is compiler dependent, I suppose.

      Thanks, Akira


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-09-16 11:01 Is WRITE_ONCE() enough to prevent invention of stores? Akira Yokosawa
@ 2017-09-17  1:07 ` Paul E. McKenney
  2017-09-17 11:04   ` Akira Yokosawa
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2017-09-17  1:07 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> Hi Paul,
> 
> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
> Restrictions" quoted below:
> 
> > Oddly enough, the compiler is within its rights to use a variable
> > as temporary storage just before a store to that variable, thus
> > inventing stores to that variable.
> > Fortunately, most compilers avoid this sort of thing, at least outside
> > of stack variables.
> > Nevertheless, using WRITE_ONCE() (or declaring the variable
> > volatile) should prevent this sort of thing.
> > But take care: If you have a translation unit that uses that variable,
> > and never makes a volatile access to it, the compiler has no way of
> > knowing that it needs to be careful.
> 
> I'm wondering if using WRITE_ONCE() in a translation unit is really
> enough to prevent invention of stores.
> 
> Accessing via a volatile-cast pointer guarantees the access is not
> optimized out (and hopefully the referenced value is respected).
> 
> But I suspect that it has any effect in preventing invention of extra
> loads/stores.
> 
> Isn't declaring the variable volatile necessary for the guarantee?
> 
> In practice, as is described in the above quote: "Fortunately, most
> compilers avoid this sort of thing, at least outside of stack variables",
> we can assume non-volatile shared variables are not spilled out to
> the variables themselves as far as GCC/LLVM is concerned.
> But this is compiler dependent, I suppose.

I suspect that it will turn out to be impossible for the compiler to
actually invent these stores in the general case.  For example, it might
be that there is some lock held or other synchronization mechanism unknown
to the compiler that prevents this behavior.  But I haven't fully worked
this out yet.

But I do know that if you just do plain stores, the compiler is fully
within its rights to invent stores preceding any given plain store.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-09-17  1:07 ` Paul E. McKenney
@ 2017-09-17 11:04   ` Akira Yokosawa
  2017-09-17 21:55     ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Akira Yokosawa @ 2017-09-17 11:04 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
>> Hi Paul,
>>
>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
>> Restrictions" quoted below:
>>
>>> Oddly enough, the compiler is within its rights to use a variable
>>> as temporary storage just before a store to that variable, thus
>>> inventing stores to that variable.
>>> Fortunately, most compilers avoid this sort of thing, at least outside
>>> of stack variables.
>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
>>> volatile) should prevent this sort of thing.
>>> But take care: If you have a translation unit that uses that variable,
>>> and never makes a volatile access to it, the compiler has no way of
>>> knowing that it needs to be careful.
>>
>> I'm wondering if using WRITE_ONCE() in a translation unit is really
>> enough to prevent invention of stores.
>>
>> Accessing via a volatile-cast pointer guarantees the access is not
>> optimized out (and hopefully the referenced value is respected).
>>
>> But I suspect that it has any effect in preventing invention of extra
>> loads/stores.
>>
>> Isn't declaring the variable volatile necessary for the guarantee?
>>
>> In practice, as is described in the above quote: "Fortunately, most
>> compilers avoid this sort of thing, at least outside of stack variables",
>> we can assume non-volatile shared variables are not spilled out to
>> the variables themselves as far as GCC/LLVM is concerned.
>> But this is compiler dependent, I suppose.
> 
> I suspect that it will turn out to be impossible for the compiler to
> actually invent these stores in the general case.  For example, it might
> be that there is some lock held or other synchronization mechanism unknown
> to the compiler that prevents this behavior.  But I haven't fully worked
> this out yet.

You mean the invented stores wouldn't be visible from other threads anyway?
In a meaningful parallel code, that can be the case.

> 
> But I do know that if you just do plain stores, the compiler is fully
> within its rights to invent stores preceding any given plain store.

So, the rules to use WRITE_ONCE() is something like the following?

---
1) Declare the variable without volatile.
2) READ_ONCE() and plain loads can be mixed. A plain load will see
   a value at least newer than or equal to the one obtained at the
   program-order most recent READ_ONCE().
3) WRITE_ONCE() should not be mixed with plain stores when invention
   of stores is to be avoided.

Invention of stores is the opposite of fusing stores.
Suppose you don't want to update progress in the while loop:

	while (!am_done()) {
	  do_something(p);
	  tmp++;
	}
	progress = tmp;

The compiler might transform this to

	while (!am_done()) {
	  do_something(p);
	  progress++;
	}

if it wants to avoid allocation of a register/stack to tmp for whatever
reason. WRITE_ONCE() prevents the unintended accesses of progress:

	while (!am_done()) {
	  do_something(p);
	  tmp++;
	}
	WRITE_ONCE(progress, tmp);

---
Adding this example in the text might be too verbose.
Would a Quick Quiz be reasonable?

        Thanks, Akira

> 
> 							Thanx, Paul
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-09-17 11:04   ` Akira Yokosawa
@ 2017-09-17 21:55     ` Paul E. McKenney
  2017-09-17 22:51       ` Akira Yokosawa
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2017-09-17 21:55 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> > On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> >> Hi Paul,
> >>
> >> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
> >> Restrictions" quoted below:
> >>
> >>> Oddly enough, the compiler is within its rights to use a variable
> >>> as temporary storage just before a store to that variable, thus
> >>> inventing stores to that variable.
> >>> Fortunately, most compilers avoid this sort of thing, at least outside
> >>> of stack variables.
> >>> Nevertheless, using WRITE_ONCE() (or declaring the variable
> >>> volatile) should prevent this sort of thing.
> >>> But take care: If you have a translation unit that uses that variable,
> >>> and never makes a volatile access to it, the compiler has no way of
> >>> knowing that it needs to be careful.
> >>
> >> I'm wondering if using WRITE_ONCE() in a translation unit is really
> >> enough to prevent invention of stores.
> >>
> >> Accessing via a volatile-cast pointer guarantees the access is not
> >> optimized out (and hopefully the referenced value is respected).
> >>
> >> But I suspect that it has any effect in preventing invention of extra
> >> loads/stores.
> >>
> >> Isn't declaring the variable volatile necessary for the guarantee?
> >>
> >> In practice, as is described in the above quote: "Fortunately, most
> >> compilers avoid this sort of thing, at least outside of stack variables",
> >> we can assume non-volatile shared variables are not spilled out to
> >> the variables themselves as far as GCC/LLVM is concerned.
> >> But this is compiler dependent, I suppose.
> > 
> > I suspect that it will turn out to be impossible for the compiler to
> > actually invent these stores in the general case.  For example, it might
> > be that there is some lock held or other synchronization mechanism unknown
> > to the compiler that prevents this behavior.  But I haven't fully worked
> > this out yet.
> 
> You mean the invented stores wouldn't be visible from other threads anyway?
> In a meaningful parallel code, that can be the case.

I mean that it is very hard to prove that inventing a store isn't introducing
a data race, which would be a violation of the standard.  The one case I know
of where the compiler can be sure that it is within its rights to invent the
store is before a normal store to a variable.

Otherwise, it might be (for example) that one must hold a lock to legally
update a given variable, and that lock might or might not be held at a given
point in the code.  But if the compiler sees a plain store, the compiler
knows that it is OK to update at that point.  So the compiler can invent
a store prior to the existing store, as long as there is no memory barrier,
compiler barrier, lock acquisition/release, atomic operation, etc., between
the original store and the compiler's invented store.

> > But I do know that if you just do plain stores, the compiler is fully
> > within its rights to invent stores preceding any given plain store.
> 
> So, the rules to use WRITE_ONCE() is something like the following?
> 
> ---
> 1) Declare the variable without volatile.

Agreed.

> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
>    a value at least newer than or equal to the one obtained at the
>    program-order most recent READ_ONCE().

I am not entirely sure of this one.  But if there is a barrier() or
stronger between the READ_ONCE() and the plain load, then yes.

> 3) WRITE_ONCE() should not be mixed with plain stores when invention
>    of stores is to be avoided.

Agreed.

> Invention of stores is the opposite of fusing stores.
> Suppose you don't want to update progress in the while loop:
> 
> 	while (!am_done()) {
> 	  do_something(p);
> 	  tmp++;
> 	}
> 	progress = tmp;
> 
> The compiler might transform this to
> 
> 	while (!am_done()) {
> 	  do_something(p);
> 	  progress++;
> 	}

But only as long as the compiler knows that do_something() doesn't
contain any ordering directives.

> if it wants to avoid allocation of a register/stack to tmp for whatever
> reason. WRITE_ONCE() prevents the unintended accesses of progress:
> 
> 	while (!am_done()) {
> 	  do_something(p);
> 	  tmp++;
> 	}
> 	WRITE_ONCE(progress, tmp);

Agreed, this would prevent the update to "progress" from being pulled
into the loop.

> ---
> Adding this example in the text might be too verbose.
> Would a Quick Quiz be reasonable?

Might be good in the section on protecting memory references, and putting
it into a quick quiz or two makes a lot of sense.

							Thanx, Paul

>         Thanks, Akira
> 
> > 
> > 							Thanx, Paul
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-09-17 21:55     ` Paul E. McKenney
@ 2017-09-17 22:51       ` Akira Yokosawa
  2017-10-30 18:14         ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Akira Yokosawa @ 2017-09-17 22:51 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
>>>> Hi Paul,
>>>>
>>>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
>>>> Restrictions" quoted below:
>>>>
>>>>> Oddly enough, the compiler is within its rights to use a variable
>>>>> as temporary storage just before a store to that variable, thus
>>>>> inventing stores to that variable.
>>>>> Fortunately, most compilers avoid this sort of thing, at least outside
>>>>> of stack variables.
>>>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
>>>>> volatile) should prevent this sort of thing.
>>>>> But take care: If you have a translation unit that uses that variable,
>>>>> and never makes a volatile access to it, the compiler has no way of
>>>>> knowing that it needs to be careful.
>>>>
>>>> I'm wondering if using WRITE_ONCE() in a translation unit is really
>>>> enough to prevent invention of stores.
>>>>
>>>> Accessing via a volatile-cast pointer guarantees the access is not
>>>> optimized out (and hopefully the referenced value is respected).
>>>>
>>>> But I suspect that it has any effect in preventing invention of extra
>>>> loads/stores.
>>>>
>>>> Isn't declaring the variable volatile necessary for the guarantee?
>>>>
>>>> In practice, as is described in the above quote: "Fortunately, most
>>>> compilers avoid this sort of thing, at least outside of stack variables",
>>>> we can assume non-volatile shared variables are not spilled out to
>>>> the variables themselves as far as GCC/LLVM is concerned.
>>>> But this is compiler dependent, I suppose.
>>>
>>> I suspect that it will turn out to be impossible for the compiler to
>>> actually invent these stores in the general case.  For example, it might
>>> be that there is some lock held or other synchronization mechanism unknown
>>> to the compiler that prevents this behavior.  But I haven't fully worked
>>> this out yet.
>>
>> You mean the invented stores wouldn't be visible from other threads anyway?
>> In a meaningful parallel code, that can be the case.
> 
> I mean that it is very hard to prove that inventing a store isn't introducing
> a data race, which would be a violation of the standard.  The one case I know
> of where the compiler can be sure that it is within its rights to invent the
> store is before a normal store to a variable.
> 
> Otherwise, it might be (for example) that one must hold a lock to legally
> update a given variable, and that lock might or might not be held at a given
> point in the code.  But if the compiler sees a plain store, the compiler
> knows that it is OK to update at that point.  So the compiler can invent
> a store prior to the existing store, as long as there is no memory barrier,
> compiler barrier, lock acquisition/release, atomic operation, etc., between
> the original store and the compiler's invented store.

I think I understand.

> 
>>> But I do know that if you just do plain stores, the compiler is fully
>>> within its rights to invent stores preceding any given plain store.
>>
>> So, the rules to use WRITE_ONCE() is something like the following?
>>
>> ---
>> 1) Declare the variable without volatile.
> 
> Agreed.
> 
>> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
>>    a value at least newer than or equal to the one obtained at the
>>    program-order most recent READ_ONCE().
> 
> I am not entirely sure of this one.  But if there is a barrier() or
> stronger between the READ_ONCE() and the plain load, then yes.

Ah, the compiler can reorder plain loads before READ_ONCE()...

I did a litmus test of a plain load after READ_ONCE(), but
such a reordering is not covered by herd7's litmus test, is it?

> 
>> 3) WRITE_ONCE() should not be mixed with plain stores when invention
>>    of stores is to be avoided.
> 
> Agreed.
> 
>> Invention of stores is the opposite of fusing stores.
>> Suppose you don't want to update progress in the while loop:
>>
>> 	while (!am_done()) {
>> 	  do_something(p);
>> 	  tmp++;
>> 	}
>> 	progress = tmp;
>>
>> The compiler might transform this to
>>
>> 	while (!am_done()) {
>> 	  do_something(p);
>> 	  progress++;
>> 	}
> 
> But only as long as the compiler knows that do_something() doesn't
> contain any ordering directives.

Yes. I borrowed the fusing example in the text and it should have
the same assumption.

> 
>> if it wants to avoid allocation of a register/stack to tmp for whatever
>> reason. WRITE_ONCE() prevents the unintended accesses of progress:
>>
>> 	while (!am_done()) {
>> 	  do_something(p);
>> 	  tmp++;
>> 	}
>> 	WRITE_ONCE(progress, tmp);
> 
> Agreed, this would prevent the update to "progress" from being pulled
> into the loop.
> 
>> ---
>> Adding this example in the text might be too verbose.
>> Would a Quick Quiz be reasonable?
> 
> Might be good in the section on protecting memory references, and putting
> it into a quick quiz or two makes a lot of sense.

It's up to you where to put it.

And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
Missing one might not cause a problem today, but a smarter compiler
can expose the bug in the future...

This is scary.

         Thanks, Akira

> 
> 							Thanx, Paul
> 
>>         Thanks, Akira
>>
>>>
>>> 							Thanx, Paul
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-09-17 22:51       ` Akira Yokosawa
@ 2017-10-30 18:14         ` Paul E. McKenney
  2017-10-31  3:03           ` Yubin Ruan
  2017-10-31 15:36           ` Akira Yokosawa
  0 siblings, 2 replies; 15+ messages in thread
From: Paul E. McKenney @ 2017-10-30 18:14 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
> > On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
> >> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> >>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> >>>> Hi Paul,
> >>>>
> >>>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
> >>>> Restrictions" quoted below:
> >>>>
> >>>>> Oddly enough, the compiler is within its rights to use a variable
> >>>>> as temporary storage just before a store to that variable, thus
> >>>>> inventing stores to that variable.
> >>>>> Fortunately, most compilers avoid this sort of thing, at least outside
> >>>>> of stack variables.
> >>>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
> >>>>> volatile) should prevent this sort of thing.
> >>>>> But take care: If you have a translation unit that uses that variable,
> >>>>> and never makes a volatile access to it, the compiler has no way of
> >>>>> knowing that it needs to be careful.
> >>>>
> >>>> I'm wondering if using WRITE_ONCE() in a translation unit is really
> >>>> enough to prevent invention of stores.
> >>>>
> >>>> Accessing via a volatile-cast pointer guarantees the access is not
> >>>> optimized out (and hopefully the referenced value is respected).
> >>>>
> >>>> But I suspect that it has any effect in preventing invention of extra
> >>>> loads/stores.
> >>>>
> >>>> Isn't declaring the variable volatile necessary for the guarantee?
> >>>>
> >>>> In practice, as is described in the above quote: "Fortunately, most
> >>>> compilers avoid this sort of thing, at least outside of stack variables",
> >>>> we can assume non-volatile shared variables are not spilled out to
> >>>> the variables themselves as far as GCC/LLVM is concerned.
> >>>> But this is compiler dependent, I suppose.
> >>>
> >>> I suspect that it will turn out to be impossible for the compiler to
> >>> actually invent these stores in the general case.  For example, it might
> >>> be that there is some lock held or other synchronization mechanism unknown
> >>> to the compiler that prevents this behavior.  But I haven't fully worked
> >>> this out yet.
> >>
> >> You mean the invented stores wouldn't be visible from other threads anyway?
> >> In a meaningful parallel code, that can be the case.
> > 
> > I mean that it is very hard to prove that inventing a store isn't introducing
> > a data race, which would be a violation of the standard.  The one case I know
> > of where the compiler can be sure that it is within its rights to invent the
> > store is before a normal store to a variable.
> > 
> > Otherwise, it might be (for example) that one must hold a lock to legally
> > update a given variable, and that lock might or might not be held at a given
> > point in the code.  But if the compiler sees a plain store, the compiler
> > knows that it is OK to update at that point.  So the compiler can invent
> > a store prior to the existing store, as long as there is no memory barrier,
> > compiler barrier, lock acquisition/release, atomic operation, etc., between
> > the original store and the compiler's invented store.
> 
> I think I understand.
> 
> > 
> >>> But I do know that if you just do plain stores, the compiler is fully
> >>> within its rights to invent stores preceding any given plain store.
> >>
> >> So, the rules to use WRITE_ONCE() is something like the following?
> >>
> >> ---
> >> 1) Declare the variable without volatile.
> > 
> > Agreed.
> > 
> >> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
> >>    a value at least newer than or equal to the one obtained at the
> >>    program-order most recent READ_ONCE().
> > 
> > I am not entirely sure of this one.  But if there is a barrier() or
> > stronger between the READ_ONCE() and the plain load, then yes.
> 
> Ah, the compiler can reorder plain loads before READ_ONCE()...
> 
> I did a litmus test of a plain load after READ_ONCE(), but
> such a reordering is not covered by herd7's litmus test, is it?
> 
> > 
> >> 3) WRITE_ONCE() should not be mixed with plain stores when invention
> >>    of stores is to be avoided.
> > 
> > Agreed.
> > 
> >> Invention of stores is the opposite of fusing stores.
> >> Suppose you don't want to update progress in the while loop:
> >>
> >> 	while (!am_done()) {
> >> 	  do_something(p);
> >> 	  tmp++;
> >> 	}
> >> 	progress = tmp;
> >>
> >> The compiler might transform this to
> >>
> >> 	while (!am_done()) {
> >> 	  do_something(p);
> >> 	  progress++;
> >> 	}
> > 
> > But only as long as the compiler knows that do_something() doesn't
> > contain any ordering directives.
> 
> Yes. I borrowed the fusing example in the text and it should have
> the same assumption.
> 
> > 
> >> if it wants to avoid allocation of a register/stack to tmp for whatever
> >> reason. WRITE_ONCE() prevents the unintended accesses of progress:
> >>
> >> 	while (!am_done()) {
> >> 	  do_something(p);
> >> 	  tmp++;
> >> 	}
> >> 	WRITE_ONCE(progress, tmp);
> > 
> > Agreed, this would prevent the update to "progress" from being pulled
> > into the loop.
> > 
> >> ---
> >> Adding this example in the text might be too verbose.
> >> Would a Quick Quiz be reasonable?
> > 
> > Might be good in the section on protecting memory references, and putting
> > it into a quick quiz or two makes a lot of sense.
> 
> It's up to you where to put it.
> 
> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
> Missing one might not cause a problem today, but a smarter compiler
> can expose the bug in the future...
> 
> This is scary.

Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
There is one final paragraph added just now, but if you get a chance,
please let me know what you think.

And if you are scared, you might actually have a good understanding
of the true situation.  ;-)

							Thanx, Paul

>          Thanks, Akira
> 
> > 
> > 							Thanx, Paul
> > 
> >>         Thanks, Akira
> >>
> >>>
> >>> 							Thanx, Paul
> >>>
> >>>
> >>
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-30 18:14         ` Paul E. McKenney
@ 2017-10-31  3:03           ` Yubin Ruan
  2017-10-31  3:14             ` [PATCH] memorder: Add one solution for one snippet Yubin Ruan
  2017-10-31  3:45             ` Is WRITE_ONCE() enough to prevent invention of stores? Paul E. McKenney
  2017-10-31 15:36           ` Akira Yokosawa
  1 sibling, 2 replies; 15+ messages in thread
From: Yubin Ruan @ 2017-10-31  3:03 UTC (permalink / raw)
  To: paulmck, Akira Yokosawa; +Cc: perfbook

On 10/31/2017 02:14 AM, Paul E. McKenney wrote:
> On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
>> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
>>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
>>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
>>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
>>>>>> Hi Paul,
>>>>>>
>>>>>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
>>>>>> Restrictions" quoted below:
>>>>>>
>>>>>>> Oddly enough, the compiler is within its rights to use a variable
>>>>>>> as temporary storage just before a store to that variable, thus
>>>>>>> inventing stores to that variable.
>>>>>>> Fortunately, most compilers avoid this sort of thing, at least outside
>>>>>>> of stack variables.
>>>>>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
>>>>>>> volatile) should prevent this sort of thing.
>>>>>>> But take care: If you have a translation unit that uses that variable,
>>>>>>> and never makes a volatile access to it, the compiler has no way of
>>>>>>> knowing that it needs to be careful.
>>>>>>
>>>>>> I'm wondering if using WRITE_ONCE() in a translation unit is really
>>>>>> enough to prevent invention of stores.
>>>>>>
>>>>>> Accessing via a volatile-cast pointer guarantees the access is not
>>>>>> optimized out (and hopefully the referenced value is respected).
>>>>>>
>>>>>> But I suspect that it has any effect in preventing invention of extra
>>>>>> loads/stores.
>>>>>>
>>>>>> Isn't declaring the variable volatile necessary for the guarantee?
>>>>>>
>>>>>> In practice, as is described in the above quote: "Fortunately, most
>>>>>> compilers avoid this sort of thing, at least outside of stack variables",
>>>>>> we can assume non-volatile shared variables are not spilled out to
>>>>>> the variables themselves as far as GCC/LLVM is concerned.
>>>>>> But this is compiler dependent, I suppose.
>>>>>
>>>>> I suspect that it will turn out to be impossible for the compiler to
>>>>> actually invent these stores in the general case.  For example, it might
>>>>> be that there is some lock held or other synchronization mechanism unknown
>>>>> to the compiler that prevents this behavior.  But I haven't fully worked
>>>>> this out yet.
>>>>
>>>> You mean the invented stores wouldn't be visible from other threads anyway?
>>>> In a meaningful parallel code, that can be the case.
>>>
>>> I mean that it is very hard to prove that inventing a store isn't introducing
>>> a data race, which would be a violation of the standard.  The one case I know
>>> of where the compiler can be sure that it is within its rights to invent the
>>> store is before a normal store to a variable.
>>>
>>> Otherwise, it might be (for example) that one must hold a lock to legally
>>> update a given variable, and that lock might or might not be held at a given
>>> point in the code.  But if the compiler sees a plain store, the compiler
>>> knows that it is OK to update at that point.  So the compiler can invent
>>> a store prior to the existing store, as long as there is no memory barrier,
>>> compiler barrier, lock acquisition/release, atomic operation, etc., between
>>> the original store and the compiler's invented store.
>>
>> I think I understand.
>>
>>>
>>>>> But I do know that if you just do plain stores, the compiler is fully
>>>>> within its rights to invent stores preceding any given plain store.
>>>>
>>>> So, the rules to use WRITE_ONCE() is something like the following?
>>>>
>>>> ---
>>>> 1) Declare the variable without volatile.
>>>
>>> Agreed.
>>>
>>>> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
>>>>    a value at least newer than or equal to the one obtained at the
>>>>    program-order most recent READ_ONCE().
>>>
>>> I am not entirely sure of this one.  But if there is a barrier() or
>>> stronger between the READ_ONCE() and the plain load, then yes.
>>
>> Ah, the compiler can reorder plain loads before READ_ONCE()...
>>
>> I did a litmus test of a plain load after READ_ONCE(), but
>> such a reordering is not covered by herd7's litmus test, is it?
>>
>>>
>>>> 3) WRITE_ONCE() should not be mixed with plain stores when invention
>>>>    of stores is to be avoided.
>>>
>>> Agreed.
>>>
>>>> Invention of stores is the opposite of fusing stores.
>>>> Suppose you don't want to update progress in the while loop:
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  tmp++;
>>>> 	}
>>>> 	progress = tmp;
>>>>
>>>> The compiler might transform this to
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  progress++;
>>>> 	}
>>>
>>> But only as long as the compiler knows that do_something() doesn't
>>> contain any ordering directives.
>>
>> Yes. I borrowed the fusing example in the text and it should have
>> the same assumption.
>>
>>>
>>>> if it wants to avoid allocation of a register/stack to tmp for whatever
>>>> reason. WRITE_ONCE() prevents the unintended accesses of progress:
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  tmp++;
>>>> 	}
>>>> 	WRITE_ONCE(progress, tmp);
>>>
>>> Agreed, this would prevent the update to "progress" from being pulled
>>> into the loop.
>>>
>>>> ---
>>>> Adding this example in the text might be too verbose.
>>>> Would a Quick Quiz be reasonable?
>>>
>>> Might be good in the section on protecting memory references, and putting
>>> it into a quick quiz or two makes a lot of sense.
>>
>> It's up to you where to put it.
>>
>> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
>> Missing one might not cause a problem today, but a smarter compiler
>> can expose the bug in the future...
>>
>> This is scary.
> 
> Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
> There is one final paragraph added just now, but if you get a chance,
> please let me know what you think.
> 
> And if you are scared, you might actually have a good understanding
> of the true situation.  ;-)

Hi Akira and Paul,

Interesting discussion! I read the new material that Paul just pushed and will
like to know how to use volatile cast to this all-too-real[1] code:

    1 tmp = p;
    2 if (tmp != NULL && tmp <= q)
    3     do_something(tmp);

It seems that in this case declaring tmp as volatile or make it an atomic
variable will be sufficient. But if I want to use a volatile cast, that is,
READ_ONCE()/WRITE_ONCE(), how should I do that cast? Should I do
    
    if (READ_ONCE(tmp) != NULL && READ_ONCE(tmp) <= q)
        do_something(tmp);

Thoughts?

Yubin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] memorder: Add one solution for one snippet
  2017-10-31  3:03           ` Yubin Ruan
@ 2017-10-31  3:14             ` Yubin Ruan
  2017-10-31  3:50               ` Paul E. McKenney
  2017-10-31  3:45             ` Is WRITE_ONCE() enough to prevent invention of stores? Paul E. McKenney
  1 sibling, 1 reply; 15+ messages in thread
From: Yubin Ruan @ 2017-10-31  3:14 UTC (permalink / raw)
  To: paulmck, Akira Yokosawa; +Cc: perfbook

From 256adc5ccd239288c3f38cd193072d6666ab1e82 Mon Sep 17 00:00:00 2001
From: Yubin Ruan <ablacktshirt@gmail.com>
Date: Tue, 31 Oct 2017 11:12:03 +0800
Subject: [PATCH] memorder: Add one solution for one snippet

Signed-off-by: Yubin Ruan <ablacktshirt@gmail.com>
---
 memorder/memorder.tex | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index f8886b4..a1c96df 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -3055,6 +3055,8 @@ surprise to \co{do_something()}.\footnote{
 	Tracking down the bug consumed a holiday weekend not just
 	for your editor, but also for several of his colleagues.
 	In short, this is not a new problem.}
+In this case, \co{tmp} should be declared as \co{volatile} to prevent
+the transformation by the compiler.

 Compilers can also fuse stores.
 The most infamous example is probably the progress-bar example
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-31  3:03           ` Yubin Ruan
  2017-10-31  3:14             ` [PATCH] memorder: Add one solution for one snippet Yubin Ruan
@ 2017-10-31  3:45             ` Paul E. McKenney
  1 sibling, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2017-10-31  3:45 UTC (permalink / raw)
  To: Yubin Ruan; +Cc: Akira Yokosawa, perfbook

On Tue, Oct 31, 2017 at 11:03:10AM +0800, Yubin Ruan wrote:
> On 10/31/2017 02:14 AM, Paul E. McKenney wrote:
> > On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
> >> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
> >>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
> >>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> >>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> >>>>>> Hi Paul,
> >>>>>>
> >>>>>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
> >>>>>> Restrictions" quoted below:
> >>>>>>
> >>>>>>> Oddly enough, the compiler is within its rights to use a variable
> >>>>>>> as temporary storage just before a store to that variable, thus
> >>>>>>> inventing stores to that variable.
> >>>>>>> Fortunately, most compilers avoid this sort of thing, at least outside
> >>>>>>> of stack variables.
> >>>>>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
> >>>>>>> volatile) should prevent this sort of thing.
> >>>>>>> But take care: If you have a translation unit that uses that variable,
> >>>>>>> and never makes a volatile access to it, the compiler has no way of
> >>>>>>> knowing that it needs to be careful.
> >>>>>>
> >>>>>> I'm wondering if using WRITE_ONCE() in a translation unit is really
> >>>>>> enough to prevent invention of stores.
> >>>>>>
> >>>>>> Accessing via a volatile-cast pointer guarantees the access is not
> >>>>>> optimized out (and hopefully the referenced value is respected).
> >>>>>>
> >>>>>> But I suspect that it has any effect in preventing invention of extra
> >>>>>> loads/stores.
> >>>>>>
> >>>>>> Isn't declaring the variable volatile necessary for the guarantee?
> >>>>>>
> >>>>>> In practice, as is described in the above quote: "Fortunately, most
> >>>>>> compilers avoid this sort of thing, at least outside of stack variables",
> >>>>>> we can assume non-volatile shared variables are not spilled out to
> >>>>>> the variables themselves as far as GCC/LLVM is concerned.
> >>>>>> But this is compiler dependent, I suppose.
> >>>>>
> >>>>> I suspect that it will turn out to be impossible for the compiler to
> >>>>> actually invent these stores in the general case.  For example, it might
> >>>>> be that there is some lock held or other synchronization mechanism unknown
> >>>>> to the compiler that prevents this behavior.  But I haven't fully worked
> >>>>> this out yet.
> >>>>
> >>>> You mean the invented stores wouldn't be visible from other threads anyway?
> >>>> In a meaningful parallel code, that can be the case.
> >>>
> >>> I mean that it is very hard to prove that inventing a store isn't introducing
> >>> a data race, which would be a violation of the standard.  The one case I know
> >>> of where the compiler can be sure that it is within its rights to invent the
> >>> store is before a normal store to a variable.
> >>>
> >>> Otherwise, it might be (for example) that one must hold a lock to legally
> >>> update a given variable, and that lock might or might not be held at a given
> >>> point in the code.  But if the compiler sees a plain store, the compiler
> >>> knows that it is OK to update at that point.  So the compiler can invent
> >>> a store prior to the existing store, as long as there is no memory barrier,
> >>> compiler barrier, lock acquisition/release, atomic operation, etc., between
> >>> the original store and the compiler's invented store.
> >>
> >> I think I understand.
> >>
> >>>
> >>>>> But I do know that if you just do plain stores, the compiler is fully
> >>>>> within its rights to invent stores preceding any given plain store.
> >>>>
> >>>> So, the rules to use WRITE_ONCE() is something like the following?
> >>>>
> >>>> ---
> >>>> 1) Declare the variable without volatile.
> >>>
> >>> Agreed.
> >>>
> >>>> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
> >>>>    a value at least newer than or equal to the one obtained at the
> >>>>    program-order most recent READ_ONCE().
> >>>
> >>> I am not entirely sure of this one.  But if there is a barrier() or
> >>> stronger between the READ_ONCE() and the plain load, then yes.
> >>
> >> Ah, the compiler can reorder plain loads before READ_ONCE()...
> >>
> >> I did a litmus test of a plain load after READ_ONCE(), but
> >> such a reordering is not covered by herd7's litmus test, is it?
> >>
> >>>
> >>>> 3) WRITE_ONCE() should not be mixed with plain stores when invention
> >>>>    of stores is to be avoided.
> >>>
> >>> Agreed.
> >>>
> >>>> Invention of stores is the opposite of fusing stores.
> >>>> Suppose you don't want to update progress in the while loop:
> >>>>
> >>>> 	while (!am_done()) {
> >>>> 	  do_something(p);
> >>>> 	  tmp++;
> >>>> 	}
> >>>> 	progress = tmp;
> >>>>
> >>>> The compiler might transform this to
> >>>>
> >>>> 	while (!am_done()) {
> >>>> 	  do_something(p);
> >>>> 	  progress++;
> >>>> 	}
> >>>
> >>> But only as long as the compiler knows that do_something() doesn't
> >>> contain any ordering directives.
> >>
> >> Yes. I borrowed the fusing example in the text and it should have
> >> the same assumption.
> >>
> >>>
> >>>> if it wants to avoid allocation of a register/stack to tmp for whatever
> >>>> reason. WRITE_ONCE() prevents the unintended accesses of progress:
> >>>>
> >>>> 	while (!am_done()) {
> >>>> 	  do_something(p);
> >>>> 	  tmp++;
> >>>> 	}
> >>>> 	WRITE_ONCE(progress, tmp);
> >>>
> >>> Agreed, this would prevent the update to "progress" from being pulled
> >>> into the loop.
> >>>
> >>>> ---
> >>>> Adding this example in the text might be too verbose.
> >>>> Would a Quick Quiz be reasonable?
> >>>
> >>> Might be good in the section on protecting memory references, and putting
> >>> it into a quick quiz or two makes a lot of sense.
> >>
> >> It's up to you where to put it.
> >>
> >> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
> >> Missing one might not cause a problem today, but a smarter compiler
> >> can expose the bug in the future...
> >>
> >> This is scary.
> > 
> > Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
> > There is one final paragraph added just now, but if you get a chance,
> > please let me know what you think.
> > 
> > And if you are scared, you might actually have a good understanding
> > of the true situation.  ;-)
> 
> Hi Akira and Paul,
> 
> Interesting discussion! I read the new material that Paul just pushed and will
> like to know how to use volatile cast to this all-too-real[1] code:
> 
>     1 tmp = p;
>     2 if (tmp != NULL && tmp <= q)
>     3     do_something(tmp);
> 
> It seems that in this case declaring tmp as volatile or make it an atomic
> variable will be sufficient. But if I want to use a volatile cast, that is,
> READ_ONCE()/WRITE_ONCE(), how should I do that cast? Should I do
>     
>     if (READ_ONCE(tmp) != NULL && READ_ONCE(tmp) <= q)
>         do_something(tmp);

Like this:

	1 tmp = READ_ONCE(p);
	2 if (tmp != NULL && tmp <= q)
	3     do_something(tmp);

							Thanx, Paul

> Thoughts?
> 
> Yubin
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] memorder: Add one solution for one snippet
  2017-10-31  3:14             ` [PATCH] memorder: Add one solution for one snippet Yubin Ruan
@ 2017-10-31  3:50               ` Paul E. McKenney
  2017-10-31  5:04                 ` Yubin Ruan
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2017-10-31  3:50 UTC (permalink / raw)
  To: Yubin Ruan; +Cc: Akira Yokosawa, perfbook

On Tue, Oct 31, 2017 at 11:14:47AM +0800, Yubin Ruan wrote:
> >From 256adc5ccd239288c3f38cd193072d6666ab1e82 Mon Sep 17 00:00:00 2001
> From: Yubin Ruan <ablacktshirt@gmail.com>
> Date: Tue, 31 Oct 2017 11:12:03 +0800
> Subject: [PATCH] memorder: Add one solution for one snippet
> 
> Signed-off-by: Yubin Ruan <ablacktshirt@gmail.com>
> ---
>  memorder/memorder.tex | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/memorder/memorder.tex b/memorder/memorder.tex
> index f8886b4..a1c96df 100644
> --- a/memorder/memorder.tex
> +++ b/memorder/memorder.tex
> @@ -3055,6 +3055,8 @@ surprise to \co{do_something()}.\footnote{
>  	Tracking down the bug consumed a holiday weekend not just
>  	for your editor, but also for several of his colleagues.
>  	In short, this is not a new problem.}
> +In this case, \co{tmp} should be declared as \co{volatile} to prevent
> +the transformation by the compiler.
> 
>  Compilers can also fuse stores.
>  The most infamous example is probably the progress-bar example

Declaring tmp as volatile could in some sense solve the problem, but at
the expense of preventing the compiler from caching tmp in a register.
It is better to use READ_ONCE() on the initial read from p, or failing
that, to declare p (not tmp) volatile.

							Thanx, Paul

------------------------------------------------------------------------

commit d1f08db32120f079da699c9deebde22af96a202a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Oct 30 20:44:33 2017 -0700

    memorder: Show how to use READ_ONCE() to prevent load replication
    
    Reported-by: Yubin Ruan <ablacktshirt@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index f8886b42468c..a3b8f7c86946 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -3055,6 +3055,21 @@ surprise to \co{do_something()}.\footnote{
 	Tracking down the bug consumed a holiday weekend not just
 	for your editor, but also for several of his colleagues.
 	In short, this is not a new problem.}
+To prevent the compiler from replicating the load, use \co{READ_ONCE()},
+for example as follows:
+
+\vspace{5pt}
+\begin{minipage}[t]{\columnwidth}
+\scriptsize
+\begin{verbatim}
+  1 tmp = READ_ONCE(p);
+  2 if (tmp != NULL && tmp <= q)
+  3   do_something(tmp);
+\end{verbatim}
+\end{minipage}
+\vspace{5pt}
+
+Alternatively, the variable \co{p} could be declared \co{volatile}.

 Compilers can also fuse stores.
 The most infamous example is probably the progress-bar example


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] memorder: Add one solution for one snippet
  2017-10-31  3:50               ` Paul E. McKenney
@ 2017-10-31  5:04                 ` Yubin Ruan
  0 siblings, 0 replies; 15+ messages in thread
From: Yubin Ruan @ 2017-10-31  5:04 UTC (permalink / raw)
  To: paulmck; +Cc: Akira Yokosawa, perfbook

Thanks Paul,

On 10/31/2017 11:50 AM, Paul E. McKenney wrote:
> On Tue, Oct 31, 2017 at 11:14:47AM +0800, Yubin Ruan wrote:
>> >From 256adc5ccd239288c3f38cd193072d6666ab1e82 Mon Sep 17 00:00:00 2001
>> From: Yubin Ruan <ablacktshirt@gmail.com>
>> Date: Tue, 31 Oct 2017 11:12:03 +0800
>> Subject: [PATCH] memorder: Add one solution for one snippet
>>
>> Signed-off-by: Yubin Ruan <ablacktshirt@gmail.com>
>> ---
>>  memorder/memorder.tex | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/memorder/memorder.tex b/memorder/memorder.tex
>> index f8886b4..a1c96df 100644
>> --- a/memorder/memorder.tex
>> +++ b/memorder/memorder.tex
>> @@ -3055,6 +3055,8 @@ surprise to \co{do_something()}.\footnote{
>>  	Tracking down the bug consumed a holiday weekend not just
>>  	for your editor, but also for several of his colleagues.
>>  	In short, this is not a new problem.}
>> +In this case, \co{tmp} should be declared as \co{volatile} to prevent
>> +the transformation by the compiler.
>>
>>  Compilers can also fuse stores.
>>  The most infamous example is probably the progress-bar example
> 
> Declaring tmp as volatile could in some sense solve the problem, but at
> the expense of preventing the compiler from caching tmp in a register.
> It is better to use READ_ONCE() on the initial read from p, or failing
> that, to declare p (not tmp) volatile.
> 
> 							Thanx, Paul

Acked-by: Yubin Ruan <ablacktshirt@gmail.com>

> ------------------------------------------------------------------------
> 
> commit d1f08db32120f079da699c9deebde22af96a202a
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Oct 30 20:44:33 2017 -0700
> 
>     memorder: Show how to use READ_ONCE() to prevent load replication
>     
>     Reported-by: Yubin Ruan <ablacktshirt@gmail.com>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/memorder/memorder.tex b/memorder/memorder.tex
> index f8886b42468c..a3b8f7c86946 100644
> --- a/memorder/memorder.tex
> +++ b/memorder/memorder.tex
> @@ -3055,6 +3055,21 @@ surprise to \co{do_something()}.\footnote{
>  	Tracking down the bug consumed a holiday weekend not just
>  	for your editor, but also for several of his colleagues.
>  	In short, this is not a new problem.}
> +To prevent the compiler from replicating the load, use \co{READ_ONCE()},
> +for example as follows:
> +
> +\vspace{5pt}
> +\begin{minipage}[t]{\columnwidth}
> +\scriptsize
> +\begin{verbatim}
> +  1 tmp = READ_ONCE(p);
> +  2 if (tmp != NULL && tmp <= q)
> +  3   do_something(tmp);
> +\end{verbatim}
> +\end{minipage}
> +\vspace{5pt}
> +
> +Alternatively, the variable \co{p} could be declared \co{volatile}.
>  
>  Compilers can also fuse stores.
>  The most infamous example is probably the progress-bar example
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-30 18:14         ` Paul E. McKenney
  2017-10-31  3:03           ` Yubin Ruan
@ 2017-10-31 15:36           ` Akira Yokosawa
  2017-10-31 16:27             ` Paul E. McKenney
  1 sibling, 1 reply; 15+ messages in thread
From: Akira Yokosawa @ 2017-10-31 15:36 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

On 2017/10/30 11:14:23 -0700, Paul E. McKenney wrote:
> On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
>> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
>>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
>>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
>>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
>>>>>> Hi Paul,
>>>>>>
>>>>>> I'm a bit disturbed by the description in Section 14.3.1 "Memory-Reference
>>>>>> Restrictions" quoted below:
>>>>>>
>>>>>>> Oddly enough, the compiler is within its rights to use a variable
>>>>>>> as temporary storage just before a store to that variable, thus
>>>>>>> inventing stores to that variable.
>>>>>>> Fortunately, most compilers avoid this sort of thing, at least outside
>>>>>>> of stack variables.
>>>>>>> Nevertheless, using WRITE_ONCE() (or declaring the variable
>>>>>>> volatile) should prevent this sort of thing.
>>>>>>> But take care: If you have a translation unit that uses that variable,
>>>>>>> and never makes a volatile access to it, the compiler has no way of
>>>>>>> knowing that it needs to be careful.
>>>>>>
>>>>>> I'm wondering if using WRITE_ONCE() in a translation unit is really
>>>>>> enough to prevent invention of stores.
>>>>>>
>>>>>> Accessing via a volatile-cast pointer guarantees the access is not
>>>>>> optimized out (and hopefully the referenced value is respected).
>>>>>>
>>>>>> But I suspect that it has any effect in preventing invention of extra
>>>>>> loads/stores.
>>>>>>
>>>>>> Isn't declaring the variable volatile necessary for the guarantee?
>>>>>>
>>>>>> In practice, as is described in the above quote: "Fortunately, most
>>>>>> compilers avoid this sort of thing, at least outside of stack variables",
>>>>>> we can assume non-volatile shared variables are not spilled out to
>>>>>> the variables themselves as far as GCC/LLVM is concerned.
>>>>>> But this is compiler dependent, I suppose.
>>>>>
>>>>> I suspect that it will turn out to be impossible for the compiler to
>>>>> actually invent these stores in the general case.  For example, it might
>>>>> be that there is some lock held or other synchronization mechanism unknown
>>>>> to the compiler that prevents this behavior.  But I haven't fully worked
>>>>> this out yet.
>>>>
>>>> You mean the invented stores wouldn't be visible from other threads anyway?
>>>> In a meaningful parallel code, that can be the case.
>>>
>>> I mean that it is very hard to prove that inventing a store isn't introducing
>>> a data race, which would be a violation of the standard.  The one case I know
>>> of where the compiler can be sure that it is within its rights to invent the
>>> store is before a normal store to a variable.
>>>
>>> Otherwise, it might be (for example) that one must hold a lock to legally
>>> update a given variable, and that lock might or might not be held at a given
>>> point in the code.  But if the compiler sees a plain store, the compiler
>>> knows that it is OK to update at that point.  So the compiler can invent
>>> a store prior to the existing store, as long as there is no memory barrier,
>>> compiler barrier, lock acquisition/release, atomic operation, etc., between
>>> the original store and the compiler's invented store.
>>
>> I think I understand.
>>
>>>
>>>>> But I do know that if you just do plain stores, the compiler is fully
>>>>> within its rights to invent stores preceding any given plain store.
>>>>
>>>> So, the rules to use WRITE_ONCE() is something like the following?
>>>>
>>>> ---
>>>> 1) Declare the variable without volatile.
>>>
>>> Agreed.
>>>
>>>> 2) READ_ONCE() and plain loads can be mixed. A plain load will see
>>>>    a value at least newer than or equal to the one obtained at the
>>>>    program-order most recent READ_ONCE().
>>>
>>> I am not entirely sure of this one.  But if there is a barrier() or
>>> stronger between the READ_ONCE() and the plain load, then yes.
>>
>> Ah, the compiler can reorder plain loads before READ_ONCE()...
>>
>> I did a litmus test of a plain load after READ_ONCE(), but
>> such a reordering is not covered by herd7's litmus test, is it?
>>
>>>
>>>> 3) WRITE_ONCE() should not be mixed with plain stores when invention
>>>>    of stores is to be avoided.
>>>
>>> Agreed.
>>>
>>>> Invention of stores is the opposite of fusing stores.
>>>> Suppose you don't want to update progress in the while loop:
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  tmp++;
>>>> 	}
>>>> 	progress = tmp;
>>>>
>>>> The compiler might transform this to
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  progress++;
>>>> 	}
>>>
>>> But only as long as the compiler knows that do_something() doesn't
>>> contain any ordering directives.
>>
>> Yes. I borrowed the fusing example in the text and it should have
>> the same assumption.
>>
>>>
>>>> if it wants to avoid allocation of a register/stack to tmp for whatever
>>>> reason. WRITE_ONCE() prevents the unintended accesses of progress:
>>>>
>>>> 	while (!am_done()) {
>>>> 	  do_something(p);
>>>> 	  tmp++;
>>>> 	}
>>>> 	WRITE_ONCE(progress, tmp);
>>>
>>> Agreed, this would prevent the update to "progress" from being pulled
>>> into the loop.
>>>
>>>> ---
>>>> Adding this example in the text might be too verbose.
>>>> Would a Quick Quiz be reasonable?
>>>
>>> Might be good in the section on protecting memory references, and putting
>>> it into a quick quiz or two makes a lot of sense.
>>
>> It's up to you where to put it.
>>
>> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
>> Missing one might not cause a problem today, but a smarter compiler
>> can expose the bug in the future...
>>
>> This is scary.
> 
> Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
> There is one final paragraph added just now, but if you get a chance,
> please let me know what you think.

It looks OK to me. Some minor nitpicks.

* On exact exceptions and exact interrupts:

IIRC, I've heard from you that there is an exceptional architecture
which requires explicit memory barriers in exception/interrupt handlers.
Was it Itanium?

* On the final paragraph of Section 15.3.1:

"Nevertheless" appears twice in this paragraph. You might want to
reword one of them.

> 
> And if you are scared, you might actually have a good understanding
> of the true situation.  ;-)

Whew!!!

     Thanks, Akira

> 
> 							Thanx, Paul
> 
>>          Thanks, Akira
>>
>>>
>>> 							Thanx, Paul
>>>
>>>>         Thanks, Akira
>>>>
>>>>>
>>>>> 							Thanx, Paul
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-31 15:36           ` Akira Yokosawa
@ 2017-10-31 16:27             ` Paul E. McKenney
  2017-10-31 22:25               ` Akira Yokosawa
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2017-10-31 16:27 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Wed, Nov 01, 2017 at 12:36:13AM +0900, Akira Yokosawa wrote:
> On 2017/10/30 11:14:23 -0700, Paul E. McKenney wrote:
> > On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
> >> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
> >>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
> >>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> >>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:

[ . . . ]

> >>>> Adding this example in the text might be too verbose.
> >>>> Would a Quick Quiz be reasonable?
> >>>
> >>> Might be good in the section on protecting memory references, and putting
> >>> it into a quick quiz or two makes a lot of sense.
> >>
> >> It's up to you where to put it.
> >>
> >> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
> >> Missing one might not cause a problem today, but a smarter compiler
> >> can expose the bug in the future...
> >>
> >> This is scary.
> > 
> > Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
> > There is one final paragraph added just now, but if you get a chance,
> > please let me know what you think.
> 
> It looks OK to me. Some minor nitpicks.
> 
> * On exact exceptions and exact interrupts:
> 
> IIRC, I've heard from you that there is an exceptional architecture
> which requires explicit memory barriers in exception/interrupt handlers.
> Was it Itanium?

For interrupts from kernel code, it is sometimes the case that the
kernel needs other CPUs to see the accesses in the code before the
interrupt handler, the code within the interrupt handler, and the code
after the interrupt handler to have happened in order.  In that case,
the interrupt entry/exit code might well need full memory barriers.

There are other ways to organize this that do not require interrupt
entry/exit memory barriers, for example, the convention could be that
locking be used.

> * On the final paragraph of Section 15.3.1:
> 
> "Nevertheless" appears twice in this paragraph. You might want to
> reword one of them.

Good point, how about the following?

	Nevertheless, it is quite possible to overdo use of READ_ONCE()
	and WRITE_ONCE(). For example, if you have prevented a given
	variable from changing (perhaps by holding the lock guarding
	all updates to that variable), there is no point in using
	READ_ONCE(). Similarly, if you have prevented any other CPUs or
	threads from reading a given variable (perhaps because you are
	initializing that variable before any other CPU or thread has
	access to it), there is no point in using WRITE_ONCE(). However,
	my experience is that developers need to use things like
	READ_ONCE() and WRITE_ONCE() more often than they think that
	they do, the overhead of unnecessary uses is quite low, and the
	penalty for failing to use them when needed is quite high.

							Thanx, Paul

> > And if you are scared, you might actually have a good understanding
> > of the true situation.  ;-)
> 
> Whew!!!
> 
>      Thanks, Akira
> 
> > 
> > 							Thanx, Paul
> > 
> >>          Thanks, Akira
> >>
> >>>
> >>> 							Thanx, Paul
> >>>
> >>>>         Thanks, Akira
> >>>>
> >>>>>
> >>>>> 							Thanx, Paul
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-31 16:27             ` Paul E. McKenney
@ 2017-10-31 22:25               ` Akira Yokosawa
  2017-11-01 20:15                 ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Akira Yokosawa @ 2017-10-31 22:25 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: perfbook, Akira Yokosawa

On 2017/11/01 1:27, Paul E. McKenney wrote:
> On Wed, Nov 01, 2017 at 12:36:13AM +0900, Akira Yokosawa wrote:
>> On 2017/10/30 11:14:23 -0700, Paul E. McKenney wrote:
>>> On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
>>>> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
>>>>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
>>>>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
>>>>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> 
> [ . . . ]
> 
>>>>>> Adding this example in the text might be too verbose.
>>>>>> Would a Quick Quiz be reasonable?
>>>>>
>>>>> Might be good in the section on protecting memory references, and putting
>>>>> it into a quick quiz or two makes a lot of sense.
>>>>
>>>> It's up to you where to put it.
>>>>
>>>> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
>>>> Missing one might not cause a problem today, but a smarter compiler
>>>> can expose the bug in the future...
>>>>
>>>> This is scary.
>>>
>>> Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
>>> There is one final paragraph added just now, but if you get a chance,
>>> please let me know what you think.
>>
>> It looks OK to me. Some minor nitpicks.
>>
>> * On exact exceptions and exact interrupts:
>>
>> IIRC, I've heard from you that there is an exceptional architecture
>> which requires explicit memory barriers in exception/interrupt handlers.
>> Was it Itanium?
> 
> For interrupts from kernel code, it is sometimes the case that the
> kernel needs other CPUs to see the accesses in the code before the
> interrupt handler, the code within the interrupt handler, and the code
> after the interrupt handler to have happened in order.  In that case,
> the interrupt entry/exit code might well need full memory barriers.
> 
> There are other ways to organize this that do not require interrupt
> entry/exit memory barriers, for example, the convention could be that
> locking be used.

Makes sense.

And I was somewhat confused here.
By "exact exceptions and exact interrupts", you are talking about
_instruction stream_ reordering.
Now I recall that my question at the time was "Do interrupts/
exceptions behave as memory barriers?".
So they are distinct point of view. If an architecture didn't
provide "exact exceptions and exact interrupts", it should be
quite tricky to return from exceptions/interrupts.

Thanks for the clarification.

> 
>> * On the final paragraph of Section 15.3.1:
>>
>> "Nevertheless" appears twice in this paragraph. You might want to
>> reword one of them.
> 
> Good point, how about the following?
> 
> 	Nevertheless, it is quite possible to overdo use of READ_ONCE()
> 	and WRITE_ONCE(). For example, if you have prevented a given
> 	variable from changing (perhaps by holding the lock guarding
> 	all updates to that variable), there is no point in using
> 	READ_ONCE(). Similarly, if you have prevented any other CPUs or
> 	threads from reading a given variable (perhaps because you are
> 	initializing that variable before any other CPU or thread has
> 	access to it), there is no point in using WRITE_ONCE(). However,
> 	my experience is that developers need to use things like
> 	READ_ONCE() and WRITE_ONCE() more often than they think that
> 	they do, the overhead of unnecessary uses is quite low, and the
> 	penalty for failing to use them when needed is quite high.

The clause added to the last sentence makes the point more obvious.
This looks more than enough.

      Thanks, Akira

> 
> 							Thanx, Paul
> 
>>> And if you are scared, you might actually have a good understanding
>>> of the true situation.  ;-)
>>
>> Whew!!!
>>
>>      Thanks, Akira
>>
>>>
>>> 							Thanx, Paul
>>>
>>>>          Thanks, Akira
>>>>
>>>>>
>>>>> 							Thanx, Paul
>>>>>
>>>>>>         Thanks, Akira
>>>>>>
>>>>>>>
>>>>>>> 							Thanx, Paul
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Is WRITE_ONCE() enough to prevent invention of stores?
  2017-10-31 22:25               ` Akira Yokosawa
@ 2017-11-01 20:15                 ` Paul E. McKenney
  0 siblings, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2017-11-01 20:15 UTC (permalink / raw)
  To: Akira Yokosawa; +Cc: perfbook

On Wed, Nov 01, 2017 at 07:25:29AM +0900, Akira Yokosawa wrote:
> On 2017/11/01 1:27, Paul E. McKenney wrote:
> > On Wed, Nov 01, 2017 at 12:36:13AM +0900, Akira Yokosawa wrote:
> >> On 2017/10/30 11:14:23 -0700, Paul E. McKenney wrote:
> >>> On Mon, Sep 18, 2017 at 07:51:29AM +0900, Akira Yokosawa wrote:
> >>>> On 2017/09/17 14:55:08 -0700, Paul E. McKenney wrote:
> >>>>> On Sun, Sep 17, 2017 at 08:04:21PM +0900, Akira Yokosawa wrote:
> >>>>>> On 2017/09/16 18:07:30 -0700, Paul E. McKenney wrote:
> >>>>>>> On Sat, Sep 16, 2017 at 08:01:45PM +0900, Akira Yokosawa wrote:
> > 
> > [ . . . ]
> > 
> >>>>>> Adding this example in the text might be too verbose.
> >>>>>> Would a Quick Quiz be reasonable?
> >>>>>
> >>>>> Might be good in the section on protecting memory references, and putting
> >>>>> it into a quick quiz or two makes a lot of sense.
> >>>>
> >>>> It's up to you where to put it.
> >>>>
> >>>> And I now realize using READ_ONCE() and WRITE_ONCE() is quite tricky.
> >>>> Missing one might not cause a problem today, but a smarter compiler
> >>>> can expose the bug in the future...
> >>>>
> >>>> This is scary.
> >>>
> >>> Section 15.3.1 is supposed to cover READ_ONCE() and WRITE_ONCE().
> >>> There is one final paragraph added just now, but if you get a chance,
> >>> please let me know what you think.
> >>
> >> It looks OK to me. Some minor nitpicks.
> >>
> >> * On exact exceptions and exact interrupts:
> >>
> >> IIRC, I've heard from you that there is an exceptional architecture
> >> which requires explicit memory barriers in exception/interrupt handlers.
> >> Was it Itanium?
> > 
> > For interrupts from kernel code, it is sometimes the case that the
> > kernel needs other CPUs to see the accesses in the code before the
> > interrupt handler, the code within the interrupt handler, and the code
> > after the interrupt handler to have happened in order.  In that case,
> > the interrupt entry/exit code might well need full memory barriers.
> > 
> > There are other ways to organize this that do not require interrupt
> > entry/exit memory barriers, for example, the convention could be that
> > locking be used.
> 
> Makes sense.
> 
> And I was somewhat confused here.
> By "exact exceptions and exact interrupts", you are talking about
> _instruction stream_ reordering.
> Now I recall that my question at the time was "Do interrupts/
> exceptions behave as memory barriers?".
> So they are distinct point of view. If an architecture didn't
> provide "exact exceptions and exact interrupts", it should be
> quite tricky to return from exceptions/interrupts.

Back in the old days, there were architectures with inexact interrupts
that dumped hardware state onto the stack.  The supposed benefit was
the ability to make progress without requiring that all pages used by
a particular instruction be mapped simultaneously.  But OS writers
really hated it, especially since the size of hardware state varied
based on all sorts of things.  ;-)

> Thanks for the clarification.
> 
> > 
> >> * On the final paragraph of Section 15.3.1:
> >>
> >> "Nevertheless" appears twice in this paragraph. You might want to
> >> reword one of them.
> > 
> > Good point, how about the following?
> > 
> > 	Nevertheless, it is quite possible to overdo use of READ_ONCE()
> > 	and WRITE_ONCE(). For example, if you have prevented a given
> > 	variable from changing (perhaps by holding the lock guarding
> > 	all updates to that variable), there is no point in using
> > 	READ_ONCE(). Similarly, if you have prevented any other CPUs or
> > 	threads from reading a given variable (perhaps because you are
> > 	initializing that variable before any other CPU or thread has
> > 	access to it), there is no point in using WRITE_ONCE(). However,
> > 	my experience is that developers need to use things like
> > 	READ_ONCE() and WRITE_ONCE() more often than they think that
> > 	they do, the overhead of unnecessary uses is quite low, and the
> > 	penalty for failing to use them when needed is quite high.
> 
> The clause added to the last sentence makes the point more obvious.
> This looks more than enough.

Glad you like it!

							Thanx, Paul

>       Thanks, Akira
> 
> > 
> > 							Thanx, Paul
> > 
> >>> And if you are scared, you might actually have a good understanding
> >>> of the true situation.  ;-)
> >>
> >> Whew!!!
> >>
> >>      Thanks, Akira
> >>
> >>>
> >>> 							Thanx, Paul
> >>>
> >>>>          Thanks, Akira
> >>>>
> >>>>>
> >>>>> 							Thanx, Paul
> >>>>>
> >>>>>>         Thanks, Akira
> >>>>>>
> >>>>>>>
> >>>>>>> 							Thanx, Paul
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-11-01 20:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-16 11:01 Is WRITE_ONCE() enough to prevent invention of stores? Akira Yokosawa
2017-09-17  1:07 ` Paul E. McKenney
2017-09-17 11:04   ` Akira Yokosawa
2017-09-17 21:55     ` Paul E. McKenney
2017-09-17 22:51       ` Akira Yokosawa
2017-10-30 18:14         ` Paul E. McKenney
2017-10-31  3:03           ` Yubin Ruan
2017-10-31  3:14             ` [PATCH] memorder: Add one solution for one snippet Yubin Ruan
2017-10-31  3:50               ` Paul E. McKenney
2017-10-31  5:04                 ` Yubin Ruan
2017-10-31  3:45             ` Is WRITE_ONCE() enough to prevent invention of stores? Paul E. McKenney
2017-10-31 15:36           ` Akira Yokosawa
2017-10-31 16:27             ` Paul E. McKenney
2017-10-31 22:25               ` Akira Yokosawa
2017-11-01 20:15                 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.