linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?
@ 2016-02-26 21:14 Sergey Fedorov
  2016-02-26 21:31 ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Fedorov @ 2016-02-26 21:14 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney

Hi,

I just can't understand how this kind of compiler barrier macros may 
provide any form of cache coherence. Sure, such kind of compiler barrier 
is necessary to "reliably" access a variable from multiple CPUs. But why 
it is stated that these macros *provide* cache coherence?

 From Documentation/memory-barriers.txt:
> The READ_ONCE() and WRITE_ONCE() functions can prevent any number of
> optimizations that, while perfectly safe in single-threaded code, can
> be fatal in concurrent code.  Here are some examples of these sorts
> of optimizations:
>
>  (*) The compiler is within its rights to reorder loads and stores
>      to the same variable, and in some cases, the CPU is within its
>      rights to reorder loads to the same variable.  This means that
>      the following code:
>
>     a[0] = x;
>     a[1] = x;
>
>      Might result in an older value of x stored in a[1] than in a[0].
>      Prevent both the compiler and the CPU from doing this as follows:
>
>     a[0] = READ_ONCE(x);
>     a[1] = READ_ONCE(x);
>
>      In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for
>      accesses from multiple CPUs to a single variable.

Thanks,
Sergey

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?
  2016-02-26 21:14 Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence? Sergey Fedorov
@ 2016-02-26 21:31 ` Paul E. McKenney
  2016-02-27 20:13   ` Sergey Fedorov
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2016-02-26 21:31 UTC (permalink / raw)
  To: Sergey Fedorov; +Cc: linux-kernel

On Sat, Feb 27, 2016 at 12:14:21AM +0300, Sergey Fedorov wrote:
> Hi,
> 
> I just can't understand how this kind of compiler barrier macros may
> provide any form of cache coherence. Sure, such kind of compiler
> barrier is necessary to "reliably" access a variable from multiple
> CPUs. But why it is stated that these macros *provide* cache
> coherence?

Without READ_ONCE(), common sub-expression elimination optimizations
can cause later reads of a given variable to see older value than
previous reads did.  For a (silly) example:

	a = complicated_pure_function(x);
	b = x;
	c = complicated_pure_function(x);

The compiler is within its rights to transform this into the following:

	a = complicated_pure_function(x);
	b = x;
	c = a(x);

In this case, the assignment to b might see a newer value of x than did
the later assignment to c.  This violates cache coherence, which states
that all reads from a given variable must agree on the order of values
taken on by that variable.

Using READ_ONCE() prevents this violation of cache coherence, albeit
at the price of evaluating complicated_pure_function() twice rather
than once:

	a = complicated_pure_function(READ_ONCE(x));
	b = READ_ONCE(x);
	c = complicated_pure_function(READ_ONCE(x));

Similar examples exist for WRITE_ONCE().

You -want- the compiler to violate cache coherence for normal accesses
to unshared variables, so you have to tell it when cache coherence is
important.

							Thanx, Paul

> From Documentation/memory-barriers.txt:
> >The READ_ONCE() and WRITE_ONCE() functions can prevent any number of
> >optimizations that, while perfectly safe in single-threaded code, can
> >be fatal in concurrent code.  Here are some examples of these sorts
> >of optimizations:
> >
> > (*) The compiler is within its rights to reorder loads and stores
> >     to the same variable, and in some cases, the CPU is within its
> >     rights to reorder loads to the same variable.  This means that
> >     the following code:
> >
> >    a[0] = x;
> >    a[1] = x;
> >
> >     Might result in an older value of x stored in a[1] than in a[0].
> >     Prevent both the compiler and the CPU from doing this as follows:
> >
> >    a[0] = READ_ONCE(x);
> >    a[1] = READ_ONCE(x);
> >
> >     In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for
> >     accesses from multiple CPUs to a single variable.
> 
> Thanks,
> Sergey
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?
  2016-02-26 21:31 ` Paul E. McKenney
@ 2016-02-27 20:13   ` Sergey Fedorov
  2016-02-27 22:53     ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Fedorov @ 2016-02-27 20:13 UTC (permalink / raw)
  To: paulmck; +Cc: linux-kernel

On 27.02.2016 00:31, Paul E. McKenney wrote:
> Without READ_ONCE(), common sub-expression elimination optimizations
> can cause later reads of a given variable to see older value than
> previous reads did.  For a (silly) example:
>
> 	a = complicated_pure_function(x);
> 	b = x;
> 	c = complicated_pure_function(x);
>
> The compiler is within its rights to transform this into the following:
>
> 	a = complicated_pure_function(x);
> 	b = x;
> 	c = a(x);
>
> In this case, the assignment to b might see a newer value of x than did
> the later assignment to c.  This violates cache coherence, which states
> that all reads from a given variable must agree on the order of values
> taken on by that variable.

I see how READ_ONCE() and WRITE_ONCE() can prevent compiler from 
speculating on variable values and optimizing memory accesses. But 
concerning cache coherency itself, my understanding is that software can 
really ensure hardware cache coherency by using one of the following 
methods:
  - by not using the caches
  - by using some sort of cache maintenance instructions
  - by using hardware cache coherency mechanisms (which is what normally 
used)

What kind of "cache coherency" do you mean?

Thanks,
Sergey

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?
  2016-02-27 20:13   ` Sergey Fedorov
@ 2016-02-27 22:53     ` Paul E. McKenney
  2016-02-29 19:07       ` Sergey Fedorov
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2016-02-27 22:53 UTC (permalink / raw)
  To: Sergey Fedorov; +Cc: linux-kernel

On Sat, Feb 27, 2016 at 11:13:00PM +0300, Sergey Fedorov wrote:
> On 27.02.2016 00:31, Paul E. McKenney wrote:
> >Without READ_ONCE(), common sub-expression elimination optimizations
> >can cause later reads of a given variable to see older value than
> >previous reads did.  For a (silly) example:
> >
> >	a = complicated_pure_function(x);
> >	b = x;
> >	c = complicated_pure_function(x);
> >
> >The compiler is within its rights to transform this into the following:
> >
> >	a = complicated_pure_function(x);
> >	b = x;
> >	c = a(x);
> >
> >In this case, the assignment to b might see a newer value of x than did
> >the later assignment to c.  This violates cache coherence, which states
> >that all reads from a given variable must agree on the order of values
> >taken on by that variable.
> 
> I see how READ_ONCE() and WRITE_ONCE() can prevent compiler from
> speculating on variable values and optimizing memory accesses. But
> concerning cache coherency itself, my understanding is that software
> can really ensure hardware cache coherency by using one of the
> following methods:
>  - by not using the caches
>  - by using some sort of cache maintenance instructions
>  - by using hardware cache coherency mechanisms (which is what
> normally used)
> 
> What kind of "cache coherency" do you mean?

All current systems supporting Linux guarantee that volatile accesses
to a given single variable will be seen in order, even when caches are
active, and without using any cache-coherence instructions.  Note "a
given single variable".  If there is more than one variable in play,
explicit memory ordering is required.  The "volatile" is also important,
because the compiler (and in a few cases, the hardware) can reorder
non-volatile accesses.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?
  2016-02-27 22:53     ` Paul E. McKenney
@ 2016-02-29 19:07       ` Sergey Fedorov
  0 siblings, 0 replies; 5+ messages in thread
From: Sergey Fedorov @ 2016-02-29 19:07 UTC (permalink / raw)
  To: paulmck; +Cc: linux-kernel

On 28.02.2016 01:53, Paul E. McKenney wrote:
> On Sat, Feb 27, 2016 at 11:13:00PM +0300, Sergey Fedorov wrote:
>> On 27.02.2016 00:31, Paul E. McKenney wrote:
>>> Without READ_ONCE(), common sub-expression elimination optimizations
>>> can cause later reads of a given variable to see older value than
>>> previous reads did.  For a (silly) example:
>>>
>>> 	a = complicated_pure_function(x);
>>> 	b = x;
>>> 	c = complicated_pure_function(x);
>>>
>>> The compiler is within its rights to transform this into the following:
>>>
>>> 	a = complicated_pure_function(x);
>>> 	b = x;
>>> 	c = a(x);
>>>
>>> In this case, the assignment to b might see a newer value of x than did
>>> the later assignment to c.  This violates cache coherence, which states
>>> that all reads from a given variable must agree on the order of values
>>> taken on by that variable.
>> I see how READ_ONCE() and WRITE_ONCE() can prevent compiler from
>> speculating on variable values and optimizing memory accesses. But
>> concerning cache coherency itself, my understanding is that software
>> can really ensure hardware cache coherency by using one of the
>> following methods:
>>   - by not using the caches
>>   - by using some sort of cache maintenance instructions
>>   - by using hardware cache coherency mechanisms (which is what
>> normally used)
>>
>> What kind of "cache coherency" do you mean?
> All current systems supporting Linux guarantee that volatile accesses
> to a given single variable will be seen in order, even when caches are
> active, and without using any cache-coherence instructions.  Note "a
> given single variable".  If there is more than one variable in play,
> explicit memory ordering is required.  The "volatile" is also important,
> because the compiler (and in a few cases, the hardware) can reorder
> non-volatile accesses.

Thank you for clarification. I think this was a bit confusing for me 
because I always think of cache coherence independent from high-level C 
objects like variables. For me, cache coherence is the behavior of 
system in response to CPU(s) making load/store operations to the same 
memory location.

Thanks,
Sergey

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-02-29 19:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-26 21:14 Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence? Sergey Fedorov
2016-02-26 21:31 ` Paul E. McKenney
2016-02-27 20:13   ` Sergey Fedorov
2016-02-27 22:53     ` Paul E. McKenney
2016-02-29 19:07       ` Sergey Fedorov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).