linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies
@ 2022-01-25 17:28 Paul Heidekrüger
  2022-01-25 21:00 ` Alan Stern
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Heidekrüger @ 2022-01-25 17:28 UTC (permalink / raw)
  To: Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	Joel Fernandes, Paul Heidekrüger, Björn Töpel,
	linux-kernel, linux-arch
  Cc: Marco Elver, Charalampos Mainas, Pramod Bhatotia

Dependencies which are purely syntactic, i.e. not semantic, might imply
ordering at first glance. However, since they do not affect defined
behavior, compilers are within their rights to remove such dependencies
when optimizing code.

Since syntactic dependencies are not related to any kind of dependency
in particular, explicitly distinguish syntactic and semantic
dependencies as part of the 'A WARNING' section in explanation.txt,
which gives examples of how compilers might affect the LKMM's dependency
orderings in general.

Link: https://lore.kernel.org/all/20211102190138.GA1497378@rowland.harvard.edu/
Signed-off-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
Cc: Marco Elver <elver@google.com>
Cc: Charalampos Mainas <charalampos.mainas@gmail.com>
Cc: Pramod Bhatotia <pramod.bhatotia@in.tum.de>
---
 .../Documentation/explanation.txt             | 25 +++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
index 5d72f3112e56..6d679e5ebdf9 100644
--- a/tools/memory-model/Documentation/explanation.txt
+++ b/tools/memory-model/Documentation/explanation.txt
@@ -411,6 +411,31 @@ Given this version of the code, the LKMM would predict that the load
 from x could be executed after the store to y.  Thus, the memory
 model's original prediction could be invalidated by the compiler.
 
+Caution is also advised when dependencies are purely syntactic, i.e.
+not semantic.  A dependency between two marked accesses is purely
+syntactic iff the defined behavior of the second access is unaffected
+by its dependency.
+
+Compilers are aware of syntactic dependencies and are within their
+rights to remove them as part of optimizations, thereby breaking any
+guarantees of ordering.
+
+Notable cases are dependencies eliminated through constant propagation
+or those where only one value leads to defined behavior as in the
+following example:
+
+	int a[1];
+	int i;
+
+	r1 = READ_ONCE(i);
+	r2 = READ_ONCE(a[r1]);
+
+The formal LKMM is unaware of syntactic dependencies and therefore
+predicts ordering.  However, since any other value than 0 for r1 would
+result in an out-of-bounds access, which is undefined behavior, r2 is
+not affected by its dependency to r1, making the above a purely
+syntactic dependency.
+
 Another issue arises from the fact that in C, arguments to many
 operators and function calls can be evaluated in any order.  For
 example:
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies
  2022-01-25 17:28 [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies Paul Heidekrüger
@ 2022-01-25 21:00 ` Alan Stern
  2022-01-27 11:00   ` Paul Heidekrüger
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2022-01-25 21:00 UTC (permalink / raw)
  To: Paul Heidekrüger
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Tue, Jan 25, 2022 at 05:28:19PM +0000, Paul Heidekrüger wrote:
> Dependencies which are purely syntactic, i.e. not semantic, might imply
> ordering at first glance. However, since they do not affect defined
> behavior, compilers are within their rights to remove such dependencies
> when optimizing code.
> 
> Since syntactic dependencies are not related to any kind of dependency
> in particular, explicitly distinguish syntactic and semantic
> dependencies as part of the 'A WARNING' section in explanation.txt,
> which gives examples of how compilers might affect the LKMM's dependency
> orderings in general.

The "A WARNING" section is a bad place to put this material, because it 
comes before dependencies have been introduced.  It would be better to 
put this at the end of the "DEPENDENCY RELATIONS: data, addr, and ctrl" 
section.

> Link: https://lore.kernel.org/all/20211102190138.GA1497378@rowland.harvard.edu/
> Signed-off-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> Cc: Marco Elver <elver@google.com>
> Cc: Charalampos Mainas <charalampos.mainas@gmail.com>
> Cc: Pramod Bhatotia <pramod.bhatotia@in.tum.de>
> ---
>  .../Documentation/explanation.txt             | 25 +++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
> index 5d72f3112e56..6d679e5ebdf9 100644
> --- a/tools/memory-model/Documentation/explanation.txt
> +++ b/tools/memory-model/Documentation/explanation.txt
> @@ -411,6 +411,31 @@ Given this version of the code, the LKMM would predict that the load
>  from x could be executed after the store to y.  Thus, the memory
>  model's original prediction could be invalidated by the compiler.
>  
> +Caution is also advised when dependencies are purely syntactic, i.e.
> +not semantic.  A dependency between two marked accesses is purely
> +syntactic iff the defined behavior of the second access is unaffected
> +by its dependency.

That's a very abstract way of describing the situation; it doesn't do a 
good job of getting the real idea across.  It also mixes up two separate 
ideas: behaviors being unaffected by a syntactic dependency and 
behaviors being undefined.  They should be described separately.  I 
would prefer something along these lines...

----------------------------------------

Here's a trick question: When is a dependency not a dependency? Answer: 
When it is purely syntactic rather than semantic.  We say a dependency 
between two accesses is purely syntactic if the second access doesn't 
actually depend on the result of the first.  Here is a trivial example:

	r1 = READ_ONCE(x);
	WRITE_ONCE(y, r1 * 0);

There appears to be a data dependency from the load of x to the store of 
y, since the value to be stored is computed from the value that was 
loaded.  But in fact, the value stored does not really depend on 
anything since it will always be 0.  Thus the data dependency is only 
syntactic (it appears to exist in the code) but not semantic (the second 
access will always be the same, regardless of the value of the first 
access).  Given code like this, a compiler could simply eliminate the 
load from x, which would certainly destroy any dependency.

(It's natural to object that no one in their right mind would write code 
like the above.  However, macro expansions can easily give rise to this 
sort of thing, in ways that generally are not apparent to the 
programmer.)

Another mechanism that can give rise to purely syntactic dependencies is 
related to the notion of "undefined behavior". Certain program behaviors 
are called "undefined" in the C language specification, which means that 
when they occur there are no guarantees at all about the outcome.  
Consider the following example:

	int a[1];
	int i;

	r1 = READ_ONCE(i);
	r2 = READ_ONCE(a[r1]);

Access beyond the end or before the beginning of an array is one kind of 
undefined behavior.  Therefore the compiler doesn't have to worry about 
what will happen if r1 is nonzero, and it can assume that r1 will always 
be zero without actually loading anything from i.  (If the assumption 
turns out to be wrong, the resulting behavior will be undefined anyway 
so the compiler doesn't care!)  Thus the load from i can be eliminated, 
breaking the address dependency.

The LKMM is unaware that purely syntactic dependencies are different 
from semantic dependencies and therefore mistakenly predicts that the 
accesses in the two examples above will be ordered.  This is another 
example of how the compiler can undermine the memory model.  Be warned.

----------------------------------------

Alan

> +Compilers are aware of syntactic dependencies and are within their
> +rights to remove them as part of optimizations, thereby breaking any
> +guarantees of ordering.
> +
> +Notable cases are dependencies eliminated through constant propagation
> +or those where only one value leads to defined behavior as in the
> +following example:
> +
> +	int a[1];
> +	int i;
> +
> +	r1 = READ_ONCE(i);
> +	r2 = READ_ONCE(a[r1]);
> +
> +The formal LKMM is unaware of syntactic dependencies and therefore
> +predicts ordering.  However, since any other value than 0 for r1 would
> +result in an out-of-bounds access, which is undefined behavior, r2 is
> +not affected by its dependency to r1, making the above a purely
> +syntactic dependency.
> +
>  Another issue arises from the fact that in C, arguments to many
>  operators and function calls can be evaluated in any order.  For
>  example:
> -- 
> 2.33.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies
  2022-01-25 21:00 ` Alan Stern
@ 2022-01-27 11:00   ` Paul Heidekrüger
  2022-01-27 17:04     ` Alan Stern
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Heidekrüger @ 2022-01-27 11:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Tue, Jan 25, 2022 at 04:00:11PM -0500, Alan Stern wrote:
> On Tue, Jan 25, 2022 at 05:28:19PM +0000, Paul Heidekrüger wrote:
> > Dependencies which are purely syntactic, i.e. not semantic, might imply
> > ordering at first glance. However, since they do not affect defined
> > behavior, compilers are within their rights to remove such dependencies
> > when optimizing code.
> > 
> > Since syntactic dependencies are not related to any kind of dependency
> > in particular, explicitly distinguish syntactic and semantic
> > dependencies as part of the 'A WARNING' section in explanation.txt,
> > which gives examples of how compilers might affect the LKMM's dependency
> > orderings in general.
> 
> The "A WARNING" section is a bad place to put this material, because it 
> comes before dependencies have been introduced.  It would be better to 
> put this at the end of the "DEPENDENCY RELATIONS: data, addr, and ctrl" 
> section.
> 
> > Link: https://lore.kernel.org/all/20211102190138.GA1497378@rowland.harvard.edu/
> > Signed-off-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> > Cc: Marco Elver <elver@google.com>
> > Cc: Charalampos Mainas <charalampos.mainas@gmail.com>
> > Cc: Pramod Bhatotia <pramod.bhatotia@in.tum.de>
> > ---
> >  .../Documentation/explanation.txt             | 25 +++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt
> > index 5d72f3112e56..6d679e5ebdf9 100644
> > --- a/tools/memory-model/Documentation/explanation.txt
> > +++ b/tools/memory-model/Documentation/explanation.txt
> > @@ -411,6 +411,31 @@ Given this version of the code, the LKMM would predict that the load
> >  from x could be executed after the store to y.  Thus, the memory
> >  model's original prediction could be invalidated by the compiler.
> >  
> > +Caution is also advised when dependencies are purely syntactic, i.e.
> > +not semantic.  A dependency between two marked accesses is purely
> > +syntactic iff the defined behavior of the second access is unaffected
> > +by its dependency.
> 
> That's a very abstract way of describing the situation; it doesn't do a 
> good job of getting the real idea across.  It also mixes up two separate 
> ideas: behaviors being unaffected by a syntactic dependency and 
> behaviors being undefined.  They should be described separately. 

Many thanks for the feedback! I agree, the explanation works a lot
better once readers have been introduced to data, addr and ctrl
relations. 

> I would prefer something along these lines...

Shall I resubmit the patch with you as co-developer, or, given that it's
arguably your work now, would you like to submit the patch yourself?

Many thanks,
Paul

> ----------------------------------------
> 
> Here's a trick question: When is a dependency not a dependency? Answer: 
> When it is purely syntactic rather than semantic.  We say a dependency 
> between two accesses is purely syntactic if the second access doesn't 
> actually depend on the result of the first.  Here is a trivial example:
> 
> 	r1 = READ_ONCE(x);
> 	WRITE_ONCE(y, r1 * 0);
> 
> There appears to be a data dependency from the load of x to the store of 
> y, since the value to be stored is computed from the value that was 
> loaded.  But in fact, the value stored does not really depend on 
> anything since it will always be 0.  Thus the data dependency is only 
> syntactic (it appears to exist in the code) but not semantic (the second 
> access will always be the same, regardless of the value of the first 
> access).  Given code like this, a compiler could simply eliminate the 
> load from x, which would certainly destroy any dependency.
> 
> (It's natural to object that no one in their right mind would write code 
> like the above.  However, macro expansions can easily give rise to this 
> sort of thing, in ways that generally are not apparent to the 
> programmer.)
> 
> Another mechanism that can give rise to purely syntactic dependencies is 
> related to the notion of "undefined behavior". Certain program behaviors 
> are called "undefined" in the C language specification, which means that 
> when they occur there are no guarantees at all about the outcome.  
> Consider the following example:
> 
> 	int a[1];
> 	int i;
> 
> 	r1 = READ_ONCE(i);
> 	r2 = READ_ONCE(a[r1]);
> 
> Access beyond the end or before the beginning of an array is one kind of 
> undefined behavior.  Therefore the compiler doesn't have to worry about 
> what will happen if r1 is nonzero, and it can assume that r1 will always 
> be zero without actually loading anything from i.  (If the assumption 
> turns out to be wrong, the resulting behavior will be undefined anyway 
> so the compiler doesn't care!)  Thus the load from i can be eliminated, 
> breaking the address dependency.
> 
> The LKMM is unaware that purely syntactic dependencies are different 
> from semantic dependencies and therefore mistakenly predicts that the 
> accesses in the two examples above will be ordered.  This is another 
> example of how the compiler can undermine the memory model.  Be warned.
> 
> ----------------------------------------
> 
> Alan
> 
> > +Compilers are aware of syntactic dependencies and are within their
> > +rights to remove them as part of optimizations, thereby breaking any
> > +guarantees of ordering.
> > +
> > +Notable cases are dependencies eliminated through constant propagation
> > +or those where only one value leads to defined behavior as in the
> > +following example:
> > +
> > +	int a[1];
> > +	int i;
> > +
> > +	r1 = READ_ONCE(i);
> > +	r2 = READ_ONCE(a[r1]);
> > +
> > +The formal LKMM is unaware of syntactic dependencies and therefore
> > +predicts ordering.  However, since any other value than 0 for r1 would
> > +result in an out-of-bounds access, which is undefined behavior, r2 is
> > +not affected by its dependency to r1, making the above a purely
> > +syntactic dependency.
> > +
> >  Another issue arises from the fact that in C, arguments to many
> >  operators and function calls can be evaluated in any order.  For
> >  example:
> > -- 
> > 2.33.1
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies
  2022-01-27 11:00   ` Paul Heidekrüger
@ 2022-01-27 17:04     ` Alan Stern
  2022-01-27 20:49       ` Paul Heidekrüger
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2022-01-27 17:04 UTC (permalink / raw)
  To: Paul Heidekrüger
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Thu, Jan 27, 2022 at 12:00:22PM +0100, Paul Heidekrüger wrote:
> On Tue, Jan 25, 2022 at 04:00:11PM -0500, Alan Stern wrote:
> > That's a very abstract way of describing the situation; it doesn't do a 
> > good job of getting the real idea across.  It also mixes up two separate 
> > ideas: behaviors being unaffected by a syntactic dependency and 
> > behaviors being undefined.  They should be described separately. 
> 
> Many thanks for the feedback! I agree, the explanation works a lot
> better once readers have been introduced to data, addr and ctrl
> relations. 
> 
> > I would prefer something along these lines...
> 
> Shall I resubmit the patch with you as co-developer, or, given that it's
> arguably your work now, would you like to submit the patch yourself?

I'll submit it myself, with Suggested-by: you.  How does that sound?

Alan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies
  2022-01-27 17:04     ` Alan Stern
@ 2022-01-27 20:49       ` Paul Heidekrüger
  2022-01-27 21:11         ` [PATCH] tools/memory-model: Explain " Alan Stern
  0 siblings, 1 reply; 12+ messages in thread
From: Paul Heidekrüger @ 2022-01-27 20:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Thu, Jan 27, 2022 at 12:04:26PM -0500, Alan Stern wrote:
> On Thu, Jan 27, 2022 at 12:00:22PM +0100, Paul Heidekrüger wrote:
> > On Tue, Jan 25, 2022 at 04:00:11PM -0500, Alan Stern wrote:
> > > That's a very abstract way of describing the situation; it doesn't do a 
> > > good job of getting the real idea across.  It also mixes up two separate 
> > > ideas: behaviors being unaffected by a syntactic dependency and 
> > > behaviors being undefined.  They should be described separately. 
> > 
> > Many thanks for the feedback! I agree, the explanation works a lot
> > better once readers have been introduced to data, addr and ctrl
> > relations. 
> > 
> > > I would prefer something along these lines...
> > 
> > Shall I resubmit the patch with you as co-developer, or, given that it's
> > arguably your work now, would you like to submit the patch yourself?
> 
> I'll submit it myself, with Suggested-by: you.  How does that sound?

Sounds good! 

Many thanks,
Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] tools/memory-model: Explain syntactic and semantic dependencies
  2022-01-27 20:49       ` Paul Heidekrüger
@ 2022-01-27 21:11         ` Alan Stern
  2022-02-01  1:42           ` Akira Yokosawa
                             ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Alan Stern @ 2022-01-27 21:11 UTC (permalink / raw)
  To: Paul Heidekrüger
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

Paul Heidekrüger pointed out that the Linux Kernel Memory Model
documentation doesn't mention the distinction between syntactic and
semantic dependencies.  This is an important difference, because the
compiler can easily break dependencies that are only syntactic, not
semantic.

This patch adds a few paragraphs to the LKMM documentation explaining
these issues and illustrating how they can matter.

Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

---


[as1970]


 tools/memory-model/Documentation/explanation.txt |   47 +++++++++++++++++++++++
 1 file changed, 47 insertions(+)

Index: usb-devel/tools/memory-model/Documentation/explanation.txt
===================================================================
--- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
+++ usb-devel/tools/memory-model/Documentation/explanation.txt
@@ -485,6 +485,53 @@ have R ->po X.  It wouldn't make sense f
 somehow on a value that doesn't get loaded from shared memory until
 later in the code!
 
+Here's a trick question: When is a dependency not a dependency?  Answer:
+When it is purely syntactic rather than semantic.  We say a dependency
+between two accesses is purely syntactic if the second access doesn't
+actually depend on the result of the first.  Here is a trivial example:
+
+	r1 = READ_ONCE(x);
+	WRITE_ONCE(y, r1 * 0);
+
+There appears to be a data dependency from the load of x to the store of
+y, since the value to be stored is computed from the value that was
+loaded.  But in fact, the value stored does not really depend on
+anything since it will always be 0.  Thus the data dependency is only
+syntactic (it appears to exist in the code) but not semantic (the second
+access will always be the same, regardless of the value of the first
+access).  Given code like this, a compiler could simply eliminate the
+load from x, which would certainly destroy any dependency.
+
+(It's natural to object that no one in their right mind would write code
+like the above.  However, macro expansions can easily give rise to this
+sort of thing, in ways that generally are not apparent to the
+programmer.)
+
+Another mechanism that can give rise to purely syntactic dependencies is
+related to the notion of "undefined behavior".  Certain program behaviors
+are called "undefined" in the C language specification, which means that
+when they occur there are no guarantees at all about the outcome.
+Consider the following example:
+
+	int a[1];
+	int i;
+
+	r1 = READ_ONCE(i);
+	r2 = READ_ONCE(a[r1]);
+
+Access beyond the end or before the beginning of an array is one kind of
+undefined behavior.  Therefore the compiler doesn't have to worry about
+what will happen if r1 is nonzero, and it can assume that r1 will always
+be zero without actually loading anything from i.  (If the assumption
+turns out to be wrong, the resulting behavior will be undefined anyway
+so the compiler doesn't care!)  Thus the load from i can be eliminated,
+breaking the address dependency.
+
+The LKMM is unaware that purely syntactic dependencies are different
+from semantic dependencies and therefore mistakenly predicts that the
+accesses in the two examples above will be ordered.  This is another
+example of how the compiler can undermine the memory model.  Be warned.
+
 
 THE READS-FROM RELATION: rf, rfi, and rfe
 -----------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Explain syntactic and semantic dependencies
  2022-01-27 21:11         ` [PATCH] tools/memory-model: Explain " Alan Stern
@ 2022-02-01  1:42           ` Akira Yokosawa
  2022-02-01 18:02           ` Paul E. McKenney
  2022-02-01 19:00           ` [PATCH v2] " Alan Stern
  2 siblings, 0 replies; 12+ messages in thread
From: Akira Yokosawa @ 2022-02-01  1:42 UTC (permalink / raw)
  To: Alan Stern, Paul Heidekrüger
  Cc: Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

Hi,

On Thu, 27 Jan 2022 16:11:48 -0500,
Alan Stern wrote:
> Paul Heidekrüger pointed out that the Linux Kernel Memory Model
> documentation doesn't mention the distinction between syntactic and
> semantic dependencies.  This is an important difference, because the
> compiler can easily break dependencies that are only syntactic, not
> semantic.
> 
> This patch adds a few paragraphs to the LKMM documentation explaining
> these issues and illustrating how they can matter.
> 
> Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> 
> ---
> 
> 
> [as1970]
> 
> 
>  tools/memory-model/Documentation/explanation.txt |   47 +++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> Index: usb-devel/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-devel/tools/memory-model/Documentation/explanation.txt
> @@ -485,6 +485,53 @@ have R ->po X.  It wouldn't make sense f
>  somehow on a value that doesn't get loaded from shared memory until
>  later in the code!
>  
> +Here's a trick question: When is a dependency not a dependency?  Answer:
> +When it is purely syntactic rather than semantic.  We say a dependency
> +between two accesses is purely syntactic if the second access doesn't
> +actually depend on the result of the first.  Here is a trivial example:
> +
> +	r1 = READ_ONCE(x);
> +	WRITE_ONCE(y, r1 * 0);
> +
> +There appears to be a data dependency from the load of x to the store of
> +y, since the value to be stored is computed from the value that was
> +loaded.  But in fact, the value stored does not really depend on
> +anything since it will always be 0.  Thus the data dependency is only
> +syntactic (it appears to exist in the code) but not semantic (the second
> +access will always be the same, regardless of the value of the first
> +access).  Given code like this, a compiler could simply eliminate the
> +load from x, which would certainly destroy any dependency.
> +
> +(It's natural to object that no one in their right mind would write code
> +like the above.  However, macro expansions can easily give rise to this
> +sort of thing, in ways that generally are not apparent to the
> +programmer.)
> +
> +Another mechanism that can give rise to purely syntactic dependencies is
> +related to the notion of "undefined behavior".  Certain program behaviors
> +are called "undefined" in the C language specification, which means that
> +when they occur there are no guarantees at all about the outcome.
> +Consider the following example:
> +
> +	int a[1];
> +	int i;
> +
> +	r1 = READ_ONCE(i);
> +	r2 = READ_ONCE(a[r1]);
> +
> +Access beyond the end or before the beginning of an array is one kind of
> +undefined behavior.  Therefore the compiler doesn't have to worry about
> +what will happen if r1 is nonzero, and it can assume that r1 will always
> +be zero without actually loading anything from i.  (If the assumption
> +turns out to be wrong, the resulting behavior will be undefined anyway
> +so the compiler doesn't care!)  Thus the load from i can be eliminated,
> +breaking the address dependency.
> +
> +The LKMM is unaware that purely syntactic dependencies are different
> +from semantic dependencies and therefore mistakenly predicts that the
> +accesses in the two examples above will be ordered.  This is another
> +example of how the compiler can undermine the memory model.  Be warned.
> +
>  
>  THE READS-FROM RELATION: rf, rfi, and rfe
>  -----------------------------------------

FWIW,

Reviewed-by: Akira Yokosawa <akiyks@gmail.com>

        Thanks, Akira


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Explain syntactic and semantic dependencies
  2022-01-27 21:11         ` [PATCH] tools/memory-model: Explain " Alan Stern
  2022-02-01  1:42           ` Akira Yokosawa
@ 2022-02-01 18:02           ` Paul E. McKenney
  2022-02-01 18:53             ` Alan Stern
  2022-02-01 19:00           ` [PATCH v2] " Alan Stern
  2 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-01 18:02 UTC (permalink / raw)
  To: Alan Stern
  Cc: Paul Heidekrüger, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Thu, Jan 27, 2022 at 04:11:48PM -0500, Alan Stern wrote:
> Paul Heidekrüger pointed out that the Linux Kernel Memory Model
> documentation doesn't mention the distinction between syntactic and
> semantic dependencies.  This is an important difference, because the
> compiler can easily break dependencies that are only syntactic, not
> semantic.
> 
> This patch adds a few paragraphs to the LKMM documentation explaining
> these issues and illustrating how they can matter.
> 
> Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> 
> ---
> 
> 
> [as1970]
> 
> 
>  tools/memory-model/Documentation/explanation.txt |   47 +++++++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> Index: usb-devel/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-devel/tools/memory-model/Documentation/explanation.txt
> @@ -485,6 +485,53 @@ have R ->po X.  It wouldn't make sense f
>  somehow on a value that doesn't get loaded from shared memory until
>  later in the code!
>  
> +Here's a trick question: When is a dependency not a dependency?  Answer:
> +When it is purely syntactic rather than semantic.  We say a dependency
> +between two accesses is purely syntactic if the second access doesn't
> +actually depend on the result of the first.  Here is a trivial example:
> +
> +	r1 = READ_ONCE(x);
> +	WRITE_ONCE(y, r1 * 0);
> +
> +There appears to be a data dependency from the load of x to the store of
> +y, since the value to be stored is computed from the value that was
> +loaded.  But in fact, the value stored does not really depend on
> +anything since it will always be 0.  Thus the data dependency is only
> +syntactic (it appears to exist in the code) but not semantic (the second
> +access will always be the same, regardless of the value of the first
> +access).  Given code like this, a compiler could simply eliminate the
> +load from x, which would certainly destroy any dependency.

Are you OK with that last sentence reading as follows?

	Given code like this, a compiler could simply discard the value
	return by the load from x, which would certainly destroy any
	dependency.

My concern with the original is that it might mislead people into thinking
that compilers can omit volatile loads.

> +
> +(It's natural to object that no one in their right mind would write code
> +like the above.  However, macro expansions can easily give rise to this
> +sort of thing, in ways that generally are not apparent to the
> +programmer.)
> +
> +Another mechanism that can give rise to purely syntactic dependencies is
> +related to the notion of "undefined behavior".  Certain program behaviors
> +are called "undefined" in the C language specification, which means that
> +when they occur there are no guarantees at all about the outcome.
> +Consider the following example:
> +
> +	int a[1];
> +	int i;
> +
> +	r1 = READ_ONCE(i);
> +	r2 = READ_ONCE(a[r1]);
> +
> +Access beyond the end or before the beginning of an array is one kind of
> +undefined behavior.  Therefore the compiler doesn't have to worry about
> +what will happen if r1 is nonzero, and it can assume that r1 will always
> +be zero without actually loading anything from i.

And similarly here:

	... and it can assume that r1 will always be zero regardless of
	the value actually loaded from i.

> +                                                   (If the assumption
> +turns out to be wrong, the resulting behavior will be undefined anyway
> +so the compiler doesn't care!)  Thus the load from i can be eliminated,
> +breaking the address dependency.
> +
> +The LKMM is unaware that purely syntactic dependencies are different
> +from semantic dependencies and therefore mistakenly predicts that the
> +accesses in the two examples above will be ordered.  This is another
> +example of how the compiler can undermine the memory model.  Be warned.
> +
>  
>  THE READS-FROM RELATION: rf, rfi, and rfe
>  -----------------------------------------

Looks great otherwise, and thank you all for your work on this!

Alan, would you like me to pull this in making those two changes and
applying Akira's Reviewed-by, or would you prefer to send another version?

For that matter, am I off base in my suggested changes.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Explain syntactic and semantic dependencies
  2022-02-01 18:02           ` Paul E. McKenney
@ 2022-02-01 18:53             ` Alan Stern
  2022-02-01 19:02               ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2022-02-01 18:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Heidekrüger, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Tue, Feb 01, 2022 at 10:02:39AM -0800, Paul E. McKenney wrote:
> On Thu, Jan 27, 2022 at 04:11:48PM -0500, Alan Stern wrote:
> > Paul Heidekrüger pointed out that the Linux Kernel Memory Model
> > documentation doesn't mention the distinction between syntactic and
> > semantic dependencies.  This is an important difference, because the
> > compiler can easily break dependencies that are only syntactic, not
> > semantic.
> > 
> > This patch adds a few paragraphs to the LKMM documentation explaining
> > these issues and illustrating how they can matter.
> > 
> > Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > 
> > ---
> > 
> > 
> > [as1970]
> > 
> > 
> >  tools/memory-model/Documentation/explanation.txt |   47 +++++++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> > 
> > Index: usb-devel/tools/memory-model/Documentation/explanation.txt
> > ===================================================================
> > --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
> > +++ usb-devel/tools/memory-model/Documentation/explanation.txt
> > @@ -485,6 +485,53 @@ have R ->po X.  It wouldn't make sense f
> >  somehow on a value that doesn't get loaded from shared memory until
> >  later in the code!
> >  
> > +Here's a trick question: When is a dependency not a dependency?  Answer:
> > +When it is purely syntactic rather than semantic.  We say a dependency
> > +between two accesses is purely syntactic if the second access doesn't
> > +actually depend on the result of the first.  Here is a trivial example:
> > +
> > +	r1 = READ_ONCE(x);
> > +	WRITE_ONCE(y, r1 * 0);
> > +
> > +There appears to be a data dependency from the load of x to the store of
> > +y, since the value to be stored is computed from the value that was
> > +loaded.  But in fact, the value stored does not really depend on
> > +anything since it will always be 0.  Thus the data dependency is only
> > +syntactic (it appears to exist in the code) but not semantic (the second
> > +access will always be the same, regardless of the value of the first
> > +access).  Given code like this, a compiler could simply eliminate the
> > +load from x, which would certainly destroy any dependency.
> 
> Are you OK with that last sentence reading as follows?
> 
> 	Given code like this, a compiler could simply discard the value
> 	return by the load from x, which would certainly destroy any

s/return/returned/

> 	dependency.
> 
> My concern with the original is that it might mislead people into thinking
> that compilers can omit volatile loads.

Yes, good point.  Should we also tack on something like this?

	(The compiler is not permitted to eliminate entirely the load 
	generated for a READ_ONCE() -- that's one of the nice properties 
	of READ_ONCE() -- but it is allowed to ignore the load's value.)

> > +
> > +(It's natural to object that no one in their right mind would write code
> > +like the above.  However, macro expansions can easily give rise to this
> > +sort of thing, in ways that generally are not apparent to the
> > +programmer.)
> > +
> > +Another mechanism that can give rise to purely syntactic dependencies is
> > +related to the notion of "undefined behavior".  Certain program behaviors
> > +are called "undefined" in the C language specification, which means that
> > +when they occur there are no guarantees at all about the outcome.
> > +Consider the following example:
> > +
> > +	int a[1];
> > +	int i;
> > +
> > +	r1 = READ_ONCE(i);
> > +	r2 = READ_ONCE(a[r1]);
> > +
> > +Access beyond the end or before the beginning of an array is one kind of
> > +undefined behavior.  Therefore the compiler doesn't have to worry about
> > +what will happen if r1 is nonzero, and it can assume that r1 will always
> > +be zero without actually loading anything from i.
> 
> And similarly here:
> 
> 	... and it can assume that r1 will always be zero regardless of
> 	the value actually loaded from i.

Right.

> > +                                                   (If the assumption
> > +turns out to be wrong, the resulting behavior will be undefined anyway
> > +so the compiler doesn't care!)  Thus the load from i can be eliminated,
> > +breaking the address dependency.

This also should be changed:

	Thus the value from the load can be discarded, breaking the 
	address dependency.

> > +
> > +The LKMM is unaware that purely syntactic dependencies are different
> > +from semantic dependencies and therefore mistakenly predicts that the
> > +accesses in the two examples above will be ordered.  This is another
> > +example of how the compiler can undermine the memory model.  Be warned.
> > +
> >  
> >  THE READS-FROM RELATION: rf, rfi, and rfe
> >  -----------------------------------------
> 
> Looks great otherwise, and thank you all for your work on this!
> 
> Alan, would you like me to pull this in making those two changes and
> applying Akira's Reviewed-by, or would you prefer to send another version?

I'll send a new version.

> For that matter, am I off base in my suggested changes.

Not at all.  Thanks.

Alan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] tools/memory-model: Explain syntactic and semantic dependencies
  2022-01-27 21:11         ` [PATCH] tools/memory-model: Explain " Alan Stern
  2022-02-01  1:42           ` Akira Yokosawa
  2022-02-01 18:02           ` Paul E. McKenney
@ 2022-02-01 19:00           ` Alan Stern
  2022-02-01 19:09             ` Paul E. McKenney
  2 siblings, 1 reply; 12+ messages in thread
From: Alan Stern @ 2022-02-01 19:00 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Heidekrüger, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

Paul Heidekrüger pointed out that the Linux Kernel Memory Model
documentation doesn't mention the distinction between syntactic and
semantic dependencies.  This is an important difference, because the
compiler can easily break dependencies that are only syntactic, not
semantic.

This patch adds a few paragraphs to the LKMM documentation explaining
these issues and illustrating how they can matter.

Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

---

v2:	Incorporate changes suggested by Paul McKenney, along with a few
	other minor edits.
 

[as1970b]


 tools/memory-model/Documentation/explanation.txt |   51 +++++++++++++++++++++++
 1 file changed, 51 insertions(+)

Index: usb-devel/tools/memory-model/Documentation/explanation.txt
===================================================================
--- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
+++ usb-devel/tools/memory-model/Documentation/explanation.txt
@@ -485,6 +485,57 @@ have R ->po X.  It wouldn't make sense f
 somehow on a value that doesn't get loaded from shared memory until
 later in the code!
 
+Here's a trick question: When is a dependency not a dependency?  Answer:
+When it is purely syntactic rather than semantic.  We say a dependency
+between two accesses is purely syntactic if the second access doesn't
+actually depend on the result of the first.  Here is a trivial example:
+
+	r1 = READ_ONCE(x);
+	WRITE_ONCE(y, r1 * 0);
+
+There appears to be a data dependency from the load of x to the store
+of y, since the value to be stored is computed from the value that was
+loaded.  But in fact, the value stored does not really depend on
+anything since it will always be 0.  Thus the data dependency is only
+syntactic (it appears to exist in the code) but not semantic (the
+second access will always be the same, regardless of the value of the
+first access).  Given code like this, a compiler could simply discard
+the value returned by the load from x, which would certainly destroy
+any dependency.  (The compiler is not permitted to eliminate entirely
+the load generated for a READ_ONCE() -- that's one of the nice
+properties of READ_ONCE() -- but it is allowed to ignore the load's
+value.)
+
+It's natural to object that no one in their right mind would write
+code like the above.  However, macro expansions can easily give rise
+to this sort of thing, in ways that often are not apparent to the
+programmer.
+
+Another mechanism that can lead to purely syntactic dependencies is
+related to the notion of "undefined behavior".  Certain program
+behaviors are called "undefined" in the C language specification,
+which means that when they occur there are no guarantees at all about
+the outcome.  Consider the following example:
+
+	int a[1];
+	int i;
+
+	r1 = READ_ONCE(i);
+	r2 = READ_ONCE(a[r1]);
+
+Access beyond the end or before the beginning of an array is one kind
+of undefined behavior.  Therefore the compiler doesn't have to worry
+about what will happen if r1 is nonzero, and it can assume that r1
+will always be zero regardless of the value actually loaded from i.
+(If the assumption turns out to be wrong the resulting behavior will
+be undefined anyway, so the compiler doesn't care!)  Thus the value
+from the load can be discarded, breaking the address dependency.
+
+The LKMM is unaware that purely syntactic dependencies are different
+from semantic dependencies and therefore mistakenly predicts that the
+accesses in the two examples above will be ordered.  This is another
+example of how the compiler can undermine the memory model.  Be warned.
+
 
 THE READS-FROM RELATION: rf, rfi, and rfe
 -----------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] tools/memory-model: Explain syntactic and semantic dependencies
  2022-02-01 18:53             ` Alan Stern
@ 2022-02-01 19:02               ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-01 19:02 UTC (permalink / raw)
  To: Alan Stern
  Cc: Paul Heidekrüger, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Tue, Feb 01, 2022 at 01:53:14PM -0500, Alan Stern wrote:
> On Tue, Feb 01, 2022 at 10:02:39AM -0800, Paul E. McKenney wrote:
> > On Thu, Jan 27, 2022 at 04:11:48PM -0500, Alan Stern wrote:
> > > Paul Heidekrüger pointed out that the Linux Kernel Memory Model
> > > documentation doesn't mention the distinction between syntactic and
> > > semantic dependencies.  This is an important difference, because the
> > > compiler can easily break dependencies that are only syntactic, not
> > > semantic.
> > > 
> > > This patch adds a few paragraphs to the LKMM documentation explaining
> > > these issues and illustrating how they can matter.
> > > 
> > > Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > 
> > > ---
> > > 
> > > 
> > > [as1970]
> > > 
> > > 
> > >  tools/memory-model/Documentation/explanation.txt |   47 +++++++++++++++++++++++
> > >  1 file changed, 47 insertions(+)
> > > 
> > > Index: usb-devel/tools/memory-model/Documentation/explanation.txt
> > > ===================================================================
> > > --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
> > > +++ usb-devel/tools/memory-model/Documentation/explanation.txt
> > > @@ -485,6 +485,53 @@ have R ->po X.  It wouldn't make sense f
> > >  somehow on a value that doesn't get loaded from shared memory until
> > >  later in the code!
> > >  
> > > +Here's a trick question: When is a dependency not a dependency?  Answer:
> > > +When it is purely syntactic rather than semantic.  We say a dependency
> > > +between two accesses is purely syntactic if the second access doesn't
> > > +actually depend on the result of the first.  Here is a trivial example:
> > > +
> > > +	r1 = READ_ONCE(x);
> > > +	WRITE_ONCE(y, r1 * 0);
> > > +
> > > +There appears to be a data dependency from the load of x to the store of
> > > +y, since the value to be stored is computed from the value that was
> > > +loaded.  But in fact, the value stored does not really depend on
> > > +anything since it will always be 0.  Thus the data dependency is only
> > > +syntactic (it appears to exist in the code) but not semantic (the second
> > > +access will always be the same, regardless of the value of the first
> > > +access).  Given code like this, a compiler could simply eliminate the
> > > +load from x, which would certainly destroy any dependency.
> > 
> > Are you OK with that last sentence reading as follows?
> > 
> > 	Given code like this, a compiler could simply discard the value
> > 	return by the load from x, which would certainly destroy any
> 
> s/return/returned/

Good eyes!

> > 	dependency.
> > 
> > My concern with the original is that it might mislead people into thinking
> > that compilers can omit volatile loads.
> 
> Yes, good point.  Should we also tack on something like this?
> 
> 	(The compiler is not permitted to eliminate entirely the load 
> 	generated for a READ_ONCE() -- that's one of the nice properties 
> 	of READ_ONCE() -- but it is allowed to ignore the load's value.)

Please!

> > > +
> > > +(It's natural to object that no one in their right mind would write code
> > > +like the above.  However, macro expansions can easily give rise to this
> > > +sort of thing, in ways that generally are not apparent to the
> > > +programmer.)
> > > +
> > > +Another mechanism that can give rise to purely syntactic dependencies is
> > > +related to the notion of "undefined behavior".  Certain program behaviors
> > > +are called "undefined" in the C language specification, which means that
> > > +when they occur there are no guarantees at all about the outcome.
> > > +Consider the following example:
> > > +
> > > +	int a[1];
> > > +	int i;
> > > +
> > > +	r1 = READ_ONCE(i);
> > > +	r2 = READ_ONCE(a[r1]);
> > > +
> > > +Access beyond the end or before the beginning of an array is one kind of
> > > +undefined behavior.  Therefore the compiler doesn't have to worry about
> > > +what will happen if r1 is nonzero, and it can assume that r1 will always
> > > +be zero without actually loading anything from i.
> > 
> > And similarly here:
> > 
> > 	... and it can assume that r1 will always be zero regardless of
> > 	the value actually loaded from i.
> 
> Right.
> 
> > > +                                                   (If the assumption
> > > +turns out to be wrong, the resulting behavior will be undefined anyway
> > > +so the compiler doesn't care!)  Thus the load from i can be eliminated,
> > > +breaking the address dependency.
> 
> This also should be changed:
> 
> 	Thus the value from the load can be discarded, breaking the 
> 	address dependency.

Again, good eyes!

> > > +
> > > +The LKMM is unaware that purely syntactic dependencies are different
> > > +from semantic dependencies and therefore mistakenly predicts that the
> > > +accesses in the two examples above will be ordered.  This is another
> > > +example of how the compiler can undermine the memory model.  Be warned.
> > > +
> > >  
> > >  THE READS-FROM RELATION: rf, rfi, and rfe
> > >  -----------------------------------------
> > 
> > Looks great otherwise, and thank you all for your work on this!
> > 
> > Alan, would you like me to pull this in making those two changes and
> > applying Akira's Reviewed-by, or would you prefer to send another version?
> 
> I'll send a new version.

Very good, looking forward to it!

							Thanx, Paul

> > For that matter, am I off base in my suggested changes.
> 
> Not at all.  Thanks.
> 
> Alan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] tools/memory-model: Explain syntactic and semantic dependencies
  2022-02-01 19:00           ` [PATCH v2] " Alan Stern
@ 2022-02-01 19:09             ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-01 19:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Paul Heidekrüger, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	Björn Töpel, linux-kernel, linux-arch, Marco Elver,
	Charalampos Mainas, Pramod Bhatotia

On Tue, Feb 01, 2022 at 02:00:08PM -0500, Alan Stern wrote:
> Paul Heidekrüger pointed out that the Linux Kernel Memory Model
> documentation doesn't mention the distinction between syntactic and
> semantic dependencies.  This is an important difference, because the
> compiler can easily break dependencies that are only syntactic, not
> semantic.
> 
> This patch adds a few paragraphs to the LKMM documentation explaining
> these issues and illustrating how they can matter.
> 
> Suggested-by: Paul Heidekrüger <paul.heidekrueger@in.tum.de>
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

Very good, queued and pushed with Akira's Reviewed-by.

Thank you all!

							Thanx, Paul

> ---
> 
> v2:	Incorporate changes suggested by Paul McKenney, along with a few
> 	other minor edits.
>  
> 
> [as1970b]
> 
> 
>  tools/memory-model/Documentation/explanation.txt |   51 +++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> Index: usb-devel/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-devel.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-devel/tools/memory-model/Documentation/explanation.txt
> @@ -485,6 +485,57 @@ have R ->po X.  It wouldn't make sense f
>  somehow on a value that doesn't get loaded from shared memory until
>  later in the code!
>  
> +Here's a trick question: When is a dependency not a dependency?  Answer:
> +When it is purely syntactic rather than semantic.  We say a dependency
> +between two accesses is purely syntactic if the second access doesn't
> +actually depend on the result of the first.  Here is a trivial example:
> +
> +	r1 = READ_ONCE(x);
> +	WRITE_ONCE(y, r1 * 0);
> +
> +There appears to be a data dependency from the load of x to the store
> +of y, since the value to be stored is computed from the value that was
> +loaded.  But in fact, the value stored does not really depend on
> +anything since it will always be 0.  Thus the data dependency is only
> +syntactic (it appears to exist in the code) but not semantic (the
> +second access will always be the same, regardless of the value of the
> +first access).  Given code like this, a compiler could simply discard
> +the value returned by the load from x, which would certainly destroy
> +any dependency.  (The compiler is not permitted to eliminate entirely
> +the load generated for a READ_ONCE() -- that's one of the nice
> +properties of READ_ONCE() -- but it is allowed to ignore the load's
> +value.)
> +
> +It's natural to object that no one in their right mind would write
> +code like the above.  However, macro expansions can easily give rise
> +to this sort of thing, in ways that often are not apparent to the
> +programmer.
> +
> +Another mechanism that can lead to purely syntactic dependencies is
> +related to the notion of "undefined behavior".  Certain program
> +behaviors are called "undefined" in the C language specification,
> +which means that when they occur there are no guarantees at all about
> +the outcome.  Consider the following example:
> +
> +	int a[1];
> +	int i;
> +
> +	r1 = READ_ONCE(i);
> +	r2 = READ_ONCE(a[r1]);
> +
> +Access beyond the end or before the beginning of an array is one kind
> +of undefined behavior.  Therefore the compiler doesn't have to worry
> +about what will happen if r1 is nonzero, and it can assume that r1
> +will always be zero regardless of the value actually loaded from i.
> +(If the assumption turns out to be wrong the resulting behavior will
> +be undefined anyway, so the compiler doesn't care!)  Thus the value
> +from the load can be discarded, breaking the address dependency.
> +
> +The LKMM is unaware that purely syntactic dependencies are different
> +from semantic dependencies and therefore mistakenly predicts that the
> +accesses in the two examples above will be ordered.  This is another
> +example of how the compiler can undermine the memory model.  Be warned.
> +
>  
>  THE READS-FROM RELATION: rf, rfi, and rfe
>  -----------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-02-01 19:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25 17:28 [PATCH] tools/memory-model: Clarify syntactic and semantic dependencies Paul Heidekrüger
2022-01-25 21:00 ` Alan Stern
2022-01-27 11:00   ` Paul Heidekrüger
2022-01-27 17:04     ` Alan Stern
2022-01-27 20:49       ` Paul Heidekrüger
2022-01-27 21:11         ` [PATCH] tools/memory-model: Explain " Alan Stern
2022-02-01  1:42           ` Akira Yokosawa
2022-02-01 18:02           ` Paul E. McKenney
2022-02-01 18:53             ` Alan Stern
2022-02-01 19:02               ` Paul E. McKenney
2022-02-01 19:00           ` [PATCH v2] " Alan Stern
2022-02-01 19:09             ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).