linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
@ 2018-02-28 20:13 Alan Stern
  2018-03-01  1:55 ` Boqun Feng
  2018-03-13 13:56 ` Andrea Parri
  0 siblings, 2 replies; 13+ messages in thread
From: Alan Stern @ 2018-02-28 20:13 UTC (permalink / raw)
  To: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, Boqun Feng,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Paul E. McKenney, Peter Zijlstra, Will Deacon
  Cc: Kernel development list

This patch reorganizes the definition of rb in the Linux Kernel Memory
Consistency Model.  The relation is now expressed in terms of
rcu-fence, which consists of a sequence of gp and rscs links separated
by rcu-link links, in which the number of occurrences of gp is >= the
number of occurrences of rscs.

Arguments similar to those published in
http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
inter-CPU strong fence.  Furthermore, the definition of rb in terms of
rcu-fence is highly analogous to the definition of pb in terms of
strong-fence, which can help explain why rcu-path expresses a form of
temporal ordering.

This change should not affect the semantics of the memory model, just
its internal organization.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

---

v2: Rebase on top of the preceding patch which renames "link" to
"rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
in the definition of rcu-fence.  Minor editing improvements in
explanation.txt.

Index: usb-4.x/tools/memory-model/linux-kernel.cat
===================================================================
--- usb-4.x.orig/tools/memory-model/linux-kernel.cat
+++ usb-4.x/tools/memory-model/linux-kernel.cat
@@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
  *)
 let rcu-link = hb* ; pb* ; prop
 
-(* Chains that affect the RCU grace-period guarantee *)
-let gp-link = gp ; rcu-link
-let rscs-link = rscs ; rcu-link
-
 (*
- * A cycle containing at least as many grace periods as RCU read-side
- * critical sections is forbidden.
+ * Any sequence containing at least as many grace periods as RCU read-side
+ * critical sections (joined by rcu-link) acts as a generalized strong fence.
  *)
-let rec rb =
-	gp-link |
-	(gp-link ; rscs-link) |
-	(rscs-link ; gp-link) |
-	(rb ; rb) |
-	(gp-link ; rb ; rscs-link) |
-	(rscs-link ; rb ; gp-link)
+let rec rcu-fence = gp |
+	(gp ; rcu-link ; rscs) |
+	(rscs ; rcu-link ; gp) |
+	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
+	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
+	(rcu-fence ; rcu-link ; rcu-fence)
+
+(* rb orders instructions just as pb does *)
+let rb = prop ; rcu-fence ; hb* ; pb*
 
 irreflexive rb as rcu
+
+(*
+ * The happens-before, propagation, and rcu constraints are all
+ * expressions of temporal ordering.  They could be replaced by
+ * a single constraint on an "executes-before" relation, xb:
+ *
+ * let xb = hb | pb | rb
+ * acyclic xb as executes-before
+ *)
Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
===================================================================
--- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
+++ usb-4.x/tools/memory-model/Documentation/explanation.txt
@@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
   19. AND THEN THERE WAS ALPHA
   20. THE HAPPENS-BEFORE RELATION: hb
   21. THE PROPAGATES-BEFORE RELATION: pb
-  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
+  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
   23. ODDS AND ENDS
 
 
@@ -1451,8 +1451,8 @@ they execute means that it cannot have c
 the content of the LKMM's "propagation" axiom.
 
 
-RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
----------------------------------------------------
+RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
+----------------------------------------------------
 
 RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
 rests on two concepts: grace periods and read-side critical sections.
@@ -1537,49 +1537,100 @@ relation, and the details don't matter u
 a somewhat lengthy formal proof.  Pretty much all you need to know
 about rcu-link is the information in the preceding paragraph.
 
-The LKMM goes on to define the gp-link and rscs-link relations.  They
-bring grace periods and read-side critical sections into the picture,
-in the following way:
-
-	E ->gp-link F means there is a synchronize_rcu() fence event S
-	and an event X such that E ->po S, either S ->po X or S = X,
-	and X ->rcu-link F.  In other words, E and F are linked by a
-	grace period followed by an instance of rcu-link.
-
-	E ->rscs-link F means there is a critical section delimited by
-	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
-	and an event X such that E ->po U, either L ->po X or L = X,
-	and X ->rcu-link F.  Roughly speaking, this says that some
-	event in the same critical section as E is linked by rcu-link
-	to F.
+The LKMM also defines the gp and rscs relations.  They bring grace
+periods and read-side critical sections into the picture, in the
+following way:
+
+	E ->gp F means there is a synchronize_rcu() fence event S such
+	that E ->po S and either S ->po F or S = F.  In simple terms,
+	there is a grace period po-between E and F.
+
+	E ->rscs F means there is a critical section delimited by an
+	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
+	that E ->po U and either L ->po F or L = F.  You can think of
+	this as saying that E and F are in the same critical section
+	(in fact, it also allows E to be po-before the start of the
+	critical section and F to be po-after the end).
 
 If we think of the rcu-link relation as standing for an extended
-"before", then E ->gp-link F says that E executes before a grace
-period which ends before F executes.  (In fact it covers more than
-this, because it also includes cases where E executes before a grace
-period and some store propagates to F's CPU before F executes and
-doesn't propagate to some other CPU until after the grace period
-ends.)  Similarly, E ->rscs-link F says that E is part of (or before
-the start of) a critical section which starts before F executes.
+"before", then X ->gp Y ->rcu-link Z says that X executes before a
+grace period which ends before Z executes.  (In fact it covers more
+than this, because it also includes cases where X executes before a
+grace period and some store propagates to Z's CPU before Z executes
+but doesn't propagate to some other CPU until after the grace period
+ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
+before the start of) a critical section which starts before Z
+executes.
+
+The LKMM goes on to define the rcu-fence relation as a sequence of gp
+and rscs links separated by rcu-link links, in which the number of gp
+links is >= the number of rscs links.  For example:
+
+	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
+
+would imply that X ->rcu-fence V, because this sequence contains two
+gp links and only one rscs link.  (It also implies that X ->rcu-fence T
+and Z ->rcu-fence V.)  On the other hand:
+
+	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
+
+does not imply X ->rcu-fence V, because the sequence contains only
+one gp link but two rscs links.
+
+The rcu-fence relation is important because the Grace Period Guarantee
+means that rcu-fence acts kind of like a strong fence.  In particular,
+if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
+will propagate to every CPU before Z executes.
+
+To prove this in full generality requires some intellectual effort.
+We'll consider just a very simple case:
+
+	W ->gp X ->rcu-link Y ->rscs Z.
+
+This formula means that there is a grace period G and a critical
+section C such that:
+
+	1. W is po-before G;
+
+	2. X is equal to or po-after G;
+
+	3. X comes "before" Y in some sense;
+
+	4. Y is po-before the end of C;
+
+	5. Z is equal to or po-after the start of C.
+
+From 2 - 4 we deduce that the grace period G ends before the critical
+section C.  Then the second part of the Grace Period Guarantee says
+not only that G starts before C does, but also that W (which executes
+on G's CPU before G starts) must propagate to every CPU before C
+starts.  In particular, W propagates to every CPU before Z executes
+(or finishes executing, in the case where Z is equal to the
+rcu_read_lock() fence event which starts C.)  This sort of reasoning
+can be expanded to handle all the situations covered by rcu-fence.
+
+Finally, the LKMM defines the RCU-before (rb) relation in terms of
+rcu-fence.  This is done in essentially the same way as the pb
+relation was defined in terms of strong-fence.  We will omit the
+details; the end result is that E ->rb F implies E must execute before
+F, just as E ->pb F does (and for much the same reasons).
 
 Putting this all together, the LKMM expresses the Grace Period
-Guarantee by requiring that there are no cycles consisting of gp-link
-and rscs-link links in which the number of gp-link instances is >= the
-number of rscs-link instances.  It does this by defining the rb
-relation to link events E and F whenever it is possible to pass from E
-to F by a sequence of gp-link and rscs-link links with at least as
-many of the former as the latter.  The LKMM's "rcu" axiom then says
-that there are no events E with E ->rb E.
-
-Justifying this axiom takes some intellectual effort, but it is in
-fact a valid formalization of the Grace Period Guarantee.  We won't
-attempt to go through the detailed argument, but the following
-analysis gives a taste of what is involved.  Suppose we have a
-violation of the first part of the Guarantee: A critical section
-starts before a grace period, and some store propagates to the
-critical section's CPU before the end of the critical section but
-doesn't propagate to some other CPU until after the end of the grace
-period.
+Guarantee by requiring that the rb relation does not contain a cycle.
+Equivalently, this "rcu" axiom requires that there are no events E and
+F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
+axiom requires that there are no cycles consisting of gp and rscs
+alternating with rcu-link, where the number of gp links is >= the
+number of rscs links.
+
+Justifying the axiom isn't easy, but it is in fact a valid
+formalization of the Grace Period Guarantee.  We won't attempt to go
+through the detailed argument, but the following analysis gives a
+taste of what is involved.  Suppose we have a violation of the first
+part of the Guarantee: A critical section starts before a grace
+period, and some store propagates to the critical section's CPU before
+the end of the critical section but doesn't propagate to some other
+CPU until after the end of the grace period.
 
 Putting symbols to these ideas, let L and U be the rcu_read_lock() and
 rcu_read_unlock() fence events delimiting the critical section in
@@ -1606,11 +1657,14 @@ by rcu-link, yielding:
 
 	S ->po X ->rcu-link Z ->po U.
 
-The formulas say that S is po-between F and X, hence F ->gp-link Z
-via X.  They also say that Z comes before the end of the critical
-section and E comes after its start, hence Z ->rscs-link F via E.  But
-now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
-"rcu" axiom rules out this violation of the Grace Period Guarantee.
+The formulas say that S is po-between F and X, hence F ->gp X.  They
+also say that Z comes before the end of the critical section and E
+comes after its start, hence Z ->rscs E.  From all this we obtain:
+
+	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
+
+a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
+the Grace Period Guarantee.
 
 For something a little more down-to-earth, let's see how the axiom
 works out in practice.  Consider the RCU code example from above, this
@@ -1639,15 +1693,15 @@ time with statement labels added to the
 If r2 = 0 at the end then P0's store at X overwrites the value that
 P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
 In addition, there is a synchronize_rcu() between Y and Z, so therefore
-we have Y ->gp-link X.
+we have Y ->gp Z.
 
 If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
 so we have W ->rcu-link Y.  In addition, W and X are in the same critical
-section, so therefore we have X ->rscs-link Y.
+section, so therefore we have X ->rscs W.
 
-This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
-and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
-not allowed by the LKMM, as we would expect.
+Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
+violating the "rcu" axiom.  Hence the outcome is not allowed by the
+LKMM, as we would expect.
 
 For contrast, let's see what can happen in a more complicated example:
 
@@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
 	}
 
 If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
-that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
-via V.  And just as before, this gives a cycle:
-
-	W ->rscs-link Y ->gp-link U ->rscs-link W.
-
-However, this cycle has fewer gp-link instances than rscs-link
-instances, and consequently the outcome is not forbidden by the LKMM.
-The following instruction timing diagram shows how it might actually
-occur:
+that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
+However this cycle is not forbidden, because the sequence of relations
+contains fewer instances of gp (one) than of rscs (two).  Consequently
+the outcome is allowed by the LKMM.  The following instruction timing
+diagram shows how it might actually occur:
 
 P0			P1			P2
 --------------------	--------------------	--------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern
@ 2018-03-01  1:55 ` Boqun Feng
  2018-03-01  4:49   ` Paul E. McKenney
  2018-03-01 15:49   ` Alan Stern
  2018-03-13 13:56 ` Andrea Parri
  1 sibling, 2 replies; 13+ messages in thread
From: Boqun Feng @ 2018-03-01  1:55 UTC (permalink / raw)
  To: Alan Stern
  Cc: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney,
	Peter Zijlstra, Will Deacon, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 15299 bytes --]

On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote:
> This patch reorganizes the definition of rb in the Linux Kernel Memory
> Consistency Model.  The relation is now expressed in terms of
> rcu-fence, which consists of a sequence of gp and rscs links separated
> by rcu-link links, in which the number of occurrences of gp is >= the
> number of occurrences of rscs.
> 
> Arguments similar to those published in
> http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
> inter-CPU strong fence.  Furthermore, the definition of rb in terms of
> rcu-fence is highly analogous to the definition of pb in terms of
> strong-fence, which can help explain why rcu-path expresses a form of
> temporal ordering.
> 
> This change should not affect the semantics of the memory model, just
> its internal organization.
> 
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> 
> ---
> 
> v2: Rebase on top of the preceding patch which renames "link" to
> "rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
> in the definition of rcu-fence.  Minor editing improvements in
> explanation.txt.
> 
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
>   *)
>  let rcu-link = hb* ; pb* ; prop
>  
> -(* Chains that affect the RCU grace-period guarantee *)
> -let gp-link = gp ; rcu-link
> -let rscs-link = rscs ; rcu-link
> -
>  (*
> - * A cycle containing at least as many grace periods as RCU read-side
> - * critical sections is forbidden.
> + * Any sequence containing at least as many grace periods as RCU read-side
> + * critical sections (joined by rcu-link) acts as a generalized strong fence.
>   *)
> -let rec rb =
> -	gp-link |
> -	(gp-link ; rscs-link) |
> -	(rscs-link ; gp-link) |
> -	(rb ; rb) |
> -	(gp-link ; rb ; rscs-link) |
> -	(rscs-link ; rb ; gp-link)
> +let rec rcu-fence = gp |
> +	(gp ; rcu-link ; rscs) |
> +	(rscs ; rcu-link ; gp) |
> +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> +	(rcu-fence ; rcu-link ; rcu-fence)
> +
> +(* rb orders instructions just as pb does *)
> +let rb = prop ; rcu-fence ; hb* ; pb*
>  
>  irreflexive rb as rcu

I wonder whether we can simplify things as:

	let rec rcu-fence =
	    (gp; rcu-link; rscs) |
	    (rscs; rcu-link; gp) |
	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
	
	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
	
	let rb = prop; rcu-fence; hb*; pb*

	acycle rb as rcu

In this way, "rcu-fence" is defined as "any sequence containing as many
grace periods as RCU read-side critical sections (joined by rcu-link)."
Note that "rcu-link" contains "gp", so we don't miss the case where
there are more grace periods. And since we use "acycle" now, so we don't
need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.

I prefer this because we already treat "gp" as "strong-fence", which
already is a "rcu-link". Also, recurisively extending rcu-fence with
itself is exactly calculating the transitive closure, which we can avoid
by using a "acycle" rule. Besides, it looks more consistent with hb and
pb.

Thoughts?

Regards,
Boqun


> +
> +(*
> + * The happens-before, propagation, and rcu constraints are all
> + * expressions of temporal ordering.  They could be replaced by
> + * a single constraint on an "executes-before" relation, xb:
> + *
> + * let xb = hb | pb | rb
> + * acyclic xb as executes-before
> + *)
> Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
>    19. AND THEN THERE WAS ALPHA
>    20. THE HAPPENS-BEFORE RELATION: hb
>    21. THE PROPAGATES-BEFORE RELATION: pb
> -  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> +  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
>    23. ODDS AND ENDS
>  
>  
> @@ -1451,8 +1451,8 @@ they execute means that it cannot have c
>  the content of the LKMM's "propagation" axiom.
>  
>  
> -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> ----------------------------------------------------
> +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> +----------------------------------------------------
>  
>  RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
>  rests on two concepts: grace periods and read-side critical sections.
> @@ -1537,49 +1537,100 @@ relation, and the details don't matter u
>  a somewhat lengthy formal proof.  Pretty much all you need to know
>  about rcu-link is the information in the preceding paragraph.
>  
> -The LKMM goes on to define the gp-link and rscs-link relations.  They
> -bring grace periods and read-side critical sections into the picture,
> -in the following way:
> -
> -	E ->gp-link F means there is a synchronize_rcu() fence event S
> -	and an event X such that E ->po S, either S ->po X or S = X,
> -	and X ->rcu-link F.  In other words, E and F are linked by a
> -	grace period followed by an instance of rcu-link.
> -
> -	E ->rscs-link F means there is a critical section delimited by
> -	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
> -	and an event X such that E ->po U, either L ->po X or L = X,
> -	and X ->rcu-link F.  Roughly speaking, this says that some
> -	event in the same critical section as E is linked by rcu-link
> -	to F.
> +The LKMM also defines the gp and rscs relations.  They bring grace
> +periods and read-side critical sections into the picture, in the
> +following way:
> +
> +	E ->gp F means there is a synchronize_rcu() fence event S such
> +	that E ->po S and either S ->po F or S = F.  In simple terms,
> +	there is a grace period po-between E and F.
> +
> +	E ->rscs F means there is a critical section delimited by an
> +	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
> +	that E ->po U and either L ->po F or L = F.  You can think of
> +	this as saying that E and F are in the same critical section
> +	(in fact, it also allows E to be po-before the start of the
> +	critical section and F to be po-after the end).
>  
>  If we think of the rcu-link relation as standing for an extended
> -"before", then E ->gp-link F says that E executes before a grace
> -period which ends before F executes.  (In fact it covers more than
> -this, because it also includes cases where E executes before a grace
> -period and some store propagates to F's CPU before F executes and
> -doesn't propagate to some other CPU until after the grace period
> -ends.)  Similarly, E ->rscs-link F says that E is part of (or before
> -the start of) a critical section which starts before F executes.
> +"before", then X ->gp Y ->rcu-link Z says that X executes before a
> +grace period which ends before Z executes.  (In fact it covers more
> +than this, because it also includes cases where X executes before a
> +grace period and some store propagates to Z's CPU before Z executes
> +but doesn't propagate to some other CPU until after the grace period
> +ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
> +before the start of) a critical section which starts before Z
> +executes.
> +
> +The LKMM goes on to define the rcu-fence relation as a sequence of gp
> +and rscs links separated by rcu-link links, in which the number of gp
> +links is >= the number of rscs links.  For example:
> +
> +	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> +
> +would imply that X ->rcu-fence V, because this sequence contains two
> +gp links and only one rscs link.  (It also implies that X ->rcu-fence T
> +and Z ->rcu-fence V.)  On the other hand:
> +
> +	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> +
> +does not imply X ->rcu-fence V, because the sequence contains only
> +one gp link but two rscs links.
> +
> +The rcu-fence relation is important because the Grace Period Guarantee
> +means that rcu-fence acts kind of like a strong fence.  In particular,
> +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
> +will propagate to every CPU before Z executes.
> +
> +To prove this in full generality requires some intellectual effort.
> +We'll consider just a very simple case:
> +
> +	W ->gp X ->rcu-link Y ->rscs Z.
> +
> +This formula means that there is a grace period G and a critical
> +section C such that:
> +
> +	1. W is po-before G;
> +
> +	2. X is equal to or po-after G;
> +
> +	3. X comes "before" Y in some sense;
> +
> +	4. Y is po-before the end of C;
> +
> +	5. Z is equal to or po-after the start of C.
> +
> +From 2 - 4 we deduce that the grace period G ends before the critical
> +section C.  Then the second part of the Grace Period Guarantee says
> +not only that G starts before C does, but also that W (which executes
> +on G's CPU before G starts) must propagate to every CPU before C
> +starts.  In particular, W propagates to every CPU before Z executes
> +(or finishes executing, in the case where Z is equal to the
> +rcu_read_lock() fence event which starts C.)  This sort of reasoning
> +can be expanded to handle all the situations covered by rcu-fence.
> +
> +Finally, the LKMM defines the RCU-before (rb) relation in terms of
> +rcu-fence.  This is done in essentially the same way as the pb
> +relation was defined in terms of strong-fence.  We will omit the
> +details; the end result is that E ->rb F implies E must execute before
> +F, just as E ->pb F does (and for much the same reasons).
>  
>  Putting this all together, the LKMM expresses the Grace Period
> -Guarantee by requiring that there are no cycles consisting of gp-link
> -and rscs-link links in which the number of gp-link instances is >= the
> -number of rscs-link instances.  It does this by defining the rb
> -relation to link events E and F whenever it is possible to pass from E
> -to F by a sequence of gp-link and rscs-link links with at least as
> -many of the former as the latter.  The LKMM's "rcu" axiom then says
> -that there are no events E with E ->rb E.
> -
> -Justifying this axiom takes some intellectual effort, but it is in
> -fact a valid formalization of the Grace Period Guarantee.  We won't
> -attempt to go through the detailed argument, but the following
> -analysis gives a taste of what is involved.  Suppose we have a
> -violation of the first part of the Guarantee: A critical section
> -starts before a grace period, and some store propagates to the
> -critical section's CPU before the end of the critical section but
> -doesn't propagate to some other CPU until after the end of the grace
> -period.
> +Guarantee by requiring that the rb relation does not contain a cycle.
> +Equivalently, this "rcu" axiom requires that there are no events E and
> +F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
> +axiom requires that there are no cycles consisting of gp and rscs
> +alternating with rcu-link, where the number of gp links is >= the
> +number of rscs links.
> +
> +Justifying the axiom isn't easy, but it is in fact a valid
> +formalization of the Grace Period Guarantee.  We won't attempt to go
> +through the detailed argument, but the following analysis gives a
> +taste of what is involved.  Suppose we have a violation of the first
> +part of the Guarantee: A critical section starts before a grace
> +period, and some store propagates to the critical section's CPU before
> +the end of the critical section but doesn't propagate to some other
> +CPU until after the end of the grace period.
>  
>  Putting symbols to these ideas, let L and U be the rcu_read_lock() and
>  rcu_read_unlock() fence events delimiting the critical section in
> @@ -1606,11 +1657,14 @@ by rcu-link, yielding:
>  
>  	S ->po X ->rcu-link Z ->po U.
>  
> -The formulas say that S is po-between F and X, hence F ->gp-link Z
> -via X.  They also say that Z comes before the end of the critical
> -section and E comes after its start, hence Z ->rscs-link F via E.  But
> -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
> -"rcu" axiom rules out this violation of the Grace Period Guarantee.
> +The formulas say that S is po-between F and X, hence F ->gp X.  They
> +also say that Z comes before the end of the critical section and E
> +comes after its start, hence Z ->rscs E.  From all this we obtain:
> +
> +	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
> +
> +a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
> +the Grace Period Guarantee.
>  
>  For something a little more down-to-earth, let's see how the axiom
>  works out in practice.  Consider the RCU code example from above, this
> @@ -1639,15 +1693,15 @@ time with statement labels added to the
>  If r2 = 0 at the end then P0's store at X overwrites the value that
>  P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
>  In addition, there is a synchronize_rcu() between Y and Z, so therefore
> -we have Y ->gp-link X.
> +we have Y ->gp Z.
>  
>  If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
>  so we have W ->rcu-link Y.  In addition, W and X are in the same critical
> -section, so therefore we have X ->rscs-link Y.
> +section, so therefore we have X ->rscs W.
>  
> -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
> -and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
> -not allowed by the LKMM, as we would expect.
> +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
> +violating the "rcu" axiom.  Hence the outcome is not allowed by the
> +LKMM, as we would expect.
>  
>  For contrast, let's see what can happen in a more complicated example:
>  
> @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
>  	}
>  
>  If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
> -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
> -via V.  And just as before, this gives a cycle:
> -
> -	W ->rscs-link Y ->gp-link U ->rscs-link W.
> -
> -However, this cycle has fewer gp-link instances than rscs-link
> -instances, and consequently the outcome is not forbidden by the LKMM.
> -The following instruction timing diagram shows how it might actually
> -occur:
> +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
> +However this cycle is not forbidden, because the sequence of relations
> +contains fewer instances of gp (one) than of rscs (two).  Consequently
> +the outcome is allowed by the LKMM.  The following instruction timing
> +diagram shows how it might actually occur:
>  
>  P0			P1			P2
>  --------------------	--------------------	--------------------
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01  1:55 ` Boqun Feng
@ 2018-03-01  4:49   ` Paul E. McKenney
  2018-03-01  8:39     ` Boqun Feng
  2018-03-01 15:49   ` Alan Stern
  1 sibling, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-01  4:49 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote:
> On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote:
> > This patch reorganizes the definition of rb in the Linux Kernel Memory
> > Consistency Model.  The relation is now expressed in terms of
> > rcu-fence, which consists of a sequence of gp and rscs links separated
> > by rcu-link links, in which the number of occurrences of gp is >= the
> > number of occurrences of rscs.
> > 
> > Arguments similar to those published in
> > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
> > inter-CPU strong fence.  Furthermore, the definition of rb in terms of
> > rcu-fence is highly analogous to the definition of pb in terms of
> > strong-fence, which can help explain why rcu-path expresses a form of
> > temporal ordering.
> > 
> > This change should not affect the semantics of the memory model, just
> > its internal organization.
> > 
> > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > 
> > ---
> > 
> > v2: Rebase on top of the preceding patch which renames "link" to
> > "rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
> > in the definition of rcu-fence.  Minor editing improvements in
> > explanation.txt.
> > 
> > Index: usb-4.x/tools/memory-model/linux-kernel.cat
> > ===================================================================
> > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> > +++ usb-4.x/tools/memory-model/linux-kernel.cat
> > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
> >   *)
> >  let rcu-link = hb* ; pb* ; prop
> >  
> > -(* Chains that affect the RCU grace-period guarantee *)
> > -let gp-link = gp ; rcu-link
> > -let rscs-link = rscs ; rcu-link
> > -
> >  (*
> > - * A cycle containing at least as many grace periods as RCU read-side
> > - * critical sections is forbidden.
> > + * Any sequence containing at least as many grace periods as RCU read-side
> > + * critical sections (joined by rcu-link) acts as a generalized strong fence.
> >   *)
> > -let rec rb =
> > -	gp-link |
> > -	(gp-link ; rscs-link) |
> > -	(rscs-link ; gp-link) |
> > -	(rb ; rb) |
> > -	(gp-link ; rb ; rscs-link) |
> > -	(rscs-link ; rb ; gp-link)
> > +let rec rcu-fence = gp |
> > +	(gp ; rcu-link ; rscs) |
> > +	(rscs ; rcu-link ; gp) |
> > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > +	(rcu-fence ; rcu-link ; rcu-fence)
> > +
> > +(* rb orders instructions just as pb does *)
> > +let rb = prop ; rcu-fence ; hb* ; pb*
> >  
> >  irreflexive rb as rcu
> 
> I wonder whether we can simplify things as:
> 
> 	let rec rcu-fence =
> 	    (gp; rcu-link; rscs) |
> 	    (rscs; rcu-link; gp) |
> 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> 	
> 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> 	
> 	let rb = prop; rcu-fence; hb*; pb*
> 
> 	acycle rb as rcu
> 
> In this way, "rcu-fence" is defined as "any sequence containing as many
> grace periods as RCU read-side critical sections (joined by rcu-link)."
> Note that "rcu-link" contains "gp", so we don't miss the case where
> there are more grace periods. And since we use "acycle" now, so we don't
> need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> 
> I prefer this because we already treat "gp" as "strong-fence", which
> already is a "rcu-link". Also, recurisively extending rcu-fence with
> itself is exactly calculating the transitive closure, which we can avoid
> by using a "acycle" rule. Besides, it looks more consistent with hb and
> pb.

I don't have any opinions from an aesthetics viewpoint, but this change
does correctly handle the automatically generated tests.  I do not see
any performance impact, if anything, about a 10% improvement based on
this 11-process RCU litmus test:

auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus

With the change, about 10.4 seconds, without, about 11.4 seconds.

I am not patient enough to try one of the really large ones, like this one:

auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus

However, it is in my "litmus" github archive, so please feel free to
try it out.  Though I would suggest working up from those of intermediate
length.

							Thanx, Paul

> Thoughts?
> 
> Regards,
> Boqun
> 
> 
> > +
> > +(*
> > + * The happens-before, propagation, and rcu constraints are all
> > + * expressions of temporal ordering.  They could be replaced by
> > + * a single constraint on an "executes-before" relation, xb:
> > + *
> > + * let xb = hb | pb | rb
> > + * acyclic xb as executes-before
> > + *)
> > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> > ===================================================================
> > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
> >    19. AND THEN THERE WAS ALPHA
> >    20. THE HAPPENS-BEFORE RELATION: hb
> >    21. THE PROPAGATES-BEFORE RELATION: pb
> > -  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > +  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> >    23. ODDS AND ENDS
> >  
> >  
> > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c
> >  the content of the LKMM's "propagation" axiom.
> >  
> >  
> > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > ----------------------------------------------------
> > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> > +----------------------------------------------------
> >  
> >  RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
> >  rests on two concepts: grace periods and read-side critical sections.
> > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u
> >  a somewhat lengthy formal proof.  Pretty much all you need to know
> >  about rcu-link is the information in the preceding paragraph.
> >  
> > -The LKMM goes on to define the gp-link and rscs-link relations.  They
> > -bring grace periods and read-side critical sections into the picture,
> > -in the following way:
> > -
> > -	E ->gp-link F means there is a synchronize_rcu() fence event S
> > -	and an event X such that E ->po S, either S ->po X or S = X,
> > -	and X ->rcu-link F.  In other words, E and F are linked by a
> > -	grace period followed by an instance of rcu-link.
> > -
> > -	E ->rscs-link F means there is a critical section delimited by
> > -	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
> > -	and an event X such that E ->po U, either L ->po X or L = X,
> > -	and X ->rcu-link F.  Roughly speaking, this says that some
> > -	event in the same critical section as E is linked by rcu-link
> > -	to F.
> > +The LKMM also defines the gp and rscs relations.  They bring grace
> > +periods and read-side critical sections into the picture, in the
> > +following way:
> > +
> > +	E ->gp F means there is a synchronize_rcu() fence event S such
> > +	that E ->po S and either S ->po F or S = F.  In simple terms,
> > +	there is a grace period po-between E and F.
> > +
> > +	E ->rscs F means there is a critical section delimited by an
> > +	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
> > +	that E ->po U and either L ->po F or L = F.  You can think of
> > +	this as saying that E and F are in the same critical section
> > +	(in fact, it also allows E to be po-before the start of the
> > +	critical section and F to be po-after the end).
> >  
> >  If we think of the rcu-link relation as standing for an extended
> > -"before", then E ->gp-link F says that E executes before a grace
> > -period which ends before F executes.  (In fact it covers more than
> > -this, because it also includes cases where E executes before a grace
> > -period and some store propagates to F's CPU before F executes and
> > -doesn't propagate to some other CPU until after the grace period
> > -ends.)  Similarly, E ->rscs-link F says that E is part of (or before
> > -the start of) a critical section which starts before F executes.
> > +"before", then X ->gp Y ->rcu-link Z says that X executes before a
> > +grace period which ends before Z executes.  (In fact it covers more
> > +than this, because it also includes cases where X executes before a
> > +grace period and some store propagates to Z's CPU before Z executes
> > +but doesn't propagate to some other CPU until after the grace period
> > +ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
> > +before the start of) a critical section which starts before Z
> > +executes.
> > +
> > +The LKMM goes on to define the rcu-fence relation as a sequence of gp
> > +and rscs links separated by rcu-link links, in which the number of gp
> > +links is >= the number of rscs links.  For example:
> > +
> > +	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > +
> > +would imply that X ->rcu-fence V, because this sequence contains two
> > +gp links and only one rscs link.  (It also implies that X ->rcu-fence T
> > +and Z ->rcu-fence V.)  On the other hand:
> > +
> > +	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > +
> > +does not imply X ->rcu-fence V, because the sequence contains only
> > +one gp link but two rscs links.
> > +
> > +The rcu-fence relation is important because the Grace Period Guarantee
> > +means that rcu-fence acts kind of like a strong fence.  In particular,
> > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
> > +will propagate to every CPU before Z executes.
> > +
> > +To prove this in full generality requires some intellectual effort.
> > +We'll consider just a very simple case:
> > +
> > +	W ->gp X ->rcu-link Y ->rscs Z.
> > +
> > +This formula means that there is a grace period G and a critical
> > +section C such that:
> > +
> > +	1. W is po-before G;
> > +
> > +	2. X is equal to or po-after G;
> > +
> > +	3. X comes "before" Y in some sense;
> > +
> > +	4. Y is po-before the end of C;
> > +
> > +	5. Z is equal to or po-after the start of C.
> > +
> > +From 2 - 4 we deduce that the grace period G ends before the critical
> > +section C.  Then the second part of the Grace Period Guarantee says
> > +not only that G starts before C does, but also that W (which executes
> > +on G's CPU before G starts) must propagate to every CPU before C
> > +starts.  In particular, W propagates to every CPU before Z executes
> > +(or finishes executing, in the case where Z is equal to the
> > +rcu_read_lock() fence event which starts C.)  This sort of reasoning
> > +can be expanded to handle all the situations covered by rcu-fence.
> > +
> > +Finally, the LKMM defines the RCU-before (rb) relation in terms of
> > +rcu-fence.  This is done in essentially the same way as the pb
> > +relation was defined in terms of strong-fence.  We will omit the
> > +details; the end result is that E ->rb F implies E must execute before
> > +F, just as E ->pb F does (and for much the same reasons).
> >  
> >  Putting this all together, the LKMM expresses the Grace Period
> > -Guarantee by requiring that there are no cycles consisting of gp-link
> > -and rscs-link links in which the number of gp-link instances is >= the
> > -number of rscs-link instances.  It does this by defining the rb
> > -relation to link events E and F whenever it is possible to pass from E
> > -to F by a sequence of gp-link and rscs-link links with at least as
> > -many of the former as the latter.  The LKMM's "rcu" axiom then says
> > -that there are no events E with E ->rb E.
> > -
> > -Justifying this axiom takes some intellectual effort, but it is in
> > -fact a valid formalization of the Grace Period Guarantee.  We won't
> > -attempt to go through the detailed argument, but the following
> > -analysis gives a taste of what is involved.  Suppose we have a
> > -violation of the first part of the Guarantee: A critical section
> > -starts before a grace period, and some store propagates to the
> > -critical section's CPU before the end of the critical section but
> > -doesn't propagate to some other CPU until after the end of the grace
> > -period.
> > +Guarantee by requiring that the rb relation does not contain a cycle.
> > +Equivalently, this "rcu" axiom requires that there are no events E and
> > +F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
> > +axiom requires that there are no cycles consisting of gp and rscs
> > +alternating with rcu-link, where the number of gp links is >= the
> > +number of rscs links.
> > +
> > +Justifying the axiom isn't easy, but it is in fact a valid
> > +formalization of the Grace Period Guarantee.  We won't attempt to go
> > +through the detailed argument, but the following analysis gives a
> > +taste of what is involved.  Suppose we have a violation of the first
> > +part of the Guarantee: A critical section starts before a grace
> > +period, and some store propagates to the critical section's CPU before
> > +the end of the critical section but doesn't propagate to some other
> > +CPU until after the end of the grace period.
> >  
> >  Putting symbols to these ideas, let L and U be the rcu_read_lock() and
> >  rcu_read_unlock() fence events delimiting the critical section in
> > @@ -1606,11 +1657,14 @@ by rcu-link, yielding:
> >  
> >  	S ->po X ->rcu-link Z ->po U.
> >  
> > -The formulas say that S is po-between F and X, hence F ->gp-link Z
> > -via X.  They also say that Z comes before the end of the critical
> > -section and E comes after its start, hence Z ->rscs-link F via E.  But
> > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
> > -"rcu" axiom rules out this violation of the Grace Period Guarantee.
> > +The formulas say that S is po-between F and X, hence F ->gp X.  They
> > +also say that Z comes before the end of the critical section and E
> > +comes after its start, hence Z ->rscs E.  From all this we obtain:
> > +
> > +	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
> > +
> > +a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
> > +the Grace Period Guarantee.
> >  
> >  For something a little more down-to-earth, let's see how the axiom
> >  works out in practice.  Consider the RCU code example from above, this
> > @@ -1639,15 +1693,15 @@ time with statement labels added to the
> >  If r2 = 0 at the end then P0's store at X overwrites the value that
> >  P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
> >  In addition, there is a synchronize_rcu() between Y and Z, so therefore
> > -we have Y ->gp-link X.
> > +we have Y ->gp Z.
> >  
> >  If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
> >  so we have W ->rcu-link Y.  In addition, W and X are in the same critical
> > -section, so therefore we have X ->rscs-link Y.
> > +section, so therefore we have X ->rscs W.
> >  
> > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
> > -and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
> > -not allowed by the LKMM, as we would expect.
> > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
> > +violating the "rcu" axiom.  Hence the outcome is not allowed by the
> > +LKMM, as we would expect.
> >  
> >  For contrast, let's see what can happen in a more complicated example:
> >  
> > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
> >  	}
> >  
> >  If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
> > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
> > -via V.  And just as before, this gives a cycle:
> > -
> > -	W ->rscs-link Y ->gp-link U ->rscs-link W.
> > -
> > -However, this cycle has fewer gp-link instances than rscs-link
> > -instances, and consequently the outcome is not forbidden by the LKMM.
> > -The following instruction timing diagram shows how it might actually
> > -occur:
> > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
> > +However this cycle is not forbidden, because the sequence of relations
> > +contains fewer instances of gp (one) than of rscs (two).  Consequently
> > +the outcome is allowed by the LKMM.  The following instruction timing
> > +diagram shows how it might actually occur:
> >  
> >  P0			P1			P2
> >  --------------------	--------------------	--------------------
> > 
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01  4:49   ` Paul E. McKenney
@ 2018-03-01  8:39     ` Boqun Feng
  2018-03-01 14:28       ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2018-03-01  8:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 17966 bytes --]

On Wed, Feb 28, 2018 at 08:49:37PM -0800, Paul E. McKenney wrote:
> On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote:
> > On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote:
> > > This patch reorganizes the definition of rb in the Linux Kernel Memory
> > > Consistency Model.  The relation is now expressed in terms of
> > > rcu-fence, which consists of a sequence of gp and rscs links separated
> > > by rcu-link links, in which the number of occurrences of gp is >= the
> > > number of occurrences of rscs.
> > > 
> > > Arguments similar to those published in
> > > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
> > > inter-CPU strong fence.  Furthermore, the definition of rb in terms of
> > > rcu-fence is highly analogous to the definition of pb in terms of
> > > strong-fence, which can help explain why rcu-path expresses a form of
> > > temporal ordering.
> > > 
> > > This change should not affect the semantics of the memory model, just
> > > its internal organization.
> > > 
> > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > 
> > > ---
> > > 
> > > v2: Rebase on top of the preceding patch which renames "link" to
> > > "rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
> > > in the definition of rcu-fence.  Minor editing improvements in
> > > explanation.txt.
> > > 
> > > Index: usb-4.x/tools/memory-model/linux-kernel.cat
> > > ===================================================================
> > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> > > +++ usb-4.x/tools/memory-model/linux-kernel.cat
> > > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
> > >   *)
> > >  let rcu-link = hb* ; pb* ; prop
> > >  
> > > -(* Chains that affect the RCU grace-period guarantee *)
> > > -let gp-link = gp ; rcu-link
> > > -let rscs-link = rscs ; rcu-link
> > > -
> > >  (*
> > > - * A cycle containing at least as many grace periods as RCU read-side
> > > - * critical sections is forbidden.
> > > + * Any sequence containing at least as many grace periods as RCU read-side
> > > + * critical sections (joined by rcu-link) acts as a generalized strong fence.
> > >   *)
> > > -let rec rb =
> > > -	gp-link |
> > > -	(gp-link ; rscs-link) |
> > > -	(rscs-link ; gp-link) |
> > > -	(rb ; rb) |
> > > -	(gp-link ; rb ; rscs-link) |
> > > -	(rscs-link ; rb ; gp-link)
> > > +let rec rcu-fence = gp |
> > > +	(gp ; rcu-link ; rscs) |
> > > +	(rscs ; rcu-link ; gp) |
> > > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > +	(rcu-fence ; rcu-link ; rcu-fence)
> > > +
> > > +(* rb orders instructions just as pb does *)
> > > +let rb = prop ; rcu-fence ; hb* ; pb*
> > >  
> > >  irreflexive rb as rcu
> > 
> > I wonder whether we can simplify things as:
> > 
> > 	let rec rcu-fence =
> > 	    (gp; rcu-link; rscs) |
> > 	    (rscs; rcu-link; gp) |
> > 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> > 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> > 	
> > 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> > 	
> > 	let rb = prop; rcu-fence; hb*; pb*
> > 
> > 	acycle rb as rcu

Note this one should be "acyclic rb as rcu"...

> > 
> > In this way, "rcu-fence" is defined as "any sequence containing as many
> > grace periods as RCU read-side critical sections (joined by rcu-link)."
> > Note that "rcu-link" contains "gp", so we don't miss the case where
> > there are more grace periods. And since we use "acycle" now, so we don't
> > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> > 
> > I prefer this because we already treat "gp" as "strong-fence", which
> > already is a "rcu-link". Also, recurisively extending rcu-fence with
> > itself is exactly calculating the transitive closure, which we can avoid
> > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > pb.
> 
> I don't have any opinions from an aesthetics viewpoint, but this change
> does correctly handle the automatically generated tests.  I do not see
> any performance impact, if anything, about a 10% improvement based on
> this 11-process RCU litmus test:
> 
> auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus
> 
> With the change, about 10.4 seconds, without, about 11.4 seconds.
> 

I got 12.0 seconds(my version) vs 13.59 seconds (Alan's version). So
clearly you have a faster computer than I ;-)

> I am not patient enough to try one of the really large ones, like this one:
> 
> auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus
> 

I'm trying to run this on my laptop, but seems it will take forever to
run(now it has been running for 1 hour and a half with Alan's version).
I will update the result if it got finished some time later.

Regards,
Boqun

> However, it is in my "litmus" github archive, so please feel free to
> try it out.  Though I would suggest working up from those of intermediate
> length.
> 
> 							Thanx, Paul
> 
> > Thoughts?
> > 
> > Regards,
> > Boqun
> > 
> > 
> > > +
> > > +(*
> > > + * The happens-before, propagation, and rcu constraints are all
> > > + * expressions of temporal ordering.  They could be replaced by
> > > + * a single constraint on an "executes-before" relation, xb:
> > > + *
> > > + * let xb = hb | pb | rb
> > > + * acyclic xb as executes-before
> > > + *)
> > > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> > > ===================================================================
> > > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> > > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> > > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
> > >    19. AND THEN THERE WAS ALPHA
> > >    20. THE HAPPENS-BEFORE RELATION: hb
> > >    21. THE PROPAGATES-BEFORE RELATION: pb
> > > -  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > > +  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> > >    23. ODDS AND ENDS
> > >  
> > >  
> > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c
> > >  the content of the LKMM's "propagation" axiom.
> > >  
> > >  
> > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > > ----------------------------------------------------
> > > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> > > +----------------------------------------------------
> > >  
> > >  RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
> > >  rests on two concepts: grace periods and read-side critical sections.
> > > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u
> > >  a somewhat lengthy formal proof.  Pretty much all you need to know
> > >  about rcu-link is the information in the preceding paragraph.
> > >  
> > > -The LKMM goes on to define the gp-link and rscs-link relations.  They
> > > -bring grace periods and read-side critical sections into the picture,
> > > -in the following way:
> > > -
> > > -	E ->gp-link F means there is a synchronize_rcu() fence event S
> > > -	and an event X such that E ->po S, either S ->po X or S = X,
> > > -	and X ->rcu-link F.  In other words, E and F are linked by a
> > > -	grace period followed by an instance of rcu-link.
> > > -
> > > -	E ->rscs-link F means there is a critical section delimited by
> > > -	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
> > > -	and an event X such that E ->po U, either L ->po X or L = X,
> > > -	and X ->rcu-link F.  Roughly speaking, this says that some
> > > -	event in the same critical section as E is linked by rcu-link
> > > -	to F.
> > > +The LKMM also defines the gp and rscs relations.  They bring grace
> > > +periods and read-side critical sections into the picture, in the
> > > +following way:
> > > +
> > > +	E ->gp F means there is a synchronize_rcu() fence event S such
> > > +	that E ->po S and either S ->po F or S = F.  In simple terms,
> > > +	there is a grace period po-between E and F.
> > > +
> > > +	E ->rscs F means there is a critical section delimited by an
> > > +	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
> > > +	that E ->po U and either L ->po F or L = F.  You can think of
> > > +	this as saying that E and F are in the same critical section
> > > +	(in fact, it also allows E to be po-before the start of the
> > > +	critical section and F to be po-after the end).
> > >  
> > >  If we think of the rcu-link relation as standing for an extended
> > > -"before", then E ->gp-link F says that E executes before a grace
> > > -period which ends before F executes.  (In fact it covers more than
> > > -this, because it also includes cases where E executes before a grace
> > > -period and some store propagates to F's CPU before F executes and
> > > -doesn't propagate to some other CPU until after the grace period
> > > -ends.)  Similarly, E ->rscs-link F says that E is part of (or before
> > > -the start of) a critical section which starts before F executes.
> > > +"before", then X ->gp Y ->rcu-link Z says that X executes before a
> > > +grace period which ends before Z executes.  (In fact it covers more
> > > +than this, because it also includes cases where X executes before a
> > > +grace period and some store propagates to Z's CPU before Z executes
> > > +but doesn't propagate to some other CPU until after the grace period
> > > +ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
> > > +before the start of) a critical section which starts before Z
> > > +executes.
> > > +
> > > +The LKMM goes on to define the rcu-fence relation as a sequence of gp
> > > +and rscs links separated by rcu-link links, in which the number of gp
> > > +links is >= the number of rscs links.  For example:
> > > +
> > > +	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > > +
> > > +would imply that X ->rcu-fence V, because this sequence contains two
> > > +gp links and only one rscs link.  (It also implies that X ->rcu-fence T
> > > +and Z ->rcu-fence V.)  On the other hand:
> > > +
> > > +	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > > +
> > > +does not imply X ->rcu-fence V, because the sequence contains only
> > > +one gp link but two rscs links.
> > > +
> > > +The rcu-fence relation is important because the Grace Period Guarantee
> > > +means that rcu-fence acts kind of like a strong fence.  In particular,
> > > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
> > > +will propagate to every CPU before Z executes.
> > > +
> > > +To prove this in full generality requires some intellectual effort.
> > > +We'll consider just a very simple case:
> > > +
> > > +	W ->gp X ->rcu-link Y ->rscs Z.
> > > +
> > > +This formula means that there is a grace period G and a critical
> > > +section C such that:
> > > +
> > > +	1. W is po-before G;
> > > +
> > > +	2. X is equal to or po-after G;
> > > +
> > > +	3. X comes "before" Y in some sense;
> > > +
> > > +	4. Y is po-before the end of C;
> > > +
> > > +	5. Z is equal to or po-after the start of C.
> > > +
> > > +From 2 - 4 we deduce that the grace period G ends before the critical
> > > +section C.  Then the second part of the Grace Period Guarantee says
> > > +not only that G starts before C does, but also that W (which executes
> > > +on G's CPU before G starts) must propagate to every CPU before C
> > > +starts.  In particular, W propagates to every CPU before Z executes
> > > +(or finishes executing, in the case where Z is equal to the
> > > +rcu_read_lock() fence event which starts C.)  This sort of reasoning
> > > +can be expanded to handle all the situations covered by rcu-fence.
> > > +
> > > +Finally, the LKMM defines the RCU-before (rb) relation in terms of
> > > +rcu-fence.  This is done in essentially the same way as the pb
> > > +relation was defined in terms of strong-fence.  We will omit the
> > > +details; the end result is that E ->rb F implies E must execute before
> > > +F, just as E ->pb F does (and for much the same reasons).
> > >  
> > >  Putting this all together, the LKMM expresses the Grace Period
> > > -Guarantee by requiring that there are no cycles consisting of gp-link
> > > -and rscs-link links in which the number of gp-link instances is >= the
> > > -number of rscs-link instances.  It does this by defining the rb
> > > -relation to link events E and F whenever it is possible to pass from E
> > > -to F by a sequence of gp-link and rscs-link links with at least as
> > > -many of the former as the latter.  The LKMM's "rcu" axiom then says
> > > -that there are no events E with E ->rb E.
> > > -
> > > -Justifying this axiom takes some intellectual effort, but it is in
> > > -fact a valid formalization of the Grace Period Guarantee.  We won't
> > > -attempt to go through the detailed argument, but the following
> > > -analysis gives a taste of what is involved.  Suppose we have a
> > > -violation of the first part of the Guarantee: A critical section
> > > -starts before a grace period, and some store propagates to the
> > > -critical section's CPU before the end of the critical section but
> > > -doesn't propagate to some other CPU until after the end of the grace
> > > -period.
> > > +Guarantee by requiring that the rb relation does not contain a cycle.
> > > +Equivalently, this "rcu" axiom requires that there are no events E and
> > > +F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
> > > +axiom requires that there are no cycles consisting of gp and rscs
> > > +alternating with rcu-link, where the number of gp links is >= the
> > > +number of rscs links.
> > > +
> > > +Justifying the axiom isn't easy, but it is in fact a valid
> > > +formalization of the Grace Period Guarantee.  We won't attempt to go
> > > +through the detailed argument, but the following analysis gives a
> > > +taste of what is involved.  Suppose we have a violation of the first
> > > +part of the Guarantee: A critical section starts before a grace
> > > +period, and some store propagates to the critical section's CPU before
> > > +the end of the critical section but doesn't propagate to some other
> > > +CPU until after the end of the grace period.
> > >  
> > >  Putting symbols to these ideas, let L and U be the rcu_read_lock() and
> > >  rcu_read_unlock() fence events delimiting the critical section in
> > > @@ -1606,11 +1657,14 @@ by rcu-link, yielding:
> > >  
> > >  	S ->po X ->rcu-link Z ->po U.
> > >  
> > > -The formulas say that S is po-between F and X, hence F ->gp-link Z
> > > -via X.  They also say that Z comes before the end of the critical
> > > -section and E comes after its start, hence Z ->rscs-link F via E.  But
> > > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
> > > -"rcu" axiom rules out this violation of the Grace Period Guarantee.
> > > +The formulas say that S is po-between F and X, hence F ->gp X.  They
> > > +also say that Z comes before the end of the critical section and E
> > > +comes after its start, hence Z ->rscs E.  From all this we obtain:
> > > +
> > > +	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
> > > +
> > > +a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
> > > +the Grace Period Guarantee.
> > >  
> > >  For something a little more down-to-earth, let's see how the axiom
> > >  works out in practice.  Consider the RCU code example from above, this
> > > @@ -1639,15 +1693,15 @@ time with statement labels added to the
> > >  If r2 = 0 at the end then P0's store at X overwrites the value that
> > >  P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
> > >  In addition, there is a synchronize_rcu() between Y and Z, so therefore
> > > -we have Y ->gp-link X.
> > > +we have Y ->gp Z.
> > >  
> > >  If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
> > >  so we have W ->rcu-link Y.  In addition, W and X are in the same critical
> > > -section, so therefore we have X ->rscs-link Y.
> > > +section, so therefore we have X ->rscs W.
> > >  
> > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
> > > -and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
> > > -not allowed by the LKMM, as we would expect.
> > > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
> > > +violating the "rcu" axiom.  Hence the outcome is not allowed by the
> > > +LKMM, as we would expect.
> > >  
> > >  For contrast, let's see what can happen in a more complicated example:
> > >  
> > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
> > >  	}
> > >  
> > >  If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
> > > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
> > > -via V.  And just as before, this gives a cycle:
> > > -
> > > -	W ->rscs-link Y ->gp-link U ->rscs-link W.
> > > -
> > > -However, this cycle has fewer gp-link instances than rscs-link
> > > -instances, and consequently the outcome is not forbidden by the LKMM.
> > > -The following instruction timing diagram shows how it might actually
> > > -occur:
> > > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
> > > +However this cycle is not forbidden, because the sequence of relations
> > > +contains fewer instances of gp (one) than of rscs (two).  Consequently
> > > +the outcome is allowed by the LKMM.  The following instruction timing
> > > +diagram shows how it might actually occur:
> > >  
> > >  P0			P1			P2
> > >  --------------------	--------------------	--------------------
> > > 
> > > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01  8:39     ` Boqun Feng
@ 2018-03-01 14:28       ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-01 14:28 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, Mar 01, 2018 at 04:39:06PM +0800, Boqun Feng wrote:
> On Wed, Feb 28, 2018 at 08:49:37PM -0800, Paul E. McKenney wrote:
> > On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote:
> > > On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote:
> > > > This patch reorganizes the definition of rb in the Linux Kernel Memory
> > > > Consistency Model.  The relation is now expressed in terms of
> > > > rcu-fence, which consists of a sequence of gp and rscs links separated
> > > > by rcu-link links, in which the number of occurrences of gp is >= the
> > > > number of occurrences of rscs.
> > > > 
> > > > Arguments similar to those published in
> > > > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
> > > > inter-CPU strong fence.  Furthermore, the definition of rb in terms of
> > > > rcu-fence is highly analogous to the definition of pb in terms of
> > > > strong-fence, which can help explain why rcu-path expresses a form of
> > > > temporal ordering.
> > > > 
> > > > This change should not affect the semantics of the memory model, just
> > > > its internal organization.
> > > > 
> > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > > 
> > > > ---
> > > > 
> > > > v2: Rebase on top of the preceding patch which renames "link" to
> > > > "rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
> > > > in the definition of rcu-fence.  Minor editing improvements in
> > > > explanation.txt.
> > > > 
> > > > Index: usb-4.x/tools/memory-model/linux-kernel.cat
> > > > ===================================================================
> > > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> > > > +++ usb-4.x/tools/memory-model/linux-kernel.cat
> > > > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
> > > >   *)
> > > >  let rcu-link = hb* ; pb* ; prop
> > > >  
> > > > -(* Chains that affect the RCU grace-period guarantee *)
> > > > -let gp-link = gp ; rcu-link
> > > > -let rscs-link = rscs ; rcu-link
> > > > -
> > > >  (*
> > > > - * A cycle containing at least as many grace periods as RCU read-side
> > > > - * critical sections is forbidden.
> > > > + * Any sequence containing at least as many grace periods as RCU read-side
> > > > + * critical sections (joined by rcu-link) acts as a generalized strong fence.
> > > >   *)
> > > > -let rec rb =
> > > > -	gp-link |
> > > > -	(gp-link ; rscs-link) |
> > > > -	(rscs-link ; gp-link) |
> > > > -	(rb ; rb) |
> > > > -	(gp-link ; rb ; rscs-link) |
> > > > -	(rscs-link ; rb ; gp-link)
> > > > +let rec rcu-fence = gp |
> > > > +	(gp ; rcu-link ; rscs) |
> > > > +	(rscs ; rcu-link ; gp) |
> > > > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > > +	(rcu-fence ; rcu-link ; rcu-fence)
> > > > +
> > > > +(* rb orders instructions just as pb does *)
> > > > +let rb = prop ; rcu-fence ; hb* ; pb*
> > > >  
> > > >  irreflexive rb as rcu
> > > 
> > > I wonder whether we can simplify things as:
> > > 
> > > 	let rec rcu-fence =
> > > 	    (gp; rcu-link; rscs) |
> > > 	    (rscs; rcu-link; gp) |
> > > 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> > > 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> > > 	
> > > 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> > > 	
> > > 	let rb = prop; rcu-fence; hb*; pb*
> > > 
> > > 	acycle rb as rcu
> 
> Note this one should be "acyclic rb as rcu"...

I applied the change by hand, and didn't notice the "acycle", so in
my tests it was indeed "acyclic".  (I left that line alone.)

> > > In this way, "rcu-fence" is defined as "any sequence containing as many
> > > grace periods as RCU read-side critical sections (joined by rcu-link)."
> > > Note that "rcu-link" contains "gp", so we don't miss the case where
> > > there are more grace periods. And since we use "acycle" now, so we don't
> > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> > > 
> > > I prefer this because we already treat "gp" as "strong-fence", which
> > > already is a "rcu-link". Also, recurisively extending rcu-fence with
> > > itself is exactly calculating the transitive closure, which we can avoid
> > > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > > pb.
> > 
> > I don't have any opinions from an aesthetics viewpoint, but this change
> > does correctly handle the automatically generated tests.  I do not see
> > any performance impact, if anything, about a 10% improvement based on
> > this 11-process RCU litmus test:
> > 
> > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus
> > 
> > With the change, about 10.4 seconds, without, about 11.4 seconds.
> 
> I got 12.0 seconds(my version) vs 13.59 seconds (Alan's version). So
> clearly you have a faster computer than I ;-)

OK, it might be consistent.

> > I am not patient enough to try one of the really large ones, like this one:
> > 
> > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus
> > 
> 
> I'm trying to run this on my laptop, but seems it will take forever to
> run(now it has been running for 1 hour and a half with Alan's version).
> I will update the result if it got finished some time later.

Yes, that one will take some time.  I don't recall exactly how long,
but a great many hours, so...

> Regards,
> Boqun
> 
> > However, it is in my "litmus" github archive, so please feel free to
> > try it out.  Though I would suggest working up from those of intermediate
> > length.

... I reiterate my suggestion that you start with the shorter ones.
But your choice.  ;-)

							Thanx, Paul

> > > Thoughts?
> > > 
> > > Regards,
> > > Boqun
> > > 
> > > 
> > > > +
> > > > +(*
> > > > + * The happens-before, propagation, and rcu constraints are all
> > > > + * expressions of temporal ordering.  They could be replaced by
> > > > + * a single constraint on an "executes-before" relation, xb:
> > > > + *
> > > > + * let xb = hb | pb | rb
> > > > + * acyclic xb as executes-before
> > > > + *)
> > > > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> > > > ===================================================================
> > > > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> > > > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> > > > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
> > > >    19. AND THEN THERE WAS ALPHA
> > > >    20. THE HAPPENS-BEFORE RELATION: hb
> > > >    21. THE PROPAGATES-BEFORE RELATION: pb
> > > > -  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > > > +  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> > > >    23. ODDS AND ENDS
> > > >  
> > > >  
> > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c
> > > >  the content of the LKMM's "propagation" axiom.
> > > >  
> > > >  
> > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> > > > ----------------------------------------------------
> > > > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> > > > +----------------------------------------------------
> > > >  
> > > >  RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
> > > >  rests on two concepts: grace periods and read-side critical sections.
> > > > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u
> > > >  a somewhat lengthy formal proof.  Pretty much all you need to know
> > > >  about rcu-link is the information in the preceding paragraph.
> > > >  
> > > > -The LKMM goes on to define the gp-link and rscs-link relations.  They
> > > > -bring grace periods and read-side critical sections into the picture,
> > > > -in the following way:
> > > > -
> > > > -	E ->gp-link F means there is a synchronize_rcu() fence event S
> > > > -	and an event X such that E ->po S, either S ->po X or S = X,
> > > > -	and X ->rcu-link F.  In other words, E and F are linked by a
> > > > -	grace period followed by an instance of rcu-link.
> > > > -
> > > > -	E ->rscs-link F means there is a critical section delimited by
> > > > -	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
> > > > -	and an event X such that E ->po U, either L ->po X or L = X,
> > > > -	and X ->rcu-link F.  Roughly speaking, this says that some
> > > > -	event in the same critical section as E is linked by rcu-link
> > > > -	to F.
> > > > +The LKMM also defines the gp and rscs relations.  They bring grace
> > > > +periods and read-side critical sections into the picture, in the
> > > > +following way:
> > > > +
> > > > +	E ->gp F means there is a synchronize_rcu() fence event S such
> > > > +	that E ->po S and either S ->po F or S = F.  In simple terms,
> > > > +	there is a grace period po-between E and F.
> > > > +
> > > > +	E ->rscs F means there is a critical section delimited by an
> > > > +	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
> > > > +	that E ->po U and either L ->po F or L = F.  You can think of
> > > > +	this as saying that E and F are in the same critical section
> > > > +	(in fact, it also allows E to be po-before the start of the
> > > > +	critical section and F to be po-after the end).
> > > >  
> > > >  If we think of the rcu-link relation as standing for an extended
> > > > -"before", then E ->gp-link F says that E executes before a grace
> > > > -period which ends before F executes.  (In fact it covers more than
> > > > -this, because it also includes cases where E executes before a grace
> > > > -period and some store propagates to F's CPU before F executes and
> > > > -doesn't propagate to some other CPU until after the grace period
> > > > -ends.)  Similarly, E ->rscs-link F says that E is part of (or before
> > > > -the start of) a critical section which starts before F executes.
> > > > +"before", then X ->gp Y ->rcu-link Z says that X executes before a
> > > > +grace period which ends before Z executes.  (In fact it covers more
> > > > +than this, because it also includes cases where X executes before a
> > > > +grace period and some store propagates to Z's CPU before Z executes
> > > > +but doesn't propagate to some other CPU until after the grace period
> > > > +ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
> > > > +before the start of) a critical section which starts before Z
> > > > +executes.
> > > > +
> > > > +The LKMM goes on to define the rcu-fence relation as a sequence of gp
> > > > +and rscs links separated by rcu-link links, in which the number of gp
> > > > +links is >= the number of rscs links.  For example:
> > > > +
> > > > +	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > > > +
> > > > +would imply that X ->rcu-fence V, because this sequence contains two
> > > > +gp links and only one rscs link.  (It also implies that X ->rcu-fence T
> > > > +and Z ->rcu-fence V.)  On the other hand:
> > > > +
> > > > +	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> > > > +
> > > > +does not imply X ->rcu-fence V, because the sequence contains only
> > > > +one gp link but two rscs links.
> > > > +
> > > > +The rcu-fence relation is important because the Grace Period Guarantee
> > > > +means that rcu-fence acts kind of like a strong fence.  In particular,
> > > > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
> > > > +will propagate to every CPU before Z executes.
> > > > +
> > > > +To prove this in full generality requires some intellectual effort.
> > > > +We'll consider just a very simple case:
> > > > +
> > > > +	W ->gp X ->rcu-link Y ->rscs Z.
> > > > +
> > > > +This formula means that there is a grace period G and a critical
> > > > +section C such that:
> > > > +
> > > > +	1. W is po-before G;
> > > > +
> > > > +	2. X is equal to or po-after G;
> > > > +
> > > > +	3. X comes "before" Y in some sense;
> > > > +
> > > > +	4. Y is po-before the end of C;
> > > > +
> > > > +	5. Z is equal to or po-after the start of C.
> > > > +
> > > > +From 2 - 4 we deduce that the grace period G ends before the critical
> > > > +section C.  Then the second part of the Grace Period Guarantee says
> > > > +not only that G starts before C does, but also that W (which executes
> > > > +on G's CPU before G starts) must propagate to every CPU before C
> > > > +starts.  In particular, W propagates to every CPU before Z executes
> > > > +(or finishes executing, in the case where Z is equal to the
> > > > +rcu_read_lock() fence event which starts C.)  This sort of reasoning
> > > > +can be expanded to handle all the situations covered by rcu-fence.
> > > > +
> > > > +Finally, the LKMM defines the RCU-before (rb) relation in terms of
> > > > +rcu-fence.  This is done in essentially the same way as the pb
> > > > +relation was defined in terms of strong-fence.  We will omit the
> > > > +details; the end result is that E ->rb F implies E must execute before
> > > > +F, just as E ->pb F does (and for much the same reasons).
> > > >  
> > > >  Putting this all together, the LKMM expresses the Grace Period
> > > > -Guarantee by requiring that there are no cycles consisting of gp-link
> > > > -and rscs-link links in which the number of gp-link instances is >= the
> > > > -number of rscs-link instances.  It does this by defining the rb
> > > > -relation to link events E and F whenever it is possible to pass from E
> > > > -to F by a sequence of gp-link and rscs-link links with at least as
> > > > -many of the former as the latter.  The LKMM's "rcu" axiom then says
> > > > -that there are no events E with E ->rb E.
> > > > -
> > > > -Justifying this axiom takes some intellectual effort, but it is in
> > > > -fact a valid formalization of the Grace Period Guarantee.  We won't
> > > > -attempt to go through the detailed argument, but the following
> > > > -analysis gives a taste of what is involved.  Suppose we have a
> > > > -violation of the first part of the Guarantee: A critical section
> > > > -starts before a grace period, and some store propagates to the
> > > > -critical section's CPU before the end of the critical section but
> > > > -doesn't propagate to some other CPU until after the end of the grace
> > > > -period.
> > > > +Guarantee by requiring that the rb relation does not contain a cycle.
> > > > +Equivalently, this "rcu" axiom requires that there are no events E and
> > > > +F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
> > > > +axiom requires that there are no cycles consisting of gp and rscs
> > > > +alternating with rcu-link, where the number of gp links is >= the
> > > > +number of rscs links.
> > > > +
> > > > +Justifying the axiom isn't easy, but it is in fact a valid
> > > > +formalization of the Grace Period Guarantee.  We won't attempt to go
> > > > +through the detailed argument, but the following analysis gives a
> > > > +taste of what is involved.  Suppose we have a violation of the first
> > > > +part of the Guarantee: A critical section starts before a grace
> > > > +period, and some store propagates to the critical section's CPU before
> > > > +the end of the critical section but doesn't propagate to some other
> > > > +CPU until after the end of the grace period.
> > > >  
> > > >  Putting symbols to these ideas, let L and U be the rcu_read_lock() and
> > > >  rcu_read_unlock() fence events delimiting the critical section in
> > > > @@ -1606,11 +1657,14 @@ by rcu-link, yielding:
> > > >  
> > > >  	S ->po X ->rcu-link Z ->po U.
> > > >  
> > > > -The formulas say that S is po-between F and X, hence F ->gp-link Z
> > > > -via X.  They also say that Z comes before the end of the critical
> > > > -section and E comes after its start, hence Z ->rscs-link F via E.  But
> > > > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
> > > > -"rcu" axiom rules out this violation of the Grace Period Guarantee.
> > > > +The formulas say that S is po-between F and X, hence F ->gp X.  They
> > > > +also say that Z comes before the end of the critical section and E
> > > > +comes after its start, hence Z ->rscs E.  From all this we obtain:
> > > > +
> > > > +	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
> > > > +
> > > > +a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
> > > > +the Grace Period Guarantee.
> > > >  
> > > >  For something a little more down-to-earth, let's see how the axiom
> > > >  works out in practice.  Consider the RCU code example from above, this
> > > > @@ -1639,15 +1693,15 @@ time with statement labels added to the
> > > >  If r2 = 0 at the end then P0's store at X overwrites the value that
> > > >  P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
> > > >  In addition, there is a synchronize_rcu() between Y and Z, so therefore
> > > > -we have Y ->gp-link X.
> > > > +we have Y ->gp Z.
> > > >  
> > > >  If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
> > > >  so we have W ->rcu-link Y.  In addition, W and X are in the same critical
> > > > -section, so therefore we have X ->rscs-link Y.
> > > > +section, so therefore we have X ->rscs W.
> > > >  
> > > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
> > > > -and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
> > > > -not allowed by the LKMM, as we would expect.
> > > > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
> > > > +violating the "rcu" axiom.  Hence the outcome is not allowed by the
> > > > +LKMM, as we would expect.
> > > >  
> > > >  For contrast, let's see what can happen in a more complicated example:
> > > >  
> > > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
> > > >  	}
> > > >  
> > > >  If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
> > > > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
> > > > -via V.  And just as before, this gives a cycle:
> > > > -
> > > > -	W ->rscs-link Y ->gp-link U ->rscs-link W.
> > > > -
> > > > -However, this cycle has fewer gp-link instances than rscs-link
> > > > -instances, and consequently the outcome is not forbidden by the LKMM.
> > > > -The following instruction timing diagram shows how it might actually
> > > > -occur:
> > > > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
> > > > +However this cycle is not forbidden, because the sequence of relations
> > > > +contains fewer instances of gp (one) than of rscs (two).  Consequently
> > > > +the outcome is allowed by the LKMM.  The following instruction timing
> > > > +diagram shows how it might actually occur:
> > > >  
> > > >  P0			P1			P2
> > > >  --------------------	--------------------	--------------------
> > > > 
> > > > 
> > 
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01  1:55 ` Boqun Feng
  2018-03-01  4:49   ` Paul E. McKenney
@ 2018-03-01 15:49   ` Alan Stern
  2018-03-01 17:49     ` Paul E. McKenney
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Stern @ 2018-03-01 15:49 UTC (permalink / raw)
  To: Boqun Feng
  Cc: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, 1 Mar 2018, Boqun Feng wrote:

> > +let rec rcu-fence = gp |
> > +	(gp ; rcu-link ; rscs) |
> > +	(rscs ; rcu-link ; gp) |
> > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > +	(rcu-fence ; rcu-link ; rcu-fence)
> > +
> > +(* rb orders instructions just as pb does *)
> > +let rb = prop ; rcu-fence ; hb* ; pb*
> >  
> >  irreflexive rb as rcu
> 
> I wonder whether we can simplify things as:
> 
> 	let rec rcu-fence =
> 	    (gp; rcu-link; rscs) |
> 	    (rscs; rcu-link; gp) |
> 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> 	
> 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> 	
> 	let rb = prop; rcu-fence; hb*; pb*
> 
> 	acycle rb as rcu
> 
> In this way, "rcu-fence" is defined as "any sequence containing as many
> grace periods as RCU read-side critical sections (joined by rcu-link)."
> Note that "rcu-link" contains "gp", so we don't miss the case where
> there are more grace periods. And since we use "acycle" now, so we don't
> need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.

Would this definition of rcu-fence work for a sequence such as (leaving
out the intermediate rcu-link parts):

	gp gp gp rscs rscs gp rscs rscs

?  I don't think it would.  Yes, if you had a cycle of that form then 
your "rcu" axiom would detect it, but at some point we might want to 
use rcu-fence for some other purpose, one that doesn't involve cycles.

> I prefer this because we already treat "gp" as "strong-fence", which
> already is a "rcu-link".

That's a good point; it had not occurred to me.

>  Also, recurisively extending rcu-fence with
> itself is exactly calculating the transitive closure, which we can avoid
> by using a "acycle" rule. Besides, it looks more consistent with hb and
> pb.

That _had_ occurred to me.  But I couldn't see any way to do it while 
still defining rcu-fence correctly.

Alan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01 15:49   ` Alan Stern
@ 2018-03-01 17:49     ` Paul E. McKenney
  2018-03-01 18:37       ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-01 17:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote:
> On Thu, 1 Mar 2018, Boqun Feng wrote:
> 
> > > +let rec rcu-fence = gp |
> > > +	(gp ; rcu-link ; rscs) |
> > > +	(rscs ; rcu-link ; gp) |
> > > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > +	(rcu-fence ; rcu-link ; rcu-fence)
> > > +
> > > +(* rb orders instructions just as pb does *)
> > > +let rb = prop ; rcu-fence ; hb* ; pb*
> > >  
> > >  irreflexive rb as rcu
> > 
> > I wonder whether we can simplify things as:
> > 
> > 	let rec rcu-fence =
> > 	    (gp; rcu-link; rscs) |
> > 	    (rscs; rcu-link; gp) |
> > 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> > 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> > 	
> > 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> > 	
> > 	let rb = prop; rcu-fence; hb*; pb*
> > 
> > 	acycle rb as rcu
> > 
> > In this way, "rcu-fence" is defined as "any sequence containing as many
> > grace periods as RCU read-side critical sections (joined by rcu-link)."
> > Note that "rcu-link" contains "gp", so we don't miss the case where
> > there are more grace periods. And since we use "acycle" now, so we don't
> > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> 
> Would this definition of rcu-fence work for a sequence such as (leaving
> out the intermediate rcu-link parts):
> 
> 	gp gp gp rscs rscs gp rscs rscs
> 
> ?  I don't think it would.  Yes, if you had a cycle of that form then 
> your "rcu" axiom would detect it, but at some point we might want to 
> use rcu-fence for some other purpose, one that doesn't involve cycles.

Let's see, that would map to this:

auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus

And no, there is no such automatically generated litmus test.  Let's
try reversing the "gp" and "rscs", which should have the same effect
courtesy of symmetry:

auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus

And that one doesn't exist, either.  So much for random test generation!  :-/

Clearly time to add them.  And here is what herd has to say about them:

l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
Herd options: -conf linux-kernel.cfg
Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255
 ^^^ Unexpected non-Never verification
 0inputs+32outputs (0major+2605minor)pagefaults 0swaps
$ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
Herd options: -conf linux-kernel.cfg
Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255
 ^^^ Unexpected non-Never verification
 0inputs+32outputs (0major+2620minor)pagefaults 0swaps

In other words, they are in fact misclassified as "Sometimes" when they
should be "Never".  I have my diffs below in case I misapplied Boqun's
change.

With Alan's original formulation, these two litmus tests are correctly
handled:

$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
Herd options: -conf linux-kernel.cfg
Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255
1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k
$ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
Herd options: -conf linux-kernel.cfg
Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255
1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k

> > I prefer this because we already treat "gp" as "strong-fence", which
> > already is a "rcu-link".
> 
> That's a good point; it had not occurred to me.

And if I remove the "gp" but leave the last line, it does properly
classify the two new litmus tests.

							Thanx, Paul

> >  Also, recurisively extending rcu-fence with
> > itself is exactly calculating the transitive closure, which we can avoid
> > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > pb.
> 
> That _had_ occurred to me.  But I couldn't see any way to do it while 
> still defining rcu-fence correctly.

------------------------------------------------------------------------

diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
index 1e5c4653dd12..75d3c225146c 100644
--- a/tools/memory-model/linux-kernel.cat
+++ b/tools/memory-model/linux-kernel.cat
@@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop
  * Any sequence containing at least as many grace periods as RCU read-side
  * critical sections (joined by rcu-link) acts as a generalized strong fence.
  *)
-let rec rcu-fence = gp |
+let rec rcu-fence =
 	(gp ; rcu-link ; rscs) |
 	(rscs ; rcu-link ; gp) |
 	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
-	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
-	(rcu-fence ; rcu-link ; rcu-fence)
+	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp)
 
 (* rb orders instructions just as pb does *)
 let rb = prop ; rcu-fence ; hb* ; pb*

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01 17:49     ` Paul E. McKenney
@ 2018-03-01 18:37       ` Paul E. McKenney
  2018-03-02  4:31         ` Boqun Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-01 18:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote:
> On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote:
> > On Thu, 1 Mar 2018, Boqun Feng wrote:
> > 
> > > > +let rec rcu-fence = gp |
> > > > +	(gp ; rcu-link ; rscs) |
> > > > +	(rscs ; rcu-link ; gp) |
> > > > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > > +	(rcu-fence ; rcu-link ; rcu-fence)
> > > > +
> > > > +(* rb orders instructions just as pb does *)
> > > > +let rb = prop ; rcu-fence ; hb* ; pb*
> > > >  
> > > >  irreflexive rb as rcu
> > > 
> > > I wonder whether we can simplify things as:
> > > 
> > > 	let rec rcu-fence =
> > > 	    (gp; rcu-link; rscs) |
> > > 	    (rscs; rcu-link; gp) |
> > > 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> > > 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> > > 	
> > > 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> > > 	
> > > 	let rb = prop; rcu-fence; hb*; pb*
> > > 
> > > 	acycle rb as rcu
> > > 
> > > In this way, "rcu-fence" is defined as "any sequence containing as many
> > > grace periods as RCU read-side critical sections (joined by rcu-link)."
> > > Note that "rcu-link" contains "gp", so we don't miss the case where
> > > there are more grace periods. And since we use "acycle" now, so we don't
> > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> > 
> > Would this definition of rcu-fence work for a sequence such as (leaving
> > out the intermediate rcu-link parts):
> > 
> > 	gp gp gp rscs rscs gp rscs rscs
> > 
> > ?  I don't think it would.  Yes, if you had a cycle of that form then 
> > your "rcu" axiom would detect it, but at some point we might want to 
> > use rcu-fence for some other purpose, one that doesn't involve cycles.
> 
> Let's see, that would map to this:
> 
> auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> 
> And no, there is no such automatically generated litmus test.  Let's
> try reversing the "gp" and "rscs", which should have the same effect
> courtesy of symmetry:
> 
> auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> 
> And that one doesn't exist, either.  So much for random test generation!  :-/
> 
> Clearly time to add them.  And here is what herd has to say about them:
> 
> l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> Herd options: -conf linux-kernel.cfg
> Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255
>  ^^^ Unexpected non-Never verification
>  0inputs+32outputs (0major+2605minor)pagefaults 0swaps
> $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> Herd options: -conf linux-kernel.cfg
> Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255
>  ^^^ Unexpected non-Never verification
>  0inputs+32outputs (0major+2620minor)pagefaults 0swaps
> 
> In other words, they are in fact misclassified as "Sometimes" when they
> should be "Never".  I have my diffs below in case I misapplied Boqun's
> change.
> 
> With Alan's original formulation, these two litmus tests are correctly
> handled:
> 
> $ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> Herd options: -conf linux-kernel.cfg
> Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255
> 1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k
> $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> Herd options: -conf linux-kernel.cfg
> Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255
> 1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k

And as Andrea pointed out off-list, I did indeed mess up Boqun's change.
I forgot to change the "irreflexive" into "acyclic".  Applying that change
makes everything work.

Please accept my apologies for my confusion!

							Thanx, Paul

> > > I prefer this because we already treat "gp" as "strong-fence", which
> > > already is a "rcu-link".
> > 
> > That's a good point; it had not occurred to me.
> 
> And if I remove the "gp" but leave the last line, it does properly
> classify the two new litmus tests.
> 
> 							Thanx, Paul
> 
> > >  Also, recurisively extending rcu-fence with
> > > itself is exactly calculating the transitive closure, which we can avoid
> > > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > > pb.
> > 
> > That _had_ occurred to me.  But I couldn't see any way to do it while 
> > still defining rcu-fence correctly.
> 
> ------------------------------------------------------------------------
> 
> diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
> index 1e5c4653dd12..75d3c225146c 100644
> --- a/tools/memory-model/linux-kernel.cat
> +++ b/tools/memory-model/linux-kernel.cat
> @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop
>   * Any sequence containing at least as many grace periods as RCU read-side
>   * critical sections (joined by rcu-link) acts as a generalized strong fence.
>   *)
> -let rec rcu-fence = gp |
> +let rec rcu-fence =
>  	(gp ; rcu-link ; rscs) |
>  	(rscs ; rcu-link ; gp) |
>  	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> -	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> -	(rcu-fence ; rcu-link ; rcu-fence)
> +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp)
>  
>  (* rb orders instructions just as pb does *)
>  let rb = prop ; rcu-fence ; hb* ; pb*

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-01 18:37       ` Paul E. McKenney
@ 2018-03-02  4:31         ` Boqun Feng
  2018-03-02  4:50           ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Boqun Feng @ 2018-03-02  4:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

[-- Attachment #1: Type: text/plain, Size: 6481 bytes --]

On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote:
> On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote:
> > On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote:
> > > On Thu, 1 Mar 2018, Boqun Feng wrote:
> > > 
> > > > > +let rec rcu-fence = gp |
> > > > > +	(gp ; rcu-link ; rscs) |
> > > > > +	(rscs ; rcu-link ; gp) |
> > > > > +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > > > +	(rcu-fence ; rcu-link ; rcu-fence)
> > > > > +
> > > > > +(* rb orders instructions just as pb does *)
> > > > > +let rb = prop ; rcu-fence ; hb* ; pb*
> > > > >  
> > > > >  irreflexive rb as rcu
> > > > 
> > > > I wonder whether we can simplify things as:
> > > > 
> > > > 	let rec rcu-fence =
> > > > 	    (gp; rcu-link; rscs) |
> > > > 	    (rscs; rcu-link; gp) |
> > > > 	    (gp; rcu-link; rcu-fence; rcu-link; rscs) |
> > > > 	    (rscs; rcu-link; rcu-fence; rcu-link; gp)
> > > > 	
> > > > 	(* gp and rcu-fence; rcu-link; rcu-fence removed *)
> > > > 	
> > > > 	let rb = prop; rcu-fence; hb*; pb*
> > > > 
> > > > 	acycle rb as rcu
> > > > 
> > > > In this way, "rcu-fence" is defined as "any sequence containing as many
> > > > grace periods as RCU read-side critical sections (joined by rcu-link)."
> > > > Note that "rcu-link" contains "gp", so we don't miss the case where
> > > > there are more grace periods. And since we use "acycle" now, so we don't
> > > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively.
> > > 
> > > Would this definition of rcu-fence work for a sequence such as (leaving
> > > out the intermediate rcu-link parts):
> > > 
> > > 	gp gp gp rscs rscs gp rscs rscs
> > > 
> > > ?  I don't think it would.  Yes, if you had a cycle of that form then 

Right. 

> > > your "rcu" axiom would detect it, but at some point we might want to 
> > > use rcu-fence for some other purpose, one that doesn't involve cycles.

OK, and I've not yet found another simple way to express rcu-fence for
purposes other than cycle-checking. So I'm OK to leave it as it is
except removing the redundant "gp" in rcu-fence definition.

But I will continue to search for a easier and sufficient way to define
these things ;-) 

> > 
> > Let's see, that would map to this:
> > 
> > auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> > 
> > And no, there is no such automatically generated litmus test.  Let's
> > try reversing the "gp" and "rscs", which should have the same effect
> > courtesy of symmetry:
> > 
> > auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> > 
> > And that one doesn't exist, either.  So much for random test generation!  :-/
> > 
> > Clearly time to add them.  And here is what herd has to say about them:
> > 
> > l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> > Herd options: -conf linux-kernel.cfg
> > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255
> >  ^^^ Unexpected non-Never verification
> >  0inputs+32outputs (0major+2605minor)pagefaults 0swaps
> > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> > Herd options: -conf linux-kernel.cfg
> > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255
> >  ^^^ Unexpected non-Never verification
> >  0inputs+32outputs (0major+2620minor)pagefaults 0swaps
> > 
> > In other words, they are in fact misclassified as "Sometimes" when they
> > should be "Never".  I have my diffs below in case I misapplied Boqun's
> > change.
> > 
> > With Alan's original formulation, these two litmus tests are correctly
> > handled:
> > 
> > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus
> > Herd options: -conf linux-kernel.cfg
> > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255
> > 1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k
> > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus
> > Herd options: -conf linux-kernel.cfg
> > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255
> > 1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k
> 
> And as Andrea pointed out off-list, I did indeed mess up Boqun's change.
> I forgot to change the "irreflexive" into "acyclic".  Applying that change
> makes everything work.
> 
> Please accept my apologies for my confusion!
> 

np, also I should have provided a proper patch for your testing.

For this Alan's patch, feel free to add:

Reviewed-by: Boqun Feng <boqun.feng@gmail.com>

Regards,
Boqun

> 							Thanx, Paul
> 
> > > > I prefer this because we already treat "gp" as "strong-fence", which
> > > > already is a "rcu-link".
> > > 
> > > That's a good point; it had not occurred to me.
> > 
> > And if I remove the "gp" but leave the last line, it does properly
> > classify the two new litmus tests.
> > 
> > 							Thanx, Paul
> > 
> > > >  Also, recurisively extending rcu-fence with
> > > > itself is exactly calculating the transitive closure, which we can avoid
> > > > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > > > pb.
> > > 
> > > That _had_ occurred to me.  But I couldn't see any way to do it while 
> > > still defining rcu-fence correctly.
> > 
> > ------------------------------------------------------------------------
> > 
> > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
> > index 1e5c4653dd12..75d3c225146c 100644
> > --- a/tools/memory-model/linux-kernel.cat
> > +++ b/tools/memory-model/linux-kernel.cat
> > @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop
> >   * Any sequence containing at least as many grace periods as RCU read-side
> >   * critical sections (joined by rcu-link) acts as a generalized strong fence.
> >   *)
> > -let rec rcu-fence = gp |
> > +let rec rcu-fence =
> >  	(gp ; rcu-link ; rscs) |
> >  	(rscs ; rcu-link ; gp) |
> >  	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > -	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > -	(rcu-fence ; rcu-link ; rcu-fence)
> > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp)
> >  
> >  (* rb orders instructions just as pb does *)
> >  let rb = prop ; rcu-fence ; hb* ; pb*
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-02  4:31         ` Boqun Feng
@ 2018-03-02  4:50           ` Paul E. McKenney
  2018-03-02 15:17             ` Alan Stern
  0 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-02  4:50 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote:
> On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote:
> > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote:

[ . . . ]

> > And as Andrea pointed out off-list, I did indeed mess up Boqun's change.
> > I forgot to change the "irreflexive" into "acyclic".  Applying that change
> > makes everything work.
> > 
> > Please accept my apologies for my confusion!
> > 
> 
> np, also I should have provided a proper patch for your testing.
> 
> For this Alan's patch, feel free to add:
> 
> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>

Alan's last submission was still RFC, so I have not yet queued it.
So this ball is still in Alan's court.

							Thanx, Paul

> Regards,
> Boqun
> 
> > 							Thanx, Paul
> > 
> > > > > I prefer this because we already treat "gp" as "strong-fence", which
> > > > > already is a "rcu-link".
> > > > 
> > > > That's a good point; it had not occurred to me.
> > > 
> > > And if I remove the "gp" but leave the last line, it does properly
> > > classify the two new litmus tests.
> > > 
> > > 							Thanx, Paul
> > > 
> > > > >  Also, recurisively extending rcu-fence with
> > > > > itself is exactly calculating the transitive closure, which we can avoid
> > > > > by using a "acycle" rule. Besides, it looks more consistent with hb and
> > > > > pb.
> > > > 
> > > > That _had_ occurred to me.  But I couldn't see any way to do it while 
> > > > still defining rcu-fence correctly.
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat
> > > index 1e5c4653dd12..75d3c225146c 100644
> > > --- a/tools/memory-model/linux-kernel.cat
> > > +++ b/tools/memory-model/linux-kernel.cat
> > > @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop
> > >   * Any sequence containing at least as many grace periods as RCU read-side
> > >   * critical sections (joined by rcu-link) acts as a generalized strong fence.
> > >   *)
> > > -let rec rcu-fence = gp |
> > > +let rec rcu-fence =
> > >  	(gp ; rcu-link ; rscs) |
> > >  	(rscs ; rcu-link ; gp) |
> > >  	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> > > -	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> > > -	(rcu-fence ; rcu-link ; rcu-fence)
> > > +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp)
> > >  
> > >  (* rb orders instructions just as pb does *)
> > >  let rb = prop ; rcu-fence ; hb* ; pb*
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-02  4:50           ` Paul E. McKenney
@ 2018-03-02 15:17             ` Alan Stern
  2018-03-02 17:38               ` Paul E. McKenney
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Stern @ 2018-03-02 15:17 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Thu, 1 Mar 2018, Paul E. McKenney wrote:

> On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote:
> > On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote:
> > > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote:
> 
> [ . . . ]
> 
> > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change.
> > > I forgot to change the "irreflexive" into "acyclic".  Applying that change
> > > makes everything work.
> > > 
> > > Please accept my apologies for my confusion!
> > > 
> > 
> > np, also I should have provided a proper patch for your testing.
> > 
> > For this Alan's patch, feel free to add:
> > 
> > Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
> 
> Alan's last submission was still RFC, so I have not yet queued it.
> So this ball is still in Alan's court.

I'll wait a few more days to see if there are any other comments and 
then submit it officially.

Alan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-03-02 15:17             ` Alan Stern
@ 2018-03-02 17:38               ` Paul E. McKenney
  0 siblings, 0 replies; 13+ messages in thread
From: Paul E. McKenney @ 2018-03-02 17:38 UTC (permalink / raw)
  To: Alan Stern
  Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri,
	David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin,
	Peter Zijlstra, Will Deacon, Kernel development list

On Fri, Mar 02, 2018 at 10:17:55AM -0500, Alan Stern wrote:
> On Thu, 1 Mar 2018, Paul E. McKenney wrote:
> 
> > On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote:
> > > On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote:
> > > > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote:
> > 
> > [ . . . ]
> > 
> > > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change.
> > > > I forgot to change the "irreflexive" into "acyclic".  Applying that change
> > > > makes everything work.
> > > > 
> > > > Please accept my apologies for my confusion!
> > > > 
> > > 
> > > np, also I should have provided a proper patch for your testing.
> > > 
> > > For this Alan's patch, feel free to add:
> > > 
> > > Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
> > 
> > Alan's last submission was still RFC, so I have not yet queued it.
> > So this ball is still in Alan's court.
> 
> I'll wait a few more days to see if there are any other comments and 
> then submit it officially.

Will do!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence
  2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern
  2018-03-01  1:55 ` Boqun Feng
@ 2018-03-13 13:56 ` Andrea Parri
  1 sibling, 0 replies; 13+ messages in thread
From: Andrea Parri @ 2018-03-13 13:56 UTC (permalink / raw)
  To: Alan Stern
  Cc: LKMM Maintainers -- Akira Yokosawa, Boqun Feng, David Howells,
	Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney,
	Peter Zijlstra, Will Deacon, Kernel development list

On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote:
> This patch reorganizes the definition of rb in the Linux Kernel Memory
> Consistency Model.  The relation is now expressed in terms of
> rcu-fence, which consists of a sequence of gp and rscs links separated
> by rcu-link links, in which the number of occurrences of gp is >= the
> number of occurrences of rscs.
> 
> Arguments similar to those published in
> http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an
> inter-CPU strong fence.  Furthermore, the definition of rb in terms of
> rcu-fence is highly analogous to the definition of pb in terms of
> strong-fence, which can help explain why rcu-path expresses a form of
> temporal ordering.
> 
> This change should not affect the semantics of the memory model, just
> its internal organization.
> 
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

I like Boqun's suggestion of "reducing rcu-fence" and using "acyclic".

IIRC, time ago we discussed "enlarging" hb, pb by defining them to be
transitive closed (and using "irreflexive" everywhere); however, this
resulted in slightly longer simulation times...

For this patch,

Reviewed-by: Andrea Parri <parri.andrea@gmail.com>

  Andrea


> 
> ---
> 
> v2: Rebase on top of the preceding patch which renames "link" to
> "rcu-link" and "rcu-path" to "rb".  Add back the missing "rec" keyword
> in the definition of rcu-fence.  Minor editing improvements in
> explanation.txt.
> 
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po?
>   *)
>  let rcu-link = hb* ; pb* ; prop
>  
> -(* Chains that affect the RCU grace-period guarantee *)
> -let gp-link = gp ; rcu-link
> -let rscs-link = rscs ; rcu-link
> -
>  (*
> - * A cycle containing at least as many grace periods as RCU read-side
> - * critical sections is forbidden.
> + * Any sequence containing at least as many grace periods as RCU read-side
> + * critical sections (joined by rcu-link) acts as a generalized strong fence.
>   *)
> -let rec rb =
> -	gp-link |
> -	(gp-link ; rscs-link) |
> -	(rscs-link ; gp-link) |
> -	(rb ; rb) |
> -	(gp-link ; rb ; rscs-link) |
> -	(rscs-link ; rb ; gp-link)
> +let rec rcu-fence = gp |
> +	(gp ; rcu-link ; rscs) |
> +	(rscs ; rcu-link ; gp) |
> +	(gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) |
> +	(rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) |
> +	(rcu-fence ; rcu-link ; rcu-fence)
> +
> +(* rb orders instructions just as pb does *)
> +let rb = prop ; rcu-fence ; hb* ; pb*
>  
>  irreflexive rb as rcu
> +
> +(*
> + * The happens-before, propagation, and rcu constraints are all
> + * expressions of temporal ordering.  They could be replaced by
> + * a single constraint on an "executes-before" relation, xb:
> + *
> + * let xb = hb | pb | rb
> + * acyclic xb as executes-before
> + *)
> Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C
>    19. AND THEN THERE WAS ALPHA
>    20. THE HAPPENS-BEFORE RELATION: hb
>    21. THE PROPAGATES-BEFORE RELATION: pb
> -  22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> +  22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
>    23. ODDS AND ENDS
>  
>  
> @@ -1451,8 +1451,8 @@ they execute means that it cannot have c
>  the content of the LKMM's "propagation" axiom.
>  
>  
> -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb
> ----------------------------------------------------
> +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb
> +----------------------------------------------------
>  
>  RCU (Read-Copy-Update) is a powerful synchronization mechanism.  It
>  rests on two concepts: grace periods and read-side critical sections.
> @@ -1537,49 +1537,100 @@ relation, and the details don't matter u
>  a somewhat lengthy formal proof.  Pretty much all you need to know
>  about rcu-link is the information in the preceding paragraph.
>  
> -The LKMM goes on to define the gp-link and rscs-link relations.  They
> -bring grace periods and read-side critical sections into the picture,
> -in the following way:
> -
> -	E ->gp-link F means there is a synchronize_rcu() fence event S
> -	and an event X such that E ->po S, either S ->po X or S = X,
> -	and X ->rcu-link F.  In other words, E and F are linked by a
> -	grace period followed by an instance of rcu-link.
> -
> -	E ->rscs-link F means there is a critical section delimited by
> -	an rcu_read_lock() fence L and an rcu_read_unlock() fence U,
> -	and an event X such that E ->po U, either L ->po X or L = X,
> -	and X ->rcu-link F.  Roughly speaking, this says that some
> -	event in the same critical section as E is linked by rcu-link
> -	to F.
> +The LKMM also defines the gp and rscs relations.  They bring grace
> +periods and read-side critical sections into the picture, in the
> +following way:
> +
> +	E ->gp F means there is a synchronize_rcu() fence event S such
> +	that E ->po S and either S ->po F or S = F.  In simple terms,
> +	there is a grace period po-between E and F.
> +
> +	E ->rscs F means there is a critical section delimited by an
> +	rcu_read_lock() fence L and an rcu_read_unlock() fence U, such
> +	that E ->po U and either L ->po F or L = F.  You can think of
> +	this as saying that E and F are in the same critical section
> +	(in fact, it also allows E to be po-before the start of the
> +	critical section and F to be po-after the end).
>  
>  If we think of the rcu-link relation as standing for an extended
> -"before", then E ->gp-link F says that E executes before a grace
> -period which ends before F executes.  (In fact it covers more than
> -this, because it also includes cases where E executes before a grace
> -period and some store propagates to F's CPU before F executes and
> -doesn't propagate to some other CPU until after the grace period
> -ends.)  Similarly, E ->rscs-link F says that E is part of (or before
> -the start of) a critical section which starts before F executes.
> +"before", then X ->gp Y ->rcu-link Z says that X executes before a
> +grace period which ends before Z executes.  (In fact it covers more
> +than this, because it also includes cases where X executes before a
> +grace period and some store propagates to Z's CPU before Z executes
> +but doesn't propagate to some other CPU until after the grace period
> +ends.)  Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or
> +before the start of) a critical section which starts before Z
> +executes.
> +
> +The LKMM goes on to define the rcu-fence relation as a sequence of gp
> +and rscs links separated by rcu-link links, in which the number of gp
> +links is >= the number of rscs links.  For example:
> +
> +	X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> +
> +would imply that X ->rcu-fence V, because this sequence contains two
> +gp links and only one rscs link.  (It also implies that X ->rcu-fence T
> +and Z ->rcu-fence V.)  On the other hand:
> +
> +	X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V
> +
> +does not imply X ->rcu-fence V, because the sequence contains only
> +one gp link but two rscs links.
> +
> +The rcu-fence relation is important because the Grace Period Guarantee
> +means that rcu-fence acts kind of like a strong fence.  In particular,
> +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W
> +will propagate to every CPU before Z executes.
> +
> +To prove this in full generality requires some intellectual effort.
> +We'll consider just a very simple case:
> +
> +	W ->gp X ->rcu-link Y ->rscs Z.
> +
> +This formula means that there is a grace period G and a critical
> +section C such that:
> +
> +	1. W is po-before G;
> +
> +	2. X is equal to or po-after G;
> +
> +	3. X comes "before" Y in some sense;
> +
> +	4. Y is po-before the end of C;
> +
> +	5. Z is equal to or po-after the start of C.
> +
> +From 2 - 4 we deduce that the grace period G ends before the critical
> +section C.  Then the second part of the Grace Period Guarantee says
> +not only that G starts before C does, but also that W (which executes
> +on G's CPU before G starts) must propagate to every CPU before C
> +starts.  In particular, W propagates to every CPU before Z executes
> +(or finishes executing, in the case where Z is equal to the
> +rcu_read_lock() fence event which starts C.)  This sort of reasoning
> +can be expanded to handle all the situations covered by rcu-fence.
> +
> +Finally, the LKMM defines the RCU-before (rb) relation in terms of
> +rcu-fence.  This is done in essentially the same way as the pb
> +relation was defined in terms of strong-fence.  We will omit the
> +details; the end result is that E ->rb F implies E must execute before
> +F, just as E ->pb F does (and for much the same reasons).
>  
>  Putting this all together, the LKMM expresses the Grace Period
> -Guarantee by requiring that there are no cycles consisting of gp-link
> -and rscs-link links in which the number of gp-link instances is >= the
> -number of rscs-link instances.  It does this by defining the rb
> -relation to link events E and F whenever it is possible to pass from E
> -to F by a sequence of gp-link and rscs-link links with at least as
> -many of the former as the latter.  The LKMM's "rcu" axiom then says
> -that there are no events E with E ->rb E.
> -
> -Justifying this axiom takes some intellectual effort, but it is in
> -fact a valid formalization of the Grace Period Guarantee.  We won't
> -attempt to go through the detailed argument, but the following
> -analysis gives a taste of what is involved.  Suppose we have a
> -violation of the first part of the Guarantee: A critical section
> -starts before a grace period, and some store propagates to the
> -critical section's CPU before the end of the critical section but
> -doesn't propagate to some other CPU until after the end of the grace
> -period.
> +Guarantee by requiring that the rb relation does not contain a cycle.
> +Equivalently, this "rcu" axiom requires that there are no events E and
> +F with E ->rcu-link F ->rcu-fence E.  Or to put it a third way, the
> +axiom requires that there are no cycles consisting of gp and rscs
> +alternating with rcu-link, where the number of gp links is >= the
> +number of rscs links.
> +
> +Justifying the axiom isn't easy, but it is in fact a valid
> +formalization of the Grace Period Guarantee.  We won't attempt to go
> +through the detailed argument, but the following analysis gives a
> +taste of what is involved.  Suppose we have a violation of the first
> +part of the Guarantee: A critical section starts before a grace
> +period, and some store propagates to the critical section's CPU before
> +the end of the critical section but doesn't propagate to some other
> +CPU until after the end of the grace period.
>  
>  Putting symbols to these ideas, let L and U be the rcu_read_lock() and
>  rcu_read_unlock() fence events delimiting the critical section in
> @@ -1606,11 +1657,14 @@ by rcu-link, yielding:
>  
>  	S ->po X ->rcu-link Z ->po U.
>  
> -The formulas say that S is po-between F and X, hence F ->gp-link Z
> -via X.  They also say that Z comes before the end of the critical
> -section and E comes after its start, hence Z ->rscs-link F via E.  But
> -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F.  Thus the
> -"rcu" axiom rules out this violation of the Grace Period Guarantee.
> +The formulas say that S is po-between F and X, hence F ->gp X.  They
> +also say that Z comes before the end of the critical section and E
> +comes after its start, hence Z ->rscs E.  From all this we obtain:
> +
> +	F ->gp X ->rcu-link Z ->rscs E ->rcu-link F,
> +
> +a forbidden cycle.  Thus the "rcu" axiom rules out this violation of
> +the Grace Period Guarantee.
>  
>  For something a little more down-to-earth, let's see how the axiom
>  works out in practice.  Consider the RCU code example from above, this
> @@ -1639,15 +1693,15 @@ time with statement labels added to the
>  If r2 = 0 at the end then P0's store at X overwrites the value that
>  P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X.
>  In addition, there is a synchronize_rcu() between Y and Z, so therefore
> -we have Y ->gp-link X.
> +we have Y ->gp Z.
>  
>  If r1 = 1 at the end then P1's load at Y reads from P0's store at W,
>  so we have W ->rcu-link Y.  In addition, W and X are in the same critical
> -section, so therefore we have X ->rscs-link Y.
> +section, so therefore we have X ->rscs W.
>  
> -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link
> -and one rscs-link, violating the "rcu" axiom.  Hence the outcome is
> -not allowed by the LKMM, as we would expect.
> +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle,
> +violating the "rcu" axiom.  Hence the outcome is not allowed by the
> +LKMM, as we would expect.
>  
>  For contrast, let's see what can happen in a more complicated example:
>  
> @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen
>  	}
>  
>  If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows
> -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W
> -via V.  And just as before, this gives a cycle:
> -
> -	W ->rscs-link Y ->gp-link U ->rscs-link W.
> -
> -However, this cycle has fewer gp-link instances than rscs-link
> -instances, and consequently the outcome is not forbidden by the LKMM.
> -The following instruction timing diagram shows how it might actually
> -occur:
> +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W.
> +However this cycle is not forbidden, because the sequence of relations
> +contains fewer instances of gp (one) than of rscs (two).  Consequently
> +the outcome is allowed by the LKMM.  The following instruction timing
> +diagram shows how it might actually occur:
>  
>  P0			P1			P2
>  --------------------	--------------------	--------------------
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-03-13 13:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern
2018-03-01  1:55 ` Boqun Feng
2018-03-01  4:49   ` Paul E. McKenney
2018-03-01  8:39     ` Boqun Feng
2018-03-01 14:28       ` Paul E. McKenney
2018-03-01 15:49   ` Alan Stern
2018-03-01 17:49     ` Paul E. McKenney
2018-03-01 18:37       ` Paul E. McKenney
2018-03-02  4:31         ` Boqun Feng
2018-03-02  4:50           ` Paul E. McKenney
2018-03-02 15:17             ` Alan Stern
2018-03-02 17:38               ` Paul E. McKenney
2018-03-13 13:56 ` Andrea Parri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).