* [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence @ 2018-02-28 20:13 Alan Stern 2018-03-01 1:55 ` Boqun Feng 2018-03-13 13:56 ` Andrea Parri 0 siblings, 2 replies; 13+ messages in thread From: Alan Stern @ 2018-02-28 20:13 UTC (permalink / raw) To: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, Boqun Feng, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney, Peter Zijlstra, Will Deacon Cc: Kernel development list This patch reorganizes the definition of rb in the Linux Kernel Memory Consistency Model. The relation is now expressed in terms of rcu-fence, which consists of a sequence of gp and rscs links separated by rcu-link links, in which the number of occurrences of gp is >= the number of occurrences of rscs. Arguments similar to those published in http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an inter-CPU strong fence. Furthermore, the definition of rb in terms of rcu-fence is highly analogous to the definition of pb in terms of strong-fence, which can help explain why rcu-path expresses a form of temporal ordering. This change should not affect the semantics of the memory model, just its internal organization. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> --- v2: Rebase on top of the preceding patch which renames "link" to "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword in the definition of rcu-fence. Minor editing improvements in explanation.txt. Index: usb-4.x/tools/memory-model/linux-kernel.cat =================================================================== --- usb-4.x.orig/tools/memory-model/linux-kernel.cat +++ usb-4.x/tools/memory-model/linux-kernel.cat @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? *) let rcu-link = hb* ; pb* ; prop -(* Chains that affect the RCU grace-period guarantee *) -let gp-link = gp ; rcu-link -let rscs-link = rscs ; rcu-link - (* - * A cycle containing at least as many grace periods as RCU read-side - * critical sections is forbidden. + * Any sequence containing at least as many grace periods as RCU read-side + * critical sections (joined by rcu-link) acts as a generalized strong fence. *) -let rec rb = - gp-link | - (gp-link ; rscs-link) | - (rscs-link ; gp-link) | - (rb ; rb) | - (gp-link ; rb ; rscs-link) | - (rscs-link ; rb ; gp-link) +let rec rcu-fence = gp | + (gp ; rcu-link ; rscs) | + (rscs ; rcu-link ; gp) | + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | + (rcu-fence ; rcu-link ; rcu-fence) + +(* rb orders instructions just as pb does *) +let rb = prop ; rcu-fence ; hb* ; pb* irreflexive rb as rcu + +(* + * The happens-before, propagation, and rcu constraints are all + * expressions of temporal ordering. They could be replaced by + * a single constraint on an "executes-before" relation, xb: + * + * let xb = hb | pb | rb + * acyclic xb as executes-before + *) Index: usb-4.x/tools/memory-model/Documentation/explanation.txt =================================================================== --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt +++ usb-4.x/tools/memory-model/Documentation/explanation.txt @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C 19. AND THEN THERE WAS ALPHA 20. THE HAPPENS-BEFORE RELATION: hb 21. THE PROPAGATES-BEFORE RELATION: pb - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb 23. ODDS AND ENDS @@ -1451,8 +1451,8 @@ they execute means that it cannot have c the content of the LKMM's "propagation" axiom. -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb ---------------------------------------------------- +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb +---------------------------------------------------- RCU (Read-Copy-Update) is a powerful synchronization mechanism. It rests on two concepts: grace periods and read-side critical sections. @@ -1537,49 +1537,100 @@ relation, and the details don't matter u a somewhat lengthy formal proof. Pretty much all you need to know about rcu-link is the information in the preceding paragraph. -The LKMM goes on to define the gp-link and rscs-link relations. They -bring grace periods and read-side critical sections into the picture, -in the following way: - - E ->gp-link F means there is a synchronize_rcu() fence event S - and an event X such that E ->po S, either S ->po X or S = X, - and X ->rcu-link F. In other words, E and F are linked by a - grace period followed by an instance of rcu-link. - - E ->rscs-link F means there is a critical section delimited by - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, - and an event X such that E ->po U, either L ->po X or L = X, - and X ->rcu-link F. Roughly speaking, this says that some - event in the same critical section as E is linked by rcu-link - to F. +The LKMM also defines the gp and rscs relations. They bring grace +periods and read-side critical sections into the picture, in the +following way: + + E ->gp F means there is a synchronize_rcu() fence event S such + that E ->po S and either S ->po F or S = F. In simple terms, + there is a grace period po-between E and F. + + E ->rscs F means there is a critical section delimited by an + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such + that E ->po U and either L ->po F or L = F. You can think of + this as saying that E and F are in the same critical section + (in fact, it also allows E to be po-before the start of the + critical section and F to be po-after the end). If we think of the rcu-link relation as standing for an extended -"before", then E ->gp-link F says that E executes before a grace -period which ends before F executes. (In fact it covers more than -this, because it also includes cases where E executes before a grace -period and some store propagates to F's CPU before F executes and -doesn't propagate to some other CPU until after the grace period -ends.) Similarly, E ->rscs-link F says that E is part of (or before -the start of) a critical section which starts before F executes. +"before", then X ->gp Y ->rcu-link Z says that X executes before a +grace period which ends before Z executes. (In fact it covers more +than this, because it also includes cases where X executes before a +grace period and some store propagates to Z's CPU before Z executes +but doesn't propagate to some other CPU until after the grace period +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or +before the start of) a critical section which starts before Z +executes. + +The LKMM goes on to define the rcu-fence relation as a sequence of gp +and rscs links separated by rcu-link links, in which the number of gp +links is >= the number of rscs links. For example: + + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V + +would imply that X ->rcu-fence V, because this sequence contains two +gp links and only one rscs link. (It also implies that X ->rcu-fence T +and Z ->rcu-fence V.) On the other hand: + + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V + +does not imply X ->rcu-fence V, because the sequence contains only +one gp link but two rscs links. + +The rcu-fence relation is important because the Grace Period Guarantee +means that rcu-fence acts kind of like a strong fence. In particular, +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W +will propagate to every CPU before Z executes. + +To prove this in full generality requires some intellectual effort. +We'll consider just a very simple case: + + W ->gp X ->rcu-link Y ->rscs Z. + +This formula means that there is a grace period G and a critical +section C such that: + + 1. W is po-before G; + + 2. X is equal to or po-after G; + + 3. X comes "before" Y in some sense; + + 4. Y is po-before the end of C; + + 5. Z is equal to or po-after the start of C. + +From 2 - 4 we deduce that the grace period G ends before the critical +section C. Then the second part of the Grace Period Guarantee says +not only that G starts before C does, but also that W (which executes +on G's CPU before G starts) must propagate to every CPU before C +starts. In particular, W propagates to every CPU before Z executes +(or finishes executing, in the case where Z is equal to the +rcu_read_lock() fence event which starts C.) This sort of reasoning +can be expanded to handle all the situations covered by rcu-fence. + +Finally, the LKMM defines the RCU-before (rb) relation in terms of +rcu-fence. This is done in essentially the same way as the pb +relation was defined in terms of strong-fence. We will omit the +details; the end result is that E ->rb F implies E must execute before +F, just as E ->pb F does (and for much the same reasons). Putting this all together, the LKMM expresses the Grace Period -Guarantee by requiring that there are no cycles consisting of gp-link -and rscs-link links in which the number of gp-link instances is >= the -number of rscs-link instances. It does this by defining the rb -relation to link events E and F whenever it is possible to pass from E -to F by a sequence of gp-link and rscs-link links with at least as -many of the former as the latter. The LKMM's "rcu" axiom then says -that there are no events E with E ->rb E. - -Justifying this axiom takes some intellectual effort, but it is in -fact a valid formalization of the Grace Period Guarantee. We won't -attempt to go through the detailed argument, but the following -analysis gives a taste of what is involved. Suppose we have a -violation of the first part of the Guarantee: A critical section -starts before a grace period, and some store propagates to the -critical section's CPU before the end of the critical section but -doesn't propagate to some other CPU until after the end of the grace -period. +Guarantee by requiring that the rb relation does not contain a cycle. +Equivalently, this "rcu" axiom requires that there are no events E and +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the +axiom requires that there are no cycles consisting of gp and rscs +alternating with rcu-link, where the number of gp links is >= the +number of rscs links. + +Justifying the axiom isn't easy, but it is in fact a valid +formalization of the Grace Period Guarantee. We won't attempt to go +through the detailed argument, but the following analysis gives a +taste of what is involved. Suppose we have a violation of the first +part of the Guarantee: A critical section starts before a grace +period, and some store propagates to the critical section's CPU before +the end of the critical section but doesn't propagate to some other +CPU until after the end of the grace period. Putting symbols to these ideas, let L and U be the rcu_read_lock() and rcu_read_unlock() fence events delimiting the critical section in @@ -1606,11 +1657,14 @@ by rcu-link, yielding: S ->po X ->rcu-link Z ->po U. -The formulas say that S is po-between F and X, hence F ->gp-link Z -via X. They also say that Z comes before the end of the critical -section and E comes after its start, hence Z ->rscs-link F via E. But -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the -"rcu" axiom rules out this violation of the Grace Period Guarantee. +The formulas say that S is po-between F and X, hence F ->gp X. They +also say that Z comes before the end of the critical section and E +comes after its start, hence Z ->rscs E. From all this we obtain: + + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, + +a forbidden cycle. Thus the "rcu" axiom rules out this violation of +the Grace Period Guarantee. For something a little more down-to-earth, let's see how the axiom works out in practice. Consider the RCU code example from above, this @@ -1639,15 +1693,15 @@ time with statement labels added to the If r2 = 0 at the end then P0's store at X overwrites the value that P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. In addition, there is a synchronize_rcu() between Y and Z, so therefore -we have Y ->gp-link X. +we have Y ->gp Z. If r1 = 1 at the end then P1's load at Y reads from P0's store at W, so we have W ->rcu-link Y. In addition, W and X are in the same critical -section, so therefore we have X ->rscs-link Y. +section, so therefore we have X ->rscs W. -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link -and one rscs-link, violating the "rcu" axiom. Hence the outcome is -not allowed by the LKMM, as we would expect. +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, +violating the "rcu" axiom. Hence the outcome is not allowed by the +LKMM, as we would expect. For contrast, let's see what can happen in a more complicated example: @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen } If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W -via V. And just as before, this gives a cycle: - - W ->rscs-link Y ->gp-link U ->rscs-link W. - -However, this cycle has fewer gp-link instances than rscs-link -instances, and consequently the outcome is not forbidden by the LKMM. -The following instruction timing diagram shows how it might actually -occur: +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. +However this cycle is not forbidden, because the sequence of relations +contains fewer instances of gp (one) than of rscs (two). Consequently +the outcome is allowed by the LKMM. The following instruction timing +diagram shows how it might actually occur: P0 P1 P2 -------------------- -------------------- -------------------- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern @ 2018-03-01 1:55 ` Boqun Feng 2018-03-01 4:49 ` Paul E. McKenney 2018-03-01 15:49 ` Alan Stern 2018-03-13 13:56 ` Andrea Parri 1 sibling, 2 replies; 13+ messages in thread From: Boqun Feng @ 2018-03-01 1:55 UTC (permalink / raw) To: Alan Stern Cc: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney, Peter Zijlstra, Will Deacon, Kernel development list [-- Attachment #1: Type: text/plain, Size: 15299 bytes --] On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote: > This patch reorganizes the definition of rb in the Linux Kernel Memory > Consistency Model. The relation is now expressed in terms of > rcu-fence, which consists of a sequence of gp and rscs links separated > by rcu-link links, in which the number of occurrences of gp is >= the > number of occurrences of rscs. > > Arguments similar to those published in > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an > inter-CPU strong fence. Furthermore, the definition of rb in terms of > rcu-fence is highly analogous to the definition of pb in terms of > strong-fence, which can help explain why rcu-path expresses a form of > temporal ordering. > > This change should not affect the semantics of the memory model, just > its internal organization. > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> > > --- > > v2: Rebase on top of the preceding patch which renames "link" to > "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword > in the definition of rcu-fence. Minor editing improvements in > explanation.txt. > > Index: usb-4.x/tools/memory-model/linux-kernel.cat > =================================================================== > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > +++ usb-4.x/tools/memory-model/linux-kernel.cat > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? > *) > let rcu-link = hb* ; pb* ; prop > > -(* Chains that affect the RCU grace-period guarantee *) > -let gp-link = gp ; rcu-link > -let rscs-link = rscs ; rcu-link > - > (* > - * A cycle containing at least as many grace periods as RCU read-side > - * critical sections is forbidden. > + * Any sequence containing at least as many grace periods as RCU read-side > + * critical sections (joined by rcu-link) acts as a generalized strong fence. > *) > -let rec rb = > - gp-link | > - (gp-link ; rscs-link) | > - (rscs-link ; gp-link) | > - (rb ; rb) | > - (gp-link ; rb ; rscs-link) | > - (rscs-link ; rb ; gp-link) > +let rec rcu-fence = gp | > + (gp ; rcu-link ; rscs) | > + (rscs ; rcu-link ; gp) | > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > + (rcu-fence ; rcu-link ; rcu-fence) > + > +(* rb orders instructions just as pb does *) > +let rb = prop ; rcu-fence ; hb* ; pb* > > irreflexive rb as rcu I wonder whether we can simplify things as: let rec rcu-fence = (gp; rcu-link; rscs) | (rscs; rcu-link; gp) | (gp; rcu-link; rcu-fence; rcu-link; rscs) | (rscs; rcu-link; rcu-fence; rcu-link; gp) (* gp and rcu-fence; rcu-link; rcu-fence removed *) let rb = prop; rcu-fence; hb*; pb* acycle rb as rcu In this way, "rcu-fence" is defined as "any sequence containing as many grace periods as RCU read-side critical sections (joined by rcu-link)." Note that "rcu-link" contains "gp", so we don't miss the case where there are more grace periods. And since we use "acycle" now, so we don't need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. I prefer this because we already treat "gp" as "strong-fence", which already is a "rcu-link". Also, recurisively extending rcu-fence with itself is exactly calculating the transitive closure, which we can avoid by using a "acycle" rule. Besides, it looks more consistent with hb and pb. Thoughts? Regards, Boqun > + > +(* > + * The happens-before, propagation, and rcu constraints are all > + * expressions of temporal ordering. They could be replaced by > + * a single constraint on an "executes-before" relation, xb: > + * > + * let xb = hb | pb | rb > + * acyclic xb as executes-before > + *) > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt > =================================================================== > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C > 19. AND THEN THERE WAS ALPHA > 20. THE HAPPENS-BEFORE RELATION: hb > 21. THE PROPAGATES-BEFORE RELATION: pb > - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > 23. ODDS AND ENDS > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c > the content of the LKMM's "propagation" axiom. > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > ---------------------------------------------------- > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > +---------------------------------------------------- > > RCU (Read-Copy-Update) is a powerful synchronization mechanism. It > rests on two concepts: grace periods and read-side critical sections. > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u > a somewhat lengthy formal proof. Pretty much all you need to know > about rcu-link is the information in the preceding paragraph. > > -The LKMM goes on to define the gp-link and rscs-link relations. They > -bring grace periods and read-side critical sections into the picture, > -in the following way: > - > - E ->gp-link F means there is a synchronize_rcu() fence event S > - and an event X such that E ->po S, either S ->po X or S = X, > - and X ->rcu-link F. In other words, E and F are linked by a > - grace period followed by an instance of rcu-link. > - > - E ->rscs-link F means there is a critical section delimited by > - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, > - and an event X such that E ->po U, either L ->po X or L = X, > - and X ->rcu-link F. Roughly speaking, this says that some > - event in the same critical section as E is linked by rcu-link > - to F. > +The LKMM also defines the gp and rscs relations. They bring grace > +periods and read-side critical sections into the picture, in the > +following way: > + > + E ->gp F means there is a synchronize_rcu() fence event S such > + that E ->po S and either S ->po F or S = F. In simple terms, > + there is a grace period po-between E and F. > + > + E ->rscs F means there is a critical section delimited by an > + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such > + that E ->po U and either L ->po F or L = F. You can think of > + this as saying that E and F are in the same critical section > + (in fact, it also allows E to be po-before the start of the > + critical section and F to be po-after the end). > > If we think of the rcu-link relation as standing for an extended > -"before", then E ->gp-link F says that E executes before a grace > -period which ends before F executes. (In fact it covers more than > -this, because it also includes cases where E executes before a grace > -period and some store propagates to F's CPU before F executes and > -doesn't propagate to some other CPU until after the grace period > -ends.) Similarly, E ->rscs-link F says that E is part of (or before > -the start of) a critical section which starts before F executes. > +"before", then X ->gp Y ->rcu-link Z says that X executes before a > +grace period which ends before Z executes. (In fact it covers more > +than this, because it also includes cases where X executes before a > +grace period and some store propagates to Z's CPU before Z executes > +but doesn't propagate to some other CPU until after the grace period > +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or > +before the start of) a critical section which starts before Z > +executes. > + > +The LKMM goes on to define the rcu-fence relation as a sequence of gp > +and rscs links separated by rcu-link links, in which the number of gp > +links is >= the number of rscs links. For example: > + > + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > + > +would imply that X ->rcu-fence V, because this sequence contains two > +gp links and only one rscs link. (It also implies that X ->rcu-fence T > +and Z ->rcu-fence V.) On the other hand: > + > + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > + > +does not imply X ->rcu-fence V, because the sequence contains only > +one gp link but two rscs links. > + > +The rcu-fence relation is important because the Grace Period Guarantee > +means that rcu-fence acts kind of like a strong fence. In particular, > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W > +will propagate to every CPU before Z executes. > + > +To prove this in full generality requires some intellectual effort. > +We'll consider just a very simple case: > + > + W ->gp X ->rcu-link Y ->rscs Z. > + > +This formula means that there is a grace period G and a critical > +section C such that: > + > + 1. W is po-before G; > + > + 2. X is equal to or po-after G; > + > + 3. X comes "before" Y in some sense; > + > + 4. Y is po-before the end of C; > + > + 5. Z is equal to or po-after the start of C. > + > +From 2 - 4 we deduce that the grace period G ends before the critical > +section C. Then the second part of the Grace Period Guarantee says > +not only that G starts before C does, but also that W (which executes > +on G's CPU before G starts) must propagate to every CPU before C > +starts. In particular, W propagates to every CPU before Z executes > +(or finishes executing, in the case where Z is equal to the > +rcu_read_lock() fence event which starts C.) This sort of reasoning > +can be expanded to handle all the situations covered by rcu-fence. > + > +Finally, the LKMM defines the RCU-before (rb) relation in terms of > +rcu-fence. This is done in essentially the same way as the pb > +relation was defined in terms of strong-fence. We will omit the > +details; the end result is that E ->rb F implies E must execute before > +F, just as E ->pb F does (and for much the same reasons). > > Putting this all together, the LKMM expresses the Grace Period > -Guarantee by requiring that there are no cycles consisting of gp-link > -and rscs-link links in which the number of gp-link instances is >= the > -number of rscs-link instances. It does this by defining the rb > -relation to link events E and F whenever it is possible to pass from E > -to F by a sequence of gp-link and rscs-link links with at least as > -many of the former as the latter. The LKMM's "rcu" axiom then says > -that there are no events E with E ->rb E. > - > -Justifying this axiom takes some intellectual effort, but it is in > -fact a valid formalization of the Grace Period Guarantee. We won't > -attempt to go through the detailed argument, but the following > -analysis gives a taste of what is involved. Suppose we have a > -violation of the first part of the Guarantee: A critical section > -starts before a grace period, and some store propagates to the > -critical section's CPU before the end of the critical section but > -doesn't propagate to some other CPU until after the end of the grace > -period. > +Guarantee by requiring that the rb relation does not contain a cycle. > +Equivalently, this "rcu" axiom requires that there are no events E and > +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the > +axiom requires that there are no cycles consisting of gp and rscs > +alternating with rcu-link, where the number of gp links is >= the > +number of rscs links. > + > +Justifying the axiom isn't easy, but it is in fact a valid > +formalization of the Grace Period Guarantee. We won't attempt to go > +through the detailed argument, but the following analysis gives a > +taste of what is involved. Suppose we have a violation of the first > +part of the Guarantee: A critical section starts before a grace > +period, and some store propagates to the critical section's CPU before > +the end of the critical section but doesn't propagate to some other > +CPU until after the end of the grace period. > > Putting symbols to these ideas, let L and U be the rcu_read_lock() and > rcu_read_unlock() fence events delimiting the critical section in > @@ -1606,11 +1657,14 @@ by rcu-link, yielding: > > S ->po X ->rcu-link Z ->po U. > > -The formulas say that S is po-between F and X, hence F ->gp-link Z > -via X. They also say that Z comes before the end of the critical > -section and E comes after its start, hence Z ->rscs-link F via E. But > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the > -"rcu" axiom rules out this violation of the Grace Period Guarantee. > +The formulas say that S is po-between F and X, hence F ->gp X. They > +also say that Z comes before the end of the critical section and E > +comes after its start, hence Z ->rscs E. From all this we obtain: > + > + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, > + > +a forbidden cycle. Thus the "rcu" axiom rules out this violation of > +the Grace Period Guarantee. > > For something a little more down-to-earth, let's see how the axiom > works out in practice. Consider the RCU code example from above, this > @@ -1639,15 +1693,15 @@ time with statement labels added to the > If r2 = 0 at the end then P0's store at X overwrites the value that > P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. > In addition, there is a synchronize_rcu() between Y and Z, so therefore > -we have Y ->gp-link X. > +we have Y ->gp Z. > > If r1 = 1 at the end then P1's load at Y reads from P0's store at W, > so we have W ->rcu-link Y. In addition, W and X are in the same critical > -section, so therefore we have X ->rscs-link Y. > +section, so therefore we have X ->rscs W. > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link > -and one rscs-link, violating the "rcu" axiom. Hence the outcome is > -not allowed by the LKMM, as we would expect. > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, > +violating the "rcu" axiom. Hence the outcome is not allowed by the > +LKMM, as we would expect. > > For contrast, let's see what can happen in a more complicated example: > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen > } > > If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W > -via V. And just as before, this gives a cycle: > - > - W ->rscs-link Y ->gp-link U ->rscs-link W. > - > -However, this cycle has fewer gp-link instances than rscs-link > -instances, and consequently the outcome is not forbidden by the LKMM. > -The following instruction timing diagram shows how it might actually > -occur: > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. > +However this cycle is not forbidden, because the sequence of relations > +contains fewer instances of gp (one) than of rscs (two). Consequently > +the outcome is allowed by the LKMM. The following instruction timing > +diagram shows how it might actually occur: > > P0 P1 P2 > -------------------- -------------------- -------------------- > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 1:55 ` Boqun Feng @ 2018-03-01 4:49 ` Paul E. McKenney 2018-03-01 8:39 ` Boqun Feng 2018-03-01 15:49 ` Alan Stern 1 sibling, 1 reply; 13+ messages in thread From: Paul E. McKenney @ 2018-03-01 4:49 UTC (permalink / raw) To: Boqun Feng Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote: > On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote: > > This patch reorganizes the definition of rb in the Linux Kernel Memory > > Consistency Model. The relation is now expressed in terms of > > rcu-fence, which consists of a sequence of gp and rscs links separated > > by rcu-link links, in which the number of occurrences of gp is >= the > > number of occurrences of rscs. > > > > Arguments similar to those published in > > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an > > inter-CPU strong fence. Furthermore, the definition of rb in terms of > > rcu-fence is highly analogous to the definition of pb in terms of > > strong-fence, which can help explain why rcu-path expresses a form of > > temporal ordering. > > > > This change should not affect the semantics of the memory model, just > > its internal organization. > > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> > > > > --- > > > > v2: Rebase on top of the preceding patch which renames "link" to > > "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword > > in the definition of rcu-fence. Minor editing improvements in > > explanation.txt. > > > > Index: usb-4.x/tools/memory-model/linux-kernel.cat > > =================================================================== > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > > +++ usb-4.x/tools/memory-model/linux-kernel.cat > > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? > > *) > > let rcu-link = hb* ; pb* ; prop > > > > -(* Chains that affect the RCU grace-period guarantee *) > > -let gp-link = gp ; rcu-link > > -let rscs-link = rscs ; rcu-link > > - > > (* > > - * A cycle containing at least as many grace periods as RCU read-side > > - * critical sections is forbidden. > > + * Any sequence containing at least as many grace periods as RCU read-side > > + * critical sections (joined by rcu-link) acts as a generalized strong fence. > > *) > > -let rec rb = > > - gp-link | > > - (gp-link ; rscs-link) | > > - (rscs-link ; gp-link) | > > - (rb ; rb) | > > - (gp-link ; rb ; rscs-link) | > > - (rscs-link ; rb ; gp-link) > > +let rec rcu-fence = gp | > > + (gp ; rcu-link ; rscs) | > > + (rscs ; rcu-link ; gp) | > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > + (rcu-fence ; rcu-link ; rcu-fence) > > + > > +(* rb orders instructions just as pb does *) > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > irreflexive rb as rcu > > I wonder whether we can simplify things as: > > let rec rcu-fence = > (gp; rcu-link; rscs) | > (rscs; rcu-link; gp) | > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > let rb = prop; rcu-fence; hb*; pb* > > acycle rb as rcu > > In this way, "rcu-fence" is defined as "any sequence containing as many > grace periods as RCU read-side critical sections (joined by rcu-link)." > Note that "rcu-link" contains "gp", so we don't miss the case where > there are more grace periods. And since we use "acycle" now, so we don't > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > I prefer this because we already treat "gp" as "strong-fence", which > already is a "rcu-link". Also, recurisively extending rcu-fence with > itself is exactly calculating the transitive closure, which we can avoid > by using a "acycle" rule. Besides, it looks more consistent with hb and > pb. I don't have any opinions from an aesthetics viewpoint, but this change does correctly handle the automatically generated tests. I do not see any performance impact, if anything, about a 10% improvement based on this 11-process RCU litmus test: auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus With the change, about 10.4 seconds, without, about 11.4 seconds. I am not patient enough to try one of the really large ones, like this one: auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus However, it is in my "litmus" github archive, so please feel free to try it out. Though I would suggest working up from those of intermediate length. Thanx, Paul > Thoughts? > > Regards, > Boqun > > > > + > > +(* > > + * The happens-before, propagation, and rcu constraints are all > > + * expressions of temporal ordering. They could be replaced by > > + * a single constraint on an "executes-before" relation, xb: > > + * > > + * let xb = hb | pb | rb > > + * acyclic xb as executes-before > > + *) > > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt > > =================================================================== > > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt > > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt > > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C > > 19. AND THEN THERE WAS ALPHA > > 20. THE HAPPENS-BEFORE RELATION: hb > > 21. THE PROPAGATES-BEFORE RELATION: pb > > - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > 23. ODDS AND ENDS > > > > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c > > the content of the LKMM's "propagation" axiom. > > > > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > ---------------------------------------------------- > > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > +---------------------------------------------------- > > > > RCU (Read-Copy-Update) is a powerful synchronization mechanism. It > > rests on two concepts: grace periods and read-side critical sections. > > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u > > a somewhat lengthy formal proof. Pretty much all you need to know > > about rcu-link is the information in the preceding paragraph. > > > > -The LKMM goes on to define the gp-link and rscs-link relations. They > > -bring grace periods and read-side critical sections into the picture, > > -in the following way: > > - > > - E ->gp-link F means there is a synchronize_rcu() fence event S > > - and an event X such that E ->po S, either S ->po X or S = X, > > - and X ->rcu-link F. In other words, E and F are linked by a > > - grace period followed by an instance of rcu-link. > > - > > - E ->rscs-link F means there is a critical section delimited by > > - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, > > - and an event X such that E ->po U, either L ->po X or L = X, > > - and X ->rcu-link F. Roughly speaking, this says that some > > - event in the same critical section as E is linked by rcu-link > > - to F. > > +The LKMM also defines the gp and rscs relations. They bring grace > > +periods and read-side critical sections into the picture, in the > > +following way: > > + > > + E ->gp F means there is a synchronize_rcu() fence event S such > > + that E ->po S and either S ->po F or S = F. In simple terms, > > + there is a grace period po-between E and F. > > + > > + E ->rscs F means there is a critical section delimited by an > > + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such > > + that E ->po U and either L ->po F or L = F. You can think of > > + this as saying that E and F are in the same critical section > > + (in fact, it also allows E to be po-before the start of the > > + critical section and F to be po-after the end). > > > > If we think of the rcu-link relation as standing for an extended > > -"before", then E ->gp-link F says that E executes before a grace > > -period which ends before F executes. (In fact it covers more than > > -this, because it also includes cases where E executes before a grace > > -period and some store propagates to F's CPU before F executes and > > -doesn't propagate to some other CPU until after the grace period > > -ends.) Similarly, E ->rscs-link F says that E is part of (or before > > -the start of) a critical section which starts before F executes. > > +"before", then X ->gp Y ->rcu-link Z says that X executes before a > > +grace period which ends before Z executes. (In fact it covers more > > +than this, because it also includes cases where X executes before a > > +grace period and some store propagates to Z's CPU before Z executes > > +but doesn't propagate to some other CPU until after the grace period > > +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or > > +before the start of) a critical section which starts before Z > > +executes. > > + > > +The LKMM goes on to define the rcu-fence relation as a sequence of gp > > +and rscs links separated by rcu-link links, in which the number of gp > > +links is >= the number of rscs links. For example: > > + > > + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > + > > +would imply that X ->rcu-fence V, because this sequence contains two > > +gp links and only one rscs link. (It also implies that X ->rcu-fence T > > +and Z ->rcu-fence V.) On the other hand: > > + > > + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > + > > +does not imply X ->rcu-fence V, because the sequence contains only > > +one gp link but two rscs links. > > + > > +The rcu-fence relation is important because the Grace Period Guarantee > > +means that rcu-fence acts kind of like a strong fence. In particular, > > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W > > +will propagate to every CPU before Z executes. > > + > > +To prove this in full generality requires some intellectual effort. > > +We'll consider just a very simple case: > > + > > + W ->gp X ->rcu-link Y ->rscs Z. > > + > > +This formula means that there is a grace period G and a critical > > +section C such that: > > + > > + 1. W is po-before G; > > + > > + 2. X is equal to or po-after G; > > + > > + 3. X comes "before" Y in some sense; > > + > > + 4. Y is po-before the end of C; > > + > > + 5. Z is equal to or po-after the start of C. > > + > > +From 2 - 4 we deduce that the grace period G ends before the critical > > +section C. Then the second part of the Grace Period Guarantee says > > +not only that G starts before C does, but also that W (which executes > > +on G's CPU before G starts) must propagate to every CPU before C > > +starts. In particular, W propagates to every CPU before Z executes > > +(or finishes executing, in the case where Z is equal to the > > +rcu_read_lock() fence event which starts C.) This sort of reasoning > > +can be expanded to handle all the situations covered by rcu-fence. > > + > > +Finally, the LKMM defines the RCU-before (rb) relation in terms of > > +rcu-fence. This is done in essentially the same way as the pb > > +relation was defined in terms of strong-fence. We will omit the > > +details; the end result is that E ->rb F implies E must execute before > > +F, just as E ->pb F does (and for much the same reasons). > > > > Putting this all together, the LKMM expresses the Grace Period > > -Guarantee by requiring that there are no cycles consisting of gp-link > > -and rscs-link links in which the number of gp-link instances is >= the > > -number of rscs-link instances. It does this by defining the rb > > -relation to link events E and F whenever it is possible to pass from E > > -to F by a sequence of gp-link and rscs-link links with at least as > > -many of the former as the latter. The LKMM's "rcu" axiom then says > > -that there are no events E with E ->rb E. > > - > > -Justifying this axiom takes some intellectual effort, but it is in > > -fact a valid formalization of the Grace Period Guarantee. We won't > > -attempt to go through the detailed argument, but the following > > -analysis gives a taste of what is involved. Suppose we have a > > -violation of the first part of the Guarantee: A critical section > > -starts before a grace period, and some store propagates to the > > -critical section's CPU before the end of the critical section but > > -doesn't propagate to some other CPU until after the end of the grace > > -period. > > +Guarantee by requiring that the rb relation does not contain a cycle. > > +Equivalently, this "rcu" axiom requires that there are no events E and > > +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the > > +axiom requires that there are no cycles consisting of gp and rscs > > +alternating with rcu-link, where the number of gp links is >= the > > +number of rscs links. > > + > > +Justifying the axiom isn't easy, but it is in fact a valid > > +formalization of the Grace Period Guarantee. We won't attempt to go > > +through the detailed argument, but the following analysis gives a > > +taste of what is involved. Suppose we have a violation of the first > > +part of the Guarantee: A critical section starts before a grace > > +period, and some store propagates to the critical section's CPU before > > +the end of the critical section but doesn't propagate to some other > > +CPU until after the end of the grace period. > > > > Putting symbols to these ideas, let L and U be the rcu_read_lock() and > > rcu_read_unlock() fence events delimiting the critical section in > > @@ -1606,11 +1657,14 @@ by rcu-link, yielding: > > > > S ->po X ->rcu-link Z ->po U. > > > > -The formulas say that S is po-between F and X, hence F ->gp-link Z > > -via X. They also say that Z comes before the end of the critical > > -section and E comes after its start, hence Z ->rscs-link F via E. But > > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the > > -"rcu" axiom rules out this violation of the Grace Period Guarantee. > > +The formulas say that S is po-between F and X, hence F ->gp X. They > > +also say that Z comes before the end of the critical section and E > > +comes after its start, hence Z ->rscs E. From all this we obtain: > > + > > + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, > > + > > +a forbidden cycle. Thus the "rcu" axiom rules out this violation of > > +the Grace Period Guarantee. > > > > For something a little more down-to-earth, let's see how the axiom > > works out in practice. Consider the RCU code example from above, this > > @@ -1639,15 +1693,15 @@ time with statement labels added to the > > If r2 = 0 at the end then P0's store at X overwrites the value that > > P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. > > In addition, there is a synchronize_rcu() between Y and Z, so therefore > > -we have Y ->gp-link X. > > +we have Y ->gp Z. > > > > If r1 = 1 at the end then P1's load at Y reads from P0's store at W, > > so we have W ->rcu-link Y. In addition, W and X are in the same critical > > -section, so therefore we have X ->rscs-link Y. > > +section, so therefore we have X ->rscs W. > > > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link > > -and one rscs-link, violating the "rcu" axiom. Hence the outcome is > > -not allowed by the LKMM, as we would expect. > > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, > > +violating the "rcu" axiom. Hence the outcome is not allowed by the > > +LKMM, as we would expect. > > > > For contrast, let's see what can happen in a more complicated example: > > > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen > > } > > > > If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows > > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W > > -via V. And just as before, this gives a cycle: > > - > > - W ->rscs-link Y ->gp-link U ->rscs-link W. > > - > > -However, this cycle has fewer gp-link instances than rscs-link > > -instances, and consequently the outcome is not forbidden by the LKMM. > > -The following instruction timing diagram shows how it might actually > > -occur: > > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. > > +However this cycle is not forbidden, because the sequence of relations > > +contains fewer instances of gp (one) than of rscs (two). Consequently > > +the outcome is allowed by the LKMM. The following instruction timing > > +diagram shows how it might actually occur: > > > > P0 P1 P2 > > -------------------- -------------------- -------------------- > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 4:49 ` Paul E. McKenney @ 2018-03-01 8:39 ` Boqun Feng 2018-03-01 14:28 ` Paul E. McKenney 0 siblings, 1 reply; 13+ messages in thread From: Boqun Feng @ 2018-03-01 8:39 UTC (permalink / raw) To: Paul E. McKenney Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list [-- Attachment #1: Type: text/plain, Size: 17966 bytes --] On Wed, Feb 28, 2018 at 08:49:37PM -0800, Paul E. McKenney wrote: > On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote: > > On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote: > > > This patch reorganizes the definition of rb in the Linux Kernel Memory > > > Consistency Model. The relation is now expressed in terms of > > > rcu-fence, which consists of a sequence of gp and rscs links separated > > > by rcu-link links, in which the number of occurrences of gp is >= the > > > number of occurrences of rscs. > > > > > > Arguments similar to those published in > > > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an > > > inter-CPU strong fence. Furthermore, the definition of rb in terms of > > > rcu-fence is highly analogous to the definition of pb in terms of > > > strong-fence, which can help explain why rcu-path expresses a form of > > > temporal ordering. > > > > > > This change should not affect the semantics of the memory model, just > > > its internal organization. > > > > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> > > > > > > --- > > > > > > v2: Rebase on top of the preceding patch which renames "link" to > > > "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword > > > in the definition of rcu-fence. Minor editing improvements in > > > explanation.txt. > > > > > > Index: usb-4.x/tools/memory-model/linux-kernel.cat > > > =================================================================== > > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > > > +++ usb-4.x/tools/memory-model/linux-kernel.cat > > > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? > > > *) > > > let rcu-link = hb* ; pb* ; prop > > > > > > -(* Chains that affect the RCU grace-period guarantee *) > > > -let gp-link = gp ; rcu-link > > > -let rscs-link = rscs ; rcu-link > > > - > > > (* > > > - * A cycle containing at least as many grace periods as RCU read-side > > > - * critical sections is forbidden. > > > + * Any sequence containing at least as many grace periods as RCU read-side > > > + * critical sections (joined by rcu-link) acts as a generalized strong fence. > > > *) > > > -let rec rb = > > > - gp-link | > > > - (gp-link ; rscs-link) | > > > - (rscs-link ; gp-link) | > > > - (rb ; rb) | > > > - (gp-link ; rb ; rscs-link) | > > > - (rscs-link ; rb ; gp-link) > > > +let rec rcu-fence = gp | > > > + (gp ; rcu-link ; rscs) | > > > + (rscs ; rcu-link ; gp) | > > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > + (rcu-fence ; rcu-link ; rcu-fence) > > > + > > > +(* rb orders instructions just as pb does *) > > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > > > irreflexive rb as rcu > > > > I wonder whether we can simplify things as: > > > > let rec rcu-fence = > > (gp; rcu-link; rscs) | > > (rscs; rcu-link; gp) | > > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > > > let rb = prop; rcu-fence; hb*; pb* > > > > acycle rb as rcu Note this one should be "acyclic rb as rcu"... > > > > In this way, "rcu-fence" is defined as "any sequence containing as many > > grace periods as RCU read-side critical sections (joined by rcu-link)." > > Note that "rcu-link" contains "gp", so we don't miss the case where > > there are more grace periods. And since we use "acycle" now, so we don't > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > > > I prefer this because we already treat "gp" as "strong-fence", which > > already is a "rcu-link". Also, recurisively extending rcu-fence with > > itself is exactly calculating the transitive closure, which we can avoid > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > pb. > > I don't have any opinions from an aesthetics viewpoint, but this change > does correctly handle the automatically generated tests. I do not see > any performance impact, if anything, about a 10% improvement based on > this 11-process RCU litmus test: > > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus > > With the change, about 10.4 seconds, without, about 11.4 seconds. > I got 12.0 seconds(my version) vs 13.59 seconds (Alan's version). So clearly you have a faster computer than I ;-) > I am not patient enough to try one of the really large ones, like this one: > > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus > I'm trying to run this on my laptop, but seems it will take forever to run(now it has been running for 1 hour and a half with Alan's version). I will update the result if it got finished some time later. Regards, Boqun > However, it is in my "litmus" github archive, so please feel free to > try it out. Though I would suggest working up from those of intermediate > length. > > Thanx, Paul > > > Thoughts? > > > > Regards, > > Boqun > > > > > > > + > > > +(* > > > + * The happens-before, propagation, and rcu constraints are all > > > + * expressions of temporal ordering. They could be replaced by > > > + * a single constraint on an "executes-before" relation, xb: > > > + * > > > + * let xb = hb | pb | rb > > > + * acyclic xb as executes-before > > > + *) > > > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt > > > =================================================================== > > > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt > > > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt > > > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C > > > 19. AND THEN THERE WAS ALPHA > > > 20. THE HAPPENS-BEFORE RELATION: hb > > > 21. THE PROPAGATES-BEFORE RELATION: pb > > > - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > > + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > > 23. ODDS AND ENDS > > > > > > > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c > > > the content of the LKMM's "propagation" axiom. > > > > > > > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > > ---------------------------------------------------- > > > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > > +---------------------------------------------------- > > > > > > RCU (Read-Copy-Update) is a powerful synchronization mechanism. It > > > rests on two concepts: grace periods and read-side critical sections. > > > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u > > > a somewhat lengthy formal proof. Pretty much all you need to know > > > about rcu-link is the information in the preceding paragraph. > > > > > > -The LKMM goes on to define the gp-link and rscs-link relations. They > > > -bring grace periods and read-side critical sections into the picture, > > > -in the following way: > > > - > > > - E ->gp-link F means there is a synchronize_rcu() fence event S > > > - and an event X such that E ->po S, either S ->po X or S = X, > > > - and X ->rcu-link F. In other words, E and F are linked by a > > > - grace period followed by an instance of rcu-link. > > > - > > > - E ->rscs-link F means there is a critical section delimited by > > > - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, > > > - and an event X such that E ->po U, either L ->po X or L = X, > > > - and X ->rcu-link F. Roughly speaking, this says that some > > > - event in the same critical section as E is linked by rcu-link > > > - to F. > > > +The LKMM also defines the gp and rscs relations. They bring grace > > > +periods and read-side critical sections into the picture, in the > > > +following way: > > > + > > > + E ->gp F means there is a synchronize_rcu() fence event S such > > > + that E ->po S and either S ->po F or S = F. In simple terms, > > > + there is a grace period po-between E and F. > > > + > > > + E ->rscs F means there is a critical section delimited by an > > > + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such > > > + that E ->po U and either L ->po F or L = F. You can think of > > > + this as saying that E and F are in the same critical section > > > + (in fact, it also allows E to be po-before the start of the > > > + critical section and F to be po-after the end). > > > > > > If we think of the rcu-link relation as standing for an extended > > > -"before", then E ->gp-link F says that E executes before a grace > > > -period which ends before F executes. (In fact it covers more than > > > -this, because it also includes cases where E executes before a grace > > > -period and some store propagates to F's CPU before F executes and > > > -doesn't propagate to some other CPU until after the grace period > > > -ends.) Similarly, E ->rscs-link F says that E is part of (or before > > > -the start of) a critical section which starts before F executes. > > > +"before", then X ->gp Y ->rcu-link Z says that X executes before a > > > +grace period which ends before Z executes. (In fact it covers more > > > +than this, because it also includes cases where X executes before a > > > +grace period and some store propagates to Z's CPU before Z executes > > > +but doesn't propagate to some other CPU until after the grace period > > > +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or > > > +before the start of) a critical section which starts before Z > > > +executes. > > > + > > > +The LKMM goes on to define the rcu-fence relation as a sequence of gp > > > +and rscs links separated by rcu-link links, in which the number of gp > > > +links is >= the number of rscs links. For example: > > > + > > > + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > > + > > > +would imply that X ->rcu-fence V, because this sequence contains two > > > +gp links and only one rscs link. (It also implies that X ->rcu-fence T > > > +and Z ->rcu-fence V.) On the other hand: > > > + > > > + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > > + > > > +does not imply X ->rcu-fence V, because the sequence contains only > > > +one gp link but two rscs links. > > > + > > > +The rcu-fence relation is important because the Grace Period Guarantee > > > +means that rcu-fence acts kind of like a strong fence. In particular, > > > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W > > > +will propagate to every CPU before Z executes. > > > + > > > +To prove this in full generality requires some intellectual effort. > > > +We'll consider just a very simple case: > > > + > > > + W ->gp X ->rcu-link Y ->rscs Z. > > > + > > > +This formula means that there is a grace period G and a critical > > > +section C such that: > > > + > > > + 1. W is po-before G; > > > + > > > + 2. X is equal to or po-after G; > > > + > > > + 3. X comes "before" Y in some sense; > > > + > > > + 4. Y is po-before the end of C; > > > + > > > + 5. Z is equal to or po-after the start of C. > > > + > > > +From 2 - 4 we deduce that the grace period G ends before the critical > > > +section C. Then the second part of the Grace Period Guarantee says > > > +not only that G starts before C does, but also that W (which executes > > > +on G's CPU before G starts) must propagate to every CPU before C > > > +starts. In particular, W propagates to every CPU before Z executes > > > +(or finishes executing, in the case where Z is equal to the > > > +rcu_read_lock() fence event which starts C.) This sort of reasoning > > > +can be expanded to handle all the situations covered by rcu-fence. > > > + > > > +Finally, the LKMM defines the RCU-before (rb) relation in terms of > > > +rcu-fence. This is done in essentially the same way as the pb > > > +relation was defined in terms of strong-fence. We will omit the > > > +details; the end result is that E ->rb F implies E must execute before > > > +F, just as E ->pb F does (and for much the same reasons). > > > > > > Putting this all together, the LKMM expresses the Grace Period > > > -Guarantee by requiring that there are no cycles consisting of gp-link > > > -and rscs-link links in which the number of gp-link instances is >= the > > > -number of rscs-link instances. It does this by defining the rb > > > -relation to link events E and F whenever it is possible to pass from E > > > -to F by a sequence of gp-link and rscs-link links with at least as > > > -many of the former as the latter. The LKMM's "rcu" axiom then says > > > -that there are no events E with E ->rb E. > > > - > > > -Justifying this axiom takes some intellectual effort, but it is in > > > -fact a valid formalization of the Grace Period Guarantee. We won't > > > -attempt to go through the detailed argument, but the following > > > -analysis gives a taste of what is involved. Suppose we have a > > > -violation of the first part of the Guarantee: A critical section > > > -starts before a grace period, and some store propagates to the > > > -critical section's CPU before the end of the critical section but > > > -doesn't propagate to some other CPU until after the end of the grace > > > -period. > > > +Guarantee by requiring that the rb relation does not contain a cycle. > > > +Equivalently, this "rcu" axiom requires that there are no events E and > > > +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the > > > +axiom requires that there are no cycles consisting of gp and rscs > > > +alternating with rcu-link, where the number of gp links is >= the > > > +number of rscs links. > > > + > > > +Justifying the axiom isn't easy, but it is in fact a valid > > > +formalization of the Grace Period Guarantee. We won't attempt to go > > > +through the detailed argument, but the following analysis gives a > > > +taste of what is involved. Suppose we have a violation of the first > > > +part of the Guarantee: A critical section starts before a grace > > > +period, and some store propagates to the critical section's CPU before > > > +the end of the critical section but doesn't propagate to some other > > > +CPU until after the end of the grace period. > > > > > > Putting symbols to these ideas, let L and U be the rcu_read_lock() and > > > rcu_read_unlock() fence events delimiting the critical section in > > > @@ -1606,11 +1657,14 @@ by rcu-link, yielding: > > > > > > S ->po X ->rcu-link Z ->po U. > > > > > > -The formulas say that S is po-between F and X, hence F ->gp-link Z > > > -via X. They also say that Z comes before the end of the critical > > > -section and E comes after its start, hence Z ->rscs-link F via E. But > > > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the > > > -"rcu" axiom rules out this violation of the Grace Period Guarantee. > > > +The formulas say that S is po-between F and X, hence F ->gp X. They > > > +also say that Z comes before the end of the critical section and E > > > +comes after its start, hence Z ->rscs E. From all this we obtain: > > > + > > > + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, > > > + > > > +a forbidden cycle. Thus the "rcu" axiom rules out this violation of > > > +the Grace Period Guarantee. > > > > > > For something a little more down-to-earth, let's see how the axiom > > > works out in practice. Consider the RCU code example from above, this > > > @@ -1639,15 +1693,15 @@ time with statement labels added to the > > > If r2 = 0 at the end then P0's store at X overwrites the value that > > > P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. > > > In addition, there is a synchronize_rcu() between Y and Z, so therefore > > > -we have Y ->gp-link X. > > > +we have Y ->gp Z. > > > > > > If r1 = 1 at the end then P1's load at Y reads from P0's store at W, > > > so we have W ->rcu-link Y. In addition, W and X are in the same critical > > > -section, so therefore we have X ->rscs-link Y. > > > +section, so therefore we have X ->rscs W. > > > > > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link > > > -and one rscs-link, violating the "rcu" axiom. Hence the outcome is > > > -not allowed by the LKMM, as we would expect. > > > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, > > > +violating the "rcu" axiom. Hence the outcome is not allowed by the > > > +LKMM, as we would expect. > > > > > > For contrast, let's see what can happen in a more complicated example: > > > > > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen > > > } > > > > > > If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows > > > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W > > > -via V. And just as before, this gives a cycle: > > > - > > > - W ->rscs-link Y ->gp-link U ->rscs-link W. > > > - > > > -However, this cycle has fewer gp-link instances than rscs-link > > > -instances, and consequently the outcome is not forbidden by the LKMM. > > > -The following instruction timing diagram shows how it might actually > > > -occur: > > > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. > > > +However this cycle is not forbidden, because the sequence of relations > > > +contains fewer instances of gp (one) than of rscs (two). Consequently > > > +the outcome is allowed by the LKMM. The following instruction timing > > > +diagram shows how it might actually occur: > > > > > > P0 P1 P2 > > > -------------------- -------------------- -------------------- > > > > > > > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 8:39 ` Boqun Feng @ 2018-03-01 14:28 ` Paul E. McKenney 0 siblings, 0 replies; 13+ messages in thread From: Paul E. McKenney @ 2018-03-01 14:28 UTC (permalink / raw) To: Boqun Feng Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Thu, Mar 01, 2018 at 04:39:06PM +0800, Boqun Feng wrote: > On Wed, Feb 28, 2018 at 08:49:37PM -0800, Paul E. McKenney wrote: > > On Thu, Mar 01, 2018 at 09:55:31AM +0800, Boqun Feng wrote: > > > On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote: > > > > This patch reorganizes the definition of rb in the Linux Kernel Memory > > > > Consistency Model. The relation is now expressed in terms of > > > > rcu-fence, which consists of a sequence of gp and rscs links separated > > > > by rcu-link links, in which the number of occurrences of gp is >= the > > > > number of occurrences of rscs. > > > > > > > > Arguments similar to those published in > > > > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an > > > > inter-CPU strong fence. Furthermore, the definition of rb in terms of > > > > rcu-fence is highly analogous to the definition of pb in terms of > > > > strong-fence, which can help explain why rcu-path expresses a form of > > > > temporal ordering. > > > > > > > > This change should not affect the semantics of the memory model, just > > > > its internal organization. > > > > > > > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> > > > > > > > > --- > > > > > > > > v2: Rebase on top of the preceding patch which renames "link" to > > > > "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword > > > > in the definition of rcu-fence. Minor editing improvements in > > > > explanation.txt. > > > > > > > > Index: usb-4.x/tools/memory-model/linux-kernel.cat > > > > =================================================================== > > > > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > > > > +++ usb-4.x/tools/memory-model/linux-kernel.cat > > > > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? > > > > *) > > > > let rcu-link = hb* ; pb* ; prop > > > > > > > > -(* Chains that affect the RCU grace-period guarantee *) > > > > -let gp-link = gp ; rcu-link > > > > -let rscs-link = rscs ; rcu-link > > > > - > > > > (* > > > > - * A cycle containing at least as many grace periods as RCU read-side > > > > - * critical sections is forbidden. > > > > + * Any sequence containing at least as many grace periods as RCU read-side > > > > + * critical sections (joined by rcu-link) acts as a generalized strong fence. > > > > *) > > > > -let rec rb = > > > > - gp-link | > > > > - (gp-link ; rscs-link) | > > > > - (rscs-link ; gp-link) | > > > > - (rb ; rb) | > > > > - (gp-link ; rb ; rscs-link) | > > > > - (rscs-link ; rb ; gp-link) > > > > +let rec rcu-fence = gp | > > > > + (gp ; rcu-link ; rscs) | > > > > + (rscs ; rcu-link ; gp) | > > > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > > + (rcu-fence ; rcu-link ; rcu-fence) > > > > + > > > > +(* rb orders instructions just as pb does *) > > > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > > > > > irreflexive rb as rcu > > > > > > I wonder whether we can simplify things as: > > > > > > let rec rcu-fence = > > > (gp; rcu-link; rscs) | > > > (rscs; rcu-link; gp) | > > > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > > > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > > > > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > > > > > let rb = prop; rcu-fence; hb*; pb* > > > > > > acycle rb as rcu > > Note this one should be "acyclic rb as rcu"... I applied the change by hand, and didn't notice the "acycle", so in my tests it was indeed "acyclic". (I left that line alone.) > > > In this way, "rcu-fence" is defined as "any sequence containing as many > > > grace periods as RCU read-side critical sections (joined by rcu-link)." > > > Note that "rcu-link" contains "gp", so we don't miss the case where > > > there are more grace periods. And since we use "acycle" now, so we don't > > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > > > > > I prefer this because we already treat "gp" as "strong-fence", which > > > already is a "rcu-link". Also, recurisively extending rcu-fence with > > > itself is exactly calculating the transitive closure, which we can avoid > > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > > pb. > > > > I don't have any opinions from an aesthetics viewpoint, but this change > > does correctly handle the automatically generated tests. I do not see > > any performance impact, if anything, about a 10% improvement based on > > this 11-process RCU litmus test: > > > > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-G.litmus > > > > With the change, about 10.4 seconds, without, about 11.4 seconds. > > I got 12.0 seconds(my version) vs 13.59 seconds (Alan's version). So > clearly you have a faster computer than I ;-) OK, it might be consistent. > > I am not patient enough to try one of the really large ones, like this one: > > > > auto/C-RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-R+RW-G.litmus > > > > I'm trying to run this on my laptop, but seems it will take forever to > run(now it has been running for 1 hour and a half with Alan's version). > I will update the result if it got finished some time later. Yes, that one will take some time. I don't recall exactly how long, but a great many hours, so... > Regards, > Boqun > > > However, it is in my "litmus" github archive, so please feel free to > > try it out. Though I would suggest working up from those of intermediate > > length. ... I reiterate my suggestion that you start with the shorter ones. But your choice. ;-) Thanx, Paul > > > Thoughts? > > > > > > Regards, > > > Boqun > > > > > > > > > > + > > > > +(* > > > > + * The happens-before, propagation, and rcu constraints are all > > > > + * expressions of temporal ordering. They could be replaced by > > > > + * a single constraint on an "executes-before" relation, xb: > > > > + * > > > > + * let xb = hb | pb | rb > > > > + * acyclic xb as executes-before > > > > + *) > > > > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt > > > > =================================================================== > > > > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt > > > > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt > > > > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C > > > > 19. AND THEN THERE WAS ALPHA > > > > 20. THE HAPPENS-BEFORE RELATION: hb > > > > 21. THE PROPAGATES-BEFORE RELATION: pb > > > > - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > > > + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > > > 23. ODDS AND ENDS > > > > > > > > > > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c > > > > the content of the LKMM's "propagation" axiom. > > > > > > > > > > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > > > > ---------------------------------------------------- > > > > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > > > > +---------------------------------------------------- > > > > > > > > RCU (Read-Copy-Update) is a powerful synchronization mechanism. It > > > > rests on two concepts: grace periods and read-side critical sections. > > > > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u > > > > a somewhat lengthy formal proof. Pretty much all you need to know > > > > about rcu-link is the information in the preceding paragraph. > > > > > > > > -The LKMM goes on to define the gp-link and rscs-link relations. They > > > > -bring grace periods and read-side critical sections into the picture, > > > > -in the following way: > > > > - > > > > - E ->gp-link F means there is a synchronize_rcu() fence event S > > > > - and an event X such that E ->po S, either S ->po X or S = X, > > > > - and X ->rcu-link F. In other words, E and F are linked by a > > > > - grace period followed by an instance of rcu-link. > > > > - > > > > - E ->rscs-link F means there is a critical section delimited by > > > > - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, > > > > - and an event X such that E ->po U, either L ->po X or L = X, > > > > - and X ->rcu-link F. Roughly speaking, this says that some > > > > - event in the same critical section as E is linked by rcu-link > > > > - to F. > > > > +The LKMM also defines the gp and rscs relations. They bring grace > > > > +periods and read-side critical sections into the picture, in the > > > > +following way: > > > > + > > > > + E ->gp F means there is a synchronize_rcu() fence event S such > > > > + that E ->po S and either S ->po F or S = F. In simple terms, > > > > + there is a grace period po-between E and F. > > > > + > > > > + E ->rscs F means there is a critical section delimited by an > > > > + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such > > > > + that E ->po U and either L ->po F or L = F. You can think of > > > > + this as saying that E and F are in the same critical section > > > > + (in fact, it also allows E to be po-before the start of the > > > > + critical section and F to be po-after the end). > > > > > > > > If we think of the rcu-link relation as standing for an extended > > > > -"before", then E ->gp-link F says that E executes before a grace > > > > -period which ends before F executes. (In fact it covers more than > > > > -this, because it also includes cases where E executes before a grace > > > > -period and some store propagates to F's CPU before F executes and > > > > -doesn't propagate to some other CPU until after the grace period > > > > -ends.) Similarly, E ->rscs-link F says that E is part of (or before > > > > -the start of) a critical section which starts before F executes. > > > > +"before", then X ->gp Y ->rcu-link Z says that X executes before a > > > > +grace period which ends before Z executes. (In fact it covers more > > > > +than this, because it also includes cases where X executes before a > > > > +grace period and some store propagates to Z's CPU before Z executes > > > > +but doesn't propagate to some other CPU until after the grace period > > > > +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or > > > > +before the start of) a critical section which starts before Z > > > > +executes. > > > > + > > > > +The LKMM goes on to define the rcu-fence relation as a sequence of gp > > > > +and rscs links separated by rcu-link links, in which the number of gp > > > > +links is >= the number of rscs links. For example: > > > > + > > > > + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > > > + > > > > +would imply that X ->rcu-fence V, because this sequence contains two > > > > +gp links and only one rscs link. (It also implies that X ->rcu-fence T > > > > +and Z ->rcu-fence V.) On the other hand: > > > > + > > > > + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > > > > + > > > > +does not imply X ->rcu-fence V, because the sequence contains only > > > > +one gp link but two rscs links. > > > > + > > > > +The rcu-fence relation is important because the Grace Period Guarantee > > > > +means that rcu-fence acts kind of like a strong fence. In particular, > > > > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W > > > > +will propagate to every CPU before Z executes. > > > > + > > > > +To prove this in full generality requires some intellectual effort. > > > > +We'll consider just a very simple case: > > > > + > > > > + W ->gp X ->rcu-link Y ->rscs Z. > > > > + > > > > +This formula means that there is a grace period G and a critical > > > > +section C such that: > > > > + > > > > + 1. W is po-before G; > > > > + > > > > + 2. X is equal to or po-after G; > > > > + > > > > + 3. X comes "before" Y in some sense; > > > > + > > > > + 4. Y is po-before the end of C; > > > > + > > > > + 5. Z is equal to or po-after the start of C. > > > > + > > > > +From 2 - 4 we deduce that the grace period G ends before the critical > > > > +section C. Then the second part of the Grace Period Guarantee says > > > > +not only that G starts before C does, but also that W (which executes > > > > +on G's CPU before G starts) must propagate to every CPU before C > > > > +starts. In particular, W propagates to every CPU before Z executes > > > > +(or finishes executing, in the case where Z is equal to the > > > > +rcu_read_lock() fence event which starts C.) This sort of reasoning > > > > +can be expanded to handle all the situations covered by rcu-fence. > > > > + > > > > +Finally, the LKMM defines the RCU-before (rb) relation in terms of > > > > +rcu-fence. This is done in essentially the same way as the pb > > > > +relation was defined in terms of strong-fence. We will omit the > > > > +details; the end result is that E ->rb F implies E must execute before > > > > +F, just as E ->pb F does (and for much the same reasons). > > > > > > > > Putting this all together, the LKMM expresses the Grace Period > > > > -Guarantee by requiring that there are no cycles consisting of gp-link > > > > -and rscs-link links in which the number of gp-link instances is >= the > > > > -number of rscs-link instances. It does this by defining the rb > > > > -relation to link events E and F whenever it is possible to pass from E > > > > -to F by a sequence of gp-link and rscs-link links with at least as > > > > -many of the former as the latter. The LKMM's "rcu" axiom then says > > > > -that there are no events E with E ->rb E. > > > > - > > > > -Justifying this axiom takes some intellectual effort, but it is in > > > > -fact a valid formalization of the Grace Period Guarantee. We won't > > > > -attempt to go through the detailed argument, but the following > > > > -analysis gives a taste of what is involved. Suppose we have a > > > > -violation of the first part of the Guarantee: A critical section > > > > -starts before a grace period, and some store propagates to the > > > > -critical section's CPU before the end of the critical section but > > > > -doesn't propagate to some other CPU until after the end of the grace > > > > -period. > > > > +Guarantee by requiring that the rb relation does not contain a cycle. > > > > +Equivalently, this "rcu" axiom requires that there are no events E and > > > > +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the > > > > +axiom requires that there are no cycles consisting of gp and rscs > > > > +alternating with rcu-link, where the number of gp links is >= the > > > > +number of rscs links. > > > > + > > > > +Justifying the axiom isn't easy, but it is in fact a valid > > > > +formalization of the Grace Period Guarantee. We won't attempt to go > > > > +through the detailed argument, but the following analysis gives a > > > > +taste of what is involved. Suppose we have a violation of the first > > > > +part of the Guarantee: A critical section starts before a grace > > > > +period, and some store propagates to the critical section's CPU before > > > > +the end of the critical section but doesn't propagate to some other > > > > +CPU until after the end of the grace period. > > > > > > > > Putting symbols to these ideas, let L and U be the rcu_read_lock() and > > > > rcu_read_unlock() fence events delimiting the critical section in > > > > @@ -1606,11 +1657,14 @@ by rcu-link, yielding: > > > > > > > > S ->po X ->rcu-link Z ->po U. > > > > > > > > -The formulas say that S is po-between F and X, hence F ->gp-link Z > > > > -via X. They also say that Z comes before the end of the critical > > > > -section and E comes after its start, hence Z ->rscs-link F via E. But > > > > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the > > > > -"rcu" axiom rules out this violation of the Grace Period Guarantee. > > > > +The formulas say that S is po-between F and X, hence F ->gp X. They > > > > +also say that Z comes before the end of the critical section and E > > > > +comes after its start, hence Z ->rscs E. From all this we obtain: > > > > + > > > > + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, > > > > + > > > > +a forbidden cycle. Thus the "rcu" axiom rules out this violation of > > > > +the Grace Period Guarantee. > > > > > > > > For something a little more down-to-earth, let's see how the axiom > > > > works out in practice. Consider the RCU code example from above, this > > > > @@ -1639,15 +1693,15 @@ time with statement labels added to the > > > > If r2 = 0 at the end then P0's store at X overwrites the value that > > > > P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. > > > > In addition, there is a synchronize_rcu() between Y and Z, so therefore > > > > -we have Y ->gp-link X. > > > > +we have Y ->gp Z. > > > > > > > > If r1 = 1 at the end then P1's load at Y reads from P0's store at W, > > > > so we have W ->rcu-link Y. In addition, W and X are in the same critical > > > > -section, so therefore we have X ->rscs-link Y. > > > > +section, so therefore we have X ->rscs W. > > > > > > > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link > > > > -and one rscs-link, violating the "rcu" axiom. Hence the outcome is > > > > -not allowed by the LKMM, as we would expect. > > > > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, > > > > +violating the "rcu" axiom. Hence the outcome is not allowed by the > > > > +LKMM, as we would expect. > > > > > > > > For contrast, let's see what can happen in a more complicated example: > > > > > > > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen > > > > } > > > > > > > > If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows > > > > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W > > > > -via V. And just as before, this gives a cycle: > > > > - > > > > - W ->rscs-link Y ->gp-link U ->rscs-link W. > > > > - > > > > -However, this cycle has fewer gp-link instances than rscs-link > > > > -instances, and consequently the outcome is not forbidden by the LKMM. > > > > -The following instruction timing diagram shows how it might actually > > > > -occur: > > > > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. > > > > +However this cycle is not forbidden, because the sequence of relations > > > > +contains fewer instances of gp (one) than of rscs (two). Consequently > > > > +the outcome is allowed by the LKMM. The following instruction timing > > > > +diagram shows how it might actually occur: > > > > > > > > P0 P1 P2 > > > > -------------------- -------------------- -------------------- > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 1:55 ` Boqun Feng 2018-03-01 4:49 ` Paul E. McKenney @ 2018-03-01 15:49 ` Alan Stern 2018-03-01 17:49 ` Paul E. McKenney 1 sibling, 1 reply; 13+ messages in thread From: Alan Stern @ 2018-03-01 15:49 UTC (permalink / raw) To: Boqun Feng Cc: LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney, Peter Zijlstra, Will Deacon, Kernel development list On Thu, 1 Mar 2018, Boqun Feng wrote: > > +let rec rcu-fence = gp | > > + (gp ; rcu-link ; rscs) | > > + (rscs ; rcu-link ; gp) | > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > + (rcu-fence ; rcu-link ; rcu-fence) > > + > > +(* rb orders instructions just as pb does *) > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > irreflexive rb as rcu > > I wonder whether we can simplify things as: > > let rec rcu-fence = > (gp; rcu-link; rscs) | > (rscs; rcu-link; gp) | > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > let rb = prop; rcu-fence; hb*; pb* > > acycle rb as rcu > > In this way, "rcu-fence" is defined as "any sequence containing as many > grace periods as RCU read-side critical sections (joined by rcu-link)." > Note that "rcu-link" contains "gp", so we don't miss the case where > there are more grace periods. And since we use "acycle" now, so we don't > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. Would this definition of rcu-fence work for a sequence such as (leaving out the intermediate rcu-link parts): gp gp gp rscs rscs gp rscs rscs ? I don't think it would. Yes, if you had a cycle of that form then your "rcu" axiom would detect it, but at some point we might want to use rcu-fence for some other purpose, one that doesn't involve cycles. > I prefer this because we already treat "gp" as "strong-fence", which > already is a "rcu-link". That's a good point; it had not occurred to me. > Also, recurisively extending rcu-fence with > itself is exactly calculating the transitive closure, which we can avoid > by using a "acycle" rule. Besides, it looks more consistent with hb and > pb. That _had_ occurred to me. But I couldn't see any way to do it while still defining rcu-fence correctly. Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 15:49 ` Alan Stern @ 2018-03-01 17:49 ` Paul E. McKenney 2018-03-01 18:37 ` Paul E. McKenney 0 siblings, 1 reply; 13+ messages in thread From: Paul E. McKenney @ 2018-03-01 17:49 UTC (permalink / raw) To: Alan Stern Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote: > On Thu, 1 Mar 2018, Boqun Feng wrote: > > > > +let rec rcu-fence = gp | > > > + (gp ; rcu-link ; rscs) | > > > + (rscs ; rcu-link ; gp) | > > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > + (rcu-fence ; rcu-link ; rcu-fence) > > > + > > > +(* rb orders instructions just as pb does *) > > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > > > irreflexive rb as rcu > > > > I wonder whether we can simplify things as: > > > > let rec rcu-fence = > > (gp; rcu-link; rscs) | > > (rscs; rcu-link; gp) | > > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > > > let rb = prop; rcu-fence; hb*; pb* > > > > acycle rb as rcu > > > > In this way, "rcu-fence" is defined as "any sequence containing as many > > grace periods as RCU read-side critical sections (joined by rcu-link)." > > Note that "rcu-link" contains "gp", so we don't miss the case where > > there are more grace periods. And since we use "acycle" now, so we don't > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > Would this definition of rcu-fence work for a sequence such as (leaving > out the intermediate rcu-link parts): > > gp gp gp rscs rscs gp rscs rscs > > ? I don't think it would. Yes, if you had a cycle of that form then > your "rcu" axiom would detect it, but at some point we might want to > use rcu-fence for some other purpose, one that doesn't involve cycles. Let's see, that would map to this: auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus And no, there is no such automatically generated litmus test. Let's try reversing the "gp" and "rscs", which should have the same effect courtesy of symmetry: auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus And that one doesn't exist, either. So much for random test generation! :-/ Clearly time to add them. And here is what herd has to say about them: l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus Herd options: -conf linux-kernel.cfg Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255 ^^^ Unexpected non-Never verification 0inputs+32outputs (0major+2605minor)pagefaults 0swaps $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus Herd options: -conf linux-kernel.cfg Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255 ^^^ Unexpected non-Never verification 0inputs+32outputs (0major+2620minor)pagefaults 0swaps In other words, they are in fact misclassified as "Sometimes" when they should be "Never". I have my diffs below in case I misapplied Boqun's change. With Alan's original formulation, these two litmus tests are correctly handled: $ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus Herd options: -conf linux-kernel.cfg Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255 1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus Herd options: -conf linux-kernel.cfg Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255 1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k > > I prefer this because we already treat "gp" as "strong-fence", which > > already is a "rcu-link". > > That's a good point; it had not occurred to me. And if I remove the "gp" but leave the last line, it does properly classify the two new litmus tests. Thanx, Paul > > Also, recurisively extending rcu-fence with > > itself is exactly calculating the transitive closure, which we can avoid > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > pb. > > That _had_ occurred to me. But I couldn't see any way to do it while > still defining rcu-fence correctly. ------------------------------------------------------------------------ diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat index 1e5c4653dd12..75d3c225146c 100644 --- a/tools/memory-model/linux-kernel.cat +++ b/tools/memory-model/linux-kernel.cat @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop * Any sequence containing at least as many grace periods as RCU read-side * critical sections (joined by rcu-link) acts as a generalized strong fence. *) -let rec rcu-fence = gp | +let rec rcu-fence = (gp ; rcu-link ; rscs) | (rscs ; rcu-link ; gp) | (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | - (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | - (rcu-fence ; rcu-link ; rcu-fence) + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) (* rb orders instructions just as pb does *) let rb = prop ; rcu-fence ; hb* ; pb* ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 17:49 ` Paul E. McKenney @ 2018-03-01 18:37 ` Paul E. McKenney 2018-03-02 4:31 ` Boqun Feng 0 siblings, 1 reply; 13+ messages in thread From: Paul E. McKenney @ 2018-03-01 18:37 UTC (permalink / raw) To: Alan Stern Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote: > On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote: > > On Thu, 1 Mar 2018, Boqun Feng wrote: > > > > > > +let rec rcu-fence = gp | > > > > + (gp ; rcu-link ; rscs) | > > > > + (rscs ; rcu-link ; gp) | > > > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > > + (rcu-fence ; rcu-link ; rcu-fence) > > > > + > > > > +(* rb orders instructions just as pb does *) > > > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > > > > > irreflexive rb as rcu > > > > > > I wonder whether we can simplify things as: > > > > > > let rec rcu-fence = > > > (gp; rcu-link; rscs) | > > > (rscs; rcu-link; gp) | > > > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > > > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > > > > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > > > > > let rb = prop; rcu-fence; hb*; pb* > > > > > > acycle rb as rcu > > > > > > In this way, "rcu-fence" is defined as "any sequence containing as many > > > grace periods as RCU read-side critical sections (joined by rcu-link)." > > > Note that "rcu-link" contains "gp", so we don't miss the case where > > > there are more grace periods. And since we use "acycle" now, so we don't > > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > > > Would this definition of rcu-fence work for a sequence such as (leaving > > out the intermediate rcu-link parts): > > > > gp gp gp rscs rscs gp rscs rscs > > > > ? I don't think it would. Yes, if you had a cycle of that form then > > your "rcu" axiom would detect it, but at some point we might want to > > use rcu-fence for some other purpose, one that doesn't involve cycles. > > Let's see, that would map to this: > > auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > > And no, there is no such automatically generated litmus test. Let's > try reversing the "gp" and "rscs", which should have the same effect > courtesy of symmetry: > > auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > > And that one doesn't exist, either. So much for random test generation! :-/ > > Clearly time to add them. And here is what herd has to say about them: > > l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > Herd options: -conf linux-kernel.cfg > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255 > ^^^ Unexpected non-Never verification > 0inputs+32outputs (0major+2605minor)pagefaults 0swaps > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > Herd options: -conf linux-kernel.cfg > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255 > ^^^ Unexpected non-Never verification > 0inputs+32outputs (0major+2620minor)pagefaults 0swaps > > In other words, they are in fact misclassified as "Sometimes" when they > should be "Never". I have my diffs below in case I misapplied Boqun's > change. > > With Alan's original formulation, these two litmus tests are correctly > handled: > > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > Herd options: -conf linux-kernel.cfg > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255 > 1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > Herd options: -conf linux-kernel.cfg > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255 > 1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k And as Andrea pointed out off-list, I did indeed mess up Boqun's change. I forgot to change the "irreflexive" into "acyclic". Applying that change makes everything work. Please accept my apologies for my confusion! Thanx, Paul > > > I prefer this because we already treat "gp" as "strong-fence", which > > > already is a "rcu-link". > > > > That's a good point; it had not occurred to me. > > And if I remove the "gp" but leave the last line, it does properly > classify the two new litmus tests. > > Thanx, Paul > > > > Also, recurisively extending rcu-fence with > > > itself is exactly calculating the transitive closure, which we can avoid > > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > > pb. > > > > That _had_ occurred to me. But I couldn't see any way to do it while > > still defining rcu-fence correctly. > > ------------------------------------------------------------------------ > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > index 1e5c4653dd12..75d3c225146c 100644 > --- a/tools/memory-model/linux-kernel.cat > +++ b/tools/memory-model/linux-kernel.cat > @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop > * Any sequence containing at least as many grace periods as RCU read-side > * critical sections (joined by rcu-link) acts as a generalized strong fence. > *) > -let rec rcu-fence = gp | > +let rec rcu-fence = > (gp ; rcu-link ; rscs) | > (rscs ; rcu-link ; gp) | > (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > - (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > - (rcu-fence ; rcu-link ; rcu-fence) > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) > > (* rb orders instructions just as pb does *) > let rb = prop ; rcu-fence ; hb* ; pb* ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-01 18:37 ` Paul E. McKenney @ 2018-03-02 4:31 ` Boqun Feng 2018-03-02 4:50 ` Paul E. McKenney 0 siblings, 1 reply; 13+ messages in thread From: Boqun Feng @ 2018-03-02 4:31 UTC (permalink / raw) To: Paul E. McKenney Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list [-- Attachment #1: Type: text/plain, Size: 6481 bytes --] On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote: > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote: > > On Thu, Mar 01, 2018 at 10:49:05AM -0500, Alan Stern wrote: > > > On Thu, 1 Mar 2018, Boqun Feng wrote: > > > > > > > > +let rec rcu-fence = gp | > > > > > + (gp ; rcu-link ; rscs) | > > > > > + (rscs ; rcu-link ; gp) | > > > > > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > > > + (rcu-fence ; rcu-link ; rcu-fence) > > > > > + > > > > > +(* rb orders instructions just as pb does *) > > > > > +let rb = prop ; rcu-fence ; hb* ; pb* > > > > > > > > > > irreflexive rb as rcu > > > > > > > > I wonder whether we can simplify things as: > > > > > > > > let rec rcu-fence = > > > > (gp; rcu-link; rscs) | > > > > (rscs; rcu-link; gp) | > > > > (gp; rcu-link; rcu-fence; rcu-link; rscs) | > > > > (rscs; rcu-link; rcu-fence; rcu-link; gp) > > > > > > > > (* gp and rcu-fence; rcu-link; rcu-fence removed *) > > > > > > > > let rb = prop; rcu-fence; hb*; pb* > > > > > > > > acycle rb as rcu > > > > > > > > In this way, "rcu-fence" is defined as "any sequence containing as many > > > > grace periods as RCU read-side critical sections (joined by rcu-link)." > > > > Note that "rcu-link" contains "gp", so we don't miss the case where > > > > there are more grace periods. And since we use "acycle" now, so we don't > > > > need "rcu-fence; rcu-link; rcu-fence" to build "rcu-fence" recursively. > > > > > > Would this definition of rcu-fence work for a sequence such as (leaving > > > out the intermediate rcu-link parts): > > > > > > gp gp gp rscs rscs gp rscs rscs > > > > > > ? I don't think it would. Yes, if you had a cycle of that form then Right. > > > your "rcu" axiom would detect it, but at some point we might want to > > > use rcu-fence for some other purpose, one that doesn't involve cycles. OK, and I've not yet found another simple way to express rcu-fence for purposes other than cycle-checking. So I'm OK to leave it as it is except removing the redundant "gp" in rcu-fence definition. But I will continue to search for a easier and sufficient way to define these things ;-) > > > > Let's see, that would map to this: > > > > auto/RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > > > > And no, there is no such automatically generated litmus test. Let's > > try reversing the "gp" and "rscs", which should have the same effect > > courtesy of symmetry: > > > > auto/RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > > > > And that one doesn't exist, either. So much for random test generation! :-/ > > > > Clearly time to add them. And here is what herd has to say about them: > > > > l$ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > > Herd options: -conf linux-kernel.cfg > > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Sometimes 1 255 > > ^^^ Unexpected non-Never verification > > 0inputs+32outputs (0major+2605minor)pagefaults 0swaps > > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > > Herd options: -conf linux-kernel.cfg > > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Sometimes 1 255 > > ^^^ Unexpected non-Never verification > > 0inputs+32outputs (0major+2620minor)pagefaults 0swaps > > > > In other words, they are in fact misclassified as "Sometimes" when they > > should be "Never". I have my diffs below in case I misapplied Boqun's > > change. > > > > With Alan's original formulation, these two litmus tests are correctly > > handled: > > > > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R.litmus > > Herd options: -conf linux-kernel.cfg > > Observation /tmp/auto/C-RW-G+RW-G+RW-G+RW-R+RW-R+RW-G+RW-R+RW-R Never 0 255 > > 1.61user 0.00system 0:01.63elapsed 98%CPU (0avgtext+0avgdata 9572maxresident)k > > $ sh scripts/checklitmus.sh /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G.litmus > > Herd options: -conf linux-kernel.cfg > > Observation /tmp/auto/C-RW-R+RW-R+RW-R+RW-G+RW-G+RW-R+RW-G+RW-G Never 0 255 > > 1.84user 0.01system 0:01.92elapsed 96%CPU (0avgtext+0avgdata 10112maxresident)k > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change. > I forgot to change the "irreflexive" into "acyclic". Applying that change > makes everything work. > > Please accept my apologies for my confusion! > np, also I should have provided a proper patch for your testing. For this Alan's patch, feel free to add: Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Regards, Boqun > Thanx, Paul > > > > > I prefer this because we already treat "gp" as "strong-fence", which > > > > already is a "rcu-link". > > > > > > That's a good point; it had not occurred to me. > > > > And if I remove the "gp" but leave the last line, it does properly > > classify the two new litmus tests. > > > > Thanx, Paul > > > > > > Also, recurisively extending rcu-fence with > > > > itself is exactly calculating the transitive closure, which we can avoid > > > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > > > pb. > > > > > > That _had_ occurred to me. But I couldn't see any way to do it while > > > still defining rcu-fence correctly. > > > > ------------------------------------------------------------------------ > > > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > > index 1e5c4653dd12..75d3c225146c 100644 > > --- a/tools/memory-model/linux-kernel.cat > > +++ b/tools/memory-model/linux-kernel.cat > > @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop > > * Any sequence containing at least as many grace periods as RCU read-side > > * critical sections (joined by rcu-link) acts as a generalized strong fence. > > *) > > -let rec rcu-fence = gp | > > +let rec rcu-fence = > > (gp ; rcu-link ; rscs) | > > (rscs ; rcu-link ; gp) | > > (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > - (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > - (rcu-fence ; rcu-link ; rcu-fence) > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) > > > > (* rb orders instructions just as pb does *) > > let rb = prop ; rcu-fence ; hb* ; pb* > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-02 4:31 ` Boqun Feng @ 2018-03-02 4:50 ` Paul E. McKenney 2018-03-02 15:17 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Paul E. McKenney @ 2018-03-02 4:50 UTC (permalink / raw) To: Boqun Feng Cc: Alan Stern, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote: > On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote: > > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote: [ . . . ] > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change. > > I forgot to change the "irreflexive" into "acyclic". Applying that change > > makes everything work. > > > > Please accept my apologies for my confusion! > > > > np, also I should have provided a proper patch for your testing. > > For this Alan's patch, feel free to add: > > Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Alan's last submission was still RFC, so I have not yet queued it. So this ball is still in Alan's court. Thanx, Paul > Regards, > Boqun > > > Thanx, Paul > > > > > > > I prefer this because we already treat "gp" as "strong-fence", which > > > > > already is a "rcu-link". > > > > > > > > That's a good point; it had not occurred to me. > > > > > > And if I remove the "gp" but leave the last line, it does properly > > > classify the two new litmus tests. > > > > > > Thanx, Paul > > > > > > > > Also, recurisively extending rcu-fence with > > > > > itself is exactly calculating the transitive closure, which we can avoid > > > > > by using a "acycle" rule. Besides, it looks more consistent with hb and > > > > > pb. > > > > > > > > That _had_ occurred to me. But I couldn't see any way to do it while > > > > still defining rcu-fence correctly. > > > > > > ------------------------------------------------------------------------ > > > > > > diff --git a/tools/memory-model/linux-kernel.cat b/tools/memory-model/linux-kernel.cat > > > index 1e5c4653dd12..75d3c225146c 100644 > > > --- a/tools/memory-model/linux-kernel.cat > > > +++ b/tools/memory-model/linux-kernel.cat > > > @@ -106,12 +106,11 @@ let rcu-link = hb* ; pb* ; prop > > > * Any sequence containing at least as many grace periods as RCU read-side > > > * critical sections (joined by rcu-link) acts as a generalized strong fence. > > > *) > > > -let rec rcu-fence = gp | > > > +let rec rcu-fence = > > > (gp ; rcu-link ; rscs) | > > > (rscs ; rcu-link ; gp) | > > > (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > > > - (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > > > - (rcu-fence ; rcu-link ; rcu-fence) > > > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) > > > > > > (* rb orders instructions just as pb does *) > > > let rb = prop ; rcu-fence ; hb* ; pb* > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-02 4:50 ` Paul E. McKenney @ 2018-03-02 15:17 ` Alan Stern 2018-03-02 17:38 ` Paul E. McKenney 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2018-03-02 15:17 UTC (permalink / raw) To: Paul E. McKenney Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Thu, 1 Mar 2018, Paul E. McKenney wrote: > On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote: > > On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote: > > > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote: > > [ . . . ] > > > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change. > > > I forgot to change the "irreflexive" into "acyclic". Applying that change > > > makes everything work. > > > > > > Please accept my apologies for my confusion! > > > > > > > np, also I should have provided a proper patch for your testing. > > > > For this Alan's patch, feel free to add: > > > > Reviewed-by: Boqun Feng <boqun.feng@gmail.com> > > Alan's last submission was still RFC, so I have not yet queued it. > So this ball is still in Alan's court. I'll wait a few more days to see if there are any other comments and then submit it officially. Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-03-02 15:17 ` Alan Stern @ 2018-03-02 17:38 ` Paul E. McKenney 0 siblings, 0 replies; 13+ messages in thread From: Paul E. McKenney @ 2018-03-02 17:38 UTC (permalink / raw) To: Alan Stern Cc: Boqun Feng, LKMM Maintainers -- Akira Yokosawa, Andrea Parri, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Peter Zijlstra, Will Deacon, Kernel development list On Fri, Mar 02, 2018 at 10:17:55AM -0500, Alan Stern wrote: > On Thu, 1 Mar 2018, Paul E. McKenney wrote: > > > On Fri, Mar 02, 2018 at 12:31:41PM +0800, Boqun Feng wrote: > > > On Thu, Mar 01, 2018 at 10:37:58AM -0800, Paul E. McKenney wrote: > > > > On Thu, Mar 01, 2018 at 09:49:06AM -0800, Paul E. McKenney wrote: > > > > [ . . . ] > > > > > > And as Andrea pointed out off-list, I did indeed mess up Boqun's change. > > > > I forgot to change the "irreflexive" into "acyclic". Applying that change > > > > makes everything work. > > > > > > > > Please accept my apologies for my confusion! > > > > > > > > > > np, also I should have provided a proper patch for your testing. > > > > > > For this Alan's patch, feel free to add: > > > > > > Reviewed-by: Boqun Feng <boqun.feng@gmail.com> > > > > Alan's last submission was still RFC, so I have not yet queued it. > > So this ball is still in Alan's court. > > I'll wait a few more days to see if there are any other comments and > then submit it officially. Will do! Thanx, Paul ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence 2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern 2018-03-01 1:55 ` Boqun Feng @ 2018-03-13 13:56 ` Andrea Parri 1 sibling, 0 replies; 13+ messages in thread From: Andrea Parri @ 2018-03-13 13:56 UTC (permalink / raw) To: Alan Stern Cc: LKMM Maintainers -- Akira Yokosawa, Boqun Feng, David Howells, Jade Alglave, Luc Maranget, Nicholas Piggin, Paul E. McKenney, Peter Zijlstra, Will Deacon, Kernel development list On Wed, Feb 28, 2018 at 03:13:54PM -0500, Alan Stern wrote: > This patch reorganizes the definition of rb in the Linux Kernel Memory > Consistency Model. The relation is now expressed in terms of > rcu-fence, which consists of a sequence of gp and rscs links separated > by rcu-link links, in which the number of occurrences of gp is >= the > number of occurrences of rscs. > > Arguments similar to those published in > http://diy.inria.fr/linux/long.pdf show that rcu-fence behaves like an > inter-CPU strong fence. Furthermore, the definition of rb in terms of > rcu-fence is highly analogous to the definition of pb in terms of > strong-fence, which can help explain why rcu-path expresses a form of > temporal ordering. > > This change should not affect the semantics of the memory model, just > its internal organization. > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu> I like Boqun's suggestion of "reducing rcu-fence" and using "acyclic". IIRC, time ago we discussed "enlarging" hb, pb by defining them to be transitive closed (and using "irreflexive" everywhere); however, this resulted in slightly longer simulation times... For this patch, Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Andrea > > --- > > v2: Rebase on top of the preceding patch which renames "link" to > "rcu-link" and "rcu-path" to "rb". Add back the missing "rec" keyword > in the definition of rcu-fence. Minor editing improvements in > explanation.txt. > > Index: usb-4.x/tools/memory-model/linux-kernel.cat > =================================================================== > --- usb-4.x.orig/tools/memory-model/linux-kernel.cat > +++ usb-4.x/tools/memory-model/linux-kernel.cat > @@ -102,20 +102,27 @@ let rscs = po ; crit^-1 ; po? > *) > let rcu-link = hb* ; pb* ; prop > > -(* Chains that affect the RCU grace-period guarantee *) > -let gp-link = gp ; rcu-link > -let rscs-link = rscs ; rcu-link > - > (* > - * A cycle containing at least as many grace periods as RCU read-side > - * critical sections is forbidden. > + * Any sequence containing at least as many grace periods as RCU read-side > + * critical sections (joined by rcu-link) acts as a generalized strong fence. > *) > -let rec rb = > - gp-link | > - (gp-link ; rscs-link) | > - (rscs-link ; gp-link) | > - (rb ; rb) | > - (gp-link ; rb ; rscs-link) | > - (rscs-link ; rb ; gp-link) > +let rec rcu-fence = gp | > + (gp ; rcu-link ; rscs) | > + (rscs ; rcu-link ; gp) | > + (gp ; rcu-link ; rcu-fence ; rcu-link ; rscs) | > + (rscs ; rcu-link ; rcu-fence ; rcu-link ; gp) | > + (rcu-fence ; rcu-link ; rcu-fence) > + > +(* rb orders instructions just as pb does *) > +let rb = prop ; rcu-fence ; hb* ; pb* > > irreflexive rb as rcu > + > +(* > + * The happens-before, propagation, and rcu constraints are all > + * expressions of temporal ordering. They could be replaced by > + * a single constraint on an "executes-before" relation, xb: > + * > + * let xb = hb | pb | rb > + * acyclic xb as executes-before > + *) > Index: usb-4.x/tools/memory-model/Documentation/explanation.txt > =================================================================== > --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt > +++ usb-4.x/tools/memory-model/Documentation/explanation.txt > @@ -27,7 +27,7 @@ Explanation of the Linux-Kernel Memory C > 19. AND THEN THERE WAS ALPHA > 20. THE HAPPENS-BEFORE RELATION: hb > 21. THE PROPAGATES-BEFORE RELATION: pb > - 22. RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > + 22. RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > 23. ODDS AND ENDS > > > @@ -1451,8 +1451,8 @@ they execute means that it cannot have c > the content of the LKMM's "propagation" axiom. > > > -RCU RELATIONS: rcu-link, gp-link, rscs-link, and rb > ---------------------------------------------------- > +RCU RELATIONS: rcu-link, gp, rscs, rcu-fence, and rb > +---------------------------------------------------- > > RCU (Read-Copy-Update) is a powerful synchronization mechanism. It > rests on two concepts: grace periods and read-side critical sections. > @@ -1537,49 +1537,100 @@ relation, and the details don't matter u > a somewhat lengthy formal proof. Pretty much all you need to know > about rcu-link is the information in the preceding paragraph. > > -The LKMM goes on to define the gp-link and rscs-link relations. They > -bring grace periods and read-side critical sections into the picture, > -in the following way: > - > - E ->gp-link F means there is a synchronize_rcu() fence event S > - and an event X such that E ->po S, either S ->po X or S = X, > - and X ->rcu-link F. In other words, E and F are linked by a > - grace period followed by an instance of rcu-link. > - > - E ->rscs-link F means there is a critical section delimited by > - an rcu_read_lock() fence L and an rcu_read_unlock() fence U, > - and an event X such that E ->po U, either L ->po X or L = X, > - and X ->rcu-link F. Roughly speaking, this says that some > - event in the same critical section as E is linked by rcu-link > - to F. > +The LKMM also defines the gp and rscs relations. They bring grace > +periods and read-side critical sections into the picture, in the > +following way: > + > + E ->gp F means there is a synchronize_rcu() fence event S such > + that E ->po S and either S ->po F or S = F. In simple terms, > + there is a grace period po-between E and F. > + > + E ->rscs F means there is a critical section delimited by an > + rcu_read_lock() fence L and an rcu_read_unlock() fence U, such > + that E ->po U and either L ->po F or L = F. You can think of > + this as saying that E and F are in the same critical section > + (in fact, it also allows E to be po-before the start of the > + critical section and F to be po-after the end). > > If we think of the rcu-link relation as standing for an extended > -"before", then E ->gp-link F says that E executes before a grace > -period which ends before F executes. (In fact it covers more than > -this, because it also includes cases where E executes before a grace > -period and some store propagates to F's CPU before F executes and > -doesn't propagate to some other CPU until after the grace period > -ends.) Similarly, E ->rscs-link F says that E is part of (or before > -the start of) a critical section which starts before F executes. > +"before", then X ->gp Y ->rcu-link Z says that X executes before a > +grace period which ends before Z executes. (In fact it covers more > +than this, because it also includes cases where X executes before a > +grace period and some store propagates to Z's CPU before Z executes > +but doesn't propagate to some other CPU until after the grace period > +ends.) Similarly, X ->rscs Y ->rcu-link Z says that X is part of (or > +before the start of) a critical section which starts before Z > +executes. > + > +The LKMM goes on to define the rcu-fence relation as a sequence of gp > +and rscs links separated by rcu-link links, in which the number of gp > +links is >= the number of rscs links. For example: > + > + X ->gp Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > + > +would imply that X ->rcu-fence V, because this sequence contains two > +gp links and only one rscs link. (It also implies that X ->rcu-fence T > +and Z ->rcu-fence V.) On the other hand: > + > + X ->rscs Y ->rcu-link Z ->rscs T ->rcu-link U ->gp V > + > +does not imply X ->rcu-fence V, because the sequence contains only > +one gp link but two rscs links. > + > +The rcu-fence relation is important because the Grace Period Guarantee > +means that rcu-fence acts kind of like a strong fence. In particular, > +if W is a write and we have W ->rcu-fence Z, the Guarantee says that W > +will propagate to every CPU before Z executes. > + > +To prove this in full generality requires some intellectual effort. > +We'll consider just a very simple case: > + > + W ->gp X ->rcu-link Y ->rscs Z. > + > +This formula means that there is a grace period G and a critical > +section C such that: > + > + 1. W is po-before G; > + > + 2. X is equal to or po-after G; > + > + 3. X comes "before" Y in some sense; > + > + 4. Y is po-before the end of C; > + > + 5. Z is equal to or po-after the start of C. > + > +From 2 - 4 we deduce that the grace period G ends before the critical > +section C. Then the second part of the Grace Period Guarantee says > +not only that G starts before C does, but also that W (which executes > +on G's CPU before G starts) must propagate to every CPU before C > +starts. In particular, W propagates to every CPU before Z executes > +(or finishes executing, in the case where Z is equal to the > +rcu_read_lock() fence event which starts C.) This sort of reasoning > +can be expanded to handle all the situations covered by rcu-fence. > + > +Finally, the LKMM defines the RCU-before (rb) relation in terms of > +rcu-fence. This is done in essentially the same way as the pb > +relation was defined in terms of strong-fence. We will omit the > +details; the end result is that E ->rb F implies E must execute before > +F, just as E ->pb F does (and for much the same reasons). > > Putting this all together, the LKMM expresses the Grace Period > -Guarantee by requiring that there are no cycles consisting of gp-link > -and rscs-link links in which the number of gp-link instances is >= the > -number of rscs-link instances. It does this by defining the rb > -relation to link events E and F whenever it is possible to pass from E > -to F by a sequence of gp-link and rscs-link links with at least as > -many of the former as the latter. The LKMM's "rcu" axiom then says > -that there are no events E with E ->rb E. > - > -Justifying this axiom takes some intellectual effort, but it is in > -fact a valid formalization of the Grace Period Guarantee. We won't > -attempt to go through the detailed argument, but the following > -analysis gives a taste of what is involved. Suppose we have a > -violation of the first part of the Guarantee: A critical section > -starts before a grace period, and some store propagates to the > -critical section's CPU before the end of the critical section but > -doesn't propagate to some other CPU until after the end of the grace > -period. > +Guarantee by requiring that the rb relation does not contain a cycle. > +Equivalently, this "rcu" axiom requires that there are no events E and > +F with E ->rcu-link F ->rcu-fence E. Or to put it a third way, the > +axiom requires that there are no cycles consisting of gp and rscs > +alternating with rcu-link, where the number of gp links is >= the > +number of rscs links. > + > +Justifying the axiom isn't easy, but it is in fact a valid > +formalization of the Grace Period Guarantee. We won't attempt to go > +through the detailed argument, but the following analysis gives a > +taste of what is involved. Suppose we have a violation of the first > +part of the Guarantee: A critical section starts before a grace > +period, and some store propagates to the critical section's CPU before > +the end of the critical section but doesn't propagate to some other > +CPU until after the end of the grace period. > > Putting symbols to these ideas, let L and U be the rcu_read_lock() and > rcu_read_unlock() fence events delimiting the critical section in > @@ -1606,11 +1657,14 @@ by rcu-link, yielding: > > S ->po X ->rcu-link Z ->po U. > > -The formulas say that S is po-between F and X, hence F ->gp-link Z > -via X. They also say that Z comes before the end of the critical > -section and E comes after its start, hence Z ->rscs-link F via E. But > -now we have a forbidden cycle: F ->gp-link Z ->rscs-link F. Thus the > -"rcu" axiom rules out this violation of the Grace Period Guarantee. > +The formulas say that S is po-between F and X, hence F ->gp X. They > +also say that Z comes before the end of the critical section and E > +comes after its start, hence Z ->rscs E. From all this we obtain: > + > + F ->gp X ->rcu-link Z ->rscs E ->rcu-link F, > + > +a forbidden cycle. Thus the "rcu" axiom rules out this violation of > +the Grace Period Guarantee. > > For something a little more down-to-earth, let's see how the axiom > works out in practice. Consider the RCU code example from above, this > @@ -1639,15 +1693,15 @@ time with statement labels added to the > If r2 = 0 at the end then P0's store at X overwrites the value that > P1's load at Z reads from, so we have Z ->fre X and thus Z ->rcu-link X. > In addition, there is a synchronize_rcu() between Y and Z, so therefore > -we have Y ->gp-link X. > +we have Y ->gp Z. > > If r1 = 1 at the end then P1's load at Y reads from P0's store at W, > so we have W ->rcu-link Y. In addition, W and X are in the same critical > -section, so therefore we have X ->rscs-link Y. > +section, so therefore we have X ->rscs W. > > -This gives us a cycle, Y ->gp-link X ->rscs-link Y, with one gp-link > -and one rscs-link, violating the "rcu" axiom. Hence the outcome is > -not allowed by the LKMM, as we would expect. > +Then X ->rscs W ->rcu-link Y ->gp Z ->rcu-link X is a forbidden cycle, > +violating the "rcu" axiom. Hence the outcome is not allowed by the > +LKMM, as we would expect. > > For contrast, let's see what can happen in a more complicated example: > > @@ -1683,15 +1737,11 @@ For contrast, let's see what can happen > } > > If r0 = r1 = r2 = 1 at the end, then similar reasoning to before shows > -that W ->rscs-link Y via X, Y ->gp-link U via Z, and U ->rscs-link W > -via V. And just as before, this gives a cycle: > - > - W ->rscs-link Y ->gp-link U ->rscs-link W. > - > -However, this cycle has fewer gp-link instances than rscs-link > -instances, and consequently the outcome is not forbidden by the LKMM. > -The following instruction timing diagram shows how it might actually > -occur: > +that W ->rscs X ->rcu-link Y ->gp Z ->rcu-link U ->rscs V ->rcu-link W. > +However this cycle is not forbidden, because the sequence of relations > +contains fewer instances of gp (one) than of rscs (two). Consequently > +the outcome is allowed by the LKMM. The following instruction timing > +diagram shows how it might actually occur: > > P0 P1 P2 > -------------------- -------------------- -------------------- > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-03-13 13:56 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-02-28 20:13 [PATCH 2/2 v2 RFC] tools/memory-model: redefine rb in terms of rcu-fence Alan Stern 2018-03-01 1:55 ` Boqun Feng 2018-03-01 4:49 ` Paul E. McKenney 2018-03-01 8:39 ` Boqun Feng 2018-03-01 14:28 ` Paul E. McKenney 2018-03-01 15:49 ` Alan Stern 2018-03-01 17:49 ` Paul E. McKenney 2018-03-01 18:37 ` Paul E. McKenney 2018-03-02 4:31 ` Boqun Feng 2018-03-02 4:50 ` Paul E. McKenney 2018-03-02 15:17 ` Alan Stern 2018-03-02 17:38 ` Paul E. McKenney 2018-03-13 13:56 ` Andrea Parri
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).