From: Alan Stern <stern@rowland.harvard.edu>
To: Jonas Oberhauser <jonas.oberhauser@huawei.com>
Cc: "paulmck@kernel.org" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
"parri.andrea" <parri.andrea@gmail.com>, will <will@kernel.org>,
"boqun.feng" <boqun.feng@gmail.com>, npiggin <npiggin@gmail.com>,
dhowells <dhowells@redhat.com>, "j.alglave" <j.alglave@ucl.ac.uk>,
"luc.maranget" <luc.maranget@inria.fr>, akiyks <akiyks@gmail.com>,
dlustig <dlustig@nvidia.com>, joel <joel@joelfernandes.org>,
urezki <urezki@gmail.com>,
quic_neeraju <quic_neeraju@quicinc.com>,
frederic <frederic@kernel.org>,
Kernel development list <linux-kernel@vger.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)
Date: Thu, 19 Jan 2023 11:41:01 -0500 [thread overview]
Message-ID: <Y8lynRI35cFeuqb5@rowland.harvard.edu> (raw)
In-Reply-To: <a2b89243-ddd7-c114-541f-0aff7806d217@huaweicloud.com>
On Thu, Jan 19, 2023 at 12:22:50PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 3:28 AM, Alan Stern wrote:
> > > This is a permanent error; I've given up. Sorry it didn't
> > work out.
>
> [It seems the e-mail still reached me through the mailing list]
[For everyone else, Jonas is referring to the fact that the last two
emails I sent to his huaweicloud.com address could not be delivered, so
I copied them off-list to his huawei.com address.]
> > > I consider that a hack though and don't like it.
> > It _is_ a bit of a hack, but not a huge one. srcu_read_lock() really
> > is a lot like a load, in that it returns a value obtained by reading
> > something from memory (along with some other operations, though, so it
> > isn't a simple straightforward read -- perhaps more like an
> > atomic_inc_return_relaxed).
> The issue I have with this is that it might create accidental ordering. How
> does it behave when you throw fences in the mix?
I think this isn't going to be a problem. Certainly any real
implementation of scru_read_lock() is going to involve some actual load
operations, so any unintentional ordering caused by fences will also
apply to real executions. Likewise for srcu_read_unlock and store
operations.
> It really does not work like an increment at all, I think srcu_read_lock()
> only reads the currently active index, but the index is changed by
> srcu_sync. But even that is an implementation detail of sorts. I think the
> best way to think of it would be for srcu_read_lock to just return an
> arbitrary value.
I think I'll stick to it always returning the initial value. Paul said
that would be okay.
> The user can not rely on any kind of "accidental" rfe edges between these
> events for ordering.
>
> Perhaps if you flag any use of these values in address or control
> dependencies, as well as any event which depends on more than one of these
> values, you could prove that it's impossible to contrain the behavior
> through these rfe(and/or co) edges because you can anyways never inspect the
> value returned by the operation (except to pass it into srcu_unlock).
>
> Or you might be able to explicitly eliminate the events everywhere, just
> like you have done for carry-dep in your patch.
On second thought, I'll make it impossible to read from the
srcu_read_unlock events by removing them from the rf (and rfi/rfe)
relation. Then it won't be necessary to change carry-dep or anything
else.
> But it looks so brittle.
Maybe not so much after this change?
> > srcu_read_unlock() is somewhat less like a store, but it does have one
> > notable similarity: It takes an input value and therefore can be the
> > target of a data dependency. The biggest difference is that an
> > srcu_read_unlock() can't really be read-from. It would be nice if herd
> > had an event type that behaved this way.
> Or if you could declare your own : )
> Obviously, you will have accidental rf edges going from srcu_read_unlock()
> to srcu_read_lock() if you model them this way.
Not when those edges are erased.
> > Also, herd doesn't have any way of saying that the value passed to a
> > store is an unmodified copy of the value obtained by a load. In our
> > case that doesn't matter much -- nobody should be writing litmus tests
> > in which the value returned by srcu_read_lock() is incremented and then
> > decremented again before being passed to srcu_write_lock()!
>
> It would be nice if herd allowed declaring structs that can be used for such
> purposes.
> (anyways, I am not sure if Luc is still following everything in this deeply
> nested thread that started somewhere completely different. But maybe if Paul
> or you open a feature request, let me know so that I can give my 2ct)
I thought you were against adding into herd features that were specific
to the Linux kernel?
> > > Note that if your ordering relies on actually using gp twice in a row, then
> > > these must come from strong-fence, but you should be able to just take the
> > > shortcut by merging them into a single gp.
> > > po;rcu-gp;po;rcu-gp;po <= gp <= strong-fence <= hb & strong-order
> > I don't know what you mean by this. The example above does rely on
> > having two synchronize_rcu() calls; with only one it would be allowed.
>
> I mean that if you have a cycle that is formed by having two adjacent actual
> `gp` edges, like .... ; gp;gp ; .... with gp= po ; rcu-gp ; po?,
> (not like your example, where the cycle uses two *rcu*-gp but no gp edges)
Don't forget that I had in mind a version of the model where rcu-gp did
not exist.
> and assume we define gp' = po ; rcu-gp ; po and hb' and pb' to use gp'
> instead of gp,
> then there are two cases for how that cycle came to be, either 1) as
> ... ; hb;hb ; ....
> but then you can refactor as
> ... ; po;rcu-gp;po;rcu-gp;po ; ...
> ... ; po;rcu-gp; po ; ...
> ... ; gp' ; ...
> ... ; hb' ; ...
> which again creates a cycle, or 2) as
> ... ; pb ; hb ; ...
> coming from
> ... ; prop ; gp ; gp ; ....
> which you can similarly refactor as
> ... ; prop ; po;rcu-gp;po ; ....
> ... ; prop ; gp' ; ....
> and again get a cycle with
> ... ; pb' ; ....
> Therefore, gp = po;rcu-gp;po should be equivalent.
The point is that in P1, we have Write ->(gp;gp) Read, but we do not
have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
gp' instead of gp, you'll analyze the litmus test as if it had only one
grace period but two critical sections, getting a wrong answer.
Here's a totally different way of thinking about these things, which may
prove enlightening. These thoughts originally occurred to me years ago,
and I had forgotten about them until last night.
If G is a grace period, let's write t1(G) for the time when G starts and
t2(G) for the time when G ends.
Likewise, if C is a read-side critical section, let's write t2(C) for
the time when C starts (or the lock executes if you prefer) and t1(C)
for the time when C ends (or the unlock executes). This terminology
reflects the "backward" role that critical sections play in the memory
model.
Now we can can characterize rcu-order and rcu-link in operational terms.
Let A and B each be either a grace period or a read-side critical
section. Then:
A ->rcu-order B means t1(A) < t2(B), and
A ->rcu-link B means t2(A) <= t1(B).
(Of course, we always have t1(X) < t2(X) for any grace period or
critical section X.)
This explains quite a lot. For example, we can justify including
C ->rcu-link G
into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
t1(G), in other words, C starts when or before G starts. Then the
Fundamental Law of RCU says that C must end before G ends, since
otherwise C would span all of G. Thus t1(C) < t2(G), which is C
->rcu-order G.
The case of G ->rcu-link C is similar.
This also explains why rcu-link can be extended by appending (rcu-order
; rcu-link)*. From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if
A ->rcu-link B ->(rcu-order ; rcu-link)* C
then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.
The same sort of argument shows that rcu-order should be extendable by
appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.
This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
holds because t1(G) < t2(G). But for critical sections we have t2(C) <
t1(C) and so C ->rcu-order C does not hold.
Assuming ordinary memory accesses occur in a single instant, you see why
it makes sense to consider (po ; rcu-order ; po) an ordering. But when
you're comparing grace periods or critical sections to each other,
things get a little ambiguous. Should G1 be considered to come before
G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
Springing for (po ; rcu-order ; po?) amounts to choosing the second
alternative.
Alan
next prev parent reply other threads:[~2023-01-19 16:41 UTC|newest]
Thread overview: 161+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220921173109.GA1214281@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <YytfFiMT2Xsdwowf@rowland.harvard.edu>
[not found] ` <YywXuzZ/922LHfjI@hirez.programming.kicks-ass.net>
[not found] ` <114ECED5-FED1-4361-94F7-8D9BC02449B7>
[not found] ` <YzSAnclenTz7KQyt@rowland.harvard.edu>
[not found] ` <f763bd5ff835458d8750b61da47fe316@huawei.com>
2023-01-03 18:56 ` Internal vs. external barriers (was: Re: Interesting LKMM litmus test) Alan Stern
2023-01-04 15:37 ` Andrea Parri
2023-01-04 20:58 ` Alan Stern
[not found] ` <ee186bc17a5e48298a5373f688496dce@huawei.com>
2023-01-05 17:32 ` Paul E. McKenney
[not found] ` <bea712c82e6346f8973399a5711ff78a@huawei.com>
2023-01-11 15:06 ` Alan Stern
[not found] ` <768ffe7edc7f4ddfacd5b0a8e844ed84@huawei.com>
2023-01-11 17:01 ` Alan Stern
[not found] ` <07579baee4b84532a76ea8b0b33052bb@huawei.com>
2023-01-12 21:57 ` Paul E. McKenney
2023-01-13 16:38 ` Alan Stern
2023-01-13 19:54 ` Paul E. McKenney
[not found] ` <06a8aef7eb8d46bca34521a80880dae3@huawei.com>
2023-01-14 17:42 ` Paul E. McKenney
[not found] ` <e51c82a113484b6bb62354a49376f248@huawei.com>
2023-01-14 16:42 ` Alan Stern
2023-01-17 17:48 ` Jonas Oberhauser
2023-01-17 21:19 ` Alan Stern
2023-01-18 11:25 ` Jonas Oberhauser
2023-01-19 2:28 ` Alan Stern
2023-01-19 11:22 ` Jonas Oberhauser
2023-01-19 16:41 ` Alan Stern [this message]
2023-01-19 18:43 ` Paul E. McKenney
2023-01-23 16:16 ` Jonas Oberhauser
2023-01-23 19:58 ` Alan Stern
2023-01-23 20:06 ` Jonas Oberhauser
2023-01-23 20:41 ` Alan Stern
2023-01-24 13:21 ` Jonas Oberhauser
2023-01-24 15:54 ` Jonas Oberhauser
2023-01-24 17:22 ` Alan Stern
[not found] ` <4c1abc7733794519ad7c5153ae8b58f9@huawei.com>
2023-01-13 16:28 ` Alan Stern
2023-01-13 20:07 ` Paul E. McKenney
2023-01-13 20:32 ` Paul E. McKenney
2023-01-14 17:40 ` Alan Stern
2023-01-14 17:48 ` Paul E. McKenney
[not found] ` <136d019d8c8049f6b737627df830e66f@huawei.com>
2023-01-14 17:53 ` Paul E. McKenney
2023-01-14 18:15 ` Paul E. McKenney
2023-01-14 19:58 ` Alan Stern
2023-01-15 5:19 ` Paul E. McKenney
2023-01-14 20:19 ` Alan Stern
2023-01-15 5:15 ` Paul E. McKenney
2023-01-15 16:23 ` Alan Stern
2023-01-15 18:10 ` Paul E. McKenney
2023-01-15 20:46 ` Alan Stern
2023-01-16 4:23 ` Paul E. McKenney
2023-01-16 18:11 ` Alan Stern
2023-01-16 19:06 ` Paul E. McKenney
2023-01-16 19:20 ` Alan Stern
2023-01-16 22:13 ` Paul E. McKenney
2023-01-17 11:46 ` Andrea Parri
2023-01-17 15:14 ` Paul E. McKenney
2023-01-17 15:56 ` Alan Stern
2023-01-17 17:43 ` Paul E. McKenney
2023-01-17 18:27 ` Jonas Oberhauser
2023-01-17 18:55 ` Paul E. McKenney
2023-01-17 20:20 ` Jonas Oberhauser
2023-01-17 20:15 ` Alan Stern
2023-01-18 3:50 ` Paul E. McKenney
2023-01-18 16:50 ` Alan Stern
2023-01-18 19:42 ` Jonas Oberhauser
2023-01-18 20:19 ` Paul E. McKenney
2023-01-18 20:30 ` Jonas Oberhauser
2023-01-18 21:12 ` Paul E. McKenney
2023-01-18 21:24 ` Jonas Oberhauser
2023-01-19 0:11 ` Paul E. McKenney
2023-01-19 13:39 ` Jonas Oberhauser
2023-01-19 18:41 ` Paul E. McKenney
2023-01-19 19:51 ` Alan Stern
2023-01-19 21:53 ` Paul E. McKenney
2023-01-19 22:04 ` Alan Stern
2023-01-19 23:03 ` Paul E. McKenney
2023-01-20 9:43 ` Jonas Oberhauser
2023-01-20 15:39 ` Paul E. McKenney
2023-01-20 20:46 ` Jonas Oberhauser
2023-01-20 21:37 ` Paul E. McKenney
2023-01-20 22:36 ` Jonas Oberhauser
2023-01-20 23:19 ` Paul E. McKenney
2023-01-21 0:03 ` Jonas Oberhauser
2023-01-21 0:34 ` Paul E. McKenney
2023-01-20 3:55 ` Paul E. McKenney
2023-01-20 9:20 ` Jonas Oberhauser
2023-01-20 12:34 ` Jonas Oberhauser
2023-01-20 12:51 ` Jonas Oberhauser
2023-01-20 15:32 ` Paul E. McKenney
2023-01-20 20:56 ` Jonas Oberhauser
2023-01-20 21:40 ` Paul E. McKenney
2023-01-20 16:14 ` Alan Stern
2023-01-20 17:30 ` Paul E. McKenney
2023-01-20 18:15 ` Alan Stern
2023-01-20 18:59 ` Paul E. McKenney
2023-01-20 10:13 ` Jonas Oberhauser
2023-01-20 15:47 ` Paul E. McKenney
2023-01-20 22:21 ` Jonas Oberhauser
2023-01-20 16:18 ` Alan Stern
2023-01-20 21:41 ` Jonas Oberhauser
2023-01-21 4:38 ` Paul E. McKenney
2023-01-21 17:36 ` Alan Stern
2023-01-21 18:40 ` Paul E. McKenney
2023-01-21 19:56 ` Alan Stern
2023-01-21 20:10 ` Paul E. McKenney
2023-01-21 21:03 ` Alan Stern
2023-01-21 23:49 ` Paul E. McKenney
2023-01-23 11:48 ` Jonas Oberhauser
2023-01-23 15:55 ` Alan Stern
2023-01-23 19:40 ` Jonas Oberhauser
2023-01-23 20:34 ` Alan Stern
2023-01-18 20:06 ` Paul E. McKenney
2023-01-18 20:54 ` Alan Stern
2023-01-18 21:05 ` Jonas Oberhauser
2023-01-19 0:02 ` Paul E. McKenney
2023-01-19 2:19 ` Alan Stern
2023-01-19 11:23 ` Paul E. McKenney
2023-01-20 16:01 ` Alan Stern
2023-01-20 17:58 ` Paul E. McKenney
2023-01-20 18:37 ` Alan Stern
2023-01-20 19:20 ` Paul E. McKenney
2023-01-20 20:36 ` Alan Stern
2023-01-20 21:20 ` Paul E. McKenney
2023-01-22 20:32 ` Alan Stern
2023-01-23 20:16 ` Paul E. McKenney
2023-01-24 2:18 ` Alan Stern
2023-01-24 4:06 ` Paul E. McKenney
2023-01-24 11:09 ` Andrea Parri
2023-01-24 14:54 ` Paul E. McKenney
2023-01-24 15:11 ` Jonas Oberhauser
2023-01-24 16:22 ` Paul E. McKenney
2023-01-24 16:39 ` Jonas Oberhauser
2023-01-24 17:26 ` Paul E. McKenney
2023-01-24 19:30 ` Jonas Oberhauser
2023-01-24 22:15 ` Paul E. McKenney
2023-01-24 22:35 ` Alan Stern
2023-01-24 22:54 ` Paul E. McKenney
2023-01-25 1:54 ` Alan Stern
2023-01-25 2:20 ` Paul E. McKenney
2023-01-25 13:10 ` Jonas Oberhauser
2023-01-25 15:05 ` Paul E. McKenney
2023-01-25 15:34 ` Alan Stern
2023-01-25 17:18 ` Paul E. McKenney
2023-01-25 17:42 ` Jonas Oberhauser
2023-01-25 19:08 ` Alan Stern
2023-01-25 19:46 ` Paul E. McKenney
2023-01-25 20:36 ` Andrea Parri
2023-01-25 21:10 ` Jonas Oberhauser
2023-01-25 21:23 ` Paul E. McKenney
2023-01-25 20:46 ` Alan Stern
2023-01-25 21:38 ` Paul E. McKenney
2023-01-25 23:33 ` Paul E. McKenney
2023-01-26 1:45 ` Alan Stern
2023-01-26 1:53 ` Paul E. McKenney
2023-01-26 12:17 ` Jonas Oberhauser
2023-01-26 18:48 ` Paul E. McKenney
2023-01-27 15:03 ` Jonas Oberhauser
2023-01-27 16:50 ` Paul E. McKenney
2023-01-27 16:54 ` Paul E. McKenney
2023-01-18 19:57 ` Jonas Oberhauser
2023-01-18 21:06 ` Paul E. McKenney
2023-01-18 2:15 ` Alan Stern
2023-01-18 5:17 ` Paul E. McKenney
2023-01-18 16:03 ` Alan Stern
2023-01-18 16:59 ` Boqun Feng
2023-01-18 17:08 ` Alan Stern
2023-01-18 17:41 ` Paul E. McKenney
2023-01-19 19:07 ` Paul E. McKenney
2023-01-14 16:55 ` Alan Stern
2023-01-14 17:35 ` Paul E. McKenney
[not found] ` <17078dd97cb6480f9c51e27058af3197@huawei.com>
2023-01-14 17:27 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8lynRI35cFeuqb5@rowland.harvard.edu \
--to=stern@rowland.harvard.edu \
--cc=akiyks@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=dhowells@redhat.com \
--cc=dlustig@nvidia.com \
--cc=frederic@kernel.org \
--cc=j.alglave@ucl.ac.uk \
--cc=joel@joelfernandes.org \
--cc=jonas.oberhauser@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luc.maranget@inria.fr \
--cc=npiggin@gmail.com \
--cc=parri.andrea@gmail.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=quic_neeraju@quicinc.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).