From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org,
john.stultz@linaro.org, dimitrysh@google.com, romlem@google.com,
ccross@google.com, tkjos@google.com
Subject: Re: [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter()
Date: Mon, 18 Jul 2016 13:54:27 +0200 [thread overview]
Message-ID: <20160718115427.GB6862@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160716171007.GA30000@redhat.com>
On Sat, Jul 16, 2016 at 07:10:07PM +0200, Oleg Nesterov wrote:
> Peter, Paul, could you review? Do you see any hole?
>
> Why. Firstly, note that the state machine was greatly simplified, and
> rsp->cb_state has gone, we have a single "state" variable, gp_state.
> Note also the ->sync() op has gone (and actually ->wait() too, see the
> "TODO" comment).
Yes, that seems like a nice simplification.
> GP_IDLE - owned by __rcu_sync_enter() which can only move this state to
>
> GP_ENTER - owned by rcu-callback which moves it to
>
> GP_PASSED - owned by the last rcu_sync_exit() which moves it to
>
> GP_EXIT - owned by rcu-callback which moves it back to GP_IDLE.
>
> Yes, this is a bit simplified, we also have GP_REPLAY, but hopefully
> clear.
>
> And, there is another possible transition, GP_ENTER -> GP_IDLE, because
> not it is possible to call __rcu_sync_enter() and rcu_sync_exit() in any
> state (except obviously they should be balanced), and they do not block.
>
> The only blocking call is __rcu_sync_wait() which actually waits for GP.
> Obviously should only be called if gp_count != 0, iow after __rcu_sync_enter.
So I'm a complete moron today, to aid the failing brain I draw pictures.
I think I ended up with this:
.----> GP_IDLE <--------------.
| | |
| | __rcu_sync_enter() | <GP> / rcu_sync_exit()
| v |
| GP_ENTER --------------'
| |
| <GP> |
| v
| GP_PASSED <---------.
| | |
| | rcu_sync_exit() | <GP> / __rcu_sync_enter()
| v |
`----- GP_EXIT ------------'
^
<GP> | __rcu_sync_enter() + rcu_sync_exit()
v
GP_RETRY
> enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
>
> static void rcu_sync_func(struct rcu_head *rcu);
>
> static void rcu_sync_call(struct rcu_sync *rsp)
> {
> // TODO: THIS IS SUBOPTIMAL. We want to call it directly
> // if rcu_blocking_is_gp() == T, but it has might_sleep().
Not sure I get that comment..
> gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
> }
>
> static void rcu_sync_func(struct rcu_head *rcu)
> {
> struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head);
> unsigned long flags;
>
Right, those are 'stable' states and must be advanced through explicit
calls to __rcu_sync_enter()/rcu_sync_exit() respectively.
> BUG_ON(rsp->gp_state == GP_IDLE);
> BUG_ON(rsp->gp_state == GP_PASSED);
>
> spin_lock_irqsave(&rsp->rss_lock, flags);
> if (rsp->gp_count) {
> /*
> * We're at least a GP after the first __rcu_sync_enter().
> */
> rsp->gp_state = GP_PASSED;
So we can end up here in two ways afaict.
The simple way: someone called __rcu_sync_enter(), we go IDLE -> ENTER
with a raised count. Once the GP passes, we get here, observe the raised
count and advance to PASSED.
The more involved way: we were EXIT and someone calls __rcu_sync_enter()
to raise the count again. The callback from rcu_sync_exit() was still
pending and once we get here we observe the raised count and voila.
Now, since state != IDLE, I suppose this is valid, but it does hurt my
brain.
Else, !count:
> } else if (rsp->gp_state == GP_REPLAY) {
Fairly straight forward: during EXIT someone did __rcu_sync_enter +
rcu_sync_exit() and we need to wait longer still.
> /*
> * A new rcu_sync_exit() has happened; requeue the callback
> * to catch a later GP.
> */
> rsp->gp_state = GP_EXIT;
> rcu_sync_call(rsp);
> } else {
Again two ways here, simple: we were EXIT, GP passed, we let em rip.
But its also possible, as you noted, that someone called rcu_sync_exit()
during ENTER and we ended up with !count which gets us back to IDLE.
> /*
> * We're at least a GP after the last rcu_sync_exit();
> * eveybody will now have observed the write side critical
> * section. Let 'em rip!.
> *
> * OR. ->gp_state can be still GP_ENTER if __rcu_sync_wait()
> * wasn't called after __rcu_sync_enter(), abort.
> */
> rsp->gp_state = GP_IDLE;
> }
> spin_unlock_irqrestore(&rsp->rss_lock, flags);
> }
Agreed on the rest.
So I think its solid, but you failed to mention one state transition,
which seems ok in any case.
next prev parent reply other threads:[~2016-07-18 11:54 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-14 18:25 [PATCH 0/2] locking/percpu-rwsem: Optimizations/tweaks Peter Zijlstra
2016-07-14 18:25 ` [PATCH 1/2] locking/percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2016-07-15 16:30 ` Oleg Nesterov
2016-07-15 19:47 ` Peter Zijlstra
2016-07-18 18:23 ` kbuild test robot
2016-07-18 22:51 ` kbuild test robot
2016-07-14 18:25 ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Peter Zijlstra
2016-07-14 18:37 ` Peter Zijlstra
2016-07-14 18:43 ` Oleg Nesterov
2016-07-14 18:56 ` Peter Zijlstra
2016-07-14 19:20 ` Peter Zijlstra
2016-07-14 19:29 ` Paul E. McKenney
2016-07-14 19:38 ` Peter Zijlstra
2016-07-14 19:54 ` Paul E. McKenney
2016-07-15 13:27 ` Oleg Nesterov
2016-07-15 13:39 ` Paul E. McKenney
2016-07-15 13:45 ` Oleg Nesterov
2016-07-15 15:38 ` Paul E. McKenney
2016-07-15 16:49 ` Oleg Nesterov
2016-07-15 18:01 ` Paul E. McKenney
2016-07-16 17:10 ` [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter() Oleg Nesterov
2016-07-16 18:40 ` Oleg Nesterov
2016-07-18 11:54 ` Peter Zijlstra [this message]
2016-07-18 13:44 ` Oleg Nesterov
2016-07-19 20:50 ` Paul E. McKenney
2016-07-20 15:13 ` Oleg Nesterov
2016-07-20 20:58 ` Paul E. McKenney
2016-07-21 17:34 ` Oleg Nesterov
2016-07-20 17:16 ` Oleg Nesterov
2016-07-20 21:31 ` Paul E. McKenney
2016-07-21 17:34 ` Oleg Nesterov
2016-07-22 3:26 ` Paul E. McKenney
2016-07-25 17:01 ` Oleg Nesterov
2016-07-25 17:05 ` John Stultz
2016-07-25 17:26 ` Oleg Nesterov
2016-08-09 8:48 ` Peter Zijlstra
2016-07-25 17:49 ` Paul E. McKenney
2016-07-15 13:42 ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160718115427.GB6862@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=ccross@google.com \
--cc=dimitrysh@google.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=romlem@google.com \
--cc=tj@kernel.org \
--cc=tkjos@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).