From: Joel Fernandes <joel@joelfernandes.org>
To: linux-kernel@vger.kernel.org
Cc: bristot@redhat.com, peterz@infradead.org, oleg@redhat.com,
paulmck@kernel.org, rcu@vger.kernel.org,
Josh Triplett <josh@joshtriplett.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
"Paul E. McKenney" <paulmck@linux.ibm.com>,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] Remove GP_REPLAY state from rcu_sync
Date: Fri, 4 Oct 2019 10:59:12 -0400 [thread overview]
Message-ID: <20191004145912.GA118626@google.com> (raw)
In-Reply-To: <20191004145741.118292-1-joel@joelfernandes.org>
On Fri, Oct 04, 2019 at 10:57:41AM -0400, Joel Fernandes (Google) wrote:
> From: Joel Fernandes <joel@joelfernandes.org>
>
> Please consider this is an RFC for discussion only. Just want to discuss
> why the GP_REPLAY state is needed at all.
And I messed up the subject prefix, but this is *really* RFC and for
discussion purposes :)
thanks,
- Joel
> Here's the intention AFAICS:
> When rcu_sync_exit() has happened, the gp_state changes to GP_EXIT while
> we wait for a grace period before transitioning to GP_IDLE. In the
> meanwhile, if we receive another rcu_sync_exit(), then we want to wait
> for another GP to account for that.
>
> Drawing some timing diagrams, it looks like this:
>
> Legend:
> rse = rcu_sync_enter
> rsx = rcu_sync_exit
> i = GP_IDLE
> x = GP_EXIT
> r = GP_REPLAY
> e = GP_ENTER
> p = GP_PASSED
> rx = GP_REPLAY changes to GP_EXIT
>
> GP num = The GP we are one.
>
> note: A GP passes between the states:
> e and p
> x and i
> x and rx
> rx and i
>
> In a simple case, the timing and states look like:
> time
> ---------------------->
> GP num 1111111 2222222
> GP state i e p x i
> CPU0 : rse rsx
>
> However we can enter the replay state like this:
> time
> ---------------------->
> GP num 1111111 2222222222222222222223333333
> GP state i e p x r rx i
> CPU0 : rse rsx
> CPU1 : rse rsx
>
> Due to the second rse + rsx, we had to wait for another GP.
>
> I believe the rationale is, if another rsx happens, another GP has to
> happen.
>
> But this is not always true if you consider the following events:
>
> time
> ---------------------->
> GP num 111111 22222222222222222222222222222222233333333
> GP state i e p x r rx i
> CPU0 : rse rsx
> CPU1 : rse rsx
> CPU2 : rse rsx
>
> Here, we had 3 grace periods that elapsed, 1 for the rcu_sync_enter(),
> and 2 for the rcu_sync_exit(s).
>
> However, we had 3 rcu_sync_exit()s, not 2. In other words, the
> rcu_sync_exit() got batched.
>
> So my point here is, rcu_sync_exit() does not really always cause a new
> GP to happen and we can end up having the rcu_sync_exit()s as batched
> and sharing the same grace period.
>
> Then what is the point of the GP_REPLAY state at all if it does not
> always wait for a new GP? Taking a step back, why did we intend to have
> to wait for a new GP if another rcu_sync_exit() comes while one is still
> in progress?
>
> Cc: bristot@redhat.com
> Cc: peterz@infradead.org
> Cc: oleg@redhat.com
> Cc: paulmck@kernel.org
> Cc: rcu@vger.kernel.org
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
> kernel/rcu/sync.c | 14 ++------------
> 1 file changed, 2 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
> index d4558ab7a07d..4f3aad67992c 100644
> --- a/kernel/rcu/sync.c
> +++ b/kernel/rcu/sync.c
> @@ -10,7 +10,7 @@
> #include <linux/rcu_sync.h>
> #include <linux/sched.h>
>
> -enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
> +enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT };
>
> #define rss_lock gp_wait.lock
>
> @@ -85,13 +85,6 @@ static void rcu_sync_func(struct rcu_head *rhp)
> */
> WRITE_ONCE(rsp->gp_state, GP_PASSED);
> wake_up_locked(&rsp->gp_wait);
> - } else if (rsp->gp_state == GP_REPLAY) {
> - /*
> - * A new rcu_sync_exit() has happened; requeue the callback to
> - * catch a later GP.
> - */
> - WRITE_ONCE(rsp->gp_state, GP_EXIT);
> - rcu_sync_call(rsp);
> } else {
> /*
> * We're at least a GP after the last rcu_sync_exit(); eveybody
> @@ -167,16 +160,13 @@ void rcu_sync_enter(struct rcu_sync *rsp)
> */
> void rcu_sync_exit(struct rcu_sync *rsp)
> {
> - WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
> - WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
> + WARN_ON_ONCE(READ_ONCE(rsp->gp_state) < GP_PASSED);
>
> spin_lock_irq(&rsp->rss_lock);
> if (!--rsp->gp_count) {
> if (rsp->gp_state == GP_PASSED) {
> WRITE_ONCE(rsp->gp_state, GP_EXIT);
> rcu_sync_call(rsp);
> - } else if (rsp->gp_state == GP_EXIT) {
> - WRITE_ONCE(rsp->gp_state, GP_REPLAY);
> }
> }
> spin_unlock_irq(&rsp->rss_lock);
> --
> 2.23.0.581.g78d2f28ef7-goog
>
next prev parent reply other threads:[~2019-10-04 14:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-04 14:57 [PATCH] Remove GP_REPLAY state from rcu_sync Joel Fernandes (Google)
2019-10-04 14:59 ` Joel Fernandes [this message]
2019-10-04 15:41 ` Oleg Nesterov
2019-10-04 16:37 ` Joel Fernandes
2019-10-07 14:09 ` Oleg Nesterov
2020-01-16 21:57 ` Joel Fernandes
2019-10-04 19:25 ` kbuild test robot
2019-10-04 22:03 ` kbuild test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191004145912.GA118626@google.com \
--to=joel@joelfernandes.org \
--cc=bristot@redhat.com \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=paulmck@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).