All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org,
	john.stultz@linaro.org, dimitrysh@google.com, romlem@google.com,
	ccross@google.com, tkjos@google.com
Subject: Re: [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter()
Date: Mon, 18 Jul 2016 13:54:27 +0200	[thread overview]
Message-ID: <20160718115427.GB6862@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160716171007.GA30000@redhat.com>

On Sat, Jul 16, 2016 at 07:10:07PM +0200, Oleg Nesterov wrote:
> Peter, Paul, could you review? Do you see any hole?
> 
> Why. Firstly, note that the state machine was greatly simplified, and
> rsp->cb_state has gone, we have a single "state" variable, gp_state.
> Note also the ->sync() op has gone (and actually ->wait() too, see the
> "TODO" comment).

Yes, that seems like a nice simplification.

> GP_IDLE	- owned by __rcu_sync_enter() which can only move this state to
> 
> GP_ENTER - owned by rcu-callback which moves it to
> 
> GP_PASSED - owned by the last rcu_sync_exit() which moves it to
> 
> GP_EXIT - owned by rcu-callback which moves it back to GP_IDLE.
> 
> Yes, this is a bit simplified, we also have GP_REPLAY, but hopefully
> clear.
> 
> And, there is another possible transition, GP_ENTER -> GP_IDLE, because
> not it is possible to call __rcu_sync_enter() and rcu_sync_exit() in any
> state (except obviously they should be balanced), and they do not block.
> 
> The only blocking call is __rcu_sync_wait() which actually waits for GP.
> Obviously should only be called if gp_count != 0, iow after __rcu_sync_enter.

So I'm a complete moron today, to aid the failing brain I draw pictures.

I think I ended up with this:

 .---->	GP_IDLE <--------------.
 |	  |                    |
 |	  | __rcu_sync_enter() | <GP> / rcu_sync_exit()
 |	  v                    |
 |	GP_ENTER --------------'
 |	  |
 |  <GP>  |
 |        v
 |	GP_PASSED <---------.
 |	  |                 |
 |	  | rcu_sync_exit() | <GP> / __rcu_sync_enter()
 |	  v                 |
 `----- GP_EXIT ------------'
	  ^
    <GP>  | __rcu_sync_enter() + rcu_sync_exit()
	  v
	GP_RETRY


> enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
> 
> static void rcu_sync_func(struct rcu_head *rcu);
> 
> static void rcu_sync_call(struct rcu_sync *rsp)
> {
> 	// TODO: THIS IS SUBOPTIMAL. We want to call it directly
> 	// if rcu_blocking_is_gp() == T, but it has might_sleep().

Not sure I get that comment..

> 	gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
> }
> 
> static void rcu_sync_func(struct rcu_head *rcu)
> {
> 	struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head);
> 	unsigned long flags;
> 

Right, those are 'stable' states and must be advanced through explicit
calls to __rcu_sync_enter()/rcu_sync_exit() respectively.

> 	BUG_ON(rsp->gp_state == GP_IDLE);
> 	BUG_ON(rsp->gp_state == GP_PASSED);
> 
> 	spin_lock_irqsave(&rsp->rss_lock, flags);
> 	if (rsp->gp_count) {
> 		/*
> 		 * We're at least a GP after the first __rcu_sync_enter().
> 		 */
> 		rsp->gp_state = GP_PASSED;

So we can end up here in two ways afaict.

The simple way: someone called __rcu_sync_enter(), we go IDLE -> ENTER
with a raised count. Once the GP passes, we get here, observe the raised
count and advance to PASSED.

The more involved way: we were EXIT and someone calls __rcu_sync_enter()
to raise the count again. The callback from rcu_sync_exit() was still
pending and once we get here we observe the raised count and voila.

Now, since state != IDLE, I suppose this is valid, but it does hurt my
brain.

Else, !count:

> 	} else if (rsp->gp_state == GP_REPLAY) {

Fairly straight forward: during EXIT someone did __rcu_sync_enter +
rcu_sync_exit() and we need to wait longer still.

> 		/*
> 		 * A new rcu_sync_exit() has happened; requeue the callback
> 		 * to catch a later GP.
> 		 */
> 		rsp->gp_state = GP_EXIT;
> 		rcu_sync_call(rsp);
> 	} else {

Again two ways here, simple: we were EXIT, GP passed, we let em rip.

But its also possible, as you noted, that someone called rcu_sync_exit()
during ENTER and we ended up with !count which gets us back to IDLE.

> 		/*
> 		 * We're at least a GP after the last rcu_sync_exit();
> 		 * eveybody will now have observed the write side critical
> 		 * section. Let 'em rip!.
> 		 *
> 		 * OR. ->gp_state can be still GP_ENTER if __rcu_sync_wait()
> 		 * wasn't called after __rcu_sync_enter(), abort.
> 		 */
> 		rsp->gp_state = GP_IDLE;
> 	}
> 	spin_unlock_irqrestore(&rsp->rss_lock, flags);
> }

Agreed on the rest.

So I think its solid, but you failed to mention one state transition,
which seems ok in any case.

  parent reply	other threads:[~2016-07-18 11:54 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-14 18:25 [PATCH 0/2] locking/percpu-rwsem: Optimizations/tweaks Peter Zijlstra
2016-07-14 18:25 ` [PATCH 1/2] locking/percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2016-07-15 16:30   ` Oleg Nesterov
2016-07-15 19:47     ` Peter Zijlstra
2016-07-18 18:23   ` kbuild test robot
2016-07-18 22:51   ` kbuild test robot
2016-07-14 18:25 ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Peter Zijlstra
2016-07-14 18:37   ` Peter Zijlstra
2016-07-14 18:43   ` Oleg Nesterov
2016-07-14 18:56     ` Peter Zijlstra
2016-07-14 19:20     ` Peter Zijlstra
2016-07-14 19:29       ` Paul E. McKenney
2016-07-14 19:38         ` Peter Zijlstra
2016-07-14 19:54           ` Paul E. McKenney
2016-07-15 13:27       ` Oleg Nesterov
2016-07-15 13:39         ` Paul E. McKenney
2016-07-15 13:45           ` Oleg Nesterov
2016-07-15 15:38             ` Paul E. McKenney
2016-07-15 16:49               ` Oleg Nesterov
2016-07-15 18:01                 ` Paul E. McKenney
2016-07-16 17:10                   ` [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter() Oleg Nesterov
2016-07-16 18:40                     ` Oleg Nesterov
2016-07-18 11:54                     ` Peter Zijlstra [this message]
2016-07-18 13:44                       ` Oleg Nesterov
2016-07-19 20:50                     ` Paul E. McKenney
2016-07-20 15:13                       ` Oleg Nesterov
2016-07-20 20:58                         ` Paul E. McKenney
2016-07-21 17:34                           ` Oleg Nesterov
2016-07-20 17:16                       ` Oleg Nesterov
2016-07-20 21:31                         ` Paul E. McKenney
2016-07-21 17:34                           ` Oleg Nesterov
2016-07-22  3:26                             ` Paul E. McKenney
2016-07-25 17:01                               ` Oleg Nesterov
2016-07-25 17:05                                 ` John Stultz
2016-07-25 17:26                                   ` Oleg Nesterov
2016-08-09  8:48                                     ` Peter Zijlstra
2016-07-25 17:49                                 ` Paul E. McKenney
2016-07-15 13:42       ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160718115427.GB6862@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ccross@google.com \
    --cc=dimitrysh@google.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=romlem@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.