From: Oleg Nesterov <oleg@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
mingo@kernel.org, linux-kernel@vger.kernel.org, tj@kernel.org,
john.stultz@linaro.org, dimitrysh@google.com, romlem@google.com,
ccross@google.com, tkjos@google.com
Subject: Re: [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter()
Date: Wed, 20 Jul 2016 19:16:03 +0200 [thread overview]
Message-ID: <20160720171602.GA5059@redhat.com> (raw)
In-Reply-To: <20160719205018.GB7094@linux.vnet.ibm.com>
Paul, I had to switch to internal bugzillas after the first email, and now
I feel I can't read... I'll try to answer one question right now, tomorrow
I'll reread your email, probably I need to answer something else...
On 07/19, Paul E. McKenney wrote:
>
> On Sat, Jul 16, 2016 at 07:10:07PM +0200, Oleg Nesterov wrote:
>
> > And, there is another possible transition, GP_ENTER -> GP_IDLE, because
> > not it is possible to call __rcu_sync_enter() and rcu_sync_exit() in any
> > state (except obviously they should be balanced), and they do not block.
...
> If you feel strongly about allowing rcu_sync_exit() in GP_ENTER state,
> could you please tell me your use case? Or am I confused?
See below,
> And I think __rcu_sync_enter() can have more users. Let's look at
> freeze_super(). It calls percpu_down_write() 3 times, and waits for 3 GP's
> sequentally.
>
> > Now we can add 3 __rcu_sync_enter's at the start and 3 rcu_sync_exit's at
> > the end (actually we can do better, just to simplify). And again, note
> > that rcu_sync_exit() will work correctly even if we (say) return -EBUSY,
> > so rcu_sync_wait and/or percpu_down_write() was not called in between,
> > and in this case we won't block waiting for GP.
> >
>
> I am not going to claim to understand freeze_super(), but it does seem
> to have a fair amount of waiting.
>
> But yes, you could put rcu_sync_enter()
^^^^^^^^^^^^^^
__rcu_sync_enter, see below,
> and rcu_sync_exit() before and
> after a series of write-side enter/exit pairs in order to force things
> to stay in writer mode, if that is what you are suggesting.
No, no, this is not what I am trying to suggest.
The problem is that freeze_super() takes 3 semaphores for writing in row,
this means that it needs to wait for 3 GP's sequentally, and it does this
with sb->s_umount held. This is just ugly.
OK, lets suppose it simply does
freeze_super(sb)
{
down_write(&sb->s_umount);
if (NEED_TO_FREEZE) {
percpu_down_write(SEM1);
percpu_down_write(SEM2);
percpu_down_write(SEM3);
}
up_write(&sb->s_umount);
}
and every percpu_down_write() waits for GP.
Now, suppose we add the additional enter/exit's:
freeze_super(sb)
{
// this doesn't block
__rcu_sync_enter(SEM3);
__rcu_sync_enter(SEM2);
__rcu_sync_enter(SEM1);
down_write(&sb->s_umount);
if (NEED_TO_FREEZE) {
percpu_down_write(SEM1);
percpu_down_write(SEM2);
percpu_down_write(SEM3);
}
up_write(&sb->s_umount);
rcu_sync_exit(SEM1);
rcu_sync_exit(SEM2);
rcu_sync_exit(SEM3);
}
Again, actually we can do better, just to simplify.
Now. the fisrt percpu_down_write(SEM1) can block waiting for GP or not,
this depends on how many time it spends in down_write().
But the 2nd and the 3rd percpu_down_write() most likely won't block, so
in the likely case freeze_super() will need a single GP pass.
And note that NEED_TO_FREEZE can be false, in this case rcu_sync_exit()
will be called in GP_ENTER state.
To some degree, this is like get_state_synchronize_rcu/cond_synchronize_rcu.
But obviously percpu_down_write() can not use these helpers, and in this
particular case __rcu_sync_enter() is better because it forces the start
of GP pass.
-----------------------------------------------------------------------
As for cgroups, we want to switch cgroup_threadgroup_rwsem into the
slow mode, at least for now.
We could add the additional hooks/hacks into rcu/sync.c but why? We can
do this without any changes outside of cgroup.c right now, just add
rcu_sync_enter() into cgroup_init().
But we do not want to add a pointless synchronize_sched() at boot time,
__rcu_sync_enter() looks much better.
------------------------------------------------------------------------
And even __cgroup_procs_write() could use __rcu_sync_enter(). But lets
ignore this for now.
Oleg.
next prev parent reply other threads:[~2016-07-20 17:15 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-14 18:25 [PATCH 0/2] locking/percpu-rwsem: Optimizations/tweaks Peter Zijlstra
2016-07-14 18:25 ` [PATCH 1/2] locking/percpu-rwsem: Optimize readers and reduce global impact Peter Zijlstra
2016-07-15 16:30 ` Oleg Nesterov
2016-07-15 19:47 ` Peter Zijlstra
2016-07-18 18:23 ` kbuild test robot
2016-07-18 22:51 ` kbuild test robot
2016-07-14 18:25 ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Peter Zijlstra
2016-07-14 18:37 ` Peter Zijlstra
2016-07-14 18:43 ` Oleg Nesterov
2016-07-14 18:56 ` Peter Zijlstra
2016-07-14 19:20 ` Peter Zijlstra
2016-07-14 19:29 ` Paul E. McKenney
2016-07-14 19:38 ` Peter Zijlstra
2016-07-14 19:54 ` Paul E. McKenney
2016-07-15 13:27 ` Oleg Nesterov
2016-07-15 13:39 ` Paul E. McKenney
2016-07-15 13:45 ` Oleg Nesterov
2016-07-15 15:38 ` Paul E. McKenney
2016-07-15 16:49 ` Oleg Nesterov
2016-07-15 18:01 ` Paul E. McKenney
2016-07-16 17:10 ` [PATCH] rcu_sync: simplify the state machine, introduce __rcu_sync_enter() Oleg Nesterov
2016-07-16 18:40 ` Oleg Nesterov
2016-07-18 11:54 ` Peter Zijlstra
2016-07-18 13:44 ` Oleg Nesterov
2016-07-19 20:50 ` Paul E. McKenney
2016-07-20 15:13 ` Oleg Nesterov
2016-07-20 20:58 ` Paul E. McKenney
2016-07-21 17:34 ` Oleg Nesterov
2016-07-20 17:16 ` Oleg Nesterov [this message]
2016-07-20 21:31 ` Paul E. McKenney
2016-07-21 17:34 ` Oleg Nesterov
2016-07-22 3:26 ` Paul E. McKenney
2016-07-25 17:01 ` Oleg Nesterov
2016-07-25 17:05 ` John Stultz
2016-07-25 17:26 ` Oleg Nesterov
2016-08-09 8:48 ` Peter Zijlstra
2016-07-25 17:49 ` Paul E. McKenney
2016-07-15 13:42 ` [PATCH 2/2] locking/percpu-rwsem: Introduce bias knob Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160720171602.GA5059@redhat.com \
--to=oleg@redhat.com \
--cc=ccross@google.com \
--cc=dimitrysh@google.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=romlem@google.com \
--cc=tj@kernel.org \
--cc=tkjos@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).