From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: gor@linux.ibm.com, jpoimboe@redhat.com, jikos@kernel.org,
mbenes@suse.cz, pmladek@suse.com, mingo@kernel.org,
linux-kernel@vger.kernel.org, joe.lawrence@redhat.com,
fweisbec@gmail.com, tglx@linutronix.de, hca@linux.ibm.com,
svens@linux.ibm.com, sumanthk@linux.ibm.com,
live-patching@vger.kernel.org
Subject: Re: [RFC][PATCH 6/7] context_tracking: Provide SMP ordering using RCU
Date: Wed, 22 Sep 2021 08:17:21 -0700 [thread overview]
Message-ID: <20210922151721.GZ880162@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20210922110836.244770922@infradead.org>
On Wed, Sep 22, 2021 at 01:05:12PM +0200, Peter Zijlstra wrote:
> Use rcu_user_{enter,exit}() calls to provide SMP ordering on context
> tracking state stores:
>
> __context_tracking_exit()
> __this_cpu_write(context_tracking.state, CONTEXT_KERNEL)
> rcu_user_exit()
> rcu_eqs_exit()
> rcu_dynticks_eqs_eit()
> rcu_dynticks_inc()
> atomic_add_return() /* smp_mb */
>
> __context_tracking_enter()
> rcu_user_enter()
> rcu_eqs_enter()
> rcu_dynticks_eqs_enter()
> rcu_dynticks_inc()
> atomic_add_return() /* smp_mb */
> __this_cpu_write(context_tracking.state, state)
>
> This separates USER/KERNEL state with an smp_mb() on each side,
> therefore, a user of context_tracking_state_cpu() can say the CPU must
> pass through an smp_mb() before changing.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
For the transformation to negative errno return value and name change
from an RCU perspective:
Acked-by: Paul E. McKenney <paulmck@kernel.org>
For the sampling of nohz_full userspace state:
Another approach is for the rcu_data structure's ->dynticks variable to
use the lower two bits to differentiate between idle, nohz_full userspace
and kernel. In theory, inlining should make this zero cost for idle
transition, and should allow you to safely sample nohz_full userspace
state with a load and a couple of memory barriers instead of an IPI.
To make this work nicely, the low-order bits have to be 00 for kernel,
and (say) 01 for idle and 10 for nohz_full userspace. 11 would be an
error.
The trick would be for rcu_user_enter() and rcu_user_exit() to atomically
increment ->dynticks by 2, for rcu_nmi_exit() to increment by 1 and
rcu_nmi_enter() to increment by 3. The state sampling would need to
change accordingly.
Does this make sense, or am I missing something?
Thanx, Paul
> ---
> include/linux/context_tracking_state.h | 12 ++++++++++++
> kernel/context_tracking.c | 7 ++++---
> 2 files changed, 16 insertions(+), 3 deletions(-)
>
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -45,11 +45,23 @@ static __always_inline bool context_trac
> {
> return __this_cpu_read(context_tracking.state) == CONTEXT_USER;
> }
> +
> +static __always_inline bool context_tracking_state_cpu(int cpu)
> +{
> + struct context_tracking *ct = per_cpu_ptr(&context_tracking);
> +
> + if (!context_tracking_enabled() || !ct->active)
> + return CONTEXT_DISABLED;
> +
> + return ct->state;
> +}
> +
> #else
> static inline bool context_tracking_in_user(void) { return false; }
> static inline bool context_tracking_enabled(void) { return false; }
> static inline bool context_tracking_enabled_cpu(int cpu) { return false; }
> static inline bool context_tracking_enabled_this_cpu(void) { return false; }
> +static inline bool context_tracking_state_cpu(int cpu) { return CONTEXT_DISABLED; }
> #endif /* CONFIG_CONTEXT_TRACKING */
>
> #endif
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -82,7 +82,7 @@ void noinstr __context_tracking_enter(en
> vtime_user_enter(current);
> instrumentation_end();
> }
> - rcu_user_enter();
> + rcu_user_enter(); /* smp_mb */
> }
> /*
> * Even if context tracking is disabled on this CPU, because it's outside
> @@ -149,12 +149,14 @@ void noinstr __context_tracking_exit(enu
> return;
>
> if (__this_cpu_read(context_tracking.state) == state) {
> + __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
> +
> if (__this_cpu_read(context_tracking.active)) {
> /*
> * We are going to run code that may use RCU. Inform
> * RCU core about that (ie: we may need the tick again).
> */
> - rcu_user_exit();
> + rcu_user_exit(); /* smp_mb */
> if (state == CONTEXT_USER) {
> instrumentation_begin();
> vtime_user_exit(current);
> @@ -162,7 +164,6 @@ void noinstr __context_tracking_exit(enu
> instrumentation_end();
> }
> }
> - __this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
> }
> context_tracking_recursion_exit();
> }
>
>
next prev parent reply other threads:[~2021-09-22 15:17 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-22 11:05 [RFC][PATCH 0/7] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL cpus Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 1/7] sched,rcu: Rework try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 2/7] sched: Fix task_try_func() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 3/7] sched,livepatch: Use task_try_func() Peter Zijlstra
2021-09-23 12:05 ` Petr Mladek
2021-09-23 13:17 ` Peter Zijlstra
2021-09-23 13:47 ` Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 4/7] sched: Simplify wake_up_*idle*() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 5/7] sched,livepatch: Use wake_up_if_idle() Peter Zijlstra
2021-09-22 13:05 ` Miroslav Benes
2021-09-23 12:19 ` Petr Mladek
2021-09-22 11:05 ` [RFC][PATCH 6/7] context_tracking: Provide SMP ordering using RCU Peter Zijlstra
2021-09-22 15:17 ` Paul E. McKenney [this message]
2021-09-22 19:33 ` Peter Zijlstra
2021-09-22 19:47 ` Peter Zijlstra
2021-09-22 19:59 ` Paul E. McKenney
2021-09-22 19:53 ` Paul E. McKenney
2021-09-22 20:02 ` Peter Zijlstra
2021-09-23 12:10 ` Petr Mladek
2021-09-24 18:57 ` Joel Savitz
2021-09-27 14:33 ` Petr Mladek
2021-09-22 11:05 ` [RFC][PATCH 7/7] livepatch,context_tracking: Avoid disturbing NOHZ_FULL tasks Peter Zijlstra
2021-09-23 13:14 ` Petr Mladek
2021-09-23 13:28 ` Peter Zijlstra
2021-09-24 7:33 ` Petr Mladek
2021-09-23 13:22 ` [RFC][PATCH 0/7] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL cpus Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210922151721.GZ880162@paulmck-ThinkPad-P17-Gen-1 \
--to=paulmck@kernel.org \
--cc=fweisbec@gmail.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=sumanthk@linux.ibm.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).