live-patching.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: gor@linux.ibm.com, jpoimboe@redhat.com, jikos@kernel.org,
	mbenes@suse.cz, pmladek@suse.com, mingo@kernel.org,
	linux-kernel@vger.kernel.org, joe.lawrence@redhat.com,
	fweisbec@gmail.com, tglx@linutronix.de, hca@linux.ibm.com,
	svens@linux.ibm.com, sumanthk@linux.ibm.com,
	live-patching@vger.kernel.org
Subject: Re: [RFC][PATCH 6/7] context_tracking: Provide SMP ordering using RCU
Date: Wed, 22 Sep 2021 21:33:43 +0200	[thread overview]
Message-ID: <YUuFF8+H2PE9m4wy@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20210922151721.GZ880162@paulmck-ThinkPad-P17-Gen-1>

On Wed, Sep 22, 2021 at 08:17:21AM -0700, Paul E. McKenney wrote:
> On Wed, Sep 22, 2021 at 01:05:12PM +0200, Peter Zijlstra wrote:
> > Use rcu_user_{enter,exit}() calls to provide SMP ordering on context
> > tracking state stores:
> > 
> > __context_tracking_exit()
> >   __this_cpu_write(context_tracking.state, CONTEXT_KERNEL)
> >   rcu_user_exit()
> >     rcu_eqs_exit()
> >       rcu_dynticks_eqs_eit()
> >         rcu_dynticks_inc()
> >           atomic_add_return() /* smp_mb */
> > 
> > __context_tracking_enter()
> >   rcu_user_enter()
> >     rcu_eqs_enter()
> >       rcu_dynticks_eqs_enter()
> >         rcu_dynticks_inc()
> > 	  atomic_add_return() /* smp_mb */
> >   __this_cpu_write(context_tracking.state, state)
> > 
> > This separates USER/KERNEL state with an smp_mb() on each side,
> > therefore, a user of context_tracking_state_cpu() can say the CPU must
> > pass through an smp_mb() before changing.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> For the transformation to negative errno return value and name change
> from an RCU perspective:
> 
> Acked-by: Paul E. McKenney <paulmck@kernel.org>

Thanks!

> For the sampling of nohz_full userspace state:
> 
> Another approach is for the rcu_data structure's ->dynticks variable to
> use the lower two bits to differentiate between idle, nohz_full userspace
> and kernel.  In theory, inlining should make this zero cost for idle
> transition, and should allow you to safely sample nohz_full userspace
> state with a load and a couple of memory barriers instead of an IPI.

That's what I do now, it's like:

  <user code>

  state = KERNEL
  smp_mb()

  <kernel code>

  smp_mb()
  state = USER

  <user core>

vs

  <patch kernel code>
  smp_mb()
  if (state == USER)
    // then we're guaranteed any subsequent kernel code execution
    // will see the modified kernel code

more-or-less

> To make this work nicely, the low-order bits have to be 00 for kernel,
> and (say) 01 for idle and 10 for nohz_full userspace.  11 would be an
> error.
> 
> The trick would be for rcu_user_enter() and rcu_user_exit() to atomically
> increment ->dynticks by 2, for rcu_nmi_exit() to increment by 1 and
> rcu_nmi_enter() to increment by 3.  The state sampling would need to
> change accordingly.
> 
> Does this make sense, or am I missing something?

Why doesn't the proposed patch work? Also, ISTR sampling of remote
context state coming up before. And as is, it's a weird mix between
context_tracking and rcu.

AFAICT there is very little useful in context_tracking as is, but it's
also very weird to have to ask RCU about this. Is there any way to slice
this this code differently? Perhaps move some of the state RCU now keeps
into context_tracking ?

Anyway, lemme see if I get your proposal; lets say the counter starts at
0 and is in kernel space.

 0x00(0) - kernel
 0x02(2) - user
 0x04(0) - kernel

So far so simple, then NMI on top of that goes:

 0x00(0) - kernel
 0x03(3) - kernel + nmi
 0x04(0) - kernel
 0x06(2) - user
 0x09(1) - user + nmi
 0x0a(2) - user

Which then gives us:

 (0) := kernel
 (1) := nmi-from-user
 (2) := user
 (3) := nmi-from-kernel

Which should work I suppose. But like I said above, I'd be happier if
this counter would live in context_tracking rather than RCU.

  reply	other threads:[~2021-09-22 19:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22 11:05 [RFC][PATCH 0/7] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL cpus Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 1/7] sched,rcu: Rework try_invoke_on_locked_down_task() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 2/7] sched: Fix task_try_func() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 3/7] sched,livepatch: Use task_try_func() Peter Zijlstra
2021-09-23 12:05   ` Petr Mladek
2021-09-23 13:17     ` Peter Zijlstra
2021-09-23 13:47       ` Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 4/7] sched: Simplify wake_up_*idle*() Peter Zijlstra
2021-09-22 11:05 ` [RFC][PATCH 5/7] sched,livepatch: Use wake_up_if_idle() Peter Zijlstra
2021-09-22 13:05   ` Miroslav Benes
2021-09-23 12:19     ` Petr Mladek
2021-09-22 11:05 ` [RFC][PATCH 6/7] context_tracking: Provide SMP ordering using RCU Peter Zijlstra
2021-09-22 15:17   ` Paul E. McKenney
2021-09-22 19:33     ` Peter Zijlstra [this message]
2021-09-22 19:47       ` Peter Zijlstra
2021-09-22 19:59         ` Paul E. McKenney
2021-09-22 19:53       ` Paul E. McKenney
2021-09-22 20:02         ` Peter Zijlstra
2021-09-23 12:10   ` Petr Mladek
2021-09-24 18:57   ` Joel Savitz
2021-09-27 14:33     ` Petr Mladek
2021-09-22 11:05 ` [RFC][PATCH 7/7] livepatch,context_tracking: Avoid disturbing NOHZ_FULL tasks Peter Zijlstra
2021-09-23 13:14   ` Petr Mladek
2021-09-23 13:28     ` Peter Zijlstra
2021-09-24  7:33       ` Petr Mladek
2021-09-23 13:22 ` [RFC][PATCH 0/7] sched,rcu,context_tracking,livepatch: Improve livepatch task transitions for idle and NOHZ_FULL cpus Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUuFF8+H2PE9m4wy@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=fweisbec@gmail.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mingo@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=pmladek@suse.com \
    --cc=sumanthk@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).