All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Neeraj Upadhyay <neeraju@codeaurora.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Josh Triplett <josh@joshtriplett.org>,
	rcu@vger.kernel.org
Subject: Re: [PATCH 00/19] rcu/nocb: De-offload and re-offload support v4
Date: Tue, 29 Dec 2020 15:21:34 +0100	[thread overview]
Message-ID: <20201229142134.GB21613@lothringen> (raw)
In-Reply-To: <20201216165930.GJ2657@paulmck-ThinkPad-P72>

On Wed, Dec 16, 2020 at 08:59:30AM -0800, Paul E. McKenney wrote:
> On Fri, Nov 13, 2020 at 01:13:15PM +0100, Frederic Weisbecker wrote:
> > 
> > Frederic Weisbecker (19):
> >       rcu/nocb: Turn enabled/offload states into a common flag
> >       rcu/nocb: Provide basic callback offloading state machine bits
> >       rcu/nocb: Always init segcblist on CPU up
> >       rcu/nocb: De-offloading CB kthread
> >       rcu/nocb: Don't deoffload an offline CPU with pending work
> >       rcu/nocb: De-offloading GP kthread
> >       rcu/nocb: Re-offload support
> >       rcu/nocb: Shutdown nocb timer on de-offloading
> >       rcu: Flush bypass before setting SEGCBLIST_SOFTIRQ_ONLY
> >       rcu/nocb: Set SEGCBLIST_SOFTIRQ_ONLY at the very last stage of de-offloading
> >       rcu/nocb: Only cond_resched() from actual offloaded batch processing
> >       rcu/nocb: Process batch locally as long as offloading isn't complete
> >       rcu/nocb: Locally accelerate callbacks as long as offloading isn't complete
> >       tools/rcutorture: Support nocb toggle in TREE01
> 
> I applied the above, with the usual commit-log wordsmithing.
> 
> >       rcutorture: Remove weak nocb declarations
> >       rcutorture: Export nocb (de)offloading functions
> 
> These I folded into the rcutorture commit, as you suggested.

Good!

> 
> >       cpu/hotplug: Add lockdep_is_cpus_held()
> >       timer: Add timer_curr_running()
> 
> I applied these two.
> 
> >       rcu/nocb: Detect unsafe checks for offloaded rdp
> 
> This one didn't apply, probably due to recent changes in -rcu.  Could
> you please take a look?

Ok I'm going to rebase it.

> I believe that the following enhancements will be needed:
> 
> o	Forbid toggling of offlined CPUs.  This is a simple rule that
> 	results in deterministic success or failure.  The current setup
> 	(which I freely admit that I suggested) will fail very rarely,
> 	only if a newly offlined offloaded CPU is toggled before its
> 	remaining callbacks are invoked.  The current state is therefore
> 	an accident waiting to happen.
> 
> 	This will entail fixing a few comments as well.

Sure, I'm always glad to remove lines!

> 
> o	Drain the bypass early in the de-offloading process and prohibit
> 	queuing onto the bypass anywhere in the toggling process.
> 	Then the only CPUs permitted to use the bypass are those that
> 	are fully offloaded.  This approach allows rcu_core() and the
> 	rcuog kthreads to safely manipulate callbacks and grace periods
> 	concurrently.  (Famous last words!)  This might address the
> 	rcutorture writer stall warnings I am seeing.
> 
> 	This will entail fixing a few comments as well.

Ok, I'm checking that.

> 
> o	Avoid double write to rdp->nocb_cb_sleep in nocb_cb_wait().
> 	Unless there is some reason why this is absolutely required.
> 	It will cause confusion as it is.

Ok.

> 
> o	What bad thing happens if we de-offload the CPU corresponding to
> 	the nocb GP kthread?  It does work fine when that CPU is offlined.
> 
> 	Coincidence or not, all but one of the rcutorture writer stalls
> 	is followed by a message refusing to de-offload this CPU.

It probably started with a brainfart that I elaborated a bit too far. Let me check
that again.

> 
> o	The nocb_gp_update_state() function's return value seems backwards.
> 	What am I missing here?

Indeed I should rename it to nocb_gp_check_rdp() or something.

> 
> o	__rcu_nocb_rdp_deoffload(): Why move rcu_nocb_lock_irqsave()
> 	across a comment?

Because the comment only concerns the locking?

> 
> o	We need better debug.  For example, "Can't deoffload an rdp GP
> 	leader (yet)".	Well, which CPU?  For another example, we need
> 	deoffload state in rcutorture writer-stall dumps.

Ok.

> 
> I intend to do the following, and, if feasible, fold them into the
> original commits:
> 
> o	Make rcu_segcblist_is_offloaded() use "&&" instead of a
> 	nested "if" statement.
> 
> o	nocb_cb_wait() needs the "^^^ Ensure CB invocation follows
> 	_sleep test" to be adjusted so that it is properly attached to
> 	its smp_load_acquire().
> 
> o	nocb_cb_wait() -- alphabetical order for local variables, please!
> 	Yeah, I know, others like inverted tree for reasons that escape me
> 	completely.

Ah, ok noted.

Thanks!

> 							Thanx, Paul
> 
> >  include/linux/cpu.h                                |   1 +
> >  include/linux/rcu_segcblist.h                      | 119 +++++-
> >  include/linux/rcupdate.h                           |   4 +
> >  include/linux/timer.h                              |   2 +
> >  kernel/cpu.c                                       |   7 +
> >  kernel/rcu/rcu_segcblist.c                         |  13 +-
> >  kernel/rcu/rcu_segcblist.h                         |  45 ++-
> >  kernel/rcu/rcutorture.c                            |   3 -
> >  kernel/rcu/tree.c                                  |  49 ++-
> >  kernel/rcu/tree.h                                  |   2 +
> >  kernel/rcu/tree_plugin.h                           | 416 +++++++++++++++++++--
> >  kernel/time/timer.c                                |  13 +
> >  .../selftests/rcutorture/configs/rcu/TREE01.boot   |   4 +-
> >  13 files changed, 614 insertions(+), 64 deletions(-)

      reply	other threads:[~2020-12-29 14:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-13 12:13 [PATCH 00/19] rcu/nocb: De-offload and re-offload support v4 Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 01/19] rcu/nocb: Turn enabled/offload states into a common flag Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 02/19] rcu/nocb: Provide basic callback offloading state machine bits Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 03/19] rcu/nocb: Always init segcblist on CPU up Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 04/19] rcu/nocb: De-offloading CB kthread Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 05/19] rcu/nocb: Don't deoffload an offline CPU with pending work Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 06/19] rcu/nocb: De-offloading GP kthread Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 07/19] rcu/nocb: Re-offload support Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 08/19] rcu/nocb: Shutdown nocb timer on de-offloading Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 09/19] rcu: Flush bypass before setting SEGCBLIST_SOFTIRQ_ONLY Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 10/19] rcu/nocb: Set SEGCBLIST_SOFTIRQ_ONLY at the very last stage of de-offloading Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 11/19] rcu/nocb: Only cond_resched() from actual offloaded batch processing Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 12/19] rcu/nocb: Process batch locally as long as offloading isn't complete Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 13/19] rcu/nocb: Locally accelerate callbacks " Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 14/19] tools/rcutorture: Support nocb toggle in TREE01 Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 15/19] rcutorture: Remove weak nocb declarations Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 16/19] rcutorture: Export nocb (de)offloading functions Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 17/19] cpu/hotplug: Add lockdep_is_cpus_held() Frederic Weisbecker
2020-12-15 23:49   ` Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 18/19] timer: Add timer_curr_running() Frederic Weisbecker
2020-11-13 12:13 ` [PATCH 19/19] rcu/nocb: Detect unsafe checks for offloaded rdp Frederic Weisbecker
2020-11-13 17:35   ` kernel test robot
2020-12-08  2:41 ` [PATCH 00/19] rcu/nocb: De-offload and re-offload support v4 Boqun Feng
2020-12-08 12:51   ` Frederic Weisbecker
2020-12-10  1:21     ` Paul E. McKenney
2020-12-10  3:08       ` Boqun Feng
2020-12-16 16:59 ` Paul E. McKenney
2020-12-29 14:21   ` Frederic Weisbecker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201229142134.GB21613@lothringen \
    --to=frederic@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=neeraju@codeaurora.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.