linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	rushikesh.s.kadam@intel.com, urezki@gmail.com,
	neeraj.iitr10@gmail.com, paulmck@kernel.org, rostedt@goodmis.org,
	vineeth@bitbyteword.org, boqun.feng@gmail.com
Subject: Re: [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation
Date: Tue, 6 Sep 2022 12:15:19 -0400	[thread overview]
Message-ID: <cde6586e-ae61-5e85-3c9a-1ce7dd2464ed@joelfernandes.org> (raw)
In-Reply-To: <20220906151757.GA183806@lothringen>



On 9/6/2022 11:17 AM, Frederic Weisbecker wrote:
> On Tue, Sep 06, 2022 at 03:05:46AM +0000, Joel Fernandes wrote:
>> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
>> index 4dc86274b3e8..b201606f7c4f 100644
>> --- a/kernel/rcu/tree_nocb.h
>> +++ b/kernel/rcu/tree_nocb.h
>> @@ -256,6 +256,31 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
>>  	return __wake_nocb_gp(rdp_gp, rdp, force, flags);
>>  }
>>  
>> +/*
>> + * LAZY_FLUSH_JIFFIES decides the maximum amount of time that
>> + * can elapse before lazy callbacks are flushed. Lazy callbacks
>> + * could be flushed much earlier for a number of other reasons
>> + * however, LAZY_FLUSH_JIFFIES will ensure no lazy callbacks are
>> + * left unsubmitted to RCU after those many jiffies.
>> + */
>> +#define LAZY_FLUSH_JIFFIES (10 * HZ)
>> +unsigned long jiffies_till_flush = LAZY_FLUSH_JIFFIES;
> 
> Still not static.

Oops will fix.

>> @@ -293,12 +322,16 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
>>   * proves to be initially empty, just return false because the no-CB GP
>>   * kthread may need to be awakened in this case.
>>   *
>> + * Return true if there was something to be flushed and it succeeded, otherwise
>> + * false.
>> + *
> 
> This kind of contradict the comment that follows. Not sure you need to add
> that line because the existing comment seem to cover it.
> 
>>   * Note that this function always returns true if rhp is NULL.
> 
>>   */
>>  static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>> -				     unsigned long j)
>> +				     unsigned long j, unsigned long flush_flags)
>>  {
>>  	struct rcu_cblist rcl;
>> +	bool lazy = flush_flags & FLUSH_BP_LAZY;
>>  
>>  	WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
>>  	rcu_lockdep_assert_cblist_protected(rdp);
>> @@ -326,13 +372,20 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>   * Note that this function always returns true if rhp is NULL.
>>   */
>>  static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>> -				  unsigned long j)
>> +				  unsigned long j, unsigned long flush_flags)
>>  {
>> +	bool ret;
>> +
>>  	if (!rcu_rdp_is_offloaded(rdp))
>>  		return true;
>>  	rcu_lockdep_assert_cblist_protected(rdp);
>>  	rcu_nocb_bypass_lock(rdp);
>> -	return rcu_nocb_do_flush_bypass(rdp, rhp, j);
>> +	ret = rcu_nocb_do_flush_bypass(rdp, rhp, j, flush_flags);
>> +
>> +	if (flush_flags & FLUSH_BP_WAKE)
>> +		wake_nocb_gp(rdp, true);
> 
> Why the true above?
> 
> Also should we check if the wake up is really necessary (otherwise it means we
> force a wake up for all rdp's from rcu_barrier())?

Good point. I need to look into your suggested optimization here more, it is
possible the wake up is not needed some of the times, but considering that
rcu_barrier() is a slow path, I will probably aim for robustness here over
mathematically correct code. And the cost of a missed wake up here can be
serious as I saw by the RCU_SCALE test I added. Also the wake up pf the rcuog
thread is square root of CPUs because of the rdp grouping right?

Still, it'd be good to not do something unnecessary, so your point is well
taken. I'll spend some time on this and get back to you.

>        was_alldone = rcu_segcblist_pend_cbs(&rdp->cblist);
>        ret = rcu_nocb_do_flush_bypass(rdp, rhp, j, flush_flags);
>        if (was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist))
>        	  wake_nocb_gp(rdp, false);
> 
> 
>> @@ -461,16 +521,29 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>  	// We need to use the bypass.
>>  	rcu_nocb_wait_contended(rdp);
>>  	rcu_nocb_bypass_lock(rdp);
>> +
>>  	ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
>>  	rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
>>  	rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
>> +
>> +	if (IS_ENABLED(CONFIG_RCU_LAZY) && lazy)
>> +		WRITE_ONCE(rdp->lazy_len, rdp->lazy_len + 1);
>> +
>>  	if (!ncbs) {
>>  		WRITE_ONCE(rdp->nocb_bypass_first, j);
>>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
>>  	}
>> +
>>  	rcu_nocb_bypass_unlock(rdp);
>>  	smp_mb(); /* Order enqueue before wake. */
>> -	if (ncbs) {
>> +
>> +	// We had CBs in the bypass list before. There is nothing else to do if:
>> +	// There were only non-lazy CBs before, in this case, the bypass timer
> 
> Kind of misleading. I would replace "There were only non-lazy CBs before" with
> "There was at least one non-lazy CBs before".

I really mean "There were only non-lazy CBs ever queued in the bypass list
before". That's the bypass_is_lazy variable. So I did not fully understand your
suggested comment change.

>> +	// or GP-thread will handle the CBs including any new lazy ones.
>> +	// Or, the new CB is lazy and the old bypass-CBs were also lazy. In this
>> +	// case the old lazy timer would have been setup. When that expires,
>> +	// the new lazy one will be handled.
>> +	if (ncbs && (!bypass_is_lazy || lazy)) {
>>  		local_irq_restore(flags);
>>  	} else {
>>  		// No-CBs GP kthread might be indefinitely asleep, if so, wake.
>> @@ -479,6 +552,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>>  					    TPS("FirstBQwake"));
>>  			__call_rcu_nocb_wake(rdp, true, flags);
>> +		} else if (bypass_is_lazy && !lazy) {
>> +			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>> +					    TPS("FirstBQwakeLazy2Non"));
>> +			__call_rcu_nocb_wake(rdp, true, flags);
> 
> Not sure we need this chunk. Since there are pending callbacks anyway,
> nocb_gp_wait() should be handling them and it will set the appropriate
> timer on the next loop.

We do because those pending callbacks could be because of a bypass list flush
and not because there were pending CBs before, right? I do recall missed wake
ups of non-lazy CBs, and them having to wait for the full lazy timer duration
and slowing down synchronize_rcu() which is on the ChromeOS boot critical path!

Thanks,

 - Joel




  reply	other threads:[~2022-09-06 16:40 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 22:17 [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 01/18] mm/slub: perform free consistency checks before call_rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 02/18] mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head Joel Fernandes (Google)
2022-09-02  9:26   ` Vlastimil Babka
2022-09-02  9:30     ` Vlastimil Babka
2022-09-02 15:09       ` Joel Fernandes
2022-09-03 13:53         ` Paul E. McKenney
2022-09-01 22:17 ` [PATCH v5 03/18] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
2022-09-06 22:26   ` Boqun Feng
2022-09-06 22:31     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 04/18] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google)
2022-09-02 11:35   ` Frederic Weisbecker
2022-09-02 23:58     ` Joel Fernandes
2022-09-03 15:10       ` Paul E. McKenney
2022-09-04 21:13       ` Frederic Weisbecker
2022-09-03 14:04   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-06  3:07   ` Joel Fernandes
2022-09-06  9:48     ` Frederic Weisbecker
2022-09-07  2:43       ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 05/18] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation Joel Fernandes (Google)
2022-09-02 15:21   ` Frederic Weisbecker
2022-09-02 23:09     ` Joel Fernandes
2022-09-05 12:59       ` Frederic Weisbecker
2022-09-05 20:18         ` Joel Fernandes
2022-09-05 20:32         ` Joel Fernandes
2022-09-06  8:55           ` Paul E. McKenney
2022-09-06 16:16             ` Joel Fernandes
2022-09-06 17:05               ` Paul E. McKenney
2022-09-03 22:00     ` Joel Fernandes
2022-09-04 21:01       ` Frederic Weisbecker
2022-09-05 20:20         ` Joel Fernandes
2022-09-06  3:05     ` Joel Fernandes
2022-09-06 15:17       ` Frederic Weisbecker
2022-09-06 16:15         ` Joel Fernandes [this message]
2022-09-06 16:31           ` Joel Fernandes
2022-09-06 16:38             ` Joel Fernandes
2022-09-06 16:43               ` Joel Fernandes
2022-09-06 19:11                 ` Frederic Weisbecker
2022-09-07  2:56                   ` Joel Fernandes
2022-09-07  9:56                     ` Frederic Weisbecker
2022-09-07 10:03           ` Frederic Weisbecker
2022-09-07 14:01             ` Joel Fernandes
2022-09-07  0:06         ` Joel Fernandes
2022-09-07  9:40           ` Frederic Weisbecker
2022-09-07 13:44             ` Joel Fernandes
2022-09-07 15:38               ` Frederic Weisbecker
2022-09-07 15:39                 ` Joel Fernandes
2022-09-21 23:52             ` Joel Fernandes
2022-09-06 18:16       ` Uladzislau Rezki
2022-09-06 18:21         ` Joel Fernandes
2022-09-07  8:52           ` Uladzislau Rezki
2022-09-07 15:23             ` Joel Fernandes
2022-09-03 14:03   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 07/18] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 08/18] rcu: Add per-CB tracing for queuing, flush and invocation Joel Fernandes (Google)
2022-09-02 16:48   ` kernel test robot
2022-09-03 12:39     ` Paul E. McKenney
2022-09-03 14:07       ` Joel Fernandes
2022-09-02 19:01   ` kernel test robot
2022-09-01 22:17 ` [PATCH v5 09/18] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 10/18] rcutorture: Add test code for call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 11/18] fs: Move call_rcu() to call_rcu_lazy() in some paths Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 12/18] cred: Move call_rcu() to call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 13/18] security: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 14/18] net/core: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 15/18] kernel: Move various core kernel usages " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 16/18] lib: Move call_rcu() " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 17/18] i915: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 18/18] fork: Move thread_stack_free_rcu() " Joel Fernandes (Google)
2022-09-03 15:22 ` [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cde6586e-ae61-5e85-3c9a-1ce7dd2464ed@joelfernandes.org \
    --to=joel@joelfernandes.org \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rushikesh.s.kadam@intel.com \
    --cc=urezki@gmail.com \
    --cc=vineeth@bitbyteword.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).