All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
	rushikesh.s.kadam@intel.com, urezki@gmail.com,
	neeraj.iitr10@gmail.com, paulmck@kernel.org, rostedt@goodmis.org,
	vineeth@bitbyteword.org, boqun.feng@gmail.com
Subject: Re: [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation
Date: Tue, 6 Sep 2022 12:43:52 -0400	[thread overview]
Message-ID: <4f6061f0-0de7-2916-dc6e-9f5af9b944c0@joelfernandes.org> (raw)
In-Reply-To: <da45d265-52f9-6314-7fcd-ea71e2bf4cec@joelfernandes.org>



On 9/6/2022 12:38 PM, Joel Fernandes wrote:
> 
> 
> On 9/6/2022 12:31 PM, Joel Fernandes wrote:
>>
>>
>> On 9/6/2022 12:15 PM, Joel Fernandes wrote:
>>>>> @@ -461,16 +521,29 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>>>>  	// We need to use the bypass.
>>>>>  	rcu_nocb_wait_contended(rdp);
>>>>>  	rcu_nocb_bypass_lock(rdp);
>>>>> +
>>>>>  	ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
>>>>>  	rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
>>>>>  	rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
>>>>> +
>>>>> +	if (IS_ENABLED(CONFIG_RCU_LAZY) && lazy)
>>>>> +		WRITE_ONCE(rdp->lazy_len, rdp->lazy_len + 1);
>>>>> +
>>>>>  	if (!ncbs) {
>>>>>  		WRITE_ONCE(rdp->nocb_bypass_first, j);
>>>>>  		trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
>>>>>  	}
>>>>> +
>>>>>  	rcu_nocb_bypass_unlock(rdp);
>>>>>  	smp_mb(); /* Order enqueue before wake. */
>>>>> -	if (ncbs) {
>>>>> +
>>>>> +	// We had CBs in the bypass list before. There is nothing else to do if:
>>>>> +	// There were only non-lazy CBs before, in this case, the bypass timer
>>>> Kind of misleading. I would replace "There were only non-lazy CBs before" with
>>>> "There was at least one non-lazy CBs before".
>>> I really mean "There were only non-lazy CBs ever queued in the bypass list
>>> before". That's the bypass_is_lazy variable. So I did not fully understand your
>>> suggested comment change.
>>>
>>>>> +	// or GP-thread will handle the CBs including any new lazy ones.
>>>>> +	// Or, the new CB is lazy and the old bypass-CBs were also lazy. In this
>>>>> +	// case the old lazy timer would have been setup. When that expires,
>>>>> +	// the new lazy one will be handled.
>>>>> +	if (ncbs && (!bypass_is_lazy || lazy)) {
>>>>>  		local_irq_restore(flags);
>>>>>  	} else {
>>>>>  		// No-CBs GP kthread might be indefinitely asleep, if so, wake.
>>>>> @@ -479,6 +552,10 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>>>>  			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>>>>>  					    TPS("FirstBQwake"));
>>>>>  			__call_rcu_nocb_wake(rdp, true, flags);
>>>>> +		} else if (bypass_is_lazy && !lazy) {
>>>>> +			trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
>>>>> +					    TPS("FirstBQwakeLazy2Non"));
>>>>> +			__call_rcu_nocb_wake(rdp, true, flags);
>>>>
>>>> Not sure we need this chunk. Since there are pending callbacks anyway,
>>>> nocb_gp_wait() should be handling them and it will set the appropriate
>>>> timer on the next loop.
>>>
>>> We do because those pending callbacks could be because of a bypass list flush
>>> and not because there were pending CBs before, right? I do recall missed wake
>>> ups of non-lazy CBs, and them having to wait for the full lazy timer duration
>>> and slowing down synchronize_rcu() which is on the ChromeOS boot critical path!
>>>
>>
>> Just to add more details, consider the series of events:
>>
>> 1. Only lazy CBs are ever queued. Timer is armed for multiple seconds.
>> rcu_segcblist_pend_cbs remains false.
>>
>> 2. First non-lazy CB triggers to code that does the bypyass rate-limit thing.
>>
>> 3. By pass list is flushed because it is non-lazy CB and we need to start GP
>> processing soon.
> 
> Correcting the events, #3 does not happen if we got here.
> 
>>
>> 4. Due to flush, rcu_segcblist_pend_cbs() is now true.
> 
> So rcu_segcblist_pend_cbs() cannot be true.
> 
>> 5. We reach this "else if" clause because bypass_is_lazy means only lazy CBs
>> were ever buffered. We need to reprogram the timer or do an immediate wake up.
>> That's the intention of __call_rcu_nocb_wake().
>>
>> I really saw #1 and #2 trigger during boot up itself and cause a multi-second
>> boot regression.
> 
> So may be this hunk is needed not needed any more and the boot regression is
> fine. I can try to drop this hunk and run the tests again...

Ah, now I know why I got confused. I *used* to flush the bypass list before when
!lazy CBs showed up. Paul suggested this is overkill. In this old overkill
method, I was missing a wake up which was likely causing the boot regression.
Forcing a wake up fixed that. Now in v5 I make it such that I don't do the flush
on a !lazy rate-limit.

I am sorry for the confusion. Either way, in my defense this is just an extra
bit of code that I have to delete. This code is hard. I have mostly relied on a
test-driven development. But now thanks to this review and I am learning the
code more and more...

Thanks,

 - Joel




  reply	other threads:[~2022-09-06 16:57 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 22:17 [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 01/18] mm/slub: perform free consistency checks before call_rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 02/18] mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head Joel Fernandes (Google)
2022-09-02  9:26   ` Vlastimil Babka
2022-09-02  9:30     ` Vlastimil Babka
2022-09-02 15:09       ` Joel Fernandes
2022-09-03 13:53         ` Paul E. McKenney
2022-09-01 22:17 ` [PATCH v5 03/18] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
2022-09-06 22:26   ` Boqun Feng
2022-09-06 22:31     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 04/18] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google)
2022-09-02 11:35   ` Frederic Weisbecker
2022-09-02 23:58     ` Joel Fernandes
2022-09-03 15:10       ` Paul E. McKenney
2022-09-04 21:13       ` Frederic Weisbecker
2022-09-03 14:04   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-06  3:07   ` Joel Fernandes
2022-09-06  9:48     ` Frederic Weisbecker
2022-09-07  2:43       ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 05/18] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation Joel Fernandes (Google)
2022-09-02 15:21   ` Frederic Weisbecker
2022-09-02 23:09     ` Joel Fernandes
2022-09-05 12:59       ` Frederic Weisbecker
2022-09-05 20:18         ` Joel Fernandes
2022-09-05 20:32         ` Joel Fernandes
2022-09-06  8:55           ` Paul E. McKenney
2022-09-06 16:16             ` Joel Fernandes
2022-09-06 17:05               ` Paul E. McKenney
2022-09-03 22:00     ` Joel Fernandes
2022-09-04 21:01       ` Frederic Weisbecker
2022-09-05 20:20         ` Joel Fernandes
2022-09-06  3:05     ` Joel Fernandes
2022-09-06 15:17       ` Frederic Weisbecker
2022-09-06 16:15         ` Joel Fernandes
2022-09-06 16:31           ` Joel Fernandes
2022-09-06 16:38             ` Joel Fernandes
2022-09-06 16:43               ` Joel Fernandes [this message]
2022-09-06 19:11                 ` Frederic Weisbecker
2022-09-07  2:56                   ` Joel Fernandes
2022-09-07  9:56                     ` Frederic Weisbecker
2022-09-07 10:03           ` Frederic Weisbecker
2022-09-07 14:01             ` Joel Fernandes
2022-09-07  0:06         ` Joel Fernandes
2022-09-07  9:40           ` Frederic Weisbecker
2022-09-07 13:44             ` Joel Fernandes
2022-09-07 15:38               ` Frederic Weisbecker
2022-09-07 15:39                 ` Joel Fernandes
2022-09-21 23:52             ` Joel Fernandes
2022-09-22 14:39               ` Joel Fernandes
2022-09-06 18:16       ` Uladzislau Rezki
2022-09-06 18:21         ` Joel Fernandes
2022-09-07  8:52           ` Uladzislau Rezki
2022-09-07 15:23             ` Joel Fernandes
2022-09-03 14:03   ` Paul E. McKenney
2022-09-03 14:05     ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 07/18] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 08/18] rcu: Add per-CB tracing for queuing, flush and invocation Joel Fernandes (Google)
2022-09-02 16:48   ` kernel test robot
2022-09-03 12:39     ` Paul E. McKenney
2022-09-03 12:39       ` Paul E. McKenney
2022-09-03 14:07       ` Joel Fernandes
2022-09-03 14:07         ` Joel Fernandes
2022-09-02 19:01   ` kernel test robot
2022-09-01 22:17 ` [PATCH v5 09/18] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 10/18] rcutorture: Add test code for call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 11/18] fs: Move call_rcu() to call_rcu_lazy() in some paths Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 12/18] cred: Move call_rcu() to call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 13/18] security: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 14/18] net/core: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 15/18] kernel: Move various core kernel usages " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 16/18] lib: Move call_rcu() " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 17/18] i915: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 18/18] fork: Move thread_stack_free_rcu() " Joel Fernandes (Google)
2022-09-03 15:22 ` [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Paul E. McKenney
     [not found]   ` <CAEXW_YTxRWvUN3YuwdJ9WPmHH-v+Auozfc_OugpV_hpxC5K+4Q@mail.gmail.com>
     [not found]     ` <20220914093423.GS246308@paulmck-ThinkPad-P17-Gen-1>
     [not found]       ` <20220914102319.GA1936@lothringen>
     [not found]         ` <CA+KHdyUb122cCaeLutzK8Ti-dk4D=iXCtQau=ZXkPRO-1cnD1A@mail.gmail.com>
     [not found]           ` <20220914112251.GV246308@paulmck-ThinkPad-P17-Gen-1>
     [not found]             ` <YygjfJkoUzxy9xd/@pc636>
     [not found]               ` <20220919131152.GF4196@paulmck-ThinkPad-P17-Gen-1>
     [not found]                 ` <CAEXW_YQGPwkTQjxAb5p4OOhJkMA-L93JNTsa-WaXs0Ts7=Oakw@mail.gmail.com>
     [not found]                   ` <20220919223211.GJ4196@paulmck-ThinkPad-P17-Gen-1>
     [not found]                     ` <f061b798-e6f6-1523-52c3-12fe7966f809@joelfernandes.org>
     [not found]                       ` <a2882e80-1723-9004-f67a-cd8332aa701d@joelfernandes.org>
     [not found]                         ` <d72808e1-fc27-f158-37b7-ef23c23e74c1@joelfernandes.org>
2022-09-20 11:40                           ` Joel Fernandes
2022-09-20 12:32                             ` Paul E. McKenney
2022-09-20 17:55                               ` Joel Fernandes
2022-09-20 19:03                                 ` Paul E. McKenney
2022-09-20 22:27                                   ` Joel Fernandes
2022-09-20 22:35                                     ` Paul E. McKenney
2022-09-20 22:40                                       ` Joel Fernandes
2022-09-21  2:21                                         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f6061f0-0de7-2916-dc6e-9f5af9b944c0@joelfernandes.org \
    --to=joel@joelfernandes.org \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=rushikesh.s.kadam@intel.com \
    --cc=urezki@gmail.com \
    --cc=vineeth@bitbyteword.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.