From: "zhangfei.gao@foxmail.com" <zhangfei.gao@foxmail.com>
To: Neeraj Upadhyay <quic_neeraju@quicinc.com>,
paulmck@kernel.org, zhangfei <zhangfei.gao@linaro.org>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"rcu@vger.kernel.org" <rcu@vger.kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Josh Triplett <josh@joshtriplett.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Matthew Wilcox <willy@infradead.org>,
"mtosatti@redhat.com" <mtosatti@redhat.com>,
Auger Eric <eric.auger@redhat.com>,
"chenxiang (M)" <chenxiang66@hisilicon.com>
Subject: Re: Commit 282d8998e997 (srcu: Prevent expedited GPs and blocking readers from consuming CPU) cause qemu boot slow
Date: Sat, 18 Jun 2022 11:07:44 +0800 [thread overview]
Message-ID: <tencent_B1290D581DF1837711F4DB176ACB4D4F6908@qq.com> (raw)
In-Reply-To: <042142db-aab2-fc4a-c1a5-371223c80440@quicinc.com>
On 2022/6/15 下午6:40, Neeraj Upadhyay wrote:
> Hi,
>
> On 6/15/2022 2:33 PM, zhangfei.gao@foxmail.com wrote:
>>
>>
>> On 2022/6/14 下午10:17, Paul E. McKenney wrote:
>>> On Tue, Jun 14, 2022 at 10:03:35PM +0800, zhangfei.gao@foxmail.com
>>> wrote:
>>>>
>>>> On 2022/6/14 下午8:19, Neeraj Upadhyay wrote:
>>>>>> 5.18-rc4 based ~8sec
>>>>>>
>>>>>> 5.19-rc1 ~2m43sec
>>>>>>
>>>>>> 5.19-rc1+fix1 ~19sec
>>>>>>
>>>>>> 5.19-rc1-fix2 ~19sec
>>>>>>
>>>>> If you try below diff on top of either 5.19-rc1+fix1 or
>>>>> 5.19-rc1-fix2 ;
>>>>> does it show any difference in boot time?
>>>>>
>>>>> --- a/kernel/rcu/srcutree.c
>>>>> +++ b/kernel/rcu/srcutree.c
>>>>> @@ -706,7 +706,7 @@ static void srcu_schedule_cbs_snp(struct
>>>>> srcu_struct
>>>>> *ssp, struct srcu_node *snp
>>>>> */
>>>>> static void srcu_gp_end(struct srcu_struct *ssp)
>>>>> {
>>>>> - unsigned long cbdelay;
>>>>> + unsigned long cbdelay = 1;
>>>>> bool cbs;
>>>>> bool last_lvl;
>>>>> int cpu;
>>>>> @@ -726,7 +726,9 @@ static void srcu_gp_end(struct srcu_struct *ssp)
>>>>> spin_lock_irq_rcu_node(ssp);
>>>>> idx = rcu_seq_state(ssp->srcu_gp_seq);
>>>>> WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
>>>>> - cbdelay = !!srcu_get_delay(ssp);
>>>>> + if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq),
>>>>> READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
>>>>> + cbdelay = 0;
>>>>> +
>>>>> WRITE_ONCE(ssp->srcu_last_gp_end, ktime_get_mono_fast_ns());
>>> Thank you both for the testing and the proposed fix!
>>>
>>>> Test here:
>>>> qemu: https://github.com/qemu/qemu/tree/stable-6.1
>>>> kernel:
>>>> https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.19-srcu-test
>>>>
>>>> (in case test patch not clear, push in git tree)
>>>>
>>>> Hardware: aarch64
>>>>
>>>> 1. 5.18-rc6
>>>> real 0m8.402s
>>>> user 0m3.015s
>>>> sys 0m1.102s
>>>>
>>>> 2. 5.19-rc1
>>>> real 2m41.433s
>>>> user 0m3.097s
>>>> sys 0m1.177s
>>>>
>>>> 3. 5.19-rc1 + fix1 from Paul
>>>> real 2m43.404s
>>>> user 0m2.880s
>>>> sys 0m1.214s
>>>>
>>>> 4. 5.19-rc1 + fix2: fix1 + Remove "if (!jbase)" block
>>>> real 0m15.262s
>>>> user 0m3.003s
>>>> sys 0m1.033s
>>>>
>>>> When build kernel in the meantime, load time become longer.
>>>>
>>>> 5. 5.19-rc1 + fix3: fix1 + SRCU_MAX_NODELAY_PHASE 1000000
>>>> real 0m15.215s
>>>> user 0m2.942s
>>>> sys 0m1.172s
>>>>
>>>> 6. 5.19-rc1 + fix4: fix1 + Neeraj's change of srcu_gp_end
>>>> real 1m23.936s
>>>> user 0m2.969s
>>>> sys 0m1.181s
>>> And thank you for the testing!
>>>
>>> Could you please try fix3 + Neeraj's change of srcu_gp_end?
>>>
>>> That is, fix1 + SRCU_MAX_NODELAY_PHASE 1000000 + Neeraj's change of
>>> srcu_gp_end.
>>>
>>> Also, at what value of SRCU_MAX_NODELAY_PHASE do the boot
>>> times start rising? This is probably best done by starting with
>>> SRCU_MAX_NODELAY_PHASE=100000 and dividing by (say) ten on each run
>>> until boot time becomes slow, followed by a binary search between the
>>> last two values. (The idea is to bias the search so that fast boot
>>> times are the common case.)
>>
>> SRCU_MAX_NODELAY_PHASE 100 becomes slower.
>>
>>
>> 8. 5.19-rc1 + fix6: fix4 + SRCU_MAX_NODELAY_PHASE 1000000
>>
>> real 0m11.154s ~12s
>>
>> user 0m2.919s
>>
>> sys 0m1.064s
>>
>>
>>
>> 9. 5.19-rc1 + fix7: fix4 + SRCU_MAX_NODELAY_PHASE 10000
>>
>> real 0m11.258s
>>
>> user 0m3.113s
>>
>> sys 0m1.073s
>>
>>
>>
>> 10. 5.19-rc1 + fix8: fix4 + SRCU_MAX_NODELAY_PHASE 100
>>
>> real 0m30.053s ~ 32s
>>
>> user 0m2.827s
>>
>> sys 0m1.161s
>>
>>
>>
>> By the way, if build kernel on the board in the meantime (using
>> memory), time become much longer.
>>
>> real 1m2.763s
>>
>>
>>
>> 11. 5.19-rc1 + fix9: fix4 + SRCU_MAX_NODELAY_PHASE 1000
>>
>> real 0m11.443s
>>
>> user 0m3.022s
>>
>> sys 0m1.052s
>>
>>
>
> This is useful data, thanks! Did you get chance to check between 100
> and 1000, to narrow down further, from which point (does need to be
> exact value) between 100 and 1000, you start seeing degradation at,
> for ex. 250, 500 , ...?
>
> Is it also possible to try experiment 10 and 11 with below diff.
> What I have done in below diff is, call srcu_get_delay() only once
> in try_check_zero() (and not for every loop iteration); also
> retry with a different delay for the extra iteration which is done
> when srcu_get_delay(ssp) returns 0.
>
> Once we have this data, can you also try by changing
> SRCU_RETRY_CHECK_LONG_DELAY to 100, on top of below diff.
>
> #define SRCU_RETRY_CHECK_LONG_DELAY 100
>
> -------------------------------------------------------------------------
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 6a354368ac1d..3aff2f3e99ab 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -620,6 +620,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_unlock);
> * we repeatedly block for 1-millisecond time periods.
> */
> #define SRCU_RETRY_CHECK_DELAY 5
> +#define SRCU_RETRY_CHECK_LONG_DELAY 5
>
> /*
> * Start an SRCU grace period.
> @@ -927,12 +928,17 @@ static void srcu_funnel_gp_start(struct
> srcu_struct *ssp, struct srcu_data *sdp,
> */
> static bool try_check_zero(struct srcu_struct *ssp, int idx, int
> trycount)
> {
> + unsigned long curdelay;
> + curdelay = !srcu_get_delay(ssp);
> for (;;) {
> if (srcu_readers_active_idx_check(ssp, idx))
> return true;
> - if (--trycount + !srcu_get_delay(ssp) <= 0)
> + if (--trycount + curdelay <= 0)
> return false;
> - udelay(SRCU_RETRY_CHECK_DELAY);
> + if (trycount)
> + udelay(SRCU_RETRY_CHECK_DELAY);
> + else
> + udelay(SRCU_RETRY_CHECK_LONG_DELAY);
> }
> }
>
11. 5.19-rc1 + fix9: fix4 + SRCU_MAX_NODELAY_PHASE 1000
real 0m11.443
s user 0m3.022
s sys 0m1.052s
fix10: fix4 + SRCU_MAX_NODELAY_PHASE 500
real 0m11.401s
user 0m2.798s
sys 0m1.328s
fix11: fix4 + SRCU_MAX_NODELAY_PHASE 250
real 0m15.748s
user 0m2.781s
sys 0m1.294s
fix12: fix4 + SRCU_MAX_NODELAY_PHASE 200
real 0m20.704s 21
user 0m2.954s
sys 0m1.226s
fix13: fix4 + SRCU_MAX_NODELAY_PHASE 150
real 0m25.151s
user 0m2.980s
sys 0m1.256s
fix8: fix4 + SRCU_MAX_NODELAY_PHASE 100
real 0m30.053s ~ 32s
user 0m2.827s
sys 0m1.161s
fix14: fix4 + SRCU_MAX_NODELAY_PHASE 100 + SRCU_RETRY_CHECK_LONG_DELAY 5
real 0m19.263s
user 0m3.018s
sys 0m1.211s
fix15: fix4 + SRCU_MAX_NODELAY_PHASE 100 +
SRCU_RETRY_CHECK_LONG_DELAY 100
real 0m9.347s
user 0m3.132s
sys 0m1.041s
And Shameer suggests this method, to decrease region_add/del time from
6000+ to 200+, also works on 5.19-rc1
Make the EFI flash image file
$ dd if=/dev/zero of=flash0.img bs=1M count=64
$ dd if=./QEMU_EFI-2022.fd of=flash0.img conv=notrunc
$ dd if=/dev/zero of=flash1.img bs=1M count=64
Include the below line instead of "-bios QEMU_EFI.fd" in Qemu cmd line.
-pflash flash0.img -pflash flash1.img \
Thanks
>
> Thanks
> Neeraj
>
>> Thanks
>>
>>>
>>>> More test details: https://docs.qq.com/doc/DRXdKalFPTVlUbFN5
>>> And thank you for these details.
>>>
>>> Thanx, Paul
>>
next prev parent reply other threads:[~2022-06-18 3:09 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-11 16:32 Commit 282d8998e997 (srcu: Prevent expedited GPs and blocking readers from consuming CPU) cause qemu boot slow Zhangfei Gao
2022-06-11 16:59 ` Paul E. McKenney
2022-06-12 7:40 ` zhangfei.gao
2022-06-12 13:36 ` Paul E. McKenney
2022-06-12 14:59 ` zhangfei.gao
2022-06-12 16:20 ` Paul E. McKenney
2022-06-12 16:40 ` Paul E. McKenney
2022-06-12 17:29 ` Paolo Bonzini
2022-06-12 17:47 ` Paolo Bonzini
2022-06-12 18:51 ` Paul E. McKenney
2022-06-12 18:49 ` Paul E. McKenney
2022-06-12 19:23 ` Paolo Bonzini
2022-06-12 20:09 ` Paul E. McKenney
2022-06-13 3:04 ` zhangfei.gao
2022-06-13 3:57 ` Paul E. McKenney
2022-06-13 4:16 ` Paul E. McKenney
2022-06-13 6:55 ` zhangfei.gao
2022-06-13 12:18 ` Paul E. McKenney
2022-06-13 13:23 ` zhangfei.gao
2022-06-13 14:59 ` Paul E. McKenney
2022-06-13 20:55 ` Shameerali Kolothum Thodi
2022-06-14 12:19 ` Neeraj Upadhyay
2022-06-14 14:03 ` zhangfei.gao
2022-06-14 14:14 ` Neeraj Upadhyay
2022-06-14 14:57 ` zhangfei.gao
2022-06-14 14:17 ` Paul E. McKenney
2022-06-15 9:03 ` zhangfei.gao
2022-06-15 10:40 ` Neeraj Upadhyay
2022-06-15 10:50 ` Paolo Bonzini
2022-06-15 11:04 ` Neeraj Upadhyay
2022-06-18 3:07 ` zhangfei.gao [this message]
2022-06-20 7:50 ` Neeraj Upadhyay
2022-06-24 15:30 ` zhangfei.gao
2022-06-13 15:02 ` Shameerali Kolothum Thodi
2022-06-15 8:38 ` Marc Zyngier
2022-06-15 8:29 ` Marc Zyngier
2022-06-14 1:53 ` chenxiang (M)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tencent_B1290D581DF1837711F4DB176ACB4D4F6908@qq.com \
--to=zhangfei.gao@foxmail.com \
--cc=chenxiang66@hisilicon.com \
--cc=eric.auger@redhat.com \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mtosatti@redhat.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=quic_neeraju@quicinc.com \
--cc=rcu@vger.kernel.org \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=willy@infradead.org \
--cc=zhangfei.gao@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.