From: Abel Wu <wuyun.abel@bytedance.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
Vincent Guittot <vincent.guittot@linaro.org>,
Josh Don <joshdon@google.com>, Chen Yu <yu.c.chen@intel.com>,
linux-kernel@vger.kernel.org
Subject: Re: Re: [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core
Date: Wed, 7 Sep 2022 15:52:32 +0800 [thread overview]
Message-ID: <fa6a22d3-6dbf-0ffc-088b-68b3c241f2d0@bytedance.com> (raw)
In-Reply-To: <20220906095717.maao4qtel4fhbmfq@techsingularity.net>
On 9/6/22 5:57 PM, Mel Gorman wrote:
> On Mon, Sep 05, 2022 at 10:40:00PM +0800, Abel Wu wrote:
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 6089251a4720..59b27a2ef465 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6427,21 +6427,36 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>>> if (sd_share) {
>>> /* because !--nr is the condition to stop scan */
>>> nr = READ_ONCE(sd_share->nr_idle_scan) + 1;
>>> - /* overloaded LLC is unlikely to have idle cpu/core */
>>> - if (nr == 1)
>>> - return -1;
>>> +
>>> + /*
>>> + * Non-overloaded case: Scan full domain if there is
>>> + * an idle core. Otherwise, scan for an idle
>>> + * CPU based on nr_idle_scan
>>> + * Overloaded case: Unlikely to have an idle CPU but
>>> + * conduct a limited scan if there is potentially
>>> + * an idle core.
>>> + */
>>> + if (nr > 1) {
>>> + if (has_idle_core)
>>> + nr = sd->span_weight;
>>> + } else {
>>> + if (!has_idle_core)
>>> + return -1;
>>> + nr = 2;
>>> + }
>>> }
>>> }
>>> for_each_cpu_wrap(cpu, cpus, target + 1) {
>>> + if (!--nr)
>>> + break;
>>> +
>>> if (has_idle_core) {
>>> i = select_idle_core(p, cpu, cpus, &idle_cpu);
>>> if ((unsigned int)i < nr_cpumask_bits)
>>> return i;
>>> } else {
>>> - if (!--nr)
>>> - return -1;
>>> idle_cpu = __select_idle_cpu(cpu, p);
>>> if ((unsigned int)idle_cpu < nr_cpumask_bits)
>>> break;
>>
>> I spent last few days testing this, with 3 variations (assume
>> has_idle_core):
>>
>> a) full or limited (2cores) scan when !nr_idle_scan
>> b) whether clear sds->has_idle_core when partial scan failed
>> c) scale scan depth with load or not
>>
>> some observations:
>>
>> 1) It seems always bad if not clear sds->has_idle_core when
>> partial scan fails. It is due to over partially scanned
>> but still can not find an idle core. (Following ones are
>> based on clearing has_idle_core even in partial scans.)
>>
>
> Ok, that's rational. There will be corner cases where there was no idle
> CPU near the target when there is an idle core far away but it would be
> counter to the purpose of SIS_UTIL to care about that corner case.
Yes, and this corner case (that may become normal sometimes actually)
will be continuously exist if scanning in a linear fashion while scan
depth is limited (SIS_UTIL/SIS_PROP/...), especially when the LLC is
getting larger nowadays.
>
>> 2) Unconditionally full scan when has_idle_core is not good
>> for netperf_{udp,tcp} and tbench4. It is probably because
>> the SIS success rate of these workloads is already high
>> enough (netperf ~= 100%, tbench4 ~= 50%, compared to that
>> hackbench ~= 3.5%) which negate a lot of the benefit full
>> scan brings.
>>
>
> That's also rational. For a single client/server on netperf, it's expected
> that the SIS success rate is high and scanning is minimal. As the client
> and server are sharing data on localhost and somewhat synchronous, it may
> even partially benefit from SMT sharing.
>
> So basic approach would be "always clear sds->has_idle_core" + "limit
> scan even when has_idle_core is true", right?
Yes, exactly the same as what you suggested at first place.
>
> If so, stick a changelog on it and resend!
I will include this in the SIS_FILTER patchset.
>
>> 3) Scaling scan depth with load seems good for the hackbench
>> socket tests, and neutral in pipe tests. And I think this
>> is just the case you mentioned before, under fast wake-up
>> workloads the has_idle_core will become not that reliable,
>> so a full scan won't always win.
>>
>
> My recommendation is leave this out for now but assuming the rest of
> the patches get picked up, consider posting it for the next major kernel
> version (i.e. separate the basic and clever approaches by one major kernel
> version). By separating them, there is less chance of a false positive
> bisection pointing to the wrong patch. Any regression will not be perfectly
> reproducible so the changes of a false positive bisection is relatively high.
Makes sense. I will just include the basic part first.
Thanks,
Abel
next prev parent reply other threads:[~2022-09-07 7:52 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-12 8:20 [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu
2022-07-12 8:20 ` [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core Abel Wu
2022-07-13 3:47 ` Chen Yu
2022-07-13 16:14 ` Abel Wu
2022-07-14 6:19 ` Yicong Yang
2022-07-14 6:58 ` Abel Wu
2022-07-14 7:15 ` Yicong Yang
2022-07-14 8:00 ` Abel Wu
2022-07-14 8:16 ` Yicong Yang
2022-07-14 8:34 ` Yicong Yang
2022-08-04 9:59 ` Chen Yu
2022-08-15 2:54 ` Abel Wu
2022-08-10 13:50 ` Chen Yu
2022-08-15 2:44 ` Abel Wu
2022-08-29 13:08 ` Mel Gorman
2022-08-29 14:11 ` Abel Wu
2022-08-29 14:56 ` Mel Gorman
2022-09-01 13:08 ` Abel Wu
2022-09-02 4:12 ` Abel Wu
2022-09-02 10:25 ` Mel Gorman
2022-09-05 14:40 ` Abel Wu
2022-09-06 9:57 ` Mel Gorman
2022-09-07 7:27 ` Chen Yu
2022-09-07 8:41 ` Mel Gorman
2022-09-07 7:52 ` Abel Wu [this message]
2022-07-12 8:20 ` [PATCH 2/5] sched/fair: default to false in test_idle_cores Abel Wu
2022-08-29 12:36 ` Mel Gorman
2022-07-12 8:20 ` [PATCH 3/5] sched/fair: remove redundant check in select_idle_smt Abel Wu
2022-08-29 12:36 ` Mel Gorman
2022-07-12 8:20 ` [PATCH 4/5] sched/fair: avoid double search on same cpu Abel Wu
2022-08-29 12:36 ` Mel Gorman
2022-07-12 8:20 ` [PATCH 5/5] sched/fair: remove useless check in select_idle_core Abel Wu
2022-08-29 12:37 ` Mel Gorman
2022-08-15 13:31 ` [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa6a22d3-6dbf-0ffc-088b-68b3c241f2d0@bytedance.com \
--to=wuyun.abel@bytedance.com \
--cc=joshdon@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mgorman@techsingularity.net \
--cc=peterz@infradead.org \
--cc=vincent.guittot@linaro.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).