linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Abel Wu <wuyun.abel@bytedance.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Josh Don <joshdon@google.com>, Chen Yu <yu.c.chen@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Re: [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core
Date: Wed, 7 Sep 2022 15:52:32 +0800	[thread overview]
Message-ID: <fa6a22d3-6dbf-0ffc-088b-68b3c241f2d0@bytedance.com> (raw)
In-Reply-To: <20220906095717.maao4qtel4fhbmfq@techsingularity.net>

On 9/6/22 5:57 PM, Mel Gorman wrote:
> On Mon, Sep 05, 2022 at 10:40:00PM +0800, Abel Wu wrote:
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 6089251a4720..59b27a2ef465 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6427,21 +6427,36 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>>>    		if (sd_share) {
>>>    			/* because !--nr is the condition to stop scan */
>>>    			nr = READ_ONCE(sd_share->nr_idle_scan) + 1;
>>> -			/* overloaded LLC is unlikely to have idle cpu/core */
>>> -			if (nr == 1)
>>> -				return -1;
>>> +
>>> +			/*
>>> +			 * Non-overloaded case: Scan full domain if there is
>>> +			 * 	an idle core. Otherwise, scan for an idle
>>> +			 * 	CPU based on nr_idle_scan
>>> +			 * Overloaded case: Unlikely to have an idle CPU but
>>> +			 * 	conduct a limited scan if there is potentially
>>> +			 * 	an idle core.
>>> +			 */
>>> +			if (nr > 1) {
>>> +				if (has_idle_core)
>>> +					nr = sd->span_weight;
>>> +			} else {
>>> +				if (!has_idle_core)
>>> +					return -1;
>>> +				nr = 2;
>>> +			}
>>>    		}
>>>    	}
>>>    	for_each_cpu_wrap(cpu, cpus, target + 1) {
>>> +		if (!--nr)
>>> +			break;
>>> +
>>>    		if (has_idle_core) {
>>>    			i = select_idle_core(p, cpu, cpus, &idle_cpu);
>>>    			if ((unsigned int)i < nr_cpumask_bits)
>>>    				return i;
>>>    		} else {
>>> -			if (!--nr)
>>> -				return -1;
>>>    			idle_cpu = __select_idle_cpu(cpu, p);
>>>    			if ((unsigned int)idle_cpu < nr_cpumask_bits)
>>>    				break;
>>
>> I spent last few days testing this, with 3 variations (assume
>> has_idle_core):
>>
>>   a) full or limited (2cores) scan when !nr_idle_scan
>>   b) whether clear sds->has_idle_core when partial scan failed
>>   c) scale scan depth with load or not
>>
>> some observations:
>>
>>   1) It seems always bad if not clear sds->has_idle_core when
>>      partial scan fails. It is due to over partially scanned
>>      but still can not find an idle core. (Following ones are
>>      based on clearing has_idle_core even in partial scans.)
>>
> 
> Ok, that's rational. There will be corner cases where there was no idle
> CPU near the target when there is an idle core far away but it would be
> counter to the purpose of SIS_UTIL to care about that corner case.

Yes, and this corner case (that may become normal sometimes actually)
will be continuously exist if scanning in a linear fashion while scan
depth is limited (SIS_UTIL/SIS_PROP/...), especially when the LLC is
getting larger nowadays.

> 
>>   2) Unconditionally full scan when has_idle_core is not good
>>      for netperf_{udp,tcp} and tbench4. It is probably because
>>      the SIS success rate of these workloads is already high
>>      enough (netperf ~= 100%, tbench4 ~= 50%, compared to that
>>      hackbench ~= 3.5%) which negate a lot of the benefit full
>>      scan brings.
>>
> 
> That's also rational. For a single client/server on netperf, it's expected
> that the SIS success rate is high and scanning is minimal. As the client
> and server are sharing data on localhost and somewhat synchronous, it may
> even partially benefit from SMT sharing.
> 
> So basic approach would be "always clear sds->has_idle_core" + "limit
> scan even when has_idle_core is true", right?

Yes, exactly the same as what you suggested at first place.

> 
> If so, stick a changelog on it and resend!

I will include this in the SIS_FILTER patchset.

> 
>>   3) Scaling scan depth with load seems good for the hackbench
>>      socket tests, and neutral in pipe tests. And I think this
>>      is just the case you mentioned before, under fast wake-up
>>      workloads the has_idle_core will become not that reliable,
>>      so a full scan won't always win.
>>
> 
> My recommendation is leave this out for now but assuming the rest of
> the patches get picked up, consider posting it for the next major kernel
> version (i.e. separate the basic and clever approaches by one major kernel
> version). By separating them, there is less chance of a false positive
> bisection pointing to the wrong patch. Any regression will not be perfectly
> reproducible so the changes of a false positive bisection is relatively high.

Makes sense. I will just include the basic part first.

Thanks,
Abel

  parent reply	other threads:[~2022-09-07  7:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12  8:20 [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu
2022-07-12  8:20 ` [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core Abel Wu
2022-07-13  3:47   ` Chen Yu
2022-07-13 16:14     ` Abel Wu
2022-07-14  6:19   ` Yicong Yang
2022-07-14  6:58     ` Abel Wu
2022-07-14  7:15       ` Yicong Yang
2022-07-14  8:00         ` Abel Wu
2022-07-14  8:16           ` Yicong Yang
2022-07-14  8:34             ` Yicong Yang
2022-08-04  9:59       ` Chen Yu
2022-08-15  2:54         ` Abel Wu
2022-08-10 13:50   ` Chen Yu
2022-08-15  2:44     ` Abel Wu
2022-08-29 13:08   ` Mel Gorman
2022-08-29 14:11     ` Abel Wu
2022-08-29 14:56       ` Mel Gorman
2022-09-01 13:08         ` Abel Wu
2022-09-02  4:12         ` Abel Wu
2022-09-02 10:25           ` Mel Gorman
2022-09-05 14:40             ` Abel Wu
2022-09-06  9:57               ` Mel Gorman
2022-09-07  7:27                 ` Chen Yu
2022-09-07  8:41                   ` Mel Gorman
2022-09-07  7:52                 ` Abel Wu [this message]
2022-07-12  8:20 ` [PATCH 2/5] sched/fair: default to false in test_idle_cores Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 3/5] sched/fair: remove redundant check in select_idle_smt Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 4/5] sched/fair: avoid double search on same cpu Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 5/5] sched/fair: remove useless check in select_idle_core Abel Wu
2022-08-29 12:37   ` Mel Gorman
2022-08-15 13:31 ` [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fa6a22d3-6dbf-0ffc-088b-68b3c241f2d0@bytedance.com \
    --to=wuyun.abel@bytedance.com \
    --cc=joshdon@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).