linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Abel Wu <wuyun.abel@bytedance.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Josh Don <joshdon@google.com>, Chen Yu <yu.c.chen@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Re: [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core
Date: Mon, 5 Sep 2022 22:40:00 +0800	[thread overview]
Message-ID: <1fc40679-b7c3-24f2-aa27-f1edab71228e@bytedance.com> (raw)
In-Reply-To: <20220902102528.keooutttg3hq3sy5@techsingularity.net>

On 9/2/22 6:25 PM, Mel Gorman Wrote:
> For the simple case, I was expecting the static depth to *not* match load
> because it's unclear what the scaling should be for load or if it had a
> benefit. If investigating scaling the scan depth to load, it would still
> make sense to compare it to a static depth. The depth of 2 cores was to
> partially match the old SIS_PROP behaviour of the minimum depth to scan.
> 
>                  if (span_avg > 4*avg_cost)
>                          nr = div_u64(span_avg, avg_cost);
>                  else
>                          nr = 4;
> 
> nr is not proportional to cores although it could be
> https://lore.kernel.org/all/20210726102247.21437-7-mgorman@techsingularity.net/
> 
> This is not tested or properly checked for correctness but for
> illustrative purposes something like this should conduct a limited scan when
> overloaded. It has a side-effect that the has_idle_cores hint gets cleared
> for a partial scan for idle cores but the hint is probably wrong anyway.
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6089251a4720..59b27a2ef465 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6427,21 +6427,36 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>   		if (sd_share) {
>   			/* because !--nr is the condition to stop scan */
>   			nr = READ_ONCE(sd_share->nr_idle_scan) + 1;
> -			/* overloaded LLC is unlikely to have idle cpu/core */
> -			if (nr == 1)
> -				return -1;
> +
> +			/*
> +			 * Non-overloaded case: Scan full domain if there is
> +			 * 	an idle core. Otherwise, scan for an idle
> +			 * 	CPU based on nr_idle_scan
> +			 * Overloaded case: Unlikely to have an idle CPU but
> +			 * 	conduct a limited scan if there is potentially
> +			 * 	an idle core.
> +			 */
> +			if (nr > 1) {
> +				if (has_idle_core)
> +					nr = sd->span_weight;
> +			} else {
> +				if (!has_idle_core)
> +					return -1;
> +				nr = 2;
> +			}
>   		}
>   	}
>   
>   	for_each_cpu_wrap(cpu, cpus, target + 1) {
> +		if (!--nr)
> +			break;
> +
>   		if (has_idle_core) {
>   			i = select_idle_core(p, cpu, cpus, &idle_cpu);
>   			if ((unsigned int)i < nr_cpumask_bits)
>   				return i;
>   
>   		} else {
> -			if (!--nr)
> -				return -1;
>   			idle_cpu = __select_idle_cpu(cpu, p);
>   			if ((unsigned int)idle_cpu < nr_cpumask_bits)
>   				break;

I spent last few days testing this, with 3 variations (assume
has_idle_core):

  a) full or limited (2cores) scan when !nr_idle_scan
  b) whether clear sds->has_idle_core when partial scan failed
  c) scale scan depth with load or not

some observations:

  1) It seems always bad if not clear sds->has_idle_core when
     partial scan fails. It is due to over partially scanned
     but still can not find an idle core. (Following ones are
     based on clearing has_idle_core even in partial scans.)

  2) Unconditionally full scan when has_idle_core is not good
     for netperf_{udp,tcp} and tbench4. It is probably because
     the SIS success rate of these workloads is already high
     enough (netperf ~= 100%, tbench4 ~= 50%, compared to that
     hackbench ~= 3.5%) which negate a lot of the benefit full
     scan brings.

  3) Scaling scan depth with load seems good for the hackbench
     socket tests, and neutral in pipe tests. And I think this
     is just the case you mentioned before, under fast wake-up
     workloads the has_idle_core will become not that reliable,
     so a full scan won't always win.

Best Regards,
Abel

  reply	other threads:[~2022-09-05 14:43 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-12  8:20 [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu
2022-07-12  8:20 ` [PATCH 1/5] sched/fair: ignore SIS_UTIL when has idle core Abel Wu
2022-07-13  3:47   ` Chen Yu
2022-07-13 16:14     ` Abel Wu
2022-07-14  6:19   ` Yicong Yang
2022-07-14  6:58     ` Abel Wu
2022-07-14  7:15       ` Yicong Yang
2022-07-14  8:00         ` Abel Wu
2022-07-14  8:16           ` Yicong Yang
2022-07-14  8:34             ` Yicong Yang
2022-08-04  9:59       ` Chen Yu
2022-08-15  2:54         ` Abel Wu
2022-08-10 13:50   ` Chen Yu
2022-08-15  2:44     ` Abel Wu
2022-08-29 13:08   ` Mel Gorman
2022-08-29 14:11     ` Abel Wu
2022-08-29 14:56       ` Mel Gorman
2022-09-01 13:08         ` Abel Wu
2022-09-02  4:12         ` Abel Wu
2022-09-02 10:25           ` Mel Gorman
2022-09-05 14:40             ` Abel Wu [this message]
2022-09-06  9:57               ` Mel Gorman
2022-09-07  7:27                 ` Chen Yu
2022-09-07  8:41                   ` Mel Gorman
2022-09-07  7:52                 ` Abel Wu
2022-07-12  8:20 ` [PATCH 2/5] sched/fair: default to false in test_idle_cores Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 3/5] sched/fair: remove redundant check in select_idle_smt Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 4/5] sched/fair: avoid double search on same cpu Abel Wu
2022-08-29 12:36   ` Mel Gorman
2022-07-12  8:20 ` [PATCH 5/5] sched/fair: remove useless check in select_idle_core Abel Wu
2022-08-29 12:37   ` Mel Gorman
2022-08-15 13:31 ` [PATCH 0/5] sched/fair: SIS improvements and cleanups Abel Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1fc40679-b7c3-24f2-aa27-f1edab71228e@bytedance.com \
    --to=wuyun.abel@bytedance.com \
    --cc=joshdon@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).