All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gautham R. Shenoy" <gautham.shenoy@amd.com>
To: Abel Wu <wuyun.abel@bytedance.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mel Gorman <mgorman@suse.de>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Josh Don <joshdon@google.com>, Chen Yu <yu.c.chen@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 5/7] sched/fair: skip SIS domain search if fully busy
Date: Wed, 20 Jul 2022 21:04:08 +0530	[thread overview]
Message-ID: <YtggcHGKVSbprXr3@BLR-5CG11610CF.amd.com> (raw)
In-Reply-To: <20220619120451.95251-6-wuyun.abel@bytedance.com>

Hello Abel,


On Sun, Jun 19, 2022 at 08:04:49PM +0800, Abel Wu wrote:
> If a full scan on SIS domain failed, then no unoccupied cpus available
> and the LLC is fully busy. In this case we'd better spend the time on
> something more useful, rather than wasting it trying to find an idle
> cpu that probably not exist.
> 
> The fully busy status will be re-evaluated when any core of this LLC
> domain enters load balancing, and cleared once idle cpus found.
> 
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> ---

[..snip..]

> @@ -6197,24 +6201,44 @@ static inline int __select_idle_cpu(int cpu, struct task_struct *p)
>  DEFINE_STATIC_KEY_FALSE(sched_smt_present);
>  EXPORT_SYMBOL_GPL(sched_smt_present);
>  
> -static inline void set_idle_cores(int cpu, int val)
> +static inline void sd_set_state(int cpu, enum sd_state state)

Nit: We are setting the state of only the LLC domain and not any other
domain via this function. So should we name it as
set_llc_state()/get_llc_state() for better readability ?


>  {
>  	struct sched_domain_shared *sds;
>  
>  	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
>  	if (sds)
> -		WRITE_ONCE(sds->has_idle_cores, val);
> +		WRITE_ONCE(sds->state, state);
>  }
>  
> -static inline bool test_idle_cores(int cpu)
> +static inline enum sd_state sd_get_state(int cpu)
>  {
>  	struct sched_domain_shared *sds;
>  
>  	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
>  	if (sds)
> -		return READ_ONCE(sds->has_idle_cores);
> +		return READ_ONCE(sds->state);
>  
> -	return false;
> +	return sd_has_icpus;
> +}
> +
> +static inline void set_idle_cores(int cpu, int idle)
                                                  ^^^^^
I agree with Josh. We can use core_idle instead of idle here.

> +{
> +	sd_set_state(cpu, idle ? sd_has_icores : sd_has_icpus);
> +}
> +
> +static inline bool test_idle_cores(int cpu)
> +{
> +	return sd_get_state(cpu) == sd_has_icores;
> +}
> +
> +static inline void set_idle_cpus(int cpu, int idle)
> +{
> +	sd_set_state(cpu, idle ? sd_has_icpus : sd_is_busy);
> +}
> +
> +static inline bool test_idle_cpus(int cpu)
> +{
> +	return sd_get_state(cpu) != sd_is_busy;
>  }
>  
>  /*

[...]


> @@ -8661,6 +8702,12 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds,  struct sg_lb_stats *sgs
>  	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
>  }
>  
> +static inline void sd_classify(struct sd_lb_stats *sds, struct rq *rq)
> +{
> +	if (sds->sd_state != sd_has_icpus && unoccupied_rq(rq))

Nit: sds->sd_state can either be sd_has_icpus or sd_is_busy. So for
better readability, we can just use the positive check

        if (sds->sd_state == sd_is_busy && unoccupied_rq(rq))
	       sds->sd_state  = sd_has_icpus;


> +		sds->sd_state = sd_has_icpus;
> +}
> +
>  /**
>   * update_sg_lb_stats - Update sched_group's statistics for load balancing.
>   * @env: The load balancing environment.
> @@ -8675,11 +8722,12 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  				      struct sg_lb_stats *sgs,
>  				      int *sg_status)
>  {
> -	int i, nr_running, local_group;
> +	int i, nr_running, local_group, update_core;
>  
>  	memset(sgs, 0, sizeof(*sgs));
>  
>  	local_group = group == sds->local;
> +	update_core = env->sd->flags & SD_SHARE_CPUCAPACITY;
>  
>  	for_each_cpu_and(i, sched_group_span(group), env->cpus) {
>  		struct rq *rq = cpu_rq(i);
> @@ -8692,6 +8740,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  		nr_running = rq->nr_running;
>  		sgs->sum_nr_running += nr_running;
>  
> +		if (update_core)
> +			sd_classify(sds, rq);
> +
>  		if (nr_running > 1)
>  			*sg_status |= SG_OVERLOAD;
>  
> @@ -9220,6 +9271,12 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
>  	return idlest;
>  }
>  
> +static void sd_update_state(struct lb_env *env, struct sd_lb_stats *sds)
> +{
> +	if (sds->sd_state == sd_has_icpus && !test_idle_cpus(env->dst_cpu))
> +		set_idle_cpus(env->dst_cpu, true);

We could enter this if condition when env->dst_cpu is the only idle
CPU in the SMT domain (which is likely to be the case every time we do
a NEW_IDLE balance). By the end of this load-balancing round, the
env->dst_cpu can pull a task from some other CPU and thereby no longer
remain idle but the LLC state would still be sd_has_icpus.

That would mean that some CPU on this LLC would do a full scan during
the wakeup only to find no idle CPU and reset the state to
sd_is_busy. Have you seen instances where this false-positive pattern
can result in wasteful scan thereby cause a performance degradation?
Ideally it should not be worse that what we currently have.

Apart from this, patch looks good to me.

I would be worth the while to explore if the LLC state can be used
early on in select_task_rq_fair() to determine if we need to do a
wake-affine or allow the task to stick to its previous LLC depending
on which among the previous LLC and the waker's LLC have an idle CPU.

--
Thanks and Regards
gautham.

  parent reply	other threads:[~2022-07-20 15:34 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-19 12:04 [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS Abel Wu
2022-06-19 12:04 ` [PATCH v4 1/7] sched/fair: default to false in test_idle_cores Abel Wu
2022-06-27 22:53   ` Josh Don
2022-06-28  3:39     ` Abel Wu
2022-06-19 12:04 ` [PATCH v4 2/7] sched/fair: remove redundant check in select_idle_smt Abel Wu
2022-06-27 23:17   ` Josh Don
2022-06-19 12:04 ` [PATCH v4 3/7] sched/fair: avoid double search on same cpu Abel Wu
2022-06-27 23:24   ` Josh Don
2022-06-19 12:04 ` [PATCH v4 4/7] sched/fair: remove useless check in select_idle_core Abel Wu
2022-06-27 23:42   ` Josh Don
2022-06-28  3:51     ` Abel Wu
2022-06-29  0:41       ` Josh Don
2022-06-19 12:04 ` [PATCH v4 5/7] sched/fair: skip SIS domain search if fully busy Abel Wu
2022-06-28  0:28   ` Josh Don
2022-06-28  6:53     ` Abel Wu
2022-06-29  1:11       ` Josh Don
2022-06-29  7:05         ` Abel Wu
2022-07-20 15:34   ` Gautham R. Shenoy [this message]
2022-08-15  9:49     ` Abel Wu
2022-06-19 12:04 ` [PATCH v4 6/7] sched/fair: skip busy cores in SIS search Abel Wu
2022-06-21 18:14   ` Chen Yu
2022-06-22  3:52     ` [External] " Abel Wu
2022-06-24  3:30       ` Chen Yu
2022-06-27 10:13         ` Abel Wu
2022-06-28  7:58           ` Abel Wu
2022-06-30  4:16             ` Chen Yu
2022-06-30 10:46               ` Abel Wu
2022-07-09  8:55                 ` Chen Yu
2022-07-09 15:56                   ` Abel Wu
2022-07-11 12:02                     ` Chen Yu
2022-07-13 10:25                       ` Abel Wu
2022-06-22  4:03     ` Abel Wu
2022-07-20 16:16   ` Gautham R. Shenoy
2022-08-15  9:49     ` Abel Wu
2022-06-19 12:04 ` [PATCH v4 7/7] sched/fair: de-entropy for SIS filter Abel Wu
2022-06-21 18:23   ` Chen Yu
2022-06-22  4:01     ` Abel Wu
2022-06-30  7:46   ` Abel Wu
2022-07-09 14:42   ` [sched/fair] 32fe13cd7a: phoronix-test-suite.fio.SequentialWrite.IO_uring.Yes.No.4KB.DefaultTestDirectory.mb_s -11.7% regression kernel test robot
2022-07-09 14:42     ` kernel test robot
2022-07-09 16:14     ` Abel Wu
2022-07-09 16:14       ` Abel Wu
2022-07-20 17:08   ` [PATCH v4 7/7] sched/fair: de-entropy for SIS filter Gautham R. Shenoy
2022-08-15  9:49     ` Abel Wu
2022-07-06  9:51 ` [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS Abel Wu
2022-07-18 11:00 ` K Prateek Nayak
2022-08-15 13:59   ` Abel Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtggcHGKVSbprXr3@BLR-5CG11610CF.amd.com \
    --to=gautham.shenoy@amd.com \
    --cc=joshdon@google.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wuyun.abel@bytedance.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.