From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> To: Vincent Guittot <vincent.guittot@linaro.org> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>, Juri Lelli <juri.lelli@redhat.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, Nicholas Piggin <npiggin@gmail.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Len Brown <len.brown@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Tim Chen <tim.c.chen@linux.intel.com>, Aubrey Li <aubrey.li@linux.intel.com>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, Ricardo Neri <ricardo.neri@intel.com>, Quentin Perret <qperret@google.com>, "Joel Fernandes (Google)" <joel@joelfernandes.org>, linuxppc-dev@lists.ozlabs.org, linux-kernel <linux-kernel@vger.kernel.org>, Aubrey Li <aubrey.li@intel.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com> Subject: Re: [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance Date: Thu, 16 Sep 2021 18:00:44 -0700 [thread overview] Message-ID: <20210917010044.GA23727@ranerica-svr.sc.intel.com> (raw) In-Reply-To: <CAKfTPtBcDP3Yp54sd4+1kP=o=4e_1HEmOf=eMXydag_J38CEng@mail.gmail.com> On Wed, Sep 15, 2021 at 05:43:44PM +0200, Vincent Guittot wrote: > On Sat, 11 Sept 2021 at 03:19, Ricardo Neri > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > When deciding to pull tasks in ASYM_PACKING, it is necessary not only to > > check for the idle state of the destination CPU, dst_cpu, but also of > > its SMT siblings. > > > > If dst_cpu is idle but its SMT siblings are busy, performance suffers > > if it pulls tasks from a medium priority CPU that does not have SMT > > siblings. > > > > Implement asym_smt_can_pull_tasks() to inspect the state of the SMT > > siblings of both dst_cpu and the CPUs in the candidate busiest group. > > > > Cc: Aubrey Li <aubrey.li@intel.com> > > Cc: Ben Segall <bsegall@google.com> > > Cc: Daniel Bristot de Oliveira <bristot@redhat.com> > > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> > > Cc: Mel Gorman <mgorman@suse.de> > > Cc: Quentin Perret <qperret@google.com> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> > > Cc: Steven Rostedt <rostedt@goodmis.org> > > Cc: Tim Chen <tim.c.chen@linux.intel.com> > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > Reviewed-by: Len Brown <len.brown@intel.com> > > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > > --- > > Changes since v4: > > * Use sg_lb_stats::sum_nr_running the idle state of a scheduling group. > > (Vincent, Peter) > > * Do not even idle CPUs in asym_smt_can_pull_tasks(). (Vincent) > > * Updated function documentation and corrected a typo. > > > > Changes since v3: > > * Removed the arch_asym_check_smt_siblings() hook. Discussions with the > > powerpc folks showed that this patch should not impact them. Also, more > > recent powerpc processor no longer use asym_packing. (PeterZ) > > * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar) > > * Removed unnecessary check for local CPUs when the local group has zero > > utilization. (Joel) > > * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect > > the fact that it deals with SMT cases. > > * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so > > that callers can deal with non-SMT cases. > > > > Changes since v2: > > * Reworded the commit message to reflect updates in code. > > * Corrected misrepresentation of dst_cpu as the CPU doing the load > > balancing. (PeterZ) > > * Removed call to arch_asym_check_smt_siblings() as it is now called in > > sched_asym(). > > > > Changes since v1: > > * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull > > tasks. Instead, reclassify the candidate busiest group, as it > > may still be selected. (PeterZ) > > * Avoid an expensive and unnecessary call to cpumask_weight() when > > determining if a sched_group is comprised of SMT siblings. > > (PeterZ). > > --- > > kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 94 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 26db017c14a3..8d763dd0174b 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -8597,10 +8597,98 @@ group_type group_classify(unsigned int imbalance_pct, > > return group_has_spare; > > } > > > > +/** > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks > > + * @dst_cpu: Destination CPU of the load balancing > > + * @sds: Load-balancing data with statistics of the local group > > + * @sgs: Load-balancing statistics of the candidate busiest group > > + * @sg: The candidate busiest group > > + * > > + * Check the state of the SMT siblings of both @sds::local and @sg and decide > > + * if @dst_cpu can pull tasks. > > + * > > + * If @dst_cpu does not have SMT siblings, it can pull tasks if two or more of > > + * the SMT siblings of @sg are busy. If only one CPU in @sg is busy, pull tasks > > + * only if @dst_cpu has higher priority. > > + * > > + * If both @dst_cpu and @sg have SMT siblings, and @sg has exactly one more > > + * busy CPU than @sds::local, let @dst_cpu pull tasks if it has higher priority. > > + * Bigger imbalances in the number of busy CPUs will be dealt with in > > + * update_sd_pick_busiest(). > > + * > > + * If @sg does not have SMT siblings, only pull tasks if all of the SMT siblings > > + * of @dst_cpu are idle and @sg has lower priority. > > + */ > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds, > > + struct sg_lb_stats *sgs, > > + struct sched_group *sg) > > +{ > > +#ifdef CONFIG_SCHED_SMT > > + bool local_is_smt, sg_is_smt; > > + int sg_busy_cpus; > > + > > + local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY; > > + sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY; > > + > > + sg_busy_cpus = sgs->group_weight - sgs->idle_cpus; > > + > > + if (!local_is_smt) { > > + /* > > + * If we are here, @dst_cpu is idle and does not have SMT > > + * siblings. Pull tasks if candidate group has two or more > > + * busy CPUs. > > + */ > > + if (sg_is_smt && sg_busy_cpus >= 2) > > Do you really need to test sg_is_smt ? if sg_busy_cpus >= 2 then > sd_is_smt must be true ? Thank you very much for your feedback Vincent! Yes, it is true that sg_busy_cpus >=2 is only true if @sg is SMT. I will remove this check. > > Also, This is the default behavior where we want to even the number of > busy cpu. Shouldn't you return false and fall back to the default > behavior ? This is also true. > > That being said, the default behavior tries to even the number of idle > cpus which is easier to compute and is equal to even the number of > busy cpus in "normal" system with the same number of cpus in groups > but this is not the case here. It could be good to change the default > behavior to even the number of busy cpus and that you use the default > behavior here. Additional condition will be used to select the busiest > group like more busy cpu or more number of running tasks That is a very good observation. Checking the number of idle CPUs assumes that both groups have the same number of CPUs. I'll look into modifying the default behavior. > > > + return true; > > + > > + /* > > + * @dst_cpu does not have SMT siblings. @sg may have SMT > > + * siblings and only one is busy. In such case, @dst_cpu > > + * can help if it has higher priority and is idle (i.e., > > + * it has no running tasks). > > The previous comment above assume that "@dst_cpu is idle" but now you > need to check that sds->local_stat.sum_nr_running == 0 But we already know that, right? We are here because in update_sg_lb_stats() we determine that dst CPU is idle (env->idle != CPU_NOT_IDLE). Thanks and BR, Ricardo
WARNING: multiple messages have this Message-ID (diff)
From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> To: Vincent Guittot <vincent.guittot@linaro.org> Cc: Juri Lelli <juri.lelli@redhat.com>, Aubrey Li <aubrey.li@linux.intel.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, "Peter Zijlstra \(Intel\)" <peterz@infradead.org>, Ricardo Neri <ricardo.neri@intel.com>, Ben Segall <bsegall@google.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, "Joel Fernandes \(Google\)" <joel@joelfernandes.org>, Ingo Molnar <mingo@kernel.org>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>, Steven Rostedt <rostedt@goodmis.org>, Mel Gorman <mgorman@suse.de>, Len Brown <len.brown@intel.com>, Nicholas Piggin <npiggin@gmail.com>, Aubrey Li <aubrey.li@intel.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Tim Chen <tim.c.chen@linux.intel.com>, Quentin Perret <qperret@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, linux-kernel <linux-kernel@vger.kernel.org>, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance Date: Thu, 16 Sep 2021 18:00:44 -0700 [thread overview] Message-ID: <20210917010044.GA23727@ranerica-svr.sc.intel.com> (raw) In-Reply-To: <CAKfTPtBcDP3Yp54sd4+1kP=o=4e_1HEmOf=eMXydag_J38CEng@mail.gmail.com> On Wed, Sep 15, 2021 at 05:43:44PM +0200, Vincent Guittot wrote: > On Sat, 11 Sept 2021 at 03:19, Ricardo Neri > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > When deciding to pull tasks in ASYM_PACKING, it is necessary not only to > > check for the idle state of the destination CPU, dst_cpu, but also of > > its SMT siblings. > > > > If dst_cpu is idle but its SMT siblings are busy, performance suffers > > if it pulls tasks from a medium priority CPU that does not have SMT > > siblings. > > > > Implement asym_smt_can_pull_tasks() to inspect the state of the SMT > > siblings of both dst_cpu and the CPUs in the candidate busiest group. > > > > Cc: Aubrey Li <aubrey.li@intel.com> > > Cc: Ben Segall <bsegall@google.com> > > Cc: Daniel Bristot de Oliveira <bristot@redhat.com> > > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> > > Cc: Mel Gorman <mgorman@suse.de> > > Cc: Quentin Perret <qperret@google.com> > > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> > > Cc: Steven Rostedt <rostedt@goodmis.org> > > Cc: Tim Chen <tim.c.chen@linux.intel.com> > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > Reviewed-by: Len Brown <len.brown@intel.com> > > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > > --- > > Changes since v4: > > * Use sg_lb_stats::sum_nr_running the idle state of a scheduling group. > > (Vincent, Peter) > > * Do not even idle CPUs in asym_smt_can_pull_tasks(). (Vincent) > > * Updated function documentation and corrected a typo. > > > > Changes since v3: > > * Removed the arch_asym_check_smt_siblings() hook. Discussions with the > > powerpc folks showed that this patch should not impact them. Also, more > > recent powerpc processor no longer use asym_packing. (PeterZ) > > * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar) > > * Removed unnecessary check for local CPUs when the local group has zero > > utilization. (Joel) > > * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect > > the fact that it deals with SMT cases. > > * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so > > that callers can deal with non-SMT cases. > > > > Changes since v2: > > * Reworded the commit message to reflect updates in code. > > * Corrected misrepresentation of dst_cpu as the CPU doing the load > > balancing. (PeterZ) > > * Removed call to arch_asym_check_smt_siblings() as it is now called in > > sched_asym(). > > > > Changes since v1: > > * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull > > tasks. Instead, reclassify the candidate busiest group, as it > > may still be selected. (PeterZ) > > * Avoid an expensive and unnecessary call to cpumask_weight() when > > determining if a sched_group is comprised of SMT siblings. > > (PeterZ). > > --- > > kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 94 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 26db017c14a3..8d763dd0174b 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -8597,10 +8597,98 @@ group_type group_classify(unsigned int imbalance_pct, > > return group_has_spare; > > } > > > > +/** > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks > > + * @dst_cpu: Destination CPU of the load balancing > > + * @sds: Load-balancing data with statistics of the local group > > + * @sgs: Load-balancing statistics of the candidate busiest group > > + * @sg: The candidate busiest group > > + * > > + * Check the state of the SMT siblings of both @sds::local and @sg and decide > > + * if @dst_cpu can pull tasks. > > + * > > + * If @dst_cpu does not have SMT siblings, it can pull tasks if two or more of > > + * the SMT siblings of @sg are busy. If only one CPU in @sg is busy, pull tasks > > + * only if @dst_cpu has higher priority. > > + * > > + * If both @dst_cpu and @sg have SMT siblings, and @sg has exactly one more > > + * busy CPU than @sds::local, let @dst_cpu pull tasks if it has higher priority. > > + * Bigger imbalances in the number of busy CPUs will be dealt with in > > + * update_sd_pick_busiest(). > > + * > > + * If @sg does not have SMT siblings, only pull tasks if all of the SMT siblings > > + * of @dst_cpu are idle and @sg has lower priority. > > + */ > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds, > > + struct sg_lb_stats *sgs, > > + struct sched_group *sg) > > +{ > > +#ifdef CONFIG_SCHED_SMT > > + bool local_is_smt, sg_is_smt; > > + int sg_busy_cpus; > > + > > + local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY; > > + sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY; > > + > > + sg_busy_cpus = sgs->group_weight - sgs->idle_cpus; > > + > > + if (!local_is_smt) { > > + /* > > + * If we are here, @dst_cpu is idle and does not have SMT > > + * siblings. Pull tasks if candidate group has two or more > > + * busy CPUs. > > + */ > > + if (sg_is_smt && sg_busy_cpus >= 2) > > Do you really need to test sg_is_smt ? if sg_busy_cpus >= 2 then > sd_is_smt must be true ? Thank you very much for your feedback Vincent! Yes, it is true that sg_busy_cpus >=2 is only true if @sg is SMT. I will remove this check. > > Also, This is the default behavior where we want to even the number of > busy cpu. Shouldn't you return false and fall back to the default > behavior ? This is also true. > > That being said, the default behavior tries to even the number of idle > cpus which is easier to compute and is equal to even the number of > busy cpus in "normal" system with the same number of cpus in groups > but this is not the case here. It could be good to change the default > behavior to even the number of busy cpus and that you use the default > behavior here. Additional condition will be used to select the busiest > group like more busy cpu or more number of running tasks That is a very good observation. Checking the number of idle CPUs assumes that both groups have the same number of CPUs. I'll look into modifying the default behavior. > > > + return true; > > + > > + /* > > + * @dst_cpu does not have SMT siblings. @sg may have SMT > > + * siblings and only one is busy. In such case, @dst_cpu > > + * can help if it has higher priority and is idle (i.e., > > + * it has no running tasks). > > The previous comment above assume that "@dst_cpu is idle" but now you > need to check that sds->local_stat.sum_nr_running == 0 But we already know that, right? We are here because in update_sg_lb_stats() we determine that dst CPU is idle (env->idle != CPU_NOT_IDLE). Thanks and BR, Ricardo
next prev parent reply other threads:[~2021-09-17 1:01 UTC|newest] Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-11 1:18 [PATCH v5 0/6] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-11 1:18 ` [PATCH v5 1/6] x86/sched: Decrease further the priorities of SMT siblings Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-05 14:12 ` tip-bot2 for Ricardo Neri 2021-09-11 1:18 ` [PATCH v5 2/6] sched/topology: Introduce sched_group::flags Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-17 15:26 ` Vincent Guittot 2021-09-17 15:26 ` Vincent Guittot 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-05 14:12 ` tip-bot2 for Ricardo Neri 2023-05-23 10:59 ` Peter Zijlstra 2023-05-24 13:02 ` Vincent Guittot 2021-09-11 1:18 ` [PATCH v5 3/6] sched/fair: Optimize checking for group_asym_packing Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-17 15:26 ` Vincent Guittot 2021-09-17 15:26 ` Vincent Guittot 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-05 14:12 ` tip-bot2 for Ricardo Neri 2021-09-11 1:18 ` [PATCH v5 4/6] sched/fair: Provide update_sg_lb_stats() with sched domain statistics Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-17 15:27 ` Vincent Guittot 2021-09-17 15:27 ` Vincent Guittot 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-05 14:12 ` tip-bot2 for Ricardo Neri 2021-09-11 1:18 ` [PATCH v5 5/6] sched/fair: Carve out logic to mark a group for asymmetric packing Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-17 15:27 ` Vincent Guittot 2021-09-17 15:27 ` Vincent Guittot 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-05 14:12 ` tip-bot2 for Ricardo Neri 2021-09-11 1:18 ` [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri 2021-09-11 1:18 ` Ricardo Neri 2021-09-15 15:43 ` Vincent Guittot 2021-09-15 15:43 ` Vincent Guittot 2021-09-17 1:00 ` Ricardo Neri [this message] 2021-09-17 1:00 ` Ricardo Neri 2021-09-17 7:41 ` Vincent Guittot 2021-09-17 7:41 ` Vincent Guittot 2021-09-17 15:25 ` Vincent Guittot 2021-09-17 15:25 ` Vincent Guittot 2021-09-17 18:46 ` Peter Zijlstra 2021-09-17 18:46 ` Peter Zijlstra 2021-09-18 9:33 ` Vincent Guittot 2021-09-18 9:33 ` Vincent Guittot 2021-09-21 7:27 ` [tip: sched/core] " tip-bot2 for Ricardo Neri 2021-10-01 9:33 ` [PATCH v5 6/6] " Guillaume Tucker 2021-10-01 9:33 ` Guillaume Tucker 2021-10-01 9:40 ` Guillaume Tucker 2021-10-01 9:40 ` Guillaume Tucker 2021-10-01 10:25 ` Vincent Guittot 2021-10-01 10:25 ` Vincent Guittot 2021-10-01 17:43 ` Ricardo Neri 2021-10-01 17:43 ` Ricardo Neri 2021-10-05 14:12 ` [tip: sched/core] " tip-bot2 for Ricardo Neri
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210917010044.GA23727@ranerica-svr.sc.intel.com \ --to=ricardo.neri-calderon@linux.intel.com \ --cc=aubrey.li@intel.com \ --cc=aubrey.li@linux.intel.com \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=dietmar.eggemann@arm.com \ --cc=joel@joelfernandes.org \ --cc=juri.lelli@redhat.com \ --cc=len.brown@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mgorman@suse.de \ --cc=mingo@kernel.org \ --cc=npiggin@gmail.com \ --cc=peterz@infradead.org \ --cc=qperret@google.com \ --cc=rafael.j.wysocki@intel.com \ --cc=ravi.v.shankar@intel.com \ --cc=ricardo.neri@intel.com \ --cc=rostedt@goodmis.org \ --cc=srikar@linux.vnet.ibm.com \ --cc=srinivas.pandruvada@linux.intel.com \ --cc=tim.c.chen@linux.intel.com \ --cc=vincent.guittot@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.