linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING
@ 2021-04-14  2:04 Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing Ricardo Neri
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-04-14  2:04 UTC (permalink / raw)
  To: Peter Zijlstra (Intel), Ingo Molnar, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Len Brown, Srinivas Pandruvada, Tim Chen, Aubrey Li,
	Ravi V. Shankar, Ricardo Neri, Quentin Perret,
	Joel Fernandes (Google),
	linux-kernel, Ricardo Neri

=== Problem statement ===
On SMT-enabled hardware, ASYM_PACKING can cause the load balancer to choose
low priority CPUs over medium priority CPUs.

When balancing load in scheduling domains with the SD_ASYM_PACKING flag,
idle CPUs of higher priority pull tasks from CPUs of lower priority. This
balancing is done by comparing pairs of scheduling groups. There may also
be scheduling groups composed of CPUs that are SMT siblings.

When using SD_ASYM_PACKING on x86 with Intel Turbo Boost Max Technology 3.0
(ITMT), the priorities of a scheduling group of CPUs that has N SMT
siblings are assigned as N*P, N*P/2, N*P/3, ..., P, where P is the
priority assigned by the hardware to the physical core and N is the number
of SMT siblings.

Systems such as Intel Comet Lake can have some cores supporting SMT, while
others do not. As a result, it is possible to have medium non-SMT
priorities, Q, such that N*P > Q > P.

When comparing groups for load balancing, the priority of the CPU doing the
load balancing is only compared with the preferred CPU of the candidate
busiest group (N*P vs Q in the example above). Thus, scheduling groups
with a preferred CPU with priority N*P always pull tasks from the
scheduling group with priority Q and then such tasks are spread across the
“SMT” domain. Conversely, since N*P > Q, CPUs with priority Q cannot
pull tasks from a group with a preferred CPU with priority N*P, even
though Q > P.

Doing load balancing based on load (i.e. if the busiest group is of type
group_overloaded) will not help if the system is not fully busy as the
involved groups will have only one task and load balancing will
not be deemed as necessary.

The behavior described above results in leaving CPUs with medium priority
idle, while CPUs with lower priority are busy. Furthermore, such CPUs of
lower priority are SMT siblings of high priority CPUs, which are also busy.

This patchset fixes this behavior by also checking the idle state of the
SMT siblings of both the CPU doing the load balance and the busiest
candidate group.

I ran a few benchmarks with and without this version of the patchset on
an Intel(R) Core(TM) i9-7900X CPU. I kept online both SMT siblings of two
high priority cores. I offlined the lower priority SMT siblings of four
low priority cores. I offlined the rest of the cores. The resulting
priority configuration meets the N*P > Q > P condition described above.

The baseline for the results is an unmodified v5.12-rc3 kernel. Results
show a comparative percentage of improvement (positive) or degradation
(negative). Each test case is repeated five times, and the standard
deviation among repetitions is also documented.

In order to judge only the improvements this patchset provides, Table 1
shows the results when setting the CPU's frequency at 1000MHz. It can be
observed that the patches bring an overall positive impact for tests
that exhibit low variance. Some of the test cases, however, show a high
variance and it is difficult to assess the impact.

Table 2 shows the results when using hardware-controlled performance
performance states (HWP), a common use case. In this case, the impact of
the patches is also overall positive. schbench exhibits a small
degradation. However, it also exhibits a large variance.

v1 patches and test results can be found in [1].

Thanks and BR,
Ricardo

========
Changes since v1:
  * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
    tasks. Instead, reclassify the candidate busiest group, as it
    may still be selected. (PeterZ)
  * Avoid an expensive and unnecessary call to cpumask_weight() when
    determining if a sched_group is comprised of SMT siblings.
    (PeterZ).
  * Updated test results using the v2 patches.

========      Table 1. Test results of patches at 10000MHz     ========
=======================================================================
hackbench
=========
case                    load            baseline(std%)  compare%( std%)
process-pipe            group-1          1.00 (  4.01)   +6.98 (  3.05)
process-pipe            group-2          1.00 (  4.17)   +1.20 (  2.21)
process-pipe            group-4          1.00 (  5.26)   +4.54 (  2.67)
process-pipe            group-8          1.00 ( 12.57)   -2.49 ( 11.70)
process-sockets         group-1          1.00 (  0.65)   +3.77 (  0.87)
process-sockets         group-2          1.00 (  1.04)   +0.45 (  0.49)
process-sockets         group-4          1.00 (  0.57)   +0.16 (  1.63)
process-sockets         group-8          1.00 (  6.39)   +0.58 (  6.16)
threads-pipe            group-1          1.00 (  2.82)   +1.04 (  3.60)
threads-pipe            group-2          1.00 (  2.50)   +3.53 (  1.44)
threads-pipe            group-4          1.00 (  5.55)   -0.02 (  1.05)
threads-pipe            group-8          1.00 ( 12.85)   -1.00 ( 14.05)
threads-sockets         group-1          1.00 (  1.26)   +4.54 (  1.92)
threads-sockets         group-2          1.00 (  2.11)   +2.80 (  0.40)
threads-sockets         group-4          1.00 (  0.70)   +0.61 (  0.88)
threads-sockets         group-8          1.00 (  1.20)   +0.29 (  2.20)

netperf
=======
case                    load            baseline(std%)  compare%( std%)
TCP_RR                  thread-2         1.00 (  0.72)   +3.48 (  0.40)
TCP_RR                  thread-4         1.00 (  0.38)   +4.11 (  0.35)
TCP_RR                  thread-8         1.00 (  9.29)   -0.67 ( 14.12)
TCP_RR                  thread-16        1.00 ( 21.18)   -0.10 ( 20.92)
UDP_RR                  thread-2         1.00 (  0.23)   +4.08 (  0.16)
UDP_RR                  thread-4         1.00 (  0.19)   +3.53 (  0.27)
UDP_RR                  thread-8         1.00 (  6.83)   -1.63 ( 11.09)
UDP_RR                  thread-16        1.00 ( 20.85)   -1.02 ( 21.20)

tbench
======
case                    load            baseline(std%)  compare%( std%)
loopback                thread-2         1.00 (  0.15)   +4.49 (  0.20)
loopback                thread-4         1.00 (  0.25)   +3.99 (  0.32)
loopback                thread-8         1.00 (  0.16)   +0.49 (  0.30)
loopback                thread-16        1.00 (  0.32)   +0.49 (  0.21)

schbench
========
case                    load            baseline(std%)  compare%( std%)
normal                  mthread-1        1.00 ( 15.87)  -17.76 (  8.38)
normal                  mthread-2        1.00 (  0.10)   -0.05 (  0.00)
normal                  mthread-4        1.00 (  0.00)   +0.00 (  0.00)
normal                  mthread-8        1.00 (  4.81)   +5.57 (  2.61)


========      Table 2. Test results of patches with HWP        ========
=======================================================================
hackbench
=========
case                    load            baseline(std%)  compare%( std%)
process-pipe            group-1          1.00 (  0.76)   +0.14 (  1.11)
process-pipe            group-2          1.00 (  2.59)   +0.88 (  3.16)
process-pipe            group-4          1.00 (  4.05)   +1.13 (  4.86)
process-pipe            group-8          1.00 (  7.37)   -2.34 ( 14.43)
process-sockets         group-1          1.00 ( 15.98)   +7.77 (  1.44)
process-sockets         group-2          1.00 (  1.64)   +0.00 (  1.59)
process-sockets         group-4          1.00 (  0.95)   -1.54 (  0.85)
process-sockets         group-8          1.00 (  1.84)   -3.27 (  4.86)
threads-pipe            group-1          1.00 (  3.27)   +0.64 (  2.91)
threads-pipe            group-2          1.00 (  3.02)   -0.09 (  1.50)
threads-pipe            group-4          1.00 (  5.39)   -6.34 (  3.11)
threads-pipe            group-8          1.00 (  5.56)   +4.61 ( 14.91)
threads-sockets         group-1          1.00 (  2.76)   +4.70 (  0.94)
threads-sockets         group-2          1.00 (  1.10)   +3.56 (  1.41)
threads-sockets         group-4          1.00 (  0.45)   +2.11 (  1.32)
threads-sockets         group-8          1.00 (  3.56)   +3.62 (  2.43)

netperf
=======
case                    load            baseline(std%)  compare%( std%)
TCP_RR                  thread-2         1.00 (  0.36)   +9.85 (  2.09)
TCP_RR                  thread-4         1.00 (  0.31)   +1.30 (  0.53)
TCP_RR                  thread-8         1.00 ( 11.70)   -0.42 ( 13.31)
TCP_RR                  thread-16        1.00 ( 23.49)   -0.55 ( 21.79)
UDP_RR                  thread-2         1.00 (  0.19)  +13.11 (  7.48)
UDP_RR                  thread-4         1.00 (  0.13)   +2.69 (  0.26)
UDP_RR                  thread-8         1.00 (  8.95)   -1.39 ( 12.39)
UDP_RR                  thread-16        1.00 ( 21.54)   -0.77 ( 20.97)

tbench
======
case                    load            baseline(std%)  compare%( std%)
loopback                thread-2         1.00 (  0.22)   +5.26 (  0.46)
loopback                thread-4         1.00 (  2.56)  +52.11 (  0.73)
loopback                thread-8         1.00 (  0.41)   +0.53 (  0.42)
loopback                thread-16        1.00 (  0.60)   +1.39 (  0.33)

schbench
========
case                    load            baseline(std%)  compare%( std%)
normal                  mthread-1        1.00 ( 12.01)   -1.72 (  2.08)
normal                  mthread-2        1.00 (  0.00)   +0.00 (  0.00)
normal                  mthread-4        1.00 (  0.00)   +0.00 (  0.00)
normal                  mthread-8        1.00 (  0.00)   +0.00 (  0.00)

========
[1]. https://lore.kernel.org/lkml/20210406041108.7416-1-ricardo.neri-calderon@linux.intel.com/

Ricardo Neri (4):
  sched/fair: Optimize checking for group_asym_packing
  sched/fair: Introduce arch_sched_asym_prefer_early()
  sched/fair: Consider SMT in ASYM_PACKING load balance
  x86/sched: Enable checks of the state of SMT siblings in load
    balancing

 arch/x86/kernel/itmt.c         |  15 ++++
 include/linux/sched/topology.h |   2 +
 kernel/sched/fair.c            | 140 ++++++++++++++++++++++++++++++++-
 3 files changed, 155 insertions(+), 2 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing
  2021-04-14  2:04 [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri
@ 2021-04-14  2:04 ` Ricardo Neri
  2021-05-03  9:24   ` Peter Zijlstra
  2021-04-14  2:04 ` [PATCH v2 2/4] sched/fair: Introduce arch_sched_asym_prefer_early() Ricardo Neri
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Ricardo Neri @ 2021-04-14  2:04 UTC (permalink / raw)
  To: Peter Zijlstra (Intel), Ingo Molnar, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Len Brown, Srinivas Pandruvada, Tim Chen, Aubrey Li,
	Ravi V. Shankar, Ricardo Neri, Quentin Perret,
	Joel Fernandes (Google),
	linux-kernel, Ricardo Neri, Aubrey Li, Ben Segall,
	Daniel Bristot de Oliveira

By checking local_group, we can avoid additional checks and invoking
sched_asmy_prefer() when it is not needed.

Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Quentin Perret <qperret@google.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
  * None
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..4ef3fa0d5e8d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8455,7 +8455,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 	}
 
 	/* Check if dst CPU is idle and preferred to this group */
-	if (env->sd->flags & SD_ASYM_PACKING &&
+	if (!local_group && env->sd->flags & SD_ASYM_PACKING &&
 	    env->idle != CPU_NOT_IDLE &&
 	    sgs->sum_h_nr_running &&
 	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] sched/fair: Introduce arch_sched_asym_prefer_early()
  2021-04-14  2:04 [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing Ricardo Neri
@ 2021-04-14  2:04 ` Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 4/4] x86/sched: Enable checks of the state of SMT siblings in load balancing Ricardo Neri
  3 siblings, 0 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-04-14  2:04 UTC (permalink / raw)
  To: Peter Zijlstra (Intel), Ingo Molnar, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Len Brown, Srinivas Pandruvada, Tim Chen, Aubrey Li,
	Ravi V. Shankar, Ricardo Neri, Quentin Perret,
	Joel Fernandes (Google),
	linux-kernel, Ricardo Neri, Aubrey Li, Ben Segall,
	Daniel Bristot de Oliveira

Introduce arch_sched_asym_prefer_early() so that architectures with SMT
can delay the decision to label a candidate busiest group as
group_asym_packing.

When using asymmetric packing, high priority idle CPUs pull tasks from
scheduling groups with low priority CPUs. The decision on using asymmetric
packing for load balancing is done after collecting the statistics of a
candidate busiest group. However, this decision needs to consider the
state of SMT siblings of dst_cpu.

Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Quentin Perret <qperret@google.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
  * None
---
 include/linux/sched/topology.h |  1 +
 kernel/sched/fair.c            | 11 ++++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 8f0f778b7c91..663b98959305 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -57,6 +57,7 @@ static inline int cpu_numa_flags(void)
 #endif
 
 extern int arch_asym_cpu_priority(int cpu);
+extern bool arch_sched_asym_prefer_early(int a, int b);
 
 struct sched_domain_attr {
 	int relax_domain_level;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4ef3fa0d5e8d..e74da853b046 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -106,6 +106,15 @@ int __weak arch_asym_cpu_priority(int cpu)
 	return -cpu;
 }
 
+/*
+ * For asym packing, early check if CPUs with higher priority should be
+ * preferred. On some architectures, more data is needed to make a decision.
+ */
+bool __weak arch_sched_asym_prefer_early(int a, int b)
+{
+	return sched_asym_prefer(a, b);
+}
+
 /*
  * The margin used when comparing utilization with CPU capacity.
  *
@@ -8458,7 +8467,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 	if (!local_group && env->sd->flags & SD_ASYM_PACKING &&
 	    env->idle != CPU_NOT_IDLE &&
 	    sgs->sum_h_nr_running &&
-	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
+	    arch_sched_asym_prefer_early(env->dst_cpu, group->asym_prefer_cpu)) {
 		sgs->group_asym_packing = 1;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-04-14  2:04 [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing Ricardo Neri
  2021-04-14  2:04 ` [PATCH v2 2/4] sched/fair: Introduce arch_sched_asym_prefer_early() Ricardo Neri
@ 2021-04-14  2:04 ` Ricardo Neri
  2021-05-03  9:52   ` Peter Zijlstra
                     ` (2 more replies)
  2021-04-14  2:04 ` [PATCH v2 4/4] x86/sched: Enable checks of the state of SMT siblings in load balancing Ricardo Neri
  3 siblings, 3 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-04-14  2:04 UTC (permalink / raw)
  To: Peter Zijlstra (Intel), Ingo Molnar, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Len Brown, Srinivas Pandruvada, Tim Chen, Aubrey Li,
	Ravi V. Shankar, Ricardo Neri, Quentin Perret,
	Joel Fernandes (Google),
	linux-kernel, Ricardo Neri, Aubrey Li, Ben Segall,
	Daniel Bristot de Oliveira

When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
check for the idle state of the CPU doing the load balancing, but also of
its SMT siblings.

If the CPU doing the balancing is idle but its SMT siblings are not busy,
performance suffers if it pulls tasks from a medium priority CPU that does
not have SMT siblings. The decision to label a group for asymmetric packing
balancing is done in update_sg_lb_stats(). However, for SMT, that code does
not account for idle SMT siblings.

Implement asym_can_pull_tasks() to revisit the early decision on whether
the CPU doing the balance can pull tasks once the needed information is
available. arch_sched_asym_prefer_early() and
arch_asym_check_smt_siblings() are used to conserve the legacy behavior.

Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Quentin Perret <qperret@google.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
 * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
   tasks. Instead, reclassify the candidate busiest group, as it
   may still be selected. (PeterZ)
 * Avoid an expensive and unnecessary call to cpumask_weight() when
   determining if a sched_group is comprised of SMT siblings.
   (PeterZ).
---
 include/linux/sched/topology.h |   1 +
 kernel/sched/fair.c            | 127 +++++++++++++++++++++++++++++++++
 2 files changed, 128 insertions(+)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 663b98959305..6487953b24e8 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -58,6 +58,7 @@ static inline int cpu_numa_flags(void)
 
 extern int arch_asym_cpu_priority(int cpu);
 extern bool arch_sched_asym_prefer_early(int a, int b);
+extern bool arch_asym_check_smt_siblings(void);
 
 struct sched_domain_attr {
 	int relax_domain_level;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e74da853b046..2a33f93646b2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -115,6 +115,14 @@ bool __weak arch_sched_asym_prefer_early(int a, int b)
 	return sched_asym_prefer(a, b);
 }
 
+/*
+ * For asym packing, first check the state of SMT siblings before deciding to
+ * pull tasks.
+ */
+bool __weak arch_asym_check_smt_siblings(void)
+{
+	return false;
+}
 /*
  * The margin used when comparing utilization with CPU capacity.
  *
@@ -8483,6 +8491,107 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 				sgs->group_capacity;
 }
 
+static bool cpu_group_is_smt(int cpu, struct sched_group *sg)
+{
+#ifdef CONFIG_SCHED_SMT
+	if (!static_branch_likely(&sched_smt_present))
+		return false;
+
+	if (sg->group_weight == 1)
+		return false;
+
+	return cpumask_equal(sched_group_span(sg), cpu_smt_mask(cpu));
+#else
+	return false;
+#endif
+}
+
+/**
+ * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
+ * @dst_cpu:	CPU doing the load balancing
+ * @sds:	Load-balancing data with statistics of the local group
+ * @sgs:	Load-balancing statistics of the candidate busiest group
+ * @sg:		The candidate busiet group
+ *
+ * Check the state of the SMT siblings of both @sds::local and @sg and decide
+ * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
+ * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
+ * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
+ *
+ * If both @dst_cpu and @sg have SMT siblings. Even the number of idle CPUs
+ * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
+ * between the number of busy CPUs is 2 or more. If the difference is of 1,
+ * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
+ * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
+ * has lower priority.
+ */
+static bool asym_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
+				struct sg_lb_stats *sgs, struct sched_group *sg)
+{
+#ifdef CONFIG_SCHED_SMT
+	int cpu, local_busy_cpus, sg_busy_cpus;
+	bool local_is_smt, sg_is_smt;
+
+	if (!arch_asym_check_smt_siblings())
+		return true;
+
+	cpu = group_first_cpu(sg);
+	local_is_smt = cpu_group_is_smt(dst_cpu, sds->local);
+	sg_is_smt = cpu_group_is_smt(cpu, sg);
+
+	sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
+
+	if (!local_is_smt) {
+		/*
+		 * If we are here, @dst_cpu is idle and does not have SMT
+		 * siblings. Pull tasks if candidate group has two or more
+		 * busy CPUs.
+		 */
+		if (sg_is_smt && sg_busy_cpus >= 2)
+			return true;
+
+		/*
+		 * @dst_cpu does not have SMT siblings. @sg may have SMT
+		 * siblings and only one is busy. In such case, @dst_cpu
+		 * can help if it has higher priority and is idle.
+		 */
+		return !sds->local_stat.group_util &&
+		       sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
+	}
+
+	/* @dst_cpu has SMT siblings. */
+
+	local_busy_cpus = sds->local->group_weight - sds->local_stat.idle_cpus;
+
+	if (sg_is_smt) {
+		int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
+
+		/* Local can always help to even the number busy CPUs. */
+		if (busy_cpus_delta >= 2)
+			return true;
+
+		if (busy_cpus_delta == 1)
+			return sched_asym_prefer(dst_cpu,
+						 sg->asym_prefer_cpu);
+
+		return false;
+	}
+
+	/*
+	 * @sg does not have SMT siblings. Ensure that @sds::local does not end
+	 * up with more than one busy SMT sibling and only pull tasks if there
+	 * are not busy CPUs. As CPUs move in and out of idle state frequently,
+	 * also check the group utilization to smoother the decision.
+	 */
+	if (!local_busy_cpus && !sds->local_stat.group_util)
+		return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
+
+	return false;
+#else
+	return true;
+#endif
+}
+
 /**
  * update_sd_pick_busiest - return 1 on busiest group
  * @env: The load balancing environment.
@@ -8507,6 +8616,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
 	if (!sgs->sum_h_nr_running)
 		return false;
 
+	/*
+	 * @sg may have been tentatively classified as group_asym_packing.
+	 * Now that we have sufficient information about @sds.local, reassess
+	 * if asym packing migration can be done. Reclassify @sg. The only
+	 * possible results are group_has_spare and group_fully_busy.
+	 */
+	if (sgs->group_type == group_asym_packing &&
+	    !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg)) {
+		sgs->group_asym_packing = 0;
+		sgs->group_type = group_classify(env->sd->imbalance_pct, sg, sgs);
+	}
+
 	/*
 	 * Don't try to pull misfit tasks we can't help.
 	 * We can use max_capacity here as reduction in capacity on some
@@ -9412,6 +9533,12 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		    nr_running == 1)
 			continue;
 
+		/* Make sure we only pull tasks from a CPU of lower priority */
+		if ((env->sd->flags & SD_ASYM_PACKING) &&
+		    sched_asym_prefer(i, env->dst_cpu) &&
+		    nr_running == 1)
+			continue;
+
 		switch (env->migration_type) {
 		case migrate_load:
 			/*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] x86/sched: Enable checks of the state of SMT siblings in load balancing
  2021-04-14  2:04 [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri
                   ` (2 preceding siblings ...)
  2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
@ 2021-04-14  2:04 ` Ricardo Neri
  3 siblings, 0 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-04-14  2:04 UTC (permalink / raw)
  To: Peter Zijlstra (Intel), Ingo Molnar, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Len Brown, Srinivas Pandruvada, Tim Chen, Aubrey Li,
	Ravi V. Shankar, Ricardo Neri, Quentin Perret,
	Joel Fernandes (Google),
	linux-kernel, Ricardo Neri, Aubrey Li, Ben Segall,
	Daniel Bristot de Oliveira

ITMT relies on asymmetric packing of tasks to ensure CPUs are populated in
priority order. When balancing load, the scheduler compares scheduling
groups in pairs, and compares only the priority of the CPUs of highest
priority in the group. This may result on CPUs with medium priority being
overlooked. A recent change introduced logic to also consider the idle
state of the SMT siblings of the CPU doing the load balance. Enable those
checks for x86 when using ITMT.

Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Quentin Perret <qperret@google.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
  * None
---
 arch/x86/kernel/itmt.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kernel/itmt.c b/arch/x86/kernel/itmt.c
index 1afbdd1dd777..1407120af82d 100644
--- a/arch/x86/kernel/itmt.c
+++ b/arch/x86/kernel/itmt.c
@@ -28,6 +28,8 @@ DEFINE_PER_CPU_READ_MOSTLY(int, sched_core_priority);
 
 /* Boolean to track if system has ITMT capabilities */
 static bool __read_mostly sched_itmt_capable;
+/* Boolean to activate checks on the state of SMT siblings */
+static bool __read_mostly sched_itmt_smt_checks;
 
 /*
  * Boolean to control whether we want to move processes to cpu capable
@@ -124,6 +126,8 @@ int sched_set_itmt_support(void)
 
 	sysctl_sched_itmt_enabled = 1;
 
+	sched_itmt_smt_checks = true;
+
 	x86_topology_update = true;
 	rebuild_sched_domains();
 
@@ -160,6 +164,7 @@ void sched_clear_itmt_support(void)
 	if (sysctl_sched_itmt_enabled) {
 		/* disable sched_itmt if we are no longer ITMT capable */
 		sysctl_sched_itmt_enabled = 0;
+		sched_itmt_smt_checks = false;
 		x86_topology_update = true;
 		rebuild_sched_domains();
 	}
@@ -167,6 +172,16 @@ void sched_clear_itmt_support(void)
 	mutex_unlock(&itmt_update_mutex);
 }
 
+bool arch_asym_check_smt_siblings(void)
+{
+	return sched_itmt_smt_checks;
+}
+
+bool arch_sched_asym_prefer_early(int a, int b)
+{
+	return sched_itmt_smt_checks;
+}
+
 int arch_asym_cpu_priority(int cpu)
 {
 	return per_cpu(sched_core_priority, cpu);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing
  2021-04-14  2:04 ` [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing Ricardo Neri
@ 2021-05-03  9:24   ` Peter Zijlstra
  2021-05-04  3:08     ` Ricardo Neri
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-03  9:24 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Tue, Apr 13, 2021 at 07:04:33PM -0700, Ricardo Neri wrote:
> By checking local_group, we can avoid additional checks and invoking
> sched_asmy_prefer() when it is not needed.

This really could do with a few words on *why* that is correct. ISTR
thinking this made sense when I last looked at it, but today, after
having the brain reset by not looking at a computer for 4 days its not
immediate obvious anymore.

> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04a3ce20da67..4ef3fa0d5e8d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8455,7 +8455,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  	}
>  
>  	/* Check if dst CPU is idle and preferred to this group */
> -	if (env->sd->flags & SD_ASYM_PACKING &&
> +	if (!local_group && env->sd->flags & SD_ASYM_PACKING &&
>  	    env->idle != CPU_NOT_IDLE &&
>  	    sgs->sum_h_nr_running &&
>  	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
@ 2021-05-03  9:52   ` Peter Zijlstra
  2021-05-06  4:28     ` Ricardo Neri
  2021-05-03  9:54   ` Peter Zijlstra
  2021-05-03 10:02   ` Peter Zijlstra
  2 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-03  9:52 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> +static bool cpu_group_is_smt(int cpu, struct sched_group *sg)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +	if (!static_branch_likely(&sched_smt_present))
> +		return false;
> +
> +	if (sg->group_weight == 1)
> +		return false;
> +
> +	return cpumask_equal(sched_group_span(sg), cpu_smt_mask(cpu));
> +#else
> +	return false;
> +#endif
> +}
> +
> +/**
> + * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> + * @dst_cpu:	CPU doing the load balancing
> + * @sds:	Load-balancing data with statistics of the local group
> + * @sgs:	Load-balancing statistics of the candidate busiest group
> + * @sg:		The candidate busiet group
> + *
> + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
> + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
> + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> + *
> + * If both @dst_cpu and @sg have SMT siblings. Even the number of idle CPUs
> + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> + * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
> + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> + * has lower priority.
> + */
> +static bool asym_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> +				struct sg_lb_stats *sgs, struct sched_group *sg)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +	int cpu, local_busy_cpus, sg_busy_cpus;
> +	bool local_is_smt, sg_is_smt;
> +
> +	if (!arch_asym_check_smt_siblings())
> +		return true;
> +
> +	cpu = group_first_cpu(sg);
> +	local_is_smt = cpu_group_is_smt(dst_cpu, sds->local);
> +	sg_is_smt = cpu_group_is_smt(cpu, sg);

Would something like this make sense?

---
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8533,21 +8533,6 @@ static inline void update_sg_lb_stats(st
 				sgs->group_capacity;
 }
 
-static bool cpu_group_is_smt(int cpu, struct sched_group *sg)
-{
-#ifdef CONFIG_SCHED_SMT
-	if (!static_branch_likely(&sched_smt_present))
-		return false;
-
-	if (sg->group_weight == 1)
-		return false;
-
-	return cpumask_equal(sched_group_span(sg), cpu_smt_mask(cpu));
-#else
-	return false;
-#endif
-}
-
 /**
  * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
  * @dst_cpu:	CPU doing the load balancing
@@ -8578,8 +8563,8 @@ static bool asym_can_pull_tasks(int dst_
 		return true;
 
 	cpu = group_first_cpu(sg);
-	local_is_smt = cpu_group_is_smt(dst_cpu, sds->local);
-	sg_is_smt = cpu_group_is_smt(cpu, sg);
+	local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
+	sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
 
 	sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
 
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1795,6 +1795,7 @@ struct sched_group {
 	unsigned int		group_weight;
 	struct sched_group_capacity *sgc;
 	int			asym_prefer_cpu;	/* CPU of highest priority in group */
+	int			flags;
 
 	/*
 	 * The CPUs this group covers.
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -916,10 +916,12 @@ build_group_from_child_sched_domain(stru
 		return NULL;
 
 	sg_span = sched_group_span(sg);
-	if (sd->child)
+	if (sd->child) {
 		cpumask_copy(sg_span, sched_domain_span(sd->child));
-	else
+		sg->flags = sd->child->flags;
+	} else {
 		cpumask_copy(sg_span, sched_domain_span(sd));
+	}
 
 	atomic_inc(&sg->ref);
 	return sg;
@@ -1169,6 +1171,7 @@ static struct sched_group *get_group(int
 	if (child) {
 		cpumask_copy(sched_group_span(sg), sched_domain_span(child));
 		cpumask_copy(group_balance_mask(sg), sched_group_span(sg));
+		sg->flags = child->flags;
 	} else {
 		cpumask_set_cpu(cpu, sched_group_span(sg));
 		cpumask_set_cpu(cpu, group_balance_mask(sg));

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
  2021-05-03  9:52   ` Peter Zijlstra
@ 2021-05-03  9:54   ` Peter Zijlstra
  2021-05-03 10:02   ` Peter Zijlstra
  2 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-03  9:54 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> +/**
> + * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> + * @dst_cpu:	CPU doing the load balancing

FWIW, that description isn't correct. dst_cpu does not need to be the
CPU doing the actual balancing. It mostly is, but no guarantees.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
  2021-05-03  9:52   ` Peter Zijlstra
  2021-05-03  9:54   ` Peter Zijlstra
@ 2021-05-03 10:02   ` Peter Zijlstra
  2021-05-03 10:23     ` Peter Zijlstra
  2 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-03 10:02 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> @@ -8507,6 +8616,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  	if (!sgs->sum_h_nr_running)
>  		return false;
>  
> +	/*
> +	 * @sg may have been tentatively classified as group_asym_packing.
> +	 * Now that we have sufficient information about @sds.local, reassess
> +	 * if asym packing migration can be done. Reclassify @sg. The only
> +	 * possible results are group_has_spare and group_fully_busy.
> +	 */
> +	if (sgs->group_type == group_asym_packing &&
> +	    !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg)) {
> +		sgs->group_asym_packing = 0;
> +		sgs->group_type = group_classify(env->sd->imbalance_pct, sg, sgs);
> +	}

So if this really is all about not having sds.local in
update_sd_lb_stats(), then that seems fixable. Let me haz a try.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-03 10:02   ` Peter Zijlstra
@ 2021-05-03 10:23     ` Peter Zijlstra
  2021-05-04  3:09       ` Ricardo Neri
  2021-05-06  4:26       ` Ricardo Neri
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-03 10:23 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Mon, May 03, 2021 at 12:02:49PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> > @@ -8507,6 +8616,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> >  	if (!sgs->sum_h_nr_running)
> >  		return false;
> >  
> > +	/*
> > +	 * @sg may have been tentatively classified as group_asym_packing.
> > +	 * Now that we have sufficient information about @sds.local, reassess
> > +	 * if asym packing migration can be done. Reclassify @sg. The only
> > +	 * possible results are group_has_spare and group_fully_busy.
> > +	 */
> > +	if (sgs->group_type == group_asym_packing &&
> > +	    !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg)) {
> > +		sgs->group_asym_packing = 0;
> > +		sgs->group_type = group_classify(env->sd->imbalance_pct, sg, sgs);
> > +	}
> 
> So if this really is all about not having sds.local in
> update_sd_lb_stats(), then that seems fixable. Let me haz a try.

How's this then?

---
 kernel/sched/fair.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3bdc41f22909..e9dcbee5b3d9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8437,6 +8437,21 @@ group_type group_classify(unsigned int imbalance_pct,
 	return group_has_spare;
 }
 
+static inline bool
+sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sched_group *group)
+{
+	/*
+	 * Because sd->groups starts with the local group, anything that isn't
+	 * the local group will have access to the local state.
+	 */
+	if (group == sds->local)
+		return false;
+
+	/* XXX do magic here */
+
+	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
+}
+
 /**
  * update_sg_lb_stats - Update sched_group's statistics for load balancing.
  * @env: The load balancing environment.
@@ -8445,6 +8460,7 @@ group_type group_classify(unsigned int imbalance_pct,
  * @sg_status: Holds flag indicating the status of the sched_group
  */
 static inline void update_sg_lb_stats(struct lb_env *env,
+				      struct sd_lb_stats *sds,
 				      struct sched_group *group,
 				      struct sg_lb_stats *sgs,
 				      int *sg_status)
@@ -8453,7 +8469,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
 	memset(sgs, 0, sizeof(*sgs));
 
-	local_group = cpumask_test_cpu(env->dst_cpu, sched_group_span(group));
+	local_group = group == sds->local;
 
 	for_each_cpu_and(i, sched_group_span(group), env->cpus) {
 		struct rq *rq = cpu_rq(i);
@@ -8498,9 +8514,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
 	/* Check if dst CPU is idle and preferred to this group */
 	if (env->sd->flags & SD_ASYM_PACKING &&
-	    env->idle != CPU_NOT_IDLE &&
-	    sgs->sum_h_nr_running &&
-	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
+	    env->idle != CPU_NOT_IDLE && sgs->sum_h_nr_running &&
+	    sched_asym(env, sds, group)) {
 		sgs->group_asym_packing = 1;
 	}
 
@@ -9016,7 +9031,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 				update_group_capacity(env->sd, env->dst_cpu);
 		}
 
-		update_sg_lb_stats(env, sg, sgs, &sg_status);
+		update_sg_lb_stats(env, sds, sg, sgs, &sg_status);
 
 		if (local_group)
 			goto next_group;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing
  2021-05-03  9:24   ` Peter Zijlstra
@ 2021-05-04  3:08     ` Ricardo Neri
  0 siblings, 0 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-05-04  3:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Mon, May 03, 2021 at 11:24:25AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 13, 2021 at 07:04:33PM -0700, Ricardo Neri wrote:
> > By checking local_group, we can avoid additional checks and invoking
> > sched_asmy_prefer() when it is not needed.
> 
> This really could do with a few words on *why* that is correct. ISTR
> thinking this made sense when I last looked at it, but today, after
> having the brain reset by not looking at a computer for 4 days its not
> immediate obvious anymore.

Thanks for your feedback Peter! I will add a comment explaining why this
is correct: when we are collecting statistics of the local group,
@env->dst_cpu belongs to @group. @env->dst_cpu may or may not be
@group->asym_prefer_cpu. If @env->dst_cpu, sched_asym_prefer() must return
false because it would be checking for

    arch_asym_cpu_priority(dst_cpu) >  arch_asym_cpu_priority(dst_cpu)

which is false. If @env->dst_cpu is not @group->env_prefer_cpu, it
implies that the former has lower prority than the latter and
sched_asym_prefer() will also return false.

Thanks and BR,
Ricardo

> 
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> >  kernel/sched/fair.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 04a3ce20da67..4ef3fa0d5e8d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -8455,7 +8455,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> >  	}
> >  
> >  	/* Check if dst CPU is idle and preferred to this group */
> > -	if (env->sd->flags & SD_ASYM_PACKING &&
> > +	if (!local_group && env->sd->flags & SD_ASYM_PACKING &&
> >  	    env->idle != CPU_NOT_IDLE &&
> >  	    sgs->sum_h_nr_running &&
> >  	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
> > -- 
> > 2.17.1
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-03 10:23     ` Peter Zijlstra
@ 2021-05-04  3:09       ` Ricardo Neri
  2021-05-06  4:26       ` Ricardo Neri
  1 sibling, 0 replies; 16+ messages in thread
From: Ricardo Neri @ 2021-05-04  3:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Mon, May 03, 2021 at 12:23:36PM +0200, Peter Zijlstra wrote:
> On Mon, May 03, 2021 at 12:02:49PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> > > @@ -8507,6 +8616,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > >  	if (!sgs->sum_h_nr_running)
> > >  		return false;
> > >  
> > > +	/*
> > > +	 * @sg may have been tentatively classified as group_asym_packing.
> > > +	 * Now that we have sufficient information about @sds.local, reassess
> > > +	 * if asym packing migration can be done. Reclassify @sg. The only
> > > +	 * possible results are group_has_spare and group_fully_busy.
> > > +	 */
> > > +	if (sgs->group_type == group_asym_packing &&
> > > +	    !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg)) {
> > > +		sgs->group_asym_packing = 0;
> > > +		sgs->group_type = group_classify(env->sd->imbalance_pct, sg, sgs);
> > > +	}
> > 
> > So if this really is all about not having sds.local in
> > update_sd_lb_stats(), then that seems fixable. Let me haz a try.
> 
> How's this then?
> 
> ---
>  kernel/sched/fair.c | 25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3bdc41f22909..e9dcbee5b3d9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8437,6 +8437,21 @@ group_type group_classify(unsigned int imbalance_pct,
>  	return group_has_spare;
>  }
>  
> +static inline bool
> +sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sched_group *group)
> +{
> +	/*
> +	 * Because sd->groups starts with the local group, anything that isn't
> +	 * the local group will have access to the local state.
> +	 */
> +	if (group == sds->local)
> +		return false;
> +
> +	/* XXX do magic here */
> +
> +	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> +}
> +
>  /**
>   * update_sg_lb_stats - Update sched_group's statistics for load balancing.
>   * @env: The load balancing environment.
> @@ -8445,6 +8460,7 @@ group_type group_classify(unsigned int imbalance_pct,
>   * @sg_status: Holds flag indicating the status of the sched_group
>   */
>  static inline void update_sg_lb_stats(struct lb_env *env,
> +				      struct sd_lb_stats *sds,
>  				      struct sched_group *group,
>  				      struct sg_lb_stats *sgs,
>  				      int *sg_status)
> @@ -8453,7 +8469,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  
>  	memset(sgs, 0, sizeof(*sgs));
>  
> -	local_group = cpumask_test_cpu(env->dst_cpu, sched_group_span(group));
> +	local_group = group == sds->local;
>  
>  	for_each_cpu_and(i, sched_group_span(group), env->cpus) {
>  		struct rq *rq = cpu_rq(i);
> @@ -8498,9 +8514,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  
>  	/* Check if dst CPU is idle and preferred to this group */
>  	if (env->sd->flags & SD_ASYM_PACKING &&
> -	    env->idle != CPU_NOT_IDLE &&
> -	    sgs->sum_h_nr_running &&
> -	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
> +	    env->idle != CPU_NOT_IDLE && sgs->sum_h_nr_running &&
> +	    sched_asym(env, sds, group)) {
>  		sgs->group_asym_packing = 1;
>  	}
>  
> @@ -9016,7 +9031,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
>  				update_group_capacity(env->sd, env->dst_cpu);
>  		}
>  
> -		update_sg_lb_stats(env, sg, sgs, &sg_status);
> +		update_sg_lb_stats(env, sds, sg, sgs, &sg_status);
>  
>  		if (local_group)
>  			goto next_group;

Thanks for the code Peter! Let me give this a try and I will report back
to you.

BR,
Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-03 10:23     ` Peter Zijlstra
  2021-05-04  3:09       ` Ricardo Neri
@ 2021-05-06  4:26       ` Ricardo Neri
  2021-05-06  9:18         ` Peter Zijlstra
  1 sibling, 1 reply; 16+ messages in thread
From: Ricardo Neri @ 2021-05-06  4:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Mon, May 03, 2021 at 12:23:36PM +0200, Peter Zijlstra wrote:
> On Mon, May 03, 2021 at 12:02:49PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> > > @@ -8507,6 +8616,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> > >  	if (!sgs->sum_h_nr_running)
> > >  		return false;
> > >  
> > > +	/*
> > > +	 * @sg may have been tentatively classified as group_asym_packing.
> > > +	 * Now that we have sufficient information about @sds.local, reassess
> > > +	 * if asym packing migration can be done. Reclassify @sg. The only
> > > +	 * possible results are group_has_spare and group_fully_busy.
> > > +	 */
> > > +	if (sgs->group_type == group_asym_packing &&
> > > +	    !asym_can_pull_tasks(env->dst_cpu, sds, sgs, sg)) {
> > > +		sgs->group_asym_packing = 0;
> > > +		sgs->group_type = group_classify(env->sd->imbalance_pct, sg, sgs);
> > > +	}
> > 
> > So if this really is all about not having sds.local in
> > update_sd_lb_stats(), then that seems fixable. Let me haz a try.
> 
> How's this then?
> 
> ---
>  kernel/sched/fair.c | 25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3bdc41f22909..e9dcbee5b3d9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8437,6 +8437,21 @@ group_type group_classify(unsigned int imbalance_pct,
>  	return group_has_spare;
>  }
>  
> +static inline bool
> +sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sched_group *group)

Thank you Peter for this code! It worked well. I had to make a couple of
tweaks. My proposed asym_can_pull_tasks() also needs the statistics of
@group. Thus, I added them to the arguments of sched_asym(). Also...

> +{
> +	/*
> +	 * Because sd->groups starts with the local group, anything that isn't
> +	 * the local group will have access to the local state.
> +	 */
> +	if (group == sds->local)
> +		return false;
> +
> +	/* XXX do magic here */
> +
> +	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> +}
> +
>  /**
>   * update_sg_lb_stats - Update sched_group's statistics for load balancing.
>   * @env: The load balancing environment.
> @@ -8445,6 +8460,7 @@ group_type group_classify(unsigned int imbalance_pct,
>   * @sg_status: Holds flag indicating the status of the sched_group
>   */
>  static inline void update_sg_lb_stats(struct lb_env *env,
> +				      struct sd_lb_stats *sds,
>  				      struct sched_group *group,
>  				      struct sg_lb_stats *sgs,
>  				      int *sg_status)
> @@ -8453,7 +8469,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  
>  	memset(sgs, 0, sizeof(*sgs));
>  
> -	local_group = cpumask_test_cpu(env->dst_cpu, sched_group_span(group));
> +	local_group = group == sds->local;
>  
>  	for_each_cpu_and(i, sched_group_span(group), env->cpus) {
>  		struct rq *rq = cpu_rq(i);
> @@ -8498,9 +8514,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>  
>  	/* Check if dst CPU is idle and preferred to this group */
>  	if (env->sd->flags & SD_ASYM_PACKING &&
> -	    env->idle != CPU_NOT_IDLE &&
> -	    sgs->sum_h_nr_running &&
> -	    sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) {
> +	    env->idle != CPU_NOT_IDLE && sgs->sum_h_nr_running &&
> +	    sched_asym(env, sds, group)) {
>  		sgs->group_asym_packing = 1;

... I moved this code to be executed after computing sgs->weight as
asym_can_pull_tasks() needs this datum as well.

May I add your Co-developed-by and Signed-off-by tags to a patch with these
changes in my v3 posting?

BR,
Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-03  9:52   ` Peter Zijlstra
@ 2021-05-06  4:28     ` Ricardo Neri
  2021-05-06  9:17       ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Ricardo Neri @ 2021-05-06  4:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Mon, May 03, 2021 at 11:52:25AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 13, 2021 at 07:04:35PM -0700, Ricardo Neri wrote:
> > +static bool cpu_group_is_smt(int cpu, struct sched_group *sg)
> > +{
> > +#ifdef CONFIG_SCHED_SMT
> > +	if (!static_branch_likely(&sched_smt_present))
> > +		return false;
> > +
> > +	if (sg->group_weight == 1)
> > +		return false;
> > +
> > +	return cpumask_equal(sched_group_span(sg), cpu_smt_mask(cpu));
> > +#else
> > +	return false;
> > +#endif
> > +}
> > +
> > +/**
> > + * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> > + * @dst_cpu:	CPU doing the load balancing
> > + * @sds:	Load-balancing data with statistics of the local group
> > + * @sgs:	Load-balancing statistics of the candidate busiest group
> > + * @sg:		The candidate busiet group
> > + *
> > + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
> > + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
> > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > + *
> > + * If both @dst_cpu and @sg have SMT siblings. Even the number of idle CPUs
> > + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> > + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
> > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> > + * has lower priority.
> > + */
> > +static bool asym_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > +				struct sg_lb_stats *sgs, struct sched_group *sg)
> > +{
> > +#ifdef CONFIG_SCHED_SMT
> > +	int cpu, local_busy_cpus, sg_busy_cpus;
> > +	bool local_is_smt, sg_is_smt;
> > +
> > +	if (!arch_asym_check_smt_siblings())
> > +		return true;
> > +
> > +	cpu = group_first_cpu(sg);
> > +	local_is_smt = cpu_group_is_smt(dst_cpu, sds->local);
> > +	sg_is_smt = cpu_group_is_smt(cpu, sg);
> 
> Would something like this make sense?
> 
> ---
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8533,21 +8533,6 @@ static inline void update_sg_lb_stats(st
>  				sgs->group_capacity;
>  }
>  
> -static bool cpu_group_is_smt(int cpu, struct sched_group *sg)
> -{
> -#ifdef CONFIG_SCHED_SMT
> -	if (!static_branch_likely(&sched_smt_present))
> -		return false;
> -
> -	if (sg->group_weight == 1)
> -		return false;
> -
> -	return cpumask_equal(sched_group_span(sg), cpu_smt_mask(cpu));
> -#else
> -	return false;
> -#endif
> -}
> -
>  /**
>   * asym_can_pull_tasks - Check whether the load balancing CPU can pull tasks
>   * @dst_cpu:	CPU doing the load balancing
> @@ -8578,8 +8563,8 @@ static bool asym_can_pull_tasks(int dst_
>  		return true;
>  
>  	cpu = group_first_cpu(sg);
> -	local_is_smt = cpu_group_is_smt(dst_cpu, sds->local);
> -	sg_is_smt = cpu_group_is_smt(cpu, sg);
> +	local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> +	sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
>  
>  	sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
>  
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1795,6 +1795,7 @@ struct sched_group {
>  	unsigned int		group_weight;
>  	struct sched_group_capacity *sgc;
>  	int			asym_prefer_cpu;	/* CPU of highest priority in group */
> +	int			flags;
>  
>  	/*
>  	 * The CPUs this group covers.
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -916,10 +916,12 @@ build_group_from_child_sched_domain(stru
>  		return NULL;
>  
>  	sg_span = sched_group_span(sg);
> -	if (sd->child)
> +	if (sd->child) {
>  		cpumask_copy(sg_span, sched_domain_span(sd->child));
> -	else
> +		sg->flags = sd->child->flags;
> +	} else {
>  		cpumask_copy(sg_span, sched_domain_span(sd));
> +	}
>  
>  	atomic_inc(&sg->ref);
>  	return sg;
> @@ -1169,6 +1171,7 @@ static struct sched_group *get_group(int
>  	if (child) {
>  		cpumask_copy(sched_group_span(sg), sched_domain_span(child));
>  		cpumask_copy(group_balance_mask(sg), sched_group_span(sg));
> +		sg->flags = child->flags;
>  	} else {
>  		cpumask_set_cpu(cpu, sched_group_span(sg));
>  		cpumask_set_cpu(cpu, group_balance_mask(sg));

Thank you Peter! This code worked well and it looks better than what I
proposed. May I add your Originally-by: and Signed-off-by: tags in a
patch when I post v3?

BR,
Ricardo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-06  4:28     ` Ricardo Neri
@ 2021-05-06  9:17       ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-06  9:17 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Wed, May 05, 2021 at 09:28:44PM -0700, Ricardo Neri wrote:
> Thank you Peter! This code worked well and it looks better than what I
> proposed. May I add your Originally-by: and Signed-off-by: tags in a
> patch when I post v3?

Sure

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance
  2021-05-06  4:26       ` Ricardo Neri
@ 2021-05-06  9:18         ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2021-05-06  9:18 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Len Brown,
	Srinivas Pandruvada, Tim Chen, Aubrey Li, Ravi V. Shankar,
	Ricardo Neri, Quentin Perret, Joel Fernandes (Google),
	linux-kernel, Aubrey Li, Ben Segall, Daniel Bristot de Oliveira

On Wed, May 05, 2021 at 09:26:27PM -0700, Ricardo Neri wrote:
> May I add your Co-developed-by and Signed-off-by tags to a patch with these
> changes in my v3 posting?

Sure

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-05-06  9:20 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14  2:04 [PATCH v2 0/4] sched/fair: Fix load balancing of SMT siblings with ASYM_PACKING Ricardo Neri
2021-04-14  2:04 ` [PATCH v2 1/4] sched/fair: Optimize checking for group_asym_packing Ricardo Neri
2021-05-03  9:24   ` Peter Zijlstra
2021-05-04  3:08     ` Ricardo Neri
2021-04-14  2:04 ` [PATCH v2 2/4] sched/fair: Introduce arch_sched_asym_prefer_early() Ricardo Neri
2021-04-14  2:04 ` [PATCH v2 3/4] sched/fair: Consider SMT in ASYM_PACKING load balance Ricardo Neri
2021-05-03  9:52   ` Peter Zijlstra
2021-05-06  4:28     ` Ricardo Neri
2021-05-06  9:17       ` Peter Zijlstra
2021-05-03  9:54   ` Peter Zijlstra
2021-05-03 10:02   ` Peter Zijlstra
2021-05-03 10:23     ` Peter Zijlstra
2021-05-04  3:09       ` Ricardo Neri
2021-05-06  4:26       ` Ricardo Neri
2021-05-06  9:18         ` Peter Zijlstra
2021-04-14  2:04 ` [PATCH v2 4/4] x86/sched: Enable checks of the state of SMT siblings in load balancing Ricardo Neri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).