linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] Scan for an idle sibling in a single pass
@ 2021-01-25  8:59 Mel Gorman
  2021-01-25  8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Mel Gorman @ 2021-01-25  8:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

Changelog since v3
o Drop scanning based on cores, SMT4 results showed problems

Changelog since v2
o Remove unnecessary parameters
o Update nr during scan only when scanning for cpus

Changlog since v1
o Move extern declaration to header for coding style
o Remove unnecessary parameter from __select_idle_cpu

This series of 4 patches reposts three patches from Peter entitled
"select_idle_sibling() wreckage". It only scans the runqueues in a single
pass when searching for an idle sibling.

Three patches from Peter were dropped. The first patch altered how scan
depth was calculated. Scan depth deletion is a random number generator
with two major limitations. The avg_idle time is based on the time
between a CPU going idle and being woken up clamped approximately by
2*sysctl_sched_migration_cost.  This is difficult to compare in a sensible
fashion to avg_scan_cost. The second issue is that only the avg_scan_cost
of scan failures is recorded and it does not decay.  This requires deeper
surgery that would justify a patch on its own although Peter notes that
https://lkml.kernel.org/r/20180530143105.977759909@infradead.org is
potentially useful for an alternative avg_idle metric.

The second patch dropped scanned based on cores instead of CPUs as it
rationalised the difference between core scanning and CPU scanning.
Unfortunately, Vincent reported problems with SMT4 so it's dropped
for now until depth searching can be fixed.

The third patch dropped converted the idle core scan throttling mechanism
to SIS_PROP. While this would unify the throttling of core and CPU
scanning, it was not free of regressions and has_idle_cores is a fairly
effective throttling mechanism with the caveat that it can have a lot of
false positives for workloads like hackbench.

Peter's series tried to solve three problems at once, this subset addresses
one problem.
losses but won more than it lost across a range of workloads and machines.

 kernel/sched/fair.c     | 153 +++++++++++++++++++---------------------
 kernel/sched/features.h |   1 -
 2 files changed, 72 insertions(+), 82 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU
  2021-01-25  8:59 [PATCH v4 0/4] Scan for an idle sibling in a single pass Mel Gorman
@ 2021-01-25  8:59 ` Mel Gorman
  2021-01-25  8:59 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Mel Gorman @ 2021-01-25  8:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

SIS_AVG_CPU was introduced as a means of avoiding a search when the
average search cost indicated that the search would likely fail. It was
a blunt instrument and disabled by commit 4c77b18cf8b7 ("sched/fair: Make
select_idle_cpu() more aggressive") and later replaced with a proportional
search depth by commit 1ad3aaf3fcd2 ("sched/core: Implement new approach
to scale select_idle_cpu()").

While there are corner cases where SIS_AVG_CPU is better, it has now been
disabled for almost three years. As the intent of SIS_PROP is to reduce
the time complexity of select_idle_cpu(), lets drop SIS_AVG_CPU and focus
on SIS_PROP as a throttling mechanism.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c     | 20 +++++++++-----------
 kernel/sched/features.h |  1 -
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..9f5682aeda2e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6145,7 +6145,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
 	struct sched_domain *this_sd;
-	u64 avg_cost, avg_idle;
 	u64 time;
 	int this = smp_processor_id();
 	int cpu, nr = INT_MAX;
@@ -6154,18 +6153,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 	if (!this_sd)
 		return -1;
 
-	/*
-	 * Due to large variance we need a large fuzz factor; hackbench in
-	 * particularly is sensitive here.
-	 */
-	avg_idle = this_rq()->avg_idle / 512;
-	avg_cost = this_sd->avg_scan_cost + 1;
+	if (sched_feat(SIS_PROP)) {
+		u64 avg_cost, avg_idle, span_avg;
 
-	if (sched_feat(SIS_AVG_CPU) && avg_idle < avg_cost)
-		return -1;
+		/*
+		 * Due to large variance we need a large fuzz factor;
+		 * hackbench in particularly is sensitive here.
+		 */
+		avg_idle = this_rq()->avg_idle / 512;
+		avg_cost = this_sd->avg_scan_cost + 1;
 
-	if (sched_feat(SIS_PROP)) {
-		u64 span_avg = sd->span_weight * avg_idle;
+		span_avg = sd->span_weight * avg_idle;
 		if (span_avg > 4*avg_cost)
 			nr = div_u64(span_avg, avg_cost);
 		else
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 68d369cba9e4..e875eabb6600 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -54,7 +54,6 @@ SCHED_FEAT(TTWU_QUEUE, true)
 /*
  * When doing wakeups, attempt to limit superfluous scans of the LLC domain.
  */
-SCHED_FEAT(SIS_AVG_CPU, false)
 SCHED_FEAT(SIS_PROP, true)
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP
  2021-01-25  8:59 [PATCH v4 0/4] Scan for an idle sibling in a single pass Mel Gorman
  2021-01-25  8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
@ 2021-01-25  8:59 ` Mel Gorman
  2021-01-27 10:39   ` Vincent Guittot
  2021-01-25  8:59 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman
  2021-01-25  8:59 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman
  3 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2021-01-25  8:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP
even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP
check and while we are at it, exclude the cost of initialising the CPU
mask from the average scan cost.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/sched/fair.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9f5682aeda2e..c8d8e185cf3b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 	if (!this_sd)
 		return -1;
 
+	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+
 	if (sched_feat(SIS_PROP)) {
 		u64 avg_cost, avg_idle, span_avg;
 
@@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 			nr = div_u64(span_avg, avg_cost);
 		else
 			nr = 4;
-	}
-
-	time = cpu_clock(this);
 
-	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+		time = cpu_clock(this);
+	}
 
 	for_each_cpu_wrap(cpu, cpus, target) {
 		if (!--nr)
@@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 			break;
 	}
 
-	time = cpu_clock(this) - time;
-	update_avg(&this_sd->avg_scan_cost, time);
+	if (sched_feat(SIS_PROP)) {
+		time = cpu_clock(this) - time;
+		update_avg(&this_sd->avg_scan_cost, time);
+	}
 
 	return cpu;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] sched/fair: Remove select_idle_smt()
  2021-01-25  8:59 [PATCH v4 0/4] Scan for an idle sibling in a single pass Mel Gorman
  2021-01-25  8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
  2021-01-25  8:59 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman
@ 2021-01-25  8:59 ` Mel Gorman
  2021-01-27 10:39   ` Vincent Guittot
  2021-02-17 13:17   ` [tip: sched/core] " tip-bot2 for Mel Gorman
  2021-01-25  8:59 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman
  3 siblings, 2 replies; 14+ messages in thread
From: Mel Gorman @ 2021-01-25  8:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

From: Peter Zijlstra <peterz@infradead.org>

From: Peter Zijlstra (Intel) <peterz@infradead.org>

In order to make the next patch more readable, and to quantify the
actual effectiveness of this pass, start by removing it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/sched/fair.c | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c8d8e185cf3b..fe587350ea14 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6101,27 +6101,6 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
 	return -1;
 }
 
-/*
- * Scan the local SMT mask for idle CPUs.
- */
-static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
-{
-	int cpu;
-
-	if (!static_branch_likely(&sched_smt_present))
-		return -1;
-
-	for_each_cpu(cpu, cpu_smt_mask(target)) {
-		if (!cpumask_test_cpu(cpu, p->cpus_ptr) ||
-		    !cpumask_test_cpu(cpu, sched_domain_span(sd)))
-			continue;
-		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
-			return cpu;
-	}
-
-	return -1;
-}
-
 #else /* CONFIG_SCHED_SMT */
 
 static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
@@ -6129,11 +6108,6 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
 	return -1;
 }
 
-static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
-{
-	return -1;
-}
-
 #endif /* CONFIG_SCHED_SMT */
 
 /*
@@ -6323,10 +6297,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	if ((unsigned)i < nr_cpumask_bits)
 		return i;
 
-	i = select_idle_smt(p, sd, target);
-	if ((unsigned)i < nr_cpumask_bits)
-		return i;
-
 	return target;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-25  8:59 [PATCH v4 0/4] Scan for an idle sibling in a single pass Mel Gorman
                   ` (2 preceding siblings ...)
  2021-01-25  8:59 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman
@ 2021-01-25  8:59 ` Mel Gorman
  2021-01-27 10:43   ` Vincent Guittot
  3 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2021-01-25  8:59 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

From: Peter Zijlstra <peterz@infradead.org>

From: Peter Zijlstra (Intel) <peterz@infradead.org>

Both select_idle_core() and select_idle_cpu() do a loop over the same
cpumask. Observe that by clearing the already visited CPUs, we can
fold the iteration and iterate a core at a time.

All we need to do is remember any non-idle CPU we encountered while
scanning for an idle core. This way we'll only iterate every CPU once.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 kernel/sched/fair.c | 101 ++++++++++++++++++++++++++------------------
 1 file changed, 61 insertions(+), 40 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fe587350ea14..52a650aa2108 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6006,6 +6006,14 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
 	return new_cpu;
 }
 
+static inline int __select_idle_cpu(int cpu)
+{
+	if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
+		return cpu;
+
+	return -1;
+}
+
 #ifdef CONFIG_SCHED_SMT
 DEFINE_STATIC_KEY_FALSE(sched_smt_present);
 EXPORT_SYMBOL_GPL(sched_smt_present);
@@ -6064,48 +6072,51 @@ void __update_idle_core(struct rq *rq)
  * there are no idle cores left in the system; tracked through
  * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above.
  */
-static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
+static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
 {
-	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
-	int core, cpu;
+	bool idle = true;
+	int cpu;
 
 	if (!static_branch_likely(&sched_smt_present))
-		return -1;
-
-	if (!test_idle_cores(target, false))
-		return -1;
-
-	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+		return __select_idle_cpu(core);
 
-	for_each_cpu_wrap(core, cpus, target) {
-		bool idle = true;
-
-		for_each_cpu(cpu, cpu_smt_mask(core)) {
-			if (!available_idle_cpu(cpu)) {
-				idle = false;
-				break;
+	for_each_cpu(cpu, cpu_smt_mask(core)) {
+		if (!available_idle_cpu(cpu)) {
+			idle = false;
+			if (*idle_cpu == -1) {
+				if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) {
+					*idle_cpu = cpu;
+					break;
+				}
+				continue;
 			}
+			break;
 		}
-
-		if (idle)
-			return core;
-
-		cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
+		if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr))
+			*idle_cpu = cpu;
 	}
 
-	/*
-	 * Failed to find an idle core; stop looking for one.
-	 */
-	set_idle_cores(target, 0);
+	if (idle)
+		return core;
 
+	cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
 	return -1;
 }
 
 #else /* CONFIG_SCHED_SMT */
 
-static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
+static inline void set_idle_cores(int cpu, int val)
 {
-	return -1;
+}
+
+static inline bool test_idle_cores(int cpu, bool def)
+{
+	return def;
+}
+
+static inline int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
+{
+	return __select_idle_cpu(core);
 }
 
 #endif /* CONFIG_SCHED_SMT */
@@ -6118,10 +6129,11 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
+	int i, cpu, idle_cpu = -1, nr = INT_MAX;
+	bool smt = test_idle_cores(target, false);
+	int this = smp_processor_id();
 	struct sched_domain *this_sd;
 	u64 time;
-	int this = smp_processor_id();
-	int cpu, nr = INT_MAX;
 
 	this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
 	if (!this_sd)
@@ -6129,7 +6141,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 
 	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
 
-	if (sched_feat(SIS_PROP)) {
+	if (sched_feat(SIS_PROP) && !smt) {
 		u64 avg_cost, avg_idle, span_avg;
 
 		/*
@@ -6149,18 +6161,31 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 	}
 
 	for_each_cpu_wrap(cpu, cpus, target) {
-		if (!--nr)
-			return -1;
-		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
-			break;
+		if (smt) {
+			i = select_idle_core(p, cpu, cpus, &idle_cpu);
+			if ((unsigned int)i < nr_cpumask_bits)
+				return i;
+
+		} else {
+			if (!--nr)
+				return -1;
+			i = __select_idle_cpu(cpu);
+			if ((unsigned int)i < nr_cpumask_bits) {
+				idle_cpu = i;
+				break;
+			}
+		}
 	}
 
-	if (sched_feat(SIS_PROP)) {
+	if (smt)
+		set_idle_cores(this, false);
+
+	if (sched_feat(SIS_PROP) && !smt) {
 		time = cpu_clock(this) - time;
 		update_avg(&this_sd->avg_scan_cost, time);
 	}
 
-	return cpu;
+	return idle_cpu;
 }
 
 /*
@@ -6289,10 +6314,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	if (!sd)
 		return target;
 
-	i = select_idle_core(p, sd, target);
-	if ((unsigned)i < nr_cpumask_bits)
-		return i;
-
 	i = select_idle_cpu(p, sd, target);
 	if ((unsigned)i < nr_cpumask_bits)
 		return i;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP
  2021-01-25  8:59 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman
@ 2021-01-27 10:39   ` Vincent Guittot
  0 siblings, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-27 10:39 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Mon, 25 Jan 2021 at 09:59, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> As noted by Vincent Guittot, avg_scan_costs are calculated for SIS_PROP
> even if SIS_PROP is disabled. Move the time calculations under a SIS_PROP
> check and while we are at it, exclude the cost of initialising the CPU
> mask from the average scan cost.
>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9f5682aeda2e..c8d8e185cf3b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6153,6 +6153,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>         if (!this_sd)
>                 return -1;
>
> +       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +
>         if (sched_feat(SIS_PROP)) {
>                 u64 avg_cost, avg_idle, span_avg;
>
> @@ -6168,11 +6170,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>                         nr = div_u64(span_avg, avg_cost);
>                 else
>                         nr = 4;
> -       }
> -
> -       time = cpu_clock(this);
>
> -       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +               time = cpu_clock(this);
> +       }
>
>         for_each_cpu_wrap(cpu, cpus, target) {
>                 if (!--nr)
> @@ -6181,8 +6181,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>                         break;
>         }
>
> -       time = cpu_clock(this) - time;
> -       update_avg(&this_sd->avg_scan_cost, time);
> +       if (sched_feat(SIS_PROP)) {
> +               time = cpu_clock(this) - time;
> +               update_avg(&this_sd->avg_scan_cost, time);
> +       }
>
>         return cpu;
>  }
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] sched/fair: Remove select_idle_smt()
  2021-01-25  8:59 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman
@ 2021-01-27 10:39   ` Vincent Guittot
  2021-02-17 13:17   ` [tip: sched/core] " tip-bot2 for Mel Gorman
  1 sibling, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-27 10:39 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Mon, 25 Jan 2021 at 09:59, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> From: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> In order to make the next patch more readable, and to quantify the
> actual effectiveness of this pass, start by removing it.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 30 ------------------------------
>  1 file changed, 30 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c8d8e185cf3b..fe587350ea14 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6101,27 +6101,6 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
>         return -1;
>  }
>
> -/*
> - * Scan the local SMT mask for idle CPUs.
> - */
> -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> -{
> -       int cpu;
> -
> -       if (!static_branch_likely(&sched_smt_present))
> -               return -1;
> -
> -       for_each_cpu(cpu, cpu_smt_mask(target)) {
> -               if (!cpumask_test_cpu(cpu, p->cpus_ptr) ||
> -                   !cpumask_test_cpu(cpu, sched_domain_span(sd)))
> -                       continue;
> -               if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> -                       return cpu;
> -       }
> -
> -       return -1;
> -}
> -
>  #else /* CONFIG_SCHED_SMT */
>
>  static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
> @@ -6129,11 +6108,6 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
>         return -1;
>  }
>
> -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
> -{
> -       return -1;
> -}
> -
>  #endif /* CONFIG_SCHED_SMT */
>
>  /*
> @@ -6323,10 +6297,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>         if ((unsigned)i < nr_cpumask_bits)
>                 return i;
>
> -       i = select_idle_smt(p, sd, target);
> -       if ((unsigned)i < nr_cpumask_bits)
> -               return i;
> -
>         return target;
>  }
>
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-25  8:59 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman
@ 2021-01-27 10:43   ` Vincent Guittot
  2021-01-27 12:02     ` Mel Gorman
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-27 10:43 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Mon, 25 Jan 2021 at 09:59, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> From: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> Both select_idle_core() and select_idle_cpu() do a loop over the same
> cpumask. Observe that by clearing the already visited CPUs, we can
> fold the iteration and iterate a core at a time.
>
> All we need to do is remember any non-idle CPU we encountered while
> scanning for an idle core. This way we'll only iterate every CPU once.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  kernel/sched/fair.c | 101 ++++++++++++++++++++++++++------------------
>  1 file changed, 61 insertions(+), 40 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fe587350ea14..52a650aa2108 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6006,6 +6006,14 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
>         return new_cpu;
>  }
>
> +static inline int __select_idle_cpu(int cpu)
> +{
> +       if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> +               return cpu;
> +
> +       return -1;
> +}
> +
>  #ifdef CONFIG_SCHED_SMT
>  DEFINE_STATIC_KEY_FALSE(sched_smt_present);
>  EXPORT_SYMBOL_GPL(sched_smt_present);
> @@ -6064,48 +6072,51 @@ void __update_idle_core(struct rq *rq)
>   * there are no idle cores left in the system; tracked through
>   * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above.
>   */
> -static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
> +static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
>  {
> -       struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> -       int core, cpu;
> +       bool idle = true;
> +       int cpu;
>
>         if (!static_branch_likely(&sched_smt_present))
> -               return -1;
> -
> -       if (!test_idle_cores(target, false))
> -               return -1;
> -
> -       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +               return __select_idle_cpu(core);
>
> -       for_each_cpu_wrap(core, cpus, target) {
> -               bool idle = true;
> -
> -               for_each_cpu(cpu, cpu_smt_mask(core)) {
> -                       if (!available_idle_cpu(cpu)) {
> -                               idle = false;
> -                               break;
> +       for_each_cpu(cpu, cpu_smt_mask(core)) {
> +               if (!available_idle_cpu(cpu)) {
> +                       idle = false;
> +                       if (*idle_cpu == -1) {
> +                               if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) {
> +                                       *idle_cpu = cpu;
> +                                       break;
> +                               }
> +                               continue;
>                         }
> +                       break;
>                 }
> -
> -               if (idle)
> -                       return core;
> -
> -               cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
> +               if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr))
> +                       *idle_cpu = cpu;
>         }
>
> -       /*
> -        * Failed to find an idle core; stop looking for one.
> -        */
> -       set_idle_cores(target, 0);
> +       if (idle)
> +               return core;
>
> +       cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
>         return -1;
>  }
>
>  #else /* CONFIG_SCHED_SMT */
>
> -static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
> +static inline void set_idle_cores(int cpu, int val)
>  {
> -       return -1;
> +}
> +
> +static inline bool test_idle_cores(int cpu, bool def)
> +{
> +       return def;
> +}
> +
> +static inline int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
> +{
> +       return __select_idle_cpu(core);
>  }
>
>  #endif /* CONFIG_SCHED_SMT */
> @@ -6118,10 +6129,11 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
>  static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
>  {
>         struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> +       int i, cpu, idle_cpu = -1, nr = INT_MAX;
> +       bool smt = test_idle_cores(target, false);
> +       int this = smp_processor_id();
>         struct sched_domain *this_sd;
>         u64 time;
> -       int this = smp_processor_id();
> -       int cpu, nr = INT_MAX;
>
>         this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
>         if (!this_sd)
> @@ -6129,7 +6141,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>
>         cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>
> -       if (sched_feat(SIS_PROP)) {
> +       if (sched_feat(SIS_PROP) && !smt) {
>                 u64 avg_cost, avg_idle, span_avg;
>
>                 /*
> @@ -6149,18 +6161,31 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>         }
>
>         for_each_cpu_wrap(cpu, cpus, target) {
> -               if (!--nr)
> -                       return -1;
> -               if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> -                       break;
> +               if (smt) {
> +                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> +                       if ((unsigned int)i < nr_cpumask_bits)
> +                               return i;
> +
> +               } else {
> +                       if (!--nr)
> +                               return -1;
> +                       i = __select_idle_cpu(cpu);

you should use idle_cpu directly instead of this intermediate i variable

+                       idle_cpu = __select_idle_cpu(cpu);
+                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
+                               break;

Apart ths small comment above, the patch looks good to me and I
haven't any performance regression anymore


> +                       if ((unsigned int)i < nr_cpumask_bits) {
> +                               idle_cpu = i;
> +                               break;
> +                       }
> +               }
>         }
>
> -       if (sched_feat(SIS_PROP)) {
> +       if (smt)
> +               set_idle_cores(this, false);
> +
> +       if (sched_feat(SIS_PROP) && !smt) {
>                 time = cpu_clock(this) - time;
>                 update_avg(&this_sd->avg_scan_cost, time);
>         }
>
> -       return cpu;
> +       return idle_cpu;
>  }
>
>  /*
> @@ -6289,10 +6314,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>         if (!sd)
>                 return target;
>
> -       i = select_idle_core(p, sd, target);
> -       if ((unsigned)i < nr_cpumask_bits)
> -               return i;
> -
>         i = select_idle_cpu(p, sd, target);
>         if ((unsigned)i < nr_cpumask_bits)
>                 return i;
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-27 10:43   ` Vincent Guittot
@ 2021-01-27 12:02     ` Mel Gorman
  2021-01-27 13:07       ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2021-01-27 12:02 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Wed, Jan 27, 2021 at 11:43:22AM +0100, Vincent Guittot wrote:
> > @@ -6149,18 +6161,31 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> >         }
> >
> >         for_each_cpu_wrap(cpu, cpus, target) {
> > -               if (!--nr)
> > -                       return -1;
> > -               if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> > -                       break;
> > +               if (smt) {
> > +                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> > +                       if ((unsigned int)i < nr_cpumask_bits)
> > +                               return i;
> > +
> > +               } else {
> > +                       if (!--nr)
> > +                               return -1;
> > +                       i = __select_idle_cpu(cpu);
> 
> you should use idle_cpu directly instead of this intermediate i variable
> 
> +                       idle_cpu = __select_idle_cpu(cpu);
> +                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
> +                               break;
> 
> Apart ths small comment above, the patch looks good to me and I
> haven't any performance regression anymore
> 

It's matching the code sequence in the SMT block. If we are going to make
that change, then go the full way with this?

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 52a650aa2108..01e40e36c386 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6129,7 +6129,7 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
-	int i, cpu, idle_cpu = -1, nr = INT_MAX;
+	int cpu, idle_cpu = -1, nr = INT_MAX;
 	bool smt = test_idle_cores(target, false);
 	int this = smp_processor_id();
 	struct sched_domain *this_sd;
@@ -6162,18 +6162,16 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 
 	for_each_cpu_wrap(cpu, cpus, target) {
 		if (smt) {
-			i = select_idle_core(p, cpu, cpus, &idle_cpu);
-			if ((unsigned int)i < nr_cpumask_bits)
-				return i;
+			idle_cpu = select_idle_core(p, cpu, cpus, &idle_cpu);
+			if ((unsigned int)idle_cpu < nr_cpumask_bits)
+				return idle_cpu;
 
 		} else {
 			if (!--nr)
 				return -1;
-			i = __select_idle_cpu(cpu);
-			if ((unsigned int)i < nr_cpumask_bits) {
-				idle_cpu = i;
+			idle_cpu = __select_idle_cpu(cpu);
+			if ((unsigned int)idle_cpu < nr_cpumask_bits)
 				break;
-			}
 		}
 	}
 

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-27 12:02     ` Mel Gorman
@ 2021-01-27 13:07       ` Vincent Guittot
  2021-01-27 13:21         ` Mel Gorman
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-27 13:07 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Wed, 27 Jan 2021 at 13:02, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Wed, Jan 27, 2021 at 11:43:22AM +0100, Vincent Guittot wrote:
> > > @@ -6149,18 +6161,31 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> > >         }
> > >
> > >         for_each_cpu_wrap(cpu, cpus, target) {
> > > -               if (!--nr)
> > > -                       return -1;
> > > -               if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> > > -                       break;
> > > +               if (smt) {
> > > +                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> > > +                       if ((unsigned int)i < nr_cpumask_bits)
> > > +                               return i;
> > > +
> > > +               } else {
> > > +                       if (!--nr)
> > > +                               return -1;
> > > +                       i = __select_idle_cpu(cpu);
> >
> > you should use idle_cpu directly instead of this intermediate i variable
> >
> > +                       idle_cpu = __select_idle_cpu(cpu);
> > +                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
> > +                               break;
> >
> > Apart ths small comment above, the patch looks good to me and I
> > haven't any performance regression anymore
> >
>
> It's matching the code sequence in the SMT block. If we are going to make
> that change, then go the full way with this?
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 52a650aa2108..01e40e36c386 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6129,7 +6129,7 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma
>  static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
>  {
>         struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> -       int i, cpu, idle_cpu = -1, nr = INT_MAX;
> +       int cpu, idle_cpu = -1, nr = INT_MAX;
>         bool smt = test_idle_cores(target, false);
>         int this = smp_processor_id();
>         struct sched_domain *this_sd;
> @@ -6162,18 +6162,16 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>
>         for_each_cpu_wrap(cpu, cpus, target) {
>                 if (smt) {
> -                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> -                       if ((unsigned int)i < nr_cpumask_bits)
> -                               return i;
> +                       idle_cpu = select_idle_core(p, cpu, cpus, &idle_cpu);

but how do you differentiate idle core (return value) and an idle cpu
in the core set in &idle_cpu

You will return as soon as a cpu is idle and before testing all cores

> +                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
> +                               return idle_cpu;
>
>                 } else {
>                         if (!--nr)
>                                 return -1;
> -                       i = __select_idle_cpu(cpu);
> -                       if ((unsigned int)i < nr_cpumask_bits) {
> -                               idle_cpu = i;
> +                       idle_cpu = __select_idle_cpu(cpu);
> +                       if ((unsigned int)idle_cpu < nr_cpumask_bits)
>                                 break;
> -                       }
>                 }
>         }
>
>
> --
> Mel Gorman
> SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-27 13:07       ` Vincent Guittot
@ 2021-01-27 13:21         ` Mel Gorman
  2021-01-27 13:26           ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Mel Gorman @ 2021-01-27 13:21 UTC (permalink / raw)
  To: Vincent Guittot; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Wed, Jan 27, 2021 at 02:07:50PM +0100, Vincent Guittot wrote:
> > @@ -6162,18 +6162,16 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> >
> >         for_each_cpu_wrap(cpu, cpus, target) {
> >                 if (smt) {
> > -                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> > -                       if ((unsigned int)i < nr_cpumask_bits)
> > -                               return i;
> > +                       idle_cpu = select_idle_core(p, cpu, cpus, &idle_cpu);
> 
> but how do you differentiate idle core (return value) and an idle cpu
> in the core set in &idle_cpu
> 
> You will return as soon as a cpu is idle and before testing all cores
> 

Bah, I'm sorry, I was context switching between multiple tasks and failed
to engage brain. I'll apply your hunk and resend. I don't think this
merits retesting as the saving of avoiding the intermeriate is marginal.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-27 13:21         ` Mel Gorman
@ 2021-01-27 13:26           ` Vincent Guittot
  0 siblings, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-27 13:26 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Peter Zijlstra, Ingo Molnar, Li Aubrey, Qais Yousef, LKML

On Wed, 27 Jan 2021 at 14:21, Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Wed, Jan 27, 2021 at 02:07:50PM +0100, Vincent Guittot wrote:
> > > @@ -6162,18 +6162,16 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> > >
> > >         for_each_cpu_wrap(cpu, cpus, target) {
> > >                 if (smt) {
> > > -                       i = select_idle_core(p, cpu, cpus, &idle_cpu);
> > > -                       if ((unsigned int)i < nr_cpumask_bits)
> > > -                               return i;
> > > +                       idle_cpu = select_idle_core(p, cpu, cpus, &idle_cpu);
> >
> > but how do you differentiate idle core (return value) and an idle cpu
> > in the core set in &idle_cpu
> >
> > You will return as soon as a cpu is idle and before testing all cores
> >
>
> Bah, I'm sorry, I was context switching between multiple tasks and failed
> to engage brain. I'll apply your hunk and resend. I don't think this

That's also happen to me

> merits retesting as the saving of avoiding the intermeriate is marginal.

I agree. You can add my Reviewed-by

>
> --
> Mel Gorman
> SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip: sched/core] sched/fair: Remove select_idle_smt()
  2021-01-25  8:59 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman
  2021-01-27 10:39   ` Vincent Guittot
@ 2021-02-17 13:17   ` tip-bot2 for Mel Gorman
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot2 for Mel Gorman @ 2021-02-17 13:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Mel Gorman, Peter Zijlstra (Intel),
	Ingo Molnar, Vincent Guittot, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     6cd56ef1df399a004f90ecb682427f9964969fc9
Gitweb:        https://git.kernel.org/tip/6cd56ef1df399a004f90ecb682427f9964969fc9
Author:        Mel Gorman <mgorman@techsingularity.net>
AuthorDate:    Mon, 25 Jan 2021 08:59:08 
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 17 Feb 2021 14:06:59 +01:00

sched/fair: Remove select_idle_smt()

In order to make the next patch more readable, and to quantify the
actual effectiveness of this pass, start by removing it.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210125085909.4600-4-mgorman@techsingularity.net
---
 kernel/sched/fair.c | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4c18ef6..6a0fc8a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6114,27 +6114,6 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int 
 	return -1;
 }
 
-/*
- * Scan the local SMT mask for idle CPUs.
- */
-static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
-{
-	int cpu;
-
-	if (!static_branch_likely(&sched_smt_present))
-		return -1;
-
-	for_each_cpu(cpu, cpu_smt_mask(target)) {
-		if (!cpumask_test_cpu(cpu, p->cpus_ptr) ||
-		    !cpumask_test_cpu(cpu, sched_domain_span(sd)))
-			continue;
-		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
-			return cpu;
-	}
-
-	return -1;
-}
-
 #else /* CONFIG_SCHED_SMT */
 
 static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
@@ -6142,11 +6121,6 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
 	return -1;
 }
 
-static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target)
-{
-	return -1;
-}
-
 #endif /* CONFIG_SCHED_SMT */
 
 /*
@@ -6336,10 +6310,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	if ((unsigned)i < nr_cpumask_bits)
 		return i;
 
-	i = select_idle_smt(p, sd, target);
-	if ((unsigned)i < nr_cpumask_bits)
-		return i;
-
 	return target;
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] sched/fair: Merge select_idle_core/cpu()
  2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman
@ 2021-01-27 13:52 ` Mel Gorman
  0 siblings, 0 replies; 14+ messages in thread
From: Mel Gorman @ 2021-01-27 13:52 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Vincent Guittot, Li Aubrey, Qais Yousef, LKML, Mel Gorman

From: Peter Zijlstra (Intel) <peterz@infradead.org>

Both select_idle_core() and select_idle_cpu() do a loop over the same
cpumask. Observe that by clearing the already visited CPUs, we can
fold the iteration and iterate a core at a time.

All we need to do is remember any non-idle CPU we encountered while
scanning for an idle core. This way we'll only iterate every CPU once.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 99 +++++++++++++++++++++++++++------------------
 1 file changed, 59 insertions(+), 40 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fe587350ea14..01591e24a95f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6006,6 +6006,14 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
 	return new_cpu;
 }
 
+static inline int __select_idle_cpu(int cpu)
+{
+	if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
+		return cpu;
+
+	return -1;
+}
+
 #ifdef CONFIG_SCHED_SMT
 DEFINE_STATIC_KEY_FALSE(sched_smt_present);
 EXPORT_SYMBOL_GPL(sched_smt_present);
@@ -6064,48 +6072,51 @@ void __update_idle_core(struct rq *rq)
  * there are no idle cores left in the system; tracked through
  * sd_llc->shared->has_idle_cores and enabled through update_idle_core() above.
  */
-static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
+static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
 {
-	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
-	int core, cpu;
+	bool idle = true;
+	int cpu;
 
 	if (!static_branch_likely(&sched_smt_present))
-		return -1;
-
-	if (!test_idle_cores(target, false))
-		return -1;
-
-	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+		return __select_idle_cpu(core);
 
-	for_each_cpu_wrap(core, cpus, target) {
-		bool idle = true;
-
-		for_each_cpu(cpu, cpu_smt_mask(core)) {
-			if (!available_idle_cpu(cpu)) {
-				idle = false;
-				break;
+	for_each_cpu(cpu, cpu_smt_mask(core)) {
+		if (!available_idle_cpu(cpu)) {
+			idle = false;
+			if (*idle_cpu == -1) {
+				if (sched_idle_cpu(cpu) && cpumask_test_cpu(cpu, p->cpus_ptr)) {
+					*idle_cpu = cpu;
+					break;
+				}
+				continue;
 			}
+			break;
 		}
-
-		if (idle)
-			return core;
-
-		cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
+		if (*idle_cpu == -1 && cpumask_test_cpu(cpu, p->cpus_ptr))
+			*idle_cpu = cpu;
 	}
 
-	/*
-	 * Failed to find an idle core; stop looking for one.
-	 */
-	set_idle_cores(target, 0);
+	if (idle)
+		return core;
 
+	cpumask_andnot(cpus, cpus, cpu_smt_mask(core));
 	return -1;
 }
 
 #else /* CONFIG_SCHED_SMT */
 
-static inline int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
+static inline void set_idle_cores(int cpu, int val)
 {
-	return -1;
+}
+
+static inline bool test_idle_cores(int cpu, bool def)
+{
+	return def;
+}
+
+static inline int select_idle_core(struct task_struct *p, int core, struct cpumask *cpus, int *idle_cpu)
+{
+	return __select_idle_cpu(core);
 }
 
 #endif /* CONFIG_SCHED_SMT */
@@ -6118,10 +6129,11 @@ static inline int select_idle_core(struct task_struct *p, struct sched_domain *s
 static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
+	int i, cpu, idle_cpu = -1, nr = INT_MAX;
+	bool smt = test_idle_cores(target, false);
+	int this = smp_processor_id();
 	struct sched_domain *this_sd;
 	u64 time;
-	int this = smp_processor_id();
-	int cpu, nr = INT_MAX;
 
 	this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
 	if (!this_sd)
@@ -6129,7 +6141,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 
 	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
 
-	if (sched_feat(SIS_PROP)) {
+	if (sched_feat(SIS_PROP) && !smt) {
 		u64 avg_cost, avg_idle, span_avg;
 
 		/*
@@ -6149,18 +6161,29 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 	}
 
 	for_each_cpu_wrap(cpu, cpus, target) {
-		if (!--nr)
-			return -1;
-		if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
-			break;
+		if (smt) {
+			i = select_idle_core(p, cpu, cpus, &idle_cpu);
+			if ((unsigned int)i < nr_cpumask_bits)
+				return i;
+
+		} else {
+			if (!--nr)
+				return -1;
+			idle_cpu = __select_idle_cpu(cpu);
+			if ((unsigned int)idle_cpu < nr_cpumask_bits)
+				break;
+		}
 	}
 
-	if (sched_feat(SIS_PROP)) {
+	if (smt)
+		set_idle_cores(this, false);
+
+	if (sched_feat(SIS_PROP) && !smt) {
 		time = cpu_clock(this) - time;
 		update_avg(&this_sd->avg_scan_cost, time);
 	}
 
-	return cpu;
+	return idle_cpu;
 }
 
 /*
@@ -6289,10 +6312,6 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 	if (!sd)
 		return target;
 
-	i = select_idle_core(p, sd, target);
-	if ((unsigned)i < nr_cpumask_bits)
-		return i;
-
 	i = select_idle_cpu(p, sd, target);
 	if ((unsigned)i < nr_cpumask_bits)
 		return i;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-02-17 13:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25  8:59 [PATCH v4 0/4] Scan for an idle sibling in a single pass Mel Gorman
2021-01-25  8:59 ` [PATCH 1/4] sched/fair: Remove SIS_AVG_CPU Mel Gorman
2021-01-25  8:59 ` [PATCH 2/4] sched/fair: Move avg_scan_cost calculations under SIS_PROP Mel Gorman
2021-01-27 10:39   ` Vincent Guittot
2021-01-25  8:59 ` [PATCH 3/4] sched/fair: Remove select_idle_smt() Mel Gorman
2021-01-27 10:39   ` Vincent Guittot
2021-02-17 13:17   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2021-01-25  8:59 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman
2021-01-27 10:43   ` Vincent Guittot
2021-01-27 12:02     ` Mel Gorman
2021-01-27 13:07       ` Vincent Guittot
2021-01-27 13:21         ` Mel Gorman
2021-01-27 13:26           ` Vincent Guittot
2021-01-27 13:51 [PATCH v5 0/4] Scan for an idle sibling in a single pass Mel Gorman
2021-01-27 13:52 ` [PATCH 4/4] sched/fair: Merge select_idle_core/cpu() Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).