All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement
@ 2019-01-17 15:34 Valentin Schneider
  2019-01-17 15:34 ` [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks Valentin Schneider
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

In

  commit 5fbdfae5221a ("sched/fair: Kick nohz balance if rq->misfit_task_load")

was added a trigger for nohz kicks, which is required to offload misfit tasks
from LITTLE to big CPUs. However, those kicks could be issued a lot more
frequently than what is strictly needed.

This patch-set brings some unrelated nohz code cleanups, and tunes down
unneeded nohz kicks.

- Patches [1-3] do some cleaning up of the current nohz code
- Patches [4-5] tweak the nohz kick conditions for asymmetric systems
  
* Testing
** kick_ilb() hits
  This causes a large reduction in calls to kick_ilb() (and thus subsequent
  rescheduling interrupts & useless nohz balance calls) in most scenarios.

  The "best case" one is running NR_BIG_CPUS big tasks, which I tested with
  4 50% periodic tasks running for 5 seconds on my HiKey960 (4x4 big.LITTLE):

  | CPU | hits (baseline) | hits (patchset) |
  |-----+-----------------+-----------------|
  |   0 |              31 |              41 |
  |   1 |              21 |               3 |
  |   2 |              35 |               2 |
  |   3 |               9 |               4 |
  |-----+-----------------+-----------------|
  |   4 |             170 |               4 |
  |   5 |             573 |               4 |
  |   6 |             544 |               4 |
  |   7 |             579 |               4 |


  Something a bit less idealistic with NR_CPUS-1 big tasks still shows some
  improvements (7 100% tasks running for 5 seconds on my HiKey960):

  | CPU | hits (baseline) | hits (patchset) |
  |-----+-----------------+-----------------|
  |   0 |              14 |             122 |
  |   1 |              47 |             162 |
  |   2 |              11 |             156 |
  |   3 |               9 |               3 |
  |-----+-----------------+-----------------|
  |   4 |              53 |               6 |
  |   5 |             276 |              13 |
  |   6 |             312 |               7 |
  |   7 |             250 |              11 |

  I was surprised to see such an increase in calls to kick_ilb() from LITTLE
  CPUs ([0-3]), but after a bit of investigation it turns out that the big
  CPUs would always run nohz_balancer_kick() a jiffy before the LITTLEs, so
  the LITTLEs would always bail out because nohz.next_balance had just been
  updated before they called nohz_balancer_kick(). IOW,

      time_before(now, nohz.next_balance)

  would always be true on CPUs [0-3] during my workload. Quieting the kicks
  issued by the big CPUs allowed the LITTLEs to execute nohz_balancer_kick()
  past that condition, explaining the higher number of kicks issued from LITTLE
  CPUs.
  
** misfit behaviour
  For good measure I also ran the usual misfit tests [1] which showed no
  particular change.

* Notes

  Since patch 05 introduces yet another label in kick_nohz_balancer(),
  I benchmarked splitting its "slowpath"  (all of the rcu derefs) into
  its own function and didn't find any real overhead. If folks don't
  hate the idea I can add it to a v2 or post it separately.

[1]: https://github.com/ARM-software/lisa/blob/next/lisa/tests/kernel/scheduler/misfit.py

Valentin Schneider (5):
  sched/fair: Use for_each_cpu_and for asym-packing nohz kicks
  sched/fair: Explain LLC nohz kick condition
  sched/fair: Prune nohz_balancer_kick() comment block
  sched/fair: Tune down misfit nohz kicks
  sched/fair: Skip LLC nohz logic for asymmetric systems

 kernel/sched/fair.c | 62 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 44 insertions(+), 18 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
@ 2019-01-17 15:34 ` Valentin Schneider
  2019-02-11 10:50   ` [tip:sched/core] sched/fair: Simplify nohz_balancer_kick() tip-bot for Valentin Schneider
  2019-01-17 15:34 ` [PATCH 2/5] sched/fair: Explain LLC nohz kick condition Valentin Schneider
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

Calling 'nohz_balance_exit_idle(rq)' will always clear 'rq->cpu' from
'nohz.idle_cpus_mask' if it is set. Since it is called at the top of
'nohz_balancer_kick()', 'rq->cpu' will never be set in
'nohz.idle_cpus_mask' if it is accessed in the rest of the function.

"And" the 'sched_domain_span()' with 'nohz.idle_cpus_mask' and drop the
'(i == cpu)' check since 'rq->cpu' will never be iterated over.

While at it, clean up a condition alignment.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ca469646ebe1..380a72e42678 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9557,7 +9557,7 @@ static void nohz_balancer_kick(struct rq *rq)
 	sd = rcu_dereference(rq->sd);
 	if (sd) {
 		if ((rq->cfs.h_nr_running >= 1) &&
-				check_cpu_capacity(rq, sd)) {
+		    check_cpu_capacity(rq, sd)) {
 			flags = NOHZ_KICK_MASK;
 			goto unlock;
 		}
@@ -9565,11 +9565,7 @@ static void nohz_balancer_kick(struct rq *rq)
 
 	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
 	if (sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
-			if (i == cpu ||
-			    !cpumask_test_cpu(i, nohz.idle_cpus_mask))
-				continue;
-
+		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
 			if (sched_asym_prefer(i, cpu)) {
 				flags = NOHZ_KICK_MASK;
 				goto unlock;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/5] sched/fair: Explain LLC nohz kick condition
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
  2019-01-17 15:34 ` [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks Valentin Schneider
@ 2019-01-17 15:34 ` Valentin Schneider
  2019-02-11 10:51   ` [tip:sched/core] " tip-bot for Valentin Schneider
  2019-01-17 15:34 ` [PATCH 3/5] sched/fair: Prune nohz_balancer_kick() comment block Valentin Schneider
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

Provide a comment explaining the LLC related nohz kick in
nohz_balancer_kick().

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 380a72e42678..62f6aead7e73 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9543,8 +9543,13 @@ static void nohz_balancer_kick(struct rq *rq)
 	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
 	if (sds) {
 		/*
-		 * XXX: write a coherent comment on why we do this.
-		 * See also: http://lkml.kernel.org/r/20111202010832.602203411@sbsiddha-desk.sc.intel.com
+		 * If there is an imbalance between LLC domains (IOW we could
+		 * increase the overall cache use), we need some less-loaded LLC
+		 * domain to pull some load. Likewise, we may need to spread
+		 * load within the current LLC domain (e.g. packed SMT cores but
+		 * other CPUs are idle). We can't really know from here how busy
+		 * the others are - so just get a nohz balance going if it looks
+		 * like this LLC domain has tasks we could move.
 		 */
 		nr_busy = atomic_read(&sds->nr_busy_cpus);
 		if (nr_busy > 1) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/5] sched/fair: Prune nohz_balancer_kick() comment block
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
  2019-01-17 15:34 ` [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks Valentin Schneider
  2019-01-17 15:34 ` [PATCH 2/5] sched/fair: Explain LLC nohz kick condition Valentin Schneider
@ 2019-01-17 15:34 ` Valentin Schneider
  2019-02-11 10:51   ` [tip:sched/core] sched/fair: Prune, fix and simplify the " tip-bot for Valentin Schneider
  2019-01-17 15:34 ` [PATCH 4/5] sched/fair: Tune down misfit nohz kicks Valentin Schneider
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

The comment block for that function lists the heuristics for
triggering a nohz kick, but the most recent ones (blocked load
updates, misfit) aren't included, and some of them (LLC nohz logic,
asym packing) are no longer in sync with the code.

The conditions are either simple enough or properly commented, so get
rid of that list instead of letting it grow.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 62f6aead7e73..7658f9d76a9a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9493,15 +9493,8 @@ static void kick_ilb(unsigned int flags)
 }
 
 /*
- * Current heuristic for kicking the idle load balancer in the presence
- * of an idle cpu in the system.
- *   - This rq has more than one task.
- *   - This rq has at least one CFS task and the capacity of the CPU is
- *     significantly reduced because of RT tasks or IRQs.
- *   - At parent of LLC scheduler domain level, this cpu's scheduler group has
- *     multiple busy cpu.
- *   - For SD_ASYM_PACKING, if the lower numbered cpu's in the scheduler
- *     domain span are idle.
+ * Current decision point for kicking the idle load balancer in the presence
+ * of idle CPUs in the system.
  */
 static void nohz_balancer_kick(struct rq *rq)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] sched/fair: Tune down misfit nohz kicks
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
                   ` (2 preceding siblings ...)
  2019-01-17 15:34 ` [PATCH 3/5] sched/fair: Prune nohz_balancer_kick() comment block Valentin Schneider
@ 2019-01-17 15:34 ` Valentin Schneider
  2019-02-06 16:04   ` Peter Zijlstra
  2019-01-17 15:34 ` [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems Valentin Schneider
  2019-02-06 14:33 ` [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
  5 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

In

  commmit 3b1baa6496e6 ("sched/fair: Add 'group_misfit_task' load-balance type")

we set rq->misfit_task_load whenever the current running task has a
utilization greater than 80% of rq->cpu_capacity. A non-zero value in
this field enables misfit load balancing.

However, if the task being looked at is already running on a CPU of
highest capacity, there's nothing more we can do for it. We can
currently spot this in update_sd_pick_busiest(), which prevents us
from selecting a sched_group of group_type == group_misfit_task as the
busiest group, but we don't do any of that in nohz_balancer_kick().

This means that we could repeatedly kick nohz CPUs when there's no
improvements in terms of load balance to be done.

Introduce a check_misfit_status() helper that returns true iff there
is a CPU in the system that could give more CPU capacity to a rq's
misfit task - IOW, there exists a CPU of higher capacity_orig or the
rq's CPU is severely pressured by rt/IRQ.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7658f9d76a9a..291dfdb0183f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8012,6 +8012,18 @@ check_cpu_capacity(struct rq *rq, struct sched_domain *sd)
 				(rq->cpu_capacity_orig * 100));
 }
 
+/*
+ * Check whether a rq has a misfit task and if it looks like we can actually
+ * help that task: we can migrate the task to a CPU of higher capacity, or
+ * the task's current CPU is heavily pressured.
+ */
+static inline int check_misfit_status(struct rq *rq, struct sched_domain *sd)
+{
+	return rq->misfit_task_load &&
+		(rq->cpu_capacity_orig < rq->rd->max_cpu_capacity ||
+		 check_cpu_capacity(rq, sd));
+}
+
 /*
  * Group imbalance indicates (and tries to solve) the problem where balancing
  * groups is inadequate due to ->cpus_allowed constraints.
@@ -9527,7 +9539,7 @@ static void nohz_balancer_kick(struct rq *rq)
 	if (time_before(now, nohz.next_balance))
 		goto out;
 
-	if (rq->nr_running >= 2 || rq->misfit_task_load) {
+	if (rq->nr_running >= 2) {
 		flags = NOHZ_KICK_MASK;
 		goto out;
 	}
@@ -9561,6 +9573,14 @@ static void nohz_balancer_kick(struct rq *rq)
 		}
 	}
 
+	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
+	if (sd) {
+		if (check_misfit_status(rq, sd)) {
+			flags = NOHZ_KICK_MASK;
+			goto unlock;
+		}
+	}
+
 	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
 	if (sd) {
 		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
                   ` (3 preceding siblings ...)
  2019-01-17 15:34 ` [PATCH 4/5] sched/fair: Tune down misfit nohz kicks Valentin Schneider
@ 2019-01-17 15:34 ` Valentin Schneider
  2019-02-06 16:14   ` Peter Zijlstra
  2019-02-06 14:33 ` [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
  5 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-01-17 15:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

The LLC nohz condition will become true as soon as >=2 CPUs in a
single LLC domain are busy. On big.LITTLE systems, this translates to
two or more CPUs of a "cluster" (big or LITTLE) being busy.

Issuing a nohz kick in these conditions isn't desired for asymmetric
systems, as if the busy CPUs can provide enough compute capacity to
the running tasks, then we can leave the nohz CPUs in peace.

Skip the LLC nohz condition for asymmetric systems, and rely on
nr_running & capacity checks to trigger nohz kicks when the system
actually needs them.

Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 291dfdb0183f..ff27bf56e8d4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9545,6 +9545,17 @@ static void nohz_balancer_kick(struct rq *rq)
 	}
 
 	rcu_read_lock();
+
+	if (static_branch_unlikely(&sched_asym_cpucapacity))
+		/*
+		 * For asymmetric systems, we do not want to nicely balance
+		 * cache use, instead we want to embrace asymmetry and only
+		 * ensure tasks have enough CPU capacity.
+		 *
+		 * Skip the LLC logic because it's not relevant in that case.
+		 */
+		goto check_capacity;
+
 	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
 	if (sds) {
 		/*
@@ -9564,6 +9575,7 @@ static void nohz_balancer_kick(struct rq *rq)
 
 	}
 
+check_capacity:
 	sd = rcu_dereference(rq->sd);
 	if (sd) {
 		if ((rq->cfs.h_nr_running >= 1) &&
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement
  2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
                   ` (4 preceding siblings ...)
  2019-01-17 15:34 ` [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems Valentin Schneider
@ 2019-02-06 14:33 ` Valentin Schneider
  5 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2019-02-06 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, peterz, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

Hi,

On 17/01/2019 15:34, Valentin Schneider wrote:
[...]
> Valentin Schneider (5):
>   sched/fair: Use for_each_cpu_and for asym-packing nohz kicks
>   sched/fair: Explain LLC nohz kick condition
>   sched/fair: Prune nohz_balancer_kick() comment block
>   sched/fair: Tune down misfit nohz kicks
>   sched/fair: Skip LLC nohz logic for asymmetric systems
> 
>  kernel/sched/fair.c | 62 ++++++++++++++++++++++++++++++++-------------
>  1 file changed, 44 insertions(+), 18 deletions(-)
> 

Gentle ping

> --
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] sched/fair: Tune down misfit nohz kicks
  2019-01-17 15:34 ` [PATCH 4/5] sched/fair: Tune down misfit nohz kicks Valentin Schneider
@ 2019-02-06 16:04   ` Peter Zijlstra
  2019-02-06 17:25     ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-02-06 16:04 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

On Thu, Jan 17, 2019 at 03:34:10PM +0000, Valentin Schneider wrote:
> In
> 
>   commmit 3b1baa6496e6 ("sched/fair: Add 'group_misfit_task' load-balance type")
> 
> we set rq->misfit_task_load whenever the current running task has a
> utilization greater than 80% of rq->cpu_capacity. A non-zero value in
> this field enables misfit load balancing.
> 
> However, if the task being looked at is already running on a CPU of
> highest capacity, there's nothing more we can do for it. We can
> currently spot this in update_sd_pick_busiest(), which prevents us
> from selecting a sched_group of group_type == group_misfit_task as the
> busiest group, but we don't do any of that in nohz_balancer_kick().
> 
> This means that we could repeatedly kick nohz CPUs when there's no
> improvements in terms of load balance to be done.
> 
> Introduce a check_misfit_status() helper that returns true iff there
> is a CPU in the system that could give more CPU capacity to a rq's
> misfit task - IOW, there exists a CPU of higher capacity_orig or the
> rq's CPU is severely pressured by rt/IRQ.
> 
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>

> +static inline int check_misfit_status(struct rq *rq, struct sched_domain *sd)
> +{
> +	return rq->misfit_task_load &&
> +		(rq->cpu_capacity_orig < rq->rd->max_cpu_capacity ||
> +		 check_cpu_capacity(rq, sd));
> +}


> @@ -9527,7 +9539,7 @@ static void nohz_balancer_kick(struct rq *rq)
>  	if (time_before(now, nohz.next_balance))
>  		goto out;
>  
> -	if (rq->nr_running >= 2 || rq->misfit_task_load) {
> +	if (rq->nr_running >= 2) {
>  		flags = NOHZ_KICK_MASK;
>  		goto out;
>  	}
> @@ -9561,6 +9573,14 @@ static void nohz_balancer_kick(struct rq *rq)

	sd = rcu_dereference(rq->sd);
	if (sd) {
		if ((rq->cfs.h_nr_running >= 1) &&
		    check_cpu_capacity(rq, sd)) {
			flags = NOHZ_KICK_MASK;
			goto unlock;
>  		}
>  	}
>  
> +	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
> +	if (sd) {
> +		if (check_misfit_status(rq, sd)) {
> +			flags = NOHZ_KICK_MASK;
> +			goto unlock;
> +		}
> +	}

So while the exact @sd to use for check_cpu_capacity() likely doesn't
matter; this is a 'implicit' test for actually having asym_capacity.

Fair enough I suppose. However, now that you wrote such a nice comment
for the sd_llc_shared case, these other two cases are sad to not have a
comment.

So how about you add something like:

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9589,8 +9589,12 @@ static void nohz_balancer_kick(struct rq
 
 	sd = rcu_dereference(rq->sd);
 	if (sd) {
-		if ((rq->cfs.h_nr_running >= 1) &&
-		    check_cpu_capacity(rq, sd)) {
+		/*
+		 * If there's a CFS task and the current CPU has reduced
+		 * capacity; kick the ILB to see if there's a better CPU to run
+		 * on.
+		 */
+		if (rq->cfs.h_nr_running >= 1 && check_cpu_capacity(rq, sd)) {
 			flags = NOHZ_KICK_MASK;
 			goto unlock;
 		}
@@ -9598,6 +9602,10 @@ static void nohz_balancer_kick(struct rq
 
 	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
 	if (sd) {
+		/*
+		 * When ASYM_CAPACITY; see if there's a higher capacity CPU to
+		 * run the misfit task on.
+		 */
 		if (check_misfit_status(rq, sd)) {
 			flags = NOHZ_KICK_MASK;
 			goto unlock;
@@ -9606,6 +9614,10 @@ static void nohz_balancer_kick(struct rq
 
 	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
 	if (sd) {
+		/*
+		 * When ASYM_PACKING; see if there's a more preferred CPU going
+		 * idle; in which case, kick the ILB to move tasks around.
+		 */
 		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
 			if (sched_asym_prefer(i, cpu)) {
 				flags = NOHZ_KICK_MASK;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems
  2019-01-17 15:34 ` [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems Valentin Schneider
@ 2019-02-06 16:14   ` Peter Zijlstra
  2019-02-06 17:26     ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-02-06 16:14 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

On Thu, Jan 17, 2019 at 03:34:11PM +0000, Valentin Schneider wrote:
> The LLC nohz condition will become true as soon as >=2 CPUs in a
> single LLC domain are busy. On big.LITTLE systems, this translates to
> two or more CPUs of a "cluster" (big or LITTLE) being busy.
> 
> Issuing a nohz kick in these conditions isn't desired for asymmetric
> systems, as if the busy CPUs can provide enough compute capacity to
> the running tasks, then we can leave the nohz CPUs in peace.
> 
> Skip the LLC nohz condition for asymmetric systems, and rely on
> nr_running & capacity checks to trigger nohz kicks when the system
> actually needs them.
> 
> Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
>  kernel/sched/fair.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 291dfdb0183f..ff27bf56e8d4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9545,6 +9545,17 @@ static void nohz_balancer_kick(struct rq *rq)
>  	}
>  
>  	rcu_read_lock();
> +
> +	if (static_branch_unlikely(&sched_asym_cpucapacity))
> +		/*
> +		 * For asymmetric systems, we do not want to nicely balance
> +		 * cache use, instead we want to embrace asymmetry and only
> +		 * ensure tasks have enough CPU capacity.
> +		 *
> +		 * Skip the LLC logic because it's not relevant in that case.
> +		 */
> +		goto check_capacity;
> +
>  	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
>  	if (sds) {
>  		/*

Since (before this) the actual order of the various tests doesn't
matter, it's a logical cascade of conditions for which to KICK_MASK.

We can easily reorder and short-circuit the cascase like so, no?

The only concern is if sd_llc_shared < sd_asym_capacity; in which case
we just lost a balance opportunity. Not sure how to best retain that
though.

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9568,25 +9568,6 @@ static void nohz_balancer_kick(struct rq
 	}
 
 	rcu_read_lock();
-	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
-	if (sds) {
-		/*
-		 * If there is an imbalance between LLC domains (IOW we could
-		 * increase the overall cache use), we need some less-loaded LLC
-		 * domain to pull some load. Likewise, we may need to spread
-		 * load within the current LLC domain (e.g. packed SMT cores but
-		 * other CPUs are idle). We can't really know from here how busy
-		 * the others are - so just get a nohz balance going if it looks
-		 * like this LLC domain has tasks we could move.
-		 */
-		nr_busy = atomic_read(&sds->nr_busy_cpus);
-		if (nr_busy > 1) {
-			flags = NOHZ_KICK_MASK;
-			goto unlock;
-		}
-
-	}
-
 	sd = rcu_dereference(rq->sd);
 	if (sd) {
 		/*
@@ -9600,6 +9581,20 @@ static void nohz_balancer_kick(struct rq
 		}
 	}
 
+	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
+	if (sd) {
+		/*
+		 * When ASYM_PACKING; see if there's a more preferred CPU going
+		 * idle; in which case, kick the ILB to move tasks around.
+		 */
+		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
+			if (sched_asym_prefer(i, cpu)) {
+				flags = NOHZ_KICK_MASK;
+				goto unlock;
+			}
+		}
+	}
+
 	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
 	if (sd) {
 		/*
@@ -9610,21 +9605,36 @@ static void nohz_balancer_kick(struct rq
 			flags = NOHZ_KICK_MASK;
 			goto unlock;
 		}
+
+		/*
+		 * For asymmetric systems, we do not want to nicely balance
+		 * cache use, instead we want to embrace asymmetry and only
+		 * ensure tasks have enough CPU capacity.
+		 *
+		 * Skip the LLC logic because it's not relevant in that case.
+		 */
+		goto unlock;
 	}
 
-	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
-	if (sd) {
+	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
+	if (sds) {
 		/*
-		 * When ASYM_PACKING; see if there's a more preferred CPU going
-		 * idle; in which case, kick the ILB to move tasks around.
+		 * If there is an imbalance between LLC domains (IOW we could
+		 * increase the overall cache use), we need some less-loaded LLC
+		 * domain to pull some load. Likewise, we may need to spread
+		 * load within the current LLC domain (e.g. packed SMT cores but
+		 * other CPUs are idle). We can't really know from here how busy
+		 * the others are - so just get a nohz balance going if it looks
+		 * like this LLC domain has tasks we could move.
 		 */
-		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
-			if (sched_asym_prefer(i, cpu)) {
-				flags = NOHZ_KICK_MASK;
-				goto unlock;
-			}
+		nr_busy = atomic_read(&sds->nr_busy_cpus);
+		if (nr_busy > 1) {
+			flags = NOHZ_KICK_MASK;
+			goto unlock;
 		}
+
 	}
+
 unlock:
 	rcu_read_unlock();
 out:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] sched/fair: Tune down misfit nohz kicks
  2019-02-06 16:04   ` Peter Zijlstra
@ 2019-02-06 17:25     ` Valentin Schneider
  2019-02-07  9:57       ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-02-06 17:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

Hi,

On 06/02/2019 16:04, Peter Zijlstra wrote:
[...]
>> @@ -9561,6 +9573,14 @@ static void nohz_balancer_kick(struct rq *rq)
> 
> 	sd = rcu_dereference(rq->sd);
> 	if (sd) {
> 		if ((rq->cfs.h_nr_running >= 1) &&
> 		    check_cpu_capacity(rq, sd)) {
> 			flags = NOHZ_KICK_MASK;
> 			goto unlock;
>>  		}
>>  	}
>>  
>> +	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
>> +	if (sd) {
>> +		if (check_misfit_status(rq, sd)) {
>> +			flags = NOHZ_KICK_MASK;
>> +			goto unlock;
>> +		}
>> +	}
> 
> So while the exact @sd to use for check_cpu_capacity() likely doesn't
> matter; this is a 'implicit' test for actually having asym_capacity.
> 

I did feel compelled to use the "right" @sd to have a coherent
imbalance_pct being used - on big.LITTLE, the @sd above this hunk would be
MC (117 imbalance_pct) whereas the sd_asym_cpucapacity one would be
DIE (125 imbalance_pct).

> Fair enough I suppose. However, now that you wrote such a nice comment
> for the sd_llc_shared case, these other two cases are sad to not have a
> comment.
> 
> So how about you add something like:
> 
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9589,8 +9589,12 @@ static void nohz_balancer_kick(struct rq
>  
>  	sd = rcu_dereference(rq->sd);
>  	if (sd) {
> -		if ((rq->cfs.h_nr_running >= 1) &&
> -		    check_cpu_capacity(rq, sd)) {
> +		/*
> +		 * If there's a CFS task and the current CPU has reduced
> +		 * capacity; kick the ILB to see if there's a better CPU to run
> +		 * on.
> +		 */
> +		if (rq->cfs.h_nr_running >= 1 && check_cpu_capacity(rq, sd)) {
>  			flags = NOHZ_KICK_MASK;
>  			goto unlock;
>  		}
> @@ -9598,6 +9602,10 @@ static void nohz_balancer_kick(struct rq
>  
>  	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
>  	if (sd) {
> +		/*
> +		 * When ASYM_CAPACITY; see if there's a higher capacity CPU to
> +		 * run the misfit task on.
> +		 */
>  		if (check_misfit_status(rq, sd)) {
>  			flags = NOHZ_KICK_MASK;
>  			goto unlock;
> @@ -9606,6 +9614,10 @@ static void nohz_balancer_kick(struct rq
>  
>  	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
>  	if (sd) {
> +		/*
> +		 * When ASYM_PACKING; see if there's a more preferred CPU going
> +		 * idle; in which case, kick the ILB to move tasks around.
> +		 */

s/going idle/currently idle/, no?

>  		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
>  			if (sched_asym_prefer(i, cpu)) {
>  				flags = NOHZ_KICK_MASK;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems
  2019-02-06 16:14   ` Peter Zijlstra
@ 2019-02-06 17:26     ` Valentin Schneider
  2019-02-07  9:56       ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2019-02-06 17:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

Hi,

On 06/02/2019 16:14, Peter Zijlstra wrote:
[...]
>> @@ -9545,6 +9545,17 @@ static void nohz_balancer_kick(struct rq *rq)
>>  	}
>>  
>>  	rcu_read_lock();
>> +
>> +	if (static_branch_unlikely(&sched_asym_cpucapacity))
>> +		/*
>> +		 * For asymmetric systems, we do not want to nicely balance
>> +		 * cache use, instead we want to embrace asymmetry and only
>> +		 * ensure tasks have enough CPU capacity.
>> +		 *
>> +		 * Skip the LLC logic because it's not relevant in that case.
>> +		 */
>> +		goto check_capacity;
>> +
>>  	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
>>  	if (sds) {
>>  		/*
> 
> Since (before this) the actual order of the various tests doesn't
> matter, it's a logical cascade of conditions for which to KICK_MASK.
> 

Ah, I assumed the order did matter somewhat with the "cheaper" LLC check
first and the more costly loops further down in case we are still looking
for a reason to do a kick.

> We can easily reorder and short-circuit the cascase like so, no?
> 
> The only concern is if sd_llc_shared < sd_asym_capacity; in which case
> we just lost a balance opportunity. Not sure how to best retain that
> though.
> 

I'm afraid I don't follow - we don't lose a balance opportunity with the
below change (compared to the original patch), do we?

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9568,25 +9568,6 @@ static void nohz_balancer_kick(struct rq
>  	}
>  
>  	rcu_read_lock();
> -	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> -	if (sds) {
> -		/*
> -		 * If there is an imbalance between LLC domains (IOW we could
> -		 * increase the overall cache use), we need some less-loaded LLC
> -		 * domain to pull some load. Likewise, we may need to spread
> -		 * load within the current LLC domain (e.g. packed SMT cores but
> -		 * other CPUs are idle). We can't really know from here how busy
> -		 * the others are - so just get a nohz balance going if it looks
> -		 * like this LLC domain has tasks we could move.
> -		 */
> -		nr_busy = atomic_read(&sds->nr_busy_cpus);
> -		if (nr_busy > 1) {
> -			flags = NOHZ_KICK_MASK;
> -			goto unlock;
> -		}
> -
> -	}
> -
>  	sd = rcu_dereference(rq->sd);
>  	if (sd) {
>  		/*
> @@ -9600,6 +9581,20 @@ static void nohz_balancer_kick(struct rq
>  		}
>  	}
>  
> +	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
> +	if (sd) {
> +		/*
> +		 * When ASYM_PACKING; see if there's a more preferred CPU going
> +		 * idle; in which case, kick the ILB to move tasks around.
> +		 */
> +		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
> +			if (sched_asym_prefer(i, cpu)) {
> +				flags = NOHZ_KICK_MASK;
> +				goto unlock;
> +			}
> +		}
> +	}
> +
>  	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
>  	if (sd) {
>  		/*
> @@ -9610,21 +9605,36 @@ static void nohz_balancer_kick(struct rq
>  			flags = NOHZ_KICK_MASK;
>  			goto unlock;
>  		}
> +
> +		/*
> +		 * For asymmetric systems, we do not want to nicely balance
> +		 * cache use, instead we want to embrace asymmetry and only
> +		 * ensure tasks have enough CPU capacity.
> +		 *
> +		 * Skip the LLC logic because it's not relevant in that case.
> +		 */
> +		goto unlock;
>  	}
>  
> -	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
> -	if (sd) {
> +	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> +	if (sds) {
>  		/*
> -		 * When ASYM_PACKING; see if there's a more preferred CPU going
> -		 * idle; in which case, kick the ILB to move tasks around.
> +		 * If there is an imbalance between LLC domains (IOW we could
> +		 * increase the overall cache use), we need some less-loaded LLC
> +		 * domain to pull some load. Likewise, we may need to spread
> +		 * load within the current LLC domain (e.g. packed SMT cores but
> +		 * other CPUs are idle). We can't really know from here how busy
> +		 * the others are - so just get a nohz balance going if it looks
> +		 * like this LLC domain has tasks we could move.
>  		 */
> -		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
> -			if (sched_asym_prefer(i, cpu)) {
> -				flags = NOHZ_KICK_MASK;
> -				goto unlock;
> -			}
> +		nr_busy = atomic_read(&sds->nr_busy_cpus);
> +		if (nr_busy > 1) {
> +			flags = NOHZ_KICK_MASK;
> +			goto unlock;
>  		}
> +
>  	}
> +
>  unlock:
>  	rcu_read_unlock();
>  out:
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems
  2019-02-06 17:26     ` Valentin Schneider
@ 2019-02-07  9:56       ` Peter Zijlstra
  2019-02-07 11:31         ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2019-02-07  9:56 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

On Wed, Feb 06, 2019 at 05:26:06PM +0000, Valentin Schneider wrote:
> Hi,
> 
> On 06/02/2019 16:14, Peter Zijlstra wrote:
> [...]
> >> @@ -9545,6 +9545,17 @@ static void nohz_balancer_kick(struct rq *rq)
> >>  	}
> >>  
> >>  	rcu_read_lock();
> >> +
> >> +	if (static_branch_unlikely(&sched_asym_cpucapacity))
> >> +		/*
> >> +		 * For asymmetric systems, we do not want to nicely balance
> >> +		 * cache use, instead we want to embrace asymmetry and only
> >> +		 * ensure tasks have enough CPU capacity.
> >> +		 *
> >> +		 * Skip the LLC logic because it's not relevant in that case.
> >> +		 */
> >> +		goto check_capacity;
> >> +
> >>  	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
> >>  	if (sds) {
> >>  		/*
> > 
> > Since (before this) the actual order of the various tests doesn't
> > matter, it's a logical cascade of conditions for which to KICK_MASK.
> > 
> 
> Ah, I assumed the order did matter somewhat with the "cheaper" LLC check
> first and the more costly loops further down in case we are still looking
> for a reason to do a kick.

I did not in fact consider that; I only looked at the logical structure
of the thing. You might want to double check :-)

> > We can easily reorder and short-circuit the cascase like so, no?
> > 
> > The only concern is if sd_llc_shared < sd_asym_capacity; in which case
> > we just lost a balance opportunity. Not sure how to best retain that
> > though.
> > 
> 
> I'm afraid I don't follow - we don't lose a balance opportunity with the
> below change (compared to the original patch), do we?

What if each big/little cluster would have multiple cache domains? Would
we not want to spread the cache usage inside the big/little resp. ?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] sched/fair: Tune down misfit nohz kicks
  2019-02-06 17:25     ` Valentin Schneider
@ 2019-02-07  9:57       ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2019-02-07  9:57 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

On Wed, Feb 06, 2019 at 05:25:31PM +0000, Valentin Schneider wrote:
> Hi,
> 
> On 06/02/2019 16:04, Peter Zijlstra wrote:
> [...]
> >> @@ -9561,6 +9573,14 @@ static void nohz_balancer_kick(struct rq *rq)
> > 
> > 	sd = rcu_dereference(rq->sd);
> > 	if (sd) {
> > 		if ((rq->cfs.h_nr_running >= 1) &&
> > 		    check_cpu_capacity(rq, sd)) {
> > 			flags = NOHZ_KICK_MASK;
> > 			goto unlock;
> >>  		}
> >>  	}
> >>  
> >> +	sd = rcu_dereference(per_cpu(sd_asym_cpucapacity, cpu));
> >> +	if (sd) {
> >> +		if (check_misfit_status(rq, sd)) {
> >> +			flags = NOHZ_KICK_MASK;
> >> +			goto unlock;
> >> +		}
> >> +	}
> > 
> > So while the exact @sd to use for check_cpu_capacity() likely doesn't
> > matter; this is a 'implicit' test for actually having asym_capacity.
> > 
> 
> I did feel compelled to use the "right" @sd to have a coherent
> imbalance_pct being used - on big.LITTLE, the @sd above this hunk would be
> MC (117 imbalance_pct) whereas the sd_asym_cpucapacity one would be
> DIE (125 imbalance_pct).

Fair enough.

> > @@ -9606,6 +9614,10 @@ static void nohz_balancer_kick(struct rq
> >  
> >  	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
> >  	if (sd) {
> > +		/*
> > +		 * When ASYM_PACKING; see if there's a more preferred CPU going
> > +		 * idle; in which case, kick the ILB to move tasks around.
> > +		 */
> 
> s/going idle/currently idle/, no?

Yah.. D'0h..

> >  		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
> >  			if (sched_asym_prefer(i, cpu)) {
> >  				flags = NOHZ_KICK_MASK;
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems
  2019-02-07  9:56       ` Peter Zijlstra
@ 2019-02-07 11:31         ` Valentin Schneider
  0 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2019-02-07 11:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, vincent.guittot, morten.rasmussen, Dietmar.Eggemann

On 07/02/2019 09:56, Peter Zijlstra wrote:
[...]
>> I'm afraid I don't follow - we don't lose a balance opportunity with the
>> below change (compared to the original patch), do we?
> 
> What if each big/little cluster would have multiple cache domains? Would
> we not want to spread the cache usage inside the big/little resp. ?
> 

Ah I see - somewhat, yes. And we kindof already hit that issue - on
regular big.LITTLE each "cluster" (big & LITTLE) will have their own L2,
but there's no L3 to share. That means

- sd_llc == MC
- sd_asym_cpucapacity == DIE

(which I believe is what you meant by sd_llc_shared < sd_asym_capacity)

However the LLC nohz kick condition is way too broad and will keep kicking
if we just have 2 busy LITTLEs with 1 task each, which could be a correct
placement if they were just running some small background task.

I'd argue that since we don't have SMT, <=1 task per CPU means we can't do
anything better to spread cache use within the "cluster", and we have to
rely on the other conditions (mostly rq->nr_running and misfit) to move
things around when we really need to.

The extra kicks are there on non asymmetric topologies as well, so we could
imagine an additional gate for the LLC kick that would look like:

  /* If SMT, <=1 task per CPU can be suboptimal
   * If we have a parent (and thus a sibling), we may need to move some tasks to
   * our sibling.
   * If we have asymmetry, asymmetric placement can be fine so just rely on the
   * other kick conditions
   */
  if (static_key_likely(&sched_smt_present) ||
      (!static_key_unlikely(&sched_asym_cpucapacity) && sd->parent)) {
	  // LLC kick stuff here
  }

but I'd argue the extra nohz kicks have more impact on asymmetric systems.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip:sched/core] sched/fair: Simplify nohz_balancer_kick()
  2019-01-17 15:34 ` [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks Valentin Schneider
@ 2019-02-11 10:50   ` tip-bot for Valentin Schneider
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot for Valentin Schneider @ 2019-02-11 10:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, peterz, linux-kernel, torvalds, hpa, tglx, valentin.schneider

Commit-ID:  7edab78d7400ea0997f8e2e971004d824b5bb511
Gitweb:     https://git.kernel.org/tip/7edab78d7400ea0997f8e2e971004d824b5bb511
Author:     Valentin Schneider <valentin.schneider@arm.com>
AuthorDate: Thu, 17 Jan 2019 15:34:07 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 11 Feb 2019 08:02:16 +0100

sched/fair: Simplify nohz_balancer_kick()

Calling 'nohz_balance_exit_idle(rq)' will always clear 'rq->cpu' from
'nohz.idle_cpus_mask' if it is set. Since it is called at the top of
'nohz_balancer_kick()', 'rq->cpu' will never be set in
'nohz.idle_cpus_mask' if it is accessed in the rest of the function.

Combine the 'sched_domain_span()' with 'nohz.idle_cpus_mask' and drop the
'(i == cpu)' check since 'rq->cpu' will never be iterated over.

While at it, clean up a condition alignment.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190117153411.2390-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 58edbbdeb661..0692c8ff6ff6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9615,7 +9615,7 @@ static void nohz_balancer_kick(struct rq *rq)
 	sd = rcu_dereference(rq->sd);
 	if (sd) {
 		if ((rq->cfs.h_nr_running >= 1) &&
-				check_cpu_capacity(rq, sd)) {
+		    check_cpu_capacity(rq, sd)) {
 			flags = NOHZ_KICK_MASK;
 			goto unlock;
 		}
@@ -9623,11 +9623,7 @@ static void nohz_balancer_kick(struct rq *rq)
 
 	sd = rcu_dereference(per_cpu(sd_asym_packing, cpu));
 	if (sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
-			if (i == cpu ||
-			    !cpumask_test_cpu(i, nohz.idle_cpus_mask))
-				continue;
-
+		for_each_cpu_and(i, sched_domain_span(sd), nohz.idle_cpus_mask) {
 			if (sched_asym_prefer(i, cpu)) {
 				flags = NOHZ_KICK_MASK;
 				goto unlock;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip:sched/core] sched/fair: Explain LLC nohz kick condition
  2019-01-17 15:34 ` [PATCH 2/5] sched/fair: Explain LLC nohz kick condition Valentin Schneider
@ 2019-02-11 10:51   ` tip-bot for Valentin Schneider
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot for Valentin Schneider @ 2019-02-11 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, tglx, mingo, hpa, peterz, torvalds, valentin.schneider

Commit-ID:  892d59c22208be820a5463b5f74eb7f0b7f2b03a
Gitweb:     https://git.kernel.org/tip/892d59c22208be820a5463b5f74eb7f0b7f2b03a
Author:     Valentin Schneider <valentin.schneider@arm.com>
AuthorDate: Thu, 17 Jan 2019 15:34:08 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 11 Feb 2019 08:02:17 +0100

sched/fair: Explain LLC nohz kick condition

Provide a comment explaining the LLC related nohz kick in
nohz_balancer_kick().

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190117153411.2390-3-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0692c8ff6ff6..ac6b52d8c79e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9601,8 +9601,13 @@ static void nohz_balancer_kick(struct rq *rq)
 	sds = rcu_dereference(per_cpu(sd_llc_shared, cpu));
 	if (sds) {
 		/*
-		 * XXX: write a coherent comment on why we do this.
-		 * See also: http://lkml.kernel.org/r/20111202010832.602203411@sbsiddha-desk.sc.intel.com
+		 * If there is an imbalance between LLC domains (IOW we could
+		 * increase the overall cache use), we need some less-loaded LLC
+		 * domain to pull some load. Likewise, we may need to spread
+		 * load within the current LLC domain (e.g. packed SMT cores but
+		 * other CPUs are idle). We can't really know from here how busy
+		 * the others are - so just get a nohz balance going if it looks
+		 * like this LLC domain has tasks we could move.
 		 */
 		nr_busy = atomic_read(&sds->nr_busy_cpus);
 		if (nr_busy > 1) {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip:sched/core] sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
  2019-01-17 15:34 ` [PATCH 3/5] sched/fair: Prune nohz_balancer_kick() comment block Valentin Schneider
@ 2019-02-11 10:51   ` tip-bot for Valentin Schneider
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot for Valentin Schneider @ 2019-02-11 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, tglx, mingo, linux-kernel, peterz, valentin.schneider, torvalds

Commit-ID:  9f132742d5c4146397fef0c5b09fe220576a5bb2
Gitweb:     https://git.kernel.org/tip/9f132742d5c4146397fef0c5b09fe220576a5bb2
Author:     Valentin Schneider <valentin.schneider@arm.com>
AuthorDate: Thu, 17 Jan 2019 15:34:09 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 11 Feb 2019 08:02:18 +0100

sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block

The comment block for that function lists the heuristics for
triggering a nohz kick, but the most recent ones (blocked load
updates, misfit) aren't included, and some of them (LLC nohz logic,
asym packing) are no longer in sync with the code.

The conditions are either simple enough or properly commented, so get
rid of that list instead of letting it grow.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190117153411.2390-4-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ac6b52d8c79e..f44da9b491ff 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9551,15 +9551,8 @@ static void kick_ilb(unsigned int flags)
 }
 
 /*
- * Current heuristic for kicking the idle load balancer in the presence
- * of an idle cpu in the system.
- *   - This rq has more than one task.
- *   - This rq has at least one CFS task and the capacity of the CPU is
- *     significantly reduced because of RT tasks or IRQs.
- *   - At parent of LLC scheduler domain level, this cpu's scheduler group has
- *     multiple busy cpu.
- *   - For SD_ASYM_PACKING, if the lower numbered cpu's in the scheduler
- *     domain span are idle.
+ * Current decision point for kicking the idle load balancer in the presence
+ * of idle CPUs in the system.
  */
 static void nohz_balancer_kick(struct rq *rq)
 {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-02-11 10:52 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-17 15:34 [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider
2019-01-17 15:34 ` [PATCH 1/5] sched/fair: Use for_each_cpu_and for asym-packing nohz kicks Valentin Schneider
2019-02-11 10:50   ` [tip:sched/core] sched/fair: Simplify nohz_balancer_kick() tip-bot for Valentin Schneider
2019-01-17 15:34 ` [PATCH 2/5] sched/fair: Explain LLC nohz kick condition Valentin Schneider
2019-02-11 10:51   ` [tip:sched/core] " tip-bot for Valentin Schneider
2019-01-17 15:34 ` [PATCH 3/5] sched/fair: Prune nohz_balancer_kick() comment block Valentin Schneider
2019-02-11 10:51   ` [tip:sched/core] sched/fair: Prune, fix and simplify the " tip-bot for Valentin Schneider
2019-01-17 15:34 ` [PATCH 4/5] sched/fair: Tune down misfit nohz kicks Valentin Schneider
2019-02-06 16:04   ` Peter Zijlstra
2019-02-06 17:25     ` Valentin Schneider
2019-02-07  9:57       ` Peter Zijlstra
2019-01-17 15:34 ` [PATCH 5/5] sched/fair: Skip LLC nohz logic for asymmetric systems Valentin Schneider
2019-02-06 16:14   ` Peter Zijlstra
2019-02-06 17:26     ` Valentin Schneider
2019-02-07  9:56       ` Peter Zijlstra
2019-02-07 11:31         ` Valentin Schneider
2019-02-06 14:33 ` [PATCH 0/5] sched/fair: NOHZ cleanups and misfit improvement Valentin Schneider

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.