linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Reduce number of active LB
@ 2021-01-06 13:34 Vincent Guittot
  2021-01-06 13:34 ` [PATCH 1/3] sched/fair: skip idle cfs_rq Vincent Guittot
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 13:34 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

Few improvements related to active LB and the increase of LB interval.
I haven't seen any performcne impact on various benchmarks except for 
  -stress-ng mmapfork : +4.54% on my octo-core arm64
But this was somewhat expected as the changes impact mainly corner cases.

Vincent Guittot (3):
  sched/fair: skip idle cfs_rq
  sched/fair: don't set LBF_ALL_PINNED unnecessarily
  sched/fair: reduce cases for active balance

 kernel/sched/fair.c | 51 ++++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 24 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] sched/fair: skip idle cfs_rq
  2021-01-06 13:34 [PATCH 0/3] Reduce number of active LB Vincent Guittot
@ 2021-01-06 13:34 ` Vincent Guittot
  2021-01-06 15:13   ` Valentin Schneider
  2021-01-06 13:34 ` [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
  2021-01-06 13:34 ` [PATCH 3/3] sched/fair: reduce cases for active balance Vincent Guittot
  2 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 13:34 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

Don't waste time checking whether an idle cfs_rq could be the busiest
queue. Furthermore, this can end up selecting a cfs_rq with a high load
but being idle in case of migrate_load.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..5428b8723e61 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9389,8 +9389,11 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		if (rt > env->fbq_type)
 			continue;
 
-		capacity = capacity_of(i);
 		nr_running = rq->cfs.h_nr_running;
+		if (!nr_running)
+			continue;
+
+		capacity = capacity_of(i);
 
 		/*
 		 * For ASYM_CPUCAPACITY domains, don't pick a CPU that could
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 13:34 [PATCH 0/3] Reduce number of active LB Vincent Guittot
  2021-01-06 13:34 ` [PATCH 1/3] sched/fair: skip idle cfs_rq Vincent Guittot
@ 2021-01-06 13:34 ` Vincent Guittot
  2021-01-06 15:07   ` Peter Zijlstra
  2021-01-06 15:13   ` Valentin Schneider
  2021-01-06 13:34 ` [PATCH 3/3] sched/fair: reduce cases for active balance Vincent Guittot
  2 siblings, 2 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 13:34 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

Setting LBF_ALL_PINNED during active load balance is only valid when there
is only 1 running task on the rq otherwise this ends up increasing the
balance interval whereas other tasks could migrate after the next interval
once they become cache-cold as an example.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5428b8723e61..69a455113b10 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
 				raw_spin_unlock_irqrestore(&busiest->lock,
 							    flags);
-				env.flags |= LBF_ALL_PINNED;
+				if (busiest->nr_running == 1)
+					env.flags |= LBF_ALL_PINNED;
 				goto out_one_pinned;
 			}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] sched/fair: reduce cases for active balance
  2021-01-06 13:34 [PATCH 0/3] Reduce number of active LB Vincent Guittot
  2021-01-06 13:34 ` [PATCH 1/3] sched/fair: skip idle cfs_rq Vincent Guittot
  2021-01-06 13:34 ` [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
@ 2021-01-06 13:34 ` Vincent Guittot
  2021-01-06 15:13   ` Peter Zijlstra
  2 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 13:34 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

Active balance is triggered for a number of voluntary case like misfit or
pinned tasks cases but also after that a number of load balance failed to
migrate a task. Remove the active load balance case for overloaded group
as an overloaded state means that there is at least one waiting tasks. The
threshold on the upper limit of the task's load will decrease with the
number of failed LB until the task has migrated.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 43 +++++++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 22 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 69a455113b10..ee87fd6f7359 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9499,13 +9499,30 @@ asym_active_balance(struct lb_env *env)
 }
 
 static inline bool
-voluntary_active_balance(struct lb_env *env)
+imbalanced_active_balance(struct lb_env *env)
+{
+	struct sched_domain *sd = env->sd;
+
+	/* The imbalanced case includes the case of pinned tasks preventing a fair
+	 * distribution of the load on the system but also the even distribution of the
+	 * threads on a system with spare capacity
+	 */
+	if ((env->migration_type == migrate_task) &&
+		(sd->nr_balance_failed > sd->cache_nice_tries+2))
+		return 1;
+
+	return 0;
+}
+
+static int need_active_balance(struct lb_env *env)
 {
 	struct sched_domain *sd = env->sd;
 
 	if (asym_active_balance(env))
 		return 1;
 
+	if (imbalanced_active_balance(env))
+		return 1;
 	/*
 	 * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
 	 * It's worth migrating the task if the src_cpu's capacity is reduced
@@ -9525,16 +9542,6 @@ voluntary_active_balance(struct lb_env *env)
 	return 0;
 }
 
-static int need_active_balance(struct lb_env *env)
-{
-	struct sched_domain *sd = env->sd;
-
-	if (voluntary_active_balance(env))
-		return 1;
-
-	return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
-}
-
 static int active_load_balance_cpu_stop(void *data);
 
 static int should_we_balance(struct lb_env *env)
@@ -9785,21 +9792,13 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 			/* We've kicked active balancing, force task migration. */
 			sd->nr_balance_failed = sd->cache_nice_tries+1;
 		}
-	} else
+	} else {
 		sd->nr_balance_failed = 0;
+	}
 
-	if (likely(!active_balance) || voluntary_active_balance(&env)) {
+	if (likely(!active_balance) || need_active_balance(&env)) {
 		/* We were unbalanced, so reset the balancing interval */
 		sd->balance_interval = sd->min_interval;
-	} else {
-		/*
-		 * If we've begun active balancing, start to back off. This
-		 * case may not be covered by the all_pinned logic if there
-		 * is only 1 task on the busy runqueue (because we don't call
-		 * detach_tasks).
-		 */
-		if (sd->balance_interval < sd->max_interval)
-			sd->balance_interval *= 2;
 	}
 
 	goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 13:34 ` [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
@ 2021-01-06 15:07   ` Peter Zijlstra
  2021-01-06 15:20     ` Vincent Guittot
  2021-01-06 15:13   ` Valentin Schneider
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2021-01-06 15:07 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, linux-kernel

On Wed, Jan 06, 2021 at 02:34:18PM +0100, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5428b8723e61..69a455113b10 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>  			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
>  				raw_spin_unlock_irqrestore(&busiest->lock,
>  							    flags);
> -				env.flags |= LBF_ALL_PINNED;
> +				if (busiest->nr_running == 1)
> +					env.flags |= LBF_ALL_PINNED;
>  				goto out_one_pinned;
>  			}

Hmm.. that wasn't the intention. It's entirely possible to have multiple
tasks pinned.

ISTR we set all-pinned and then clear it in can_migrate_task() when we
actually find a task that can be moved to the destination CPU. That's a
construct that works perfectly fine with multiple tasks.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] sched/fair: reduce cases for active balance
  2021-01-06 13:34 ` [PATCH 3/3] sched/fair: reduce cases for active balance Vincent Guittot
@ 2021-01-06 15:13   ` Peter Zijlstra
  2021-01-06 15:41     ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2021-01-06 15:13 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, linux-kernel

On Wed, Jan 06, 2021 at 02:34:19PM +0100, Vincent Guittot wrote:
> Active balance is triggered for a number of voluntary case like misfit or
							cases
> pinned tasks cases but also after that a number of load balance failed to
								 ^attempts
> migrate a task. Remove the active load balance case for overloaded group
							 ^an ?
> as an overloaded state means that there is at least one waiting tasks. The
								  task
> threshold on the upper limit of the task's load will decrease with the
> number of failed LB until the task has migrated.

And I'm not sure I follow that last part, irrespective of spelling nits,
help?

> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 69a455113b10..ee87fd6f7359 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9499,13 +9499,30 @@ asym_active_balance(struct lb_env *env)
>  }
>  
>  static inline bool
> -voluntary_active_balance(struct lb_env *env)
> +imbalanced_active_balance(struct lb_env *env)
> +{
> +	struct sched_domain *sd = env->sd;
> +
> +	/* The imbalanced case includes the case of pinned tasks preventing a fair
> +	 * distribution of the load on the system but also the even distribution of the
> +	 * threads on a system with spare capacity
> +	 */

comment style fail

> +	if ((env->migration_type == migrate_task) &&
> +		(sd->nr_balance_failed > sd->cache_nice_tries+2))

indent fail; try: set cino=(0:0

> +		return 1;
> +
> +	return 0;
> +}
> +
> +static int need_active_balance(struct lb_env *env)
>  {
>  	struct sched_domain *sd = env->sd;
>  
>  	if (asym_active_balance(env))
>  		return 1;
>  
> +	if (imbalanced_active_balance(env))
> +		return 1;

+ whitespace

>  	/*
>  	 * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
>  	 * It's worth migrating the task if the src_cpu's capacity is reduced

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] sched/fair: skip idle cfs_rq
  2021-01-06 13:34 ` [PATCH 1/3] sched/fair: skip idle cfs_rq Vincent Guittot
@ 2021-01-06 15:13   ` Valentin Schneider
  0 siblings, 0 replies; 14+ messages in thread
From: Valentin Schneider @ 2021-01-06 15:13 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

On 06/01/21 14:34, Vincent Guittot wrote:
> Don't waste time checking whether an idle cfs_rq could be the busiest
> queue. Furthermore, this can end up selecting a cfs_rq with a high load
> but being idle in case of migrate_load.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Makes sense to me.

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

> ---
>  kernel/sched/fair.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04a3ce20da67..5428b8723e61 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9389,8 +9389,11 @@ static struct rq *find_busiest_queue(struct lb_env *env,
>  		if (rt > env->fbq_type)
>  			continue;
>  
> -		capacity = capacity_of(i);
>  		nr_running = rq->cfs.h_nr_running;
> +		if (!nr_running)
> +			continue;
> +
> +		capacity = capacity_of(i);
>  
>  		/*
>  		 * For ASYM_CPUCAPACITY domains, don't pick a CPU that could
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 13:34 ` [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
  2021-01-06 15:07   ` Peter Zijlstra
@ 2021-01-06 15:13   ` Valentin Schneider
  2021-01-06 16:04     ` Vincent Guittot
  1 sibling, 1 reply; 14+ messages in thread
From: Valentin Schneider @ 2021-01-06 15:13 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

On 06/01/21 14:34, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5428b8723e61..69a455113b10 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>                       if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
>                               raw_spin_unlock_irqrestore(&busiest->lock,
>                                                           flags);
> -				env.flags |= LBF_ALL_PINNED;
> +				if (busiest->nr_running == 1)
> +					env.flags |= LBF_ALL_PINNED;

So LBF_ALL_PINNED *can* be set if busiest->nr_running > 1, because
before we get there we have:

  if (nr_running > 1) {
      env.flags |= LBF_ALL_PINNED;
      detach_tasks(&env); // Removes LBF_ALL_PINNED if > 0 tasks can be pulled
      ...
  }

What about following the logic used by detach_tasks() and only clear the
flag? Say something like the below. if nr_running > 1, then we'll have
gone through detach_tasks() and will have cleared the flag (if
possible).
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..211c86ba3f5b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9623,6 +9623,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 	env.src_rq = busiest;
 
 	ld_moved = 0;
+	/* Clear this as soon as we find a single pullable task */
+	env.flags |= LBF_ALL_PINNED;
 	if (busiest->nr_running > 1) {
 		/*
 		 * Attempt to move tasks. If find_busiest_group has found
@@ -9630,7 +9632,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		 * still unbalanced. ld_moved simply stays zero, so it is
 		 * correctly treated as an imbalance.
 		 */
-		env.flags |= LBF_ALL_PINNED;
 		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
 
 more_balance:
@@ -9756,10 +9757,11 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
 				raw_spin_unlock_irqrestore(&busiest->lock,
 							    flags);
-				env.flags |= LBF_ALL_PINNED;
 				goto out_one_pinned;
 			}
 
+			env.flags &= ~LBF_ALL_PINNED;
+
 			/*
 			 * ->active_balance synchronizes accesses to
 			 * ->active_balance_work.  Once set, it's cleared
---

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 15:07   ` Peter Zijlstra
@ 2021-01-06 15:20     ` Vincent Guittot
  2021-01-06 15:32       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 15:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel

On Wed, 6 Jan 2021 at 16:10, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 06, 2021 at 02:34:18PM +0100, Vincent Guittot wrote:
> > Setting LBF_ALL_PINNED during active load balance is only valid when there
> > is only 1 running task on the rq otherwise this ends up increasing the
> > balance interval whereas other tasks could migrate after the next interval
> > once they become cache-cold as an example.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 5428b8723e61..69a455113b10 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >                       if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> >                               raw_spin_unlock_irqrestore(&busiest->lock,
> >                                                           flags);
> > -                             env.flags |= LBF_ALL_PINNED;
> > +                             if (busiest->nr_running == 1)
> > +                                     env.flags |= LBF_ALL_PINNED;
> >                               goto out_one_pinned;
> >                       }
>
> Hmm.. that wasn't the intention. It's entirely possible to have multiple
> tasks pinned.

But LBF_ALL_PINNED is already set in this case

>
> ISTR we set all-pinned and then clear it in can_migrate_task() when we
> actually find a task that can be moved to the destination CPU. That's a
> construct that works perfectly fine with multiple tasks.

I agree with you.

This case here is :
we have 2 tasks TA and TB on the rq.
The waiting one TB can't migrate for a reason other than the pinned case.
We decide to start the active migration on the running  TA task but TA
is pinned.
In this case we are not in the all pinned case.

We are in the all pinned case, only if the running task TA is the only
runnable task of the rq

>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 15:20     ` Vincent Guittot
@ 2021-01-06 15:32       ` Peter Zijlstra
  2021-01-06 15:45         ` Vincent Guittot
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2021-01-06 15:32 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel

On Wed, Jan 06, 2021 at 04:20:55PM +0100, Vincent Guittot wrote:

> This case here is :
> we have 2 tasks TA and TB on the rq.
> The waiting one TB can't migrate for a reason other than the pinned case.
> We decide to start the active migration on the running  TA task but TA
> is pinned.
> In this case we are not in the all pinned case.

But then can_migrate_task(TB) should clear ALL_PINNED, no?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] sched/fair: reduce cases for active balance
  2021-01-06 15:13   ` Peter Zijlstra
@ 2021-01-06 15:41     ` Vincent Guittot
  2021-01-06 16:11       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 15:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel

On Wed, 6 Jan 2021 at 16:13, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 06, 2021 at 02:34:19PM +0100, Vincent Guittot wrote:
> > Active balance is triggered for a number of voluntary case like misfit or
>                                                         cases
> > pinned tasks cases but also after that a number of load balance failed to
>                                                                  ^attempts
> > migrate a task. Remove the active load balance case for overloaded group
>                                                          ^an ?
> > as an overloaded state means that there is at least one waiting tasks. The
>                                                                   task
> > threshold on the upper limit of the task's load will decrease with the
> > number of failed LB until the task has migrated.
>
> And I'm not sure I follow that last part, irrespective of spelling nits,
> help?

Argh, come back to work is difficult for me

Let me try again:

Active balance is triggered for a number of voluntary cases like
misfit or pinned tasks cases but also after that a number of load
balance attempts failed to migrate a task. There is no need to use
active load balance when the group is overloaded because an overloaded
state means that there is at least one waiting task. Nevertheless, the
waiting task is not selected and detached until the threshold becomes
higher than its load. This threshold increases with the number of
failed lb (see the condition if ((load >> env->sd->nr_balance_failed)
> env->imbalance) in detach_tasks()) and the waiting task will end up
to be selected after a number of attempts.

>
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 43 +++++++++++++++++++++----------------------
> >  1 file changed, 21 insertions(+), 22 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 69a455113b10..ee87fd6f7359 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9499,13 +9499,30 @@ asym_active_balance(struct lb_env *env)
> >  }
> >
> >  static inline bool
> > -voluntary_active_balance(struct lb_env *env)
> > +imbalanced_active_balance(struct lb_env *env)
> > +{
> > +     struct sched_domain *sd = env->sd;
> > +
> > +     /* The imbalanced case includes the case of pinned tasks preventing a fair
> > +      * distribution of the load on the system but also the even distribution of the
> > +      * threads on a system with spare capacity
> > +      */
>
> comment style fail
>
> > +     if ((env->migration_type == migrate_task) &&
> > +             (sd->nr_balance_failed > sd->cache_nice_tries+2))
>
> indent fail; try: set cino=(0:0
>
> > +             return 1;
> > +
> > +     return 0;
> > +}
> > +
> > +static int need_active_balance(struct lb_env *env)
> >  {
> >       struct sched_domain *sd = env->sd;
> >
> >       if (asym_active_balance(env))
> >               return 1;
> >
> > +     if (imbalanced_active_balance(env))
> > +             return 1;
>
> + whitespace
>
> >       /*
> >        * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
> >        * It's worth migrating the task if the src_cpu's capacity is reduced

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 15:32       ` Peter Zijlstra
@ 2021-01-06 15:45         ` Vincent Guittot
  0 siblings, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 15:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel

On Wed, 6 Jan 2021 at 16:32, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, Jan 06, 2021 at 04:20:55PM +0100, Vincent Guittot wrote:
>
> > This case here is :
> > we have 2 tasks TA and TB on the rq.
> > The waiting one TB can't migrate for a reason other than the pinned case.
> > We decide to start the active migration on the running  TA task but TA
> > is pinned.
> > In this case we are not in the all pinned case.
>
> But then can_migrate_task(TB) should clear ALL_PINNED, no?

Yes but current code sets ALL_PINNED when the active migration failed
because the running task is pinned whatever there is not pinned
waiting tasks that has not been selected

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-06 15:13   ` Valentin Schneider
@ 2021-01-06 16:04     ` Vincent Guittot
  0 siblings, 0 replies; 14+ messages in thread
From: Vincent Guittot @ 2021-01-06 16:04 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Wed, 6 Jan 2021 at 16:13, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 06/01/21 14:34, Vincent Guittot wrote:
> > Setting LBF_ALL_PINNED during active load balance is only valid when there
> > is only 1 running task on the rq otherwise this ends up increasing the
> > balance interval whereas other tasks could migrate after the next interval
> > once they become cache-cold as an example.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 5428b8723e61..69a455113b10 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9759,7 +9759,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >                       if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> >                               raw_spin_unlock_irqrestore(&busiest->lock,
> >                                                           flags);
> > -                             env.flags |= LBF_ALL_PINNED;
> > +                             if (busiest->nr_running == 1)
> > +                                     env.flags |= LBF_ALL_PINNED;
>
> So LBF_ALL_PINNED *can* be set if busiest->nr_running > 1, because
> before we get there we have:
>
>   if (nr_running > 1) {
>       env.flags |= LBF_ALL_PINNED;
>       detach_tasks(&env); // Removes LBF_ALL_PINNED if > 0 tasks can be pulled
>       ...
>   }
>
> What about following the logic used by detach_tasks() and only clear the
> flag? Say something like the below. if nr_running > 1, then we'll have
> gone through detach_tasks() and will have cleared the flag (if
> possible).
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04a3ce20da67..211c86ba3f5b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9623,6 +9623,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>         env.src_rq = busiest;
>
>         ld_moved = 0;
> +       /* Clear this as soon as we find a single pullable task */
> +       env.flags |= LBF_ALL_PINNED;
>         if (busiest->nr_running > 1) {
>                 /*
>                  * Attempt to move tasks. If find_busiest_group has found
> @@ -9630,7 +9632,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>                  * still unbalanced. ld_moved simply stays zero, so it is
>                  * correctly treated as an imbalance.
>                  */
> -               env.flags |= LBF_ALL_PINNED;
>                 env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
>
>  more_balance:
> @@ -9756,10 +9757,11 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>                         if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
>                                 raw_spin_unlock_irqrestore(&busiest->lock,
>                                                             flags);
> -                               env.flags |= LBF_ALL_PINNED;
>                                 goto out_one_pinned;
>                         }
>
> +                       env.flags &= ~LBF_ALL_PINNED;

Yes, looks easier to read.
will do the change in the next version

> +
>                         /*
>                          * ->active_balance synchronizes accesses to
>                          * ->active_balance_work.  Once set, it's cleared
> ---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] sched/fair: reduce cases for active balance
  2021-01-06 15:41     ` Vincent Guittot
@ 2021-01-06 16:11       ` Peter Zijlstra
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2021-01-06 16:11 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Juri Lelli, Dietmar Eggemann, Steven Rostedt,
	Ben Segall, Mel Gorman, Daniel Bristot de Oliveira, linux-kernel

On Wed, Jan 06, 2021 at 04:41:04PM +0100, Vincent Guittot wrote:

> Let me try again:
> 
> Active balance is triggered for a number of voluntary cases like
> misfit or pinned tasks cases but also after that a number of load
> balance attempts failed to migrate a task. There is no need to use
> active load balance when the group is overloaded because an overloaded
> state means that there is at least one waiting task. Nevertheless, the
> waiting task is not selected and detached until the threshold becomes
> higher than its load. This threshold increases with the number of
> failed lb (see the condition if ((load >> env->sd->nr_balance_failed)
> > env->imbalance) in detach_tasks()) and the waiting task will end up
> to be selected after a number of attempts.

Ah, makes sense now. Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-01-06 16:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06 13:34 [PATCH 0/3] Reduce number of active LB Vincent Guittot
2021-01-06 13:34 ` [PATCH 1/3] sched/fair: skip idle cfs_rq Vincent Guittot
2021-01-06 15:13   ` Valentin Schneider
2021-01-06 13:34 ` [PATCH 2/3] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
2021-01-06 15:07   ` Peter Zijlstra
2021-01-06 15:20     ` Vincent Guittot
2021-01-06 15:32       ` Peter Zijlstra
2021-01-06 15:45         ` Vincent Guittot
2021-01-06 15:13   ` Valentin Schneider
2021-01-06 16:04     ` Vincent Guittot
2021-01-06 13:34 ` [PATCH 3/3] sched/fair: reduce cases for active balance Vincent Guittot
2021-01-06 15:13   ` Peter Zijlstra
2021-01-06 15:41     ` Vincent Guittot
2021-01-06 16:11       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).