All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] Reduce number of active LB
@ 2021-01-07 10:33 Vincent Guittot
  2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 10:33 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, valentin.schneider, linux-kernel
  Cc: Vincent Guittot

Few improvements related to active LB and the increase of LB interval.
I haven't seen any performcne impact on various benchmarks except for 
  -stress-ng mmapfork : +4.54% on my octo-core arm64
But this was somewhat expected as the changes impact mainly corner cases.

Changes since v1:
- patch 2: change how LBF_ALL_PINNED is managed as proposed by Valentin
- patch 3: updated comment and fix typos
  
Vincent Guittot (3):
  sched/fair: skip idle cfs_rq
  sched/fair: don't set LBF_ALL_PINNED unnecessarily
  sched/fair: reduce cases for active balance

 kernel/sched/fair.c | 57 +++++++++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 25 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3 v2] sched/fair: skip idle cfs_rq
  2021-01-07 10:33 [PATCH 0/3 v2] Reduce number of active LB Vincent Guittot
@ 2021-01-07 10:33 ` Vincent Guittot
  2021-01-11 14:46   ` Mel Gorman
  2021-01-14 11:29   ` [tip: sched/core] sched/fair: Skip " tip-bot2 for Vincent Guittot
  2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
  2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
  2 siblings, 2 replies; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 10:33 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, valentin.schneider, linux-kernel
  Cc: Vincent Guittot

Don't waste time checking whether an idle cfs_rq could be the busiest
queue. Furthermore, this can end up selecting a cfs_rq with a high load
but being idle in case of migrate_load.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/sched/fair.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce20da67..5428b8723e61 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9389,8 +9389,11 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		if (rt > env->fbq_type)
 			continue;
 
-		capacity = capacity_of(i);
 		nr_running = rq->cfs.h_nr_running;
+		if (!nr_running)
+			continue;
+
+		capacity = capacity_of(i);
 
 		/*
 		 * For ASYM_CPUCAPACITY domains, don't pick a CPU that could
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-07 10:33 [PATCH 0/3 v2] Reduce number of active LB Vincent Guittot
  2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
@ 2021-01-07 10:33 ` Vincent Guittot
  2021-01-07 11:26   ` Valentin Schneider
                     ` (3 more replies)
  2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
  2 siblings, 4 replies; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 10:33 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, valentin.schneider, linux-kernel
  Cc: Vincent Guittot

Setting LBF_ALL_PINNED during active load balance is only valid when there
is only 1 running task on the rq otherwise this ends up increasing the
balance interval whereas other tasks could migrate after the next interval
once they become cache-cold as an example.

LBF_ALL_PINNED flag is now always set it by default. It is then cleared
when we find one task that can be pulled when calling detach_tasks() or
during active migration.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5428b8723e61..a3515dea1afc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9626,6 +9626,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 	env.src_rq = busiest;
 
 	ld_moved = 0;
+	/* Clear this flag as soon as we find a pullable task */
+	env.flags |= LBF_ALL_PINNED;
 	if (busiest->nr_running > 1) {
 		/*
 		 * Attempt to move tasks. If find_busiest_group has found
@@ -9633,7 +9635,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		 * still unbalanced. ld_moved simply stays zero, so it is
 		 * correctly treated as an imbalance.
 		 */
-		env.flags |= LBF_ALL_PINNED;
 		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
 
 more_balance:
@@ -9759,10 +9760,12 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
 				raw_spin_unlock_irqrestore(&busiest->lock,
 							    flags);
-				env.flags |= LBF_ALL_PINNED;
 				goto out_one_pinned;
 			}
 
+			/* Record that we found atleast one task that could run on this_cpu */
+			env.flags &= ~LBF_ALL_PINNED;
+
 			/*
 			 * ->active_balance synchronizes accesses to
 			 * ->active_balance_work.  Once set, it's cleared
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 10:33 [PATCH 0/3 v2] Reduce number of active LB Vincent Guittot
  2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
  2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
@ 2021-01-07 10:33 ` Vincent Guittot
  2021-01-07 11:26   ` Valentin Schneider
                     ` (2 more replies)
  2 siblings, 3 replies; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 10:33 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, valentin.schneider, linux-kernel
  Cc: Vincent Guittot

Active balance is triggered for a number of voluntary cases like misfit
or pinned tasks cases but also after that a number of load balance
attempts failed to migrate a task. There is no need to use active load
balance when the group is overloaded because an overloaded state means
that there is at least one waiting task. Nevertheless, the waiting task
is not selected and detached until the threshold becomes higher than its
load. This threshold increases with the number of failed lb (see the
condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in
detach_tasks()) and the waiting task will end up to be selected after a
number of attempts.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 45 +++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a3515dea1afc..00ec5b901188 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
 }
 
 static inline bool
-voluntary_active_balance(struct lb_env *env)
+imbalanced_active_balance(struct lb_env *env)
+{
+	struct sched_domain *sd = env->sd;
+
+	/*
+	 * The imbalanced case includes the case of pinned tasks preventing a fair
+	 * distribution of the load on the system but also the even distribution of the
+	 * threads on a system with spare capacity
+	 */
+	if ((env->migration_type == migrate_task) &&
+	    (sd->nr_balance_failed > sd->cache_nice_tries+2))
+		return 1;
+
+	return 0;
+}
+
+static int need_active_balance(struct lb_env *env)
 {
 	struct sched_domain *sd = env->sd;
 
 	if (asym_active_balance(env))
 		return 1;
 
+	if (imbalanced_active_balance(env))
+		return 1;
+
 	/*
 	 * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
 	 * It's worth migrating the task if the src_cpu's capacity is reduced
@@ -9525,16 +9544,6 @@ voluntary_active_balance(struct lb_env *env)
 	return 0;
 }
 
-static int need_active_balance(struct lb_env *env)
-{
-	struct sched_domain *sd = env->sd;
-
-	if (voluntary_active_balance(env))
-		return 1;
-
-	return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
-}
-
 static int active_load_balance_cpu_stop(void *data);
 
 static int should_we_balance(struct lb_env *env)
@@ -9787,21 +9796,13 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 			/* We've kicked active balancing, force task migration. */
 			sd->nr_balance_failed = sd->cache_nice_tries+1;
 		}
-	} else
+	} else {
 		sd->nr_balance_failed = 0;
+	}
 
-	if (likely(!active_balance) || voluntary_active_balance(&env)) {
+	if (likely(!active_balance) || need_active_balance(&env)) {
 		/* We were unbalanced, so reset the balancing interval */
 		sd->balance_interval = sd->min_interval;
-	} else {
-		/*
-		 * If we've begun active balancing, start to back off. This
-		 * case may not be covered by the all_pinned logic if there
-		 * is only 1 task on the busy runqueue (because we don't call
-		 * detach_tasks).
-		 */
-		if (sd->balance_interval < sd->max_interval)
-			sd->balance_interval *= 2;
 	}
 
 	goto out;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
@ 2021-01-07 11:26   ` Valentin Schneider
       [not found]   ` <BN8PR12MB2978EC9CFBAF529C527D05919AAF0@BN8PR12MB2978.namprd12.prod.outlook.com>
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2021-01-07 11:26 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

On 07/01/21 11:33, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
>
> LBF_ALL_PINNED flag is now always set it by default. It is then cleared
> when we find one task that can be pulled when calling detach_tasks() or
> during active migration.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
@ 2021-01-07 11:26   ` Valentin Schneider
  2021-01-07 12:20     ` Vincent Guittot
  2021-01-12  9:16   ` Mel Gorman
  2021-01-14 11:29   ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Vincent Guittot
  2 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2021-01-07 11:26 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel
  Cc: Vincent Guittot

On 07/01/21 11:33, Vincent Guittot wrote:
> Active balance is triggered for a number of voluntary cases like misfit
> or pinned tasks cases but also after that a number of load balance
> attempts failed to migrate a task. There is no need to use active load
> balance when the group is overloaded because an overloaded state means
> that there is at least one waiting task. Nevertheless, the waiting task
> is not selected and detached until the threshold becomes higher than its
> load. This threshold increases with the number of failed lb (see the
> condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in
> detach_tasks()) and the waiting task will end up to be selected after a
> number of attempts.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 45 +++++++++++++++++++++++----------------------
>  1 file changed, 23 insertions(+), 22 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a3515dea1afc..00ec5b901188 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
>  }
>
>  static inline bool
> -voluntary_active_balance(struct lb_env *env)
> +imbalanced_active_balance(struct lb_env *env)
> +{
> +	struct sched_domain *sd = env->sd;
> +
> +	/*
> +	 * The imbalanced case includes the case of pinned tasks preventing a fair
> +	 * distribution of the load on the system but also the even distribution of the
> +	 * threads on a system with spare capacity
> +	 */

Do you mean s/imbalanced/migrate_task/? This part here will affect
group_imbalanced, group_asym_packing, and some others.

> +	if ((env->migration_type == migrate_task) &&
> +	    (sd->nr_balance_failed > sd->cache_nice_tries+2))
> +		return 1;
> +
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 11:26   ` Valentin Schneider
@ 2021-01-07 12:20     ` Vincent Guittot
  2021-01-07 17:40       ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 12:20 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Thu, 7 Jan 2021 at 12:26, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 07/01/21 11:33, Vincent Guittot wrote:
> > Active balance is triggered for a number of voluntary cases like misfit
> > or pinned tasks cases but also after that a number of load balance
> > attempts failed to migrate a task. There is no need to use active load
> > balance when the group is overloaded because an overloaded state means
> > that there is at least one waiting task. Nevertheless, the waiting task
> > is not selected and detached until the threshold becomes higher than its
> > load. This threshold increases with the number of failed lb (see the
> > condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in
> > detach_tasks()) and the waiting task will end up to be selected after a
> > number of attempts.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 45 +++++++++++++++++++++++----------------------
> >  1 file changed, 23 insertions(+), 22 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index a3515dea1afc..00ec5b901188 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
> >  }
> >
> >  static inline bool
> > -voluntary_active_balance(struct lb_env *env)
> > +imbalanced_active_balance(struct lb_env *env)
> > +{
> > +     struct sched_domain *sd = env->sd;
> > +
> > +     /*
> > +      * The imbalanced case includes the case of pinned tasks preventing a fair
> > +      * distribution of the load on the system but also the even distribution of the
> > +      * threads on a system with spare capacity
> > +      */
>
> Do you mean s/imbalanced/migrate_task/? This part here will affect
> group_imbalanced, group_asym_packing, and some others.

I really mean the imbalanced case which refers to the function name
and includes:
- the pinned tasks case aka group_imbalanced and which is the primary
target of this function ( which explains its name)
- but also the case where we want to evenly spread tasks on system
with spare capacity and removed this imbalance

>
> > +     if ((env->migration_type == migrate_task) &&
> > +         (sd->nr_balance_failed > sd->cache_nice_tries+2))
> > +             return 1;
> > +
> > +     return 0;
> > +}
> > +

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily
       [not found]   ` <BN8PR12MB2978EC9CFBAF529C527D05919AAF0@BN8PR12MB2978.namprd12.prod.outlook.com>
@ 2021-01-07 16:00     ` Vincent Guittot
  0 siblings, 0 replies; 17+ messages in thread
From: Vincent Guittot @ 2021-01-07 16:00 UTC (permalink / raw)
  To: Tao Zhou
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, linux-kernel

On Thu, 7 Jan 2021 at 16:08, Tao Zhou <ouwen210@hotmail.com> wrote:
>
> Hi Vincent,
>
> On Thu, Jan 07, 2021 at 11:33:24AM +0100, Vincent Guittot wrote:
> > Setting LBF_ALL_PINNED during active load balance is only valid when there
> > is only 1 running task on the rq otherwise this ends up increasing the
> > balance interval whereas other tasks could migrate after the next interval
> > once they become cache-cold as an example.
> >
> > LBF_ALL_PINNED flag is now always set it by default. It is then cleared
> > when we find one task that can be pulled when calling detach_tasks() or
> > during active migration.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 5428b8723e61..a3515dea1afc 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9626,6 +9626,8 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >       env.src_rq = busiest;
> >
> >       ld_moved = 0;
> > +     /* Clear this flag as soon as we find a pullable task */
> > +     env.flags |= LBF_ALL_PINNED;
> >       if (busiest->nr_running > 1) {
> >               /*
> >                * Attempt to move tasks. If find_busiest_group has found
> > @@ -9633,7 +9635,6 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >                * still unbalanced. ld_moved simply stays zero, so it is
> >                * correctly treated as an imbalance.
> >                */
> > -             env.flags |= LBF_ALL_PINNED;
> >               env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
> >
> >  more_balance:
> > @@ -9759,10 +9760,12 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >                       if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> >                               raw_spin_unlock_irqrestore(&busiest->lock,
> >                                                           flags);
> > -                             env.flags |= LBF_ALL_PINNED;
>
> busiest->nr_running > 1, LBF_ALL_PINNED cleared but !ld_moved and get here.
> This is not consistent with the tip sched code because the original code
> from this path unconditionally set LBF_ALL_PINNED. But is this intentional
> to not increase balance interval and allow other tasks migrate not in the
> next balance interval.
>
> In v1, there was a condition here to allow that only one task running on rq
> can set LBF_ALL_PINNED. But in v2, when busiest->nr_running > 1, !ld_moved,
> LBF_ALL_PINNED is not cleared and can get here. Increase the balance interval.
> Not consist with v1. If I am not wrong, need a condition like:
>
>   if (busiest->nr_running != 1 /* && env.flags & LBF_ALL_PINNED */)
>       env.flags &= ~LBF_ALL_PINNED;

if (nr_running > 1) then LBF_ALL_PINNED can't be set when we reach the
active migration  (if (!ld_moved) { ...) because we go to
out_all_pinned if LBF_ALL_PINNED is set and we tried all cpus of the
sched_group

>
> I hope this is not a noise to this new thread.
>
> Thanks,
> Tao
>
> > once they become cache-cold as an example
>
> >                               goto out_one_pinned;
> >                       }
> >
> > +                     /* Record that we found atleast one task that could run on this_cpu */
> > +                     env.flags &= ~LBF_ALL_PINNED;
> > +
> >                       /*
> >                        * ->active_balance synchronizes accesses to
> >                        * ->active_balance_work.  Once set, it's cleared
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 12:20     ` Vincent Guittot
@ 2021-01-07 17:40       ` Valentin Schneider
  2021-01-08  8:11         ` Vincent Guittot
  0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2021-01-07 17:40 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On 07/01/21 13:20, Vincent Guittot wrote:
> On Thu, 7 Jan 2021 at 12:26, Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>> > @@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
>> >  }
>> >
>> >  static inline bool
>> > -voluntary_active_balance(struct lb_env *env)
>> > +imbalanced_active_balance(struct lb_env *env)
>> > +{
>> > +     struct sched_domain *sd = env->sd;
>> > +
>> > +     /*
>> > +      * The imbalanced case includes the case of pinned tasks preventing a fair
>> > +      * distribution of the load on the system but also the even distribution of the
>> > +      * threads on a system with spare capacity
>> > +      */
>>
>> Do you mean s/imbalanced/migrate_task/? This part here will affect
>> group_imbalanced, group_asym_packing, and some others.
>
> I really mean the imbalanced case which refers to the function name
> and includes:
> - the pinned tasks case aka group_imbalanced and which is the primary
> target of this function ( which explains its name)
> - but also the case where we want to evenly spread tasks on system
> with spare capacity and removed this imbalance
>

But can't this also affect other group_types? calculate_imbalance() can
set

  env->migration_type = migrate_task;

for

  busiest->group_type > group_fully_busy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 17:40       ` Valentin Schneider
@ 2021-01-08  8:11         ` Vincent Guittot
  2021-01-08 14:36           ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Guittot @ 2021-01-08  8:11 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On Thu, 7 Jan 2021 at 18:40, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 07/01/21 13:20, Vincent Guittot wrote:
> > On Thu, 7 Jan 2021 at 12:26, Valentin Schneider
> > <valentin.schneider@arm.com> wrote:
> >> > @@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
> >> >  }
> >> >
> >> >  static inline bool
> >> > -voluntary_active_balance(struct lb_env *env)
> >> > +imbalanced_active_balance(struct lb_env *env)
> >> > +{
> >> > +     struct sched_domain *sd = env->sd;
> >> > +
> >> > +     /*
> >> > +      * The imbalanced case includes the case of pinned tasks preventing a fair
> >> > +      * distribution of the load on the system but also the even distribution of the
> >> > +      * threads on a system with spare capacity
> >> > +      */
> >>
> >> Do you mean s/imbalanced/migrate_task/? This part here will affect
> >> group_imbalanced, group_asym_packing, and some others.
> >
> > I really mean the imbalanced case which refers to the function name
> > and includes:
> > - the pinned tasks case aka group_imbalanced and which is the primary
> > target of this function ( which explains its name)
> > - but also the case where we want to evenly spread tasks on system
> > with spare capacity and removed this imbalance
> >
>
> But can't this also affect other group_types? calculate_imbalance() can
> set
>
>   env->migration_type = migrate_task;
>
> for
>
>   busiest->group_type > group_fully_busy

yes but we are still in the case of evenly spread tasks on system with
spare capacity. Also, this is already the behavior for such cases.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-08  8:11         ` Vincent Guittot
@ 2021-01-08 14:36           ` Valentin Schneider
  0 siblings, 0 replies; 17+ messages in thread
From: Valentin Schneider @ 2021-01-08 14:36 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

On 08/01/21 09:11, Vincent Guittot wrote:
> On Thu, 7 Jan 2021 at 18:40, Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>>
>> On 07/01/21 13:20, Vincent Guittot wrote:
>> > On Thu, 7 Jan 2021 at 12:26, Valentin Schneider
>> > <valentin.schneider@arm.com> wrote:
>> >> > @@ -9499,13 +9499,32 @@ asym_active_balance(struct lb_env *env)
>> >> >  }
>> >> >
>> >> >  static inline bool
>> >> > -voluntary_active_balance(struct lb_env *env)
>> >> > +imbalanced_active_balance(struct lb_env *env)
>> >> > +{
>> >> > +     struct sched_domain *sd = env->sd;
>> >> > +
>> >> > +     /*
>> >> > +      * The imbalanced case includes the case of pinned tasks preventing a fair
>> >> > +      * distribution of the load on the system but also the even distribution of the
>> >> > +      * threads on a system with spare capacity
>> >> > +      */
>> >>
>> >> Do you mean s/imbalanced/migrate_task/? This part here will affect
>> >> group_imbalanced, group_asym_packing, and some others.
>> >
>> > I really mean the imbalanced case which refers to the function name
>> > and includes:
>> > - the pinned tasks case aka group_imbalanced and which is the primary
>> > target of this function ( which explains its name)
>> > - but also the case where we want to evenly spread tasks on system
>> > with spare capacity and removed this imbalance
>> >
>>
>> But can't this also affect other group_types? calculate_imbalance() can
>> set
>>
>>   env->migration_type = migrate_task;
>>
>> for
>>
>>   busiest->group_type > group_fully_busy
>
> yes but we are still in the case of evenly spread tasks on system with
> spare capacity. Also, this is already the behavior for such cases.

Ah, I was parsing 'imbalance' as 'group_imbalance' and didn't read
your 'evenly spread tasks' description as accounting this case.

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3 v2] sched/fair: skip idle cfs_rq
  2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
@ 2021-01-11 14:46   ` Mel Gorman
  2021-01-14 11:29   ` [tip: sched/core] sched/fair: Skip " tip-bot2 for Vincent Guittot
  1 sibling, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2021-01-11 14:46 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	bristot, valentin.schneider, linux-kernel

On Thu, Jan 07, 2021 at 11:33:23AM +0100, Vincent Guittot wrote:
> Don't waste time checking whether an idle cfs_rq could be the busiest
> queue. Furthermore, this can end up selecting a cfs_rq with a high load
> but being idle in case of migrate_load.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily
  2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
  2021-01-07 11:26   ` Valentin Schneider
       [not found]   ` <BN8PR12MB2978EC9CFBAF529C527D05919AAF0@BN8PR12MB2978.namprd12.prod.outlook.com>
@ 2021-01-11 15:42   ` Mel Gorman
  2021-01-14 11:29   ` [tip: sched/core] sched/fair: Don't " tip-bot2 for Vincent Guittot
  3 siblings, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2021-01-11 15:42 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	bristot, valentin.schneider, linux-kernel

On Thu, Jan 07, 2021 at 11:33:24AM +0100, Vincent Guittot wrote:
> Setting LBF_ALL_PINNED during active load balance is only valid when there
> is only 1 running task on the rq otherwise this ends up increasing the
> balance interval whereas other tasks could migrate after the next interval
> once they become cache-cold as an example.
> 
> LBF_ALL_PINNED flag is now always set it by default. It is then cleared
> when we find one task that can be pulled when calling detach_tasks() or
> during active migration.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

When reviewing this, I found it curious that sched_nr_migrate_break is
a const instead of a #define. It's not clear why as it only appears to
exist in case sysctl_sched_nr_migrate is set higher than 32 to prevent
rq lock starvation.

With your patch, s/atleast/at least/ in the comments if there is another
version.

Otherwise, I didn't spot a real problem so

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3 v2] sched/fair: reduce cases for active balance
  2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
  2021-01-07 11:26   ` Valentin Schneider
@ 2021-01-12  9:16   ` Mel Gorman
  2021-01-14 11:29   ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Vincent Guittot
  2 siblings, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2021-01-12  9:16 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	bristot, valentin.schneider, linux-kernel

On Thu, Jan 07, 2021 at 11:33:25AM +0100, Vincent Guittot wrote:
> Active balance is triggered for a number of voluntary cases like misfit
> or pinned tasks cases but also after that a number of load balance
> attempts failed to migrate a task. There is no need to use active load
> balance when the group is overloaded because an overloaded state means
> that there is at least one waiting task. Nevertheless, the waiting task
> is not selected and detached until the threshold becomes higher than its
> load. This threshold increases with the number of failed lb (see the
> condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in
> detach_tasks()) and the waiting task will end up to be selected after a
> number of attempts.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

I didn't see major problems so.

Acked-by: Mel Gorman <mgorman@suse.de>

Like you, I do not see significant performance differences, either
positive or negative (minor gains and losses that are borderline).
However, I didn't track the stats necessary to see what impact it had on
alb_* stats which is an oversight but schedstat tracking interferes with
results on its own. It would have been nice to see how often this case
is even hit in the overloaded case.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip: sched/core] sched/fair: Don't set LBF_ALL_PINNED unnecessarily
  2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
                     ` (2 preceding siblings ...)
  2021-01-11 15:42   ` Mel Gorman
@ 2021-01-14 11:29   ` tip-bot2 for Vincent Guittot
  3 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2021-01-14 11:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vincent Guittot, Peter Zijlstra (Intel),
	Valentin Schneider, Mel Gorman, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     8a41dfcda7a32ed4435c00d98a9dc7156b08b671
Gitweb:        https://git.kernel.org/tip/8a41dfcda7a32ed4435c00d98a9dc7156b08b671
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Thu, 07 Jan 2021 11:33:24 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 14 Jan 2021 11:20:11 +01:00

sched/fair: Don't set LBF_ALL_PINNED unnecessarily

Setting LBF_ALL_PINNED during active load balance is only valid when there
is only 1 running task on the rq otherwise this ends up increasing the
balance interval whereas other tasks could migrate after the next interval
once they become cache-cold as an example.

LBF_ALL_PINNED flag is now always set it by default. It is then cleared
when we find one task that can be pulled when calling detach_tasks() or
during active migration.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20210107103325.30851-3-vincent.guittot@linaro.org
---
 kernel/sched/fair.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 13de7ae..48f99c8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9639,6 +9639,8 @@ redo:
 	env.src_rq = busiest;
 
 	ld_moved = 0;
+	/* Clear this flag as soon as we find a pullable task */
+	env.flags |= LBF_ALL_PINNED;
 	if (busiest->nr_running > 1) {
 		/*
 		 * Attempt to move tasks. If find_busiest_group has found
@@ -9646,7 +9648,6 @@ redo:
 		 * still unbalanced. ld_moved simply stays zero, so it is
 		 * correctly treated as an imbalance.
 		 */
-		env.flags |= LBF_ALL_PINNED;
 		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
 
 more_balance:
@@ -9772,10 +9773,12 @@ more_balance:
 			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
 				raw_spin_unlock_irqrestore(&busiest->lock,
 							    flags);
-				env.flags |= LBF_ALL_PINNED;
 				goto out_one_pinned;
 			}
 
+			/* Record that we found at least one task that could run on this_cpu */
+			env.flags &= ~LBF_ALL_PINNED;
+
 			/*
 			 * ->active_balance synchronizes accesses to
 			 * ->active_balance_work.  Once set, it's cleared

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: sched/core] sched/fair: Reduce cases for active balance
  2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
  2021-01-07 11:26   ` Valentin Schneider
  2021-01-12  9:16   ` Mel Gorman
@ 2021-01-14 11:29   ` tip-bot2 for Vincent Guittot
  2 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2021-01-14 11:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vincent Guittot, Peter Zijlstra (Intel),
	Valentin Schneider, Mel Gorman, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e9b9734b74656abb585a7f6fabf1d30ce00e51ea
Gitweb:        https://git.kernel.org/tip/e9b9734b74656abb585a7f6fabf1d30ce00e51ea
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Thu, 07 Jan 2021 11:33:25 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 14 Jan 2021 11:20:11 +01:00

sched/fair: Reduce cases for active balance

Active balance is triggered for a number of voluntary cases like misfit
or pinned tasks cases but also after that a number of load balance
attempts failed to migrate a task. There is no need to use active load
balance when the group is overloaded because an overloaded state means
that there is at least one waiting task. Nevertheless, the waiting task
is not selected and detached until the threshold becomes higher than its
load. This threshold increases with the number of failed lb (see the
condition if ((load >> env->sd->nr_balance_failed) > env->imbalance) in
detach_tasks()) and the waiting task will end up to be selected after a
number of attempts.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20210107103325.30851-4-vincent.guittot@linaro.org
---
 kernel/sched/fair.c | 45 ++++++++++++++++++++++----------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 48f99c8..53802b7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9512,13 +9512,32 @@ asym_active_balance(struct lb_env *env)
 }
 
 static inline bool
-voluntary_active_balance(struct lb_env *env)
+imbalanced_active_balance(struct lb_env *env)
+{
+	struct sched_domain *sd = env->sd;
+
+	/*
+	 * The imbalanced case includes the case of pinned tasks preventing a fair
+	 * distribution of the load on the system but also the even distribution of the
+	 * threads on a system with spare capacity
+	 */
+	if ((env->migration_type == migrate_task) &&
+	    (sd->nr_balance_failed > sd->cache_nice_tries+2))
+		return 1;
+
+	return 0;
+}
+
+static int need_active_balance(struct lb_env *env)
 {
 	struct sched_domain *sd = env->sd;
 
 	if (asym_active_balance(env))
 		return 1;
 
+	if (imbalanced_active_balance(env))
+		return 1;
+
 	/*
 	 * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
 	 * It's worth migrating the task if the src_cpu's capacity is reduced
@@ -9538,16 +9557,6 @@ voluntary_active_balance(struct lb_env *env)
 	return 0;
 }
 
-static int need_active_balance(struct lb_env *env)
-{
-	struct sched_domain *sd = env->sd;
-
-	if (voluntary_active_balance(env))
-		return 1;
-
-	return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2);
-}
-
 static int active_load_balance_cpu_stop(void *data);
 
 static int should_we_balance(struct lb_env *env)
@@ -9800,21 +9809,13 @@ more_balance:
 			/* We've kicked active balancing, force task migration. */
 			sd->nr_balance_failed = sd->cache_nice_tries+1;
 		}
-	} else
+	} else {
 		sd->nr_balance_failed = 0;
+	}
 
-	if (likely(!active_balance) || voluntary_active_balance(&env)) {
+	if (likely(!active_balance) || need_active_balance(&env)) {
 		/* We were unbalanced, so reset the balancing interval */
 		sd->balance_interval = sd->min_interval;
-	} else {
-		/*
-		 * If we've begun active balancing, start to back off. This
-		 * case may not be covered by the all_pinned logic if there
-		 * is only 1 task on the busy runqueue (because we don't call
-		 * detach_tasks).
-		 */
-		if (sd->balance_interval < sd->max_interval)
-			sd->balance_interval *= 2;
 	}
 
 	goto out;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: sched/core] sched/fair: Skip idle cfs_rq
  2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
  2021-01-11 14:46   ` Mel Gorman
@ 2021-01-14 11:29   ` tip-bot2 for Vincent Guittot
  1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Vincent Guittot @ 2021-01-14 11:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Vincent Guittot, Peter Zijlstra (Intel),
	Valentin Schneider, Mel Gorman, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     fc488ffd4297f661b3e9d7450dcdb9089a53df7c
Gitweb:        https://git.kernel.org/tip/fc488ffd4297f661b3e9d7450dcdb9089a53df7c
Author:        Vincent Guittot <vincent.guittot@linaro.org>
AuthorDate:    Thu, 07 Jan 2021 11:33:23 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 14 Jan 2021 11:20:10 +01:00

sched/fair: Skip idle cfs_rq

Don't waste time checking whether an idle cfs_rq could be the busiest
queue. Furthermore, this can end up selecting a cfs_rq with a high load
but being idle in case of migrate_load.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20210107103325.30851-2-vincent.guittot@linaro.org
---
 kernel/sched/fair.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 40d3ebf..13de7ae 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9402,8 +9402,11 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		if (rt > env->fbq_type)
 			continue;
 
-		capacity = capacity_of(i);
 		nr_running = rq->cfs.h_nr_running;
+		if (!nr_running)
+			continue;
+
+		capacity = capacity_of(i);
 
 		/*
 		 * For ASYM_CPUCAPACITY domains, don't pick a CPU that could

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-01-14 11:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-07 10:33 [PATCH 0/3 v2] Reduce number of active LB Vincent Guittot
2021-01-07 10:33 ` [PATCH 1/3 v2] sched/fair: skip idle cfs_rq Vincent Guittot
2021-01-11 14:46   ` Mel Gorman
2021-01-14 11:29   ` [tip: sched/core] sched/fair: Skip " tip-bot2 for Vincent Guittot
2021-01-07 10:33 ` [PATCH 2/3 v2] sched/fair: don't set LBF_ALL_PINNED unnecessarily Vincent Guittot
2021-01-07 11:26   ` Valentin Schneider
     [not found]   ` <BN8PR12MB2978EC9CFBAF529C527D05919AAF0@BN8PR12MB2978.namprd12.prod.outlook.com>
2021-01-07 16:00     ` Vincent Guittot
2021-01-11 15:42   ` Mel Gorman
2021-01-14 11:29   ` [tip: sched/core] sched/fair: Don't " tip-bot2 for Vincent Guittot
2021-01-07 10:33 ` [PATCH 3/3 v2] sched/fair: reduce cases for active balance Vincent Guittot
2021-01-07 11:26   ` Valentin Schneider
2021-01-07 12:20     ` Vincent Guittot
2021-01-07 17:40       ` Valentin Schneider
2021-01-08  8:11         ` Vincent Guittot
2021-01-08 14:36           ` Valentin Schneider
2021-01-12  9:16   ` Mel Gorman
2021-01-14 11:29   ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Vincent Guittot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.