linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit
@ 2023-08-20 20:34 Qais Yousef
  2023-08-29 14:10 ` Vincent Guittot
  0 siblings, 1 reply; 5+ messages in thread
From: Qais Yousef @ 2023-08-20 20:34 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann
  Cc: linux-kernel, Qais Yousef, Qais Yousef

From: Qais Yousef <qais.yousef@arm.com>

If a misfit task is affined to a subset of the possible cpus, we need to
verify that one of these cpus can fit it. Otherwise the load balancer
code will continuously trigger needlessly leading the balance_interval
to increase in return and eventually end up with a situation where real
imbalances take a long time to address because of this impossible
imbalance situation.

This can happen in Android world where it's common for background tasks
to be restricted to little cores.

Similarly if we can't fit the biggest core, triggering misfit is
pointless as it is the best we can ever get on this system.

To speed the search up, don't call task_fits_cpu() which will repeatedly
call uclamp_eff_value() for the same task. Call util_fits_cpu() instead.
And only do so when we see a cpu with higher capacity level than
passed cpu_of(rq).

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/fair.c | 50 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0b7445cd5af9..f08c5f3bf895 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4853,17 +4853,50 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
 
 static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
 {
+	unsigned long uclamp_min, uclamp_max;
+	unsigned long util, cap_level;
+	bool has_fitting_cpu = false;
+	int cpu = cpu_of(rq);
+
 	if (!sched_asym_cpucap_active())
 		return;
 
-	if (!p || p->nr_cpus_allowed == 1) {
-		rq->misfit_task_load = 0;
-		return;
-	}
+	if (!p || p->nr_cpus_allowed == 1)
+		goto out;
 
-	if (task_fits_cpu(p, cpu_of(rq))) {
-		rq->misfit_task_load = 0;
-		return;
+	uclamp_min = uclamp_eff_value(p, UCLAMP_MIN);
+	uclamp_max = uclamp_eff_value(p, UCLAMP_MAX);
+	util = task_util_est(p);
+
+	if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0)
+		goto out;
+
+	cap_level = capacity_orig_of(cpu);
+
+	/* If we can't fit the biggest CPU, that's the best we can ever get. */
+	if (cap_level == SCHED_CAPACITY_SCALE)
+		goto out;
+
+	/*
+	 * If the task affinity is not set to default, make sure it is not
+	 * restricted to a subset where no CPU can ever fit it. Triggering
+	 * misfit in this case is pointless as it has no where better to move
+	 * to. And it can lead to balance_interval to grow too high as we'll
+	 * continuously fail to move it anywhere.
+	 */
+	if (!cpumask_equal(p->cpus_ptr, cpu_possible_mask)) {
+		for_each_cpu(cpu, p->cpus_ptr) {
+			if (cap_level < capacity_orig_of(cpu)) {
+				cap_level = capacity_orig_of(cpu);
+				if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0) {
+					has_fitting_cpu = true;
+					break;
+				}
+			}
+		}
+
+		if (!has_fitting_cpu)
+			goto out;
 	}
 
 	/*
@@ -4871,6 +4904,9 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
 	 * task_h_load() returns 0.
 	 */
 	rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1);
+	return;
+out:
+	rq->misfit_task_load = 0;
 }
 
 #else /* CONFIG_SMP */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit
  2023-08-20 20:34 [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit Qais Yousef
@ 2023-08-29 14:10 ` Vincent Guittot
  2023-08-29 15:35   ` Qais Yousef
  0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2023-08-29 14:10 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, linux-kernel, Qais Yousef

On Sun, 20 Aug 2023 at 22:34, Qais Yousef <qyousef@layalina.io> wrote:
>
> From: Qais Yousef <qais.yousef@arm.com>
>
> If a misfit task is affined to a subset of the possible cpus, we need to
> verify that one of these cpus can fit it. Otherwise the load balancer
> code will continuously trigger needlessly leading the balance_interval
> to increase in return and eventually end up with a situation where real
> imbalances take a long time to address because of this impossible
> imbalance situation.
>
> This can happen in Android world where it's common for background tasks
> to be restricted to little cores.
>
> Similarly if we can't fit the biggest core, triggering misfit is
> pointless as it is the best we can ever get on this system.
>
> To speed the search up, don't call task_fits_cpu() which will repeatedly
> call uclamp_eff_value() for the same task. Call util_fits_cpu() instead.
> And only do so when we see a cpu with higher capacity level than
> passed cpu_of(rq).
>
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> ---
>  kernel/sched/fair.c | 50 ++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 43 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0b7445cd5af9..f08c5f3bf895 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4853,17 +4853,50 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
>
>  static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
>  {
> +       unsigned long uclamp_min, uclamp_max;
> +       unsigned long util, cap_level;
> +       bool has_fitting_cpu = false;
> +       int cpu = cpu_of(rq);
> +
>         if (!sched_asym_cpucap_active())
>                 return;
>
> -       if (!p || p->nr_cpus_allowed == 1) {
> -               rq->misfit_task_load = 0;
> -               return;
> -       }
> +       if (!p || p->nr_cpus_allowed == 1)
> +               goto out;
>
> -       if (task_fits_cpu(p, cpu_of(rq))) {
> -               rq->misfit_task_load = 0;
> -               return;
> +       uclamp_min = uclamp_eff_value(p, UCLAMP_MIN);
> +       uclamp_max = uclamp_eff_value(p, UCLAMP_MAX);
> +       util = task_util_est(p);
> +
> +       if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0)
> +               goto out;
> +
> +       cap_level = capacity_orig_of(cpu);
> +
> +       /* If we can't fit the biggest CPU, that's the best we can ever get. */
> +       if (cap_level == SCHED_CAPACITY_SCALE)
> +               goto out;
> +
> +       /*
> +        * If the task affinity is not set to default, make sure it is not
> +        * restricted to a subset where no CPU can ever fit it. Triggering
> +        * misfit in this case is pointless as it has no where better to move
> +        * to. And it can lead to balance_interval to grow too high as we'll
> +        * continuously fail to move it anywhere.
> +        */
> +       if (!cpumask_equal(p->cpus_ptr, cpu_possible_mask)) {
> +               for_each_cpu(cpu, p->cpus_ptr) {

I haven't looked at the problem in detail and at other possibilities
so far but for_each_cpu doesn't scale and update_misfit_status() being
called in pick_next_task_fair() so you must find another way to detect
this


> +                       if (cap_level < capacity_orig_of(cpu)) {
> +                               cap_level = capacity_orig_of(cpu);
> +                               if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0) {
> +                                       has_fitting_cpu = true;
> +                                       break;
> +                               }
> +                       }
> +               }
> +
> +               if (!has_fitting_cpu)
> +                       goto out;
>         }
>
>         /*
> @@ -4871,6 +4904,9 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
>          * task_h_load() returns 0.
>          */
>         rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1);
> +       return;
> +out:
> +       rq->misfit_task_load = 0;
>  }
>
>  #else /* CONFIG_SMP */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit
  2023-08-29 14:10 ` Vincent Guittot
@ 2023-08-29 15:35   ` Qais Yousef
  2023-09-04 13:18     ` Dietmar Eggemann
  0 siblings, 1 reply; 5+ messages in thread
From: Qais Yousef @ 2023-08-29 15:35 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, linux-kernel, Qais Yousef

On 08/29/23 16:10, Vincent Guittot wrote:
> On Sun, 20 Aug 2023 at 22:34, Qais Yousef <qyousef@layalina.io> wrote:
> >
> > From: Qais Yousef <qais.yousef@arm.com>
> >
> > If a misfit task is affined to a subset of the possible cpus, we need to
> > verify that one of these cpus can fit it. Otherwise the load balancer
> > code will continuously trigger needlessly leading the balance_interval
> > to increase in return and eventually end up with a situation where real
> > imbalances take a long time to address because of this impossible
> > imbalance situation.
> >
> > This can happen in Android world where it's common for background tasks
> > to be restricted to little cores.
> >
> > Similarly if we can't fit the biggest core, triggering misfit is
> > pointless as it is the best we can ever get on this system.
> >
> > To speed the search up, don't call task_fits_cpu() which will repeatedly
> > call uclamp_eff_value() for the same task. Call util_fits_cpu() instead.
> > And only do so when we see a cpu with higher capacity level than
> > passed cpu_of(rq).
> >
> > Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> > Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> > ---
> >  kernel/sched/fair.c | 50 ++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 43 insertions(+), 7 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 0b7445cd5af9..f08c5f3bf895 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -4853,17 +4853,50 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
> >
> >  static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
> >  {
> > +       unsigned long uclamp_min, uclamp_max;
> > +       unsigned long util, cap_level;
> > +       bool has_fitting_cpu = false;
> > +       int cpu = cpu_of(rq);
> > +
> >         if (!sched_asym_cpucap_active())
> >                 return;
> >
> > -       if (!p || p->nr_cpus_allowed == 1) {
> > -               rq->misfit_task_load = 0;
> > -               return;
> > -       }
> > +       if (!p || p->nr_cpus_allowed == 1)
> > +               goto out;
> >
> > -       if (task_fits_cpu(p, cpu_of(rq))) {
> > -               rq->misfit_task_load = 0;
> > -               return;
> > +       uclamp_min = uclamp_eff_value(p, UCLAMP_MIN);
> > +       uclamp_max = uclamp_eff_value(p, UCLAMP_MAX);
> > +       util = task_util_est(p);
> > +
> > +       if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0)
> > +               goto out;
> > +
> > +       cap_level = capacity_orig_of(cpu);
> > +
> > +       /* If we can't fit the biggest CPU, that's the best we can ever get. */
> > +       if (cap_level == SCHED_CAPACITY_SCALE)
> > +               goto out;
> > +
> > +       /*
> > +        * If the task affinity is not set to default, make sure it is not
> > +        * restricted to a subset where no CPU can ever fit it. Triggering
> > +        * misfit in this case is pointless as it has no where better to move
> > +        * to. And it can lead to balance_interval to grow too high as we'll
> > +        * continuously fail to move it anywhere.
> > +        */
> > +       if (!cpumask_equal(p->cpus_ptr, cpu_possible_mask)) {
> > +               for_each_cpu(cpu, p->cpus_ptr) {
> 
> I haven't looked at the problem in detail and at other possibilities
> so far but for_each_cpu doesn't scale and update_misfit_status() being
> called in pick_next_task_fair() so you must find another way to detect
> this

Okay, will do.


Thanks

--
Qais Yousef

> 
> 
> > +                       if (cap_level < capacity_orig_of(cpu)) {
> > +                               cap_level = capacity_orig_of(cpu);
> > +                               if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0) {
> > +                                       has_fitting_cpu = true;
> > +                                       break;
> > +                               }
> > +                       }
> > +               }
> > +
> > +               if (!has_fitting_cpu)
> > +                       goto out;
> >         }
> >
> >         /*
> > @@ -4871,6 +4904,9 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
> >          * task_h_load() returns 0.
> >          */
> >         rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1);
> > +       return;
> > +out:
> > +       rq->misfit_task_load = 0;
> >  }
> >
> >  #else /* CONFIG_SMP */
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit
  2023-08-29 15:35   ` Qais Yousef
@ 2023-09-04 13:18     ` Dietmar Eggemann
  2023-09-04 19:38       ` Qais Yousef
  0 siblings, 1 reply; 5+ messages in thread
From: Dietmar Eggemann @ 2023-09-04 13:18 UTC (permalink / raw)
  To: Qais Yousef, Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Qais Yousef

On 29/08/2023 17:35, Qais Yousef wrote:
> On 08/29/23 16:10, Vincent Guittot wrote:
>> On Sun, 20 Aug 2023 at 22:34, Qais Yousef <qyousef@layalina.io> wrote:

[...]

>>>  kernel/sched/fair.c | 50 ++++++++++++++++++++++++++++++++++++++-------
>>>  1 file changed, 43 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 0b7445cd5af9..f08c5f3bf895 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -4853,17 +4853,50 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
>>>
>>>  static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
>>>  {
>>> +       unsigned long uclamp_min, uclamp_max;
>>> +       unsigned long util, cap_level;
>>> +       bool has_fitting_cpu = false;
>>> +       int cpu = cpu_of(rq);
>>> +
>>>         if (!sched_asym_cpucap_active())
>>>                 return;
>>>
>>> -       if (!p || p->nr_cpus_allowed == 1) {
>>> -               rq->misfit_task_load = 0;
>>> -               return;
>>> -       }
>>> +       if (!p || p->nr_cpus_allowed == 1)
>>> +               goto out;
>>>
>>> -       if (task_fits_cpu(p, cpu_of(rq))) {
>>> -               rq->misfit_task_load = 0;
>>> -               return;
>>> +       uclamp_min = uclamp_eff_value(p, UCLAMP_MIN);
>>> +       uclamp_max = uclamp_eff_value(p, UCLAMP_MAX);
>>> +       util = task_util_est(p);
>>> +
>>> +       if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0)
>>> +               goto out;

util_fits_cpu() checks fits_capacity(util, capacity_of(cpu)) but the
capacity pressure could change between update_misfit_status() and CFS lb?

>>> +
>>> +       cap_level = capacity_orig_of(cpu);
>>> +
>>> +       /* If we can't fit the biggest CPU, that's the best we can ever get. */
>>> +       if (cap_level == SCHED_CAPACITY_SCALE)
>>> +               goto out;
>>> +
>>> +       /*
>>> +        * If the task affinity is not set to default, make sure it is not
>>> +        * restricted to a subset where no CPU can ever fit it. Triggering
>>> +        * misfit in this case is pointless as it has no where better to move
>>> +        * to. And it can lead to balance_interval to grow too high as we'll
>>> +        * continuously fail to move it anywhere.
>>> +        */
>>> +       if (!cpumask_equal(p->cpus_ptr, cpu_possible_mask)) {
>>> +               for_each_cpu(cpu, p->cpus_ptr) {
>>
>> I haven't looked at the problem in detail and at other possibilities
>> so far but for_each_cpu doesn't scale and update_misfit_status() being
>> called in pick_next_task_fair() so you must find another way to detect
>> this
> 
> Okay, will do.

We have LIST_HEAD(asym_cap_list) (list of cpumasks according to
cpu_capacity_orig CPU groups) in kernel/sched/topology.c to set
SD_ASYM_CPUCAPACITY{,_FULL} for asymmetric CPU capacity systems.
Maybe this could be made usable in fair.c as well?

But checking via util_fits_cpu() wouldn't work then since it's per-CPU.
The check of p's CPU affinity, its uclamped util_avg value and the
cpu_capacity_orig is sufficient here. Then using those cpumasks could work.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit
  2023-09-04 13:18     ` Dietmar Eggemann
@ 2023-09-04 19:38       ` Qais Yousef
  0 siblings, 0 replies; 5+ messages in thread
From: Qais Yousef @ 2023-09-04 19:38 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Vincent Guittot, Ingo Molnar, Peter Zijlstra, linux-kernel, Qais Yousef

On 09/04/23 15:18, Dietmar Eggemann wrote:
> On 29/08/2023 17:35, Qais Yousef wrote:
> > On 08/29/23 16:10, Vincent Guittot wrote:
> >> On Sun, 20 Aug 2023 at 22:34, Qais Yousef <qyousef@layalina.io> wrote:
> 
> [...]
> 
> >>>  kernel/sched/fair.c | 50 ++++++++++++++++++++++++++++++++++++++-------
> >>>  1 file changed, 43 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index 0b7445cd5af9..f08c5f3bf895 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -4853,17 +4853,50 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
> >>>
> >>>  static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
> >>>  {
> >>> +       unsigned long uclamp_min, uclamp_max;
> >>> +       unsigned long util, cap_level;
> >>> +       bool has_fitting_cpu = false;
> >>> +       int cpu = cpu_of(rq);
> >>> +
> >>>         if (!sched_asym_cpucap_active())
> >>>                 return;
> >>>
> >>> -       if (!p || p->nr_cpus_allowed == 1) {
> >>> -               rq->misfit_task_load = 0;
> >>> -               return;
> >>> -       }
> >>> +       if (!p || p->nr_cpus_allowed == 1)
> >>> +               goto out;
> >>>
> >>> -       if (task_fits_cpu(p, cpu_of(rq))) {
> >>> -               rq->misfit_task_load = 0;
> >>> -               return;
> >>> +       uclamp_min = uclamp_eff_value(p, UCLAMP_MIN);
> >>> +       uclamp_max = uclamp_eff_value(p, UCLAMP_MAX);
> >>> +       util = task_util_est(p);
> >>> +
> >>> +       if (util_fits_cpu(util, uclamp_min, uclamp_max, cpu) > 0)
> >>> +               goto out;
> 
> util_fits_cpu() checks fits_capacity(util, capacity_of(cpu)) but the
> capacity pressure could change between update_misfit_status() and CFS lb?

Do we need to be precise here? I think the race is not a problem as long as
we're not reading garbage values, which I don't think we do.

FWIW, task_fits_cpu() also used util_fits_cpu(), so if this is a problem it's
not something introduced by this patch at least.

The trade-off is to be lockless but live with potentially slightly outdated
value, or be precise and hold some locks to force serialization.

> 
> >>> +
> >>> +       cap_level = capacity_orig_of(cpu);
> >>> +
> >>> +       /* If we can't fit the biggest CPU, that's the best we can ever get. */
> >>> +       if (cap_level == SCHED_CAPACITY_SCALE)
> >>> +               goto out;
> >>> +
> >>> +       /*
> >>> +        * If the task affinity is not set to default, make sure it is not
> >>> +        * restricted to a subset where no CPU can ever fit it. Triggering
> >>> +        * misfit in this case is pointless as it has no where better to move
> >>> +        * to. And it can lead to balance_interval to grow too high as we'll
> >>> +        * continuously fail to move it anywhere.
> >>> +        */
> >>> +       if (!cpumask_equal(p->cpus_ptr, cpu_possible_mask)) {
> >>> +               for_each_cpu(cpu, p->cpus_ptr) {
> >>
> >> I haven't looked at the problem in detail and at other possibilities
> >> so far but for_each_cpu doesn't scale and update_misfit_status() being
> >> called in pick_next_task_fair() so you must find another way to detect
> >> this
> > 
> > Okay, will do.
> 
> We have LIST_HEAD(asym_cap_list) (list of cpumasks according to
> cpu_capacity_orig CPU groups) in kernel/sched/topology.c to set
> SD_ASYM_CPUCAPACITY{,_FULL} for asymmetric CPU capacity systems.
> Maybe this could be made usable in fair.c as well?

Yeah it could help to implement for_each_cap_level() iterator that can be
safely used from anywhere.

I remember looking at topology code in the past but the issue I think I found
then is that I must make sure we have something that is RCU protected to truly
allow it to be used concurrently without overhead (like we do for pd). So a bit
of rework is required.

> 
> But checking via util_fits_cpu() wouldn't work then since it's per-CPU.
> The check of p's CPU affinity, its uclamped util_avg value and the
> cpu_capacity_orig is sufficient here. Then using those cpumasks could work.

Thanks for the suggestions. Yes we can skip util_fits_cpu() and just compare
against capacity_orig directly as this is only what we truly care about here.


Thanks!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-04 19:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-20 20:34 [PATCH] sched/fair: Check a task has a fitting cpu when updating misfit Qais Yousef
2023-08-29 14:10 ` Vincent Guittot
2023-08-29 15:35   ` Qais Yousef
2023-09-04 13:18     ` Dietmar Eggemann
2023-09-04 19:38       ` Qais Yousef

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).