linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
@ 2020-10-22 13:43 Vincent Guittot
  2020-10-22 14:53 ` Valentin Schneider
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Guittot @ 2020-10-22 13:43 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel, valentin.schneider, morten.rasmussen
  Cc: Vincent Guittot

During fast wakeup path, scheduler always check whether local or prev cpus
are good candidates for the task before looking for other cpus in the
domain. With
  commit b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
the heterogenous system gains a dedicated path but doesn't try to keep
reusing prev cpu whenever possible. If the previous cpu is idle and belong to the
asymmetric domain, we should check it 1st before looking for another cpu
because it stays one of the best candidate and it stabilizes task placement
on the system.

This change aligns asymmetric path behavior with symmetric one and reduces
cases where the task migrates across all cpus of the sd_asym_cpucapacity
domains at wakeup.

This change does not impact normal EAS mode but only the overloaded case or
when EAS is not used.

On hikey960 with performance governor (EAS disable)

./perf bench sched pipe -T -l 150000
             mainline           w/ patch
# migrations   299811                  3
ops/sec        154535(+/-0.13%)   181754(+/- 0.29) +17%

Fixes: b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aa4c6227cd6d..f39638fe6b94 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
  * maximize capacity.
  */
 static int
-select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
+select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
 {
 	unsigned long best_cap = 0;
 	int cpu, best_cpu = -1;
@@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
 
 	sync_entity_load_avg(&p->se);
 
+	if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
+	    task_fits_capacity(p, capacity_of(target)))
+		return target;
+
 	cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
 	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
 
+	/*
+	 * If the previous CPU belongs to this asymmetric domain and is idle,
+	 * check it 1st as it's the best candidate.
+	 */
+	if (prev != target && cpumask_test_cpu(prev, cpus) &&
+	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
+	    task_fits_capacity(p, capacity_of(prev)))
+		return prev;
+
 	for_each_cpu_wrap(cpu, cpus, target) {
 		unsigned long cpu_cap = capacity_of(cpu);
 
@@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
 		if (!sd)
 			goto symmetric;
 
-		i = select_idle_capacity(p, sd, target);
+		i = select_idle_capacity(p, sd, prev, target);
 		return ((unsigned)i < nr_cpumask_bits) ? i : target;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-22 13:43 [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path Vincent Guittot
@ 2020-10-22 14:53 ` Valentin Schneider
  2020-10-22 15:33   ` Vincent Guittot
  0 siblings, 1 reply; 7+ messages in thread
From: Valentin Schneider @ 2020-10-22 14:53 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel, morten.rasmussen


Hi Vincent,

On 22/10/20 14:43, Vincent Guittot wrote:
> During fast wakeup path, scheduler always check whether local or prev cpus
> are good candidates for the task before looking for other cpus in the
> domain. With
>   commit b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> the heterogenous system gains a dedicated path but doesn't try to keep
> reusing prev cpu whenever possible. If the previous cpu is idle and belong to the
> asymmetric domain, we should check it 1st before looking for another cpu
> because it stays one of the best candidate and it stabilizes task placement
> on the system.
>
> This change aligns asymmetric path behavior with symmetric one and reduces
> cases where the task migrates across all cpus of the sd_asym_cpucapacity
> domains at wakeup.
>
> This change does not impact normal EAS mode but only the overloaded case or
> when EAS is not used.
>
> On hikey960 with performance governor (EAS disable)
>
> ./perf bench sched pipe -T -l 150000
>              mainline           w/ patch
> # migrations   299811                  3

Colour me impressed!

Now AFAICT the only thing that makes new_cpu != prev_cpu in
select_task_rq_fair() is the WAKE_AFFINE stuff, and the likelihood of that
happening increases when WF_SYNC (which the Android binder uses, at least
on a mainline tree). I had severely underestimated how often that thing
picks this_cpu.

> ops/sec        154535(+/-0.13%)   181754(+/- 0.29) +17%
>
> Fixes: b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index aa4c6227cd6d..f39638fe6b94 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>   * maximize capacity.
>   */
>  static int
> -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
>  {
>       unsigned long best_cap = 0;
>       int cpu, best_cpu = -1;
> @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
>
>       sync_entity_load_avg(&p->se);
>
> +	if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> +	    task_fits_capacity(p, capacity_of(target)))
> +		return target;
> +

I think we still need to check for CPU affinity here.

>       cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>
> +	/*
> +	 * If the previous CPU belongs to this asymmetric domain and is idle,
> +	 * check it 1st as it's the best candidate.
> +	 */
> +	if (prev != target && cpumask_test_cpu(prev, cpus) &&
> +	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> +	    task_fits_capacity(p, capacity_of(prev)))
> +		return prev;
> +
>       for_each_cpu_wrap(cpu, cpus, target) {

So we prioritize target over prev, like the rest of the
select_idle_sibling() family. Here however we apply the same acceptability
function to target, prev and the loop body, so perhaps we could simplify
this to:

  if (accept(target))
      return target;

  ...

  for_each_cpu_wrap(cpu, cpus, prev) {
      ...
  }

That way we evaluate target twice only if it isn't a direct candidate
(but might be a fallback one).

>               unsigned long cpu_cap = capacity_of(cpu);
>
> @@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>               if (!sd)
>                       goto symmetric;
>
> -		i = select_idle_capacity(p, sd, target);
> +		i = select_idle_capacity(p, sd, prev, target);
>               return ((unsigned)i < nr_cpumask_bits) ? i : target;
>       }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-22 14:53 ` Valentin Schneider
@ 2020-10-22 15:33   ` Vincent Guittot
  2020-10-22 17:45     ` Valentin Schneider
  2020-10-23 17:14     ` Dietmar Eggemann
  0 siblings, 2 replies; 7+ messages in thread
From: Vincent Guittot @ 2020-10-22 15:33 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, linux-kernel,
	Morten Rasmussen

On Thu, 22 Oct 2020 at 16:53, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
>
> Hi Vincent,
>
> On 22/10/20 14:43, Vincent Guittot wrote:
> > During fast wakeup path, scheduler always check whether local or prev cpus
> > are good candidates for the task before looking for other cpus in the
> > domain. With
> >   commit b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> > the heterogenous system gains a dedicated path but doesn't try to keep
> > reusing prev cpu whenever possible. If the previous cpu is idle and belong to the
> > asymmetric domain, we should check it 1st before looking for another cpu
> > because it stays one of the best candidate and it stabilizes task placement
> > on the system.
> >
> > This change aligns asymmetric path behavior with symmetric one and reduces
> > cases where the task migrates across all cpus of the sd_asym_cpucapacity
> > domains at wakeup.
> >
> > This change does not impact normal EAS mode but only the overloaded case or
> > when EAS is not used.
> >
> > On hikey960 with performance governor (EAS disable)
> >
> > ./perf bench sched pipe -T -l 150000
> >              mainline           w/ patch
> > # migrations   299811                  3
>
> Colour me impressed!
>
> Now AFAICT the only thing that makes new_cpu != prev_cpu in
> select_task_rq_fair() is the WAKE_AFFINE stuff, and the likelihood of that
> happening increases when WF_SYNC (which the Android binder uses, at least
> on a mainline tree). I had severely underestimated how often that thing
> picks this_cpu.
>
> > ops/sec        154535(+/-0.13%)   181754(+/- 0.29) +17%
> >
> > Fixes: b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  kernel/sched/fair.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index aa4c6227cd6d..f39638fe6b94 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> >   * maximize capacity.
> >   */
> >  static int
> > -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> > +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
> >  {
> >       unsigned long best_cap = 0;
> >       int cpu, best_cpu = -1;
> > @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> >
> >       sync_entity_load_avg(&p->se);
> >
> > +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> > +         task_fits_capacity(p, capacity_of(target)))
> > +             return target;
> > +
>
> I think we still need to check for CPU affinity here.

yes good point

>
> >       cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> >       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> >
> > +     /*
> > +      * If the previous CPU belongs to this asymmetric domain and is idle,
> > +      * check it 1st as it's the best candidate.
> > +      */
> > +     if (prev != target && cpumask_test_cpu(prev, cpus) &&
> > +         (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> > +         task_fits_capacity(p, capacity_of(prev)))
> > +             return prev;
> > +
> >       for_each_cpu_wrap(cpu, cpus, target) {
>
> So we prioritize target over prev, like the rest of the
> select_idle_sibling() family. Here however we apply the same acceptability
> function to target, prev and the loop body, so perhaps we could simplify
> this to:

My 1st implementation was similar to you proposal below but i finally
decided to strictly follow the same sequence as symmetric which:
- checks target
- then prev cpu
- and finally uses target as the starting point of the loop for
looking for another cpu

Using prev as the starting point of the loop will change which cpu
will be selected but I don't have a strong opinion if this will make a
real difference at the end because bit position doesn't imply any
relation with others cpus.

So I'm fine to go with your proposal below

Also I wonder if i should also add the test of p->recent_used_cpu and
the per cpu kthread optimization, which benefit XFS IIRC.

>
>   if (accept(target))
>       return target;
>
>   ...
>
>   for_each_cpu_wrap(cpu, cpus, prev) {
>       ...
>   }
>
> That way we evaluate target twice only if it isn't a direct candidate
> (but might be a fallback one).
>
> >               unsigned long cpu_cap = capacity_of(cpu);
> >
> > @@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >               if (!sd)
> >                       goto symmetric;
> >
> > -             i = select_idle_capacity(p, sd, target);
> > +             i = select_idle_capacity(p, sd, prev, target);
> >               return ((unsigned)i < nr_cpumask_bits) ? i : target;
> >       }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-22 15:33   ` Vincent Guittot
@ 2020-10-22 17:45     ` Valentin Schneider
  2020-10-23  7:15       ` Vincent Guittot
  2020-10-23 17:14     ` Dietmar Eggemann
  1 sibling, 1 reply; 7+ messages in thread
From: Valentin Schneider @ 2020-10-22 17:45 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, linux-kernel,
	Morten Rasmussen


On 22/10/20 16:33, Vincent Guittot wrote:
> On Thu, 22 Oct 2020 at 16:53, Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>> > @@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>> >   * maximize capacity.
>> >   */
>> >  static int
>> > -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
>> > +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
>> >  {
>> >       unsigned long best_cap = 0;
>> >       int cpu, best_cpu = -1;
>> > @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
>> >
>> >       sync_entity_load_avg(&p->se);
>> >
>> > +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
>> > +         task_fits_capacity(p, capacity_of(target)))
>> > +             return target;
>> > +
>>
>> I think we still need to check for CPU affinity here.
>
> yes good point
>
>>
>> >       cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>> >       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>> >
>> > +     /*
>> > +      * If the previous CPU belongs to this asymmetric domain and is idle,
>> > +      * check it 1st as it's the best candidate.
>> > +      */
>> > +     if (prev != target && cpumask_test_cpu(prev, cpus) &&
>> > +         (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
>> > +         task_fits_capacity(p, capacity_of(prev)))
>> > +             return prev;
>> > +
>> >       for_each_cpu_wrap(cpu, cpus, target) {
>>
>> So we prioritize target over prev, like the rest of the
>> select_idle_sibling() family. Here however we apply the same acceptability
>> function to target, prev and the loop body, so perhaps we could simplify
>> this to:
>
> My 1st implementation was similar to you proposal below but i finally
> decided to strictly follow the same sequence as symmetric which:
> - checks target
> - then prev cpu
> - and finally uses target as the starting point of the loop for
> looking for another cpu
>
> Using prev as the starting point of the loop will change which cpu
> will be selected but I don't have a strong opinion if this will make a
> real difference at the end because bit position doesn't imply any
> relation with others cpus.
>

Yep, also one difference with the symmetric path here is that the first
checks against target & prev use exactly the same criteria as the loop
body, so we shouldn't feel shy about doing this here.

> So I'm fine to go with your proposal below
>
> Also I wonder if i should also add the test of p->recent_used_cpu and
> the per cpu kthread optimization, which benefit XFS IIRC.
>

If we head down that route it would be nice to reuse the existing
conditions (rather than copy and tweak them) and move the asymmetric loop
further down. Maybe with something like (with a better name though):

  static inline bool asym_task_fits_capacity(struct task_struct *p, int cpu)
  {
          if (!static_branch_unlikely(&sched_asym_cpucapacity))
                  return true;

          return task_fits_capacity(p, capacity_of(cpu));
  }

and we could && that to the existing cases. Food for thought.

>>
>>   if (accept(target))
>>       return target;
>>
>>   ...
>>
>>   for_each_cpu_wrap(cpu, cpus, prev) {
>>       ...
>>   }
>>
>> That way we evaluate target twice only if it isn't a direct candidate
>> (but might be a fallback one).
>>
>> >               unsigned long cpu_cap = capacity_of(cpu);
>> >
>> > @@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>> >               if (!sd)
>> >                       goto symmetric;
>> >
>> > -             i = select_idle_capacity(p, sd, target);
>> > +             i = select_idle_capacity(p, sd, prev, target);
>> >               return ((unsigned)i < nr_cpumask_bits) ? i : target;
>> >       }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-22 17:45     ` Valentin Schneider
@ 2020-10-23  7:15       ` Vincent Guittot
  0 siblings, 0 replies; 7+ messages in thread
From: Vincent Guittot @ 2020-10-23  7:15 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, linux-kernel,
	Morten Rasmussen

On Thu, 22 Oct 2020 at 19:46, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
>
> On 22/10/20 16:33, Vincent Guittot wrote:
> > On Thu, 22 Oct 2020 at 16:53, Valentin Schneider
> > <valentin.schneider@arm.com> wrote:
> >> > @@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> >> >   * maximize capacity.
> >> >   */
> >> >  static int
> >> > -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> >> > +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
> >> >  {
> >> >       unsigned long best_cap = 0;
> >> >       int cpu, best_cpu = -1;
> >> > @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> >> >
> >> >       sync_entity_load_avg(&p->se);
> >> >
> >> > +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> >> > +         task_fits_capacity(p, capacity_of(target)))
> >> > +             return target;
> >> > +
> >>
> >> I think we still need to check for CPU affinity here.
> >
> > yes good point
> >
> >>
> >> >       cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> >> >       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> >> >
> >> > +     /*
> >> > +      * If the previous CPU belongs to this asymmetric domain and is idle,
> >> > +      * check it 1st as it's the best candidate.
> >> > +      */
> >> > +     if (prev != target && cpumask_test_cpu(prev, cpus) &&
> >> > +         (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> >> > +         task_fits_capacity(p, capacity_of(prev)))
> >> > +             return prev;
> >> > +
> >> >       for_each_cpu_wrap(cpu, cpus, target) {
> >>
> >> So we prioritize target over prev, like the rest of the
> >> select_idle_sibling() family. Here however we apply the same acceptability
> >> function to target, prev and the loop body, so perhaps we could simplify
> >> this to:
> >
> > My 1st implementation was similar to you proposal below but i finally
> > decided to strictly follow the same sequence as symmetric which:
> > - checks target
> > - then prev cpu
> > - and finally uses target as the starting point of the loop for
> > looking for another cpu
> >
> > Using prev as the starting point of the loop will change which cpu
> > will be selected but I don't have a strong opinion if this will make a
> > real difference at the end because bit position doesn't imply any
> > relation with others cpus.
> >
>
> Yep, also one difference with the symmetric path here is that the first
> checks against target & prev use exactly the same criteria as the loop
> body, so we shouldn't feel shy about doing this here.
>
> > So I'm fine to go with your proposal below
> >
> > Also I wonder if i should also add the test of p->recent_used_cpu and
> > the per cpu kthread optimization, which benefit XFS IIRC.
> >
>
> If we head down that route it would be nice to reuse the existing
> conditions (rather than copy and tweak them) and move the asymmetric loop
> further down. Maybe with something like (with a better name though):

Yes, That would ensure that asymmetric will stay align symmetric.

I 'm going to look at this for the next version

>
>   static inline bool asym_task_fits_capacity(struct task_struct *p, int cpu)
>   {
>           if (!static_branch_unlikely(&sched_asym_cpucapacity))
>                   return true;
>
>           return task_fits_capacity(p, capacity_of(cpu));
>   }
>
> and we could && that to the existing cases. Food for thought.
>
> >>
> >>   if (accept(target))
> >>       return target;
> >>
> >>   ...
> >>
> >>   for_each_cpu_wrap(cpu, cpus, prev) {
> >>       ...
> >>   }
> >>
> >> That way we evaluate target twice only if it isn't a direct candidate
> >> (but might be a fallback one).
> >>
> >> >               unsigned long cpu_cap = capacity_of(cpu);
> >> >
> >> > @@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> >> >               if (!sd)
> >> >                       goto symmetric;
> >> >
> >> > -             i = select_idle_capacity(p, sd, target);
> >> > +             i = select_idle_capacity(p, sd, prev, target);
> >> >               return ((unsigned)i < nr_cpumask_bits) ? i : target;
> >> >       }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-22 15:33   ` Vincent Guittot
  2020-10-22 17:45     ` Valentin Schneider
@ 2020-10-23 17:14     ` Dietmar Eggemann
  2020-10-26  8:27       ` Vincent Guittot
  1 sibling, 1 reply; 7+ messages in thread
From: Dietmar Eggemann @ 2020-10-23 17:14 UTC (permalink / raw)
  To: Vincent Guittot, Valentin Schneider
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Steven Rostedt,
	Ben Segall, Mel Gorman, linux-kernel, Morten Rasmussen

On 22/10/2020 17:33, Vincent Guittot wrote:
> On Thu, 22 Oct 2020 at 16:53, Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>>
>>
>> Hi Vincent,
>>
>> On 22/10/20 14:43, Vincent Guittot wrote:

[...]

>>>  static int
>>> -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
>>> +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
>>>  {
>>>       unsigned long best_cap = 0;
>>>       int cpu, best_cpu = -1;
>>> @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
>>>
>>>       sync_entity_load_avg(&p->se);
>>>
>>> +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
>>> +         task_fits_capacity(p, capacity_of(target)))
>>> +             return target;
>>> +
>>
>> I think we still need to check for CPU affinity here.
> 
> yes good point

We don't check CPU affinity on target and prev in the symmetric case.

I always thought that since we:

(1) check 'want_affine = ... && cpumask_test_cpu(cpu, p->cpus_ptr);' in
    select_task_rq_fair() and

(2) we have the select_fallback_rq() in select_task_rq() for prev

that this would be sufficient?

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path
  2020-10-23 17:14     ` Dietmar Eggemann
@ 2020-10-26  8:27       ` Vincent Guittot
  0 siblings, 0 replies; 7+ messages in thread
From: Vincent Guittot @ 2020-10-26  8:27 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Steven Rostedt, Ben Segall, Mel Gorman, linux-kernel,
	Morten Rasmussen

On Fri, 23 Oct 2020 at 19:14, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 22/10/2020 17:33, Vincent Guittot wrote:
> > On Thu, 22 Oct 2020 at 16:53, Valentin Schneider
> > <valentin.schneider@arm.com> wrote:
> >>
> >>
> >> Hi Vincent,
> >>
> >> On 22/10/20 14:43, Vincent Guittot wrote:
>
> [...]
>
> >>>  static int
> >>> -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> >>> +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int prev, int target)
> >>>  {
> >>>       unsigned long best_cap = 0;
> >>>       int cpu, best_cpu = -1;
> >>> @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> >>>
> >>>       sync_entity_load_avg(&p->se);
> >>>
> >>> +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> >>> +         task_fits_capacity(p, capacity_of(target)))
> >>> +             return target;
> >>> +
> >>
> >> I think we still need to check for CPU affinity here.
> >
> > yes good point
>
> We don't check CPU affinity on target and prev in the symmetric case.

Yes that's what i have noticed while reworking the patch to merge asym
and symmetric
>
> I always thought that since we:
>
> (1) check 'want_affine = ... && cpumask_test_cpu(cpu, p->cpus_ptr);' in
>     select_task_rq_fair() and
>
> (2) we have the select_fallback_rq() in select_task_rq() for prev
>
> that this would be sufficient?
>
> [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-26  8:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 13:43 [PATCH] sched/fair: prefer prev cpu in asymmetric wakeup path Vincent Guittot
2020-10-22 14:53 ` Valentin Schneider
2020-10-22 15:33   ` Vincent Guittot
2020-10-22 17:45     ` Valentin Schneider
2020-10-23  7:15       ` Vincent Guittot
2020-10-23 17:14     ` Dietmar Eggemann
2020-10-26  8:27       ` Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).