All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched/fair: check for idle core
@ 2020-10-22 13:15 Julia Lawall
  2020-10-23  8:40 ` Mel Gorman
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Julia Lawall @ 2020-10-22 13:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel

In the case of a thread wakeup, wake_affine determines whether a core
will be chosen for the thread on the socket where the thread ran
previously or on the socket of the waker.  This is done primarily by
comparing the load of the core where th thread ran previously (prev)
and the load of the waker (this).

commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
in wakeup path") changed the load computation from the runnable load
to the load average, where the latter includes the load of threads
that have already blocked on the core.

When a short-running daemon processes happens to run on prev, this
change raised the situation that prev could appear to have a greater
load than this, even when prev is actually idle.  When prev and this
are on the same socket, the idle prev is detected later, in
select_idle_sibling.  But if that does not hold, prev is completely
ignored, causing the waking thread to move to the socket of the waker.
In the case of N mostly active threads on N cores, this triggers other
migrations and hurts performance.

In contrast, before commit 11f10e5420f6, the load on an idle core
was 0, and in the case of a non-idle waker core, the effect of
wake_affine was to select prev as the target for searching for a core
for the waking thread.

To avoid unnecessary migrations, extend wake_affine_idle to check
whether the core where the thread previously ran is currently idle,
and if so simply return that core as the target.

[1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
load in wakeup path")

This particularly has an impact when using the ondemand power manager,
where kworkers run every 0.004 seconds on all cores, increasing the
likelihood that an idle core will be considered to have a load.

The following numbers were obtained with the benchmarking tool
hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
management were used.  Times are in seconds.  All experiments use all
160 hardware threads.

	v5.9/intel-pstate	v5.9+patch/intel-pstate
bt.C.c	24.725724+-0.962340	23.349608+-1.607214
lu.C.x	29.105952+-4.804203	25.249052+-5.561617
sp.C.x	31.220696+-1.831335	30.227760+-2.429792
ua.C.x	26.606118+-1.767384	25.778367+-1.263850

	v5.9/ondemand		v5.9+patch/ondemand
bt.C.c	25.330360+-1.028316	23.544036+-1.020189
lu.C.x	35.872659+-4.872090	23.719295+-3.883848
sp.C.x	32.141310+-2.289541	29.125363+-0.872300
ua.C.x	29.024597+-1.667049	25.728888+-1.539772

On the smaller data sets (A and B) and on the other NAS benchmarks
there is no impact on performance.

This also has a major impact on the splash2x.volrend benchmark of the
parsec benchmark suite that goes from 1m25 without this patch to 0m45,
in active (intel_pstate) mode.

Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>

---
v2: rewrite the log message, add volrend information

 kernel/sched/fair.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aa4c6227cd6d..9b23dad883ee 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (sync && cpu_rq(this_cpu)->nr_running == 1)
 		return this_cpu;
 
+	if (available_idle_cpu(prev_cpu))
+		return prev_cpu;
+
 	return nr_cpumask_bits;
 }
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
@ 2020-10-23  8:40 ` Mel Gorman
  2020-10-23  9:21   ` Julia Lawall
  2020-10-23 16:28 ` Chen Yu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Mel Gorman @ 2020-10-23  8:40 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel

On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> In the case of a thread wakeup, wake_affine determines whether a core
> will be chosen for the thread on the socket where the thread ran
> previously or on the socket of the waker.  This is done primarily by
> comparing the load of the core where th thread ran previously (prev)
> and the load of the waker (this).
> 
> commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
> in wakeup path") changed the load computation from the runnable load
> to the load average, where the latter includes the load of threads
> that have already blocked on the core.
> 
> When a short-running daemon processes happens to run on prev, this
> change raised the situation that prev could appear to have a greater
> load than this, even when prev is actually idle.  When prev and this
> are on the same socket, the idle prev is detected later, in
> select_idle_sibling.  But if that does not hold, prev is completely
> ignored, causing the waking thread to move to the socket of the waker.
> In the case of N mostly active threads on N cores, this triggers other
> migrations and hurts performance.
> 
> In contrast, before commit 11f10e5420f6, the load on an idle core
> was 0, and in the case of a non-idle waker core, the effect of
> wake_affine was to select prev as the target for searching for a core
> for the waking thread.
> 
> To avoid unnecessary migrations, extend wake_affine_idle to check
> whether the core where the thread previously ran is currently idle,
> and if so simply return that core as the target.
> target
> [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
> load in wakeup path")
> 
> This particularly has an impact when using the ondemand power manager,
> where kworkers run every 0.004 seconds on all cores, increasing the
> likelihood that an idle core will be considered to have a load.
> 
> The following numbers were obtained with the benchmarking tool
> hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
> benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
> tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
> 2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
> management were used.  Times are in seconds.  All experiments use all
> 160 hardware threads.
> 
> 	v5.9/intel-pstate	v5.9+patch/intel-pstate
> bt.C.c	24.725724+-0.962340	23.349608+-1.607214
> lu.C.x	29.105952+-4.804203	25.249052+-5.561617
> sp.C.x	31.220696+-1.831335	30.227760+-2.429792
> ua.C.x	26.606118+-1.767384	25.778367+-1.263850
> 
> 	v5.9/ondemand		v5.9+patch/ondemand
> bt.C.c	25.330360+-1.028316	23.544036+-1.020189
> lu.C.x	35.872659+-4.872090	23.719295+-3.883848
> sp.C.x	32.141310+-2.289541	29.125363+-0.872300
> ua.C.x	29.024597+-1.667049	25.728888+-1.539772
> 
> On the smaller data sets (A and B) and on the other NAS benchmarks
> there is no impact on performance.
> 
> This also has a major impact on the splash2x.volrend benchmark of the
> parsec benchmark suite that goes from 1m25 without this patch to 0m45,
> in active (intel_pstate) mode.
> 
> Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> 

In principal, I think the patch is ok after the recent discussion. I'm
holding off an ack until a battery of tests on loads with different
levels of utilisation and wakeup patterns makes its way through a test
grid. It's based on Linus's tree mid-merge window that includes what is
in the scheduler pipeline

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-23  8:40 ` Mel Gorman
@ 2020-10-23  9:21   ` Julia Lawall
  2020-10-23 10:05     ` Mel Gorman
  0 siblings, 1 reply; 15+ messages in thread
From: Julia Lawall @ 2020-10-23  9:21 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Fri, 23 Oct 2020, Mel Gorman wrote:

> On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > In the case of a thread wakeup, wake_affine determines whether a core
> > will be chosen for the thread on the socket where the thread ran
> > previously or on the socket of the waker.  This is done primarily by
> > comparing the load of the core where th thread ran previously (prev)
> > and the load of the waker (this).
> >
> > commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
> > in wakeup path") changed the load computation from the runnable load
> > to the load average, where the latter includes the load of threads
> > that have already blocked on the core.
> >
> > When a short-running daemon processes happens to run on prev, this
> > change raised the situation that prev could appear to have a greater
> > load than this, even when prev is actually idle.  When prev and this
> > are on the same socket, the idle prev is detected later, in
> > select_idle_sibling.  But if that does not hold, prev is completely
> > ignored, causing the waking thread to move to the socket of the waker.
> > In the case of N mostly active threads on N cores, this triggers other
> > migrations and hurts performance.
> >
> > In contrast, before commit 11f10e5420f6, the load on an idle core
> > was 0, and in the case of a non-idle waker core, the effect of
> > wake_affine was to select prev as the target for searching for a core
> > for the waking thread.
> >
> > To avoid unnecessary migrations, extend wake_affine_idle to check
> > whether the core where the thread previously ran is currently idle,
> > and if so simply return that core as the target.
> > target
> > [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
> > load in wakeup path")
> >
> > This particularly has an impact when using the ondemand power manager,
> > where kworkers run every 0.004 seconds on all cores, increasing the
> > likelihood that an idle core will be considered to have a load.
> >
> > The following numbers were obtained with the benchmarking tool
> > hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
> > benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
> > tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
> > 2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
> > management were used.  Times are in seconds.  All experiments use all
> > 160 hardware threads.
> >
> > 	v5.9/intel-pstate	v5.9+patch/intel-pstate
> > bt.C.c	24.725724+-0.962340	23.349608+-1.607214
> > lu.C.x	29.105952+-4.804203	25.249052+-5.561617
> > sp.C.x	31.220696+-1.831335	30.227760+-2.429792
> > ua.C.x	26.606118+-1.767384	25.778367+-1.263850
> >
> > 	v5.9/ondemand		v5.9+patch/ondemand
> > bt.C.c	25.330360+-1.028316	23.544036+-1.020189
> > lu.C.x	35.872659+-4.872090	23.719295+-3.883848
> > sp.C.x	32.141310+-2.289541	29.125363+-0.872300
> > ua.C.x	29.024597+-1.667049	25.728888+-1.539772
> >
> > On the smaller data sets (A and B) and on the other NAS benchmarks
> > there is no impact on performance.
> >
> > This also has a major impact on the splash2x.volrend benchmark of the
> > parsec benchmark suite that goes from 1m25 without this patch to 0m45,
> > in active (intel_pstate) mode.
> >
> > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> >
>
> In principal, I think the patch is ok after the recent discussion. I'm
> holding off an ack until a battery of tests on loads with different
> levels of utilisation and wakeup patterns makes its way through a test
> grid. It's based on Linus's tree mid-merge window that includes what is
> in the scheduler pipeline

OK, if it doesn't work out, it would be interesting to know what goes
badly.

thanks,
julia

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-23  9:21   ` Julia Lawall
@ 2020-10-23 10:05     ` Mel Gorman
  0 siblings, 0 replies; 15+ messages in thread
From: Mel Gorman @ 2020-10-23 10:05 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel

On Fri, Oct 23, 2020 at 11:21:50AM +0200, Julia Lawall wrote:
> 
> 
> On Fri, 23 Oct 2020, Mel Gorman wrote:
> 
> > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > In the case of a thread wakeup, wake_affine determines whether a core
> > > will be chosen for the thread on the socket where the thread ran
> > > previously or on the socket of the waker.  This is done primarily by
> > > comparing the load of the core where th thread ran previously (prev)
> > > and the load of the waker (this).
> > >
> > > commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
> > > in wakeup path") changed the load computation from the runnable load
> > > to the load average, where the latter includes the load of threads
> > > that have already blocked on the core.
> > >
> > > When a short-running daemon processes happens to run on prev, this
> > > change raised the situation that prev could appear to have a greater
> > > load than this, even when prev is actually idle.  When prev and this
> > > are on the same socket, the idle prev is detected later, in
> > > select_idle_sibling.  But if that does not hold, prev is completely
> > > ignored, causing the waking thread to move to the socket of the waker.
> > > In the case of N mostly active threads on N cores, this triggers other
> > > migrations and hurts performance.
> > >
> > > In contrast, before commit 11f10e5420f6, the load on an idle core
> > > was 0, and in the case of a non-idle waker core, the effect of
> > > wake_affine was to select prev as the target for searching for a core
> > > for the waking thread.
> > >
> > > To avoid unnecessary migrations, extend wake_affine_idle to check
> > > whether the core where the thread previously ran is currently idle,
> > > and if so simply return that core as the target.
> > > target
> > > [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
> > > load in wakeup path")
> > >
> > > This particularly has an impact when using the ondemand power manager,
> > > where kworkers run every 0.004 seconds on all cores, increasing the
> > > likelihood that an idle core will be considered to have a load.
> > >
> > > The following numbers were obtained with the benchmarking tool
> > > hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
> > > benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
> > > tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
> > > 2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
> > > management were used.  Times are in seconds.  All experiments use all
> > > 160 hardware threads.
> > >
> > > 	v5.9/intel-pstate	v5.9+patch/intel-pstate
> > > bt.C.c	24.725724+-0.962340	23.349608+-1.607214
> > > lu.C.x	29.105952+-4.804203	25.249052+-5.561617
> > > sp.C.x	31.220696+-1.831335	30.227760+-2.429792
> > > ua.C.x	26.606118+-1.767384	25.778367+-1.263850
> > >
> > > 	v5.9/ondemand		v5.9+patch/ondemand
> > > bt.C.c	25.330360+-1.028316	23.544036+-1.020189
> > > lu.C.x	35.872659+-4.872090	23.719295+-3.883848
> > > sp.C.x	32.141310+-2.289541	29.125363+-0.872300
> > > ua.C.x	29.024597+-1.667049	25.728888+-1.539772
> > >
> > > On the smaller data sets (A and B) and on the other NAS benchmarks
> > > there is no impact on performance.
> > >
> > > This also has a major impact on the splash2x.volrend benchmark of the
> > > parsec benchmark suite that goes from 1m25 without this patch to 0m45,
> > > in active (intel_pstate) mode.
> > >
> > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > >
> >
> > In principal, I think the patch is ok after the recent discussion. I'm
> > holding off an ack until a battery of tests on loads with different
> > levels of utilisation and wakeup patterns makes its way through a test
> > grid. It's based on Linus's tree mid-merge window that includes what is
> > in the scheduler pipeline
> 
> OK, if it doesn't work out, it would be interesting to know what goes
> badly.
> 

Yep, if something goes wrong, I'll make the full logs available, details
on reproducing it etc.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
  2020-10-23  8:40 ` Mel Gorman
@ 2020-10-23 16:28 ` Chen Yu
  2020-10-23 16:52   ` Julia Lawall
  2020-10-27  9:19 ` Mel Gorman
  2020-10-29 10:51 ` [tip: sched/core] sched/fair: Check for idle core in wake_affine tip-bot2 for Julia Lawall
  3 siblings, 1 reply; 15+ messages in thread
From: Chen Yu @ 2020-10-23 16:28 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Linux Kernel Mailing List

On Fri, Oct 23, 2020 at 1:32 AM Julia Lawall <Julia.Lawall@inria.fr> wrote:
>
[cut]
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index aa4c6227cd6d..9b23dad883ee 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
>         if (sync && cpu_rq(this_cpu)->nr_running == 1)
>                 return this_cpu;
>
> +       if (available_idle_cpu(prev_cpu))
How about also taking  sched_idle_cpu(prev_cpu) into consideration?
if (available_idle_cpu(prev_cpu) || sched_idle_cpu(prev_cpu))

Thanks,
Chenyu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-23 16:28 ` Chen Yu
@ 2020-10-23 16:52   ` Julia Lawall
  0 siblings, 0 replies; 15+ messages in thread
From: Julia Lawall @ 2020-10-23 16:52 UTC (permalink / raw)
  To: Chen Yu
  Cc: Julia Lawall, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira,
	Linux Kernel Mailing List



On Sat, 24 Oct 2020, Chen Yu wrote:

> On Fri, Oct 23, 2020 at 1:32 AM Julia Lawall <Julia.Lawall@inria.fr> wrote:
> >
> [cut]
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index aa4c6227cd6d..9b23dad883ee 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> >         if (sync && cpu_rq(this_cpu)->nr_running == 1)
> >                 return this_cpu;
> >
> > +       if (available_idle_cpu(prev_cpu))
> How about also taking  sched_idle_cpu(prev_cpu) into consideration?
> if (available_idle_cpu(prev_cpu) || sched_idle_cpu(prev_cpu))

I have no opinion about this, because it wasn't relevant to the
application I was looking at.  The available_idle check on prev previously
in wake_affine_idle doesn't check sched_idle_cpu, so I didn't put it here.

julia

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
  2020-10-23  8:40 ` Mel Gorman
  2020-10-23 16:28 ` Chen Yu
@ 2020-10-27  9:19 ` Mel Gorman
  2020-10-27 10:00   ` Julia Lawall
  2021-01-24 20:38   ` Julia Lawall
  2020-10-29 10:51 ` [tip: sched/core] sched/fair: Check for idle core in wake_affine tip-bot2 for Julia Lawall
  3 siblings, 2 replies; 15+ messages in thread
From: Mel Gorman @ 2020-10-27  9:19 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel

On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> 

While not a universal win, it was mostly a win or neutral. In few cases
where there was a problem, one benchmark I'm a bit suspicious of generally
as occasionally it generates bad results for unknown and unpredictable
reasons. In another, it was very machine specific and the differences
were small in absolte time rather than relative time. Other tests on the
same machine were fine so overall;

Acked-by: Mel Gorman <mgorman@suse.de>

Thanks.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-27  9:19 ` Mel Gorman
@ 2020-10-27 10:00   ` Julia Lawall
  2021-01-24 20:38   ` Julia Lawall
  1 sibling, 0 replies; 15+ messages in thread
From: Julia Lawall @ 2020-10-27 10:00 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Tue, 27 Oct 2020, Mel Gorman wrote:

> On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> >
>
> While not a universal win, it was mostly a win or neutral. In few cases
> where there was a problem, one benchmark I'm a bit suspicious of generally
> as occasionally it generates bad results for unknown and unpredictable
> reasons. In another, it was very machine specific and the differences
> were small in absolte time rather than relative time. Other tests on the
> same machine were fine so overall;

Thanks for testing!

julia

>
> Acked-by: Mel Gorman <mgorman@suse.de>
>
> Thanks.
>
> --
> Mel Gorman
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip: sched/core] sched/fair: Check for idle core in wake_affine
  2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
                   ` (2 preceding siblings ...)
  2020-10-27  9:19 ` Mel Gorman
@ 2020-10-29 10:51 ` tip-bot2 for Julia Lawall
  3 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for Julia Lawall @ 2020-10-29 10:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Julia Lawall, Peter Zijlstra (Intel), Mel Gorman, x86, LKML

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     d8fcb81f1acf651a0e50eacecca43d0524984f87
Gitweb:        https://git.kernel.org/tip/d8fcb81f1acf651a0e50eacecca43d0524984f87
Author:        Julia Lawall <Julia.Lawall@inria.fr>
AuthorDate:    Thu, 22 Oct 2020 15:15:50 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 29 Oct 2020 11:00:32 +01:00

sched/fair: Check for idle core in wake_affine

In the case of a thread wakeup, wake_affine determines whether a core
will be chosen for the thread on the socket where the thread ran
previously or on the socket of the waker.  This is done primarily by
comparing the load of the core where th thread ran previously (prev)
and the load of the waker (this).

commit 11f10e5420f6 ("sched/fair: Use load instead of runnable load
in wakeup path") changed the load computation from the runnable load
to the load average, where the latter includes the load of threads
that have already blocked on the core.

When a short-running daemon processes happens to run on prev, this
change raised the situation that prev could appear to have a greater
load than this, even when prev is actually idle.  When prev and this
are on the same socket, the idle prev is detected later, in
select_idle_sibling.  But if that does not hold, prev is completely
ignored, causing the waking thread to move to the socket of the waker.
In the case of N mostly active threads on N cores, this triggers other
migrations and hurts performance.

In contrast, before commit 11f10e5420f6, the load on an idle core
was 0, and in the case of a non-idle waker core, the effect of
wake_affine was to select prev as the target for searching for a core
for the waking thread.

To avoid unnecessary migrations, extend wake_affine_idle to check
whether the core where the thread previously ran is currently idle,
and if so simply return that core as the target.

[1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
load in wakeup path")

This particularly has an impact when using the ondemand power manager,
where kworkers run every 0.004 seconds on all cores, increasing the
likelihood that an idle core will be considered to have a load.

The following numbers were obtained with the benchmarking tool
hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
management were used.  Times are in seconds.  All experiments use all
160 hardware threads.

	v5.9/intel-pstate	v5.9+patch/intel-pstate
bt.C.c	24.725724+-0.962340	23.349608+-1.607214
lu.C.x	29.105952+-4.804203	25.249052+-5.561617
sp.C.x	31.220696+-1.831335	30.227760+-2.429792
ua.C.x	26.606118+-1.767384	25.778367+-1.263850

	v5.9/ondemand		v5.9+patch/ondemand
bt.C.c	25.330360+-1.028316	23.544036+-1.020189
lu.C.x	35.872659+-4.872090	23.719295+-3.883848
sp.C.x	32.141310+-2.289541	29.125363+-0.872300
ua.C.x	29.024597+-1.667049	25.728888+-1.539772

On the smaller data sets (A and B) and on the other NAS benchmarks
there is no impact on performance.

This also has a major impact on the splash2x.volrend benchmark of the
parsec benchmark suite that goes from 1m25 without this patch to 0m45,
in active (intel_pstate) mode.

Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/1603372550-14680-1-git-send-email-Julia.Lawall@inria.fr
---
 kernel/sched/fair.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f30d35a..52cacfc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5813,6 +5813,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (sync && cpu_rq(this_cpu)->nr_running == 1)
 		return this_cpu;
 
+	if (available_idle_cpu(prev_cpu))
+		return prev_cpu;
+
 	return nr_cpumask_bits;
 }
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2020-10-27  9:19 ` Mel Gorman
  2020-10-27 10:00   ` Julia Lawall
@ 2021-01-24 20:38   ` Julia Lawall
  2021-01-25  9:12     ` Mel Gorman
  1 sibling, 1 reply; 15+ messages in thread
From: Julia Lawall @ 2021-01-24 20:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Tue, 27 Oct 2020, Mel Gorman wrote:

> On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> >
>
> While not a universal win, it was mostly a win or neutral. In few cases
> where there was a problem, one benchmark I'm a bit suspicious of generally
> as occasionally it generates bad results for unknown and unpredictable
> reasons. In another, it was very machine specific and the differences
> were small in absolte time rather than relative time. Other tests on the
> same machine were fine so overall;
>
> Acked-by: Mel Gorman <mgorman@suse.de>

Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
with this patch, the preparation time of phoronix slows down, from ~23
seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
clear what causes the problem.  But perhaps the patch should be removed
from v5.11, until the problem is understood.

commit d8fcb81f1acf651a0e50eacecca43d0524984f87

julia



>
> Thanks.
>
> --
> Mel Gorman
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2021-01-24 20:38   ` Julia Lawall
@ 2021-01-25  9:12     ` Mel Gorman
  2021-01-25  9:20       ` Julia Lawall
  0 siblings, 1 reply; 15+ messages in thread
From: Mel Gorman @ 2021-01-25  9:12 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel

On Sun, Jan 24, 2021 at 09:38:14PM +0100, Julia Lawall wrote:
> 
> 
> On Tue, 27 Oct 2020, Mel Gorman wrote:
> 
> > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > >
> >
> > While not a universal win, it was mostly a win or neutral. In few cases
> > where there was a problem, one benchmark I'm a bit suspicious of generally
> > as occasionally it generates bad results for unknown and unpredictable
> > reasons. In another, it was very machine specific and the differences
> > were small in absolte time rather than relative time. Other tests on the
> > same machine were fine so overall;
> >
> > Acked-by: Mel Gorman <mgorman@suse.de>
> 
> Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
> with this patch, the preparation time of phoronix slows down, from ~23
> seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
> clear what causes the problem.  But perhaps the patch should be removed
> from v5.11, until the problem is understood.
> 
> commit d8fcb81f1acf651a0e50eacecca43d0524984f87
> 

I'm not 100% convinved given that it was a mix of wins and losses. In
the wakup path in general, universal wins almost never happen. It's not
100% clear from your mail what happens during the preparation patch. If
it included time to download the benchmarks and install then it would be
inherently variable due to network time (if download) or cache hotness
(if installing/compiling). While preparation time can be interesting --
for example, if preparation involves reading a lot of files from disk,
it's not universally interesting when it's not the critical phase of a
benchmark.

I think it would be better to wait until the problem is fully understood
to see if it's a timing artifact (e.g. a race between when prev_cpu is
observed to be idle and when it is busy).

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2021-01-25  9:12     ` Mel Gorman
@ 2021-01-25  9:20       ` Julia Lawall
  2021-01-25  9:25         ` Vincent Guittot
  0 siblings, 1 reply; 15+ messages in thread
From: Julia Lawall @ 2021-01-25  9:20 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Mon, 25 Jan 2021, Mel Gorman wrote:

> On Sun, Jan 24, 2021 at 09:38:14PM +0100, Julia Lawall wrote:
> >
> >
> > On Tue, 27 Oct 2020, Mel Gorman wrote:
> >
> > > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > > >
> > >
> > > While not a universal win, it was mostly a win or neutral. In few cases
> > > where there was a problem, one benchmark I'm a bit suspicious of generally
> > > as occasionally it generates bad results for unknown and unpredictable
> > > reasons. In another, it was very machine specific and the differences
> > > were small in absolte time rather than relative time. Other tests on the
> > > same machine were fine so overall;
> > >
> > > Acked-by: Mel Gorman <mgorman@suse.de>
> >
> > Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
> > with this patch, the preparation time of phoronix slows down, from ~23
> > seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
> > clear what causes the problem.  But perhaps the patch should be removed
> > from v5.11, until the problem is understood.
> >
> > commit d8fcb81f1acf651a0e50eacecca43d0524984f87
> >
>
> I'm not 100% convinved given that it was a mix of wins and losses. In
> the wakup path in general, universal wins almost never happen. It's not
> 100% clear from your mail what happens during the preparation patch. If
> it included time to download the benchmarks and install then it would be
> inherently variable due to network time (if download) or cache hotness
> (if installing/compiling). While preparation time can be interesting --
> for example, if preparation involves reading a lot of files from disk,
> it's not universally interesting when it's not the critical phase of a
> benchmark.

The benchmark is completely downloaded prior to the runs.  There seems to
be some perturbation to the activation of containerd.  Normally it is
even:  *   *   *   *

and with the patch it becomes more like: *     **     **

That is every other one is on time, and every other one is late.

But I don't know why this happens.

julia

>
> I think it would be better to wait until the problem is fully understood
> to see if it's a timing artifact (e.g. a race between when prev_cpu is
> observed to be idle and when it is busy).
>
> --
> Mel Gorman
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2021-01-25  9:20       ` Julia Lawall
@ 2021-01-25  9:25         ` Vincent Guittot
  2021-01-25 13:38           ` Julia Lawall
  2021-02-06 17:20           ` Julia Lawall
  0 siblings, 2 replies; 15+ messages in thread
From: Vincent Guittot @ 2021-01-25  9:25 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Mel Gorman, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel

On Mon, 25 Jan 2021 at 10:20, Julia Lawall <julia.lawall@inria.fr> wrote:
>
>
>
> On Mon, 25 Jan 2021, Mel Gorman wrote:
>
> > On Sun, Jan 24, 2021 at 09:38:14PM +0100, Julia Lawall wrote:
> > >
> > >
> > > On Tue, 27 Oct 2020, Mel Gorman wrote:
> > >
> > > > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > > > >
> > > >
> > > > While not a universal win, it was mostly a win or neutral. In few cases
> > > > where there was a problem, one benchmark I'm a bit suspicious of generally
> > > > as occasionally it generates bad results for unknown and unpredictable
> > > > reasons. In another, it was very machine specific and the differences
> > > > were small in absolte time rather than relative time. Other tests on the
> > > > same machine were fine so overall;
> > > >
> > > > Acked-by: Mel Gorman <mgorman@suse.de>
> > >
> > > Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
> > > with this patch, the preparation time of phoronix slows down, from ~23
> > > seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
> > > clear what causes the problem.  But perhaps the patch should be removed
> > > from v5.11, until the problem is understood.
> > >
> > > commit d8fcb81f1acf651a0e50eacecca43d0524984f87
> > >
> >
> > I'm not 100% convinved given that it was a mix of wins and losses. In
> > the wakup path in general, universal wins almost never happen. It's not
> > 100% clear from your mail what happens during the preparation patch. If
> > it included time to download the benchmarks and install then it would be
> > inherently variable due to network time (if download) or cache hotness
> > (if installing/compiling). While preparation time can be interesting --
> > for example, if preparation involves reading a lot of files from disk,
> > it's not universally interesting when it's not the critical phase of a
> > benchmark.
>
> The benchmark is completely downloaded prior to the runs.  There seems to
> be some perturbation to the activation of containerd.  Normally it is
> even:  *   *   *   *

Does it impact the benchmark results too or only the preparation prior
to running the benchmark ?

>
> and with the patch it becomes more like: *     **     **
>
> That is every other one is on time, and every other one is late.
>
> But I don't know why this happens.
>
> julia
>
> >
> > I think it would be better to wait until the problem is fully understood
> > to see if it's a timing artifact (e.g. a race between when prev_cpu is
> > observed to be idle and when it is busy).

I agree that a better understanding of what is happening is necessary
before any changes

> >
> > --
> > Mel Gorman
> > SUSE Labs
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2021-01-25  9:25         ` Vincent Guittot
@ 2021-01-25 13:38           ` Julia Lawall
  2021-02-06 17:20           ` Julia Lawall
  1 sibling, 0 replies; 15+ messages in thread
From: Julia Lawall @ 2021-01-25 13:38 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Mel Gorman, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Mon, 25 Jan 2021, Vincent Guittot wrote:

> On Mon, 25 Jan 2021 at 10:20, Julia Lawall <julia.lawall@inria.fr> wrote:
> >
> >
> >
> > On Mon, 25 Jan 2021, Mel Gorman wrote:
> >
> > > On Sun, Jan 24, 2021 at 09:38:14PM +0100, Julia Lawall wrote:
> > > >
> > > >
> > > > On Tue, 27 Oct 2020, Mel Gorman wrote:
> > > >
> > > > > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > > > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > > > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > > > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > > > > >
> > > > >
> > > > > While not a universal win, it was mostly a win or neutral. In few cases
> > > > > where there was a problem, one benchmark I'm a bit suspicious of generally
> > > > > as occasionally it generates bad results for unknown and unpredictable
> > > > > reasons. In another, it was very machine specific and the differences
> > > > > were small in absolte time rather than relative time. Other tests on the
> > > > > same machine were fine so overall;
> > > > >
> > > > > Acked-by: Mel Gorman <mgorman@suse.de>
> > > >
> > > > Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
> > > > with this patch, the preparation time of phoronix slows down, from ~23
> > > > seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
> > > > clear what causes the problem.  But perhaps the patch should be removed
> > > > from v5.11, until the problem is understood.
> > > >
> > > > commit d8fcb81f1acf651a0e50eacecca43d0524984f87
> > > >
> > >
> > > I'm not 100% convinved given that it was a mix of wins and losses. In
> > > the wakup path in general, universal wins almost never happen. It's not
> > > 100% clear from your mail what happens during the preparation patch. If
> > > it included time to download the benchmarks and install then it would be
> > > inherently variable due to network time (if download) or cache hotness
> > > (if installing/compiling). While preparation time can be interesting --
> > > for example, if preparation involves reading a lot of files from disk,
> > > it's not universally interesting when it's not the critical phase of a
> > > benchmark.
> >
> > The benchmark is completely downloaded prior to the runs.  There seems to
> > be some perturbation to the activation of containerd.  Normally it is
> > even:  *   *   *   *
>
> Does it impact the benchmark results too or only the preparation prior
> to running the benchmark ?

Looking at a few of the benchmarks, there is no clear pattern which is
better.  But there is not a big degradation, like from 23 to 28 seconds
for the preparation time.  I will report back when we figure out more.

julia

>
> >
> > and with the patch it becomes more like: *     **     **
> >
> > That is every other one is on time, and every other one is late.
> >
> > But I don't know why this happens.
> >
> > julia
> >
> > >
> > > I think it would be better to wait until the problem is fully understood
> > > to see if it's a timing artifact (e.g. a race between when prev_cpu is
> > > observed to be idle and when it is busy).
>
> I agree that a better understanding of what is happening is necessary
> before any changes
>
> > >
> > > --
> > > Mel Gorman
> > > SUSE Labs
> > >
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched/fair: check for idle core
  2021-01-25  9:25         ` Vincent Guittot
  2021-01-25 13:38           ` Julia Lawall
@ 2021-02-06 17:20           ` Julia Lawall
  1 sibling, 0 replies; 15+ messages in thread
From: Julia Lawall @ 2021-02-06 17:20 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Mel Gorman, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Daniel Bristot de Oliveira, linux-kernel



On Mon, 25 Jan 2021, Vincent Guittot wrote:

> On Mon, 25 Jan 2021 at 10:20, Julia Lawall <julia.lawall@inria.fr> wrote:
> >
> >
> >
> > On Mon, 25 Jan 2021, Mel Gorman wrote:
> >
> > > On Sun, Jan 24, 2021 at 09:38:14PM +0100, Julia Lawall wrote:
> > > >
> > > >
> > > > On Tue, 27 Oct 2020, Mel Gorman wrote:
> > > >
> > > > > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > > > > Fixes: 11f10e5420f6 ("sched/fair: Use load instead of runnable load in wakeup path")
> > > > > > Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
> > > > > > Reviewed-by Vincent Guittot <vincent.guittot@linaro.org>
> > > > > >
> > > > >
> > > > > While not a universal win, it was mostly a win or neutral. In few cases
> > > > > where there was a problem, one benchmark I'm a bit suspicious of generally
> > > > > as occasionally it generates bad results for unknown and unpredictable
> > > > > reasons. In another, it was very machine specific and the differences
> > > > > were small in absolte time rather than relative time. Other tests on the
> > > > > same machine were fine so overall;
> > > > >
> > > > > Acked-by: Mel Gorman <mgorman@suse.de>
> > > >
> > > > Recently, we have been testing the phoronix multicore benchmarks.  On v5.9
> > > > with this patch, the preparation time of phoronix slows down, from ~23
> > > > seconds to ~28 seconds.  In v5.11-rc4, we see 29 seconds.  It's not yet
> > > > clear what causes the problem.  But perhaps the patch should be removed
> > > > from v5.11, until the problem is understood.
> > > >
> > > > commit d8fcb81f1acf651a0e50eacecca43d0524984f87
> > > >
> > >
> > > I'm not 100% convinved given that it was a mix of wins and losses. In
> > > the wakup path in general, universal wins almost never happen. It's not
> > > 100% clear from your mail what happens during the preparation patch. If
> > > it included time to download the benchmarks and install then it would be
> > > inherently variable due to network time (if download) or cache hotness
> > > (if installing/compiling). While preparation time can be interesting --
> > > for example, if preparation involves reading a lot of files from disk,
> > > it's not universally interesting when it's not the critical phase of a
> > > benchmark.
> >
> > The benchmark is completely downloaded prior to the runs.  There seems to
> > be some perturbation to the activation of containerd.  Normally it is
> > even:  *   *   *   *
>
> Does it impact the benchmark results too or only the preparation prior
> to running the benchmark ?
>
> >
> > and with the patch it becomes more like: *     **     **
> >
> > That is every other one is on time, and every other one is late.
> >
> > But I don't know why this happens.
> >
> > julia
> >
> > >
> > > I think it would be better to wait until the problem is fully understood
> > > to see if it's a timing artifact (e.g. a race between when prev_cpu is
> > > observed to be idle and when it is busy).
>
> I agree that a better understanding of what is happening is necessary
> before any changes

The tests were incorrect.  The faster ones without the patch were with
schedutil.  If we use powersave with the patch or without we get the same
setup time and comparable values for the metrics for the actual benchmarks
(some of which vary a lot, though).

So there is no evidence of any problem with the patch.

julia

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-02-06 17:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22 13:15 [PATCH v2] sched/fair: check for idle core Julia Lawall
2020-10-23  8:40 ` Mel Gorman
2020-10-23  9:21   ` Julia Lawall
2020-10-23 10:05     ` Mel Gorman
2020-10-23 16:28 ` Chen Yu
2020-10-23 16:52   ` Julia Lawall
2020-10-27  9:19 ` Mel Gorman
2020-10-27 10:00   ` Julia Lawall
2021-01-24 20:38   ` Julia Lawall
2021-01-25  9:12     ` Mel Gorman
2021-01-25  9:20       ` Julia Lawall
2021-01-25  9:25         ` Vincent Guittot
2021-01-25 13:38           ` Julia Lawall
2021-02-06 17:20           ` Julia Lawall
2020-10-29 10:51 ` [tip: sched/core] sched/fair: Check for idle core in wake_affine tip-bot2 for Julia Lawall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.