All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julia Lawall <Julia.Lawall@inria.fr>
To: Ingo Molnar <mingo@redhat.com>
Cc: kernel-janitors@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-kernel@vger.kernel.org,
	Valentin Schneider <valentin.schneider@arm.com>,
	Gilles.Muller@inria.fr
Subject: [PATCH] sched/fair: check for idle core
Date: Tue, 20 Oct 2020 18:37:59 +0200	[thread overview]
Message-ID: <1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr> (raw)

On a thread wakeup, the change [1] from runnable load average to load
average for comparing candidate cores means that recent short-running
daemons on the core where a thread ran previously can be considered to
have a higher load than the core performing the wakeup, even when the
core where the thread ran previously is currently idle.  This can
cause a thread to migrate, taking the place of some other thread that
is about to wake up, and so on.  To avoid unnecessary migrations,
extend wake_affine_idle to check whether the core where the thread
previously ran is currently idle, and if so return that core as the
target.

[1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
load in wakeup path")

This particularly has an impact when using passive (intel_cpufreq)
power management, where kworkers run every 0.004 seconds on all cores,
increasing the likelihood that an idle core will be considered to have
a load.

The following numbers were obtained with the benchmarking tool
hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
management were used.  Times are in seconds.  All experiments use all
160 hardware threads.

	v5.9/active		v5.9+patch/active
bt.C.c	24.725724+-0.962340	23.349608+-1.607214
lu.C.x	29.105952+-4.804203	25.249052+-5.561617
sp.C.x	31.220696+-1.831335	30.227760+-2.429792
ua.C.x	26.606118+-1.767384	25.778367+-1.263850

	v5.9/passive		v5.9+patch/passive
bt.C.c	25.330360+-1.028316	23.544036+-1.020189
lu.C.x	35.872659+-4.872090	23.719295+-3.883848
sp.C.x	32.141310+-2.289541	29.125363+-0.872300
ua.C.x	29.024597+-1.667049	25.728888+-1.539772

On the smaller data sets (A and B) and on the other NAS benchmarks
there is no impact on performance.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

---
 kernel/sched/fair.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aa4c6227cd6d..9b23dad883ee 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (sync && cpu_rq(this_cpu)->nr_running == 1)
 		return this_cpu;
 
+	if (available_idle_cpu(prev_cpu))
+		return prev_cpu;
+
 	return nr_cpumask_bits;
 }
 


WARNING: multiple messages have this Message-ID (diff)
From: Julia Lawall <Julia.Lawall@inria.fr>
To: Ingo Molnar <mingo@redhat.com>
Cc: kernel-janitors@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-kernel@vger.kernel.org,
	Valentin Schneider <valentin.schneider@arm.com>,
	Gilles.Muller@inria.fr
Subject: [PATCH] sched/fair: check for idle core
Date: Tue, 20 Oct 2020 16:37:59 +0000	[thread overview]
Message-ID: <1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr> (raw)

On a thread wakeup, the change [1] from runnable load average to load
average for comparing candidate cores means that recent short-running
daemons on the core where a thread ran previously can be considered to
have a higher load than the core performing the wakeup, even when the
core where the thread ran previously is currently idle.  This can
cause a thread to migrate, taking the place of some other thread that
is about to wake up, and so on.  To avoid unnecessary migrations,
extend wake_affine_idle to check whether the core where the thread
previously ran is currently idle, and if so return that core as the
target.

[1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable
load in wakeup path")

This particularly has an impact when using passive (intel_cpufreq)
power management, where kworkers run every 0.004 seconds on all cores,
increasing the likelihood that an idle core will be considered to have
a load.

The following numbers were obtained with the benchmarking tool
hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel
benchmarks (https://www.nas.nasa.gov/publications/npb.html).  The
tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @
2.10GHz.  Active (intel_pstate) and passive (intel_cpufreq) power
management were used.  Times are in seconds.  All experiments use all
160 hardware threads.

	v5.9/active		v5.9+patch/active
bt.C.c	24.725724+-0.962340	23.349608+-1.607214
lu.C.x	29.105952+-4.804203	25.249052+-5.561617
sp.C.x	31.220696+-1.831335	30.227760+-2.429792
ua.C.x	26.606118+-1.767384	25.778367+-1.263850

	v5.9/passive		v5.9+patch/passive
bt.C.c	25.330360+-1.028316	23.544036+-1.020189
lu.C.x	35.872659+-4.872090	23.719295+-3.883848
sp.C.x	32.141310+-2.289541	29.125363+-0.872300
ua.C.x	29.024597+-1.667049	25.728888+-1.539772

On the smaller data sets (A and B) and on the other NAS benchmarks
there is no impact on performance.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

---
 kernel/sched/fair.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aa4c6227cd6d..9b23dad883ee 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (sync && cpu_rq(this_cpu)->nr_running = 1)
 		return this_cpu;
 
+	if (available_idle_cpu(prev_cpu))
+		return prev_cpu;
+
 	return nr_cpumask_bits;
 }
 

             reply	other threads:[~2020-10-20 17:21 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 16:37 Julia Lawall [this message]
2020-10-20 16:37 ` [PATCH] sched/fair: check for idle core Julia Lawall
2020-10-21  7:29 ` Vincent Guittot
2020-10-21  7:29   ` Vincent Guittot
2020-10-21 11:13   ` Peter Zijlstra
2020-10-21 11:13     ` Peter Zijlstra
2020-10-21 12:27   ` Vincent Guittot
2020-10-21 12:27     ` Vincent Guittot
2020-10-21 11:20 ` Mel Gorman
2020-10-21 11:20   ` Mel Gorman
2020-10-21 11:56   ` Julia Lawall
2020-10-21 11:56     ` Julia Lawall
2020-10-21 12:19     ` Peter Zijlstra
2020-10-21 12:19       ` Peter Zijlstra
2020-10-21 12:42       ` Julia Lawall
2020-10-21 12:42         ` Julia Lawall
2020-10-21 12:52         ` Peter Zijlstra
2020-10-21 12:52           ` Peter Zijlstra
2020-10-21 13:43           ` Julia Lawall
2020-10-21 18:18           ` Rafael J. Wysocki
2020-10-21 18:18             ` Rafael J. Wysocki
2020-10-21 18:15         ` Rafael J. Wysocki
2020-10-21 18:15           ` Rafael J. Wysocki
2020-10-21 19:47           ` Julia Lawall
2020-10-21 19:47             ` Julia Lawall
2020-10-21 20:25             ` Rafael J. Wysocki
2020-10-21 20:25               ` Rafael J. Wysocki
2020-10-21 13:10       ` Peter Zijlstra
2020-10-21 13:10         ` Peter Zijlstra
2020-10-21 18:11         ` Rafael J. Wysocki
2020-10-21 18:11           ` Rafael J. Wysocki
2020-10-22  4:41           ` Viresh Kumar
2020-10-22  4:53             ` Viresh Kumar
2020-10-22  7:11           ` Peter Zijlstra
2020-10-22  7:11             ` Peter Zijlstra
2020-10-22 10:47             ` Viresh Kumar
2020-10-22 10:59               ` Viresh Kumar
2020-10-22 11:45               ` Rafael J. Wysocki
2020-10-22 11:45                 ` Rafael J. Wysocki
2020-10-22 12:02                 ` default cpufreq gov, was: " Peter Zijlstra
2020-10-22 12:02                   ` Peter Zijlstra
2020-10-22 12:19                   ` Rafael J. Wysocki
2020-10-22 12:19                     ` Rafael J. Wysocki
2020-10-22 12:29                     ` Peter Zijlstra
2020-10-22 12:29                       ` Peter Zijlstra
2020-10-22 14:52                       ` Mel Gorman
2020-10-22 14:52                         ` Mel Gorman
2020-10-22 14:58                         ` Colin Ian King
2020-10-22 14:58                           ` Colin Ian King
2020-10-22 15:12                           ` Phil Auld
2020-10-22 15:12                             ` Phil Auld
2020-10-22 16:35                             ` Mel Gorman
2020-10-22 16:35                               ` Mel Gorman
2020-10-22 17:59                               ` Rafael J. Wysocki
2020-10-22 17:59                                 ` Rafael J. Wysocki
2020-10-22 20:32                                 ` Mel Gorman
2020-10-22 20:32                                   ` Mel Gorman
2020-10-22 20:39                                   ` Phil Auld
2020-10-22 20:39                                     ` Phil Auld
2020-10-22 15:25                         ` Peter Zijlstra
2020-10-22 15:25                           ` Peter Zijlstra
2020-10-22 15:55                           ` Rafael J. Wysocki
2020-10-22 15:55                             ` Rafael J. Wysocki
2020-10-22 16:29                           ` Mel Gorman
2020-10-22 16:29                             ` Mel Gorman
2020-10-22 20:10                           ` Giovanni Gherdovich
2020-10-22 20:10                             ` Giovanni Gherdovich
2020-10-22 20:16                             ` Giovanni Gherdovich
2020-10-22 20:16                               ` Giovanni Gherdovich
2020-10-23  7:03                             ` Peter Zijlstra
2020-10-23  7:03                               ` Peter Zijlstra
2020-10-23 17:46                               ` Tom Lendacky
2020-10-23 17:46                                 ` Tom Lendacky
2020-10-26 19:52                                 ` Fontenot, Nathan
2020-10-26 19:52                                   ` Fontenot, Nathan
2020-10-22 15:45                       ` A L
2020-10-22 15:45                         ` A L
2020-10-22 15:55                         ` Vincent Guittot
2020-10-22 15:55                           ` Vincent Guittot
2020-10-23  5:11                           ` Viresh Kumar
2020-10-23  5:23                             ` Viresh Kumar
2020-10-22 16:23                   ` [PATCH] cpufreq: Avoid configuring old governors as default with intel_pstate Rafael J. Wysocki
2020-10-22 16:23                     ` Rafael J. Wysocki
2020-10-23  6:17                     ` Viresh Kumar
2020-10-23  6:29                       ` Viresh Kumar
2020-10-23 11:59                       ` Rafael J. Wysocki
2020-10-23 11:59                         ` Rafael J. Wysocki
2020-10-23 15:15                     ` [PATCH v2] " Rafael J. Wysocki
2020-10-23 15:15                       ` Rafael J. Wysocki
2020-10-27  3:01                       ` Viresh Kumar
2020-10-27  3:13                         ` Viresh Kumar
2020-10-27 11:11                   ` default cpufreq gov, was: [PATCH] sched/fair: check for idle core Qais Yousef
2020-10-27 11:26                     ` Valentin Schneider
2020-10-27 11:42                       ` Qais Yousef
2020-10-27 11:48                         ` Viresh Kumar
2020-10-27 11:48                           ` Viresh Kumar
2020-10-23  6:12                 ` Viresh Kumar
2020-10-23  6:24                   ` Viresh Kumar
2020-10-23 15:06                   ` Rafael J. Wysocki
2020-10-23 15:06                     ` Rafael J. Wysocki
2020-10-27  3:01                     ` Viresh Kumar
2020-10-27  3:13                       ` Viresh Kumar
2020-10-22 11:21             ` AW: " Walter Harms
2020-10-22 11:21               ` Walter Harms
2020-10-21 12:28     ` Mel Gorman
2020-10-21 12:28       ` Mel Gorman
2020-10-21 12:25   ` Vincent Guittot
2020-10-21 12:25     ` Vincent Guittot
2020-10-21 12:47     ` Mel Gorman
2020-10-21 12:47       ` Mel Gorman
2020-10-21 12:56       ` Julia Lawall
2020-10-21 12:56         ` Julia Lawall
2020-10-21 13:18         ` Mel Gorman
2020-10-21 13:18           ` Mel Gorman
2020-10-21 13:24           ` Julia Lawall
2020-10-21 13:24             ` Julia Lawall
2020-10-21 15:08             ` Mel Gorman
2020-10-21 15:08               ` Mel Gorman
2020-10-21 15:18               ` Julia Lawall
2020-10-21 15:18                 ` Julia Lawall
2020-10-21 15:23                 ` Vincent Guittot
2020-10-21 15:23                   ` Vincent Guittot
2020-10-21 15:33                   ` Julia Lawall
2020-10-21 15:33                     ` Julia Lawall
2020-10-21 15:19               ` Vincent Guittot
2020-10-21 15:19                 ` Vincent Guittot
2020-10-21 17:00                 ` Mel Gorman
2020-10-21 17:00                   ` Mel Gorman
2020-10-21 17:39                   ` Julia Lawall
2020-10-21 17:39                     ` Julia Lawall
2020-10-21 13:48           ` Julia Lawall
2020-10-21 13:48             ` Julia Lawall
2020-10-21 15:26             ` Mel Gorman
2020-10-21 15:26               ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr \
    --to=julia.lawall@inria.fr \
    --cc=Gilles.Muller@inria.fr \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.