From: Julia Lawall <Julia.Lawall@inria.fr> To: Ingo Molnar <mingo@redhat.com> Cc: kernel-janitors@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, linux-kernel@vger.kernel.org, Valentin Schneider <valentin.schneider@arm.com>, Gilles.Muller@inria.fr Subject: [PATCH] sched/fair: check for idle core Date: Tue, 20 Oct 2020 18:37:59 +0200 [thread overview] Message-ID: <1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr> (raw) On a thread wakeup, the change [1] from runnable load average to load average for comparing candidate cores means that recent short-running daemons on the core where a thread ran previously can be considered to have a higher load than the core performing the wakeup, even when the core where the thread ran previously is currently idle. This can cause a thread to migrate, taking the place of some other thread that is about to wake up, and so on. To avoid unnecessary migrations, extend wake_affine_idle to check whether the core where the thread previously ran is currently idle, and if so return that core as the target. [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable load in wakeup path") This particularly has an impact when using passive (intel_cpufreq) power management, where kworkers run every 0.004 seconds on all cores, increasing the likelihood that an idle core will be considered to have a load. The following numbers were obtained with the benchmarking tool hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel benchmarks (https://www.nas.nasa.gov/publications/npb.html). The tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @ 2.10GHz. Active (intel_pstate) and passive (intel_cpufreq) power management were used. Times are in seconds. All experiments use all 160 hardware threads. v5.9/active v5.9+patch/active bt.C.c 24.725724+-0.962340 23.349608+-1.607214 lu.C.x 29.105952+-4.804203 25.249052+-5.561617 sp.C.x 31.220696+-1.831335 30.227760+-2.429792 ua.C.x 26.606118+-1.767384 25.778367+-1.263850 v5.9/passive v5.9+patch/passive bt.C.c 25.330360+-1.028316 23.544036+-1.020189 lu.C.x 35.872659+-4.872090 23.719295+-3.883848 sp.C.x 32.141310+-2.289541 29.125363+-0.872300 ua.C.x 29.024597+-1.667049 25.728888+-1.539772 On the smaller data sets (A and B) and on the other NAS benchmarks there is no impact on performance. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> --- kernel/sched/fair.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index aa4c6227cd6d..9b23dad883ee 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync) if (sync && cpu_rq(this_cpu)->nr_running == 1) return this_cpu; + if (available_idle_cpu(prev_cpu)) + return prev_cpu; + return nr_cpumask_bits; }
WARNING: multiple messages have this Message-ID (diff)
From: Julia Lawall <Julia.Lawall@inria.fr> To: Ingo Molnar <mingo@redhat.com> Cc: kernel-janitors@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, linux-kernel@vger.kernel.org, Valentin Schneider <valentin.schneider@arm.com>, Gilles.Muller@inria.fr Subject: [PATCH] sched/fair: check for idle core Date: Tue, 20 Oct 2020 16:37:59 +0000 [thread overview] Message-ID: <1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr> (raw) On a thread wakeup, the change [1] from runnable load average to load average for comparing candidate cores means that recent short-running daemons on the core where a thread ran previously can be considered to have a higher load than the core performing the wakeup, even when the core where the thread ran previously is currently idle. This can cause a thread to migrate, taking the place of some other thread that is about to wake up, and so on. To avoid unnecessary migrations, extend wake_affine_idle to check whether the core where the thread previously ran is currently idle, and if so return that core as the target. [1] commit 11f10e5420f6ce ("sched/fair: Use load instead of runnable load in wakeup path") This particularly has an impact when using passive (intel_cpufreq) power management, where kworkers run every 0.004 seconds on all cores, increasing the likelihood that an idle core will be considered to have a load. The following numbers were obtained with the benchmarking tool hyperfine (https://github.com/sharkdp/hyperfine) on the NAS parallel benchmarks (https://www.nas.nasa.gov/publications/npb.html). The tests were run on an 80-core Intel(R) Xeon(R) CPU E7-8870 v4 @ 2.10GHz. Active (intel_pstate) and passive (intel_cpufreq) power management were used. Times are in seconds. All experiments use all 160 hardware threads. v5.9/active v5.9+patch/active bt.C.c 24.725724+-0.962340 23.349608+-1.607214 lu.C.x 29.105952+-4.804203 25.249052+-5.561617 sp.C.x 31.220696+-1.831335 30.227760+-2.429792 ua.C.x 26.606118+-1.767384 25.778367+-1.263850 v5.9/passive v5.9+patch/passive bt.C.c 25.330360+-1.028316 23.544036+-1.020189 lu.C.x 35.872659+-4.872090 23.719295+-3.883848 sp.C.x 32.141310+-2.289541 29.125363+-0.872300 ua.C.x 29.024597+-1.667049 25.728888+-1.539772 On the smaller data sets (A and B) and on the other NAS benchmarks there is no impact on performance. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> --- kernel/sched/fair.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index aa4c6227cd6d..9b23dad883ee 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5804,6 +5804,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync) if (sync && cpu_rq(this_cpu)->nr_running = 1) return this_cpu; + if (available_idle_cpu(prev_cpu)) + return prev_cpu; + return nr_cpumask_bits; }
next reply other threads:[~2020-10-20 17:21 UTC|newest] Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-20 16:37 Julia Lawall [this message] 2020-10-20 16:37 ` [PATCH] sched/fair: check for idle core Julia Lawall 2020-10-21 7:29 ` Vincent Guittot 2020-10-21 7:29 ` Vincent Guittot 2020-10-21 11:13 ` Peter Zijlstra 2020-10-21 11:13 ` Peter Zijlstra 2020-10-21 12:27 ` Vincent Guittot 2020-10-21 12:27 ` Vincent Guittot 2020-10-21 11:20 ` Mel Gorman 2020-10-21 11:20 ` Mel Gorman 2020-10-21 11:56 ` Julia Lawall 2020-10-21 11:56 ` Julia Lawall 2020-10-21 12:19 ` Peter Zijlstra 2020-10-21 12:19 ` Peter Zijlstra 2020-10-21 12:42 ` Julia Lawall 2020-10-21 12:42 ` Julia Lawall 2020-10-21 12:52 ` Peter Zijlstra 2020-10-21 12:52 ` Peter Zijlstra 2020-10-21 13:43 ` Julia Lawall 2020-10-21 18:18 ` Rafael J. Wysocki 2020-10-21 18:18 ` Rafael J. Wysocki 2020-10-21 18:15 ` Rafael J. Wysocki 2020-10-21 18:15 ` Rafael J. Wysocki 2020-10-21 19:47 ` Julia Lawall 2020-10-21 19:47 ` Julia Lawall 2020-10-21 20:25 ` Rafael J. Wysocki 2020-10-21 20:25 ` Rafael J. Wysocki 2020-10-21 13:10 ` Peter Zijlstra 2020-10-21 13:10 ` Peter Zijlstra 2020-10-21 18:11 ` Rafael J. Wysocki 2020-10-21 18:11 ` Rafael J. Wysocki 2020-10-22 4:41 ` Viresh Kumar 2020-10-22 4:53 ` Viresh Kumar 2020-10-22 7:11 ` Peter Zijlstra 2020-10-22 7:11 ` Peter Zijlstra 2020-10-22 10:47 ` Viresh Kumar 2020-10-22 10:59 ` Viresh Kumar 2020-10-22 11:45 ` Rafael J. Wysocki 2020-10-22 11:45 ` Rafael J. Wysocki 2020-10-22 12:02 ` default cpufreq gov, was: " Peter Zijlstra 2020-10-22 12:02 ` Peter Zijlstra 2020-10-22 12:19 ` Rafael J. Wysocki 2020-10-22 12:19 ` Rafael J. Wysocki 2020-10-22 12:29 ` Peter Zijlstra 2020-10-22 12:29 ` Peter Zijlstra 2020-10-22 14:52 ` Mel Gorman 2020-10-22 14:52 ` Mel Gorman 2020-10-22 14:58 ` Colin Ian King 2020-10-22 14:58 ` Colin Ian King 2020-10-22 15:12 ` Phil Auld 2020-10-22 15:12 ` Phil Auld 2020-10-22 16:35 ` Mel Gorman 2020-10-22 16:35 ` Mel Gorman 2020-10-22 17:59 ` Rafael J. Wysocki 2020-10-22 17:59 ` Rafael J. Wysocki 2020-10-22 20:32 ` Mel Gorman 2020-10-22 20:32 ` Mel Gorman 2020-10-22 20:39 ` Phil Auld 2020-10-22 20:39 ` Phil Auld 2020-10-22 15:25 ` Peter Zijlstra 2020-10-22 15:25 ` Peter Zijlstra 2020-10-22 15:55 ` Rafael J. Wysocki 2020-10-22 15:55 ` Rafael J. Wysocki 2020-10-22 16:29 ` Mel Gorman 2020-10-22 16:29 ` Mel Gorman 2020-10-22 20:10 ` Giovanni Gherdovich 2020-10-22 20:10 ` Giovanni Gherdovich 2020-10-22 20:16 ` Giovanni Gherdovich 2020-10-22 20:16 ` Giovanni Gherdovich 2020-10-23 7:03 ` Peter Zijlstra 2020-10-23 7:03 ` Peter Zijlstra 2020-10-23 17:46 ` Tom Lendacky 2020-10-23 17:46 ` Tom Lendacky 2020-10-26 19:52 ` Fontenot, Nathan 2020-10-26 19:52 ` Fontenot, Nathan 2020-10-22 15:45 ` A L 2020-10-22 15:45 ` A L 2020-10-22 15:55 ` Vincent Guittot 2020-10-22 15:55 ` Vincent Guittot 2020-10-23 5:11 ` Viresh Kumar 2020-10-23 5:23 ` Viresh Kumar 2020-10-22 16:23 ` [PATCH] cpufreq: Avoid configuring old governors as default with intel_pstate Rafael J. Wysocki 2020-10-22 16:23 ` Rafael J. Wysocki 2020-10-23 6:17 ` Viresh Kumar 2020-10-23 6:29 ` Viresh Kumar 2020-10-23 11:59 ` Rafael J. Wysocki 2020-10-23 11:59 ` Rafael J. Wysocki 2020-10-23 15:15 ` [PATCH v2] " Rafael J. Wysocki 2020-10-23 15:15 ` Rafael J. Wysocki 2020-10-27 3:01 ` Viresh Kumar 2020-10-27 3:13 ` Viresh Kumar 2020-10-27 11:11 ` default cpufreq gov, was: [PATCH] sched/fair: check for idle core Qais Yousef 2020-10-27 11:26 ` Valentin Schneider 2020-10-27 11:42 ` Qais Yousef 2020-10-27 11:48 ` Viresh Kumar 2020-10-27 11:48 ` Viresh Kumar 2020-10-23 6:12 ` Viresh Kumar 2020-10-23 6:24 ` Viresh Kumar 2020-10-23 15:06 ` Rafael J. Wysocki 2020-10-23 15:06 ` Rafael J. Wysocki 2020-10-27 3:01 ` Viresh Kumar 2020-10-27 3:13 ` Viresh Kumar 2020-10-22 11:21 ` AW: " Walter Harms 2020-10-22 11:21 ` Walter Harms 2020-10-21 12:28 ` Mel Gorman 2020-10-21 12:28 ` Mel Gorman 2020-10-21 12:25 ` Vincent Guittot 2020-10-21 12:25 ` Vincent Guittot 2020-10-21 12:47 ` Mel Gorman 2020-10-21 12:47 ` Mel Gorman 2020-10-21 12:56 ` Julia Lawall 2020-10-21 12:56 ` Julia Lawall 2020-10-21 13:18 ` Mel Gorman 2020-10-21 13:18 ` Mel Gorman 2020-10-21 13:24 ` Julia Lawall 2020-10-21 13:24 ` Julia Lawall 2020-10-21 15:08 ` Mel Gorman 2020-10-21 15:08 ` Mel Gorman 2020-10-21 15:18 ` Julia Lawall 2020-10-21 15:18 ` Julia Lawall 2020-10-21 15:23 ` Vincent Guittot 2020-10-21 15:23 ` Vincent Guittot 2020-10-21 15:33 ` Julia Lawall 2020-10-21 15:33 ` Julia Lawall 2020-10-21 15:19 ` Vincent Guittot 2020-10-21 15:19 ` Vincent Guittot 2020-10-21 17:00 ` Mel Gorman 2020-10-21 17:00 ` Mel Gorman 2020-10-21 17:39 ` Julia Lawall 2020-10-21 17:39 ` Julia Lawall 2020-10-21 13:48 ` Julia Lawall 2020-10-21 13:48 ` Julia Lawall 2020-10-21 15:26 ` Mel Gorman 2020-10-21 15:26 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1603211879-1064-1-git-send-email-Julia.Lawall@inria.fr \ --to=julia.lawall@inria.fr \ --cc=Gilles.Muller@inria.fr \ --cc=bristot@redhat.com \ --cc=bsegall@google.com \ --cc=dietmar.eggemann@arm.com \ --cc=juri.lelli@redhat.com \ --cc=kernel-janitors@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mgorman@suse.de \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=rostedt@goodmis.org \ --cc=valentin.schneider@arm.com \ --cc=vincent.guittot@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.