kernel-janitors.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Giovanni Gherdovich <ggherdovich@suse.com>
To: Peter Zijlstra <peterz@infradead.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Mel Gorman <mgorman@suse.de>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Julia Lawall <julia.lawall@inria.fr>,
	Ingo Molnar <mingo@redhat.com>,
	kernel-janitors@vger.kernel.org,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-kernel@vger.kernel.org,
	Valentin Schneider <valentin.schneider@arm.com>,
	Gilles Muller <Gilles.Muller@inria.fr>,
	srinivas.pandruvada@linux.intel.com,
	Linux PM <linux-pm@vger.kernel.org>,
	Len Brown <len.brown@intel.com>
Subject: Re: default cpufreq gov, was: [PATCH] sched/fair: check for idle core
Date: Thu, 22 Oct 2020 20:10:35 +0000	[thread overview]
Message-ID: <1603397435.16275.45.camel@suse.com> (raw)
In-Reply-To: <20201022152514.GJ2611@hirez.programming.kicks-ass.net>

Hello Peter, Rafael,

back in August I tested a v5.8 kernel adding Rafael's patches from v5.9 that
make schedutil and HWP works together, i.e. f6ebbcf08f37 ("cpufreq: intel_pstate:
Implement passive mode with HWP enabled").

The main point I took from the exercise is that tbench (network benchmark
in localhost) is problematic for schedutil and only with HWP (thanks to
Rafael's patch above) it reaches the throughput of the other governors.
When HWP isn't available, the penalty is 5-10% and I need to understand if
the cause is something that can affect other applications too (or just a
quirk of this test).

I ran this campaign this summer when Rafal CC'ed me to f6ebbcf08f37
("cpufreq: intel_pstate: Implement passive mode with HWP enabled"),
I didn't reply as the patch was a win anyways (my bad, I should have posted
the positive results). The regression of tbench with schedutil w/o HWP,
that went unnoticed for long, got the best of my attention.

Other remarks

* on gitsource (running the git unit test suite, measures elapsed time)
  schedutil is a lot better than Intel's powersave but not as good as the
  performance governor.

* for the AMD EPYC machines we haven't yet implemented frequency invariant
  accounting, which might explain why schedutil looses to ondemand on all
  the benchmarks.

* on dbench (filesystem, measures latency) and kernbench (kernel compilation),
  sugov is as good as the Intel performance governor. You can add or remove
  HWP (to either sugov or perfgov), it doesn't make a difference. Intel's
  powersave in general trails behind.

* generally my main concern is performance, not power efficiency, but I was
  a little disappointed to see schedutil being just as efficient as
  perfgov (the performance-per-watt ratios): there are even a few cases
  where (on tbench) the performance governor is both faster and more
  efficient. From previous conversations with Rafael I recall that
  switching frequency has an energy cost, so it could be that schedutil
  switches too often to amortize it. I haven't checked.

To read the tables:

Tilde (~) means the result is the same as baseline (or, the ratio is close
to 1). The double asterisk (**) is a visual aid and means the result is
worse than baseline (higher or lower depending on the case).

For an overview of the possible configurations (intel_psate passive,
active, HWP on/off etc) I made the diagram at
https://beta.suse.com/private/ggherdovich/cpufreq/x86-cpufreq.png

1) INTEL, HWP-CAPABLE MACHINES
2) INTEL, NON-HWP-CAPABLE MACHINES
3) AMD EPYC

1) INTEL, HWP-CAPABLE MACHINES:

64x_SKYLAKE_NUMA: Intel Skylake SP, 32 cores / 64 threads, NUMA, SATA SSD storage
------------------------------------------------------------------------------
            sugov-HWP   sugov-no-HWP   powersave-HWP   perfgov-HWP   better if
------------------------------------------------------------------------------
                                  PERFORMANCE RATIOS
tbench        1.00        0.68           ~               1.03**        higher
dbench        1.00        ~              1.03            ~             lower
kernbench     1.00        ~              1.11            ~             lower
gitsource     1.00        1.03           2.26            0.82**        lower
------------------------------------------------------------------------------
                             PERFORMANCE-PER-WATT RATIOS
tbench        1.00        0.74           ~               ~             higher
dbench        1.00        ~              ~               ~             higher
kernbench     1.00        ~              0.96            ~             higher
gitsource     1.00        0.96           0.45            1.15**        higher


8x_SKYLAKE_UMA: Intel Skylake (client), 4 cores / 8 threads, UMA, SATA SSD storage
------------------------------------------------------------------------------
            sugov-HWP   sugov-no-HWP   powersave-HWP   perfgov-HWP   better if
------------------------------------------------------------------------------
                                  PERFORMANCE RATIOS
tbench        1.00        0.91           ~               ~             higher
dbench        1.00        ~              ~               ~             lower
kernbench     1.00        ~              ~               ~             lower
gitsource     1.00        1.04           1.77            ~             lower
------------------------------------------------------------------------------
                             PERFORMANCE-PER-WATT RATIOS
tbench        1.00        0.95           ~               ~             higher
dbench        1.00        ~              ~               ~             higher
kernbench     1.00        ~              ~               ~             higher
gitsource     1.00        ~              0.74            ~             higher


8x_COFFEELAKE_UMA: Intel Coffee Lake, 4 cores / 8 threads, UMA, NVMe SSD storage
---------------------------------------------------------------
            sugov-HWP   powersave-HWP   perfgov-HWP   better if
---------------------------------------------------------------
                        PERFORMANCE RATIOS
tbench        1.00        ~               ~             higher
dbench        1.00        1.12            ~             lower
kernbench     1.00        ~               ~             lower
gitsource     1.00        2.05            ~             lower
---------------------------------------------------------------
                    PERFORMANCE-PER-WATT RATIOS
tbench        1.00        ~               ~             higher
dbench        1.00        1.80**          ~             higher
kernbench     1.00        ~               ~             higher
gitsource     1.00        1.52**          ~             higher


2) INTEL, NON-HWP-CAPABLE MACHINES:

80x_BROADWELL_NUMA: Intel Broadwell EP, 40 cores / 80 threads, NUMA, SATA SSD storage
---------------------------------------------------------------
              sugov     powersave       perfgov       better if
---------------------------------------------------------------
                        PERFORMANCE RATIOS
tbench        1.00        1.11**          1.10**        higher
dbench        1.00        1.10            ~             lower
kernbench     1.00        1.10            ~             lower
gitsource     1.00        2.27            0.95**        lower
---------------------------------------------------------------
                    PERFORMANCE-PER-WATT RATIOS
tbench        1.00         1.05**         1.04**        higher
dbench        1.00         1.24**         0.95          higher
kernbench     1.00         ~              ~             higher
gitsource     1.00         0.86           1.04**        higher


48x_HASWELL_NUMA: Intel Haswell EP, 24 cores / 48 threads, NUMA, HDD storage
---------------------------------------------------------------
              sugov     powersave       perfgov       better if
---------------------------------------------------------------
                        PERFORMANCE RATIOS
tbench        1.00         1.25**         1.27**        higher
dbench        1.00         1.17           ~             lower
kernbench     1.00         1.04           ~             lower
gitsource     1.00         1.54           0.79**        lower
---------------------------------------------------------------
                    PERFORMANCE-PER-WATT RATIOS
tbench        1.00         1.18**         1.11**        higher
dbench        1.00         1.25**         ~             higher
kernbench     1.00         1.04**         0.97          higher
gitsource     1.00         0.77           ~             higher


3) AMD EPYC:

256x_ROME_NUMA: AMD Rome , 128 cores / 256 threads, NUMA, SATA SSD storage
---------------------------------------------------------------
              sugov      ondemand       perfgov       better if
---------------------------------------------------------------
                        PERFORMANCE RATIOS
tbench        1.00         1.11**       1.58**          higher
dbench        1.00         0.44**       0.40**          lower
kernbench     1.00         ~            0.91**          lower
gitsource     1.00         0.96**       0.65**          lower


128x_NAPLES_NUMA: AMD Naples , 64 cores / 128 threads, NUMA, SATA SSD storage
---------------------------------------------------------------
              sugov      ondemand       perfgov       better if
---------------------------------------------------------------
                        PERFORMANCE RATIOS
tbench        1.00         1.10**       1.19**          higher
dbench        1.00         1.05         0.95**          lower
kernbench     1.00         ~            0.95**          lower
gitsource     1.00         0.93**       0.55**          lower


Giovanni

  parent reply	other threads:[~2020-10-22 20:10 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 16:37 [PATCH] sched/fair: check for idle core Julia Lawall
2020-10-21  7:29 ` Vincent Guittot
2020-10-21 11:13   ` Peter Zijlstra
2020-10-21 12:27   ` Vincent Guittot
2020-10-21 11:20 ` Mel Gorman
2020-10-21 11:56   ` Julia Lawall
2020-10-21 12:19     ` Peter Zijlstra
2020-10-21 12:42       ` Julia Lawall
2020-10-21 12:52         ` Peter Zijlstra
2020-10-21 18:18           ` Rafael J. Wysocki
2020-10-21 18:15         ` Rafael J. Wysocki
2020-10-21 19:47           ` Julia Lawall
2020-10-21 20:25             ` Rafael J. Wysocki
2020-10-21 13:10       ` Peter Zijlstra
2020-10-21 18:11         ` Rafael J. Wysocki
2020-10-22  4:53           ` Viresh Kumar
2020-10-22  7:11           ` Peter Zijlstra
2020-10-22 10:59             ` Viresh Kumar
2020-10-22 11:45               ` Rafael J. Wysocki
2020-10-22 12:02                 ` default cpufreq gov, was: " Peter Zijlstra
2020-10-22 12:19                   ` Rafael J. Wysocki
2020-10-22 12:29                     ` Peter Zijlstra
2020-10-22 14:52                       ` Mel Gorman
2020-10-22 14:58                         ` Colin Ian King
2020-10-22 15:12                           ` Phil Auld
2020-10-22 16:35                             ` Mel Gorman
2020-10-22 17:59                               ` Rafael J. Wysocki
2020-10-22 20:32                                 ` Mel Gorman
2020-10-22 20:39                                   ` Phil Auld
2020-10-22 15:25                         ` Peter Zijlstra
2020-10-22 15:55                           ` Rafael J. Wysocki
2020-10-22 16:29                           ` Mel Gorman
2020-10-22 20:10                           ` Giovanni Gherdovich [this message]
2020-10-22 20:16                             ` Giovanni Gherdovich
2020-10-23  7:03                             ` Peter Zijlstra
2020-10-23 17:46                               ` Tom Lendacky
2020-10-26 19:52                                 ` Fontenot, Nathan
2020-10-22 15:45                       ` A L
2020-10-22 15:55                         ` Vincent Guittot
2020-10-23  5:23                           ` Viresh Kumar
2020-10-22 16:23                   ` [PATCH] cpufreq: Avoid configuring old governors as default with intel_pstate Rafael J. Wysocki
2020-10-23  6:29                     ` Viresh Kumar
2020-10-23 11:59                       ` Rafael J. Wysocki
2020-10-23 15:15                     ` [PATCH v2] " Rafael J. Wysocki
2020-10-27  3:13                       ` Viresh Kumar
2020-10-27 11:11                   ` default cpufreq gov, was: [PATCH] sched/fair: check for idle core Qais Yousef
2020-10-27 11:26                     ` Valentin Schneider
2020-10-27 11:42                       ` Qais Yousef
2020-10-27 11:48                         ` Viresh Kumar
2020-10-23  6:24                 ` Viresh Kumar
2020-10-23 15:06                   ` Rafael J. Wysocki
2020-10-27  3:13                     ` Viresh Kumar
2020-10-22 11:21             ` AW: " Walter Harms
2020-10-21 12:28     ` Mel Gorman
2020-10-21 12:25   ` Vincent Guittot
2020-10-21 12:47     ` Mel Gorman
2020-10-21 12:56       ` Julia Lawall
2020-10-21 13:18         ` Mel Gorman
2020-10-21 13:24           ` Julia Lawall
2020-10-21 15:08             ` Mel Gorman
2020-10-21 15:18               ` Julia Lawall
2020-10-21 15:23                 ` Vincent Guittot
2020-10-21 15:33                   ` Julia Lawall
2020-10-21 15:19               ` Vincent Guittot
2020-10-21 17:00                 ` Mel Gorman
2020-10-21 17:39                   ` Julia Lawall
2020-10-21 13:48           ` Julia Lawall
2020-10-21 15:26             ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1603397435.16275.45.camel@suse.com \
    --to=ggherdovich@suse.com \
    --cc=Gilles.Muller@inria.fr \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=julia.lawall@inria.fr \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).