All of lore.kernel.org
 help / color / mirror / Atom feed
From: Giovanni Gherdovich <ggherdovich@suse.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux PM <linux-pm@vger.kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Doug Smythies <dsmythies@telus.net>
Subject: Re: [PATCH v2 0/3] cpufreq: Allow drivers to receive more information from the governor
Date: Fri, 18 Dec 2020 17:11:45 +0100	[thread overview]
Message-ID: <1608307905.26567.46.camel@suse.com> (raw)
In-Reply-To: <3827230.0GnL3RTcl1@kreacher>

On Mon, 2020-12-14 at 21:01 +0100, Rafael J. Wysocki wrote:
> Hi,
> 
> The timing of this is not perfect (sorry about that), but here's a refresh
> of this series.
> 
> The majority of the previous cover letter still applies:
> [...]

Hello,

the series is tested using

-> tbench (packets processing with loopback networking, measures throughput)
-> dbench (filesystem operations, measures average latency)
-> kernbench (kernel compilation, elapsed time)
-> and gitsource (long-running shell script, elapsed time)

These are chosen because none of them is bound by compute and all are
sensitive to freq scaling decisions. The machines are a Cascade Lake based
server, a client Skylake and a Coffee Lake laptop.

What's being compared:

sugov-HWP.desired : the present series;  intel_pstate=passive,  governor=schedutil
sugov-HWP.min     : mainline;            intel_pstate=passive,  governor=schedutil
powersave-HWP     : mainline;            intel_pstate=active,   governor=powersave
perfgov-HWP       : mainline;            intel_pstate=active,   governor=performance
sugov-no-HWP      : HWP disabled;        intel_pstate=passive,  governor=schedutil

Dbench and Kernbench have neutral results, but Tbench has sugov-HWP.desired
lose in both performance and performance-per-watt, while Gitsource show the
series as faster in raw performance but again worse than the competition in
efficiency.

1. SUMMARY BY BENCHMARK
   1.1. TBENCH
   1.2. DBENCH
   1.3. KERNBENCH
   1.4. GITSOURCE
2. SUMMARY BY USER PROFILE
   2.1. PERFORMANCE USER: what if I switch pergov -> schedutil?
   2.2. DEFAULT USER: what if I switch powersave -> schedutil?
   2.3. DEVELOPER: what if I switch sugov-HWP.min -> sugov-HWP.desired?
3. RESULTS TABLES
   PERFORMANCE RATIOS
   PERFORMANCE-PER-WATT RATIOS


1. SUMMARY BY BENCHMARK
~~~~~~~~~~~~~~~~~~~~~~~

Tbench: sugov-HWP.desired is the worst performance on all three
    machines. sugov-HWP.min is between 20% and 90% better. The baseline
    sugov-HWP.desired offers a lower throughput, but does it increase
    efficiency? It actually doesn't: on two out of three machines the
    incumbent code (current sugov, or intel_pstate=active) has 10% to 35%
    better efficiency. In other word, the status quo is both faster and more
    efficient than the proposed series on this benchmark.
    The absolute power consumption is lower, but the delivered performance is
    "even more lower", and that's why performance-per-watt shows a net loss.

Dbench: generally neutral, in both performance and efficiency. Powersave is
    occasionally behind the pack in performance, 5% to 15%. A 15% performance
    loss on the Coffe Lake is compensated by an 80% improved efficiency. To be
    noted that on the same Coffee Lake sugov-no-HWP is 20% ahead of the pack
    in efficiency.

Kernbench: neutral, in both performance and efficiency. powersave looses 14%
    to the pack in performance on the Cascade Lake.

Gitsource: this test show the most compelling case against the
    sugov-HWP.desired series: on the Cascade Lake sugov-HWP.desired is 10%
    faster than sugov-HWP.min (it was expected to be slower!) and 35% less
    efficient (we expected more performance-per-watt, not less).


2. SUMMARY BY USER PROFILE
~~~~~~~~~~~~~~~~~~~~~~~~~~

If I was a perfgov-HWP user, I would be 20%-90% faster than with other governors
on tbench an gitsource. This speed gap comes with an unexpected efficiency
bonus on both test. Since dbench and kernbench have a flat profile across the
board, there is no incentive to try another governor.

If I was a powersave-HWP user, I'd be the slower of the bunch. The lost
performance is not, in general, balanced by better efficiency. This only
happens on Coffee Lake, which is a CPU for the mobile market and possibly HWP
has efficiency-oriented tuning there. Any flavor of schedutil would be an
improvement.

From a developer perspective, the obstacles to move from HWP.min to
HWP.desired are tbench, where HWP.desired is worse than having no HWP support
at all, and gitsource, where HWP.desired has the opposite properties than
those advertised (it's actually faster but less efficient).


3. RESULTS TABLES
~~~~~~~~~~~~~~~~~

Tilde (~) means the result is the same as baseline (or, the ratio is close to 1).
The double asterisk (**) is a visual aid and means the result is better than
baseline (higher or lower depending on the case).


| 80x_CASCADELAKE_NUMA: Intel Cascade Lake, 40 cores / 80 threads, NUMA, SATA SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|            sugov-HWP.des  sugov-HWP.min  powersave-HWP  perfgov-HWP  sugov-no-HWP   better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                         PERFORMANCE RATIOS
| tbench         1.00           1.89**         1.88**        1.89**        1.17**       higher
| dbench         1.00           ~              1.06          ~             ~            lower 
| kernbench      1.00           ~              1.14          ~             ~            lower 
| gitsource      1.00           1.11           2.70          0.80**        ~            lower 
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                    PERFORMANCE-PER-WATT RATIOS
| tbench         1.00           1.36**         1.38**        1.33**        1.04**       higher
| dbench         1.00           ~              ~             ~             ~            higher
| kernbench      1.00           ~              ~             ~             ~            higher
| gitsource      1.00           1.36**         0.63          1.22**        1.02**       higher


| 8x_COFFEELAKE_UMA: Intel Coffee Lake, 4 cores / 8 threads, UMA, NVMe SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|            sugov-HWP.des  sugov-HWP.min  powersave-HWP  perfgov-HWP  sugov-no-HWP   better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                         PERFORMANCE RATIOS
| tbench         1.00           1.27**         1.30**        1.30**        1.31**       higher
| dbench         1.00           ~              1.15          ~             ~            lower 
| kernbench      1.00           ~              ~             ~             ~            lower 
| gitsource      1.00           ~              2.09          ~             ~            lower 
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                    PERFORMANCE-PER-WATT RATIOS
| tbench         1.00           ~              ~             ~             ~            higher
| dbench         1.00           ~              1.82**        ~             1.22**       higher
| kernbench      1.00           ~              ~             ~             ~            higher
| gitsource      1.00           ~              1.56**        ~             1.17**       higher


| 8x_SKYLAKE_UMA: Intel Skylake (client), 4 cores / 8 threads, UMA, SATA SSD storage
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|            sugov-HWP.des  sugov-HWP.min  powersave-HWP  perfgov-HWP  sugov-no-HWP   better if
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                         PERFORMANCE RATIOS
| tbench         1.00           1.21**         1.22**        1.20**        1.06**       higher
| dbench         1.00           ~              ~             ~             ~            lower 
| kernbench      1.00           ~              ~             ~             ~            lower 
| gitsource      1.00           ~              1.71          0.96**        ~            lower 
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|                                    PERFORMANCE-PER-WATT RATIOS
| tbench         1.00           1.11**         1.12**        1.10**        1.03**       higher
| dbench         1.00           ~              ~             ~             ~            higher
| kernbench      1.00           ~              ~             ~             ~            higher
| gitsource      1.00           ~              0.75          ~             ~            higher



Giovanni


  parent reply	other threads:[~2020-12-18 16:22 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-07 16:25 [PATCH v1 0/4] cpufreq: Allow drivers to receive more information from the governor Rafael J. Wysocki
2020-12-07 16:28 ` [PATCH v1 1/4] cpufreq: schedutil: Add util to struct sg_cpu Rafael J. Wysocki
2020-12-08  8:33   ` Viresh Kumar
2020-12-09 17:17     ` Rafael J. Wysocki
2020-12-07 16:29 ` [PATCH v1 2/4] cpufreq: schedutil: Adjust utilization instead of frequency Rafael J. Wysocki
2020-12-08  8:51   ` Viresh Kumar
2020-12-08 17:01     ` Rafael J. Wysocki
2020-12-09  5:16       ` Viresh Kumar
2020-12-09 15:32         ` Rafael J. Wysocki
2020-12-14 11:07           ` Viresh Kumar
2020-12-07 16:35 ` [PATCH v1 3/4] cpufreq: Add special-purpose fast-switching callback for drivers Rafael J. Wysocki
2020-12-08  9:02   ` Viresh Kumar
2020-12-15  4:16     ` Viresh Kumar
2020-12-15 15:38       ` Rafael J. Wysocki
2020-12-07 16:38 ` [PATCH v1 4/4] cpufreq: intel_pstate: Implement the ->adjust_perf() callback Rafael J. Wysocki
2020-12-08 12:43   ` Peter Zijlstra
2020-12-08 17:10     ` Rafael J. Wysocki
2020-12-08 16:30 ` [PATCH v1 0/4] cpufreq: Allow drivers to receive more information from the governor Giovanni Gherdovich
2020-12-08 17:13   ` Rafael J. Wysocki
2020-12-08 19:14     ` Doug Smythies
2020-12-13 19:12       ` Doug Smythies
2020-12-18 15:32       ` Peter Zijlstra
2020-12-14 20:01 ` [PATCH v2 0/3] " Rafael J. Wysocki
2020-12-14 20:04   ` [PATCH v2 1/3] cpufreq: schedutil: Add util to struct sg_cpu Rafael J. Wysocki
2020-12-14 20:08   ` [PATCH v2 2/3] cpufreq: Add special-purpose fast-switching callback for drivers Rafael J. Wysocki
2020-12-14 20:09   ` [PATCH v2 3/3] cpufreq: intel_pstate: Implement the ->adjust_perf() callback Rafael J. Wysocki
2020-12-15  3:29     ` Srinivas Pandruvada
2020-12-15  4:16   ` [PATCH v2 0/3] cpufreq: Allow drivers to receive more information from the governor Viresh Kumar
2020-12-17 15:26   ` Doug Smythies
2020-12-21 10:41     ` Rafael J. Wysocki
2020-12-18 16:11   ` Giovanni Gherdovich [this message]
2020-12-21 16:11     ` Rafael J. Wysocki
2020-12-23 13:06       ` Giovanni Gherdovich
2020-12-28 19:11         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1608307905.26567.46.camel@suse.com \
    --to=ggherdovich@suse.com \
    --cc=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.