linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Doug Smythies" <dsmythies@telus.net>
To: "'Rafael J. Wysocki'" <rjw@rjwysocki.net>
Cc: "'LKML'" <linux-kernel@vger.kernel.org>,
	"'Linux PM'" <linux-pm@vger.kernel.org>,
	"Doug Smythies" <dsmythies@telus.net>
Subject: RE: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic
Date: Sun, 4 Jul 2021 14:01:03 -0700	[thread overview]
Message-ID: <007101d77117$b3b837a0$1b28a6e0$@telus.net> (raw)
In-Reply-To: <1867445.PYKUYFuaPT@kreacher>

[-- Attachment #1: Type: text/plain, Size: 5117 bytes --]

Hi Rafael,

On 2021.06.02 11:14 Rafael J. Wysocki wrote:

> Hi All,
>
> This series of patches addresses some theoretical shortcoming in the
> TEO (Timer Events Oriented) cpuidle governor by reworking its idle
> state selection logic to some extent.
>
> Patches [1-2/5] are introductory cleanups and the substantial changes are
> made in patches [3-4/5] (please refer to the changelogs of these two
> patches for details).  The last patch only deals with documentation.
>
> Even though this work is mostly based on theoretical considerations, it
> shows a measurable reduction of the number of cases in which the
shallowest
> idle state is selected while it would be more beneficial to select a
deeper
> one or the deepest idle state is selected while it would be more
beneficial to
> select a shallower one, which should be a noticeable improvement.

Do you have any test results to share? Or test methods that I can try?
I have done a few tests, and generally don't notice much difference.
Perhaps an increase in idle state 2 below (was to shallow) numbers.
I am searching for some results that would offset the below:

The difficulty I am having with this patch set is the additional overhead
which becomes significant at the extremes, where idle state 0 is dominant.
Throughout the history of teo, I have used multiple one core pipe-tests
for this particular test. Some results:

CPU: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
HWP: disabled
CPU frequency scaling driver: intel_pstate, active, powersave
Pipe-tests are run forever, printing average loop time for the
Last 2.5 million loops. 1021 of those are again averaged.
Total = 2.5525e9 loops
The power and idle data is sampled for 100 minutes.

Note 1: other tests were also done and also with passive,
schedutil, but it isn't relevant for this test because the
CPU frequency stays pinned at maximum.

Note 2: I use TCC offset for thermal throttling, but I disabled it
for these tests, because the temperature needed to go higher
than my normal throttling point.

Idle configuration 1: As a COMETLAKE processor, with 4 idle states.
Kernel 5.13-RC4.

Before patch set average:
2.8014 uSec/loop
113.9 watts
Idle state 0 residency: 9.450%
Idle state 0 entries per minute: 256,812,896.6

After patch set average:
2.8264 uSec/loop, 0.89% slower
114.0 watts
Idle state 0 residency: 8.677% 
Idle state 0 entries per minute: 254,560,049.9

Menu governor:
2.8051 uSec/loop, 0.13% slower
113.9 watts
Idle state 0 residency: 8.437% 
Idle state 0 entries per minute: 256,436,417.2

O.K., perhaps not so bad, but also not many idle states.

Idle configuration 2: As a SKYLAKE processor, with 9 idle states.
i.e.:
/drivers/idle/intel_idle.c
static const struct x86_cpu_id intel_idle_ids[] __initconst
...
   X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
+ X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &idle_cpu_skl),

Purpose: To demonstrate increasing overhead as a function of number
of idle states.
Kernel 5.13.

Before patch set average:
2.8394 uSec/loop
114.2 watts
Idle state 0 residency: 7.212%
Idle state 0 entries per minute: 253,391,954.3

After patch set average:
2.9103 uSec/loop, 2.5% slower
114.4 watts, 0.18% more
Idle state 0 residency: 6.152%, 14.7% less.
Idle state 0 entries per minute: 244,024,752.1

Menu governor:
2.8141 uSec/loop, 0.89% faster
113.9 watts,  0.26% less
Idle state 0 residency: 7.167%, 0.6% less
Idle state 0 entries per minute: 255,650,610.7

Another potentially interesting test was the ebizzy test:
Records per second, averaged over many tests, varying
threads and intervals:

passive, schedutil:
Before: 6771.977
After: 5502.643, -18.7%
Menu: 10728.89, +58.4%

Active, powersave:
Before: 8361.82
After: 8463.31, +1.2%
Menu: 8225.58, -1.6%

I think it has more to do with CPU scaling governors
than this patch set, so:

performance:
Before: 12137.33
After: 12083.26, -0.4%
Menu: 11983.73, -1.3%

These and other test results available here:
(encoded to prevent a barrage of bots)

double u double u double u dot smythies dot com
/~doug/linux/idle/teo-2021-06/

... a day later ...

I might have an answer to my own question.
By switching to cross core pipe-tests, and only loading down one
CPU per core, I was able to get a lot more activity in other idle states.
The test runs for 100 minutes, and the results change with time, but
I'll leave that investigation for another day (there is no throttling):

1st 50 tests:
Before: 3.888 uSec/loop
After: 3.764 uSec/loop
Menu: 3.464 uSec/loop

Tests 50 to 100:
Before: 4.329 uSec/loop
After: 3.919 uSec/loop
Menu: 3.514 uSec/loop

Tests 200 to 250:
Before: 5.089 uSec/loop
After: 4.364 uSec/loop
Menu: 4.619 uSec/loop
 
Tests 280 to 330:
Before: 5.142 uSec/loop
After: 4.464 uSec/loop
Menu: 4.619 uSec/loop

Notice that the "after" this patch set is applied eventually does
better than using the menu governor. Its processor package power
always remains less, than the menu governor.

The results can be viewed graphically at the above link, but the
most dramatic results are:

Idle state 3 above % goes from 70% to 5%.
Idle state 2 below % goes from 13% to less than 1%.
 
... Doug


[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 4994 bytes --]

  parent reply	other threads:[~2021-07-04 21:01 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02 18:14 Rafael J. Wysocki
2021-06-02 18:15 ` [PATCH v1 1/5] cpuidle: teo: Cosmetic modifications of teo_update() Rafael J. Wysocki
2021-06-02 18:15 ` [PATCH v1 2/5] cpuidle: teo: Cosmetic modification of teo_select() Rafael J. Wysocki
2021-06-02 18:16 ` [PATCH v1 3/5] cpuidle: teo: Change the main idle state selection logic Rafael J. Wysocki
2021-06-02 18:17 ` [PATCH v1 4/5] cpuidle: teo: Rework most recent idle duration values treatment Rafael J. Wysocki
2021-06-02 18:18 ` [PATCH v1 5/5] cpuidle: teo: Use kerneldoc documentation in admin-guide Rafael J. Wysocki
2021-07-04 21:01 ` Doug Smythies [this message]
2021-07-05 13:24   ` [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic Rafael J. Wysocki
2021-07-27 20:06 ` Doug Smythies
2021-07-28 13:52   ` Rafael J. Wysocki
2021-07-28 17:47     ` Rafael J. Wysocki
2021-07-29  6:34       ` Doug Smythies
2021-07-29 15:23         ` Doug Smythies
2021-07-29 16:18           ` Rafael J. Wysocki
2021-07-29 16:14         ` Rafael J. Wysocki
2021-07-30  3:36           ` Doug Smythies
2021-07-30 13:25             ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='007101d77117$b3b837a0$1b28a6e0$@telus.net' \
    --to=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --subject='RE: [PATCH v1 0/5] cpuidle: teo: Rework the idle state selection logic' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).