All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Ilsche <thomas.ilsche@tu-dresden.de>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	"Frederic Weisbecker" <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Doug Smythies <dsmythies@telus.net>,
	"Rik van Riel" <riel@surriel.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	"Mike Galbraith" <mgalbraith@suse.de>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFT][PATCH v5 0/7] sched/cpuidle: Idle loop rework
Date: Sat, 17 Mar 2018 13:42:19 +0100	[thread overview]
Message-ID: <dfddfd651256472fa1b7c9db2a4dcb54@MSX-L104.msx.ad.zih.tu-dresden.de> (raw)
In-Reply-To: <2142751.3U6XgWyF8u@aspire.rjw.lan>

[-- Attachment #1: Type: text/plain, Size: 4778 bytes --]

Over the last week I tested v4+pollv2 and now v5+pollv3. With v5, I
observe a particular idle behavior, that I have not seen before with
v4. On a dual-socket Skylake system the idle power increases from
74.1 W (system total) to 85.5 W with a 300 HZ build and even to
138.3 W with a 1000 HZ build. A similar Haswell-EP system is also
affected.

There are phases during which one core will keep switching to the
highest C-state, but not disable the sched tick. Every 4th sched tick,
a kworker on that core is scheduled shortly. Every wakeup from C6 of a
single core will more than double the package power consumption of
*both8 sockets for ~500 us resulting in the significantly increased
sustained power consumption.

This is illustrated in [1]. For a comparison of a "normal" phase
(samekernel), see [2]. For a global view of the effect on a 1000 Hz
build, see [3].

I have not yet found any particular triggers or the specific
interaction between the sched tick and the kworker. I'm not sure how
this was introduced in v5. I would guess it could be a feedback loop
that I was concerned about initially.

I have more findings from v4, but this seems much more impactful.

[1] https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/rjwv5_idle_300Hz.png
[2] https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/rjwv5_idle_300Hz_ok.png
[3] https://wwwpub.zih.tu-dresden.de/~tilsche/powernightmares/rjwv5_idle_1000Hz.png

On 2018-03-15 22:59, Rafael J. Wysocki wrote:
> Hi All,
> 
> Thanks a lot for the feedback so far!
> 
> One more respin after the last batch of comments from Peter and Frederic.
> 
> The previous summary that still applies:
> 
> On Sunday, March 4, 2018 11:21:30 PM CET Rafael J. Wysocki wrote:
>>
>> The problem is that if we stop the sched tick in
>> tick_nohz_idle_enter() and then the idle governor predicts short idle
>> duration, we lose regardless of whether or not it is right.
>>
>> If it is right, we've lost already, because we stopped the tick
>> unnecessarily.  If it is not right, we'll lose going forward, because
>> the idle state selected by the governor is going to be too shallow and
>> we'll draw too much power (that has been reported recently to actually
>> happen often enough for people to care).
>>
>> This patch series is an attempt to improve the situation and the idea
>> here is to make the decision whether or not to stop the tick deeper in
>> the idle loop and in particular after running the idle state selection
>> in the path where the idle governor is invoked.  This way the problem
>> can be avoided, because the idle duration predicted by the idle governor
>> can be used to decide whether or not to stop the tick so that the tick
>> is only stopped if that value is large enough (and, consequently, the
>> idle state selected by the governor is deep enough).
>>
>> The series tires to avoid adding too much new code, rather reorder the
>> existing code and make it more fine-grained.
>>
>> Patch 1 prepares the tick-sched code for the subsequent modifications and it
>> doesn't change the code's functionality (at least not intentionally).
>>
>> Patch 2 starts pushing the tick stopping decision deeper into the idle
>> loop, but that is limited to do_idle() and tick_nohz_irq_exit().
>>
>> Patch 3 makes cpuidle_idle_call() decide whether or not to stop the tick
>> and sets the stage for the subsequent changes.
>>
>> Patch 4 adds a bool pointer argument to cpuidle_select() and the ->select
>> governor callback allowing them to return a "nohz" hint on whether or not to
>> stop the tick to the caller.  It also adds code to decide what value to
>> return as "nohz" to the menu governor.
>>
>> Patch 5 reorders the idle state selection with respect to the stopping of
>> the tick and causes the additional "nohz" hint from cpuidle_select() to be
>> used for deciding whether or not to stop the tick.
>>
>> Patch 6 causes the menu governor to refine the state selection in case the
>> tick is not going to be stopped and the already selected state may not fit
>> before the next tick time.
>>
>> Patch 7 Deals with the situation in which the tick was stopped previously,
>> but the idle governor still predicts short idle.
> 
> This series is complementary to the poll_idle() patch at
> 
> https://patchwork.kernel.org/patch/10282237/
> 
> Thanks,
> Rafael
> 

-- 
Dipl. Inf. Thomas Ilsche
Computer Scientist
Highly Adaptive Energy-Efficient Computing
CRC 912 HAEC: http://tu-dresden.de/sfb912
Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany

Phone: +49 351 463-42168
Fax: +49 351 463-37773
E-Mail: thomas.ilsche@tu-dresden.de


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5214 bytes --]

  parent reply	other threads:[~2018-03-17 12:43 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 21:59 [RFT][PATCH v5 0/7] sched/cpuidle: Idle loop rework Rafael J. Wysocki
2018-03-15 22:03 ` [RFT][PATCH v5 1/7] time: tick-sched: Reorganize idle tick management code Rafael J. Wysocki
2018-03-15 22:05 ` [RFT][PATCH v5 2/7] sched: idle: Do not stop the tick upfront in the idle loop Rafael J. Wysocki
2018-03-15 22:07 ` [RFT][PATCH v5 3/7] sched: idle: Do not stop the tick before cpuidle_idle_call() Rafael J. Wysocki
2018-03-15 22:11 ` [RFT][PATCH v5 4/7] cpuidle: Return nohz hint from cpuidle_select() Rafael J. Wysocki
2018-03-19  9:11   ` Peter Zijlstra
2018-03-19  9:39     ` Rafael J. Wysocki
2018-03-15 22:13 ` [RFT][PATCH v5 5/7] sched: idle: Select idle state before stopping the tick Rafael J. Wysocki
2018-03-15 22:16 ` [RFT][PATCH v5 6/7] cpuidle: menu: Refine idle state selection for running tick Rafael J. Wysocki
2018-03-19  9:45   ` Peter Zijlstra
2018-03-19  9:49     ` Rafael J. Wysocki
2018-03-15 22:19 ` [RFT][PATCH v5 7/7] cpuidle: menu: Avoid selecting shallow states with stopped tick Rafael J. Wysocki
2018-03-19 12:47   ` Thomas Ilsche
2018-03-19 18:21   ` Doug Smythies
2018-03-20 17:15   ` Doug Smythies
2018-03-20 17:28     ` Rafael J. Wysocki
2018-03-17 12:42 ` Thomas Ilsche [this message]
2018-03-17 16:11 ` [RFT][PATCH v5 0/7] sched/cpuidle: Idle loop rework Doug Smythies
2018-03-18 11:00   ` Rafael J. Wysocki
2018-03-18 16:15     ` Rafael J. Wysocki
2018-03-19 10:49       ` Peter Zijlstra
2018-03-19 11:36         ` Rafael J. Wysocki
2018-03-19 11:58           ` Rafael J. Wysocki
2018-03-19 12:31           ` Peter Zijlstra
2018-03-20 10:01       ` Thomas Ilsche
2018-03-20 10:49         ` Rafael J. Wysocki
2018-03-20 17:15       ` Doug Smythies
2018-03-20 21:03       ` Doug Smythies
2018-03-21  6:33         ` Rafael J. Wysocki
2018-03-21 13:51         ` Doug Smythies
2018-03-21 13:58           ` Rafael J. Wysocki
2018-03-18 15:30   ` Doug Smythies
2018-03-18 16:06     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dfddfd651256472fa1b7c9db2a4dcb54@MSX-L104.msx.ad.zih.tu-dresden.de \
    --to=thomas.ilsche@tu-dresden.de \
    --cc=aubrey.li@linux.intel.com \
    --cc=dsmythies@telus.net \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgalbraith@suse.de \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.