All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
@ 2018-03-07 17:04 Doug Smythies
  2018-03-07 22:11 ` Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Doug Smythies @ 2018-03-07 17:04 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Thomas Gleixner', 'Frederic Weisbecker',
	'Paul McKenney', 'Thomas Ilsche',
	'Rik van Riel', 'Aubrey Li',
	'Mike Galbraith', 'LKML', 'Linux PM',
	'Peter Zijlstra'

On 2018.03.06 12:57 Rafael J. Wysocki wrote:

...[snip]...

> And the two paragraphs below still apply:

>> I have tested these patches on a couple of machines, including the very laptop
>> I'm sending them from, without any obvious issues, but please give them a go
>> if you can, especially if you have an easy way to reproduce the problem they
>> are targeting.  The patches are on top of 4.16-rc3 (if you need a git branch
>> with them for easier testing, please let me know).

Hi,

I am still having some boot troubles with V2. However, and because my system
did eventually boot, seemingly O.K., I didn't re-boot a bunch of times for
further testing.

I ran my 100% load on one CPU test, which is for idle state 0 issues, on
my otherwise extremely idle test server. I never did have very good ways
to test issues with the other idle states (Thomas Ilsche's specialty).

During the test I got some messages (I also got some with the V1 patch set):

[16246.655148] rcu_preempt kthread starved for 60005 jiffies! g10557 c10556
			f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
[19556.565007] rcu_preempt kthread starved for 60003 jiffies! g12126 c12125
			f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
[20223.066251] clocksource: timekeeping watchdog on CPU4: Marking clocksource
			'tsc' as unstable because the skew is too large:
[20223.066260] clocksource:                       'hpet' wd_now: 6b02e6a0
			wd_last: c70685ef mask: ffffffff
[20223.066262] clocksource:                       'tsc' cs_now: 3ed0d6f109f5
			cs_last: 3e383b5c058d mask: ffffffffffffffff
[20223.066264] tsc: Marking TSC unstable due to clocksource watchdog
[26720.509156] rcu_preempt kthread starved for 60003 jiffies! g16640
			c16639 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
[29058.215330] rcu_preempt kthread starved for 60004 jiffies! g17522
			c17521 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=4
...

The other observation is sometimes the number of irqs (turbostat) jumps
a lot. This did not occur with the V1 patch set. An increase in irqs is
expected, but I don't think that much.
Note: I am unable to show a correlation between the above log entries
and the jumps in irqs.

Test results (some previous restated):
Observe 24.44 average processor package watts. 100% load on CPU 7.
Test Duration 10 hours and 9 minutes. Peak power 26.88 Watts
Reference: K4.16-rc3 + rjw V1 patchset: 24.77 Watts. Peak power 32.8 watts
Reference: K4.16-rc3: 26.41 Watts (short test, 3.53 hours)
Reference: K4.15-rc1: 27.34 Watts
Reference: K4.15-rc1, idle states 0-3 disabled: 23.92 Watts
Reference: K4.16-rc3 + rjw v1 patch set, idle states 0-3 disabled: ~23.65 Watts

References (Graphs):
http://fast.smythies.com/rjw416rc3v2_irq.png
http://fast.smythies.com/rjw416rc3v2_pwr.png

... Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-07 17:04 [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework Doug Smythies
@ 2018-03-07 22:11 ` Rafael J. Wysocki
  2018-03-08  1:28 ` Doug Smythies
  2018-03-08 15:18 ` Doug Smythies
  2 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-07 22:11 UTC (permalink / raw)
  To: Doug Smythies
  Cc: Rafael J. Wysocki, Thomas Gleixner, Frederic Weisbecker,
	Paul McKenney, Thomas Ilsche, Rik van Riel, Aubrey Li,
	Mike Galbraith, LKML, Linux PM, Peter Zijlstra

On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.net> wrote:
> On 2018.03.06 12:57 Rafael J. Wysocki wrote:
>
> ...[snip]...
>
>> And the two paragraphs below still apply:
>
>>> I have tested these patches on a couple of machines, including the very laptop
>>> I'm sending them from, without any obvious issues, but please give them a go
>>> if you can, especially if you have an easy way to reproduce the problem they
>>> are targeting.  The patches are on top of 4.16-rc3 (if you need a git branch
>>> with them for easier testing, please let me know).
>
> Hi,
>
> I am still having some boot troubles with V2. However, and because my system
> did eventually boot, seemingly O.K., I didn't re-boot a bunch of times for
> further testing.

OK, thanks for letting me know.

> I ran my 100% load on one CPU test, which is for idle state 0 issues, on
> my otherwise extremely idle test server. I never did have very good ways
> to test issues with the other idle states (Thomas Ilsche's specialty).
>
> During the test I got some messages (I also got some with the V1 patch set):
>
> [16246.655148] rcu_preempt kthread starved for 60005 jiffies! g10557 c10556
>                         f0x0 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
> [19556.565007] rcu_preempt kthread starved for 60003 jiffies! g12126 c12125
>                         f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
> [20223.066251] clocksource: timekeeping watchdog on CPU4: Marking clocksource
>                         'tsc' as unstable because the skew is too large:
> [20223.066260] clocksource:                       'hpet' wd_now: 6b02e6a0
>                         wd_last: c70685ef mask: ffffffff
> [20223.066262] clocksource:                       'tsc' cs_now: 3ed0d6f109f5
>                         cs_last: 3e383b5c058d mask: ffffffffffffffff
> [20223.066264] tsc: Marking TSC unstable due to clocksource watchdog
> [26720.509156] rcu_preempt kthread starved for 60003 jiffies! g16640
>                         c16639 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=5
> [29058.215330] rcu_preempt kthread starved for 60004 jiffies! g17522
>                         c17521 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=4
> ...

Can you please check if that's reporoducible with just the first three
patches in the series applied?

> The other observation is sometimes the number of irqs (turbostat) jumps
> a lot. This did not occur with the V1 patch set. An increase in irqs is
> expected, but I don't think that much.

Right.

> Note: I am unable to show a correlation between the above log entries
> and the jumps in irqs.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-07 17:04 [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework Doug Smythies
  2018-03-07 22:11 ` Rafael J. Wysocki
@ 2018-03-08  1:28 ` Doug Smythies
  2018-03-08 15:18 ` Doug Smythies
  2 siblings, 0 replies; 12+ messages in thread
From: Doug Smythies @ 2018-03-08  1:28 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Rafael J. Wysocki', 'Thomas Gleixner',
	'Frederic Weisbecker', 'Paul McKenney',
	'Thomas Ilsche', 'Rik van Riel',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	'Linux PM', 'Peter Zijlstra',
	Doug Smythies

On 2018.03.07 14:12 Rafael J. Wysocki wrote:
> On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.net> wrote:
>> On 2018.03.06 12:57 Rafael J. Wysocki wrote:
>
>> During the test I got some messages (I also got some with the V1 patch set):

> Can you please check if that's reporoducible with just the first three
> patches in the series applied?

I will.

>> The other observation is sometimes the number of irqs (turbostat) jumps
>> a lot. This did not occur with the V1 patch set. An increase in irqs is
>> expected, but I don't think that much.

> Right.

I did a trace for 1 hour, and got about 12 occurrence of very high IRQs.
I was able to correlate trace results with high IRQs per turbostat sampling
time (1 minute).

The extreme jumps in IRQs is due to CPUs wanting to be in idle state 4
(the deepest for my older i7 processor, C6), but the tick is not stopping,
(or so I am guessing) thus they exit idle state 4 every millisecond
(I am using a 1000 Hz kernel), and then pretty much immediately to go
back into idle state 4.
When this occurs, it seems to occur for a very long time, but it does
seem to eventually sort itself out. 

Example:
       5   2.0 [006] d..1   780.447005: cpu_idle: state=4 cpu_id=6
     993   2.0 [006] d..1   780.447999: cpu_idle: state=4294967295 cpu_id=6
       6   2.0 [006] d..1   780.448006: cpu_idle: state=4 cpu_id=6
     992   2.0 [006] d..1   780.448999: cpu_idle: state=4294967295 cpu_id=6
       6   2.0 [006] d..1   780.449005: cpu_idle: state=4 cpu_id=6

Where:
column 1 is the time difference in microseconds from the previous sample.
column 2 is the elapsed time of the test in minutes (for ease of correlating
	with the other once per minute data.
Other columns are unmodified from the raw trace data, but this file is
only CPU 6 and only idle entry/exit.

And the same area from the raw trace file:

<idle>-0     [006] d..1   780.447005: cpu_idle: state=4 cpu_id=6
 test1-2664  [007] d.h.   780.447999: local_timer_entry: vector=237
<idle>-0     [006] d..1   780.447999: cpu_idle: state=4294967295 cpu_id=6
 test1-2664  [007] d.h.   780.447999: hrtimer_expire_entry: hrtimer=00000000ea612c0e function=tick_sched_timer now=780435000915
<idle>-0     [006] d.h1   780.448000: local_timer_entry: vector=237
<idle>-0     [006] d.h1   780.448001: hrtimer_expire_entry: hrtimer=000000004b84020f function=tick_sched_timer now=780435002540
<idle>-0     [006] d.h1   780.448002: hrtimer_expire_exit: hrtimer=000000004b84020f
 test1-2664  [007] d.h.   780.448003: hrtimer_expire_exit: hrtimer=00000000ea612c0e
<idle>-0     [006] d.h1   780.448003: local_timer_exit: vector=237
 test1-2664  [007] d.h.   780.448003: local_timer_exit: vector=237
<idle>-0     [006] d..1   780.448006: cpu_idle: state=4 cpu_id=6

... Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-07 17:04 [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework Doug Smythies
  2018-03-07 22:11 ` Rafael J. Wysocki
  2018-03-08  1:28 ` Doug Smythies
@ 2018-03-08 15:18 ` Doug Smythies
  2018-03-08 16:16   ` Rik van Riel
  2018-03-08 16:37   ` Rafael J. Wysocki
  2 siblings, 2 replies; 12+ messages in thread
From: Doug Smythies @ 2018-03-08 15:18 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Rafael J. Wysocki', 'Thomas Gleixner',
	'Frederic Weisbecker', 'Paul McKenney',
	'Thomas Ilsche', 'Rik van Riel',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	'Linux PM', 'Peter Zijlstra',
	'Doug Smythies'

On 2018.03.07 17:29 Doug wrote:
> On 2018.03.07 14:12 Rafael J. Wysocki wrote:
>> On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.net> wrote:
>>> On 2018.03.06 12:57 Rafael J. Wysocki wrote:
>>
>>> During the test I got some messages (I also got some with the V1 patch set):
>
>> Can you please check if that's reporoducible with just the first three
>> patches in the series applied?
>
> I will.

No issues. Test duration: 7 hours and 26 minutes.
Boot seems normal. Times re-booted 4.

... Doug

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 15:18 ` Doug Smythies
@ 2018-03-08 16:16   ` Rik van Riel
  2018-03-08 16:36     ` Rafael J. Wysocki
  2018-03-08 16:37   ` Rafael J. Wysocki
  1 sibling, 1 reply; 12+ messages in thread
From: Rik van Riel @ 2018-03-08 16:16 UTC (permalink / raw)
  To: Doug Smythies, 'Rafael J. Wysocki'
  Cc: 'Rafael J. Wysocki', 'Thomas Gleixner',
	'Frederic Weisbecker', 'Paul McKenney',
	'Thomas Ilsche', 'Aubrey Li',
	'Mike Galbraith', 'LKML', 'Linux PM',
	'Peter Zijlstra'

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

On Thu, 2018-03-08 at 07:18 -0800, Doug Smythies wrote:
> On 2018.03.07 17:29 Doug wrote:
> > On 2018.03.07 14:12 Rafael J. Wysocki wrote:
> > > On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.ne
> > > t> wrote:
> > > > On 2018.03.06 12:57 Rafael J. Wysocki wrote:
> > > > During the test I got some messages (I also got some with the
> > > > V1 patch set):
> > > Can you please check if that's reporoducible with just the first
> > > three
> > > patches in the series applied?
> > 
> > I will.
> 
> No issues. Test duration: 7 hours and 26 minutes.
> Boot seems normal. Times re-booted 4.

I am observing the same issues here.

I'll comb through the code to see if I can
spot the issue :)

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 16:16   ` Rik van Riel
@ 2018-03-08 16:36     ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-08 16:36 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Doug Smythies, Rafael J. Wysocki, Rafael J. Wysocki,
	Thomas Gleixner, Frederic Weisbecker, Paul McKenney,
	Thomas Ilsche, Aubrey Li, Mike Galbraith, LKML, Linux PM,
	Peter Zijlstra

On Thu, Mar 8, 2018 at 5:16 PM, Rik van Riel <riel@surriel.com> wrote:
> On Thu, 2018-03-08 at 07:18 -0800, Doug Smythies wrote:
>> On 2018.03.07 17:29 Doug wrote:
>> > On 2018.03.07 14:12 Rafael J. Wysocki wrote:
>> > > On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.ne
>> > > t> wrote:
>> > > > On 2018.03.06 12:57 Rafael J. Wysocki wrote:
>> > > > During the test I got some messages (I also got some with the
>> > > > V1 patch set):
>> > > Can you please check if that's reporoducible with just the first
>> > > three
>> > > patches in the series applied?
>> >
>> > I will.
>>
>> No issues. Test duration: 7 hours and 26 minutes.
>> Boot seems normal. Times re-booted 4.
>
> I am observing the same issues here.
>
> I'll comb through the code to see if I can
> spot the issue :)

I found a problem that may be causing this.

You can very well wait for a respin of the series, should be tomorrow. :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 15:18 ` Doug Smythies
  2018-03-08 16:16   ` Rik van Riel
@ 2018-03-08 16:37   ` Rafael J. Wysocki
  1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-08 16:37 UTC (permalink / raw)
  To: Doug Smythies
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Thomas Gleixner,
	Frederic Weisbecker, Paul McKenney, Thomas Ilsche, Rik van Riel,
	Aubrey Li, Mike Galbraith, LKML, Linux PM, Peter Zijlstra

On Thu, Mar 8, 2018 at 4:18 PM, Doug Smythies <dsmythies@telus.net> wrote:
> On 2018.03.07 17:29 Doug wrote:
>> On 2018.03.07 14:12 Rafael J. Wysocki wrote:
>>> On Wed, Mar 7, 2018 at 6:04 PM, Doug Smythies <dsmythies@telus.net> wrote:
>>>> On 2018.03.06 12:57 Rafael J. Wysocki wrote:
>>>
>>>> During the test I got some messages (I also got some with the V1 patch set):
>>
>>> Can you please check if that's reporoducible with just the first three
>>> patches in the series applied?
>>
>> I will.
>
> No issues. Test duration: 7 hours and 26 minutes.
> Boot seems normal. Times re-booted 4.

Cool, thanks!

This means that the other patches introduce the problem, most likely.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 13:40     ` Mike Galbraith
@ 2018-03-09  9:58       ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-09  9:58 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Peter Zijlstra, Linux PM,
	Thomas Gleixner, Frederic Weisbecker, Paul McKenney,
	Thomas Ilsche, Doug Smythies, Rik van Riel, Aubrey Li, LKML

On Thu, Mar 8, 2018 at 2:40 PM, Mike Galbraith <mgalbraith@suse.de> wrote:
> On Thu, 2018-03-08 at 12:10 +0100, Rafael J. Wysocki wrote:
>> On Thu, Mar 8, 2018 at 11:31 AM, Mike Galbraith <mgalbraith@suse.de> wrote:
>>                               1     2     3
>> > 4.16.0.g1b88acc-master     6.95  7.03  6.91 (virgin)
>> > 4.16.0.g1b88acc-master     7.20  7.25  7.26 (+v2)
>> > 4.16.0.g1b88acc-master     6.90  7.06  6.95 (+local)
>> >
>> > Why would v2 charge the light firefox load a small but consistent fee?
>>
>> Two effects may come into play here I think.
>>
>> One is that allowing the tick to run biases the menu governor's
>> predictions towards the lower end, so we may use shallow states more
>> as a result then (Peter was talking about that).
>
> Hm, I'd expect that to show up in +local as well then, as it keeps the
> tick running when avg_idle < sched_migration_cost (convenient magic
> number), but the firefox load runs at the same wattage as virgin.  I'm
> also doing this...
>
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -335,7 +335,7 @@ static int menu_select(struct cpuidle_dr
>                  * C1's exit latency exceeds the user configured limit.
>                  */
>                 polling_threshold = max_t(unsigned int, 20, s->target_residency);
> -               if (data->next_timer_us > polling_threshold &&
> +               if (expected_interval > polling_threshold &&
>                     latency_req > s->exit_latency && !s->disabled &&
>                     !dev->states_usage[1].disable)
>                         first_idx = 1;
>
> ...to help out high frequency cross core throughput, but the firefox
> load apparently doesn't tickle that, as significant polling would
> surely show in the wattage.

OK, so the second reason sounds more likely to me.

Anyway, please retest with the v3 I've just posted.  The previous
iteration had a rather serious issue that might very well influence
the results (it was using stale values sometimes).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 11:10   ` Rafael J. Wysocki
@ 2018-03-08 13:40     ` Mike Galbraith
  2018-03-09  9:58       ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Galbraith @ 2018-03-08 13:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Peter Zijlstra, Linux PM, Thomas Gleixner,
	Frederic Weisbecker, Paul McKenney, Thomas Ilsche, Doug Smythies,
	Rik van Riel, Aubrey Li, LKML

On Thu, 2018-03-08 at 12:10 +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 8, 2018 at 11:31 AM, Mike Galbraith <mgalbraith@suse.de> wrote:
>                               1     2     3
> > 4.16.0.g1b88acc-master     6.95  7.03  6.91 (virgin)
> > 4.16.0.g1b88acc-master     7.20  7.25  7.26 (+v2)
> > 4.16.0.g1b88acc-master     6.90  7.06  6.95 (+local)
> >
> > Why would v2 charge the light firefox load a small but consistent fee?
> 
> Two effects may come into play here I think.
> 
> One is that allowing the tick to run biases the menu governor's
> predictions towards the lower end, so we may use shallow states more
> as a result then (Peter was talking about that).

Hm, I'd expect that to show up in +local as well then, as it keeps the
tick running when avg_idle < sched_migration_cost (convenient magic
number), but the firefox load runs at the same wattage as virgin.  I'm
also doing this...

--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -335,7 +335,7 @@ static int menu_select(struct cpuidle_dr
                 * C1's exit latency exceeds the user configured limit.
                 */
                polling_threshold = max_t(unsigned int, 20, s->target_residency);
-               if (data->next_timer_us > polling_threshold &&
+               if (expected_interval > polling_threshold &&
                    latency_req > s->exit_latency && !s->disabled &&
                    !dev->states_usage[1].disable)
                        first_idx = 1;

...to help out high frequency cross core throughput, but the firefox
load apparently doesn't tickle that, as significant polling would
surely show in the wattage.

	-Mike

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-08 10:31 ` Mike Galbraith
@ 2018-03-08 11:10   ` Rafael J. Wysocki
  2018-03-08 13:40     ` Mike Galbraith
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-08 11:10 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Rafael J. Wysocki, Peter Zijlstra, Linux PM, Thomas Gleixner,
	Frederic Weisbecker, Paul McKenney, Thomas Ilsche, Doug Smythies,
	Rik van Riel, Aubrey Li, LKML

On Thu, Mar 8, 2018 at 11:31 AM, Mike Galbraith <mgalbraith@suse.de> wrote:
> On Tue, 2018-03-06 at 09:57 +0100, Rafael J. Wysocki wrote:
>> Hi All,
>
> Greetings,

Hi,

>> Thanks a lot for the discussion so far!
>>
>> Here's a new version of the series addressing some comments from the
>> discussion and (most importantly) replacing patches 4 and 5 with another
>> (simpler) patch.
>
> Oddity: these patches seemingly manage to cost a bit of power when
> lightly loaded.  (but didn't cut cross core nohz cost much.. darn)
>
> i4790 booted nopti nospectre_v2
>
> 30 sec tbench
> 4.16.0.g1b88acc-master (virgin)
> Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
> Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
> Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
> Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
> Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms
>
> 4.16.0.g1b88acc-master (+v2)
> Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
> Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
> Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
> Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
> Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms

max_latency is much lower here for >2 clients/procs, but at the same
time it is much higher for <=2 clients/procs (which then may be
related to the somewhat higher throughput).  Interesting.

> 4.16.0.g1b88acc-master (+local nohz mitigation etc for reference [1])
> Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
> Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
> Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
> Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
> Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms
>
> turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 300;killall firefox
>
>                         PkgWatt
>                               1     2     3
> 4.16.0.g1b88acc-master     6.95  7.03  6.91 (virgin)
> 4.16.0.g1b88acc-master     7.20  7.25  7.26 (+v2)
> 4.16.0.g1b88acc-master     6.90  7.06  6.95 (+local)
>
> Why would v2 charge the light firefox load a small but consistent fee?

Two effects may come into play here I think.

One is that allowing the tick to run biases the menu governor's
predictions towards the lower end, so we may use shallow states more
as a result then (Peter was talking about that).

The second one may be that intermediate states are used quite a bit
"by nature" in this workload (that should be quite straightforward to
verify) and stopping the tick for them saves some energy on idle
entry/exit.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
  2018-03-06  8:57 Rafael J. Wysocki
@ 2018-03-08 10:31 ` Mike Galbraith
  2018-03-08 11:10   ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Mike Galbraith @ 2018-03-08 10:31 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra, Linux PM
  Cc: Thomas Gleixner, Frederic Weisbecker, Paul McKenney,
	Thomas Ilsche, Doug Smythies, Rik van Riel, Aubrey Li, LKML

On Tue, 2018-03-06 at 09:57 +0100, Rafael J. Wysocki wrote:
> Hi All,

Greetings,

> Thanks a lot for the discussion so far!
> 
> Here's a new version of the series addressing some comments from the
> discussion and (most importantly) replacing patches 4 and 5 with another
> (simpler) patch.

Oddity: these patches seemingly manage to cost a bit of power when
lightly loaded.  (but didn't cut cross core nohz cost much.. darn)

i4790 booted nopti nospectre_v2

30 sec tbench
4.16.0.g1b88acc-master (virgin)
Throughput 559.279 MB/sec  1 clients  1 procs  max_latency=0.046 ms
Throughput 997.119 MB/sec  2 clients  2 procs  max_latency=0.246 ms
Throughput 1693.04 MB/sec  4 clients  4 procs  max_latency=4.309 ms
Throughput 3597.2 MB/sec  8 clients  8 procs  max_latency=6.760 ms
Throughput 3474.55 MB/sec  16 clients  16 procs  max_latency=6.743 ms

4.16.0.g1b88acc-master (+v2)
Throughput 588.929 MB/sec  1 clients  1 procs  max_latency=0.291 ms
Throughput 1080.93 MB/sec  2 clients  2 procs  max_latency=0.639 ms
Throughput 1826.3 MB/sec  4 clients  4 procs  max_latency=0.647 ms
Throughput 3561.01 MB/sec  8 clients  8 procs  max_latency=1.279 ms
Throughput 3382.98 MB/sec  16 clients  16 procs  max_latency=4.817 ms

4.16.0.g1b88acc-master (+local nohz mitigation etc for reference [1])
Throughput 722.559 MB/sec  1 clients  1 procs  max_latency=0.087 ms
Throughput 1208.59 MB/sec  2 clients  2 procs  max_latency=0.289 ms
Throughput 2071.94 MB/sec  4 clients  4 procs  max_latency=0.654 ms
Throughput 3784.91 MB/sec  8 clients  8 procs  max_latency=0.974 ms
Throughput 3644.4 MB/sec  16 clients  16 procs  max_latency=5.620 ms

turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 300;killall firefox

                        PkgWatt
                              1     2     3
4.16.0.g1b88acc-master     6.95  7.03  6.91 (virgin)
4.16.0.g1b88acc-master     7.20  7.25  7.26 (+v2)
4.16.0.g1b88acc-master     6.90  7.06  6.95 (+local)

Why would v2 charge the light firefox load a small but consistent fee?

	-Mike

1. see low end, that's largely due to nohz throttling

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework
@ 2018-03-06  8:57 Rafael J. Wysocki
  2018-03-08 10:31 ` Mike Galbraith
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2018-03-06  8:57 UTC (permalink / raw)
  To: Peter Zijlstra, Linux PM
  Cc: Thomas Gleixner, Frederic Weisbecker, Paul McKenney,
	Thomas Ilsche, Doug Smythies, Rik van Riel, Aubrey Li,
	Mike Galbraith, LKML

Hi All,

Thanks a lot for the discussion so far!

Here's a new version of the series addressing some comments from the
discussion and (most importantly) replacing patches 4 and 5 with another
(simpler) patch.

The summary below still applies:

On Sunday, March 4, 2018 11:21:30 PM CET Rafael J. Wysocki wrote:
> 
> The problem is that if we stop the sched tick in
> tick_nohz_idle_enter() and then the idle governor predicts short idle
> duration, we lose regardless of whether or not it is right.
> 
> If it is right, we've lost already, because we stopped the tick
> unnecessarily.  If it is not right, we'll lose going forward, because
> the idle state selected by the governor is going to be too shallow and
> we'll draw too much power (that has been reported recently to actually
> happen often enough for people to care).
> 
> This patch series is an attempt to improve the situation and the idea
> here is to make the decision whether or not to stop the tick deeper in
> the idle loop and in particular after running the idle state selection
> in the path where the idle governor is invoked.  This way the problem
> can be avoided, because the idle duration predicted by the idle governor
> can be used to decide whether or not to stop the tick so that the tick
> is only stopped if that value is large enough (and, consequently, the
> idle state selected by the governor is deep enough).
> 
> The series tires to avoid adding too much new code, rather reorder the
> existing code and make it more fine-grained.
> 
> Patch 1 prepares the tick-sched code for the subsequent modifications and it
> doesn't change the code's functionality (at least not intentionally).
> 
> Patch 2 starts pushing the tick stopping decision deeper into the idle
> loop, but it is limited to do_idle() and tick_nohz_irq_exit().
> 
> Patch 3 makes cpuidle_idle_call() decide whether or not to stop the tick
> and sets the stage for the changes in patch 6.

Patch 4 adds a bool pointer argument to cpuidle_select() and the ->select
governor callback allowing them to return a "nohz" hint on whether or not to
stop the tick to the caller.

Patch 5 reorders the idle state selection with respect to the stopping of the
tick and causes the additional "nohz" hint from cpuidle_select() to be used
for deciding whether or not to stop the tick.

Patch 6 cleans up the code to avoid running one piece of it twice in a row
in some cases.

And the two paragraphs below still apply:

> I have tested these patches on a couple of machines, including the very laptop
> I'm sending them from, without any obvious issues, but please give them a go
> if you can, especially if you have an easy way to reproduce the problem they
> are targeting.  The patches are on top of 4.16-rc3 (if you need a git branch
> with them for easier testing, please let me know).
> 
> The above said, this is just RFC, so no pets close to the machines running it,
> please, and I'm kind of expecting Peter and Thomas to tear it into pieces. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-03-09  9:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-07 17:04 [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework Doug Smythies
2018-03-07 22:11 ` Rafael J. Wysocki
2018-03-08  1:28 ` Doug Smythies
2018-03-08 15:18 ` Doug Smythies
2018-03-08 16:16   ` Rik van Riel
2018-03-08 16:36     ` Rafael J. Wysocki
2018-03-08 16:37   ` Rafael J. Wysocki
  -- strict thread matches above, loose matches on Subject: below --
2018-03-06  8:57 Rafael J. Wysocki
2018-03-08 10:31 ` Mike Galbraith
2018-03-08 11:10   ` Rafael J. Wysocki
2018-03-08 13:40     ` Mike Galbraith
2018-03-09  9:58       ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.