linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
@ 2018-03-25  0:28 Doug Smythies
  2018-03-25 11:53 ` Rafael J. Wysocki
  2018-03-26  6:01 ` Doug Smythies
  0 siblings, 2 replies; 22+ messages in thread
From: Doug Smythies @ 2018-03-25  0:28 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Peter Zijlstra', 'Frederic Weisbecker',
	'Thomas Gleixner', 'Paul McKenney',
	'Thomas Ilsche', 'Rik van Riel',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	'Linux PM',
	Doug Smythies

On 2018.03.14 07:04 Rafael J. Wysocki wrote:

> If poll_idle() is allowed to spin until need_resched() returns 'true',
> it may actually spin for a much longer time than expected by the idle
> governor, since set_tsk_need_resched() is not always called by the
> timer interrupt handler.  If that happens, the CPU may spend much
> more time than anticipated in the "polling" state.
>
> To prevent that from happening, limit the time of the spinning loop
> in poll_idle().

...[snip]...

> +#define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)

The other ongoing threads on this aside, potentially, there might
be another issue.

What if the next available idle state, after 0, has a residency
that is greater than TICK_NSEC / 16? Meaning these numbers, for example:

/sys/devices/system/cpu/cpu0/cpuidle/state*/residency

The suggestion is that upon a timeout exit from idle state 0,
the measured_us should maybe be rejected, because the statistics
are being biased and it doesn't seem to correct itself.

Up to 1300% (<- not a typo) extra power consumption has been observed.

Supporting experimental data:

My processor: 
/sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
/sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
/sys/devices/system/cpu/cpu0/cpuidle/state3/residency:211 <<< Important
/sys/devices/system/cpu/cpu0/cpuidle/state4/residency:345

A 1000 Hz kernel (TICK_NSEC/16) = 62.5 nsec; idle system:

Idle state 0 time: Typically 0 uSec.
Processor package power: 3.7 watts (steady)

Now, disable idle states 1 and 2:

Idle state 0 time (all 8 CPUs): ~~ 430 Seconds / minute
Processor package power: ~52 watts (1300% more power, 14X)

A 250 Hz kernel (TICK_NSEC/16) = 250 nSec; idle system:

Idle state 0 time: Typically < 1 mSec / minute
Processor package power: 3.7 to 3.8 watts

Now, disable idle states 1 and 2:

Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
Processor package power: 3.7 to 3.8 watts

A 1000 Hz kernel with:

+#define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 4)

Note: Just for a test. I am not suggesting this should change.

instead. i.e. (TICK_NSEC/4) = 250 nSec.

Idle state 0 time: Typically 0 uSec.
Processor package power: 3.7 watts (steady)

Now, disable idle states 1 and 2:

Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
Processor package power: ~3.8 watts

Note 1: My example is contrived via disabling idle states, so
I don't know if it actually needs to be worried about.

Note 2: I do not know if there is some processor where
cpuidle/state1/residency is > 62.5 nSec.

Note 3: I am trying to figure out a way to test rejecting
measured_us upon timeout exit, but haven't made much progress.

... Doug

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25  0:28 [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle() Doug Smythies
@ 2018-03-25 11:53 ` Rafael J. Wysocki
  2018-03-25 21:41   ` Rafael J. Wysocki
  2018-03-26  6:01 ` Doug Smythies
  1 sibling, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-25 11:53 UTC (permalink / raw)
  To: Doug Smythies
  Cc: Rafael J. Wysocki, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Rik van Riel,
	Aubrey Li, Mike Galbraith, LKML, Linux PM

On Sun, Mar 25, 2018 at 1:28 AM, Doug Smythies <dsmythies@telus.net> wrote:
> On 2018.03.14 07:04 Rafael J. Wysocki wrote:
>
>> If poll_idle() is allowed to spin until need_resched() returns 'true',
>> it may actually spin for a much longer time than expected by the idle
>> governor, since set_tsk_need_resched() is not always called by the
>> timer interrupt handler.  If that happens, the CPU may spend much
>> more time than anticipated in the "polling" state.
>>
>> To prevent that from happening, limit the time of the spinning loop
>> in poll_idle().
>
> ...[snip]...
>
>> +#define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16)
>
> The other ongoing threads on this aside, potentially, there might
> be another issue.
>
> What if the next available idle state, after 0, has a residency
> that is greater than TICK_NSEC / 16? Meaning these numbers, for example:
>
> /sys/devices/system/cpu/cpu0/cpuidle/state*/residency
>
> The suggestion is that upon a timeout exit from idle state 0,
> the measured_us should maybe be rejected, because the statistics
> are being biased and it doesn't seem to correct itself.

OK

> Up to 1300% (<- not a typo) extra power consumption has been observed.
>
> Supporting experimental data:
>
> My processor:
> /sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
> /sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
> /sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
> /sys/devices/system/cpu/cpu0/cpuidle/state3/residency:211 <<< Important
> /sys/devices/system/cpu/cpu0/cpuidle/state4/residency:345
>
> A 1000 Hz kernel (TICK_NSEC/16) = 62.5 nsec; idle system:

nsec or usec?

> Idle state 0 time: Typically 0 uSec.
> Processor package power: 3.7 watts (steady)
>
> Now, disable idle states 1 and 2:
>
> Idle state 0 time (all 8 CPUs): ~~ 430 Seconds / minute
> Processor package power: ~52 watts (1300% more power, 14X)

But that's because you have disabled states 1 and 2, isn't it?

> A 250 Hz kernel (TICK_NSEC/16) = 250 nSec; idle system:
>
> Idle state 0 time: Typically < 1 mSec / minute
> Processor package power: 3.7 to 3.8 watts
>
> Now, disable idle states 1 and 2:
>
> Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
> Processor package power: 3.7 to 3.8 watts
>
> A 1000 Hz kernel with:
>
> +#define POLL_IDLE_TIME_LIMIT   (TICK_NSEC / 4)
>
> Note: Just for a test. I am not suggesting this should change.
>
> instead. i.e. (TICK_NSEC/4) = 250 nSec.
>
> Idle state 0 time: Typically 0 uSec.
> Processor package power: 3.7 watts (steady)
>
> Now, disable idle states 1 and 2:
>
> Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
> Processor package power: ~3.8 watts
>
> Note 1: My example is contrived via disabling idle states, so
> I don't know if it actually needs to be worried about.
>
> Note 2: I do not know if there is some processor where
> cpuidle/state1/residency is > 62.5 nSec.

If that's usec, I would be quite surprised if there were any. :-)

> Note 3: I am trying to figure out a way to test rejecting
> measured_us upon timeout exit, but haven't made much progress.

Rejecting it has side effects too, because it basically means lost information.

Reaching the time limit means that the CPU could have been idle much
longer, even though we can't say how much.  That needs to be recorded
in the data used for the next-time prediction or that is going to be
inaccurate too.

Of course, what number to record is a good question. :-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 11:53 ` Rafael J. Wysocki
@ 2018-03-25 21:41   ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-25 21:41 UTC (permalink / raw)
  To: Doug Smythies
  Cc: Rafael J. Wysocki, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Rik van Riel,
	Aubrey Li, Mike Galbraith, LKML, Linux PM

On Sunday, March 25, 2018 1:53:42 PM CEST Rafael J. Wysocki wrote:
> On Sun, Mar 25, 2018 at 1:28 AM, Doug Smythies <dsmythies@telus.net> wrote:
> > On 2018.03.14 07:04 Rafael J. Wysocki wrote:
> >
> >> If poll_idle() is allowed to spin until need_resched() returns 'true',
> >> it may actually spin for a much longer time than expected by the idle
> >> governor, since set_tsk_need_resched() is not always called by the
> >> timer interrupt handler.  If that happens, the CPU may spend much
> >> more time than anticipated in the "polling" state.
> >>
> >> To prevent that from happening, limit the time of the spinning loop
> >> in poll_idle().
> >
> > ...[snip]...
> >
> >> +#define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16)
> >
> > The other ongoing threads on this aside, potentially, there might
> > be another issue.
> >
> > What if the next available idle state, after 0, has a residency
> > that is greater than TICK_NSEC / 16? Meaning these numbers, for example:
> >
> > /sys/devices/system/cpu/cpu0/cpuidle/state*/residency
> >
> > The suggestion is that upon a timeout exit from idle state 0,
> > the measured_us should maybe be rejected, because the statistics
> > are being biased and it doesn't seem to correct itself.
> 
> OK
> 
> > Up to 1300% (<- not a typo) extra power consumption has been observed.
> >
> > Supporting experimental data:
> >
> > My processor:
> > /sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
> > /sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
> > /sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
> > /sys/devices/system/cpu/cpu0/cpuidle/state3/residency:211 <<< Important
> > /sys/devices/system/cpu/cpu0/cpuidle/state4/residency:345
> >
> > A 1000 Hz kernel (TICK_NSEC/16) = 62.5 nsec; idle system:
> 
> nsec or usec?
> 
> > Idle state 0 time: Typically 0 uSec.
> > Processor package power: 3.7 watts (steady)
> >
> > Now, disable idle states 1 and 2:
> >
> > Idle state 0 time (all 8 CPUs): ~~ 430 Seconds / minute
> > Processor package power: ~52 watts (1300% more power, 14X)
> 
> But that's because you have disabled states 1 and 2, isn't it?
> 
> > A 250 Hz kernel (TICK_NSEC/16) = 250 nSec; idle system:
> >
> > Idle state 0 time: Typically < 1 mSec / minute
> > Processor package power: 3.7 to 3.8 watts
> >
> > Now, disable idle states 1 and 2:
> >
> > Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
> > Processor package power: 3.7 to 3.8 watts
> >
> > A 1000 Hz kernel with:
> >
> > +#define POLL_IDLE_TIME_LIMIT   (TICK_NSEC / 4)
> >
> > Note: Just for a test. I am not suggesting this should change.
> >
> > instead. i.e. (TICK_NSEC/4) = 250 nSec.
> >
> > Idle state 0 time: Typically 0 uSec.
> > Processor package power: 3.7 watts (steady)
> >
> > Now, disable idle states 1 and 2:
> >
> > Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
> > Processor package power: ~3.8 watts
> >
> > Note 1: My example is contrived via disabling idle states, so
> > I don't know if it actually needs to be worried about.
> >
> > Note 2: I do not know if there is some processor where
> > cpuidle/state1/residency is > 62.5 nSec.
> 
> If that's usec, I would be quite surprised if there were any. :-)
> 
> > Note 3: I am trying to figure out a way to test rejecting
> > measured_us upon timeout exit, but haven't made much progress.
> 
> Rejecting it has side effects too, because it basically means lost information.
> 
> Reaching the time limit means that the CPU could have been idle much
> longer, even though we can't say how much.  That needs to be recorded
> in the data used for the next-time prediction or that is going to be
> inaccurate too.
> 
> Of course, what number to record is a good question. :-)

Chances are that the appended patch will help.

Instead of trying to figure out what to do with measured_us after the fact it
tries to make the target residency of the next available state sit within the
time limit that's going to be used.

I'm not sure if "tick period / 16" is a sufficient margin, though.  If not, it
may be necessary to take 2 * target_residency + POLL_IDLE_TIME_LIMIT as the
limit or similar.

---
 drivers/cpuidle/poll_state.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/cpuidle/poll_state.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/poll_state.c
+++ linux-pm/drivers/cpuidle/poll_state.c
@@ -20,7 +20,23 @@ static int __cpuidle poll_idle(struct cp
 	current_enable_polling();
 	local_irq_enable();
 	if (!current_set_polling_and_test()) {
+		u64 limit = POLL_IDLE_TIME_LIMIT;
 		unsigned int loop_count = 0;
+		int i;
+
+		/*
+		 * Make sure that the next available state is within the limit
+		 * we are going to use to make it more likely to be selected
+		 * next time.
+		 */
+		for (i = index + 1; i < drv->state_count; i++) {
+			struct cpuidle_state *s = &drv->states[i];
+
+			if (!s->disabled && !dev->states_usage[i].disable) {
+				limit += (u64)s->target_residency * NSEC_PER_USEC;
+				break;
+			}
+		}
 
 		while (!need_resched()) {
 			cpu_relax();
@@ -34,7 +50,7 @@ static int __cpuidle poll_idle(struct cp
 				continue;
 
 			loop_count = 0;
-			if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
+			if (local_clock() - time_start > limit)
 				break;
 		}
 	}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25  0:28 [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle() Doug Smythies
  2018-03-25 11:53 ` Rafael J. Wysocki
@ 2018-03-26  6:01 ` Doug Smythies
  1 sibling, 0 replies; 22+ messages in thread
From: Doug Smythies @ 2018-03-26  6:01 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Rafael J. Wysocki', 'Peter Zijlstra',
	'Frederic Weisbecker', 'Thomas Gleixner',
	'Paul McKenney', 'Thomas Ilsche',
	'Rik van Riel', 'Aubrey Li',
	'Mike Galbraith', 'LKML', 'Linux PM',
	Doug Smythies

On 2018.03.25 04:54 Rafael J. Wysocki wrote:
> On Sun, Mar 25, 2018 at 1:28 AM, Doug Smythies <dsmythies@telus.net> wrote:
>> On 2018.03.14 07:04 Rafael J. Wysocki wrote:
>>
>>> If poll_idle() is allowed to spin until need_resched() returns 'true',
>>> it may actually spin for a much longer time than expected by the idle
>>> governor, since set_tsk_need_resched() is not always called by the
>>> timer interrupt handler.  If that happens, the CPU may spend much
>>> more time than anticipated in the "polling" state.
>>>
>>> To prevent that from happening, limit the time of the spinning loop
>>> in poll_idle().
>>
>> ...[snip]...
>>
>>> +#define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16)
>>
>> The other ongoing threads on this aside, potentially, there might
>> be another issue.
>>
>> What if the next available idle state, after 0, has a residency
>> that is greater than TICK_NSEC / 16? Meaning these numbers, for example:
>>
>> /sys/devices/system/cpu/cpu0/cpuidle/state*/residency
>>
>> The suggestion is that upon a timeout exit from idle state 0,
>> the measured_us should maybe be rejected, because the statistics
>> are being biased and it doesn't seem to correct itself.
>
> OK
>
>> Up to 1300% (<- not a typo) extra power consumption has been observed.
>>
>> Supporting experimental data:
>>
>> My processor:
>> /sys/devices/system/cpu/cpu0/cpuidle/state0/residency:0
>> /sys/devices/system/cpu/cpu0/cpuidle/state1/residency:2
>> /sys/devices/system/cpu/cpu0/cpuidle/state2/residency:20
>> /sys/devices/system/cpu/cpu0/cpuidle/state3/residency:211 <<< Important
>> /sys/devices/system/cpu/cpu0/cpuidle/state4/residency:345
>>
>> A 1000 Hz kernel (TICK_NSEC/16) = 62.5 nsec; idle system:
>
> nsec or usec?

Right, uSeconds.

>> Idle state 0 time: Typically 0 uSec.
>> Processor package power: 3.7 watts (steady)
>>
>> Now, disable idle states 1 and 2:
>>
>> Idle state 0 time (all 8 CPUs): ~~ 430 Seconds / minute
>> Processor package power: ~52 watts (1300% more power, 14X)
>
> But that's because you have disabled states 1 and 2, isn't it?

Yes, and perhaps the conclusion here is that we don't care if
the user has disabled intermediate idle states, forcing these
conditions.

>> A 250 Hz kernel (TICK_NSEC/16) = 250 nSec; idle system:
250 uSec.
>>
>> Idle state 0 time: Typically < 1 mSec / minute
>> Processor package power: 3.7 to 3.8 watts
>>
>> Now, disable idle states 1 and 2:
>>
>> Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
>> Processor package power: 3.7 to 3.8 watts
>>
>> A 1000 Hz kernel with:
>>
>> +#define POLL_IDLE_TIME_LIMIT   (TICK_NSEC / 4)
>>
>> Note: Just for a test. I am not suggesting this should change.
>>
>> instead. i.e. (TICK_NSEC/4) = 250 nSec.
250 uSec.
>>
>> Idle state 0 time: Typically 0 uSec.
>> Processor package power: 3.7 watts (steady)
>>
>> Now, disable idle states 1 and 2:
>>
>> Idle state 0 time (all 8 CPUs): Typically 0 to 70 mSecs / minute
>> Processor package power: ~3.8 watts
>>
>> Note 1: My example is contrived via disabling idle states, so
>> I don't know if it actually needs to be worried about.
>>
>> Note 2: I do not know if there is some processor where
>> cpuidle/state1/residency is > 62.5 nSec.
62.5 uSec.
> If that's usec, I would be quite surprised if there were any. :-)

O.K.

>> Note 3: I am trying to figure out a way to test rejecting
>> measured_us upon timeout exit, but haven't made much progress.
>
> Rejecting it has side effects too, because it basically means lost information.
>
> Reaching the time limit means that the CPU could have been idle much
> longer, even though we can't say how much.  That needs to be recorded
> in the data used for the next-time prediction or that is going to be
> inaccurate too.
>
> Of course, what number to record is a good question. :-)

Maybe it would be O.K. to be aware of this and simply move on.

By the way, I forgot to mention that the above work was done
with kernels based on 4.16-rc6 and only these poll_idle patches.
If I then add the V7.3 idle loop rework patch set, the issue
becomes partially mitigated (151 minute averages):
Idle state 0 time (all 8 CPUs): ~~ 304 Seconds / minute
Processor package power: ~47.7 watts

It'll be tomorrow before I can try the suggestion from the other e-mail.

... Doug

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-27 17:59     ` Rik van Riel
@ 2018-03-27 21:06       ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-27 21:06 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Rafael J. Wysocki, Linux PM, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Doug Smythies,
	Aubrey Li, Mike Galbraith, LKML

On Tue, Mar 27, 2018 at 7:59 PM, Rik van Riel <riel@surriel.com> wrote:
> On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
>
>> Does it improve if you do something like the below on top of it?
>
> First off, apologies for doing my testing wrong.
> There was an error, which was cleverly hidden by
> scripts, and things were not running the way they
> should.

No worries.

I'm glad that what you really see appears to match my understanding of
things. :-)

> After running the tests right, I have seen some
> results. My own test patch of disabling polling,
> and always dropping into at least C1, resulted in
> a CPU use increase of 0.4%.
>
> Your patch below, with a single cpu_relax(), and
> the local variable to not read the TSC too often,
> seems to give the best performance here, with a
> 0.2% reduction in CPU use.

Cool.

> Again, apologies for messing up the testing.
>
> I will kick off proper tests for that other patch
> series right now.

Thanks!

>> > Let me go test the nohz idle series by itself,
>> > without this patch.
>>
>> OK
>>
>> ---
>>  drivers/cpuidle/poll_state.c |    6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> Index: linux-pm/drivers/cpuidle/poll_state.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpuidle/poll_state.c
>> +++ linux-pm/drivers/cpuidle/poll_state.c
>> @@ -10,6 +10,7 @@
>>  #include <linux/sched/idle.h>
>>
>>  #define POLL_IDLE_TIME_LIMIT (TICK_NSEC / 16)
>> +#define POLL_IDLE_COUNT              1000
>>
>>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>>                              struct cpuidle_driver *drv, int
>> index)
>> @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
>>
>>       local_irq_enable();
>>       if (!current_set_polling_and_test()) {
>> +             unsigned int loop_count = 0;
>> +
>>               while (!need_resched()) {
>>                       cpu_relax();
>> +                     if (loop_count++ < POLL_IDLE_COUNT)
>> +                             continue;
>>
>> +                     loop_count = 0;
>>                       if (local_clock() - time_start >
>> POLL_IDLE_TIME_LIMIT)
>>                               break;
>>               }
>>
>>
> --
> All Rights Reversed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-22 17:09   ` Rafael J. Wysocki
  2018-03-22 17:19     ` Rik van Riel
  2018-03-25 20:15     ` Rik van Riel
@ 2018-03-27 17:59     ` Rik van Riel
  2018-03-27 21:06       ` Rafael J. Wysocki
  2 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2018-03-27 17:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 1882 bytes --]

On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:

> Does it improve if you do something like the below on top of it?

First off, apologies for doing my testing wrong.
There was an error, which was cleverly hidden by
scripts, and things were not running the way they
should.

After running the tests right, I have seen some
results. My own test patch of disabling polling,
and always dropping into at least C1, resulted in
a CPU use increase of 0.4%.

Your patch below, with a single cpu_relax(), and
the local variable to not read the TSC too often,
seems to give the best performance here, with a
0.2% reduction in CPU use.

Again, apologies for messing up the testing.

I will kick off proper tests for that other patch
series right now.

> > Let me go test the nohz idle series by itself,
> > without this patch.
> 
> OK
> 
> ---
>  drivers/cpuidle/poll_state.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> Index: linux-pm/drivers/cpuidle/poll_state.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/poll_state.c
> +++ linux-pm/drivers/cpuidle/poll_state.c
> @@ -10,6 +10,7 @@
>  #include <linux/sched/idle.h>
>  
>  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> +#define POLL_IDLE_COUNT		1000
>  
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>  			       struct cpuidle_driver *drv, int
> index)
> @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
>  
>  	local_irq_enable();
>  	if (!current_set_polling_and_test()) {
> +		unsigned int loop_count = 0;
> +
>  		while (!need_resched()) {
>  			cpu_relax();
> +			if (loop_count++ < POLL_IDLE_COUNT)
> +				continue;
>  
> +			loop_count = 0;
>  			if (local_clock() - time_start >
> POLL_IDLE_TIME_LIMIT)
>  				break;
>  		}
> 
> 
-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-26 21:44           ` Rafael J. Wysocki
@ 2018-03-26 21:48             ` Rik van Riel
  0 siblings, 0 replies; 22+ messages in thread
From: Rik van Riel @ 2018-03-26 21:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Doug Smythies,
	Aubrey Li, Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

On Mon, 2018-03-26 at 23:44 +0200, Rafael J. Wysocki wrote:
> On Mon, Mar 26, 2018 at 6:32 PM, Rik van Riel <riel@surriel.com>
> wrote:
> > On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
> > 
> > I tried that, as well, and some other variations.
> > 
> > Every single change to poll_idle() that I tried
> > seems to result in a 9-10% relative increase in
> > CPU use during the peak load of the test.
> > 
> > During the busiest parts of the load, every CPU
> > sees on the order of 20k context switches a second.

... and I did something wrong in the tests :/

> Hmm.  Basically, you are saying that
> 
> while (something)
>     cpu_relax();
> 
> is measurably less overhead than
> 
> while (something) {
>     cpu_relax();
>     check something else;
>     cpu_relax();
> }
> 
> which honestly makes me wonder how this is possible at all.

Part of the mystery is solved. Apparently the
control kernel was not the one I thought it
was, or the one that was printed in the 
output messages :(

I am re-running the tests now with a few
variations, and seeing what works best.

My apologies for wasting everybody's time.

I should have valid results later tonight.
Fingers crossed.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-26 16:32         ` Rik van Riel
@ 2018-03-26 21:44           ` Rafael J. Wysocki
  2018-03-26 21:48             ` Rik van Riel
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-26 21:44 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Rafael J. Wysocki, Linux PM, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Doug Smythies,
	Aubrey Li, Mike Galbraith, LKML

On Mon, Mar 26, 2018 at 6:32 PM, Rik van Riel <riel@surriel.com> wrote:
> On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
>> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
>> >
>> > I plan to try two more things:
>> >
>> > 1) Disable polling on SMT systems, with
>> >    the idea that putting one thread to
>> >    sleep with monitor/mwait in C1 will
>> >    allow the other thread to run faster.
>>
>> Sounds plausible.
>
> Plausible, but wrong. Tests showed that CPU use
> during the peak load of this test increased from
> about 71% to about 78% with this change, or just
> under 10% increase relative to the baseline.
>
> Coincidentally, that is the same CPU use increase
> I have seen with the poll_idle() changes. Not sure
> if that means anything...
>
>> > 2) Insert more cpu_relax() calls into the
>> >    main loop, so the CPU core spends more
>> >    of its time in cpu_relax() and less
>> >    time doing other things:
>>
>> Well, maybe it's a matter of doing cpu_relax() between any other bits
>> of
>> significant computation in there:
>
> I tried that, as well, and some other variations.
>
> Every single change to poll_idle() that I tried
> seems to result in a 9-10% relative increase in
> CPU use during the peak load of the test.
>
> During the busiest parts of the load, every CPU
> sees on the order of 20k context switches a second.

Hmm.  Basically, you are saying that

while (something)
    cpu_relax();

is measurably less overhead than

while (something) {
    cpu_relax();
    check something else;
    cpu_relax();
}

which honestly makes me wonder how this is possible at all.

Unless, that is, the Doug's theory actually matches the reality and
the difference is related to the fact that the "something else" causes
the loop to terminate after fewer steps.

Which in turn causes poll_idle() to return to the outer loop in
do_idle() and *that* is enough overhead to explain what you are
seeing.

So, it looks like after the change causing the poll_idle() loop to
terminate in fewer steps (which is generally good for power and also
for performance in some workloads) you are measuring the overhead of
the outer loop in do_idle() instead of the overhead of the loop in
poll_idle() itself.

With that in mind (and I'm going to assume that this is the case until
I see convincing evidence to the contrary), it seems to me that your
workload will be unhappy if *any* changes are made to poll_idle(), but
we simply have to change it, because today it is kind of a design
disaster.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 21:34       ` Rafael J. Wysocki
                           ` (2 preceding siblings ...)
  2018-03-26  7:13         ` Doug Smythies
@ 2018-03-26 16:32         ` Rik van Riel
  2018-03-26 21:44           ` Rafael J. Wysocki
  3 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2018-03-26 16:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]

On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
> > 
> > I plan to try two more things:
> > 
> > 1) Disable polling on SMT systems, with
> >    the idea that putting one thread to
> >    sleep with monitor/mwait in C1 will
> >    allow the other thread to run faster.
> 
> Sounds plausible.

Plausible, but wrong. Tests showed that CPU use
during the peak load of this test increased from
about 71% to about 78% with this change, or just
under 10% increase relative to the baseline.

Coincidentally, that is the same CPU use increase
I have seen with the poll_idle() changes. Not sure
if that means anything...

> > 2) Insert more cpu_relax() calls into the
> >    main loop, so the CPU core spends more
> >    of its time in cpu_relax() and less
> >    time doing other things:
> 
> Well, maybe it's a matter of doing cpu_relax() between any other bits
> of
> significant computation in there:

I tried that, as well, and some other variations.

Every single change to poll_idle() that I tried
seems to result in a 9-10% relative increase in
CPU use during the peak load of the test.

During the busiest parts of the load, every CPU
sees on the order of 20k context switches a second.

kind regards,

Rik
-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-26  7:13         ` Doug Smythies
@ 2018-03-26  9:35           ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-26  9:35 UTC (permalink / raw)
  To: Doug Smythies, Rik van Riel
  Cc: Rafael J. Wysocki, Linux PM, Peter Zijlstra, Frederic Weisbecker,
	Thomas Gleixner, Paul McKenney, Thomas Ilsche, Aubrey Li,
	Mike Galbraith, LKML

On Mon, Mar 26, 2018 at 9:13 AM, Doug Smythies <dsmythies@telus.net> wrote:
> On 2018.03.25 23:00 Doug Smythies wrote:
>> On 2018.03.25 14:25 Rik van Riel wrote:
>> On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
>>> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
>
> ...[snip]...
>
>>>>>
>>>>> OK, I am still seeing a performance
>>>>> degradation with the above, though
>>>>> not throughout the entire workload.
>>>>>
>>>>> It appears that making the idle loop
>>>>> do anything besides cpu_relax() for
>>>>> a significant amount of time slows
>>>>> things down.
>>>>
>>>> I see.
>>
>> I have no proof, but I do not see that as
>> the problem.
>>
>> I think the issue is the overall exiting
>> and then re-entering idle state 0 much
>> more often, and the related overheads, where
>> interrupts are disabled for short periods.

That is a very good point.

So far we have assumed that the performance impact was due to what
happened in poll_idle(), but it very well may be due to how often
poll_idle() is called and returns over a unit of time.

There may be workloads in which wakeups occur so often that idle
states above 1 are (almost) never selected and in these workloads
there will be increased overhead related to entering and exiting state
0 (polling) with the timeout.  That is inevitable.

Rik, if your workload is one of these, you will see performance
impact.  However, you also should see a difference in power (see
below).

>> My jury rigged way of trying to create similar
>> conditions seems to always have the ISR return with
>> the need_resched() flag set, so there is no difference
>> in idle state 0 entries per unit time between kernel
>> 4.16-rc6 and one with the poll fixes added.
>>
>> i.e. the difference between these numbers over some time:
>>
>> cat /sys/devices/system/cpu/cpu*/cpuidle/state0/usage
>>
>> Rik, I wonder if you see a difference with your real
>> workflow?
>
> Using iperf, I was able to show a difference on my computer.
> Another computer was used as the server, and my test computer
> was the client. (the other way around didn't show a difference)
>
> With Kernel 4.16-rc6 I got about ~2000 idle state 0 entries
> per minute and ~155 seconds residency. ~32 watts package power.
>
> With the poll stuff included I got ~46000 idle state 0 entries
> per minute and ~53 seconds residency. ~20 watts package power.

OK, so that's 30%+ of a difference in power.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 21:34       ` Rafael J. Wysocki
  2018-03-25 21:45         ` Rik van Riel
  2018-03-26  5:59         ` Doug Smythies
@ 2018-03-26  7:13         ` Doug Smythies
  2018-03-26  9:35           ` Rafael J. Wysocki
  2018-03-26 16:32         ` Rik van Riel
  3 siblings, 1 reply; 22+ messages in thread
From: Doug Smythies @ 2018-03-26  7:13 UTC (permalink / raw)
  To: 'Rik van Riel', 'Rafael J. Wysocki'
  Cc: 'Linux PM', 'Peter Zijlstra',
	'Frederic Weisbecker', 'Thomas Gleixner',
	'Paul McKenney', 'Thomas Ilsche',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	Doug Smythies

On 2018.03.25 23:00 Doug Smythies wrote: 
> On 2018.03.25 14:25 Rik van Riel wrote:
> On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
>> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:

...[snip]...

>>>> 
>>>> OK, I am still seeing a performance
>>>> degradation with the above, though
>>>> not throughout the entire workload.
>>>> 
>>>> It appears that making the idle loop
>>>> do anything besides cpu_relax() for
>>>> a significant amount of time slows
>>>> things down.
>>> 
>>> I see. 
>
> I have no proof, but I do not see that as
> the problem.
>
> I think the issue is the overall exiting
> and then re-entering idle state 0 much
> more often, and the related overheads, where
> interrupts are disabled for short periods.
>
> My jury rigged way of trying to create similar
> conditions seems to always have the ISR return with
> the need_resched() flag set, so there is no difference
> in idle state 0 entries per unit time between kernel
> 4.16-rc6 and one with the poll fixes added.
>
> i.e. the difference between these numbers over some time:
>
> cat /sys/devices/system/cpu/cpu*/cpuidle/state0/usage
>
> Rik, I wonder if you see a difference with your real
> workflow?

Using iperf, I was able to show a difference on my computer.
Another computer was used as the server, and my test computer
was the client. (the other way around didn't show a difference)

With Kernel 4.16-rc6 I got about ~2000 idle state 0 entries
per minute and ~155 seconds residency. ~32 watts package power.

With the poll stuff included I got ~46000 idle state 0 entries
per minute and ~53 seconds residency. ~20 watts package power.  

... Doug

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 21:34       ` Rafael J. Wysocki
  2018-03-25 21:45         ` Rik van Riel
@ 2018-03-26  5:59         ` Doug Smythies
  2018-03-26  7:13         ` Doug Smythies
  2018-03-26 16:32         ` Rik van Riel
  3 siblings, 0 replies; 22+ messages in thread
From: Doug Smythies @ 2018-03-26  5:59 UTC (permalink / raw)
  To: 'Rik van Riel', 'Rafael J. Wysocki'
  Cc: 'Linux PM', 'Peter Zijlstra',
	'Frederic Weisbecker', 'Thomas Gleixner',
	'Paul McKenney', 'Thomas Ilsche',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	Doug Smythies

On 2018.03.25 14:25 Rik van Riel wrote:
> On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
>> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
>>> 
>>> --=-e8yLbs0aoH4SrxOskwwl
>>> Content-Type: text/plain; charset="UTF-8"
>>> Content-Transfer-Encoding: quoted-printable
>>> 
>>> On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
>>>> =20
>>>> +++ linux-pm/drivers/cpuidle/poll_state.c
>>>> @@ -10,6 +10,7 @@
>>>>  #include <linux/sched/idle.h>
>>>> =20
>>>>  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
>>>> +#define POLL_IDLE_COUNT		1000
>>>> =20
>>>>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>>>>  			       struct cpuidle_driver *drv, int
>>>> index)
>>>> @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
>>>>
>>>>  	local_irq_enable();
>>>>  	if (!current_set_polling_and_test()) {
>>>> +		unsigned int loop_count =3D 0;
>>>> +
>>>>  		while (!need_resched()) {
>>>>  			cpu_relax();
>>>> +			if (loop_count++ < POLL_IDLE_COUNT)
>>>> +				continue;
>>>>
>>>> +			loop_count =3D 0;
>>>>  			if (local_clock() - time_start >
>>>> POLL_IDLE_TIME_LIMIT)
>>>>  				break;
>>>>  		}
>>> 
>>> OK, I am still seeing a performance
>>> degradation with the above, though
>>> not throughout the entire workload.
>>> 
>>> It appears that making the idle loop
>>> do anything besides cpu_relax() for
>>> a significant amount of time slows
>>> things down.
>> 
>> I see.
>> 

I have no proof, but I do not see that as
the problem.

I think the issue is the overall exiting
and then re-entering idle state 0 much
more often, and the related overheads, where
interrupts are disabled for short periods.

My jury rigged way of trying to create similar
conditions seems to always have the ISR return with
the need_resched() flag set, so there is no difference
in idle state 0 entries per unit time between kernel
4.16-rc6 and one with the poll fixes added.

i.e. the difference between these numbers over some time:

cat /sys/devices/system/cpu/cpu*/cpuidle/state0/usage

Rik, I wonder if you see a difference with your real
workflow?

... Doug

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 21:34       ` Rafael J. Wysocki
@ 2018-03-25 21:45         ` Rik van Riel
  2018-03-26  5:59         ` Doug Smythies
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Rik van Riel @ 2018-03-25 21:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 3269 bytes --]

On Sun, 2018-03-25 at 23:34 +0200, Rafael J. Wysocki wrote:
> On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
> > 
> > --=-e8yLbs0aoH4SrxOskwwl
> > Content-Type: text/plain; charset="UTF-8"
> > Content-Transfer-Encoding: quoted-printable
> > 
> > On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
> > > =20
> > > +++ linux-pm/drivers/cpuidle/poll_state.c
> > > @@ -10,6 +10,7 @@
> > >  #include <linux/sched/idle.h>
> > > =20
> > >  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> > > +#define POLL_IDLE_COUNT		1000
> > > =20
> > >  static int __cpuidle poll_idle(struct cpuidle_device *dev,
> > >  			       struct cpuidle_driver *drv, int
> > > index)
> > > @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
> > > =20
> > >  	local_irq_enable();
> > >  	if (!current_set_polling_and_test()) {
> > > +		unsigned int loop_count =3D 0;
> > > +
> > >  		while (!need_resched()) {
> > >  			cpu_relax();
> > > +			if (loop_count++ < POLL_IDLE_COUNT)
> > > +				continue;
> > > =20
> > > +			loop_count =3D 0;
> > >  			if (local_clock() - time_start >
> > > POLL_IDLE_TIME_LIMIT)
> > >  				break;
> > >  		}
> > 
> > OK, I am still seeing a performance
> > degradation with the above, though
> > not throughout the entire workload.
> > 
> > It appears that making the idle loop
> > do anything besides cpu_relax() for
> > a significant amount of time slows
> > things down.
> 
> I see.
> 
> > I plan to try two more things:
> > 
> > 1) Disable polling on SMT systems, with
> >    the idea that putting one thread to
> >    sleep with monitor/mwait in C1 will
> >    allow the other thread to run faster.
> 
> Sounds plausible.
> 
> > 2) Insert more cpu_relax() calls into the
> >    main loop, so the CPU core spends more
> >    of its time in cpu_relax() and less
> >    time doing other things:
> 
> Well, maybe it's a matter of doing cpu_relax() between any other bits
> of
> significant computation in there:

That sounds like a plausible thing to try.
Let me kick off a test with that variant, too.

>  drivers/cpuidle/poll_state.c |   13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/cpuidle/poll_state.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/poll_state.c
> +++ linux-pm/drivers/cpuidle/poll_state.c
> @@ -10,6 +10,7 @@
>  #include <linux/sched/idle.h>
>  
>  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> +#define POLL_IDLE_COUNT		200
>  
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>  			       struct cpuidle_driver *drv, int
> index)
> @@ -18,11 +19,21 @@ static int __cpuidle poll_idle(struct cp
>  
>  	local_irq_enable();
>  	if (!current_set_polling_and_test()) {
> +		unsigned int loop_count = 0;
> +
>  		while (!need_resched()) {
>  			cpu_relax();
> -
> +			if (loop_count++ < POLL_IDLE_COUNT) {
> +				cpu_relax();
> +				continue;
> +			}
> +			cpu_relax();
> +			loop_count = 0;
> +			cpu_relax();
>  			if (local_clock() - time_start >
> POLL_IDLE_TIME_LIMIT)
>  				break;
> +
> +			cpu_relax();
>  		}
>  	}
>  	current_clr_polling();
> 
> 
-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-25 20:15     ` Rik van Riel
@ 2018-03-25 21:34       ` Rafael J. Wysocki
  2018-03-25 21:45         ` Rik van Riel
                           ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-25 21:34 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

On Sunday, March 25, 2018 10:15:52 PM CEST Rik van Riel wrote:
> 
> --=-e8yLbs0aoH4SrxOskwwl
> Content-Type: text/plain; charset="UTF-8"
> Content-Transfer-Encoding: quoted-printable
> 
> On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
> >=20
> > +++ linux-pm/drivers/cpuidle/poll_state.c
> > @@ -10,6 +10,7 @@
> >  #include <linux/sched/idle.h>
> > =20
> >  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> > +#define POLL_IDLE_COUNT		1000
> > =20
> >  static int __cpuidle poll_idle(struct cpuidle_device *dev,
> >  			       struct cpuidle_driver *drv, int
> > index)
> > @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
> > =20
> >  	local_irq_enable();
> >  	if (!current_set_polling_and_test()) {
> > +		unsigned int loop_count =3D 0;
> > +
> >  		while (!need_resched()) {
> >  			cpu_relax();
> > +			if (loop_count++ < POLL_IDLE_COUNT)
> > +				continue;
> > =20
> > +			loop_count =3D 0;
> >  			if (local_clock() - time_start >
> > POLL_IDLE_TIME_LIMIT)
> >  				break;
> >  		}
> 
> OK, I am still seeing a performance
> degradation with the above, though
> not throughout the entire workload.
> 
> It appears that making the idle loop
> do anything besides cpu_relax() for
> a significant amount of time slows
> things down.

I see.

> I plan to try two more things:
> 
> 1) Disable polling on SMT systems, with
>    the idea that putting one thread to
>    sleep with monitor/mwait in C1 will
>    allow the other thread to run faster.

Sounds plausible.

> 2) Insert more cpu_relax() calls into the
>    main loop, so the CPU core spends more
>    of its time in cpu_relax() and less
>    time doing other things:

Well, maybe it's a matter of doing cpu_relax() between any other bits of
significant computation in there:


---
 drivers/cpuidle/poll_state.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/cpuidle/poll_state.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/poll_state.c
+++ linux-pm/drivers/cpuidle/poll_state.c
@@ -10,6 +10,7 @@
 #include <linux/sched/idle.h>
 
 #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
+#define POLL_IDLE_COUNT		200
 
 static int __cpuidle poll_idle(struct cpuidle_device *dev,
 			       struct cpuidle_driver *drv, int index)
@@ -18,11 +19,21 @@ static int __cpuidle poll_idle(struct cp
 
 	local_irq_enable();
 	if (!current_set_polling_and_test()) {
+		unsigned int loop_count = 0;
+
 		while (!need_resched()) {
 			cpu_relax();
-
+			if (loop_count++ < POLL_IDLE_COUNT) {
+				cpu_relax();
+				continue;
+			}
+			cpu_relax();
+			loop_count = 0;
+			cpu_relax();
 			if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
 				break;
+
+			cpu_relax();
 		}
 	}
 	current_clr_polling();

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-22 17:09   ` Rafael J. Wysocki
  2018-03-22 17:19     ` Rik van Riel
@ 2018-03-25 20:15     ` Rik van Riel
  2018-03-25 21:34       ` Rafael J. Wysocki
  2018-03-27 17:59     ` Rik van Riel
  2 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2018-03-25 20:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 2160 bytes --]

On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
> 
> +++ linux-pm/drivers/cpuidle/poll_state.c
> @@ -10,6 +10,7 @@
>  #include <linux/sched/idle.h>
>  
>  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> +#define POLL_IDLE_COUNT		1000
>  
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>  			       struct cpuidle_driver *drv, int
> index)
> @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
>  
>  	local_irq_enable();
>  	if (!current_set_polling_and_test()) {
> +		unsigned int loop_count = 0;
> +
>  		while (!need_resched()) {
>  			cpu_relax();
> +			if (loop_count++ < POLL_IDLE_COUNT)
> +				continue;
>  
> +			loop_count = 0;
>  			if (local_clock() - time_start >
> POLL_IDLE_TIME_LIMIT)
>  				break;
>  		}

OK, I am still seeing a performance
degradation with the above, though
not throughout the entire workload.

It appears that making the idle loop
do anything besides cpu_relax() for
a significant amount of time slows
things down.

I plan to try two more things:

1) Disable polling on SMT systems, with
   the idea that putting one thread to
   sleep with monitor/mwait in C1 will
   allow the other thread to run faster.

2) Insert more cpu_relax() calls into the
   main loop, so the CPU core spends more
   of its time in cpu_relax() and less
   time doing other things:

static int __cpuidle poll_idle(struct cpuidle_device *dev,
             struct cpuidle_driver *drv, int index)
{
  u64 time_start = local_clock();

  local_irq_enable();
  if (!current_set_polling_and_test()) {
    unsigned int loop_count = 0;

    while (!need_resched()) {
      cpu_relax();
      cpu_relax();
      cpu_relax();
      cpu_relax();
      cpu_relax();
      cpu_relax();
      cpu_relax();
      cpu_relax();
      if (loop_count++ < POLL_IDLE_COUNT)
        continue;

      loop_count = 0;
      if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
        break;
    }
  }
  current_clr_polling();
                                         
  return index;
}

I will let you know how they perform.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-22 17:19     ` Rik van Riel
@ 2018-03-22 17:24       ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-22 17:24 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

On Thursday, March 22, 2018 6:19:16 PM CET Rik van Riel wrote:
> On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
> > On Thursday, March 22, 2018 5:32:23 PM CET Rik van Riel wrote:
> > > On Wed, 2018-03-14 at 15:08 +0100, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > If poll_idle() is allowed to spin until need_resched() returns
> > > > 'true',
> > > > it may actually spin for a much longer time than expected by the
> > > > idle
> > > > governor, since set_tsk_need_resched() is not always called by
> > > > the
> > > > timer interrupt handler.  If that happens, the CPU may spend much
> > > > more time than anticipated in the "polling" state.
> > > > 
> > > > To prevent that from happening, limit the time of the spinning
> > > > loop
> > > > in poll_idle().
> > > > 
> > > > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > So ... about bisecting that other patch series...
> > > 
> > > It turned out I had this patch, which looks so
> > > obviously correct, as patch #1 in my series.
> > > 
> > > It also turned out that this patch is responsible
> > > for the entire 5-10% increase in CPU use for the
> > > memcache style workload.
> > > 
> > > I wonder if keeping an idle HT thread much busier
> > > than before slows down its sibling, or something
> > > like that.
> > 
> > Uhm, sorry about this.
> 
> No worries, this is why we do patch reviews and
> tests in the first place.
> 
> > Does it improve if you do something like the below on top of it?
> 
> That was my next thing to try, after testing just
> the idle nohz series by itself :)
> 
> I'll push both into the test systems, and will
> get back to you when I have answers.

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-22 17:09   ` Rafael J. Wysocki
@ 2018-03-22 17:19     ` Rik van Riel
  2018-03-22 17:24       ` Rafael J. Wysocki
  2018-03-25 20:15     ` Rik van Riel
  2018-03-27 17:59     ` Rik van Riel
  2 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2018-03-22 17:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]

On Thu, 2018-03-22 at 18:09 +0100, Rafael J. Wysocki wrote:
> On Thursday, March 22, 2018 5:32:23 PM CET Rik van Riel wrote:
> > On Wed, 2018-03-14 at 15:08 +0100, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > If poll_idle() is allowed to spin until need_resched() returns
> > > 'true',
> > > it may actually spin for a much longer time than expected by the
> > > idle
> > > governor, since set_tsk_need_resched() is not always called by
> > > the
> > > timer interrupt handler.  If that happens, the CPU may spend much
> > > more time than anticipated in the "polling" state.
> > > 
> > > To prevent that from happening, limit the time of the spinning
> > > loop
> > > in poll_idle().
> > > 
> > > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > So ... about bisecting that other patch series...
> > 
> > It turned out I had this patch, which looks so
> > obviously correct, as patch #1 in my series.
> > 
> > It also turned out that this patch is responsible
> > for the entire 5-10% increase in CPU use for the
> > memcache style workload.
> > 
> > I wonder if keeping an idle HT thread much busier
> > than before slows down its sibling, or something
> > like that.
> 
> Uhm, sorry about this.

No worries, this is why we do patch reviews and
tests in the first place.

> Does it improve if you do something like the below on top of it?

That was my next thing to try, after testing just
the idle nohz series by itself :)

I'll push both into the test systems, and will
get back to you when I have answers.

> > Let me go test the nohz idle series by itself,
> > without this patch.
> 
> OK
> 
> ---
>  drivers/cpuidle/poll_state.c |    6 ++++++
>  1 file changed, 6 insertions(+)
> 
> Index: linux-pm/drivers/cpuidle/poll_state.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/poll_state.c
> +++ linux-pm/drivers/cpuidle/poll_state.c
> @@ -10,6 +10,7 @@
>  #include <linux/sched/idle.h>
>  
>  #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
> +#define POLL_IDLE_COUNT		1000
>  
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>  			       struct cpuidle_driver *drv, int
> index)
> @@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
>  
>  	local_irq_enable();
>  	if (!current_set_polling_and_test()) {
> +		unsigned int loop_count = 0;
> +
>  		while (!need_resched()) {
>  			cpu_relax();
> +			if (loop_count++ < POLL_IDLE_COUNT)
> +				continue;
>  
> +			loop_count = 0;
>  			if (local_clock() - time_start >
> POLL_IDLE_TIME_LIMIT)
>  				break;
>  		}
> 
> 
-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-22 16:32 ` Rik van Riel
@ 2018-03-22 17:09   ` Rafael J. Wysocki
  2018-03-22 17:19     ` Rik van Riel
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-22 17:09 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Linux PM, Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

On Thursday, March 22, 2018 5:32:23 PM CET Rik van Riel wrote:
> On Wed, 2018-03-14 at 15:08 +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > If poll_idle() is allowed to spin until need_resched() returns
> > 'true',
> > it may actually spin for a much longer time than expected by the idle
> > governor, since set_tsk_need_resched() is not always called by the
> > timer interrupt handler.  If that happens, the CPU may spend much
> > more time than anticipated in the "polling" state.
> > 
> > To prevent that from happening, limit the time of the spinning loop
> > in poll_idle().
> > 
> > Suggested-by: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> So ... about bisecting that other patch series...
> 
> It turned out I had this patch, which looks so
> obviously correct, as patch #1 in my series.
> 
> It also turned out that this patch is responsible
> for the entire 5-10% increase in CPU use for the
> memcache style workload.
> 
> I wonder if keeping an idle HT thread much busier
> than before slows down its sibling, or something
> like that.

Uhm, sorry about this.

Does it improve if you do something like the below on top of it?

> Let me go test the nohz idle series by itself,
> without this patch.

OK

---
 drivers/cpuidle/poll_state.c |    6 ++++++
 1 file changed, 6 insertions(+)

Index: linux-pm/drivers/cpuidle/poll_state.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/poll_state.c
+++ linux-pm/drivers/cpuidle/poll_state.c
@@ -10,6 +10,7 @@
 #include <linux/sched/idle.h>
 
 #define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
+#define POLL_IDLE_COUNT		1000
 
 static int __cpuidle poll_idle(struct cpuidle_device *dev,
 			       struct cpuidle_driver *drv, int index)
@@ -18,9 +19,14 @@ static int __cpuidle poll_idle(struct cp
 
 	local_irq_enable();
 	if (!current_set_polling_and_test()) {
+		unsigned int loop_count = 0;
+
 		while (!need_resched()) {
 			cpu_relax();
+			if (loop_count++ < POLL_IDLE_COUNT)
+				continue;
 
+			loop_count = 0;
 			if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
 				break;
 		}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-14 14:08 Rafael J. Wysocki
@ 2018-03-22 16:32 ` Rik van Riel
  2018-03-22 17:09   ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2018-03-22 16:32 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Aubrey Li,
	Mike Galbraith, LKML

[-- Attachment #1: Type: text/plain, Size: 1173 bytes --]

On Wed, 2018-03-14 at 15:08 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If poll_idle() is allowed to spin until need_resched() returns
> 'true',
> it may actually spin for a much longer time than expected by the idle
> governor, since set_tsk_need_resched() is not always called by the
> timer interrupt handler.  If that happens, the CPU may spend much
> more time than anticipated in the "polling" state.
> 
> To prevent that from happening, limit the time of the spinning loop
> in poll_idle().
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

So ... about bisecting that other patch series...

It turned out I had this patch, which looks so
obviously correct, as patch #1 in my series.

It also turned out that this patch is responsible
for the entire 5-10% increase in CPU use for the
memcache style workload.

I wonder if keeping an idle HT thread much busier
than before slows down its sibling, or something
like that.

Let me go test the nohz idle series by itself,
without this patch.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
  2018-03-14 15:00 Doug Smythies
@ 2018-03-20 10:52 ` Rafael J. Wysocki
  0 siblings, 0 replies; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-20 10:52 UTC (permalink / raw)
  To: Doug Smythies
  Cc: 'Peter Zijlstra', 'Frederic Weisbecker',
	'Thomas Gleixner', 'Paul McKenney',
	'Thomas Ilsche', 'Rik van Riel',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	'Linux PM'

On Wednesday, March 14, 2018 4:00:44 PM CET Doug Smythies wrote:
> On 2018.03.14 07:09 Rafael J. Wysocki wrote:
> 
> ... [snip]...
> 
> > v2 -> v3: Use local_clock() for time measurements and drop the
> >          counter, since that should be lightweight enough (as
> >          suggested by Peter).
> 
> I have been testing the latest of everything for a couple of days
> now, and everything continues to be great.

Due to the lack of any further comments from anyone, I have queued up this
patch with a Tested-by from you.

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
@ 2018-03-14 15:00 Doug Smythies
  2018-03-20 10:52 ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: Doug Smythies @ 2018-03-14 15:00 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Peter Zijlstra', 'Frederic Weisbecker',
	'Thomas Gleixner', 'Paul McKenney',
	'Thomas Ilsche', 'Rik van Riel',
	'Aubrey Li', 'Mike Galbraith', 'LKML',
	Doug Smythies, 'Linux PM'

On 2018.03.14 07:09 Rafael J. Wysocki wrote:

... [snip]...

> v2 -> v3: Use local_clock() for time measurements and drop the
>          counter, since that should be lightweight enough (as
>          suggested by Peter).

I have been testing the latest of everything for a couple of days
now, and everything continues to be great.

Note that I was using a POLL_IDLE_TIME_CHECK_COUNT of 1 anyhow, because
I specifically wanted to test the worst case time through the loop.
i.e. I wanted any potential issue to be 1000 times more likely to find.
My problem is that I don't know of a good test for this specifically.

I'll switch to this V3, along with V4 of the "sched/cpuidle: Idle loop
rework" 7 patch set.

As for energy savings for just this patch only, I would refer readers
to my previous test results from late November, [1], as I haven't
re-done those Phoronix tests yet, but I don't expect the results to
differ much.

[1] https://marc.info/?l=linux-pm&m=151154499710125&w=2

... Doug

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle()
@ 2018-03-14 14:08 Rafael J. Wysocki
  2018-03-22 16:32 ` Rik van Riel
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2018-03-14 14:08 UTC (permalink / raw)
  To: Linux PM
  Cc: Peter Zijlstra, Frederic Weisbecker, Thomas Gleixner,
	Paul McKenney, Thomas Ilsche, Doug Smythies, Rik van Riel,
	Aubrey Li, Mike Galbraith, LKML

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If poll_idle() is allowed to spin until need_resched() returns 'true',
it may actually spin for a much longer time than expected by the idle
governor, since set_tsk_need_resched() is not always called by the
timer interrupt handler.  If that happens, the CPU may spend much
more time than anticipated in the "polling" state.

To prevent that from happening, limit the time of the spinning loop
in poll_idle().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v2 -> v3: Use local_clock() for time measurements and drop the
          counter, since that should be lightweight enough (as
          suggested by Peter).

---
 drivers/cpuidle/poll_state.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/cpuidle/poll_state.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/poll_state.c
+++ linux-pm/drivers/cpuidle/poll_state.c
@@ -6,15 +6,24 @@
 
 #include <linux/cpuidle.h>
 #include <linux/sched.h>
+#include <linux/sched/clock.h>
 #include <linux/sched/idle.h>
 
+#define POLL_IDLE_TIME_LIMIT	(TICK_NSEC / 16)
+
 static int __cpuidle poll_idle(struct cpuidle_device *dev,
 			       struct cpuidle_driver *drv, int index)
 {
+	u64 time_start = local_clock();
+
 	local_irq_enable();
 	if (!current_set_polling_and_test()) {
-		while (!need_resched())
+		while (!need_resched()) {
 			cpu_relax();
+
+			if (local_clock() - time_start > POLL_IDLE_TIME_LIMIT)
+				break;
+		}
 	}
 	current_clr_polling();
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-03-27 21:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-25  0:28 [PATCH v3] cpuidle: poll_state: Add time limit to poll_idle() Doug Smythies
2018-03-25 11:53 ` Rafael J. Wysocki
2018-03-25 21:41   ` Rafael J. Wysocki
2018-03-26  6:01 ` Doug Smythies
  -- strict thread matches above, loose matches on Subject: below --
2018-03-14 15:00 Doug Smythies
2018-03-20 10:52 ` Rafael J. Wysocki
2018-03-14 14:08 Rafael J. Wysocki
2018-03-22 16:32 ` Rik van Riel
2018-03-22 17:09   ` Rafael J. Wysocki
2018-03-22 17:19     ` Rik van Riel
2018-03-22 17:24       ` Rafael J. Wysocki
2018-03-25 20:15     ` Rik van Riel
2018-03-25 21:34       ` Rafael J. Wysocki
2018-03-25 21:45         ` Rik van Riel
2018-03-26  5:59         ` Doug Smythies
2018-03-26  7:13         ` Doug Smythies
2018-03-26  9:35           ` Rafael J. Wysocki
2018-03-26 16:32         ` Rik van Riel
2018-03-26 21:44           ` Rafael J. Wysocki
2018-03-26 21:48             ` Rik van Riel
2018-03-27 17:59     ` Rik van Riel
2018-03-27 21:06       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).