linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] Optimization CPU idle state impacted by tick
@ 2018-08-07 14:27 Leo Yan
  2018-08-07 14:27 ` [PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick Leo Yan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Leo Yan @ 2018-08-07 14:27 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra (Intel),
	Ramesh Thomas, Daniel Lezcano, Vincent Guittot, linux-kernel
  Cc: Leo Yan

After Rafael's patch series 'sched/cpuidle: Idle loop rework' has been
merged in mainline kernel, it perfectly resolved the Powernightmares
issue [1] with not stopping the tick during the idle loop; we verified
this patch series on Arm platform (96boards Hikey620 with octa CA53
CPUs) with the rt-app [2] program to generate workloads: a single task
with below combinded configurations with period 5ms and duty cycle
1%/3%/5%/10%/20%/30%/40%.

After run these testing cases, we found the CPU cannot stay in deepest
idle state as expected, the issues essentialy are related with sched
tick:

The prominent issue is the criteria for decision stopping tick; now
the criteria is checking expected interval is less than TICK_USEC, but
this doesn't consider from the perspective of idle state parameters, so
we can observe the CPU even has enough sleeping time but it cannot run
into deepest idle state; this is very serious for some specific ducy
cycle cases.

Another issue is after tick keeping running in idle state, the tick
can heavily impact on 'menu' governor metrics, especially it will
introduce many noise for next event correction factors.

This patch series tries to fix these two issues; patch 0001 wants to
define a time point to distinguish for stopping or not, this time point
consideres the factors from tick period and the maximum target residency
and use prediction period to compare this time point to decide if need
to stop tick.  Patch 0002 wants to always to give compensation for tick
event so that dimiss the tick impaction on correction factors for next
time prediction.

Blow table are comparison results for testing cases between without and
with this patch series; we run the test case with single task with period
5ms with different dutycycle, the total running time is 10s.  Based on
the tracing log, we do statistics for all CPUs for all idle states
duration, the unit is second (s), on Hikey board the result shows the C2
state (the CPU deepest state) selection improvement.

Some notations are used in the table:

state: C0: WFI; C1: CPU OFF; C2: Cluster OFF

All testing cases have single task with 5ms period:

		 Without patches                  With patches                   Difference
            -----------------------------  -----------------------------  -------------------------------
Duty cycle      C0        C1        C2         C0        C1         C2         C0        C1         C2
  1%        0.218589  4.208460  87.995606  0.119723  0.847116  91.940569  -0.098866  -3.361344  +3.944963
  3%        0.801521  5.031361  86.444753  0.147346  0.820276  91.761191  -0.654175  -4.211085  +5.316438
  5%        0.590236  2.733048  88.284541  0.149237  1.042383  90.490482  -0.440999  -1.690665  +2.205941
 10%        0.601922  6.282368  84.899870  0.169491  1.304985  89.725754  -0.432431  -4.977383  +4.825884
 20%        1.381870  8.531687  80.627691  0.307390  3.302562  86.686887  -1.074480  -5.229125  +6.059196
 30%        1.785221  6.974483  81.083312  0.548050  5.319929  83.551747  -1.237171  -1.654554  +2.468435
 40%        1.403247  6.474203  80.577176  0.467686  6.366482  81.983384  -0.935561  -0.107721  +1.406208


Leo Yan (2):
  cpuidle: menu: Correct the criteria for stopping tick
  cpuidle: menu: Dismiss tick impaction on correction factors

 drivers/cpuidle/governors/menu.c | 55 ++++++++++++++++++++++++++++++++--------
 1 file changed, 45 insertions(+), 10 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick
  2018-08-07 14:27 [PATCH v1 0/2] Optimization CPU idle state impacted by tick Leo Yan
@ 2018-08-07 14:27 ` Leo Yan
  2018-08-07 14:27 ` [PATCH v1 2/2] cpuidle: menu: Dismiss tick impaction on correction factors Leo Yan
  2018-08-07 14:38 ` [PATCH v1 0/2] Optimization CPU idle state impacted by tick leo.yan
  2 siblings, 0 replies; 4+ messages in thread
From: Leo Yan @ 2018-08-07 14:27 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra (Intel),
	Ramesh Thomas, Daniel Lezcano, Vincent Guittot, linux-kernel
  Cc: Leo Yan

The criteria for keeping tick running is the prediction duration is less
than TICK_USEC, the mainline kernel configures HZ=250 so TICK_USEC equals
to 4000us, so any prediction is less than 4000us will not stop tick and
the idle state will be fixed up to one shallow state.  On the other hand,
let's use 96boards Hikey (CA53 octa-CPUs) as an example, the platform has
the deepest C-state is cluster off state which its 'target_residency' is
2700us, if the 'menu' governor predicts the next idle duration is any
value fallen into the range [2700us, 4000us), then the 'menu' governor
will keep sched tick running and and roll back to a shallow CPU off state
rather than cluster off state.  Finally we can see the CPU has much less
chance to run into deepest state when a task repeatedly running on it
with 5000us period and 40% duty cycle (so the task runs for 2000us and
then sleep for 3000us in every period).  In theory, we should permit the
CPU to stay in cluster off state due the CPU sleeping time 3000us is
over its 'target_residency' 2700us.

This issue is caused by the 'menu' governor's criteria for decision if
need to enable tick and roll back to shallow state, the criteria is:
'expected_interval < TICK_USEC'.  This criteria is only considering from
tick aspect, but it doesn't consider idle state residency so misses
better choice for deeper idle state; e.g., the deepest idle state
'target_residency' is less than TICK_USEC, which is quite common on Arm
platforms.

To fix this issue, this patch is to add one extra variable
'stop_tick_point' to help decision if need to stop tick or not.  If
prediction is longer than 'stop_tick_point' then we can stop tick,
otherwise it will keep tick running.

For 'stop_tick_point', except we need to compare prediction period with
TICK_USEC, we also need consider from the perspective of deepest idle
state 'target_residency'.  Finally, 'stop_tick_point' is coming from the
minimum value within the deepest idle state 'target_residency' and
TICK_USEC.

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/cpuidle/governors/menu.c | 41 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 30ab759..2ce4068 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -294,6 +294,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 	unsigned int expected_interval;
 	unsigned long nr_iowaiters, cpu_load;
 	ktime_t delta_next;
+	unsigned int stop_tick_point;
 
 	if (data->needs_update) {
 		menu_update(drv, dev);
@@ -406,11 +407,47 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 		idx = 0; /* No states enabled. Must use 0. */
 
 	/*
+	 * Decide the time point for tick stopping, if the prediction is before
+	 * this time point it's better to keep the tick enabled and after the
+	 * time point it means the CPU can stay in idle state for enough long
+	 * time so should stop the tick.  This point needs to consider two
+	 * factors: the first one is tick period and the another factor is the
+	 * maximum target residency.
+	 *
+	 * We can divide into below cases:
+	 *
+	 * The first case is the prediction is shorter than the maximum target
+	 * residency and also shorter than tick period, this means the
+	 * prediction isn't to use deepest idle state and it's suppose the CPU
+	 * will be waken up within tick period, for this case we should keep
+	 * the tick to be enabled;
+	 *
+	 * The second case is the prediction is shorter than the maximum target
+	 * residency and longer than tick period, for this case the idle state
+	 * selection has already based on the prediction for shallow state and
+	 * we will expect some events can arrive later than tick to wake up the
+	 * CPU; another thinking for this case is the CPU is likely to stay in
+	 * the expected idle state for long while (which should be longer than
+	 * tick period), so it's reasonable to stop the tick.
+	 *
+	 * The third case is the prediction is longer than the maximum target
+	 * residency, but weather it's longer or shorter than tick period; for
+	 * this case we have selected the deepest idle state so it's pointless
+	 * to enable tick to wake up CPU from deepest state.
+	 *
+	 * To summary upper cases, we use the value of min(TICK_USEC,
+	 * maximum_target_residency) as the critical point to decide if need to
+	 * stop tick.
+	 */
+	stop_tick_point = min_t(unsigned int, TICK_USEC,
+			drv->states[drv->state_count-1].target_residency);
+
+	/*
 	 * Don't stop the tick if the selected state is a polling one or if the
-	 * expected idle duration is shorter than the tick period length.
+	 * expected idle duration is shorter than the estimated stop tick point.
 	 */
 	if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) ||
-	    expected_interval < TICK_USEC) {
+	    expected_interval < stop_tick_point) {
 		unsigned int delta_next_us = ktime_to_us(delta_next);
 
 		*stop_tick = false;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v1 2/2] cpuidle: menu: Dismiss tick impaction on correction factors
  2018-08-07 14:27 [PATCH v1 0/2] Optimization CPU idle state impacted by tick Leo Yan
  2018-08-07 14:27 ` [PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick Leo Yan
@ 2018-08-07 14:27 ` Leo Yan
  2018-08-07 14:38 ` [PATCH v1 0/2] Optimization CPU idle state impacted by tick leo.yan
  2 siblings, 0 replies; 4+ messages in thread
From: Leo Yan @ 2018-08-07 14:27 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra (Intel),
	Ramesh Thomas, Daniel Lezcano, Vincent Guittot, linux-kernel
  Cc: Leo Yan

If the idle duration predictor detects the tick is triggered, and with
meeting the condition 'data->next_timer_us > TICK_USEC', it will give a
big compensation for the 'measured' interval; this is purposed to avoid
artificially small correction factor values.  Unfortunately, this still
cannot cover all cases of the tick impaction on correction factors,
e.g. if the predicted next event is less than ITCK_USEC, then all
wakening up by the ticks will be taken as usual case and reducing exit
latency, as results the tick events heavily impacts the correction
factors.  Moreover, the coming tick sometimes is very soon, especially
at the first time when the CPU becomes idle the tick expire time might
be vary, so ticks can introduce big deviation on correction factors.

If idle governor deliberately doesn't stop the tick timer, the tick
event is coming as expected with fixed interval, so the tick event is
predictable; if the tick event is coming early than other normal timer
event and other possible wakeup events, we need to dismiss the tick
impaction on correction factors, this can let the correction factor
array is purely used for other wakeup events correctness rather than
sched tick.

This patch is to check if it's a tick wakeup, it takes the CPU can
stay in the idle state for enough time so it gives high compensation
for the measured' interval, this can avoid tick impaction on the
correction factor array.

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 drivers/cpuidle/governors/menu.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 2ce4068..43cbde3 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -525,15 +525,13 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 	 * assume the state was never reached and the exit latency is 0.
 	 */
 
-	if (data->tick_wakeup && data->next_timer_us > TICK_USEC) {
+	if (data->tick_wakeup) {
 		/*
-		 * The nohz code said that there wouldn't be any events within
-		 * the tick boundary (if the tick was stopped), but the idle
-		 * duration predictor had a differing opinion.  Since the CPU
-		 * was woken up by a tick (that wasn't stopped after all), the
-		 * predictor was not quite right, so assume that the CPU could
-		 * have been idle long (but not forever) to help the idle
-		 * duration predictor do a better job next time.
+		 * Since the CPU was woken up by a tick (that wasn't stopped
+		 * after all), the predictor was not quite right, so assume
+		 * that the CPU could have been idle long (but not forever)
+		 * to help the idle duration predictor do a better job next
+		 * time.
 		 */
 		measured_us = 9 * MAX_INTERESTING / 10;
 	} else {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v1 0/2] Optimization CPU idle state impacted by tick
  2018-08-07 14:27 [PATCH v1 0/2] Optimization CPU idle state impacted by tick Leo Yan
  2018-08-07 14:27 ` [PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick Leo Yan
  2018-08-07 14:27 ` [PATCH v1 2/2] cpuidle: menu: Dismiss tick impaction on correction factors Leo Yan
@ 2018-08-07 14:38 ` leo.yan
  2 siblings, 0 replies; 4+ messages in thread
From: leo.yan @ 2018-08-07 14:38 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra (Intel),
	Ramesh Thomas, Daniel Lezcano, Vincent Guittot, linux-kernel

On Tue, Aug 07, 2018 at 10:27:02PM +0800, Leo Yan wrote:
> After Rafael's patch series 'sched/cpuidle: Idle loop rework' has been
> merged in mainline kernel, it perfectly resolved the Powernightmares
> issue [1] with not stopping the tick during the idle loop; we verified
> this patch series on Arm platform (96boards Hikey620 with octa CA53
> CPUs) with the rt-app [2] program to generate workloads: a single task
> with below combinded configurations with period 5ms and duty cycle
> 1%/3%/5%/10%/20%/30%/40%.

Oops, I missed the two reference links, for complete info so list here:

[1] https://tu-dresden.de/zih/forschung/ressourcen/dateien/projekte/haec/powernightmares.pdf?lang=en
[2] https://git.linaro.org/power/rt-app.git

Thanks,
Leo Yan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-08-07 14:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-07 14:27 [PATCH v1 0/2] Optimization CPU idle state impacted by tick Leo Yan
2018-08-07 14:27 ` [PATCH v1 1/2] cpuidle: menu: Correct the criteria for stopping tick Leo Yan
2018-08-07 14:27 ` [PATCH v1 2/2] cpuidle: menu: Dismiss tick impaction on correction factors Leo Yan
2018-08-07 14:38 ` [PATCH v1 0/2] Optimization CPU idle state impacted by tick leo.yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).