All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
@ 2016-09-03  0:56 Rafael J. Wysocki
  2016-09-03  0:58 ` [RFC/RFT][PATCH 1/4] cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-03  0:56 UTC (permalink / raw)
  To: Linux PM list
  Cc: Linux Kernel Mailing List, Srinivas Pandruvada, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Steve Muckle, Doug Smythies

Hi Everyone,

This is a new version of the "iowait boost" series I posted a few weeks
ago.  Since the first two patches from that series have been reworked and
are in linux-next now, I've rebased this series on top of my linux-next
branch.

In addition to that I took the Doug's feedback into account in the
intel_pstate patches [2-3/4].

Please let me know what you think and if you can run some benchmarks you
care about and see if the changes make any difference (this way or another),
please do that and let me know what you've found.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC/RFT][PATCH 1/4] cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
@ 2016-09-03  0:58 ` Rafael J. Wysocki
  2016-09-03  1:02 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-03  0:58 UTC (permalink / raw)
  To: Linux PM list
  Cc: Linux Kernel Mailing List, Srinivas Pandruvada, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Steve Muckle, Doug Smythies

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Testing indicates that it is possible to improve performace
significantly without increasing energy consumption too much by
teaching cpufreq governors to bump up the CPU performance level if
the in_iowait flag is set for the task in enqueue_task_fair().

For this purpose, define a new cpufreq_update_util() flag
SCHED_CPUFREQ_IOWAIT and modify enqueue_task_fair() to pass that
flag to cpufreq_update_util() in the in_iowait case.  That generally
requires cpufreq_update_util() to be called directly from there,
because update_load_avg() may not be invoked in that case.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 include/linux/sched.h |    1 +
 kernel/sched/fair.c   |    8 ++++++++
 2 files changed, 9 insertions(+)

Index: linux-pm/kernel/sched/fair.c
===================================================================
--- linux-pm.orig/kernel/sched/fair.c
+++ linux-pm/kernel/sched/fair.c
@@ -4500,6 +4500,14 @@ enqueue_task_fair(struct rq *rq, struct
 	struct cfs_rq *cfs_rq;
 	struct sched_entity *se = &p->se;
 
+	/*
+	 * If in_iowait is set, the code below may not trigger any cpufreq
+	 * utilization updates, so do it here explicitly with the IOWAIT flag
+	 * passed.
+	 */
+	if (p->in_iowait)
+		cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_IOWAIT);
+
 	for_each_sched_entity(se) {
 		if (se->on_rq)
 			break;
Index: linux-pm/include/linux/sched.h
===================================================================
--- linux-pm.orig/include/linux/sched.h
+++ linux-pm/include/linux/sched.h
@@ -3471,6 +3471,7 @@ static inline unsigned long rlimit_max(u
 
 #define SCHED_CPUFREQ_RT	(1U << 0)
 #define SCHED_CPUFREQ_DL	(1U << 1)
+#define SCHED_CPUFREQ_IOWAIT	(1U << 2)
 
 #define SCHED_CPUFREQ_RT_DL	(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC/RFT][PATCH 2/4]  cpufreq: intel_pstate: Change P-state selection algorithm for Core
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
  2016-09-03  0:58 ` [RFC/RFT][PATCH 1/4] cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition Rafael J. Wysocki
@ 2016-09-03  1:02 ` Rafael J. Wysocki
  2016-09-03  1:03 ` [RFC/RFT][PATCH 3/4] cpufreq: intel_pstate: Use average P-state in get_target_pstate_default() Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-03  1:02 UTC (permalink / raw)
  To: Linux PM list, Doug Smythies
  Cc: Linux Kernel Mailing List, Srinivas Pandruvada, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Steve Muckle

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The PID-base P-state selection algorithm used by intel_pstate for
Core processors is based on very weak foundations.  Namely, its
decisions are mostly based on the values of the APERF and MPERF
feedback registers and it only estimates the actual utilization to
check if it is not extremely low (in order to avoid getting stuck
in the highest P-state in that case).

Since it generally causes the CPU P-state to ramp up quickly, it
leads to satisfactory performance, but the metric used by it is only
really valid when the CPU changes P-states by itself (ie. in the turbo
range) and if the P-state value set by the driver is treated by the
CPU as the upper limit on turbo P-states selected by it.

As a result, the only case when P-states are reduced by that
algorithm is when the CPU has just come out of idle, but in that
particular case it would have been better to bump up the P-state
instead.  That causes some benchmarks to behave erratically and
attempts to improve the situation lead to excessive energy
consumption, because they make the CPU stay in very high P-states
almost all the time.

Consequently, the only viable way to fix that is to replace the
erroneous algorithm entirely with a new one.

To that end, notice that setting the P-state proportional to the
actual CPU utilization (measured with the help of MPERF and TSC)
generally leads to reasonable behavior, but it does not reflect
the "performance boosting" nature of the current P-state
selection algorithm.  It may be made more similar to that
algorithm, though, by adding iowait boosting to it.

Specifically, if the P-state is bumped up to the maximum after
receiving the SCHED_CPUFREQ_IOWAIT flag via cpufreq_update_util(),
it will allow tasks that were previously waiting on I/O to get the
full capacity of the CPU when they are ready to process data again
and that should lead to the desired performance increase overall
without sacrificing too much energy.

However, the utilization-based method of target P-state selection
may cause the resultant target P-state to oscillate which generally
leads to excessive consumption of energy, so apply an Infinite
Impulse Response filter on top of it to dampen those osciallations
and make it more energy-efficient (thanks to Doug Smythies for this
idea).

Use the approach as described in intel_pstate for Core processors.

Original-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Suggested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This includes an IIR filter on top of the load-based P-state selection,
but the filter is applied to the non-boosted case only (otherwise it
defeats the point of the boost) and I used a slightly different raw gain
value.

Thanks,
Rafael

---
 drivers/cpufreq/intel_pstate.c |   81 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 2 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -98,6 +98,7 @@ static inline u64 div_ext_fp(u64 x, u64
  * @tsc:		Difference of time stamp counter between last and
  *			current sample
  * @time:		Current time from scheduler
+ * @target:		Target P-state
  *
  * This structure is used in the cpudata structure to store performance sample
  * data for choosing next P State.
@@ -109,6 +110,7 @@ struct sample {
 	u64 mperf;
 	u64 tsc;
 	u64 time;
+	int target;
 };
 
 /**
@@ -181,6 +183,8 @@ struct _pid {
  * @cpu:		CPU number for this instance data
  * @update_util:	CPUFreq utility callback information
  * @update_util_set:	CPUFreq utility callback is set
+ * @iowait_boost:	iowait-related boost fraction
+ * @last_update:	Time of the last update.
  * @pstate:		Stores P state limits for this CPU
  * @vid:		Stores VID limits for this CPU
  * @pid:		Stores PID parameters for this CPU
@@ -206,6 +210,7 @@ struct cpudata {
 	struct vid_data vid;
 	struct _pid pid;
 
+	u64	last_update;
 	u64	last_sample_time;
 	u64	prev_aperf;
 	u64	prev_mperf;
@@ -216,6 +221,7 @@ struct cpudata {
 	struct acpi_processor_performance acpi_perf_data;
 	bool valid_pss_table;
 #endif
+	unsigned int iowait_boost;
 };
 
 static struct cpudata **all_cpu_data;
@@ -229,6 +235,7 @@ static struct cpudata **all_cpu_data;
  * @p_gain_pct:		PID proportional gain
  * @i_gain_pct:		PID integral gain
  * @d_gain_pct:		PID derivative gain
+ * @boost_iowait:	Whether or not to use iowait boosting.
  *
  * Stores per CPU model static PID configuration data.
  */
@@ -240,6 +247,7 @@ struct pstate_adjust_policy {
 	int p_gain_pct;
 	int d_gain_pct;
 	int i_gain_pct;
+	bool boost_iowait;
 };
 
 /**
@@ -277,6 +285,7 @@ struct cpu_defaults {
 	struct pstate_funcs funcs;
 };
 
+static inline int32_t get_target_pstate_default(struct cpudata *cpu);
 static inline int32_t get_target_pstate_use_performance(struct cpudata *cpu);
 static inline int32_t get_target_pstate_use_cpu_load(struct cpudata *cpu);
 
@@ -1017,6 +1026,7 @@ static struct cpu_defaults core_params =
 		.p_gain_pct = 20,
 		.d_gain_pct = 0,
 		.i_gain_pct = 0,
+		.boost_iowait = true,
 	},
 	.funcs = {
 		.get_max = core_get_max_pstate,
@@ -1025,7 +1035,7 @@ static struct cpu_defaults core_params =
 		.get_turbo = core_get_turbo_pstate,
 		.get_scaling = core_get_scaling,
 		.get_val = core_get_val,
-		.get_target_pstate = get_target_pstate_use_performance,
+		.get_target_pstate = get_target_pstate_default,
 	},
 };
 
@@ -1139,6 +1149,7 @@ static void intel_pstate_set_min_pstate(
 
 	trace_cpu_frequency(pstate * cpu->pstate.scaling, cpu->cpu);
 	cpu->pstate.current_pstate = pstate;
+	cpu->sample.target = pstate;
 	/*
 	 * Generally, there is no guarantee that this code will always run on
 	 * the CPU being updated, so force the register update to run on the
@@ -1290,6 +1301,59 @@ static inline int32_t get_target_pstate_
 	return cpu->pstate.current_pstate - pid_calc(&cpu->pid, perf_scaled);
 }
 
+static inline int32_t get_target_pstate_default(struct cpudata *cpu)
+{
+	struct sample *sample = &cpu->sample;
+	int32_t busy_frac, boost;
+	int pstate, max_perf, min_perf;
+	int64_t target;
+
+	pstate = limits->no_turbo || limits->turbo_disabled ?
+			cpu->pstate.max_pstate : cpu->pstate.turbo_pstate;
+	pstate += pstate >> 2;
+
+	busy_frac = div_fp(sample->mperf, sample->tsc);
+	sample->busy_scaled = busy_frac * 100;
+
+	boost = cpu->iowait_boost;
+	cpu->iowait_boost >>= 1;
+
+	if (busy_frac < boost) {
+		target = pstate * boost;
+	} else {
+		int32_t iir_gain;
+
+		target = pstate * busy_frac;
+		/*
+		 * Use an Infinite Impulse Response (IIR) filter:
+		 *
+		 *	new_output = old_output * (1 - gain) + input * gain
+		 *
+		 * where pstate * busy_frac is the input.
+		 *
+		 * The purpose of this is to dampen output oscillations that are
+		 * otherwise possible and lead to increased energy consumption.
+		 *
+		 * Compute the filter gain as a function of the time since the
+		 * last pass (delta_t) so as to reduce, or even eliminate, the
+		 * influence of what might be a very stale old_output value.
+		 *
+		 * Take the raw gain as 1/8 and compute the effective gain as
+		 *
+		 *	iir_gain = 1/8 * delta_t / sampling_interval
+		 */
+		iir_gain = div_fp(sample->time - cpu->last_sample_time,
+				  pid_params.sample_rate_ns << 3);
+		if (iir_gain < int_tofp(1))
+			target = sample->target * (int_tofp(1) - iir_gain) +
+					mul_fp(target, iir_gain);
+	}
+	intel_pstate_get_min_max(cpu, &min_perf, &max_perf);
+	target = clamp_val(target, int_tofp(min_perf), int_tofp(max_perf));
+	sample->target = fp_toint(target + (1 << (FRAC_BITS-1)));
+	return sample->target;
+}
+
 static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate)
 {
 	int max_perf, min_perf;
@@ -1332,8 +1396,21 @@ static void intel_pstate_update_util(str
 				     unsigned int flags)
 {
 	struct cpudata *cpu = container_of(data, struct cpudata, update_util);
-	u64 delta_ns = time - cpu->sample.time;
+	u64 delta_ns;
+
+	if (pid_params.boost_iowait) {
+		if (flags & SCHED_CPUFREQ_IOWAIT) {
+			cpu->iowait_boost = int_tofp(1);
+		} else if (cpu->iowait_boost) {
+			/* Clear iowait_boost if the CPU may have been idle. */
+			delta_ns = time - cpu->last_update;
+			if (delta_ns > TICK_NSEC)
+				cpu->iowait_boost = 0;
+		}
+		cpu->last_update = time;
+	}
 
+	delta_ns = time - cpu->sample.time;
 	if ((s64)delta_ns >= pid_params.sample_rate_ns) {
 		bool sample_taken = intel_pstate_sample(cpu, time);
 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC/RFT][PATCH 3/4] cpufreq: intel_pstate: Use average P-state in get_target_pstate_default()
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
  2016-09-03  0:58 ` [RFC/RFT][PATCH 1/4] cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition Rafael J. Wysocki
  2016-09-03  1:02 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Rafael J. Wysocki
@ 2016-09-03  1:03 ` Rafael J. Wysocki
  2016-09-03  1:04 ` [RFC/RFT][PATCH 4/4] cpufreq: schedutil: Add iowait boosting Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-03  1:03 UTC (permalink / raw)
  To: Linux PM list, Doug Smythies
  Cc: Linux Kernel Mailing List, Srinivas Pandruvada, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Steve Muckle

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Adjust the next P-state formula in get_target_pstate_default()
(in the filtered case) to use the average P-state as given by
the APERF and MPERF feedback registers instead of the exact
P-state requested previously, as that request might not be
granted.

Suggested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -98,7 +98,6 @@ static inline u64 div_ext_fp(u64 x, u64
  * @tsc:		Difference of time stamp counter between last and
  *			current sample
  * @time:		Current time from scheduler
- * @target:		Target P-state
  *
  * This structure is used in the cpudata structure to store performance sample
  * data for choosing next P State.
@@ -110,7 +109,6 @@ struct sample {
 	u64 mperf;
 	u64 tsc;
 	u64 time;
-	int target;
 };
 
 /**
@@ -1149,7 +1147,6 @@ static void intel_pstate_set_min_pstate(
 
 	trace_cpu_frequency(pstate * cpu->pstate.scaling, cpu->cpu);
 	cpu->pstate.current_pstate = pstate;
-	cpu->sample.target = pstate;
 	/*
 	 * Generally, there is no guarantee that this code will always run on
 	 * the CPU being updated, so force the register update to run on the
@@ -1305,8 +1302,8 @@ static inline int32_t get_target_pstate_
 {
 	struct sample *sample = &cpu->sample;
 	int32_t busy_frac, boost;
-	int pstate, max_perf, min_perf;
 	int64_t target;
+	int pstate;
 
 	pstate = limits->no_turbo || limits->turbo_disabled ?
 			cpu->pstate.max_pstate : cpu->pstate.turbo_pstate;
@@ -1341,17 +1338,20 @@ static inline int32_t get_target_pstate_
 		 * Take the raw gain as 1/8 and compute the effective gain as
 		 *
 		 *	iir_gain = 1/8 * delta_t / sampling_interval
+		 *
+		 * Moreover, since the output is a request that may or may not
+		 * be granted (depending on what the other CPUs are doing, for
+		 * example), instead of using the output value obtained
+		 * previously in the computation, use the effective average
+		 * P-state since the last pass as given by APERF and MPERF.
 		 */
 		iir_gain = div_fp(sample->time - cpu->last_sample_time,
 				  pid_params.sample_rate_ns << 3);
 		if (iir_gain < int_tofp(1))
-			target = sample->target * (int_tofp(1) - iir_gain) +
+			target = get_avg_pstate(cpu) * (int_tofp(1) - iir_gain) +
 					mul_fp(target, iir_gain);
 	}
-	intel_pstate_get_min_max(cpu, &min_perf, &max_perf);
-	target = clamp_val(target, int_tofp(min_perf), int_tofp(max_perf));
-	sample->target = fp_toint(target + (1 << (FRAC_BITS-1)));
-	return sample->target;
+	return fp_toint(target + (1 << (FRAC_BITS-1)));
 }
 
 static inline void intel_pstate_update_pstate(struct cpudata *cpu, int pstate)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC/RFT][PATCH 4/4] cpufreq: schedutil: Add iowait boosting
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2016-09-03  1:03 ` [RFC/RFT][PATCH 3/4] cpufreq: intel_pstate: Use average P-state in get_target_pstate_default() Rafael J. Wysocki
@ 2016-09-03  1:04 ` Rafael J. Wysocki
  2016-09-07 15:26 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Doug Smythies
  2016-09-08  0:22 ` [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Steve Muckle
  5 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-03  1:04 UTC (permalink / raw)
  To: Linux PM list
  Cc: Linux Kernel Mailing List, Srinivas Pandruvada, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Steve Muckle, Doug Smythies

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify the schedutil cpufreq governor to boost the CPU
frequency if the SCHED_CPUFREQ_IOWAIT flag is passed to
it via cpufreq_update_util().

If that happens, the frequency is set to the maximum during
the first update after receiving the SCHED_CPUFREQ_IOWAIT flag
and then the boost is reduced by half during each following update.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/sched/cpufreq_schedutil.c |   53 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 49 insertions(+), 4 deletions(-)

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -47,11 +47,13 @@ struct sugov_cpu {
 	struct sugov_policy *sg_policy;
 
 	unsigned int cached_raw_freq;
+	unsigned long iowait_boost;
+	unsigned long iowait_boost_max;
+	u64 last_update;
 
 	/* The fields below are only needed when sharing a policy. */
 	unsigned long util;
 	unsigned long max;
-	u64 last_update;
 	unsigned int flags;
 };
 
@@ -153,6 +155,36 @@ static void sugov_get_util(unsigned long
 	*max = cfs_max;
 }
 
+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
+				   unsigned int flags)
+{
+	if (flags & SCHED_CPUFREQ_IOWAIT) {
+		sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
+	} else if (sg_cpu->iowait_boost) {
+		s64 delta_ns = time - sg_cpu->last_update;
+
+		/* Clear iowait_boost if the CPU apprears to have been idle. */
+		if (delta_ns > TICK_NSEC)
+			sg_cpu->iowait_boost = 0;
+	}
+}
+
+static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
+			       unsigned long *max)
+{
+	unsigned long boost_util = sg_cpu->iowait_boost;
+	unsigned long boost_max = sg_cpu->iowait_boost_max;
+
+	if (!boost_util)
+		return;
+
+	if (*util * boost_max < *max * boost_util) {
+		*util = boost_util;
+		*max = boost_max;
+	}
+	sg_cpu->iowait_boost >>= 1;
+}
+
 static void sugov_update_single(struct update_util_data *hook, u64 time,
 				unsigned int flags)
 {
@@ -162,6 +194,9 @@ static void sugov_update_single(struct u
 	unsigned long util, max;
 	unsigned int next_f;
 
+	sugov_set_iowait_boost(sg_cpu, time, flags);
+	sg_cpu->last_update = time;
+
 	if (!sugov_should_update_freq(sg_policy, time))
 		return;
 
@@ -169,6 +204,7 @@ static void sugov_update_single(struct u
 		next_f = policy->cpuinfo.max_freq;
 	} else {
 		sugov_get_util(&util, &max);
+		sugov_iowait_boost(sg_cpu, &util, &max);
 		next_f = get_next_freq(sg_cpu, util, max);
 	}
 	sugov_update_commit(sg_policy, time, next_f);
@@ -187,6 +223,8 @@ static unsigned int sugov_next_freq_shar
 	if (flags & SCHED_CPUFREQ_RT_DL)
 		return max_f;
 
+	sugov_iowait_boost(sg_cpu, &util, &max);
+
 	for_each_cpu(j, policy->cpus) {
 		struct sugov_cpu *j_sg_cpu;
 		unsigned long j_util, j_max;
@@ -201,12 +239,13 @@ static unsigned int sugov_next_freq_shar
 		 * frequency update and the time elapsed between the last update
 		 * of the CPU utilization and the last frequency update is long
 		 * enough, don't take the CPU into account as it probably is
-		 * idle now.
+		 * idle now (and clear iowait_boost for it).
 		 */
 		delta_ns = last_freq_update_time - j_sg_cpu->last_update;
-		if (delta_ns > TICK_NSEC)
+		if (delta_ns > TICK_NSEC) {
+			j_sg_cpu->iowait_boost = 0;
 			continue;
-
+		}
 		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
 			return max_f;
 
@@ -216,6 +255,8 @@ static unsigned int sugov_next_freq_shar
 			util = j_util;
 			max = j_max;
 		}
+
+		sugov_iowait_boost(j_sg_cpu, &util, &max);
 	}
 
 	return get_next_freq(sg_cpu, util, max);
@@ -236,6 +277,8 @@ static void sugov_update_shared(struct u
 	sg_cpu->util = util;
 	sg_cpu->max = max;
 	sg_cpu->flags = flags;
+
+	sugov_set_iowait_boost(sg_cpu, time, flags);
 	sg_cpu->last_update = time;
 
 	if (sugov_should_update_freq(sg_policy, time)) {
@@ -468,6 +511,8 @@ static int sugov_start(struct cpufreq_po
 			sg_cpu->flags = SCHED_CPUFREQ_RT;
 			sg_cpu->last_update = 0;
 			sg_cpu->cached_raw_freq = 0;
+			sg_cpu->iowait_boost = 0;
+			sg_cpu->iowait_boost_max = policy->cpuinfo.max_freq;
 			cpufreq_add_update_util_hook(cpu, &sg_cpu->update_util,
 						     sugov_update_shared);
 		} else {

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC/RFT][PATCH 2/4]  cpufreq: intel_pstate: Change P-state selection algorithm for Core
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2016-09-03  1:04 ` [RFC/RFT][PATCH 4/4] cpufreq: schedutil: Add iowait boosting Rafael J. Wysocki
@ 2016-09-07 15:26 ` Doug Smythies
  2016-09-08  0:22 ` [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Steve Muckle
  5 siblings, 0 replies; 19+ messages in thread
From: Doug Smythies @ 2016-09-07 15:26 UTC (permalink / raw)
  To: 'Rafael J. Wysocki', 'Linux PM list'
  Cc: 'Linux Kernel Mailing List',
	'Srinivas Pandruvada', 'Peter Zijlstra',
	'Viresh Kumar', 'Ingo Molnar',
	'Vincent Guittot', 'Morten Rasmussen',
	'Juri Lelli', 'Dietmar Eggemann',
	'Steve Muckle'

On 2016.09.02 18:02 Rafael J. Wysocki wrote:

...[cut]...

> This includes an IIR filter on top of the load-based P-state selection,
> but the filter is applied to the non-boosted case only (otherwise it
> defeats the point of the boost) and I used a slightly different raw gain
> value.

The different gain value, 12.5% instead 10%, does come at a cost of some
energy. Although we are finding inconsistencies in the test results.
(I estimated about 2.2% energy cost, for my 20% SpecPower simulator test,
and scaling off of a simple graph I did of energy vs gain with the previous
version).

...[cut]...
> +	intel_pstate_get_min_max(cpu, &min_perf, &max_perf);
> +	target = clamp_val(target, int_tofp(min_perf), int_tofp(max_perf));
> +	sample->target = fp_toint(target + (1 << (FRAC_BITS-1)));
> +	return sample->target;
> +}
> +

In my earlier proposed versions, it was very much on purpose that it
was keeping the pseudo floating point filtered target. Excerpt:

+	cpu->sample.target = div_u64((int_tofp(100) - scaled_gain) *
+			cpu->sample.target + scaled_gain *
+			unfiltered_target, int_tofp(100));
+	/*
+	 * Clamp the filtered value.
+	 */
+	intel_pstate_get_min_max(cpu, &min_perf, &max_perf);
+	if (cpu->sample.target < int_tofp(min_perf))
+		cpu->sample.target = int_tofp(min_perf);
+	if (cpu->sample.target > int_tofp(max_perf))
+		cpu->sample.target = int_tofp(max_perf);
+
+	return fp_toint(cpu->sample.target + (1 << (FRAC_BITS-1)));

Why? To prevent a lock up scenario where, depending on the processor
and the gain settings, the target pstate would never kick over to the
next value. i.e. if it only increased 1/3 of a pstate per iteration
as the filter approached its steady state value. While this condition
did occur in my older proposed implementations, with my processor it
doesn't seem to with this implementation. I didn't theoretically check
other processors.

Another side effect of this change is effectively a further increase
in the gain setting, and thus more energy being given back. This was
determined by looking at step function load response times, as opposed
to math analysis. (I can make pretty graphs if you want.)

The purpose of this e-mail just to make us aware of the tradeoffs,
not to imply it should change.

... Doug

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2016-09-07 15:26 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Doug Smythies
@ 2016-09-08  0:22 ` Steve Muckle
  2016-09-08  0:35   ` Srinivas Pandruvada
  2016-09-08  0:37   ` Rafael J. Wysocki
  5 siblings, 2 replies; 19+ messages in thread
From: Steve Muckle @ 2016-09-08  0:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Linux Kernel Mailing List, Srinivas Pandruvada,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki wrote:
> Please let me know what you think and if you can run some benchmarks you
> care about and see if the changes make any difference (this way or another),
> please do that and let me know what you've found.

LGTM (I just reviewed the first and last patch, skipping the
intel_pstate ones).

I was unable to see a conclusive power regression in Android audio, video or
idle usecases on my hikey 96board.

thanks,
Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:22 ` [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Steve Muckle
@ 2016-09-08  0:35   ` Srinivas Pandruvada
  2016-09-08  0:44     ` Rafael J. Wysocki
  2016-09-08 19:26     ` Steve Muckle
  2016-09-08  0:37   ` Rafael J. Wysocki
  1 sibling, 2 replies; 19+ messages in thread
From: Srinivas Pandruvada @ 2016-09-08  0:35 UTC (permalink / raw)
  To: Steve Muckle, Rafael J. Wysocki
  Cc: Linux PM list, Linux Kernel Mailing List, Peter Zijlstra,
	Viresh Kumar, Ingo Molnar, Vincent Guittot, Morten Rasmussen,
	Juri Lelli, Dietmar Eggemann, Doug Smythies

On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki wrote:
> > 
> > Please let me know what you think and if you can run some
> > benchmarks you
> > care about and see if the changes make any difference (this way or
> > another),
> > please do that and let me know what you've found.
> 
> LGTM (I just reviewed the first and last patch, skipping the
> intel_pstate ones).
> 
> I was unable to see a conclusive power regression in Android audio,
> video or
> idle usecases on my hikey 96board.
Did you see any performance regression on Android workloads?

Thanks,
Srinivas
> 
> thanks,
> Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:22 ` [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Steve Muckle
  2016-09-08  0:35   ` Srinivas Pandruvada
@ 2016-09-08  0:37   ` Rafael J. Wysocki
  1 sibling, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-08  0:37 UTC (permalink / raw)
  To: Steve Muckle
  Cc: Linux PM list, Linux Kernel Mailing List, Srinivas Pandruvada,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Wednesday, September 07, 2016 05:22:26 PM Steve Muckle wrote:
> On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki wrote:
> > Please let me know what you think and if you can run some benchmarks you
> > care about and see if the changes make any difference (this way or another),
> > please do that and let me know what you've found.
> 
> LGTM (I just reviewed the first and last patch, skipping the
> intel_pstate ones).
> 
> I was unable to see a conclusive power regression in Android audio, video or
> idle usecases on my hikey 96board.

Cool, thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:35   ` Srinivas Pandruvada
@ 2016-09-08  0:44     ` Rafael J. Wysocki
  2016-09-08  0:49       ` Srinivas Pandruvada
  2016-09-08 19:26     ` Steve Muckle
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-08  0:44 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Steve Muckle, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Wednesday, September 07, 2016 05:35:50 PM Srinivas Pandruvada wrote:
> On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> > On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki wrote:
> > > 
> > > Please let me know what you think and if you can run some
> > > benchmarks you
> > > care about and see if the changes make any difference (this way or
> > > another),
> > > please do that and let me know what you've found.
> > 
> > LGTM (I just reviewed the first and last patch, skipping the
> > intel_pstate ones).
> > 
> > I was unable to see a conclusive power regression in Android audio,
> > video or
> > idle usecases on my hikey 96board.
> Did you see any performance regression on Android workloads?

That's with schedutil and IOwait boost.  Why would performance regress?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:44     ` Rafael J. Wysocki
@ 2016-09-08  0:49       ` Srinivas Pandruvada
  2016-09-08  1:15         ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Srinivas Pandruvada @ 2016-09-08  0:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Steve Muckle, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Thu, 2016-09-08 at 02:44 +0200, Rafael J. Wysocki wrote:
> On Wednesday, September 07, 2016 05:35:50 PM Srinivas Pandruvada
> wrote:
> > 
> > On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> > > 
> > > On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki
> > > wrote:
> > > > 
> > > > 
> > > > Please let me know what you think and if you can run some
> > > > benchmarks you
> > > > care about and see if the changes make any difference (this way
> > > > or
> > > > another),
> > > > please do that and let me know what you've found.
> > > 
> > > LGTM (I just reviewed the first and last patch, skipping the
> > > intel_pstate ones).
> > > 
> > > I was unable to see a conclusive power regression in Android
> > > audio,
> > > video or
> > > idle usecases on my hikey 96board.
> > Did you see any performance regression on Android workloads?
> 
> That's with schedutil and IOwait boost.  Why would performance
> regress?
Some Android tests reach thermal limits and aggressive throttling
causes performance issues. 

Thanks,
Srinivas


> 
> Thanks,
> Rafael
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:49       ` Srinivas Pandruvada
@ 2016-09-08  1:15         ` Rafael J. Wysocki
  2016-09-08 15:02           ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-08  1:15 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Steve Muckle, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Wednesday, September 07, 2016 05:49:31 PM Srinivas Pandruvada wrote:
> On Thu, 2016-09-08 at 02:44 +0200, Rafael J. Wysocki wrote:
> > On Wednesday, September 07, 2016 05:35:50 PM Srinivas Pandruvada
> > wrote:
> > > 
> > > On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> > > > 
> > > > On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki
> > > > wrote:
> > > > > 
> > > > > 
> > > > > Please let me know what you think and if you can run some
> > > > > benchmarks you
> > > > > care about and see if the changes make any difference (this way
> > > > > or
> > > > > another),
> > > > > please do that and let me know what you've found.
> > > > 
> > > > LGTM (I just reviewed the first and last patch, skipping the
> > > > intel_pstate ones).
> > > > 
> > > > I was unable to see a conclusive power regression in Android
> > > > audio,
> > > > video or
> > > > idle usecases on my hikey 96board.
> > > Did you see any performance regression on Android workloads?
> > 
> > That's with schedutil and IOwait boost.  Why would performance
> > regress?
> Some Android tests reach thermal limits and aggressive throttling
> causes performance issues. 

I see, OK.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  1:15         ` Rafael J. Wysocki
@ 2016-09-08 15:02           ` Rafael J. Wysocki
  2016-09-08 17:30             ` Srinivas Pandruvada
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-08 15:02 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Steve Muckle, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Thursday, September 08, 2016 03:15:49 AM Rafael J. Wysocki wrote:
> On Wednesday, September 07, 2016 05:49:31 PM Srinivas Pandruvada wrote:
> > On Thu, 2016-09-08 at 02:44 +0200, Rafael J. Wysocki wrote:
> > > On Wednesday, September 07, 2016 05:35:50 PM Srinivas Pandruvada
> > > wrote:
> > > > 
> > > > On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> > > > > 
> > > > > On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki
> > > > > wrote:
> > > > > > 
> > > > > > 
> > > > > > Please let me know what you think and if you can run some
> > > > > > benchmarks you
> > > > > > care about and see if the changes make any difference (this way
> > > > > > or
> > > > > > another),
> > > > > > please do that and let me know what you've found.
> > > > > 
> > > > > LGTM (I just reviewed the first and last patch, skipping the
> > > > > intel_pstate ones).
> > > > > 
> > > > > I was unable to see a conclusive power regression in Android
> > > > > audio,
> > > > > video or
> > > > > idle usecases on my hikey 96board.
> > > > Did you see any performance regression on Android workloads?
> > > 
> > > That's with schedutil and IOwait boost.  Why would performance
> > > regress?
> > Some Android tests reach thermal limits and aggressive throttling
> > causes performance issues. 
> 
> I see, OK.

But in that case Steve would see a power regression as well IMO.  It would
be rather difficult to reach thermal limits without consuming more energy,
wouldn't it? :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08 15:02           ` Rafael J. Wysocki
@ 2016-09-08 17:30             ` Srinivas Pandruvada
  0 siblings, 0 replies; 19+ messages in thread
From: Srinivas Pandruvada @ 2016-09-08 17:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Steve Muckle, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Thu, 2016-09-08 at 17:02 +0200, Rafael J. Wysocki wrote:
> On Thursday, September 08, 2016 03:15:49 AM Rafael J. Wysocki wrote:
> > 
> > On Wednesday, September 07, 2016 05:49:31 PM Srinivas Pandruvada
> > wrote:
> > > 
> > > On Thu, 2016-09-08 at 02:44 +0200, Rafael J. Wysocki wrote:
> > > > 
> > > > On Wednesday, September 07, 2016 05:35:50 PM Srinivas
> > > > Pandruvada
> > > > wrote:
> > > > > 
> > > > > 
> > > > > On Wed, 2016-09-07 at 17:22 -0700, Steve Muckle wrote:
> > > > > > 
> > > > > > 
> > > > > > On Sat, Sep 03, 2016 at 02:56:48AM +0200, Rafael J. Wysocki
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Please let me know what you think and if you can run some
> > > > > > > benchmarks you
> > > > > > > care about and see if the changes make any difference
> > > > > > > (this way
> > > > > > > or
> > > > > > > another),
> > > > > > > please do that and let me know what you've found.
> > > > > > 
> > > > > > LGTM (I just reviewed the first and last patch, skipping
> > > > > > the
> > > > > > intel_pstate ones).
> > > > > > 
> > > > > > I was unable to see a conclusive power regression in
> > > > > > Android
> > > > > > audio,
> > > > > > video or
> > > > > > idle usecases on my hikey 96board.
> > > > > Did you see any performance regression on Android workloads?
> > > > 
> > > > That's with schedutil and IOwait boost.  Why would performance
> > > > regress?
> > > Some Android tests reach thermal limits and aggressive throttling
> > > causes performance issues. 
> > 
> > I see, OK.
> 
> But in that case Steve would see a power regression as well IMO.
>   It would
> be rather difficult to reach thermal limits without consuming more
> energy,
> wouldn't it? :-)
Yes. It depends on workloads. Idle and AV tests which tend to use HW
encoding/decoding don't stress CPU enough in my experience. May be
something like CPU Mark or Disk mark score.

Anyway this shouldn't be a reason for not including a change.

Thanks,
Srinivas

> 
> Thanks,
> Rafael
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08  0:35   ` Srinivas Pandruvada
  2016-09-08  0:44     ` Rafael J. Wysocki
@ 2016-09-08 19:26     ` Steve Muckle
  2016-09-08 19:49       ` Srinivas Pandruvada
  1 sibling, 1 reply; 19+ messages in thread
From: Steve Muckle @ 2016-09-08 19:26 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Steve Muckle, Rafael J. Wysocki, Linux PM list,
	Linux Kernel Mailing List, Peter Zijlstra, Viresh Kumar,
	Ingo Molnar, Vincent Guittot, Morten Rasmussen, Juri Lelli,
	Dietmar Eggemann, Doug Smythies

On Wed, Sep 07, 2016 at 05:35:50PM -0700, Srinivas Pandruvada wrote:
> Did you see any performance regression on Android workloads?

I did a few AnTuTU runs and did not observe a regression.

thanks,
Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-08 19:26     ` Steve Muckle
@ 2016-09-08 19:49       ` Srinivas Pandruvada
  0 siblings, 0 replies; 19+ messages in thread
From: Srinivas Pandruvada @ 2016-09-08 19:49 UTC (permalink / raw)
  To: Steve Muckle
  Cc: Rafael J. Wysocki, Linux PM list, Linux Kernel Mailing List,
	Peter Zijlstra, Viresh Kumar, Ingo Molnar, Vincent Guittot,
	Morten Rasmussen, Juri Lelli, Dietmar Eggemann, Doug Smythies

On Thu, 2016-09-08 at 12:26 -0700, Steve Muckle wrote:
> On Wed, Sep 07, 2016 at 05:35:50PM -0700, Srinivas Pandruvada wrote:
> > 
> > Did you see any performance regression on Android workloads?
> 
> I did a few AnTuTU runs and did not observe a regression.
Thanks.

-Srinivas


> thanks,
> Steve

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-04 15:54 Doug Smythies
  2016-09-04 23:54 ` Rafael J. Wysocki
@ 2016-09-07 15:25 ` Doug Smythies
  1 sibling, 0 replies; 19+ messages in thread
From: Doug Smythies @ 2016-09-07 15:25 UTC (permalink / raw)
  To: 'Rafael J. Wysocki'
  Cc: 'Linux PM list', 'Linux Kernel Mailing List',
	'Srinivas Pandruvada', 'Peter Zijlstra',
	'Viresh Kumar', 'Ingo Molnar',
	'Vincent Guittot', 'Morten Rasmussen',
	'Juri Lelli', 'Dietmar Eggemann',
	'Steve Muckle', 'Doug Smythies'

On 2016.09.04 16:55 Rafael J. Wysocki wrote:
> On Sunday, September 04, 2016 08:54:49 AM Doug Smythies wrote:
>> On 2016.09.02 17:57 Rafael J. Wysocki wrote:
>> 
>>> This is a new version of the "iowait boost" series I posted a few weeks
>>> ago.  Since the first two patches from that series have been reworked and
>>> are in linux-next now, I've rebased this series on top of my linux-next
>>> branch.
>>>
>>> In addition to that I took the Doug's feedback into account in the
>>> intel_pstate patches [2-3/4].
>> 
>> You got ahead of me a little.
>> Recall the suggestion for the addition of some filtering was based
>> on energy savings. And further that it might make sense to use
>> average pstate as input to the filter (your new patch 3 of 4).
>> In my testing (of the old patch set) I have been finding that some
>> of those energy savings are being given back by the average pstate
>> method, putting its value added into question.
>> 
>> Switching to the new patch set, I made two kernels (based on 4.8-rc4
>> + your pre-requisite 2 patches):
>> rfc4: has all 4 patches.
>> rfc2: has patches 1, 2, 4. (does not have the average pstate change)
>> 
>> Using my SpecPower simulator test at 20% load I get:
>> 
>> Unpatched (reference): ~5905 Joules
19.68 watts
>> rfc4: ~ 6232 Joules (+5.5%)
20.77 watts
>> rfc2: ~ 6075 Joules (+2.9%)
20.25 watts
>> Old rfc, no filter (restated): ~7197 Joules (+21.9%)
>> Old rfc + old iir filter V2: ~5967 Joules (+1%)
>> Old rfc + old ave pstate method: ~6275 Joules (+6.3%)
The above numbers are all an average of 4 runs of 300 seconds each.
See further down for why I added normalized watts.
>> 
>> Srinivas was getting considerably different, but still
>> encouraging, numbers on the real SpecPower test beds.
>> 
>> I would like to suggest/ask that those real SpecPower tests be done
>> first so as to decide a preferred way forward. I'll also re-do my
>> simulator tests over a longer time period and at some other loads
>> (currently 20% is hard coded).
>
> The reason I made patch [3/4] separate was to make it easier to test without
> that change.  That is, apply [1-2/4] and see what difference it makes.
>
> I'd like to see the results from that if poss.

O.K., that is what I was doing anyway.
I have some more data from my SpecPower simulator test:

Note: My calibration was out by quite a bit, so what I called 20%
was actually about 36.4%. While I knew it was out, I didn't know it
was that much, but I didn't care as it wasn't really relevant to
the compare type tests I was doing. I'll just use "X" in the table
below, where X ~= 18.2% on a real SpecPower.

Big numbers are Joules (package Joules from turbostat)
Smaller numbers are watts, 1500 Seconds test run time.

Load:		idle	0.5X	X	2X	3X	4X	5X	100%
Unpatched:	5757	11050	16048	29012	47575	61313	76634	81737
		3.84	7.37	10.70	19.34	31.72	40.88	51.09	54.49

rfc4:		5723	11323	17079	31561	47666	62625	76286	81664
		3.82	7.55	11.39	21.04	31.78	41.75	50.86	54.44
		-0.6%	2.5%	6.4%	8.8%	0.2%	2.1%	-0.5%	-0.1%

rfc2:		5769	11319	17140	30533	45158	61387	75690	81722
		3.85	7.55	11.43	20.36	30.11	40.92	50.46	54.48
		0.2%	2.4%	6.8%	5.2%	-5.1%	0.1%	-1.2%	0.0%

And again, 2nd run:

		idle	0.5X	X	2X	3X	4X	5X	100%
Unpatched:	5708	11037	16075	29147	45913	61165	76650	81695
		3.81	7.36	10.72	19.43	30.61	40.78	51.10	54.46

rfc4:		5770	11303	17023	31508	47653	62520	75798	81725
		3.85	7.54	11.35	21.01	31.77	41.68	50.53	54.48
		1.1%	2.4%	5.9%	8.1%	3.8%	2.2%	-1.1%	0.0%

rfc2:		5793	11242	17044	30258	45178	61526	75631	81669
		3.86	7.49	11.36	20.17	30.12	41.02	50.42	54.45
		1.5%	1.9%	6.0%	3.8%	-1.6%	0.6%	-1.3%	0.0%

Note: Comparing the 2X data to the further above numbers
from the other day shows more run to run variability than
I had expected. (I have very very few services running
on my test server, so background idle is really quite
idle.)

... Doug

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
  2016-09-04 15:54 Doug Smythies
@ 2016-09-04 23:54 ` Rafael J. Wysocki
  2016-09-07 15:25 ` Doug Smythies
  1 sibling, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2016-09-04 23:54 UTC (permalink / raw)
  To: Doug Smythies
  Cc: 'Linux PM list', 'Linux Kernel Mailing List',
	'Srinivas Pandruvada', 'Peter Zijlstra',
	'Viresh Kumar', 'Ingo Molnar',
	'Vincent Guittot', 'Morten Rasmussen',
	'Juri Lelli', 'Dietmar Eggemann',
	'Steve Muckle', 'Doug Smythies'

On Sunday, September 04, 2016 08:54:49 AM Doug Smythies wrote:
> Hi Rafael,
> 
> On 2016.09.02 17:57 Rafael J. Wysocki wrote:
> 
> > This is a new version of the "iowait boost" series I posted a few weeks
> > ago.  Since the first two patches from that series have been reworked and
> > are in linux-next now, I've rebased this series on top of my linux-next
> > branch.
> >
> > In addition to that I took the Doug's feedback into account in the
> > intel_pstate patches [2-3/4].
> 
> You got ahead of me a little.
> Recall the suggestion for the addition of some filtering was based
> on energy savings. And further that it might make sense to use
> average pstate as input to the filter (your new patch 3 of 4).
> In my testing (of the old patch set) I have been finding that some
> of those energy savings are being given back by the average pstate
> method, putting its value added into question.
> 
> Switching to the new patch set, I made two kernels (based on 4.8-rc4
> + your pre-requisite 2 patches):
> rfc4: has all 4 patches.
> rfc2: has patches 1, 2, 4. (does not have the average pstate change)
> 
> Using my SpecPower simulator test at 20% load I get:
> 
> Unpatched (reference): ~5905 Joules
> rfc4: ~ 6232 Joules (+5.5%)
> rfc2: ~ 6075 Joules (+2.9%)
> Old rfc, no filter (restated): ~7197 Joules (+21.9%)
> Old rfc + old iir filter V2: ~5967 Joules (+1%)
> Old rfc + old ave pstate method: ~6275 Joules (+6.3%)
> 
> Srinivas was getting considerably different, but still
> encouraging, numbers on the real SpecPower test beds.
> 
> I would like to suggest/ask that those real SpecPower tests be done
> first so as to decide a preferred way forward. I'll also re-do my
> simulator tests over a longer time period and at some other loads
> (currently 20% is hard coded).

The reason I made patch [3/4] separate was to make it easier to test without
that change.  That is, apply [1-2/4] and see what difference it makes.

I'd like to see the results from that if poss.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil
@ 2016-09-04 15:54 Doug Smythies
  2016-09-04 23:54 ` Rafael J. Wysocki
  2016-09-07 15:25 ` Doug Smythies
  0 siblings, 2 replies; 19+ messages in thread
From: Doug Smythies @ 2016-09-04 15:54 UTC (permalink / raw)
  To: 'Rafael J. Wysocki', 'Linux PM list'
  Cc: 'Linux Kernel Mailing List',
	'Srinivas Pandruvada', 'Peter Zijlstra',
	'Viresh Kumar', 'Ingo Molnar',
	'Vincent Guittot', 'Morten Rasmussen',
	'Juri Lelli', 'Dietmar Eggemann',
	'Steve Muckle', 'Doug Smythies'

Hi Rafael,

On 2016.09.02 17:57 Rafael J. Wysocki wrote:

> This is a new version of the "iowait boost" series I posted a few weeks
> ago.  Since the first two patches from that series have been reworked and
> are in linux-next now, I've rebased this series on top of my linux-next
> branch.
>
> In addition to that I took the Doug's feedback into account in the
> intel_pstate patches [2-3/4].

You got ahead of me a little.
Recall the suggestion for the addition of some filtering was based
on energy savings. And further that it might make sense to use
average pstate as input to the filter (your new patch 3 of 4).
In my testing (of the old patch set) I have been finding that some
of those energy savings are being given back by the average pstate
method, putting its value added into question.

Switching to the new patch set, I made two kernels (based on 4.8-rc4
+ your pre-requisite 2 patches):
rfc4: has all 4 patches.
rfc2: has patches 1, 2, 4. (does not have the average pstate change)

Using my SpecPower simulator test at 20% load I get:

Unpatched (reference): ~5905 Joules
rfc4: ~ 6232 Joules (+5.5%)
rfc2: ~ 6075 Joules (+2.9%)
Old rfc, no filter (restated): ~7197 Joules (+21.9%)
Old rfc + old iir filter V2: ~5967 Joules (+1%)
Old rfc + old ave pstate method: ~6275 Joules (+6.3%)

Srinivas was getting considerably different, but still
encouraging, numbers on the real SpecPower test beds.

I would like to suggest/ask that those real SpecPower tests be done
first so as to decide a preferred way forward. I'll also re-do my
simulator tests over a longer time period and at some other loads
(currently 20% is hard coded).

... Doug

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-09-08 19:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-03  0:56 [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Rafael J. Wysocki
2016-09-03  0:58 ` [RFC/RFT][PATCH 1/4] cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition Rafael J. Wysocki
2016-09-03  1:02 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Rafael J. Wysocki
2016-09-03  1:03 ` [RFC/RFT][PATCH 3/4] cpufreq: intel_pstate: Use average P-state in get_target_pstate_default() Rafael J. Wysocki
2016-09-03  1:04 ` [RFC/RFT][PATCH 4/4] cpufreq: schedutil: Add iowait boosting Rafael J. Wysocki
2016-09-07 15:26 ` [RFC/RFT][PATCH 2/4] cpufreq: intel_pstate: Change P-state selection algorithm for Core Doug Smythies
2016-09-08  0:22 ` [RFC/RFT][PATCH 0/4] cpufreq / sched: iowait boost in intel_pstate and schedutil Steve Muckle
2016-09-08  0:35   ` Srinivas Pandruvada
2016-09-08  0:44     ` Rafael J. Wysocki
2016-09-08  0:49       ` Srinivas Pandruvada
2016-09-08  1:15         ` Rafael J. Wysocki
2016-09-08 15:02           ` Rafael J. Wysocki
2016-09-08 17:30             ` Srinivas Pandruvada
2016-09-08 19:26     ` Steve Muckle
2016-09-08 19:49       ` Srinivas Pandruvada
2016-09-08  0:37   ` Rafael J. Wysocki
2016-09-04 15:54 Doug Smythies
2016-09-04 23:54 ` Rafael J. Wysocki
2016-09-07 15:25 ` Doug Smythies

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.