* [RFC/RFT][PATCH] cpufreq: intel_pstate: Improve IO performance
@ 2017-07-28 6:44 Srinivas Pandruvada
2017-07-31 12:21 ` Rafael J. Wysocki
0 siblings, 1 reply; 3+ messages in thread
From: Srinivas Pandruvada @ 2017-07-28 6:44 UTC (permalink / raw)
To: rjw, lenb; +Cc: linux-pm, Srinivas Pandruvada
In the current implementation the latency from SCHED_CPUFREQ_IOWAIT is
set to actual P-state adjustment can be upto 10ms. This can be improved
by reacting to SCHED_CPUFREQ_IOWAIT faster in a milli second. With this
trivial change the IO performance improves significantly.
With a simple "grep -r . linux" (Here linux is kernel source folder) with
dropped caches every time on a platform with per core P-states
(Broadwell and Haswell Xeon ), the performance difference is significant.
The user and kernel time improvement is more than 20%.
The same performance difference was not observed on clients and on a
IvyTown server. which don't have per core P-state support.
So the performance gain may not be apparent on all systems.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
---
The idea of this patch is to test if it brings in any significant
improvement on real world use cases.
drivers/cpufreq/intel_pstate.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 8c67b77..639979c 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -38,6 +38,7 @@
#include <asm/intel-family.h>
#define INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC)
+#define INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL (NSEC_PER_MSEC)
#define INTEL_PSTATE_HWP_SAMPLING_INTERVAL (50 * NSEC_PER_MSEC)
#define INTEL_CPUFREQ_TRANSITION_LATENCY 20000
@@ -287,6 +288,7 @@ static struct pstate_funcs pstate_funcs __read_mostly;
static int hwp_active __read_mostly;
static bool per_cpu_limits __read_mostly;
+static int current_sample_interval = INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL;
static struct cpufreq_driver *intel_pstate_driver __read_mostly;
@@ -1527,15 +1529,18 @@ static void intel_pstate_update_util(struct update_util_data *data, u64 time,
if (flags & SCHED_CPUFREQ_IOWAIT) {
cpu->iowait_boost = int_tofp(1);
+ current_sample_interval = INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL;
} else if (cpu->iowait_boost) {
/* Clear iowait_boost if the CPU may have been idle. */
delta_ns = time - cpu->last_update;
- if (delta_ns > TICK_NSEC)
+ if (delta_ns > TICK_NSEC) {
cpu->iowait_boost = 0;
+ current_sample_interval = INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL;
+ }
}
cpu->last_update = time;
delta_ns = time - cpu->sample.time;
- if ((s64)delta_ns < INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL)
+ if ((s64)delta_ns < current_sample_interval)
return;
if (intel_pstate_sample(cpu, time)) {
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [RFC/RFT][PATCH] cpufreq: intel_pstate: Improve IO performance
2017-07-28 6:44 [RFC/RFT][PATCH] cpufreq: intel_pstate: Improve IO performance Srinivas Pandruvada
@ 2017-07-31 12:21 ` Rafael J. Wysocki
2017-07-31 16:39 ` Srinivas Pandruvada
0 siblings, 1 reply; 3+ messages in thread
From: Rafael J. Wysocki @ 2017-07-31 12:21 UTC (permalink / raw)
To: Srinivas Pandruvada; +Cc: lenb, linux-pm
On Thursday, July 27, 2017 11:44:52 PM Srinivas Pandruvada wrote:
> In the current implementation the latency from SCHED_CPUFREQ_IOWAIT is
> set to actual P-state adjustment can be upto 10ms. This can be improved
> by reacting to SCHED_CPUFREQ_IOWAIT faster in a milli second. With this
> trivial change the IO performance improves significantly.
>
> With a simple "grep -r . linux" (Here linux is kernel source folder) with
> dropped caches every time on a platform with per core P-states
> (Broadwell and Haswell Xeon ), the performance difference is significant.
> The user and kernel time improvement is more than 20%.
>
> The same performance difference was not observed on clients and on a
> IvyTown server. which don't have per core P-state support.
> So the performance gain may not be apparent on all systems.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> ---
> The idea of this patch is to test if it brings in any significant
> improvement on real world use cases.
>
> drivers/cpufreq/intel_pstate.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 8c67b77..639979c 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -38,6 +38,7 @@
> #include <asm/intel-family.h>
>
> #define INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC)
> +#define INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL (NSEC_PER_MSEC)
> #define INTEL_PSTATE_HWP_SAMPLING_INTERVAL (50 * NSEC_PER_MSEC)
First offf, can we simply set INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL to NSEC_PER_MSEC?
I guess it may help quite a bit in the more "interactive" cases overall.
Or would that be too much overhead?
> #define INTEL_CPUFREQ_TRANSITION_LATENCY 20000
> @@ -287,6 +288,7 @@ static struct pstate_funcs pstate_funcs __read_mostly;
>
> static int hwp_active __read_mostly;
> static bool per_cpu_limits __read_mostly;
> +static int current_sample_interval = INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL;
>
> static struct cpufreq_driver *intel_pstate_driver __read_mostly;
>
> @@ -1527,15 +1529,18 @@ static void intel_pstate_update_util(struct update_util_data *data, u64 time,
>
> if (flags & SCHED_CPUFREQ_IOWAIT) {
> cpu->iowait_boost = int_tofp(1);
> + current_sample_interval = INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL;
> } else if (cpu->iowait_boost) {
> /* Clear iowait_boost if the CPU may have been idle. */
> delta_ns = time - cpu->last_update;
> - if (delta_ns > TICK_NSEC)
> + if (delta_ns > TICK_NSEC) {
> cpu->iowait_boost = 0;
> + current_sample_interval = INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL;
Second, if reducing INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL is not viable, why
does the sample interval have to be reduced for all CPUs if SCHED_CPUFREQ_IOWAIT
is set for one of them and not just for the CPU receiving that flag?
> + }
> }
> cpu->last_update = time;
> delta_ns = time - cpu->sample.time;
> - if ((s64)delta_ns < INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL)
> + if ((s64)delta_ns < current_sample_interval)
> return;
>
> if (intel_pstate_sample(cpu, time)) {
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC/RFT][PATCH] cpufreq: intel_pstate: Improve IO performance
2017-07-31 12:21 ` Rafael J. Wysocki
@ 2017-07-31 16:39 ` Srinivas Pandruvada
0 siblings, 0 replies; 3+ messages in thread
From: Srinivas Pandruvada @ 2017-07-31 16:39 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: lenb, linux-pm
On Mon, 2017-07-31 at 14:21 +0200, Rafael J. Wysocki wrote:
>
[...]
> > #define INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL (10 *
> > NSEC_PER_MSEC)
> > +#define INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL (NSEC_PER_MS
> > EC)
> > #define INTEL_PSTATE_HWP_SAMPLING_INTERVAL (50 *
> > NSEC_PER_MSEC)
> First offf, can we simply set INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL
> to NSEC_PER_MSEC?
>
> I guess it may help quite a bit in the more "interactive" cases
> overall.
>
> Or would that be too much overhead?
It will be too much overhead for clients.
>
> >
> >
[...]
> > @@ -1527,15 +1529,18 @@ static void intel_pstate_update_util(struct
> > update_util_data *data, u64 time,
> >
> > if (flags & SCHED_CPUFREQ_IOWAIT) {
> > cpu->iowait_boost = int_tofp(1);
> > + current_sample_interval =
> > INTEL_PSTATE_IO_WAIT_SAMPLING_INTERVAL;
> > } else if (cpu->iowait_boost) {
> > /* Clear iowait_boost if the CPU may have been
> > idle. */
> > delta_ns = time - cpu->last_update;
> > - if (delta_ns > TICK_NSEC)
> > + if (delta_ns > TICK_NSEC) {
> > cpu->iowait_boost = 0;
> > + current_sample_interval =
> > INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL;
> Second, if reducing INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL is not
> viable, why
> does the sample interval have to be reduced for all CPUs if
> SCHED_CPUFREQ_IOWAIT
> is set for one of them and not just for the CPU receiving that flag?
>
Correct, the read data may be passed to some other thread for
processing.
Replacing this patch with one below improves simple grep user time and
system time by 40% on Haswell servers.
diff --git a/drivers/cpufreq/intel_pstate.c
b/drivers/cpufreq/intel_pstate.c
index 48a98f11a84e..dce3e324a9da 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1751,6 +1751,16 @@ static void intel_pstate_update_util(struct
update_util_data *data, u64 time,
if (flags & SCHED_CPUFREQ_IOWAIT) {
cpu->iowait_boost = int_tofp(1);
+ /*
+ * The last time the busy was 100% so P-state was max
anyway
+ * so avoid overhead of computation.
+ */
+ if (fp_toint(cpu->sample.busy_scaled) == 100) {
+ cpu->last_update = time;
+ return;
+ }
+ goto set_pstate;
+
} else if (cpu->iowait_boost) {
/* Clear iowait_boost if the CPU may have been idle. */
delta_ns = time - cpu->last_update;
@@ -1762,6 +1772,7 @@ static void intel_pstate_update_util(struct
update_util_data *data, u64 time,
if ((s64)delta_ns < INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL)
return;
+set_pstate:
if (intel_pstate_sample(cpu, time)) {
int target_pstate;
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-07-31 16:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-28 6:44 [RFC/RFT][PATCH] cpufreq: intel_pstate: Improve IO performance Srinivas Pandruvada
2017-07-31 12:21 ` Rafael J. Wysocki
2017-07-31 16:39 ` Srinivas Pandruvada
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.