linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/9] cpufreq: transition-latency cleanups
@ 2017-07-19 10:12 Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 1/9] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki
  Cc: Viresh Kumar, linux-pm, Vincent Guittot, linux,
	Benjamin Herrenschmidt, Ingo Molnar, Jonathan Corbet, Len Brown,
	linux-doc, linux-kernel, linuxppc-dev, Michael Ellerman,
	Paul Mackerras, Peter Zijlstra, Srinivas Pandruvada,
	Sudeep Holla

Hi Rafael,

This series tries to cleanup the code around transition-latency and its
users. Some of the old legacy code, which may not make much sense now,
is dropped as well. And some code consolidation is also done across
governors.

Based of: v4.13-rc1
Tested on: ARM64 Hikey board.

I have pushed it here as well (which gets tested by kbuild test bot):

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/transition-latency

V2->V3:
- Rearranged patches to keep related stuff together
- Introduce CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING flag (Rafael)
- Minor optimization in cpufreq_policy_transition_delay_us() and moved
  it to cpufreq.c (Rafael)
- Allow dynamic switching for drivers which don't know their transition
  latency.

V1->V2:
- While we still get rid of the limitation of 10ms for using
  ondemand/conservative, but we preserve the earlier behavior where the
  transition latency set to CPUFREQ_ETERNAL would not allow use of
  ondemand/conservative governors. Thanks to Dominik for his feedback on
  that.

--
viresh

Viresh Kumar (9):
  cpufreq: governor: Drop min_sampling_rate
  cpufreq: Use transition_delay_us for legacy governors as well
  cpufreq: Cap the default transition delay value to 10 ms
  cpufreq: Don't set transition_latency for setpolicy drivers
  cpufreq: arm_big_little: Make ->get_transition_latency() mandatory
  cpufreq: Replace "max_transition_latency" with "dynamic_switching"
  cpufreq: schedutil: Set dynamic_switching to true
  cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag
  cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency

 Documentation/admin-guide/pm/cpufreq.rst |  8 --------
 drivers/cpufreq/arm_big_little.c         | 10 ++++------
 drivers/cpufreq/cpufreq-nforce2.c        |  2 +-
 drivers/cpufreq/cpufreq.c                | 34 ++++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_conservative.c   |  6 ------
 drivers/cpufreq/cpufreq_governor.c       | 17 ++--------------
 drivers/cpufreq/cpufreq_governor.h       |  3 +--
 drivers/cpufreq/cpufreq_ondemand.c       | 12 -----------
 drivers/cpufreq/elanfreq.c               |  4 +---
 drivers/cpufreq/gx-suspmod.c             |  2 +-
 drivers/cpufreq/intel_pstate.c           |  1 -
 drivers/cpufreq/longrun.c                |  1 -
 drivers/cpufreq/pmac32-cpufreq.c         |  7 +++++--
 drivers/cpufreq/sa1100-cpufreq.c         |  5 +++--
 drivers/cpufreq/sa1110-cpufreq.c         |  5 +++--
 drivers/cpufreq/sh-cpufreq.c             |  3 +--
 drivers/cpufreq/speedstep-smi.c          |  2 +-
 drivers/cpufreq/unicore2-cpufreq.c       |  3 +--
 include/linux/cpufreq.h                  | 18 ++++++++---------
 kernel/sched/cpufreq_schedutil.c         | 12 ++---------
 20 files changed, 65 insertions(+), 90 deletions(-)

-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V3 1/9] cpufreq: governor: Drop min_sampling_rate
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Jonathan Corbet, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-doc, linux-kernel

The cpufreq core and governors aren't supposed to set a limit on how
fast we want to try changing the frequency. This is currently done for
the legacy governors with help of min_sampling_rate.

At worst, we may end up setting the sampling rate to a value lower than
the rate at which frequency can be changed and then one of the CPUs in
the policy will be only changing frequency for ever.

But that is something for the user to decide and there is no need to
have special handling for such cases in the core. Leave it for the user
to figure out.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 Documentation/admin-guide/pm/cpufreq.rst |  8 --------
 drivers/cpufreq/cpufreq_conservative.c   |  6 ------
 drivers/cpufreq/cpufreq_governor.c       | 10 ++--------
 drivers/cpufreq/cpufreq_governor.h       |  1 -
 drivers/cpufreq/cpufreq_ondemand.c       | 12 ------------
 include/linux/cpufreq.h                  |  2 --
 6 files changed, 2 insertions(+), 37 deletions(-)

diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 463cf7e73db8..2eb3bf62393e 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -471,14 +471,6 @@ it is allowed to use (the ``scaling_max_freq`` policy limit).
 
 	# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
 
-
-``min_sampling_rate``
-	The minimum value of ``sampling_rate``.
-
-	Equal to 10000 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and
-	:c:data:`tick_nohz_active` are both set or to 20 times the value of
-	:c:data:`jiffies` in microseconds otherwise.
-
 ``up_threshold``
 	If the estimated CPU load is above this value (in percent), the governor
 	will set the frequency to the maximum value allowed for the policy.
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 88220ff3e1c2..f20f20a77d4d 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -246,7 +246,6 @@ gov_show_one_common(sampling_rate);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(up_threshold);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one(cs, down_threshold);
 gov_show_one(cs, freq_step);
 
@@ -254,12 +253,10 @@ gov_attr_rw(sampling_rate);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(up_threshold);
 gov_attr_rw(ignore_nice_load);
-gov_attr_ro(min_sampling_rate);
 gov_attr_rw(down_threshold);
 gov_attr_rw(freq_step);
 
 static struct attribute *cs_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&sampling_down_factor.attr,
 	&up_threshold.attr,
@@ -297,10 +294,7 @@ static int cs_init(struct dbs_data *dbs_data)
 	dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
 	dbs_data->ignore_nice_load = 0;
-
 	dbs_data->tuners = tuners;
-	dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-		jiffies_to_usecs(10);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 47e24b5384b3..858081f9c3d7 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -47,14 +47,11 @@ ssize_t store_sampling_rate(struct gov_attr_set *attr_set, const char *buf,
 {
 	struct dbs_data *dbs_data = to_dbs_data(attr_set);
 	struct policy_dbs_info *policy_dbs;
-	unsigned int rate;
 	int ret;
-	ret = sscanf(buf, "%u", &rate);
+	ret = sscanf(buf, "%u", &dbs_data->sampling_rate);
 	if (ret != 1)
 		return -EINVAL;
 
-	dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate);
-
 	/*
 	 * We are operating under dbs_data->mutex and so the list and its
 	 * entries can't be freed concurrently.
@@ -437,10 +434,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 		latency = 1;
 
 	/* Bring kernel and HW constraints together */
-	dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
-					  MIN_LATENCY_MULTIPLIER * latency);
-	dbs_data->sampling_rate = max(dbs_data->min_sampling_rate,
-				      LATENCY_MULTIPLIER * latency);
+	dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 0236ec2cd654..95f207eb820e 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -41,7 +41,6 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 struct dbs_data {
 	struct gov_attr_set attr_set;
 	void *tuners;
-	unsigned int min_sampling_rate;
 	unsigned int ignore_nice_load;
 	unsigned int sampling_rate;
 	unsigned int sampling_down_factor;
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 3937acf7e026..6b423eebfd5d 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -319,7 +319,6 @@ gov_show_one_common(sampling_rate);
 gov_show_one_common(up_threshold);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one_common(io_is_busy);
 gov_show_one(od, powersave_bias);
 
@@ -329,10 +328,8 @@ gov_attr_rw(up_threshold);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(ignore_nice_load);
 gov_attr_rw(powersave_bias);
-gov_attr_ro(min_sampling_rate);
 
 static struct attribute *od_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&up_threshold.attr,
 	&sampling_down_factor.attr,
@@ -373,17 +370,8 @@ static int od_init(struct dbs_data *dbs_data)
 	if (idle_time != -1ULL) {
 		/* Idle micro accounting is supported. Use finer thresholds */
 		dbs_data->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
-		/*
-		 * In nohz/micro accounting case we set the minimum frequency
-		 * not depending on HZ, but fixed (very low).
-		*/
-		dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
 	} else {
 		dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
-
-		/* For correct statistics, we need 10 ticks for each measure */
-		dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-			jiffies_to_usecs(10);
 	}
 
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index f10a9b3761cd..02aec384cab9 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -491,9 +491,7 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
  * For CPUs with transition latency > 10ms (mostly drivers with CPUFREQ_ETERNAL)
  * the ondemand governor will not work. All times here are in us (microseconds).
  */
-#define MIN_SAMPLING_RATE_RATIO		(2)
 #define LATENCY_MULTIPLIER		(1000)
-#define MIN_LATENCY_MULTIPLIER		(20)
 #define TRANSITION_LATENCY_LIMIT	(10 * 1000 * 1000)
 
 struct cpufreq_governor {
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 1/9] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-24 16:17   ` Peter Zijlstra
  2017-07-19 10:12 ` [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Ingo Molnar, Peter Zijlstra
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

The policy->transition_delay_us field is used only by the schedutil
governor currently, and this field describes how fast the driver wants
the cpufreq governor to change CPUs frequency. It should rather be a
common thing across all governors, as it doesn't have any schedutil
dependency here.

Create a new helper cpufreq_policy_transition_delay_us() to get the
transition delay across all governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c          | 15 +++++++++++++++
 drivers/cpufreq/cpufreq_governor.c |  9 +--------
 include/linux/cpufreq.h            |  1 +
 kernel/sched/cpufreq_schedutil.c   | 11 +----------
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9bf97a366029..c426d21822f7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -524,6 +524,21 @@ unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
 }
 EXPORT_SYMBOL_GPL(cpufreq_driver_resolve_freq);
 
+unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
+{
+	unsigned int latency;
+
+	if (policy->transition_delay_us)
+		return policy->transition_delay_us;
+
+	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
+	if (latency)
+		return latency * LATENCY_MULTIPLIER;
+
+	return LATENCY_MULTIPLIER;
+}
+EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);
+
 /*********************************************************************
  *                          SYSFS INTERFACE                          *
  *********************************************************************/
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 858081f9c3d7..eed069ecfd5e 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -389,7 +389,6 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 	struct dbs_governor *gov = dbs_governor_of(policy);
 	struct dbs_data *dbs_data;
 	struct policy_dbs_info *policy_dbs;
-	unsigned int latency;
 	int ret = 0;
 
 	/* State should be equivalent to EXIT */
@@ -428,13 +427,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 	if (ret)
 		goto free_policy_dbs_info;
 
-	/* policy latency is in ns. Convert it to us first */
-	latency = policy->cpuinfo.transition_latency / 1000;
-	if (latency == 0)
-		latency = 1;
-
-	/* Bring kernel and HW constraints together */
-	dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;
+	dbs_data->sampling_rate = cpufreq_policy_transition_delay_us(policy);
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 02aec384cab9..aaadfc543f63 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -523,6 +523,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 				   unsigned int relation);
 unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
 					 unsigned int target_freq);
+unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy);
 int cpufreq_register_governor(struct cpufreq_governor *governor);
 void cpufreq_unregister_governor(struct cpufreq_governor *governor);
 
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 29a397067ffa..89c4dd9777bb 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -528,16 +528,7 @@ static int sugov_init(struct cpufreq_policy *policy)
 		goto stop_kthread;
 	}
 
-	if (policy->transition_delay_us) {
-		tunables->rate_limit_us = policy->transition_delay_us;
-	} else {
-		unsigned int lat;
-
-		tunables->rate_limit_us = LATENCY_MULTIPLIER;
-		lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
-		if (lat)
-			tunables->rate_limit_us *= lat;
-	}
+	tunables->rate_limit_us = cpufreq_policy_transition_delay_us(policy);
 
 	policy->governor_data = sg_policy;
 	sg_policy->tunables = tunables;
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 1/9] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-25 11:54   ` Leonard Crestez
  2017-07-19 10:12 ` [PATCH V3 4/9] cpufreq: Don't set transition_latency for setpolicy drivers Viresh Kumar
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

If transition_delay_us isn't defined by the cpufreq driver, the default
value of transition delay (time after which the cpufreq governor will
try updating the frequency again) is currently calculated by multiplying
transition_latency (nsec) with LATENCY_MULTIPLIER (1000) and then
converting this time to usec. That gives the exact same value as
transition_latency, just that the time unit is usec instead of nsec.

With acpi-cpufreq for example, transition_latency is set to around 10
usec and we get transition delay as 10 ms. Which seems to be a
reasonable amount of time to reevaluate the frequency again.

But for platforms where frequency switching isn't that fast (like ARM),
the transition_latency varies from 500 usec to 3 ms, and the transition
delay becomes 500 ms to 3 seconds. Of course, that is a pretty bad
default value to start with.

We can try to come across a better formula (instead of multiplying with
LATENCY_MULTIPLIER) to solve this problem, but will that be worth it ?

This patch tries a simple approach and caps the maximum value of default
transition delay to 10 ms. Of course, userspace can still come in and
change this value anytime or individual drivers can rather provide
transition_delay_us instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c426d21822f7..d00cde871c15 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -532,8 +532,19 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
 		return policy->transition_delay_us;
 
 	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
-	if (latency)
-		return latency * LATENCY_MULTIPLIER;
+	if (latency) {
+		/*
+		 * For platforms that can change the frequency very fast (< 10
+		 * us), the above formula gives a decent transition delay. But
+		 * for platforms where transition_latency is in milliseconds, it
+		 * ends up giving unrealistic values.
+		 *
+		 * Cap the default transition delay to 10 ms, which seems to be
+		 * a reasonable amount of time after which we should reevaluate
+		 * the frequency.
+		 */
+		return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000);
+	}
 
 	return LATENCY_MULTIPLIER;
 }
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 4/9] cpufreq: Don't set transition_latency for setpolicy drivers
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (2 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 5/9] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Srinivas Pandruvada, Len Brown, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

The transition_latency field isn't used for drivers with ->setpolicy()
callback present and there is no point setting it from the drivers.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/intel_pstate.c | 1 -
 drivers/cpufreq/longrun.c      | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index b7fb8b7c980d..c1100a3e3325 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2138,7 +2138,6 @@ static int intel_pstate_cpu_init(struct cpufreq_policy *policy)
 	if (ret)
 		return ret;
 
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 	if (IS_ENABLED(CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE))
 		policy->policy = CPUFREQ_POLICY_PERFORMANCE;
 	else
diff --git a/drivers/cpufreq/longrun.c b/drivers/cpufreq/longrun.c
index 074971b12635..542aa9adba1a 100644
--- a/drivers/cpufreq/longrun.c
+++ b/drivers/cpufreq/longrun.c
@@ -270,7 +270,6 @@ static int longrun_cpu_init(struct cpufreq_policy *policy)
 	/* cpuinfo and default policy values */
 	policy->cpuinfo.min_freq = longrun_low_freq;
 	policy->cpuinfo.max_freq = longrun_high_freq;
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 	longrun_get_policy(policy);
 
 	return 0;
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 5/9] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (3 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 4/9] cpufreq: Don't set transition_latency for setpolicy drivers Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 6/9] cpufreq: Replace "max_transition_latency" with "dynamic_switching" Viresh Kumar
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Sudeep Holla
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

All users of arm_big_little driver are defining it and there is no need
to keep it optional.

Make it mandatory to remove the always true conditional statement.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/arm_big_little.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index ea6d62547b10..17504129fd77 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -483,11 +483,8 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
 		return ret;
 	}
 
-	if (arm_bL_ops->get_transition_latency)
-		policy->cpuinfo.transition_latency =
-			arm_bL_ops->get_transition_latency(cpu_dev);
-	else
-		policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+	policy->cpuinfo.transition_latency =
+				arm_bL_ops->get_transition_latency(cpu_dev);
 
 	if (is_bL_switching_enabled())
 		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
@@ -622,7 +619,8 @@ int bL_cpufreq_register(struct cpufreq_arm_bL_ops *ops)
 		return -EBUSY;
 	}
 
-	if (!ops || !strlen(ops->name) || !ops->init_opp_table) {
+	if (!ops || !strlen(ops->name) || !ops->init_opp_table ||
+	    !ops->get_transition_latency) {
 		pr_err("%s: Invalid arm_bL_ops, exiting\n", __func__);
 		return -ENODEV;
 	}
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 6/9] cpufreq: Replace "max_transition_latency" with "dynamic_switching"
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (4 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 5/9] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 7/9] cpufreq: schedutil: Set dynamic_switching to true Viresh Kumar
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

There is no limitation in the ondemand or conservative governors which
disallow the transition_latency to be greater than 10 ms.

The max_transition_latency field is rather used to disallow automatic
dynamic frequency switching for platforms which didn't wanted these
governors to run.

Replace max_transition_latency with a boolean (dynamic_switching) and
check for transition_latency == CPUFREQ_ETERNAL along with that. This
makes it pretty straight forward to read/understand now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c          | 8 ++++----
 drivers/cpufreq/cpufreq_governor.h | 2 +-
 include/linux/cpufreq.h            | 9 ++-------
 3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d00cde871c15..2debfb5f9126 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2014,13 +2014,13 @@ static int cpufreq_init_governor(struct cpufreq_policy *policy)
 	if (!policy->governor)
 		return -EINVAL;
 
-	if (policy->governor->max_transition_latency &&
-	    policy->cpuinfo.transition_latency >
-	    policy->governor->max_transition_latency) {
+	/* Platform doesn't want dynamic frequency switching ? */
+	if (policy->governor->dynamic_switching &&
+	    policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL) {
 		struct cpufreq_governor *gov = cpufreq_fallback_governor();
 
 		if (gov) {
-			pr_warn("%s governor failed, too long transition latency of HW, fallback to %s governor\n",
+			pr_warn("Transition latency set to CPUFREQ_ETERNAL, can't use %s governor. Fallback to %s governor\n",
 				policy->governor->name, gov->name);
 			policy->governor = gov;
 		} else {
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 95f207eb820e..8463f5def0f5 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -159,7 +159,7 @@ void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy);
 #define CPUFREQ_DBS_GOVERNOR_INITIALIZER(_name_)			\
 	{								\
 		.name = _name_,						\
-		.max_transition_latency	= TRANSITION_LATENCY_LIMIT,	\
+		.dynamic_switching = true,				\
 		.owner = THIS_MODULE,					\
 		.init = cpufreq_dbs_governor_init,			\
 		.exit = cpufreq_dbs_governor_exit,			\
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index aaadfc543f63..e141dbbb9d1c 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -487,12 +487,8 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
  * polling frequency is 1000 times the transition latency of the processor. The
  * ondemand governor will work on any processor with transition latency <= 10ms,
  * using appropriate sampling rate.
- *
- * For CPUs with transition latency > 10ms (mostly drivers with CPUFREQ_ETERNAL)
- * the ondemand governor will not work. All times here are in us (microseconds).
  */
 #define LATENCY_MULTIPLIER		(1000)
-#define TRANSITION_LATENCY_LIMIT	(10 * 1000 * 1000)
 
 struct cpufreq_governor {
 	char	name[CPUFREQ_NAME_LEN];
@@ -505,9 +501,8 @@ struct cpufreq_governor {
 					 char *buf);
 	int	(*store_setspeed)	(struct cpufreq_policy *policy,
 					 unsigned int freq);
-	unsigned int max_transition_latency; /* HW must be able to switch to
-			next freq faster than this value in nano secs or we
-			will fallback to performance governor */
+	/* For governors which change frequency dynamically by themselves */
+	bool			dynamic_switching;
 	struct list_head	governor_list;
 	struct module		*owner;
 };
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 7/9] cpufreq: schedutil: Set dynamic_switching to true
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (5 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 6/9] cpufreq: Replace "max_transition_latency" with "dynamic_switching" Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 10:12 ` [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag Viresh Kumar
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: Viresh Kumar, linux-pm, Vincent Guittot, linux, linux-kernel

Set dynamic_switching to 'true' to disallow use of schedutil governor
for platforms with transition_latency set to CPUFREQ_ETERNAL, as they
may not want to do automatic dynamic frequency switching.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 89c4dd9777bb..45fcf21ad685 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -646,6 +646,7 @@ static void sugov_limits(struct cpufreq_policy *policy)
 static struct cpufreq_governor schedutil_gov = {
 	.name = "schedutil",
 	.owner = THIS_MODULE,
+	.dynamic_switching = true,
 	.init = sugov_init,
 	.exit = sugov_exit,
 	.start = sugov_start,
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (6 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 7/9] cpufreq: schedutil: Set dynamic_switching to true Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 16:30   ` Dominik Brodowski
  2017-07-19 10:12 ` [PATCH V3 9/9] cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency Viresh Kumar
  2017-07-19 12:42 ` [PATCH V3 0/9] cpufreq: transition-latency cleanups Rafael J. Wysocki
  9 siblings, 1 reply; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, linuxppc-dev

The policy->transition_latency field is used for multiple purposes
today and its not straight forward at all. This is how it is used:

A. Set the correct transition_latency value.

B. Set it to CPUFREQ_ETERNAL because:
   1. We don't want automatic dynamic switching (with
      ondemand/conservative) to happen at all.
   2. We don't know the transition latency.

This patch handles the B.1. case in a more readable way. A new flag for
the cpufreq drivers is added to disallow use of cpufreq governors which
have dynamic_switching flag set.

All the current cpufreq drivers which are setting transition_latency
unconditionally to CPUFREQ_ETERNAL are updated to use it. They don't
need to set transition_latency anymore.

There shouldn't be any functional change after this patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq-nforce2.c  | 2 +-
 drivers/cpufreq/cpufreq.c          | 5 +++--
 drivers/cpufreq/elanfreq.c         | 4 +---
 drivers/cpufreq/gx-suspmod.c       | 2 +-
 drivers/cpufreq/pmac32-cpufreq.c   | 7 +++++--
 drivers/cpufreq/sa1100-cpufreq.c   | 5 +++--
 drivers/cpufreq/sa1110-cpufreq.c   | 5 +++--
 drivers/cpufreq/sh-cpufreq.c       | 3 +--
 drivers/cpufreq/speedstep-smi.c    | 2 +-
 drivers/cpufreq/unicore2-cpufreq.c | 3 +--
 include/linux/cpufreq.h            | 6 ++++++
 11 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
index 5503d491b016..dbf82f36d270 100644
--- a/drivers/cpufreq/cpufreq-nforce2.c
+++ b/drivers/cpufreq/cpufreq-nforce2.c
@@ -357,7 +357,6 @@ static int nforce2_cpu_init(struct cpufreq_policy *policy)
 	/* cpuinfo and default policy values */
 	policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
 	policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 
 	return 0;
 }
@@ -369,6 +368,7 @@ static int nforce2_cpu_exit(struct cpufreq_policy *policy)
 
 static struct cpufreq_driver nforce2_driver = {
 	.name = "nforce2",
+	.flags = CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify = nforce2_verify,
 	.target = nforce2_target,
 	.get = nforce2_get,
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2debfb5f9126..a4d9b47c4af4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2016,11 +2016,12 @@ static int cpufreq_init_governor(struct cpufreq_policy *policy)
 
 	/* Platform doesn't want dynamic frequency switching ? */
 	if (policy->governor->dynamic_switching &&
-	    policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL) {
+	    (cpufreq_driver->flags & CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING ||
+	    policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)) {
 		struct cpufreq_governor *gov = cpufreq_fallback_governor();
 
 		if (gov) {
-			pr_warn("Transition latency set to CPUFREQ_ETERNAL, can't use %s governor. Fallback to %s governor\n",
+			pr_warn("Can't use %s governor as dynamic switching is disallowed. Fallback to %s governor\n",
 				policy->governor->name, gov->name);
 			policy->governor = gov;
 		} else {
diff --git a/drivers/cpufreq/elanfreq.c b/drivers/cpufreq/elanfreq.c
index bfce11cba1df..45e2ca62515e 100644
--- a/drivers/cpufreq/elanfreq.c
+++ b/drivers/cpufreq/elanfreq.c
@@ -165,9 +165,6 @@ static int elanfreq_cpu_init(struct cpufreq_policy *policy)
 		if (pos->frequency > max_freq)
 			pos->frequency = CPUFREQ_ENTRY_INVALID;
 
-	/* cpuinfo and default policy values */
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
-
 	return cpufreq_table_validate_and_show(policy, elanfreq_table);
 }
 
@@ -196,6 +193,7 @@ __setup("elanfreq=", elanfreq_setup);
 
 static struct cpufreq_driver elanfreq_driver = {
 	.get		= elanfreq_get_cpu_frequency,
+	.flags		= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify		= cpufreq_generic_frequency_table_verify,
 	.target_index	= elanfreq_target,
 	.init		= elanfreq_cpu_init,
diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
index 3488c9c175eb..8f52a06664e3 100644
--- a/drivers/cpufreq/gx-suspmod.c
+++ b/drivers/cpufreq/gx-suspmod.c
@@ -428,7 +428,6 @@ static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
 	policy->max = maxfreq;
 	policy->cpuinfo.min_freq = maxfreq / max_duration;
 	policy->cpuinfo.max_freq = maxfreq;
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 
 	return 0;
 }
@@ -438,6 +437,7 @@ static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
  *   MediaGX/Geode GX initialize cpufreq driver
  */
 static struct cpufreq_driver gx_suspmod_driver = {
+	.flags		= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.get		= gx_get_cpuspeed,
 	.verify		= cpufreq_gx_verify,
 	.target		= cpufreq_gx_target,
diff --git a/drivers/cpufreq/pmac32-cpufreq.c b/drivers/cpufreq/pmac32-cpufreq.c
index ff44016ea031..61ae06ca008e 100644
--- a/drivers/cpufreq/pmac32-cpufreq.c
+++ b/drivers/cpufreq/pmac32-cpufreq.c
@@ -442,7 +442,8 @@ static struct cpufreq_driver pmac_cpufreq_driver = {
 	.init		= pmac_cpufreq_cpu_init,
 	.suspend	= pmac_cpufreq_suspend,
 	.resume		= pmac_cpufreq_resume,
-	.flags		= CPUFREQ_PM_NO_WARN,
+	.flags		= CPUFREQ_PM_NO_WARN |
+			  CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.attr		= cpufreq_generic_attr,
 	.name		= "powermac",
 };
@@ -626,14 +627,16 @@ static int __init pmac_cpufreq_setup(void)
 	if (!value)
 		goto out;
 	cur_freq = (*value) / 1000;
-	transition_latency = CPUFREQ_ETERNAL;
 
 	/*  Check for 7447A based MacRISC3 */
 	if (of_machine_is_compatible("MacRISC3") &&
 	    of_get_property(cpunode, "dynamic-power-step", NULL) &&
 	    PVR_VER(mfspr(SPRN_PVR)) == 0x8003) {
 		pmac_cpufreq_init_7447A(cpunode);
+
+		/* Allow dynamic switching */
 		transition_latency = 8000000;
+		pmac_cpufreq_driver.flags &= ~CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING;
 	/* Check for other MacRISC3 machines */
 	} else if (of_machine_is_compatible("PowerBook3,4") ||
 		   of_machine_is_compatible("PowerBook3,5") ||
diff --git a/drivers/cpufreq/sa1100-cpufreq.c b/drivers/cpufreq/sa1100-cpufreq.c
index 728eab77e8e0..e2d8a77c36d5 100644
--- a/drivers/cpufreq/sa1100-cpufreq.c
+++ b/drivers/cpufreq/sa1100-cpufreq.c
@@ -197,11 +197,12 @@ static int sa1100_target(struct cpufreq_policy *policy, unsigned int ppcr)
 
 static int __init sa1100_cpu_init(struct cpufreq_policy *policy)
 {
-	return cpufreq_generic_init(policy, sa11x0_freq_table, CPUFREQ_ETERNAL);
+	return cpufreq_generic_init(policy, sa11x0_freq_table, 0);
 }
 
 static struct cpufreq_driver sa1100_driver __refdata = {
-	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+			  CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify		= cpufreq_generic_frequency_table_verify,
 	.target_index	= sa1100_target,
 	.get		= sa11x0_getspeed,
diff --git a/drivers/cpufreq/sa1110-cpufreq.c b/drivers/cpufreq/sa1110-cpufreq.c
index 2bac9b6cfeea..66e5fb088ecc 100644
--- a/drivers/cpufreq/sa1110-cpufreq.c
+++ b/drivers/cpufreq/sa1110-cpufreq.c
@@ -306,13 +306,14 @@ static int sa1110_target(struct cpufreq_policy *policy, unsigned int ppcr)
 
 static int __init sa1110_cpu_init(struct cpufreq_policy *policy)
 {
-	return cpufreq_generic_init(policy, sa11x0_freq_table, CPUFREQ_ETERNAL);
+	return cpufreq_generic_init(policy, sa11x0_freq_table, 0);
 }
 
 /* sa1110_driver needs __refdata because it must remain after init registers
  * it with cpufreq_register_driver() */
 static struct cpufreq_driver sa1110_driver __refdata = {
-	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+			  CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify		= cpufreq_generic_frequency_table_verify,
 	.target_index	= sa1110_target,
 	.get		= sa11x0_getspeed,
diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
index 719c3d9f07fb..28893d435cf5 100644
--- a/drivers/cpufreq/sh-cpufreq.c
+++ b/drivers/cpufreq/sh-cpufreq.c
@@ -137,8 +137,6 @@ static int sh_cpufreq_cpu_init(struct cpufreq_policy *policy)
 			(clk_round_rate(cpuclk, ~0UL) + 500) / 1000;
 	}
 
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
-
 	dev_info(dev, "CPU Frequencies - Minimum %u.%03u MHz, "
 	       "Maximum %u.%03u MHz.\n",
 	       policy->min / 1000, policy->min % 1000,
@@ -159,6 +157,7 @@ static int sh_cpufreq_cpu_exit(struct cpufreq_policy *policy)
 
 static struct cpufreq_driver sh_cpufreq_driver = {
 	.name		= "sh",
+	.flags		= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.get		= sh_cpufreq_get,
 	.target		= sh_cpufreq_target,
 	.verify		= sh_cpufreq_verify,
diff --git a/drivers/cpufreq/speedstep-smi.c b/drivers/cpufreq/speedstep-smi.c
index 37b30071c220..d23f24ccff38 100644
--- a/drivers/cpufreq/speedstep-smi.c
+++ b/drivers/cpufreq/speedstep-smi.c
@@ -266,7 +266,6 @@ static int speedstep_cpu_init(struct cpufreq_policy *policy)
 			pr_debug("workaround worked.\n");
 	}
 
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 	return cpufreq_table_validate_and_show(policy, speedstep_freqs);
 }
 
@@ -290,6 +289,7 @@ static int speedstep_resume(struct cpufreq_policy *policy)
 
 static struct cpufreq_driver speedstep_driver = {
 	.name		= "speedstep-smi",
+	.flags		= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify		= cpufreq_generic_frequency_table_verify,
 	.target_index	= speedstep_target,
 	.init		= speedstep_cpu_init,
diff --git a/drivers/cpufreq/unicore2-cpufreq.c b/drivers/cpufreq/unicore2-cpufreq.c
index 6f9dfa80563a..db62d9844751 100644
--- a/drivers/cpufreq/unicore2-cpufreq.c
+++ b/drivers/cpufreq/unicore2-cpufreq.c
@@ -58,13 +58,12 @@ static int __init ucv2_cpu_init(struct cpufreq_policy *policy)
 
 	policy->min = policy->cpuinfo.min_freq = 250000;
 	policy->max = policy->cpuinfo.max_freq = 1000000;
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
 	policy->clk = clk_get(NULL, "MAIN_CLK");
 	return PTR_ERR_OR_ZERO(policy->clk);
 }
 
 static struct cpufreq_driver ucv2_driver = {
-	.flags		= CPUFREQ_STICKY,
+	.flags		= CPUFREQ_STICKY | CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING,
 	.verify		= ucv2_verify_speed,
 	.target		= ucv2_target,
 	.get		= cpufreq_generic_get,
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index e141dbbb9d1c..5f40522ec98c 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -370,6 +370,12 @@ struct cpufreq_driver {
  */
 #define CPUFREQ_NEED_INITIAL_FREQ_CHECK	(1 << 5)
 
+/*
+ * Set by drivers to disallow use of governors with "dynamic_switching" flag
+ * set.
+ */
+#define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING (1 << 6)
+
 int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
 
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V3 9/9] cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (7 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag Viresh Kumar
@ 2017-07-19 10:12 ` Viresh Kumar
  2017-07-19 12:42 ` [PATCH V3 0/9] cpufreq: transition-latency cleanups Rafael J. Wysocki
  9 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-19 10:12 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

With the recent updates, CPUFREQ_ETERNAL is only used by the drivers
which don't know their transition latency but want to use dynamic
switching.

Anyway, the routine cpufreq_policy_transition_delay_us() caps the value
of transition latency to 10 ms now and that can be used safely with such
platforms.

Remove the check from cpufreq_init_governor() and allow dynamic
switching for such configurations as well.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a4d9b47c4af4..c7ae67d6886d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2016,8 +2016,7 @@ static int cpufreq_init_governor(struct cpufreq_policy *policy)
 
 	/* Platform doesn't want dynamic frequency switching ? */
 	if (policy->governor->dynamic_switching &&
-	    (cpufreq_driver->flags & CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING ||
-	    policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)) {
+	    cpufreq_driver->flags & CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING) {
 		struct cpufreq_governor *gov = cpufreq_fallback_governor();
 
 		if (gov) {
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 0/9] cpufreq: transition-latency cleanups
  2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
                   ` (8 preceding siblings ...)
  2017-07-19 10:12 ` [PATCH V3 9/9] cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency Viresh Kumar
@ 2017-07-19 12:42 ` Rafael J. Wysocki
  9 siblings, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2017-07-19 12:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, Benjamin Herrenschmidt,
	Ingo Molnar, Jonathan Corbet, Len Brown, linux-doc, linux-kernel,
	linuxppc-dev, Michael Ellerman, Paul Mackerras, Peter Zijlstra,
	Srinivas Pandruvada, Sudeep Holla

On Wednesday, July 19, 2017 03:42:40 PM Viresh Kumar wrote:
> Hi Rafael,
> 
> This series tries to cleanup the code around transition-latency and its
> users. Some of the old legacy code, which may not make much sense now,
> is dropped as well. And some code consolidation is also done across
> governors.
> 
> Based of: v4.13-rc1
> Tested on: ARM64 Hikey board.

>From the first quick look this version is fine by me.

Unless I find anything of concern later, this will be queued up for 4.14.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag
  2017-07-19 10:12 ` [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag Viresh Kumar
@ 2017-07-19 16:30   ` Dominik Brodowski
  0 siblings, 0 replies; 24+ messages in thread
From: Dominik Brodowski @ 2017-07-19 16:30 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linux-pm, Vincent Guittot, linux-kernel,
	linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]


On Wed, Jul 19, 2017 at 03:42:48PM +0530, Viresh Kumar wrote:
> The policy->transition_latency field is used for multiple purposes
> today and its not straight forward at all. This is how it is used:
> 
> A. Set the correct transition_latency value.
> 
> B. Set it to CPUFREQ_ETERNAL because:
>    1. We don't want automatic dynamic switching (with
>       ondemand/conservative) to happen at all.
>    2. We don't know the transition latency.
> 
> This patch handles the B.1. case in a more readable way. A new flag for
> the cpufreq drivers is added to disallow use of cpufreq governors which
> have dynamic_switching flag set.
> 
> All the current cpufreq drivers which are setting transition_latency
> unconditionally to CPUFREQ_ETERNAL are updated to use it. They don't
> need to set transition_latency anymore.
> 
> There shouldn't be any functional change after this patch.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Looks good to me, so feel free to add:

	Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>

Best,
	Dominik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well
  2017-07-19 10:12 ` [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
@ 2017-07-24 16:17   ` Peter Zijlstra
  2017-07-28  4:48     ` Viresh Kumar
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2017-07-24 16:17 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Ingo Molnar, linux-pm, Vincent Guittot, linux,
	linux-kernel

On Wed, Jul 19, 2017 at 03:42:42PM +0530, Viresh Kumar wrote:
> The policy->transition_delay_us field is used only by the schedutil
> governor currently, and this field describes how fast the driver wants
> the cpufreq governor to change CPUs frequency. It should rather be a
> common thing across all governors, as it doesn't have any schedutil
> dependency here.
> 
> Create a new helper cpufreq_policy_transition_delay_us() to get the
> transition delay across all governors.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c          | 15 +++++++++++++++
>  drivers/cpufreq/cpufreq_governor.c |  9 +--------
>  include/linux/cpufreq.h            |  1 +
>  kernel/sched/cpufreq_schedutil.c   | 11 +----------
>  4 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 9bf97a366029..c426d21822f7 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -524,6 +524,21 @@ unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_driver_resolve_freq);
>  
> +unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
> +{
> +	unsigned int latency;
> +
> +	if (policy->transition_delay_us)
> +		return policy->transition_delay_us;
> +
> +	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> +	if (latency)
> +		return latency * LATENCY_MULTIPLIER;
> +
> +	return LATENCY_MULTIPLIER;
> +}
> +EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);

I realize you're just moving code about, but _why_ are we doing that
division?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-19 10:12 ` [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
@ 2017-07-25 11:54   ` Leonard Crestez
  2017-07-26  0:19     ` Rafael J. Wysocki
  2017-07-26  6:06     ` Viresh Kumar
  0 siblings, 2 replies; 24+ messages in thread
From: Leonard Crestez @ 2017-07-25 11:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, Rafael Wysocki

On Wed, 2017-07-19 at 15:42 +0530, Viresh Kumar wrote:
> If transition_delay_us isn't defined by the cpufreq driver, the default
> value of transition delay (time after which the cpufreq governor will
> try updating the frequency again) is currently calculated by multiplying
> transition_latency (nsec) with LATENCY_MULTIPLIER (1000) and then
> converting this time to usec. That gives the exact same value as
> transition_latency, just that the time unit is usec instead of nsec.
> 
> With acpi-cpufreq for example, transition_latency is set to around 10
> usec and we get transition delay as 10 ms. Which seems to be a
> reasonable amount of time to reevaluate the frequency again.
> 
> But for platforms where frequency switching isn't that fast (like ARM),
> the transition_latency varies from 500 usec to 3 ms, and the transition
> delay becomes 500 ms to 3 seconds. Of course, that is a pretty bad
> default value to start with.
> 
> We can try to come across a better formula (instead of multiplying with
> LATENCY_MULTIPLIER) to solve this problem, but will that be worth it ?
> 
> This patch tries a simple approach and caps the maximum value of default
> transition delay to 10 ms. Of course, userspace can still come in and
> change this value anytime or individual drivers can rather provide
> transition_delay_us instead.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index c426d21822f7..d00cde871c15 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -532,8 +532,19 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>  		return policy->transition_delay_us;
>  
>  	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> -	if (latency)
> -		return latency * LATENCY_MULTIPLIER;
> +	if (latency) {
> +		/*
> +		 * For platforms that can change the frequency very fast (< 10
> +		 * us), the above formula gives a decent transition delay. But
> +		 * for platforms where transition_latency is in milliseconds, it
> +		 * ends up giving unrealistic values.
> +		 *
> +		 * Cap the default transition delay to 10 ms, which seems to be
> +		 * a reasonable amount of time after which we should reevaluate
> +		 * the frequency.
> +		 */
> +		return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000);
> +	}
>  
>  	return LATENCY_MULTIPLIER;
>  }

This patch made it's way into linux-next and it seems to cause imx socs
to almost always hang around their max frequency with the ondemand
governor, even when almost completely idle. The lowest frequency is
never reached. This seems wrong?

This driver calculates transition_latency at probe time, the value is
not terribly accurate but it reaches values like latency = 109 us, so
this patch clamps it at about 10% of the value.

It's worth noting that the default IMX config has HZ=100 and
NO_HZ_IDLE=y, so maybe doing idle checks at a rate comparable to the
jiffie tick screws stuff up? I don't understand what ondemand is trying
to do.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-25 11:54   ` Leonard Crestez
@ 2017-07-26  0:19     ` Rafael J. Wysocki
  2017-07-26  6:06     ` Viresh Kumar
  1 sibling, 0 replies; 24+ messages in thread
From: Rafael J. Wysocki @ 2017-07-26  0:19 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Viresh Kumar, linux-pm, Vincent Guittot, linux, linux-kernel

On Tuesday, July 25, 2017 02:54:46 PM Leonard Crestez wrote:
> On Wed, 2017-07-19 at 15:42 +0530, Viresh Kumar wrote:
> > If transition_delay_us isn't defined by the cpufreq driver, the default
> > value of transition delay (time after which the cpufreq governor will
> > try updating the frequency again) is currently calculated by multiplying
> > transition_latency (nsec) with LATENCY_MULTIPLIER (1000) and then
> > converting this time to usec. That gives the exact same value as
> > transition_latency, just that the time unit is usec instead of nsec.
> > 
> > With acpi-cpufreq for example, transition_latency is set to around 10
> > usec and we get transition delay as 10 ms. Which seems to be a
> > reasonable amount of time to reevaluate the frequency again.
> > 
> > But for platforms where frequency switching isn't that fast (like ARM),
> > the transition_latency varies from 500 usec to 3 ms, and the transition
> > delay becomes 500 ms to 3 seconds. Of course, that is a pretty bad
> > default value to start with.
> > 
> > We can try to come across a better formula (instead of multiplying with
> > LATENCY_MULTIPLIER) to solve this problem, but will that be worth it ?
> > 
> > This patch tries a simple approach and caps the maximum value of default
> > transition delay to 10 ms. Of course, userspace can still come in and
> > change this value anytime or individual drivers can rather provide
> > transition_delay_us instead.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> >  drivers/cpufreq/cpufreq.c | 15 +++++++++++++--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index c426d21822f7..d00cde871c15 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -532,8 +532,19 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
> >  		return policy->transition_delay_us;
> >  
> >  	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> > -	if (latency)
> > -		return latency * LATENCY_MULTIPLIER;
> > +	if (latency) {
> > +		/*
> > +		 * For platforms that can change the frequency very fast (< 10
> > +		 * us), the above formula gives a decent transition delay. But
> > +		 * for platforms where transition_latency is in milliseconds, it
> > +		 * ends up giving unrealistic values.
> > +		 *
> > +		 * Cap the default transition delay to 10 ms, which seems to be
> > +		 * a reasonable amount of time after which we should reevaluate
> > +		 * the frequency.
> > +		 */
> > +		return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000);
> > +	}
> >  
> >  	return LATENCY_MULTIPLIER;
> >  }
> 
> This patch made it's way into linux-next and it seems to cause imx socs
> to almost always hang around their max frequency with the ondemand
> governor, even when almost completely idle. The lowest frequency is
> never reached. This seems wrong?
> 
> This driver calculates transition_latency at probe time, the value is
> not terribly accurate but it reaches values like latency = 109 us, so
> this patch clamps it at about 10% of the value.
> 
> It's worth noting that the default IMX config has HZ=100 and
> NO_HZ_IDLE=y, so maybe doing idle checks at a rate comparable to the
> jiffie tick screws stuff up? I don't understand what ondemand is trying
> to do.

I've dropped this commit from linux-next for now.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-25 11:54   ` Leonard Crestez
  2017-07-26  0:19     ` Rafael J. Wysocki
@ 2017-07-26  6:06     ` Viresh Kumar
  2017-07-27 16:54       ` Leonard Crestez
  1 sibling, 1 reply; 24+ messages in thread
From: Viresh Kumar @ 2017-07-26  6:06 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, Rafael Wysocki

[-- Attachment #1: Type: text/plain, Size: 1726 bytes --]

On 25-07-17, 14:54, Leonard Crestez wrote:

Thanks for reporting Leonard, really appreciate it.

> This patch made it's way into linux-next and it seems to cause imx socs
> to almost always hang around their max frequency with the ondemand
> governor, even when almost completely idle. The lowest frequency is
> never reached. This seems wrong?

Sure thing. Though I tried to reproduce it on Hikey today and it all
worked pretty fine with ondemand governor. The latency is around 500
us here. I don't have a clue right now on what's wrong in your setup.

> This driver calculates transition_latency at probe time, the value is
> not terribly accurate but it reaches values like latency = 109 us, so

So this is the value that is stored in the global variable
"transition_latency" in the imx6q-cpufreq.c file? i.e.
transition_latency = 109000 (ns) to be exact ?

> this patch clamps it at about 10% of the value.

Without this patch the sampling rate of ondemand governor will be 109
ms. And after this patch it would be capped at 10 ms. Why would that
screw up anyone's setup ? I don't have an answer to that right now.

> It's worth noting that the default IMX config has HZ=100 and
> NO_HZ_IDLE=y, so maybe doing idle checks at a rate comparable to the
> jiffie tick screws stuff up?

Maybe, but I am not sure at all.

Can you try few things ?

- Don't use this patch and try to change ondemand's sampling rate from
  sysfs. Try setting it to 10000 and see if the behavior is identical
  to after this patch.

- Find how much time does it really take to change the frequency of
  the CPU. I don't really thing 109 us is the right transition
  latency. Use attached patch for that and look for the print message.

-- 
viresh

[-- Attachment #2: 0001-cpufreq-Find-transition-latency-dynamically.patch --]
[-- Type: text/x-diff, Size: 2689 bytes --]

>From 95d830ad8765d6c35fb9e91d5028bf3cf1ff2451 Mon Sep 17 00:00:00 2001
Message-Id: <95d830ad8765d6c35fb9e91d5028bf3cf1ff2451.1501049147.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Fri, 2 Jun 2017 15:00:58 +0530
Subject: [PATCH] cpufreq: Find transition latency dynamically

Test patch to find latency dynamically.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 0d6fbb3099b4..e62d670dffd2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1076,6 +1076,62 @@ __weak struct cpufreq_governor *cpufreq_default_governor(void)
 	return NULL;
 }
 
+static int find_dvfs_latency(struct cpufreq_policy *policy, unsigned long freq,
+			     unsigned int *latency_ns)
+{
+	u64 time;
+	int ret;
+
+	time = local_clock();
+	ret = __cpufreq_driver_target(policy, freq, CPUFREQ_RELATION_L);
+	*latency_ns = local_clock() - time;
+
+	return ret;
+}
+
+/*
+ * Find the transition latency dynamically by:
+ * - Switching to min freq first.
+ * - Then switching to max freq.
+ * - And finally switching back to the initial freq.
+ *
+ * The maximum duration of the above three freq changes should be good enough to
+ * find the maximum transition latency for a platform.
+ */
+static void cpufreq_find_target_latency(struct cpufreq_policy *policy)
+{
+	unsigned long initial_freq = policy->cur;
+	unsigned int latency_ns, latency_max_ns;
+	int ret;
+
+	if (!has_target())
+		return;
+
+	/* Go to min frequency first */
+	ret = find_dvfs_latency(policy, policy->cpuinfo.min_freq, &latency_ns);
+	if (ret)
+		return;
+
+	latency_max_ns = latency_ns;
+
+	/* Go to max frequency then.. */
+	ret = find_dvfs_latency(policy, policy->cpuinfo.max_freq, &latency_ns);
+	if (ret)
+		return;
+
+	latency_max_ns = max(latency_max_ns, latency_ns);
+
+	/* And finally switch back to where we started from */
+	ret = find_dvfs_latency(policy, initial_freq, &latency_ns);
+	if (ret)
+		return;
+
+	policy->cpuinfo.transition_latency = max(latency_max_ns, latency_ns);
+
+	pr_info("%s: Setting transition latency to %u ns for policy of CPU%d\n",
+		__func__, policy->cpuinfo.transition_latency, policy->cpu);
+}
+
 static int cpufreq_init_policy(struct cpufreq_policy *policy)
 {
 	struct cpufreq_governor *gov = NULL;
@@ -1345,6 +1401,8 @@ static int cpufreq_online(unsigned int cpu)
 	}
 
 	if (new_policy) {
+		cpufreq_find_target_latency(policy);
+
 		ret = cpufreq_add_dev_interface(policy);
 		if (ret)
 			goto out_exit_policy;
-- 
2.13.0.71.gd7076ec9c9cb


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-26  6:06     ` Viresh Kumar
@ 2017-07-27 16:54       ` Leonard Crestez
  2017-07-28  5:28         ` Viresh Kumar
  0 siblings, 1 reply; 24+ messages in thread
From: Leonard Crestez @ 2017-07-27 16:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, Rafael Wysocki

On Wed, 2017-07-26 at 11:36 +0530, Viresh Kumar wrote:
> On 25-07-17, 14:54, Leonard Crestez wrote:

> > This patch made it's way into linux-next and it seems to cause imx socs
> > to almost always hang around their max frequency with the ondemand
> > governor, even when almost completely idle. The lowest frequency is
> > never reached. This seems wrong?

> > This driver calculates transition_latency at probe time, the value is
> > not terribly accurate but it reaches values like latency = 109 us, so

> So this is the value that is stored in the global variable
> "transition_latency" in the imx6q-cpufreq.c file? i.e.
> transition_latency = 109000 (ns) to be exact ?

Yes.

> - Don't use this patch and try to change ondemand's sampling rate from
>   sysfs. Try setting it to 10000 and see if the behavior is identical
>   to after this patch.

Yes, it seems to be. Also setting 100000 explicitly fixes this.

I also tried to switch from HZ=100 to HZ=1000 but that did not make a
difference.

> - Find how much time does it really take to change the frequency of
>   the CPU. I don't really thing 109 us is the right transition
>   latency. Use attached patch for that and look for the print message.

Your patch measures latencies of around 2.5ms, but it can vary between
1.6 ms to 3ms from boot-to-boot. This is a lot more than what the
driver reports. Most transitions seem to be faster.

I did a little digging and it seems that a majority of time is always
spent inside clk_pllv3_wait_lock which spins on a HW bit while doing
usleep_range(50, 500). I originally thought it was because of
regulators but the delays involved in that are smaller.

Measuring wall time on a process that can sleep seems dubious, isn't
this vulnerable to random delays because of other tasks?

> Without this patch the sampling rate of ondemand governor will be 109
> ms. And after this patch it would be capped at 10 ms. Why would that
> screw up anyone's setup ? I don't have an answer to that right now.

On a closer look it seems that most of the time is actually spent at
low cpufreq though (90%+).

Your change makes it so that even something like "sleep 1; cat
scaling_cur_freq" raises the frequency to the maximum. This happens
enough that even if you do it in a loop you will never see the minimum
frequency. It seems there is enough internal bookkeeping on such a
wakeup that it takes more than 10ms and enough for a reevaluation of
cpufreq until cat returns the value?!
 
I found this by enabling the power:cpu_frequency tracepoint event and
checking for deltas with a script. Enabling CPU_FREQ_STAT show this:

time_in_state:

396000 1609
792000 71
996000 54

trans_table:

   From  :    To
         :    396000    792000    996000 
   396000:         0        10         7 
   792000:        16         0        12 
   996000:         1        18         0 

This is very unexpected but not necessarily wrong.

--
Regards,
Leonard

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well
  2017-07-24 16:17   ` Peter Zijlstra
@ 2017-07-28  4:48     ` Viresh Kumar
  0 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-28  4:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael Wysocki, Ingo Molnar, linux-pm, Vincent Guittot, linux,
	linux-kernel

On 24-07-17, 18:17, Peter Zijlstra wrote:
> On Wed, Jul 19, 2017 at 03:42:42PM +0530, Viresh Kumar wrote:
> > The policy->transition_delay_us field is used only by the schedutil
> > governor currently, and this field describes how fast the driver wants
> > the cpufreq governor to change CPUs frequency. It should rather be a
> > common thing across all governors, as it doesn't have any schedutil
> > dependency here.
> > 
> > Create a new helper cpufreq_policy_transition_delay_us() to get the
> > transition delay across all governors.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> > ---
> >  drivers/cpufreq/cpufreq.c          | 15 +++++++++++++++
> >  drivers/cpufreq/cpufreq_governor.c |  9 +--------
> >  include/linux/cpufreq.h            |  1 +
> >  kernel/sched/cpufreq_schedutil.c   | 11 +----------
> >  4 files changed, 18 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 9bf97a366029..c426d21822f7 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -524,6 +524,21 @@ unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
> >  }
> >  EXPORT_SYMBOL_GPL(cpufreq_driver_resolve_freq);
> >  
> > +unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
> > +{
> > +	unsigned int latency;
> > +
> > +	if (policy->transition_delay_us)
> > +		return policy->transition_delay_us;
> > +
> > +	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
> > +	if (latency)
> > +		return latency * LATENCY_MULTIPLIER;
> > +
> > +	return LATENCY_MULTIPLIER;
> > +}
> > +EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);
> 
> I realize you're just moving code about, but _why_ are we doing that
> division?

We are doing division by NSEC_PER_USEC because the values of
sampling_rate and rate_limit_us are in us, while transition_latency is
in ns.

We shouldn't be breaking the userspace ABI and so if we want, we can
actually make transition_latency be stored in us instead.

Though I am not sure why are we multiplying with LATENCY_MULTIPLIER
(1000) as that seems to be a huge value.

-- 
viresh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-27 16:54       ` Leonard Crestez
@ 2017-07-28  5:28         ` Viresh Kumar
  2017-08-01 17:48           ` Leonard Crestez
  2017-08-16  6:34           ` Viresh Kumar
  0 siblings, 2 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-07-28  5:28 UTC (permalink / raw)
  To: Leonard Crestez, Shawn Guo, Sascha Hauer, Fabio Estevam
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, Rafael Wysocki

+ IMX maintainers.

On 27-07-17, 19:54, Leonard Crestez wrote:
> On Wed, 2017-07-26 at 11:36 +0530, Viresh Kumar wrote:

> > - Find how much time does it really take to change the frequency of
> >   the CPU. I don't really thing 109 us is the right transition
> >   latency. Use attached patch for that and look for the print message.
> 
> Your patch measures latencies of around 2.5ms, but it can vary between
> 1.6 ms to 3ms from boot-to-boot. This is a lot more than what the
> driver reports. Most transitions seem to be faster.

Wow !!

I was pretty sure all these figures are just made up by some coder :)

> I did a little digging and it seems that a majority of time is always
> spent inside clk_pllv3_wait_lock which spins on a HW bit while doing
> usleep_range(50, 500). I originally thought it was because of
> regulators but the delays involved in that are smaller.
> 
> Measuring wall time on a process that can sleep seems dubious, isn't
> this vulnerable to random delays because of other tasks?

I am not sure I understood that, sorry.

> > Without this patch the sampling rate of ondemand governor will be 109
> > ms. And after this patch it would be capped at 10 ms. Why would that
> > screw up anyone's setup ? I don't have an answer to that right now.
> 
> On a closer look it seems that most of the time is actually spent at
> low cpufreq though (90%+).
> 
> Your change makes it so that even something like "sleep 1; cat
> scaling_cur_freq" raises the frequency to the maximum.

Why?

> This happens
> enough that even if you do it in a loop you will never see the minimum
> frequency. It seems there is enough internal bookkeeping on such a
> wakeup that it takes more than 10ms and enough for a reevaluation of
> cpufreq until cat returns the value?!

At this point I really feel that this is a hardware specific problem
and it was working by chance until now. And I am not sure if we
shouldn't be stopping this patch from getting merged just because of
that.

At least you can teach your distribution to go increase the sampling
rate from userspace to make it all work.

Can you try one more thing? Try using schedutil governor and see how
it behaves ?

> I found this by enabling the power:cpu_frequency tracepoint event and
> checking for deltas with a script. Enabling CPU_FREQ_STAT show this:
> 
> time_in_state:
> 
> 396000 1609

So we still stay at the lowest frequency most of the time.

> 792000 71
> 996000 54
> 
> trans_table:
> 
>    From  :    To
>          :    396000    792000    996000 
>    396000:         0        10         7 
>    792000:        16         0        12 
>    996000:         1        18         0 

What is it that you are trying to point out here? I still see that we
are coming back to 396 MHz quite often.

Maybe can you compare these values with and without this patch to let
us know?

> This is very unexpected but not necessarily wrong.

I haven't understood the problem completely yet :(

-- 
viresh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-28  5:28         ` Viresh Kumar
@ 2017-08-01 17:48           ` Leonard Crestez
  2017-08-02  3:23             ` Viresh Kumar
  2017-08-16  6:34           ` Viresh Kumar
  1 sibling, 1 reply; 24+ messages in thread
From: Leonard Crestez @ 2017-08-01 17:48 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Shawn Guo, Sascha Hauer, Fabio Estevam
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

On Fri, 2017-07-28 at 10:58 +0530, Viresh Kumar wrote:
> On 27-07-17, 19:54, Leonard Crestez wrote:
> > On Wed, 2017-07-26 at 11:36 +0530, Viresh Kumar wrote:
> > > Without this patch the sampling rate of ondemand governor will be 109
> > > ms. And after this patch it would be capped at 10 ms. Why would that
> > > screw up anyone's setup ? I don't have an answer to that right now.
> > On a closer look it seems that most of the time is actually spent at
> > low cpufreq though (90%+).
> > 
> > Your change makes it so that even something like "sleep 1; cat
> > scaling_cur_freq" raises the frequency to the maximum.
> Why?
> 
> > 
> > This happens
> > enough that even if you do it in a loop you will never see the minimum
> > frequency. It seems there is enough internal bookkeeping on such a
> > wakeup that it takes more than 10ms and enough for a reevaluation of
> > cpufreq until cat returns the value?!

> At this point I really feel that this is a hardware specific problem
> and it was working by chance until now. And I am not sure if we
> shouldn't be stopping this patch from getting merged just because of
> that.

Yes, I agree. Something is fishy here but most likely your patch just
expose the problem.

> At least you can teach your distribution to go increase the sampling
> rate from userspace to make it all work.
> 
> Can you try one more thing? Try using schedutil governor and see how
> it behaves ?

I don't have the time to investigate this properly right now.

> > I found this by enabling the power:cpu_frequency tracepoint event and
> > checking for deltas with a script. Enabling CPU_FREQ_STAT show this:
> > 
> > time_in_state:
> > 
> > 396000 1609
> So we still stay at the lowest frequency most of the time.

Yes

> Maybe can you compare these values with and without this patch to let
> us know?

Without the patch it is always at low freq. Sampling at a lower
frequency means spikes get ignored.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-08-01 17:48           ` Leonard Crestez
@ 2017-08-02  3:23             ` Viresh Kumar
  0 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-08-02  3:23 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael Wysocki, Shawn Guo, Sascha Hauer, Fabio Estevam, linux-pm,
	Vincent Guittot, linux, linux-kernel

On 01-08-17, 20:48, Leonard Crestez wrote:
> > > I found this by enabling the power:cpu_frequency tracepoint event and
> > > checking for deltas with a script. Enabling CPU_FREQ_STAT show this:
> > > 
> > > time_in_state:
> > > 
> > > 396000 1609
> > So we still stay at the lowest frequency most of the time.
> 
> Yes

Aren't these numbers you shared were taken after this patch is applied? What's
wrong with the numbers then ?

> > Maybe can you compare these values with and without this patch to let
> > us know?
> 
> Without the patch it is always at low freq. Sampling at a lower
> frequency means spikes get ignored.

-- 
viresh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-07-28  5:28         ` Viresh Kumar
  2017-08-01 17:48           ` Leonard Crestez
@ 2017-08-16  6:34           ` Viresh Kumar
  2017-08-16  9:42             ` Leonard Crestez
  1 sibling, 1 reply; 24+ messages in thread
From: Viresh Kumar @ 2017-08-16  6:34 UTC (permalink / raw)
  To: Leonard Crestez, Shawn Guo, Sascha Hauer, Fabio Estevam
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel, Rafael Wysocki

On 28-07-17, 10:58, Viresh Kumar wrote:
> At this point I really feel that this is a hardware specific problem
> and it was working by chance until now. And I am not sure if we
> shouldn't be stopping this patch from getting merged just because of
> that.
> 
> At least you can teach your distribution to go increase the sampling
> rate from userspace to make it all work.

Its been 3 weeks since my last email on this thread and no reply yet
from any of the IMX maintainers. Can someone please help here ?

@Shawn: Can you help debugging a bit here, to see what's get screwed
up due to this commit ? Its just that your platform isn't able to
change freq at 10 ms rate.

@Rafael: I am not sure, but should we be stopping this patch because
some hardware isn't able to change freq at 10ms interval and is just
faking the transition delay to start with ?

Maybe we get this merged again and the IMX guys can figure out what's
wrong on their platform and how to fix it ?

-- 
viresh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-08-16  6:34           ` Viresh Kumar
@ 2017-08-16  9:42             ` Leonard Crestez
  2017-08-17  3:38               ` Viresh Kumar
  0 siblings, 1 reply; 24+ messages in thread
From: Leonard Crestez @ 2017-08-16  9:42 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Shawn Guo, Sascha Hauer, Fabio Estevam
  Cc: linux-pm, Vincent Guittot, linux, linux-kernel

On Wed, 2017-08-16 at 12:04 +0530, Viresh Kumar wrote:
> On 28-07-17, 10:58, Viresh Kumar wrote:
> > 
> > At this point I really feel that this is a hardware specific problem
> > and it was working by chance until now. And I am not sure if we
> > shouldn't be stopping this patch from getting merged just because of
> > that.
> > 
> > At least you can teach your distribution to go increase the sampling
> > rate from userspace to make it all work.
> Its been 3 weeks since my last email on this thread and no reply yet
> from any of the IMX maintainers. Can someone please help here ?
> 
> @Shawn: Can you help debugging a bit here, to see what's get screwed
> up due to this commit ? Its just that your platform isn't able to
> change freq at 10 ms rate.
> 
> @Rafael: I am not sure, but should we be stopping this patch because
> some hardware isn't able to change freq at 10ms interval and is just
> faking the transition delay to start with ?
> 
> Maybe we get this merged again and the IMX guys can figure out what's
> wrong on their platform and how to fix it ?

I reported the initial issue but did not have the time to do a more
thorough investigation, this is more complicated than it seems. I said
this before but maybe it got lost:

I don't think the odd behavior I noticed justifies keeping the patch
from merging.

--
Regards,
Leonrd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms
  2017-08-16  9:42             ` Leonard Crestez
@ 2017-08-17  3:38               ` Viresh Kumar
  0 siblings, 0 replies; 24+ messages in thread
From: Viresh Kumar @ 2017-08-17  3:38 UTC (permalink / raw)
  To: Leonard Crestez
  Cc: Rafael Wysocki, Shawn Guo, Sascha Hauer, Fabio Estevam, linux-pm,
	Vincent Guittot, linux, linux-kernel

On 16-08-17, 12:42, Leonard Crestez wrote:
> I reported the initial issue but did not have the time to do a more
> thorough investigation, this is more complicated than it seems. I said
> this before but maybe it got lost:
> 
> I don't think the odd behavior I noticed justifies keeping the patch
> from merging.

Thanks for that.

I will resend this patch now.

-- 
viresh

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-08-17  3:38 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-19 10:12 [PATCH V3 0/9] cpufreq: transition-latency cleanups Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 1/9] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 2/9] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
2017-07-24 16:17   ` Peter Zijlstra
2017-07-28  4:48     ` Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 3/9] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
2017-07-25 11:54   ` Leonard Crestez
2017-07-26  0:19     ` Rafael J. Wysocki
2017-07-26  6:06     ` Viresh Kumar
2017-07-27 16:54       ` Leonard Crestez
2017-07-28  5:28         ` Viresh Kumar
2017-08-01 17:48           ` Leonard Crestez
2017-08-02  3:23             ` Viresh Kumar
2017-08-16  6:34           ` Viresh Kumar
2017-08-16  9:42             ` Leonard Crestez
2017-08-17  3:38               ` Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 4/9] cpufreq: Don't set transition_latency for setpolicy drivers Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 5/9] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 6/9] cpufreq: Replace "max_transition_latency" with "dynamic_switching" Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 7/9] cpufreq: schedutil: Set dynamic_switching to true Viresh Kumar
2017-07-19 10:12 ` [PATCH V3 8/9] cpufreq: Add CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING cpufreq driver flag Viresh Kumar
2017-07-19 16:30   ` Dominik Brodowski
2017-07-19 10:12 ` [PATCH V3 9/9] cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency Viresh Kumar
2017-07-19 12:42 ` [PATCH V3 0/9] cpufreq: transition-latency cleanups Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).