linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/6] cpufreq: Don't check for max_transition_latency
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
@ 2017-06-29 10:59 ` Viresh Kumar
  2017-06-29 10:59 ` [PATCH 2/6] cpufreq: Remove (now) unused code related to max_transition_latency Viresh Kumar
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar; +Cc: linux-pm, Vincent Guittot, linux-kernel

This check is there only for drivers that set transition_latency to
CPUFREQ_ETERNAL. And the "max_transition_latency" field is only set for
ondemand and conservative governors (i.e. This code doesn't run for
schedutil governor).

With CPUFREQ_ETERNAL, transition latency becomes around 4 seconds (yeah
its practically impossible). But even then there is nothing particular
in the governor's code that will not function if the latency is over
10ms (TRANSITION_LATENCY_LIMIT). Why disable use of the governor
completely ?

And if its about falling back to the performance governor for such
platforms, then we aren't doing the same for schedutil governor.

This patch get rids of this particular check and let platforms decide if
they want to use legacy governors or not.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 14 --------------
 include/linux/cpufreq.h   |  3 ---
 2 files changed, 17 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 6e7424d12de9..34dbbf3122c8 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1988,20 +1988,6 @@ static int cpufreq_init_governor(struct cpufreq_policy *policy)
 	if (!policy->governor)
 		return -EINVAL;
 
-	if (policy->governor->max_transition_latency &&
-	    policy->cpuinfo.transition_latency >
-	    policy->governor->max_transition_latency) {
-		struct cpufreq_governor *gov = cpufreq_fallback_governor();
-
-		if (gov) {
-			pr_warn("%s governor failed, too long transition latency of HW, fallback to %s governor\n",
-				policy->governor->name, gov->name);
-			policy->governor = gov;
-		} else {
-			return -EINVAL;
-		}
-	}
-
 	if (!try_module_get(policy->governor->owner))
 		return -EINVAL;
 
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 905117bd5012..0818bdc3ebf2 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -487,9 +487,6 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
  * polling frequency is 1000 times the transition latency of the processor. The
  * ondemand governor will work on any processor with transition latency <= 10ms,
  * using appropriate sampling rate.
- *
- * For CPUs with transition latency > 10ms (mostly drivers with CPUFREQ_ETERNAL)
- * the ondemand governor will not work. All times here are in us (microseconds).
  */
 #define MIN_SAMPLING_RATE_RATIO		(2)
 #define LATENCY_MULTIPLIER		(1000)
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] cpufreq: Remove (now) unused code related to max_transition_latency
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
  2017-06-29 10:59 ` [PATCH 1/6] cpufreq: Don't check for max_transition_latency Viresh Kumar
@ 2017-06-29 10:59 ` Viresh Kumar
  2017-06-29 10:59 ` [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar; +Cc: linux-pm, Vincent Guittot, linux-kernel

Get rid of the now unused code after the only user of
max_transition_latency is gone.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c             | 5 -----
 drivers/cpufreq/cpufreq_governor.h    | 1 -
 drivers/cpufreq/cpufreq_performance.c | 6 ------
 include/linux/cpufreq.h               | 5 -----
 4 files changed, 17 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 34dbbf3122c8..0508b62c6f9b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1969,11 +1969,6 @@ int cpufreq_driver_target(struct cpufreq_policy *policy,
 }
 EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 
-__weak struct cpufreq_governor *cpufreq_fallback_governor(void)
-{
-	return NULL;
-}
-
 static int cpufreq_init_governor(struct cpufreq_policy *policy)
 {
 	int ret;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 0236ec2cd654..7cbb07512e4c 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -160,7 +160,6 @@ void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy);
 #define CPUFREQ_DBS_GOVERNOR_INITIALIZER(_name_)			\
 	{								\
 		.name = _name_,						\
-		.max_transition_latency	= TRANSITION_LATENCY_LIMIT,	\
 		.owner = THIS_MODULE,					\
 		.init = cpufreq_dbs_governor_init,			\
 		.exit = cpufreq_dbs_governor_exit,			\
diff --git a/drivers/cpufreq/cpufreq_performance.c b/drivers/cpufreq/cpufreq_performance.c
index dafb679adc58..73bc6c28b956 100644
--- a/drivers/cpufreq/cpufreq_performance.c
+++ b/drivers/cpufreq/cpufreq_performance.c
@@ -44,12 +44,6 @@ struct cpufreq_governor *cpufreq_default_governor(void)
 	return &cpufreq_gov_performance;
 }
 #endif
-#ifndef CONFIG_CPU_FREQ_GOV_PERFORMANCE_MODULE
-struct cpufreq_governor *cpufreq_fallback_governor(void)
-{
-	return &cpufreq_gov_performance;
-}
-#endif
 
 MODULE_AUTHOR("Dominik Brodowski <linux@brodo.de>");
 MODULE_DESCRIPTION("CPUfreq policy governor 'performance'");
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 0818bdc3ebf2..54dfa1bdf138 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -491,7 +491,6 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
 #define MIN_SAMPLING_RATE_RATIO		(2)
 #define LATENCY_MULTIPLIER		(1000)
 #define MIN_LATENCY_MULTIPLIER		(20)
-#define TRANSITION_LATENCY_LIMIT	(10 * 1000 * 1000)
 
 struct cpufreq_governor {
 	char	name[CPUFREQ_NAME_LEN];
@@ -504,9 +503,6 @@ struct cpufreq_governor {
 					 char *buf);
 	int	(*store_setspeed)	(struct cpufreq_policy *policy,
 					 unsigned int freq);
-	unsigned int max_transition_latency; /* HW must be able to switch to
-			next freq faster than this value in nano secs or we
-			will fallback to performance governor */
 	struct list_head	governor_list;
 	struct module		*owner;
 };
@@ -526,7 +522,6 @@ int cpufreq_register_governor(struct cpufreq_governor *governor);
 void cpufreq_unregister_governor(struct cpufreq_governor *governor);
 
 struct cpufreq_governor *cpufreq_default_governor(void);
-struct cpufreq_governor *cpufreq_fallback_governor(void);
 
 static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)
 {
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
  2017-06-29 10:59 ` [PATCH 1/6] cpufreq: Don't check for max_transition_latency Viresh Kumar
  2017-06-29 10:59 ` [PATCH 2/6] cpufreq: Remove (now) unused code related to max_transition_latency Viresh Kumar
@ 2017-06-29 10:59 ` Viresh Kumar
  2017-06-29 18:01   ` Dominik Brodowski
  2017-06-29 10:59 ` [PATCH 4/6] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar
  Cc: linux-pm, Vincent Guittot, linux-doc, linux-kernel

The cpufreq core and governors aren't supposed to set a limit on how
fast we want to try changing the frequency. This is currently done for
the legacy governors with help of min_sampling_rate.

At worst, we may end up setting the sampling rate to a value lower than
the rate at which frequency can be changed and then one of the CPUs in
the policy will be only changing frequency for ever.

But that is something for the user to decide and there is no need to
have special handling for such cases in the core. Leave it for the user
to figure out.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 Documentation/admin-guide/pm/cpufreq.rst |  8 --------
 drivers/cpufreq/cpufreq_conservative.c   |  6 ------
 drivers/cpufreq/cpufreq_governor.c       | 10 ++--------
 drivers/cpufreq/cpufreq_governor.h       |  1 -
 drivers/cpufreq/cpufreq_ondemand.c       | 12 ------------
 include/linux/cpufreq.h                  |  2 --
 6 files changed, 2 insertions(+), 37 deletions(-)

diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 09aa2e949787..6adbe1ed58b9 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -471,14 +471,6 @@ it is allowed to use (the ``scaling_max_freq`` policy limit).
 
 	# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
 
-
-``min_sampling_rate``
-	The minimum value of ``sampling_rate``.
-
-	Equal to 10000 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and
-	:c:data:`tick_nohz_active` are both set or to 20 times the value of
-	:c:data:`jiffies` in microseconds otherwise.
-
 ``up_threshold``
 	If the estimated CPU load is above this value (in percent), the governor
 	will set the frequency to the maximum value allowed for the policy.
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 88220ff3e1c2..f20f20a77d4d 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -246,7 +246,6 @@ gov_show_one_common(sampling_rate);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(up_threshold);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one(cs, down_threshold);
 gov_show_one(cs, freq_step);
 
@@ -254,12 +253,10 @@ gov_attr_rw(sampling_rate);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(up_threshold);
 gov_attr_rw(ignore_nice_load);
-gov_attr_ro(min_sampling_rate);
 gov_attr_rw(down_threshold);
 gov_attr_rw(freq_step);
 
 static struct attribute *cs_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&sampling_down_factor.attr,
 	&up_threshold.attr,
@@ -297,10 +294,7 @@ static int cs_init(struct dbs_data *dbs_data)
 	dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
 	dbs_data->ignore_nice_load = 0;
-
 	dbs_data->tuners = tuners;
-	dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-		jiffies_to_usecs(10);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 47e24b5384b3..858081f9c3d7 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -47,14 +47,11 @@ ssize_t store_sampling_rate(struct gov_attr_set *attr_set, const char *buf,
 {
 	struct dbs_data *dbs_data = to_dbs_data(attr_set);
 	struct policy_dbs_info *policy_dbs;
-	unsigned int rate;
 	int ret;
-	ret = sscanf(buf, "%u", &rate);
+	ret = sscanf(buf, "%u", &dbs_data->sampling_rate);
 	if (ret != 1)
 		return -EINVAL;
 
-	dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate);
-
 	/*
 	 * We are operating under dbs_data->mutex and so the list and its
 	 * entries can't be freed concurrently.
@@ -437,10 +434,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 		latency = 1;
 
 	/* Bring kernel and HW constraints together */
-	dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
-					  MIN_LATENCY_MULTIPLIER * latency);
-	dbs_data->sampling_rate = max(dbs_data->min_sampling_rate,
-				      LATENCY_MULTIPLIER * latency);
+	dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 7cbb07512e4c..06d9f90ede93 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -41,7 +41,6 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 struct dbs_data {
 	struct gov_attr_set attr_set;
 	void *tuners;
-	unsigned int min_sampling_rate;
 	unsigned int ignore_nice_load;
 	unsigned int sampling_rate;
 	unsigned int sampling_down_factor;
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 3937acf7e026..6b423eebfd5d 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -319,7 +319,6 @@ gov_show_one_common(sampling_rate);
 gov_show_one_common(up_threshold);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one_common(io_is_busy);
 gov_show_one(od, powersave_bias);
 
@@ -329,10 +328,8 @@ gov_attr_rw(up_threshold);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(ignore_nice_load);
 gov_attr_rw(powersave_bias);
-gov_attr_ro(min_sampling_rate);
 
 static struct attribute *od_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&up_threshold.attr,
 	&sampling_down_factor.attr,
@@ -373,17 +370,8 @@ static int od_init(struct dbs_data *dbs_data)
 	if (idle_time != -1ULL) {
 		/* Idle micro accounting is supported. Use finer thresholds */
 		dbs_data->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
-		/*
-		 * In nohz/micro accounting case we set the minimum frequency
-		 * not depending on HZ, but fixed (very low).
-		*/
-		dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
 	} else {
 		dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
-
-		/* For correct statistics, we need 10 ticks for each measure */
-		dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-			jiffies_to_usecs(10);
 	}
 
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 54dfa1bdf138..083e7f3c23dd 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -488,9 +488,7 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
  * ondemand governor will work on any processor with transition latency <= 10ms,
  * using appropriate sampling rate.
  */
-#define MIN_SAMPLING_RATE_RATIO		(2)
 #define LATENCY_MULTIPLIER		(1000)
-#define MIN_LATENCY_MULTIPLIER		(20)
 
 struct cpufreq_governor {
 	char	name[CPUFREQ_NAME_LEN];
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] cpufreq: Use transition_delay_us for legacy governors as well
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
                   ` (2 preceding siblings ...)
  2017-06-29 10:59 ` [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
@ 2017-06-29 10:59 ` Viresh Kumar
  2017-06-29 10:59 ` [PATCH 5/6] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
  2017-06-29 10:59 ` [PATCH 6/6] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar
  5 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Ingo Molnar, Peter Zijlstra
  Cc: linux-pm, Vincent Guittot, linux-kernel

The policy->transition_delay_us field is used only by the schedutil
governor currently, but it should rather be a common thing across all
governors.

Create a new helper cpufreq_policy_transition_delay_us() to get the
transition delay across all governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c |  9 +--------
 include/linux/cpufreq.h            | 15 +++++++++++++++
 kernel/sched/cpufreq_schedutil.c   | 11 +----------
 3 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 858081f9c3d7..eed069ecfd5e 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -389,7 +389,6 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 	struct dbs_governor *gov = dbs_governor_of(policy);
 	struct dbs_data *dbs_data;
 	struct policy_dbs_info *policy_dbs;
-	unsigned int latency;
 	int ret = 0;
 
 	/* State should be equivalent to EXIT */
@@ -428,13 +427,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 	if (ret)
 		goto free_policy_dbs_info;
 
-	/* policy latency is in ns. Convert it to us first */
-	latency = policy->cpuinfo.transition_latency / 1000;
-	if (latency == 0)
-		latency = 1;
-
-	/* Bring kernel and HW constraints together */
-	dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;
+	dbs_data->sampling_rate = cpufreq_policy_transition_delay_us(policy);
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 083e7f3c23dd..dee6fe3d8af0 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -529,6 +529,21 @@ static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)
 		__cpufreq_driver_target(policy, policy->min, CPUFREQ_RELATION_L);
 }
 
+static inline unsigned int
+cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
+{
+	unsigned int delay_us = LATENCY_MULTIPLIER, latency;
+
+	if (policy->transition_delay_us)
+		return policy->transition_delay_us;
+
+	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
+	if (latency)
+		delay_us *= latency;
+
+	return delay_us;
+}
+
 /* Governor attribute set */
 struct gov_attr_set {
 	struct kobject kobj;
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 076a2e31951c..3bdec2a08e1e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -528,16 +528,7 @@ static int sugov_init(struct cpufreq_policy *policy)
 		goto stop_kthread;
 	}
 
-	if (policy->transition_delay_us) {
-		tunables->rate_limit_us = policy->transition_delay_us;
-	} else {
-		unsigned int lat;
-
-		tunables->rate_limit_us = LATENCY_MULTIPLIER;
-		lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
-		if (lat)
-			tunables->rate_limit_us *= lat;
-	}
+	tunables->rate_limit_us = cpufreq_policy_transition_delay_us(policy);
 
 	policy->governor_data = sg_policy;
 	sg_policy->tunables = tunables;
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] cpufreq: Cap the default transition delay value to 10 ms
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
                   ` (3 preceding siblings ...)
  2017-06-29 10:59 ` [PATCH 4/6] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
@ 2017-06-29 10:59 ` Viresh Kumar
  2017-06-29 10:59 ` [PATCH 6/6] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar
  5 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar; +Cc: linux-pm, Vincent Guittot, linux-kernel

If transition_delay_us isn't defined by the cpufreq driver, the default
value of transition delay (time after which the cpufreq governor will
try updating the frequency again) is currently calculated by multiplying
transition_latency (nsec) with LATENCY_MULTIPLIER (1000) and then
converting this time to usec. That gives the exact same value as
transition_latency, just that the time unit is usec instead of nsec.

With acpi-cpufreq for example, transition_latency is set to around 10
usec and we get transition delay as 10 ms. Which seems to be a
reasonable amount of time to reevaluate the frequency again.

But for platforms where frequency switching isn't that fast (like ARM),
the transition_latency varies from 500 usec to 3 ms, and the transition
delay becomes 500 ms to 3 seconds. Of course, that is a pretty bad
default value to start with.

We can try to come across a better formula (instead of multiplying with
LATENCY_MULTIPLIER) to solve this problem, but will that be worth it ?

This patch tries a simple approach and caps the maximum value of default
transition delay to 10 ms. Of course, userspace can still come in and
change this value anytime or individual drivers can rather provide
transition_delay_us instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 include/linux/cpufreq.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index dee6fe3d8af0..cf1faba29113 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -541,7 +541,17 @@ cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
 	if (latency)
 		delay_us *= latency;
 
-	return delay_us;
+	/*
+	 * For platforms that can change the frequency very fast (< 10 us),
+	 * the above formula gives a decent transition delay. But for platforms
+	 * where transition_latency is in milliseconds, it ends up giving
+	 * unrealistic values.
+	 *
+	 * Cap the default transition delay to 10 ms, which seems to be a
+	 * reasonable amount of time after which we should reevaluate the
+	 * frequency.
+	 */
+	return min(delay_us, (unsigned int)10000);
 }
 
 /* Governor attribute set */
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory
       [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
                   ` (4 preceding siblings ...)
  2017-06-29 10:59 ` [PATCH 5/6] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
@ 2017-06-29 10:59 ` Viresh Kumar
  5 siblings, 0 replies; 11+ messages in thread
From: Viresh Kumar @ 2017-06-29 10:59 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Sudeep Holla
  Cc: linux-pm, Vincent Guittot, linux-kernel

All users of arm_big_little driver are defining it and there is no need
to keep it optional. Make it mandatory.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/arm_big_little.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index 418042201e6d..d1eb2a53b61f 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -483,11 +483,8 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
 		return ret;
 	}
 
-	if (arm_bL_ops->get_transition_latency)
-		policy->cpuinfo.transition_latency =
-			arm_bL_ops->get_transition_latency(cpu_dev);
-	else
-		policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+	policy->cpuinfo.transition_latency =
+				arm_bL_ops->get_transition_latency(cpu_dev);
 
 	if (is_bL_switching_enabled())
 		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
@@ -622,7 +619,8 @@ int bL_cpufreq_register(struct cpufreq_arm_bL_ops *ops)
 		return -EBUSY;
 	}
 
-	if (!ops || !strlen(ops->name) || !ops->init_opp_table) {
+	if (!ops || !strlen(ops->name) || !ops->init_opp_table ||
+	    !ops->get_transition_latency) {
 		pr_err("%s: Invalid arm_bL_ops, exiting\n", __func__);
 		return -ENODEV;
 	}
-- 
2.13.0.71.gd7076ec9c9cb

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
  2017-06-29 10:59 ` [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
@ 2017-06-29 18:01   ` Dominik Brodowski
  2017-06-30  3:34     ` Viresh Kumar
  0 siblings, 1 reply; 11+ messages in thread
From: Dominik Brodowski @ 2017-06-29 18:01 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, linux-pm, Vincent Guittot, linux-doc, linux-kernel

On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:
> The cpufreq core and governors aren't supposed to set a limit on how
> fast we want to try changing the frequency. This is currently done for
> the legacy governors with help of min_sampling_rate.
> 
> At worst, we may end up setting the sampling rate to a value lower than
> the rate at which frequency can be changed and then one of the CPUs in
> the policy will be only changing frequency for ever.

Is it safe to issue requests to change the CPU frequency so frequently, even
on historic hardware such as speedstep-{ich,smi,centrino}? In the past,
these checks more or less disallowed the running of dynamic frequency
scaling at least on speedstep-smi[*], but maybe on a few other platforms as
well. That's why I am curious on whether this may break systems potentially
on a hardware level if the hardware was not designed to do dynamic frequency
scaling (and not just frequency switches on battery/AC).

Best,
	Dominik

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
  2017-06-29 18:01   ` Dominik Brodowski
@ 2017-06-30  3:34     ` Viresh Kumar
  2017-06-30  4:53       ` Dominik Brodowski
  0 siblings, 1 reply; 11+ messages in thread
From: Viresh Kumar @ 2017-06-30  3:34 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Rafael Wysocki, linux-pm, Vincent Guittot, linux-doc, linux-kernel

On 29-06-17, 20:01, Dominik Brodowski wrote:
> On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:
> > The cpufreq core and governors aren't supposed to set a limit on how
> > fast we want to try changing the frequency. This is currently done for
> > the legacy governors with help of min_sampling_rate.
> > 
> > At worst, we may end up setting the sampling rate to a value lower than
> > the rate at which frequency can be changed and then one of the CPUs in
> > the policy will be only changing frequency for ever.
> 
> Is it safe to issue requests to change the CPU frequency so frequently,

Well, I assumed so. I am not sure the hardware would break though.
Overheating ?

> even
> on historic hardware such as speedstep-{ich,smi,centrino}? In the past,
> these checks more or less disallowed the running of dynamic frequency
> scaling at least on speedstep-smi[*],

We must by doing dynamic freq scaling even without this patch. I don't
see why you say the above then.

All we do here is that we get rid of the limit on how soon we can
change the freq again.

> but maybe on a few other platforms as
> well. That's why I am curious on whether this may break systems potentially
> on a hardware level if the hardware was not designed to do dynamic frequency
> scaling (and not just frequency switches on battery/AC).

Honestly I am not sure if any hardware can break or not, just because
of this commit.

-- 
viresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
  2017-06-30  3:34     ` Viresh Kumar
@ 2017-06-30  4:53       ` Dominik Brodowski
  2017-06-30  5:40         ` Viresh Kumar
  0 siblings, 1 reply; 11+ messages in thread
From: Dominik Brodowski @ 2017-06-30  4:53 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, linux-pm, Vincent Guittot, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2115 bytes --]

On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote:
> On 29-06-17, 20:01, Dominik Brodowski wrote:
> > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:
> > > The cpufreq core and governors aren't supposed to set a limit on how
> > > fast we want to try changing the frequency. This is currently done for
> > > the legacy governors with help of min_sampling_rate.
> > > 
> > > At worst, we may end up setting the sampling rate to a value lower than
> > > the rate at which frequency can be changed and then one of the CPUs in
> > > the policy will be only changing frequency for ever.
> > 
> > Is it safe to issue requests to change the CPU frequency so frequently,
> 
> Well, I assumed so. I am not sure the hardware would break though.
> Overheating ?
> 
> > even
> > on historic hardware such as speedstep-{ich,smi,centrino}? In the past,
> > these checks more or less disallowed the running of dynamic frequency
> > scaling at least on speedstep-smi[*],
> 
> We must by doing dynamic freq scaling even without this patch. I don't
> see why you say the above then.
> 
> All we do here is that we get rid of the limit on how soon we can
> change the freq again.

Well, as I understand it, first generation "speedstep" was designed more or
less to switch frequencies only when AC power was lost or restored.

The Linux implementation merely said: "no on-the-fly changes", but switch
frequencies whenever a user explicitly requested such a change (presumably
only every once in an unspecified while).

This same reasoning may be present in other drivers using CPUFREQ_ETERNAL.

> > but maybe on a few other platforms as
> > well. That's why I am curious on whether this may break systems potentially
> > on a hardware level if the hardware was not designed to do dynamic frequency
> > scaling (and not just frequency switches on battery/AC).
> 
> Honestly I am not sure if any hardware can break or not, just because
> of this commit.

I am not *sure* either, I am just worried of the consequences of doing
things out-of-spec...

Best
	Dominik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
  2017-06-30  4:53       ` Dominik Brodowski
@ 2017-06-30  5:40         ` Viresh Kumar
  2017-07-02  7:23           ` Dominik Brodowski
  0 siblings, 1 reply; 11+ messages in thread
From: Viresh Kumar @ 2017-06-30  5:40 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Rafael Wysocki, linux-pm, Vincent Guittot, linux-doc, linux-kernel

On 30-06-17, 06:53, Dominik Brodowski wrote:
> On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote:
> > On 29-06-17, 20:01, Dominik Brodowski wrote:
> > > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:
> > > > The cpufreq core and governors aren't supposed to set a limit on how
> > > > fast we want to try changing the frequency. This is currently done for
> > > > the legacy governors with help of min_sampling_rate.
> > > > 
> > > > At worst, we may end up setting the sampling rate to a value lower than
> > > > the rate at which frequency can be changed and then one of the CPUs in
> > > > the policy will be only changing frequency for ever.
> > > 
> > > Is it safe to issue requests to change the CPU frequency so frequently,
> > 
> > Well, I assumed so. I am not sure the hardware would break though.
> > Overheating ?
> > 
> > > even
> > > on historic hardware such as speedstep-{ich,smi,centrino}? In the past,

speedstep-smi is the only one which sets transition_latency to
CPUFREQ_ETERNAL and the others are putting some meaningful values. So
yes, they should be doing DVFS dynamically.

> > > these checks more or less disallowed the running of dynamic frequency
> > > scaling at least on speedstep-smi[*],
> > 
> > We must by doing dynamic freq scaling even without this patch. I don't
> > see why you say the above then.
> > 
> > All we do here is that we get rid of the limit on how soon we can
> > change the freq again.
> 
> Well, as I understand it, first generation "speedstep" was designed more or
> less to switch frequencies only when AC power was lost or restored.
> 
> The Linux implementation merely said: "no on-the-fly changes", but switch
> frequencies whenever a user explicitly requested such a change (presumably
> only every once in an unspecified while).
> 
> This same reasoning may be present in other drivers using CPUFREQ_ETERNAL.

Thanks for the explanation here and I am convinced that this series
has at least done one thing wrong. And that is removal of
max_transition_latency from governors and allowing ondemand to run on
such platforms (which may end up breaking them).

So I will actually modify that patch and set max_transition_latency to
CPUFREQ_ETERNAL for ondemand/conservative instead of 10ms. Also we
should do the same for schedutil as well, so that will also use the
max_transition_latency field.

But I hope, this patch will still be fine. Right ?

> I am not *sure* either, I am just worried of the consequences of doing
> things out-of-spec...

Thanks for your inputs Dominik.

-- 
viresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
  2017-06-30  5:40         ` Viresh Kumar
@ 2017-07-02  7:23           ` Dominik Brodowski
  0 siblings, 0 replies; 11+ messages in thread
From: Dominik Brodowski @ 2017-07-02  7:23 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, linux-pm, Vincent Guittot, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2717 bytes --]

On Fri, Jun 30, 2017 at 11:10:33AM +0530, Viresh Kumar wrote:
> On 30-06-17, 06:53, Dominik Brodowski wrote:
> > On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote:
> > > On 29-06-17, 20:01, Dominik Brodowski wrote:
> > > > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:
> > > > > The cpufreq core and governors aren't supposed to set a limit on how
> > > > > fast we want to try changing the frequency. This is currently done for
> > > > > the legacy governors with help of min_sampling_rate.
> > > > > 
> > > > > At worst, we may end up setting the sampling rate to a value lower than
> > > > > the rate at which frequency can be changed and then one of the CPUs in
> > > > > the policy will be only changing frequency for ever.
> > > > 
> > > > Is it safe to issue requests to change the CPU frequency so frequently,
> > > 
> > > Well, I assumed so. I am not sure the hardware would break though.
> > > Overheating ?
> > > 
> > > > even
> > > > on historic hardware such as speedstep-{ich,smi,centrino}? In the past,
> 
> speedstep-smi is the only one which sets transition_latency to
> CPUFREQ_ETERNAL and the others are putting some meaningful values. So
> yes, they should be doing DVFS dynamically.
> 
> > > > these checks more or less disallowed the running of dynamic frequency
> > > > scaling at least on speedstep-smi[*],
> > > 
> > > We must by doing dynamic freq scaling even without this patch. I don't
> > > see why you say the above then.
> > > 
> > > All we do here is that we get rid of the limit on how soon we can
> > > change the freq again.
> > 
> > Well, as I understand it, first generation "speedstep" was designed more or
> > less to switch frequencies only when AC power was lost or restored.
> > 
> > The Linux implementation merely said: "no on-the-fly changes", but switch
> > frequencies whenever a user explicitly requested such a change (presumably
> > only every once in an unspecified while).
> > 
> > This same reasoning may be present in other drivers using CPUFREQ_ETERNAL.
> 
> Thanks for the explanation here and I am convinced that this series
> has at least done one thing wrong. And that is removal of
> max_transition_latency from governors and allowing ondemand to run on
> such platforms (which may end up breaking them).
> 
> So I will actually modify that patch and set max_transition_latency to
> CPUFREQ_ETERNAL for ondemand/conservative instead of 10ms. Also we
> should do the same for schedutil as well, so that will also use the
> max_transition_latency field.
> 
> But I hope, this patch will still be fine. Right ?

Indeed, I have no comments otherwise. Thanks!

Best
	Dominik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-02  7:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <cover.1498733506.git.viresh.kumar@linaro.org>
2017-06-29 10:59 ` [PATCH 1/6] cpufreq: Don't check for max_transition_latency Viresh Kumar
2017-06-29 10:59 ` [PATCH 2/6] cpufreq: Remove (now) unused code related to max_transition_latency Viresh Kumar
2017-06-29 10:59 ` [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate Viresh Kumar
2017-06-29 18:01   ` Dominik Brodowski
2017-06-30  3:34     ` Viresh Kumar
2017-06-30  4:53       ` Dominik Brodowski
2017-06-30  5:40         ` Viresh Kumar
2017-07-02  7:23           ` Dominik Brodowski
2017-06-29 10:59 ` [PATCH 4/6] cpufreq: Use transition_delay_us for legacy governors as well Viresh Kumar
2017-06-29 10:59 ` [PATCH 5/6] cpufreq: Cap the default transition delay value to 10 ms Viresh Kumar
2017-06-29 10:59 ` [PATCH 6/6] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).