[PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions
@ 2015-06-19 11:48 Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 01/10] cpufreq: governor: Name delayed-work as dwork Viresh Kumar
                   ` (12 more replies)
  0 siblings, 13 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Hi Rafael/Preeti,

This is another attempt to fix the crashes reported by Preeti. They work
quite well for me now, and I hope they would work well for Preeti as
well :)

So, patches [1-7,9] are already Reviewed by Preeti.

The first 5 patches are minor cleanups, 6th & 7th try to optimize few
things to make code less complex.

Patches 8-10 actually solve (or try to solve :)) the synchronization
problems, or the crashes I was getting.

V1->V2:
- 7/11 is dropped and only 8/11 is updated, which is 8/10 now.
- Avoid taking the same mutex in both cpufreq_governor_dbs() and
  work-handler as that has given us some locdeps, classic ABBA stuff.
- And so timer_mutex, responsible for synchronizing governor
  work-handlers is kept as is.
- Later patches are almost same with minor updates.
- Invalid state-transitions are sensed better now with improved checks.
- I have run enough tests on my exynos dual core board and failed to
  crash at all. Not that I wanted to crash :)

Rebased over pm/bleeeding-edge.

Viresh Kumar (10):
  cpufreq: governor: Name delayed-work as dwork
  cpufreq: governor: Drop unused field 'cpu'
  cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info'
  cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs'
  cpufreq: governor: rename cur_policy as policy
  cpufreq: governor: Keep single copy of information common to
    policy->cpus
  cpufreq: governor: split out common part of {cs|od}_dbs_timer()
  cpufreq: governor: Avoid invalid states with additional checks
  cpufreq: governor: Don't WARN on invalid states
  cpufreq: propagate errors returned from __cpufreq_governor()

 drivers/cpufreq/cpufreq.c              |  22 ++--
 drivers/cpufreq/cpufreq_conservative.c |  25 ++---
 drivers/cpufreq/cpufreq_governor.c     | 196 ++++++++++++++++++++++++---------
 drivers/cpufreq/cpufreq_governor.h     |  40 ++++---
 drivers/cpufreq/cpufreq_ondemand.c     |  67 ++++++-----
 5 files changed, 220 insertions(+), 130 deletions(-)

-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH V2 01/10] cpufreq: governor: Name delayed-work as dwork
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 02/10] cpufreq: governor: Drop unused field 'cpu' Viresh Kumar
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Delayed work was named as 'work' and to access work within it we do
work.work. Not much readable. Rename delayed_work as 'dwork'.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 2 +-
 drivers/cpufreq/cpufreq_governor.c     | 6 +++---
 drivers/cpufreq/cpufreq_governor.h     | 2 +-
 drivers/cpufreq/cpufreq_ondemand.c     | 8 ++++----
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index c86a10c30912..0e3ec1d968d9 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -105,7 +105,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
 static void cs_dbs_timer(struct work_struct *work)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
-			struct cs_cpu_dbs_info_s, cdbs.work.work);
+			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
 	unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
 	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
 			cpu);
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 57a39f8a92b7..6024bbc782c0 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -165,7 +165,7 @@ static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
 {
 	struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 
-	mod_delayed_work_on(cpu, system_wq, &cdbs->work, delay);
+	mod_delayed_work_on(cpu, system_wq, &cdbs->dwork, delay);
 }
 
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
@@ -204,7 +204,7 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 
 	for_each_cpu(i, policy->cpus) {
 		cdbs = dbs_data->cdata->get_cpu_cdbs(i);
-		cancel_delayed_work_sync(&cdbs->work);
+		cancel_delayed_work_sync(&cdbs->dwork);
 	}
 }
 
@@ -367,7 +367,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
 		mutex_init(&j_cdbs->timer_mutex);
-		INIT_DEFERRABLE_WORK(&j_cdbs->work, cdata->gov_dbs_timer);
+		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
 	}
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 34736f5e869d..352eecaae789 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -142,7 +142,7 @@ struct cpu_dbs_common_info {
 	 */
 	unsigned int prev_load;
 	struct cpufreq_policy *cur_policy;
-	struct delayed_work work;
+	struct delayed_work dwork;
 	/*
 	 * percpu mutex that serializes governor limit change with gov_dbs_timer
 	 * invocation. We do not want gov_dbs_timer to run when user is changing
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 3c1e10f2304c..830aef1db8c3 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -194,7 +194,7 @@ static void od_check_cpu(int cpu, unsigned int load)
 static void od_dbs_timer(struct work_struct *work)
 {
 	struct od_cpu_dbs_info_s *dbs_info =
-		container_of(work, struct od_cpu_dbs_info_s, cdbs.work.work);
+		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
 	unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
 	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
@@ -275,18 +275,18 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 
 		mutex_lock(&dbs_info->cdbs.timer_mutex);
 
-		if (!delayed_work_pending(&dbs_info->cdbs.work)) {
+		if (!delayed_work_pending(&dbs_info->cdbs.dwork)) {
 			mutex_unlock(&dbs_info->cdbs.timer_mutex);
 			continue;
 		}
 
 		next_sampling = jiffies + usecs_to_jiffies(new_rate);
-		appointed_at = dbs_info->cdbs.work.timer.expires;
+		appointed_at = dbs_info->cdbs.dwork.timer.expires;
 
 		if (time_before(next_sampling, appointed_at)) {
 
 			mutex_unlock(&dbs_info->cdbs.timer_mutex);
-			cancel_delayed_work_sync(&dbs_info->cdbs.work);
+			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
 			mutex_lock(&dbs_info->cdbs.timer_mutex);
 
 			gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy,
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 02/10] cpufreq: governor: Drop unused field 'cpu'
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 01/10] cpufreq: governor: Name delayed-work as dwork Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 03/10] cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info' Viresh Kumar
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Its not used at all, drop it.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 1 -
 drivers/cpufreq/cpufreq_governor.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 6024bbc782c0..2149ba7d32a8 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -353,7 +353,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		struct cpu_dbs_common_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
-		j_cdbs->cpu = j;
 		j_cdbs->cur_policy = policy;
 		j_cdbs->prev_cpu_idle =
 			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 352eecaae789..1bbf8c87fdd5 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -130,7 +130,6 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 
 /* Per cpu structures */
 struct cpu_dbs_common_info {
-	int cpu;
 	u64 prev_cpu_idle;
 	u64 prev_cpu_wall;
 	u64 prev_cpu_nice;
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 03/10] cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info'
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 01/10] cpufreq: governor: Name delayed-work as dwork Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 02/10] cpufreq: governor: Drop unused field 'cpu' Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 04/10] cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs' Viresh Kumar
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Its not common info to all CPUs, but a structure representing common
type of cpu info to both governor types. Lets drop 'common_' from its
name.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 19 +++++++++----------
 drivers/cpufreq/cpufreq_governor.h | 13 ++++++-------
 2 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 2149ba7d32a8..529f236f2d05 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -32,7 +32,7 @@ static struct attribute_group *get_sysfs_attr(struct dbs_data *dbs_data)
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 {
-	struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	struct cpufreq_policy *policy;
@@ -64,7 +64,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
-		struct cpu_dbs_common_info *j_cdbs;
+		struct cpu_dbs_info *j_cdbs;
 		u64 cur_wall_time, cur_idle_time;
 		unsigned int idle_time, wall_time;
 		unsigned int load;
@@ -163,7 +163,7 @@ EXPORT_SYMBOL_GPL(dbs_check_cpu);
 static inline void __gov_queue_work(int cpu, struct dbs_data *dbs_data,
 		unsigned int delay)
 {
-	struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 
 	mod_delayed_work_on(cpu, system_wq, &cdbs->dwork, delay);
 }
@@ -199,7 +199,7 @@ EXPORT_SYMBOL_GPL(gov_queue_work);
 static inline void gov_cancel_work(struct dbs_data *dbs_data,
 		struct cpufreq_policy *policy)
 {
-	struct cpu_dbs_common_info *cdbs;
+	struct cpu_dbs_info *cdbs;
 	int i;
 
 	for_each_cpu(i, policy->cpus) {
@@ -209,8 +209,7 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_dbs_common_info *cdbs,
-		unsigned int sampling_rate)
+bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
 {
 	if (policy_is_shared(cdbs->cur_policy)) {
 		ktime_t time_now = ktime_get();
@@ -330,7 +329,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
-	struct cpu_dbs_common_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
 	int io_busy = 0;
 
 	if (!policy->cur)
@@ -350,7 +349,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	}
 
 	for_each_cpu(j, policy->cpus) {
-		struct cpu_dbs_common_info *j_cdbs = cdata->get_cpu_cdbs(j);
+		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
 		j_cdbs->cur_policy = policy;
@@ -398,7 +397,7 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
-	struct cpu_dbs_common_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -418,7 +417,7 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
-	struct cpu_dbs_common_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
 
 	if (!cpu_cdbs->cur_policy)
 		return;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 1bbf8c87fdd5..6b5e33f68064 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -109,7 +109,7 @@ store_one(_gov, file_name)
 
 /* create helper routines */
 #define define_get_cpu_dbs_routines(_dbs_info)				\
-static struct cpu_dbs_common_info *get_cpu_cdbs(int cpu)		\
+static struct cpu_dbs_info *get_cpu_cdbs(int cpu)			\
 {									\
 	return &per_cpu(_dbs_info, cpu).cdbs;				\
 }									\
@@ -129,7 +129,7 @@ static void *get_cpu_dbs_info_s(int cpu)				\
  */
 
 /* Per cpu structures */
-struct cpu_dbs_common_info {
+struct cpu_dbs_info {
 	u64 prev_cpu_idle;
 	u64 prev_cpu_wall;
 	u64 prev_cpu_nice;
@@ -152,7 +152,7 @@ struct cpu_dbs_common_info {
 };
 
 struct od_cpu_dbs_info_s {
-	struct cpu_dbs_common_info cdbs;
+	struct cpu_dbs_info cdbs;
 	struct cpufreq_frequency_table *freq_table;
 	unsigned int freq_lo;
 	unsigned int freq_lo_jiffies;
@@ -162,7 +162,7 @@ struct od_cpu_dbs_info_s {
 };
 
 struct cs_cpu_dbs_info_s {
-	struct cpu_dbs_common_info cdbs;
+	struct cpu_dbs_info cdbs;
 	unsigned int down_skip;
 	unsigned int requested_freq;
 	unsigned int enable:1;
@@ -203,7 +203,7 @@ struct common_dbs_data {
 	 */
 	struct dbs_data *gdbs_data;
 
-	struct cpu_dbs_common_info *(*get_cpu_cdbs)(int cpu);
+	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	void *(*get_cpu_dbs_info_s)(int cpu);
 	void (*gov_dbs_timer)(struct work_struct *work);
 	void (*gov_check_cpu)(int cpu, unsigned int load);
@@ -264,8 +264,7 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_dbs_common_info *cdbs,
-		unsigned int sampling_rate);
+bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 04/10] cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs'
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (2 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 03/10] cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info' Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 05/10] cpufreq: governor: rename cur_policy as policy Viresh Kumar
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

It is called as 'cdbs' at most of the places and 'cpu_dbs' at others.
Lets use 'cdbs' consistently for better readability.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 529f236f2d05..4ea13f182154 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -329,7 +329,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
-	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 	int io_busy = 0;
 
 	if (!policy->cur)
@@ -385,7 +385,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	}
 
 	/* Initiate timer time stamp */
-	cpu_cdbs->time_stamp = ktime_get();
+	cdbs->time_stamp = ktime_get();
 
 	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
 		       true);
@@ -397,7 +397,7 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
-	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -408,8 +408,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 
 	gov_cancel_work(dbs_data, policy);
 
-	mutex_destroy(&cpu_cdbs->timer_mutex);
-	cpu_cdbs->cur_policy = NULL;
+	mutex_destroy(&cdbs->timer_mutex);
+	cdbs->cur_policy = NULL;
 }
 
 static void cpufreq_governor_limits(struct cpufreq_policy *policy,
@@ -417,20 +417,20 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
-	struct cpu_dbs_info *cpu_cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
-	if (!cpu_cdbs->cur_policy)
+	if (!cdbs->cur_policy)
 		return;
 
-	mutex_lock(&cpu_cdbs->timer_mutex);
-	if (policy->max < cpu_cdbs->cur_policy->cur)
-		__cpufreq_driver_target(cpu_cdbs->cur_policy, policy->max,
+	mutex_lock(&cdbs->timer_mutex);
+	if (policy->max < cdbs->cur_policy->cur)
+		__cpufreq_driver_target(cdbs->cur_policy, policy->max,
 					CPUFREQ_RELATION_H);
-	else if (policy->min > cpu_cdbs->cur_policy->cur)
-		__cpufreq_driver_target(cpu_cdbs->cur_policy, policy->min,
+	else if (policy->min > cdbs->cur_policy->cur)
+		__cpufreq_driver_target(cdbs->cur_policy, policy->min,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
-	mutex_unlock(&cpu_cdbs->timer_mutex);
+	mutex_unlock(&cdbs->timer_mutex);
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 05/10] cpufreq: governor: rename cur_policy as policy
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (3 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 04/10] cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs' Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Just call it 'policy', cur_policy is unnecessarily long and doesn't
have any special meaning.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 10 +++++-----
 drivers/cpufreq/cpufreq_governor.c     | 18 +++++++++---------
 drivers/cpufreq/cpufreq_governor.h     |  2 +-
 drivers/cpufreq/cpufreq_ondemand.c     | 19 ++++++++++---------
 4 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 0e3ec1d968d9..af47d322679e 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 static void cs_check_cpu(int cpu, unsigned int load)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.cur_policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -106,10 +106,10 @@ static void cs_dbs_timer(struct work_struct *work)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
 			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
+	unsigned int cpu = dbs_info->cdbs.policy->cpu;
 	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
 	bool modify_all = true;
@@ -120,7 +120,7 @@ static void cs_dbs_timer(struct work_struct *work)
 	else
 		dbs_check_cpu(dbs_data, cpu);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy, delay, modify_all);
+	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
 	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
 }
 
@@ -135,7 +135,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
 	if (!dbs_info->enable)
 		return 0;
 
-	policy = dbs_info->cdbs.cur_policy;
+	policy = dbs_info->cdbs.policy;
 
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 4ea13f182154..c0566f86caed 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -60,7 +60,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 		ignore_nice = cs_tuners->ignore_nice_load;
 	}
 
-	policy = cdbs->cur_policy;
+	policy = cdbs->policy;
 
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
@@ -211,7 +211,7 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 /* Will return if we need to evaluate cpu load again or not */
 bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
 {
-	if (policy_is_shared(cdbs->cur_policy)) {
+	if (policy_is_shared(cdbs->policy)) {
 		ktime_t time_now = ktime_get();
 		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
 
@@ -352,7 +352,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
-		j_cdbs->cur_policy = policy;
+		j_cdbs->policy = policy;
 		j_cdbs->prev_cpu_idle =
 			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
 
@@ -409,7 +409,7 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 	gov_cancel_work(dbs_data, policy);
 
 	mutex_destroy(&cdbs->timer_mutex);
-	cdbs->cur_policy = NULL;
+	cdbs->policy = NULL;
 }
 
 static void cpufreq_governor_limits(struct cpufreq_policy *policy,
@@ -419,15 +419,15 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
-	if (!cdbs->cur_policy)
+	if (!cdbs->policy)
 		return;
 
 	mutex_lock(&cdbs->timer_mutex);
-	if (policy->max < cdbs->cur_policy->cur)
-		__cpufreq_driver_target(cdbs->cur_policy, policy->max,
+	if (policy->max < cdbs->policy->cur)
+		__cpufreq_driver_target(cdbs->policy, policy->max,
 					CPUFREQ_RELATION_H);
-	else if (policy->min > cdbs->cur_policy->cur)
-		__cpufreq_driver_target(cdbs->cur_policy, policy->min,
+	else if (policy->min > cdbs->policy->cur)
+		__cpufreq_driver_target(cdbs->policy, policy->min,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
 	mutex_unlock(&cdbs->timer_mutex);
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 6b5e33f68064..a0f8eb79ee6d 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -140,7 +140,7 @@ struct cpu_dbs_info {
 	 * wake-up from idle.
 	 */
 	unsigned int prev_load;
-	struct cpufreq_policy *cur_policy;
+	struct cpufreq_policy *policy;
 	struct delayed_work dwork;
 	/*
 	 * percpu mutex that serializes governor limit change with gov_dbs_timer
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 830aef1db8c3..d29c6f9c6e3e 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
 static void od_check_cpu(int cpu, unsigned int load)
 {
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.cur_policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -195,10 +195,10 @@ static void od_dbs_timer(struct work_struct *work)
 {
 	struct od_cpu_dbs_info_s *dbs_info =
 		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
+	unsigned int cpu = dbs_info->cdbs.policy->cpu;
 	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	int delay = 0, sample_type = core_dbs_info->sample_type;
 	bool modify_all = true;
@@ -213,8 +213,9 @@ static void od_dbs_timer(struct work_struct *work)
 	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
 		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(core_dbs_info->cdbs.cur_policy,
-				core_dbs_info->freq_lo, CPUFREQ_RELATION_H);
+		__cpufreq_driver_target(core_dbs_info->cdbs.policy,
+					core_dbs_info->freq_lo,
+					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
 		if (core_dbs_info->freq_lo) {
@@ -229,7 +230,7 @@ static void od_dbs_timer(struct work_struct *work)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
 				* core_dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy, delay, modify_all);
+	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
 	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
 }
 
@@ -289,8 +290,8 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
 			mutex_lock(&dbs_info->cdbs.timer_mutex);
 
-			gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy,
-					usecs_to_jiffies(new_rate), true);
+			gov_queue_work(dbs_data, dbs_info->cdbs.policy,
+				       usecs_to_jiffies(new_rate), true);
 
 		}
 		mutex_unlock(&dbs_info->cdbs.timer_mutex);
@@ -559,7 +560,7 @@ static void od_set_powersave_bias(unsigned int powersave_bias)
 		if (cpumask_test_cpu(cpu, &done))
 			continue;
 
-		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.cur_policy;
+		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy;
 		if (!policy)
 			continue;
 
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (4 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 05/10] cpufreq: governor: rename cur_policy as policy Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-07-17 22:11   ` Rafael J. Wysocki
  2015-11-27 11:56   ` Lucas Stach
  2015-06-19 11:48 ` [PATCH V2 07/10] cpufreq: governor: split out common part of {cs|od}_dbs_timer() Viresh Kumar
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Some information is common to all CPUs belonging to a policy, but are
kept on per-cpu basis. Lets keep that in another structure common to all
policy->cpus. That will make updates/reads to that less complex and less
error prone.

The memory for ccdbs is allocated/freed at INIT/EXIT, so that it we
don't reallocate it for STOP/START sequence. It will be also be used (in
next patch) while the governor is stopped and so must not be freed that
early.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 18 ++++---
 drivers/cpufreq/cpufreq_governor.c     | 92 +++++++++++++++++++++++++---------
 drivers/cpufreq/cpufreq_governor.h     | 24 +++++----
 drivers/cpufreq/cpufreq_ondemand.c     | 38 +++++++-------
 4 files changed, 114 insertions(+), 58 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index af47d322679e..5b5c01ca556c 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 static void cs_check_cpu(int cpu, unsigned int load)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -106,22 +106,24 @@ static void cs_dbs_timer(struct work_struct *work)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
 			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
+	unsigned int cpu = policy->cpu;
 	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
+	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
+			    cs_tuners->sampling_rate))
 		modify_all = false;
 	else
 		dbs_check_cpu(dbs_data, cpu);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
@@ -135,7 +137,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
 	if (!dbs_info->enable)
 		return 0;
 
-	policy = dbs_info->cdbs.policy;
+	policy = dbs_info->cdbs.ccdbs->policy;
 
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index c0566f86caed..72a92fa71ba5 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -35,7 +35,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	struct cpufreq_policy *policy;
+	struct cpufreq_policy *policy = cdbs->ccdbs->policy;
 	unsigned int sampling_rate;
 	unsigned int max_load = 0;
 	unsigned int ignore_nice;
@@ -60,8 +60,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 		ignore_nice = cs_tuners->ignore_nice_load;
 	}
 
-	policy = cdbs->policy;
-
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs;
@@ -209,17 +207,18 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
+bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
+		    unsigned int sampling_rate)
 {
-	if (policy_is_shared(cdbs->policy)) {
+	if (policy_is_shared(ccdbs->policy)) {
 		ktime_t time_now = ktime_get();
-		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
+		s64 delta_us = ktime_us_delta(time_now, ccdbs->time_stamp);
 
 		/* Do nothing if we recently have sampled */
 		if (delta_us < (s64)(sampling_rate / 2))
 			return false;
 		else
-			cdbs->time_stamp = time_now;
+			ccdbs->time_stamp = time_now;
 	}
 
 	return true;
@@ -238,6 +237,37 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
 	}
 }
 
+static int alloc_ccdbs(struct cpufreq_policy *policy,
+		       struct common_dbs_data *cdata)
+{
+	struct cpu_common_dbs_info *ccdbs;
+	int j;
+
+	/* Allocate memory for the common information for policy->cpus */
+	ccdbs = kzalloc(sizeof(*ccdbs), GFP_KERNEL);
+	if (!ccdbs)
+		return -ENOMEM;
+
+	/* Set ccdbs for all CPUs, online+offline */
+	for_each_cpu(j, policy->related_cpus)
+		cdata->get_cpu_cdbs(j)->ccdbs = ccdbs;
+
+	return 0;
+}
+
+static void free_ccdbs(struct cpufreq_policy *policy,
+		       struct common_dbs_data *cdata)
+{
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
+	int j;
+
+	for_each_cpu(j, policy->cpus)
+		cdata->get_cpu_cdbs(j)->ccdbs = NULL;
+
+	kfree(ccdbs);
+}
+
 static int cpufreq_governor_init(struct cpufreq_policy *policy,
 				 struct dbs_data *dbs_data,
 				 struct common_dbs_data *cdata)
@@ -248,6 +278,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
+
+		ret = alloc_ccdbs(policy, cdata);
+		if (ret)
+			return ret;
+
 		dbs_data->usage_count++;
 		policy->governor_data = dbs_data;
 		return 0;
@@ -257,12 +292,16 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (!dbs_data)
 		return -ENOMEM;
 
+	ret = alloc_ccdbs(policy, cdata);
+	if (ret)
+		goto free_dbs_data;
+
 	dbs_data->cdata = cdata;
 	dbs_data->usage_count = 1;
 
 	ret = cdata->init(dbs_data, !policy->governor->initialized);
 	if (ret)
-		goto free_dbs_data;
+		goto free_ccdbs;
 
 	/* policy latency is in ns. Convert it to us first */
 	latency = policy->cpuinfo.transition_latency / 1000;
@@ -299,6 +338,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	}
 cdata_exit:
 	cdata->exit(dbs_data, !policy->governor->initialized);
+free_ccdbs:
+	free_ccdbs(policy, cdata);
 free_dbs_data:
 	kfree(dbs_data);
 	return ret;
@@ -322,6 +363,8 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 		cdata->exit(dbs_data, policy->governor->initialized == 1);
 		kfree(dbs_data);
 	}
+
+	free_ccdbs(policy, cdata);
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -330,6 +373,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
 	int io_busy = 0;
 
 	if (!policy->cur)
@@ -348,11 +392,14 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		io_busy = od_tuners->io_is_busy;
 	}
 
+	ccdbs->policy = policy;
+	ccdbs->time_stamp = ktime_get();
+	mutex_init(&ccdbs->timer_mutex);
+
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
-		j_cdbs->policy = policy;
 		j_cdbs->prev_cpu_idle =
 			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
 
@@ -364,7 +411,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		mutex_init(&j_cdbs->timer_mutex);
 		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
 	}
 
@@ -384,9 +430,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		od_ops->powersave_bias_init_cpu(cpu);
 	}
 
-	/* Initiate timer time stamp */
-	cdbs->time_stamp = ktime_get();
-
 	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
 		       true);
 	return 0;
@@ -398,6 +441,9 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
+
+	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -406,10 +452,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 		cs_dbs_info->enable = 0;
 	}
 
-	gov_cancel_work(dbs_data, policy);
-
-	mutex_destroy(&cdbs->timer_mutex);
-	cdbs->policy = NULL;
+	ccdbs->policy = NULL;
+	mutex_destroy(&ccdbs->timer_mutex);
 }
 
 static void cpufreq_governor_limits(struct cpufreq_policy *policy,
@@ -419,18 +463,18 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
-	if (!cdbs->policy)
+	if (!cdbs->ccdbs || !cdbs->ccdbs->policy)
 		return;
 
-	mutex_lock(&cdbs->timer_mutex);
-	if (policy->max < cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->max,
+	mutex_lock(&cdbs->ccdbs->timer_mutex);
+	if (policy->max < cdbs->ccdbs->policy->cur)
+		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->max,
 					CPUFREQ_RELATION_H);
-	else if (policy->min > cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->min,
+	else if (policy->min > cdbs->ccdbs->policy->cur)
+		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->min,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
-	mutex_unlock(&cdbs->timer_mutex);
+	mutex_unlock(&cdbs->ccdbs->timer_mutex);
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index a0f8eb79ee6d..153412cafb05 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -128,6 +128,18 @@ static void *get_cpu_dbs_info_s(int cpu)				\
  * cs_*: Conservative governor
  */
 
+/* Common to all CPUs of a policy */
+struct cpu_common_dbs_info {
+	struct cpufreq_policy *policy;
+	/*
+	 * percpu mutex that serializes governor limit change with gov_dbs_timer
+	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * the governor or limits.
+	 */
+	struct mutex timer_mutex;
+	ktime_t time_stamp;
+};
+
 /* Per cpu structures */
 struct cpu_dbs_info {
 	u64 prev_cpu_idle;
@@ -140,15 +152,8 @@ struct cpu_dbs_info {
 	 * wake-up from idle.
 	 */
 	unsigned int prev_load;
-	struct cpufreq_policy *policy;
 	struct delayed_work dwork;
-	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
-	 * the governor or limits.
-	 */
-	struct mutex timer_mutex;
-	ktime_t time_stamp;
+	struct cpu_common_dbs_info *ccdbs;
 };
 
 struct od_cpu_dbs_info_s {
@@ -264,7 +269,8 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
+bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
+		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index d29c6f9c6e3e..651677cfa581 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
 static void od_check_cpu(int cpu, unsigned int load)
 {
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -195,16 +195,18 @@ static void od_dbs_timer(struct work_struct *work)
 {
 	struct od_cpu_dbs_info_s *dbs_info =
 		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
+	unsigned int cpu = policy->cpu;
 	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	int delay = 0, sample_type = core_dbs_info->sample_type;
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, od_tuners->sampling_rate)) {
+	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
+			    od_tuners->sampling_rate)) {
 		modify_all = false;
 		goto max_delay;
 	}
@@ -213,8 +215,7 @@ static void od_dbs_timer(struct work_struct *work)
 	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
 		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(core_dbs_info->cdbs.policy,
-					core_dbs_info->freq_lo,
+		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
@@ -230,8 +231,8 @@ static void od_dbs_timer(struct work_struct *work)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
 				* core_dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
 }
 
 /************************** sysfs interface ************************/
@@ -274,10 +275,10 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 		cpufreq_cpu_put(policy);
 
-		mutex_lock(&dbs_info->cdbs.timer_mutex);
+		mutex_lock(&dbs_info->cdbs.ccdbs->timer_mutex);
 
 		if (!delayed_work_pending(&dbs_info->cdbs.dwork)) {
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
 			continue;
 		}
 
@@ -286,15 +287,15 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 
 		if (time_before(next_sampling, appointed_at)) {
 
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
 			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
-			mutex_lock(&dbs_info->cdbs.timer_mutex);
+			mutex_lock(&dbs_info->cdbs.ccdbs->timer_mutex);
 
-			gov_queue_work(dbs_data, dbs_info->cdbs.policy,
+			gov_queue_work(dbs_data, policy,
 				       usecs_to_jiffies(new_rate), true);
 
 		}
-		mutex_unlock(&dbs_info->cdbs.timer_mutex);
+		mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
 	}
 }
 
@@ -557,13 +558,16 @@ static void od_set_powersave_bias(unsigned int powersave_bias)
 
 	get_online_cpus();
 	for_each_online_cpu(cpu) {
+		struct cpu_common_dbs_info *ccdbs;
+
 		if (cpumask_test_cpu(cpu, &done))
 			continue;
 
-		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy;
-		if (!policy)
+		ccdbs = per_cpu(od_cpu_dbs_info, cpu).cdbs.ccdbs;
+		if (!ccdbs)
 			continue;
 
+		policy = ccdbs->policy;
 		cpumask_or(&done, &done, policy->cpus);
 
 		if (policy->governor != &cpufreq_gov_ondemand)
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 07/10] cpufreq: governor: split out common part of {cs|od}_dbs_timer()
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (5 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks Viresh Kumar
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Some part of cs_dbs_timer() and od_dbs_timer() is exactly same and is
unnecessarily duplicated.

Create the real work-handler in cpufreq_governor.c and put the common
code in this routine (dbs_timer()).

Shouldn't make any functional change.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 27 +++++++-----------------
 drivers/cpufreq/cpufreq_governor.c     | 38 ++++++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_governor.h     | 10 ++++-----
 drivers/cpufreq/cpufreq_ondemand.c     | 36 +++++++++++++-------------------
 4 files changed, 60 insertions(+), 51 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 5b5c01ca556c..0e4154e584bf 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -102,28 +102,15 @@ static void cs_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void cs_dbs_timer(struct work_struct *work)
+static unsigned int cs_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
-			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
-	unsigned int cpu = policy->cpu;
-	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
-			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
-	bool modify_all = true;
-
-	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
-			    cs_tuners->sampling_rate))
-		modify_all = false;
-	else
-		dbs_check_cpu(dbs_data, cpu);
-
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
+
+	if (modify_all)
+		dbs_check_cpu(dbs_data, cdbs->ccdbs->policy->cpu);
+
+	return delay_for_sampling_rate(cs_tuners->sampling_rate);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 72a92fa71ba5..441150dea078 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -207,8 +207,8 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
-		    unsigned int sampling_rate)
+static bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
+			   unsigned int sampling_rate)
 {
 	if (policy_is_shared(ccdbs->policy)) {
 		ktime_t time_now = ktime_get();
@@ -223,7 +223,37 @@ bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(need_load_eval);
+
+static void dbs_timer(struct work_struct *work)
+{
+	struct cpu_dbs_info *cdbs = container_of(work, struct cpu_dbs_info,
+						 dwork.work);
+	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
+	struct cpufreq_policy *policy = ccdbs->policy;
+	struct dbs_data *dbs_data = policy->governor_data;
+	unsigned int sampling_rate, delay;
+	bool modify_all = true;
+
+	mutex_lock(&ccdbs->timer_mutex);
+
+	if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+
+		sampling_rate = cs_tuners->sampling_rate;
+	} else {
+		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+
+		sampling_rate = od_tuners->sampling_rate;
+	}
+
+	if (!need_load_eval(cdbs->ccdbs, sampling_rate))
+		modify_all = false;
+
+	delay = dbs_data->cdata->gov_dbs_timer(cdbs, dbs_data, modify_all);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+
+	mutex_unlock(&ccdbs->timer_mutex);
+}
 
 static void set_sampling_rate(struct dbs_data *dbs_data,
 		unsigned int sampling_rate)
@@ -411,7 +441,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
+		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, dbs_timer);
 	}
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 153412cafb05..2125c299c602 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -132,8 +132,8 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 struct cpu_common_dbs_info {
 	struct cpufreq_policy *policy;
 	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * percpu mutex that serializes governor limit change with dbs_timer
+	 * invocation. We do not want dbs_timer to run when user is changing
 	 * the governor or limits.
 	 */
 	struct mutex timer_mutex;
@@ -210,7 +210,9 @@ struct common_dbs_data {
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	void *(*get_cpu_dbs_info_s)(int cpu);
-	void (*gov_dbs_timer)(struct work_struct *work);
+	unsigned int (*gov_dbs_timer)(struct cpu_dbs_info *cdbs,
+				      struct dbs_data *dbs_data,
+				      bool modify_all);
 	void (*gov_check_cpu)(int cpu, unsigned int load);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
@@ -269,8 +271,6 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
-		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 651677cfa581..11db20079fc6 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -191,48 +191,40 @@ static void od_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void od_dbs_timer(struct work_struct *work)
+static unsigned int od_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct od_cpu_dbs_info_s *dbs_info =
-		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
+	struct cpufreq_policy *policy = cdbs->ccdbs->policy;
 	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-	int delay = 0, sample_type = core_dbs_info->sample_type;
-	bool modify_all = true;
+	int delay = 0, sample_type = dbs_info->sample_type;
 
-	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
-			    od_tuners->sampling_rate)) {
-		modify_all = false;
+	if (!modify_all)
 		goto max_delay;
-	}
 
 	/* Common NORMAL_SAMPLE setup */
-	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
+	dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
-		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
+		delay = dbs_info->freq_lo_jiffies;
+		__cpufreq_driver_target(policy, dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
-		if (core_dbs_info->freq_lo) {
+		if (dbs_info->freq_lo) {
 			/* Setup timer for SUB_SAMPLE */
-			core_dbs_info->sample_type = OD_SUB_SAMPLE;
-			delay = core_dbs_info->freq_hi_jiffies;
+			dbs_info->sample_type = OD_SUB_SAMPLE;
+			delay = dbs_info->freq_hi_jiffies;
 		}
 	}
 
 max_delay:
 	if (!delay)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
-				* core_dbs_info->rate_mult);
+				* dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
+	return delay;
 }
 
 /************************** sysfs interface ************************/
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (6 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 07/10] cpufreq: governor: split out common part of {cs|od}_dbs_timer() Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 18:04   ` Preeti U Murthy
  2015-06-19 11:48 ` [PATCH V2 09/10] cpufreq: governor: Don't WARN on invalid states Viresh Kumar
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

There can be races where the request has come to a wrong state. For
example INIT followed by STOP (instead of START) or START followed by
EXIT (instead of STOP).

Tackle these races by comparing making sure the state-machine never gets
into any invalid state. And return error if an invalid state-transition
is requested.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 46 +++++++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 441150dea078..8d4c86fa43ec 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -305,6 +305,10 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	unsigned int latency;
 	int ret;
 
+	/* State should be equivalent to EXIT */
+	if (policy->governor_data)
+		return -EBUSY;
+
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
@@ -375,10 +379,15 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	return ret;
 }
 
-static void cpufreq_governor_exit(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_exit(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+
+	/* State should be equivalent to INIT */
+	if (!cdbs->ccdbs || cdbs->ccdbs->policy)
+		return -EBUSY;
 
 	policy->governor_data = NULL;
 	if (!--dbs_data->usage_count) {
@@ -395,6 +404,7 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 	}
 
 	free_ccdbs(policy, cdata);
+	return 0;
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -409,6 +419,10 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	if (!policy->cur)
 		return -EINVAL;
 
+	/* State should be equivalent to INIT */
+	if (!ccdbs || ccdbs->policy)
+		return -EBUSY;
+
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -465,14 +479,18 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	return 0;
 }
 
-static void cpufreq_governor_stop(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_stop(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
 
+	/* State should be equivalent to START */
+	if (!ccdbs || !ccdbs->policy)
+		return -EBUSY;
+
 	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
@@ -484,17 +502,19 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 
 	ccdbs->policy = NULL;
 	mutex_destroy(&ccdbs->timer_mutex);
+	return 0;
 }
 
-static void cpufreq_governor_limits(struct cpufreq_policy *policy,
-				    struct dbs_data *dbs_data)
+static int cpufreq_governor_limits(struct cpufreq_policy *policy,
+				   struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
+	/* State should be equivalent to START */
 	if (!cdbs->ccdbs || !cdbs->ccdbs->policy)
-		return;
+		return -EBUSY;
 
 	mutex_lock(&cdbs->ccdbs->timer_mutex);
 	if (policy->max < cdbs->ccdbs->policy->cur)
@@ -505,13 +525,15 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
 	mutex_unlock(&cdbs->ccdbs->timer_mutex);
+
+	return 0;
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			 struct common_dbs_data *cdata, unsigned int event)
 {
 	struct dbs_data *dbs_data;
-	int ret = 0;
+	int ret;
 
 	/* Lock governor to block concurrent initialization of governor */
 	mutex_lock(&cdata->mutex);
@@ -531,17 +553,19 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		ret = cpufreq_governor_init(policy, dbs_data, cdata);
 		break;
 	case CPUFREQ_GOV_POLICY_EXIT:
-		cpufreq_governor_exit(policy, dbs_data);
+		ret = cpufreq_governor_exit(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_START:
 		ret = cpufreq_governor_start(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_STOP:
-		cpufreq_governor_stop(policy, dbs_data);
+		ret = cpufreq_governor_stop(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_LIMITS:
-		cpufreq_governor_limits(policy, dbs_data);
+		ret = cpufreq_governor_limits(policy, dbs_data);
 		break;
+	default:
+		ret = -EINVAL;
 	}
 
 unlock:
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 09/10] cpufreq: governor: Don't WARN on invalid states
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (7 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 11:48 ` [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

With previous commit, governors have started to return errors on invalid
state-transition requests. We already have a WARN for an invalid
state-transition request in cpufreq_governor_dbs(). This does trigger
today, as the sequence of events isn't guaranteed by cpufreq core.

Lets stop warning on that for now, and make sure we don't enter an
invalid state.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 8d4c86fa43ec..af63402a94a9 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -543,7 +543,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 	else
 		dbs_data = cdata->gdbs_data;
 
-	if (WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT))) {
+	if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
 		ret = -EINVAL;
 		goto unlock;
 	}
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (8 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 09/10] cpufreq: governor: Don't WARN on invalid states Viresh Kumar
@ 2015-06-19 11:48 ` Viresh Kumar
  2015-06-19 18:02   ` Preeti U Murthy
  2015-06-19 18:09   ` Preeti U Murthy
  2015-06-19 18:08 ` [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Preeti U Murthy
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-19 11:48 UTC (permalink / raw)
  To: Rafael Wysocki, Preeti U Murthy, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan, Viresh Kumar

Return codes aren't honored properly in cpufreq_set_policy(). This can
lead to two problems:
- wrong errors propagated to sysfs
- we try to do next state-change even if the previous one failed

cpufreq_governor_dbs() now returns proper errors on all invalid
state-transition requests and this code should honor that.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b612411655f9..da672b910760 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2284,16 +2284,20 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
-		up_write(&policy->rwsem);
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
+		if(!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP))) {
+			up_write(&policy->rwsem);
+			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+			down_write(&policy->rwsem);
+		}
+
+		if (ret)
+			return ret;
 	}
 
 	/* start new governor */
 	policy->governor = new_policy->governor;
-	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
+	if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))) {
+		if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_START)))
 			goto out;
 
 		up_write(&policy->rwsem);
@@ -2305,11 +2309,11 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
-		__cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 
-	return -EINVAL;
+	return ret;
 
  out:
 	pr_debug("governor: change or update limits\n");
-- 
2.4.0

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-19 11:48 ` [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
@ 2015-06-19 18:02   ` Preeti U Murthy
  2015-06-20  3:12     ` Viresh Kumar
  2015-06-19 18:09   ` Preeti U Murthy
  1 sibling, 1 reply; 42+ messages in thread
From: Preeti U Murthy @ 2015-06-19 18:02 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan

On 06/19/2015 05:18 PM, Viresh Kumar wrote:
> Return codes aren't honored properly in cpufreq_set_policy(). This can
> lead to two problems:
> - wrong errors propagated to sysfs
> - we try to do next state-change even if the previous one failed
> 
> cpufreq_governor_dbs() now returns proper errors on all invalid
> state-transition requests and this code should honor that.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Minor points below.

> ---
>  drivers/cpufreq/cpufreq.c | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..da672b910760 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2284,16 +2284,20 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	old_gov = policy->governor;
>  	/* end old governor */
>  	if (old_gov) {
> -		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> -		up_write(&policy->rwsem);
> -		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> -		down_write(&policy->rwsem);
> +		if(!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP))) {
> +			up_write(&policy->rwsem);
> +			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> +			down_write(&policy->rwsem);
> +		}
> +
> +		if (ret)
> +			return ret;

Some debug prints help here? Like "Failed to stop govenor".
>  	}
> 
>  	/* start new governor */
>  	policy->governor = new_policy->governor;
> -	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
> -		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
> +	if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))) {
> +		if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_START)))
>  			goto out;
> 
>  		up_write(&policy->rwsem);
> @@ -2305,11 +2309,11 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	pr_debug("starting governor %s failed\n", policy->governor->name);
>  	if (old_gov) {
>  		policy->governor = old_gov;
> -		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
> -		__cpufreq_governor(policy, CPUFREQ_GOV_START);
> +		if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
> +			__cpufreq_governor(policy, CPUFREQ_GOV_START);

If INIT itself fails, we need to set policy->governor to NULL. I
included this in the testing of the patchset.

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks
  2015-06-19 11:48 ` [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks Viresh Kumar
@ 2015-06-19 18:04   ` Preeti U Murthy
  0 siblings, 0 replies; 42+ messages in thread
From: Preeti U Murthy @ 2015-06-19 18:04 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan

On 06/19/2015 05:18 PM, Viresh Kumar wrote:
> There can be races where the request has come to a wrong state. For
> example INIT followed by STOP (instead of START) or START followed by
> EXIT (instead of STOP).
> 
> Tackle these races by comparing making sure the state-machine never gets
> into any invalid state. And return error if an invalid state-transition
> is requested.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (9 preceding siblings ...)
  2015-06-19 11:48 ` [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
@ 2015-06-19 18:08 ` Preeti U Murthy
  2015-06-19 23:16 ` Rafael J. Wysocki
  2015-07-17 22:15 ` Rafael J. Wysocki
  12 siblings, 0 replies; 42+ messages in thread
From: Preeti U Murthy @ 2015-06-19 18:08 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan

On 06/19/2015 05:18 PM, Viresh Kumar wrote:
> Hi Rafael/Preeti,
> 
> This is another attempt to fix the crashes reported by Preeti. They work
> quite well for me now, and I hope they would work well for Preeti as
> well :)
> 
> So, patches [1-7,9] are already Reviewed by Preeti.
> 
> The first 5 patches are minor cleanups, 6th & 7th try to optimize few
> things to make code less complex.
> 
> Patches 8-10 actually solve (or try to solve :)) the synchronization
> problems, or the crashes I was getting.

This series has been tested on powerpc, by running governor switching in
parallel and another test where  hotplug, governor switching were run in
parallel.So,

Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> 
> V1->V2:
> - 7/11 is dropped and only 8/11 is updated, which is 8/10 now.
> - Avoid taking the same mutex in both cpufreq_governor_dbs() and
>   work-handler as that has given us some locdeps, classic ABBA stuff.
> - And so timer_mutex, responsible for synchronizing governor
>   work-handlers is kept as is.
> - Later patches are almost same with minor updates.
> - Invalid state-transitions are sensed better now with improved checks.
> - I have run enough tests on my exynos dual core board and failed to
>   crash at all. Not that I wanted to crash :)
> 
> Rebased over pm/bleeeding-edge.
> 
> Viresh Kumar (10):
>   cpufreq: governor: Name delayed-work as dwork
>   cpufreq: governor: Drop unused field 'cpu'
>   cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info'
>   cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs'
>   cpufreq: governor: rename cur_policy as policy
>   cpufreq: governor: Keep single copy of information common to
>     policy->cpus
>   cpufreq: governor: split out common part of {cs|od}_dbs_timer()
>   cpufreq: governor: Avoid invalid states with additional checks
>   cpufreq: governor: Don't WARN on invalid states
>   cpufreq: propagate errors returned from __cpufreq_governor()
> 
>  drivers/cpufreq/cpufreq.c              |  22 ++--
>  drivers/cpufreq/cpufreq_conservative.c |  25 ++---
>  drivers/cpufreq/cpufreq_governor.c     | 196 ++++++++++++++++++++++++---------
>  drivers/cpufreq/cpufreq_governor.h     |  40 ++++---
>  drivers/cpufreq/cpufreq_ondemand.c     |  67 ++++++-----
>  5 files changed, 220 insertions(+), 130 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-19 11:48 ` [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
  2015-06-19 18:02   ` Preeti U Murthy
@ 2015-06-19 18:09   ` Preeti U Murthy
  1 sibling, 0 replies; 42+ messages in thread
From: Preeti U Murthy @ 2015-06-19 18:09 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, ke.wang
  Cc: linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene, skannan

On 06/19/2015 05:18 PM, Viresh Kumar wrote:
> Return codes aren't honored properly in cpufreq_set_policy(). This can
> lead to two problems:
> - wrong errors propagated to sysfs
> - we try to do next state-change even if the previous one failed
> 
> cpufreq_governor_dbs() now returns proper errors on all invalid
> state-transition requests and this code should honor that.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Incorporating the nit in my previous reply to this patch would mean,

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (10 preceding siblings ...)
  2015-06-19 18:08 ` [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Preeti U Murthy
@ 2015-06-19 23:16 ` Rafael J. Wysocki
  2015-06-20  2:36   ` Viresh Kumar
  2015-07-17 22:15 ` Rafael J. Wysocki
  12 siblings, 1 reply; 42+ messages in thread
From: Rafael J. Wysocki @ 2015-06-19 23:16 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Preeti U Murthy, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On Friday, June 19, 2015 05:18:00 PM Viresh Kumar wrote:
> Hi Rafael/Preeti,
> 
> This is another attempt to fix the crashes reported by Preeti. They work
> quite well for me now, and I hope they would work well for Preeti as
> well :)
> 
> So, patches [1-7,9] are already Reviewed by Preeti.
> 
> The first 5 patches are minor cleanups, 6th & 7th try to optimize few
> things to make code less complex.
> 
> Patches 8-10 actually solve (or try to solve :)) the synchronization
> problems, or the crashes I was getting.
> 
> V1->V2:
> - 7/11 is dropped and only 8/11 is updated, which is 8/10 now.
> - Avoid taking the same mutex in both cpufreq_governor_dbs() and
>   work-handler as that has given us some locdeps, classic ABBA stuff.
> - And so timer_mutex, responsible for synchronizing governor
>   work-handlers is kept as is.
> - Later patches are almost same with minor updates.
> - Invalid state-transitions are sensed better now with improved checks.
> - I have run enough tests on my exynos dual core board and failed to
>   crash at all. Not that I wanted to crash :)
> 
> Rebased over pm/bleeeding-edge.
> 
> Viresh Kumar (10):
>   cpufreq: governor: Name delayed-work as dwork
>   cpufreq: governor: Drop unused field 'cpu'
>   cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info'
>   cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs'
>   cpufreq: governor: rename cur_policy as policy
>   cpufreq: governor: Keep single copy of information common to
>     policy->cpus
>   cpufreq: governor: split out common part of {cs|od}_dbs_timer()
>   cpufreq: governor: Avoid invalid states with additional checks
>   cpufreq: governor: Don't WARN on invalid states
>   cpufreq: propagate errors returned from __cpufreq_governor()
> 
>  drivers/cpufreq/cpufreq.c              |  22 ++--
>  drivers/cpufreq/cpufreq_conservative.c |  25 ++---
>  drivers/cpufreq/cpufreq_governor.c     | 196 ++++++++++++++++++++++++---------
>  drivers/cpufreq/cpufreq_governor.h     |  40 ++++---
>  drivers/cpufreq/cpufreq_ondemand.c     |  67 ++++++-----
>  5 files changed, 220 insertions(+), 130 deletions(-)

So as I said before, I'm not regarding this as 4.2 material.

The fact that you manage to send your stuff before the merge window opens
doesn't automatically mean that it has to go in during that merge window.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions
  2015-06-19 23:16 ` Rafael J. Wysocki
@ 2015-06-20  2:36   ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-20  2:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Preeti U Murthy, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On 20-06-15, 01:16, Rafael J. Wysocki wrote:
> So as I said before, I'm not regarding this as 4.2 material.

Okay.

> The fact that you manage to send your stuff before the merge window opens
> doesn't automatically mean that it has to go in during that merge window.

Yeah, that wasn't the intention. There were few reasons for me to
hurry up:
- It was related to crashes that we are aware of since few kernel
  releases now. And these aren't new features, but bug fixes, ofcourse
  not for bugs that are introduced in 4.2 material :)
- I wanted Preeti to test this stuff, as she wouldn't be working on
  this stuff anymore after this month.

Anyway, its fine to me even if this goes in the next cycle.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-19 18:02   ` Preeti U Murthy
@ 2015-06-20  3:12     ` Viresh Kumar
  2015-06-22  4:41       ` Preeti U Murthy
  0 siblings, 1 reply; 42+ messages in thread
From: Viresh Kumar @ 2015-06-20  3:12 UTC (permalink / raw)
  To: Preeti U Murthy
  Cc: Rafael Wysocki, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On 19-06-15, 23:32, Preeti U Murthy wrote:
> If INIT itself fails, we need to set policy->governor to NULL. I
> included this in the testing of the patchset.

Diff from earlier patch:

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index da672b910760..59fa0c1b7922 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2284,14 +2284,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		if(!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP))) {
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		if (ret) {
+			/* This can happen due to race with other operations */
+			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
+				 __func__, old_gov->name, ret);
+			return ret;
+		} else {
 			up_write(&policy->rwsem);
 			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
 			down_write(&policy->rwsem);
-		}
 
-		if (ret)
-			return ret;
+			if (ret) {
+				pr_err("%s: Failed to Exit Governor: %s (%d)\n",
+				       __func__, old_gov->name, ret);
+				return ret;
+			}
+		}
 	}
 
 	/* start new governor */
@@ -2309,7 +2318,9 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			policy->governor = NULL;
+		else
 			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 

New-patch:

--------------8<-------------------

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Thu, 11 Sep 2014 10:50:48 +0530
Subject: [PATCH] cpufreq: propagate errors returned from __cpufreq_governor()

Return codes aren't honored properly in cpufreq_set_policy(). This can
lead to two problems:
- wrong errors propagated to sysfs
- we try to do next state-change even if the previous one failed

cpufreq_governor_dbs() now returns proper errors on all invalid
state-transition requests and this code should honor that.

Reviewed-and-Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b612411655f9..59fa0c1b7922 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2284,16 +2284,29 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
-		up_write(&policy->rwsem);
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		if (ret) {
+			/* This can happen due to race with other operations */
+			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
+				 __func__, old_gov->name, ret);
+			return ret;
+		} else {
+			up_write(&policy->rwsem);
+			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+			down_write(&policy->rwsem);
+
+			if (ret) {
+				pr_err("%s: Failed to Exit Governor: %s (%d)\n",
+				       __func__, old_gov->name, ret);
+				return ret;
+			}
+		}
 	}
 
 	/* start new governor */
 	policy->governor = new_policy->governor;
-	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
+	if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))) {
+		if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_START)))
 			goto out;
 
 		up_write(&policy->rwsem);
@@ -2305,11 +2318,13 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
-		__cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			policy->governor = NULL;
+		else
+			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 
-	return -EINVAL;
+	return ret;
 
  out:
 	pr_debug("governor: change or update limits\n");

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-20  3:12     ` Viresh Kumar
@ 2015-06-22  4:41       ` Preeti U Murthy
  2015-06-22  5:21         ` Viresh Kumar
  0 siblings, 1 reply; 42+ messages in thread
From: Preeti U Murthy @ 2015-06-22  4:41 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On 06/20/2015 08:42 AM, Viresh Kumar wrote:
> On 19-06-15, 23:32, Preeti U Murthy wrote:
>> If INIT itself fails, we need to set policy->governor to NULL. I
>> included this in the testing of the patchset.
> 
> Diff from earlier patch:
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index da672b910760..59fa0c1b7922 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2284,14 +2284,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	old_gov = policy->governor;
>  	/* end old governor */
>  	if (old_gov) {
> -		if(!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP))) {
> +		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> +		if (ret) {
> +			/* This can happen due to race with other operations */
> +			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
> +				 __func__, old_gov->name, ret);
> +			return ret;
> +		} else {
>  			up_write(&policy->rwsem);
>  			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
>  			down_write(&policy->rwsem);
> -		}
> 
> -		if (ret)
> -			return ret;
> +			if (ret) {
> +				pr_err("%s: Failed to Exit Governor: %s (%d)\n",
> +				       __func__, old_gov->name, ret);
> +				return ret;
> +			}
> +		}
>  	}
> 
>  	/* start new governor */
> @@ -2309,7 +2318,9 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	pr_debug("starting governor %s failed\n", policy->governor->name);
>  	if (old_gov) {
>  		policy->governor = old_gov;
> -		if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
> +		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
> +			policy->governor = NULL;
> +		else
>  			__cpufreq_governor(policy, CPUFREQ_GOV_START);
>  	}
> 
> 
> New-patch:
> 
> --------------8<-------------------
> 
> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Thu, 11 Sep 2014 10:50:48 +0530
> Subject: [PATCH] cpufreq: propagate errors returned from __cpufreq_governor()
> 
> Return codes aren't honored properly in cpufreq_set_policy(). This can
> lead to two problems:
> - wrong errors propagated to sysfs
> - we try to do next state-change even if the previous one failed
> 
> cpufreq_governor_dbs() now returns proper errors on all invalid
> state-transition requests and this code should honor that.
> 
> Reviewed-and-Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c | 33 ++++++++++++++++++++++++---------
>  1 file changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b612411655f9..59fa0c1b7922 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2284,16 +2284,29 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	old_gov = policy->governor;
>  	/* end old governor */
>  	if (old_gov) {
> -		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> -		up_write(&policy->rwsem);
> -		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> -		down_write(&policy->rwsem);
> +		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> +		if (ret) {
> +			/* This can happen due to race with other operations */
> +			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
> +				 __func__, old_gov->name, ret);
> +			return ret;
> +		} else {
> +			up_write(&policy->rwsem);
> +			ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
> +			down_write(&policy->rwsem);
> +
> +			if (ret) {
> +				pr_err("%s: Failed to Exit Governor: %s (%d)\n",
> +				       __func__, old_gov->name, ret);
> +				return ret;
> +			}
> +		}
>  	}
> 
>  	/* start new governor */
>  	policy->governor = new_policy->governor;
> -	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
> -		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
> +	if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))) {
> +		if (!(ret = __cpufreq_governor(policy, CPUFREQ_GOV_START)))
>  			goto out;
> 
>  		up_write(&policy->rwsem);
> @@ -2305,11 +2318,13 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	pr_debug("starting governor %s failed\n", policy->governor->name);
>  	if (old_gov) {
>  		policy->governor = old_gov;
> -		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
> -		__cpufreq_governor(policy, CPUFREQ_GOV_START);
> +		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
> +			policy->governor = NULL;
> +		else
> +			__cpufreq_governor(policy, CPUFREQ_GOV_START);
>  	}
> 
> -	return -EINVAL;
> +	return ret;
> 
>   out:
>  	pr_debug("governor: change or update limits\n");
> 

There are some checkpatch errors on this patch.

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-06-22  4:41       ` Preeti U Murthy
@ 2015-06-22  5:21         ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-06-22  5:21 UTC (permalink / raw)
  To: Preeti U Murthy
  Cc: Rafael Wysocki, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On 22-06-15, 10:11, Preeti U Murthy wrote:
> There are some checkpatch errors on this patch.

--------------8<------------------

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Thu, 11 Sep 2014 10:50:48 +0530
Subject: [PATCH] cpufreq: propagate errors returned from __cpufreq_governor()

Return codes aren't honored properly in cpufreq_set_policy(). This can
lead to two problems:
- wrong errors propagated to sysfs
- we try to do next state-change even if the previous one failed

cpufreq_governor_dbs() now returns proper errors on all invalid
state-transition requests and this code should honor that.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b612411655f9..0b3c60861bdf 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2284,16 +2284,31 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		if (ret) {
+			/* This can happen due to race with other operations */
+			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
+				 __func__, old_gov->name, ret);
+			return ret;
+		}
+
 		up_write(&policy->rwsem);
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
 		down_write(&policy->rwsem);
+
+		if (ret) {
+			pr_err("%s: Failed to Exit Governor: %s (%d)\n",
+			       __func__, old_gov->name, ret);
+			return ret;
+		}
 	}
 
 	/* start new governor */
 	policy->governor = new_policy->governor;
-	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
+	ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
+	if (!ret) {
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (!ret)
 			goto out;
 
 		up_write(&policy->rwsem);
@@ -2305,11 +2320,13 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
-		__cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			policy->governor = NULL;
+		else
+			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 
-	return -EINVAL;
+	return ret;
 
  out:
 	pr_debug("governor: change or update limits\n");
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-06-19 11:48 ` [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
@ 2015-07-17 22:11   ` Rafael J. Wysocki
  2015-11-27 11:56   ` Lucas Stach
  1 sibling, 0 replies; 42+ messages in thread
From: Rafael J. Wysocki @ 2015-07-17 22:11 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Preeti U Murthy, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On Friday, June 19, 2015 05:18:06 PM Viresh Kumar wrote:
> Some information is common to all CPUs belonging to a policy, but are
> kept on per-cpu basis. Lets keep that in another structure common to all
> policy->cpus. That will make updates/reads to that less complex and less
> error prone.
> 
> The memory for ccdbs is allocated/freed at INIT/EXIT, so that it we
> don't reallocate it for STOP/START sequence. It will be also be used (in
> next patch) while the governor is stopped and so must not be freed that
> early.
> 
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_conservative.c | 18 ++++---
>  drivers/cpufreq/cpufreq_governor.c     | 92 +++++++++++++++++++++++++---------
>  drivers/cpufreq/cpufreq_governor.h     | 24 +++++----
>  drivers/cpufreq/cpufreq_ondemand.c     | 38 +++++++-------
>  4 files changed, 114 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index af47d322679e..5b5c01ca556c 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
>  static void cs_check_cpu(int cpu, unsigned int load)
>  {
>  	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
> -	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
>  	struct dbs_data *dbs_data = policy->governor_data;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>  
> @@ -106,22 +106,24 @@ static void cs_dbs_timer(struct work_struct *work)
>  {
>  	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
>  			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
> -	unsigned int cpu = dbs_info->cdbs.policy->cpu;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
> +	unsigned int cpu = policy->cpu;
>  	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
>  			cpu);
> -	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
> +	struct dbs_data *dbs_data = policy->governor_data;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>  	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
>  	bool modify_all = true;
>  
> -	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
> -	if (!need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
> +	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
> +	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
> +			    cs_tuners->sampling_rate))
>  		modify_all = false;
>  	else
>  		dbs_check_cpu(dbs_data, cpu);
>  
> -	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
> -	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
> +	gov_queue_work(dbs_data, policy, delay, modify_all);
> +	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
>  }
>  
>  static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
> @@ -135,7 +137,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
>  	if (!dbs_info->enable)
>  		return 0;
>  
> -	policy = dbs_info->cdbs.policy;
> +	policy = dbs_info->cdbs.ccdbs->policy;
>  
>  	/*
>  	 * we only care if our internally tracked freq moves outside the 'valid'
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index c0566f86caed..72a92fa71ba5 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -35,7 +35,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>  	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
> -	struct cpufreq_policy *policy;
> +	struct cpufreq_policy *policy = cdbs->ccdbs->policy;
>  	unsigned int sampling_rate;
>  	unsigned int max_load = 0;
>  	unsigned int ignore_nice;
> @@ -60,8 +60,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  		ignore_nice = cs_tuners->ignore_nice_load;
>  	}
>  
> -	policy = cdbs->policy;
> -
>  	/* Get Absolute Load */
>  	for_each_cpu(j, policy->cpus) {
>  		struct cpu_dbs_info *j_cdbs;
> @@ -209,17 +207,18 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>  }
>  
>  /* Will return if we need to evaluate cpu load again or not */
> -bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
> +bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
> +		    unsigned int sampling_rate)
>  {
> -	if (policy_is_shared(cdbs->policy)) {
> +	if (policy_is_shared(ccdbs->policy)) {
>  		ktime_t time_now = ktime_get();
> -		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
> +		s64 delta_us = ktime_us_delta(time_now, ccdbs->time_stamp);
>  
>  		/* Do nothing if we recently have sampled */
>  		if (delta_us < (s64)(sampling_rate / 2))
>  			return false;
>  		else
> -			cdbs->time_stamp = time_now;
> +			ccdbs->time_stamp = time_now;
>  	}
>  
>  	return true;
> @@ -238,6 +237,37 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>  	}
>  }
>  
> +static int alloc_ccdbs(struct cpufreq_policy *policy,
> +		       struct common_dbs_data *cdata)
> +{
> +	struct cpu_common_dbs_info *ccdbs;
> +	int j;
> +
> +	/* Allocate memory for the common information for policy->cpus */
> +	ccdbs = kzalloc(sizeof(*ccdbs), GFP_KERNEL);
> +	if (!ccdbs)
> +		return -ENOMEM;
> +
> +	/* Set ccdbs for all CPUs, online+offline */
> +	for_each_cpu(j, policy->related_cpus)
> +		cdata->get_cpu_cdbs(j)->ccdbs = ccdbs;
> +
> +	return 0;
> +}
> +
> +static void free_ccdbs(struct cpufreq_policy *policy,
> +		       struct common_dbs_data *cdata)
> +{
> +	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
> +	int j;
> +
> +	for_each_cpu(j, policy->cpus)
> +		cdata->get_cpu_cdbs(j)->ccdbs = NULL;
> +
> +	kfree(ccdbs);
> +}
> +
>  static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  				 struct dbs_data *dbs_data,
>  				 struct common_dbs_data *cdata)
> @@ -248,6 +278,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	if (dbs_data) {
>  		if (WARN_ON(have_governor_per_policy()))
>  			return -EINVAL;
> +
> +		ret = alloc_ccdbs(policy, cdata);
> +		if (ret)
> +			return ret;
> +
>  		dbs_data->usage_count++;
>  		policy->governor_data = dbs_data;
>  		return 0;
> @@ -257,12 +292,16 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	if (!dbs_data)
>  		return -ENOMEM;
>  
> +	ret = alloc_ccdbs(policy, cdata);
> +	if (ret)
> +		goto free_dbs_data;
> +
>  	dbs_data->cdata = cdata;
>  	dbs_data->usage_count = 1;
>  
>  	ret = cdata->init(dbs_data, !policy->governor->initialized);
>  	if (ret)
> -		goto free_dbs_data;
> +		goto free_ccdbs;
>  
>  	/* policy latency is in ns. Convert it to us first */
>  	latency = policy->cpuinfo.transition_latency / 1000;
> @@ -299,6 +338,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	}
>  cdata_exit:
>  	cdata->exit(dbs_data, !policy->governor->initialized);
> +free_ccdbs:
> +	free_ccdbs(policy, cdata);
>  free_dbs_data:
>  	kfree(dbs_data);
>  	return ret;
> @@ -322,6 +363,8 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
>  		cdata->exit(dbs_data, policy->governor->initialized == 1);
>  		kfree(dbs_data);
>  	}
> +
> +	free_ccdbs(policy, cdata);
>  }
>  
>  static int cpufreq_governor_start(struct cpufreq_policy *policy,
> @@ -330,6 +373,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  	struct common_dbs_data *cdata = dbs_data->cdata;
>  	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
>  	int io_busy = 0;
>  
>  	if (!policy->cur)
> @@ -348,11 +392,14 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		io_busy = od_tuners->io_is_busy;
>  	}
>  
> +	ccdbs->policy = policy;
> +	ccdbs->time_stamp = ktime_get();
> +	mutex_init(&ccdbs->timer_mutex);
> +
>  	for_each_cpu(j, policy->cpus) {
>  		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
>  		unsigned int prev_load;
>  
> -		j_cdbs->policy = policy;
>  		j_cdbs->prev_cpu_idle =
>  			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
>  
> @@ -364,7 +411,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		if (ignore_nice)
>  			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>  
> -		mutex_init(&j_cdbs->timer_mutex);
>  		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
>  	}
>  
> @@ -384,9 +430,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		od_ops->powersave_bias_init_cpu(cpu);
>  	}
>  
> -	/* Initiate timer time stamp */
> -	cdbs->time_stamp = ktime_get();
> -
>  	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
>  		       true);
>  	return 0;
> @@ -398,6 +441,9 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
>  	struct common_dbs_data *cdata = dbs_data->cdata;
>  	unsigned int cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
> +
> +	gov_cancel_work(dbs_data, policy);
>  
>  	if (cdata->governor == GOV_CONSERVATIVE) {
>  		struct cs_cpu_dbs_info_s *cs_dbs_info =
> @@ -406,10 +452,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
>  		cs_dbs_info->enable = 0;
>  	}
>  
> -	gov_cancel_work(dbs_data, policy);
> -
> -	mutex_destroy(&cdbs->timer_mutex);
> -	cdbs->policy = NULL;
> +	ccdbs->policy = NULL;
> +	mutex_destroy(&ccdbs->timer_mutex);
>  }
>  
>  static void cpufreq_governor_limits(struct cpufreq_policy *policy,
> @@ -419,18 +463,18 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
>  	unsigned int cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
>  
> -	if (!cdbs->policy)
> +	if (!cdbs->ccdbs || !cdbs->ccdbs->policy)
>  		return;
>  
> -	mutex_lock(&cdbs->timer_mutex);
> -	if (policy->max < cdbs->policy->cur)
> -		__cpufreq_driver_target(cdbs->policy, policy->max,
> +	mutex_lock(&cdbs->ccdbs->timer_mutex);
> +	if (policy->max < cdbs->ccdbs->policy->cur)
> +		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->max,
>  					CPUFREQ_RELATION_H);
> -	else if (policy->min > cdbs->policy->cur)
> -		__cpufreq_driver_target(cdbs->policy, policy->min,
> +	else if (policy->min > cdbs->ccdbs->policy->cur)
> +		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->min,
>  					CPUFREQ_RELATION_L);
>  	dbs_check_cpu(dbs_data, cpu);
> -	mutex_unlock(&cdbs->timer_mutex);
> +	mutex_unlock(&cdbs->ccdbs->timer_mutex);
>  }
>  
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index a0f8eb79ee6d..153412cafb05 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -128,6 +128,18 @@ static void *get_cpu_dbs_info_s(int cpu)				\
>   * cs_*: Conservative governor
>   */
>  
> +/* Common to all CPUs of a policy */
> +struct cpu_common_dbs_info {
> +	struct cpufreq_policy *policy;
> +	/*
> +	 * percpu mutex that serializes governor limit change with gov_dbs_timer
> +	 * invocation. We do not want gov_dbs_timer to run when user is changing
> +	 * the governor or limits.
> +	 */
> +	struct mutex timer_mutex;
> +	ktime_t time_stamp;
> +};
> +
>  /* Per cpu structures */
>  struct cpu_dbs_info {
>  	u64 prev_cpu_idle;
> @@ -140,15 +152,8 @@ struct cpu_dbs_info {
>  	 * wake-up from idle.
>  	 */
>  	unsigned int prev_load;
> -	struct cpufreq_policy *policy;
>  	struct delayed_work dwork;
> -	/*
> -	 * percpu mutex that serializes governor limit change with gov_dbs_timer
> -	 * invocation. We do not want gov_dbs_timer to run when user is changing
> -	 * the governor or limits.
> -	 */
> -	struct mutex timer_mutex;
> -	ktime_t time_stamp;
> +	struct cpu_common_dbs_info *ccdbs;

I don't like this new field name to be honest.  It's just too cryptic to me.

Why don't you simply call it "common" or "shared", for example?

>  };
>  
>  struct od_cpu_dbs_info_s {
> @@ -264,7 +269,8 @@ static ssize_t show_sampling_rate_min_gov_pol				\
>  extern struct mutex cpufreq_governor_lock;
>  
>  void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
> -bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
> +bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
> +		    unsigned int sampling_rate);
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>  		struct common_dbs_data *cdata, unsigned int event);
>  void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index d29c6f9c6e3e..651677cfa581 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
>  static void od_check_cpu(int cpu, unsigned int load)
>  {
>  	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> -	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;

If you rename ccdbs to "shared" that will look as follows:

	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;

which is way more readable to me.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions
  2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
                   ` (11 preceding siblings ...)
  2015-06-19 23:16 ` Rafael J. Wysocki
@ 2015-07-17 22:15 ` Rafael J. Wysocki
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
  12 siblings, 1 reply; 42+ messages in thread
From: Rafael J. Wysocki @ 2015-07-17 22:15 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Preeti U Murthy, ke.wang, linaro-kernel, linux-pm, ego, paulus,
	shilpa.bhat, prarit, robert.schoene, skannan

On Friday, June 19, 2015 05:18:00 PM Viresh Kumar wrote:
> Hi Rafael/Preeti,
> 
> This is another attempt to fix the crashes reported by Preeti. They work
> quite well for me now, and I hope they would work well for Preeti as
> well :)
> 
> So, patches [1-7,9] are already Reviewed by Preeti.
> 
> The first 5 patches are minor cleanups, 6th & 7th try to optimize few
> things to make code less complex.
> 
> Patches 8-10 actually solve (or try to solve :)) the synchronization
> problems, or the crashes I was getting.
> 
> V1->V2:
> - 7/11 is dropped and only 8/11 is updated, which is 8/10 now.
> - Avoid taking the same mutex in both cpufreq_governor_dbs() and
>   work-handler as that has given us some locdeps, classic ABBA stuff.
> - And so timer_mutex, responsible for synchronizing governor
>   work-handlers is kept as is.
> - Later patches are almost same with minor updates.
> - Invalid state-transitions are sensed better now with improved checks.
> - I have run enough tests on my exynos dual core board and failed to
>   crash at all. Not that I wanted to crash :)
> 
> Rebased over pm/bleeeding-edge.
> 
> Viresh Kumar (10):
>   cpufreq: governor: Name delayed-work as dwork
>   cpufreq: governor: Drop unused field 'cpu'
>   cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info'
>   cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs'
>   cpufreq: governor: rename cur_policy as policy

The above are all fine by me, so I'm queuing them up for 4.3.

The one below needs a small update in my view which will propagate to the
next ones.

>   cpufreq: governor: Keep single copy of information common to
>     policy->cpus
>   cpufreq: governor: split out common part of {cs|od}_dbs_timer()
>   cpufreq: governor: Avoid invalid states with additional checks
>   cpufreq: governor: Don't WARN on invalid states
>   cpufreq: propagate errors returned from __cpufreq_governor()

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH V3 0/5] cpufreq: governor: Avoid invalid state-transitions
  2015-07-17 22:15 ` Rafael J. Wysocki
@ 2015-07-18  6:00   ` Viresh Kumar
  2015-07-18  6:00       ` Viresh Kumar
                       ` (5 more replies)
  0 siblings, 6 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:00 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar

Hi Rafael,

These are rest of the patches after replacing 'ccdbs' with 'shared' as
you suggested.

Viresh Kumar (5):
  cpufreq: governor: Keep single copy of information common to
    policy->cpus
  cpufreq: governor: split out common part of {cs|od}_dbs_timer()
  cpufreq: governor: Avoid invalid states with additional checks
  cpufreq: governor: Don't WARN on invalid states
  cpufreq: propagate errors returned from __cpufreq_governor()

 drivers/cpufreq/cpufreq.c              |  31 ++++--
 drivers/cpufreq/cpufreq_conservative.c |  25 ++---
 drivers/cpufreq/cpufreq_governor.c     | 174 ++++++++++++++++++++++++++-------
 drivers/cpufreq/cpufreq_governor.h     |  26 +++--
 drivers/cpufreq/cpufreq_ondemand.c     |  58 +++++------
 5 files changed, 210 insertions(+), 104 deletions(-)

-- 
2.4.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH V3 1/5] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
@ 2015-07-18  6:00       ` Viresh Kumar
  2015-07-18  6:01       ` Viresh Kumar
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:00 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Some information is common to all CPUs belonging to a policy, but are
kept on per-cpu basis. Lets keep that in another structure common to all
policy->cpus. That will make updates/reads to that less complex and less
error prone.

The memory for cpu_common_dbs_info is allocated/freed at INIT/EXIT, so
that it we don't reallocate it for STOP/START sequence. It will be also
be used (in next patch) while the governor is stopped and so must not be
freed that early.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 18 ++++---
 drivers/cpufreq/cpufreq_governor.c     | 92 +++++++++++++++++++++++++---------
 drivers/cpufreq/cpufreq_governor.h     | 24 +++++----
 drivers/cpufreq/cpufreq_ondemand.c     | 38 +++++++-------
 4 files changed, 114 insertions(+), 58 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index af47d322679e..d21c3cff9056 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 static void cs_check_cpu(int cpu, unsigned int load)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -106,22 +106,24 @@ static void cs_dbs_timer(struct work_struct *work)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
 			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	unsigned int cpu = policy->cpu;
 	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
+	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.shared,
+			    cs_tuners->sampling_rate))
 		modify_all = false;
 	else
 		dbs_check_cpu(dbs_data, cpu);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
@@ -135,7 +137,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
 	if (!dbs_info->enable)
 		return 0;
 
-	policy = dbs_info->cdbs.policy;
+	policy = dbs_info->cdbs.shared->policy;
 
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index c0566f86caed..b01cb729104b 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -35,7 +35,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	struct cpufreq_policy *policy;
+	struct cpufreq_policy *policy = cdbs->shared->policy;
 	unsigned int sampling_rate;
 	unsigned int max_load = 0;
 	unsigned int ignore_nice;
@@ -60,8 +60,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 		ignore_nice = cs_tuners->ignore_nice_load;
 	}
 
-	policy = cdbs->policy;
-
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs;
@@ -209,17 +207,18 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
+bool need_load_eval(struct cpu_common_dbs_info *shared,
+		    unsigned int sampling_rate)
 {
-	if (policy_is_shared(cdbs->policy)) {
+	if (policy_is_shared(shared->policy)) {
 		ktime_t time_now = ktime_get();
-		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
+		s64 delta_us = ktime_us_delta(time_now, shared->time_stamp);
 
 		/* Do nothing if we recently have sampled */
 		if (delta_us < (s64)(sampling_rate / 2))
 			return false;
 		else
-			cdbs->time_stamp = time_now;
+			shared->time_stamp = time_now;
 	}
 
 	return true;
@@ -238,6 +237,37 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
 	}
 }
 
+static int alloc_common_dbs_info(struct cpufreq_policy *policy,
+				 struct common_dbs_data *cdata)
+{
+	struct cpu_common_dbs_info *shared;
+	int j;
+
+	/* Allocate memory for the common information for policy->cpus */
+	shared = kzalloc(sizeof(*shared), GFP_KERNEL);
+	if (!shared)
+		return -ENOMEM;
+
+	/* Set shared for all CPUs, online+offline */
+	for_each_cpu(j, policy->related_cpus)
+		cdata->get_cpu_cdbs(j)->shared = shared;
+
+	return 0;
+}
+
+static void free_common_dbs_info(struct cpufreq_policy *policy,
+				 struct common_dbs_data *cdata)
+{
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+	int j;
+
+	for_each_cpu(j, policy->cpus)
+		cdata->get_cpu_cdbs(j)->shared = NULL;
+
+	kfree(shared);
+}
+
 static int cpufreq_governor_init(struct cpufreq_policy *policy,
 				 struct dbs_data *dbs_data,
 				 struct common_dbs_data *cdata)
@@ -248,6 +278,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
+
+		ret = alloc_common_dbs_info(policy, cdata);
+		if (ret)
+			return ret;
+
 		dbs_data->usage_count++;
 		policy->governor_data = dbs_data;
 		return 0;
@@ -257,12 +292,16 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (!dbs_data)
 		return -ENOMEM;
 
+	ret = alloc_common_dbs_info(policy, cdata);
+	if (ret)
+		goto free_dbs_data;
+
 	dbs_data->cdata = cdata;
 	dbs_data->usage_count = 1;
 
 	ret = cdata->init(dbs_data, !policy->governor->initialized);
 	if (ret)
-		goto free_dbs_data;
+		goto free_common_dbs_info;
 
 	/* policy latency is in ns. Convert it to us first */
 	latency = policy->cpuinfo.transition_latency / 1000;
@@ -299,6 +338,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	}
 cdata_exit:
 	cdata->exit(dbs_data, !policy->governor->initialized);
+free_common_dbs_info:
+	free_common_dbs_info(policy, cdata);
 free_dbs_data:
 	kfree(dbs_data);
 	return ret;
@@ -322,6 +363,8 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 		cdata->exit(dbs_data, policy->governor->initialized == 1);
 		kfree(dbs_data);
 	}
+
+	free_common_dbs_info(policy, cdata);
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -330,6 +373,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
 	int io_busy = 0;
 
 	if (!policy->cur)
@@ -348,11 +392,14 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		io_busy = od_tuners->io_is_busy;
 	}
 
+	shared->policy = policy;
+	shared->time_stamp = ktime_get();
+	mutex_init(&shared->timer_mutex);
+
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
-		j_cdbs->policy = policy;
 		j_cdbs->prev_cpu_idle =
 			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
 
@@ -364,7 +411,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		mutex_init(&j_cdbs->timer_mutex);
 		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
 	}
 
@@ -384,9 +430,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		od_ops->powersave_bias_init_cpu(cpu);
 	}
 
-	/* Initiate timer time stamp */
-	cdbs->time_stamp = ktime_get();
-
 	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
 		       true);
 	return 0;
@@ -398,6 +441,9 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+
+	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -406,10 +452,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 		cs_dbs_info->enable = 0;
 	}
 
-	gov_cancel_work(dbs_data, policy);
-
-	mutex_destroy(&cdbs->timer_mutex);
-	cdbs->policy = NULL;
+	shared->policy = NULL;
+	mutex_destroy(&shared->timer_mutex);
 }
 
 static void cpufreq_governor_limits(struct cpufreq_policy *policy,
@@ -419,18 +463,18 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
-	if (!cdbs->policy)
+	if (!cdbs->shared || !cdbs->shared->policy)
 		return;
 
-	mutex_lock(&cdbs->timer_mutex);
-	if (policy->max < cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->max,
+	mutex_lock(&cdbs->shared->timer_mutex);
+	if (policy->max < cdbs->shared->policy->cur)
+		__cpufreq_driver_target(cdbs->shared->policy, policy->max,
 					CPUFREQ_RELATION_H);
-	else if (policy->min > cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->min,
+	else if (policy->min > cdbs->shared->policy->cur)
+		__cpufreq_driver_target(cdbs->shared->policy, policy->min,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
-	mutex_unlock(&cdbs->timer_mutex);
+	mutex_unlock(&cdbs->shared->timer_mutex);
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index a0f8eb79ee6d..8e4a25f0730c 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -128,6 +128,18 @@ static void *get_cpu_dbs_info_s(int cpu)				\
  * cs_*: Conservative governor
  */
 
+/* Common to all CPUs of a policy */
+struct cpu_common_dbs_info {
+	struct cpufreq_policy *policy;
+	/*
+	 * percpu mutex that serializes governor limit change with gov_dbs_timer
+	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * the governor or limits.
+	 */
+	struct mutex timer_mutex;
+	ktime_t time_stamp;
+};
+
 /* Per cpu structures */
 struct cpu_dbs_info {
 	u64 prev_cpu_idle;
@@ -140,15 +152,8 @@ struct cpu_dbs_info {
 	 * wake-up from idle.
 	 */
 	unsigned int prev_load;
-	struct cpufreq_policy *policy;
 	struct delayed_work dwork;
-	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
-	 * the governor or limits.
-	 */
-	struct mutex timer_mutex;
-	ktime_t time_stamp;
+	struct cpu_common_dbs_info *shared;
 };
 
 struct od_cpu_dbs_info_s {
@@ -264,7 +269,8 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
+bool need_load_eval(struct cpu_common_dbs_info *shared,
+		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index d29c6f9c6e3e..14d7e86e7f94 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
 static void od_check_cpu(int cpu, unsigned int load)
 {
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -195,16 +195,18 @@ static void od_dbs_timer(struct work_struct *work)
 {
 	struct od_cpu_dbs_info_s *dbs_info =
 		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	unsigned int cpu = policy->cpu;
 	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	int delay = 0, sample_type = core_dbs_info->sample_type;
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, od_tuners->sampling_rate)) {
+	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.shared,
+			    od_tuners->sampling_rate)) {
 		modify_all = false;
 		goto max_delay;
 	}
@@ -213,8 +215,7 @@ static void od_dbs_timer(struct work_struct *work)
 	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
 		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(core_dbs_info->cdbs.policy,
-					core_dbs_info->freq_lo,
+		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
@@ -230,8 +231,8 @@ static void od_dbs_timer(struct work_struct *work)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
 				* core_dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
 }
 
 /************************** sysfs interface ************************/
@@ -274,10 +275,10 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 		cpufreq_cpu_put(policy);
 
-		mutex_lock(&dbs_info->cdbs.timer_mutex);
+		mutex_lock(&dbs_info->cdbs.shared->timer_mutex);
 
 		if (!delayed_work_pending(&dbs_info->cdbs.dwork)) {
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 			continue;
 		}
 
@@ -286,15 +287,15 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 
 		if (time_before(next_sampling, appointed_at)) {
 
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
-			mutex_lock(&dbs_info->cdbs.timer_mutex);
+			mutex_lock(&dbs_info->cdbs.shared->timer_mutex);
 
-			gov_queue_work(dbs_data, dbs_info->cdbs.policy,
+			gov_queue_work(dbs_data, policy,
 				       usecs_to_jiffies(new_rate), true);
 
 		}
-		mutex_unlock(&dbs_info->cdbs.timer_mutex);
+		mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 	}
 }
 
@@ -557,13 +558,16 @@ static void od_set_powersave_bias(unsigned int powersave_bias)
 
 	get_online_cpus();
 	for_each_online_cpu(cpu) {
+		struct cpu_common_dbs_info *shared;
+
 		if (cpumask_test_cpu(cpu, &done))
 			continue;
 
-		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy;
-		if (!policy)
+		shared = per_cpu(od_cpu_dbs_info, cpu).cdbs.shared;
+		if (!shared)
 			continue;
 
+		policy = shared->policy;
 		cpumask_or(&done, &done, policy->cpus);
 
 		if (policy->governor != &cpufreq_gov_ondemand)
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 1/5] cpufreq: governor: Keep single copy of information common to policy->cpus
@ 2015-07-18  6:00       ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:00 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Some information is common to all CPUs belonging to a policy, but are
kept on per-cpu basis. Lets keep that in another structure common to all
policy->cpus. That will make updates/reads to that less complex and less
error prone.

The memory for cpu_common_dbs_info is allocated/freed at INIT/EXIT, so
that it we don't reallocate it for STOP/START sequence. It will be also
be used (in next patch) while the governor is stopped and so must not be
freed that early.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 18 ++++---
 drivers/cpufreq/cpufreq_governor.c     | 92 +++++++++++++++++++++++++---------
 drivers/cpufreq/cpufreq_governor.h     | 24 +++++----
 drivers/cpufreq/cpufreq_ondemand.c     | 38 +++++++-------
 4 files changed, 114 insertions(+), 58 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index af47d322679e..d21c3cff9056 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 static void cs_check_cpu(int cpu, unsigned int load)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -106,22 +106,24 @@ static void cs_dbs_timer(struct work_struct *work)
 {
 	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
 			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	unsigned int cpu = policy->cpu;
 	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
+	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.shared,
+			    cs_tuners->sampling_rate))
 		modify_all = false;
 	else
 		dbs_check_cpu(dbs_data, cpu);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
@@ -135,7 +137,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
 	if (!dbs_info->enable)
 		return 0;
 
-	policy = dbs_info->cdbs.policy;
+	policy = dbs_info->cdbs.shared->policy;
 
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index c0566f86caed..b01cb729104b 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -35,7 +35,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	struct cpufreq_policy *policy;
+	struct cpufreq_policy *policy = cdbs->shared->policy;
 	unsigned int sampling_rate;
 	unsigned int max_load = 0;
 	unsigned int ignore_nice;
@@ -60,8 +60,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 		ignore_nice = cs_tuners->ignore_nice_load;
 	}
 
-	policy = cdbs->policy;
-
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs;
@@ -209,17 +207,18 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
+bool need_load_eval(struct cpu_common_dbs_info *shared,
+		    unsigned int sampling_rate)
 {
-	if (policy_is_shared(cdbs->policy)) {
+	if (policy_is_shared(shared->policy)) {
 		ktime_t time_now = ktime_get();
-		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
+		s64 delta_us = ktime_us_delta(time_now, shared->time_stamp);
 
 		/* Do nothing if we recently have sampled */
 		if (delta_us < (s64)(sampling_rate / 2))
 			return false;
 		else
-			cdbs->time_stamp = time_now;
+			shared->time_stamp = time_now;
 	}
 
 	return true;
@@ -238,6 +237,37 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
 	}
 }
 
+static int alloc_common_dbs_info(struct cpufreq_policy *policy,
+				 struct common_dbs_data *cdata)
+{
+	struct cpu_common_dbs_info *shared;
+	int j;
+
+	/* Allocate memory for the common information for policy->cpus */
+	shared = kzalloc(sizeof(*shared), GFP_KERNEL);
+	if (!shared)
+		return -ENOMEM;
+
+	/* Set shared for all CPUs, online+offline */
+	for_each_cpu(j, policy->related_cpus)
+		cdata->get_cpu_cdbs(j)->shared = shared;
+
+	return 0;
+}
+
+static void free_common_dbs_info(struct cpufreq_policy *policy,
+				 struct common_dbs_data *cdata)
+{
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+	int j;
+
+	for_each_cpu(j, policy->cpus)
+		cdata->get_cpu_cdbs(j)->shared = NULL;
+
+	kfree(shared);
+}
+
 static int cpufreq_governor_init(struct cpufreq_policy *policy,
 				 struct dbs_data *dbs_data,
 				 struct common_dbs_data *cdata)
@@ -248,6 +278,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
+
+		ret = alloc_common_dbs_info(policy, cdata);
+		if (ret)
+			return ret;
+
 		dbs_data->usage_count++;
 		policy->governor_data = dbs_data;
 		return 0;
@@ -257,12 +292,16 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	if (!dbs_data)
 		return -ENOMEM;
 
+	ret = alloc_common_dbs_info(policy, cdata);
+	if (ret)
+		goto free_dbs_data;
+
 	dbs_data->cdata = cdata;
 	dbs_data->usage_count = 1;
 
 	ret = cdata->init(dbs_data, !policy->governor->initialized);
 	if (ret)
-		goto free_dbs_data;
+		goto free_common_dbs_info;
 
 	/* policy latency is in ns. Convert it to us first */
 	latency = policy->cpuinfo.transition_latency / 1000;
@@ -299,6 +338,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	}
 cdata_exit:
 	cdata->exit(dbs_data, !policy->governor->initialized);
+free_common_dbs_info:
+	free_common_dbs_info(policy, cdata);
 free_dbs_data:
 	kfree(dbs_data);
 	return ret;
@@ -322,6 +363,8 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 		cdata->exit(dbs_data, policy->governor->initialized == 1);
 		kfree(dbs_data);
 	}
+
+	free_common_dbs_info(policy, cdata);
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -330,6 +373,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
 	int io_busy = 0;
 
 	if (!policy->cur)
@@ -348,11 +392,14 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		io_busy = od_tuners->io_is_busy;
 	}
 
+	shared->policy = policy;
+	shared->time_stamp = ktime_get();
+	mutex_init(&shared->timer_mutex);
+
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
 		unsigned int prev_load;
 
-		j_cdbs->policy = policy;
 		j_cdbs->prev_cpu_idle =
 			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
 
@@ -364,7 +411,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		mutex_init(&j_cdbs->timer_mutex);
 		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
 	}
 
@@ -384,9 +430,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		od_ops->powersave_bias_init_cpu(cpu);
 	}
 
-	/* Initiate timer time stamp */
-	cdbs->time_stamp = ktime_get();
-
 	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
 		       true);
 	return 0;
@@ -398,6 +441,9 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+
+	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -406,10 +452,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 		cs_dbs_info->enable = 0;
 	}
 
-	gov_cancel_work(dbs_data, policy);
-
-	mutex_destroy(&cdbs->timer_mutex);
-	cdbs->policy = NULL;
+	shared->policy = NULL;
+	mutex_destroy(&shared->timer_mutex);
 }
 
 static void cpufreq_governor_limits(struct cpufreq_policy *policy,
@@ -419,18 +463,18 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
-	if (!cdbs->policy)
+	if (!cdbs->shared || !cdbs->shared->policy)
 		return;
 
-	mutex_lock(&cdbs->timer_mutex);
-	if (policy->max < cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->max,
+	mutex_lock(&cdbs->shared->timer_mutex);
+	if (policy->max < cdbs->shared->policy->cur)
+		__cpufreq_driver_target(cdbs->shared->policy, policy->max,
 					CPUFREQ_RELATION_H);
-	else if (policy->min > cdbs->policy->cur)
-		__cpufreq_driver_target(cdbs->policy, policy->min,
+	else if (policy->min > cdbs->shared->policy->cur)
+		__cpufreq_driver_target(cdbs->shared->policy, policy->min,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
-	mutex_unlock(&cdbs->timer_mutex);
+	mutex_unlock(&cdbs->shared->timer_mutex);
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index a0f8eb79ee6d..8e4a25f0730c 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -128,6 +128,18 @@ static void *get_cpu_dbs_info_s(int cpu)				\
  * cs_*: Conservative governor
  */
 
+/* Common to all CPUs of a policy */
+struct cpu_common_dbs_info {
+	struct cpufreq_policy *policy;
+	/*
+	 * percpu mutex that serializes governor limit change with gov_dbs_timer
+	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * the governor or limits.
+	 */
+	struct mutex timer_mutex;
+	ktime_t time_stamp;
+};
+
 /* Per cpu structures */
 struct cpu_dbs_info {
 	u64 prev_cpu_idle;
@@ -140,15 +152,8 @@ struct cpu_dbs_info {
 	 * wake-up from idle.
 	 */
 	unsigned int prev_load;
-	struct cpufreq_policy *policy;
 	struct delayed_work dwork;
-	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
-	 * the governor or limits.
-	 */
-	struct mutex timer_mutex;
-	ktime_t time_stamp;
+	struct cpu_common_dbs_info *shared;
 };
 
 struct od_cpu_dbs_info_s {
@@ -264,7 +269,8 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
+bool need_load_eval(struct cpu_common_dbs_info *shared,
+		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index d29c6f9c6e3e..14d7e86e7f94 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
 static void od_check_cpu(int cpu, unsigned int load)
 {
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
 	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -195,16 +195,18 @@ static void od_dbs_timer(struct work_struct *work)
 {
 	struct od_cpu_dbs_info_s *dbs_info =
 		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	unsigned int cpu = dbs_info->cdbs.policy->cpu;
+	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	unsigned int cpu = policy->cpu;
 	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
+	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	int delay = 0, sample_type = core_dbs_info->sample_type;
 	bool modify_all = true;
 
-	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
-	if (!need_load_eval(&core_dbs_info->cdbs, od_tuners->sampling_rate)) {
+	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
+	if (!need_load_eval(core_dbs_info->cdbs.shared,
+			    od_tuners->sampling_rate)) {
 		modify_all = false;
 		goto max_delay;
 	}
@@ -213,8 +215,7 @@ static void od_dbs_timer(struct work_struct *work)
 	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
 		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(core_dbs_info->cdbs.policy,
-					core_dbs_info->freq_lo,
+		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
@@ -230,8 +231,8 @@ static void od_dbs_timer(struct work_struct *work)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
 				* core_dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
 }
 
 /************************** sysfs interface ************************/
@@ -274,10 +275,10 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 		cpufreq_cpu_put(policy);
 
-		mutex_lock(&dbs_info->cdbs.timer_mutex);
+		mutex_lock(&dbs_info->cdbs.shared->timer_mutex);
 
 		if (!delayed_work_pending(&dbs_info->cdbs.dwork)) {
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 			continue;
 		}
 
@@ -286,15 +287,15 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 
 		if (time_before(next_sampling, appointed_at)) {
 
-			mutex_unlock(&dbs_info->cdbs.timer_mutex);
+			mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
-			mutex_lock(&dbs_info->cdbs.timer_mutex);
+			mutex_lock(&dbs_info->cdbs.shared->timer_mutex);
 
-			gov_queue_work(dbs_data, dbs_info->cdbs.policy,
+			gov_queue_work(dbs_data, policy,
 				       usecs_to_jiffies(new_rate), true);
 
 		}
-		mutex_unlock(&dbs_info->cdbs.timer_mutex);
+		mutex_unlock(&dbs_info->cdbs.shared->timer_mutex);
 	}
 }
 
@@ -557,13 +558,16 @@ static void od_set_powersave_bias(unsigned int powersave_bias)
 
 	get_online_cpus();
 	for_each_online_cpu(cpu) {
+		struct cpu_common_dbs_info *shared;
+
 		if (cpumask_test_cpu(cpu, &done))
 			continue;
 
-		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy;
-		if (!policy)
+		shared = per_cpu(od_cpu_dbs_info, cpu).cdbs.shared;
+		if (!shared)
 			continue;
 
+		policy = shared->policy;
 		cpumask_or(&done, &done, policy->cpus);
 
 		if (policy->governor != &cpufreq_gov_ondemand)
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 2/5] cpufreq: governor: split out common part of {cs|od}_dbs_timer()
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
@ 2015-07-18  6:01       ` Viresh Kumar
  2015-07-18  6:01       ` Viresh Kumar
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Some part of cs_dbs_timer() and od_dbs_timer() is exactly same and is
unnecessarily duplicated.

Create the real work-handler in cpufreq_governor.c and put the common
code in this routine (dbs_timer()).

Shouldn't make any functional change.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 27 +++++++-----------------
 drivers/cpufreq/cpufreq_governor.c     | 38 ++++++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_governor.h     | 10 ++++-----
 drivers/cpufreq/cpufreq_ondemand.c     | 36 +++++++++++++-------------------
 4 files changed, 60 insertions(+), 51 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index d21c3cff9056..84a1506950a7 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -102,28 +102,15 @@ static void cs_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void cs_dbs_timer(struct work_struct *work)
+static unsigned int cs_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
-			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
-	unsigned int cpu = policy->cpu;
-	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
-			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
-	bool modify_all = true;
-
-	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.shared,
-			    cs_tuners->sampling_rate))
-		modify_all = false;
-	else
-		dbs_check_cpu(dbs_data, cpu);
-
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
+
+	if (modify_all)
+		dbs_check_cpu(dbs_data, cdbs->shared->policy->cpu);
+
+	return delay_for_sampling_rate(cs_tuners->sampling_rate);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index b01cb729104b..7ed0ec2ac853 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -207,8 +207,8 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_common_dbs_info *shared,
-		    unsigned int sampling_rate)
+static bool need_load_eval(struct cpu_common_dbs_info *shared,
+			   unsigned int sampling_rate)
 {
 	if (policy_is_shared(shared->policy)) {
 		ktime_t time_now = ktime_get();
@@ -223,7 +223,37 @@ bool need_load_eval(struct cpu_common_dbs_info *shared,
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(need_load_eval);
+
+static void dbs_timer(struct work_struct *work)
+{
+	struct cpu_dbs_info *cdbs = container_of(work, struct cpu_dbs_info,
+						 dwork.work);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+	struct cpufreq_policy *policy = shared->policy;
+	struct dbs_data *dbs_data = policy->governor_data;
+	unsigned int sampling_rate, delay;
+	bool modify_all = true;
+
+	mutex_lock(&shared->timer_mutex);
+
+	if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+
+		sampling_rate = cs_tuners->sampling_rate;
+	} else {
+		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+
+		sampling_rate = od_tuners->sampling_rate;
+	}
+
+	if (!need_load_eval(cdbs->shared, sampling_rate))
+		modify_all = false;
+
+	delay = dbs_data->cdata->gov_dbs_timer(cdbs, dbs_data, modify_all);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+
+	mutex_unlock(&shared->timer_mutex);
+}
 
 static void set_sampling_rate(struct dbs_data *dbs_data,
 		unsigned int sampling_rate)
@@ -411,7 +441,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
+		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, dbs_timer);
 	}
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 8e4a25f0730c..50f171796632 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -132,8 +132,8 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 struct cpu_common_dbs_info {
 	struct cpufreq_policy *policy;
 	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * percpu mutex that serializes governor limit change with dbs_timer
+	 * invocation. We do not want dbs_timer to run when user is changing
 	 * the governor or limits.
 	 */
 	struct mutex timer_mutex;
@@ -210,7 +210,9 @@ struct common_dbs_data {
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	void *(*get_cpu_dbs_info_s)(int cpu);
-	void (*gov_dbs_timer)(struct work_struct *work);
+	unsigned int (*gov_dbs_timer)(struct cpu_dbs_info *cdbs,
+				      struct dbs_data *dbs_data,
+				      bool modify_all);
 	void (*gov_check_cpu)(int cpu, unsigned int load);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
@@ -269,8 +271,6 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_common_dbs_info *shared,
-		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 14d7e86e7f94..1fa9088c84a8 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -191,48 +191,40 @@ static void od_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void od_dbs_timer(struct work_struct *work)
+static unsigned int od_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct od_cpu_dbs_info_s *dbs_info =
-		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	struct cpufreq_policy *policy = cdbs->shared->policy;
 	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-	int delay = 0, sample_type = core_dbs_info->sample_type;
-	bool modify_all = true;
+	int delay = 0, sample_type = dbs_info->sample_type;
 
-	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.shared,
-			    od_tuners->sampling_rate)) {
-		modify_all = false;
+	if (!modify_all)
 		goto max_delay;
-	}
 
 	/* Common NORMAL_SAMPLE setup */
-	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
+	dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
-		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
+		delay = dbs_info->freq_lo_jiffies;
+		__cpufreq_driver_target(policy, dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
-		if (core_dbs_info->freq_lo) {
+		if (dbs_info->freq_lo) {
 			/* Setup timer for SUB_SAMPLE */
-			core_dbs_info->sample_type = OD_SUB_SAMPLE;
-			delay = core_dbs_info->freq_hi_jiffies;
+			dbs_info->sample_type = OD_SUB_SAMPLE;
+			delay = dbs_info->freq_hi_jiffies;
 		}
 	}
 
 max_delay:
 	if (!delay)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
-				* core_dbs_info->rate_mult);
+				* dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
+	return delay;
 }
 
 /************************** sysfs interface ************************/
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 2/5] cpufreq: governor: split out common part of {cs|od}_dbs_timer()
@ 2015-07-18  6:01       ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Some part of cs_dbs_timer() and od_dbs_timer() is exactly same and is
unnecessarily duplicated.

Create the real work-handler in cpufreq_governor.c and put the common
code in this routine (dbs_timer()).

Shouldn't make any functional change.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 27 +++++++-----------------
 drivers/cpufreq/cpufreq_governor.c     | 38 ++++++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_governor.h     | 10 ++++-----
 drivers/cpufreq/cpufreq_ondemand.c     | 36 +++++++++++++-------------------
 4 files changed, 60 insertions(+), 51 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index d21c3cff9056..84a1506950a7 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -102,28 +102,15 @@ static void cs_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void cs_dbs_timer(struct work_struct *work)
+static unsigned int cs_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
-			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
-	unsigned int cpu = policy->cpu;
-	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
-			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
-	bool modify_all = true;
-
-	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.shared,
-			    cs_tuners->sampling_rate))
-		modify_all = false;
-	else
-		dbs_check_cpu(dbs_data, cpu);
-
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
+
+	if (modify_all)
+		dbs_check_cpu(dbs_data, cdbs->shared->policy->cpu);
+
+	return delay_for_sampling_rate(cs_tuners->sampling_rate);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index b01cb729104b..7ed0ec2ac853 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -207,8 +207,8 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
 }
 
 /* Will return if we need to evaluate cpu load again or not */
-bool need_load_eval(struct cpu_common_dbs_info *shared,
-		    unsigned int sampling_rate)
+static bool need_load_eval(struct cpu_common_dbs_info *shared,
+			   unsigned int sampling_rate)
 {
 	if (policy_is_shared(shared->policy)) {
 		ktime_t time_now = ktime_get();
@@ -223,7 +223,37 @@ bool need_load_eval(struct cpu_common_dbs_info *shared,
 
 	return true;
 }
-EXPORT_SYMBOL_GPL(need_load_eval);
+
+static void dbs_timer(struct work_struct *work)
+{
+	struct cpu_dbs_info *cdbs = container_of(work, struct cpu_dbs_info,
+						 dwork.work);
+	struct cpu_common_dbs_info *shared = cdbs->shared;
+	struct cpufreq_policy *policy = shared->policy;
+	struct dbs_data *dbs_data = policy->governor_data;
+	unsigned int sampling_rate, delay;
+	bool modify_all = true;
+
+	mutex_lock(&shared->timer_mutex);
+
+	if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+
+		sampling_rate = cs_tuners->sampling_rate;
+	} else {
+		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+
+		sampling_rate = od_tuners->sampling_rate;
+	}
+
+	if (!need_load_eval(cdbs->shared, sampling_rate))
+		modify_all = false;
+
+	delay = dbs_data->cdata->gov_dbs_timer(cdbs, dbs_data, modify_all);
+	gov_queue_work(dbs_data, policy, delay, modify_all);
+
+	mutex_unlock(&shared->timer_mutex);
+}
 
 static void set_sampling_rate(struct dbs_data *dbs_data,
 		unsigned int sampling_rate)
@@ -411,7 +441,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
-		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
+		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, dbs_timer);
 	}
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 8e4a25f0730c..50f171796632 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -132,8 +132,8 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 struct cpu_common_dbs_info {
 	struct cpufreq_policy *policy;
 	/*
-	 * percpu mutex that serializes governor limit change with gov_dbs_timer
-	 * invocation. We do not want gov_dbs_timer to run when user is changing
+	 * percpu mutex that serializes governor limit change with dbs_timer
+	 * invocation. We do not want dbs_timer to run when user is changing
 	 * the governor or limits.
 	 */
 	struct mutex timer_mutex;
@@ -210,7 +210,9 @@ struct common_dbs_data {
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	void *(*get_cpu_dbs_info_s)(int cpu);
-	void (*gov_dbs_timer)(struct work_struct *work);
+	unsigned int (*gov_dbs_timer)(struct cpu_dbs_info *cdbs,
+				      struct dbs_data *dbs_data,
+				      bool modify_all);
 	void (*gov_check_cpu)(int cpu, unsigned int load);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
@@ -269,8 +271,6 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 extern struct mutex cpufreq_governor_lock;
 
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-bool need_load_eval(struct cpu_common_dbs_info *shared,
-		    unsigned int sampling_rate);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		struct common_dbs_data *cdata, unsigned int event);
 void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 14d7e86e7f94..1fa9088c84a8 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -191,48 +191,40 @@ static void od_check_cpu(int cpu, unsigned int load)
 	}
 }
 
-static void od_dbs_timer(struct work_struct *work)
+static unsigned int od_dbs_timer(struct cpu_dbs_info *cdbs,
+				 struct dbs_data *dbs_data, bool modify_all)
 {
-	struct od_cpu_dbs_info_s *dbs_info =
-		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
-	struct cpufreq_policy *policy = dbs_info->cdbs.shared->policy;
+	struct cpufreq_policy *policy = cdbs->shared->policy;
 	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
 			cpu);
-	struct dbs_data *dbs_data = policy->governor_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-	int delay = 0, sample_type = core_dbs_info->sample_type;
-	bool modify_all = true;
+	int delay = 0, sample_type = dbs_info->sample_type;
 
-	mutex_lock(&core_dbs_info->cdbs.shared->timer_mutex);
-	if (!need_load_eval(core_dbs_info->cdbs.shared,
-			    od_tuners->sampling_rate)) {
-		modify_all = false;
+	if (!modify_all)
 		goto max_delay;
-	}
 
 	/* Common NORMAL_SAMPLE setup */
-	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
+	dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	if (sample_type == OD_SUB_SAMPLE) {
-		delay = core_dbs_info->freq_lo_jiffies;
-		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
+		delay = dbs_info->freq_lo_jiffies;
+		__cpufreq_driver_target(policy, dbs_info->freq_lo,
 					CPUFREQ_RELATION_H);
 	} else {
 		dbs_check_cpu(dbs_data, cpu);
-		if (core_dbs_info->freq_lo) {
+		if (dbs_info->freq_lo) {
 			/* Setup timer for SUB_SAMPLE */
-			core_dbs_info->sample_type = OD_SUB_SAMPLE;
-			delay = core_dbs_info->freq_hi_jiffies;
+			dbs_info->sample_type = OD_SUB_SAMPLE;
+			delay = dbs_info->freq_hi_jiffies;
 		}
 	}
 
 max_delay:
 	if (!delay)
 		delay = delay_for_sampling_rate(od_tuners->sampling_rate
-				* core_dbs_info->rate_mult);
+				* dbs_info->rate_mult);
 
-	gov_queue_work(dbs_data, policy, delay, modify_all);
-	mutex_unlock(&core_dbs_info->cdbs.shared->timer_mutex);
+	return delay;
 }
 
 /************************** sysfs interface ************************/
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 3/5] cpufreq: governor: Avoid invalid states with additional checks
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
@ 2015-07-18  6:01       ` Viresh Kumar
  2015-07-18  6:01       ` Viresh Kumar
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

There can be races where the request has come to a wrong state. For
example INIT followed by STOP (instead of START) or START followed by
EXIT (instead of STOP).

Tackle these races by comparing making sure the state-machine never gets
into any invalid state. And return error if an invalid state-transition
is requested.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 46 +++++++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 7ed0ec2ac853..f225dc975415 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -305,6 +305,10 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	unsigned int latency;
 	int ret;
 
+	/* State should be equivalent to EXIT */
+	if (policy->governor_data)
+		return -EBUSY;
+
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
@@ -375,10 +379,15 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	return ret;
 }
 
-static void cpufreq_governor_exit(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_exit(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+
+	/* State should be equivalent to INIT */
+	if (!cdbs->shared || cdbs->shared->policy)
+		return -EBUSY;
 
 	policy->governor_data = NULL;
 	if (!--dbs_data->usage_count) {
@@ -395,6 +404,7 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 	}
 
 	free_common_dbs_info(policy, cdata);
+	return 0;
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -409,6 +419,10 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	if (!policy->cur)
 		return -EINVAL;
 
+	/* State should be equivalent to INIT */
+	if (!shared || shared->policy)
+		return -EBUSY;
+
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -465,14 +479,18 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	return 0;
 }
 
-static void cpufreq_governor_stop(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_stop(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 	struct cpu_common_dbs_info *shared = cdbs->shared;
 
+	/* State should be equivalent to START */
+	if (!shared || !shared->policy)
+		return -EBUSY;
+
 	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
@@ -484,17 +502,19 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 
 	shared->policy = NULL;
 	mutex_destroy(&shared->timer_mutex);
+	return 0;
 }
 
-static void cpufreq_governor_limits(struct cpufreq_policy *policy,
-				    struct dbs_data *dbs_data)
+static int cpufreq_governor_limits(struct cpufreq_policy *policy,
+				   struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
+	/* State should be equivalent to START */
 	if (!cdbs->shared || !cdbs->shared->policy)
-		return;
+		return -EBUSY;
 
 	mutex_lock(&cdbs->shared->timer_mutex);
 	if (policy->max < cdbs->shared->policy->cur)
@@ -505,13 +525,15 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
 	mutex_unlock(&cdbs->shared->timer_mutex);
+
+	return 0;
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			 struct common_dbs_data *cdata, unsigned int event)
 {
 	struct dbs_data *dbs_data;
-	int ret = 0;
+	int ret;
 
 	/* Lock governor to block concurrent initialization of governor */
 	mutex_lock(&cdata->mutex);
@@ -531,17 +553,19 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		ret = cpufreq_governor_init(policy, dbs_data, cdata);
 		break;
 	case CPUFREQ_GOV_POLICY_EXIT:
-		cpufreq_governor_exit(policy, dbs_data);
+		ret = cpufreq_governor_exit(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_START:
 		ret = cpufreq_governor_start(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_STOP:
-		cpufreq_governor_stop(policy, dbs_data);
+		ret = cpufreq_governor_stop(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_LIMITS:
-		cpufreq_governor_limits(policy, dbs_data);
+		ret = cpufreq_governor_limits(policy, dbs_data);
 		break;
+	default:
+		ret = -EINVAL;
 	}
 
 unlock:
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 3/5] cpufreq: governor: Avoid invalid states with additional checks
@ 2015-07-18  6:01       ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

There can be races where the request has come to a wrong state. For
example INIT followed by STOP (instead of START) or START followed by
EXIT (instead of STOP).

Tackle these races by comparing making sure the state-machine never gets
into any invalid state. And return error if an invalid state-transition
is requested.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 46 +++++++++++++++++++++++++++++---------
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 7ed0ec2ac853..f225dc975415 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -305,6 +305,10 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	unsigned int latency;
 	int ret;
 
+	/* State should be equivalent to EXIT */
+	if (policy->governor_data)
+		return -EBUSY;
+
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy()))
 			return -EINVAL;
@@ -375,10 +379,15 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
 	return ret;
 }
 
-static void cpufreq_governor_exit(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_exit(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
+	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
+
+	/* State should be equivalent to INIT */
+	if (!cdbs->shared || cdbs->shared->policy)
+		return -EBUSY;
 
 	policy->governor_data = NULL;
 	if (!--dbs_data->usage_count) {
@@ -395,6 +404,7 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
 	}
 
 	free_common_dbs_info(policy, cdata);
+	return 0;
 }
 
 static int cpufreq_governor_start(struct cpufreq_policy *policy,
@@ -409,6 +419,10 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	if (!policy->cur)
 		return -EINVAL;
 
+	/* State should be equivalent to INIT */
+	if (!shared || shared->policy)
+		return -EBUSY;
+
 	if (cdata->governor == GOV_CONSERVATIVE) {
 		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
@@ -465,14 +479,18 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
 	return 0;
 }
 
-static void cpufreq_governor_stop(struct cpufreq_policy *policy,
-				  struct dbs_data *dbs_data)
+static int cpufreq_governor_stop(struct cpufreq_policy *policy,
+				 struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 	struct cpu_common_dbs_info *shared = cdbs->shared;
 
+	/* State should be equivalent to START */
+	if (!shared || !shared->policy)
+		return -EBUSY;
+
 	gov_cancel_work(dbs_data, policy);
 
 	if (cdata->governor == GOV_CONSERVATIVE) {
@@ -484,17 +502,19 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
 
 	shared->policy = NULL;
 	mutex_destroy(&shared->timer_mutex);
+	return 0;
 }
 
-static void cpufreq_governor_limits(struct cpufreq_policy *policy,
-				    struct dbs_data *dbs_data)
+static int cpufreq_governor_limits(struct cpufreq_policy *policy,
+				   struct dbs_data *dbs_data)
 {
 	struct common_dbs_data *cdata = dbs_data->cdata;
 	unsigned int cpu = policy->cpu;
 	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
 
+	/* State should be equivalent to START */
 	if (!cdbs->shared || !cdbs->shared->policy)
-		return;
+		return -EBUSY;
 
 	mutex_lock(&cdbs->shared->timer_mutex);
 	if (policy->max < cdbs->shared->policy->cur)
@@ -505,13 +525,15 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
 					CPUFREQ_RELATION_L);
 	dbs_check_cpu(dbs_data, cpu);
 	mutex_unlock(&cdbs->shared->timer_mutex);
+
+	return 0;
 }
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 			 struct common_dbs_data *cdata, unsigned int event)
 {
 	struct dbs_data *dbs_data;
-	int ret = 0;
+	int ret;
 
 	/* Lock governor to block concurrent initialization of governor */
 	mutex_lock(&cdata->mutex);
@@ -531,17 +553,19 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 		ret = cpufreq_governor_init(policy, dbs_data, cdata);
 		break;
 	case CPUFREQ_GOV_POLICY_EXIT:
-		cpufreq_governor_exit(policy, dbs_data);
+		ret = cpufreq_governor_exit(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_START:
 		ret = cpufreq_governor_start(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_STOP:
-		cpufreq_governor_stop(policy, dbs_data);
+		ret = cpufreq_governor_stop(policy, dbs_data);
 		break;
 	case CPUFREQ_GOV_LIMITS:
-		cpufreq_governor_limits(policy, dbs_data);
+		ret = cpufreq_governor_limits(policy, dbs_data);
 		break;
+	default:
+		ret = -EINVAL;
 	}
 
 unlock:
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 4/5] cpufreq: governor: Don't WARN on invalid states
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
@ 2015-07-18  6:01       ` Viresh Kumar
  2015-07-18  6:01       ` Viresh Kumar
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

With previous commit, governors have started to return errors on invalid
state-transition requests. We already have a WARN for an invalid
state-transition request in cpufreq_governor_dbs(). This does trigger
today, as the sequence of events isn't guaranteed by cpufreq core.

Lets stop warning on that for now, and make sure we don't enter an
invalid state.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index f225dc975415..939197ffa4ac 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -543,7 +543,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 	else
 		dbs_data = cdata->gdbs_data;
 
-	if (WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT))) {
+	if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
 		ret = -EINVAL;
 		goto unlock;
 	}
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 4/5] cpufreq: governor: Don't WARN on invalid states
@ 2015-07-18  6:01       ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

With previous commit, governors have started to return errors on invalid
state-transition requests. We already have a WARN for an invalid
state-transition request in cpufreq_governor_dbs(). This does trigger
today, as the sequence of events isn't guaranteed by cpufreq core.

Lets stop warning on that for now, and make sure we don't enter an
invalid state.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index f225dc975415..939197ffa4ac 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -543,7 +543,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 	else
 		dbs_data = cdata->gdbs_data;
 
-	if (WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT))) {
+	if (!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT)) {
 		ret = -EINVAL;
 		goto unlock;
 	}
-- 
2.4.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 5/5] cpufreq: propagate errors returned from __cpufreq_governor()
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
@ 2015-07-18  6:01       ` Viresh Kumar
  2015-07-18  6:01       ` Viresh Kumar
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Return codes aren't honored properly in cpufreq_set_policy(). This can
lead to two problems:
- wrong errors propagated to sysfs
- we try to do next state-change even if the previous one failed

cpufreq_governor_dbs() now returns proper errors on all invalid
state-transition requests and this code should honor that.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a7b6ac6e048e..a3e8fb61cbcc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2295,16 +2295,31 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		if (ret) {
+			/* This can happen due to race with other operations */
+			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
+				 __func__, old_gov->name, ret);
+			return ret;
+		}
+
 		up_write(&policy->rwsem);
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
 		down_write(&policy->rwsem);
+
+		if (ret) {
+			pr_err("%s: Failed to Exit Governor: %s (%d)\n",
+			       __func__, old_gov->name, ret);
+			return ret;
+		}
 	}
 
 	/* start new governor */
 	policy->governor = new_policy->governor;
-	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
+	ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
+	if (!ret) {
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (!ret)
 			goto out;
 
 		up_write(&policy->rwsem);
@@ -2316,11 +2331,13 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
-		__cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			policy->governor = NULL;
+		else
+			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 
-	return -EINVAL;
+	return ret;
 
  out:
 	pr_debug("governor: change or update limits\n");
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH V3 5/5] cpufreq: propagate errors returned from __cpufreq_governor()
@ 2015-07-18  6:01       ` Viresh Kumar
  0 siblings, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2015-07-18  6:01 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Return codes aren't honored properly in cpufreq_set_policy(). This can
lead to two problems:
- wrong errors propagated to sysfs
- we try to do next state-change even if the previous one failed

cpufreq_governor_dbs() now returns proper errors on all invalid
state-transition requests and this code should honor that.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 31 ++++++++++++++++++++++++-------
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a7b6ac6e048e..a3e8fb61cbcc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2295,16 +2295,31 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	old_gov = policy->governor;
 	/* end old governor */
 	if (old_gov) {
-		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		if (ret) {
+			/* This can happen due to race with other operations */
+			pr_debug("%s: Failed to Stop Governor: %s (%d)\n",
+				 __func__, old_gov->name, ret);
+			return ret;
+		}
+
 		up_write(&policy->rwsem);
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
 		down_write(&policy->rwsem);
+
+		if (ret) {
+			pr_err("%s: Failed to Exit Governor: %s (%d)\n",
+			       __func__, old_gov->name, ret);
+			return ret;
+		}
 	}
 
 	/* start new governor */
 	policy->governor = new_policy->governor;
-	if (!__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT)) {
-		if (!__cpufreq_governor(policy, CPUFREQ_GOV_START))
+	ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
+	if (!ret) {
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (!ret)
 			goto out;
 
 		up_write(&policy->rwsem);
@@ -2316,11 +2331,13 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("starting governor %s failed\n", policy->governor->name);
 	if (old_gov) {
 		policy->governor = old_gov;
-		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT);
-		__cpufreq_governor(policy, CPUFREQ_GOV_START);
+		if (__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_INIT))
+			policy->governor = NULL;
+		else
+			__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	}
 
-	return -EINVAL;
+	return ret;
 
  out:
 	pr_debug("governor: change or update limits\n");
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH V3 0/5] cpufreq: governor: Avoid invalid state-transitions
  2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
                       ` (4 preceding siblings ...)
  2015-07-18  6:01       ` Viresh Kumar
@ 2015-07-23 21:00     ` Rafael J. Wysocki
  5 siblings, 0 replies; 42+ messages in thread
From: Rafael J. Wysocki @ 2015-07-23 21:00 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linaro-kernel, linux-pm

On Saturday, July 18, 2015 11:30:58 AM Viresh Kumar wrote:
> Hi Rafael,
> 
> These are rest of the patches after replacing 'ccdbs' with 'shared' as
> you suggested.
> 
> Viresh Kumar (5):
>   cpufreq: governor: Keep single copy of information common to
>     policy->cpus
>   cpufreq: governor: split out common part of {cs|od}_dbs_timer()
>   cpufreq: governor: Avoid invalid states with additional checks
>   cpufreq: governor: Don't WARN on invalid states
>   cpufreq: propagate errors returned from __cpufreq_governor()
> 
>  drivers/cpufreq/cpufreq.c              |  31 ++++--
>  drivers/cpufreq/cpufreq_conservative.c |  25 ++---
>  drivers/cpufreq/cpufreq_governor.c     | 174 ++++++++++++++++++++++++++-------
>  drivers/cpufreq/cpufreq_governor.h     |  26 +++--
>  drivers/cpufreq/cpufreq_ondemand.c     |  58 +++++------
>  5 files changed, 210 insertions(+), 104 deletions(-)

All queued up for 4.3, thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-06-19 11:48 ` [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
  2015-07-17 22:11   ` Rafael J. Wysocki
@ 2015-11-27 11:56   ` Lucas Stach
  2015-11-30  5:19     ` Viresh Kumar
  1 sibling, 1 reply; 42+ messages in thread
From: Lucas Stach @ 2015-11-27 11:56 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Preeti U Murthy, ke.wang, linaro-kernel,
	linux-pm, ego, paulus, shilpa.bhat, prarit, robert.schoene,
	skannan

Am Freitag, den 19.06.2015, 17:18 +0530 schrieb Viresh Kumar:
> Some information is common to all CPUs belonging to a policy, but are
> kept on per-cpu basis. Lets keep that in another structure common to all
> policy->cpus. That will make updates/reads to that less complex and less
> error prone.
> 
> The memory for ccdbs is allocated/freed at INIT/EXIT, so that it we
> don't reallocate it for STOP/START sequence. It will be also be used (in
> next patch) while the governor is stopped and so must not be freed that
> early.
> 
> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Sorry for being late to the party, as this has long been applied, but I
stumbled upon this while looking at other stuff and wanted to express my
finding.

>From a short look this change is totally bogus.
The *_dbs_timer functions are called from a workqueue on each CPU.
Moving the time_stamp and the timer_mutex to a shared structure now
causes a thundering herd issue on multicore systems. All workqueues wake
up at the same time with all but one of the worker bouncing on the
contended mutex.

Also sharing the time_stamp causes all but the first worker to acquire
the mutex to not properly update their sampling information, I guess. I
didn't look to deeply into this.

Regards,
Lucas

> ---
>  drivers/cpufreq/cpufreq_conservative.c | 18 ++++---
>  drivers/cpufreq/cpufreq_governor.c     | 92 +++++++++++++++++++++++++---------
>  drivers/cpufreq/cpufreq_governor.h     | 24 +++++----
>  drivers/cpufreq/cpufreq_ondemand.c     | 38 +++++++-------
>  4 files changed, 114 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index af47d322679e..5b5c01ca556c 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -47,7 +47,7 @@ static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
>  static void cs_check_cpu(int cpu, unsigned int load)
>  {
>  	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
> -	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
>  	struct dbs_data *dbs_data = policy->governor_data;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>  
> @@ -106,22 +106,24 @@ static void cs_dbs_timer(struct work_struct *work)
>  {
>  	struct cs_cpu_dbs_info_s *dbs_info = container_of(work,
>  			struct cs_cpu_dbs_info_s, cdbs.dwork.work);
> -	unsigned int cpu = dbs_info->cdbs.policy->cpu;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
> +	unsigned int cpu = policy->cpu;
>  	struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
>  			cpu);
> -	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
> +	struct dbs_data *dbs_data = policy->governor_data;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
>  	int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
>  	bool modify_all = true;
>  
> -	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
> -	if (!need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
> +	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
> +	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
> +			    cs_tuners->sampling_rate))
>  		modify_all = false;
>  	else
>  		dbs_check_cpu(dbs_data, cpu);
>  
> -	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
> -	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
> +	gov_queue_work(dbs_data, policy, delay, modify_all);
> +	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
>  }
>  
>  static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
> @@ -135,7 +137,7 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
>  	if (!dbs_info->enable)
>  		return 0;
>  
> -	policy = dbs_info->cdbs.policy;
> +	policy = dbs_info->cdbs.ccdbs->policy;
>  
>  	/*
>  	 * we only care if our internally tracked freq moves outside the 'valid'
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index c0566f86caed..72a92fa71ba5 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -35,7 +35,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  	struct cpu_dbs_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
>  	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
>  	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
> -	struct cpufreq_policy *policy;
> +	struct cpufreq_policy *policy = cdbs->ccdbs->policy;
>  	unsigned int sampling_rate;
>  	unsigned int max_load = 0;
>  	unsigned int ignore_nice;
> @@ -60,8 +60,6 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  		ignore_nice = cs_tuners->ignore_nice_load;
>  	}
>  
> -	policy = cdbs->policy;
> -
>  	/* Get Absolute Load */
>  	for_each_cpu(j, policy->cpus) {
>  		struct cpu_dbs_info *j_cdbs;
> @@ -209,17 +207,18 @@ static inline void gov_cancel_work(struct dbs_data *dbs_data,
>  }
>  
>  /* Will return if we need to evaluate cpu load again or not */
> -bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate)
> +bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
> +		    unsigned int sampling_rate)
>  {
> -	if (policy_is_shared(cdbs->policy)) {
> +	if (policy_is_shared(ccdbs->policy)) {
>  		ktime_t time_now = ktime_get();
> -		s64 delta_us = ktime_us_delta(time_now, cdbs->time_stamp);
> +		s64 delta_us = ktime_us_delta(time_now, ccdbs->time_stamp);
>  
>  		/* Do nothing if we recently have sampled */
>  		if (delta_us < (s64)(sampling_rate / 2))
>  			return false;
>  		else
> -			cdbs->time_stamp = time_now;
> +			ccdbs->time_stamp = time_now;
>  	}
>  
>  	return true;
> @@ -238,6 +237,37 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
>  	}
>  }
>  
> +static int alloc_ccdbs(struct cpufreq_policy *policy,
> +		       struct common_dbs_data *cdata)
> +{
> +	struct cpu_common_dbs_info *ccdbs;
> +	int j;
> +
> +	/* Allocate memory for the common information for policy->cpus */
> +	ccdbs = kzalloc(sizeof(*ccdbs), GFP_KERNEL);
> +	if (!ccdbs)
> +		return -ENOMEM;
> +
> +	/* Set ccdbs for all CPUs, online+offline */
> +	for_each_cpu(j, policy->related_cpus)
> +		cdata->get_cpu_cdbs(j)->ccdbs = ccdbs;
> +
> +	return 0;
> +}
> +
> +static void free_ccdbs(struct cpufreq_policy *policy,
> +		       struct common_dbs_data *cdata)
> +{
> +	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(policy->cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
> +	int j;
> +
> +	for_each_cpu(j, policy->cpus)
> +		cdata->get_cpu_cdbs(j)->ccdbs = NULL;
> +
> +	kfree(ccdbs);
> +}
> +
>  static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  				 struct dbs_data *dbs_data,
>  				 struct common_dbs_data *cdata)
> @@ -248,6 +278,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	if (dbs_data) {
>  		if (WARN_ON(have_governor_per_policy()))
>  			return -EINVAL;
> +
> +		ret = alloc_ccdbs(policy, cdata);
> +		if (ret)
> +			return ret;
> +
>  		dbs_data->usage_count++;
>  		policy->governor_data = dbs_data;
>  		return 0;
> @@ -257,12 +292,16 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	if (!dbs_data)
>  		return -ENOMEM;
>  
> +	ret = alloc_ccdbs(policy, cdata);
> +	if (ret)
> +		goto free_dbs_data;
> +
>  	dbs_data->cdata = cdata;
>  	dbs_data->usage_count = 1;
>  
>  	ret = cdata->init(dbs_data, !policy->governor->initialized);
>  	if (ret)
> -		goto free_dbs_data;
> +		goto free_ccdbs;
>  
>  	/* policy latency is in ns. Convert it to us first */
>  	latency = policy->cpuinfo.transition_latency / 1000;
> @@ -299,6 +338,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy,
>  	}
>  cdata_exit:
>  	cdata->exit(dbs_data, !policy->governor->initialized);
> +free_ccdbs:
> +	free_ccdbs(policy, cdata);
>  free_dbs_data:
>  	kfree(dbs_data);
>  	return ret;
> @@ -322,6 +363,8 @@ static void cpufreq_governor_exit(struct cpufreq_policy *policy,
>  		cdata->exit(dbs_data, policy->governor->initialized == 1);
>  		kfree(dbs_data);
>  	}
> +
> +	free_ccdbs(policy, cdata);
>  }
>  
>  static int cpufreq_governor_start(struct cpufreq_policy *policy,
> @@ -330,6 +373,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  	struct common_dbs_data *cdata = dbs_data->cdata;
>  	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
>  	int io_busy = 0;
>  
>  	if (!policy->cur)
> @@ -348,11 +392,14 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		io_busy = od_tuners->io_is_busy;
>  	}
>  
> +	ccdbs->policy = policy;
> +	ccdbs->time_stamp = ktime_get();
> +	mutex_init(&ccdbs->timer_mutex);
> +
>  	for_each_cpu(j, policy->cpus) {
>  		struct cpu_dbs_info *j_cdbs = cdata->get_cpu_cdbs(j);
>  		unsigned int prev_load;
>  
> -		j_cdbs->policy = policy;
>  		j_cdbs->prev_cpu_idle =
>  			get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
>  
> @@ -364,7 +411,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		if (ignore_nice)
>  			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>  
> -		mutex_init(&j_cdbs->timer_mutex);
>  		INIT_DEFERRABLE_WORK(&j_cdbs->dwork, cdata->gov_dbs_timer);
>  	}
>  
> @@ -384,9 +430,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy,
>  		od_ops->powersave_bias_init_cpu(cpu);
>  	}
>  
> -	/* Initiate timer time stamp */
> -	cdbs->time_stamp = ktime_get();
> -
>  	gov_queue_work(dbs_data, policy, delay_for_sampling_rate(sampling_rate),
>  		       true);
>  	return 0;
> @@ -398,6 +441,9 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
>  	struct common_dbs_data *cdata = dbs_data->cdata;
>  	unsigned int cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
> +	struct cpu_common_dbs_info *ccdbs = cdbs->ccdbs;
> +
> +	gov_cancel_work(dbs_data, policy);
>  
>  	if (cdata->governor == GOV_CONSERVATIVE) {
>  		struct cs_cpu_dbs_info_s *cs_dbs_info =
> @@ -406,10 +452,8 @@ static void cpufreq_governor_stop(struct cpufreq_policy *policy,
>  		cs_dbs_info->enable = 0;
>  	}
>  
> -	gov_cancel_work(dbs_data, policy);
> -
> -	mutex_destroy(&cdbs->timer_mutex);
> -	cdbs->policy = NULL;
> +	ccdbs->policy = NULL;
> +	mutex_destroy(&ccdbs->timer_mutex);
>  }
>  
>  static void cpufreq_governor_limits(struct cpufreq_policy *policy,
> @@ -419,18 +463,18 @@ static void cpufreq_governor_limits(struct cpufreq_policy *policy,
>  	unsigned int cpu = policy->cpu;
>  	struct cpu_dbs_info *cdbs = cdata->get_cpu_cdbs(cpu);
>  
> -	if (!cdbs->policy)
> +	if (!cdbs->ccdbs || !cdbs->ccdbs->policy)
>  		return;
>  
> -	mutex_lock(&cdbs->timer_mutex);
> -	if (policy->max < cdbs->policy->cur)
> -		__cpufreq_driver_target(cdbs->policy, policy->max,
> +	mutex_lock(&cdbs->ccdbs->timer_mutex);
> +	if (policy->max < cdbs->ccdbs->policy->cur)
> +		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->max,
>  					CPUFREQ_RELATION_H);
> -	else if (policy->min > cdbs->policy->cur)
> -		__cpufreq_driver_target(cdbs->policy, policy->min,
> +	else if (policy->min > cdbs->ccdbs->policy->cur)
> +		__cpufreq_driver_target(cdbs->ccdbs->policy, policy->min,
>  					CPUFREQ_RELATION_L);
>  	dbs_check_cpu(dbs_data, cpu);
> -	mutex_unlock(&cdbs->timer_mutex);
> +	mutex_unlock(&cdbs->ccdbs->timer_mutex);
>  }
>  
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index a0f8eb79ee6d..153412cafb05 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -128,6 +128,18 @@ static void *get_cpu_dbs_info_s(int cpu)				\
>   * cs_*: Conservative governor
>   */
>  
> +/* Common to all CPUs of a policy */
> +struct cpu_common_dbs_info {
> +	struct cpufreq_policy *policy;
> +	/*
> +	 * percpu mutex that serializes governor limit change with gov_dbs_timer
> +	 * invocation. We do not want gov_dbs_timer to run when user is changing
> +	 * the governor or limits.
> +	 */
> +	struct mutex timer_mutex;
> +	ktime_t time_stamp;
> +};
> +
>  /* Per cpu structures */
>  struct cpu_dbs_info {
>  	u64 prev_cpu_idle;
> @@ -140,15 +152,8 @@ struct cpu_dbs_info {
>  	 * wake-up from idle.
>  	 */
>  	unsigned int prev_load;
> -	struct cpufreq_policy *policy;
>  	struct delayed_work dwork;
> -	/*
> -	 * percpu mutex that serializes governor limit change with gov_dbs_timer
> -	 * invocation. We do not want gov_dbs_timer to run when user is changing
> -	 * the governor or limits.
> -	 */
> -	struct mutex timer_mutex;
> -	ktime_t time_stamp;
> +	struct cpu_common_dbs_info *ccdbs;
>  };
>  
>  struct od_cpu_dbs_info_s {
> @@ -264,7 +269,8 @@ static ssize_t show_sampling_rate_min_gov_pol				\
>  extern struct mutex cpufreq_governor_lock;
>  
>  void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
> -bool need_load_eval(struct cpu_dbs_info *cdbs, unsigned int sampling_rate);
> +bool need_load_eval(struct cpu_common_dbs_info *ccdbs,
> +		    unsigned int sampling_rate);
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>  		struct common_dbs_data *cdata, unsigned int event);
>  void gov_queue_work(struct dbs_data *dbs_data, struct cpufreq_policy *policy,
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index d29c6f9c6e3e..651677cfa581 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -155,7 +155,7 @@ static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
>  static void od_check_cpu(int cpu, unsigned int load)
>  {
>  	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> -	struct cpufreq_policy *policy = dbs_info->cdbs.policy;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
>  	struct dbs_data *dbs_data = policy->governor_data;
>  	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
>  
> @@ -195,16 +195,18 @@ static void od_dbs_timer(struct work_struct *work)
>  {
>  	struct od_cpu_dbs_info_s *dbs_info =
>  		container_of(work, struct od_cpu_dbs_info_s, cdbs.dwork.work);
> -	unsigned int cpu = dbs_info->cdbs.policy->cpu;
> +	struct cpufreq_policy *policy = dbs_info->cdbs.ccdbs->policy;
> +	unsigned int cpu = policy->cpu;
>  	struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
>  			cpu);
> -	struct dbs_data *dbs_data = dbs_info->cdbs.policy->governor_data;
> +	struct dbs_data *dbs_data = policy->governor_data;
>  	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
>  	int delay = 0, sample_type = core_dbs_info->sample_type;
>  	bool modify_all = true;
>  
> -	mutex_lock(&core_dbs_info->cdbs.timer_mutex);
> -	if (!need_load_eval(&core_dbs_info->cdbs, od_tuners->sampling_rate)) {
> +	mutex_lock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
> +	if (!need_load_eval(core_dbs_info->cdbs.ccdbs,
> +			    od_tuners->sampling_rate)) {
>  		modify_all = false;
>  		goto max_delay;
>  	}
> @@ -213,8 +215,7 @@ static void od_dbs_timer(struct work_struct *work)
>  	core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
>  	if (sample_type == OD_SUB_SAMPLE) {
>  		delay = core_dbs_info->freq_lo_jiffies;
> -		__cpufreq_driver_target(core_dbs_info->cdbs.policy,
> -					core_dbs_info->freq_lo,
> +		__cpufreq_driver_target(policy, core_dbs_info->freq_lo,
>  					CPUFREQ_RELATION_H);
>  	} else {
>  		dbs_check_cpu(dbs_data, cpu);
> @@ -230,8 +231,8 @@ static void od_dbs_timer(struct work_struct *work)
>  		delay = delay_for_sampling_rate(od_tuners->sampling_rate
>  				* core_dbs_info->rate_mult);
>  
> -	gov_queue_work(dbs_data, dbs_info->cdbs.policy, delay, modify_all);
> -	mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
> +	gov_queue_work(dbs_data, policy, delay, modify_all);
> +	mutex_unlock(&core_dbs_info->cdbs.ccdbs->timer_mutex);
>  }
>  
>  /************************** sysfs interface ************************/
> @@ -274,10 +275,10 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
>  		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
>  		cpufreq_cpu_put(policy);
>  
> -		mutex_lock(&dbs_info->cdbs.timer_mutex);
> +		mutex_lock(&dbs_info->cdbs.ccdbs->timer_mutex);
>  
>  		if (!delayed_work_pending(&dbs_info->cdbs.dwork)) {
> -			mutex_unlock(&dbs_info->cdbs.timer_mutex);
> +			mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
>  			continue;
>  		}
>  
> @@ -286,15 +287,15 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
>  
>  		if (time_before(next_sampling, appointed_at)) {
>  
> -			mutex_unlock(&dbs_info->cdbs.timer_mutex);
> +			mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
>  			cancel_delayed_work_sync(&dbs_info->cdbs.dwork);
> -			mutex_lock(&dbs_info->cdbs.timer_mutex);
> +			mutex_lock(&dbs_info->cdbs.ccdbs->timer_mutex);
>  
> -			gov_queue_work(dbs_data, dbs_info->cdbs.policy,
> +			gov_queue_work(dbs_data, policy,
>  				       usecs_to_jiffies(new_rate), true);
>  
>  		}
> -		mutex_unlock(&dbs_info->cdbs.timer_mutex);
> +		mutex_unlock(&dbs_info->cdbs.ccdbs->timer_mutex);
>  	}
>  }
>  
> @@ -557,13 +558,16 @@ static void od_set_powersave_bias(unsigned int powersave_bias)
>  
>  	get_online_cpus();
>  	for_each_online_cpu(cpu) {
> +		struct cpu_common_dbs_info *ccdbs;
> +
>  		if (cpumask_test_cpu(cpu, &done))
>  			continue;
>  
> -		policy = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy;
> -		if (!policy)
> +		ccdbs = per_cpu(od_cpu_dbs_info, cpu).cdbs.ccdbs;
> +		if (!ccdbs)
>  			continue;
>  
> +		policy = ccdbs->policy;
>  		cpumask_or(&done, &done, policy->cpus);
>  
>  		if (policy->governor != &cpufreq_gov_ondemand)

-- 
Pengutronix e.K.             | Lucas Stach                 |
Industrial Linux Solutions   | http://www.pengutronix.de/  |


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-11-27 11:56   ` Lucas Stach
@ 2015-11-30  5:19     ` Viresh Kumar
  2015-11-30 10:34       ` Lucas Stach
  0 siblings, 1 reply; 42+ messages in thread
From: Viresh Kumar @ 2015-11-30  5:19 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Rafael Wysocki, Preeti U Murthy, ke.wang, linaro-kernel,
	linux-pm, ego, paulus, shilpa.bhat, prarit, robert.schoene,
	skannan

On 27-11-15, 12:56, Lucas Stach wrote:
> Sorry for being late to the party, as this has long been applied, but I
> stumbled upon this while looking at other stuff and wanted to express my
> finding.

No issues, we can still fix things if they have gone wrong.

> From a short look this change is totally bogus.

That's a bit shocking, after two people reviewed and tested it :)

> The *_dbs_timer functions are called from a workqueue on each CPU.

Right.

> Moving the time_stamp and the timer_mutex to a shared structure now
> causes a thundering herd issue on multicore systems.

You faced an issue or you just think that something is wrong ?

> All workqueues wake
> up at the same time with all but one of the worker bouncing on the
> contended mutex.

This is exactly what used to be done earlier, but it was designed
badly. For example, for ondemand governor, each CPU had a
struct od_cpu_dbs_info_s and we only used the od_cpu_dbs_info_s for
the 'policy->cpu'. So, even if all CPUs had this structure (i.e. the
time_stamp and the lock), we only used structure for policy->cpu.

> Also sharing the time_stamp causes all but the first worker to acquire
> the mutex to not properly update their sampling information, I guess.

Yeah, that's what we want per-policy.

> I didn't look to deeply into this.

Perhaps, yes.

-- 
viresh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-11-30  5:19     ` Viresh Kumar
@ 2015-11-30 10:34       ` Lucas Stach
  2015-11-30 11:03         ` Viresh Kumar
  0 siblings, 1 reply; 42+ messages in thread
From: Lucas Stach @ 2015-11-30 10:34 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Preeti U Murthy, ke.wang, linaro-kernel,
	linux-pm, ego, paulus, shilpa.bhat, prarit, robert.schoene,
	skannan

Am Montag, den 30.11.2015, 10:49 +0530 schrieb Viresh Kumar:
> On 27-11-15, 12:56, Lucas Stach wrote:
> > Sorry for being late to the party, as this has long been applied, but I
> > stumbled upon this while looking at other stuff and wanted to express my
> > finding.
> 
> No issues, we can still fix things if they have gone wrong.
> 
> > From a short look this change is totally bogus.
> 
> That's a bit shocking, after two people reviewed and tested it :)
> 
> > The *_dbs_timer functions are called from a workqueue on each CPU.
> 
> Right.
> 
> > Moving the time_stamp and the timer_mutex to a shared structure now
> > causes a thundering herd issue on multicore systems.
> 
> You faced an issue or you just think that something is wrong ?

The timer_mutex is one of the top contended locks already on a quad core
system. That really doesn't look right, as the timer is quite low
frequency. It's causing excessive wake ups, as all 4 CPUs wake up to
handle the WQ, 3 of them directly go back to sleep waiting for the mutex
to be released, then same thing happens for CPUs-1 until we rippled down
to a single CPU.

> 
> > All workqueues wake
> > up at the same time with all but one of the worker bouncing on the
> > contended mutex.
> 
> This is exactly what used to be done earlier, but it was designed
> badly. For example, for ondemand governor, each CPU had a
> struct od_cpu_dbs_info_s and we only used the od_cpu_dbs_info_s for
> the 'policy->cpu'. So, even if all CPUs had this structure (i.e. the
> time_stamp and the lock), we only used structure for policy->cpu.
> 
Ok, so this patch isn't actually at fault. My apologies. It was just the
first thing down the git history that catched my eye when looking into
why the timer mutex is this badly contended.
I would say it's still worth fixing. Perhaps by not waking all the
workqueues at the same time, but spreading the wake times out over a
jiffy.

> > Also sharing the time_stamp causes all but the first worker to acquire
> > the mutex to not properly update their sampling information, I guess.
> 
> Yeah, that's what we want per-policy.
> 
That I still don't really understand. If we only sample a single CPU,
why do we also wake all the others? Also if we serialize on the
timer_mutex and only the first CPU to acquire it updates the sampling
information, is it ok that essentially a random CPU will do the work?

Again, I didn't really take the time to understand the code in depth,
hoping that you could provide the answers to those questions more
easily. :)

Thanks,
Lucas

-- 
Pengutronix e.K.             | Lucas Stach                 |
Industrial Linux Solutions   | http://www.pengutronix.de/  |


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-11-30 10:34       ` Lucas Stach
@ 2015-11-30 11:03         ` Viresh Kumar
  2015-12-02 21:43           ` Saravana Kannan
  0 siblings, 1 reply; 42+ messages in thread
From: Viresh Kumar @ 2015-11-30 11:03 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Rafael Wysocki, Preeti U Murthy, ke.wang, linaro-kernel,
	linux-pm, ego, paulus, shilpa.bhat, prarit, robert.schoene,
	skannan

On 30-11-15, 11:34, Lucas Stach wrote:
> The timer_mutex is one of the top contended locks already on a quad core
> system.

My new series (un-committed) should fix that to some level hopefully:

http://marc.info/?l=linux-pm&m=144612165211091&w=2

> That really doesn't look right, as the timer is quite low
> frequency. It's causing excessive wake ups, as all 4 CPUs wake up to
> handle the WQ, 3 of them directly go back to sleep waiting for the mutex
> to be released, then same thing happens for CPUs-1 until we rippled down
> to a single CPU.

There is a reason why we need to do this on all CPUs today. The timers
are deferrable by nature, as we shouldn't wake up a CPU to change it
frequency. Now group of CPUs that change their DVFS state together, or
that change their voltage/clock rails are part of the same
cpufreq-policy. We need to take load of all the CPUs, that are part of
the policy, while update the frequency of the group.

If we queue the timer on any one CPU, then that CPU can go into idle
and the deferred timer will not fire. But the other CPUs of the policy
can still be active and the frequency of the group wouldn't change
with load.

Hope that answers the query related to timers on all CPUs.

> I would say it's still worth fixing. Perhaps by not waking all the
> workqueues at the same time, but spreading the wake times out over a
> jiffy.

Maybe.

-- 
viresh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-11-30 11:03         ` Viresh Kumar
@ 2015-12-02 21:43           ` Saravana Kannan
  2015-12-03  4:54             ` Viresh Kumar
  0 siblings, 1 reply; 42+ messages in thread
From: Saravana Kannan @ 2015-12-02 21:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Lucas Stach, Rafael Wysocki, Preeti U Murthy, ke.wang,
	linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene

On 11/30/2015 03:03 AM, Viresh Kumar wrote:
> On 30-11-15, 11:34, Lucas Stach wrote:
>> The timer_mutex is one of the top contended locks already on a quad core
>> system.
>
> My new series (un-committed) should fix that to some level hopefully:
>
> http://marc.info/?l=linux-pm&m=144612165211091&w=2
>
>> That really doesn't look right, as the timer is quite low
>> frequency. It's causing excessive wake ups, as all 4 CPUs wake up to
>> handle the WQ, 3 of them directly go back to sleep waiting for the mutex
>> to be released, then same thing happens for CPUs-1 until we rippled down
>> to a single CPU.
>
> There is a reason why we need to do this on all CPUs today. The timers
> are deferrable by nature, as we shouldn't wake up a CPU to change it
> frequency. Now group of CPUs that change their DVFS state together, or
> that change their voltage/clock rails are part of the same
> cpufreq-policy. We need to take load of all the CPUs, that are part of
> the policy, while update the frequency of the group.
>
> If we queue the timer on any one CPU, then that CPU can go into idle
> and the deferred timer will not fire. But the other CPUs of the policy
> can still be active and the frequency of the group wouldn't change
> with load.
>
> Hope that answers the query related to timers on all CPUs.
>
>> I would say it's still worth fixing. Perhaps by not waking all the
>> workqueues at the same time, but spreading the wake times out over a
>> jiffy.
>
> Maybe.
>

There's a separate thread where we proposed a fix to deferrable timers 
that are stored globally if they are not CPU bound. That way, even if 
one CPU is up, they get handled. But TGLX had some valid concerns with 
cache thrashing and impact on some network code. So, last I heard, he 
was going to rewrite and fixed the deferrable timer problem by having 
the "orphaned" (because CPU has gone idle) deferrable timers being 
adopted by other CPUs while the original CPU is idle.

Once that's fixed, we just need one timer per policy. Long story short, 
CPU freq is working around a poor API semantic of deferrable timers.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-12-02 21:43           ` Saravana Kannan
@ 2015-12-03  4:54             ` Viresh Kumar
  2015-12-03  6:53               ` Robert Schöne
  0 siblings, 1 reply; 42+ messages in thread
From: Viresh Kumar @ 2015-12-03  4:54 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Lucas Stach, Rafael Wysocki, Preeti U Murthy, ke.wang,
	linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit,
	robert.schoene

On 02-12-15, 13:43, Saravana Kannan wrote:
> There's a separate thread where we proposed a fix to deferrable
> timers that are stored globally if they are not CPU bound. That way,
> even if one CPU is up, they get handled. But TGLX had some valid
> concerns with cache thrashing and impact on some network code. So,
> last I heard, he was going to rewrite and fixed the deferrable timer
> problem by having the "orphaned" (because CPU has gone idle)
> deferrable timers being adopted by other CPUs while the original CPU
> is idle.
> 
> Once that's fixed, we just need one timer per policy. Long story
> short, CPU freq is working around a poor API semantic of deferrable
> timers.

There is another idea that I have.

Lets sacrifice idleness of CPU0 (which is already considered as
housekeeping CPU in scheduler) to save us from all the complexity we
have today.

Suppose we have 16 CPUs, with 4 CPUs per policy and hence 4 policies.
- Keep a single delayed-work (non-deferrable) per policy and queue them as:
  queue_work_on(CPU0).
- This will work because any CPU can calculate the load of other CPUs,
  and there is no dependency on the local CPU.
- CPU0 will hence get interrupted, check if the policy->cpus are idle
  or not, and if not, update their frequency (perhaps with an IPI).

Not sure if this will be better performance wise though.

-- 
viresh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-12-03  4:54             ` Viresh Kumar
@ 2015-12-03  6:53               ` Robert Schöne
  2015-12-03 22:00                 ` Saravana Kannan
  0 siblings, 1 reply; 42+ messages in thread
From: Robert Schöne @ 2015-12-03  6:53 UTC (permalink / raw)
  To: Viresh Kumar, Saravana Kannan
  Cc: Lucas Stach, Rafael Wysocki, Preeti U Murthy, ke.wang,
	linaro-kernel, linux-pm, ego, paulus, shilpa.bhat, prarit

> There is another idea that I have.
> 
> Lets sacrifice idleness of CPU0 (which is already considered as
> housekeeping CPU in scheduler) to save us from all the complexity we
> have today.
> 
> Suppose we have 16 CPUs, with 4 CPUs per policy and hence 4 policies.
> - Keep a single delayed-work (non-deferrable) per policy and queue
> them as:
>   queue_work_on(CPU0).
> - This will work because any CPU can calculate the load of other
> CPUs,
>   and there is no dependency on the local CPU.
> - CPU0 will hence get interrupted, check if the policy->cpus are idle
>   or not, and if not, update their frequency (perhaps with an IPI).
> 
> Not sure if this will be better performance wise though.
> 

No it is not. In your example, everything could be okay with many ifs.
I'm from the HPC field and would expect load imbalances because this
approach does not scale with the number of frequency domains.

In 2011, AMD released Bulldozer servers with 4 sockets and each of
these sockets had 2 dies with 4 modules. Each module has its own
frequency domain, which sums up to 32 different domains.
Putting all the load to CPU0 will be unfavorable for a balanced
performance over all CPUS. This is worsened by the cache coherence
protocol which might delay accesses to shared and foreign memory
significantly.

And this had been in 2011. Today we have Haswell-EP quad socket servers
with up to 72 frequency domains for all CPUs of all sockets put
together. If you have an SGI Altix UV, you even have 256 sockets with
18 domains each, which sums up to 4608 domains.

I would expect architectures with more frequency domains in the future.

Robert

-- 

Dipl.-Inf. Robert Schoene
Computer Scientist - R&D Energy Efficient Computing

Technische Universitaet Dresden
Center for Information Services and High Performance Computing
Distributed and Data Intensive Computing
01062 Dresden
Tel.: +49 (351) 463-42483
Fax : +49 (351) 463-37773
E-Mail: Robert.Schoene@tu-dresden.de

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus
  2015-12-03  6:53               ` Robert Schöne
@ 2015-12-03 22:00                 ` Saravana Kannan
  0 siblings, 0 replies; 42+ messages in thread
From: Saravana Kannan @ 2015-12-03 22:00 UTC (permalink / raw)
  To: Robert Schöne
  Cc: Viresh Kumar, Lucas Stach, Rafael Wysocki, Preeti U Murthy,
	ke.wang, linaro-kernel, linux-pm, ego, paulus, shilpa.bhat,
	prarit

On 12/02/2015 10:53 PM, Robert Schöne wrote:
>
>> There is another idea that I have.
>>
>> Lets sacrifice idleness of CPU0 (which is already considered as
>> housekeeping CPU in scheduler) to save us from all the complexity we
>> have today.
>>
>> Suppose we have 16 CPUs, with 4 CPUs per policy and hence 4 policies.
>> - Keep a single delayed-work (non-deferrable) per policy and queue
>> them as:

This is a 100% no go. Non-deferrable is going to kill mobile idle power.

>>    queue_work_on(CPU0).
>> - This will work because any CPU can calculate the load of other
>> CPUs,
>>    and there is no dependency on the local CPU.
>> - CPU0 will hence get interrupted, check if the policy->cpus are idle
>>    or not, and if not, update their frequency (perhaps with an IPI).
>>
>> Not sure if this will be better performance wise though.
>>
>
> No it is not. In your example, everything could be okay with many ifs.
> I'm from the HPC field and would expect load imbalances because this
> approach does not scale with the number of frequency domains.

Agree. I think that's why we had per policy timers (forced to be 
implemented with per CPU deferrable timers) in the first place.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2015-12-03 22:00 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-19 11:48 [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 01/10] cpufreq: governor: Name delayed-work as dwork Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 02/10] cpufreq: governor: Drop unused field 'cpu' Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 03/10] cpufreq: governor: Rename 'cpu_dbs_common_info' to 'cpu_dbs_info' Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 04/10] cpufreq: governor: name pointer to cpu_dbs_info as 'cdbs' Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 05/10] cpufreq: governor: rename cur_policy as policy Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 06/10] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
2015-07-17 22:11   ` Rafael J. Wysocki
2015-11-27 11:56   ` Lucas Stach
2015-11-30  5:19     ` Viresh Kumar
2015-11-30 10:34       ` Lucas Stach
2015-11-30 11:03         ` Viresh Kumar
2015-12-02 21:43           ` Saravana Kannan
2015-12-03  4:54             ` Viresh Kumar
2015-12-03  6:53               ` Robert Schöne
2015-12-03 22:00                 ` Saravana Kannan
2015-06-19 11:48 ` [PATCH V2 07/10] cpufreq: governor: split out common part of {cs|od}_dbs_timer() Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 08/10] cpufreq: governor: Avoid invalid states with additional checks Viresh Kumar
2015-06-19 18:04   ` Preeti U Murthy
2015-06-19 11:48 ` [PATCH V2 09/10] cpufreq: governor: Don't WARN on invalid states Viresh Kumar
2015-06-19 11:48 ` [PATCH V2 10/10] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
2015-06-19 18:02   ` Preeti U Murthy
2015-06-20  3:12     ` Viresh Kumar
2015-06-22  4:41       ` Preeti U Murthy
2015-06-22  5:21         ` Viresh Kumar
2015-06-19 18:09   ` Preeti U Murthy
2015-06-19 18:08 ` [PATCH V2 00/10] cpufreq: governor: Avoid invalid state-transitions Preeti U Murthy
2015-06-19 23:16 ` Rafael J. Wysocki
2015-06-20  2:36   ` Viresh Kumar
2015-07-17 22:15 ` Rafael J. Wysocki
2015-07-18  6:00   ` [PATCH V3 0/5] " Viresh Kumar
2015-07-18  6:00     ` [PATCH V3 1/5] cpufreq: governor: Keep single copy of information common to policy->cpus Viresh Kumar
2015-07-18  6:00       ` Viresh Kumar
2015-07-18  6:01     ` [PATCH V3 2/5] cpufreq: governor: split out common part of {cs|od}_dbs_timer() Viresh Kumar
2015-07-18  6:01       ` Viresh Kumar
2015-07-18  6:01     ` [PATCH V3 3/5] cpufreq: governor: Avoid invalid states with additional checks Viresh Kumar
2015-07-18  6:01       ` Viresh Kumar
2015-07-18  6:01     ` [PATCH V3 4/5] cpufreq: governor: Don't WARN on invalid states Viresh Kumar
2015-07-18  6:01       ` Viresh Kumar
2015-07-18  6:01     ` [PATCH V3 5/5] cpufreq: propagate errors returned from __cpufreq_governor() Viresh Kumar
2015-07-18  6:01       ` Viresh Kumar
2015-07-23 21:00     ` [PATCH V3 0/5] cpufreq: governor: Avoid invalid state-transitions Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.