All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
@ 2016-02-08 11:39 Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners Viresh Kumar
                   ` (14 more replies)
  0 siblings, 15 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

Hi Rafael,

Things look much much better now. I have rebased this series over
pm/bleeding-edge, that has all your patches.

I have moved ahead and done few more changes in this series, that should
get rid of all the lockdeps we were getting earlier, that includes
fixing lockdep issue in update_sampling_rate() that we were chasing.

These are pushed here again:
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject

@Juri/Shilpa: Can you please confirm if all the issues got resolved now
?

V2->V3:
- The first patch from V2, that was moving min_sampling_rate to
  per governor structure was dropped, as you suggested.
- Also, I have moved common tunables to struct dbs_data now, which you
  also suggested sometime back.
- Last 5 patches are all new and fix rest of the issues we were facing.

--
viresh

Viresh Kumar (13):
  cpufreq: governor: Create generic macro for global tuners
  cpufreq: governor: Move common tunables to 'struct dbs_data'
  cpufreq: governor: New sysfs show/store callbacks for governor
    tunables
  cpufreq: governor: Drop unused macros for creating governor tunable
    attributes
  Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT"
  cpufreq: Merge cpufreq_offline_prepare/finish routines
  cpufreq: Call __cpufreq_governor() with policy->rwsem held
  cpufreq: Remove cpufreq_governor_lock
  cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
  cpufreq: governor: No need to manage state machine now
  cpufreq: governor: Keep list of policy_dbs within dbs_data
  cpufreq: ondemand: Traverse list of policy_dbs in
    update_sampling_rate()
  cpufreq: conservative: Update sample_delay_ns immediately

 drivers/cpufreq/cpufreq.c              |  98 ++++++------
 drivers/cpufreq/cpufreq_conservative.c | 143 ++++--------------
 drivers/cpufreq/cpufreq_governor.c     | 266 +++++++++++++++++++++++++--------
 drivers/cpufreq/cpufreq_governor.h     | 156 ++++++++-----------
 drivers/cpufreq/cpufreq_ondemand.c     | 241 +++++------------------------
 include/linux/cpufreq.h                |   4 -
 6 files changed, 381 insertions(+), 527 deletions(-)

-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 16:33   ` Rafael J. Wysocki
  2016-02-08 11:39 ` [PATCH V3 02/13] cpufreq: governor: Move common tunables to 'struct dbs_data' Viresh Kumar
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

Some tunables are present in governor specific structures, whereas one
(min_sampling_rate) is present in global 'struct dbs_data'.

The macro for that is pretty much specific to sampling_rate_min for now.
Next patch would be moving few more tunables to the 'struct dbs_data',
then it would be useful to create a generic macro for such cases.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c |  8 ++++----
 drivers/cpufreq/cpufreq_governor.h     | 36 +++++++++++++++++++---------------
 drivers/cpufreq/cpufreq_ondemand.c     |  8 ++++----
 3 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 1a899bb7d1a4..8aaa8a4c2fca 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -245,7 +245,7 @@ show_store_one(cs, up_threshold);
 show_store_one(cs, down_threshold);
 show_store_one(cs, ignore_nice_load);
 show_store_one(cs, freq_step);
-declare_show_sampling_rate_min(cs);
+show_one_global(cs, min_sampling_rate);
 
 gov_sys_pol_attr_rw(sampling_rate);
 gov_sys_pol_attr_rw(sampling_down_factor);
@@ -253,10 +253,10 @@ gov_sys_pol_attr_rw(up_threshold);
 gov_sys_pol_attr_rw(down_threshold);
 gov_sys_pol_attr_rw(ignore_nice_load);
 gov_sys_pol_attr_rw(freq_step);
-gov_sys_pol_attr_ro(sampling_rate_min);
+gov_sys_pol_attr_ro(min_sampling_rate);
 
 static struct attribute *dbs_attributes_gov_sys[] = {
-	&sampling_rate_min_gov_sys.attr,
+	&min_sampling_rate_gov_sys.attr,
 	&sampling_rate_gov_sys.attr,
 	&sampling_down_factor_gov_sys.attr,
 	&up_threshold_gov_sys.attr,
@@ -272,7 +272,7 @@ static struct attribute_group cs_attr_group_gov_sys = {
 };
 
 static struct attribute *dbs_attributes_gov_pol[] = {
-	&sampling_rate_min_gov_pol.attr,
+	&min_sampling_rate_gov_pol.attr,
 	&sampling_rate_gov_pol.attr,
 	&sampling_down_factor_gov_pol.attr,
 	&up_threshold_gov_pol.attr,
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 95e6834d36a8..cc9a373b9736 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -110,6 +110,26 @@ static ssize_t store_##file_name##_gov_pol				\
 show_one(_gov, file_name);						\
 store_one(_gov, file_name)
 
+#define show_one_global(_gov, file_name)				\
+static ssize_t show_##file_name##_gov_sys				\
+(struct kobject *kobj, struct attribute *attr, char *buf)		\
+{									\
+	struct dbs_data *dbs_data = _gov##_dbs_gov.gdbs_data;		\
+	return sprintf(buf, "%u\n", dbs_data->file_name);		\
+}									\
+									\
+static ssize_t show_##file_name##_gov_pol				\
+(struct cpufreq_policy *policy, char *buf)				\
+{									\
+	struct policy_dbs_info *policy_dbs = policy->governor_data;	\
+	struct dbs_data *dbs_data = policy_dbs->dbs_data;		\
+	return sprintf(buf, "%u\n", dbs_data->file_name);		\
+}
+
+#define show_store_one_global(_gov, file_name)				\
+show_one_global(_gov, file_name);					\
+store_one(_gov, file_name)
+
 /* create helper routines */
 #define define_get_cpu_dbs_routines(_dbs_info)				\
 static struct cpu_dbs_info *get_cpu_cdbs(int cpu)			\
@@ -264,22 +284,6 @@ static inline int delay_for_sampling_rate(unsigned int sampling_rate)
 	return delay;
 }
 
-#define declare_show_sampling_rate_min(_gov)				\
-static ssize_t show_sampling_rate_min_gov_sys				\
-(struct kobject *kobj, struct attribute *attr, char *buf)		\
-{									\
-	struct dbs_data *dbs_data = _gov##_dbs_gov.gdbs_data;		\
-	return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);	\
-}									\
-									\
-static ssize_t show_sampling_rate_min_gov_pol				\
-(struct cpufreq_policy *policy, char *buf)				\
-{									\
-	struct policy_dbs_info *policy_dbs = policy->governor_data;	\
-	struct dbs_data *dbs_data = policy_dbs->dbs_data;		\
-	return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);	\
-}
-
 extern struct mutex dbs_data_mutex;
 extern struct mutex cpufreq_governor_lock;
 void dbs_check_cpu(struct cpufreq_policy *policy);
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index b7ef2e7f4d4a..2a71a6755998 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -443,7 +443,7 @@ show_store_one(od, up_threshold);
 show_store_one(od, sampling_down_factor);
 show_store_one(od, ignore_nice_load);
 show_store_one(od, powersave_bias);
-declare_show_sampling_rate_min(od);
+show_one_global(od, min_sampling_rate);
 
 gov_sys_pol_attr_rw(sampling_rate);
 gov_sys_pol_attr_rw(io_is_busy);
@@ -451,10 +451,10 @@ gov_sys_pol_attr_rw(up_threshold);
 gov_sys_pol_attr_rw(sampling_down_factor);
 gov_sys_pol_attr_rw(ignore_nice_load);
 gov_sys_pol_attr_rw(powersave_bias);
-gov_sys_pol_attr_ro(sampling_rate_min);
+gov_sys_pol_attr_ro(min_sampling_rate);
 
 static struct attribute *dbs_attributes_gov_sys[] = {
-	&sampling_rate_min_gov_sys.attr,
+	&min_sampling_rate_gov_sys.attr,
 	&sampling_rate_gov_sys.attr,
 	&up_threshold_gov_sys.attr,
 	&sampling_down_factor_gov_sys.attr,
@@ -470,7 +470,7 @@ static struct attribute_group od_attr_group_gov_sys = {
 };
 
 static struct attribute *dbs_attributes_gov_pol[] = {
-	&sampling_rate_min_gov_pol.attr,
+	&min_sampling_rate_gov_pol.attr,
 	&sampling_rate_gov_pol.attr,
 	&up_threshold_gov_pol.attr,
 	&sampling_down_factor_gov_pol.attr,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 02/13] cpufreq: governor: Move common tunables to 'struct dbs_data'
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables Viresh Kumar
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

There are few more common tunables shared across ondemand and
conservative governors. Move them to 'struct dbs_data' to simplify code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 38 ++++++++++++++-----------------
 drivers/cpufreq/cpufreq_governor.c     | 37 ++++++------------------------
 drivers/cpufreq/cpufreq_governor.h     | 14 +++++-------
 drivers/cpufreq/cpufreq_ondemand.c     | 41 +++++++++++++++-------------------
 4 files changed, 47 insertions(+), 83 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 8aaa8a4c2fca..ee4937ab6a8b 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -60,7 +60,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
 		return;
 
 	/* Check for frequency increase */
-	if (load > cs_tuners->up_threshold) {
+	if (load > dbs_data->up_threshold) {
 		dbs_info->down_skip = 0;
 
 		/* if we are already at full speed then break out early */
@@ -78,7 +78,7 @@ static void cs_check_cpu(int cpu, unsigned int load)
 	}
 
 	/* if sampling_down_factor is active break out early */
-	if (++dbs_info->down_skip < cs_tuners->sampling_down_factor)
+	if (++dbs_info->down_skip < dbs_data->sampling_down_factor)
 		return;
 	dbs_info->down_skip = 0;
 
@@ -107,10 +107,9 @@ static unsigned int cs_dbs_timer(struct cpufreq_policy *policy)
 {
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
 	dbs_check_cpu(policy);
-	return delay_for_sampling_rate(cs_tuners->sampling_rate);
+	return delay_for_sampling_rate(dbs_data->sampling_rate);
 }
 
 static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
@@ -126,7 +125,6 @@ static struct dbs_governor cs_dbs_gov;
 static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
 		const char *buf, size_t count)
 {
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	unsigned int input;
 	int ret;
 	ret = sscanf(buf, "%u", &input);
@@ -134,14 +132,13 @@ static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
 	if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
 		return -EINVAL;
 
-	cs_tuners->sampling_down_factor = input;
+	dbs_data->sampling_down_factor = input;
 	return count;
 }
 
 static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 		size_t count)
 {
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	unsigned int input;
 	int ret;
 	ret = sscanf(buf, "%u", &input);
@@ -149,7 +146,7 @@ static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 	if (ret != 1)
 		return -EINVAL;
 
-	cs_tuners->sampling_rate = max(input, dbs_data->min_sampling_rate);
+	dbs_data->sampling_rate = max(input, dbs_data->min_sampling_rate);
 	return count;
 }
 
@@ -164,7 +161,7 @@ static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
 	if (ret != 1 || input > 100 || input <= cs_tuners->down_threshold)
 		return -EINVAL;
 
-	cs_tuners->up_threshold = input;
+	dbs_data->up_threshold = input;
 	return count;
 }
 
@@ -178,7 +175,7 @@ static ssize_t store_down_threshold(struct dbs_data *dbs_data, const char *buf,
 
 	/* cannot be lower than 11 otherwise freq will not fall */
 	if (ret != 1 || input < 11 || input > 100 ||
-			input >= cs_tuners->up_threshold)
+			input >= dbs_data->up_threshold)
 		return -EINVAL;
 
 	cs_tuners->down_threshold = input;
@@ -188,7 +185,6 @@ static ssize_t store_down_threshold(struct dbs_data *dbs_data, const char *buf,
 static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 		const char *buf, size_t count)
 {
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	unsigned int input, j;
 	int ret;
 
@@ -199,10 +195,10 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 	if (input > 1)
 		input = 1;
 
-	if (input == cs_tuners->ignore_nice_load) /* nothing to do */
+	if (input == dbs_data->ignore_nice_load) /* nothing to do */
 		return count;
 
-	cs_tuners->ignore_nice_load = input;
+	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
 	for_each_online_cpu(j) {
@@ -210,7 +206,7 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 		dbs_info = &per_cpu(cs_cpu_dbs_info, j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
 					&dbs_info->cdbs.prev_cpu_wall, 0);
-		if (cs_tuners->ignore_nice_load)
+		if (dbs_data->ignore_nice_load)
 			dbs_info->cdbs.prev_cpu_nice =
 				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 	}
@@ -239,12 +235,12 @@ static ssize_t store_freq_step(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-show_store_one(cs, sampling_rate);
-show_store_one(cs, sampling_down_factor);
-show_store_one(cs, up_threshold);
 show_store_one(cs, down_threshold);
-show_store_one(cs, ignore_nice_load);
 show_store_one(cs, freq_step);
+show_store_one_global(cs, sampling_rate);
+show_store_one_global(cs, sampling_down_factor);
+show_store_one_global(cs, up_threshold);
+show_store_one_global(cs, ignore_nice_load);
 show_one_global(cs, min_sampling_rate);
 
 gov_sys_pol_attr_rw(sampling_rate);
@@ -299,11 +295,11 @@ static int cs_init(struct dbs_data *dbs_data, bool notify)
 		return -ENOMEM;
 	}
 
-	tuners->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
 	tuners->down_threshold = DEF_FREQUENCY_DOWN_THRESHOLD;
-	tuners->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
-	tuners->ignore_nice_load = 0;
 	tuners->freq_step = DEF_FREQUENCY_STEP;
+	dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
+	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
+	dbs_data->ignore_nice_load = 0;
 
 	dbs_data->tuners = tuners;
 	dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 9c1dfcee0d57..b168a32cc8f0 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -38,10 +38,9 @@ void dbs_check_cpu(struct cpufreq_policy *policy)
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	unsigned int sampling_rate;
+	unsigned int sampling_rate = dbs_data->sampling_rate;
+	unsigned int ignore_nice = dbs_data->ignore_nice_load;
 	unsigned int max_load = 0;
-	unsigned int ignore_nice;
 	unsigned int j;
 
 	if (gov->governor == GOV_ONDEMAND) {
@@ -54,13 +53,8 @@ void dbs_check_cpu(struct cpufreq_policy *policy)
 		 * the 'sampling_rate', so as to keep the wake-up-from-idle
 		 * detection logic a bit conservative.
 		 */
-		sampling_rate = od_tuners->sampling_rate;
 		sampling_rate *= od_dbs_info->rate_mult;
 
-		ignore_nice = od_tuners->ignore_nice_load;
-	} else {
-		sampling_rate = cs_tuners->sampling_rate;
-		ignore_nice = cs_tuners->ignore_nice_load;
 	}
 
 	/* Get Absolute Load */
@@ -279,19 +273,6 @@ static void dbs_update_util_handler(struct update_util_data *data, u64 time,
 	atomic_dec(&policy_dbs->skip_work);
 }
 
-static void set_sampling_rate(struct dbs_data *dbs_data,
-			      struct dbs_governor *gov,
-			      unsigned int sampling_rate)
-{
-	if (gov->governor == GOV_CONSERVATIVE) {
-		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-		cs_tuners->sampling_rate = sampling_rate;
-	} else {
-		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-		od_tuners->sampling_rate = sampling_rate;
-	}
-}
-
 static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *policy,
 						     struct dbs_governor *gov)
 {
@@ -383,8 +364,8 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 	/* Bring kernel and HW constraints together */
 	dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
 					  MIN_LATENCY_MULTIPLIER * latency);
-	set_sampling_rate(dbs_data, gov, max(dbs_data->min_sampling_rate,
-					latency * LATENCY_MULTIPLIER));
+	dbs_data->sampling_rate = max(dbs_data->min_sampling_rate,
+				      LATENCY_MULTIPLIER * latency);
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
@@ -456,16 +437,12 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy)
 	if (policy_dbs->policy)
 		return -EBUSY;
 
-	if (gov->governor == GOV_CONSERVATIVE) {
-		struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+	sampling_rate = dbs_data->sampling_rate;
+	ignore_nice = dbs_data->ignore_nice_load;
 
-		sampling_rate = cs_tuners->sampling_rate;
-		ignore_nice = cs_tuners->ignore_nice_load;
-	} else {
+	if (gov->governor == GOV_ONDEMAND) {
 		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
-		sampling_rate = od_tuners->sampling_rate;
-		ignore_nice = od_tuners->ignore_nice_load;
 		io_busy = od_tuners->io_is_busy;
 	}
 
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index cc9a373b9736..5c5d7936087c 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -153,9 +153,13 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 
 /* Governor demand based switching data (per-policy or global). */
 struct dbs_data {
-	unsigned int min_sampling_rate;
 	int usage_count;
 	void *tuners;
+	unsigned int min_sampling_rate;
+	unsigned int ignore_nice_load;
+	unsigned int sampling_rate;
+	unsigned int sampling_down_factor;
+	unsigned int up_threshold;
 };
 
 /* Common to all CPUs of a policy */
@@ -216,19 +220,11 @@ struct cs_cpu_dbs_info_s {
 
 /* Per policy Governors sysfs tunables */
 struct od_dbs_tuners {
-	unsigned int ignore_nice_load;
-	unsigned int sampling_rate;
-	unsigned int sampling_down_factor;
-	unsigned int up_threshold;
 	unsigned int powersave_bias;
 	unsigned int io_is_busy;
 };
 
 struct cs_dbs_tuners {
-	unsigned int ignore_nice_load;
-	unsigned int sampling_rate;
-	unsigned int sampling_down_factor;
-	unsigned int up_threshold;
 	unsigned int down_threshold;
 	unsigned int freq_step;
 };
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 2a71a6755998..cb0d6ff1ced5 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -110,7 +110,7 @@ static unsigned int generic_powersave_bias_target(struct cpufreq_policy *policy,
 		dbs_info->freq_lo_jiffies = 0;
 		return freq_lo;
 	}
-	jiffies_total = usecs_to_jiffies(od_tuners->sampling_rate);
+	jiffies_total = usecs_to_jiffies(dbs_data->sampling_rate);
 	jiffies_hi = (freq_avg - freq_lo) * jiffies_total;
 	jiffies_hi += ((freq_hi - freq_lo) / 2);
 	jiffies_hi /= (freq_hi - freq_lo);
@@ -161,11 +161,10 @@ static void od_check_cpu(int cpu, unsigned int load)
 	dbs_info->freq_lo = 0;
 
 	/* Check for frequency increase */
-	if (load > od_tuners->up_threshold) {
+	if (load > dbs_data->up_threshold) {
 		/* If switching to max speed, apply sampling_down_factor */
 		if (policy->cur < policy->max)
-			dbs_info->rate_mult =
-				od_tuners->sampling_down_factor;
+			dbs_info->rate_mult = dbs_data->sampling_down_factor;
 		dbs_freq_increase(policy, policy->max);
 	} else {
 		/* Calculate the next frequency proportional to load */
@@ -195,7 +194,6 @@ static unsigned int od_dbs_timer(struct cpufreq_policy *policy)
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	int delay = 0, sample_type = dbs_info->sample_type;
 
 	/* Common NORMAL_SAMPLE setup */
@@ -214,7 +212,7 @@ static unsigned int od_dbs_timer(struct cpufreq_policy *policy)
 	}
 
 	if (!delay)
-		delay = delay_for_sampling_rate(od_tuners->sampling_rate
+		delay = delay_for_sampling_rate(dbs_data->sampling_rate
 				* dbs_info->rate_mult);
 
 	return delay;
@@ -239,11 +237,10 @@ static struct dbs_governor od_dbs_gov;
 static void update_sampling_rate(struct dbs_data *dbs_data,
 		unsigned int new_rate)
 {
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	struct cpumask cpumask;
 	int cpu;
 
-	od_tuners->sampling_rate = new_rate = max(new_rate,
+	dbs_data->sampling_rate = new_rate = max(new_rate,
 			dbs_data->min_sampling_rate);
 
 	/*
@@ -348,7 +345,6 @@ static ssize_t store_io_is_busy(struct dbs_data *dbs_data, const char *buf,
 static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
 		size_t count)
 {
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int input;
 	int ret;
 	ret = sscanf(buf, "%u", &input);
@@ -358,21 +354,20 @@ static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
 		return -EINVAL;
 	}
 
-	od_tuners->up_threshold = input;
+	dbs_data->up_threshold = input;
 	return count;
 }
 
 static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
 		const char *buf, size_t count)
 {
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int input, j;
 	int ret;
 	ret = sscanf(buf, "%u", &input);
 
 	if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
 		return -EINVAL;
-	od_tuners->sampling_down_factor = input;
+	dbs_data->sampling_down_factor = input;
 
 	/* Reset down sampling multiplier in case it was active */
 	for_each_online_cpu(j) {
@@ -399,10 +394,10 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 	if (input > 1)
 		input = 1;
 
-	if (input == od_tuners->ignore_nice_load) { /* nothing to do */
+	if (input == dbs_data->ignore_nice_load) { /* nothing to do */
 		return count;
 	}
-	od_tuners->ignore_nice_load = input;
+	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
 	for_each_online_cpu(j) {
@@ -410,7 +405,7 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 		dbs_info = &per_cpu(od_cpu_dbs_info, j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
 			&dbs_info->cdbs.prev_cpu_wall, od_tuners->io_is_busy);
-		if (od_tuners->ignore_nice_load)
+		if (dbs_data->ignore_nice_load)
 			dbs_info->cdbs.prev_cpu_nice =
 				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
@@ -437,12 +432,12 @@ static ssize_t store_powersave_bias(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-show_store_one(od, sampling_rate);
 show_store_one(od, io_is_busy);
-show_store_one(od, up_threshold);
-show_store_one(od, sampling_down_factor);
-show_store_one(od, ignore_nice_load);
 show_store_one(od, powersave_bias);
+show_store_one_global(od, sampling_rate);
+show_store_one_global(od, up_threshold);
+show_store_one_global(od, sampling_down_factor);
+show_store_one_global(od, ignore_nice_load);
 show_one_global(od, min_sampling_rate);
 
 gov_sys_pol_attr_rw(sampling_rate);
@@ -504,7 +499,7 @@ static int od_init(struct dbs_data *dbs_data, bool notify)
 	put_cpu();
 	if (idle_time != -1ULL) {
 		/* Idle micro accounting is supported. Use finer thresholds */
-		tuners->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
+		dbs_data->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
 		/*
 		 * In nohz/micro accounting case we set the minimum frequency
 		 * not depending on HZ, but fixed (very low). The deferred
@@ -512,15 +507,15 @@ static int od_init(struct dbs_data *dbs_data, bool notify)
 		*/
 		dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
 	} else {
-		tuners->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
+		dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
 
 		/* For correct statistics, we need 10 ticks for each measure */
 		dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
 			jiffies_to_usecs(10);
 	}
 
-	tuners->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
-	tuners->ignore_nice_load = 0;
+	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
+	dbs_data->ignore_nice_load = 0;
 	tuners->powersave_bias = default_powersave_bias;
 	tuners->io_is_busy = should_io_be_busy();
 
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 02/13] cpufreq: governor: Move common tunables to 'struct dbs_data' Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 17:07   ` Viresh Kumar
  2016-02-08 21:36   ` Rafael J. Wysocki
  2016-02-08 11:39 ` [PATCH V3 04/13] cpufreq: governor: Drop unused macros for creating governor tunable attributes Viresh Kumar
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

The ondemand and conservative governors use the global-attr or freq-attr
structures to represent sysfs attributes corresponding to their tunables
(which of them is actually used depends on whether or not different
policy objects can use the same governor with different tunables at the
same time and, consequently, on where those attributes are located in
sysfs).

Unfortunately, in the freq-attr case, the standard cpufreq show/store
sysfs attribute callbacks are applied to the governor tunable attributes
and they always acquire the policy->rwsem lock before carrying out the
operation.  That may lead to an ABBA deadlock if governor tunable
attributes are removed under policy->rwsem while one of them is being
accessed concurrently (if sysfs attributes removal wins the race, it
will wait for the access to complete with policy->rwsem held while the
attribute callback will block on policy->rwsem indefinitely).

We attempted to address this issue by dropping policy->rwsem around
governor tunable attributes removal (that is, around invocations of the
->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT)
in cpufreq_set_policy(), but that opened up race conditions that had not
been possible with policy->rwsem held all the time.  Therefore
policy->rwsem cannot be dropped in cpufreq_set_policy() at any point,
but the deadlock situation described above must be avoided too.

To that end, use the observation that in principle governor tunables may
be represented by the same data type regardless of whether the governor
is system-wide or per-policy and introduce a new structure, struct
governor_attr, for representing them and new corresponding macros for
creating show/store sysfs callbacks for them.  Also make their parent
kobject use a new kobject type whose default show/store callbacks are
not related to the standard core cpufreq ones in any way (and they don't
acquire policy->rwsem in particular).

[ Rafael: Written changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c | 73 ++++++++++++----------------------
 drivers/cpufreq/cpufreq_governor.c     | 68 +++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_governor.h     | 40 ++++++++++++++++++-
 drivers/cpufreq/cpufreq_ondemand.c     | 73 ++++++++++++----------------------
 4 files changed, 150 insertions(+), 104 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index ee4937ab6a8b..6d45b7e6b43f 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -235,54 +235,33 @@ static ssize_t store_freq_step(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-show_store_one(cs, down_threshold);
-show_store_one(cs, freq_step);
-show_store_one_global(cs, sampling_rate);
-show_store_one_global(cs, sampling_down_factor);
-show_store_one_global(cs, up_threshold);
-show_store_one_global(cs, ignore_nice_load);
-show_one_global(cs, min_sampling_rate);
-
-gov_sys_pol_attr_rw(sampling_rate);
-gov_sys_pol_attr_rw(sampling_down_factor);
-gov_sys_pol_attr_rw(up_threshold);
-gov_sys_pol_attr_rw(down_threshold);
-gov_sys_pol_attr_rw(ignore_nice_load);
-gov_sys_pol_attr_rw(freq_step);
-gov_sys_pol_attr_ro(min_sampling_rate);
-
-static struct attribute *dbs_attributes_gov_sys[] = {
-	&min_sampling_rate_gov_sys.attr,
-	&sampling_rate_gov_sys.attr,
-	&sampling_down_factor_gov_sys.attr,
-	&up_threshold_gov_sys.attr,
-	&down_threshold_gov_sys.attr,
-	&ignore_nice_load_gov_sys.attr,
-	&freq_step_gov_sys.attr,
+gov_show_one(sampling_rate);
+gov_show_one(sampling_down_factor);
+gov_show_one(up_threshold);
+gov_show_one(ignore_nice_load);
+gov_show_one(min_sampling_rate);
+gov_show_one_tunable(cs, down_threshold);
+gov_show_one_tunable(cs, freq_step);
+
+gov_attr_rw(sampling_rate);
+gov_attr_rw(sampling_down_factor);
+gov_attr_rw(up_threshold);
+gov_attr_rw(ignore_nice_load);
+gov_attr_ro(min_sampling_rate);
+gov_attr_rw(down_threshold);
+gov_attr_rw(freq_step);
+
+static struct attribute *cs_attributes[] = {
+	&min_sampling_rate.attr,
+	&sampling_rate.attr,
+	&sampling_down_factor.attr,
+	&up_threshold.attr,
+	&down_threshold.attr,
+	&ignore_nice_load.attr,
+	&freq_step.attr,
 	NULL
 };
 
-static struct attribute_group cs_attr_group_gov_sys = {
-	.attrs = dbs_attributes_gov_sys,
-	.name = "conservative",
-};
-
-static struct attribute *dbs_attributes_gov_pol[] = {
-	&min_sampling_rate_gov_pol.attr,
-	&sampling_rate_gov_pol.attr,
-	&sampling_down_factor_gov_pol.attr,
-	&up_threshold_gov_pol.attr,
-	&down_threshold_gov_pol.attr,
-	&ignore_nice_load_gov_pol.attr,
-	&freq_step_gov_pol.attr,
-	NULL
-};
-
-static struct attribute_group cs_attr_group_gov_pol = {
-	.attrs = dbs_attributes_gov_pol,
-	.name = "conservative",
-};
-
 /************************** sysfs end ************************/
 
 static int cs_init(struct dbs_data *dbs_data, bool notify)
@@ -331,8 +310,8 @@ static struct dbs_governor cs_dbs_gov = {
 		.owner = THIS_MODULE,
 	},
 	.governor = GOV_CONSERVATIVE,
-	.attr_group_gov_sys = &cs_attr_group_gov_sys,
-	.attr_group_gov_pol = &cs_attr_group_gov_pol,
+	.kobj_name = "conservative",
+	.kobj_type = { .default_attrs = cs_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = cs_dbs_timer,
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index b168a32cc8f0..295732e06354 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -25,12 +25,58 @@
 DEFINE_MUTEX(dbs_data_mutex);
 EXPORT_SYMBOL_GPL(dbs_data_mutex);
 
-static struct attribute_group *get_sysfs_attr(struct dbs_governor *gov)
+static inline struct dbs_data *to_dbs_data(struct kobject *kobj)
 {
-	return have_governor_per_policy() ?
-		gov->attr_group_gov_pol : gov->attr_group_gov_sys;
+	return container_of(kobj, struct dbs_data, kobj);
 }
 
+static inline struct governor_attr *to_gov_attr(struct attribute *attr)
+{
+	return container_of(attr, struct governor_attr, attr);
+}
+
+static ssize_t governor_show(struct kobject *kobj, struct attribute *attr,
+			     char *buf)
+{
+	struct dbs_data *dbs_data = to_dbs_data(kobj);
+	struct governor_attr *gattr = to_gov_attr(attr);
+	int ret = -EIO;
+
+	if (gattr->show)
+		ret = gattr->show(dbs_data, buf);
+
+	return ret;
+}
+
+static ssize_t governor_store(struct kobject *kobj, struct attribute *attr,
+			      const char *buf, size_t count)
+{
+	struct dbs_data *dbs_data = to_dbs_data(kobj);
+	struct governor_attr *gattr = to_gov_attr(attr);
+	int ret = -EIO;
+
+	mutex_lock(&dbs_data->mutex);
+
+	if (gattr->store)
+		ret = gattr->store(dbs_data, buf, count);
+
+	mutex_unlock(&dbs_data->mutex);
+
+	return ret;
+}
+
+/*
+ * Sysfs Ops for accessing governor attributes.
+ *
+ * All show/store invocations for governor specific sysfs attributes, will first
+ * call the below show/store callbacks and the attribute specific callback will
+ * be called from within it.
+ */
+static const struct sysfs_ops governor_sysfs_ops = {
+	.show	= governor_show,
+	.store	= governor_store,
+};
+
 void dbs_check_cpu(struct cpufreq_policy *policy)
 {
 	int cpu = policy->cpu;
@@ -351,6 +397,7 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 	}
 
 	dbs_data->usage_count = 1;
+	mutex_init(&dbs_data->mutex);
 
 	ret = gov->init(dbs_data, !policy->governor->initialized);
 	if (ret)
@@ -373,10 +420,15 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 	policy_dbs->dbs_data = dbs_data;
 	policy->governor_data = policy_dbs;
 
-	ret = sysfs_create_group(get_governor_parent_kobj(policy),
-				 get_sysfs_attr(gov));
-	if (ret)
+	gov->kobj_type.sysfs_ops = &governor_sysfs_ops;
+	ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
+				   get_governor_parent_kobj(policy),
+				   gov->kobj_name);
+	if (ret) {
+		pr_err("cpufreq: Governor initialization failed (dbs_data kobject initialization error %d)\n",
+		       ret);
 		goto reset_gdbs_data;
+	}
 
 	return 0;
 
@@ -404,8 +456,7 @@ static int cpufreq_governor_exit(struct cpufreq_policy *policy)
 		return -EBUSY;
 
 	if (!--dbs_data->usage_count) {
-		sysfs_remove_group(get_governor_parent_kobj(policy),
-				   get_sysfs_attr(gov));
+		kobject_put(&dbs_data->kobj);
 
 		policy->governor_data = NULL;
 
@@ -413,6 +464,7 @@ static int cpufreq_governor_exit(struct cpufreq_policy *policy)
 			gov->gdbs_data = NULL;
 
 		gov->exit(dbs_data, policy->governor->initialized == 1);
+		mutex_destroy(&dbs_data->mutex);
 		kfree(dbs_data);
 	} else {
 		policy->governor_data = NULL;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 5c5d7936087c..a3afac5d8ab2 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -160,8 +160,44 @@ struct dbs_data {
 	unsigned int sampling_rate;
 	unsigned int sampling_down_factor;
 	unsigned int up_threshold;
+
+	struct kobject kobj;
+	/* Protect concurrent updates to governor tunables from sysfs */
+	struct mutex mutex;
+};
+
+/* Governor's specific attributes */
+struct dbs_data;
+struct governor_attr {
+	struct attribute attr;
+	ssize_t (*show)(struct dbs_data *dbs_data, char *buf);
+	ssize_t (*store)(struct dbs_data *dbs_data, const char *buf,
+			 size_t count);
 };
 
+#define gov_show_one_tunable(_gov, file_name)				\
+static ssize_t show_##file_name						\
+(struct dbs_data *dbs_data, char *buf)					\
+{									\
+	struct _gov##_dbs_tuners *tuners = dbs_data->tuners;		\
+	return sprintf(buf, "%u\n", tuners->file_name);			\
+}
+
+#define gov_show_one(file_name)						\
+static ssize_t show_##file_name						\
+(struct dbs_data *dbs_data, char *buf)					\
+{									\
+	return sprintf(buf, "%u\n", dbs_data->file_name);		\
+}
+
+#define gov_attr_ro(_name)						\
+static struct governor_attr _name =					\
+__ATTR(_name, 0444, show_##_name, NULL)
+
+#define gov_attr_rw(_name)						\
+static struct governor_attr _name =					\
+__ATTR(_name, 0644, show_##_name, store_##_name)
+
 /* Common to all CPUs of a policy */
 struct policy_dbs_info {
 	struct cpufreq_policy *policy;
@@ -236,8 +272,8 @@ struct dbs_governor {
 	#define GOV_ONDEMAND		0
 	#define GOV_CONSERVATIVE	1
 	int governor;
-	struct attribute_group *attr_group_gov_sys; /* one governor - system */
-	struct attribute_group *attr_group_gov_pol; /* one governor - policy */
+	const char *kobj_name;
+	struct kobj_type kobj_type;
 
 	/*
 	 * Common data for platforms that don't set
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index cb0d6ff1ced5..bf570800fa78 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -432,54 +432,33 @@ static ssize_t store_powersave_bias(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-show_store_one(od, io_is_busy);
-show_store_one(od, powersave_bias);
-show_store_one_global(od, sampling_rate);
-show_store_one_global(od, up_threshold);
-show_store_one_global(od, sampling_down_factor);
-show_store_one_global(od, ignore_nice_load);
-show_one_global(od, min_sampling_rate);
-
-gov_sys_pol_attr_rw(sampling_rate);
-gov_sys_pol_attr_rw(io_is_busy);
-gov_sys_pol_attr_rw(up_threshold);
-gov_sys_pol_attr_rw(sampling_down_factor);
-gov_sys_pol_attr_rw(ignore_nice_load);
-gov_sys_pol_attr_rw(powersave_bias);
-gov_sys_pol_attr_ro(min_sampling_rate);
-
-static struct attribute *dbs_attributes_gov_sys[] = {
-	&min_sampling_rate_gov_sys.attr,
-	&sampling_rate_gov_sys.attr,
-	&up_threshold_gov_sys.attr,
-	&sampling_down_factor_gov_sys.attr,
-	&ignore_nice_load_gov_sys.attr,
-	&powersave_bias_gov_sys.attr,
-	&io_is_busy_gov_sys.attr,
+gov_show_one(sampling_rate);
+gov_show_one(up_threshold);
+gov_show_one(sampling_down_factor);
+gov_show_one(ignore_nice_load);
+gov_show_one(min_sampling_rate);
+gov_show_one_tunable(od, io_is_busy);
+gov_show_one_tunable(od, powersave_bias);
+
+gov_attr_rw(sampling_rate);
+gov_attr_rw(io_is_busy);
+gov_attr_rw(up_threshold);
+gov_attr_rw(sampling_down_factor);
+gov_attr_rw(ignore_nice_load);
+gov_attr_rw(powersave_bias);
+gov_attr_ro(min_sampling_rate);
+
+static struct attribute *od_attributes[] = {
+	&min_sampling_rate.attr,
+	&sampling_rate.attr,
+	&up_threshold.attr,
+	&sampling_down_factor.attr,
+	&ignore_nice_load.attr,
+	&powersave_bias.attr,
+	&io_is_busy.attr,
 	NULL
 };
 
-static struct attribute_group od_attr_group_gov_sys = {
-	.attrs = dbs_attributes_gov_sys,
-	.name = "ondemand",
-};
-
-static struct attribute *dbs_attributes_gov_pol[] = {
-	&min_sampling_rate_gov_pol.attr,
-	&sampling_rate_gov_pol.attr,
-	&up_threshold_gov_pol.attr,
-	&sampling_down_factor_gov_pol.attr,
-	&ignore_nice_load_gov_pol.attr,
-	&powersave_bias_gov_pol.attr,
-	&io_is_busy_gov_pol.attr,
-	NULL
-};
-
-static struct attribute_group od_attr_group_gov_pol = {
-	.attrs = dbs_attributes_gov_pol,
-	.name = "ondemand",
-};
-
 /************************** sysfs end ************************/
 
 static int od_init(struct dbs_data *dbs_data, bool notify)
@@ -544,8 +523,8 @@ static struct dbs_governor od_dbs_gov = {
 		.owner = THIS_MODULE,
 	},
 	.governor = GOV_ONDEMAND,
-	.attr_group_gov_sys = &od_attr_group_gov_sys,
-	.attr_group_gov_pol = &od_attr_group_gov_pol,
+	.kobj_name = "ondemand",
+	.kobj_type = { .default_attrs = od_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = od_dbs_timer,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 04/13] cpufreq: governor: Drop unused macros for creating governor tunable attributes
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (2 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 05/13] Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" Viresh Kumar
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

"The previous commit introduced a new set of macros for creating sysfs
attributes that represent governor tunables and the old macros used for
this purpose are not needed any more, so drop them."

[ Rafael: Written changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.h | 89 --------------------------------------
 1 file changed, 89 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index a3afac5d8ab2..f164437f235e 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -41,95 +41,6 @@
 /* Ondemand Sampling types */
 enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 
-/*
- * Macro for creating governors sysfs routines
- *
- * - gov_sys: One governor instance per whole system
- * - gov_pol: One governor instance per policy
- */
-
-/* Create attributes */
-#define gov_sys_attr_ro(_name)						\
-static struct global_attr _name##_gov_sys =				\
-__ATTR(_name, 0444, show_##_name##_gov_sys, NULL)
-
-#define gov_sys_attr_rw(_name)						\
-static struct global_attr _name##_gov_sys =				\
-__ATTR(_name, 0644, show_##_name##_gov_sys, store_##_name##_gov_sys)
-
-#define gov_pol_attr_ro(_name)						\
-static struct freq_attr _name##_gov_pol =				\
-__ATTR(_name, 0444, show_##_name##_gov_pol, NULL)
-
-#define gov_pol_attr_rw(_name)						\
-static struct freq_attr _name##_gov_pol =				\
-__ATTR(_name, 0644, show_##_name##_gov_pol, store_##_name##_gov_pol)
-
-#define gov_sys_pol_attr_rw(_name)					\
-	gov_sys_attr_rw(_name);						\
-	gov_pol_attr_rw(_name)
-
-#define gov_sys_pol_attr_ro(_name)					\
-	gov_sys_attr_ro(_name);						\
-	gov_pol_attr_ro(_name)
-
-/* Create show/store routines */
-#define show_one(_gov, file_name)					\
-static ssize_t show_##file_name##_gov_sys				\
-(struct kobject *kobj, struct attribute *attr, char *buf)		\
-{									\
-	struct _gov##_dbs_tuners *tuners = _gov##_dbs_gov.gdbs_data->tuners; \
-	return sprintf(buf, "%u\n", tuners->file_name);			\
-}									\
-									\
-static ssize_t show_##file_name##_gov_pol				\
-(struct cpufreq_policy *policy, char *buf)				\
-{									\
-	struct policy_dbs_info *policy_dbs = policy->governor_data;	\
-	struct dbs_data *dbs_data = policy_dbs->dbs_data;		\
-	struct _gov##_dbs_tuners *tuners = dbs_data->tuners;		\
-	return sprintf(buf, "%u\n", tuners->file_name);			\
-}
-
-#define store_one(_gov, file_name)					\
-static ssize_t store_##file_name##_gov_sys				\
-(struct kobject *kobj, struct attribute *attr, const char *buf, size_t count) \
-{									\
-	struct dbs_data *dbs_data = _gov##_dbs_gov.gdbs_data;		\
-	return store_##file_name(dbs_data, buf, count);			\
-}									\
-									\
-static ssize_t store_##file_name##_gov_pol				\
-(struct cpufreq_policy *policy, const char *buf, size_t count)		\
-{									\
-	struct policy_dbs_info *policy_dbs = policy->governor_data;	\
-	return store_##file_name(policy_dbs->dbs_data, buf, count);			\
-}
-
-#define show_store_one(_gov, file_name)					\
-show_one(_gov, file_name);						\
-store_one(_gov, file_name)
-
-#define show_one_global(_gov, file_name)				\
-static ssize_t show_##file_name##_gov_sys				\
-(struct kobject *kobj, struct attribute *attr, char *buf)		\
-{									\
-	struct dbs_data *dbs_data = _gov##_dbs_gov.gdbs_data;		\
-	return sprintf(buf, "%u\n", dbs_data->file_name);		\
-}									\
-									\
-static ssize_t show_##file_name##_gov_pol				\
-(struct cpufreq_policy *policy, char *buf)				\
-{									\
-	struct policy_dbs_info *policy_dbs = policy->governor_data;	\
-	struct dbs_data *dbs_data = policy_dbs->dbs_data;		\
-	return sprintf(buf, "%u\n", dbs_data->file_name);		\
-}
-
-#define show_store_one_global(_gov, file_name)				\
-show_one_global(_gov, file_name);					\
-store_one(_gov, file_name)
-
 /* create helper routines */
 #define define_get_cpu_dbs_routines(_dbs_info)				\
 static struct cpu_dbs_info *get_cpu_cdbs(int cpu)			\
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 05/13] Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT"
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (3 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 04/13] cpufreq: governor: Drop unused macros for creating governor tunable attributes Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 06/13] cpufreq: Merge cpufreq_offline_prepare/finish routines Viresh Kumar
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

Earlier, when the struct freq-attr was used to represent governor
attributes, the standard cpufreq show/store sysfs attribute callbacks
were applied to the governor tunable attributes and they always acquire
the policy->rwsem lock before carrying out the operation.  That could
have resulted in an ABBA deadlock if governor tunable attributes are
removed under policy->rwsem while one of them is being accessed
concurrently (if sysfs attributes removal wins the race, it will wait
for the access to complete with policy->rwsem held while the attribute
callback will block on policy->rwsem indefinitely).

We attempted to address this issue by dropping policy->rwsem around
governor tunable attributes removal (that is, around invocations of the
->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT)
in cpufreq_set_policy(), but that opened up race conditions that had not
been possible with policy->rwsem held all the time.

The previous commit, "cpufreq: governor: New sysfs show/store callbacks
for governor tunables", fixed the original ABBA deadlock by adding new
governor specific show/store callbacks.

We don't have to drop rwsem around invocations of governor event
CPUFREQ_GOV_POLICY_EXIT anymore, and original fix can be reverted now.

Fixes: 955ef4833574 ("cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT")
Reported-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 5 -----
 include/linux/cpufreq.h   | 4 ----
 2 files changed, 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 29755fcbf200..9c62bf35b9dc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2204,10 +2204,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 			return ret;
 		}
 
-		up_write(&policy->rwsem);
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
-
 		if (ret) {
 			pr_err("%s: Failed to Exit Governor: %s (%d)\n",
 			       __func__, old_gov->name, ret);
@@ -2223,9 +2220,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 		if (!ret)
 			goto out;
 
-		up_write(&policy->rwsem);
 		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
 	}
 
 	/* new governor failed, so re-start old one */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 11aa93134493..2a029c587975 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -100,10 +100,6 @@ struct cpufreq_policy {
 	 * - Any routine that will write to the policy structure and/or may take away
 	 *   the policy altogether (eg. CPU hotplug), will hold this lock in write
 	 *   mode before doing so.
-	 *
-	 * Additional rules:
-	 * - Lock should not be held across
-	 *     __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
 	 */
 	struct rw_semaphore	rwsem;
 
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 06/13] cpufreq: Merge cpufreq_offline_prepare/finish routines
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (4 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 05/13] Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 07/13] cpufreq: Call __cpufreq_governor() with policy->rwsem held Viresh Kumar
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

The offline routine was separated into two halves earlier by
'commit 1aee40ac9c86 ("cpufreq: Invoke __cpufreq_remove_dev_finish()
after releasing cpu_hotplug.lock");.

And the reasons cited were, race issues between accessing policy's sysfs
files and policy kobject's cleanup.

That race isn't valid anymore, as we don't remove the policy & its
kobject completely on hotplugs, but do that from ->remove() callback of
subsys framework.

These two routines can be merged back now.

This is a preparatory step for the next patch, that will enforce
policy->rwsem lock around __cpufreq_governor() routines STOP/EXIT
sequence.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 36 ++++++++++--------------------------
 1 file changed, 10 insertions(+), 26 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9c62bf35b9dc..863ac26c4ecf 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1361,9 +1361,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	return ret;
 }
 
-static void cpufreq_offline_prepare(unsigned int cpu)
+static void cpufreq_offline(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
+	int ret;
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
@@ -1374,7 +1375,7 @@ static void cpufreq_offline_prepare(unsigned int cpu)
 	}
 
 	if (has_target()) {
-		int ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 		if (ret)
 			pr_err("%s: Failed to stop governor\n", __func__);
 	}
@@ -1397,34 +1398,23 @@ static void cpufreq_offline_prepare(unsigned int cpu)
 	/* Start governor again for active policy */
 	if (!policy_is_inactive(policy)) {
 		if (has_target()) {
-			int ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+			ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
 			if (!ret)
 				ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
 
 			if (ret)
 				pr_err("%s: Failed to start governor\n", __func__);
 		}
-	} else if (cpufreq_driver->stop_cpu) {
-		cpufreq_driver->stop_cpu(policy);
-	}
-}
 
-static void cpufreq_offline_finish(unsigned int cpu)
-{
-	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
-
-	if (!policy) {
-		pr_debug("%s: No cpu_data found\n", __func__);
 		return;
 	}
 
-	/* Only proceed for inactive policies */
-	if (!policy_is_inactive(policy))
-		return;
+	if (cpufreq_driver->stop_cpu)
+		cpufreq_driver->stop_cpu(policy);
 
 	/* If cpu is last user of policy, free policy */
 	if (has_target()) {
-		int ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
 		if (ret)
 			pr_err("%s: Failed to exit governor\n", __func__);
 	}
@@ -1453,10 +1443,8 @@ static void cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
 	if (!policy)
 		return;
 
-	if (cpu_online(cpu)) {
-		cpufreq_offline_prepare(cpu);
-		cpufreq_offline_finish(cpu);
-	}
+	if (cpu_online(cpu))
+		cpufreq_offline(cpu);
 
 	cpumask_clear_cpu(cpu, policy->real_cpus);
 	remove_cpu_dev_symlink(policy, cpu);
@@ -2304,11 +2292,7 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb,
 		break;
 
 	case CPU_DOWN_PREPARE:
-		cpufreq_offline_prepare(cpu);
-		break;
-
-	case CPU_POST_DEAD:
-		cpufreq_offline_finish(cpu);
+		cpufreq_offline(cpu);
 		break;
 
 	case CPU_DOWN_FAILED:
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 07/13] cpufreq: Call __cpufreq_governor() with policy->rwsem held
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (5 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 06/13] cpufreq: Merge cpufreq_offline_prepare/finish routines Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 08/13] cpufreq: Remove cpufreq_governor_lock Viresh Kumar
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

This isn't followed properly by all parts of the core code, some follow
it, whereas others don't.

Enforcing it will also enable us to remove cpufreq_governor_lock, that
is used today because we can't guarantee that __cpufreq_governor() isn't
executed in parallel.

We should also ensure that the lock is held across state changes to the
governors.

For example, while adding a CPU to the policy on cpu-online path, we
need to stop the governor, change policy->cpus, start the governor and
then refresh its limits. The complete sequence must be guaranteed to
execute without any concurrent races. And that can be achieved using
policy->rwsem around these use cases.

Also note that cpufreq_driver->stop_cpu() and ->exit() can get called
while policy->rwsem is held. That shouldn't have any side effects
though.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 49 +++++++++++++++++++++++++++++++----------------
 1 file changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 863ac26c4ecf..51fb47cd38a0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1048,30 +1048,29 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
 	if (cpumask_test_cpu(cpu, policy->cpus))
 		return 0;
 
+	down_write(&policy->rwsem);
 	if (has_target()) {
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 		if (ret) {
 			pr_err("%s: Failed to stop governor\n", __func__);
-			return ret;
+			goto unlock;
 		}
 	}
 
-	down_write(&policy->rwsem);
 	cpumask_set_cpu(cpu, policy->cpus);
-	up_write(&policy->rwsem);
 
 	if (has_target()) {
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
 		if (!ret)
 			ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
 
-		if (ret) {
+		if (ret)
 			pr_err("%s: Failed to start governor\n", __func__);
-			return ret;
-		}
 	}
 
-	return 0;
+unlock:
+	up_write(&policy->rwsem);
+	return ret;
 }
 
 static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)
@@ -1374,13 +1373,13 @@ static void cpufreq_offline(unsigned int cpu)
 		return;
 	}
 
+	down_write(&policy->rwsem);
 	if (has_target()) {
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 		if (ret)
 			pr_err("%s: Failed to stop governor\n", __func__);
 	}
 
-	down_write(&policy->rwsem);
 	cpumask_clear_cpu(cpu, policy->cpus);
 
 	if (policy_is_inactive(policy)) {
@@ -1393,7 +1392,6 @@ static void cpufreq_offline(unsigned int cpu)
 		/* Nominate new CPU */
 		policy->cpu = cpumask_any(policy->cpus);
 	}
-	up_write(&policy->rwsem);
 
 	/* Start governor again for active policy */
 	if (!policy_is_inactive(policy)) {
@@ -1406,7 +1404,7 @@ static void cpufreq_offline(unsigned int cpu)
 				pr_err("%s: Failed to start governor\n", __func__);
 		}
 
-		return;
+		goto unlock;
 	}
 
 	if (cpufreq_driver->stop_cpu)
@@ -1428,6 +1426,9 @@ static void cpufreq_offline(unsigned int cpu)
 		cpufreq_driver->exit(policy);
 		policy->freq_table = NULL;
 	}
+
+unlock:
+	up_write(&policy->rwsem);
 }
 
 /**
@@ -1624,6 +1625,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
 void cpufreq_suspend(void)
 {
 	struct cpufreq_policy *policy;
+	int ret;
 
 	if (!cpufreq_driver)
 		return;
@@ -1634,7 +1636,11 @@ void cpufreq_suspend(void)
 	pr_debug("%s: Suspending Governors\n", __func__);
 
 	for_each_active_policy(policy) {
-		if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
+		down_write(&policy->rwsem);
+		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
+		up_write(&policy->rwsem);
+
+		if (ret)
 			pr_err("%s: Failed to stop governor for policy: %p\n",
 				__func__, policy);
 		else if (cpufreq_driver->suspend
@@ -1656,6 +1662,7 @@ void cpufreq_suspend(void)
 void cpufreq_resume(void)
 {
 	struct cpufreq_policy *policy;
+	int ret;
 
 	if (!cpufreq_driver)
 		return;
@@ -1668,13 +1675,20 @@ void cpufreq_resume(void)
 	pr_debug("%s: Resuming Governors\n", __func__);
 
 	for_each_active_policy(policy) {
-		if (cpufreq_driver->resume && cpufreq_driver->resume(policy))
+		if (cpufreq_driver->resume && cpufreq_driver->resume(policy)) {
 			pr_err("%s: Failed to resume driver: %p\n", __func__,
 				policy);
-		else if (__cpufreq_governor(policy, CPUFREQ_GOV_START)
-		    || __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS))
-			pr_err("%s: Failed to start governor for policy: %p\n",
-				__func__, policy);
+		} else {
+			down_write(&policy->rwsem);
+			ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+			if (!ret)
+				__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
+			up_write(&policy->rwsem);
+
+			if (ret)
+				pr_err("%s: Failed to start governor for policy: %p\n",
+				       __func__, policy);
+		}
 	}
 
 	/*
@@ -2325,8 +2339,11 @@ static int cpufreq_boost_set_sw(int state)
 				       __func__);
 				break;
 			}
+
+			down_write(&policy->rwsem);
 			policy->user_policy.max = policy->max;
 			__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
+			up_write(&policy->rwsem);
 		}
 	}
 
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 08/13] cpufreq: Remove cpufreq_governor_lock
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (6 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 07/13] cpufreq: Call __cpufreq_governor() with policy->rwsem held Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c Viresh Kumar
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

We used to drop policy->rwsem just before calling __cpufreq_governor()
in some cases earlier and so it was possible that __cpufreq_governor()
runs concurrently via separate threads.

In order to guarantee valid state transitions for governors,
'governor_enabled' was required to be protected using some locking and
we created cpufreq_governor_lock for that.

But now, __cpufreq_governor() is always called from within policy->rwsem
held and so 'governor_enabled' is protected against races even without
cpufreq_governor_lock.

Get rid of the extra lock now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c          | 8 --------
 drivers/cpufreq/cpufreq_governor.h | 1 -
 2 files changed, 9 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 51fb47cd38a0..745da90d7b38 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -146,8 +146,6 @@ void cpufreq_update_util(u64 time, unsigned long util, unsigned long max)
 	rcu_read_unlock();
 }
 
-DEFINE_MUTEX(cpufreq_governor_lock);
-
 /* Flag to suspend/resume CPUFreq governors */
 static bool cpufreq_suspended;
 
@@ -2014,11 +2012,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 
 	pr_debug("%s: for CPU %u, event %u\n", __func__, policy->cpu, event);
 
-	mutex_lock(&cpufreq_governor_lock);
 	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
 	    || (!policy->governor_enabled
 	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
-		mutex_unlock(&cpufreq_governor_lock);
 		return -EBUSY;
 	}
 
@@ -2027,8 +2023,6 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 	else if (event == CPUFREQ_GOV_START)
 		policy->governor_enabled = true;
 
-	mutex_unlock(&cpufreq_governor_lock);
-
 	ret = policy->governor->governor(policy, event);
 
 	if (!ret) {
@@ -2038,12 +2032,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 			policy->governor->initialized--;
 	} else {
 		/* Restore original values */
-		mutex_lock(&cpufreq_governor_lock);
 		if (event == CPUFREQ_GOV_STOP)
 			policy->governor_enabled = true;
 		else if (event == CPUFREQ_GOV_START)
 			policy->governor_enabled = false;
-		mutex_unlock(&cpufreq_governor_lock);
 	}
 
 	if (((event == CPUFREQ_GOV_POLICY_INIT) && ret) ||
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index f164437f235e..886578b5a4fd 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -228,7 +228,6 @@ static inline int delay_for_sampling_rate(unsigned int sampling_rate)
 }
 
 extern struct mutex dbs_data_mutex;
-extern struct mutex cpufreq_governor_lock;
 void dbs_check_cpu(struct cpufreq_policy *policy);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
 void od_register_powersave_bias_handler(unsigned int (*f)
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (7 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 08/13] cpufreq: Remove cpufreq_governor_lock Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 12:58   ` Rafael J. Wysocki
  2016-02-08 11:39 ` [PATCH V3 10/13] cpufreq: governor: No need to manage state machine now Viresh Kumar
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

We have got five common sysfs tunables between ondemand and conservative
governors, move their callbacks to cpufreq_governor.c to get rid of
redundant code.

Because of minor differences in the implementation of the callbacks,
some more per-governor callbacks are introduced in order to not
introduce any more "governor == ONDEMAND/CONSERVATIVE" like checks.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_conservative.c |  80 +++++---------------------
 drivers/cpufreq/cpufreq_governor.c     | 100 +++++++++++++++++++++++++++++++++
 drivers/cpufreq/cpufreq_governor.h     |  16 +++++-
 drivers/cpufreq/cpufreq_ondemand.c     | 100 ++++++++-------------------------
 4 files changed, 151 insertions(+), 145 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 6d45b7e6b43f..f96770dab788 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -122,47 +122,17 @@ static struct notifier_block cs_cpufreq_notifier_block = {
 /************************** sysfs interface ************************/
 static struct dbs_governor cs_dbs_gov;
 
-static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
-		const char *buf, size_t count)
+static bool invalid_up_threshold(struct dbs_data *dbs_data,
+				 unsigned int threshold)
 {
-	unsigned int input;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-
-	if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
-		return -EINVAL;
+	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 
-	dbs_data->sampling_down_factor = input;
-	return count;
+	return threshold > 100 || threshold <= cs_tuners->down_threshold;
 }
 
-static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
-		size_t count)
+static bool invalid_sampling_down_factor(unsigned int factor)
 {
-	unsigned int input;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-
-	if (ret != 1)
-		return -EINVAL;
-
-	dbs_data->sampling_rate = max(input, dbs_data->min_sampling_rate);
-	return count;
-}
-
-static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
-		size_t count)
-{
-	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
-	unsigned int input;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-
-	if (ret != 1 || input > 100 || input <= cs_tuners->down_threshold)
-		return -EINVAL;
-
-	dbs_data->up_threshold = input;
-	return count;
+	return factor > MAX_SAMPLING_DOWN_FACTOR;
 }
 
 static ssize_t store_down_threshold(struct dbs_data *dbs_data, const char *buf,
@@ -182,27 +152,13 @@ static ssize_t store_down_threshold(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
-		const char *buf, size_t count)
+static void update_ignore_nice_load(struct dbs_data *dbs_data)
 {
-	unsigned int input, j;
-	int ret;
-
-	ret = sscanf(buf, "%u", &input);
-	if (ret != 1)
-		return -EINVAL;
-
-	if (input > 1)
-		input = 1;
-
-	if (input == dbs_data->ignore_nice_load) /* nothing to do */
-		return count;
-
-	dbs_data->ignore_nice_load = input;
+	struct cs_cpu_dbs_info_s *dbs_info;
+	unsigned int j;
 
 	/* we need to re-evaluate prev_cpu_idle */
 	for_each_online_cpu(j) {
-		struct cs_cpu_dbs_info_s *dbs_info;
 		dbs_info = &per_cpu(cs_cpu_dbs_info, j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
 					&dbs_info->cdbs.prev_cpu_wall, 0);
@@ -210,7 +166,6 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 			dbs_info->cdbs.prev_cpu_nice =
 				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 	}
-	return count;
 }
 
 static ssize_t store_freq_step(struct dbs_data *dbs_data, const char *buf,
@@ -235,21 +190,11 @@ static ssize_t store_freq_step(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-gov_show_one(sampling_rate);
-gov_show_one(sampling_down_factor);
-gov_show_one(up_threshold);
-gov_show_one(ignore_nice_load);
-gov_show_one(min_sampling_rate);
 gov_show_one_tunable(cs, down_threshold);
 gov_show_one_tunable(cs, freq_step);
 
-gov_attr_rw(sampling_rate);
-gov_attr_rw(sampling_down_factor);
-gov_attr_rw(up_threshold);
-gov_attr_rw(ignore_nice_load);
-gov_attr_ro(min_sampling_rate);
-gov_attr_rw(down_threshold);
-gov_attr_rw(freq_step);
+static gov_attr_rw(down_threshold);
+static gov_attr_rw(freq_step);
 
 static struct attribute *cs_attributes[] = {
 	&min_sampling_rate.attr,
@@ -316,6 +261,9 @@ static struct dbs_governor cs_dbs_gov = {
 	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = cs_dbs_timer,
 	.gov_check_cpu = cs_check_cpu,
+	.invalid_up_threshold = invalid_up_threshold,
+	.invalid_sampling_down_factor = invalid_sampling_down_factor,
+	.update_ignore_nice_load = update_ignore_nice_load,
 	.init = cs_init,
 	.exit = cs_exit,
 };
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 295732e06354..5403f863b14d 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -25,6 +25,105 @@
 DEFINE_MUTEX(dbs_data_mutex);
 EXPORT_SYMBOL_GPL(dbs_data_mutex);
 
+/* Common sysfs tunables */
+static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
+				   size_t count)
+{
+	struct dbs_governor *gov = dbs_data->gov;
+	unsigned int rate;
+	int ret;
+	ret = sscanf(buf, "%u", &rate);
+	if (ret != 1)
+		return -EINVAL;
+
+	dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate);
+
+	if (gov->update_sampling_rate)
+		gov->update_sampling_rate(dbs_data);
+
+	return count;
+}
+
+static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
+				  size_t count)
+{
+	struct dbs_governor *gov = dbs_data->gov;
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+
+	if (ret != 1 || gov->invalid_up_threshold(dbs_data, input))
+		return -EINVAL;
+
+	dbs_data->up_threshold = input;
+	return count;
+}
+
+static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
+					  const char *buf, size_t count)
+{
+	struct dbs_governor *gov = dbs_data->gov;
+	unsigned int input;
+	int ret;
+	ret = sscanf(buf, "%u", &input);
+
+	if (ret != 1 || gov->invalid_sampling_down_factor(input) || input < 1)
+		return -EINVAL;
+
+	dbs_data->sampling_down_factor = input;
+
+	if (gov->update_sampling_down_factor)
+		gov->update_sampling_down_factor(dbs_data);
+
+	return count;
+}
+
+static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
+				      const char *buf, size_t count)
+{
+	struct dbs_governor *gov = dbs_data->gov;
+	unsigned int input;
+	int ret;
+
+	ret = sscanf(buf, "%u", &input);
+	if (ret != 1)
+		return -EINVAL;
+
+	if (input > 1)
+		input = 1;
+
+	if (input == dbs_data->ignore_nice_load) { /* nothing to do */
+		return count;
+	}
+
+	dbs_data->ignore_nice_load = input;
+
+	gov->update_ignore_nice_load(dbs_data);
+	return count;
+}
+
+gov_show_one(sampling_rate);
+gov_show_one(up_threshold);
+gov_show_one(sampling_down_factor);
+gov_show_one(ignore_nice_load);
+gov_show_one(min_sampling_rate);
+
+gov_attr_rw(sampling_rate);
+EXPORT_SYMBOL_GPL(sampling_rate);
+
+gov_attr_rw(up_threshold);
+EXPORT_SYMBOL_GPL(up_threshold);
+
+gov_attr_rw(sampling_down_factor);
+EXPORT_SYMBOL_GPL(sampling_down_factor);
+
+gov_attr_rw(ignore_nice_load);
+EXPORT_SYMBOL_GPL(ignore_nice_load);
+
+gov_attr_ro(min_sampling_rate);
+EXPORT_SYMBOL_GPL(min_sampling_rate);
+
+/* Governor routines */
 static inline struct dbs_data *to_dbs_data(struct kobject *kobj)
 {
 	return container_of(kobj, struct dbs_data, kobj);
@@ -397,6 +496,7 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 	}
 
 	dbs_data->usage_count = 1;
+	dbs_data->gov = gov;
 	mutex_init(&dbs_data->mutex);
 
 	ret = gov->init(dbs_data, !policy->governor->initialized);
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 886578b5a4fd..ced34ba5a18d 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -64,6 +64,7 @@ static void *get_cpu_dbs_info_s(int cpu)				\
 
 /* Governor demand based switching data (per-policy or global). */
 struct dbs_data {
+	struct dbs_governor *gov;
 	int usage_count;
 	void *tuners;
 	unsigned int min_sampling_rate;
@@ -102,13 +103,19 @@ static ssize_t show_##file_name						\
 }
 
 #define gov_attr_ro(_name)						\
-static struct governor_attr _name =					\
+struct governor_attr _name =						\
 __ATTR(_name, 0444, show_##_name, NULL)
 
 #define gov_attr_rw(_name)						\
-static struct governor_attr _name =					\
+struct governor_attr _name =						\
 __ATTR(_name, 0644, show_##_name, store_##_name)
 
+extern struct governor_attr sampling_rate;
+extern struct governor_attr up_threshold;
+extern struct governor_attr sampling_down_factor;
+extern struct governor_attr ignore_nice_load;
+extern struct governor_attr min_sampling_rate;
+
 /* Common to all CPUs of a policy */
 struct policy_dbs_info {
 	struct cpufreq_policy *policy;
@@ -198,6 +205,11 @@ struct dbs_governor {
 	void (*gov_check_cpu)(int cpu, unsigned int load);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
+	bool (*invalid_up_threshold)(struct dbs_data *dbs_data, unsigned int threshold);
+	bool (*invalid_sampling_down_factor)(unsigned int factor);
+	void (*update_sampling_rate)(struct dbs_data *dbs_data);
+	void (*update_sampling_down_factor)(struct dbs_data *dbs_data);
+	void (*update_ignore_nice_load)(struct dbs_data *dbs_data);
 
 	/* Governor specific ops, see below */
 	void *gov_ops;
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index bf570800fa78..ccc3419d43f3 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -234,15 +234,12 @@ static struct dbs_governor od_dbs_gov;
  * reducing the sampling rate, we need to make the new value effective
  * immediately.
  */
-static void update_sampling_rate(struct dbs_data *dbs_data,
-		unsigned int new_rate)
+static void update_sampling_rate(struct dbs_data *dbs_data)
 {
 	struct cpumask cpumask;
+	unsigned int new_rate = dbs_data->sampling_rate;
 	int cpu;
 
-	dbs_data->sampling_rate = new_rate = max(new_rate,
-			dbs_data->min_sampling_rate);
-
 	/*
 	 * Lock governor so that governor start/stop can't execute in parallel.
 	 */
@@ -306,17 +303,16 @@ static void update_sampling_rate(struct dbs_data *dbs_data,
 	mutex_unlock(&dbs_data_mutex);
 }
 
-static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
-		size_t count)
+static bool invalid_up_threshold(struct dbs_data *dbs_data,
+				 unsigned int threshold)
 {
-	unsigned int input;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-	if (ret != 1)
-		return -EINVAL;
+	return threshold > MAX_FREQUENCY_UP_THRESHOLD ||
+	       threshold < MIN_FREQUENCY_UP_THRESHOLD;
+}
 
-	update_sampling_rate(dbs_data, input);
-	return count;
+static bool invalid_sampling_down_factor(unsigned int factor)
+{
+	return factor > MAX_SAMPLING_DOWN_FACTOR;
 }
 
 static ssize_t store_io_is_busy(struct dbs_data *dbs_data, const char *buf,
@@ -342,66 +338,22 @@ static ssize_t store_io_is_busy(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-static ssize_t store_up_threshold(struct dbs_data *dbs_data, const char *buf,
-		size_t count)
+static void update_sampling_down_factor(struct dbs_data *dbs_data)
 {
-	unsigned int input;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-
-	if (ret != 1 || input > MAX_FREQUENCY_UP_THRESHOLD ||
-			input < MIN_FREQUENCY_UP_THRESHOLD) {
-		return -EINVAL;
-	}
-
-	dbs_data->up_threshold = input;
-	return count;
-}
-
-static ssize_t store_sampling_down_factor(struct dbs_data *dbs_data,
-		const char *buf, size_t count)
-{
-	unsigned int input, j;
-	int ret;
-	ret = sscanf(buf, "%u", &input);
-
-	if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
-		return -EINVAL;
-	dbs_data->sampling_down_factor = input;
+	unsigned int j;
 
 	/* Reset down sampling multiplier in case it was active */
-	for_each_online_cpu(j) {
-		struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
-				j);
-		dbs_info->rate_mult = 1;
-	}
-	return count;
+	for_each_online_cpu(j)
+		per_cpu(od_cpu_dbs_info, j).rate_mult = 1;
 }
 
-static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
-		const char *buf, size_t count)
+static void update_ignore_nice_load(struct dbs_data *dbs_data)
 {
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-	unsigned int input;
-	int ret;
-
+	struct od_cpu_dbs_info_s *dbs_info;
 	unsigned int j;
 
-	ret = sscanf(buf, "%u", &input);
-	if (ret != 1)
-		return -EINVAL;
-
-	if (input > 1)
-		input = 1;
-
-	if (input == dbs_data->ignore_nice_load) { /* nothing to do */
-		return count;
-	}
-	dbs_data->ignore_nice_load = input;
-
-	/* we need to re-evaluate prev_cpu_idle */
 	for_each_online_cpu(j) {
-		struct od_cpu_dbs_info_s *dbs_info;
 		dbs_info = &per_cpu(od_cpu_dbs_info, j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
 			&dbs_info->cdbs.prev_cpu_wall, od_tuners->io_is_busy);
@@ -410,7 +362,6 @@ static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 
 	}
-	return count;
 }
 
 static ssize_t store_powersave_bias(struct dbs_data *dbs_data, const char *buf,
@@ -432,21 +383,11 @@ static ssize_t store_powersave_bias(struct dbs_data *dbs_data, const char *buf,
 	return count;
 }
 
-gov_show_one(sampling_rate);
-gov_show_one(up_threshold);
-gov_show_one(sampling_down_factor);
-gov_show_one(ignore_nice_load);
-gov_show_one(min_sampling_rate);
 gov_show_one_tunable(od, io_is_busy);
 gov_show_one_tunable(od, powersave_bias);
 
-gov_attr_rw(sampling_rate);
-gov_attr_rw(io_is_busy);
-gov_attr_rw(up_threshold);
-gov_attr_rw(sampling_down_factor);
-gov_attr_rw(ignore_nice_load);
-gov_attr_rw(powersave_bias);
-gov_attr_ro(min_sampling_rate);
+static gov_attr_rw(io_is_busy);
+static gov_attr_rw(powersave_bias);
 
 static struct attribute *od_attributes[] = {
 	&min_sampling_rate.attr,
@@ -529,6 +470,11 @@ static struct dbs_governor od_dbs_gov = {
 	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = od_dbs_timer,
 	.gov_check_cpu = od_check_cpu,
+	.update_sampling_rate = update_sampling_rate,
+	.invalid_up_threshold = invalid_up_threshold,
+	.invalid_sampling_down_factor = invalid_sampling_down_factor,
+	.update_sampling_down_factor = update_sampling_down_factor,
+	.update_ignore_nice_load = update_ignore_nice_load,
 	.gov_ops = &od_ops,
 	.init = od_init,
 	.exit = od_exit,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 10/13] cpufreq: governor: No need to manage state machine now
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (8 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 11:39 ` [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data Viresh Kumar
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

cpufreq core now guarantees that policy->rwsem wouldn't get dropped
while calling CPUFREQ_GOV_POLICY_EXIT governor event and will be kept
acquired until the complete sequence of governor state changes has
finished.

And so we can remove the state machine checks that were put in place
earlier.

This also means that policy_dbs->policy can be initialized while
policy_dbs is allocated, to move all initialization together.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 27 +++++----------------------
 drivers/cpufreq/cpufreq_ondemand.c |  6 +++---
 2 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 5403f863b14d..ee3c2d92da53 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -332,8 +332,10 @@ static inline void gov_clear_update_util(struct cpufreq_policy *policy)
 	synchronize_rcu();
 }
 
-static void gov_cancel_work(struct policy_dbs_info *policy_dbs)
+static void gov_cancel_work(struct cpufreq_policy *policy)
 {
+	struct policy_dbs_info *policy_dbs = policy->governor_data;
+
 	/* Tell dbs_update_util_handler() to skip queuing up work items. */
 	atomic_inc(&policy_dbs->skip_work);
 	/*
@@ -429,6 +431,7 @@ static struct policy_dbs_info *alloc_policy_dbs_info(struct cpufreq_policy *poli
 	if (!policy_dbs)
 		return NULL;
 
+	policy_dbs->policy = policy;
 	mutex_init(&policy_dbs->timer_mutex);
 	atomic_set(&policy_dbs->skip_work, 0);
 	init_irq_work(&policy_dbs->irq_work, dbs_irq_work);
@@ -551,10 +554,6 @@ static int cpufreq_governor_exit(struct cpufreq_policy *policy)
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 
-	/* State should be equivalent to INIT */
-	if (policy_dbs->policy)
-		return -EBUSY;
-
 	if (!--dbs_data->usage_count) {
 		kobject_put(&dbs_data->kobj);
 
@@ -585,10 +584,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy)
 	if (!policy->cur)
 		return -EINVAL;
 
-	/* State should be equivalent to INIT */
-	if (policy_dbs->policy)
-		return -EBUSY;
-
 	sampling_rate = dbs_data->sampling_rate;
 	ignore_nice = dbs_data->ignore_nice_load;
 
@@ -613,7 +608,6 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy)
 		if (ignore_nice)
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 	}
-	policy_dbs->policy = policy;
 
 	if (gov->governor == GOV_CONSERVATIVE) {
 		struct cs_cpu_dbs_info_s *cs_dbs_info =
@@ -636,14 +630,7 @@ static int cpufreq_governor_start(struct cpufreq_policy *policy)
 
 static int cpufreq_governor_stop(struct cpufreq_policy *policy)
 {
-	struct policy_dbs_info *policy_dbs = policy->governor_data;
-
-	/* State should be equivalent to START */
-	if (!policy_dbs->policy)
-		return -EBUSY;
-
-	gov_cancel_work(policy_dbs);
-	policy_dbs->policy = NULL;
+	gov_cancel_work(policy);
 
 	return 0;
 }
@@ -652,10 +639,6 @@ static int cpufreq_governor_limits(struct cpufreq_policy *policy)
 {
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 
-	/* State should be equivalent to START */
-	if (!policy_dbs->policy)
-		return -EBUSY;
-
 	mutex_lock(&policy_dbs->timer_mutex);
 	if (policy->max < policy->cur)
 		__cpufreq_driver_target(policy, policy->max, CPUFREQ_RELATION_H);
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index ccc3419d43f3..745290d7f6a2 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -258,10 +258,10 @@ static void update_sampling_rate(struct dbs_data *dbs_data)
 		policy_dbs = cdbs->policy_dbs;
 
 		/*
-		 * A valid policy_dbs and policy_dbs->policy means governor
-		 * hasn't stopped or exited yet.
+		 * A valid policy_dbs means governor hasn't stopped or exited
+		 * yet.
 		 */
-		if (!policy_dbs || !policy_dbs->policy)
+		if (!policy_dbs)
 			continue;
 
 		policy = policy_dbs->policy;
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (9 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 10/13] cpufreq: governor: No need to manage state machine now Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 13:21   ` Rafael J. Wysocki
  2016-02-08 11:39 ` [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate() Viresh Kumar
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

An instance of 'struct dbs_data' can support multiple 'struct
policy_dbs_info' instances. To traverse all policy_dbs supported by a
dbs_data, create a list of policy_dbs within dbs_data.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 12 ++++++++++++
 drivers/cpufreq/cpufreq_governor.h |  7 ++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index ee3c2d92da53..e267acc67067 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -489,6 +489,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 		dbs_data->usage_count++;
 		policy_dbs->dbs_data = dbs_data;
 		policy->governor_data = policy_dbs;
+
+		mutex_lock(&dbs_data->mutex);
+		list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
+		mutex_unlock(&dbs_data->mutex);
+
 		return 0;
 	}
 
@@ -500,8 +505,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
 
 	dbs_data->usage_count = 1;
 	dbs_data->gov = gov;
+	INIT_LIST_HEAD(&dbs_data->policy_dbs_list);
 	mutex_init(&dbs_data->mutex);
 
+	list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
+
 	ret = gov->init(dbs_data, !policy->governor->initialized);
 	if (ret)
 		goto free_policy_dbs_info;
@@ -554,6 +562,10 @@ static int cpufreq_governor_exit(struct cpufreq_policy *policy)
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 
+	mutex_lock(&dbs_data->mutex);
+	list_del(&policy_dbs->list);
+	mutex_unlock(&dbs_data->mutex);
+
 	if (!--dbs_data->usage_count) {
 		kobject_put(&dbs_data->kobj);
 
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index ced34ba5a18d..b740633c2fbe 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -74,7 +74,11 @@ struct dbs_data {
 	unsigned int up_threshold;
 
 	struct kobject kobj;
-	/* Protect concurrent updates to governor tunables from sysfs */
+	struct list_head policy_dbs_list;
+	/*
+	 * Protect concurrent updates to governor tunables from sysfs and
+	 * policy_dbs_list.
+	 */
 	struct mutex mutex;
 };
 
@@ -132,6 +136,7 @@ struct policy_dbs_info {
 	struct work_struct work;
 	/* dbs_data may be shared between multiple policy objects */
 	struct dbs_data *dbs_data;
+	struct list_head list;
 };
 
 static inline void gov_update_sample_delay(struct policy_dbs_info *policy_dbs,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (10 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 13:32   ` Rafael J. Wysocki
  2016-02-08 11:39 ` [PATCH V3 13/13] cpufreq: conservative: Update sample_delay_ns immediately Viresh Kumar
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

Now that we maintain a list of all 'struct policy_dbs_info' for an
instance of 'struct dbs_data', we can traverse that instead of
traversing the loop for all online CPUs.

This also solves the circular dependency lockdep reported by Juri
(and verified by Shilpa) earlier:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.4.0+ #445 Not tainted
 -------------------------------------------------------
 trace.sh/1723 is trying to acquire lock:
  (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94

 but task is already holding lock:
  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

-> #2 (od_dbs_cdata.mutex){+.+.+.}:
        [<c075b040>] mutex_lock_nested+0x7c/0x434
        [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

-> #1 (&policy->rwsem){+++++.}:
        [<c075ca8c>] down_read+0x58/0x94
        [<c057c244>] show+0x30/0x60
        [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
        [<c01f7ad8>] kernfs_seq_show+0x34/0x38
        [<c01a22ec>] seq_read+0x1e4/0x4e4
        [<c01f8694>] kernfs_fop_read+0x120/0x1a0
        [<c01794b4>] __vfs_read+0x3c/0xe0
        [<c017a378>] vfs_read+0x98/0x104
        [<c017a434>] SyS_read+0x50/0x90
        [<c000fd40>] ret_fast_syscall+0x0/0x1c

-> #0 (s_active#48){++++.+}:
        [<c008238c>] lock_acquire+0xd4/0x20c
        [<c01f6ae4>] __kernfs_remove+0x288/0x328
        [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
        [<c01fa024>] remove_files+0x44/0x88
        [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
        [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

 other info that might help us debug this:

 Chain exists of:
  s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(od_dbs_cdata.mutex);
                                lock(&policy->rwsem);
                                lock(od_dbs_cdata.mutex);
   lock(s_active#48);

  *** DEADLOCK ***

 5 locks held by trace.sh/1723:
  #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
  #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
  #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
  #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
  #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 stack backtrace:
 CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
 Hardware name: ARM-Versatile Express
 [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
 [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
 [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
 [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
 [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
 [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
 [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
 [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
 [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
 [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
 [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)

Reported-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
 1 file changed, 22 insertions(+), 67 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 745290d7f6a2..f72087bc8302 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -224,83 +224,38 @@ static struct dbs_governor od_dbs_gov;
 /**
  * update_sampling_rate - update sampling rate effective immediately if needed.
  * @new_rate: new sampling rate
- *
- * If new rate is smaller than the old, simply updating
- * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
- * original sampling_rate was 1 second and the requested new sampling rate is 10
- * ms because the user needs immediate reaction from ondemand governor, but not
- * sure if higher frequency will be required or not, then, the governor may
- * change the sampling rate too late; up to 1 second later. Thus, if we are
- * reducing the sampling rate, we need to make the new value effective
- * immediately.
  */
 static void update_sampling_rate(struct dbs_data *dbs_data)
 {
-	struct cpumask cpumask;
+	struct policy_dbs_info *policy_dbs;
 	unsigned int new_rate = dbs_data->sampling_rate;
-	int cpu;
 
 	/*
-	 * Lock governor so that governor start/stop can't execute in parallel.
+	 * We are operating under dbs_data->mutex and so the list and its
+	 * entries can't be freed concurrently.
 	 */
-	mutex_lock(&dbs_data_mutex);
-
-	cpumask_copy(&cpumask, cpu_online_mask);
-
-	for_each_cpu(cpu, &cpumask) {
-		struct cpufreq_policy *policy;
-		struct od_cpu_dbs_info_s *dbs_info;
-		struct cpu_dbs_info *cdbs;
-		struct policy_dbs_info *policy_dbs;
-
-		dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-		cdbs = &dbs_info->cdbs;
-		policy_dbs = cdbs->policy_dbs;
-
+	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
+		mutex_lock(&policy_dbs->timer_mutex);
 		/*
-		 * A valid policy_dbs means governor hasn't stopped or exited
-		 * yet.
+		 * On 32-bit architectures this may race with the
+		 * sample_delay_ns read in dbs_update_util_handler(), but that
+		 * really doesn't matter.  If the read returns a value that's
+		 * too big, the sample will be skipped, but the next invocation
+		 * of dbs_update_util_handler() (when the update has been
+		 * completed) will take a sample.  If the returned value is too
+		 * small, the sample will be taken immediately, but that isn't a
+		 * problem, as we want the new rate to take effect immediately
+		 * anyway.
+		 *
+		 * If this runs in parallel with dbs_work_handler(), we may end
+		 * up overwriting the sample_delay_ns value that it has just
+		 * written, but the difference should not be too big and it will
+		 * be corrected next time a sample is taken, so it shouldn't be
+		 * significant.
 		 */
-		if (!policy_dbs)
-			continue;
-
-		policy = policy_dbs->policy;
-
-		/* clear all CPUs of this policy */
-		cpumask_andnot(&cpumask, &cpumask, policy->cpus);
-
-		/*
-		 * Update sampling rate for CPUs whose policy is governed by
-		 * dbs_data. In case of governor_per_policy, only a single
-		 * policy will be governed by dbs_data, otherwise there can be
-		 * multiple policies that are governed by the same dbs_data.
-		 */
-		if (dbs_data == policy_dbs->dbs_data) {
-			mutex_lock(&policy_dbs->timer_mutex);
-			/*
-			 * On 32-bit architectures this may race with the
-			 * sample_delay_ns read in dbs_update_util_handler(),
-			 * but that really doesn't matter.  If the read returns
-			 * a value that's too big, the sample will be skipped,
-			 * but the next invocation of dbs_update_util_handler()
-			 * (when the update has been completed) will take a
-			 * sample.  If the returned value is too small, the
-			 * sample will be taken immediately, but that isn't a
-			 * problem, as we want the new rate to take effect
-			 * immediately anyway.
-			 *
-			 * If this runs in parallel with dbs_work_handler(), we
-			 * may end up overwriting the sample_delay_ns value that
-			 * it has just written, but the difference should not be
-			 * too big and it will be corrected next time a sample
-			 * is taken, so it shouldn't be significant.
-			 */
-			gov_update_sample_delay(policy_dbs, new_rate);
-			mutex_unlock(&policy_dbs->timer_mutex);
-		}
+		gov_update_sample_delay(policy_dbs, new_rate);
+		mutex_unlock(&policy_dbs->timer_mutex);
 	}
-
-	mutex_unlock(&dbs_data_mutex);
 }
 
 static bool invalid_up_threshold(struct dbs_data *dbs_data,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH V3 13/13] cpufreq: conservative: Update sample_delay_ns immediately
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (11 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate() Viresh Kumar
@ 2016-02-08 11:39 ` Viresh Kumar
  2016-02-08 12:51 ` [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Shilpasri G Bhat
  2016-02-08 21:43 ` Rafael J. Wysocki
  14 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 11:39 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel, Viresh Kumar

Ondemand governor already updates sample_delay_ns immediately on updates
to sampling rate, but conservative isn't doing that.

It was left out earlier as the code has been really complex to get that
done easily. But now things are sorted out very well, and we can follow
the same for conservative governor as well.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq_governor.c | 30 +++++++++++++++++++++++++++---
 drivers/cpufreq/cpufreq_governor.h |  1 -
 drivers/cpufreq/cpufreq_ondemand.c | 38 --------------------------------------
 3 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index e267acc67067..3f4338bdb3aa 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -29,7 +29,7 @@ EXPORT_SYMBOL_GPL(dbs_data_mutex);
 static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 				   size_t count)
 {
-	struct dbs_governor *gov = dbs_data->gov;
+	struct policy_dbs_info *policy_dbs;
 	unsigned int rate;
 	int ret;
 	ret = sscanf(buf, "%u", &rate);
@@ -38,8 +38,32 @@ static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 
 	dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate);
 
-	if (gov->update_sampling_rate)
-		gov->update_sampling_rate(dbs_data);
+	/*
+	 * We are operating under dbs_data->mutex and so the list and its
+	 * entries can't be freed concurrently.
+	 */
+	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
+		mutex_lock(&policy_dbs->timer_mutex);
+		/*
+		 * On 32-bit architectures this may race with the
+		 * sample_delay_ns read in dbs_update_util_handler(), but that
+		 * really doesn't matter.  If the read returns a value that's
+		 * too big, the sample will be skipped, but the next invocation
+		 * of dbs_update_util_handler() (when the update has been
+		 * completed) will take a sample.  If the returned value is too
+		 * small, the sample will be taken immediately, but that isn't a
+		 * problem, as we want the new rate to take effect immediately
+		 * anyway.
+		 *
+		 * If this runs in parallel with dbs_work_handler(), we may end
+		 * up overwriting the sample_delay_ns value that it has just
+		 * written, but the difference should not be too big and it will
+		 * be corrected next time a sample is taken, so it shouldn't be
+		 * significant.
+		 */
+		gov_update_sample_delay(policy_dbs, dbs_data->sampling_rate);
+		mutex_unlock(&policy_dbs->timer_mutex);
+	}
 
 	return count;
 }
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index b740633c2fbe..127982f6d869 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -212,7 +212,6 @@ struct dbs_governor {
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
 	bool (*invalid_up_threshold)(struct dbs_data *dbs_data, unsigned int threshold);
 	bool (*invalid_sampling_down_factor)(unsigned int factor);
-	void (*update_sampling_rate)(struct dbs_data *dbs_data);
 	void (*update_sampling_down_factor)(struct dbs_data *dbs_data);
 	void (*update_ignore_nice_load)(struct dbs_data *dbs_data);
 
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index f72087bc8302..1e17f6ffdf42 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -221,43 +221,6 @@ static unsigned int od_dbs_timer(struct cpufreq_policy *policy)
 /************************** sysfs interface ************************/
 static struct dbs_governor od_dbs_gov;
 
-/**
- * update_sampling_rate - update sampling rate effective immediately if needed.
- * @new_rate: new sampling rate
- */
-static void update_sampling_rate(struct dbs_data *dbs_data)
-{
-	struct policy_dbs_info *policy_dbs;
-	unsigned int new_rate = dbs_data->sampling_rate;
-
-	/*
-	 * We are operating under dbs_data->mutex and so the list and its
-	 * entries can't be freed concurrently.
-	 */
-	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
-		mutex_lock(&policy_dbs->timer_mutex);
-		/*
-		 * On 32-bit architectures this may race with the
-		 * sample_delay_ns read in dbs_update_util_handler(), but that
-		 * really doesn't matter.  If the read returns a value that's
-		 * too big, the sample will be skipped, but the next invocation
-		 * of dbs_update_util_handler() (when the update has been
-		 * completed) will take a sample.  If the returned value is too
-		 * small, the sample will be taken immediately, but that isn't a
-		 * problem, as we want the new rate to take effect immediately
-		 * anyway.
-		 *
-		 * If this runs in parallel with dbs_work_handler(), we may end
-		 * up overwriting the sample_delay_ns value that it has just
-		 * written, but the difference should not be too big and it will
-		 * be corrected next time a sample is taken, so it shouldn't be
-		 * significant.
-		 */
-		gov_update_sample_delay(policy_dbs, new_rate);
-		mutex_unlock(&policy_dbs->timer_mutex);
-	}
-}
-
 static bool invalid_up_threshold(struct dbs_data *dbs_data,
 				 unsigned int threshold)
 {
@@ -425,7 +388,6 @@ static struct dbs_governor od_dbs_gov = {
 	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = od_dbs_timer,
 	.gov_check_cpu = od_check_cpu,
-	.update_sampling_rate = update_sampling_rate,
 	.invalid_up_threshold = invalid_up_threshold,
 	.invalid_sampling_down_factor = invalid_sampling_down_factor,
 	.update_sampling_down_factor = update_sampling_down_factor,
-- 
2.7.1.370.gb2aa7f8

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (12 preceding siblings ...)
  2016-02-08 11:39 ` [PATCH V3 13/13] cpufreq: conservative: Update sample_delay_ns immediately Viresh Kumar
@ 2016-02-08 12:51 ` Shilpasri G Bhat
  2016-02-08 12:54   ` Viresh Kumar
  2016-02-08 21:43 ` Rafael J. Wysocki
  14 siblings, 1 reply; 38+ messages in thread
From: Shilpasri G Bhat @ 2016-02-08 12:51 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, linux-kernel

Hi,

On 02/08/2016 05:09 PM, Viresh Kumar wrote:
> Hi Rafael,
> 
> Things look much much better now. I have rebased this series over
> pm/bleeding-edge, that has all your patches.
> 
> I have moved ahead and done few more changes in this series, that should
> get rid of all the lockdeps we were getting earlier, that includes
> fixing lockdep issue in update_sampling_rate() that we were chasing.
> 
> These are pushed here again:
> git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject
> 
> @Juri/Shilpa: Can you please confirm if all the issues got resolved now
> ?

I have tested this patchset on Power8 box and did not find any lockdep warnings.

Thanks and Regards,
Shilpa

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
  2016-02-08 12:51 ` [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Shilpasri G Bhat
@ 2016-02-08 12:54   ` Viresh Kumar
  2016-02-08 16:39     ` Juri Lelli
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 12:54 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: Rafael Wysocki, juri.lelli, linaro-kernel, linux-pm, skannan,
	peterz, mturquette, steve.muckle, vincent.guittot,
	morten.rasmussen, dietmar.eggemann, linux-kernel

On 08-02-16, 18:21, Shilpasri G Bhat wrote:
> Hi,
> 
> On 02/08/2016 05:09 PM, Viresh Kumar wrote:
> > Hi Rafael,
> > 
> > Things look much much better now. I have rebased this series over
> > pm/bleeding-edge, that has all your patches.
> > 
> > I have moved ahead and done few more changes in this series, that should
> > get rid of all the lockdeps we were getting earlier, that includes
> > fixing lockdep issue in update_sampling_rate() that we were chasing.
> > 
> > These are pushed here again:
> > git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject
> > 
> > @Juri/Shilpa: Can you please confirm if all the issues got resolved now
> > ?
> 
> I have tested this patchset on Power8 box and did not find any lockdep warnings.

Awesome news Shilpa. Thanks a lot for helping us out :)

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
  2016-02-08 11:39 ` [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c Viresh Kumar
@ 2016-02-08 12:58   ` Rafael J. Wysocki
  2016-02-08 13:03     ` Viresh Kumar
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 12:58 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> We have got five common sysfs tunables between ondemand and conservative
> governors, move their callbacks to cpufreq_governor.c to get rid of
> redundant code.
>
> Because of minor differences in the implementation of the callbacks,
> some more per-governor callbacks are introduced in order to not
> introduce any more "governor == ONDEMAND/CONSERVATIVE" like checks.

My most fundamental concern here is that attributes that don't apply
to a particular governor should not appear in sysfs at all when that
governor is in use (instead of appearing and always returning -EINVAL
which is sort of silly).

That doesn't mean the common code cannot access them, though.  They
still can be present in the data structure, but it may be a good idea
to set them to special values clearly meaning "invalid" then.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
  2016-02-08 12:58   ` Rafael J. Wysocki
@ 2016-02-08 13:03     ` Viresh Kumar
  2016-02-08 13:24       ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 13:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 13:58, Rafael J. Wysocki wrote:
> My most fundamental concern here is that attributes that don't apply
> to a particular governor should not appear in sysfs at all when that
> governor is in use (instead of appearing and always returning -EINVAL

s/is in use/is not in use/ ??

> which is sort of silly).

But who said that I have made them available always ? Sorry, I didn't
understood your input.

I have just moved the show/store callbacks and the struct
governor_attr definition to cpufreq_governor.c. And sysfs files are
created only for the ones that are valid for a governor.

> That doesn't mean the common code cannot access them, though.  They
> still can be present in the data structure, but it may be a good idea
> to set them to special values clearly meaning "invalid" then.

Or are you saying that we should move all the tunables to dbs_data ?

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data
  2016-02-08 11:39 ` [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data Viresh Kumar
@ 2016-02-08 13:21   ` Rafael J. Wysocki
  2016-02-08 13:30     ` Viresh Kumar
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 13:21 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> An instance of 'struct dbs_data' can support multiple 'struct
> policy_dbs_info' instances. To traverse all policy_dbs supported by a
> dbs_data, create a list of policy_dbs within dbs_data.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Good idea overall, I like this.

> ---
>  drivers/cpufreq/cpufreq_governor.c | 12 ++++++++++++
>  drivers/cpufreq/cpufreq_governor.h |  7 ++++++-
>  2 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index ee3c2d92da53..e267acc67067 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -489,6 +489,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
>                 dbs_data->usage_count++;
>                 policy_dbs->dbs_data = dbs_data;
>                 policy->governor_data = policy_dbs;
> +
> +               mutex_lock(&dbs_data->mutex);
> +               list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
> +               mutex_unlock(&dbs_data->mutex);

The previous statements should be under the mutex too IMO, at least
the usage count incrementation in case two instances of this happen at
the same time.

That can't happen now, but if we want to get rid of dbs_data_mutex
going forward, having it under the mutex will be actually useful.

> +
>                 return 0;
>         }
>
> @@ -500,8 +505,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
>
>         dbs_data->usage_count = 1;
>         dbs_data->gov = gov;
> +       INIT_LIST_HEAD(&dbs_data->policy_dbs_list);
>         mutex_init(&dbs_data->mutex);
>
> +       list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);

That line should go to where policy_dbs->dbs_data is set so it is
clear that they go together.  And I'd set the usage count to 1 in
there too for consistency.

> +
>         ret = gov->init(dbs_data, !policy->governor->initialized);
>         if (ret)
>                 goto free_policy_dbs_info;
> @@ -554,6 +562,10 @@ static int cpufreq_governor_exit(struct cpufreq_policy *policy)
>         struct policy_dbs_info *policy_dbs = policy->governor_data;
>         struct dbs_data *dbs_data = policy_dbs->dbs_data;
>
> +       mutex_lock(&dbs_data->mutex);
> +       list_del(&policy_dbs->list);
> +       mutex_unlock(&dbs_data->mutex);
> +

Same here.  I'd do the decrementation of the counter under the mutex
along with the removal from the list.  They are related in any case
("one user goes away") and will be useful for dbs_data_mutex
elimination.

>         if (!--dbs_data->usage_count) {
>                 kobject_put(&dbs_data->kobj);
>
> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index ced34ba5a18d..b740633c2fbe 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -74,7 +74,11 @@ struct dbs_data {
>         unsigned int up_threshold;
>
>         struct kobject kobj;
> -       /* Protect concurrent updates to governor tunables from sysfs */
> +       struct list_head policy_dbs_list;
> +       /*
> +        * Protect concurrent updates to governor tunables from sysfs and
> +        * policy_dbs_list.
> +        */

And if the counter is modified under the mutex, it should be mentioned
in this comment.

Maybe keep the counter close to the list and the mutex in the
structure definition too?

>         struct mutex mutex;
>  };
>
> @@ -132,6 +136,7 @@ struct policy_dbs_info {
>         struct work_struct work;
>         /* dbs_data may be shared between multiple policy objects */
>         struct dbs_data *dbs_data;
> +       struct list_head list;
>  };
>
>  static inline void gov_update_sample_delay(struct policy_dbs_info *policy_dbs,
> --

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
  2016-02-08 13:03     ` Viresh Kumar
@ 2016-02-08 13:24       ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 13:24 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael Wysocki, Juri Lelli,
	Lists linaro-kernel, linux-pm, Saravana Kannan, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann, Shilpasri G Bhat,
	Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 2:03 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 13:58, Rafael J. Wysocki wrote:
>> My most fundamental concern here is that attributes that don't apply
>> to a particular governor should not appear in sysfs at all when that
>> governor is in use (instead of appearing and always returning -EINVAL
>
> s/is in use/is not in use/ ??
>
>> which is sort of silly).
>
> But who said that I have made them available always ? Sorry, I didn't
> understood your input.
>
> I have just moved the show/store callbacks and the struct
> governor_attr definition to cpufreq_governor.c. And sysfs files are
> created only for the ones that are valid for a governor.

OK, I need to look at it more carefully then.

>> That doesn't mean the common code cannot access them, though.  They
>> still can be present in the data structure, but it may be a good idea
>> to set them to special values clearly meaning "invalid" then.
>
> Or are you saying that we should move all the tunables to dbs_data ?

Well, maybe.  I'm not sure, but that may be done later in any case.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data
  2016-02-08 13:21   ` Rafael J. Wysocki
@ 2016-02-08 13:30     ` Viresh Kumar
  2016-02-08 13:35       ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 13:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 14:21, Rafael J. Wysocki wrote:
> On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > An instance of 'struct dbs_data' can support multiple 'struct
> > policy_dbs_info' instances. To traverse all policy_dbs supported by a
> > dbs_data, create a list of policy_dbs within dbs_data.
> >
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
> Good idea overall, I like this.

Thanks.

> > ---
> >  drivers/cpufreq/cpufreq_governor.c | 12 ++++++++++++
> >  drivers/cpufreq/cpufreq_governor.h |  7 ++++++-
> >  2 files changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> > index ee3c2d92da53..e267acc67067 100644
> > --- a/drivers/cpufreq/cpufreq_governor.c
> > +++ b/drivers/cpufreq/cpufreq_governor.c
> > @@ -489,6 +489,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
> >                 dbs_data->usage_count++;
> >                 policy_dbs->dbs_data = dbs_data;
> >                 policy->governor_data = policy_dbs;
> > +
> > +               mutex_lock(&dbs_data->mutex);
> > +               list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
> > +               mutex_unlock(&dbs_data->mutex);
> 
> The previous statements should be under the mutex too IMO, at least
> the usage count incrementation in case two instances of this happen at
> the same time.
> 
> That can't happen now, but if we want to get rid of dbs_data_mutex
> going forward, having it under the mutex will be actually useful.

I think we should keep it precise for now. Right now, we are only
concerned about the list modification, so just lock around that.

Once we are going to remove dbs_data_mutex, then we can cover more
things under it.

Is there anything that is broken right now ?

> > +
> >                 return 0;
> >         }
> >
> > @@ -500,8 +505,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
> >
> >         dbs_data->usage_count = 1;
> >         dbs_data->gov = gov;
> > +       INIT_LIST_HEAD(&dbs_data->policy_dbs_list);
> >         mutex_init(&dbs_data->mutex);
> >
> > +       list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
> 
> That line should go to where policy_dbs->dbs_data is set so it is
> clear that they go together.

Okay.

> And I'd set the usage count to 1 in
> there too for consistency.

I am not sure about including any updates within the lock, which don't
need protection in current state of code.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 11:39 ` [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate() Viresh Kumar
@ 2016-02-08 13:32   ` Rafael J. Wysocki
  2016-02-08 13:34       ` Viresh Kumar
  2016-02-08 17:20     ` Viresh Kumar
  0 siblings, 2 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 13:32 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Now that we maintain a list of all 'struct policy_dbs_info' for an
> instance of 'struct dbs_data', we can traverse that instead of
> traversing the loop for all online CPUs.
>
> This also solves the circular dependency lockdep reported by Juri
> (and verified by Shilpa) earlier:
>
>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.4.0+ #445 Not tainted
>  -------------------------------------------------------
>  trace.sh/1723 is trying to acquire lock:
>   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>
>  but task is already holding lock:
>   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  which lock already depends on the new lock.
>
>  the existing dependency chain (in reverse order) is:
>
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
>         [<c075b040>] mutex_lock_nested+0x7c/0x434
>         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
> -> #1 (&policy->rwsem){+++++.}:
>         [<c075ca8c>] down_read+0x58/0x94
>         [<c057c244>] show+0x30/0x60
>         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
>         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
>         [<c01a22ec>] seq_read+0x1e4/0x4e4
>         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
>         [<c01794b4>] __vfs_read+0x3c/0xe0
>         [<c017a378>] vfs_read+0x98/0x104
>         [<c017a434>] SyS_read+0x50/0x90
>         [<c000fd40>] ret_fast_syscall+0x0/0x1c
>
> -> #0 (s_active#48){++++.+}:
>         [<c008238c>] lock_acquire+0xd4/0x20c
>         [<c01f6ae4>] __kernfs_remove+0x288/0x328
>         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>         [<c01fa024>] remove_files+0x44/0x88
>         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
>         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
>  other info that might help us debug this:
>
>  Chain exists of:
>   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(od_dbs_cdata.mutex);
>                                 lock(&policy->rwsem);
>                                 lock(od_dbs_cdata.mutex);
>    lock(s_active#48);
>
>   *** DEADLOCK ***
>
>  5 locks held by trace.sh/1723:
>   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
>   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
>   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
>   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  stack backtrace:
>  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
>  Hardware name: ARM-Versatile Express
>  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
>  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
>  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
>  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
>  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
>  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
>  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
>  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
>  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
>  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
>  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
>
> Reported-by: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
>  1 file changed, 22 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index 745290d7f6a2..f72087bc8302 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -224,83 +224,38 @@ static struct dbs_governor od_dbs_gov;
>  /**
>   * update_sampling_rate - update sampling rate effective immediately if needed.
>   * @new_rate: new sampling rate
> - *
> - * If new rate is smaller than the old, simply updating
> - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> - * original sampling_rate was 1 second and the requested new sampling rate is 10
> - * ms because the user needs immediate reaction from ondemand governor, but not
> - * sure if higher frequency will be required or not, then, the governor may
> - * change the sampling rate too late; up to 1 second later. Thus, if we are
> - * reducing the sampling rate, we need to make the new value effective
> - * immediately.

The comment still applies.

Moreover, please extend it to say that this must be called with
dbs_data->mutex held (or it looks racy otherwise).

Thanks,
Rafael


On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Now that we maintain a list of all 'struct policy_dbs_info' for an
> instance of 'struct dbs_data', we can traverse that instead of
> traversing the loop for all online CPUs.
>
> This also solves the circular dependency lockdep reported by Juri
> (and verified by Shilpa) earlier:
>
>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.4.0+ #445 Not tainted
>  -------------------------------------------------------
>  trace.sh/1723 is trying to acquire lock:
>   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>
>  but task is already holding lock:
>   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  which lock already depends on the new lock.
>
>  the existing dependency chain (in reverse order) is:
>
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
>         [<c075b040>] mutex_lock_nested+0x7c/0x434
>         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
> -> #1 (&policy->rwsem){+++++.}:
>         [<c075ca8c>] down_read+0x58/0x94
>         [<c057c244>] show+0x30/0x60
>         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
>         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
>         [<c01a22ec>] seq_read+0x1e4/0x4e4
>         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
>         [<c01794b4>] __vfs_read+0x3c/0xe0
>         [<c017a378>] vfs_read+0x98/0x104
>         [<c017a434>] SyS_read+0x50/0x90
>         [<c000fd40>] ret_fast_syscall+0x0/0x1c
>
> -> #0 (s_active#48){++++.+}:
>         [<c008238c>] lock_acquire+0xd4/0x20c
>         [<c01f6ae4>] __kernfs_remove+0x288/0x328
>         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>         [<c01fa024>] remove_files+0x44/0x88
>         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
>         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
>
>  other info that might help us debug this:
>
>  Chain exists of:
>   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(od_dbs_cdata.mutex);
>                                 lock(&policy->rwsem);
>                                 lock(od_dbs_cdata.mutex);
>    lock(s_active#48);
>
>   *** DEADLOCK ***
>
>  5 locks held by trace.sh/1723:
>   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
>   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
>   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
>   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>
>  stack backtrace:
>  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
>  Hardware name: ARM-Versatile Express
>  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
>  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
>  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
>  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
>  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
>  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
>  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
>  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
>  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
>  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
>  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
>
> Reported-by: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c | 89 ++++++++++----------------------------
>  1 file changed, 22 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index 745290d7f6a2..f72087bc8302 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -224,83 +224,38 @@ static struct dbs_governor od_dbs_gov;
>  /**
>   * update_sampling_rate - update sampling rate effective immediately if needed.
>   * @new_rate: new sampling rate
> - *
> - * If new rate is smaller than the old, simply updating
> - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> - * original sampling_rate was 1 second and the requested new sampling rate is 10
> - * ms because the user needs immediate reaction from ondemand governor, but not
> - * sure if higher frequency will be required or not, then, the governor may
> - * change the sampling rate too late; up to 1 second later. Thus, if we are
> - * reducing the sampling rate, we need to make the new value effective
> - * immediately.
>   */
>  static void update_sampling_rate(struct dbs_data *dbs_data)
>  {
> -       struct cpumask cpumask;
> +       struct policy_dbs_info *policy_dbs;
>         unsigned int new_rate = dbs_data->sampling_rate;
> -       int cpu;
>
>         /*
> -        * Lock governor so that governor start/stop can't execute in parallel.
> +        * We are operating under dbs_data->mutex and so the list and its
> +        * entries can't be freed concurrently.
>          */
> -       mutex_lock(&dbs_data_mutex);
> -
> -       cpumask_copy(&cpumask, cpu_online_mask);
> -
> -       for_each_cpu(cpu, &cpumask) {
> -               struct cpufreq_policy *policy;
> -               struct od_cpu_dbs_info_s *dbs_info;
> -               struct cpu_dbs_info *cdbs;
> -               struct policy_dbs_info *policy_dbs;
> -
> -               dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> -               cdbs = &dbs_info->cdbs;
> -               policy_dbs = cdbs->policy_dbs;
> -
> +       list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
> +               mutex_lock(&policy_dbs->timer_mutex);
>                 /*
> -                * A valid policy_dbs means governor hasn't stopped or exited
> -                * yet.
> +                * On 32-bit architectures this may race with the
> +                * sample_delay_ns read in dbs_update_util_handler(), but that
> +                * really doesn't matter.  If the read returns a value that's
> +                * too big, the sample will be skipped, but the next invocation
> +                * of dbs_update_util_handler() (when the update has been
> +                * completed) will take a sample.  If the returned value is too
> +                * small, the sample will be taken immediately, but that isn't a
> +                * problem, as we want the new rate to take effect immediately
> +                * anyway.
> +                *
> +                * If this runs in parallel with dbs_work_handler(), we may end
> +                * up overwriting the sample_delay_ns value that it has just
> +                * written, but the difference should not be too big and it will
> +                * be corrected next time a sample is taken, so it shouldn't be
> +                * significant.
>                  */
> -               if (!policy_dbs)
> -                       continue;
> -
> -               policy = policy_dbs->policy;
> -
> -               /* clear all CPUs of this policy */
> -               cpumask_andnot(&cpumask, &cpumask, policy->cpus);
> -
> -               /*
> -                * Update sampling rate for CPUs whose policy is governed by
> -                * dbs_data. In case of governor_per_policy, only a single
> -                * policy will be governed by dbs_data, otherwise there can be
> -                * multiple policies that are governed by the same dbs_data.
> -                */
> -               if (dbs_data == policy_dbs->dbs_data) {
> -                       mutex_lock(&policy_dbs->timer_mutex);
> -                       /*
> -                        * On 32-bit architectures this may race with the
> -                        * sample_delay_ns read in dbs_update_util_handler(),
> -                        * but that really doesn't matter.  If the read returns
> -                        * a value that's too big, the sample will be skipped,
> -                        * but the next invocation of dbs_update_util_handler()
> -                        * (when the update has been completed) will take a
> -                        * sample.  If the returned value is too small, the
> -                        * sample will be taken immediately, but that isn't a
> -                        * problem, as we want the new rate to take effect
> -                        * immediately anyway.
> -                        *
> -                        * If this runs in parallel with dbs_work_handler(), we
> -                        * may end up overwriting the sample_delay_ns value that
> -                        * it has just written, but the difference should not be
> -                        * too big and it will be corrected next time a sample
> -                        * is taken, so it shouldn't be significant.
> -                        */
> -                       gov_update_sample_delay(policy_dbs, new_rate);
> -                       mutex_unlock(&policy_dbs->timer_mutex);
> -               }
> +               gov_update_sample_delay(policy_dbs, new_rate);
> +               mutex_unlock(&policy_dbs->timer_mutex);
>         }
> -
> -       mutex_unlock(&dbs_data_mutex);
>  }
>
>  static bool invalid_up_threshold(struct dbs_data *dbs_data,
> --
> 2.7.1.370.gb2aa7f8
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 13:32   ` Rafael J. Wysocki
@ 2016-02-08 13:34       ` Viresh Kumar
  2016-02-08 17:20     ` Viresh Kumar
  1 sibling, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 13:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 14:32, Rafael J. Wysocki wrote:
> > - * If new rate is smaller than the old, simply updating
> > - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> > - * original sampling_rate was 1 second and the requested new sampling rate is 10
> > - * ms because the user needs immediate reaction from ondemand governor, but not
> > - * sure if higher frequency will be required or not, then, the governor may
> > - * change the sampling rate too late; up to 1 second later. Thus, if we are
> > - * reducing the sampling rate, we need to make the new value effective
> > - * immediately.
> 
> The comment still applies.

Why? It talks about the case where we have reduced sampling rate, but
that's not the case anymore. We *always* update sample_delay_ns now.

> Moreover, please extend it to say that this must be called with
> dbs_data->mutex held (or it looks racy otherwise).

Yeah, that can be done.

--
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
@ 2016-02-08 13:34       ` Viresh Kumar
  0 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 13:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 14:32, Rafael J. Wysocki wrote:
> > - * If new rate is smaller than the old, simply updating
> > - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
> > - * original sampling_rate was 1 second and the requested new sampling rate is 10
> > - * ms because the user needs immediate reaction from ondemand governor, but not
> > - * sure if higher frequency will be required or not, then, the governor may
> > - * change the sampling rate too late; up to 1 second later. Thus, if we are
> > - * reducing the sampling rate, we need to make the new value effective
> > - * immediately.
> 
> The comment still applies.

Why? It talks about the case where we have reduced sampling rate, but
that's not the case anymore. We *always* update sample_delay_ns now.

> Moreover, please extend it to say that this must be called with
> dbs_data->mutex held (or it looks racy otherwise).

Yeah, that can be done.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data
  2016-02-08 13:30     ` Viresh Kumar
@ 2016-02-08 13:35       ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 13:35 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael Wysocki, Juri Lelli,
	Lists linaro-kernel, linux-pm, Saravana Kannan, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann, Shilpasri G Bhat,
	Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 2:30 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 14:21, Rafael J. Wysocki wrote:
>> On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> > An instance of 'struct dbs_data' can support multiple 'struct
>> > policy_dbs_info' instances. To traverse all policy_dbs supported by a
>> > dbs_data, create a list of policy_dbs within dbs_data.
>> >
>> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
>>
>> Good idea overall, I like this.
>
> Thanks.
>
>> > ---
>> >  drivers/cpufreq/cpufreq_governor.c | 12 ++++++++++++
>> >  drivers/cpufreq/cpufreq_governor.h |  7 ++++++-
>> >  2 files changed, 18 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> > index ee3c2d92da53..e267acc67067 100644
>> > --- a/drivers/cpufreq/cpufreq_governor.c
>> > +++ b/drivers/cpufreq/cpufreq_governor.c
>> > @@ -489,6 +489,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
>> >                 dbs_data->usage_count++;
>> >                 policy_dbs->dbs_data = dbs_data;
>> >                 policy->governor_data = policy_dbs;
>> > +
>> > +               mutex_lock(&dbs_data->mutex);
>> > +               list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
>> > +               mutex_unlock(&dbs_data->mutex);
>>
>> The previous statements should be under the mutex too IMO, at least
>> the usage count incrementation in case two instances of this happen at
>> the same time.
>>
>> That can't happen now, but if we want to get rid of dbs_data_mutex
>> going forward, having it under the mutex will be actually useful.
>
> I think we should keep it precise for now. Right now, we are only
> concerned about the list modification, so just lock around that.
>
> Once we are going to remove dbs_data_mutex, then we can cover more
> things under it.
>
> Is there anything that is broken right now ?

Yes, the logic.

The counter technically is the number of items in policy_dbs_list.
Updating the list alone under the lock is simply illogical.

>> > +
>> >                 return 0;
>> >         }
>> >
>> > @@ -500,8 +505,11 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
>> >
>> >         dbs_data->usage_count = 1;
>> >         dbs_data->gov = gov;
>> > +       INIT_LIST_HEAD(&dbs_data->policy_dbs_list);
>> >         mutex_init(&dbs_data->mutex);
>> >
>> > +       list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
>>
>> That line should go to where policy_dbs->dbs_data is set so it is
>> clear that they go together.
>
> Okay.
>
>> And I'd set the usage count to 1 in
>> there too for consistency.
>
> I am not sure about including any updates within the lock, which don't
> need protection in current state of code.

Well, if you're not sure, then please simply follow my request. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 13:34       ` Viresh Kumar
  (?)
@ 2016-02-08 13:37       ` Rafael J. Wysocki
  -1 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 13:37 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael Wysocki, Juri Lelli,
	Lists linaro-kernel, linux-pm, Saravana Kannan, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann, Shilpasri G Bhat,
	Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 2:34 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>> > - * If new rate is smaller than the old, simply updating
>> > - * dbs_tuners_int.sampling_rate might not be appropriate. For example, if the
>> > - * original sampling_rate was 1 second and the requested new sampling rate is 10
>> > - * ms because the user needs immediate reaction from ondemand governor, but not
>> > - * sure if higher frequency will be required or not, then, the governor may
>> > - * change the sampling rate too late; up to 1 second later. Thus, if we are
>> > - * reducing the sampling rate, we need to make the new value effective
>> > - * immediately.
>>
>> The comment still applies.
>
> Why? It talks about the case where we have reduced sampling rate, but
> that's not the case anymore. We *always* update sample_delay_ns now.

But the comment explains *why* we do that, doesn't it?

If it doesn't apply, then why do we need this function at all?

>> Moreover, please extend it to say that this must be called with
>> dbs_data->mutex held (or it looks racy otherwise).
>
> Yeah, that can be done.


OK

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners
  2016-02-08 11:39 ` [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners Viresh Kumar
@ 2016-02-08 16:33   ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 16:33 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Some tunables are present in governor specific structures, whereas one
> (min_sampling_rate) is present in global 'struct dbs_data'.

To me, the word "global" is not the best one to express that.
"Common" would probably be better.  Or "generic" or similar.

It's just that these tunables apply to both governors (or "all"
governors if you will) as opposed to some other ones specific to a
particular governor type.

So I'd suggest replacing "global" with "common" in the patch in the
names of the macros too.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
  2016-02-08 12:54   ` Viresh Kumar
@ 2016-02-08 16:39     ` Juri Lelli
  2016-02-08 16:55       ` Viresh Kumar
  0 siblings, 1 reply; 38+ messages in thread
From: Juri Lelli @ 2016-02-08 16:39 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Shilpasri G Bhat, Rafael Wysocki, linaro-kernel, linux-pm,
	skannan, peterz, mturquette, steve.muckle, vincent.guittot,
	morten.rasmussen, dietmar.eggemann, linux-kernel

Hi,

On 08/02/16 18:24, Viresh Kumar wrote:
> On 08-02-16, 18:21, Shilpasri G Bhat wrote:
> > Hi,
> > 
> > On 02/08/2016 05:09 PM, Viresh Kumar wrote:
> > > Hi Rafael,
> > > 
> > > Things look much much better now. I have rebased this series over
> > > pm/bleeding-edge, that has all your patches.
> > > 
> > > I have moved ahead and done few more changes in this series, that should
> > > get rid of all the lockdeps we were getting earlier, that includes
> > > fixing lockdep issue in update_sampling_rate() that we were chasing.
> > > 
> > > These are pushed here again:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject
> > > 
> > > @Juri/Shilpa: Can you please confirm if all the issues got resolved now
> > > ?
> > 
> > I have tested this patchset on Power8 box and did not find any lockdep warnings.
> 

Everything looks good on TC2/Juno as well :).

Best,

- Juri

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
  2016-02-08 16:39     ` Juri Lelli
@ 2016-02-08 16:55       ` Viresh Kumar
  0 siblings, 0 replies; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 16:55 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Shilpasri G Bhat, Rafael Wysocki, linaro-kernel, linux-pm,
	skannan, peterz, mturquette, steve.muckle, vincent.guittot,
	morten.rasmussen, dietmar.eggemann, linux-kernel

On 08-02-16, 16:39, Juri Lelli wrote:
> Hi,
> 
> On 08/02/16 18:24, Viresh Kumar wrote:
> > On 08-02-16, 18:21, Shilpasri G Bhat wrote:
> > > Hi,
> > > 
> > > On 02/08/2016 05:09 PM, Viresh Kumar wrote:
> > > > Hi Rafael,
> > > > 
> > > > Things look much much better now. I have rebased this series over
> > > > pm/bleeding-edge, that has all your patches.
> > > > 
> > > > I have moved ahead and done few more changes in this series, that should
> > > > get rid of all the lockdeps we were getting earlier, that includes
> > > > fixing lockdep issue in update_sampling_rate() that we were chasing.
> > > > 
> > > > These are pushed here again:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject
> > > > 
> > > > @Juri/Shilpa: Can you please confirm if all the issues got resolved now
> > > > ?
> > > 
> > > I have tested this patchset on Power8 box and did not find any lockdep warnings.
> > 
> 
> Everything looks good on TC2/Juno as well :).

Yeah, we finally solved the puzzle :)

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-08 11:39 ` [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables Viresh Kumar
@ 2016-02-08 17:07   ` Viresh Kumar
  2016-02-08 21:28     ` Rafael J. Wysocki
  2016-02-08 21:36   ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 17:07 UTC (permalink / raw)
  To: Rafael Wysocki, juri.lelli
  Cc: linaro-kernel, linux-pm, skannan, peterz, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann, shilpa.bhat, linux-kernel

On 08-02-16, 17:09, Viresh Kumar wrote:
> +gov_show_one(sampling_rate);
> +gov_show_one(sampling_down_factor);
> +gov_show_one(up_threshold);
> +gov_show_one(ignore_nice_load);
> +gov_show_one(min_sampling_rate);
> +gov_show_one_tunable(cs, down_threshold);
> +gov_show_one_tunable(cs, freq_step);

Based on the review comments on 1/13, I will do:
- s/gov_show_one/gov_show_one_common
- s/gov_show_one_tunable/gov_show_one

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 13:32   ` Rafael J. Wysocki
  2016-02-08 13:34       ` Viresh Kumar
@ 2016-02-08 17:20     ` Viresh Kumar
  2016-02-08 22:05       ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-08 17:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 14:32, Rafael J. Wysocki wrote:
> The comment still applies.
> 
> Moreover, please extend it to say that this must be called with
> dbs_data->mutex held (or it looks racy otherwise).

Modified it as:

+ *
+ * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
+ * For example, if the original sampling_rate was 1 second and the requested new
+ * sampling rate is 10 ms because the user needs immediate reaction from
+ * ondemand governor, otherwise the governor may change the sampling rate too
+ * late; up to 1 second later.
+ *
+ * Similar logic applies while increasing the sampling rate. And so we need to
+ * update it with immediate effect.
+ *
+ * This must be called with dbs_data->mutex held, otherwise traversing
+ * policy_dbs_list isn't safe.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-08 17:07   ` Viresh Kumar
@ 2016-02-08 21:28     ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 21:28 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 6:07 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 17:09, Viresh Kumar wrote:
>> +gov_show_one(sampling_rate);
>> +gov_show_one(sampling_down_factor);
>> +gov_show_one(up_threshold);
>> +gov_show_one(ignore_nice_load);
>> +gov_show_one(min_sampling_rate);
>> +gov_show_one_tunable(cs, down_threshold);
>> +gov_show_one_tunable(cs, freq_step);
>
> Based on the review comments on 1/13, I will do:
> - s/gov_show_one/gov_show_one_common
> - s/gov_show_one_tunable/gov_show_one

OK

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-08 11:39 ` [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables Viresh Kumar
  2016-02-08 17:07   ` Viresh Kumar
@ 2016-02-08 21:36   ` Rafael J. Wysocki
  2016-02-09  3:21     ` Viresh Kumar
  1 sibling, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 21:36 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

[cut]

> @@ -331,8 +310,8 @@ static struct dbs_governor cs_dbs_gov = {
>                 .owner = THIS_MODULE,
>         },
>         .governor = GOV_CONSERVATIVE,
> -       .attr_group_gov_sys = &cs_attr_group_gov_sys,
> -       .attr_group_gov_pol = &cs_attr_group_gov_pol,
> +       .kobj_name = "conservative",

I don't think you need this.

> +       .kobj_type = { .default_attrs = cs_attributes },
>         .get_cpu_cdbs = get_cpu_cdbs,
>         .get_cpu_dbs_info_s = get_cpu_dbs_info_s,
>         .gov_dbs_timer = cs_dbs_timer,

[cut]

> @@ -373,10 +420,15 @@ static int cpufreq_governor_init(struct cpufreq_policy *policy)
>         policy_dbs->dbs_data = dbs_data;
>         policy->governor_data = policy_dbs;
>
> -       ret = sysfs_create_group(get_governor_parent_kobj(policy),
> -                                get_sysfs_attr(gov));
> -       if (ret)
> +       gov->kobj_type.sysfs_ops = &governor_sysfs_ops;
> +       ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
> +                                  get_governor_parent_kobj(policy),
> +                                  gov->kobj_name);

gov->gov.name can be used here instead of the new kobj_name thing.

Besides, you forgot about the format argument for kobject_init_and_add().

> +       if (ret) {
> +               pr_err("cpufreq: Governor initialization failed (dbs_data kobject initialization error %d)\n",
> +                      ret);
>                 goto reset_gdbs_data;
> +       }
>
>         return 0;
>

[cut]

> diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
> index 5c5d7936087c..a3afac5d8ab2 100644
> --- a/drivers/cpufreq/cpufreq_governor.h
> +++ b/drivers/cpufreq/cpufreq_governor.h
> @@ -160,8 +160,44 @@ struct dbs_data {
>         unsigned int sampling_rate;
>         unsigned int sampling_down_factor;
>         unsigned int up_threshold;
> +
> +       struct kobject kobj;
> +       /* Protect concurrent updates to governor tunables from sysfs */
> +       struct mutex mutex;
> +};
> +
> +/* Governor's specific attributes */
> +struct dbs_data;
> +struct governor_attr {
> +       struct attribute attr;
> +       ssize_t (*show)(struct dbs_data *dbs_data, char *buf);
> +       ssize_t (*store)(struct dbs_data *dbs_data, const char *buf,
> +                        size_t count);
>  };
>
> +#define gov_show_one_tunable(_gov, file_name)                          \
> +static ssize_t show_##file_name                                                \
> +(struct dbs_data *dbs_data, char *buf)                                 \
> +{                                                                      \
> +       struct _gov##_dbs_tuners *tuners = dbs_data->tuners;            \
> +       return sprintf(buf, "%u\n", tuners->file_name);                 \
> +}
> +
> +#define gov_show_one(file_name)                                                \
> +static ssize_t show_##file_name                                                \
> +(struct dbs_data *dbs_data, char *buf)                                 \
> +{                                                                      \
> +       return sprintf(buf, "%u\n", dbs_data->file_name);               \
> +}
> +
> +#define gov_attr_ro(_name)                                             \
> +static struct governor_attr _name =                                    \
> +__ATTR(_name, 0444, show_##_name, NULL)
> +
> +#define gov_attr_rw(_name)                                             \
> +static struct governor_attr _name =                                    \
> +__ATTR(_name, 0644, show_##_name, store_##_name)
> +
>  /* Common to all CPUs of a policy */
>  struct policy_dbs_info {
>         struct cpufreq_policy *policy;
> @@ -236,8 +272,8 @@ struct dbs_governor {
>         #define GOV_ONDEMAND            0
>         #define GOV_CONSERVATIVE        1
>         int governor;
> -       struct attribute_group *attr_group_gov_sys; /* one governor - system */
> -       struct attribute_group *attr_group_gov_pol; /* one governor - policy */
> +       const char *kobj_name;

So this isn't really necessary.

> +       struct kobj_type kobj_type;
>
>         /*
>          * Common data for platforms that don't set
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index cb0d6ff1ced5..bf570800fa78 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c

[cut]

> @@ -544,8 +523,8 @@ static struct dbs_governor od_dbs_gov = {
>                 .owner = THIS_MODULE,
>         },
>         .governor = GOV_ONDEMAND,
> -       .attr_group_gov_sys = &od_attr_group_gov_sys,
> -       .attr_group_gov_pol = &od_attr_group_gov_pol,
> +       .kobj_name = "ondemand",

And this isn't necessary too.

> +       .kobj_type = { .default_attrs = od_attributes },
>         .get_cpu_cdbs = get_cpu_cdbs,
>         .get_cpu_dbs_info_s = get_cpu_dbs_info_s,
>         .gov_dbs_timer = od_dbs_timer,
> --

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups
  2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
                   ` (13 preceding siblings ...)
  2016-02-08 12:51 ` [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Shilpasri G Bhat
@ 2016-02-08 21:43 ` Rafael J. Wysocki
  14 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 21:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

Hi,

On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Hi Rafael,
>
> Things look much much better now. I have rebased this series over
> pm/bleeding-edge, that has all your patches.
>
> I have moved ahead and done few more changes in this series, that should
> get rid of all the lockdeps we were getting earlier, that includes
> fixing lockdep issue in update_sampling_rate() that we were chasing.
>
> These are pushed here again:
> git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git cpufreq/governor-kobject
>
> @Juri/Shilpa: Can you please confirm if all the issues got resolved now
> ?
>
> V2->V3:
> - The first patch from V2, that was moving min_sampling_rate to
>   per governor structure was dropped, as you suggested.
> - Also, I have moved common tunables to struct dbs_data now, which you
>   also suggested sometime back.
> - Last 5 patches are all new and fix rest of the issues we were facing.
>
> --
> viresh
>
> Viresh Kumar (13):
>   cpufreq: governor: Create generic macro for global tuners
>   cpufreq: governor: Move common tunables to 'struct dbs_data'
>   cpufreq: governor: New sysfs show/store callbacks for governor
>     tunables
>   cpufreq: governor: Drop unused macros for creating governor tunable
>     attributes
>   Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT"
>   cpufreq: Merge cpufreq_offline_prepare/finish routines
>   cpufreq: Call __cpufreq_governor() with policy->rwsem held
>   cpufreq: Remove cpufreq_governor_lock
>   cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c
>   cpufreq: governor: No need to manage state machine now
>   cpufreq: governor: Keep list of policy_dbs within dbs_data
>   cpufreq: ondemand: Traverse list of policy_dbs in
>     update_sampling_rate()
>   cpufreq: conservative: Update sample_delay_ns immediately

This is OK overall, but we'll need to reorder it somewhat.

I have reviewed patches [1-5,12/13].  Where I didn't send comments, I had none.

In my opinion, those 6 patches should go in first.  At least I don't
see why [12/13] cannot be reworked on top of [1-5/13].  If there is
any fundamental reason, please let me know what it is.  Otherwise,
please first send new versions of those 6 patches, preferably as a new
series.

The rest of the patchset will require more review time, so please send
them again later, when you're done with the first item.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 17:20     ` Viresh Kumar
@ 2016-02-08 22:05       ` Rafael J. Wysocki
  2016-02-08 22:08         ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 22:05 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael Wysocki, Juri Lelli,
	Lists linaro-kernel, linux-pm, Saravana Kannan, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann, Shilpasri G Bhat,
	Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 6:20 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>> The comment still applies.
>>
>> Moreover, please extend it to say that this must be called with
>> dbs_data->mutex held (or it looks racy otherwise).
>
> Modified it as:
>
> + *
> + * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
> + * For example, if the original sampling_rate was 1 second and the requested new
> + * sampling rate is 10 ms because the user needs immediate reaction from
> + * ondemand governor, otherwise the governor may change the sampling rate too
> + * late; up to 1 second later.

The "otherwise" doesn't seem to be necessary here.

> + *
> + * Similar logic applies while increasing the sampling rate. And so we need to
> + * update it with immediate effect.

Actually, no, it doesn't apply.  If you increase the sampling rate,
the governor will never be late.  It may be early, but that's fine in
this case.

It just doesn't hurt to update immediately in this case too.

> + *
> + * This must be called with dbs_data->mutex held, otherwise traversing
> + * policy_dbs_list isn't safe.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate()
  2016-02-08 22:05       ` Rafael J. Wysocki
@ 2016-02-08 22:08         ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-08 22:08 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Mon, Feb 8, 2016 at 11:05 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Feb 8, 2016 at 6:20 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 08-02-16, 14:32, Rafael J. Wysocki wrote:
>>> The comment still applies.
>>>
>>> Moreover, please extend it to say that this must be called with
>>> dbs_data->mutex held (or it looks racy otherwise).
>>
>> Modified it as:
>>
>> + *
>> + * Simply updating dbs_tuners_int.sampling_rate might not be appropriate here.
>> + * For example, if the original sampling_rate was 1 second and the requested new
>> + * sampling rate is 10 ms because the user needs immediate reaction from
>> + * ondemand governor, otherwise the governor may change the sampling rate too
>> + * late; up to 1 second later.
>
> The "otherwise" doesn't seem to be necessary here.
>
>> + *
>> + * Similar logic applies while increasing the sampling rate. And so we need to
>> + * update it with immediate effect.
>
> Actually, no, it doesn't apply.  If you increase the sampling rate,
> the governor will never be late.  It may be early, but that's fine in
> this case.
>
> It just doesn't hurt to update immediately in this case too.
>
>> + *
>> + * This must be called with dbs_data->mutex held, otherwise traversing
>> + * policy_dbs_list isn't safe.

I really don't know what's wrong with retaining the original paragraph
and adding the above sentence after it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-08 21:36   ` Rafael J. Wysocki
@ 2016-02-09  3:21     ` Viresh Kumar
  2016-02-09 20:20       ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Viresh Kumar @ 2016-02-09  3:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On 08-02-16, 22:36, Rafael J. Wysocki wrote:
> On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > +       ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
> > +                                  get_governor_parent_kobj(policy),
> > +                                  gov->kobj_name);
> 
> Besides, you forgot about the format argument for kobject_init_and_add().

What about that? Why is it required here ? We don't have to modify the
gov->gov.name string at all, and that string can be used here without adding any
more format arguments.

-- 
viresh

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables
  2016-02-09  3:21     ` Viresh Kumar
@ 2016-02-09 20:20       ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2016-02-09 20:20 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Juri Lelli, Lists linaro-kernel, linux-pm,
	Saravana Kannan, Peter Zijlstra, Michael Turquette, Steve Muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann,
	Shilpasri G Bhat, Linux Kernel Mailing List

On Tuesday, February 09, 2016 08:51:26 AM Viresh Kumar wrote:
> On 08-02-16, 22:36, Rafael J. Wysocki wrote:
> > On Mon, Feb 8, 2016 at 12:39 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > +       ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
> > > +                                  get_governor_parent_kobj(policy),
> > > +                                  gov->kobj_name);
> > 
> > Besides, you forgot about the format argument for kobject_init_and_add().
> 
> What about that? Why is it required here ? We don't have to modify the
> gov->gov.name string at all, and that string can be used here without adding any
> more format arguments.

But that's because the governor name is a static string and we can safely pass
it as a format (because we know that it doesn't contain any output field specifiers
in particular).

So either there should be a comment to that effect in the code, or the format
argument should be present.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2016-02-09 20:19 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-08 11:39 [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 01/13] cpufreq: governor: Create generic macro for global tuners Viresh Kumar
2016-02-08 16:33   ` Rafael J. Wysocki
2016-02-08 11:39 ` [PATCH V3 02/13] cpufreq: governor: Move common tunables to 'struct dbs_data' Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 03/13] cpufreq: governor: New sysfs show/store callbacks for governor tunables Viresh Kumar
2016-02-08 17:07   ` Viresh Kumar
2016-02-08 21:28     ` Rafael J. Wysocki
2016-02-08 21:36   ` Rafael J. Wysocki
2016-02-09  3:21     ` Viresh Kumar
2016-02-09 20:20       ` Rafael J. Wysocki
2016-02-08 11:39 ` [PATCH V3 04/13] cpufreq: governor: Drop unused macros for creating governor tunable attributes Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 05/13] Revert "cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT" Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 06/13] cpufreq: Merge cpufreq_offline_prepare/finish routines Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 07/13] cpufreq: Call __cpufreq_governor() with policy->rwsem held Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 08/13] cpufreq: Remove cpufreq_governor_lock Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 09/13] cpufreq: governor: Move common sysfs tunables to cpufreq_governor.c Viresh Kumar
2016-02-08 12:58   ` Rafael J. Wysocki
2016-02-08 13:03     ` Viresh Kumar
2016-02-08 13:24       ` Rafael J. Wysocki
2016-02-08 11:39 ` [PATCH V3 10/13] cpufreq: governor: No need to manage state machine now Viresh Kumar
2016-02-08 11:39 ` [PATCH V3 11/13] cpufreq: governor: Keep list of policy_dbs within dbs_data Viresh Kumar
2016-02-08 13:21   ` Rafael J. Wysocki
2016-02-08 13:30     ` Viresh Kumar
2016-02-08 13:35       ` Rafael J. Wysocki
2016-02-08 11:39 ` [PATCH V3 12/13] cpufreq: ondemand: Traverse list of policy_dbs in update_sampling_rate() Viresh Kumar
2016-02-08 13:32   ` Rafael J. Wysocki
2016-02-08 13:34     ` Viresh Kumar
2016-02-08 13:34       ` Viresh Kumar
2016-02-08 13:37       ` Rafael J. Wysocki
2016-02-08 17:20     ` Viresh Kumar
2016-02-08 22:05       ` Rafael J. Wysocki
2016-02-08 22:08         ` Rafael J. Wysocki
2016-02-08 11:39 ` [PATCH V3 13/13] cpufreq: conservative: Update sample_delay_ns immediately Viresh Kumar
2016-02-08 12:51 ` [PATCH V3 00/13] cpufreq: governors: Fix ABBA lockups Shilpasri G Bhat
2016-02-08 12:54   ` Viresh Kumar
2016-02-08 16:39     ` Juri Lelli
2016-02-08 16:55       ` Viresh Kumar
2016-02-08 21:43 ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.