linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/12] cpufreq: More governor code reorganization
@ 2016-02-18  1:17 Rafael J. Wysocki
  2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
                   ` (11 more replies)
  0 siblings, 12 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:17 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

Hi,

This series continues the governor code reorganization I've been doing
for the last couple of weeks.

Again, it doesn't change the way the code works fundamentally, but some
minor changes in behavior may be noticeable.

[1/12] closes a race condition for dbs_data updates.

[2/12] moves the io_is_busy tunable to struct dbs_data.

[3/12] adds a ->start callback for governors to initialize governor-specific
       stuff and avoid the ugly governor == GOV_SOMETHING checks in the common
       code.

[4/12] drops unnecessary things from struct dbs_governor (and related stuff).

[5/12] drops an unnecessary ondemand operation callback.

[6/12] fixes the way some governor tunable sysfs attributes update CPU load info.

[7/12] fixes the way in which powersave bias updates are handled in ondemand.

[8/12] moves some data items from per-CPU structures to per-policy ones.

[9/12] rearranges the per-CPU structures (moves them to the common code which
       is the only user of them now and drops several related things that
       aren't necessary any more).

[10/12] moves the definitions of "tuners" structures to governors.

[11/12] makes dbs_data_mutex static.

[12/12] reduces the dbs_data_mutex usage area and renames it.

Patches on top of the current linux-next branch of linux-pm.git, lightly tested
on Toshiba Portege R500 with the ACPI driver.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
@ 2016-02-18  1:19 ` Rafael J. Wysocki
  2016-02-18  5:24   ` Viresh Kumar
  2016-02-19  3:09   ` Viresh Kumar
  2016-02-18  1:20 ` [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data Rafael J. Wysocki
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:19 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It is possible for a dbs_data object to be updated after its
usage counter has become 0.  That may happen if governor_store()
runs (via a govenor tunable sysfs attribute write) in parallel
with cpufreq_governor_exit() called for the last cpufreq policy
associated with the dbs_data in question.  In that case, if
governor_store() acquires dbs_data->mutex right after
cpufreq_governor_exit() has released it, the ->store() callback
invoked by it may operate on dbs_data with no users.  Although
sysfs will cause the kobject_put() in cpufreq_governor_exit() to
block until governor_store() has returned, that situation may
lead to some unexpected results, depending on the implementation
of the ->store callback, and therefore it should be avoided.

To that end, modify governor_store() to check the dbs_data's
usage count before invoking the ->store() callback and return
an error if it is 0 at that point.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
 
 	mutex_lock(&dbs_data->mutex);
 
-	if (gattr->store)
+	if (dbs_data->usage_count && gattr->store)
 		ret = gattr->store(dbs_data, buf, count);
 
 	mutex_unlock(&dbs_data->mutex);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
  2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
@ 2016-02-18  1:20 ` Rafael J. Wysocki
  2016-02-18  5:28   ` Viresh Kumar
  2016-02-18  1:21 ` [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors Rafael J. Wysocki
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:20 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The io_is_busy governor tunable is only used by the ondemand governor
and is located in the ondemand-specific data structure, but it is
looked at by the common governor code that has to do ugly things to
get to that value, so move it to struct dbs_data and modify ondemand
accordingly.

Since the conservative governor never touches that field, it will
be always 0 for that governor and it won't have any effect on the
results of computations in that case.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.c |   27 +++++++++------------------
 drivers/cpufreq/cpufreq_governor.h |    2 +-
 drivers/cpufreq/cpufreq_ondemand.c |   12 +++++-------
 3 files changed, 15 insertions(+), 26 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -137,10 +137,9 @@ unsigned int dbs_update(struct cpufreq_p
 	struct dbs_governor *gov = dbs_governor_of(policy);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int ignore_nice = dbs_data->ignore_nice_load;
 	unsigned int max_load = 0;
-	unsigned int sampling_rate, j;
+	unsigned int sampling_rate, io_busy, j;
 
 	/*
 	 * Sometimes governors may use an additional multiplier to increase
@@ -149,6 +148,12 @@ unsigned int dbs_update(struct cpufreq_p
 	 * conservative.
 	 */
 	sampling_rate = dbs_data->sampling_rate * policy_dbs->rate_mult;
+	/*
+	 * For the purpose of ondemand, waiting for disk IO is an indication
+	 * that you're performance critical, and not that the system is actually
+	 * idle, so do not add the iowait time to the CPU idle time then.
+	 */
+	io_busy = dbs_data->io_is_busy;
 
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
@@ -156,18 +161,9 @@ unsigned int dbs_update(struct cpufreq_p
 		u64 cur_wall_time, cur_idle_time;
 		unsigned int idle_time, wall_time;
 		unsigned int load;
-		int io_busy = 0;
 
 		j_cdbs = gov->get_cpu_cdbs(j);
 
-		/*
-		 * For the purpose of ondemand, waiting for disk IO is
-		 * an indication that you're performance critical, and
-		 * not that the system is actually idle. So do not add
-		 * the iowait time to the cpu idle time.
-		 */
-		if (gov->governor == GOV_ONDEMAND)
-			io_busy = od_tuners->io_is_busy;
 		cur_idle_time = get_cpu_idle_time(j, &cur_wall_time, io_busy);
 
 		wall_time = cur_wall_time - j_cdbs->prev_cpu_wall;
@@ -522,7 +518,7 @@ static int cpufreq_governor_start(struct
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
-	int io_busy = 0;
+	unsigned int io_busy;
 
 	if (!policy->cur)
 		return -EINVAL;
@@ -532,12 +528,7 @@ static int cpufreq_governor_start(struct
 
 	sampling_rate = dbs_data->sampling_rate;
 	ignore_nice = dbs_data->ignore_nice_load;
-
-	if (gov->governor == GOV_ONDEMAND) {
-		struct od_dbs_tuners *od_tuners = dbs_data->tuners;
-
-		io_busy = od_tuners->io_is_busy;
-	}
+	io_busy = dbs_data->io_is_busy;
 
 	for_each_cpu(j, policy->cpus) {
 		struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -71,6 +71,7 @@ struct dbs_data {
 	unsigned int sampling_rate;
 	unsigned int sampling_down_factor;
 	unsigned int up_threshold;
+	unsigned int io_is_busy;
 
 	struct kobject kobj;
 	struct list_head policy_dbs_list;
@@ -177,7 +178,6 @@ struct cs_cpu_dbs_info_s {
 /* Per policy Governors sysfs tunables */
 struct od_dbs_tuners {
 	unsigned int powersave_bias;
-	unsigned int io_is_busy;
 };
 
 struct cs_dbs_tuners {
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -220,7 +220,6 @@ static struct dbs_governor od_dbs_gov;
 static ssize_t store_io_is_busy(struct dbs_data *dbs_data, const char *buf,
 		size_t count)
 {
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int input;
 	int ret;
 	unsigned int j;
@@ -228,14 +227,14 @@ static ssize_t store_io_is_busy(struct d
 	ret = sscanf(buf, "%u", &input);
 	if (ret != 1)
 		return -EINVAL;
-	od_tuners->io_is_busy = !!input;
+	dbs_data->io_is_busy = !!input;
 
 	/* we need to re-evaluate prev_cpu_idle */
 	for_each_online_cpu(j) {
 		struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
 									j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-			&dbs_info->cdbs.prev_cpu_wall, od_tuners->io_is_busy);
+			&dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
 	}
 	return count;
 }
@@ -286,7 +285,6 @@ static ssize_t store_sampling_down_facto
 static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 		const char *buf, size_t count)
 {
-	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int input;
 	int ret;
 
@@ -309,7 +307,7 @@ static ssize_t store_ignore_nice_load(st
 		struct od_cpu_dbs_info_s *dbs_info;
 		dbs_info = &per_cpu(od_cpu_dbs_info, j);
 		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-			&dbs_info->cdbs.prev_cpu_wall, od_tuners->io_is_busy);
+			&dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
 		if (dbs_data->ignore_nice_load)
 			dbs_info->cdbs.prev_cpu_nice =
 				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
@@ -342,7 +340,7 @@ gov_show_one_common(up_threshold);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(ignore_nice_load);
 gov_show_one_common(min_sampling_rate);
-gov_show_one(od, io_is_busy);
+gov_show_one_common(io_is_busy);
 gov_show_one(od, powersave_bias);
 
 gov_attr_rw(sampling_rate);
@@ -401,7 +399,7 @@ static int od_init(struct dbs_data *dbs_
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
 	dbs_data->ignore_nice_load = 0;
 	tuners->powersave_bias = default_powersave_bias;
-	tuners->io_is_busy = should_io_be_busy();
+	dbs_data->io_is_busy = should_io_be_busy();
 
 	dbs_data->tuners = tuners;
 	return 0;

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
  2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
  2016-02-18  1:20 ` [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data Rafael J. Wysocki
@ 2016-02-18  1:21 ` Rafael J. Wysocki
  2016-02-18  5:36   ` Viresh Kumar
  2016-02-18  1:22 ` [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields Rafael J. Wysocki
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:21 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

To avoid having to check the governor type explicitly in the common
code in order to initialize data structures specific to the governor
type properly, add a ->start callback to struct dbs_governor and
use it to initialize those data structures for the ondemand and
conservative governors.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |    9 +++++++++
 drivers/cpufreq/cpufreq_governor.c     |   16 ++--------------
 drivers/cpufreq/cpufreq_governor.h     |    1 +
 drivers/cpufreq/cpufreq_ondemand.c     |   10 ++++++++++
 4 files changed, 22 insertions(+), 14 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -517,7 +517,7 @@ static int cpufreq_governor_start(struct
 	struct dbs_governor *gov = dbs_governor_of(policy);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
-	unsigned int sampling_rate, ignore_nice, j, cpu = policy->cpu;
+	unsigned int sampling_rate, ignore_nice, j;
 	unsigned int io_busy;
 
 	if (!policy->cur)
@@ -543,19 +543,7 @@ static int cpufreq_governor_start(struct
 			j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
 	}
 
-	if (gov->governor == GOV_CONSERVATIVE) {
-		struct cs_cpu_dbs_info_s *cs_dbs_info =
-			gov->get_cpu_dbs_info_s(cpu);
-
-		cs_dbs_info->down_skip = 0;
-		cs_dbs_info->requested_freq = policy->cur;
-	} else {
-		struct od_ops *od_ops = gov->gov_ops;
-		struct od_cpu_dbs_info_s *od_dbs_info = gov->get_cpu_dbs_info_s(cpu);
-
-		od_dbs_info->sample_type = OD_NORMAL_SAMPLE;
-		od_ops->powersave_bias_init_cpu(cpu);
-	}
+	gov->start(policy);
 
 	gov_set_update_util(policy_dbs, sampling_rate);
 	return 0;
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -205,6 +205,7 @@ struct dbs_governor {
 	unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
+	void (*start)(struct cpufreq_policy *policy);
 
 	/* Governor specific ops, see below */
 	void *gov_ops;
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -410,6 +410,15 @@ static void od_exit(struct dbs_data *dbs
 	kfree(dbs_data->tuners);
 }
 
+static void od_start(struct cpufreq_policy *policy)
+{
+	unsigned int cpu = policy->cpu;
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+
+	dbs_info->sample_type = OD_NORMAL_SAMPLE;
+	od_ops.powersave_bias_init_cpu(cpu);
+}
+
 define_get_cpu_dbs_routines(od_cpu_dbs_info);
 
 static struct od_ops od_ops = {
@@ -432,6 +441,7 @@ static struct dbs_governor od_dbs_gov =
 	.gov_ops = &od_ops,
 	.init = od_init,
 	.exit = od_exit,
+	.start = od_start,
 };
 
 #define CPU_FREQ_GOV_ONDEMAND	(&od_dbs_gov.gov)
Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -279,6 +279,14 @@ static void cs_exit(struct dbs_data *dbs
 	kfree(dbs_data->tuners);
 }
 
+static void cs_start(struct cpufreq_policy *policy)
+{
+	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, policy->cpu);
+
+	dbs_info->down_skip = 0;
+	dbs_info->requested_freq = policy->cur;
+}
+
 define_get_cpu_dbs_routines(cs_cpu_dbs_info);
 
 static struct dbs_governor cs_dbs_gov = {
@@ -295,6 +303,7 @@ static struct dbs_governor cs_dbs_gov =
 	.gov_dbs_timer = cs_dbs_timer,
 	.init = cs_init,
 	.exit = cs_exit,
+	.start = cs_start,
 };
 
 #define CPU_FREQ_GOV_CONSERVATIVE	(&cs_dbs_gov.gov)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2016-02-18  1:21 ` [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors Rafael J. Wysocki
@ 2016-02-18  1:22 ` Rafael J. Wysocki
  2016-02-18  5:37   ` Viresh Kumar
  2016-02-18  1:24 ` [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops Rafael J. Wysocki
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:22 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

After some previous changes, the ->get_cpu_dbs_info_s governor
callback and the "governor" field in struct dbs_governor (whose
value represents the governor type) are not used any more, so
drop them.

Also drop the unused gov_ops field from struct dbs_governor.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |    2 --
 drivers/cpufreq/cpufreq_governor.h     |   15 +--------------
 drivers/cpufreq/cpufreq_ondemand.c     |    3 ---
 3 files changed, 1 insertion(+), 19 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -296,10 +296,8 @@ static struct dbs_governor cs_dbs_gov =
 		.max_transition_latency = TRANSITION_LATENCY_LIMIT,
 		.owner = THIS_MODULE,
 	},
-	.governor = GOV_CONSERVATIVE,
 	.kobj_type = { .default_attrs = cs_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
-	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = cs_dbs_timer,
 	.init = cs_init,
 	.exit = cs_exit,
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -46,11 +46,6 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 static struct cpu_dbs_info *get_cpu_cdbs(int cpu)			\
 {									\
 	return &per_cpu(_dbs_info, cpu).cdbs;				\
-}									\
-									\
-static void *get_cpu_dbs_info_s(int cpu)				\
-{									\
-	return &per_cpu(_dbs_info, cpu);				\
 }
 
 /*
@@ -188,10 +183,6 @@ struct cs_dbs_tuners {
 /* Common Governor data across policies */
 struct dbs_governor {
 	struct cpufreq_governor gov;
-
-	#define GOV_ONDEMAND		0
-	#define GOV_CONSERVATIVE	1
-	int governor;
 	struct kobj_type kobj_type;
 
 	/*
@@ -201,14 +192,10 @@ struct dbs_governor {
 	struct dbs_data *gdbs_data;
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
-	void *(*get_cpu_dbs_info_s)(int cpu);
 	unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
 	void (*start)(struct cpufreq_policy *policy);
-
-	/* Governor specific ops, see below */
-	void *gov_ops;
 };
 
 static inline struct dbs_governor *dbs_governor_of(struct cpufreq_policy *policy)
@@ -216,7 +203,7 @@ static inline struct dbs_governor *dbs_g
 	return container_of(policy->governor, struct dbs_governor, gov);
 }
 
-/* Governor specific ops, will be passed to dbs_data->gov_ops */
+/* Governor specific operations */
 struct od_ops {
 	void (*powersave_bias_init_cpu)(int cpu);
 	unsigned int (*powersave_bias_target)(struct cpufreq_policy *policy,
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -433,12 +433,9 @@ static struct dbs_governor od_dbs_gov =
 		.max_transition_latency	= TRANSITION_LATENCY_LIMIT,
 		.owner = THIS_MODULE,
 	},
-	.governor = GOV_ONDEMAND,
 	.kobj_type = { .default_attrs = od_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
-	.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
 	.gov_dbs_timer = od_dbs_timer,
-	.gov_ops = &od_ops,
 	.init = od_init,
 	.exit = od_exit,
 	.start = od_start,

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2016-02-18  1:22 ` [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields Rafael J. Wysocki
@ 2016-02-18  1:24 ` Rafael J. Wysocki
  2016-02-18  5:38   ` Viresh Kumar
  2016-02-18  1:26 ` [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store Rafael J. Wysocki
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:24 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The ->powersave_bias_init_cpu callback in struct od_ops is only used
in one place and that invocation may be replaced with a direct call
to the function pointed to by that callback, so change the code
accordingly and drop the callback.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.h |    1 -
 drivers/cpufreq/cpufreq_ondemand.c |    3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -205,7 +205,6 @@ static inline struct dbs_governor *dbs_g
 
 /* Governor specific operations */
 struct od_ops {
-	void (*powersave_bias_init_cpu)(int cpu);
 	unsigned int (*powersave_bias_target)(struct cpufreq_policy *policy,
 			unsigned int freq_next, unsigned int relation);
 };
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -416,13 +416,12 @@ static void od_start(struct cpufreq_poli
 	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
 
 	dbs_info->sample_type = OD_NORMAL_SAMPLE;
-	od_ops.powersave_bias_init_cpu(cpu);
+	ondemand_powersave_bias_init_cpu(cpu);
 }
 
 define_get_cpu_dbs_routines(od_cpu_dbs_info);
 
 static struct od_ops od_ops = {
-	.powersave_bias_init_cpu = ondemand_powersave_bias_init_cpu,
 	.powersave_bias_target = generic_powersave_bias_target,
 };
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2016-02-18  1:24 ` [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops Rafael J. Wysocki
@ 2016-02-18  1:26 ` Rafael J. Wysocki
  2016-02-18  5:44   ` Viresh Kumar
  2016-02-18  1:28 ` [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates Rafael J. Wysocki
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:26 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The ->store() callbacks of some tunable sysfs attributes of the
ondemand and conservative governors trigger immediate updates of
the CPU load information for all CPUs "governed" by the given
dbs_data by walking the cpu_dbs_info structures for all online
CPUs in the system and updating them.

This is questionable for two reasons.  First, it may lead to a lot of
extra overhead on a system with many CPUs if the given dbs_data is
only associated with a few of them.  Second, if governor tunables are
per-policy, the CPUs associated with the other sets of governor
tunables should not be updated.

To address this issue, use the observation that in all of the places
in question the update operation may be carried out in the same way
(because all of the tunables involved are now located in struct
dbs_data and readily available to the common code) and make the
code in those places invoke the same (new) helper function that
will carry out the update correctly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |   15 +++++----------
 drivers/cpufreq/cpufreq_governor.c     |   30 ++++++++++++++++++++++++++++++
 drivers/cpufreq/cpufreq_governor.h     |    1 +
 drivers/cpufreq/cpufreq_ondemand.c     |   22 ++++------------------
 4 files changed, 40 insertions(+), 28 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -80,6 +80,36 @@ ssize_t store_sampling_rate(struct dbs_d
 }
 EXPORT_SYMBOL_GPL(store_sampling_rate);
 
+/**
+ * gov_update_cpu_data - Update CPU load data.
+ * @gov: Governor whose data is to be updated.
+ * @dbs_data: Top-level governor data pointer.
+ *
+ * Update CPU load data for all CPUs in the domain governed by @dbs_data
+ * (that may be a single policy or a bunch of them if governor tunables are
+ * system-wide).
+ *
+ * Call under the @dbs_data mutex.
+ */
+void gov_update_cpu_data(struct dbs_governor *gov, struct dbs_data *dbs_data)
+{
+	struct policy_dbs_info *policy_dbs;
+
+	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list) {
+		unsigned int j;
+
+		for_each_cpu(j, policy_dbs->policy->cpus) {
+			struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
+
+			j_cdbs->prev_cpu_idle = get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall,
+								  dbs_data->io_is_busy);
+			if (dbs_data->ignore_nice_load)
+				j_cdbs->prev_cpu_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(gov_update_cpu_data);
+
 static inline struct dbs_data *to_dbs_data(struct kobject *kobj)
 {
 	return container_of(kobj, struct dbs_data, kobj);
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -218,4 +218,5 @@ void od_register_powersave_bias_handler(
 void od_unregister_powersave_bias_handler(void);
 ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 			    size_t count);
+void gov_update_cpu_data(struct dbs_governor *gov, struct dbs_data *dbs_data);
 #endif /* _CPUFREQ_GOVERNOR_H */
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -29,6 +29,7 @@
 
 static DEFINE_PER_CPU(struct od_cpu_dbs_info_s, od_cpu_dbs_info);
 
+static struct dbs_governor od_dbs_gov;
 static struct od_ops od_ops;
 
 static unsigned int default_powersave_bias;
@@ -222,7 +223,6 @@ static ssize_t store_io_is_busy(struct d
 {
 	unsigned int input;
 	int ret;
-	unsigned int j;
 
 	ret = sscanf(buf, "%u", &input);
 	if (ret != 1)
@@ -230,12 +230,8 @@ static ssize_t store_io_is_busy(struct d
 	dbs_data->io_is_busy = !!input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	for_each_online_cpu(j) {
-		struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
-									j);
-		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-			&dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
-	}
+	gov_update_cpu_data(&od_dbs_gov, dbs_data);
+
 	return count;
 }
 
@@ -288,8 +284,6 @@ static ssize_t store_ignore_nice_load(st
 	unsigned int input;
 	int ret;
 
-	unsigned int j;
-
 	ret = sscanf(buf, "%u", &input);
 	if (ret != 1)
 		return -EINVAL;
@@ -303,16 +297,8 @@ static ssize_t store_ignore_nice_load(st
 	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	for_each_online_cpu(j) {
-		struct od_cpu_dbs_info_s *dbs_info;
-		dbs_info = &per_cpu(od_cpu_dbs_info, j);
-		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-			&dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
-		if (dbs_data->ignore_nice_load)
-			dbs_info->cdbs.prev_cpu_nice =
-				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+	gov_update_cpu_data(&od_dbs_gov, dbs_data);
 
-	}
 	return count;
 }
 
Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -23,6 +23,8 @@
 
 static DEFINE_PER_CPU(struct cs_cpu_dbs_info_s, cs_cpu_dbs_info);
 
+static struct dbs_governor cs_dbs_gov;
+
 static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 					   struct cpufreq_policy *policy)
 {
@@ -164,7 +166,7 @@ static ssize_t store_down_threshold(stru
 static ssize_t store_ignore_nice_load(struct dbs_data *dbs_data,
 		const char *buf, size_t count)
 {
-	unsigned int input, j;
+	unsigned int input;
 	int ret;
 
 	ret = sscanf(buf, "%u", &input);
@@ -180,15 +182,8 @@ static ssize_t store_ignore_nice_load(st
 	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	for_each_online_cpu(j) {
-		struct cs_cpu_dbs_info_s *dbs_info;
-		dbs_info = &per_cpu(cs_cpu_dbs_info, j);
-		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
-					&dbs_info->cdbs.prev_cpu_wall, 0);
-		if (dbs_data->ignore_nice_load)
-			dbs_info->cdbs.prev_cpu_nice =
-				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
-	}
+	gov_update_cpu_data(&cs_dbs_gov, dbs_data);
+
 	return count;
 }
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (5 preceding siblings ...)
  2016-02-18  1:26 ` [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store Rafael J. Wysocki
@ 2016-02-18  1:28 ` Rafael J. Wysocki
  2016-02-18  5:53   ` Viresh Kumar
  2016-02-18  1:30 ` [PATCH 8/12] cpufreq: governor: Make governor private data per-policy Rafael J. Wysocki
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:28 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The ondemand_powersave_bias_init() function used for resetting data
fields related to the powersave bias tunable of the ondemand governor
works by walking all of the online CPUs in the system and updating the
od_cpu_dbs_info_s structures for all of them.

However, if governor tunables are per policy, the update should not
touch the CPUs that are not associated with the given dbs_data.

Moreover, since the data fields in question are only ever used for
policy->cpu in each policy governed by ondemand, the update can be
limited to those specific CPUs.

Rework the code to take the above observations into account.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_ondemand.c |   30 +++++++++++++-----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -34,14 +34,6 @@ static struct od_ops od_ops;
 
 static unsigned int default_powersave_bias;
 
-static void ondemand_powersave_bias_init_cpu(int cpu)
-{
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
-
-	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
-	dbs_info->freq_lo = 0;
-}
-
 /*
  * Not all CPUs want IO time to be accounted as busy; this depends on how
  * efficient idling at a higher frequency/voltage is.
@@ -120,12 +112,13 @@ static unsigned int generic_powersave_bi
 	return freq_hi;
 }
 
-static void ondemand_powersave_bias_init(void)
+static void ondemand_powersave_bias_init(struct cpufreq_policy *policy)
 {
-	int i;
-	for_each_online_cpu(i) {
-		ondemand_powersave_bias_init_cpu(i);
-	}
+	unsigned int cpu = policy->cpu;
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+
+	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
+	dbs_info->freq_lo = 0;
 }
 
 static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
@@ -306,6 +299,7 @@ static ssize_t store_powersave_bias(stru
 		size_t count)
 {
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+	struct policy_dbs_info *policy_dbs;
 	unsigned int input;
 	int ret;
 	ret = sscanf(buf, "%u", &input);
@@ -317,7 +311,10 @@ static ssize_t store_powersave_bias(stru
 		input = 1000;
 
 	od_tuners->powersave_bias = input;
-	ondemand_powersave_bias_init();
+
+	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list)
+		ondemand_powersave_bias_init(policy_dbs->policy);
+
 	return count;
 }
 
@@ -398,11 +395,10 @@ static void od_exit(struct dbs_data *dbs
 
 static void od_start(struct cpufreq_policy *policy)
 {
-	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
 
 	dbs_info->sample_type = OD_NORMAL_SAMPLE;
-	ondemand_powersave_bias_init_cpu(cpu);
+	ondemand_powersave_bias_init(policy);
 }
 
 define_get_cpu_dbs_routines(od_cpu_dbs_info);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 8/12] cpufreq: governor: Make governor private data per-policy
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (6 preceding siblings ...)
  2016-02-18  1:28 ` [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates Rafael J. Wysocki
@ 2016-02-18  1:30 ` Rafael J. Wysocki
  2016-02-18  6:03   ` Viresh Kumar
  2016-02-18  1:31 ` [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code Rafael J. Wysocki
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:30 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s
are only used for a limited set of CPUs.  Namely, if a policy is
shared between multiple CPUs, those fields will only be used for one
of them (policy->cpu).  This means that they really are per-policy
rather than per-CPU and holding room for them in per-CPU data
structures is generally wasteful.  Also moving those fields into
per-policy data structures will allow some significant simplifications
to be made going forward.

For this reason, introduce struct cs_policy_dbs_info and
struct od_policy_dbs_info to hold those fields.  Define each of the
new structures as an extension of struct policy_dbs_info (such that
struct policy_dbs_info is embedded in each of them) and introduce
new ->alloc and ->free governor callbacks to allocate and free
those structures, respectively, such that ->alloc() will return
a pointer to the struct policy_dbs_info embedded in the allocated
data structure and ->free() will take that pointer as its argument.

With that, modify the code accessing the data fields in question
in per-CPU data objects to look for them in the new structures
via the struct policy_dbs_info pointer available to it and drop
them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |   34 ++++++++++++++++++++++--
 drivers/cpufreq/cpufreq_governor.c     |    7 ++---
 drivers/cpufreq/cpufreq_governor.h     |    9 +-----
 drivers/cpufreq/cpufreq_ondemand.c     |   45 ++++++++++++++++++++++++++-------
 4 files changed, 71 insertions(+), 24 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -14,6 +14,17 @@
 #include <linux/slab.h>
 #include "cpufreq_governor.h"
 
+struct cs_policy_dbs_info {
+	struct policy_dbs_info policy_dbs;
+	unsigned int down_skip;
+	unsigned int requested_freq;
+};
+
+static inline struct cs_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
+{
+	return container_of(policy_dbs, struct cs_policy_dbs_info, policy_dbs);
+}
+
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD		(20)
@@ -48,8 +59,8 @@ static inline unsigned int get_freq_targ
  */
 static unsigned int cs_dbs_timer(struct cpufreq_policy *policy)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, policy->cpu);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	unsigned int load = dbs_update(policy);
@@ -238,6 +249,19 @@ static struct attribute *cs_attributes[]
 
 /************************** sysfs end ************************/
 
+static struct policy_dbs_info *cs_alloc(void)
+{
+	struct cs_policy_dbs_info *dbs_info;
+
+	dbs_info = kzalloc(sizeof(*dbs_info), GFP_KERNEL);
+	return dbs_info ? &dbs_info->policy_dbs : NULL;
+}
+
+static void cs_free(struct policy_dbs_info *policy_dbs)
+{
+	kfree(to_dbs_info(policy_dbs));
+}
+
 static int cs_init(struct dbs_data *dbs_data, bool notify)
 {
 	struct cs_dbs_tuners *tuners;
@@ -276,7 +300,7 @@ static void cs_exit(struct dbs_data *dbs
 
 static void cs_start(struct cpufreq_policy *policy)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, policy->cpu);
+	struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
 	dbs_info->down_skip = 0;
 	dbs_info->requested_freq = policy->cur;
@@ -294,6 +318,8 @@ static struct dbs_governor cs_dbs_gov =
 	.kobj_type = { .default_attrs = cs_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = cs_dbs_timer,
+	.alloc = cs_alloc,
+	.free = cs_free,
 	.init = cs_init,
 	.exit = cs_exit,
 	.start = cs_start,
@@ -305,9 +331,8 @@ static int dbs_cpufreq_notifier(struct n
 				void *data)
 {
 	struct cpufreq_freqs *freq = data;
-	struct cs_cpu_dbs_info_s *dbs_info =
-					&per_cpu(cs_cpu_dbs_info, freq->cpu);
 	struct cpufreq_policy *policy = cpufreq_cpu_get_raw(freq->cpu);
+	struct cs_policy_dbs_info *dbs_info;
 
 	if (!policy)
 		return 0;
@@ -316,6 +341,7 @@ static int dbs_cpufreq_notifier(struct n
 	if (policy->governor != CPU_FREQ_GOV_CONSERVATIVE)
 		return 0;
 
+	dbs_info = to_dbs_info(policy->governor_data);
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
 	 * ranges of frequency available to us otherwise we do not change it
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -157,17 +157,10 @@ struct cpu_dbs_info {
 
 struct od_cpu_dbs_info_s {
 	struct cpu_dbs_info cdbs;
-	struct cpufreq_frequency_table *freq_table;
-	unsigned int freq_lo;
-	unsigned int freq_lo_delay_us;
-	unsigned int freq_hi_delay_us;
-	unsigned int sample_type:1;
 };
 
 struct cs_cpu_dbs_info_s {
 	struct cpu_dbs_info cdbs;
-	unsigned int down_skip;
-	unsigned int requested_freq;
 };
 
 /* Per policy Governors sysfs tunables */
@@ -193,6 +186,8 @@ struct dbs_governor {
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
+	struct policy_dbs_info *(*alloc)(void);
+	void (*free)(struct policy_dbs_info *policy_dbs);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
 	void (*start)(struct cpufreq_policy *policy);
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -18,6 +18,20 @@
 #include <linux/tick.h>
 #include "cpufreq_governor.h"
 
+struct od_policy_dbs_info {
+	struct policy_dbs_info policy_dbs;
+	struct cpufreq_frequency_table *freq_table;
+	unsigned int freq_lo;
+	unsigned int freq_lo_delay_us;
+	unsigned int freq_hi_delay_us;
+	unsigned int sample_type:1;
+};
+
+static inline struct od_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
+{
+	return container_of(policy_dbs, struct od_policy_dbs_info, policy_dbs);
+}
+
 /* On-demand governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_SAMPLING_DOWN_FACTOR		(1)
@@ -69,9 +83,8 @@ static unsigned int generic_powersave_bi
 	unsigned int freq_hi, freq_lo;
 	unsigned int index = 0;
 	unsigned int delay_hi_us;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
-						   policy->cpu);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -114,10 +127,9 @@ static unsigned int generic_powersave_bi
 
 static void ondemand_powersave_bias_init(struct cpufreq_policy *policy)
 {
-	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
-	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
+	dbs_info->freq_table = cpufreq_frequency_get_table(policy->cpu);
 	dbs_info->freq_lo = 0;
 }
 
@@ -144,8 +156,8 @@ static void dbs_freq_increase(struct cpu
  */
 static void od_update(struct cpufreq_policy *policy)
 {
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
-	struct policy_dbs_info *policy_dbs = dbs_info->cdbs.policy_dbs;
+	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int load = dbs_update(policy);
@@ -182,7 +194,7 @@ static unsigned int od_dbs_timer(struct
 {
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	int sample_type = dbs_info->sample_type;
 
 	/* Common NORMAL_SAMPLE setup */
@@ -347,6 +359,19 @@ static struct attribute *od_attributes[]
 
 /************************** sysfs end ************************/
 
+static struct policy_dbs_info *od_alloc(void)
+{
+	struct od_policy_dbs_info *dbs_info;
+
+	dbs_info = kzalloc(sizeof(*dbs_info), GFP_KERNEL);
+	return dbs_info ? &dbs_info->policy_dbs : NULL;
+}
+
+static void od_free(struct policy_dbs_info *policy_dbs)
+{
+	kfree(to_dbs_info(policy_dbs));
+}
+
 static int od_init(struct dbs_data *dbs_data, bool notify)
 {
 	struct od_dbs_tuners *tuners;
@@ -395,7 +420,7 @@ static void od_exit(struct dbs_data *dbs
 
 static void od_start(struct cpufreq_policy *policy)
 {
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
 	dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	ondemand_powersave_bias_init(policy);
@@ -417,6 +442,8 @@ static struct dbs_governor od_dbs_gov =
 	.kobj_type = { .default_attrs = od_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = od_dbs_timer,
+	.alloc = od_alloc,
+	.free = od_free,
 	.init = od_init,
 	.exit = od_exit,
 	.start = od_start,
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -385,8 +385,8 @@ static struct policy_dbs_info *alloc_pol
 	struct policy_dbs_info *policy_dbs;
 	int j;
 
-	/* Allocate memory for the common information for policy->cpus */
-	policy_dbs = kzalloc(sizeof(*policy_dbs), GFP_KERNEL);
+	/* Allocate memory for per-policy governor data. */
+	policy_dbs = gov->alloc();
 	if (!policy_dbs)
 		return NULL;
 
@@ -421,7 +421,7 @@ static void free_policy_dbs_info(struct
 		j_cdbs->policy_dbs = NULL;
 		j_cdbs->update_util.func = NULL;
 	}
-	kfree(policy_dbs);
+	gov->free(policy_dbs);
 }
 
 static int cpufreq_governor_init(struct cpufreq_policy *policy)
@@ -582,7 +582,6 @@ static int cpufreq_governor_start(struct
 static int cpufreq_governor_stop(struct cpufreq_policy *policy)
 {
 	gov_cancel_work(policy);
-
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (7 preceding siblings ...)
  2016-02-18  1:30 ` [PATCH 8/12] cpufreq: governor: Make governor private data per-policy Rafael J. Wysocki
@ 2016-02-18  1:31 ` Rafael J. Wysocki
  2016-02-18  6:08   ` Viresh Kumar
  2016-02-18  1:32 ` [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures Rafael J. Wysocki
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:31 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

After previous changes there is only one piece of code in the
ondemand governor making references to per-CPU data structures,
but it can be easily modified to avoid doing that, so modify it
accordingly and move the definition of per-CPU data used by the
ondemand and conservative governors to the common code.  Next,
change that code to access the per-CPU data structures directly
rather than via a governor callback.

This causes the ->get_cpu_cdbs governor callback to become
unnecessary, so drop it along with the macro and function
definitions related to it.

Finally, drop the definitions of struct od_cpu_dbs_info_s and
struct cs_cpu_dbs_info_s that aren't necessary any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |    9 +--------
 drivers/cpufreq/cpufreq_governor.c     |   24 ++++++++++--------------
 drivers/cpufreq/cpufreq_governor.h     |   18 +-----------------
 drivers/cpufreq/cpufreq_ondemand.c     |   26 ++++++++++----------------
 4 files changed, 22 insertions(+), 55 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,8 @@
 
 #include "cpufreq_governor.h"
 
+static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
+
 DEFINE_MUTEX(dbs_data_mutex);
 EXPORT_SYMBOL_GPL(dbs_data_mutex);
 
@@ -82,7 +84,6 @@ EXPORT_SYMBOL_GPL(store_sampling_rate);
 
 /**
  * gov_update_cpu_data - Update CPU load data.
- * @gov: Governor whose data is to be updated.
  * @dbs_data: Top-level governor data pointer.
  *
  * Update CPU load data for all CPUs in the domain governed by @dbs_data
@@ -91,7 +92,7 @@ EXPORT_SYMBOL_GPL(store_sampling_rate);
  *
  * Call under the @dbs_data mutex.
  */
-void gov_update_cpu_data(struct dbs_governor *gov, struct dbs_data *dbs_data)
+void gov_update_cpu_data(struct dbs_data *dbs_data)
 {
 	struct policy_dbs_info *policy_dbs;
 
@@ -99,7 +100,7 @@ void gov_update_cpu_data(struct dbs_gove
 		unsigned int j;
 
 		for_each_cpu(j, policy_dbs->policy->cpus) {
-			struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
+			struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
 
 			j_cdbs->prev_cpu_idle = get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall,
 								  dbs_data->io_is_busy);
@@ -164,7 +165,6 @@ static const struct sysfs_ops governor_s
 
 unsigned int dbs_update(struct cpufreq_policy *policy)
 {
-	struct dbs_governor *gov = dbs_governor_of(policy);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	unsigned int ignore_nice = dbs_data->ignore_nice_load;
@@ -187,13 +187,11 @@ unsigned int dbs_update(struct cpufreq_p
 
 	/* Get Absolute Load */
 	for_each_cpu(j, policy->cpus) {
-		struct cpu_dbs_info *j_cdbs;
+		struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
 		u64 cur_wall_time, cur_idle_time;
 		unsigned int idle_time, wall_time;
 		unsigned int load;
 
-		j_cdbs = gov->get_cpu_cdbs(j);
-
 		cur_idle_time = get_cpu_idle_time(j, &cur_wall_time, io_busy);
 
 		wall_time = cur_wall_time - j_cdbs->prev_cpu_wall;
@@ -268,14 +266,13 @@ void gov_set_update_util(struct policy_d
 			 unsigned int delay_us)
 {
 	struct cpufreq_policy *policy = policy_dbs->policy;
-	struct dbs_governor *gov = dbs_governor_of(policy);
 	int cpu;
 
 	gov_update_sample_delay(policy_dbs, delay_us);
 	policy_dbs->last_sample_time = 0;
 
 	for_each_cpu(cpu, policy->cpus) {
-		struct cpu_dbs_info *cdbs = gov->get_cpu_cdbs(cpu);
+		struct cpu_dbs_info *cdbs = &per_cpu(cpu_dbs, cpu);
 
 		cpufreq_set_update_util_data(cpu, &cdbs->update_util);
 	}
@@ -398,7 +395,7 @@ static struct policy_dbs_info *alloc_pol
 
 	/* Set policy_dbs for all CPUs, online+offline */
 	for_each_cpu(j, policy->related_cpus) {
-		struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
+		struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
 
 		j_cdbs->policy_dbs = policy_dbs;
 		j_cdbs->update_util.func = dbs_update_util_handler;
@@ -409,14 +406,13 @@ static struct policy_dbs_info *alloc_pol
 static void free_policy_dbs_info(struct cpufreq_policy *policy,
 				 struct dbs_governor *gov)
 {
-	struct cpu_dbs_info *cdbs = gov->get_cpu_cdbs(policy->cpu);
-	struct policy_dbs_info *policy_dbs = cdbs->policy_dbs;
+	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	int j;
 
 	mutex_destroy(&policy_dbs->timer_mutex);
 
 	for_each_cpu(j, policy->related_cpus) {
-		struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
+		struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
 
 		j_cdbs->policy_dbs = NULL;
 		j_cdbs->update_util.func = NULL;
@@ -561,7 +557,7 @@ static int cpufreq_governor_start(struct
 	io_busy = dbs_data->io_is_busy;
 
 	for_each_cpu(j, policy->cpus) {
-		struct cpu_dbs_info *j_cdbs = gov->get_cpu_cdbs(j);
+		struct cpu_dbs_info *j_cdbs = &per_cpu(cpu_dbs, j);
 		unsigned int prev_load;
 
 		j_cdbs->prev_cpu_idle = get_cpu_idle_time(j, &j_cdbs->prev_cpu_wall, io_busy);
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -41,9 +41,6 @@ static inline struct od_policy_dbs_info
 #define MIN_FREQUENCY_UP_THRESHOLD		(11)
 #define MAX_FREQUENCY_UP_THRESHOLD		(100)
 
-static DEFINE_PER_CPU(struct od_cpu_dbs_info_s, od_cpu_dbs_info);
-
-static struct dbs_governor od_dbs_gov;
 static struct od_ops od_ops;
 
 static unsigned int default_powersave_bias;
@@ -235,7 +232,7 @@ static ssize_t store_io_is_busy(struct d
 	dbs_data->io_is_busy = !!input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	gov_update_cpu_data(&od_dbs_gov, dbs_data);
+	gov_update_cpu_data(dbs_data);
 
 	return count;
 }
@@ -302,7 +299,7 @@ static ssize_t store_ignore_nice_load(st
 	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	gov_update_cpu_data(&od_dbs_gov, dbs_data);
+	gov_update_cpu_data(dbs_data);
 
 	return count;
 }
@@ -426,8 +423,6 @@ static void od_start(struct cpufreq_poli
 	ondemand_powersave_bias_init(policy);
 }
 
-define_get_cpu_dbs_routines(od_cpu_dbs_info);
-
 static struct od_ops od_ops = {
 	.powersave_bias_target = generic_powersave_bias_target,
 };
@@ -440,7 +435,6 @@ static struct dbs_governor od_dbs_gov =
 		.owner = THIS_MODULE,
 	},
 	.kobj_type = { .default_attrs = od_attributes },
-	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = od_dbs_timer,
 	.alloc = od_alloc,
 	.free = od_free,
@@ -453,9 +447,6 @@ static struct dbs_governor od_dbs_gov =
 
 static void od_set_powersave_bias(unsigned int powersave_bias)
 {
-	struct cpufreq_policy *policy;
-	struct dbs_data *dbs_data;
-	struct od_dbs_tuners *od_tuners;
 	unsigned int cpu;
 	cpumask_t done;
 
@@ -464,21 +455,24 @@ static void od_set_powersave_bias(unsign
 
 	get_online_cpus();
 	for_each_online_cpu(cpu) {
+		struct cpufreq_policy *policy;
 		struct policy_dbs_info *policy_dbs;
+		struct dbs_data *dbs_data;
+		struct od_dbs_tuners *od_tuners;
 
 		if (cpumask_test_cpu(cpu, &done))
 			continue;
 
-		policy_dbs = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy_dbs;
+		policy = cpufreq_cpu_get_raw(cpu);
+		if (!policy || policy->governor != CPU_FREQ_GOV_ONDEMAND)
+			continue;
+
+		policy_dbs = policy->governor_data;
 		if (!policy_dbs)
 			continue;
 
-		policy = policy_dbs->policy;
 		cpumask_or(&done, &done, policy->cpus);
 
-		if (policy->governor != CPU_FREQ_GOV_ONDEMAND)
-			continue;
-
 		dbs_data = policy_dbs->dbs_data;
 		od_tuners = dbs_data->tuners;
 		od_tuners->powersave_bias = default_powersave_bias;
Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -32,10 +32,6 @@ static inline struct cs_policy_dbs_info
 #define DEF_SAMPLING_DOWN_FACTOR		(1)
 #define MAX_SAMPLING_DOWN_FACTOR		(10)
 
-static DEFINE_PER_CPU(struct cs_cpu_dbs_info_s, cs_cpu_dbs_info);
-
-static struct dbs_governor cs_dbs_gov;
-
 static inline unsigned int get_freq_target(struct cs_dbs_tuners *cs_tuners,
 					   struct cpufreq_policy *policy)
 {
@@ -193,7 +189,7 @@ static ssize_t store_ignore_nice_load(st
 	dbs_data->ignore_nice_load = input;
 
 	/* we need to re-evaluate prev_cpu_idle */
-	gov_update_cpu_data(&cs_dbs_gov, dbs_data);
+	gov_update_cpu_data(dbs_data);
 
 	return count;
 }
@@ -306,8 +302,6 @@ static void cs_start(struct cpufreq_poli
 	dbs_info->requested_freq = policy->cur;
 }
 
-define_get_cpu_dbs_routines(cs_cpu_dbs_info);
-
 static struct dbs_governor cs_dbs_gov = {
 	.gov = {
 		.name = "conservative",
@@ -316,7 +310,6 @@ static struct dbs_governor cs_dbs_gov =
 		.owner = THIS_MODULE,
 	},
 	.kobj_type = { .default_attrs = cs_attributes },
-	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = cs_dbs_timer,
 	.alloc = cs_alloc,
 	.free = cs_free,
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -41,13 +41,6 @@
 /* Ondemand Sampling types */
 enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 
-/* create helper routines */
-#define define_get_cpu_dbs_routines(_dbs_info)				\
-static struct cpu_dbs_info *get_cpu_cdbs(int cpu)			\
-{									\
-	return &per_cpu(_dbs_info, cpu).cdbs;				\
-}
-
 /*
  * Abbreviations:
  * dbs: used as a shortform for demand based switching It helps to keep variable
@@ -155,14 +148,6 @@ struct cpu_dbs_info {
 	struct policy_dbs_info *policy_dbs;
 };
 
-struct od_cpu_dbs_info_s {
-	struct cpu_dbs_info cdbs;
-};
-
-struct cs_cpu_dbs_info_s {
-	struct cpu_dbs_info cdbs;
-};
-
 /* Per policy Governors sysfs tunables */
 struct od_dbs_tuners {
 	unsigned int powersave_bias;
@@ -184,7 +169,6 @@ struct dbs_governor {
 	 */
 	struct dbs_data *gdbs_data;
 
-	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
 	struct policy_dbs_info *(*alloc)(void);
 	void (*free)(struct policy_dbs_info *policy_dbs);
@@ -213,5 +197,5 @@ void od_register_powersave_bias_handler(
 void od_unregister_powersave_bias_handler(void);
 ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
 			    size_t count);
-void gov_update_cpu_data(struct dbs_governor *gov, struct dbs_data *dbs_data);
+void gov_update_cpu_data(struct dbs_data *dbs_data);
 #endif /* _CPUFREQ_GOVERNOR_H */

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (8 preceding siblings ...)
  2016-02-18  1:31 ` [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code Rafael J. Wysocki
@ 2016-02-18  1:32 ` Rafael J. Wysocki
  2016-02-18  6:09   ` Viresh Kumar
  2016-02-18  1:33 ` [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static Rafael J. Wysocki
  2016-02-18  1:38 ` [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage Rafael J. Wysocki
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:32 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners
from the common governor header to the ondemand and conservative
governor code, respectively, as they don't need to be in the common
header any more.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_conservative.c |    5 +++++
 drivers/cpufreq/cpufreq_governor.h     |   10 ----------
 drivers/cpufreq/cpufreq_ondemand.c     |    4 ++++
 3 files changed, 9 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -25,6 +25,11 @@ static inline struct cs_policy_dbs_info
 	return container_of(policy_dbs, struct cs_policy_dbs_info, policy_dbs);
 }
 
+struct cs_dbs_tuners {
+	unsigned int down_threshold;
+	unsigned int freq_step;
+};
+
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD		(20)
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -148,16 +148,6 @@ struct cpu_dbs_info {
 	struct policy_dbs_info *policy_dbs;
 };
 
-/* Per policy Governors sysfs tunables */
-struct od_dbs_tuners {
-	unsigned int powersave_bias;
-};
-
-struct cs_dbs_tuners {
-	unsigned int down_threshold;
-	unsigned int freq_step;
-};
-
 /* Common Governor data across policies */
 struct dbs_governor {
 	struct cpufreq_governor gov;
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -32,6 +32,10 @@ static inline struct od_policy_dbs_info
 	return container_of(policy_dbs, struct od_policy_dbs_info, policy_dbs);
 }
 
+struct od_dbs_tuners {
+	unsigned int powersave_bias;
+};
+
 /* On-demand governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_SAMPLING_DOWN_FACTOR		(1)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (9 preceding siblings ...)
  2016-02-18  1:32 ` [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures Rafael J. Wysocki
@ 2016-02-18  1:33 ` Rafael J. Wysocki
  2016-02-18  6:09   ` Viresh Kumar
  2016-02-18  1:38 ` [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage Rafael J. Wysocki
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:33 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

That mutex is only used by cpufreq_governor_dbs() and it doesn't
need to be exported to modules, so make it static and drop the
export incantation.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.c |    3 +--
 drivers/cpufreq/cpufreq_governor.h |    1 -
 2 files changed, 1 insertion(+), 3 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -24,8 +24,7 @@
 
 static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
 
-DEFINE_MUTEX(dbs_data_mutex);
-EXPORT_SYMBOL_GPL(dbs_data_mutex);
+static DEFINE_MUTEX(dbs_data_mutex);
 
 /* Common sysfs tunables */
 /**
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -178,7 +178,6 @@ struct od_ops {
 			unsigned int freq_next, unsigned int relation);
 };
 
-extern struct mutex dbs_data_mutex;
 unsigned int dbs_update(struct cpufreq_policy *policy);
 int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
 void od_register_powersave_bias_handler(unsigned int (*f)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage
  2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
                   ` (10 preceding siblings ...)
  2016-02-18  1:33 ` [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static Rafael J. Wysocki
@ 2016-02-18  1:38 ` Rafael J. Wysocki
  2016-02-18  6:20   ` Viresh Kumar
  11 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18  1:38 UTC (permalink / raw)
  To: Linux PM list; +Cc: Linux Kernel Mailing List, Viresh Kumar

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Since cpufreq_governor_dbs() is now always called with policy->rwsem
held, it cannot be executed twice in parallel for the same policy.
Thus it is not necessary to hold dbs_data_mutex around the invocations
of cpufreq_governor_start/stop/limits() from it as those functions
never modify any data that can be shared between different policies.

However, cpufreq_governor_dbs() may be executed twice in parallal
for different policies using the same gov->gdbs_data object and
dbs_data_mutex is still necessary to protect that object against
concurrent updates.

For this reason, narrow down the dbs_data_mutex locking to
cpufreq_governor_init/exit() where it is needed and rename the
mutex to gov_dbs_data_mutex to reflect its purpose.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.c |   53 ++++++++++++++++++-------------------
 1 file changed, 27 insertions(+), 26 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -24,7 +24,7 @@
 
 static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
 
-static DEFINE_MUTEX(dbs_data_mutex);
+static DEFINE_MUTEX(gov_dbs_data_mutex);
 
 /* Common sysfs tunables */
 /**
@@ -422,10 +422,10 @@ static void free_policy_dbs_info(struct
 static int cpufreq_governor_init(struct cpufreq_policy *policy)
 {
 	struct dbs_governor *gov = dbs_governor_of(policy);
-	struct dbs_data *dbs_data = gov->gdbs_data;
+	struct dbs_data *dbs_data;
 	struct policy_dbs_info *policy_dbs;
 	unsigned int latency;
-	int ret;
+	int ret = 0;
 
 	/* State should be equivalent to EXIT */
 	if (policy->governor_data)
@@ -435,6 +435,10 @@ static int cpufreq_governor_init(struct
 	if (!policy_dbs)
 		return -ENOMEM;
 
+	/* Protect gov->gdbs_data against concurrent updates. */
+	mutex_lock(&gov_dbs_data_mutex);
+
+	dbs_data = gov->gdbs_data;
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy())) {
 			ret = -EINVAL;
@@ -447,8 +451,7 @@ static int cpufreq_governor_init(struct
 		dbs_data->usage_count++;
 		list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
 		mutex_unlock(&dbs_data->mutex);
-
-		return 0;
+		goto out;
 	}
 
 	dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
@@ -488,10 +491,14 @@ static int cpufreq_governor_init(struct
 	ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
 				   get_governor_parent_kobj(policy),
 				   "%s", gov->gov.name);
-	if (!ret)
-		return 0;
+	if (ret)
+		goto err;
 
-	/* Failure, so roll back. */
+out:
+	mutex_unlock(&gov_dbs_data_mutex);
+	return ret;
+
+err:
 	pr_err("cpufreq: Governor initialization failed (dbs_data kobject init error %d)\n", ret);
 
 	policy->governor_data = NULL;
@@ -503,7 +510,7 @@ static int cpufreq_governor_init(struct
 
 free_policy_dbs_info:
 	free_policy_dbs_info(policy, gov);
-	return ret;
+	goto out;
 }
 
 static int cpufreq_governor_exit(struct cpufreq_policy *policy)
@@ -513,6 +520,9 @@ static int cpufreq_governor_exit(struct
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	int count;
 
+	/* Protect gov->gdbs_data against concurrent updates. */
+	mutex_lock(&gov_dbs_data_mutex);
+
 	mutex_lock(&dbs_data->mutex);
 	list_del(&policy_dbs->list);
 	count = --dbs_data->usage_count;
@@ -533,6 +543,8 @@ static int cpufreq_governor_exit(struct
 		policy->governor_data = NULL;
 	}
 
+	mutex_unlock(&gov_dbs_data_mutex);
+
 	free_policy_dbs_info(policy, gov);
 	return 0;
 }
@@ -600,31 +612,20 @@ static int cpufreq_governor_limits(struc
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event)
 {
-	int ret = -EINVAL;
-
-	/* Lock governor to block concurrent initialization of governor */
-	mutex_lock(&dbs_data_mutex);
-
 	if (event == CPUFREQ_GOV_POLICY_INIT) {
-		ret = cpufreq_governor_init(policy);
+		return cpufreq_governor_init(policy);
 	} else if (policy->governor_data) {
 		switch (event) {
 		case CPUFREQ_GOV_POLICY_EXIT:
-			ret = cpufreq_governor_exit(policy);
-			break;
+			return cpufreq_governor_exit(policy);
 		case CPUFREQ_GOV_START:
-			ret = cpufreq_governor_start(policy);
-			break;
+			return cpufreq_governor_start(policy);
 		case CPUFREQ_GOV_STOP:
-			ret = cpufreq_governor_stop(policy);
-			break;
+			return cpufreq_governor_stop(policy);
 		case CPUFREQ_GOV_LIMITS:
-			ret = cpufreq_governor_limits(policy);
-			break;
+			return cpufreq_governor_limits(policy);
 		}
 	}
-
-	mutex_unlock(&dbs_data_mutex);
-	return ret;
+	return -EINVAL;
 }
 EXPORT_SYMBOL_GPL(cpufreq_governor_dbs);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
@ 2016-02-18  5:24   ` Viresh Kumar
  2016-02-18 16:20     ` Rafael J. Wysocki
  2016-02-19  3:09   ` Viresh Kumar
  1 sibling, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:24 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:19, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It is possible for a dbs_data object to be updated after its
> usage counter has become 0.  That may happen if governor_store()
> runs (via a govenor tunable sysfs attribute write) in parallel
> with cpufreq_governor_exit() called for the last cpufreq policy
> associated with the dbs_data in question.  In that case, if
> governor_store() acquires dbs_data->mutex right after
> cpufreq_governor_exit() has released it, the ->store() callback
> invoked by it may operate on dbs_data with no users.  Although
> sysfs will cause the kobject_put() in cpufreq_governor_exit() to
> block until governor_store() has returned, that situation may
> lead to some unexpected results, depending on the implementation
> of the ->store callback, and therefore it should be avoided.
> 
> To that end, modify governor_store() to check the dbs_data's
> usage count before invoking the ->store() callback and return
> an error if it is 0 at that point.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
> @@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
>  
>  	mutex_lock(&dbs_data->mutex);
>  
> -	if (gattr->store)
> +	if (dbs_data->usage_count && gattr->store)

That's not gonna be enough. The above lock doesn't guarantee
protection with any such races. And so usage_count can become zero
just after this check.

Btw, we should also kill the gattr->store checks here as well, as we
did it in cpufreq-core.

>  		ret = gattr->store(dbs_data, buf, count);
>  
>  	mutex_unlock(&dbs_data->mutex);

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data
  2016-02-18  1:20 ` [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data Rafael J. Wysocki
@ 2016-02-18  5:28   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:28 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:20, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The io_is_busy governor tunable is only used by the ondemand governor
> and is located in the ondemand-specific data structure, but it is
> looked at by the common governor code that has to do ugly things to
> get to that value, so move it to struct dbs_data and modify ondemand
> accordingly.
> 
> Since the conservative governor never touches that field, it will
> be always 0 for that governor and it won't have any effect on the
> results of computations in that case.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |   27 +++++++++------------------
>  drivers/cpufreq/cpufreq_governor.h |    2 +-
>  drivers/cpufreq/cpufreq_ondemand.c |   12 +++++-------
>  3 files changed, 15 insertions(+), 26 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors
  2016-02-18  1:21 ` [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors Rafael J. Wysocki
@ 2016-02-18  5:36   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:21, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> To avoid having to check the governor type explicitly in the common
> code in order to initialize data structures specific to the governor
> type properly, add a ->start callback to struct dbs_governor and
> use it to initialize those data structures for the ondemand and
> conservative governors.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_conservative.c |    9 +++++++++
>  drivers/cpufreq/cpufreq_governor.c     |   16 ++--------------
>  drivers/cpufreq/cpufreq_governor.h     |    1 +
>  drivers/cpufreq/cpufreq_ondemand.c     |   10 ++++++++++
>  4 files changed, 22 insertions(+), 14 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields
  2016-02-18  1:22 ` [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields Rafael J. Wysocki
@ 2016-02-18  5:37   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:37 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:22, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> After some previous changes, the ->get_cpu_dbs_info_s governor
> callback and the "governor" field in struct dbs_governor (whose
> value represents the governor type) are not used any more, so
> drop them.
> 
> Also drop the unused gov_ops field from struct dbs_governor.
> 
> No functional changes.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_conservative.c |    2 --
>  drivers/cpufreq/cpufreq_governor.h     |   15 +--------------
>  drivers/cpufreq/cpufreq_ondemand.c     |    3 ---
>  3 files changed, 1 insertion(+), 19 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops
  2016-02-18  1:24 ` [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops Rafael J. Wysocki
@ 2016-02-18  5:38   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:38 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:24, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The ->powersave_bias_init_cpu callback in struct od_ops is only used
> in one place and that invocation may be replaced with a direct call
> to the function pointed to by that callback, so change the code
> accordingly and drop the callback.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.h |    1 -
>  drivers/cpufreq/cpufreq_ondemand.c |    3 +--
>  2 files changed, 1 insertion(+), 3 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store
  2016-02-18  1:26 ` [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store Rafael J. Wysocki
@ 2016-02-18  5:44   ` Viresh Kumar
  2016-02-18 17:37     ` Rafael J. Wysocki
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:44 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:26, Rafael J. Wysocki wrote:
> Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
> +++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
> @@ -29,6 +29,7 @@
>  
>  static DEFINE_PER_CPU(struct od_cpu_dbs_info_s, od_cpu_dbs_info);
>  
> +static struct dbs_governor od_dbs_gov;
>  static struct od_ops od_ops;
>  
>  static unsigned int default_powersave_bias;
> @@ -222,7 +223,6 @@ static ssize_t store_io_is_busy(struct d
>  {
>  	unsigned int input;
>  	int ret;
> -	unsigned int j;
>  
>  	ret = sscanf(buf, "%u", &input);
>  	if (ret != 1)
> @@ -230,12 +230,8 @@ static ssize_t store_io_is_busy(struct d
>  	dbs_data->io_is_busy = !!input;
>  
>  	/* we need to re-evaluate prev_cpu_idle */
> -	for_each_online_cpu(j) {
> -		struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
> -									j);
> -		dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
> -			&dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
> -	}

We weren't doing ignore_nice_load check here, but will be done after
this patch. Will that make a different? And then this should be
mentioned in the log as well ?

Apart from that.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates
  2016-02-18  1:28 ` [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates Rafael J. Wysocki
@ 2016-02-18  5:53   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  5:53 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:28, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The ondemand_powersave_bias_init() function used for resetting data
> fields related to the powersave bias tunable of the ondemand governor
> works by walking all of the online CPUs in the system and updating the
> od_cpu_dbs_info_s structures for all of them.
> 
> However, if governor tunables are per policy, the update should not
> touch the CPUs that are not associated with the given dbs_data.
> 
> Moreover, since the data fields in question are only ever used for
> policy->cpu in each policy governed by ondemand, the update can be
> limited to those specific CPUs.

Now that I am looking at struct **_cpu_dbs_info_s definitions, I think
we need to change the design of the structures a bit. Only cdbs is
something that is per-cpu and everything else is per-policy. And so we
will be able to move the per-cpu structures present in individual
governors into cpufreq_governor.c and save some memory.

> Rework the code to take the above observations into account.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_ondemand.c |   30 +++++++++++++-----------------
>  1 file changed, 13 insertions(+), 17 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
> +++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
> @@ -34,14 +34,6 @@ static struct od_ops od_ops;
>  
>  static unsigned int default_powersave_bias;
>  
> -static void ondemand_powersave_bias_init_cpu(int cpu)
> -{
> -	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> -
> -	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
> -	dbs_info->freq_lo = 0;
> -}
> -
>  /*
>   * Not all CPUs want IO time to be accounted as busy; this depends on how
>   * efficient idling at a higher frequency/voltage is.
> @@ -120,12 +112,13 @@ static unsigned int generic_powersave_bi
>  	return freq_hi;
>  }
>  
> -static void ondemand_powersave_bias_init(void)
> +static void ondemand_powersave_bias_init(struct cpufreq_policy *policy)
>  {
> -	int i;
> -	for_each_online_cpu(i) {
> -		ondemand_powersave_bias_init_cpu(i);
> -	}
> +	unsigned int cpu = policy->cpu;
> +	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> +
> +	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
> +	dbs_info->freq_lo = 0;
>  }
>  
>  static void dbs_freq_increase(struct cpufreq_policy *policy, unsigned int freq)
> @@ -306,6 +299,7 @@ static ssize_t store_powersave_bias(stru
>  		size_t count)
>  {
>  	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
> +	struct policy_dbs_info *policy_dbs;
>  	unsigned int input;
>  	int ret;
>  	ret = sscanf(buf, "%u", &input);
> @@ -317,7 +311,10 @@ static ssize_t store_powersave_bias(stru
>  		input = 1000;
>  
>  	od_tuners->powersave_bias = input;
> -	ondemand_powersave_bias_init();
> +
> +	list_for_each_entry(policy_dbs, &dbs_data->policy_dbs_list, list)
> +		ondemand_powersave_bias_init(policy_dbs->policy);
> +
>  	return count;
>  }
>  
> @@ -398,11 +395,10 @@ static void od_exit(struct dbs_data *dbs
>  
>  static void od_start(struct cpufreq_policy *policy)
>  {
> -	unsigned int cpu = policy->cpu;
> -	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
> +	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
>  
>  	dbs_info->sample_type = OD_NORMAL_SAMPLE;
> -	ondemand_powersave_bias_init_cpu(cpu);
> +	ondemand_powersave_bias_init(policy);
>  }
>  
>  define_get_cpu_dbs_routines(od_cpu_dbs_info);

I agree with the patch, but because we have all these fields per-cpu,
I feel a bit scared while depending on policy->cpu for everything.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 8/12] cpufreq: governor: Make governor private data per-policy
  2016-02-18  1:30 ` [PATCH 8/12] cpufreq: governor: Make governor private data per-policy Rafael J. Wysocki
@ 2016-02-18  6:03   ` Viresh Kumar
  2016-02-18 17:56     ` [PATCH v2 " Rafael J. Wysocki
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  6:03 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:30, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s
> are only used for a limited set of CPUs.  Namely, if a policy is
> shared between multiple CPUs, those fields will only be used for one
> of them (policy->cpu).  This means that they really are per-policy
> rather than per-CPU and holding room for them in per-CPU data
> structures is generally wasteful.  Also moving those fields into
> per-policy data structures will allow some significant simplifications
> to be made going forward.
> 
> For this reason, introduce struct cs_policy_dbs_info and
> struct od_policy_dbs_info to hold those fields.  Define each of the
> new structures as an extension of struct policy_dbs_info (such that
> struct policy_dbs_info is embedded in each of them) and introduce
> new ->alloc and ->free governor callbacks to allocate and free
> those structures, respectively, such that ->alloc() will return
> a pointer to the struct policy_dbs_info embedded in the allocated
> data structure and ->free() will take that pointer as its argument.
> 
> With that, modify the code accessing the data fields in question
> in per-CPU data objects to look for them in the new structures
> via the struct policy_dbs_info pointer available to it and drop
> them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s.

Fantastic, that's what I just suggested in the previous patch :)


Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code
  2016-02-18  1:31 ` [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code Rafael J. Wysocki
@ 2016-02-18  6:08   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  6:08 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:31, Rafael J. Wysocki wrote:
> @@ -464,21 +455,24 @@ static void od_set_powersave_bias(unsign
>  
>  	get_online_cpus();
>  	for_each_online_cpu(cpu) {
> +		struct cpufreq_policy *policy;
>  		struct policy_dbs_info *policy_dbs;
> +		struct dbs_data *dbs_data;
> +		struct od_dbs_tuners *od_tuners;
>  
>  		if (cpumask_test_cpu(cpu, &done))
>  			continue;
>  
> -		policy_dbs = per_cpu(od_cpu_dbs_info, cpu).cdbs.policy_dbs;
> +		policy = cpufreq_cpu_get_raw(cpu);

This is surely racy, as this might get called while governors are
getting exchanged. But this is racy today as well.

We should be using the list of policies present with the governors
here, with dbs_data->lock or something like that.


Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures
  2016-02-18  1:32 ` [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures Rafael J. Wysocki
@ 2016-02-18  6:09   ` Viresh Kumar
  2016-02-18 17:57     ` [PATCH v2 " Rafael J. Wysocki
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  6:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:32, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners
> from the common governor header to the ondemand and conservative
> governor code, respectively, as they don't need to be in the common
> header any more.
> 
> No functional changes.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_conservative.c |    5 +++++
>  drivers/cpufreq/cpufreq_governor.h     |   10 ----------
>  drivers/cpufreq/cpufreq_ondemand.c     |    4 ++++
>  3 files changed, 9 insertions(+), 10 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static
  2016-02-18  1:33 ` [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static Rafael J. Wysocki
@ 2016-02-18  6:09   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  6:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:33, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> That mutex is only used by cpufreq_governor_dbs() and it doesn't
> need to be exported to modules, so make it static and drop the
> export incantation.
> 
> No functional changes.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |    3 +--
>  drivers/cpufreq/cpufreq_governor.h |    1 -
>  2 files changed, 1 insertion(+), 3 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
> @@ -24,8 +24,7 @@
>  
>  static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
>  
> -DEFINE_MUTEX(dbs_data_mutex);
> -EXPORT_SYMBOL_GPL(dbs_data_mutex);
> +static DEFINE_MUTEX(dbs_data_mutex);
>  
>  /* Common sysfs tunables */
>  /**
> Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
> +++ linux-pm/drivers/cpufreq/cpufreq_governor.h
> @@ -178,7 +178,6 @@ struct od_ops {
>  			unsigned int freq_next, unsigned int relation);
>  };
>  
> -extern struct mutex dbs_data_mutex;
>  unsigned int dbs_update(struct cpufreq_policy *policy);
>  int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
>  void od_register_powersave_bias_handler(unsigned int (*f)


Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage
  2016-02-18  1:38 ` [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage Rafael J. Wysocki
@ 2016-02-18  6:20   ` Viresh Kumar
  2016-02-18 16:32     ` Rafael J. Wysocki
  2016-02-18 17:58     ` [PATCH v2 " Rafael J. Wysocki
  0 siblings, 2 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-18  6:20 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:38, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Since cpufreq_governor_dbs() is now always called with policy->rwsem
> held, it cannot be executed twice in parallel for the same policy.
> Thus it is not necessary to hold dbs_data_mutex around the invocations
> of cpufreq_governor_start/stop/limits() from it as those functions
> never modify any data that can be shared between different policies.
> 
> However, cpufreq_governor_dbs() may be executed twice in parallal
> for different policies using the same gov->gdbs_data object and
> dbs_data_mutex is still necessary to protect that object against
> concurrent updates.
> 
> For this reason, narrow down the dbs_data_mutex locking to
> cpufreq_governor_init/exit() where it is needed and rename the
> mutex to gov_dbs_data_mutex to reflect its purpose.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |   53 ++++++++++++++++++-------------------
>  1 file changed, 27 insertions(+), 26 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
> @@ -24,7 +24,7 @@
>  
>  static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
>  
> -static DEFINE_MUTEX(dbs_data_mutex);
> +static DEFINE_MUTEX(gov_dbs_data_mutex);
>  
>  /* Common sysfs tunables */
>  /**
> @@ -422,10 +422,10 @@ static void free_policy_dbs_info(struct
>  static int cpufreq_governor_init(struct cpufreq_policy *policy)
>  {
>  	struct dbs_governor *gov = dbs_governor_of(policy);
> -	struct dbs_data *dbs_data = gov->gdbs_data;
> +	struct dbs_data *dbs_data;
>  	struct policy_dbs_info *policy_dbs;
>  	unsigned int latency;
> -	int ret;
> +	int ret = 0;
>  
>  	/* State should be equivalent to EXIT */
>  	if (policy->governor_data)
> @@ -435,6 +435,10 @@ static int cpufreq_governor_init(struct
>  	if (!policy_dbs)
>  		return -ENOMEM;
>  
> +	/* Protect gov->gdbs_data against concurrent updates. */
> +	mutex_lock(&gov_dbs_data_mutex);
> +
> +	dbs_data = gov->gdbs_data;
>  	if (dbs_data) {
>  		if (WARN_ON(have_governor_per_policy())) {
>  			ret = -EINVAL;
> @@ -447,8 +451,7 @@ static int cpufreq_governor_init(struct
>  		dbs_data->usage_count++;
>  		list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
>  		mutex_unlock(&dbs_data->mutex);
> -
> -		return 0;
> +		goto out;
>  	}
>  
>  	dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
> @@ -488,10 +491,14 @@ static int cpufreq_governor_init(struct
>  	ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
>  				   get_governor_parent_kobj(policy),
>  				   "%s", gov->gov.name);
> -	if (!ret)
> -		return 0;
> +	if (ret)
> +		goto err;
>  
> -	/* Failure, so roll back. */
> +out:
> +	mutex_unlock(&gov_dbs_data_mutex);
> +	return ret;
> +
> +err:

This has turned into an ugly maze, really. I think it would be much
better if we sacrifice a bit on consistency in the code, and move the
locks in cpufreq_governor_dbs() around invocations to
cpufreq_governor_init(). Or maybe create a
__cpufreq_governor_init(), or whatever.

That routine is hardly readably anymore.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-18  5:24   ` Viresh Kumar
@ 2016-02-18 16:20     ` Rafael J. Wysocki
  2016-02-19  2:27       ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 16:20 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael J. Wysocki, Linux PM list, Linux Kernel Mailing List

On Thu, Feb 18, 2016 at 6:24 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 18-02-16, 02:19, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> It is possible for a dbs_data object to be updated after its
>> usage counter has become 0.  That may happen if governor_store()
>> runs (via a govenor tunable sysfs attribute write) in parallel
>> with cpufreq_governor_exit() called for the last cpufreq policy
>> associated with the dbs_data in question.  In that case, if
>> governor_store() acquires dbs_data->mutex right after
>> cpufreq_governor_exit() has released it, the ->store() callback
>> invoked by it may operate on dbs_data with no users.  Although
>> sysfs will cause the kobject_put() in cpufreq_governor_exit() to
>> block until governor_store() has returned, that situation may
>> lead to some unexpected results, depending on the implementation
>> of the ->store callback, and therefore it should be avoided.
>>
>> To that end, modify governor_store() to check the dbs_data's
>> usage count before invoking the ->store() callback and return
>> an error if it is 0 at that point.
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>  drivers/cpufreq/cpufreq_governor.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
>> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
>> @@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
>>
>>       mutex_lock(&dbs_data->mutex);
>>
>> -     if (gattr->store)
>> +     if (dbs_data->usage_count && gattr->store)
>
> That's not gonna be enough. The above lock doesn't guarantee
> protection with any such races.

I'm not really sure what you're talking about to be honest, so please
be more specific.  You can say "For example, function X decrements the
usage count without locking" or similar.

Such vague comments are quite difficult to address, especially if they
don't hold any water. :-)

> And so usage_count can become zero
> just after this check.

But how?

The only place it is decremented is cpufreq_governor_exit() and there
it is done under dbs_data->mutex (at my direct request, BTW).  So we
are guaranteed that it won't go down to zero while we're holding
dbs_data->mutex, aren't we?

> Btw, we should also kill the gattr->store checks here as well, as we
> did it in cpufreq-core.
>
>>               ret = gattr->store(dbs_data, buf, count);
>>
>>       mutex_unlock(&dbs_data->mutex);

Yeah, they are quite useless.  But not in this patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage
  2016-02-18  6:20   ` Viresh Kumar
@ 2016-02-18 16:32     ` Rafael J. Wysocki
  2016-02-18 17:58     ` [PATCH v2 " Rafael J. Wysocki
  1 sibling, 0 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 16:32 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael J. Wysocki, Linux PM list, Linux Kernel Mailing List

On Thu, Feb 18, 2016 at 7:20 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 18-02-16, 02:38, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Since cpufreq_governor_dbs() is now always called with policy->rwsem
>> held, it cannot be executed twice in parallel for the same policy.
>> Thus it is not necessary to hold dbs_data_mutex around the invocations
>> of cpufreq_governor_start/stop/limits() from it as those functions
>> never modify any data that can be shared between different policies.
>>
>> However, cpufreq_governor_dbs() may be executed twice in parallal
>> for different policies using the same gov->gdbs_data object and
>> dbs_data_mutex is still necessary to protect that object against
>> concurrent updates.
>>
>> For this reason, narrow down the dbs_data_mutex locking to
>> cpufreq_governor_init/exit() where it is needed and rename the
>> mutex to gov_dbs_data_mutex to reflect its purpose.
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>  drivers/cpufreq/cpufreq_governor.c |   53 ++++++++++++++++++-------------------
>>  1 file changed, 27 insertions(+), 26 deletions(-)
>>
>> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
>> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
>> @@ -24,7 +24,7 @@
>>
>>  static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
>>
>> -static DEFINE_MUTEX(dbs_data_mutex);
>> +static DEFINE_MUTEX(gov_dbs_data_mutex);
>>
>>  /* Common sysfs tunables */
>>  /**
>> @@ -422,10 +422,10 @@ static void free_policy_dbs_info(struct
>>  static int cpufreq_governor_init(struct cpufreq_policy *policy)
>>  {
>>       struct dbs_governor *gov = dbs_governor_of(policy);
>> -     struct dbs_data *dbs_data = gov->gdbs_data;
>> +     struct dbs_data *dbs_data;
>>       struct policy_dbs_info *policy_dbs;
>>       unsigned int latency;
>> -     int ret;
>> +     int ret = 0;
>>
>>       /* State should be equivalent to EXIT */
>>       if (policy->governor_data)
>> @@ -435,6 +435,10 @@ static int cpufreq_governor_init(struct
>>       if (!policy_dbs)
>>               return -ENOMEM;
>>
>> +     /* Protect gov->gdbs_data against concurrent updates. */
>> +     mutex_lock(&gov_dbs_data_mutex);
>> +
>> +     dbs_data = gov->gdbs_data;
>>       if (dbs_data) {
>>               if (WARN_ON(have_governor_per_policy())) {
>>                       ret = -EINVAL;
>> @@ -447,8 +451,7 @@ static int cpufreq_governor_init(struct
>>               dbs_data->usage_count++;
>>               list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
>>               mutex_unlock(&dbs_data->mutex);
>> -
>> -             return 0;
>> +             goto out;
>>       }
>>
>>       dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
>> @@ -488,10 +491,14 @@ static int cpufreq_governor_init(struct
>>       ret = kobject_init_and_add(&dbs_data->kobj, &gov->kobj_type,
>>                                  get_governor_parent_kobj(policy),
>>                                  "%s", gov->gov.name);
>> -     if (!ret)
>> -             return 0;
>> +     if (ret)
>> +             goto err;
>>
>> -     /* Failure, so roll back. */
>> +out:
>> +     mutex_unlock(&gov_dbs_data_mutex);
>> +     return ret;
>> +
>> +err:
>
> This has turned into an ugly maze, really. I think it would be much
> better if we sacrifice a bit on consistency in the code, and move the
> locks in cpufreq_governor_dbs() around invocations to
> cpufreq_governor_init(). Or maybe create a
> __cpufreq_governor_init(), or whatever.
>
> That routine is hardly readably anymore.

Yes, it's not pretty, but I can still read it just fine.  Maybe that's
because I'm used to things like that. :-)

But OK, you have a point.  I'll rework this one.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store
  2016-02-18  5:44   ` Viresh Kumar
@ 2016-02-18 17:37     ` Rafael J. Wysocki
  0 siblings, 0 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 17:37 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael J. Wysocki, Linux PM list, Linux Kernel Mailing List

On Thu, Feb 18, 2016 at 6:44 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 18-02-16, 02:26, Rafael J. Wysocki wrote:
>> Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
>> +++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
>> @@ -29,6 +29,7 @@
>>
>>  static DEFINE_PER_CPU(struct od_cpu_dbs_info_s, od_cpu_dbs_info);
>>
>> +static struct dbs_governor od_dbs_gov;
>>  static struct od_ops od_ops;
>>
>>  static unsigned int default_powersave_bias;
>> @@ -222,7 +223,6 @@ static ssize_t store_io_is_busy(struct d
>>  {
>>       unsigned int input;
>>       int ret;
>> -     unsigned int j;
>>
>>       ret = sscanf(buf, "%u", &input);
>>       if (ret != 1)
>> @@ -230,12 +230,8 @@ static ssize_t store_io_is_busy(struct d
>>       dbs_data->io_is_busy = !!input;
>>
>>       /* we need to re-evaluate prev_cpu_idle */
>> -     for_each_online_cpu(j) {
>> -             struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
>> -                                                                     j);
>> -             dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
>> -                     &dbs_info->cdbs.prev_cpu_wall, dbs_data->io_is_busy);
>> -     }
>
> We weren't doing ignore_nice_load check here, but will be done after
> this patch. Will that make a different?

Yes, IMO failing to update prev_cpu_nice if ignore_nice_load is set is
at least inconsistent with the other updates.

> And then this should be mentioned in the log as well ?

OK, I'll update the changelog to mention that.

> Apart from that.
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Thanks!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 8/12] cpufreq: governor: Make governor private data per-policy
  2016-02-18  6:03   ` Viresh Kumar
@ 2016-02-18 17:56     ` Rafael J. Wysocki
  2016-02-19  2:36       ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 17:56 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Linux PM list, Linux Kernel Mailing List

On Thursday, February 18, 2016 11:33:24 AM Viresh Kumar wrote:
> On 18-02-16, 02:30, Rafael J. Wysocki wrote:

[cut]

> Fantastic, that's what I just suggested in the previous patch :)
> 
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

I had to update this one, because I forgot that the amd_freq_sensitivity thing
refers to the ondemand data structures.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] cpufreq: governor: Make governor private data per-policy

Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s
are only used for a limited set of CPUs.  Namely, if a policy is
shared between multiple CPUs, those fields will only be used for one
of them (policy->cpu).  This means that they really are per-policy
rather than per-CPU and holding room for them in per-CPU data
structures is generally wasteful.  Also moving those fields into
per-policy data structures will allow some significant simplifications
to be made going forward.

For this reason, introduce struct cs_policy_dbs_info and
struct od_policy_dbs_info to hold those fields.  Define each of the
new structures as an extension of struct policy_dbs_info (such that
struct policy_dbs_info is embedded in each of them) and introduce
new ->alloc and ->free governor callbacks to allocate and free
those structures, respectively, such that ->alloc() will return
a pointer to the struct policy_dbs_info embedded in the allocated
data structure and ->free() will take that pointer as its argument.

With that, modify the code accessing the data fields in question
in per-CPU data objects to look for them in the new structures
via the struct policy_dbs_info pointer available to it and drop
them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This adds a header file for the definitions of data structures shared
between ondemand and amd_freq_sensitivity.

The latter is updated to look for the governor tunables in struct od_policy_dbs_info
instead of struct od_cpu_dbs_info_s.

---
 drivers/cpufreq/amd_freq_sensitivity.c |    5 +---
 drivers/cpufreq/cpufreq_conservative.c |   34 +++++++++++++++++++++++++++++----
 drivers/cpufreq/cpufreq_governor.c     |    7 ++----
 drivers/cpufreq/cpufreq_governor.h     |    9 +-------
 drivers/cpufreq/cpufreq_ondemand.c     |   34 +++++++++++++++++++++++----------
 drivers/cpufreq/cpufreq_ondemand.h     |   26 +++++++++++++++++++++++++
 6 files changed, 87 insertions(+), 28 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -14,6 +14,17 @@
 #include <linux/slab.h>
 #include "cpufreq_governor.h"
 
+struct cs_policy_dbs_info {
+	struct policy_dbs_info policy_dbs;
+	unsigned int down_skip;
+	unsigned int requested_freq;
+};
+
+static inline struct cs_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
+{
+	return container_of(policy_dbs, struct cs_policy_dbs_info, policy_dbs);
+}
+
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD		(20)
@@ -48,8 +59,8 @@ static inline unsigned int get_freq_targ
  */
 static unsigned int cs_dbs_timer(struct cpufreq_policy *policy)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, policy->cpu);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
 	unsigned int load = dbs_update(policy);
@@ -238,6 +249,19 @@ static struct attribute *cs_attributes[]
 
 /************************** sysfs end ************************/
 
+static struct policy_dbs_info *cs_alloc(void)
+{
+	struct cs_policy_dbs_info *dbs_info;
+
+	dbs_info = kzalloc(sizeof(*dbs_info), GFP_KERNEL);
+	return dbs_info ? &dbs_info->policy_dbs : NULL;
+}
+
+static void cs_free(struct policy_dbs_info *policy_dbs)
+{
+	kfree(to_dbs_info(policy_dbs));
+}
+
 static int cs_init(struct dbs_data *dbs_data, bool notify)
 {
 	struct cs_dbs_tuners *tuners;
@@ -276,7 +300,7 @@ static void cs_exit(struct dbs_data *dbs
 
 static void cs_start(struct cpufreq_policy *policy)
 {
-	struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, policy->cpu);
+	struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
 	dbs_info->down_skip = 0;
 	dbs_info->requested_freq = policy->cur;
@@ -294,6 +318,8 @@ static struct dbs_governor cs_dbs_gov =
 	.kobj_type = { .default_attrs = cs_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = cs_dbs_timer,
+	.alloc = cs_alloc,
+	.free = cs_free,
 	.init = cs_init,
 	.exit = cs_exit,
 	.start = cs_start,
@@ -305,9 +331,8 @@ static int dbs_cpufreq_notifier(struct n
 				void *data)
 {
 	struct cpufreq_freqs *freq = data;
-	struct cs_cpu_dbs_info_s *dbs_info =
-					&per_cpu(cs_cpu_dbs_info, freq->cpu);
 	struct cpufreq_policy *policy = cpufreq_cpu_get_raw(freq->cpu);
+	struct cs_policy_dbs_info *dbs_info;
 
 	if (!policy)
 		return 0;
@@ -316,6 +341,7 @@ static int dbs_cpufreq_notifier(struct n
 	if (policy->governor != CPU_FREQ_GOV_CONSERVATIVE)
 		return 0;
 
+	dbs_info = to_dbs_info(policy->governor_data);
 	/*
 	 * we only care if our internally tracked freq moves outside the 'valid'
 	 * ranges of frequency available to us otherwise we do not change it
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -157,17 +157,10 @@ struct cpu_dbs_info {
 
 struct od_cpu_dbs_info_s {
 	struct cpu_dbs_info cdbs;
-	struct cpufreq_frequency_table *freq_table;
-	unsigned int freq_lo;
-	unsigned int freq_lo_delay_us;
-	unsigned int freq_hi_delay_us;
-	unsigned int sample_type:1;
 };
 
 struct cs_cpu_dbs_info_s {
 	struct cpu_dbs_info cdbs;
-	unsigned int down_skip;
-	unsigned int requested_freq;
 };
 
 /* Per policy Governors sysfs tunables */
@@ -193,6 +186,8 @@ struct dbs_governor {
 
 	struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
 	unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
+	struct policy_dbs_info *(*alloc)(void);
+	void (*free)(struct policy_dbs_info *policy_dbs);
 	int (*init)(struct dbs_data *dbs_data, bool notify);
 	void (*exit)(struct dbs_data *dbs_data, bool notify);
 	void (*start)(struct cpufreq_policy *policy);
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.c
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.c
@@ -16,7 +16,8 @@
 #include <linux/percpu-defs.h>
 #include <linux/slab.h>
 #include <linux/tick.h>
-#include "cpufreq_governor.h"
+
+#include "cpufreq_ondemand.h"
 
 /* On-demand governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
@@ -69,9 +70,8 @@ static unsigned int generic_powersave_bi
 	unsigned int freq_hi, freq_lo;
 	unsigned int index = 0;
 	unsigned int delay_hi_us;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
-						   policy->cpu);
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 
@@ -114,10 +114,9 @@ static unsigned int generic_powersave_bi
 
 static void ondemand_powersave_bias_init(struct cpufreq_policy *policy)
 {
-	unsigned int cpu = policy->cpu;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
-	dbs_info->freq_table = cpufreq_frequency_get_table(cpu);
+	dbs_info->freq_table = cpufreq_frequency_get_table(policy->cpu);
 	dbs_info->freq_lo = 0;
 }
 
@@ -144,8 +143,8 @@ static void dbs_freq_increase(struct cpu
  */
 static void od_update(struct cpufreq_policy *policy)
 {
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
-	struct policy_dbs_info *policy_dbs = dbs_info->cdbs.policy_dbs;
+	struct policy_dbs_info *policy_dbs = policy->governor_data;
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = dbs_data->tuners;
 	unsigned int load = dbs_update(policy);
@@ -182,7 +181,7 @@ static unsigned int od_dbs_timer(struct
 {
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy_dbs);
 	int sample_type = dbs_info->sample_type;
 
 	/* Common NORMAL_SAMPLE setup */
@@ -347,6 +346,19 @@ static struct attribute *od_attributes[]
 
 /************************** sysfs end ************************/
 
+static struct policy_dbs_info *od_alloc(void)
+{
+	struct od_policy_dbs_info *dbs_info;
+
+	dbs_info = kzalloc(sizeof(*dbs_info), GFP_KERNEL);
+	return dbs_info ? &dbs_info->policy_dbs : NULL;
+}
+
+static void od_free(struct policy_dbs_info *policy_dbs)
+{
+	kfree(to_dbs_info(policy_dbs));
+}
+
 static int od_init(struct dbs_data *dbs_data, bool notify)
 {
 	struct od_dbs_tuners *tuners;
@@ -395,7 +407,7 @@ static void od_exit(struct dbs_data *dbs
 
 static void od_start(struct cpufreq_policy *policy)
 {
-	struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, policy->cpu);
+	struct od_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data);
 
 	dbs_info->sample_type = OD_NORMAL_SAMPLE;
 	ondemand_powersave_bias_init(policy);
@@ -417,6 +429,8 @@ static struct dbs_governor od_dbs_gov =
 	.kobj_type = { .default_attrs = od_attributes },
 	.get_cpu_cdbs = get_cpu_cdbs,
 	.gov_dbs_timer = od_dbs_timer,
+	.alloc = od_alloc,
+	.free = od_free,
 	.init = od_init,
 	.exit = od_exit,
 	.start = od_start,
Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -385,8 +385,8 @@ static struct policy_dbs_info *alloc_pol
 	struct policy_dbs_info *policy_dbs;
 	int j;
 
-	/* Allocate memory for the common information for policy->cpus */
-	policy_dbs = kzalloc(sizeof(*policy_dbs), GFP_KERNEL);
+	/* Allocate memory for per-policy governor data. */
+	policy_dbs = gov->alloc();
 	if (!policy_dbs)
 		return NULL;
 
@@ -421,7 +421,7 @@ static void free_policy_dbs_info(struct
 		j_cdbs->policy_dbs = NULL;
 		j_cdbs->update_util.func = NULL;
 	}
-	kfree(policy_dbs);
+	gov->free(policy_dbs);
 }
 
 static int cpufreq_governor_init(struct cpufreq_policy *policy)
@@ -582,7 +582,6 @@ static int cpufreq_governor_start(struct
 static int cpufreq_governor_stop(struct cpufreq_policy *policy)
 {
 	gov_cancel_work(policy);
-
 	return 0;
 }
 
Index: linux-pm/drivers/cpufreq/amd_freq_sensitivity.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/amd_freq_sensitivity.c
+++ linux-pm/drivers/cpufreq/amd_freq_sensitivity.c
@@ -21,7 +21,7 @@
 #include <asm/msr.h>
 #include <asm/cpufeature.h>
 
-#include "cpufreq_governor.h"
+#include "cpufreq_ondemand.h"
 
 #define MSR_AMD64_FREQ_SENSITIVITY_ACTUAL	0xc0010080
 #define MSR_AMD64_FREQ_SENSITIVITY_REFERENCE	0xc0010081
@@ -48,8 +48,7 @@ static unsigned int amd_powersave_bias_t
 	struct policy_dbs_info *policy_dbs = policy->governor_data;
 	struct dbs_data *od_data = policy_dbs->dbs_data;
 	struct od_dbs_tuners *od_tuners = od_data->tuners;
-	struct od_cpu_dbs_info_s *od_info =
-		dbs_governor_of(policy)->get_cpu_dbs_info_s(policy->cpu);
+	struct od_policy_dbs_info *od_info = to_dbs_info(policy_dbs);
 
 	if (!od_info->freq_table)
 		return freq_next;
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.h
===================================================================
--- /dev/null
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.h
@@ -0,0 +1,26 @@
+/*
+ * Header file for CPUFreq ondemand governor and related code.
+ *
+ * Copyright (C) 2016, Intel Corporation
+ * Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include "cpufreq_governor.h"
+
+struct od_policy_dbs_info {
+	struct policy_dbs_info policy_dbs;
+	struct cpufreq_frequency_table *freq_table;
+	unsigned int freq_lo;
+	unsigned int freq_lo_delay_us;
+	unsigned int freq_hi_delay_us;
+	unsigned int sample_type:1;
+};
+
+static inline struct od_policy_dbs_info *to_dbs_info(struct policy_dbs_info *policy_dbs)
+{
+	return container_of(policy_dbs, struct od_policy_dbs_info, policy_dbs);
+}

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 10/12] cpufreq: governor: Relocate definitions of tuners structures
  2016-02-18  6:09   ` Viresh Kumar
@ 2016-02-18 17:57     ` Rafael J. Wysocki
  2016-02-19  2:36       ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 17:57 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Linux PM list, Linux Kernel Mailing List

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] cpufreq: governor: Relocate definitions of tuners structures

Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners
from the common governor header to the ondemand and conservative
governor code, respectively, as they don't need to be in the common
header any more.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This had to be update because of the update of [8/12] it depends on.

---
 drivers/cpufreq/cpufreq_conservative.c |    5 +++++
 drivers/cpufreq/cpufreq_governor.h     |   10 ----------
 drivers/cpufreq/cpufreq_ondemand.h     |    4 ++++
 3 files changed, 9 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_conservative.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_conservative.c
+++ linux-pm/drivers/cpufreq/cpufreq_conservative.c
@@ -25,6 +25,11 @@ static inline struct cs_policy_dbs_info
 	return container_of(policy_dbs, struct cs_policy_dbs_info, policy_dbs);
 }
 
+struct cs_dbs_tuners {
+	unsigned int down_threshold;
+	unsigned int freq_step;
+};
+
 /* Conservative governor macros */
 #define DEF_FREQUENCY_UP_THRESHOLD		(80)
 #define DEF_FREQUENCY_DOWN_THRESHOLD		(20)
Index: linux-pm/drivers/cpufreq/cpufreq_governor.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.h
+++ linux-pm/drivers/cpufreq/cpufreq_governor.h
@@ -148,16 +148,6 @@ struct cpu_dbs_info {
 	struct policy_dbs_info *policy_dbs;
 };
 
-/* Per policy Governors sysfs tunables */
-struct od_dbs_tuners {
-	unsigned int powersave_bias;
-};
-
-struct cs_dbs_tuners {
-	unsigned int down_threshold;
-	unsigned int freq_step;
-};
-
 /* Common Governor data across policies */
 struct dbs_governor {
 	struct cpufreq_governor gov;
Index: linux-pm/drivers/cpufreq/cpufreq_ondemand.h
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_ondemand.h
+++ linux-pm/drivers/cpufreq/cpufreq_ondemand.h
@@ -24,3 +24,7 @@ static inline struct od_policy_dbs_info
 {
 	return container_of(policy_dbs, struct od_policy_dbs_info, policy_dbs);
 }
+
+struct od_dbs_tuners {
+	unsigned int powersave_bias;
+};

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage
  2016-02-18  6:20   ` Viresh Kumar
  2016-02-18 16:32     ` Rafael J. Wysocki
@ 2016-02-18 17:58     ` Rafael J. Wysocki
  2016-02-19  2:38       ` Viresh Kumar
  1 sibling, 1 reply; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-18 17:58 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Linux PM list, Linux Kernel Mailing List

On Thursday, February 18, 2016 11:50:40 AM Viresh Kumar wrote:
> On 18-02-16, 02:38, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

[cut]

> 
> This has turned into an ugly maze, really. I think it would be much
> better if we sacrifice a bit on consistency in the code, and move the
> locks in cpufreq_governor_dbs() around invocations to
> cpufreq_governor_init(). Or maybe create a
> __cpufreq_governor_init(), or whatever.
> 
> That routine is hardly readably anymore.

So does the one below look better?

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: [PATCH] cpufreq: governor: Narrow down the dbs_data_mutex coverage

Since cpufreq_governor_dbs() is now always called with policy->rwsem
held, it cannot be executed twice in parallel for the same policy.
Thus it is not necessary to hold dbs_data_mutex around the invocations
of cpufreq_governor_start/stop/limits() from it as those functions
never modify any data that can be shared between different policies.

However, cpufreq_governor_dbs() may be executed twice in parallal
for different policies using the same gov->gdbs_data object and
dbs_data_mutex is still necessary to protect that object against
concurrent updates.

For this reason, narrow down the dbs_data_mutex locking to
cpufreq_governor_init/exit() where it is needed and rename the
mutex to gov_dbs_data_mutex to reflect its purpose.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq_governor.c |   46 ++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
+++ linux-pm/drivers/cpufreq/cpufreq_governor.c
@@ -24,7 +24,7 @@
 
 static DEFINE_PER_CPU(struct cpu_dbs_info, cpu_dbs);
 
-static DEFINE_MUTEX(dbs_data_mutex);
+static DEFINE_MUTEX(gov_dbs_data_mutex);
 
 /* Common sysfs tunables */
 /**
@@ -422,10 +422,10 @@ static void free_policy_dbs_info(struct
 static int cpufreq_governor_init(struct cpufreq_policy *policy)
 {
 	struct dbs_governor *gov = dbs_governor_of(policy);
-	struct dbs_data *dbs_data = gov->gdbs_data;
+	struct dbs_data *dbs_data;
 	struct policy_dbs_info *policy_dbs;
 	unsigned int latency;
-	int ret;
+	int ret = 0;
 
 	/* State should be equivalent to EXIT */
 	if (policy->governor_data)
@@ -435,6 +435,10 @@ static int cpufreq_governor_init(struct
 	if (!policy_dbs)
 		return -ENOMEM;
 
+	/* Protect gov->gdbs_data against concurrent updates. */
+	mutex_lock(&gov_dbs_data_mutex);
+
+	dbs_data = gov->gdbs_data;
 	if (dbs_data) {
 		if (WARN_ON(have_governor_per_policy())) {
 			ret = -EINVAL;
@@ -447,8 +451,7 @@ static int cpufreq_governor_init(struct
 		dbs_data->usage_count++;
 		list_add(&policy_dbs->list, &dbs_data->policy_dbs_list);
 		mutex_unlock(&dbs_data->mutex);
-
-		return 0;
+		goto out;
 	}
 
 	dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
@@ -489,7 +492,7 @@ static int cpufreq_governor_init(struct
 				   get_governor_parent_kobj(policy),
 				   "%s", gov->gov.name);
 	if (!ret)
-		return 0;
+		goto out;
 
 	/* Failure, so roll back. */
 	pr_err("cpufreq: Governor initialization failed (dbs_data kobject init error %d)\n", ret);
@@ -503,6 +506,9 @@ static int cpufreq_governor_init(struct
 
 free_policy_dbs_info:
 	free_policy_dbs_info(policy, gov);
+
+out:
+	mutex_unlock(&gov_dbs_data_mutex);
 	return ret;
 }
 
@@ -513,6 +519,9 @@ static int cpufreq_governor_exit(struct
 	struct dbs_data *dbs_data = policy_dbs->dbs_data;
 	int count;
 
+	/* Protect gov->gdbs_data against concurrent updates. */
+	mutex_lock(&gov_dbs_data_mutex);
+
 	mutex_lock(&dbs_data->mutex);
 	list_del(&policy_dbs->list);
 	count = --dbs_data->usage_count;
@@ -534,6 +543,8 @@ static int cpufreq_governor_exit(struct
 	}
 
 	free_policy_dbs_info(policy, gov);
+
+	mutex_unlock(&gov_dbs_data_mutex);
 	return 0;
 }
 
@@ -600,31 +611,20 @@ static int cpufreq_governor_limits(struc
 
 int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event)
 {
-	int ret = -EINVAL;
-
-	/* Lock governor to block concurrent initialization of governor */
-	mutex_lock(&dbs_data_mutex);
-
 	if (event == CPUFREQ_GOV_POLICY_INIT) {
-		ret = cpufreq_governor_init(policy);
+		return cpufreq_governor_init(policy);
 	} else if (policy->governor_data) {
 		switch (event) {
 		case CPUFREQ_GOV_POLICY_EXIT:
-			ret = cpufreq_governor_exit(policy);
-			break;
+			return cpufreq_governor_exit(policy);
 		case CPUFREQ_GOV_START:
-			ret = cpufreq_governor_start(policy);
-			break;
+			return cpufreq_governor_start(policy);
 		case CPUFREQ_GOV_STOP:
-			ret = cpufreq_governor_stop(policy);
-			break;
+			return cpufreq_governor_stop(policy);
 		case CPUFREQ_GOV_LIMITS:
-			ret = cpufreq_governor_limits(policy);
-			break;
+			return cpufreq_governor_limits(policy);
 		}
 	}
-
-	mutex_unlock(&dbs_data_mutex);
-	return ret;
+	return -EINVAL;
 }
 EXPORT_SYMBOL_GPL(cpufreq_governor_dbs);

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-18 16:20     ` Rafael J. Wysocki
@ 2016-02-19  2:27       ` Viresh Kumar
  2016-02-19  2:34         ` Rafael J. Wysocki
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2016-02-19  2:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM list, Linux Kernel Mailing List

On 18-02-16, 17:20, Rafael J. Wysocki wrote:
> On Thu, Feb 18, 2016 at 6:24 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 18-02-16, 02:19, Rafael J. Wysocki wrote:

> >> @@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
> >>
> >>       mutex_lock(&dbs_data->mutex);
> >>
> >> -     if (gattr->store)
> >> +     if (dbs_data->usage_count && gattr->store)
> >
> > That's not gonna be enough. The above lock doesn't guarantee
> > protection with any such races.

Oops, I completely misread it. Really sorry about that.

But now that I have read the code again, I wonder why we need this protection at
all. The first thing we do after decrementing the usage_count counter, is we put
the kobject. Which will ensure that the sysfs files are all gone. So, what is
the race we are trying to fix then?

> Yeah, they are quite useless.  But not in this patch.

Sure.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-19  2:27       ` Viresh Kumar
@ 2016-02-19  2:34         ` Rafael J. Wysocki
  0 siblings, 0 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2016-02-19  2:34 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Linux PM list,
	Linux Kernel Mailing List

On Fri, Feb 19, 2016 at 3:27 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 18-02-16, 17:20, Rafael J. Wysocki wrote:
>> On Thu, Feb 18, 2016 at 6:24 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> > On 18-02-16, 02:19, Rafael J. Wysocki wrote:
>
>> >> @@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
>> >>
>> >>       mutex_lock(&dbs_data->mutex);
>> >>
>> >> -     if (gattr->store)
>> >> +     if (dbs_data->usage_count && gattr->store)
>> >
>> > That's not gonna be enough. The above lock doesn't guarantee
>> > protection with any such races.
>
> Oops, I completely misread it. Really sorry about that.
>
> But now that I have read the code again, I wonder why we need this protection at
> all. The first thing we do after decrementing the usage_count counter, is we put
> the kobject. Which will ensure that the sysfs files are all gone. So, what is
> the race we are trying to fix then?

The ->store() callbacks for different attributes may do silly stuff
like walking all CPUs in the system and updating per-CPU data for
them.  If the dbs_data the callback has been called for is going away,
this is pointless at best and may be actually harmful depending on
what the callback is really doing.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 8/12] cpufreq: governor: Make governor private data per-policy
  2016-02-18 17:56     ` [PATCH v2 " Rafael J. Wysocki
@ 2016-02-19  2:36       ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-19  2:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] cpufreq: governor: Make governor private data per-policy
> 
> Some fields in struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s
> are only used for a limited set of CPUs.  Namely, if a policy is
> shared between multiple CPUs, those fields will only be used for one
> of them (policy->cpu).  This means that they really are per-policy
> rather than per-CPU and holding room for them in per-CPU data
> structures is generally wasteful.  Also moving those fields into
> per-policy data structures will allow some significant simplifications
> to be made going forward.
> 
> For this reason, introduce struct cs_policy_dbs_info and
> struct od_policy_dbs_info to hold those fields.  Define each of the
> new structures as an extension of struct policy_dbs_info (such that
> struct policy_dbs_info is embedded in each of them) and introduce
> new ->alloc and ->free governor callbacks to allocate and free
> those structures, respectively, such that ->alloc() will return
> a pointer to the struct policy_dbs_info embedded in the allocated
> data structure and ->free() will take that pointer as its argument.
> 
> With that, modify the code accessing the data fields in question
> in per-CPU data objects to look for them in the new structures
> via the struct policy_dbs_info pointer available to it and drop
> them from struct od_cpu_dbs_info_s and struct cs_cpu_dbs_info_s.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> This adds a header file for the definitions of data structures shared
> between ondemand and amd_freq_sensitivity.
> 
> The latter is updated to look for the governor tunables in struct od_policy_dbs_info
> instead of struct od_cpu_dbs_info_s.
> 
> ---
>  drivers/cpufreq/amd_freq_sensitivity.c |    5 +---
>  drivers/cpufreq/cpufreq_conservative.c |   34 +++++++++++++++++++++++++++++----
>  drivers/cpufreq/cpufreq_governor.c     |    7 ++----
>  drivers/cpufreq/cpufreq_governor.h     |    9 +-------
>  drivers/cpufreq/cpufreq_ondemand.c     |   34 +++++++++++++++++++++++----------
>  drivers/cpufreq/cpufreq_ondemand.h     |   26 +++++++++++++++++++++++++
>  6 files changed, 87 insertions(+), 28 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/12] cpufreq: governor: Relocate definitions of tuners structures
  2016-02-18 17:57     ` [PATCH v2 " Rafael J. Wysocki
@ 2016-02-19  2:36       ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-19  2:36 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 18:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] cpufreq: governor: Relocate definitions of tuners structures
> 
> Move the definitions of struct od_dbs_tuners and struct cs_dbs_tuners
> from the common governor header to the ondemand and conservative
> governor code, respectively, as they don't need to be in the common
> header any more.
> 
> No functional changes.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> This had to be update because of the update of [8/12] it depends on.
> 
> ---
>  drivers/cpufreq/cpufreq_conservative.c |    5 +++++
>  drivers/cpufreq/cpufreq_governor.h     |   10 ----------
>  drivers/cpufreq/cpufreq_ondemand.h     |    4 ++++
>  3 files changed, 9 insertions(+), 10 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage
  2016-02-18 17:58     ` [PATCH v2 " Rafael J. Wysocki
@ 2016-02-19  2:38       ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-19  2:38 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 18:58, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: [PATCH] cpufreq: governor: Narrow down the dbs_data_mutex coverage
> 
> Since cpufreq_governor_dbs() is now always called with policy->rwsem
> held, it cannot be executed twice in parallel for the same policy.
> Thus it is not necessary to hold dbs_data_mutex around the invocations
> of cpufreq_governor_start/stop/limits() from it as those functions
> never modify any data that can be shared between different policies.
> 
> However, cpufreq_governor_dbs() may be executed twice in parallal
> for different policies using the same gov->gdbs_data object and
> dbs_data_mutex is still necessary to protect that object against
> concurrent updates.
> 
> For this reason, narrow down the dbs_data_mutex locking to
> cpufreq_governor_init/exit() where it is needed and rename the
> mutex to gov_dbs_data_mutex to reflect its purpose.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |   46 ++++++++++++++++++-------------------
>  1 file changed, 23 insertions(+), 23 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition
  2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
  2016-02-18  5:24   ` Viresh Kumar
@ 2016-02-19  3:09   ` Viresh Kumar
  1 sibling, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2016-02-19  3:09 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM list, Linux Kernel Mailing List

On 18-02-16, 02:19, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It is possible for a dbs_data object to be updated after its
> usage counter has become 0.  That may happen if governor_store()
> runs (via a govenor tunable sysfs attribute write) in parallel
> with cpufreq_governor_exit() called for the last cpufreq policy
> associated with the dbs_data in question.  In that case, if
> governor_store() acquires dbs_data->mutex right after
> cpufreq_governor_exit() has released it, the ->store() callback
> invoked by it may operate on dbs_data with no users.  Although
> sysfs will cause the kobject_put() in cpufreq_governor_exit() to
> block until governor_store() has returned, that situation may
> lead to some unexpected results, depending on the implementation
> of the ->store callback, and therefore it should be avoided.
> 
> To that end, modify governor_store() to check the dbs_data's
> usage count before invoking the ->store() callback and return
> an error if it is 0 at that point.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq_governor.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq_governor.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq_governor.c
> +++ linux-pm/drivers/cpufreq/cpufreq_governor.c
> @@ -112,7 +112,7 @@ static ssize_t governor_store(struct kob
>  
>  	mutex_lock(&dbs_data->mutex);
>  
> -	if (gattr->store)
> +	if (dbs_data->usage_count && gattr->store)
>  		ret = gattr->store(dbs_data, buf, count);
>  
>  	mutex_unlock(&dbs_data->mutex);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-02-19  3:10 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-18  1:17 [PATCH 0/12] cpufreq: More governor code reorganization Rafael J. Wysocki
2016-02-18  1:19 ` [PATCH 1/12] cpufreq: governor: Close dbs_data update race condition Rafael J. Wysocki
2016-02-18  5:24   ` Viresh Kumar
2016-02-18 16:20     ` Rafael J. Wysocki
2016-02-19  2:27       ` Viresh Kumar
2016-02-19  2:34         ` Rafael J. Wysocki
2016-02-19  3:09   ` Viresh Kumar
2016-02-18  1:20 ` [PATCH 2/12] cpufreq: governor: Move io_is_busy to struct dbs_data Rafael J. Wysocki
2016-02-18  5:28   ` Viresh Kumar
2016-02-18  1:21 ` [PATCH 3/12] cpufreq: governor: Add a ->start callback for governors Rafael J. Wysocki
2016-02-18  5:36   ` Viresh Kumar
2016-02-18  1:22 ` [PATCH 4/12] cpufreq: governor: Drop unused governor callback and data fields Rafael J. Wysocki
2016-02-18  5:37   ` Viresh Kumar
2016-02-18  1:24 ` [PATCH 5/12] cpufreq: ondemand: Drop one more callback from struct od_ops Rafael J. Wysocki
2016-02-18  5:38   ` Viresh Kumar
2016-02-18  1:26 ` [PATCH 6/12] cpufreq: governor: Fix CPU load information updates via ->store Rafael J. Wysocki
2016-02-18  5:44   ` Viresh Kumar
2016-02-18 17:37     ` Rafael J. Wysocki
2016-02-18  1:28 ` [PATCH 7/12] cpufreq: ondemand: Rework the handling of powersave bias updates Rafael J. Wysocki
2016-02-18  5:53   ` Viresh Kumar
2016-02-18  1:30 ` [PATCH 8/12] cpufreq: governor: Make governor private data per-policy Rafael J. Wysocki
2016-02-18  6:03   ` Viresh Kumar
2016-02-18 17:56     ` [PATCH v2 " Rafael J. Wysocki
2016-02-19  2:36       ` Viresh Kumar
2016-02-18  1:31 ` [PATCH 9/12] cpufreq: governor: Move per-CPU data to the common code Rafael J. Wysocki
2016-02-18  6:08   ` Viresh Kumar
2016-02-18  1:32 ` [PATCH 10/12] cpufreq: governor: Relocate definitions of tuners structures Rafael J. Wysocki
2016-02-18  6:09   ` Viresh Kumar
2016-02-18 17:57     ` [PATCH v2 " Rafael J. Wysocki
2016-02-19  2:36       ` Viresh Kumar
2016-02-18  1:33 ` [PATCH 11/12] cpufreq: governor: Make dbs_data_mutex static Rafael J. Wysocki
2016-02-18  6:09   ` Viresh Kumar
2016-02-18  1:38 ` [PATCH 12/12] cpufreq: governor: Narrow down the dbs_data_mutex coverage Rafael J. Wysocki
2016-02-18  6:20   ` Viresh Kumar
2016-02-18 16:32     ` Rafael J. Wysocki
2016-02-18 17:58     ` [PATCH v2 " Rafael J. Wysocki
2016-02-19  2:38       ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).