All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/19] cpufreq locking cleanups and documentation
@ 2016-01-11 17:35 Juri Lelli
  2016-01-11 17:35 ` [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock Juri Lelli
                   ` (20 more replies)
  0 siblings, 21 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

Hi all,

In the context of the ongoing discussion about introducing a simple platform
energy model to guide scheduling decisions (Energy Aware Scheduling [1])
concerns have been expressed by Peter about the component in charge of driving
clock frequency selection (Steve recently posted an update of such component
[2]): https://lkml.org/lkml/2015/8/15/141.

The problem is that, with this new approach, cpufreq core functions need to be
accessed from scheduler hot-paths and the overhead associated with the current
locking scheme might result to be unsustainable. 

Peter's proposed approach of using RCU logic to reduce locking overhead seems
reasonable, but things may not be so straightforward as originally thought. The
very first thing I actually realized when I started looking into this is that
it was hard for me to understand which locking mechanism was protecting which
data structure. As mostly a way to build a better understanding of the current
cpufreq locking scheme and also as preparatory work for implementing RCU logic,
I came up with this set of patches. In fact, at this stage, I would like each
patch to be considered as a question I'm asking rather than a proposed change,
thus the RFC tag for the series; with the intent of documenting current locking
scheme and modifying it a bit in order to make RCU logic implementation easier.
Actually, as you'll soon notice, I didn't really start from scratch. Mike
shared with me some patches he has been developing while looking at the same
problem. I've given Mike attribution for the patches that I took unchanged from
him, with thanks for sharing his findings with me.

High level description of patches:

 o [01-04] cleanup and move code around to make things (hopefully) cleaner
 o [05-14] insert lockdep assertions and fix uncovered erroneous situations
 o [15-18] remove overkill usage of locking mechanism
 o 19      adds documentation for the cleaned up locking scheme

With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
anything bad happening. However, coverage is really small (as is my personal
confidence of not breaking things for other confs :-)).

This set is based on top of linux-pm/linux-next as of today and it is also
available from here:

 git://linux-arm.org/linux-jl.git upstream/cpufreq_cleanups

Comments, concerns and rants are the primary goal of this posting; I'm thus
looking forward to them.

Best,

- Juri

[1] https://lkml.org/lkml/2015/7/7/754
[2] https://lkml.org/lkml/2015/12/9/35 
[3] https://git.linaro.org/people/viresh.kumar/cpufreq-tests.git

Juri Lelli (16):
  cpufreq: kill for_each_policy
  cpufreq: bring data structures close to their locks
  cpufreq: assert locking when accessing cpufreq_policy_list
  cpufreq: always access cpufreq_policy_list while holding
    cpufreq_driver_lock
  cpufreq: assert locking when accessing cpufreq_governor_list
  cpufreq: fix warning for cpufreq_init_policy unlocked access to
    cpufreq_governor_list
  cpufreq: fix warning for show_scaling_available_governors unlocked
    access to cpufreq_governor_list
  cpufreq: assert policy->rwsem is held in cpufreq_set_policy
  cpufreq: assert policy->rwsem is held in __cpufreq_governor
  cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
  cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
  cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
  cpufreq: remove useless usage of cpufreq_governor_mutex in
    __cpufreq_governor
  cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
  cpufreq: stop checking for cpufreq_driver being present in
    cpufreq_cpu_get
  cpufreq: documentation: document locking scheme

Michael Turquette (3):
  cpufreq: do not expose cpufreq_governor_lock
  cpufreq: merge governor lock and mutex
  cpufreq: remove transition_lock

 Documentation/cpu-freq/core.txt    |  44 +++++++++++++
 drivers/cpufreq/cpufreq.c          | 132 +++++++++++++++++++++++--------------
 drivers/cpufreq/cpufreq_governor.h |   2 -
 include/linux/cpufreq.h            |   5 --
 4 files changed, 125 insertions(+), 58 deletions(-)

-- 
2.2.2

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12  8:56   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 02/19] cpufreq: merge governor lock and mutex Juri Lelli
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

From: Michael Turquette <mturquette@baylibre.com>

Since commit 3a91b069eabf ("cpufreq: governor: Quit work-handlers early if
governor is stopped") cpufreq_governor_lock is not used anywhere outside
cpufreq.c. Make it static again.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
---
 drivers/cpufreq/cpufreq.c          | 2 +-
 drivers/cpufreq/cpufreq_governor.h | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c35e7da..7bdd845 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -102,7 +102,7 @@ static LIST_HEAD(cpufreq_governor_list);
 static struct cpufreq_driver *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 static DEFINE_RWLOCK(cpufreq_driver_lock);
-DEFINE_MUTEX(cpufreq_governor_lock);
+static DEFINE_MUTEX(cpufreq_governor_lock);
 
 /* Flag to suspend/resume CPUFreq governors */
 static bool cpufreq_suspended;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 91e767a..65b2893 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -269,8 +269,6 @@ static ssize_t show_sampling_rate_min_gov_pol				\
 	return sprintf(buf, "%u\n", dbs_data->min_sampling_rate);	\
 }
 
-extern struct mutex cpufreq_governor_lock;
-
 void gov_add_timers(struct cpufreq_policy *policy, unsigned int delay);
 void gov_cancel_work(struct cpu_common_dbs_info *shared);
 void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 02/19] cpufreq: merge governor lock and mutex
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
  2016-01-11 17:35 ` [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12  9:00   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 03/19] cpufreq: kill for_each_policy Juri Lelli
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

From: Michael Turquette <mturquette@baylibre.com>

Commit 95731ebb114c ("cpufreq: Fix governor start/stop race condition")
introduced cpufreq_governor_lock. This was actually overkilling, as the
same can be achieved by using the existing cpufreq_governor_mutex.
Removing cpufreq_governor_lock cleans things up and makes deadlocks less
likely to happen.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
---
 drivers/cpufreq/cpufreq.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 7bdd845..0802705 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -102,7 +102,7 @@ static LIST_HEAD(cpufreq_governor_list);
 static struct cpufreq_driver *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 static DEFINE_RWLOCK(cpufreq_driver_lock);
-static DEFINE_MUTEX(cpufreq_governor_lock);
+static DEFINE_MUTEX(cpufreq_governor_mutex);
 
 /* Flag to suspend/resume CPUFreq governors */
 static bool cpufreq_suspended;
@@ -146,7 +146,6 @@ void disable_cpufreq(void)
 {
 	off = 1;
 }
-static DEFINE_MUTEX(cpufreq_governor_mutex);
 
 bool have_governor_per_policy(void)
 {
@@ -1963,11 +1962,11 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 
 	pr_debug("%s: for CPU %u, event %u\n", __func__, policy->cpu, event);
 
-	mutex_lock(&cpufreq_governor_lock);
+	mutex_lock(&cpufreq_governor_mutex);
 	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
 	    || (!policy->governor_enabled
 	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
-		mutex_unlock(&cpufreq_governor_lock);
+		mutex_unlock(&cpufreq_governor_mutex);
 		return -EBUSY;
 	}
 
@@ -1976,7 +1975,7 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 	else if (event == CPUFREQ_GOV_START)
 		policy->governor_enabled = true;
 
-	mutex_unlock(&cpufreq_governor_lock);
+	mutex_unlock(&cpufreq_governor_mutex);
 
 	ret = policy->governor->governor(policy, event);
 
@@ -1987,12 +1986,12 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 			policy->governor->initialized--;
 	} else {
 		/* Restore original values */
-		mutex_lock(&cpufreq_governor_lock);
+		mutex_lock(&cpufreq_governor_mutex);
 		if (event == CPUFREQ_GOV_STOP)
 			policy->governor_enabled = true;
 		else if (event == CPUFREQ_GOV_START)
 			policy->governor_enabled = false;
-		mutex_unlock(&cpufreq_governor_lock);
+		mutex_unlock(&cpufreq_governor_mutex);
 	}
 
 	if (((event == CPUFREQ_GOV_POLICY_INIT) && ret) ||
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 03/19] cpufreq: kill for_each_policy
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
  2016-01-11 17:35 ` [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock Juri Lelli
  2016-01-11 17:35 ` [RFC PATCH 02/19] cpufreq: merge governor lock and mutex Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12  9:01   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

for_each_policy() macro is not used anywhere. Kill it.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 0802705..2e41356 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -86,9 +86,6 @@ static struct cpufreq_policy *first_policy(bool active)
 #define for_each_inactive_policy(__policy)		\
 	for_each_suitable_policy(__policy, false)
 
-#define for_each_policy(__policy)			\
-	list_for_each_entry(__policy, &cpufreq_policy_list, policy_list)
-
 /* Iterate over governors */
 static LIST_HEAD(cpufreq_governor_list);
 #define for_each_governor(__governor)				\
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (2 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 03/19] cpufreq: kill for_each_policy Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-11 22:05   ` Peter Zijlstra
                     ` (2 more replies)
  2016-01-11 17:35 ` [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list Juri Lelli
                   ` (16 subsequent siblings)
  20 siblings, 3 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

Currently it is not easy to figure out which lock/mutex protects which data
structure. Clean things up by moving data structures and their locks/mutexs
closer; also, change comments to document relations further.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2e41356..00a00cd 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -31,7 +31,25 @@
 #include <linux/tick.h>
 #include <trace/events/power.h>
 
+/**
+ * Iterate over governors
+ *
+ * cpufreq_governor_list is protected by cpufreq_governor_mutex.
+ */
+static LIST_HEAD(cpufreq_governor_list);
+static DEFINE_MUTEX(cpufreq_governor_mutex);
+#define for_each_governor(__governor)				\
+	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
+
+/**
+ * The "cpufreq driver" - the arch- or hardware-dependent low
+ * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
+ * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
+ */
+static struct cpufreq_driver *cpufreq_driver;
+static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 static LIST_HEAD(cpufreq_policy_list);
+static DEFINE_RWLOCK(cpufreq_driver_lock);
 
 static inline bool policy_is_inactive(struct cpufreq_policy *policy)
 {
@@ -86,21 +104,6 @@ static struct cpufreq_policy *first_policy(bool active)
 #define for_each_inactive_policy(__policy)		\
 	for_each_suitable_policy(__policy, false)
 
-/* Iterate over governors */
-static LIST_HEAD(cpufreq_governor_list);
-#define for_each_governor(__governor)				\
-	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
-
-/**
- * The "cpufreq driver" - the arch- or hardware-dependent low
- * level driver of CPUFreq support, and its spinlock. This lock
- * also protects the cpufreq_cpu_data array.
- */
-static struct cpufreq_driver *cpufreq_driver;
-static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
-static DEFINE_RWLOCK(cpufreq_driver_lock);
-static DEFINE_MUTEX(cpufreq_governor_mutex);
-
 /* Flag to suspend/resume CPUFreq governors */
 static bool cpufreq_suspended;
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (3 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12  9:34   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock Juri Lelli
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

cpufreq_policy_list is guarded by cpufreq_driver_lock. Add appropriate
locking assertions to check that we always access the list while holding
the associated lock.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 00a00cd..63d6efb 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -65,6 +65,7 @@ static bool suitable_policy(struct cpufreq_policy *policy, bool active)
 static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
 					  bool active)
 {
+	lockdep_assert_held(&cpufreq_driver_lock);
 	do {
 		policy = list_next_entry(policy, policy_list);
 
@@ -80,6 +81,7 @@ static struct cpufreq_policy *first_policy(bool active)
 {
 	struct cpufreq_policy *policy;
 
+	lockdep_assert_held(&cpufreq_driver_lock);
 	/* No policies in the list */
 	if (list_empty(&cpufreq_policy_list))
 		return NULL;
@@ -2430,6 +2432,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (ret)
 		goto err_boost_unreg;
 
+	lockdep_assert_held(&cpufreq_driver_lock);
 	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
 	    list_empty(&cpufreq_policy_list)) {
 		/* if all ->init() calls failed, unregister */
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (4 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12  9:57   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list Juri Lelli
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

Commit highlights paths where we access cpufreq_policy_list without
holding cpufreq_driver_lock; one example being the following:

[    8.245779] ------------[ cut here ]------------
[    8.305977] WARNING: CPU: 2 PID: 1 at kernel/drivers/cpufreq/cpufreq.c:2447 cpufreq_register_driver+0xfd/0x120()
[    8.438611] Modules linked in:
[    8.493751] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc4+ #369
[    8.561039] Hardware name: ARM-Versatile Express
[    8.622765] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
[    8.629651] atkbd serio0: keyboard reset failed on 1c060000.kmi
[    8.810905] [<c0010e25>] (show_stack) from [<c02ece7d>] (dump_stack+0x55/0x78)
[    8.935122] [<c02ece7d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
[    9.067097] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
[    9.204101] [<c002030f>] (warn_slowpath_null) from [<c03ba329>] (cpufreq_register_driver+0xfd/0x120)
[    9.209603] usb 1-1.2: new high-speed USB device number 3 using isp1760
[    9.419507] [<c03ba329>] (cpufreq_register_driver) from [<c03bc481>] (bL_cpufreq_register+0x49/0x98)
[    9.560548] [<c03bc481>] (bL_cpufreq_register) from [<c0342517>] (platform_drv_probe+0x3b/0x6c)
[    9.573806] usb-storage 1-1.2:1.0: USB Mass Storage device detected
[    9.575468] scsi host0: usb-storage 1-1.2:1.0
[    9.855845] [<c0342517>] (platform_drv_probe) from [<c03412e7>] (driver_probe_device+0x153/0x1bc)
[   10.006137] [<c03412e7>] (driver_probe_device) from [<c03413a7>] (__driver_attach+0x57/0x58)
[   10.009576] atkbd serio1: keyboard reset failed on 1c070000.kmi
[   10.237057] [<c03413a7>] (__driver_attach) from [<c0340199>] (bus_for_each_dev+0x2d/0x4c)
[   10.387824] [<c0340199>] (bus_for_each_dev) from [<c0340bd7>] (bus_add_driver+0xa3/0x14c)
[   10.539200] [<c0340bd7>] (bus_add_driver) from [<c0341bff>] (driver_register+0x3b/0x88)
[   10.691023] [<c0341bff>] (driver_register) from [<c0009613>] (do_one_initcall+0x5b/0x150)
[   10.703809] scsi 0:0:0:0: Direct-Access     General  USB Flash Disk   1.0  PQ: 0 ANSI: 2
[   10.713081] sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
[   10.713973] sd 0:0:0:0: [sda] Write Protect is off
[   10.713984] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   10.730783] sd 0:0:0:0: [sda] No Caching mode page found
[   10.730814] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   10.779815]  sda: sda1 sda2
[   10.823590] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   11.581894] [<c0009613>] (do_one_initcall) from [<c0734b45>] (kernel_init_freeable+0x18d/0x22c)
[   11.720454] [<c0734b45>] (kernel_init_freeable) from [<c04f45f9>] (kernel_init+0xd/0xa4)
[   11.857340] [<c04f45f9>] (kernel_init) from [<c000dfb9>] (ret_from_fork+0x11/0x38)
[   11.993082] ---[ end trace 62ff5522fb3f41dd ]---

Fix this, and others, with proper locking of cpufreq_driver_lock.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 63d6efb..98adbc2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1585,6 +1585,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
 void cpufreq_suspend(void)
 {
 	struct cpufreq_policy *policy;
+	unsigned long flags;
 
 	if (!cpufreq_driver)
 		return;
@@ -1594,6 +1595,7 @@ void cpufreq_suspend(void)
 
 	pr_debug("%s: Suspending Governors\n", __func__);
 
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_active_policy(policy) {
 		if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
 			pr_err("%s: Failed to stop governor for policy: %p\n",
@@ -1603,6 +1605,7 @@ void cpufreq_suspend(void)
 			pr_err("%s: Failed to suspend driver: %p\n", __func__,
 				policy);
 	}
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 suspend:
 	cpufreq_suspended = true;
@@ -1617,6 +1620,7 @@ suspend:
 void cpufreq_resume(void)
 {
 	struct cpufreq_policy *policy;
+	unsigned long flags;
 
 	if (!cpufreq_driver)
 		return;
@@ -1628,6 +1632,7 @@ void cpufreq_resume(void)
 
 	pr_debug("%s: Resuming Governors\n", __func__);
 
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_active_policy(policy) {
 		if (cpufreq_driver->resume && cpufreq_driver->resume(policy))
 			pr_err("%s: Failed to resume driver: %p\n", __func__,
@@ -1637,6 +1642,7 @@ void cpufreq_resume(void)
 			pr_err("%s: Failed to start governor for policy: %p\n",
 				__func__, policy);
 	}
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	/*
 	 * schedule call cpufreq_update_policy() for first-online CPU, as that
@@ -2287,7 +2293,9 @@ static int cpufreq_boost_set_sw(int state)
 	struct cpufreq_frequency_table *freq_table;
 	struct cpufreq_policy *policy;
 	int ret = -EINVAL;
+	unsigned long flags;
 
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_active_policy(policy) {
 		freq_table = cpufreq_frequency_get_table(policy->cpu);
 		if (freq_table) {
@@ -2302,6 +2310,7 @@ static int cpufreq_boost_set_sw(int state)
 			__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
 		}
 	}
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return ret;
 }
@@ -2432,14 +2441,16 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (ret)
 		goto err_boost_unreg;
 
-	lockdep_assert_held(&cpufreq_driver_lock);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
 	    list_empty(&cpufreq_policy_list)) {
 		/* if all ->init() calls failed, unregister */
 		pr_debug("%s: No CPU initialized for driver %s\n", __func__,
 			 driver_data->name);
+		read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		goto err_if_unreg;
 	}
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	register_hotcpu_notifier(&cpufreq_cpu_notifier);
 	pr_debug("driver %s up and running\n", driver_data->name);
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (5 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:01   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list Juri Lelli
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

cpufreq_governor_list is guarded by cpufreq_governor_mutex. Add
appropriate locking assertions to check that we always access the list
while holding the lock protecting it.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 98adbc2..7dae7f3 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -506,6 +506,7 @@ static struct cpufreq_governor *find_governor(const char *str_governor)
 {
 	struct cpufreq_governor *t;
 
+	lockdep_assert_held(&cpufreq_governor_mutex);
 	for_each_governor(t)
 		if (!strncasecmp(str_governor, t->name, CPUFREQ_NAME_LEN))
 			return t;
@@ -693,6 +694,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 		goto out;
 	}
 
+	lockdep_assert_held(&cpufreq_governor_mutex);
 	for_each_governor(t) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
 		    - (CPUFREQ_NAME_LEN + 2)))
@@ -2025,6 +2027,7 @@ int cpufreq_register_governor(struct cpufreq_governor *governor)
 	err = -EBUSY;
 	if (!find_governor(governor->name)) {
 		err = 0;
+		lockdep_assert_held(&cpufreq_governor_mutex);
 		list_add(&governor->governor_list, &cpufreq_governor_list);
 	}
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (6 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:09   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors " Juri Lelli
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

cpufreq_init_policy calls find_governor, which iterates through
cpufreq_governor_list.  cpufreq_governor_mutex has to be held before
calling that function or the following warning will be generated:

[    8.100161] cpu cpu0: bL_cpufreq_init: CPU 0 initialized
[    8.164477] ------------[ cut here ]------------
[    8.225164] WARNING: CPU: 2 PID: 1 at kernel/drivers/cpufreq/cpufreq.c:512 find_governor+0x57/0x68()
[    8.356296] Modules linked in:
[    8.411252] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc2+ #298
[    8.477501] Hardware name: ARM-Versatile Express
[    8.538416] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
[    8.657973] [<c0010e25>] (show_stack) from [<c02eca5d>] (dump_stack+0x55/0x78)
[    8.778347] [<c02eca5d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
[    8.888775] usb 1-1.2: new high-speed USB device number 3 using isp1760
[    8.981012] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
[    8.993193] usb-storage 1-1.2:1.0: USB Mass Storage device detected
[    8.995167] scsi host0: usb-storage 1-1.2:1.0
[    9.260384] [<c002030f>] (warn_slowpath_null) from [<c03b7caf>] (find_governor+0x57/0x68)
[    9.395241] [<c03b7caf>] (find_governor) from [<c03b917d>] (cpufreq_init_policy+0x21/0x50)
[    9.532811] [<c03b917d>] (cpufreq_init_policy) from [<c03b9391>] (cpufreq_online+0x1e5/0x530)
[    9.676622] [<c03b9391>] (cpufreq_online) from [<c033fc9f>] (subsys_interface_register+0x53/0x78)
[    9.826894] [<c033fc9f>] (subsys_interface_register) from [<c03b986f>] (cpufreq_register_driver+0x9f/0x108)
[    9.981174] [<c03b986f>] (cpufreq_register_driver) from [<c03bb9b1>] (bL_cpufreq_register+0x49/0x98)
[   10.002780] scsi 0:0:0:0: Direct-Access     General  USB Flash Disk   1.0  PQ: 0 ANSI: 2
[   10.024039] sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
[   10.030505] sd 0:0:0:0: [sda] Write Protect is off
[   10.030544] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
[   10.039631] sd 0:0:0:0: [sda] No Caching mode page found
[   10.039672] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   10.093331]  sda: sda1 sda2
[   10.138827] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   10.883930] [<c03bb9b1>] (bL_cpufreq_register) from [<c034203f>] (platform_drv_probe+0x3b/0x6c)
[   11.024538] [<c034203f>] (platform_drv_probe) from [<c0340e2f>] (driver_probe_device+0x16f/0x1c0)
[   11.167217] [<c0340e2f>] (driver_probe_device) from [<c0340ed7>] (__driver_attach+0x57/0x58)
[   11.308084] [<c0340ed7>] (__driver_attach) from [<c033fd41>] (bus_for_each_dev+0x2d/0x4c)
[   11.448056] [<c033fd41>] (bus_for_each_dev) from [<c034077f>] (bus_add_driver+0xa3/0x14c)
[   11.587859] [<c034077f>] (bus_add_driver) from [<c0341727>] (driver_register+0x3b/0x88)
[   11.726813] [<c0341727>] (driver_register) from [<c0009613>] (do_one_initcall+0x5b/0x150)
[   11.866247] [<c0009613>] (do_one_initcall) from [<c0732b45>] (kernel_init_freeable+0x18d/0x22c)
[   12.008765] [<c0732b45>] (kernel_init_freeable) from [<c04f24ed>] (kernel_init+0xd/0xa4)
[   12.150461] [<c04f24ed>] (kernel_init) from [<c000dfb9>] (ret_from_fork+0x11/0x38)
[   12.294473] ---[ end trace 545905b1fdc9cd96 ]---
[   12.371823] atkbd serio0: keyboard reset failed on 1c060000.kmi
[   12.372910] cpu cpu1: bL_cpufreq_init: CPU 1 initialized
[   12.373741] ------------[ cut here ]------------

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 7dae7f3..d065435 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -969,6 +969,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
 
 	memcpy(&new_policy, policy, sizeof(*policy));
 
+	mutex_lock(&cpufreq_governor_mutex);
 	/* Update governor of new_policy to the governor used before hotplug */
 	gov = find_governor(policy->last_governor);
 	if (gov)
@@ -976,6 +977,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
 				policy->governor->name, policy->cpu);
 	else
 		gov = CPUFREQ_DEFAULT_GOVERNOR;
+	mutex_unlock(&cpufreq_governor_mutex);
 
 	new_policy.governor = gov;
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors unlocked access to cpufreq_governor_list
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (7 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:13   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy Juri Lelli
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

show_scaling_available_governors iterates through cpufreq_governor_list
without holding cpufreq_governor_mutex; this generates the following
warning:

[  700.910381] ------------[ cut here ]------------
[  700.924282] WARNING: CPU: 2 PID: 1756 at kernel/drivers/cpufreq/cpufreq.c:700 show_scaling_available_governors+0x6f/0xb8()
[  700.965473] Modules linked in:
[  700.974637] CPU: 2 PID: 1756 Comm: cat Tainted: G        W       4.4.0-rc2+ #299
[  700.996813] Hardware name: ARM-Versatile Express
[  701.010674] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
[  701.033905] [<c0010e25>] (show_stack) from [<c02eca5d>] (dump_stack+0x55/0x78)
[  701.055561] [<c02eca5d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
[  701.079839] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
[  701.106182] [<c002030f>] (warn_slowpath_null) from [<c03b7bef>] (show_scaling_available_governors+0x6f/0xb8)
[  701.135656] [<c03b7bef>] (show_scaling_available_governors) from [<c03b7dc3>] (show+0x27/0x38)
[  701.161488] [<c03b7dc3>] (show) from [<c015469f>] (sysfs_kf_seq_show+0x5f/0xa0)
[  701.183409] [<c015469f>] (sysfs_kf_seq_show) from [<c01536a7>] (kernfs_seq_show+0x1b/0x1c)
[  701.208188] [<c01536a7>] (kernfs_seq_show) from [<c011a6d5>] (seq_read+0x129/0x33c)
[  701.231161] [<c011a6d5>] (seq_read) from [<c00ff7c7>] (__vfs_read+0x1b/0x84)
[  701.252300] [<c00ff7c7>] (__vfs_read) from [<c010000f>] (vfs_read+0x5f/0xb0)
[  701.273436] [<c010000f>] (vfs_read) from [<c0100099>] (SyS_read+0x39/0x68)
[  701.294049] [<c0100099>] (SyS_read) from [<c000df21>] (ret_fast_syscall+0x1/0x1a)
[  701.316484] ---[ end trace 5dd15744a4da127c ]---

Fix this by locking cpufreq_governor_mutex before for_each_governor().

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d065435..d91fdb8 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -694,7 +694,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 		goto out;
 	}
 
-	lockdep_assert_held(&cpufreq_governor_mutex);
+	mutex_lock(&cpufreq_governor_mutex);
 	for_each_governor(t) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
 		    - (CPUFREQ_NAME_LEN + 2)))
@@ -702,6 +702,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 		i += scnprintf(&buf[i], CPUFREQ_NAME_PLEN, "%s ", t->name);
 	}
 out:
+	mutex_unlock(&cpufreq_governor_mutex);
 	i += sprintf(&buf[i], "\n");
 	return i;
 }
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (8 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors " Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:15   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor Juri Lelli
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

Since cpufreq_set_policy is modifying policy, it has to work under policy->
rwsem protection.

Assert such condition.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d91fdb8..f1f9fbc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2109,6 +2109,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n",
 		 new_policy->cpu, new_policy->min, new_policy->max);
 
+	lockdep_assert_held(&policy->rwsem);
+
 	memcpy(&new_policy->cpuinfo, &policy->cpuinfo, sizeof(policy->cpuinfo));
 
 	/*
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (9 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:20   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy Juri Lelli
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

__cpufreq_governor works on policy, so policy->rwsem has to be held.
Add assertion for such condition.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index f1f9fbc..e7fc5c9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 	/* Don't start any governor operations if we are entering suspend */
 	if (cpufreq_suspended)
 		return 0;
+
+	lockdep_assert_held(&policy->rwsem);
+
 	/*
 	 * Governor might not be initiated here if ACPI _PPC changed
 	 * notification happened, so check it.
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (10 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:39   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare Juri Lelli
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

There are paths in cpufreq_init_policy where policy is used, but its rwsem
is not held.

Fix it.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index e7fc5c9..2c7cc6c73 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -998,21 +998,24 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
 {
 	int ret = 0;
 
+	down_write(&policy->rwsem);
+
 	/* Has this CPU been taken care of already? */
-	if (cpumask_test_cpu(cpu, policy->cpus))
+	if (cpumask_test_cpu(cpu, policy->cpus)) {
+		up_write(&policy->rwsem);
 		return 0;
+	}
 
 	if (has_target()) {
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 		if (ret) {
+			up_write(&policy->rwsem);
 			pr_err("%s: Failed to stop governor\n", __func__);
 			return ret;
 		}
 	}
 
-	down_write(&policy->rwsem);
 	cpumask_set_cpu(cpu, policy->cpus);
-	up_write(&policy->rwsem);
 
 	if (has_target()) {
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
@@ -1020,10 +1023,12 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
 			ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
 
 		if (ret) {
+			up_write(&policy->rwsem);
 			pr_err("%s: Failed to start governor\n", __func__);
 			return ret;
 		}
 	}
+	up_write(&policy->rwsem);
 
 	return 0;
 }
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (11 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 10:54   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish Juri Lelli
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

There are paths in cpufreq_offline_prepare where policy is used, but its
rwsem is not held.

Fix it.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2c7cc6c73..91158b0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1332,13 +1332,13 @@ static void cpufreq_offline_prepare(unsigned int cpu)
 		return;
 	}
 
+	down_write(&policy->rwsem);
 	if (has_target()) {
 		int ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 		if (ret)
 			pr_err("%s: Failed to stop governor\n", __func__);
 	}
 
-	down_write(&policy->rwsem);
 	cpumask_clear_cpu(cpu, policy->cpus);
 
 	if (policy_is_inactive(policy)) {
@@ -1356,12 +1356,16 @@ static void cpufreq_offline_prepare(unsigned int cpu)
 	/* Start governor again for active policy */
 	if (!policy_is_inactive(policy)) {
 		if (has_target()) {
-			int ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
+			int ret;
+
+			down_write(&policy->rwsem);
+			ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
 			if (!ret)
 				ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
 
 			if (ret)
 				pr_err("%s: Failed to start governor\n", __func__);
+			up_write(&policy->rwsem);
 		}
 	} else if (cpufreq_driver->stop_cpu) {
 		cpufreq_driver->stop_cpu(policy);
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (12 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 11:02   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor Juri Lelli
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

There are paths in cpufreq_offline_prepare where policy is used, but its
rwsem is not held.

Fix it.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 91158b0..ba452c3 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1381,9 +1381,13 @@ static void cpufreq_offline_finish(unsigned int cpu)
 		return;
 	}
 
+	down_write(&policy->rwsem);
+
 	/* Only proceed for inactive policies */
-	if (!policy_is_inactive(policy))
+	if (!policy_is_inactive(policy)) {
+		up_write(&policy->rwsem);
 		return;
+	}
 
 	/* If cpu is last user of policy, free policy */
 	if (has_target()) {
@@ -1392,6 +1396,8 @@ static void cpufreq_offline_finish(unsigned int cpu)
 			pr_err("%s: Failed to exit governor\n", __func__);
 	}
 
+	up_write(&policy->rwsem);
+
 	/*
 	 * Perform the ->exit() even during light-weight tear-down,
 	 * since this is a core component, and is essential for the
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (13 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 11:06   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT Juri Lelli
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

Commit 6f1e4efd882e ("cpufreq: Fix timer/workqueue corruption by
protecting reading governor_enabled") made policy->governor_enabled
guarded by cpufreq_governor_mutex in __cpufreq_governor. Now that
holding of policy->rwsem is asserted in __cpufreq_governor,
cpufreq_governor_mutex is overkilling.

Remove such usage. Also, this cleans up semantic of
cpufreq_governor_mutex: it guards cpufreq_governor_list only.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index ba452c3..d58a622 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1993,11 +1993,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 
 	pr_debug("%s: for CPU %u, event %u\n", __func__, policy->cpu, event);
 
-	mutex_lock(&cpufreq_governor_mutex);
 	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
 	    || (!policy->governor_enabled
 	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
-		mutex_unlock(&cpufreq_governor_mutex);
 		return -EBUSY;
 	}
 
@@ -2006,8 +2004,6 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 	else if (event == CPUFREQ_GOV_START)
 		policy->governor_enabled = true;
 
-	mutex_unlock(&cpufreq_governor_mutex);
-
 	ret = policy->governor->governor(policy, event);
 
 	if (!ret) {
@@ -2017,12 +2013,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
 			policy->governor->initialized--;
 	} else {
 		/* Restore original values */
-		mutex_lock(&cpufreq_governor_mutex);
 		if (event == CPUFREQ_GOV_STOP)
 			policy->governor_enabled = true;
 		else if (event == CPUFREQ_GOV_START)
 			policy->governor_enabled = false;
-		mutex_unlock(&cpufreq_governor_mutex);
 	}
 
 	if (((event == CPUFREQ_GOV_POLICY_INIT) && ret) ||
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (14 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 11:09   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get Juri Lelli
                   ` (4 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

There are no good reasons why policy->rwsem cannot be hold across calls to
__cpufreq_governor with CPUFREQ_GOV_POLICY_EXIT event.

Remove {up,down}_write across such call sites. This also verify assertion
that policy->rwsem is always hold when calling into __cpufreq_governor.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 4 ----
 include/linux/cpufreq.h   | 4 ----
 2 files changed, 8 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d58a622..797bfae 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2182,9 +2182,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 			return ret;
 		}
 
-		up_write(&policy->rwsem);
 		ret = __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
 
 		if (ret) {
 			pr_err("%s: Failed to Exit Governor: %s (%d)\n",
@@ -2201,9 +2199,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
 		if (!ret)
 			goto out;
 
-		up_write(&policy->rwsem);
 		__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT);
-		down_write(&policy->rwsem);
 	}
 
 	/* new governor failed, so re-start old one */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 88a4215..79b87ce 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -100,10 +100,6 @@ struct cpufreq_policy {
 	 * - Any routine that will write to the policy structure and/or may take away
 	 *   the policy altogether (eg. CPU hotplug), will hold this lock in write
 	 *   mode before doing so.
-	 *
-	 * Additional rules:
-	 * - Lock should not be held across
-	 *     __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
 	 */
 	struct rw_semaphore	rwsem;
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (15 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 11:17   ` Viresh Kumar
  2016-01-11 17:35 ` [RFC PATCH 18/19] cpufreq: remove transition_lock Juri Lelli
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

After cpufreq_driver_lock() is acquired in cpufreq_cpu_get() we are sure
we can't race with cpufreq_policy_free() (which is in the path that ends
up removing cpufreq_driver). We can thus safely remove check for
cpufreq_driver being present (which is a leftover from commit
6eed9404ab3c ("cpufreq: Use rwsem for protecting critical sections")).

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 drivers/cpufreq/cpufreq.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 797bfae..6c9bef7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -282,15 +282,15 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu)
 	if (WARN_ON(cpu >= nr_cpu_ids))
 		return NULL;
 
-	/* get the cpufreq driver */
 	read_lock_irqsave(&cpufreq_driver_lock, flags);
 
-	if (cpufreq_driver) {
-		/* get the CPU */
-		policy = cpufreq_cpu_get_raw(cpu);
-		if (policy)
-			kobject_get(&policy->kobj);
-	}
+	/*
+	 * If we get a policy, cpufreq_policy_free() didn't
+	 * yet run.
+	 */
+	policy = cpufreq_cpu_get_raw(cpu);
+	if (policy)
+		kobject_get(&policy->kobj);
 
 	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (16 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get Juri Lelli
@ 2016-01-11 17:35 ` Juri Lelli
  2016-01-12 11:24   ` Viresh Kumar
  2016-01-11 17:36 ` [RFC PATCH 19/19] cpufreq: documentation: document locking scheme Juri Lelli
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli

From: Michael Turquette <mturquette@baylibre.com>

transition_lock was introduced to serialize cpufreq transition
notifiers. Instead of using a different lock for protecting concurrent
modifications of policy, it is better to require that callers of
transition notifiers implement appropriate locking (this is already the
case AFAICS). Removing transition_lock also simplifies current locking
scheme.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
---
 drivers/cpufreq/cpufreq.c | 19 ++++++++++---------
 include/linux/cpufreq.h   |  1 -
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 6c9bef7..78b1e2f 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -421,10 +421,15 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
 	cpufreq_notify_transition(policy, freqs, CPUFREQ_POSTCHANGE);
 }
 
+/*
+ * Callers must ensure proper mutual exclusion on policy (for transition_
+ * ongoing/transition_task handling). While holding policy->rwsem is
+ * sufficient, other scheme might work as well (e.g., cpufreq_governor.c
+ * holds timer_mutex while entering the path that generates transitions).
+ */
 void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
 		struct cpufreq_freqs *freqs)
 {
-
 	/*
 	 * Catch double invocations of _begin() which lead to self-deadlock.
 	 * ASYNC_NOTIFICATION drivers are left out because the cpufreq core
@@ -439,22 +444,19 @@ void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
 wait:
 	wait_event(policy->transition_wait, !policy->transition_ongoing);
 
-	spin_lock(&policy->transition_lock);
-
-	if (unlikely(policy->transition_ongoing)) {
-		spin_unlock(&policy->transition_lock);
+	if (unlikely(policy->transition_ongoing))
 		goto wait;
-	}
 
 	policy->transition_ongoing = true;
 	policy->transition_task = current;
 
-	spin_unlock(&policy->transition_lock);
-
 	cpufreq_notify_transition(policy, freqs, CPUFREQ_PRECHANGE);
 }
 EXPORT_SYMBOL_GPL(cpufreq_freq_transition_begin);
 
+/*
+ * As above, callers must ensure proper mutual exclusion on policy.
+ */
 void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
 		struct cpufreq_freqs *freqs, int transition_failed)
 {
@@ -1057,7 +1059,6 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)
 	kobject_init(&policy->kobj, &ktype_cpufreq);
 	INIT_LIST_HEAD(&policy->policy_list);
 	init_rwsem(&policy->rwsem);
-	spin_lock_init(&policy->transition_lock);
 	init_waitqueue_head(&policy->transition_wait);
 	init_completion(&policy->kobj_unregister);
 	INIT_WORK(&policy->update, handle_update);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 79b87ce..6bbb88f 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -105,7 +105,6 @@ struct cpufreq_policy {
 
 	/* Synchronization for frequency transitions */
 	bool			transition_ongoing; /* Tracks transition status */
-	spinlock_t		transition_lock;
 	wait_queue_head_t	transition_wait;
 	struct task_struct	*transition_task; /* Task which is doing the transition */
 
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RFC PATCH 19/19] cpufreq: documentation: document locking scheme
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (17 preceding siblings ...)
  2016-01-11 17:35 ` [RFC PATCH 18/19] cpufreq: remove transition_lock Juri Lelli
@ 2016-01-11 17:36 ` Juri Lelli
  2016-01-11 22:45 ` [RFC PATCH 00/19] cpufreq locking cleanups and documentation Rafael J. Wysocki
  2016-01-30  0:57 ` Saravana Kannan
  20 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-11 17:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-pm, peterz, rjw, viresh.kumar, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann, juri.lelli,
	Jonathan Corbet, linux-doc

CPUFreq locking scheme is quite complicated. Provide some initial
documentation for it.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 Documentation/cpu-freq/core.txt | 44 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/cpu-freq/core.txt b/Documentation/cpu-freq/core.txt
index ba78e7c..1838976 100644
--- a/Documentation/cpu-freq/core.txt
+++ b/Documentation/cpu-freq/core.txt
@@ -21,6 +21,7 @@ Contents:
 1.  CPUFreq core and interfaces
 2.  CPUFreq notifiers
 3.  CPUFreq Table Generation with Operating Performance Point (OPP)
+4.  CPUFreq locking scheme for internal data structures
 
 1. General Information
 =======================
@@ -118,3 +119,46 @@ dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
 	addition to CONFIG_PM_OPP.
 
 dev_pm_opp_free_cpufreq_table - Free up the table allocated by dev_pm_opp_init_cpufreq_table
+
+4. CPUFreq locking scheme for internal data structures
+======================================================
+CPUFreq locking scheme has to guarantee ordering and mutual exclusion on
+internal data structures while concurrent execution of CPU hotplug, OPP
+selection, governor changes and driver changes might happen.
+
+Data structures depicted in the following diagram (except policyX data
+structures, see below) are protected by cpufreq_ driver_lock RWLOCK (see
+beginning of cpufreq.c).
+
+ cpufreq_driver -> struct cpufreq_driver
+
+ cpufreq_cpu_data (per-CPU)
+
+                         +---+---+---+    +---+
+                         | 0 | 1 | 2 | .. |N-1|
+                         +-+-+-+-+-+-+    +-+-+
+                           |   |   |        |
+                           |---+   +--------+
+                           |                |
+                           |                |
+                           V                V
+ cpufreq_policy_list --> policy0 -- .. -> policy(N-1)-+
+                                                      |
+					              x
+
+Concurrent reads/updates of policyX data structures are guarded by policy->
+rwsem. The rules for this semaphore are (as also reported in include/linux/
+cpufreq.h):
+
+ - any routine that wants to read from the policy structure will
+   do a down_read on this semaphore;
+ - any routine that will write to the policy structure and/or may take away
+   the policy altogether (eg. CPU hotplug), will hold this lock in write
+   mode before doing so.
+
+We then have another list that keeps track of available governors and that is
+protected by cpufreq_governor_mutex MUTEX:
+
+ cpufreq_governor_list --> governor0 -- .. --> governor(N-1) -+
+                                                              |
+				                              x
-- 
2.2.2

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
@ 2016-01-11 22:05   ` Peter Zijlstra
  2016-01-11 23:03     ` Rafael J. Wysocki
  2016-01-11 22:07   ` Peter Zijlstra
  2016-01-12  9:10   ` Viresh Kumar
  2 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-11 22:05 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, rjw, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> +/**
> + * The "cpufreq driver" - the arch- or hardware-dependent low
> + * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
> + * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
> + */
> +static struct cpufreq_driver *cpufreq_driver;
> +static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  static LIST_HEAD(cpufreq_policy_list);
> +static DEFINE_RWLOCK(cpufreq_driver_lock);

Part of my suggestion was to fold the per-cpu data of cpufreq_cpu_data
into struct cpufreq_driver.

That way each cpufreq_driver will have its own copy and there'd be only
the one global pointer to swizzle. Something very well suited to RCU.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
  2016-01-11 22:05   ` Peter Zijlstra
@ 2016-01-11 22:07   ` Peter Zijlstra
  2016-01-12  9:27     ` Viresh Kumar
  2016-01-12  9:10   ` Viresh Kumar
  2 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-11 22:07 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, rjw, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> +/**
> + * Iterate over governors
> + *
> + * cpufreq_governor_list is protected by cpufreq_governor_mutex.
> + */
> +static LIST_HEAD(cpufreq_governor_list);
> +static DEFINE_MUTEX(cpufreq_governor_mutex);
> +#define for_each_governor(__governor)				\
> +	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)

So you could stuff the lockdep_assert_held() you later add intididually
into the for_each_governor macro, impossible to forget that way.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (18 preceding siblings ...)
  2016-01-11 17:36 ` [RFC PATCH 19/19] cpufreq: documentation: document locking scheme Juri Lelli
@ 2016-01-11 22:45 ` Rafael J. Wysocki
  2016-01-12 10:46   ` Juri Lelli
  2016-01-30  0:57 ` Saravana Kannan
  20 siblings, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-11 22:45 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Monday, January 11, 2016 05:35:41 PM Juri Lelli wrote:
> Hi all,
> 
> In the context of the ongoing discussion about introducing a simple platform
> energy model to guide scheduling decisions (Energy Aware Scheduling [1])
> concerns have been expressed by Peter about the component in charge of driving
> clock frequency selection (Steve recently posted an update of such component
> [2]): https://lkml.org/lkml/2015/8/15/141.
> 
> The problem is that, with this new approach, cpufreq core functions need to be
> accessed from scheduler hot-paths and the overhead associated with the current
> locking scheme might result to be unsustainable. 
> 
> Peter's proposed approach of using RCU logic to reduce locking overhead seems
> reasonable, but things may not be so straightforward as originally thought. The
> very first thing I actually realized when I started looking into this is that
> it was hard for me to understand which locking mechanism was protecting which
> data structure. As mostly a way to build a better understanding of the current
> cpufreq locking scheme and also as preparatory work for implementing RCU logic,
> I came up with this set of patches. In fact, at this stage, I would like each
> patch to be considered as a question I'm asking rather than a proposed change,
> thus the RFC tag for the series; with the intent of documenting current locking
> scheme and modifying it a bit in order to make RCU logic implementation easier.
> Actually, as you'll soon notice, I didn't really start from scratch. Mike
> shared with me some patches he has been developing while looking at the same
> problem. I've given Mike attribution for the patches that I took unchanged from
> him, with thanks for sharing his findings with me.
> 
> High level description of patches:
> 
>  o [01-04] cleanup and move code around to make things (hopefully) cleaner
>  o [05-14] insert lockdep assertions and fix uncovered erroneous situations
>  o [15-18] remove overkill usage of locking mechanism
>  o 19      adds documentation for the cleaned up locking scheme
> 
> With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
> anything bad happening. However, coverage is really small (as is my personal
> confidence of not breaking things for other confs :-)).
> 
> This set is based on top of linux-pm/linux-next as of today and it is also
> available from here:

Due to the merge window in progress I have more urgent things to do than
looking at this material right now.  Sorry about that.

I may be able to look at it towards the end of the week.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 22:05   ` Peter Zijlstra
@ 2016-01-11 23:03     ` Rafael J. Wysocki
  2016-01-12  8:27       ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-11 23:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, linux-kernel, linux-pm, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Monday, January 11, 2016 11:05:28 PM Peter Zijlstra wrote:
> On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > +/**
> > + * The "cpufreq driver" - the arch- or hardware-dependent low
> > + * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
> > + * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
> > + */
> > +static struct cpufreq_driver *cpufreq_driver;
> > +static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
> >  static LIST_HEAD(cpufreq_policy_list);
> > +static DEFINE_RWLOCK(cpufreq_driver_lock);
> 
> Part of my suggestion was to fold the per-cpu data of cpufreq_cpu_data
> into struct cpufreq_driver.
> 
> That way each cpufreq_driver will have its own copy and there'd be only
> the one global pointer to swizzle. Something very well suited to RCU.

Well, I'm not really sure reworking all that is necessary.

What we need is to be able to call something analogous to dbs_timer_handler()
from the scheduler and a driver callback from there (if present).  For that,
it should be sufficient to have a pointer to that callback (that may be set
upon driver registration) protected by RCU (or should that be sched RCU
rather?) if I'm not missing anything.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 23:03     ` Rafael J. Wysocki
@ 2016-01-12  8:27       ` Peter Zijlstra
  2016-01-12 10:43         ` Juri Lelli
  2016-01-12 16:47         ` Rafael J. Wysocki
  0 siblings, 2 replies; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-12  8:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Juri Lelli, linux-kernel, linux-pm, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 12, 2016 at 12:03:39AM +0100, Rafael J. Wysocki wrote:
> On Monday, January 11, 2016 11:05:28 PM Peter Zijlstra wrote:
> > On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > > +/**
> > > + * The "cpufreq driver" - the arch- or hardware-dependent low
> > > + * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
> > > + * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
> > > + */
> > > +static struct cpufreq_driver *cpufreq_driver;
> > > +static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
> > >  static LIST_HEAD(cpufreq_policy_list);
> > > +static DEFINE_RWLOCK(cpufreq_driver_lock);
> > 
> > Part of my suggestion was to fold the per-cpu data of cpufreq_cpu_data
> > into struct cpufreq_driver.
> > 
> > That way each cpufreq_driver will have its own copy and there'd be only
> > the one global pointer to swizzle. Something very well suited to RCU.
> 
> Well, I'm not really sure reworking all that is necessary.
> 
> What we need is to be able to call something analogous to dbs_timer_handler()
> from the scheduler and a driver callback from there (if present).  For that,
> it should be sufficient to have a pointer to that callback (that may be set
> upon driver registration) protected by RCU (or should that be sched RCU
> rather?) if I'm not missing anything.

But such a callback will invariably want to use the per-cpu state. And
now you have two pointers, one for the driver and one for the per-cpu
state. Keeping that in sync is a pain.

Moving the per-cpu data into the driver solves that trivially.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock
  2016-01-11 17:35 ` [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock Juri Lelli
@ 2016-01-12  8:56   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  8:56 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> From: Michael Turquette <mturquette@baylibre.com>
> 
> Since commit 3a91b069eabf ("cpufreq: governor: Quit work-handlers early if
> governor is stopped") cpufreq_governor_lock is not used anywhere outside
> cpufreq.c. Make it static again.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Michael Turquette <mturquette@baylibre.com>
> ---
>  drivers/cpufreq/cpufreq.c          | 2 +-
>  drivers/cpufreq/cpufreq_governor.h | 2 --
>  2 files changed, 1 insertion(+), 3 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 02/19] cpufreq: merge governor lock and mutex
  2016-01-11 17:35 ` [RFC PATCH 02/19] cpufreq: merge governor lock and mutex Juri Lelli
@ 2016-01-12  9:00   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:00 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> From: Michael Turquette <mturquette@baylibre.com>
> 
> Commit 95731ebb114c ("cpufreq: Fix governor start/stop race condition")
> introduced cpufreq_governor_lock. This was actually overkilling, as the
> same can be achieved by using the existing cpufreq_governor_mutex.
> Removing cpufreq_governor_lock cleans things up and makes deadlocks less
> likely to happen.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Michael Turquette <mturquette@baylibre.com>
> ---
>  drivers/cpufreq/cpufreq.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)


Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 03/19] cpufreq: kill for_each_policy
  2016-01-11 17:35 ` [RFC PATCH 03/19] cpufreq: kill for_each_policy Juri Lelli
@ 2016-01-12  9:01   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:01 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> for_each_policy() macro is not used anywhere. Kill it.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 3 ---
>  1 file changed, 3 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
  2016-01-11 22:05   ` Peter Zijlstra
  2016-01-11 22:07   ` Peter Zijlstra
@ 2016-01-12  9:10   ` Viresh Kumar
  2 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:10 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> Currently it is not easy to figure out which lock/mutex protects which data
> structure. Clean things up by moving data structures and their locks/mutexs
> closer; also, change comments to document relations further.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 33 ++++++++++++++++++---------------
>  1 file changed, 18 insertions(+), 15 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-11 22:07   ` Peter Zijlstra
@ 2016-01-12  9:27     ` Viresh Kumar
  2016-01-12 11:21       ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 11-01-16, 23:07, Peter Zijlstra wrote:
> On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > +/**
> > + * Iterate over governors
> > + *
> > + * cpufreq_governor_list is protected by cpufreq_governor_mutex.
> > + */
> > +static LIST_HEAD(cpufreq_governor_list);
> > +static DEFINE_MUTEX(cpufreq_governor_mutex);
> > +#define for_each_governor(__governor)				\
> > +	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
> 
> So you could stuff the lockdep_assert_held() you later add intididually
> into the for_each_governor macro, impossible to forget that way.

How exactly? I couldn't see how it can be done in a neat and clean
way.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list
  2016-01-11 17:35 ` [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list Juri Lelli
@ 2016-01-12  9:34   ` Viresh Kumar
  2016-01-12 11:44     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:34 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> cpufreq_policy_list is guarded by cpufreq_driver_lock. Add appropriate
> locking assertions to check that we always access the list while holding
> the associated lock.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 00a00cd..63d6efb 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -65,6 +65,7 @@ static bool suitable_policy(struct cpufreq_policy *policy, bool active)
>  static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
>  					  bool active)
>  {
> +	lockdep_assert_held(&cpufreq_driver_lock);
>  	do {
>  		policy = list_next_entry(policy, policy_list);
>  
> @@ -80,6 +81,7 @@ static struct cpufreq_policy *first_policy(bool active)
>  {
>  	struct cpufreq_policy *policy;
>  
> +	lockdep_assert_held(&cpufreq_driver_lock);

Because both first_policy() and next_policy() are parts of
for_each_suitable_policy() macro, checking this in first_policy() is
sufficient. next_policy() isn't designed to be used by any other code.

>  	/* No policies in the list */
>  	if (list_empty(&cpufreq_policy_list))
>  		return NULL;
> @@ -2430,6 +2432,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
>  	if (ret)
>  		goto err_boost_unreg;
>  
> +	lockdep_assert_held(&cpufreq_driver_lock);

Why do you need a cpufreq_driver_lock here? And the above change
should generate a lockdep here as the lock isn't taken right now.

>  	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
>  	    list_empty(&cpufreq_policy_list)) {
>  		/* if all ->init() calls failed, unregister */
> -- 
> 2.2.2

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock
  2016-01-11 17:35 ` [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock Juri Lelli
@ 2016-01-12  9:57   ` Viresh Kumar
  2016-01-12 12:08     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12  9:57 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> Commit highlights paths where we access cpufreq_policy_list without
> holding cpufreq_driver_lock; one example being the following:
> 
> [    8.245779] ------------[ cut here ]------------
> [    8.305977] WARNING: CPU: 2 PID: 1 at kernel/drivers/cpufreq/cpufreq.c:2447 cpufreq_register_driver+0xfd/0x120()
> [    8.438611] Modules linked in:
> [    8.493751] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc4+ #369
> [    8.561039] Hardware name: ARM-Versatile Express
> [    8.622765] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
> [    8.629651] atkbd serio0: keyboard reset failed on 1c060000.kmi
> [    8.810905] [<c0010e25>] (show_stack) from [<c02ece7d>] (dump_stack+0x55/0x78)
> [    8.935122] [<c02ece7d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
> [    9.067097] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
> [    9.204101] [<c002030f>] (warn_slowpath_null) from [<c03ba329>] (cpufreq_register_driver+0xfd/0x120)
> [    9.209603] usb 1-1.2: new high-speed USB device number 3 using isp1760
> [    9.419507] [<c03ba329>] (cpufreq_register_driver) from [<c03bc481>] (bL_cpufreq_register+0x49/0x98)
> [    9.560548] [<c03bc481>] (bL_cpufreq_register) from [<c0342517>] (platform_drv_probe+0x3b/0x6c)
> [    9.573806] usb-storage 1-1.2:1.0: USB Mass Storage device detected
> [    9.575468] scsi host0: usb-storage 1-1.2:1.0
> [    9.855845] [<c0342517>] (platform_drv_probe) from [<c03412e7>] (driver_probe_device+0x153/0x1bc)
> [   10.006137] [<c03412e7>] (driver_probe_device) from [<c03413a7>] (__driver_attach+0x57/0x58)
> [   10.009576] atkbd serio1: keyboard reset failed on 1c070000.kmi
> [   10.237057] [<c03413a7>] (__driver_attach) from [<c0340199>] (bus_for_each_dev+0x2d/0x4c)
> [   10.387824] [<c0340199>] (bus_for_each_dev) from [<c0340bd7>] (bus_add_driver+0xa3/0x14c)
> [   10.539200] [<c0340bd7>] (bus_add_driver) from [<c0341bff>] (driver_register+0x3b/0x88)
> [   10.691023] [<c0341bff>] (driver_register) from [<c0009613>] (do_one_initcall+0x5b/0x150)
> [   10.703809] scsi 0:0:0:0: Direct-Access     General  USB Flash Disk   1.0  PQ: 0 ANSI: 2
> [   10.713081] sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
> [   10.713973] sd 0:0:0:0: [sda] Write Protect is off
> [   10.713984] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> [   10.730783] sd 0:0:0:0: [sda] No Caching mode page found
> [   10.730814] sd 0:0:0:0: [sda] Assuming drive cache: write through
> [   10.779815]  sda: sda1 sda2
> [   10.823590] sd 0:0:0:0: [sda] Attached SCSI removable disk
> [   11.581894] [<c0009613>] (do_one_initcall) from [<c0734b45>] (kernel_init_freeable+0x18d/0x22c)
> [   11.720454] [<c0734b45>] (kernel_init_freeable) from [<c04f45f9>] (kernel_init+0xd/0xa4)
> [   11.857340] [<c04f45f9>] (kernel_init) from [<c000dfb9>] (ret_from_fork+0x11/0x38)
> [   11.993082] ---[ end trace 62ff5522fb3f41dd ]---
> 
> Fix this, and others, with proper locking of cpufreq_driver_lock.

Perhaps this should be added prior to the lockdep patch, so that git
bisect doesn't show lockdeps ?

> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 63d6efb..98adbc2 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1585,6 +1585,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
>  void cpufreq_suspend(void)
>  {
>  	struct cpufreq_policy *policy;
> +	unsigned long flags;
>  
>  	if (!cpufreq_driver)
>  		return;
> @@ -1594,6 +1595,7 @@ void cpufreq_suspend(void)
>  
>  	pr_debug("%s: Suspending Governors\n", __func__);
>  
> +	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	for_each_active_policy(policy) {
>  		if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
>  			pr_err("%s: Failed to stop governor for policy: %p\n",
> @@ -1603,6 +1605,7 @@ void cpufreq_suspend(void)
>  			pr_err("%s: Failed to suspend driver: %p\n", __func__,
>  				policy);
>  	}
> +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>  
>  suspend:
>  	cpufreq_suspended = true;
> @@ -1617,6 +1620,7 @@ suspend:
>  void cpufreq_resume(void)
>  {
>  	struct cpufreq_policy *policy;
> +	unsigned long flags;
>  
>  	if (!cpufreq_driver)
>  		return;
> @@ -1628,6 +1632,7 @@ void cpufreq_resume(void)
>  
>  	pr_debug("%s: Resuming Governors\n", __func__);
>  
> +	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	for_each_active_policy(policy) {
>  		if (cpufreq_driver->resume && cpufreq_driver->resume(policy))
>  			pr_err("%s: Failed to resume driver: %p\n", __func__,
> @@ -1637,6 +1642,7 @@ void cpufreq_resume(void)
>  			pr_err("%s: Failed to start governor for policy: %p\n",
>  				__func__, policy);
>  	}
> +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>  
>  	/*
>  	 * schedule call cpufreq_update_policy() for first-online CPU, as that
> @@ -2287,7 +2293,9 @@ static int cpufreq_boost_set_sw(int state)
>  	struct cpufreq_frequency_table *freq_table;
>  	struct cpufreq_policy *policy;
>  	int ret = -EINVAL;
> +	unsigned long flags;
>  
> +	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	for_each_active_policy(policy) {
>  		freq_table = cpufreq_frequency_get_table(policy->cpu);
>  		if (freq_table) {
> @@ -2302,6 +2310,7 @@ static int cpufreq_boost_set_sw(int state)
>  			__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
>  		}
>  	}
> +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>  
>  	return ret;
>  }

For the above three, I am not sure if there can be some side effects.
Can you please push a branch somewhere, to be tested by Fengguang's
build bot? So that we know of any new lockdeps due to this? All above
routines directly/indirectly call governor specific routines and that
leads to freq-update in few cases. AFAIR, there were some issues with
locking here.

> @@ -2432,14 +2441,16 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
>  	if (ret)
>  		goto err_boost_unreg;
>  
> -	lockdep_assert_held(&cpufreq_driver_lock);
> +	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
>  	    list_empty(&cpufreq_policy_list)) {
>  		/* if all ->init() calls failed, unregister */
>  		pr_debug("%s: No CPU initialized for driver %s\n", __func__,
>  			 driver_data->name);
> +		read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>  		goto err_if_unreg;
>  	}
> +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);

We have just registered the cpufreq driver, there is no other path
that can simultaneously update the list here.

And so we don't need the lock here.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list
  2016-01-11 17:35 ` [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list Juri Lelli
@ 2016-01-12 10:01   ` Viresh Kumar
  2016-01-12 15:33     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:01 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> @@ -2025,6 +2027,7 @@ int cpufreq_register_governor(struct cpufreq_governor *governor)
>  	err = -EBUSY;
>  	if (!find_governor(governor->name)) {
>  		err = 0;
> +		lockdep_assert_held(&cpufreq_governor_mutex);
>  		list_add(&governor->governor_list, &cpufreq_governor_list);
>  	}

Why here? This is how the routine looks like:

int cpufreq_register_governor(struct cpufreq_governor *governor)
{
	int err;

	if (!governor)
		return -EINVAL;

	if (cpufreq_disabled())
		return -ENODEV;

	mutex_lock(&cpufreq_governor_mutex);

	governor->initialized = 0;
	err = -EBUSY;
	if (!find_governor(governor->name)) {
		err = 0;
		list_add(&governor->governor_list, &cpufreq_governor_list);
	}

	mutex_unlock(&cpufreq_governor_mutex);
	return err;
}


-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-11 17:35 ` [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list Juri Lelli
@ 2016-01-12 10:09   ` Viresh Kumar
  2016-01-12 15:52     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:09 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 7dae7f3..d065435 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -969,6 +969,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
>  
>  	memcpy(&new_policy, policy, sizeof(*policy));
>  
> +	mutex_lock(&cpufreq_governor_mutex);
>  	/* Update governor of new_policy to the governor used before hotplug */
>  	gov = find_governor(policy->last_governor);

You should take the lock within find_governor() instead, i.e.  around
the while loop.

>  	if (gov)
> @@ -976,6 +977,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
>  				policy->governor->name, policy->cpu);
>  	else
>  		gov = CPUFREQ_DEFAULT_GOVERNOR;
> +	mutex_unlock(&cpufreq_governor_mutex);
>  
>  	new_policy.governor = gov;
>  
> -- 
> 2.2.2

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors unlocked access to cpufreq_governor_list
  2016-01-11 17:35 ` [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors " Juri Lelli
@ 2016-01-12 10:13   ` Viresh Kumar
  2016-01-13 10:25     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:13 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> show_scaling_available_governors iterates through cpufreq_governor_list
> without holding cpufreq_governor_mutex; this generates the following
> warning:
> 
> [  700.910381] ------------[ cut here ]------------
> [  700.924282] WARNING: CPU: 2 PID: 1756 at kernel/drivers/cpufreq/cpufreq.c:700 show_scaling_available_governors+0x6f/0xb8()
> [  700.965473] Modules linked in:
> [  700.974637] CPU: 2 PID: 1756 Comm: cat Tainted: G        W       4.4.0-rc2+ #299
> [  700.996813] Hardware name: ARM-Versatile Express
> [  701.010674] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
> [  701.033905] [<c0010e25>] (show_stack) from [<c02eca5d>] (dump_stack+0x55/0x78)
> [  701.055561] [<c02eca5d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
> [  701.079839] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
> [  701.106182] [<c002030f>] (warn_slowpath_null) from [<c03b7bef>] (show_scaling_available_governors+0x6f/0xb8)
> [  701.135656] [<c03b7bef>] (show_scaling_available_governors) from [<c03b7dc3>] (show+0x27/0x38)
> [  701.161488] [<c03b7dc3>] (show) from [<c015469f>] (sysfs_kf_seq_show+0x5f/0xa0)
> [  701.183409] [<c015469f>] (sysfs_kf_seq_show) from [<c01536a7>] (kernfs_seq_show+0x1b/0x1c)
> [  701.208188] [<c01536a7>] (kernfs_seq_show) from [<c011a6d5>] (seq_read+0x129/0x33c)
> [  701.231161] [<c011a6d5>] (seq_read) from [<c00ff7c7>] (__vfs_read+0x1b/0x84)
> [  701.252300] [<c00ff7c7>] (__vfs_read) from [<c010000f>] (vfs_read+0x5f/0xb0)
> [  701.273436] [<c010000f>] (vfs_read) from [<c0100099>] (SyS_read+0x39/0x68)
> [  701.294049] [<c0100099>] (SyS_read) from [<c000df21>] (ret_fast_syscall+0x1/0x1a)
> [  701.316484] ---[ end trace 5dd15744a4da127c ]---

FWIW, I would suggest you to use cpufreq-dt for Juno instead of
arm_bL. I have asked Sudeep to do it earlier, but perhaps he was busy.

> Fix this by locking cpufreq_governor_mutex before for_each_governor().
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index d065435..d91fdb8 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -694,7 +694,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
>  		goto out;
>  	}
>  
> -	lockdep_assert_held(&cpufreq_governor_mutex);
> +	mutex_lock(&cpufreq_governor_mutex);
>  	for_each_governor(t) {
>  		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
>  		    - (CPUFREQ_NAME_LEN + 2)))
> @@ -702,6 +702,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
>  		i += scnprintf(&buf[i], CPUFREQ_NAME_PLEN, "%s ", t->name);
>  	}
>  out:
> +	mutex_unlock(&cpufreq_governor_mutex);
>  	i += sprintf(&buf[i], "\n");
>  	return i;
>  }

Just move this patch before before the patch that added the
lockdep-assert and we wouldn't be required to add the
lockdep_assert_held() in the first place.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy
  2016-01-11 17:35 ` [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy Juri Lelli
@ 2016-01-12 10:15   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:15 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> Since cpufreq_set_policy is modifying policy, it has to work under policy->
> rwsem protection.
> 
> Assert such condition.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-01-11 17:35 ` [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor Juri Lelli
@ 2016-01-12 10:20   ` Viresh Kumar
  2016-01-30  0:33     ` Saravana Kannan
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:20 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> __cpufreq_governor works on policy, so policy->rwsem has to be held.
> Add assertion for such condition.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index f1f9fbc..e7fc5c9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>  	/* Don't start any governor operations if we are entering suspend */
>  	if (cpufreq_suspended)
>  		return 0;
> +
> +	lockdep_assert_held(&policy->rwsem);
> +

We had an ABBA problem with the EXIT governor callback and so this
rwsem is dropped just before that from set_policy()..

commit 955ef4833574 ("cpufreq: Drop rwsem lock around
CPUFREQ_GOV_POLICY_EXIT")

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
  2016-01-11 17:35 ` [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy Juri Lelli
@ 2016-01-12 10:39   ` Viresh Kumar
  2016-01-14 17:58     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:39 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> There are paths in cpufreq_init_policy where policy is used, but its rwsem
> is not held.
> 
> Fix it.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index e7fc5c9..2c7cc6c73 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -998,21 +998,24 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
>  {
>  	int ret = 0;
>  
> +	down_write(&policy->rwsem);
> +
>  	/* Has this CPU been taken care of already? */
> -	if (cpumask_test_cpu(cpu, policy->cpus))
> +	if (cpumask_test_cpu(cpu, policy->cpus)) {
> +		up_write(&policy->rwsem);

Perhaps create a label at the end to unlock the rwsem and jump to it?

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12  8:27       ` Peter Zijlstra
@ 2016-01-12 10:43         ` Juri Lelli
  2016-01-12 16:47         ` Rafael J. Wysocki
  1 sibling, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 10:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, linux-kernel, linux-pm, viresh.kumar,
	mturquette, steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

Hi,

On 12/01/16 09:27, Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 12:03:39AM +0100, Rafael J. Wysocki wrote:
> > On Monday, January 11, 2016 11:05:28 PM Peter Zijlstra wrote:
> > > On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > > > +/**
> > > > + * The "cpufreq driver" - the arch- or hardware-dependent low
> > > > + * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
> > > > + * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
> > > > + */
> > > > +static struct cpufreq_driver *cpufreq_driver;
> > > > +static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
> > > >  static LIST_HEAD(cpufreq_policy_list);
> > > > +static DEFINE_RWLOCK(cpufreq_driver_lock);
> > > 
> > > Part of my suggestion was to fold the per-cpu data of cpufreq_cpu_data
> > > into struct cpufreq_driver.
> > > 
> > > That way each cpufreq_driver will have its own copy and there'd be only
> > > the one global pointer to swizzle. Something very well suited to RCU.
> > 
> > Well, I'm not really sure reworking all that is necessary.
> > 
> > What we need is to be able to call something analogous to dbs_timer_handler()
> > from the scheduler and a driver callback from there (if present).  For that,
> > it should be sufficient to have a pointer to that callback (that may be set
> > upon driver registration) protected by RCU (or should that be sched RCU
> > rather?) if I'm not missing anything.
> 
> But such a callback will invariably want to use the per-cpu state. And
> now you have two pointers, one for the driver and one for the per-cpu
> state. Keeping that in sync is a pain.
> 
> Moving the per-cpu data into the driver solves that trivially.
> 

Oh, I think I now finally get your suggestion completely (I hope :-));
and it makes sense to me. On top of this series I have patches that
implement RCU logic. What I tried to do is to protect all cpufreq.c
stuff with a single mutex (plus RCU logic) and single policies with
another mutex (plus RCU logic). What you are saying should make things
easier to get right. I have to go back and try that.

My idea was to try to build some confidence that this first set is right
and then post the second part implementing RCU logic. 

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation
  2016-01-11 22:45 ` [RFC PATCH 00/19] cpufreq locking cleanups and documentation Rafael J. Wysocki
@ 2016-01-12 10:46   ` Juri Lelli
  0 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 10:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, linux-pm, peterz, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

Hi Rafael,

On 11/01/16 23:45, Rafael J. Wysocki wrote:
> On Monday, January 11, 2016 05:35:41 PM Juri Lelli wrote:
> > Hi all,
> > 
> > In the context of the ongoing discussion about introducing a simple platform
> > energy model to guide scheduling decisions (Energy Aware Scheduling [1])
> > concerns have been expressed by Peter about the component in charge of driving
> > clock frequency selection (Steve recently posted an update of such component
> > [2]): https://lkml.org/lkml/2015/8/15/141.
> > 
> > The problem is that, with this new approach, cpufreq core functions need to be
> > accessed from scheduler hot-paths and the overhead associated with the current
> > locking scheme might result to be unsustainable. 
> > 
> > Peter's proposed approach of using RCU logic to reduce locking overhead seems
> > reasonable, but things may not be so straightforward as originally thought. The
> > very first thing I actually realized when I started looking into this is that
> > it was hard for me to understand which locking mechanism was protecting which
> > data structure. As mostly a way to build a better understanding of the current
> > cpufreq locking scheme and also as preparatory work for implementing RCU logic,
> > I came up with this set of patches. In fact, at this stage, I would like each
> > patch to be considered as a question I'm asking rather than a proposed change,
> > thus the RFC tag for the series; with the intent of documenting current locking
> > scheme and modifying it a bit in order to make RCU logic implementation easier.
> > Actually, as you'll soon notice, I didn't really start from scratch. Mike
> > shared with me some patches he has been developing while looking at the same
> > problem. I've given Mike attribution for the patches that I took unchanged from
> > him, with thanks for sharing his findings with me.
> > 
> > High level description of patches:
> > 
> >  o [01-04] cleanup and move code around to make things (hopefully) cleaner
> >  o [05-14] insert lockdep assertions and fix uncovered erroneous situations
> >  o [15-18] remove overkill usage of locking mechanism
> >  o 19      adds documentation for the cleaned up locking scheme
> > 
> > With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
> > anything bad happening. However, coverage is really small (as is my personal
> > confidence of not breaking things for other confs :-)).
> > 
> > This set is based on top of linux-pm/linux-next as of today and it is also
> > available from here:
> 
> Due to the merge window in progress I have more urgent things to do than
> looking at this material right now.  Sorry about that.
> 

No problem, I understand that.

> I may be able to look at it towards the end of the week.
> 

Great! Viresh is already doing his review, so I have things to work on
until you get time to have a look. Looking forward to receive your
comments as well.

Best,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
  2016-01-11 17:35 ` [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare Juri Lelli
@ 2016-01-12 10:54   ` Viresh Kumar
  2016-01-15 12:37     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 10:54 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> There are paths in cpufreq_offline_prepare where policy is used, but its
> rwsem is not held.
> 
> Fix it.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

I know the locking in general in cpufreq core is poor. We recently
fixed lots of issues in governors ..

> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2c7cc6c73..91158b0 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1332,13 +1332,13 @@ static void cpufreq_offline_prepare(unsigned int cpu)
>  		return;
>  	}
>  
> +	down_write(&policy->rwsem);
>  	if (has_target()) {
>  		int ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
>  		if (ret)
>  			pr_err("%s: Failed to stop governor\n", __func__);
>  	}
>  
> -	down_write(&policy->rwsem);
>  	cpumask_clear_cpu(cpu, policy->cpus);
>  
>  	if (policy_is_inactive(policy)) {
> @@ -1356,12 +1356,16 @@ static void cpufreq_offline_prepare(unsigned int cpu)
>  	/* Start governor again for active policy */
>  	if (!policy_is_inactive(policy)) {

Why shouldn't this be under the lock?

>  		if (has_target()) {
> -			int ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
> +			int ret;
> +
> +			down_write(&policy->rwsem);
> +			ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
>  			if (!ret)
>  				ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
>  
>  			if (ret)
>  				pr_err("%s: Failed to start governor\n", __func__);
> +			up_write(&policy->rwsem);
>  		}
>  	} else if (cpufreq_driver->stop_cpu) {
>  		cpufreq_driver->stop_cpu(policy);

And this ?

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
  2016-01-11 17:35 ` [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish Juri Lelli
@ 2016-01-12 11:02   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 11:02 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> There are paths in cpufreq_offline_prepare where policy is used, but its
> rwsem is not held.
> 
> Fix it.
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 91158b0..ba452c3 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1381,9 +1381,13 @@ static void cpufreq_offline_finish(unsigned int cpu)
>  		return;
>  	}
>  
> +	down_write(&policy->rwsem);
> +
>  	/* Only proceed for inactive policies */
> -	if (!policy_is_inactive(policy))
> +	if (!policy_is_inactive(policy)) {
> +		up_write(&policy->rwsem);
>  		return;
> +	}
>  
>  	/* If cpu is last user of policy, free policy */
>  	if (has_target()) {
> @@ -1392,6 +1396,8 @@ static void cpufreq_offline_finish(unsigned int cpu)
>  			pr_err("%s: Failed to exit governor\n", __func__);
>  	}
>  
> +	up_write(&policy->rwsem);
> +
>  	/*
>  	 * Perform the ->exit() even during light-weight tear-down,
>  	 * since this is a core component, and is essential for the

I think we need to nail down the purpose of the lock first and discuss
the races we are trying to fix. For example, policy is used by all
cpufreq drivers, etc and no one is stopping them to use it without
taking the lock..

FWIW, I have also tried to do some cleanups earlier, but was never
able to send them upstream due to busy schedule.

ssh://git@git.linaro.org/people/viresh.kumar/linux.git
cpufreq/core/locking

You might find some interesting bits there.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-11 17:35 ` [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor Juri Lelli
@ 2016-01-12 11:06   ` Viresh Kumar
  2016-01-15 16:30     ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 11:06 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> Commit 6f1e4efd882e ("cpufreq: Fix timer/workqueue corruption by
> protecting reading governor_enabled") made policy->governor_enabled
> guarded by cpufreq_governor_mutex in __cpufreq_governor. Now that
> holding of policy->rwsem is asserted in __cpufreq_governor,
> cpufreq_governor_mutex is overkilling.

I am sure that is going to break it. Try that x86, somehow I don't get
it on my exynos boards.

> -	mutex_lock(&cpufreq_governor_mutex);
>  	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
>  	    || (!policy->governor_enabled
>  	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
> -		mutex_unlock(&cpufreq_governor_mutex);
>  		return -EBUSY;
>  	}

Actually the above checks should also be removed as the governors are
responsible for maintaining their state machines. But
userspace/powersave/performance don't have that support yet and so
these checks save them from going into undefined states.

Over that, above and below checks are incomplete..

> @@ -2006,8 +2004,6 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>  	else if (event == CPUFREQ_GOV_START)
>  		policy->governor_enabled = true;
>  
> -	mutex_unlock(&cpufreq_governor_mutex);
> -
>  	ret = policy->governor->governor(policy, event);
>  
>  	if (!ret) {
> @@ -2017,12 +2013,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>  			policy->governor->initialized--;
>  	} else {
>  		/* Restore original values */
> -		mutex_lock(&cpufreq_governor_mutex);
>  		if (event == CPUFREQ_GOV_STOP)
>  			policy->governor_enabled = true;
>  		else if (event == CPUFREQ_GOV_START)
>  			policy->governor_enabled = false;
> -		mutex_unlock(&cpufreq_governor_mutex);
>  	}
>  
>  	if (((event == CPUFREQ_GOV_POLICY_INIT) && ret) ||
> -- 
> 2.2.2

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
  2016-01-11 17:35 ` [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT Juri Lelli
@ 2016-01-12 11:09   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 11:09 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> There are no good reasons why policy->rwsem cannot be hold across calls to
> __cpufreq_governor with CPUFREQ_GOV_POLICY_EXIT event.
> 
> Remove {up,down}_write across such call sites. This also verify assertion
> that policy->rwsem is always hold when calling into __cpufreq_governor.

Test on X86 with prove_locking etc enabled.. and try running tests
from my cpufreq-test repo. There were real concerns. Over that, I have
identified the issue completely, as to why the ABBA dependency I
mentioned earlier is there. You can find that in the branch I shared
with you earlier.

commit 57714d5b1778 ("cpufreq: Access governor's sysfs attributes without 'policy->rwsem'")

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get
  2016-01-11 17:35 ` [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get Juri Lelli
@ 2016-01-12 11:17   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 11:17 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> After cpufreq_driver_lock() is acquired in cpufreq_cpu_get() we are sure
> we can't race with cpufreq_policy_free() (which is in the path that ends
> up removing cpufreq_driver). We can thus safely remove check for
> cpufreq_driver being present (which is a leftover from commit
> 6eed9404ab3c ("cpufreq: Use rwsem for protecting critical sections")).
> 
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> ---
>  drivers/cpufreq/cpufreq.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 797bfae..6c9bef7 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -282,15 +282,15 @@ struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu)
>  	if (WARN_ON(cpu >= nr_cpu_ids))
>  		return NULL;
>  
> -	/* get the cpufreq driver */
>  	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  
> -	if (cpufreq_driver) {
> -		/* get the CPU */
> -		policy = cpufreq_cpu_get_raw(cpu);
> -		if (policy)
> -			kobject_get(&policy->kobj);
> -	}
> +	/*
> +	 * If we get a policy, cpufreq_policy_free() didn't
> +	 * yet run.
> +	 */
> +	policy = cpufreq_cpu_get_raw(cpu);
> +	if (policy)
> +		kobject_get(&policy->kobj);
>  
>  	read_unlock_irqrestore(&cpufreq_driver_lock, flags);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12  9:27     ` Viresh Kumar
@ 2016-01-12 11:21       ` Juri Lelli
  2016-01-12 11:58         ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 11:21 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Peter Zijlstra, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

Hi,

On 12/01/16 14:57, Viresh Kumar wrote:
> On 11-01-16, 23:07, Peter Zijlstra wrote:
> > On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > > +/**
> > > + * Iterate over governors
> > > + *
> > > + * cpufreq_governor_list is protected by cpufreq_governor_mutex.
> > > + */
> > > +static LIST_HEAD(cpufreq_governor_list);
> > > +static DEFINE_MUTEX(cpufreq_governor_mutex);
> > > +#define for_each_governor(__governor)				\
> > > +	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
> > 
> > So you could stuff the lockdep_assert_held() you later add intididually
> > into the for_each_governor macro, impossible to forget that way.
> 
> How exactly? I couldn't see how it can be done in a neat and clean
> way.
> 

I tried to see if something like for_each_domain() can be done, but here
we use list_for_each_entry() macro. Peter, do you mean something like
the following?

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 78b1e2f..1a847a6 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,6 +39,7 @@
 static LIST_HEAD(cpufreq_governor_list);
 static DEFINE_MUTEX(cpufreq_governor_mutex);
 #define for_each_governor(__governor)				\
+	lockdep_assert_held(&cpufreq_governor_mutex);		\
 	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
 
 /**
@@ -508,7 +509,6 @@ static struct cpufreq_governor *find_governor(const char *str_governor)
 {
 	struct cpufreq_governor *t;
 
-	lockdep_assert_held(&cpufreq_governor_mutex);
 	for_each_governor(t)
 		if (!strncasecmp(str_governor, t->name, CPUFREQ_NAME_LEN))
 			return t;

Since for_each_governor() is not used in if conditions that should be
fine?

Thanks,

- Juri

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-11 17:35 ` [RFC PATCH 18/19] cpufreq: remove transition_lock Juri Lelli
@ 2016-01-12 11:24   ` Viresh Kumar
  2016-01-13  0:54     ` Michael Turquette
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-12 11:24 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 11-01-16, 17:35, Juri Lelli wrote:
> From: Michael Turquette <mturquette@baylibre.com>
> 
> transition_lock was introduced to serialize cpufreq transition
> notifiers. Instead of using a different lock for protecting concurrent
> modifications of policy, it is better to require that callers of
> transition notifiers implement appropriate locking (this is already the
> case AFAICS). Removing transition_lock also simplifies current locking
> scheme.

So, are you saying that the reasoning mentioned in this patch are all
wrong?

commit 12478cf0c55e ("cpufreq: Make sure frequency transitions are
serialized")

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list
  2016-01-12  9:34   ` Viresh Kumar
@ 2016-01-12 11:44     ` Juri Lelli
  2016-01-13  5:59       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 11:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 12/01/16 15:04, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > cpufreq_policy_list is guarded by cpufreq_driver_lock. Add appropriate
> > locking assertions to check that we always access the list while holding
> > the associated lock.
> > 
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 00a00cd..63d6efb 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -65,6 +65,7 @@ static bool suitable_policy(struct cpufreq_policy *policy, bool active)
> >  static struct cpufreq_policy *next_policy(struct cpufreq_policy *policy,
> >  					  bool active)
> >  {
> > +	lockdep_assert_held(&cpufreq_driver_lock);
> >  	do {
> >  		policy = list_next_entry(policy, policy_list);
> >  
> > @@ -80,6 +81,7 @@ static struct cpufreq_policy *first_policy(bool active)
> >  {
> >  	struct cpufreq_policy *policy;
> >  
> > +	lockdep_assert_held(&cpufreq_driver_lock);
> 
> Because both first_policy() and next_policy() are parts of
> for_each_suitable_policy() macro, checking this in first_policy() is
> sufficient. next_policy() isn't designed to be used by any other code.
> 

But next_policy is called multiple times as part of
for_each_suitable_policy().  What if someone thinks she/he can release
cpufreq_driver_lock inside for_each_(in)active_policy() loop? Not that
it makes sense, but don't you think it could happen?

> >  	/* No policies in the list */
> >  	if (list_empty(&cpufreq_policy_list))
> >  		return NULL;
> > @@ -2430,6 +2432,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> >  	if (ret)
> >  		goto err_boost_unreg;
> >  
> > +	lockdep_assert_held(&cpufreq_driver_lock);
> 
> Why do you need a cpufreq_driver_lock here? And the above change
> should generate a lockdep here as the lock isn't taken right now.
> 

Because you are checking cpufreq_policy_list to see if it's empty. And
it generates a lockdep warning, yes; fixed by next patch. Maybe putting
fixes before warnings, as you are suggesting, is better.

Thanks,

- Juri

> >  	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
> >  	    list_empty(&cpufreq_policy_list)) {
> >  		/* if all ->init() calls failed, unregister */
> > -- 
> > 2.2.2
> 
> -- 
> viresh
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12 11:21       ` Juri Lelli
@ 2016-01-12 11:58         ` Peter Zijlstra
  2016-01-12 12:36           ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-12 11:58 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Viresh Kumar, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 12, 2016 at 11:21:25AM +0000, Juri Lelli wrote:
> I tried to see if something like for_each_domain() can be done, but here
> we use list_for_each_entry() macro. Peter, do you mean something like
> the following?
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 78b1e2f..1a847a6 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -39,6 +39,7 @@
>  static LIST_HEAD(cpufreq_governor_list);
>  static DEFINE_MUTEX(cpufreq_governor_mutex);
>  #define for_each_governor(__governor)				\
> +	lockdep_assert_held(&cpufreq_governor_mutex);		\
>  	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)

That fails for things like:

	if (blah)
		for_each_governor(...) {
		}

which looks like valid C -- even though our Coding Style says the if
should have { } on.

I was thinking of either open coding the for statement and adding it to
the first statement like:

	#define for_each_governor(__g) \
		for (_g = list_first_entry(&cpufreq_governor_list, typeof(*_g), governor_list, lockdep_assert_held(), \
		     ..... )

Or use something like this:

  lkml.kernel.org/r/20150422154212.GE3007@worktop.Skamania.guest

	#define for_each_governor(_g) \
		list_for_each_entry(_g, &cpufreq_governor_list, governor_list)
			if (lockdep_assert_held(..), false)
				;
			else

Which should preserve C syntax rules.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock
  2016-01-12  9:57   ` Viresh Kumar
@ 2016-01-12 12:08     ` Juri Lelli
  2016-01-13  6:01       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 12:08 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 12/01/16 15:27, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > Commit highlights paths where we access cpufreq_policy_list without
> > holding cpufreq_driver_lock; one example being the following:
> > 
> > [    8.245779] ------------[ cut here ]------------
> > [    8.305977] WARNING: CPU: 2 PID: 1 at kernel/drivers/cpufreq/cpufreq.c:2447 cpufreq_register_driver+0xfd/0x120()
> > [    8.438611] Modules linked in:
> > [    8.493751] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc4+ #369
> > [    8.561039] Hardware name: ARM-Versatile Express
> > [    8.622765] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
> > [    8.629651] atkbd serio0: keyboard reset failed on 1c060000.kmi
> > [    8.810905] [<c0010e25>] (show_stack) from [<c02ece7d>] (dump_stack+0x55/0x78)
> > [    8.935122] [<c02ece7d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
> > [    9.067097] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
> > [    9.204101] [<c002030f>] (warn_slowpath_null) from [<c03ba329>] (cpufreq_register_driver+0xfd/0x120)
> > [    9.209603] usb 1-1.2: new high-speed USB device number 3 using isp1760
> > [    9.419507] [<c03ba329>] (cpufreq_register_driver) from [<c03bc481>] (bL_cpufreq_register+0x49/0x98)
> > [    9.560548] [<c03bc481>] (bL_cpufreq_register) from [<c0342517>] (platform_drv_probe+0x3b/0x6c)
> > [    9.573806] usb-storage 1-1.2:1.0: USB Mass Storage device detected
> > [    9.575468] scsi host0: usb-storage 1-1.2:1.0
> > [    9.855845] [<c0342517>] (platform_drv_probe) from [<c03412e7>] (driver_probe_device+0x153/0x1bc)
> > [   10.006137] [<c03412e7>] (driver_probe_device) from [<c03413a7>] (__driver_attach+0x57/0x58)
> > [   10.009576] atkbd serio1: keyboard reset failed on 1c070000.kmi
> > [   10.237057] [<c03413a7>] (__driver_attach) from [<c0340199>] (bus_for_each_dev+0x2d/0x4c)
> > [   10.387824] [<c0340199>] (bus_for_each_dev) from [<c0340bd7>] (bus_add_driver+0xa3/0x14c)
> > [   10.539200] [<c0340bd7>] (bus_add_driver) from [<c0341bff>] (driver_register+0x3b/0x88)
> > [   10.691023] [<c0341bff>] (driver_register) from [<c0009613>] (do_one_initcall+0x5b/0x150)
> > [   10.703809] scsi 0:0:0:0: Direct-Access     General  USB Flash Disk   1.0  PQ: 0 ANSI: 2
> > [   10.713081] sd 0:0:0:0: [sda] 7831552 512-byte logical blocks: (4.00 GB/3.73 GiB)
> > [   10.713973] sd 0:0:0:0: [sda] Write Protect is off
> > [   10.713984] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
> > [   10.730783] sd 0:0:0:0: [sda] No Caching mode page found
> > [   10.730814] sd 0:0:0:0: [sda] Assuming drive cache: write through
> > [   10.779815]  sda: sda1 sda2
> > [   10.823590] sd 0:0:0:0: [sda] Attached SCSI removable disk
> > [   11.581894] [<c0009613>] (do_one_initcall) from [<c0734b45>] (kernel_init_freeable+0x18d/0x22c)
> > [   11.720454] [<c0734b45>] (kernel_init_freeable) from [<c04f45f9>] (kernel_init+0xd/0xa4)
> > [   11.857340] [<c04f45f9>] (kernel_init) from [<c000dfb9>] (ret_from_fork+0x11/0x38)
> > [   11.993082] ---[ end trace 62ff5522fb3f41dd ]---
> > 
> > Fix this, and others, with proper locking of cpufreq_driver_lock.
> 
> Perhaps this should be added prior to the lockdep patch, so that git
> bisect doesn't show lockdeps ?
> 

I put patches in this order to be able to highlight problems before
fixing them. But I agree this is not nice for bisectability. I guess I
could squash related fixes and assertions together (when removing the
RFC tag) so that we don't break bisectability.

> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 63d6efb..98adbc2 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -1585,6 +1585,7 @@ EXPORT_SYMBOL(cpufreq_generic_suspend);
> >  void cpufreq_suspend(void)
> >  {
> >  	struct cpufreq_policy *policy;
> > +	unsigned long flags;
> >  
> >  	if (!cpufreq_driver)
> >  		return;
> > @@ -1594,6 +1595,7 @@ void cpufreq_suspend(void)
> >  
> >  	pr_debug("%s: Suspending Governors\n", __func__);
> >  
> > +	read_lock_irqsave(&cpufreq_driver_lock, flags);
> >  	for_each_active_policy(policy) {
> >  		if (__cpufreq_governor(policy, CPUFREQ_GOV_STOP))
> >  			pr_err("%s: Failed to stop governor for policy: %p\n",
> > @@ -1603,6 +1605,7 @@ void cpufreq_suspend(void)
> >  			pr_err("%s: Failed to suspend driver: %p\n", __func__,
> >  				policy);
> >  	}
> > +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >  
> >  suspend:
> >  	cpufreq_suspended = true;
> > @@ -1617,6 +1620,7 @@ suspend:
> >  void cpufreq_resume(void)
> >  {
> >  	struct cpufreq_policy *policy;
> > +	unsigned long flags;
> >  
> >  	if (!cpufreq_driver)
> >  		return;
> > @@ -1628,6 +1632,7 @@ void cpufreq_resume(void)
> >  
> >  	pr_debug("%s: Resuming Governors\n", __func__);
> >  
> > +	read_lock_irqsave(&cpufreq_driver_lock, flags);
> >  	for_each_active_policy(policy) {
> >  		if (cpufreq_driver->resume && cpufreq_driver->resume(policy))
> >  			pr_err("%s: Failed to resume driver: %p\n", __func__,
> > @@ -1637,6 +1642,7 @@ void cpufreq_resume(void)
> >  			pr_err("%s: Failed to start governor for policy: %p\n",
> >  				__func__, policy);
> >  	}
> > +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >  
> >  	/*
> >  	 * schedule call cpufreq_update_policy() for first-online CPU, as that
> > @@ -2287,7 +2293,9 @@ static int cpufreq_boost_set_sw(int state)
> >  	struct cpufreq_frequency_table *freq_table;
> >  	struct cpufreq_policy *policy;
> >  	int ret = -EINVAL;
> > +	unsigned long flags;
> >  
> > +	read_lock_irqsave(&cpufreq_driver_lock, flags);
> >  	for_each_active_policy(policy) {
> >  		freq_table = cpufreq_frequency_get_table(policy->cpu);
> >  		if (freq_table) {
> > @@ -2302,6 +2310,7 @@ static int cpufreq_boost_set_sw(int state)
> >  			__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
> >  		}
> >  	}
> > +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >  
> >  	return ret;
> >  }
> 
> For the above three, I am not sure if there can be some side effects.
> Can you please push a branch somewhere, to be tested by Fengguang's
> build bot? So that we know of any new lockdeps due to this? All above
> routines directly/indirectly call governor specific routines and that
> leads to freq-update in few cases. AFAIR, there were some issues with
> locking here.
> 

I currently don't have any branch fetched by Fengguang's bot; I'll see
how to start doing that. In the meantime I'll try to setup an x86 box
and run some more tests.

> > @@ -2432,14 +2441,16 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> >  	if (ret)
> >  		goto err_boost_unreg;
> >  
> > -	lockdep_assert_held(&cpufreq_driver_lock);
> > +	read_lock_irqsave(&cpufreq_driver_lock, flags);
> >  	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
> >  	    list_empty(&cpufreq_policy_list)) {
> >  		/* if all ->init() calls failed, unregister */
> >  		pr_debug("%s: No CPU initialized for driver %s\n", __func__,
> >  			 driver_data->name);
> > +		read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >  		goto err_if_unreg;
> >  	}
> > +	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 
> We have just registered the cpufreq driver, there is no other path
> that can simultaneously update the list here.
> 
> And so we don't need the lock here.
> 

I was thinking hotplug can get in the way, but we are inside a
{get,put}_online_cpus block. I'll remove that.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12 11:58         ` Peter Zijlstra
@ 2016-01-12 12:36           ` Juri Lelli
  2016-01-12 15:26             ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 12:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Viresh Kumar, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 12/01/16 12:58, Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 11:21:25AM +0000, Juri Lelli wrote:
> > I tried to see if something like for_each_domain() can be done, but here
> > we use list_for_each_entry() macro. Peter, do you mean something like
> > the following?
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 78b1e2f..1a847a6 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -39,6 +39,7 @@
> >  static LIST_HEAD(cpufreq_governor_list);
> >  static DEFINE_MUTEX(cpufreq_governor_mutex);
> >  #define for_each_governor(__governor)				\
> > +	lockdep_assert_held(&cpufreq_governor_mutex);		\
> >  	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
> 
> That fails for things like:
> 
> 	if (blah)
> 		for_each_governor(...) {
> 		}
> 
> which looks like valid C -- even though our Coding Style says the if
> should have { } on.
> 
> I was thinking of either open coding the for statement and adding it to
> the first statement like:
> 
> 	#define for_each_governor(__g) \
> 		for (_g = list_first_entry(&cpufreq_governor_list, typeof(*_g), governor_list, lockdep_assert_held(), \
> 		     ..... )
> 
> Or use something like this:
> 
>   lkml.kernel.org/r/20150422154212.GE3007@worktop.Skamania.guest
> 
> 	#define for_each_governor(_g) \
> 		list_for_each_entry(_g, &cpufreq_governor_list, governor_list)
> 			if (lockdep_assert_held(..), false)
> 				;
> 			else
> 
> Which should preserve C syntax rules.
> 

Oh, nice this! I'll try it.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12 12:36           ` Juri Lelli
@ 2016-01-12 15:26             ` Juri Lelli
  2016-01-12 15:58               ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 15:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Viresh Kumar, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 12/01/16 12:36, Juri Lelli wrote:
> On 12/01/16 12:58, Peter Zijlstra wrote:
> > On Tue, Jan 12, 2016 at 11:21:25AM +0000, Juri Lelli wrote:
> > > I tried to see if something like for_each_domain() can be done, but here
> > > we use list_for_each_entry() macro. Peter, do you mean something like
> > > the following?
> > > 
> > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > index 78b1e2f..1a847a6 100644
> > > --- a/drivers/cpufreq/cpufreq.c
> > > +++ b/drivers/cpufreq/cpufreq.c
> > > @@ -39,6 +39,7 @@
> > >  static LIST_HEAD(cpufreq_governor_list);
> > >  static DEFINE_MUTEX(cpufreq_governor_mutex);
> > >  #define for_each_governor(__governor)				\
> > > +	lockdep_assert_held(&cpufreq_governor_mutex);		\
> > >  	list_for_each_entry(__governor, &cpufreq_governor_list, governor_list)
> > 
> > That fails for things like:
> > 
> > 	if (blah)
> > 		for_each_governor(...) {
> > 		}
> > 
> > which looks like valid C -- even though our Coding Style says the if
> > should have { } on.
> > 
> > I was thinking of either open coding the for statement and adding it to
> > the first statement like:
> > 
> > 	#define for_each_governor(__g) \
> > 		for (_g = list_first_entry(&cpufreq_governor_list, typeof(*_g), governor_list, lockdep_assert_held(), \
> > 		     ..... )
> > 
> > Or use something like this:
> > 
> >   lkml.kernel.org/r/20150422154212.GE3007@worktop.Skamania.guest
> > 
> > 	#define for_each_governor(_g) \
> > 		list_for_each_entry(_g, &cpufreq_governor_list, governor_list)
> > 			if (lockdep_assert_held(..), false)
> > 				;
> > 			else
> > 
> > Which should preserve C syntax rules.
> > 
> 
> Oh, nice this! I'll try it.
> 

This second approach doesn't really play well with lockdep_assert_held
definition, right?

However, it seems I could make this work with

 #ifdef CONFIG_LOCKDEP
 #define for_each_governor(__gov)					    \
 	for (__gov = list_first_entry(&cpufreq_governor_list, 		    \
 				      typeof(*__gov), 			    \
 				      governor_list),			    \
 				WARN_ON(debug_locks &&			    \
 				!lockdep_is_held(&cpufreq_governor_mutex)); \
 	     &__gov->governor_list != (&cpufreq_governor_list);		    \
 	     __gov = list_next_entry(__gov, governor_list))
 #else /* !CONFIG_LOCKDEP */
 #define for_each_governor(__gov)					    \
 	list_for_each_entry(__gov, &cpufreq_governor_list, governor_list)
 #endif /* CONFIG_LOCKDEP */

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list
  2016-01-12 10:01   ` Viresh Kumar
@ 2016-01-12 15:33     ` Juri Lelli
  0 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 15:33 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 12/01/16 15:31, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > @@ -2025,6 +2027,7 @@ int cpufreq_register_governor(struct cpufreq_governor *governor)
> >  	err = -EBUSY;
> >  	if (!find_governor(governor->name)) {
> >  		err = 0;
> > +		lockdep_assert_held(&cpufreq_governor_mutex);
> >  		list_add(&governor->governor_list, &cpufreq_governor_list);
> >  	}
> 
> Why here? This is how the routine looks like:
> 

I guess I was simply over-paranoid. We can drop this assertion.

Thanks,

- Juri

> int cpufreq_register_governor(struct cpufreq_governor *governor)
> {
> 	int err;
> 
> 	if (!governor)
> 		return -EINVAL;
> 
> 	if (cpufreq_disabled())
> 		return -ENODEV;
> 
> 	mutex_lock(&cpufreq_governor_mutex);
> 
> 	governor->initialized = 0;
> 	err = -EBUSY;
> 	if (!find_governor(governor->name)) {
> 		err = 0;
> 		list_add(&governor->governor_list, &cpufreq_governor_list);
> 	}
> 
> 	mutex_unlock(&cpufreq_governor_mutex);
> 	return err;
> }
> 
> 
> -- 
> viresh
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-12 10:09   ` Viresh Kumar
@ 2016-01-12 15:52     ` Juri Lelli
  2016-01-13  6:07       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-12 15:52 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12/01/16 15:39, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 7dae7f3..d065435 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -969,6 +969,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
> >  
> >  	memcpy(&new_policy, policy, sizeof(*policy));
> >  
> > +	mutex_lock(&cpufreq_governor_mutex);
> >  	/* Update governor of new_policy to the governor used before hotplug */
> >  	gov = find_governor(policy->last_governor);
> 
> You should take the lock within find_governor() instead, i.e.  around
> the while loop.
> 

Other users (i.e., cpufreq_parse_governor and cpufreq_register_governor)
needs to take the mutex externally. So, we need to unify this behaviour.

Best,

- Juri

> >  	if (gov)
> > @@ -976,6 +977,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
> >  				policy->governor->name, policy->cpu);
> >  	else
> >  		gov = CPUFREQ_DEFAULT_GOVERNOR;
> > +	mutex_unlock(&cpufreq_governor_mutex);
> >  
> >  	new_policy.governor = gov;
> >  
> > -- 
> > 2.2.2
> 
> -- 
> viresh
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12 15:26             ` Juri Lelli
@ 2016-01-12 15:58               ` Peter Zijlstra
  0 siblings, 0 replies; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-12 15:58 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Viresh Kumar, linux-kernel, linux-pm, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 12, 2016 at 03:26:01PM +0000, Juri Lelli wrote:
> > > 	#define for_each_governor(_g) \
> > > 		list_for_each_entry(_g, &cpufreq_governor_list, governor_list)
> > > 			if (lockdep_assert_held(..), false)
> > > 				;
> > > 			else
> > > 
> > > Which should preserve C syntax rules.
> > > 
> > 
> > Oh, nice this! I'll try it.
> > 
> 
> This second approach doesn't really play well with lockdep_assert_held
> definition, right?

Right, the below however makes it work, except:

../kernel/sched/core.c: In function ‘scheduler_ipi’:
../kernel/sched/core.c:1831:32: warning: left-hand operand of comma expression has no effect [-Wunused-value]
  if (lockdep_assert_held(&lock), false)

Which is of course correct and very much on purpose :/

---

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c57e424d914b..caf7a89643d8 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -362,9 +362,9 @@ extern void lock_unpin_lock(struct lockdep_map *lock);
 
 #define lockdep_depth(tsk)	(debug_locks ? (tsk)->lockdep_depth : 0)
 
-#define lockdep_assert_held(l)	do {				\
+#define lockdep_assert_held(l)	({				\
 		WARN_ON(debug_locks && !lockdep_is_held(l));	\
-	} while (0)
+		(void)l; })
 
 #define lockdep_assert_held_once(l)	do {				\
 		WARN_ON_ONCE(debug_locks && !lockdep_is_held(l));	\
@@ -422,7 +422,7 @@ struct lock_class_key { };
 
 #define lockdep_depth(tsk)	(0)
 
-#define lockdep_assert_held(l)			do { (void)(l); } while (0)
+#define lockdep_assert_held(l)			({ (void)l; })
 #define lockdep_assert_held_once(l)		do { (void)(l); } while (0)
 
 #define lockdep_recursing(tsk)			(0)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 77d97a6fc715..f6f36217133d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1817,6 +1817,8 @@ void sched_ttwu_pending(void)
 	raw_spin_unlock_irqrestore(&rq->lock, flags);
 }
 
+raw_spinlock_t lock;
+
 void scheduler_ipi(void)
 {
 	/*
@@ -1826,6 +1828,9 @@ void scheduler_ipi(void)
 	 */
 	preempt_fold_need_resched();
 
+	if (lockdep_assert_held(&lock), false)
+		;
+
 	if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
 		return;
 

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 04/19] cpufreq: bring data structures close to their locks
  2016-01-12  8:27       ` Peter Zijlstra
  2016-01-12 10:43         ` Juri Lelli
@ 2016-01-12 16:47         ` Rafael J. Wysocki
  1 sibling, 0 replies; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-12 16:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, linux-kernel, linux-pm, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tuesday, January 12, 2016 09:27:18 AM Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 12:03:39AM +0100, Rafael J. Wysocki wrote:
> > On Monday, January 11, 2016 11:05:28 PM Peter Zijlstra wrote:
> > > On Mon, Jan 11, 2016 at 05:35:45PM +0000, Juri Lelli wrote:
> > > > +/**
> > > > + * The "cpufreq driver" - the arch- or hardware-dependent low
> > > > + * level driver of CPUFreq support, and its spinlock (cpufreq_driver_lock).
> > > > + * This lock also protects cpufreq_cpu_data array and cpufreq_policy_list.
> > > > + */
> > > > +static struct cpufreq_driver *cpufreq_driver;
> > > > +static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
> > > >  static LIST_HEAD(cpufreq_policy_list);
> > > > +static DEFINE_RWLOCK(cpufreq_driver_lock);
> > > 
> > > Part of my suggestion was to fold the per-cpu data of cpufreq_cpu_data
> > > into struct cpufreq_driver.
> > > 
> > > That way each cpufreq_driver will have its own copy and there'd be only
> > > the one global pointer to swizzle. Something very well suited to RCU.
> > 
> > Well, I'm not really sure reworking all that is necessary.
> > 
> > What we need is to be able to call something analogous to dbs_timer_handler()
> > from the scheduler and a driver callback from there (if present).  For that,
> > it should be sufficient to have a pointer to that callback (that may be set
> > upon driver registration) protected by RCU (or should that be sched RCU
> > rather?) if I'm not missing anything.
> 
> But such a callback will invariably want to use the per-cpu state.

Which likely is the driver's own per-cpu state, not the policy object itself.

> And now you have two pointers, one for the driver and one for the per-cpu
> state. Keeping that in sync is a pain.

Well, I basically need to guarantee that all of the pointers involved are set
and the data structures are valid when the driver pointer is set.

> Moving the per-cpu data into the driver solves that trivially.

It doesn't really address the case when the driver has its own per-cpu state
as I said above.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-12 11:24   ` Viresh Kumar
@ 2016-01-13  0:54     ` Michael Turquette
  2016-01-13  6:31       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Michael Turquette @ 2016-01-13  0:54 UTC (permalink / raw)
  To: Viresh Kumar, Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi Viresh,

Quoting Viresh Kumar (2016-01-12 03:24:09)
> On 11-01-16, 17:35, Juri Lelli wrote:
> > From: Michael Turquette <mturquette@baylibre.com>
> > 
> > transition_lock was introduced to serialize cpufreq transition
> > notifiers. Instead of using a different lock for protecting concurrent
> > modifications of policy, it is better to require that callers of
> > transition notifiers implement appropriate locking (this is already the
> > case AFAICS). Removing transition_lock also simplifies current locking
> > scheme.
> 
> So, are you saying that the reasoning mentioned in this patch are all
> wrong?
> 
> commit 12478cf0c55e ("cpufreq: Make sure frequency transitions are
> serialized")

No, that's not what I'm saying. Quoting that patch:

"""
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).

To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
"""

So the transition_onging flag and wait-queue are all good. That stuff is
just great. This patch doesn't touch it.

What it does change is that it removes a superfluous spinlock that
should never have needed to exist in the first place.
cpufreq_freq_transition_begin is called directly by driver target
callbacks, and it is called by __cpufreq_driver_target.

__cpufreq_driver_target should be using a per-policy lock. Any other
behavior is just insane. I haven't gone through this thread to see if
that change has been made by Juri, but we need to get there either in
this series or the follow-up series that introduces some RCU locking.

Regards,
Mike

> 
> -- 
> viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list
  2016-01-12 11:44     ` Juri Lelli
@ 2016-01-13  5:59       ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-13  5:59 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12-01-16, 11:44, Juri Lelli wrote:
> But next_policy is called multiple times as part of
> for_each_suitable_policy().  What if someone thinks she/he can release
> cpufreq_driver_lock inside for_each_(in)active_policy() loop? Not that
> it makes sense, but don't you think it could happen?

Okay, I don't have strong opinion about using that only in the first
routine. No issues.

> > >  	/* No policies in the list */
> > >  	if (list_empty(&cpufreq_policy_list))
> > >  		return NULL;
> > > @@ -2430,6 +2432,7 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> > >  	if (ret)
> > >  		goto err_boost_unreg;
> > >  
> > > +	lockdep_assert_held(&cpufreq_driver_lock);
> > 
> > Why do you need a cpufreq_driver_lock here? And the above change
> > should generate a lockdep here as the lock isn't taken right now.
> > 
> 
> Because you are checking cpufreq_policy_list to see if it's empty. And
> it generates a lockdep warning, yes; fixed by next patch. Maybe putting
> fixes before warnings, as you are suggesting, is better.

Well, locking isn't required because we think we need to protect every
access of a variable (like cpufreq_policy_list here). But we need to
protect its access from possible races.

What I am saying is, we can't have a race here. And so no need to lock
it down.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock
  2016-01-12 12:08     ` Juri Lelli
@ 2016-01-13  6:01       ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-13  6:01 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12-01-16, 12:08, Juri Lelli wrote:
> I currently don't have any branch fetched by Fengguang's bot; I'll see
> how to start doing that. In the meantime I'll try to setup an x86 box
> and run some more tests.

Perhaps, just create a new tree (that you always want to be tested by
the Bot) and push your branch there. And then ask Fengguang ((In)formal
email to: fengguang.wu@intel.com) to add your tree. He is very fast :)

> I was thinking hotplug can get in the way, but we are inside a
> {get,put}_online_cpus block. I'll remove that.

Exactly.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-12 15:52     ` Juri Lelli
@ 2016-01-13  6:07       ` Viresh Kumar
  2016-01-14 16:35         ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-13  6:07 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12-01-16, 15:52, Juri Lelli wrote:
> Other users (i.e., cpufreq_parse_governor and cpufreq_register_governor)
> needs to take the mutex externally. So, we need to unify this behaviour.

No they don't have to.

And that's why I have been saying that we better nail down the exact
thing the mutex is supposed to protect.

There can be two cases here:
- It protects the governor list, in that case we can move it to
  find_governor().
- It guarantees that the governor pointer stays valid: That's not true
  as we are using the governor pointer outside of the lock.

And so I said, "No they don't have to" :)

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-13  0:54     ` Michael Turquette
@ 2016-01-13  6:31       ` Viresh Kumar
       [not found]         ` <20160113182131.1168.45753@quark.deferred.io>
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-13  6:31 UTC (permalink / raw)
  To: Michael Turquette
  Cc: Juri Lelli, linux-kernel, linux-pm, peterz, rjw, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12-01-16, 16:54, Michael Turquette wrote:
> __cpufreq_driver_target should be using a per-policy lock.

It doesn't :)

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors unlocked access to cpufreq_governor_list
  2016-01-12 10:13   ` Viresh Kumar
@ 2016-01-13 10:25     ` Juri Lelli
  2016-01-13 10:32       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-13 10:25 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 12/01/16 15:43, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > show_scaling_available_governors iterates through cpufreq_governor_list
> > without holding cpufreq_governor_mutex; this generates the following
> > warning:
> > 
> > [  700.910381] ------------[ cut here ]------------
> > [  700.924282] WARNING: CPU: 2 PID: 1756 at kernel/drivers/cpufreq/cpufreq.c:700 show_scaling_available_governors+0x6f/0xb8()
> > [  700.965473] Modules linked in:
> > [  700.974637] CPU: 2 PID: 1756 Comm: cat Tainted: G        W       4.4.0-rc2+ #299
> > [  700.996813] Hardware name: ARM-Versatile Express
> > [  701.010674] [<c0014215>] (unwind_backtrace) from [<c0010e25>] (show_stack+0x11/0x14)
> > [  701.033905] [<c0010e25>] (show_stack) from [<c02eca5d>] (dump_stack+0x55/0x78)
> > [  701.055561] [<c02eca5d>] (dump_stack) from [<c00202cd>] (warn_slowpath_common+0x59/0x84)
> > [  701.079839] [<c00202cd>] (warn_slowpath_common) from [<c002030f>] (warn_slowpath_null+0x17/0x1c)
> > [  701.106182] [<c002030f>] (warn_slowpath_null) from [<c03b7bef>] (show_scaling_available_governors+0x6f/0xb8)
> > [  701.135656] [<c03b7bef>] (show_scaling_available_governors) from [<c03b7dc3>] (show+0x27/0x38)
> > [  701.161488] [<c03b7dc3>] (show) from [<c015469f>] (sysfs_kf_seq_show+0x5f/0xa0)
> > [  701.183409] [<c015469f>] (sysfs_kf_seq_show) from [<c01536a7>] (kernfs_seq_show+0x1b/0x1c)
> > [  701.208188] [<c01536a7>] (kernfs_seq_show) from [<c011a6d5>] (seq_read+0x129/0x33c)
> > [  701.231161] [<c011a6d5>] (seq_read) from [<c00ff7c7>] (__vfs_read+0x1b/0x84)
> > [  701.252300] [<c00ff7c7>] (__vfs_read) from [<c010000f>] (vfs_read+0x5f/0xb0)
> > [  701.273436] [<c010000f>] (vfs_read) from [<c0100099>] (SyS_read+0x39/0x68)
> > [  701.294049] [<c0100099>] (SyS_read) from [<c000df21>] (ret_fast_syscall+0x1/0x1a)
> > [  701.316484] ---[ end trace 5dd15744a4da127c ]---
> 
> FWIW, I would suggest you to use cpufreq-dt for Juno instead of
> arm_bL. I have asked Sudeep to do it earlier, but perhaps he was busy.
>

I couldn't really relate this comment with this patch or the backtrace.
Can you please clarify why you are referring to switching to use
cpufreq-dt here?

> > Fix this by locking cpufreq_governor_mutex before for_each_governor().
> > 
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index d065435..d91fdb8 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -694,7 +694,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
> >  		goto out;
> >  	}
> >  
> > -	lockdep_assert_held(&cpufreq_governor_mutex);
> > +	mutex_lock(&cpufreq_governor_mutex);
> >  	for_each_governor(t) {
> >  		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
> >  		    - (CPUFREQ_NAME_LEN + 2)))
> > @@ -702,6 +702,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
> >  		i += scnprintf(&buf[i], CPUFREQ_NAME_PLEN, "%s ", t->name);
> >  	}
> >  out:
> > +	mutex_unlock(&cpufreq_governor_mutex);
> >  	i += sprintf(&buf[i], "\n");
> >  	return i;
> >  }
> 
> Just move this patch before before the patch that added the
> lockdep-assert and we wouldn't be required to add the
> lockdep_assert_held() in the first place.
> 

Yep. As said, I just wanted to try to highlight possible problems with
this RFC.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors unlocked access to cpufreq_governor_list
  2016-01-13 10:25     ` Juri Lelli
@ 2016-01-13 10:32       ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-13 10:32 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 13-01-16, 10:25, Juri Lelli wrote:
> I couldn't really relate this comment with this patch or the backtrace.
> Can you please clarify why you are referring to switching to use
> cpufreq-dt here?

Yeah, sorry about that. One of your previous patches had backtrace
mentioning bL and so I wanted to mention that thing to you. But by
that time, I moved to the new patch and thought of just adding the
point before forgetting it completely.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
       [not found]         ` <20160113182131.1168.45753@quark.deferred.io>
@ 2016-01-14  9:44           ` Juri Lelli
  2016-01-14 10:32           ` Viresh Kumar
  2016-01-19 14:00           ` Peter Zijlstra
  2 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-14  9:44 UTC (permalink / raw)
  To: Michael Turquette
  Cc: Viresh Kumar, linux-kernel, linux-pm, peterz, rjw, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 13/01/16 10:21, Michael Turquette wrote:
> Hi Viresh,
> 
> Quoting Viresh Kumar (2016-01-12 22:31:48)
> > On 12-01-16, 16:54, Michael Turquette wrote:
> > > __cpufreq_driver_target should be using a per-policy lock.
> > 
> > It doesn't :)
> 
> It should.
> 
> A less conceited response is that a per-policy lock should be held
> around calls to __cpufreq_driver_target. This can obviously be done by
> cpufreq_driver_target (no double underscore), but there are quite a few
> drivers that call __cpufreq_driver_target, and since we're touching the
> policy structure we need a lock around it.
> 

Agree, we should enforce the rule that everything that touches policy
structure has to lock it before.

> Juri's cover letter did not explicitly state my original, full intention
> for the patches I was working on. I'll spell that out below and
> hopefully we can gather consensus around it before moving forward. Juri,
> I'm implicitly assuming that you agree with the stuff below, but please
> correct me if I am wrong.

Right. I decided to post with this RFC only a subset of the patches we
came up with because I needed to build some more confidence with the
subsystem I was going to propose changes for. Review comments received
are helping me on that front. I didn't mention at all next steps (RCU)
because I wanted to focus on understanding and documenting, and maybe
fixing where required, the current status, before we change it.

> The original idea for overhauling the locking
> in cpufreq is to use two locks:
> 
> 1) per-policy lock (my patches were using a mutex), which is the only
> lock that needs to be touched during a frequency transition. We do not
> want any lock contention between policy's during freq transition. For
> read-side operation this locking scheme should utilize RCU so that the
> scheduler can safely access the values in struct cpufreq_policy within
> it's schedule() context. [a note on RCU below]
> 
> 2) a single, framework-wide lock (my patches were using a mutex) that
> handles all of the other synchronization: governor events, driver events
> and anything else that does not happen on a per-policy basis. I don't
> think RCU is necessary for this. These operations are all slow-path ones
> so reducing the mess of 6-ish locks in cpufreq.c and friends down to a
> single mutex simplifies things greatly, eliminates the "drop the lock
> here for a few instructions" hacks and generally makes things more
> readable.
> 

This is basically what I also have on top of this series. I actually
went for RCUs also for 2, but yes, that's maybe overkilling.

A comment on 1 above, and something on which I got stuck upon for some
time, is that, if we implement RCU logic as it is supposed to be, I
think we can generate a lot of copy-update operations when changing
frequency (as policy structure needs to be changed). Also, we might read
stale data. So, I'm not sure this will pay off. However, I tried to get
around this problem and I guess we will discuss if 1 is doable in the
next RFC :-).

> A quick note on RCU and the scheduler-driven DVFS stuff: RCU only helps
> us on read-side operations. For the purposes of sched-dvfs, this means
> that when we look at capacity utilization and want to normalize
> frequency based on that, we need to access the per-policy structure in a
> lockless way. RCU makes this possible.
> 
> RCU is absolutely not a magic bullet or elixir that lets us kick off
> DVFS transitions from the schedule() context. The frequency transitions
> are write-side operations, as we invariably touch struct cpufreq_policy.
> This means that the read-side stuff can live in the schedule() context,
> but write-side needs to be kicked out to a thread.
> 

Correct. We will still need the kthread machinery even after this
changes.

Thanks for clarifying things!

Best,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
       [not found]         ` <20160113182131.1168.45753@quark.deferred.io>
  2016-01-14  9:44           ` Juri Lelli
@ 2016-01-14 10:32           ` Viresh Kumar
  2016-01-14 13:52             ` Juri Lelli
  2016-01-19 14:00           ` Peter Zijlstra
  2 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-14 10:32 UTC (permalink / raw)
  To: Michael Turquette
  Cc: Juri Lelli, linux-kernel, linux-pm, peterz, rjw, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 13-01-16, 10:21, Michael Turquette wrote:
> Quoting Viresh Kumar (2016-01-12 22:31:48)
> > On 12-01-16, 16:54, Michael Turquette wrote:
> > > __cpufreq_driver_target should be using a per-policy lock.
> > 
> > It doesn't :)
> 
> It should.

I thought we wanted the routine doing DVFS to not sleep as it will be
called from scheduler ?

Looks fine otherwise. But yeah, the series is still incomplete in the
sense that there is no lock today around __cpufreq_driver_target().

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-14 10:32           ` Viresh Kumar
@ 2016-01-14 13:52             ` Juri Lelli
  2016-01-18  5:09               ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-14 13:52 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Michael Turquette, linux-kernel, linux-pm, peterz, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 14/01/16 16:02, Viresh Kumar wrote:
> On 13-01-16, 10:21, Michael Turquette wrote:
> > Quoting Viresh Kumar (2016-01-12 22:31:48)
> > > On 12-01-16, 16:54, Michael Turquette wrote:
> > > > __cpufreq_driver_target should be using a per-policy lock.
> > > 
> > > It doesn't :)
> > 
> > It should.
> 
> I thought we wanted the routine doing DVFS to not sleep as it will be
> called from scheduler ?
> 
> Looks fine otherwise. But yeah, the series is still incomplete in the
> sense that there is no lock today around __cpufreq_driver_target().
> 

I was under the impression that the purpose of having
__cpufreq_driver_target() exported outside cpufreq.c was working due to
the fact that users implement their own locking.

That's why I put the following comment in this patch.

 /*
  * Callers must ensure proper mutual exclusion on policy (for transition_
  * ongoing/transition_task handling). While holding policy->rwsem is
  * sufficient, other schemes might work as well (e.g., cpufreq_governor.c
  * holds timer_mutex while entering the path that generates transitions).
  */

>From what I can see ondemand and conservative (via governor) seem to use
timer_mutex; userspace userspace_mutex instead. Do they serve different
purposes instead? How do we currently serialize operations on policy
when using __cpufreq_driver_target() directly otherwise?

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-13  6:07       ` Viresh Kumar
@ 2016-01-14 16:35         ` Juri Lelli
  2016-01-18  5:23           ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-14 16:35 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 13/01/16 11:37, Viresh Kumar wrote:
> On 12-01-16, 15:52, Juri Lelli wrote:
> > Other users (i.e., cpufreq_parse_governor and cpufreq_register_governor)
> > needs to take the mutex externally. So, we need to unify this behaviour.
> 
> No they don't have to.
> 
> And that's why I have been saying that we better nail down the exact
> thing the mutex is supposed to protect.
> 
> There can be two cases here:
> - It protects the governor list, in that case we can move it to
>   find_governor().
> - It guarantees that the governor pointer stays valid: That's not true
>   as we are using the governor pointer outside of the lock.
> 
> And so I said, "No they don't have to" :)
> 

But, don't we have to guarantee consinstency between multiple operations
on cpufreq_governor_list?

In cpufreq_register_governor() we have:

 mutex_lock(&cpufreq_governor_mutex);
 
 governor->initialized = 0;
 err = -EBUSY;
 if (!find_governor(governor->name)) {
 	err = 0;
 	list_add(&governor->governor_list, &cpufreq_governor_list);
 }
 
 mutex_unlock(&cpufreq_governor_mutex);

IIUC, find_governor and list_add have to be atomic. Couldn't someone
slip in right after find_governor and add the same governor to the list?

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
  2016-01-12 10:39   ` Viresh Kumar
@ 2016-01-14 17:58     ` Juri Lelli
  0 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-14 17:58 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12/01/16 16:09, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > There are paths in cpufreq_init_policy where policy is used, but its rwsem
> > is not held.
> > 
> > Fix it.
> > 
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index e7fc5c9..2c7cc6c73 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -998,21 +998,24 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy, unsigned int cp
> >  {
> >  	int ret = 0;
> >  
> > +	down_write(&policy->rwsem);
> > +
> >  	/* Has this CPU been taken care of already? */
> > -	if (cpumask_test_cpu(cpu, policy->cpus))
> > +	if (cpumask_test_cpu(cpu, policy->cpus)) {
> > +		up_write(&policy->rwsem);
> 
> Perhaps create a label at the end to unlock the rwsem and jump to it?
> 

Yep, done.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
  2016-01-12 10:54   ` Viresh Kumar
@ 2016-01-15 12:37     ` Juri Lelli
  0 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-15 12:37 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 12/01/16 16:24, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > There are paths in cpufreq_offline_prepare where policy is used, but its
> > rwsem is not held.
> > 
> > Fix it.
> > 
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> I know the locking in general in cpufreq core is poor. We recently
> fixed lots of issues in governors ..
> 
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 2c7cc6c73..91158b0 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -1332,13 +1332,13 @@ static void cpufreq_offline_prepare(unsigned int cpu)
> >  		return;
> >  	}
> >  
> > +	down_write(&policy->rwsem);
> >  	if (has_target()) {
> >  		int ret = __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> >  		if (ret)
> >  			pr_err("%s: Failed to stop governor\n", __func__);
> >  	}
> >  
> > -	down_write(&policy->rwsem);
> >  	cpumask_clear_cpu(cpu, policy->cpus);
> >  
> >  	if (policy_is_inactive(policy)) {
> > @@ -1356,12 +1356,16 @@ static void cpufreq_offline_prepare(unsigned int cpu)
> >  	/* Start governor again for active policy */
> >  	if (!policy_is_inactive(policy)) {
> 
> Why shouldn't this be under the lock?
> 
> >  		if (has_target()) {
> > -			int ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
> > +			int ret;
> > +
> > +			down_write(&policy->rwsem);
> > +			ret = __cpufreq_governor(policy, CPUFREQ_GOV_START);
> >  			if (!ret)
> >  				ret = __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
> >  
> >  			if (ret)
> >  				pr_err("%s: Failed to start governor\n", __func__);
> > +			up_write(&policy->rwsem);
> >  		}
> >  	} else if (cpufreq_driver->stop_cpu) {
> >  		cpufreq_driver->stop_cpu(policy);
> 
> And this ?
> 

Right. Releasing rwsem at the end seems to work.

Best,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-12 11:06   ` Viresh Kumar
@ 2016-01-15 16:30     ` Juri Lelli
  2016-01-18  5:50       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-15 16:30 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

Hi,

On 12/01/16 16:36, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
> > Commit 6f1e4efd882e ("cpufreq: Fix timer/workqueue corruption by
> > protecting reading governor_enabled") made policy->governor_enabled
> > guarded by cpufreq_governor_mutex in __cpufreq_governor. Now that
> > holding of policy->rwsem is asserted in __cpufreq_governor,
> > cpufreq_governor_mutex is overkilling.
> 
> I am sure that is going to break it. Try that x86, somehow I don't get
> it on my exynos boards.
> 

But governor_enabled seems to not be checked anymore outside cpufreq.c
(see also 01/19), as it was in the commit you are referring to. Now that
users of this should be holding policy->rwsem, so that should suffice
for protecting governor_enabled, as governor_enabled is only changed
inside here.

I run some test on a x86 box I setup and didn't see anything related to
this. I'll wait to get the first 0-day report anyway.

> > -	mutex_lock(&cpufreq_governor_mutex);
> >  	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
> >  	    || (!policy->governor_enabled
> >  	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
> > -		mutex_unlock(&cpufreq_governor_mutex);
> >  		return -EBUSY;
> >  	}
> 
> Actually the above checks should also be removed as the governors are
> responsible for maintaining their state machines. But
> userspace/powersave/performance don't have that support yet and so
> these checks save them from going into undefined states.
> 
> Over that, above and below checks are incomplete..
> 

You mean we need an additional patch that extends the checks performed?

Thanks,

- Juri

> > @@ -2006,8 +2004,6 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> >  	else if (event == CPUFREQ_GOV_START)
> >  		policy->governor_enabled = true;
> >  
> > -	mutex_unlock(&cpufreq_governor_mutex);
> > -
> >  	ret = policy->governor->governor(policy, event);
> >  
> >  	if (!ret) {
> > @@ -2017,12 +2013,10 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> >  			policy->governor->initialized--;
> >  	} else {
> >  		/* Restore original values */
> > -		mutex_lock(&cpufreq_governor_mutex);
> >  		if (event == CPUFREQ_GOV_STOP)
> >  			policy->governor_enabled = true;
> >  		else if (event == CPUFREQ_GOV_START)
> >  			policy->governor_enabled = false;
> > -		mutex_unlock(&cpufreq_governor_mutex);
> >  	}
> >  
> >  	if (((event == CPUFREQ_GOV_POLICY_INIT) && ret) ||
> > -- 
> > 2.2.2
> 
> -- 
> viresh
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-14 13:52             ` Juri Lelli
@ 2016-01-18  5:09               ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-18  5:09 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Michael Turquette, linux-kernel, linux-pm, peterz, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 14-01-16, 13:52, Juri Lelli wrote:
> I was under the impression that the purpose of having
> __cpufreq_driver_target() exported outside cpufreq.c was working due to
> the fact that users implement their own locking.
> 
> That's why I put the following comment in this patch.
> 
>  /*
>   * Callers must ensure proper mutual exclusion on policy (for transition_
>   * ongoing/transition_task handling). While holding policy->rwsem is
>   * sufficient, other schemes might work as well (e.g., cpufreq_governor.c
>   * holds timer_mutex while entering the path that generates transitions).
>   */
> 
> >From what I can see ondemand and conservative (via governor) seem to use
> timer_mutex; userspace userspace_mutex instead. Do they serve different
> purposes instead? How do we currently serialize operations on policy
> when using __cpufreq_driver_target() directly otherwise?

The patch I referred to earlier in the thread had detailed few of the
races we were worried about. The lock you just removed is responsible
for taking care of the races you are worried now :)

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-14 16:35         ` Juri Lelli
@ 2016-01-18  5:23           ` Viresh Kumar
  2016-01-18 15:19             ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-18  5:23 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 14-01-16, 16:35, Juri Lelli wrote:
> But, don't we have to guarantee consinstency between multiple operations
> on cpufreq_governor_list?
> 
> In cpufreq_register_governor() we have:
> 
>  mutex_lock(&cpufreq_governor_mutex);
>  
>  governor->initialized = 0;
>  err = -EBUSY;
>  if (!find_governor(governor->name)) {
>  	err = 0;
>  	list_add(&governor->governor_list, &cpufreq_governor_list);
>  }
>  
>  mutex_unlock(&cpufreq_governor_mutex);
> 
> IIUC, find_governor and list_add have to be atomic. Couldn't someone
> slip in right after find_governor and add the same governor to the list?

Yeah, I was wrong that cpufreq_register_governor() doesn't need a
lock. We already have that in place ..

But most of the other places are really useless and shows that we
haven't implemented it well.

I would suggest that we move the lock within find_governor() and
create another find_governor_unlocked() or __find_governor() that will
be used only from cpufreq_register_governor(), with an outer lock.

Looks reasonable ?

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-15 16:30     ` Juri Lelli
@ 2016-01-18  5:50       ` Viresh Kumar
  2016-01-19 16:49         ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-18  5:50 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 15-01-16, 16:30, Juri Lelli wrote:
> But governor_enabled seems to not be checked anymore outside cpufreq.c
> (see also 01/19), as it was in the commit you are referring to.

Okay, I must have told you this earlier but anyway ..

governor_enabled was introduced long back to keep governor state
changes serialized. Because of the complex cases we had in hand
(governor-per-policy or system wide governors, etc.), it failed to do
so. Though simple races were avoided with it, complex ones still came
back to haunt us.

We fixed that by managing state changes within ondemand and
conservative governors instead and that worked very well.

Then I wrote a patch to kill the stupid code around governor_enabled
thing, but I got into few races. Those races happened because of
userspace governor, which was getting into invalid states on some
extreme cases (These were caught using the test-suite I wrote and you
perhaps used it).

And I never came back to fix those corner cases ..

You can try that on ARM or x86 by running following command from my
test-suite (I remember that you are using it, right?):

./runme.sh -f sp1 or sp2 or sp3

Only one of sp1, sp2 or sp3 is required..

> Now that
> users of this should be holding policy->rwsem, so that should suffice
> for protecting governor_enabled, as governor_enabled is only changed
> inside here.

If we can get rid of the rwsem dropping problem, then yeah this can be
killed for sure.

> I run some test on a x86 box I setup and didn't see anything related to
> this. I'll wait to get the first 0-day report anyway.

Okay, so run the above test and make sure you have following enabled
in your configuration:

CONFIG_LOCKDEP_SUPPORT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y

> > > -	mutex_lock(&cpufreq_governor_mutex);
> > >  	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
> > >  	    || (!policy->governor_enabled
> > >  	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
> > > -		mutex_unlock(&cpufreq_governor_mutex);
> > >  		return -EBUSY;
> > >  	}
> > 
> > Actually the above checks should also be removed as the governors are
> > responsible for maintaining their state machines. But
> > userspace/powersave/performance don't have that support yet and so
> > these checks save them from going into undefined states.
> > 
> > Over that, above and below checks are incomplete..
> > 
> 
> You mean we need an additional patch that extends the checks performed?

Yeah, we need to add some state-management code in
userspace/powersave/performance governors as well.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list
  2016-01-18  5:23           ` Viresh Kumar
@ 2016-01-18 15:19             ` Juri Lelli
  0 siblings, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-18 15:19 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 18/01/16 10:53, Viresh Kumar wrote:
> On 14-01-16, 16:35, Juri Lelli wrote:
> > But, don't we have to guarantee consinstency between multiple operations
> > on cpufreq_governor_list?
> > 
> > In cpufreq_register_governor() we have:
> > 
> >  mutex_lock(&cpufreq_governor_mutex);
> >  
> >  governor->initialized = 0;
> >  err = -EBUSY;
> >  if (!find_governor(governor->name)) {
> >  	err = 0;
> >  	list_add(&governor->governor_list, &cpufreq_governor_list);
> >  }
> >  
> >  mutex_unlock(&cpufreq_governor_mutex);
> > 
> > IIUC, find_governor and list_add have to be atomic. Couldn't someone
> > slip in right after find_governor and add the same governor to the list?
> 
> Yeah, I was wrong that cpufreq_register_governor() doesn't need a
> lock. We already have that in place ..
> 
> But most of the other places are really useless and shows that we
> haven't implemented it well.
> 
> I would suggest that we move the lock within find_governor() and
> create another find_governor_unlocked() or __find_governor() that will
> be used only from cpufreq_register_governor(), with an outer lock.
> 
> Looks reasonable ?
> 

Yes it does. I'll look into doing that.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
       [not found]         ` <20160113182131.1168.45753@quark.deferred.io>
  2016-01-14  9:44           ` Juri Lelli
  2016-01-14 10:32           ` Viresh Kumar
@ 2016-01-19 14:00           ` Peter Zijlstra
  2016-01-19 14:42             ` Juri Lelli
  2 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-19 14:00 UTC (permalink / raw)
  To: Michael Turquette
  Cc: Viresh Kumar, Juri Lelli, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Wed, Jan 13, 2016 at 10:21:31AM -0800, Michael Turquette wrote:
> RCU is absolutely not a magic bullet or elixir that lets us kick off
> DVFS transitions from the schedule() context. The frequency transitions
> are write-side operations, as we invariably touch struct cpufreq_policy.
> This means that the read-side stuff can live in the schedule() context,
> but write-side needs to be kicked out to a thread.

Why? If the state is per-cpu and acquired by RCU, updates should be no
problem at all.

If you need inter-cpu state, then things get to be a little tricky
though, but you can actually nest a raw_spinlock_t in there if you
absolutely have to.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 14:00           ` Peter Zijlstra
@ 2016-01-19 14:42             ` Juri Lelli
  2016-01-19 15:30               ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-19 14:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 19/01/16 15:00, Peter Zijlstra wrote:
> On Wed, Jan 13, 2016 at 10:21:31AM -0800, Michael Turquette wrote:
> > RCU is absolutely not a magic bullet or elixir that lets us kick off
> > DVFS transitions from the schedule() context. The frequency transitions
> > are write-side operations, as we invariably touch struct cpufreq_policy.
> > This means that the read-side stuff can live in the schedule() context,
> > but write-side needs to be kicked out to a thread.
> 
> Why? If the state is per-cpu and acquired by RCU, updates should be no
> problem at all.
> 
> If you need inter-cpu state, then things get to be a little tricky
> though, but you can actually nest a raw_spinlock_t in there if you
> absolutely have to.
> 

We have at least two problems. First one is that state is per frequency
domain (struct cpufreq_policy) and this usually spans more than one cpu.
Second one is that we might need to sleep while servicing the frequency
transition, both because platform needs to sleep and because some paths
of cpufreq core use sleeping locks (yes, that might be changed as well I
guess).  A solution based on spinlocks only might not be usable on
platforms that needs to sleep, also.

Another thing that I was thinking of actually is that since struct
cpufreq_policy is updated a lot (more or less at every frequency
transition), is it actually suitable for RCU?

Best,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 14:42             ` Juri Lelli
@ 2016-01-19 15:30               ` Peter Zijlstra
  2016-01-19 16:01                 ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-19 15:30 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 19, 2016 at 02:42:33PM +0000, Juri Lelli wrote:
> On 19/01/16 15:00, Peter Zijlstra wrote:
> > On Wed, Jan 13, 2016 at 10:21:31AM -0800, Michael Turquette wrote:
> > > RCU is absolutely not a magic bullet or elixir that lets us kick off
> > > DVFS transitions from the schedule() context. The frequency transitions
> > > are write-side operations, as we invariably touch struct cpufreq_policy.
> > > This means that the read-side stuff can live in the schedule() context,
> > > but write-side needs to be kicked out to a thread.
> > 
> > Why? If the state is per-cpu and acquired by RCU, updates should be no
> > problem at all.
> > 
> > If you need inter-cpu state, then things get to be a little tricky
> > though, but you can actually nest a raw_spinlock_t in there if you
> > absolutely have to.
> > 
> 
> We have at least two problems. First one is that state is per frequency
> domain (struct cpufreq_policy) and this usually spans more than one cpu.
> Second one is that we might need to sleep while servicing the frequency
> transition, both because platform needs to sleep and because some paths
> of cpufreq core use sleeping locks (yes, that might be changed as well I
> guess).  A solution based on spinlocks only might not be usable on
> platforms that needs to sleep, also.

Sure, if you need to actually sleep to poke the hardware you've lost and
you do indeed need the kthread thingy.

> Another thing that I was thinking of actually is that since struct
> cpufreq_policy is updated a lot (more or less at every frequency
> transition), is it actually suitable for RCU?

That entirely depends on how 'hard' it is to 'replace/change' the
cpufreq policy.

Typically I envision that to be very hard and require mutexes and the
like, in which case RCU can provide a cheap lookup and existence.

So on 'sane' hardware with per logical cpu hints you can get away
without any locks.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 15:30               ` Peter Zijlstra
@ 2016-01-19 16:01                 ` Juri Lelli
  2016-01-19 19:17                   ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-19 16:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 19/01/16 16:30, Peter Zijlstra wrote:
> On Tue, Jan 19, 2016 at 02:42:33PM +0000, Juri Lelli wrote:
> > On 19/01/16 15:00, Peter Zijlstra wrote:
> > > On Wed, Jan 13, 2016 at 10:21:31AM -0800, Michael Turquette wrote:
> > > > RCU is absolutely not a magic bullet or elixir that lets us kick off
> > > > DVFS transitions from the schedule() context. The frequency transitions
> > > > are write-side operations, as we invariably touch struct cpufreq_policy.
> > > > This means that the read-side stuff can live in the schedule() context,
> > > > but write-side needs to be kicked out to a thread.
> > > 
> > > Why? If the state is per-cpu and acquired by RCU, updates should be no
> > > problem at all.
> > > 
> > > If you need inter-cpu state, then things get to be a little tricky
> > > though, but you can actually nest a raw_spinlock_t in there if you
> > > absolutely have to.
> > > 
> > 
> > We have at least two problems. First one is that state is per frequency
> > domain (struct cpufreq_policy) and this usually spans more than one cpu.
> > Second one is that we might need to sleep while servicing the frequency
> > transition, both because platform needs to sleep and because some paths
> > of cpufreq core use sleeping locks (yes, that might be changed as well I
> > guess).  A solution based on spinlocks only might not be usable on
> > platforms that needs to sleep, also.
> 
> Sure, if you need to actually sleep to poke the hardware you've lost and
> you do indeed need the kthread thingy.
> 

Yeah, also cpufreq relies on blocking notifiers (to name one thing). So,
it seems to me quite some things needs to be changed to make it fully
non sleeping.

> > Another thing that I was thinking of actually is that since struct
> > cpufreq_policy is updated a lot (more or less at every frequency
> > transition), is it actually suitable for RCU?
> 
> That entirely depends on how 'hard' it is to 'replace/change' the
> cpufreq policy.
> 
> Typically I envision that to be very hard and require mutexes and the
> like, in which case RCU can provide a cheap lookup and existence.
> 

Right, read path is fast, but write path still requires some sort of
locking (malloc, copy and update). So, I'm wondering if this still pays
off for a structure that gets written a lot.

> So on 'sane' hardware with per logical cpu hints you can get away
> without any locks.
> 

But maybe you are saying that there are ways we can make that work :).

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-18  5:50       ` Viresh Kumar
@ 2016-01-19 16:49         ` Juri Lelli
  2016-01-20  7:29           ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-19 16:49 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 18/01/16 11:20, Viresh Kumar wrote:
> On 15-01-16, 16:30, Juri Lelli wrote:
> > But governor_enabled seems to not be checked anymore outside cpufreq.c
> > (see also 01/19), as it was in the commit you are referring to.
> 
> Okay, I must have told you this earlier but anyway ..
> 
> governor_enabled was introduced long back to keep governor state
> changes serialized. Because of the complex cases we had in hand
> (governor-per-policy or system wide governors, etc.), it failed to do
> so. Though simple races were avoided with it, complex ones still came
> back to haunt us.
> 
> We fixed that by managing state changes within ondemand and
> conservative governors instead and that worked very well.
> 
> Then I wrote a patch to kill the stupid code around governor_enabled
> thing, but I got into few races. Those races happened because of
> userspace governor, which was getting into invalid states on some
> extreme cases (These were caught using the test-suite I wrote and you
> perhaps used it).
> 
> And I never came back to fix those corner cases ..
> 

OK, thanks for the explanation.

> You can try that on ARM or x86 by running following command from my
> test-suite (I remember that you are using it, right?):
> 

Yep, I'm constantly running those on my boxes.

> ./runme.sh -f sp1 or sp2 or sp3
> 
> Only one of sp1, sp2 or sp3 is required..
> 

I'm actually hitting this running sp2, on linux-pm/linux-next :/.

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.4.0+ #445 Not tainted
 -------------------------------------------------------
 trace.sh/1723 is trying to acquire lock:
  (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94

 but task is already holding lock:
  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 which lock already depends on the new lock.


 the existing dependency chain (in reverse order) is:

-> #2 (od_dbs_cdata.mutex){+.+.+.}:
        [<c075b040>] mutex_lock_nested+0x7c/0x434
        [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

-> #1 (&policy->rwsem){+++++.}:
        [<c075ca8c>] down_read+0x58/0x94
        [<c057c244>] show+0x30/0x60
        [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
        [<c01f7ad8>] kernfs_seq_show+0x34/0x38
        [<c01a22ec>] seq_read+0x1e4/0x4e4
        [<c01f8694>] kernfs_fop_read+0x120/0x1a0
        [<c01794b4>] __vfs_read+0x3c/0xe0
        [<c017a378>] vfs_read+0x98/0x104
        [<c017a434>] SyS_read+0x50/0x90
        [<c000fd40>] ret_fast_syscall+0x0/0x1c

-> #0 (s_active#48){++++.+}:
        [<c008238c>] lock_acquire+0xd4/0x20c
        [<c01f6ae4>] __kernfs_remove+0x288/0x328
        [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
        [<c01fa024>] remove_files+0x44/0x88
        [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
        [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
        [<c0017c10>] return_to_handler+0x0/0x18

 other info that might help us debug this:

 Chain exists of:
  s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(od_dbs_cdata.mutex);
                                lock(&policy->rwsem);
                                lock(od_dbs_cdata.mutex);
   lock(s_active#48);

  *** DEADLOCK ***

 5 locks held by trace.sh/1723:
  #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
  #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
  #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
  #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
  #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4

 stack backtrace:
 CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
 Hardware name: ARM-Versatile Express
 [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
 [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
 [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
 [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
 [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
 [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
 [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
 [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
 [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
 [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
 [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)

Now, I couldn't yet make sense of this, but it seems to be
triggered by setting ondemand, printing its attributes and then
switching to conservative (that's what sp2 does, right?). Also, s_active
seems to come into play only when lockdep is enabled. Are you seeing
this as well?

> > Now that
> > users of this should be holding policy->rwsem, so that should suffice
> > for protecting governor_enabled, as governor_enabled is only changed
> > inside here.
> 
> If we can get rid of the rwsem dropping problem, then yeah this can be
> killed for sure.
> 

OK.

> > I run some test on a x86 box I setup and didn't see anything related to
> > this. I'll wait to get the first 0-day report anyway.
> 

0-day is setup. I didn't yet receive any major bad thing from it :).

> Okay, so run the above test and make sure you have following enabled
> in your configuration:
> 
> CONFIG_LOCKDEP_SUPPORT=y
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_PI_LIST=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_MUTEXES=y
> CONFIG_DEBUG_LOCK_ALLOC=y
> CONFIG_PROVE_LOCKING=y
> CONFIG_LOCKDEP=y
> CONFIG_DEBUG_ATOMIC_SLEEP=y
> 

Yep, that's what I normally use for developing.

Thanks,

- Juri

> > > > -	mutex_lock(&cpufreq_governor_mutex);
> > > >  	if ((policy->governor_enabled && event == CPUFREQ_GOV_START)
> > > >  	    || (!policy->governor_enabled
> > > >  	    && (event == CPUFREQ_GOV_LIMITS || event == CPUFREQ_GOV_STOP))) {
> > > > -		mutex_unlock(&cpufreq_governor_mutex);
> > > >  		return -EBUSY;
> > > >  	}
> > > 
> > > Actually the above checks should also be removed as the governors are
> > > responsible for maintaining their state machines. But
> > > userspace/powersave/performance don't have that support yet and so
> > > these checks save them from going into undefined states.
> > > 
> > > Over that, above and below checks are incomplete..
> > > 
> > 
> > You mean we need an additional patch that extends the checks performed?
> 
> Yeah, we need to add some state-management code in
> userspace/powersave/performance governors as well.
> 
> -- 
> viresh
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 16:01                 ` Juri Lelli
@ 2016-01-19 19:17                   ` Peter Zijlstra
  2016-01-19 19:21                     ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-19 19:17 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 19, 2016 at 04:01:55PM +0000, Juri Lelli wrote:
> Right, read path is fast, but write path still requires some sort of
> locking (malloc, copy and update). So, I'm wondering if this still pays
> off for a structure that gets written a lot.

No, not at all.

struct cpufreq_driver *driver;

void sched_util_change(unsigned int util)
{
	struct my_per_cpu_data *foo;

	rcu_read_lock();
	foo = __this_cpu_ptr(rcu_dereference(driver)->data);
	if (foo) {
		if (abs(util - foo->last_util) > 10) {
			foo->last_util = util;
			foo->set_util(util);
		}
	}
	rcu_read_unlock();
}


struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
{
	struct cpufreq_driver *old_driver;

	mutex_lock(&cpufreq_driver_lock);
	old_driver = driver;
	rcu_assign_driver(driver, new_driver);
	if (old_driver)
		synchronize_rcu();
	mutex_unlock(&cpufreq_driver_lock);

	return old_driver;
}

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 19:17                   ` Peter Zijlstra
@ 2016-01-19 19:21                     ` Peter Zijlstra
  2016-01-19 21:52                       ` Rafael J. Wysocki
  2016-01-20 12:59                       ` Juri Lelli
  0 siblings, 2 replies; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-19 19:21 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 19, 2016 at 08:17:34PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 19, 2016 at 04:01:55PM +0000, Juri Lelli wrote:
> > Right, read path is fast, but write path still requires some sort of
> > locking (malloc, copy and update). So, I'm wondering if this still pays
> > off for a structure that gets written a lot.
> 
> No, not at all.
> 
> struct cpufreq_driver *driver;
> 
> void sched_util_change(unsigned int util)
> {
> 	struct my_per_cpu_data *foo;
> 
> 	rcu_read_lock();

That should obviously be:

	d = rcu_dereference(driver);
	if (d) {
		foo = __this_cpu_ptr(d->data);

> 		if (abs(util - foo->last_util) > 10) {
> 			foo->last_util = util;
> 			foo->set_util(util);
> 		}
> 	}
> 	rcu_read_unlock();
> }
> 
> 
> struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
> {
> 	struct cpufreq_driver *old_driver;
> 
> 	mutex_lock(&cpufreq_driver_lock);
> 	old_driver = driver;
> 	rcu_assign_driver(driver, new_driver);
> 	if (old_driver)
> 		synchronize_rcu();
> 	mutex_unlock(&cpufreq_driver_lock);
> 
> 	return old_driver;
> }
> 
> 
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 19:21                     ` Peter Zijlstra
@ 2016-01-19 21:52                       ` Rafael J. Wysocki
  2016-01-20 17:04                         ` Peter Zijlstra
  2016-01-20 12:59                       ` Juri Lelli
  1 sibling, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-19 21:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, Michael Turquette, Viresh Kumar, linux-kernel,
	linux-pm, steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tuesday, January 19, 2016 08:21:11 PM Peter Zijlstra wrote:
> On Tue, Jan 19, 2016 at 08:17:34PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 19, 2016 at 04:01:55PM +0000, Juri Lelli wrote:
> > > Right, read path is fast, but write path still requires some sort of
> > > locking (malloc, copy and update). So, I'm wondering if this still pays
> > > off for a structure that gets written a lot.
> > 
> > No, not at all.
> > 

This is very similar to what I was thinking about, plus-minus a couple of
things.

> > struct cpufreq_driver *driver;
> > 
> > void sched_util_change(unsigned int util)
> > {
> > 	struct my_per_cpu_data *foo;
> > 
> > 	rcu_read_lock();
> 
> That should obviously be:
> 
> 	d = rcu_dereference(driver);
> 	if (d) {
> 		foo = __this_cpu_ptr(d->data);

If we do this, it would be convenient to define ->set_util() to take
foo as an arg too, in addition to util.

And is there any particular reason why d->data has to be per-cpu?

> 
> > 		if (abs(util - foo->last_util) > 10) {

Even if the utilization doesn't change, it still may be too high or too low,
so we may want to call foo->set_util() in that case too, at least once a
while.

> > 			foo->last_util = util;
> > 			foo->set_util(util);
> > 		}
> > 	}
> > 	rcu_read_unlock();
> > }
> > 
> > 
> > struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
> > {
> > 	struct cpufreq_driver *old_driver;
> > 
> > 	mutex_lock(&cpufreq_driver_lock);
> > 	old_driver = driver;
> > 	rcu_assign_driver(driver, new_driver);
> > 	if (old_driver)
> > 		synchronize_rcu();
> > 	mutex_unlock(&cpufreq_driver_lock);
> > 
> > 	return old_driver;
> > }

We never need to do this, because we never replace one driver with another in
one go.  We need to go from a valid driver pointer to NULL and the other way
around only.

This means there may be other pointers around that may be accessed safely
from foo->set_util() above if there's a rule that they must be set before
the driver pointer and the data structures they point to must stay around
until the syncronize_rcu() returns.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-19 16:49         ` Juri Lelli
@ 2016-01-20  7:29           ` Viresh Kumar
  2016-01-20 10:17             ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-20  7:29 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 19-01-16, 16:49, Juri Lelli wrote:
> I'm actually hitting this running sp2, on linux-pm/linux-next :/.

That's really bad .. Are you hitting this on Juno or x86 ?

And I am sure you would have hit that with your changes as well, but
now its on the currently merged patches :(

>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  4.4.0+ #445 Not tainted
>  -------------------------------------------------------
>  trace.sh/1723 is trying to acquire lock:
>   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
> 
>  but task is already holding lock:
>   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
> -> #2 (od_dbs_cdata.mutex){+.+.+.}:
>         [<c075b040>] mutex_lock_nested+0x7c/0x434
>         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
> 
> -> #1 (&policy->rwsem){+++++.}:
>         [<c075ca8c>] down_read+0x58/0x94
>         [<c057c244>] show+0x30/0x60
>         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
>         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
>         [<c01a22ec>] seq_read+0x1e4/0x4e4
>         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
>         [<c01794b4>] __vfs_read+0x3c/0xe0
>         [<c017a378>] vfs_read+0x98/0x104
>         [<c017a434>] SyS_read+0x50/0x90
>         [<c000fd40>] ret_fast_syscall+0x0/0x1c
> 
> -> #0 (s_active#48){++++.+}:
>         [<c008238c>] lock_acquire+0xd4/0x20c
>         [<c01f6ae4>] __kernfs_remove+0x288/0x328
>         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
>         [<c01fa024>] remove_files+0x44/0x88
>         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
>         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
>         [<c0017c10>] return_to_handler+0x0/0x18
> 
>  other info that might help us debug this:
> 
>  Chain exists of:
>   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(od_dbs_cdata.mutex);
>                                 lock(&policy->rwsem);
>                                 lock(od_dbs_cdata.mutex);
>    lock(s_active#48);
> 
>   *** DEADLOCK ***
> 
>  5 locks held by trace.sh/1723:
>   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
>   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
>   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
>   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
>   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
> 
>  stack backtrace:
>  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
>  Hardware name: ARM-Versatile Express
>  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
>  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
>  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
>  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
>  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
>  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
>  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
>  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
>  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
>  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
>  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
> 
> Now, I couldn't yet make sense of this, but it seems to be
> triggered by setting ondemand, printing its attributes and then
> switching to conservative (that's what sp2 does, right?). Also, s_active
> seems to come into play only when lockdep is enabled. Are you seeing
> this as well?

There is something about the platform you are running this on.. I
don't hit it most of the times in my exynos board (Dual A15), but x86
and powerpc guys used to report this all the time. I have tried with
both have-governor-per-policy and otherwise.

I have explained something similar in the earlier commits I pointed to
you, here is the commit log:

http://pastebin.com/JbEJBLzU

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-20  7:29           ` Viresh Kumar
@ 2016-01-20 10:17             ` Juri Lelli
  2016-01-20 10:18               ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-20 10:17 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 20/01/16 12:59, Viresh Kumar wrote:
> On 19-01-16, 16:49, Juri Lelli wrote:
> > I'm actually hitting this running sp2, on linux-pm/linux-next :/.
> 
> That's really bad .. Are you hitting this on Juno or x86 ?
> 

That's on TC2. I'll try to run the same on Juno and x86.

> And I am sure you would have hit that with your changes as well, but
> now its on the currently merged patches :(
> 
> >  ======================================================
> >  [ INFO: possible circular locking dependency detected ]
> >  4.4.0+ #445 Not tainted
> >  -------------------------------------------------------
> >  trace.sh/1723 is trying to acquire lock:
> >   (s_active#48){++++.+}, at: [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
> > 
> >  but task is already holding lock:
> >   (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
> > 
> >  which lock already depends on the new lock.
> > 
> > 
> >  the existing dependency chain (in reverse order) is:
> > 
> > -> #2 (od_dbs_cdata.mutex){+.+.+.}:
> >         [<c075b040>] mutex_lock_nested+0x7c/0x434
> >         [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
> >         [<c0017c10>] return_to_handler+0x0/0x18
> > 
> > -> #1 (&policy->rwsem){+++++.}:
> >         [<c075ca8c>] down_read+0x58/0x94
> >         [<c057c244>] show+0x30/0x60
> >         [<c01f934c>] sysfs_kf_seq_show+0x90/0xfc
> >         [<c01f7ad8>] kernfs_seq_show+0x34/0x38
> >         [<c01a22ec>] seq_read+0x1e4/0x4e4
> >         [<c01f8694>] kernfs_fop_read+0x120/0x1a0
> >         [<c01794b4>] __vfs_read+0x3c/0xe0
> >         [<c017a378>] vfs_read+0x98/0x104
> >         [<c017a434>] SyS_read+0x50/0x90
> >         [<c000fd40>] ret_fast_syscall+0x0/0x1c
> > 
> > -> #0 (s_active#48){++++.+}:
> >         [<c008238c>] lock_acquire+0xd4/0x20c
> >         [<c01f6ae4>] __kernfs_remove+0x288/0x328
> >         [<c01f78c8>] kernfs_remove_by_name_ns+0x4c/0x94
> >         [<c01fa024>] remove_files+0x44/0x88
> >         [<c01fa5a4>] sysfs_remove_group+0x50/0xa4
> >         [<c058285c>] cpufreq_governor_dbs+0x3f0/0x5d4
> >         [<c0017c10>] return_to_handler+0x0/0x18
> > 
> >  other info that might help us debug this:
> > 
> >  Chain exists of:
> >   s_active#48 --> &policy->rwsem --> od_dbs_cdata.mutex
> > 
> >   Possible unsafe locking scenario:
> > 
> >         CPU0                    CPU1
> >         ----                    ----
> >    lock(od_dbs_cdata.mutex);
> >                                 lock(&policy->rwsem);
> >                                 lock(od_dbs_cdata.mutex);
> >    lock(s_active#48);
> > 
> >   *** DEADLOCK ***
> > 
> >  5 locks held by trace.sh/1723:
> >   #0:  (sb_writers#6){.+.+.+}, at: [<c017beb8>] __sb_start_write+0xb4/0xc0
> >   #1:  (&of->mutex){+.+.+.}, at: [<c01f8418>] kernfs_fop_write+0x6c/0x1c8
> >   #2:  (s_active#35){.+.+.+}, at: [<c01f8420>] kernfs_fop_write+0x74/0x1c8
> >   #3:  (cpu_hotplug.lock){++++++}, at: [<c0029e6c>] get_online_cpus+0x48/0xb8
> >   #4:  (od_dbs_cdata.mutex){+.+.+.}, at: [<c05824a0>] cpufreq_governor_dbs+0x34/0x5d4
> > 
> >  stack backtrace:
> >  CPU: 2 PID: 1723 Comm: trace.sh Not tainted 4.4.0+ #445
> >  Hardware name: ARM-Versatile Express
> >  [<c001883c>] (unwind_backtrace) from [<c0013f50>] (show_stack+0x20/0x24)
> >  [<c0013f50>] (show_stack) from [<c044ad90>] (dump_stack+0x80/0xb4)
> >  [<c044ad90>] (dump_stack) from [<c0128edc>] (print_circular_bug+0x29c/0x2f0)
> >  [<c0128edc>] (print_circular_bug) from [<c0081708>] (__lock_acquire+0x163c/0x1d74)
> >  [<c0081708>] (__lock_acquire) from [<c008238c>] (lock_acquire+0xd4/0x20c)
> >  [<c008238c>] (lock_acquire) from [<c01f6ae4>] (__kernfs_remove+0x288/0x328)
> >  [<c01f6ae4>] (__kernfs_remove) from [<c01f78c8>] (kernfs_remove_by_name_ns+0x4c/0x94)
> >  [<c01f78c8>] (kernfs_remove_by_name_ns) from [<c01fa024>] (remove_files+0x44/0x88)
> >  [<c01fa024>] (remove_files) from [<c01fa5a4>] (sysfs_remove_group+0x50/0xa4)
> >  [<c01fa5a4>] (sysfs_remove_group) from [<c058285c>] (cpufreq_governor_dbs+0x3f0/0x5d4)
> >  [<c058285c>] (cpufreq_governor_dbs) from [<c0017c10>] (return_to_handler+0x0/0x18)
> > 
> > Now, I couldn't yet make sense of this, but it seems to be
> > triggered by setting ondemand, printing its attributes and then
> > switching to conservative (that's what sp2 does, right?). Also, s_active
> > seems to come into play only when lockdep is enabled. Are you seeing
> > this as well?
> 
> There is something about the platform you are running this on.. I
> don't hit it most of the times in my exynos board (Dual A15), but x86
> and powerpc guys used to report this all the time. I have tried with
> both have-governor-per-policy and otherwise.
> 
> I have explained something similar in the earlier commits I pointed to
> you, here is the commit log:
> 
> http://pastebin.com/JbEJBLzU
> 

Yeah, saw that. I guess I have to stare at this thing more.

Thanks,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-20 10:17             ` Juri Lelli
@ 2016-01-20 10:18               ` Viresh Kumar
  2016-01-20 10:27                 ` Juri Lelli
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-01-20 10:18 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 20-01-16, 10:17, Juri Lelli wrote:
> On 20/01/16 12:59, Viresh Kumar wrote:
> > On 19-01-16, 16:49, Juri Lelli wrote:
> > > I'm actually hitting this running sp2, on linux-pm/linux-next :/.
> > 
> > That's really bad .. Are you hitting this on Juno or x86 ?
> > 
> 
> That's on TC2. I'll try to run the same on Juno and x86.

Juno will be the same as that also set the per-policy-governor thing
:)

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-20 10:18               ` Viresh Kumar
@ 2016-01-20 10:27                 ` Juri Lelli
  2016-01-20 10:30                   ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Juri Lelli @ 2016-01-20 10:27 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 20/01/16 15:48, Viresh Kumar wrote:
> On 20-01-16, 10:17, Juri Lelli wrote:
> > On 20/01/16 12:59, Viresh Kumar wrote:
> > > On 19-01-16, 16:49, Juri Lelli wrote:
> > > > I'm actually hitting this running sp2, on linux-pm/linux-next :/.
> > > 
> > > That's really bad .. Are you hitting this on Juno or x86 ?
> > > 
> > 
> > That's on TC2. I'll try to run the same on Juno and x86.
> 
> Juno will be the same as that also set the per-policy-governor thing
> :)

That is what I expect as well, but you never know ;).

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor
  2016-01-20 10:27                 ` Juri Lelli
@ 2016-01-20 10:30                   ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-01-20 10:30 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, mturquette, steve.muckle,
	vincent.guittot, morten.rasmussen, dietmar.eggemann

On 20-01-16, 10:27, Juri Lelli wrote:
> That is what I expect as well, but you never know ;).

Perhaps not. We are talking about policy->rwsem here being used while
reading the values of governor's sysfs files. And that lock is only
taken for single governor case.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 19:21                     ` Peter Zijlstra
  2016-01-19 21:52                       ` Rafael J. Wysocki
@ 2016-01-20 12:59                       ` Juri Lelli
  1 sibling, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-01-20 12:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Turquette, Viresh Kumar, linux-kernel, linux-pm, rjw,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 19/01/16 20:21, Peter Zijlstra wrote:
> On Tue, Jan 19, 2016 at 08:17:34PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 19, 2016 at 04:01:55PM +0000, Juri Lelli wrote:
> > > Right, read path is fast, but write path still requires some sort of
> > > locking (malloc, copy and update). So, I'm wondering if this still pays
> > > off for a structure that gets written a lot.
> > 
> > No, not at all.
> > 
> > struct cpufreq_driver *driver;
> > 
> > void sched_util_change(unsigned int util)
> > {
> > 	struct my_per_cpu_data *foo;
> > 
> > 	rcu_read_lock();
> 
> That should obviously be:
> 
> 	d = rcu_dereference(driver);
> 	if (d) {
> 		foo = __this_cpu_ptr(d->data);
> 
> > 		if (abs(util - foo->last_util) > 10) {
> > 			foo->last_util = util;
> > 			foo->set_util(util);
> > 		}
> > 	}
> > 	rcu_read_unlock();
> > }
> > 
> > 
> > struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
> > {
> > 	struct cpufreq_driver *old_driver;
> > 
> > 	mutex_lock(&cpufreq_driver_lock);
> > 	old_driver = driver;
> > 	rcu_assign_driver(driver, new_driver);
> > 	if (old_driver)
> > 		synchronize_rcu();
> > 	mutex_unlock(&cpufreq_driver_lock);
> > 
> > 	return old_driver;
> > }
> > 
> > 
> > 
> 

Right, this addresses the driver side (modulo what Rafael pointed out
about setting driver pointer to NULL and then to point to the new
driver); and for this part I think RCU works well. I'm not concerned
about the driver side :).

Now, assuming that we move cpufreq_cpu_data inside cpufreq_driver (IIUC
this is your d->data), we will have per_cpu pointers pointing to the
different policies. Inside these policy data structures we have
information regarding current frequency, maximum allowed frequency, cpus
covered by this policy, and a few more. IIUC this is your foo thing.
Since the structure pointed to by foo will be shared amongs several
cpus, we need some way to guarantee mutual exclusion and such. I think
we were thinking to use RCU for this bit as well and that is what I'm
concerned about, as curr frequency will change at every frequency
transition.

Maybe you are also implying that we need to change cpufreq_cpu_data as
well. I need to think more about that.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-19 21:52                       ` Rafael J. Wysocki
@ 2016-01-20 17:04                         ` Peter Zijlstra
  2016-01-20 22:12                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-20 17:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Juri Lelli, Michael Turquette, Viresh Kumar, linux-kernel,
	linux-pm, steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Tue, Jan 19, 2016 at 10:52:22PM +0100, Rafael J. Wysocki wrote:
> This is very similar to what I was thinking about, plus-minus a couple of
> things.
> 
> > > struct cpufreq_driver *driver;
> > > 
> > > void sched_util_change(unsigned int util)
> > > {
> > > 	struct my_per_cpu_data *foo;
> > > 
> > > 	rcu_read_lock();
> > 
> > That should obviously be:
> > 
> > 	d = rcu_dereference(driver);
> > 	if (d) {
> > 		foo = __this_cpu_ptr(d->data);
> 
> If we do this, it would be convenient to define ->set_util() to take
> foo as an arg too, in addition to util.
> 
> And is there any particular reason why d->data has to be per-cpu?

Seems sensible, at best it actually is per cpu data, at worst this per
cpu pointer points to the same data for multiple cpus (the freq domain).

> > 
> > > 		if (abs(util - foo->last_util) > 10) {
> 
> Even if the utilization doesn't change, it still may be too high or too low,
> so we may want to call foo->set_util() in that case too, at least once a
> while.
> 
> > > 			foo->last_util = util;

Ah, the whole point of this was that ^^^ store.

Modifying the data structure doesn't need a new alloc / copy etc.. We
only use RCU to guarantee the data exists, once we have the data, the
data itself can be modified however.

Here its strictly per-cpu data, so modifying it can be unserialized
since CPUs themselves are sequentially consistent.

If you have a freq domain with multiple CPUs in, you'll have to go stick
a lock in.

> > > 			foo->set_util(util);
> > > 		}
> > > 	}
> > > 	rcu_read_unlock();
> > > }
> > > 
> > > 
> > > struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
> > > {
> > > 	struct cpufreq_driver *old_driver;
> > > 
> > > 	mutex_lock(&cpufreq_driver_lock);
> > > 	old_driver = driver;
> > > 	rcu_assign_driver(driver, new_driver);
> > > 	if (old_driver)
> > > 		synchronize_rcu();
> > > 	mutex_unlock(&cpufreq_driver_lock);
> > > 
> > > 	return old_driver;
> > > }
> 
> We never need to do this, because we never replace one driver with another in
> one go.  We need to go from a valid driver pointer to NULL and the other way
> around only.

The above can do those transitions :-)

> This means there may be other pointers around that may be accessed safely
> from foo->set_util() above if there's a rule that they must be set before
> the driver pointer and the data structures they point to must stay around
> until the syncronize_rcu() returns.

I would dangle _everything_ off the one driver pointer, that's much
easier.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-20 17:04                         ` Peter Zijlstra
@ 2016-01-20 22:12                           ` Rafael J. Wysocki
  2016-01-20 22:38                             ` Peter Zijlstra
  0 siblings, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-20 22:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Juri Lelli, Michael Turquette, Viresh Kumar,
	Linux Kernel Mailing List, linux-pm, steve.muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann

On Wed, Jan 20, 2016 at 6:04 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jan 19, 2016 at 10:52:22PM +0100, Rafael J. Wysocki wrote:
>> This is very similar to what I was thinking about, plus-minus a couple of
>> things.
>>
>> > > struct cpufreq_driver *driver;
>> > >
>> > > void sched_util_change(unsigned int util)
>> > > {
>> > >   struct my_per_cpu_data *foo;
>> > >
>> > >   rcu_read_lock();
>> >
>> > That should obviously be:
>> >
>> >     d = rcu_dereference(driver);
>> >     if (d) {
>> >             foo = __this_cpu_ptr(d->data);
>>
>> If we do this, it would be convenient to define ->set_util() to take
>> foo as an arg too, in addition to util.
>>
>> And is there any particular reason why d->data has to be per-cpu?
>
> Seems sensible, at best it actually is per cpu data, at worst this per
> cpu pointer points to the same data for multiple cpus (the freq domain).
>
>> >
>> > >           if (abs(util - foo->last_util) > 10) {
>>
>> Even if the utilization doesn't change, it still may be too high or too low,
>> so we may want to call foo->set_util() in that case too, at least once a
>> while.
>>
>> > >                   foo->last_util = util;
>
> Ah, the whole point of this was that ^^^ store.

OK, I see.

> Modifying the data structure doesn't need a new alloc / copy etc.. We
> only use RCU to guarantee the data exists, once we have the data, the
> data itself can be modified however.
>
> Here its strictly per-cpu data, so modifying it can be unserialized
> since CPUs themselves are sequentially consistent.

Right.  So that's why you want it to be per-cpu really.

> If you have a freq domain with multiple CPUs in, you'll have to go stick
> a lock in.

Right.

>> > >                   foo->set_util(util);
>> > >           }
>> > >   }
>> > >   rcu_read_unlock();
>> > > }
>> > >
>> > >
>> > > struct cpufreq_driver *cpufreq_flip_driver(struct cpufreq_driver *new_driver)
>> > > {
>> > >   struct cpufreq_driver *old_driver;
>> > >
>> > >   mutex_lock(&cpufreq_driver_lock);
>> > >   old_driver = driver;
>> > >   rcu_assign_driver(driver, new_driver);
>> > >   if (old_driver)
>> > >           synchronize_rcu();
>> > >   mutex_unlock(&cpufreq_driver_lock);
>> > >
>> > >   return old_driver;
>> > > }
>>
>> We never need to do this, because we never replace one driver with another in
>> one go.  We need to go from a valid driver pointer to NULL and the other way
>> around only.
>
> The above can do those transitions :-)

Yes, it can, but the real thing will probably be more complicated than
the code above and then the difference may actually matter.

>> This means there may be other pointers around that may be accessed safely
>> from foo->set_util() above if there's a rule that they must be set before
>> the driver pointer and the data structures they point to must stay around
>> until the syncronize_rcu() returns.
>
> I would dangle _everything_ off the one driver pointer, that's much
> easier.

I'm not sure how much easier it is in practice.

Even if everything dangles out of the driver pointer, data structures
pointed to by those things need not be allocated all in one go by the
same entity.  Some of them are allocated by drivers, some of them by
the core, at different times.  The ordering between those allocations
and populating the pointers is what matters, not how all that is laid
out in memory.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-20 22:12                           ` Rafael J. Wysocki
@ 2016-01-20 22:38                             ` Peter Zijlstra
  2016-01-20 23:33                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 110+ messages in thread
From: Peter Zijlstra @ 2016-01-20 22:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Juri Lelli, Michael Turquette, Viresh Kumar,
	Linux Kernel Mailing List, linux-pm, steve.muckle,
	Vincent Guittot, Morten Rasmussen, dietmar.eggemann

On Wed, Jan 20, 2016 at 11:12:45PM +0100, Rafael J. Wysocki wrote:
> > I would dangle _everything_ off the one driver pointer, that's much
> > easier.
> 
> I'm not sure how much easier it is in practice.
> 
> Even if everything dangles out of the driver pointer, data structures
> pointed to by those things need not be allocated all in one go by the
> same entity.  Some of them are allocated by drivers, some of them by
> the core, at different times.

Yes, I've noticed, some of that is really bonkers.

> The ordering between those allocations
> and populating the pointers is what matters, not how all that is laid
> out in memory.

I'm thinking getting that ordering right is easier/more natural, if its
all contained in one object. But this could be subjective.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 18/19] cpufreq: remove transition_lock
  2016-01-20 22:38                             ` Peter Zijlstra
@ 2016-01-20 23:33                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-20 23:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Michael Turquette, Viresh Kumar, Linux Kernel Mailing List,
	linux-pm, steve.muckle, Vincent Guittot, Morten Rasmussen,
	dietmar.eggemann

On Wed, Jan 20, 2016 at 11:38 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Jan 20, 2016 at 11:12:45PM +0100, Rafael J. Wysocki wrote:
>> > I would dangle _everything_ off the one driver pointer, that's much
>> > easier.
>>
>> I'm not sure how much easier it is in practice.
>>
>> Even if everything dangles out of the driver pointer, data structures
>> pointed to by those things need not be allocated all in one go by the
>> same entity.  Some of them are allocated by drivers, some of them by
>> the core, at different times.
>
> Yes, I've noticed, some of that is really bonkers.
>
>> The ordering between those allocations
>> and populating the pointers is what matters, not how all that is laid
>> out in memory.
>
> I'm thinking getting that ordering right is easier/more natural, if its
> all contained in one object. But this could be subjective.

I'm trying to look at this from the perspective of making changes.

It should be possible to change the ordering of how the data
structures are populated and pointers set without changing the
existing memory layout of them, which may allow us to minimize the
amount of changes to cpufreq drivers for old hardware (and therefore
generally difficult to test), for example.

Also, this way each individual change may be more limited in scope and
therefore less error prone IMO.

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-01-12 10:20   ` Viresh Kumar
@ 2016-01-30  0:33     ` Saravana Kannan
  2016-01-30 11:49       ` Rafael J. Wysocki
  0 siblings, 1 reply; 110+ messages in thread
From: Saravana Kannan @ 2016-01-30  0:33 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Juri Lelli, linux-kernel, linux-pm, peterz, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 01/12/2016 02:20 AM, Viresh Kumar wrote:
> On 11-01-16, 17:35, Juri Lelli wrote:
>> __cpufreq_governor works on policy, so policy->rwsem has to be held.
>> Add assertion for such condition.
>>
>> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
>> ---
>>   drivers/cpufreq/cpufreq.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index f1f9fbc..e7fc5c9 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>>   	/* Don't start any governor operations if we are entering suspend */
>>   	if (cpufreq_suspended)
>>   		return 0;
>> +
>> +	lockdep_assert_held(&policy->rwsem);
>> +
>
> We had an ABBA problem with the EXIT governor callback and so this
> rwsem is dropped just before that from set_policy()..
>
> commit 955ef4833574 ("cpufreq: Drop rwsem lock around
> CPUFREQ_GOV_POLICY_EXIT")
>

AFAIR, the ABBA issue was between the sysfs lock and the policy lock. 
The fix for that issue should not be dropping the lock around 
POLICY_EXIT. The proper fix is to have the governor "export" the 
attributes it wants to add/remove and have the cpufreq framework do the 
adding/removing of the attributes from sysfs for the governor.

Thanks,
Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation
  2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
                   ` (19 preceding siblings ...)
  2016-01-11 22:45 ` [RFC PATCH 00/19] cpufreq locking cleanups and documentation Rafael J. Wysocki
@ 2016-01-30  0:57 ` Saravana Kannan
  2016-02-01  6:02   ` Viresh Kumar
  2016-02-01 12:06   ` Juri Lelli
  20 siblings, 2 replies; 110+ messages in thread
From: Saravana Kannan @ 2016-01-30  0:57 UTC (permalink / raw)
  To: Juri Lelli
  Cc: linux-kernel, linux-pm, peterz, rjw, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 01/11/2016 09:35 AM, Juri Lelli wrote:
> Hi all,
>
> In the context of the ongoing discussion about introducing a simple platform
> energy model to guide scheduling decisions (Energy Aware Scheduling [1])
> concerns have been expressed by Peter about the component in charge of driving
> clock frequency selection (Steve recently posted an update of such component
> [2]): https://lkml.org/lkml/2015/8/15/141.
>
> The problem is that, with this new approach, cpufreq core functions need to be
> accessed from scheduler hot-paths and the overhead associated with the current
> locking scheme might result to be unsustainable.
>
> Peter's proposed approach of using RCU logic to reduce locking overhead seems
> reasonable, but things may not be so straightforward as originally thought. The
> very first thing I actually realized when I started looking into this is that
> it was hard for me to understand which locking mechanism was protecting which
> data structure. As mostly a way to build a better understanding of the current
> cpufreq locking scheme and also as preparatory work for implementing RCU logic,
> I came up with this set of patches. In fact, at this stage, I would like each
> patch to be considered as a question I'm asking rather than a proposed change,
> thus the RFC tag for the series; with the intent of documenting current locking
> scheme and modifying it a bit in order to make RCU logic implementation easier.
> Actually, as you'll soon notice, I didn't really start from scratch. Mike
> shared with me some patches he has been developing while looking at the same
> problem. I've given Mike attribution for the patches that I took unchanged from
> him, with thanks for sharing his findings with me.
>
> High level description of patches:
>
>   o [01-04] cleanup and move code around to make things (hopefully) cleaner
>   o [05-14] insert lockdep assertions and fix uncovered erroneous situations
>   o [15-18] remove overkill usage of locking mechanism
>   o 19      adds documentation for the cleaned up locking scheme
>
> With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
> anything bad happening. However, coverage is really small (as is my personal
> confidence of not breaking things for other confs :-)).
>
> This set is based on top of linux-pm/linux-next as of today and it is also
> available from here:
>
>   git://linux-arm.org/linux-jl.git upstream/cpufreq_cleanups
>
> Comments, concerns and rants are the primary goal of this posting; I'm thus
> looking forward to them.
>
> Best,
>
> - Juri
>
> [1] https://lkml.org/lkml/2015/7/7/754
> [2] https://lkml.org/lkml/2015/12/9/35
> [3] https://git.linaro.org/people/viresh.kumar/cpufreq-tests.git
>
> Juri Lelli (16):
>    cpufreq: kill for_each_policy
>    cpufreq: bring data structures close to their locks
>    cpufreq: assert locking when accessing cpufreq_policy_list
>    cpufreq: always access cpufreq_policy_list while holding
>      cpufreq_driver_lock
>    cpufreq: assert locking when accessing cpufreq_governor_list
>    cpufreq: fix warning for cpufreq_init_policy unlocked access to
>      cpufreq_governor_list
>    cpufreq: fix warning for show_scaling_available_governors unlocked
>      access to cpufreq_governor_list
>    cpufreq: assert policy->rwsem is held in cpufreq_set_policy
>    cpufreq: assert policy->rwsem is held in __cpufreq_governor
>    cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
>    cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
>    cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
>    cpufreq: remove useless usage of cpufreq_governor_mutex in
>      __cpufreq_governor
>    cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
>    cpufreq: stop checking for cpufreq_driver being present in
>      cpufreq_cpu_get
>    cpufreq: documentation: document locking scheme
>
> Michael Turquette (3):
>    cpufreq: do not expose cpufreq_governor_lock
>    cpufreq: merge governor lock and mutex
>    cpufreq: remove transition_lock
>
>   Documentation/cpu-freq/core.txt    |  44 +++++++++++++
>   drivers/cpufreq/cpufreq.c          | 132 +++++++++++++++++++++++--------------
>   drivers/cpufreq/cpufreq_governor.h |   2 -
>   include/linux/cpufreq.h            |   5 --
>   4 files changed, 125 insertions(+), 58 deletions(-)
>

Juri,

I haven't looked at the cpufreq-tests, but I doubt they do hotplug 
testing where they remove all the CPUs of a policy (to trigger a policy 
exit).

Can you please add that to your testing? I wouldn't be surprised if some 
of your clean ups would cause a dead lock. This clean up series is 
definitely appreciated, but I think the patch series might still be 
missing some patches that are needed to make things work without 
deadlocking.

I'll try to do a deeper analysis/review/testing, but kinda hard pressed 
on time here.

Thanks,
Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-01-30  0:33     ` Saravana Kannan
@ 2016-01-30 11:49       ` Rafael J. Wysocki
  2016-02-01  6:09         ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-01-30 11:49 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Viresh Kumar, Juri Lelli, linux-kernel, linux-pm, peterz,
	mturquette, steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:
> On 01/12/2016 02:20 AM, Viresh Kumar wrote:
> > On 11-01-16, 17:35, Juri Lelli wrote:
> >> __cpufreq_governor works on policy, so policy->rwsem has to be held.
> >> Add assertion for such condition.
> >>
> >> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> >> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> >> ---
> >>   drivers/cpufreq/cpufreq.c | 3 +++
> >>   1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >> index f1f9fbc..e7fc5c9 100644
> >> --- a/drivers/cpufreq/cpufreq.c
> >> +++ b/drivers/cpufreq/cpufreq.c
> >> @@ -1950,6 +1950,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> >>   	/* Don't start any governor operations if we are entering suspend */
> >>   	if (cpufreq_suspended)
> >>   		return 0;
> >> +
> >> +	lockdep_assert_held(&policy->rwsem);
> >> +
> >
> > We had an ABBA problem with the EXIT governor callback and so this
> > rwsem is dropped just before that from set_policy()..
> >
> > commit 955ef4833574 ("cpufreq: Drop rwsem lock around
> > CPUFREQ_GOV_POLICY_EXIT")
> >
> 
> AFAIR, the ABBA issue was between the sysfs lock and the policy lock. 
> The fix for that issue should not be dropping the lock around 
> POLICY_EXIT.

Right.  Dropping the lock is a mistake (which I overlooked, sadly).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation
  2016-01-30  0:57 ` Saravana Kannan
@ 2016-02-01  6:02   ` Viresh Kumar
  2016-02-01 12:06   ` Juri Lelli
  1 sibling, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-02-01  6:02 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Juri Lelli, linux-kernel, linux-pm, peterz, rjw, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 29-01-16, 16:57, Saravana Kannan wrote:
> I haven't looked at the cpufreq-tests, but I doubt they do hotplug testing
> where they remove all the CPUs of a policy (to trigger a policy exit).

They do.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-01-30 11:49       ` Rafael J. Wysocki
@ 2016-02-01  6:09         ` Viresh Kumar
  2016-02-01 10:22           ` Rafael J. Wysocki
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-02-01  6:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Saravana Kannan, Juri Lelli, linux-kernel, linux-pm, peterz,
	mturquette, steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

On 30-01-16, 12:49, Rafael J. Wysocki wrote:
> On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:
> > AFAIR, the ABBA issue was between the sysfs lock and the policy lock. 

Yeah, to be precise here it is:

CPU0 (sysfs read)               CPU1 (exit governor)

sysfs-read                      set_policy()-> lock policy->rwsem
sysfs-active lock               Remove sysfs files
lock policy->rwsem              sysfs-active lock
Actual read

> > The fix for that issue should not be dropping the lock around 
> > POLICY_EXIT.
> 
> Right.  Dropping the lock is a mistake (which I overlooked, sadly).

I joined the party at around time of 3.10, and we had this problem and
hacky solution then as well. We tried to get rid of it multiple times,
but sadly failed.

> > The proper fix is to have the governor "export" the
> > attributes it wants to add/remove and have the cpufreq framework do
> > the adding/removing of the attributes from sysfs for the governor.

I failed to understand your solution, sorry. Care to explain this a
bit more?

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-01  6:09         ` Viresh Kumar
@ 2016-02-01 10:22           ` Rafael J. Wysocki
  2016-02-01 20:24             ` Saravana Kannan
  0 siblings, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 10:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Saravana Kannan, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On Mon, Feb 1, 2016 at 7:09 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 30-01-16, 12:49, Rafael J. Wysocki wrote:
>> On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:
>> > AFAIR, the ABBA issue was between the sysfs lock and the policy lock.
>
> Yeah, to be precise here it is:
>
> CPU0 (sysfs read)               CPU1 (exit governor)
>
> sysfs-read                      set_policy()-> lock policy->rwsem
> sysfs-active lock               Remove sysfs files
> lock policy->rwsem              sysfs-active lock
> Actual read
>
>> > The fix for that issue should not be dropping the lock around
>> > POLICY_EXIT.
>>
>> Right.  Dropping the lock is a mistake (which I overlooked, sadly).
>
> I joined the party at around time of 3.10, and we had this problem and
> hacky solution then as well. We tried to get rid of it multiple times,
> but sadly failed.

I kind of like your idea of accessing governor attributes without
holding the policy rwsem.

I looked at that code and it seems doable to me.  The problem to solve
there would be to ensure that the dbs_data pointer is valid when
show/store runs for those attributes.

The fact that we make the distinction between global and policy
governors in there doesn't really help, but it looks like getting rid
of that bit wouldn't be too much effort.  Let me take a deeper look at
that.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 00/19] cpufreq locking cleanups and documentation
  2016-01-30  0:57 ` Saravana Kannan
  2016-02-01  6:02   ` Viresh Kumar
@ 2016-02-01 12:06   ` Juri Lelli
  1 sibling, 0 replies; 110+ messages in thread
From: Juri Lelli @ 2016-02-01 12:06 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: linux-kernel, linux-pm, peterz, rjw, viresh.kumar, mturquette,
	steve.muckle, vincent.guittot, morten.rasmussen,
	dietmar.eggemann

Hi Saravana,

On 29/01/16 16:57, Saravana Kannan wrote:
> On 01/11/2016 09:35 AM, Juri Lelli wrote:
> >Hi all,
> >
> >In the context of the ongoing discussion about introducing a simple platform
> >energy model to guide scheduling decisions (Energy Aware Scheduling [1])
> >concerns have been expressed by Peter about the component in charge of driving
> >clock frequency selection (Steve recently posted an update of such component
> >[2]): https://lkml.org/lkml/2015/8/15/141.
> >
> >The problem is that, with this new approach, cpufreq core functions need to be
> >accessed from scheduler hot-paths and the overhead associated with the current
> >locking scheme might result to be unsustainable.
> >
> >Peter's proposed approach of using RCU logic to reduce locking overhead seems
> >reasonable, but things may not be so straightforward as originally thought. The
> >very first thing I actually realized when I started looking into this is that
> >it was hard for me to understand which locking mechanism was protecting which
> >data structure. As mostly a way to build a better understanding of the current
> >cpufreq locking scheme and also as preparatory work for implementing RCU logic,
> >I came up with this set of patches. In fact, at this stage, I would like each
> >patch to be considered as a question I'm asking rather than a proposed change,
> >thus the RFC tag for the series; with the intent of documenting current locking
> >scheme and modifying it a bit in order to make RCU logic implementation easier.
> >Actually, as you'll soon notice, I didn't really start from scratch. Mike
> >shared with me some patches he has been developing while looking at the same
> >problem. I've given Mike attribution for the patches that I took unchanged from
> >him, with thanks for sharing his findings with me.
> >
> >High level description of patches:
> >
> >  o [01-04] cleanup and move code around to make things (hopefully) cleaner
> >  o [05-14] insert lockdep assertions and fix uncovered erroneous situations
> >  o [15-18] remove overkill usage of locking mechanism
> >  o 19      adds documentation for the cleaned up locking scheme
> >
> >With Viresh' tests [3] on both arm TC2 and arm64 Juno boards I'm not seeing
> >anything bad happening. However, coverage is really small (as is my personal
> >confidence of not breaking things for other confs :-)).
> >
> >This set is based on top of linux-pm/linux-next as of today and it is also
> >available from here:
> >
> >  git://linux-arm.org/linux-jl.git upstream/cpufreq_cleanups
> >
> >Comments, concerns and rants are the primary goal of this posting; I'm thus
> >looking forward to them.
> >
> >Best,
> >
> >- Juri
> >
> >[1] https://lkml.org/lkml/2015/7/7/754
> >[2] https://lkml.org/lkml/2015/12/9/35
> >[3] https://git.linaro.org/people/viresh.kumar/cpufreq-tests.git
> >
> >Juri Lelli (16):
> >   cpufreq: kill for_each_policy
> >   cpufreq: bring data structures close to their locks
> >   cpufreq: assert locking when accessing cpufreq_policy_list
> >   cpufreq: always access cpufreq_policy_list while holding
> >     cpufreq_driver_lock
> >   cpufreq: assert locking when accessing cpufreq_governor_list
> >   cpufreq: fix warning for cpufreq_init_policy unlocked access to
> >     cpufreq_governor_list
> >   cpufreq: fix warning for show_scaling_available_governors unlocked
> >     access to cpufreq_governor_list
> >   cpufreq: assert policy->rwsem is held in cpufreq_set_policy
> >   cpufreq: assert policy->rwsem is held in __cpufreq_governor
> >   cpufreq: fix locking of policy->rwsem in cpufreq_init_policy
> >   cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare
> >   cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish
> >   cpufreq: remove useless usage of cpufreq_governor_mutex in
> >     __cpufreq_governor
> >   cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT
> >   cpufreq: stop checking for cpufreq_driver being present in
> >     cpufreq_cpu_get
> >   cpufreq: documentation: document locking scheme
> >
> >Michael Turquette (3):
> >   cpufreq: do not expose cpufreq_governor_lock
> >   cpufreq: merge governor lock and mutex
> >   cpufreq: remove transition_lock
> >
> >  Documentation/cpu-freq/core.txt    |  44 +++++++++++++
> >  drivers/cpufreq/cpufreq.c          | 132 +++++++++++++++++++++++--------------
> >  drivers/cpufreq/cpufreq_governor.h |   2 -
> >  include/linux/cpufreq.h            |   5 --
> >  4 files changed, 125 insertions(+), 58 deletions(-)
> >
> 
> Juri,
> 
> I haven't looked at the cpufreq-tests, but I doubt they do hotplug
> testing where they remove all the CPUs of a policy (to trigger a
> policy exit).
> 

As already pointed out by Viresh, they do actually test the case when we
hotplug out all CPUs of a policy. I'm running those on b.L. arm and
arm64 targets (two policies).

> Can you please add that to your testing? I wouldn't be surprised if
> some of your clean ups would cause a dead lock. This clean up series
> is definitely appreciated, but I think the patch series might still
> be missing some patches that are needed to make things work without
> deadlocking.
> 

Right, the problem with governors sysfs attibutes was there. But Viresh
proposed a patch (different thread I'm afraid) to fix that.

Another thing that was still missing/wrong in this set is that governors
don't hold policy->rwsem when calling __cpufreq_driver_target(), so we
can't remove transition_lock. This seems to be a bit tricky to solve
thought, as we seems to be runnning into lot of ABBA deadlocks
(timer_mutex, etc.) if we do that.

> I'll try to do a deeper analysis/review/testing, but kinda hard
> pressed on time here.
> 

Thanks a lot for looking at this set. I should be able to post an
updated version fairly soon that should address review comments.

Best,

- Juri

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-01 10:22           ` Rafael J. Wysocki
@ 2016-02-01 20:24             ` Saravana Kannan
  2016-02-01 21:00               ` Rafael J. Wysocki
  2016-02-02  6:34               ` Viresh Kumar
  0 siblings, 2 replies; 110+ messages in thread
From: Saravana Kannan @ 2016-02-01 20:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
> On Mon, Feb 1, 2016 at 7:09 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 30-01-16, 12:49, Rafael J. Wysocki wrote:
>>> On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:
>>>> AFAIR, the ABBA issue was between the sysfs lock and the policy lock.
>>
>> Yeah, to be precise here it is:
>>
>> CPU0 (sysfs read)               CPU1 (exit governor)
>>
>> sysfs-read                      set_policy()-> lock policy->rwsem
>> sysfs-active lock               Remove sysfs files
>> lock policy->rwsem              sysfs-active lock
>> Actual read
>>
>>>> The fix for that issue should not be dropping the lock around
>>>> POLICY_EXIT.
>>>
>>> Right.  Dropping the lock is a mistake (which I overlooked, sadly).
>>
>> I joined the party at around time of 3.10, and we had this problem and
>> hacky solution then as well. We tried to get rid of it multiple times,
>> but sadly failed.
>
> I kind of like your idea of accessing governor attributes without
> holding the policy rwsem.

I'm not sure whose idea you are referring to. Viresh's (I don't think I 
saw his proposal) or mine.

> I looked at that code and it seems doable to me.  The problem to solve
> there would be to ensure that the dbs_data pointer is valid when
> show/store runs for those attributes.
>
> The fact that we make the distinction between global and policy
> governors in there doesn't really help, but it looks like getting rid
> of that bit wouldn't be too much effort.  Let me take a deeper look at
> that.
>

Anyway, to explain my suggestion better, I'm proposing to make it so 
that we don't have a need for the AB BA locking. The only reason the 
governor needs to even grab the sysfs lock is to add/remove the sysfs 
attribute files.

That can be easily achieved if the policy struct has some "gov_attrs" 
field(s) that each governor populates. Then the framework just has to 
create them after POLICY_INIT is processed by the governor and remove 
them before POILICY_EXIT is sent to the governor.

That way, we also avoid having to worry about the gov attributes 
accessed by the show/store disappearing while the files are being 
accessed. Since we remove those files before we even ask the gov to 
clean up, that situation can never happen.

The current problem is that there is no good place for the governor to 
populate this "gov_attrs" field(s). Maybe the governor register might be 
one place for it to provide the data to the framework and the framework 
can later fill it up itself when switching governors.

Thanks,
Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-01 20:24             ` Saravana Kannan
@ 2016-02-01 21:00               ` Rafael J. Wysocki
  2016-02-02  6:36                 ` Viresh Kumar
  2016-02-02  6:34               ` Viresh Kumar
  1 sibling, 1 reply; 110+ messages in thread
From: Rafael J. Wysocki @ 2016-02-01 21:00 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Viresh Kumar, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On Mon, Feb 1, 2016 at 9:24 PM, Saravana Kannan <skannan@codeaurora.org> wrote:
> On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
>>
>> On Mon, Feb 1, 2016 at 7:09 AM, Viresh Kumar <viresh.kumar@linaro.org>
>> wrote:
>>>
>>> On 30-01-16, 12:49, Rafael J. Wysocki wrote:
>>>>
>>>> On Friday, January 29, 2016 04:33:39 PM Saravana Kannan wrote:
>>>>>
>>>>> AFAIR, the ABBA issue was between the sysfs lock and the policy lock.
>>>
>>>
>>> Yeah, to be precise here it is:
>>>
>>> CPU0 (sysfs read)               CPU1 (exit governor)
>>>
>>> sysfs-read                      set_policy()-> lock policy->rwsem
>>> sysfs-active lock               Remove sysfs files
>>> lock policy->rwsem              sysfs-active lock
>>> Actual read
>>>
>>>>> The fix for that issue should not be dropping the lock around
>>>>> POLICY_EXIT.
>>>>
>>>>
>>>> Right.  Dropping the lock is a mistake (which I overlooked, sadly).
>>>
>>>
>>> I joined the party at around time of 3.10, and we had this problem and
>>> hacky solution then as well. We tried to get rid of it multiple times,
>>> but sadly failed.
>>
>>
>> I kind of like your idea of accessing governor attributes without
>> holding the policy rwsem.
>
>
> I'm not sure whose idea you are referring to. Viresh's (I don't think I saw
> his proposal) or mine.

I meant a Viresh's idea that he discussed with Preeti Murthy a while
ago (or maybe just pointed her to a message where it was outlined, I
can't recall ATM).

>> I looked at that code and it seems doable to me.  The problem to solve
>> there would be to ensure that the dbs_data pointer is valid when
>> show/store runs for those attributes.
>>
>> The fact that we make the distinction between global and policy
>> governors in there doesn't really help, but it looks like getting rid
>> of that bit wouldn't be too much effort.  Let me take a deeper look at
>> that.
>>
>
> Anyway, to explain my suggestion better, I'm proposing to make it so that we
> don't have a need for the AB BA locking. The only reason the governor needs
> to even grab the sysfs lock is to add/remove the sysfs attribute files.

I'm not sure what you mean by "the sysfs lock" here?  The policy rwsem
or something else?

> That can be easily achieved if the policy struct has some "gov_attrs"
> field(s) that each governor populates. Then the framework just has to create
> them after POLICY_INIT is processed by the governor and remove them before
> POILICY_EXIT is sent to the governor.
>
> That way, we also avoid having to worry about the gov attributes accessed by
> the show/store disappearing while the files are being accessed. Since we
> remove those files before we even ask the gov to clean up, that situation
> can never happen.
>
> The current problem is that there is no good place for the governor to
> populate this "gov_attrs" field(s). Maybe the governor register might be one
> place for it to provide the data to the framework and the framework can
> later fill it up itself when switching governors.

Well, as I said, let me see what can be done to avoid holding the
policy rwsem around governor attributes access.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-01 20:24             ` Saravana Kannan
  2016-02-01 21:00               ` Rafael J. Wysocki
@ 2016-02-02  6:34               ` Viresh Kumar
  2016-02-02 21:37                 ` Saravana Kannan
  1 sibling, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-02-02  6:34 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 01-02-16, 12:24, Saravana Kannan wrote:
> On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
> I'm not sure whose idea you are referring to. Viresh's (I don't think I saw
> his proposal) or mine.

http://git.linaro.org/people/viresh.kumar/linux.git/commit/57714d5b1778f2f610bcc5c74d85b29ba1cc1995

> Anyway, to explain my suggestion better, I'm proposing to make it so that we
> don't have a need for the AB BA locking. The only reason the governor needs
> to even grab the sysfs lock is to add/remove the sysfs attribute files.
> 
> That can be easily achieved if the policy struct has some "gov_attrs"
> field(s) that each governor populates. Then the framework just has to create
> them after POLICY_INIT is processed by the governor and remove them before
> POILICY_EXIT is sent to the governor.

What will that solve? It will stay exactly same then as well, as we
would be adding/removing these attributes from within the same
policy->rwsem ..

> That way, we also avoid having to worry about the gov attributes accessed by
> the show/store disappearing while the files are being accessed.

It can't happen. S_active lock should be taking care of that, isn't
it?

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-01 21:00               ` Rafael J. Wysocki
@ 2016-02-02  6:36                 ` Viresh Kumar
  2016-02-02 21:38                   ` Saravana Kannan
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-02-02  6:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Saravana Kannan, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 01-02-16, 22:00, Rafael J. Wysocki wrote:
> I'm not sure what you mean by "the sysfs lock" here?  The policy rwsem
> or something else?

He perhaps referred to the s_active.lock that we see in traces.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-02  6:34               ` Viresh Kumar
@ 2016-02-02 21:37                 ` Saravana Kannan
  2016-02-03  2:13                   ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Saravana Kannan @ 2016-02-02 21:37 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02/01/2016 10:34 PM, Viresh Kumar wrote:
> On 01-02-16, 12:24, Saravana Kannan wrote:
>> On 02/01/2016 02:22 AM, Rafael J. Wysocki wrote:
>> I'm not sure whose idea you are referring to. Viresh's (I don't think I saw
>> his proposal) or mine.
>
> http://git.linaro.org/people/viresh.kumar/linux.git/commit/57714d5b1778f2f610bcc5c74d85b29ba1cc1995
>
>> Anyway, to explain my suggestion better, I'm proposing to make it so that we
>> don't have a need for the AB BA locking. The only reason the governor needs
>> to even grab the sysfs lock is to add/remove the sysfs attribute files.
>>
>> That can be easily achieved if the policy struct has some "gov_attrs"
>> field(s) that each governor populates. Then the framework just has to create
>> them after POLICY_INIT is processed by the governor and remove them before
>> POILICY_EXIT is sent to the governor.
>
> What will that solve? It will stay exactly same then as well, as we
> would be adding/removing these attributes from within the same
> policy->rwsem ..

The problem isn't that you are holding the policy rwsem. The problem is 
that we are trying to grab the same locks in different order. This is 
trying to fix that.
>
>> That way, we also avoid having to worry about the gov attributes accessed by
>> the show/store disappearing while the files are being accessed.
>
> It can't happen. S_active lock should be taking care of that, isn't
> it?

You are right. That can't happen because we have the s_active lock. I 
meant to say that in general we don't have to worry about the races 
between a show/store needing some policy specific data within the 
governor to be valid but racing with governor change where it ends up 
being invalid. The releasing of the policy rwsem across POLICY_EXIT 
allows this to happen today.

-Saravana


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-02  6:36                 ` Viresh Kumar
@ 2016-02-02 21:38                   ` Saravana Kannan
  0 siblings, 0 replies; 110+ messages in thread
From: Saravana Kannan @ 2016-02-02 21:38 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02/01/2016 10:36 PM, Viresh Kumar wrote:
> On 01-02-16, 22:00, Rafael J. Wysocki wrote:
>> I'm not sure what you mean by "the sysfs lock" here?  The policy rwsem
>> or something else?
>
> He perhaps referred to the s_active.lock that we see in traces.
>

Yeah, that's what I mean. I generally don't use the exact name of the 
lock in emails (lazy to look it up) if there isn't a lot of chance for 
mistaking it for another lock.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-02 21:37                 ` Saravana Kannan
@ 2016-02-03  2:13                   ` Viresh Kumar
  2016-02-03  4:04                     ` Saravana Kannan
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-02-03  2:13 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02-02-16, 13:37, Saravana Kannan wrote:
> On 02/01/2016 10:34 PM, Viresh Kumar wrote:
> >What will that solve? It will stay exactly same then as well, as we
> >would be adding/removing these attributes from within the same
> >policy->rwsem ..
> 
> The problem isn't that you are holding the policy rwsem. The problem is that
> we are trying to grab the same locks in different order. This is trying to
> fix that.

That's exactly what I was trying to say, sorry for not being very
clear.

Even if you would move the sysfs file creation thing into the cpufreq
core, instead of governor, we will have locks this way:

CPU0                            CPU1
(sysfs read)                    (sysfs dir remove)
s_active lock                   policy->rwsem
policy->rwsem
                                s_active lock (hang)


And so I said, nothing will change.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-03  2:13                   ` Viresh Kumar
@ 2016-02-03  4:04                     ` Saravana Kannan
  2016-02-03  5:02                       ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Saravana Kannan @ 2016-02-03  4:04 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02/02/2016 06:13 PM, Viresh Kumar wrote:
> On 02-02-16, 13:37, Saravana Kannan wrote:
>> On 02/01/2016 10:34 PM, Viresh Kumar wrote:
>>> What will that solve? It will stay exactly same then as well, as we
>>> would be adding/removing these attributes from within the same
>>> policy->rwsem ..
>>
>> The problem isn't that you are holding the policy rwsem. The problem is that
>> we are trying to grab the same locks in different order. This is trying to
>> fix that.
>
> That's exactly what I was trying to say, sorry for not being very
> clear.
>
> Even if you would move the sysfs file creation thing into the cpufreq
> core, instead of governor, we will have locks this way:
>
> CPU0                            CPU1
> (sysfs read)                    (sysfs dir remove)
> s_active lock                   policy->rwsem
> policy->rwsem
>                                  s_active lock (hang)
>
>
> And so I said, nothing will change.
>

What's the s_active lock in CPU1 coming from? The only reason it's there 
today is because of the sysfs dir remove. If you move it before the 
policy->rwsem, you won't have it after the policy->rwsem too. So, I 
think it will fix the issue.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-03  4:04                     ` Saravana Kannan
@ 2016-02-03  5:02                       ` Viresh Kumar
  2016-02-03  5:06                         ` Saravana Kannan
  0 siblings, 1 reply; 110+ messages in thread
From: Viresh Kumar @ 2016-02-03  5:02 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02-02-16, 20:04, Saravana Kannan wrote:
> What's the s_active lock in CPU1 coming from?

That's taken by sysfs core while removing the files.

> The only reason it's there
> today is because of the sysfs dir remove. If you move it before the
> policy->rwsem, you won't have it after the policy->rwsem too. So, I think it
> will fix the issue.

Its complex and we will end up making ugly..

For example, EXIT can be called while switching governors. The
policy->rwsem is taken at the beginning cpufreq_set_policy(). To
decide if we should remove the governor sysfs directory so early (i.e.
before taking rwsem) in the call, is going to be difficult.

Over that the same directory might be shared across multiple policies,
and all that information is present only with the governor-core.

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-03  5:02                       ` Viresh Kumar
@ 2016-02-03  5:06                         ` Saravana Kannan
  2016-02-03  6:59                           ` Viresh Kumar
  0 siblings, 1 reply; 110+ messages in thread
From: Saravana Kannan @ 2016-02-03  5:06 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02/02/2016 09:02 PM, Viresh Kumar wrote:
> On 02-02-16, 20:04, Saravana Kannan wrote:
>> What's the s_active lock in CPU1 coming from?
>
> That's taken by sysfs core while removing the files.
>
>> The only reason it's there
>> today is because of the sysfs dir remove. If you move it before the
>> policy->rwsem, you won't have it after the policy->rwsem too. So, I think it
>> will fix the issue.
>
> Its complex and we will end up making ugly..

I disagree. I think it's way better and simpler than this patch set. It 
also doesn't tie into cpufreq_governor.* which is a good thing IMO since 
it keeps things simpler for sched-dvfs too.

> For example, EXIT can be called while switching governors. The
> policy->rwsem is taken at the beginning cpufreq_set_policy(). To
> decide if we should remove the governor sysfs directory so early (i.e.
> before taking rwsem) in the call, is going to be difficult.

Just check if the governor is changing. And if it is, you just need to 
remove the policy specific stuff.

> Over that the same directory might be shared across multiple policies,
> and all that information is present only with the governor-core.

That's why I said the gov needs to register the per pol and system wide 
attrs list separately.

This will also remove the need for ever governor to do this crap.

-Saravana

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor
  2016-02-03  5:06                         ` Saravana Kannan
@ 2016-02-03  6:59                           ` Viresh Kumar
  0 siblings, 0 replies; 110+ messages in thread
From: Viresh Kumar @ 2016-02-03  6:59 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Juri Lelli,
	Linux Kernel Mailing List, linux-pm, Peter Zijlstra,
	Michael Turquette, Steve Muckle, Vincent Guittot,
	Morten Rasmussen, dietmar.eggemann

On 02-02-16, 21:06, Saravana Kannan wrote:
> I disagree. I think it's way better and simpler than this patch set. It also
> doesn't tie into cpufreq_governor.* which is a good thing IMO since it keeps
> things simpler for sched-dvfs too.

Lets discuss it further on the other thread .. 

-- 
viresh

^ permalink raw reply	[flat|nested] 110+ messages in thread

end of thread, other threads:[~2016-02-03  6:59 UTC | newest]

Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-11 17:35 [RFC PATCH 00/19] cpufreq locking cleanups and documentation Juri Lelli
2016-01-11 17:35 ` [RFC PATCH 01/19] cpufreq: do not expose cpufreq_governor_lock Juri Lelli
2016-01-12  8:56   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 02/19] cpufreq: merge governor lock and mutex Juri Lelli
2016-01-12  9:00   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 03/19] cpufreq: kill for_each_policy Juri Lelli
2016-01-12  9:01   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 04/19] cpufreq: bring data structures close to their locks Juri Lelli
2016-01-11 22:05   ` Peter Zijlstra
2016-01-11 23:03     ` Rafael J. Wysocki
2016-01-12  8:27       ` Peter Zijlstra
2016-01-12 10:43         ` Juri Lelli
2016-01-12 16:47         ` Rafael J. Wysocki
2016-01-11 22:07   ` Peter Zijlstra
2016-01-12  9:27     ` Viresh Kumar
2016-01-12 11:21       ` Juri Lelli
2016-01-12 11:58         ` Peter Zijlstra
2016-01-12 12:36           ` Juri Lelli
2016-01-12 15:26             ` Juri Lelli
2016-01-12 15:58               ` Peter Zijlstra
2016-01-12  9:10   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 05/19] cpufreq: assert locking when accessing cpufreq_policy_list Juri Lelli
2016-01-12  9:34   ` Viresh Kumar
2016-01-12 11:44     ` Juri Lelli
2016-01-13  5:59       ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 06/19] cpufreq: always access cpufreq_policy_list while holding cpufreq_driver_lock Juri Lelli
2016-01-12  9:57   ` Viresh Kumar
2016-01-12 12:08     ` Juri Lelli
2016-01-13  6:01       ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 07/19] cpufreq: assert locking when accessing cpufreq_governor_list Juri Lelli
2016-01-12 10:01   ` Viresh Kumar
2016-01-12 15:33     ` Juri Lelli
2016-01-11 17:35 ` [RFC PATCH 08/19] cpufreq: fix warning for cpufreq_init_policy unlocked access to cpufreq_governor_list Juri Lelli
2016-01-12 10:09   ` Viresh Kumar
2016-01-12 15:52     ` Juri Lelli
2016-01-13  6:07       ` Viresh Kumar
2016-01-14 16:35         ` Juri Lelli
2016-01-18  5:23           ` Viresh Kumar
2016-01-18 15:19             ` Juri Lelli
2016-01-11 17:35 ` [RFC PATCH 09/19] cpufreq: fix warning for show_scaling_available_governors " Juri Lelli
2016-01-12 10:13   ` Viresh Kumar
2016-01-13 10:25     ` Juri Lelli
2016-01-13 10:32       ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 10/19] cpufreq: assert policy->rwsem is held in cpufreq_set_policy Juri Lelli
2016-01-12 10:15   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 11/19] cpufreq: assert policy->rwsem is held in __cpufreq_governor Juri Lelli
2016-01-12 10:20   ` Viresh Kumar
2016-01-30  0:33     ` Saravana Kannan
2016-01-30 11:49       ` Rafael J. Wysocki
2016-02-01  6:09         ` Viresh Kumar
2016-02-01 10:22           ` Rafael J. Wysocki
2016-02-01 20:24             ` Saravana Kannan
2016-02-01 21:00               ` Rafael J. Wysocki
2016-02-02  6:36                 ` Viresh Kumar
2016-02-02 21:38                   ` Saravana Kannan
2016-02-02  6:34               ` Viresh Kumar
2016-02-02 21:37                 ` Saravana Kannan
2016-02-03  2:13                   ` Viresh Kumar
2016-02-03  4:04                     ` Saravana Kannan
2016-02-03  5:02                       ` Viresh Kumar
2016-02-03  5:06                         ` Saravana Kannan
2016-02-03  6:59                           ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 12/19] cpufreq: fix locking of policy->rwsem in cpufreq_init_policy Juri Lelli
2016-01-12 10:39   ` Viresh Kumar
2016-01-14 17:58     ` Juri Lelli
2016-01-11 17:35 ` [RFC PATCH 13/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_prepare Juri Lelli
2016-01-12 10:54   ` Viresh Kumar
2016-01-15 12:37     ` Juri Lelli
2016-01-11 17:35 ` [RFC PATCH 14/19] cpufreq: fix locking of policy->rwsem in cpufreq_offline_finish Juri Lelli
2016-01-12 11:02   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 15/19] cpufreq: remove useless usage of cpufreq_governor_mutex in __cpufreq_governor Juri Lelli
2016-01-12 11:06   ` Viresh Kumar
2016-01-15 16:30     ` Juri Lelli
2016-01-18  5:50       ` Viresh Kumar
2016-01-19 16:49         ` Juri Lelli
2016-01-20  7:29           ` Viresh Kumar
2016-01-20 10:17             ` Juri Lelli
2016-01-20 10:18               ` Viresh Kumar
2016-01-20 10:27                 ` Juri Lelli
2016-01-20 10:30                   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 16/19] cpufreq: hold policy->rwsem across CPUFREQ_GOV_POLICY_EXIT Juri Lelli
2016-01-12 11:09   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 17/19] cpufreq: stop checking for cpufreq_driver being present in cpufreq_cpu_get Juri Lelli
2016-01-12 11:17   ` Viresh Kumar
2016-01-11 17:35 ` [RFC PATCH 18/19] cpufreq: remove transition_lock Juri Lelli
2016-01-12 11:24   ` Viresh Kumar
2016-01-13  0:54     ` Michael Turquette
2016-01-13  6:31       ` Viresh Kumar
     [not found]         ` <20160113182131.1168.45753@quark.deferred.io>
2016-01-14  9:44           ` Juri Lelli
2016-01-14 10:32           ` Viresh Kumar
2016-01-14 13:52             ` Juri Lelli
2016-01-18  5:09               ` Viresh Kumar
2016-01-19 14:00           ` Peter Zijlstra
2016-01-19 14:42             ` Juri Lelli
2016-01-19 15:30               ` Peter Zijlstra
2016-01-19 16:01                 ` Juri Lelli
2016-01-19 19:17                   ` Peter Zijlstra
2016-01-19 19:21                     ` Peter Zijlstra
2016-01-19 21:52                       ` Rafael J. Wysocki
2016-01-20 17:04                         ` Peter Zijlstra
2016-01-20 22:12                           ` Rafael J. Wysocki
2016-01-20 22:38                             ` Peter Zijlstra
2016-01-20 23:33                               ` Rafael J. Wysocki
2016-01-20 12:59                       ` Juri Lelli
2016-01-11 17:36 ` [RFC PATCH 19/19] cpufreq: documentation: document locking scheme Juri Lelli
2016-01-11 22:45 ` [RFC PATCH 00/19] cpufreq locking cleanups and documentation Rafael J. Wysocki
2016-01-12 10:46   ` Juri Lelli
2016-01-30  0:57 ` Saravana Kannan
2016-02-01  6:02   ` Viresh Kumar
2016-02-01 12:06   ` Juri Lelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.