All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking
@ 2016-12-07 10:37 Viresh Kumar
  2016-12-07 10:37 ` [PATCH 01/12] PM / OPP: Add per OPP table mutex Viresh Kumar
                   ` (12 more replies)
  0 siblings, 13 replies; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki
  Cc: linaro-kernel, linux-pm, linux-kernel, Stephen Boyd,
	Nishanth Menon, Vincent Guittot, Viresh Kumar

Hi Rafael,

You can pretty much ignore this series until all other OPP cleanup/fixes
get merged. I am posting these to get early reviews from Stephen as
these patches have been lying with me for almost a week now. And I am
also _not_ pushing these for 4.10-rc1. It all depends on how the reviews
go.

The RCU locking isn't well suited for the OPP core. The RCU locking fits
better for reader heavy stuff, while the OPP core have at max one or two
readers only at a time.

Over that, it was getting very confusing the way RCU locking was used
within the OPP core. The individual OPPs are mostly well handled, i.e.
for an update a new structure was created and then that replaced the
older one. But the OPP tables were updated directly all the time from
various parts of the core. Though they were mostly used from within RCU
locked region, they didn't had much to do with RCU and were governed by
the mutex instead.

And that mixed with the 'opp_table_lock' has made the core even more
confusing.

Similar concerns were shared by Stephen Boyd earlier [1].

This patchset simplifies the locking in OPP core to great extent using
Kernel reference counting mechanism along with per OPP table mutex.
And finally it gets rid of RCU locking as well.

Each and every patch of this series is individually:
- build tested
- boot tested with cpufreq-dt.ko module. Insmod and rmmod to make sure
  the OPPs and the OPP tables are getting freed.

More testing is also done by various build and boot bots for last few
days. And they reported lots of issues (both build and boot time) that
helped making this series more robust:
- Kernel CI (Linaro)
- Fengguang Wu's bot (Intel)

This series has few dependencies though. It is rebased over:
  pm/bleeding-edge
  + OPP cleanup series [2]
  + few devfreq fixes [3], [4], and [5].
  + A recent revert [6]

Though all of those shall get merged before we end up reviewing this
series.

--
viresh

[1] https://marc.info/?l=linux-kernel&m=147742717527548&w=2
[2] https://marc.info/?l=linux-pm&m=148108573618896&w=2
[3] https://patchwork.kernel.org/patch/9455789/
[4] https://patchwork.kernel.org/patch/9455757/
[5] https://marc.info/?l=linux-pm&m=148090824301852&w=2
[6] https://marc.info/?l=linux-kernel&m=148110674223377&w=2

Viresh Kumar (12):
  PM / OPP: Add per OPP table mutex
  PM / OPP: Add 'struct kref' to OPP table
  PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
  PM / OPP: Take reference of the OPP table while adding/removing OPPs
  PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table()
  PM / OPP: Add 'struct kref' to struct dev_pm_opp
  PM / OPP: Update OPP users to put reference
  PM / OPP: Take kref from _find_opp_table()
  PM / OPP: Move away from RCU locking
  PM / OPP: Simplify _opp_set_availability()
  PM / OPP: Simplify dev_pm_opp_get_max_volt_latency()
  PM / OPP: Update Documentation to remove RCU specific bits

 Documentation/power/opp.txt          |  47 +-
 arch/arm/mach-omap2/pm.c             |   5 +-
 drivers/base/power/opp/core.c        | 888 +++++++++++------------------------
 drivers/base/power/opp/cpu.c         |  66 +--
 drivers/base/power/opp/of.c          |  94 +---
 drivers/base/power/opp/opp.h         |  31 +-
 drivers/clk/tegra/clk-dfll.c         |  17 +-
 drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
 drivers/cpufreq/imx6q-cpufreq.c      |  10 +-
 drivers/cpufreq/mt8173-cpufreq.c     |   8 +-
 drivers/cpufreq/omap-cpufreq.c       |   4 +-
 drivers/cpufreq/sti-cpufreq.c        |  13 +-
 drivers/devfreq/devfreq.c            |  14 +-
 drivers/devfreq/exynos-bus.c         |  14 +-
 drivers/devfreq/governor_passive.c   |   4 +-
 drivers/devfreq/rk3399_dmc.c         |  16 +-
 drivers/devfreq/tegra-devfreq.c      |   4 +-
 drivers/thermal/cpu_cooling.c        |  11 +-
 drivers/thermal/devfreq_cooling.c    |  14 +-
 include/linux/pm_opp.h               |  48 +-
 20 files changed, 429 insertions(+), 884 deletions(-)

-- 
2.7.1.410.g6faf27b

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/12] PM / OPP: Add per OPP table mutex
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:11   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table Viresh Kumar
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Add per OPP table lock to protect opp_table->opp_list.

Note that at few places opp_list is used under the rcu_read_lock() and
so a mutex can't be added there for now. This will be fixed by a later
patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 31 +++++++++++++++++++++++++++----
 drivers/base/power/opp/opp.h  |  2 ++
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 6af371a55062..212b7dbecae2 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -854,6 +854,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
 
 	srcu_init_notifier_head(&opp_table->srcu_head);
 	INIT_LIST_HEAD(&opp_table->opp_list);
+	mutex_init(&opp_table->lock);
 
 	/* Secure the device table modification */
 	list_add_rcu(&opp_table->node, &opp_tables);
@@ -909,6 +910,7 @@ static void _free_opp_table(struct opp_table *opp_table)
 	/* dev_list must be empty now */
 	WARN_ON(!list_empty(&opp_table->dev_list));
 
+	mutex_destroy(&opp_table->lock);
 	list_del_rcu(&opp_table->node);
 	call_srcu(&opp_table->srcu_head.srcu, &opp_table->rcu_head,
 		  _kfree_device_rcu);
@@ -969,6 +971,8 @@ static void _kfree_opp_rcu(struct rcu_head *head)
  */
 static void _opp_remove(struct opp_table *opp_table, struct dev_pm_opp *opp)
 {
+	mutex_lock(&opp_table->lock);
+
 	/*
 	 * Notify the changes in the availability of the operable
 	 * frequency/voltage list.
@@ -978,6 +982,8 @@ static void _opp_remove(struct opp_table *opp_table, struct dev_pm_opp *opp)
 	list_del_rcu(&opp->node);
 	call_srcu(&opp_table->srcu_head.srcu, &opp->rcu_head, _kfree_opp_rcu);
 
+	mutex_unlock(&opp_table->lock);
+
 	_remove_opp_table(opp_table);
 }
 
@@ -1007,6 +1013,8 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 	if (IS_ERR(opp_table))
 		goto unlock;
 
+	mutex_lock(&opp_table->lock);
+
 	list_for_each_entry(opp, &opp_table->opp_list, node) {
 		if (opp->rate == freq) {
 			found = true;
@@ -1014,6 +1022,8 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 		}
 	}
 
+	mutex_unlock(&opp_table->lock);
+
 	if (!found) {
 		dev_warn(dev, "%s: Couldn't find OPP with freq: %lu\n",
 			 __func__, freq);
@@ -1084,7 +1094,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 	     struct opp_table *opp_table)
 {
 	struct dev_pm_opp *opp;
-	struct list_head *head = &opp_table->opp_list;
+	struct list_head *head;
 	int ret;
 
 	/*
@@ -1095,6 +1105,9 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 	 * loop, don't replace it with head otherwise it will become an infinite
 	 * loop.
 	 */
+	mutex_lock(&opp_table->lock);
+	head = &opp_table->opp_list;
+
 	list_for_each_entry_rcu(opp, &opp_table->opp_list, node) {
 		if (new_opp->rate > opp->rate) {
 			head = &opp->node;
@@ -1111,12 +1124,17 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 			 new_opp->supplies[0].u_volt, new_opp->available);
 
 		/* Should we compare voltages for all regulators here ? */
-		return opp->available &&
-		       new_opp->supplies[0].u_volt == opp->supplies[0].u_volt ? -EBUSY : -EEXIST;
+		ret = opp->available &&
+		      new_opp->supplies[0].u_volt == opp->supplies[0].u_volt ? -EBUSY : -EEXIST;
+
+		mutex_unlock(&opp_table->lock);
+		return ret;
 	}
 
-	new_opp->opp_table = opp_table;
 	list_add_rcu(&new_opp->node, head);
+	mutex_unlock(&opp_table->lock);
+
+	new_opp->opp_table = opp_table;
 
 	ret = opp_debug_create_one(new_opp, opp_table);
 	if (ret)
@@ -1780,6 +1798,8 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 		goto unlock;
 	}
 
+	mutex_lock(&opp_table->lock);
+
 	/* Do we have the frequency? */
 	list_for_each_entry(tmp_opp, &opp_table->opp_list, node) {
 		if (tmp_opp->rate == freq) {
@@ -1787,6 +1807,9 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 			break;
 		}
 	}
+
+	mutex_unlock(&opp_table->lock);
+
 	if (IS_ERR(opp)) {
 		r = PTR_ERR(opp);
 		goto unlock;
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index e32dc80ddc12..81eb6cee7295 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -131,6 +131,7 @@ enum opp_table_access {
  * @rcu_head:	RCU callback head used for deferred freeing
  * @dev_list:	list of devices that share these OPPs
  * @opp_list:	table of opps
+ * @lock:	mutex protecting the opp_list.
  * @np:		struct device_node pointer for opp's DT node.
  * @clock_latency_ns_max: Max clock latency in nanoseconds.
  * @shared_opp: OPP is shared between multiple devices.
@@ -163,6 +164,7 @@ struct opp_table {
 	struct rcu_head rcu_head;
 	struct list_head dev_list;
 	struct list_head opp_list;
+	struct mutex lock;
 
 	struct device_node *np;
 	unsigned long clock_latency_ns_max;
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
  2016-12-07 10:37 ` [PATCH 01/12] PM / OPP: Add per OPP table mutex Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:36   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines Viresh Kumar
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Add kref to struct opp_table for easier accounting of the OPP table.

Note that the new routine dev_pm_opp_get_opp_table() takes the reference
from under the opp_table_lock, which guarantees that the OPP table
doesn't get freed unless dev_pm_opp_put_opp_table() is called for the
OPP table.

Two separate release mechanisms are added: locked and unlocked. In
unlocked version the routines aren't required to take/drop
opp_table_lock as the callers have already done that. This is required
to avoid breaking git bisect, otherwise we may get lockdeps between
commits. Once all the users of OPP table are updated the unlocked
version shall be removed.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 51 +++++++++++++++++++++++++++++++++++++++++--
 drivers/base/power/opp/opp.h  |  3 +++
 include/linux/pm_opp.h        | 10 +++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 212b7dbecae2..0d9bc89de41a 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -855,6 +855,7 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
 	srcu_init_notifier_head(&opp_table->srcu_head);
 	INIT_LIST_HEAD(&opp_table->opp_list);
 	mutex_init(&opp_table->lock);
+	kref_init(&opp_table->kref);
 
 	/* Secure the device table modification */
 	list_add_rcu(&opp_table->node, &opp_tables);
@@ -894,8 +895,36 @@ static void _kfree_device_rcu(struct rcu_head *head)
 	kfree_rcu(opp_table, rcu_head);
 }
 
-static void _free_opp_table(struct opp_table *opp_table)
+void _get_opp_table_kref(struct opp_table *opp_table)
 {
+	kref_get(&opp_table->kref);
+}
+
+struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
+{
+	struct opp_table *opp_table;
+
+	/* Hold our table modification lock here */
+	mutex_lock(&opp_table_lock);
+
+	opp_table = _find_opp_table(dev);
+	if (!IS_ERR(opp_table)) {
+		_get_opp_table_kref(opp_table);
+		goto unlock;
+	}
+
+	opp_table = _allocate_opp_table(dev);
+
+unlock:
+	mutex_unlock(&opp_table_lock);
+
+	return opp_table;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_table);
+
+static void _opp_table_kref_release_unlocked(struct kref *kref)
+{
+	struct opp_table *opp_table = container_of(kref, struct opp_table, kref);
 	struct opp_device *opp_dev;
 
 	/* Release clk */
@@ -916,6 +945,24 @@ static void _free_opp_table(struct opp_table *opp_table)
 		  _kfree_device_rcu);
 }
 
+static void dev_pm_opp_put_opp_table_unlocked(struct opp_table *opp_table)
+{
+	kref_put(&opp_table->kref, _opp_table_kref_release_unlocked);
+}
+
+static void _opp_table_kref_release(struct kref *kref)
+{
+	_opp_table_kref_release_unlocked(kref);
+	mutex_unlock(&opp_table_lock);
+}
+
+void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
+{
+	kref_put_mutex(&opp_table->kref, _opp_table_kref_release,
+		       &opp_table_lock);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_put_opp_table);
+
 /**
  * _remove_opp_table() - Removes a OPP table
  * @opp_table: OPP table to be removed.
@@ -939,7 +986,7 @@ static void _remove_opp_table(struct opp_table *opp_table)
 	if (opp_table->set_opp)
 		return;
 
-	_free_opp_table(opp_table);
+	dev_pm_opp_put_opp_table_unlocked(opp_table);
 }
 
 void _opp_free(struct dev_pm_opp *opp)
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index 81eb6cee7295..596f361fbe70 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -131,6 +131,7 @@ enum opp_table_access {
  * @rcu_head:	RCU callback head used for deferred freeing
  * @dev_list:	list of devices that share these OPPs
  * @opp_list:	table of opps
+ * @kref:	for reference count of the table.
  * @lock:	mutex protecting the opp_list.
  * @np:		struct device_node pointer for opp's DT node.
  * @clock_latency_ns_max: Max clock latency in nanoseconds.
@@ -164,6 +165,7 @@ struct opp_table {
 	struct rcu_head rcu_head;
 	struct list_head dev_list;
 	struct list_head opp_list;
+	struct kref kref;
 	struct mutex lock;
 
 	struct device_node *np;
@@ -192,6 +194,7 @@ struct opp_table {
 };
 
 /* Routines internal to opp core */
+void _get_opp_table_kref(struct opp_table *opp_table);
 struct opp_table *_find_opp_table(struct device *dev);
 struct opp_table *_add_opp_table(struct device *dev);
 struct opp_device *_add_opp_dev(const struct device *dev, struct opp_table *opp_table);
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 66a02deeb03f..d867c6b25f9a 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -78,6 +78,9 @@ struct dev_pm_set_opp_data {
 
 #if defined(CONFIG_PM_OPP)
 
+struct opp_table *dev_pm_opp_get_opp_table(struct device *dev);
+void dev_pm_opp_put_opp_table(struct opp_table *opp_table);
+
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp);
 
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp);
@@ -126,6 +129,13 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 void dev_pm_opp_remove_table(struct device *dev);
 void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask);
 #else
+static inline struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
+{
+	return ERR_PTR(-ENOTSUPP);
+}
+
+static inline void dev_pm_opp_put_opp_table(struct opp_table *opp_table) {}
+
 static inline unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	return 0;
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
  2016-12-07 10:37 ` [PATCH 01/12] PM / OPP: Add per OPP table mutex Viresh Kumar
  2016-12-07 10:37 ` [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:37   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs Viresh Kumar
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Patrice Chotard
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Now that we have proper kernel reference infrastructure in place for OPP
tables, use it to guarantee that the OPP table isn't freed while being
used by the callers of dev_pm_opp_set_*() APIs.

Make them all return the pointer to the OPP table after taking its
reference and put the reference back with dev_pm_opp_put_*() APIs.

Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.

Remove the rcu specific comments from these routines as they aren't
relevant anymore.

Note that prototypes of dev_pm_opp_{set|put}_regulators() were already
updated by another patch.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 243 +++++++++---------------------------------
 drivers/cpufreq/sti-cpufreq.c |  13 +--
 include/linux/pm_opp.h        |  35 +++---
 3 files changed, 74 insertions(+), 217 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 0d9bc89de41a..472f93755945 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -974,18 +974,6 @@ static void _remove_opp_table(struct opp_table *opp_table)
 	if (!list_empty(&opp_table->opp_list))
 		return;
 
-	if (opp_table->supported_hw)
-		return;
-
-	if (opp_table->prop_name)
-		return;
-
-	if (opp_table->regulators)
-		return;
-
-	if (opp_table->set_opp)
-		return;
-
 	dev_pm_opp_put_opp_table_unlocked(opp_table);
 }
 
@@ -1278,27 +1266,16 @@ int _opp_add_v1(struct opp_table *opp_table, struct device *dev,
  * specify the hierarchy of versions it supports. OPP layer will then enable
  * OPPs, which are available for those versions, based on its 'opp-supported-hw'
  * property.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
-int dev_pm_opp_set_supported_hw(struct device *dev, const u32 *versions,
-				unsigned int count)
+struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev,
+			const u32 *versions, unsigned int count)
 {
 	struct opp_table *opp_table;
-	int ret = 0;
-
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
+	int ret;
 
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return ERR_PTR(-ENOMEM);
 
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
@@ -1319,65 +1296,40 @@ int dev_pm_opp_set_supported_hw(struct device *dev, const u32 *versions,
 	}
 
 	opp_table->supported_hw_count = count;
-	mutex_unlock(&opp_table_lock);
-	return 0;
+
+	return opp_table;
 
 err:
-	_remove_opp_table(opp_table);
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 
-	return ret;
+	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_set_supported_hw);
 
 /**
  * dev_pm_opp_put_supported_hw() - Releases resources blocked for supported hw
- * @dev: Device for which supported-hw has to be put.
+ * @opp_table: OPP table returned by dev_pm_opp_set_supported_hw().
  *
  * This is required only for the V2 bindings, and is called for a matching
  * dev_pm_opp_set_supported_hw(). Until this is called, the opp_table structure
  * will not be freed.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
-void dev_pm_opp_put_supported_hw(struct device *dev)
+void dev_pm_opp_put_supported_hw(struct opp_table *opp_table)
 {
-	struct opp_table *opp_table;
-
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
-
-	/* Check for existing table for 'dev' first */
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		dev_err(dev, "Failed to find opp_table: %ld\n",
-			PTR_ERR(opp_table));
-		goto unlock;
-	}
-
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
 
 	if (!opp_table->supported_hw) {
-		dev_err(dev, "%s: Doesn't have supported hardware list\n",
-			__func__);
-		goto unlock;
+		pr_err("%s: Doesn't have supported hardware list\n",
+		       __func__);
+		return;
 	}
 
 	kfree(opp_table->supported_hw);
 	opp_table->supported_hw = NULL;
 	opp_table->supported_hw_count = 0;
 
-	/* Try freeing opp_table if this was the last blocking resource */
-	_remove_opp_table(opp_table);
-
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_supported_hw);
 
@@ -1390,26 +1342,15 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_put_supported_hw);
  * specify the extn to be used for certain property names. The properties to
  * which the extension will apply are opp-microvolt and opp-microamp. OPP core
  * should postfix the property name with -<name> while looking for them.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
-int dev_pm_opp_set_prop_name(struct device *dev, const char *name)
+struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name)
 {
 	struct opp_table *opp_table;
-	int ret = 0;
-
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
+	int ret;
 
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return ERR_PTR(-ENOMEM);
 
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
@@ -1428,63 +1369,37 @@ int dev_pm_opp_set_prop_name(struct device *dev, const char *name)
 		goto err;
 	}
 
-	mutex_unlock(&opp_table_lock);
-	return 0;
+	return opp_table;
 
 err:
-	_remove_opp_table(opp_table);
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 
-	return ret;
+	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_set_prop_name);
 
 /**
  * dev_pm_opp_put_prop_name() - Releases resources blocked for prop-name
- * @dev: Device for which the prop-name has to be put.
+ * @opp_table: OPP table returned by dev_pm_opp_set_prop_name().
  *
  * This is required only for the V2 bindings, and is called for a matching
  * dev_pm_opp_set_prop_name(). Until this is called, the opp_table structure
  * will not be freed.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
-void dev_pm_opp_put_prop_name(struct device *dev)
+void dev_pm_opp_put_prop_name(struct opp_table *opp_table)
 {
-	struct opp_table *opp_table;
-
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
-
-	/* Check for existing table for 'dev' first */
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		dev_err(dev, "Failed to find opp_table: %ld\n",
-			PTR_ERR(opp_table));
-		goto unlock;
-	}
-
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
 
 	if (!opp_table->prop_name) {
-		dev_err(dev, "%s: Doesn't have a prop-name\n", __func__);
-		goto unlock;
+		pr_err("%s: Doesn't have a prop-name\n", __func__);
+		return;
 	}
 
 	kfree(opp_table->prop_name);
 	opp_table->prop_name = NULL;
 
-	/* Try freeing opp_table if this was the last blocking resource */
-	_remove_opp_table(opp_table);
-
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_prop_name);
 
@@ -1531,12 +1446,6 @@ static void _free_set_opp_data(struct opp_table *opp_table)
  * well.
  *
  * This must be called before any OPPs are initialized for the device.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 					    const char * const names[],
@@ -1546,13 +1455,9 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 	struct regulator *reg;
 	int ret, i;
 
-	mutex_lock(&opp_table_lock);
-
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return ERR_PTR(-ENOMEM);
 
 	/* This should be called before OPPs are initialized */
 	if (WARN_ON(!list_empty(&opp_table->opp_list))) {
@@ -1594,7 +1499,6 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 	if (ret)
 		goto free_regulators;
 
-	mutex_unlock(&opp_table_lock);
 	return opp_table;
 
 free_regulators:
@@ -1605,9 +1509,7 @@ struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 	opp_table->regulators = NULL;
 	opp_table->regulator_count = 0;
 err:
-	_remove_opp_table(opp_table);
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ERR_PTR(ret);
 }
@@ -1616,22 +1518,14 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_set_regulators);
 /**
  * dev_pm_opp_put_regulators() - Releases resources blocked for regulator
  * @opp_table: OPP table returned from dev_pm_opp_set_regulators().
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_put_regulators(struct opp_table *opp_table)
 {
 	int i;
 
-	mutex_lock(&opp_table_lock);
-
 	if (!opp_table->regulators) {
 		pr_err("%s: Doesn't have regulators set\n", __func__);
-		goto unlock;
+		return;
 	}
 
 	/* Make sure there are no concurrent readers while updating opp_table */
@@ -1646,11 +1540,7 @@ void dev_pm_opp_put_regulators(struct opp_table *opp_table)
 	opp_table->regulators = NULL;
 	opp_table->regulator_count = 0;
 
-	/* Try freeing opp_table if this was the last blocking resource */
-	_remove_opp_table(opp_table);
-
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators);
 
@@ -1663,29 +1553,19 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators);
  * regulators per device), instead of the generic OPP set rate helper.
  *
  * This must be called before any OPPs are initialized for the device.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
-int dev_pm_opp_register_set_opp_helper(struct device *dev,
+struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev,
 			int (*set_opp)(struct dev_pm_set_opp_data *data))
 {
 	struct opp_table *opp_table;
 	int ret;
 
 	if (!set_opp)
-		return -EINVAL;
-
-	mutex_lock(&opp_table_lock);
+		return ERR_PTR(-EINVAL);
 
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return ERR_PTR(-ENOMEM);
 
 	/* This should be called before OPPs are initialized */
 	if (WARN_ON(!list_empty(&opp_table->opp_list))) {
@@ -1701,47 +1581,28 @@ int dev_pm_opp_register_set_opp_helper(struct device *dev,
 
 	opp_table->set_opp = set_opp;
 
-	mutex_unlock(&opp_table_lock);
-	return 0;
+	return opp_table;
 
 err:
-	_remove_opp_table(opp_table);
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 
-	return ret;
+	return ERR_PTR(ret);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_register_set_opp_helper);
 
 /**
  * dev_pm_opp_register_put_opp_helper() - Releases resources blocked for
  *					   set_opp helper
- * @dev: Device for which custom set_opp helper has to be cleared.
+ * @opp_table: OPP table returned from dev_pm_opp_register_set_opp_helper().
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
+ * Release resources blocked for platform specific set_opp helper.
  */
-void dev_pm_opp_register_put_opp_helper(struct device *dev)
+void dev_pm_opp_register_put_opp_helper(struct opp_table *opp_table)
 {
-	struct opp_table *opp_table;
-
-	mutex_lock(&opp_table_lock);
-
-	/* Check for existing table for 'dev' first */
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		dev_err(dev, "Failed to find opp_table: %ld\n",
-			PTR_ERR(opp_table));
-		goto unlock;
-	}
-
 	if (!opp_table->set_opp) {
-		dev_err(dev, "%s: Doesn't have custom set_opp helper set\n",
-			__func__);
-		goto unlock;
+		pr_err("%s: Doesn't have custom set_opp helper set\n",
+		       __func__);
+		return;
 	}
 
 	/* Make sure there are no concurrent readers while updating opp_table */
@@ -1749,11 +1610,7 @@ void dev_pm_opp_register_put_opp_helper(struct device *dev)
 
 	opp_table->set_opp = NULL;
 
-	/* Try freeing opp_table if this was the last blocking resource */
-	_remove_opp_table(opp_table);
-
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_register_put_opp_helper);
 
diff --git a/drivers/cpufreq/sti-cpufreq.c b/drivers/cpufreq/sti-cpufreq.c
index b366e6d830ea..a7db9011d5fe 100644
--- a/drivers/cpufreq/sti-cpufreq.c
+++ b/drivers/cpufreq/sti-cpufreq.c
@@ -160,6 +160,7 @@ static int sti_cpufreq_set_opp_info(void)
 	int pcode, substrate, major, minor;
 	int ret;
 	char name[MAX_PCODE_NAME_LEN];
+	struct opp_table *opp_table;
 
 	reg_fields = sti_cpufreq_match();
 	if (!reg_fields) {
@@ -211,20 +212,20 @@ static int sti_cpufreq_set_opp_info(void)
 
 	snprintf(name, MAX_PCODE_NAME_LEN, "pcode%d", pcode);
 
-	ret = dev_pm_opp_set_prop_name(dev, name);
-	if (ret) {
+	opp_table = dev_pm_opp_set_prop_name(dev, name);
+	if (IS_ERR(opp_table)) {
 		dev_err(dev, "Failed to set prop name\n");
-		return ret;
+		return PTR_ERR(opp_table);
 	}
 
 	version[0] = BIT(major);
 	version[1] = BIT(minor);
 	version[2] = BIT(substrate);
 
-	ret = dev_pm_opp_set_supported_hw(dev, version, VERSION_ELEMENTS);
-	if (ret) {
+	opp_table = dev_pm_opp_set_supported_hw(dev, version, VERSION_ELEMENTS);
+	if (IS_ERR(opp_table)) {
 		dev_err(dev, "Failed to set supported hardware\n");
-		return ret;
+		return PTR_ERR(opp_table);
 	}
 
 	dev_dbg(dev, "pcode: %d major: %d minor: %d substrate: %d\n",
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index d867c6b25f9a..99787cbcaab2 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -114,15 +114,14 @@ int dev_pm_opp_disable(struct device *dev, unsigned long freq);
 int dev_pm_opp_register_notifier(struct device *dev, struct notifier_block *nb);
 int dev_pm_opp_unregister_notifier(struct device *dev, struct notifier_block *nb);
 
-int dev_pm_opp_set_supported_hw(struct device *dev, const u32 *versions,
-				unsigned int count);
-void dev_pm_opp_put_supported_hw(struct device *dev);
-int dev_pm_opp_set_prop_name(struct device *dev, const char *name);
-void dev_pm_opp_put_prop_name(struct device *dev);
+struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev, const u32 *versions, unsigned int count);
+void dev_pm_opp_put_supported_hw(struct opp_table *opp_table);
+struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name);
+void dev_pm_opp_put_prop_name(struct opp_table *opp_table);
 struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * const names[], unsigned int count);
 void dev_pm_opp_put_regulators(struct opp_table *opp_table);
-int dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data));
-void dev_pm_opp_register_put_opp_helper(struct device *dev);
+struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*set_opp)(struct dev_pm_set_opp_data *data));
+void dev_pm_opp_register_put_opp_helper(struct opp_table *opp_table);
 int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq);
 int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev, const struct cpumask *cpumask);
 int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask);
@@ -224,29 +223,29 @@ static inline int dev_pm_opp_unregister_notifier(struct device *dev, struct noti
 	return -ENOTSUPP;
 }
 
-static inline int dev_pm_opp_set_supported_hw(struct device *dev,
-					      const u32 *versions,
-					      unsigned int count)
+static inline struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev,
+							    const u32 *versions,
+							    unsigned int count)
 {
-	return -ENOTSUPP;
+	return ERR_PTR(-ENOTSUPP);
 }
 
-static inline void dev_pm_opp_put_supported_hw(struct device *dev) {}
+static inline void dev_pm_opp_put_supported_hw(struct opp_table *opp_table) {}
 
-static inline int dev_pm_opp_register_set_opp_helper(struct device *dev,
+static inline struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev,
 			int (*set_opp)(struct dev_pm_set_opp_data *data))
 {
-	return -ENOTSUPP;
+	return ERR_PTR(-ENOTSUPP);
 }
 
-static inline void dev_pm_opp_register_put_opp_helper(struct device *dev) {}
+static inline void dev_pm_opp_register_put_opp_helper(struct opp_table *opp_table) {}
 
-static inline int dev_pm_opp_set_prop_name(struct device *dev, const char *name)
+static inline struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name)
 {
-	return -ENOTSUPP;
+	return ERR_PTR(-ENOTSUPP);
 }
 
-static inline void dev_pm_opp_put_prop_name(struct device *dev) {}
+static inline void dev_pm_opp_put_prop_name(struct opp_table *opp_table) {}
 
 static inline struct opp_table *dev_pm_opp_set_regulators(struct device *dev, const char * const names[], unsigned int count)
 {
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (2 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:38   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table() Viresh Kumar
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Take reference of the OPP table while adding and removing OPPs, that
helps us remove special checks in _remove_opp_table().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 472f93755945..aacca85ebd20 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -971,9 +971,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_put_opp_table);
  */
 static void _remove_opp_table(struct opp_table *opp_table)
 {
-	if (!list_empty(&opp_table->opp_list))
-		return;
-
 	dev_pm_opp_put_opp_table_unlocked(opp_table);
 }
 
@@ -1018,8 +1015,7 @@ static void _opp_remove(struct opp_table *opp_table, struct dev_pm_opp *opp)
 	call_srcu(&opp_table->srcu_head.srcu, &opp->rcu_head, _kfree_opp_rcu);
 
 	mutex_unlock(&opp_table->lock);
-
-	_remove_opp_table(opp_table);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 
 /**
@@ -1171,6 +1167,9 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 
 	new_opp->opp_table = opp_table;
 
+	/* Get a reference to the OPP table */
+	_get_opp_table_kref(opp_table);
+
 	ret = opp_debug_create_one(new_opp, opp_table);
 	if (ret)
 		dev_err(dev, "%s: Failed to register opp to debugfs (%d)\n",
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table()
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (3 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:43   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp Viresh Kumar
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Migrate all users of _add_opp_table() to use dev_pm_opp_get_opp_table()
to guarantee that the OPP table doesn't get freed while being used.

Also update _managed_opp() to get the reference to the OPP table.

Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.

Now that _add_opp_table(), _remove_opp_table() and the unlocked release
routines aren't used anymore, remove them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 63 ++++---------------------------------------
 drivers/base/power/opp/of.c   | 54 +++++++++++++++++--------------------
 drivers/base/power/opp/opp.h  |  1 -
 3 files changed, 29 insertions(+), 89 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index aacca85ebd20..ec833a8f7aa5 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -863,27 +863,6 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
 }
 
 /**
- * _add_opp_table() - Find OPP table or allocate a new one
- * @dev:	device for which we do this operation
- *
- * It tries to find an existing table first, if it couldn't find one, it
- * allocates a new OPP table and returns that.
- *
- * Return: valid opp_table pointer if success, else NULL.
- */
-struct opp_table *_add_opp_table(struct device *dev)
-{
-	struct opp_table *opp_table;
-
-	/* Check for existing table for 'dev' first */
-	opp_table = _find_opp_table(dev);
-	if (!IS_ERR(opp_table))
-		return opp_table;
-
-	return _allocate_opp_table(dev);
-}
-
-/**
  * _kfree_device_rcu() - Free opp_table RCU handler
  * @head:	RCU head
  */
@@ -922,7 +901,7 @@ struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_table);
 
-static void _opp_table_kref_release_unlocked(struct kref *kref)
+static void _opp_table_kref_release(struct kref *kref)
 {
 	struct opp_table *opp_table = container_of(kref, struct opp_table, kref);
 	struct opp_device *opp_dev;
@@ -943,16 +922,7 @@ static void _opp_table_kref_release_unlocked(struct kref *kref)
 	list_del_rcu(&opp_table->node);
 	call_srcu(&opp_table->srcu_head.srcu, &opp_table->rcu_head,
 		  _kfree_device_rcu);
-}
 
-static void dev_pm_opp_put_opp_table_unlocked(struct opp_table *opp_table)
-{
-	kref_put(&opp_table->kref, _opp_table_kref_release_unlocked);
-}
-
-static void _opp_table_kref_release(struct kref *kref)
-{
-	_opp_table_kref_release_unlocked(kref);
 	mutex_unlock(&opp_table_lock);
 }
 
@@ -963,17 +933,6 @@ void dev_pm_opp_put_opp_table(struct opp_table *opp_table)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_put_opp_table);
 
-/**
- * _remove_opp_table() - Removes a OPP table
- * @opp_table: OPP table to be removed.
- *
- * Removes/frees OPP table if it doesn't contain any OPPs.
- */
-static void _remove_opp_table(struct opp_table *opp_table)
-{
-	dev_pm_opp_put_opp_table_unlocked(opp_table);
-}
-
 void _opp_free(struct dev_pm_opp *opp)
 {
 	kfree(opp);
@@ -1219,8 +1178,6 @@ int _opp_add_v1(struct opp_table *opp_table, struct device *dev,
 	unsigned long tol;
 	int ret;
 
-	opp_rcu_lockdep_assert();
-
 	new_opp = _opp_allocate(dev, opp_table);
 	if (!new_opp)
 		return -ENOMEM;
@@ -1641,21 +1598,13 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
 	struct opp_table *opp_table;
 	int ret;
 
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
-
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return -ENOMEM;
 
 	ret = _opp_add_v1(opp_table, dev, freq, u_volt, true);
-	if (ret)
-		_remove_opp_table(opp_table);
 
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_add);
@@ -1866,8 +1815,6 @@ void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev,
 {
 	struct dev_pm_opp *opp, *tmp;
 
-	opp_rcu_lockdep_assert();
-
 	/* Find if opp_table manages a single device */
 	if (list_is_singular(&opp_table->dev_list)) {
 		/* Free static OPPs */
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index 38efc14d829c..a789dc228a6a 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -24,7 +24,9 @@
 
 static struct opp_table *_managed_opp(const struct device_node *np)
 {
-	struct opp_table *opp_table;
+	struct opp_table *opp_table, *managed_table = NULL;
+
+	mutex_lock(&opp_table_lock);
 
 	list_for_each_entry_rcu(opp_table, &opp_tables, node) {
 		if (opp_table->np == np) {
@@ -35,14 +37,18 @@ static struct opp_table *_managed_opp(const struct device_node *np)
 			 * But the OPPs will be considered as shared only if the
 			 * OPP table contains a "opp-shared" property.
 			 */
-			if (opp_table->shared_opp == OPP_TABLE_ACCESS_SHARED)
-				return opp_table;
+			if (opp_table->shared_opp == OPP_TABLE_ACCESS_SHARED) {
+				_get_opp_table_kref(opp_table);
+				managed_table = opp_table;
+			}
 
-			return NULL;
+			break;
 		}
 	}
 
-	return NULL;
+	mutex_unlock(&opp_table_lock);
+
+	return managed_table;
 }
 
 void _of_init_opp_table(struct opp_table *opp_table, struct device *dev)
@@ -368,21 +374,17 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np)
 	struct opp_table *opp_table;
 	int ret = 0, count = 0;
 
-	mutex_lock(&opp_table_lock);
-
 	opp_table = _managed_opp(opp_np);
 	if (opp_table) {
 		/* OPPs are already managed */
 		if (!_add_opp_dev(dev, opp_table))
 			ret = -ENOMEM;
-		goto unlock;
+		goto put_opp_table;
 	}
 
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return -ENOMEM;
 
 	/* We have opp-table node now, iterate over it and add OPPs */
 	for_each_available_child_of_node(opp_np, np) {
@@ -392,14 +394,15 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np)
 		if (ret) {
 			dev_err(dev, "%s: Failed to add OPP, %d\n", __func__,
 				ret);
-			goto free_table;
+			_dev_pm_opp_remove_table(opp_table, dev, false);
+			goto put_opp_table;
 		}
 	}
 
 	/* There should be one of more OPP defined */
 	if (WARN_ON(!count)) {
 		ret = -ENOENT;
-		goto free_table;
+		goto put_opp_table;
 	}
 
 	opp_table->np = opp_np;
@@ -408,12 +411,8 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np)
 	else
 		opp_table->shared_opp = OPP_TABLE_ACCESS_EXCLUSIVE;
 
-	goto unlock;
-
-free_table:
-	_dev_pm_opp_remove_table(opp_table, dev, false);
-unlock:
-	mutex_unlock(&opp_table_lock);
+put_opp_table:
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
 }
@@ -442,13 +441,9 @@ static int _of_add_opp_table_v1(struct device *dev)
 		return -EINVAL;
 	}
 
-	mutex_lock(&opp_table_lock);
-
-	opp_table = _add_opp_table(dev);
-	if (!opp_table) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	opp_table = dev_pm_opp_get_opp_table(dev);
+	if (!opp_table)
+		return -ENOMEM;
 
 	val = prop->value;
 	while (nr) {
@@ -465,8 +460,7 @@ static int _of_add_opp_table_v1(struct device *dev)
 		nr -= 2;
 	}
 
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 	return ret;
 }
 
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index 596f361fbe70..8435a0eb27be 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -196,7 +196,6 @@ struct opp_table {
 /* Routines internal to opp core */
 void _get_opp_table_kref(struct opp_table *opp_table);
 struct opp_table *_find_opp_table(struct device *dev);
-struct opp_table *_add_opp_table(struct device *dev);
 struct opp_device *_add_opp_dev(const struct device *dev, struct opp_table *opp_table);
 void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev, bool remove_all);
 void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all);
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (4 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table() Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:44   ` Stephen Boyd
  2016-12-07 10:37   ` Viresh Kumar
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Add kref to struct dev_pm_opp for easier accounting of the OPPs.

Note that the OPPs are freed under the opp_table->lock mutex only.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 27 ++++++++++++---------------
 drivers/base/power/opp/opp.h  |  3 +++
 include/linux/pm_opp.h        |  3 +++
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index ec833a8f7aa5..12be0f29f2ad 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -949,20 +949,10 @@ static void _kfree_opp_rcu(struct rcu_head *head)
 	kfree_rcu(opp, rcu_head);
 }
 
-/**
- * _opp_remove()  - Remove an OPP from a table definition
- * @opp_table:	points back to the opp_table struct this opp belongs to
- * @opp:	pointer to the OPP to remove
- *
- * This function removes an opp definition from the opp table.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * It is assumed that the caller holds required mutex for an RCU updater
- * strategy.
- */
-static void _opp_remove(struct opp_table *opp_table, struct dev_pm_opp *opp)
+static void _opp_kref_release(struct kref *kref)
 {
-	mutex_lock(&opp_table->lock);
+	struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
+	struct opp_table *opp_table = opp->opp_table;
 
 	/*
 	 * Notify the changes in the availability of the operable
@@ -977,6 +967,12 @@ static void _opp_remove(struct opp_table *opp_table, struct dev_pm_opp *opp)
 	dev_pm_opp_put_opp_table(opp_table);
 }
 
+void dev_pm_opp_put(struct dev_pm_opp *opp)
+{
+	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_put);
+
 /**
  * dev_pm_opp_remove()  - Remove an OPP from OPP table
  * @dev:	device for which we do this operation
@@ -1020,7 +1016,7 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 		goto unlock;
 	}
 
-	_opp_remove(opp_table, opp);
+	dev_pm_opp_put(opp);
 unlock:
 	mutex_unlock(&opp_table_lock);
 }
@@ -1125,6 +1121,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 	mutex_unlock(&opp_table->lock);
 
 	new_opp->opp_table = opp_table;
+	kref_init(&new_opp->kref);
 
 	/* Get a reference to the OPP table */
 	_get_opp_table_kref(opp_table);
@@ -1820,7 +1817,7 @@ void _dev_pm_opp_remove_table(struct opp_table *opp_table, struct device *dev,
 		/* Free static OPPs */
 		list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) {
 			if (remove_all || !opp->dynamic)
-				_opp_remove(opp_table, opp);
+				dev_pm_opp_put(opp);
 		}
 	} else {
 		_remove_opp_dev(_find_opp_dev(dev, opp_table), opp_table);
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index 8435a0eb27be..bd929ba6efaf 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -16,6 +16,7 @@
 
 #include <linux/device.h>
 #include <linux/kernel.h>
+#include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/limits.h>
 #include <linux/pm_opp.h>
@@ -56,6 +57,7 @@ extern struct list_head opp_tables;
  *		are protected by the opp_table_lock for integrity.
  *		IMPORTANT: the opp nodes should be maintained in increasing
  *		order.
+ * @kref:	for reference count of the OPP.
  * @available:	true/false - marks if this OPP as available or not
  * @dynamic:	not-created from static DT entries.
  * @turbo:	true if turbo (boost) OPP
@@ -73,6 +75,7 @@ extern struct list_head opp_tables;
  */
 struct dev_pm_opp {
 	struct list_head node;
+	struct kref kref;
 
 	bool available;
 	bool dynamic;
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 99787cbcaab2..731d548657aa 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -102,6 +102,7 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 
 struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 					     unsigned long *freq);
+void dev_pm_opp_put(struct dev_pm_opp *opp);
 
 int dev_pm_opp_add(struct device *dev, unsigned long freq,
 		   unsigned long u_volt);
@@ -193,6 +194,8 @@ static inline struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 	return ERR_PTR(-ENOTSUPP);
 }
 
+static inline void dev_pm_opp_put(struct dev_pm_opp *opp) {}
+
 static inline int dev_pm_opp_add(struct device *dev, unsigned long freq,
 					unsigned long u_volt)
 {
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/12] PM / OPP: Update OPP users to put reference
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
@ 2016-12-07 10:37   ` Viresh Kumar
  2016-12-07 10:37 ` [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table Viresh Kumar
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Kevin Hilman, Tony Lindgren, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Chanwoo Choi, Amit Daniel Kachhap, Javi Merino, Zhang Rui,
	Eduardo Valentin
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.

Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.

As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.

This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.

Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Cover-letter: lkml.kernel.org/r/cover.1481106919.git.viresh.kumar@linaro.org

 arch/arm/mach-omap2/pm.c             |   5 +-
 drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
 drivers/base/power/opp/cpu.c         |  22 ++-----
 drivers/clk/tegra/clk-dfll.c         |  17 ++----
 drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
 drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
 drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
 drivers/cpufreq/omap-cpufreq.c       |   4 +-
 drivers/devfreq/devfreq.c            |  14 ++---
 drivers/devfreq/exynos-bus.c         |  14 ++---
 drivers/devfreq/governor_passive.c   |   4 +-
 drivers/devfreq/rk3399_dmc.c         |  16 ++---
 drivers/devfreq/tegra-devfreq.c      |   4 +-
 drivers/thermal/cpu_cooling.c        |  11 +---
 drivers/thermal/devfreq_cooling.c    |  14 +----
 15 files changed, 109 insertions(+), 153 deletions(-)

diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
index 678d2a31dcb8..c5a1d4439202 100644
--- a/arch/arm/mach-omap2/pm.c
+++ b/arch/arm/mach-omap2/pm.c
@@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
 	freq = clk_get_rate(clk);
 	clk_put(clk);
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("%s: unable to find boot up OPP for vdd_%s\n",
 			__func__, vdd_name);
 		goto exit;
 	}
 
 	bootup_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (!bootup_volt) {
 		pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
 		       __func__, vdd_name);
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 12be0f29f2ad..d112b1846327 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -40,6 +40,8 @@ do {									\
 			 "opp_table_lock protection");			\
 } while (0)
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp);
+
 static struct opp_device *_find_opp_dev(const struct device *dev,
 					struct opp_table *opp_table)
 {
@@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
  * return 0
  *
  * This is useful only for devices with single power supply.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long v = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp))
@@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 	else
 		v = tmp_opp->supplies[0].u_volt;
 
+	rcu_read_unlock();
 	return v;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
@@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  *
  * Return: frequency in hertz corresponding to the opp, else
  * return 0
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long f = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
@@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 	else
 		f = tmp_opp->rate;
 
+	rcu_read_unlock();
 	return f;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
@@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
  * quickly. Running on them for longer times may overheat the chip.
  *
  * Return: true if opp is turbo opp, else false.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
+	bool turbo;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
@@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 		return false;
 	}
 
-	return tmp_opp->turbo;
+	turbo = tmp_opp->turbo;
+
+	rcu_read_unlock();
+	return turbo;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
@@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
  * This provides a mechanism to enable an opp which is not available currently
  * or the opposite as well.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 					      unsigned long freq,
@@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		int r = PTR_ERR(opp_table);
 
 		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
+		rcu_read_unlock();
 		return ERR_PTR(r);
 	}
 
@@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
 			opp = temp_opp;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
 
+	rcu_read_unlock();
+
 	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
@@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 		if (temp_opp->available && temp_opp->rate >= *freq) {
 			opp = temp_opp;
 			*freq = opp->rate;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
@@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 					     unsigned long *freq)
 {
 	struct opp_table *opp_table;
-
-	opp_rcu_lockdep_assert();
+	struct dev_pm_opp *opp;
 
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
+
+	opp = _find_freq_ceil(opp_table, freq);
 
-	return _find_freq_ceil(opp_table, freq);
+	rcu_read_unlock();
+
+	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
 
@@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					      unsigned long *freq)
@@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
-
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
 
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
@@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 				opp = temp_opp;
 		}
 	}
+
+	/* Increment the reference count of OPP */
+	if (!IS_ERR(opp))
+		dev_pm_opp_get(opp);
+	rcu_read_unlock();
+
 	if (!IS_ERR(opp))
 		*freq = opp->rate;
 
@@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		ret = PTR_ERR(opp);
 		dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
 			__func__, freq, ret);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return ret;
 	}
@@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!regulators) {
+		dev_pm_opp_put(opp);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	}
@@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	data->new_opp.rate = freq;
 	memcpy(data->new_opp.supplies, opp->supplies, size);
 
+	dev_pm_opp_put(opp);
+	if (!IS_ERR(old_opp))
+		dev_pm_opp_put(old_opp);
 	rcu_read_unlock();
 
 	return set_opp(data);
@@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
 	dev_pm_opp_put_opp_table(opp_table);
 }
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp)
+{
+	kref_get(&opp->kref);
+}
+
 void dev_pm_opp_put(struct dev_pm_opp *opp)
 {
 	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index 8c3434bdb26d..adef788862d5 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -42,11 +42,6 @@
  *
  * WARNING: It is  important for the callers to ensure refreshing their copy of
  * the table if any of the mentioned functions have been invoked in the interim.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Since we just use the regular accessor functions to access the internal data
- * structures, we use RCU read lock inside this function. As a result, users of
- * this function DONOT need to use explicit locks for invoking.
  */
 int dev_pm_opp_init_cpufreq_table(struct device *dev,
 				  struct cpufreq_frequency_table **table)
@@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	int i, max_opps, ret = 0;
 	unsigned long rate;
 
-	rcu_read_lock();
-
 	max_opps = dev_pm_opp_get_opp_count(dev);
-	if (max_opps <= 0) {
-		ret = max_opps ? max_opps : -ENODATA;
-		goto out;
-	}
+	if (max_opps <= 0)
+		return max_opps ? max_opps : -ENODATA;
 
 	freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
-	if (!freq_table) {
-		ret = -ENOMEM;
-		goto out;
-	}
+	if (!freq_table)
+		return -ENOMEM;
 
 	for (i = 0, rate = 0; i < max_opps; i++, rate++) {
 		/* find next rate */
@@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 		/* Is Boost/turbo opp ? */
 		if (dev_pm_opp_is_turbo(opp))
 			freq_table[i].flags = CPUFREQ_BOOST_FREQ;
+
+		dev_pm_opp_put(opp);
 	}
 
 	freq_table[i].driver_data = i;
@@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	*table = &freq_table[0];
 
 out:
-	rcu_read_unlock();
 	if (ret)
 		kfree(freq_table);
 
diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
index f010562534eb..2c44aeb0b97c 100644
--- a/drivers/clk/tegra/clk-dfll.c
+++ b/drivers/clk/tegra/clk-dfll.c
@@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
 	struct dev_pm_opp *opp;
 	int i, uv;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
-	uv = dev_pm_opp_get_voltage(opp);
 
-	rcu_read_unlock();
+	uv = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	for (i = 0; i < td->i2c_lut_size; i++) {
 		if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
@@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 	struct dev_pm_opp *opp;
 	int lut;
 
-	rcu_read_lock();
-
 	rate = ULONG_MAX;
 	opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
 	if (IS_ERR(opp)) {
@@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		goto out;
 	}
 	v_max = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	v = td->soc->cvb->min_millivolts * 1000;
 	lut = find_vdd_map_entry_exact(td, v);
@@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		if (v_opp <= td->soc->cvb->min_millivolts * 1000)
 			td->dvco_rate_min = dev_pm_opp_get_freq(opp);
 
+		dev_pm_opp_put(opp);
+
 		for (;;) {
 			v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
 			if (v >= v_opp)
@@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		ret = 0;
 
 out:
-	rcu_read_unlock();
-
 	return ret;
 }
 
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index c0f3373706f4..9180d34cc9fc 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -118,12 +118,10 @@ static int init_div_table(void)
 	unsigned int tmp, clk_div, ema_div, freq, volt_id;
 	struct dev_pm_opp *opp;
 
-	rcu_read_lock();
 	cpufreq_for_each_entry(pos, freq_tbl) {
 		opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
 					pos->frequency * 1000, true);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(dvfs_info->dev,
 				"failed to find valid OPP for %u KHZ\n",
 				pos->frequency);
@@ -140,6 +138,7 @@ static int init_div_table(void)
 
 		/* Calculate EMA */
 		volt_id = dev_pm_opp_get_voltage(opp);
+
 		volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
 		if (volt_id < PMIC_HIGH_VOLT) {
 			ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
@@ -157,9 +156,9 @@ static int init_div_table(void)
 
 		__raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
 						(pos - freq_tbl));
+		dev_pm_opp_put(opp);
 	}
 
-	rcu_read_unlock();
 	return 0;
 }
 
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index ef1fa8145419..7719b02e04f5 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 	freq_hz = new_freq * 1000;
 	old_freq = clk_get_rate(arm_clk) / 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
 		return PTR_ERR(opp);
 	}
 
 	volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	volt_old = regulator_get_voltage(arm_reg);
 
 	dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
@@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 	 * freq_table initialised from OPP is therefore sorted in the
 	 * same order.
 	 */
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[0].frequency * 1000, true);
 	min_volt = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[--num].frequency * 1000, true);
 	max_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
 	if (ret > 0)
 		transition_latency += ret * 1000;
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index 643f43179df1..ab25b1235a5e 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 
 	freq_hz = freq_table[index].frequency * 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("cpu%d: failed to find OPP for %ld\n",
 		       policy->cpu, freq_hz);
 		return PTR_ERR(opp);
 	}
 	vproc = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
@@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
 		goto out_free_opp_table;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 376e63ca94e8..71e81bbf031b 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
 	freq = ret;
 
 	if (mpu_reg) {
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
 				__func__, new_freq);
 			return -EINVAL;
 		}
 		volt = dev_pm_opp_get_voltage(opp);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		tol = volt * OPP_TOLERANCE / 100;
 		volt_old = regulator_get_voltage(mpu_reg);
 	}
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index b0de42972b74..89add0d7c017 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
 		return;
 	}
 
-	rcu_read_lock();
 	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
 		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
+		dev_pm_opp_put(opp);
 		if (IS_ERR(opp)) {
 			devm_kfree(devfreq->dev.parent, profile->freq_table);
 			profile->max_state = 0;
-			rcu_read_unlock();
 			return;
 		}
 		profile->freq_table[i] = freq;
 	}
-	rcu_read_unlock();
 }
 
 /**
@@ -1107,9 +1105,9 @@ static ssize_t available_frequencies_show(struct device *d,
 	ssize_t count = 0;
 	unsigned long freq = 0;
 
-	rcu_read_lock();
 	do {
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+		dev_pm_opp_put(opp);
 		if (IS_ERR(opp))
 			break;
 
@@ -1117,7 +1115,6 @@ static ssize_t available_frequencies_show(struct device *d,
 				   "%lu ", freq);
 		freq++;
 	} while (1);
-	rcu_read_unlock();
 
 	/* Truncate the trailing space */
 	if (count)
@@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
  * @freq:	The frequency given to target function
  * @flags:	Flags handed from devfreq framework.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
 					   unsigned long *freq,
diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index a8ed7792ece2..49ce38cef460 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
 	new_volt = dev_pm_opp_get_voltage(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
 
 	rate = clk_get_rate(bus->clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, 0);
 	if (IS_ERR(opp)) {
 		dev_err(dev, "failed to find dev_pm_opp\n");
-		rcu_read_unlock();
 		ret = PTR_ERR(opp);
 		goto err_opp;
 	}
 	bus->curr_freq = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	return 0;
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 9ef46e2592c4..671a1e0afc6e 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -59,9 +59,9 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	 * list of parent device. Because in this case, *freq is temporary
 	 * value which is decided by ondemand governor.
 	 */
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (IS_ERR(opp)) {
 		ret = PTR_ERR(opp);
 		goto out;
diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 27d2f349b53c..40a2499730fc 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
 	unsigned long target_volt, target_rate;
 	int err;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, freq, flags);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
 
 	target_rate = dev_pm_opp_get_freq(opp);
 	target_volt = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (dmcfreq->rate == target_rate)
 		return 0;
@@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
 
 	data->rate = clk_get_rate(data->dmc_clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &data->rate, 0);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
+
 	data->rate = dev_pm_opp_get_freq(opp);
 	data->volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
 
diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
index fe9dce0245bf..214fff96fa4a 100644
--- a/drivers/devfreq/tegra-devfreq.c
+++ b/drivers/devfreq/tegra-devfreq.c
@@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
 	struct dev_pm_opp *opp;
 	unsigned long rate = *freq * KHZ;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, flags);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
 		return PTR_ERR(opp);
 	}
 	rate = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	clk_set_min_rate(tegra->emc_clock, rate);
 	clk_set_rate(tegra->emc_clock, 0);
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 9ce0e9eef923..85fdbf762fa0 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 	if (!power_table)
 		return -ENOMEM;
 
-	rcu_read_lock();
-
 	for (freq = 0, i = 0;
 	     opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
 	     freq++, i++) {
@@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		u64 power;
 
 		if (i >= num_opps) {
-			rcu_read_unlock();
 			ret = -EAGAIN;
 			goto free_power_table;
 		}
 
 		freq_mhz = freq / 1000000;
 		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
+		dev_pm_opp_put(opp);
 
 		/*
 		 * Do the multiplication with MHz and millivolt so as
@@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		power_table[i].power = power;
 	}
 
-	rcu_read_unlock();
-
 	if (i != num_opps) {
 		ret = PTR_ERR(opp);
 		goto free_power_table;
@@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
 		return 0;
 	}
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
 					 true);
 	voltage = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(cpufreq_device->cpu_dev,
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 81631b110e17..55839dd2ded2 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -113,9 +113,8 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
 		unsigned int freq = dfc->freq_table[i];
 		bool want_enable = i >= cdev_state ? true : false;
 
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (PTR_ERR(opp) == -ERANGE)
 			continue;
@@ -221,15 +220,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
 	if (!dfc->power_ops->get_static_power)
 		return 0;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(dev, freq, true);
 	if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
 		opp = dev_pm_opp_find_freq_exact(dev, freq, false);
 
 	voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(dev,
@@ -411,18 +407,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
 		unsigned long power_dyn, voltage;
 		struct dev_pm_opp *opp;
 
-		rcu_read_lock();
-
 		opp = dev_pm_opp_find_freq_floor(dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			ret = PTR_ERR(opp);
 			goto free_tables;
 		}
 
 		voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (dfc->power_ops) {
 			power_dyn = get_dynamic_power(dfc, freq, voltage);
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/12] PM / OPP: Update OPP users to put reference
@ 2016-12-07 10:37   ` Viresh Kumar
  0 siblings, 0 replies; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Kevin Hilman, Tony Lindgren, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Chanwoo Choi, Amit Daniel Kachhap, Javi Merino, Zhang Rui
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.

Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.

As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.

This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.

Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Cover-letter: lkml.kernel.org/r/cover.1481106919.git.viresh.kumar@linaro.org

 arch/arm/mach-omap2/pm.c             |   5 +-
 drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
 drivers/base/power/opp/cpu.c         |  22 ++-----
 drivers/clk/tegra/clk-dfll.c         |  17 ++----
 drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
 drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
 drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
 drivers/cpufreq/omap-cpufreq.c       |   4 +-
 drivers/devfreq/devfreq.c            |  14 ++---
 drivers/devfreq/exynos-bus.c         |  14 ++---
 drivers/devfreq/governor_passive.c   |   4 +-
 drivers/devfreq/rk3399_dmc.c         |  16 ++---
 drivers/devfreq/tegra-devfreq.c      |   4 +-
 drivers/thermal/cpu_cooling.c        |  11 +---
 drivers/thermal/devfreq_cooling.c    |  14 +----
 15 files changed, 109 insertions(+), 153 deletions(-)

diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
index 678d2a31dcb8..c5a1d4439202 100644
--- a/arch/arm/mach-omap2/pm.c
+++ b/arch/arm/mach-omap2/pm.c
@@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
 	freq = clk_get_rate(clk);
 	clk_put(clk);
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("%s: unable to find boot up OPP for vdd_%s\n",
 			__func__, vdd_name);
 		goto exit;
 	}
 
 	bootup_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (!bootup_volt) {
 		pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
 		       __func__, vdd_name);
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 12be0f29f2ad..d112b1846327 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -40,6 +40,8 @@ do {									\
 			 "opp_table_lock protection");			\
 } while (0)
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp);
+
 static struct opp_device *_find_opp_dev(const struct device *dev,
 					struct opp_table *opp_table)
 {
@@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
  * return 0
  *
  * This is useful only for devices with single power supply.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long v = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp))
@@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 	else
 		v = tmp_opp->supplies[0].u_volt;
 
+	rcu_read_unlock();
 	return v;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
@@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  *
  * Return: frequency in hertz corresponding to the opp, else
  * return 0
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long f = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
@@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 	else
 		f = tmp_opp->rate;
 
+	rcu_read_unlock();
 	return f;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
@@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
  * quickly. Running on them for longer times may overheat the chip.
  *
  * Return: true if opp is turbo opp, else false.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
+	bool turbo;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
@@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 		return false;
 	}
 
-	return tmp_opp->turbo;
+	turbo = tmp_opp->turbo;
+
+	rcu_read_unlock();
+	return turbo;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
@@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
  * This provides a mechanism to enable an opp which is not available currently
  * or the opposite as well.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 					      unsigned long freq,
@@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		int r = PTR_ERR(opp_table);
 
 		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
+		rcu_read_unlock();
 		return ERR_PTR(r);
 	}
 
@@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
 			opp = temp_opp;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
 
+	rcu_read_unlock();
+
 	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
@@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 		if (temp_opp->available && temp_opp->rate >= *freq) {
 			opp = temp_opp;
 			*freq = opp->rate;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
@@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 					     unsigned long *freq)
 {
 	struct opp_table *opp_table;
-
-	opp_rcu_lockdep_assert();
+	struct dev_pm_opp *opp;
 
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
+
+	opp = _find_freq_ceil(opp_table, freq);
 
-	return _find_freq_ceil(opp_table, freq);
+	rcu_read_unlock();
+
+	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
 
@@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					      unsigned long *freq)
@@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
-
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
 
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
@@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 				opp = temp_opp;
 		}
 	}
+
+	/* Increment the reference count of OPP */
+	if (!IS_ERR(opp))
+		dev_pm_opp_get(opp);
+	rcu_read_unlock();
+
 	if (!IS_ERR(opp))
 		*freq = opp->rate;
 
@@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		ret = PTR_ERR(opp);
 		dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
 			__func__, freq, ret);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return ret;
 	}
@@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!regulators) {
+		dev_pm_opp_put(opp);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	}
@@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	data->new_opp.rate = freq;
 	memcpy(data->new_opp.supplies, opp->supplies, size);
 
+	dev_pm_opp_put(opp);
+	if (!IS_ERR(old_opp))
+		dev_pm_opp_put(old_opp);
 	rcu_read_unlock();
 
 	return set_opp(data);
@@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
 	dev_pm_opp_put_opp_table(opp_table);
 }
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp)
+{
+	kref_get(&opp->kref);
+}
+
 void dev_pm_opp_put(struct dev_pm_opp *opp)
 {
 	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index 8c3434bdb26d..adef788862d5 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -42,11 +42,6 @@
  *
  * WARNING: It is  important for the callers to ensure refreshing their copy of
  * the table if any of the mentioned functions have been invoked in the interim.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Since we just use the regular accessor functions to access the internal data
- * structures, we use RCU read lock inside this function. As a result, users of
- * this function DONOT need to use explicit locks for invoking.
  */
 int dev_pm_opp_init_cpufreq_table(struct device *dev,
 				  struct cpufreq_frequency_table **table)
@@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	int i, max_opps, ret = 0;
 	unsigned long rate;
 
-	rcu_read_lock();
-
 	max_opps = dev_pm_opp_get_opp_count(dev);
-	if (max_opps <= 0) {
-		ret = max_opps ? max_opps : -ENODATA;
-		goto out;
-	}
+	if (max_opps <= 0)
+		return max_opps ? max_opps : -ENODATA;
 
 	freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
-	if (!freq_table) {
-		ret = -ENOMEM;
-		goto out;
-	}
+	if (!freq_table)
+		return -ENOMEM;
 
 	for (i = 0, rate = 0; i < max_opps; i++, rate++) {
 		/* find next rate */
@@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 		/* Is Boost/turbo opp ? */
 		if (dev_pm_opp_is_turbo(opp))
 			freq_table[i].flags = CPUFREQ_BOOST_FREQ;
+
+		dev_pm_opp_put(opp);
 	}
 
 	freq_table[i].driver_data = i;
@@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	*table = &freq_table[0];
 
 out:
-	rcu_read_unlock();
 	if (ret)
 		kfree(freq_table);
 
diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
index f010562534eb..2c44aeb0b97c 100644
--- a/drivers/clk/tegra/clk-dfll.c
+++ b/drivers/clk/tegra/clk-dfll.c
@@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
 	struct dev_pm_opp *opp;
 	int i, uv;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
-	uv = dev_pm_opp_get_voltage(opp);
 
-	rcu_read_unlock();
+	uv = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	for (i = 0; i < td->i2c_lut_size; i++) {
 		if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
@@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 	struct dev_pm_opp *opp;
 	int lut;
 
-	rcu_read_lock();
-
 	rate = ULONG_MAX;
 	opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
 	if (IS_ERR(opp)) {
@@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		goto out;
 	}
 	v_max = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	v = td->soc->cvb->min_millivolts * 1000;
 	lut = find_vdd_map_entry_exact(td, v);
@@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		if (v_opp <= td->soc->cvb->min_millivolts * 1000)
 			td->dvco_rate_min = dev_pm_opp_get_freq(opp);
 
+		dev_pm_opp_put(opp);
+
 		for (;;) {
 			v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
 			if (v >= v_opp)
@@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		ret = 0;
 
 out:
-	rcu_read_unlock();
-
 	return ret;
 }
 
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index c0f3373706f4..9180d34cc9fc 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -118,12 +118,10 @@ static int init_div_table(void)
 	unsigned int tmp, clk_div, ema_div, freq, volt_id;
 	struct dev_pm_opp *opp;
 
-	rcu_read_lock();
 	cpufreq_for_each_entry(pos, freq_tbl) {
 		opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
 					pos->frequency * 1000, true);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(dvfs_info->dev,
 				"failed to find valid OPP for %u KHZ\n",
 				pos->frequency);
@@ -140,6 +138,7 @@ static int init_div_table(void)
 
 		/* Calculate EMA */
 		volt_id = dev_pm_opp_get_voltage(opp);
+
 		volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
 		if (volt_id < PMIC_HIGH_VOLT) {
 			ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
@@ -157,9 +156,9 @@ static int init_div_table(void)
 
 		__raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
 						(pos - freq_tbl));
+		dev_pm_opp_put(opp);
 	}
 
-	rcu_read_unlock();
 	return 0;
 }
 
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index ef1fa8145419..7719b02e04f5 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 	freq_hz = new_freq * 1000;
 	old_freq = clk_get_rate(arm_clk) / 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
 		return PTR_ERR(opp);
 	}
 
 	volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	volt_old = regulator_get_voltage(arm_reg);
 
 	dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
@@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 	 * freq_table initialised from OPP is therefore sorted in the
 	 * same order.
 	 */
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[0].frequency * 1000, true);
 	min_volt = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[--num].frequency * 1000, true);
 	max_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
 	if (ret > 0)
 		transition_latency += ret * 1000;
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index 643f43179df1..ab25b1235a5e 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 
 	freq_hz = freq_table[index].frequency * 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("cpu%d: failed to find OPP for %ld\n",
 		       policy->cpu, freq_hz);
 		return PTR_ERR(opp);
 	}
 	vproc = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
@@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
 		goto out_free_opp_table;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 376e63ca94e8..71e81bbf031b 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
 	freq = ret;
 
 	if (mpu_reg) {
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
 				__func__, new_freq);
 			return -EINVAL;
 		}
 		volt = dev_pm_opp_get_voltage(opp);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		tol = volt * OPP_TOLERANCE / 100;
 		volt_old = regulator_get_voltage(mpu_reg);
 	}
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index b0de42972b74..89add0d7c017 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
 		return;
 	}
 
-	rcu_read_lock();
 	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
 		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
+		dev_pm_opp_put(opp);
 		if (IS_ERR(opp)) {
 			devm_kfree(devfreq->dev.parent, profile->freq_table);
 			profile->max_state = 0;
-			rcu_read_unlock();
 			return;
 		}
 		profile->freq_table[i] = freq;
 	}
-	rcu_read_unlock();
 }
 
 /**
@@ -1107,9 +1105,9 @@ static ssize_t available_frequencies_show(struct device *d,
 	ssize_t count = 0;
 	unsigned long freq = 0;
 
-	rcu_read_lock();
 	do {
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+		dev_pm_opp_put(opp);
 		if (IS_ERR(opp))
 			break;
 
@@ -1117,7 +1115,6 @@ static ssize_t available_frequencies_show(struct device *d,
 				   "%lu ", freq);
 		freq++;
 	} while (1);
-	rcu_read_unlock();
 
 	/* Truncate the trailing space */
 	if (count)
@@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
  * @freq:	The frequency given to target function
  * @flags:	Flags handed from devfreq framework.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
 					   unsigned long *freq,
diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index a8ed7792ece2..49ce38cef460 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
 	new_volt = dev_pm_opp_get_voltage(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
 
 	rate = clk_get_rate(bus->clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, 0);
 	if (IS_ERR(opp)) {
 		dev_err(dev, "failed to find dev_pm_opp\n");
-		rcu_read_unlock();
 		ret = PTR_ERR(opp);
 		goto err_opp;
 	}
 	bus->curr_freq = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	return 0;
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 9ef46e2592c4..671a1e0afc6e 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -59,9 +59,9 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	 * list of parent device. Because in this case, *freq is temporary
 	 * value which is decided by ondemand governor.
 	 */
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (IS_ERR(opp)) {
 		ret = PTR_ERR(opp);
 		goto out;
diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 27d2f349b53c..40a2499730fc 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
 	unsigned long target_volt, target_rate;
 	int err;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, freq, flags);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
 
 	target_rate = dev_pm_opp_get_freq(opp);
 	target_volt = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (dmcfreq->rate == target_rate)
 		return 0;
@@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
 
 	data->rate = clk_get_rate(data->dmc_clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &data->rate, 0);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
+
 	data->rate = dev_pm_opp_get_freq(opp);
 	data->volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
 
diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
index fe9dce0245bf..214fff96fa4a 100644
--- a/drivers/devfreq/tegra-devfreq.c
+++ b/drivers/devfreq/tegra-devfreq.c
@@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
 	struct dev_pm_opp *opp;
 	unsigned long rate = *freq * KHZ;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, flags);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
 		return PTR_ERR(opp);
 	}
 	rate = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	clk_set_min_rate(tegra->emc_clock, rate);
 	clk_set_rate(tegra->emc_clock, 0);
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 9ce0e9eef923..85fdbf762fa0 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 	if (!power_table)
 		return -ENOMEM;
 
-	rcu_read_lock();
-
 	for (freq = 0, i = 0;
 	     opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
 	     freq++, i++) {
@@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		u64 power;
 
 		if (i >= num_opps) {
-			rcu_read_unlock();
 			ret = -EAGAIN;
 			goto free_power_table;
 		}
 
 		freq_mhz = freq / 1000000;
 		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
+		dev_pm_opp_put(opp);
 
 		/*
 		 * Do the multiplication with MHz and millivolt so as
@@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		power_table[i].power = power;
 	}
 
-	rcu_read_unlock();
-
 	if (i != num_opps) {
 		ret = PTR_ERR(opp);
 		goto free_power_table;
@@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
 		return 0;
 	}
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
 					 true);
 	voltage = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(cpufreq_device->cpu_dev,
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 81631b110e17..55839dd2ded2 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -113,9 +113,8 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
 		unsigned int freq = dfc->freq_table[i];
 		bool want_enable = i >= cdev_state ? true : false;
 
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (PTR_ERR(opp) == -ERANGE)
 			continue;
@@ -221,15 +220,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
 	if (!dfc->power_ops->get_static_power)
 		return 0;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(dev, freq, true);
 	if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
 		opp = dev_pm_opp_find_freq_exact(dev, freq, false);
 
 	voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(dev,
@@ -411,18 +407,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
 		unsigned long power_dyn, voltage;
 		struct dev_pm_opp *opp;
 
-		rcu_read_lock();
-
 		opp = dev_pm_opp_find_freq_floor(dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			ret = PTR_ERR(opp);
 			goto free_tables;
 		}
 
 		voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (dfc->power_ops) {
 			power_dyn = get_dynamic_power(dfc, freq, voltage);
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/12] PM / OPP: Take kref from _find_opp_table()
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (6 preceding siblings ...)
  2016-12-07 10:37   ` Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:49   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 09/12] PM / OPP: Move away from RCU locking Viresh Kumar
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Take reference of the OPP table from within _find_opp_table(). Also
update the callers of _find_opp_table() to call
dev_pm_opp_put_opp_table() after they have used the OPP table.

Note that _find_opp_table() increments the reference under the
opp_table_lock.

Now that the OPP table wouldn't get freed until the callers of
_find_opp_table() call dev_pm_opp_put_opp_table(), there is no need to
take the opp_table_lock or rcu_read_lock() around it. Drop them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 191 +++++++++++++++++++-----------------------
 drivers/base/power/opp/cpu.c  |  26 ++----
 2 files changed, 95 insertions(+), 122 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index d112b1846327..2b689fc73596 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -54,6 +54,21 @@ static struct opp_device *_find_opp_dev(const struct device *dev,
 	return NULL;
 }
 
+struct opp_table *_find_opp_table_unlocked(struct device *dev)
+{
+	struct opp_table *opp_table;
+
+	list_for_each_entry(opp_table, &opp_tables, node) {
+		if (_find_opp_dev(dev, opp_table)) {
+			_get_opp_table_kref(opp_table);
+
+			return opp_table;
+		}
+	}
+
+	return ERR_PTR(-ENODEV);
+}
+
 /**
  * _find_opp_table() - find opp_table struct using device pointer
  * @dev:	device pointer used to lookup OPP table
@@ -64,28 +79,22 @@ static struct opp_device *_find_opp_dev(const struct device *dev,
  * Return: pointer to 'struct opp_table' if found, otherwise -ENODEV or
  * -EINVAL based on type of error.
  *
- * Locking: For readers, this function must be called under rcu_read_lock().
- * opp_table is a RCU protected pointer, which means that opp_table is valid
- * as long as we are under RCU lock.
- *
- * For Writers, this function must be called with opp_table_lock held.
+ * The callers must call dev_pm_opp_put_opp_table() after the table is used.
  */
 struct opp_table *_find_opp_table(struct device *dev)
 {
 	struct opp_table *opp_table;
 
-	opp_rcu_lockdep_assert();
-
 	if (IS_ERR_OR_NULL(dev)) {
 		pr_err("%s: Invalid parameters\n", __func__);
 		return ERR_PTR(-EINVAL);
 	}
 
-	list_for_each_entry_rcu(opp_table, &opp_tables, node)
-		if (_find_opp_dev(dev, opp_table))
-			return opp_table;
+	mutex_lock(&opp_table_lock);
+	opp_table = _find_opp_table_unlocked(dev);
+	mutex_unlock(&opp_table_lock);
 
-	return ERR_PTR(-ENODEV);
+	return opp_table;
 }
 
 /**
@@ -175,23 +184,20 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
  * @dev:	device for which we do this operation
  *
  * Return: This function returns the max clock latency in nanoseconds.
- *
- * Locking: This function takes rcu_read_lock().
  */
 unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev)
 {
 	struct opp_table *opp_table;
 	unsigned long clock_latency_ns;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table))
-		clock_latency_ns = 0;
-	else
-		clock_latency_ns = opp_table->clock_latency_ns_max;
+		return 0;
+
+	clock_latency_ns = opp_table->clock_latency_ns_max;
+
+	dev_pm_opp_put_opp_table(opp_table);
 
-	rcu_read_unlock();
 	return clock_latency_ns;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_clock_latency);
@@ -201,15 +207,13 @@ static int _get_regulator_count(struct device *dev)
 	struct opp_table *opp_table;
 	int count;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (!IS_ERR(opp_table))
-		count = opp_table->regulator_count;
-	else
-		count = 0;
+	if (IS_ERR(opp_table))
+		return 0;
 
-	rcu_read_unlock();
+	count = opp_table->regulator_count;
+
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return count;
 }
@@ -248,13 +252,11 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 	if (!uV)
 		goto free_regulators;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp_table))
 		goto free_uV;
-	}
+
+	rcu_read_lock();
 
 	memcpy(regulators, opp_table->regulators, count * sizeof(*regulators));
 
@@ -274,6 +276,7 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 	}
 
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	/*
 	 * The caller needs to ensure that opp_table (and hence the regulator)
@@ -323,17 +326,15 @@ unsigned long dev_pm_opp_get_suspend_opp_freq(struct device *dev)
 	struct opp_table *opp_table;
 	unsigned long freq = 0;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table) || !opp_table->suspend_opp ||
-	    !opp_table->suspend_opp->available)
-		goto unlock;
+	if (IS_ERR(opp_table))
+		return 0;
 
-	freq = dev_pm_opp_get_freq(opp_table->suspend_opp);
+	if (opp_table->suspend_opp && opp_table->suspend_opp->available)
+		freq = dev_pm_opp_get_freq(opp_table->suspend_opp);
+
+	dev_pm_opp_put_opp_table(opp_table);
 
-unlock:
-	rcu_read_unlock();
 	return freq;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_suspend_opp_freq);
@@ -353,23 +354,24 @@ int dev_pm_opp_get_opp_count(struct device *dev)
 	struct dev_pm_opp *temp_opp;
 	int count = 0;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		count = PTR_ERR(opp_table);
 		dev_err(dev, "%s: OPP table not found (%d)\n",
 			__func__, count);
-		goto out_unlock;
+		return count;
 	}
 
+	rcu_read_lock();
+
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available)
 			count++;
 	}
 
-out_unlock:
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
+
 	return count;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
@@ -404,17 +406,16 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		int r = PTR_ERR(opp_table);
 
 		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
-		rcu_read_unlock();
 		return ERR_PTR(r);
 	}
 
+	rcu_read_lock();
+
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
@@ -427,6 +428,7 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	}
 
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return opp;
 }
@@ -480,17 +482,16 @@ struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 		return ERR_PTR(-EINVAL);
 	}
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp_table))
 		return ERR_CAST(opp_table);
-	}
+
+	rcu_read_lock();
 
 	opp = _find_freq_ceil(opp_table, freq);
 
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return opp;
 }
@@ -525,13 +526,11 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 		return ERR_PTR(-EINVAL);
 	}
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp_table))
 		return ERR_CAST(opp_table);
-	}
+
+	rcu_read_lock();
 
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
@@ -547,6 +546,7 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	if (!IS_ERR(opp))
 		dev_pm_opp_get(opp);
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	if (!IS_ERR(opp))
 		*freq = opp->rate;
@@ -564,22 +564,18 @@ static struct clk *_get_opp_clk(struct device *dev)
 	struct opp_table *opp_table;
 	struct clk *clk;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		dev_err(dev, "%s: device opp doesn't exist\n", __func__);
-		clk = ERR_CAST(opp_table);
-		goto unlock;
+		return ERR_CAST(opp_table);
 	}
 
 	clk = opp_table->clk;
 	if (IS_ERR(clk))
 		dev_err(dev, "%s: No clock available for the device\n",
 			__func__);
+	dev_pm_opp_put_opp_table(opp_table);
 
-unlock:
-	rcu_read_unlock();
 	return clk;
 }
 
@@ -715,15 +711,14 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		return 0;
 	}
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		dev_err(dev, "%s: device opp doesn't exist\n", __func__);
-		rcu_read_unlock();
 		return PTR_ERR(opp_table);
 	}
 
+	rcu_read_lock();
+
 	old_opp = _find_freq_ceil(opp_table, &old_freq);
 	if (IS_ERR(old_opp)) {
 		dev_err(dev, "%s: failed to find current OPP for freq %lu (%ld)\n",
@@ -738,6 +733,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		if (!IS_ERR(old_opp))
 			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
+		dev_pm_opp_put_opp_table(opp_table);
 		return ret;
 	}
 
@@ -752,6 +748,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		if (!IS_ERR(old_opp))
 			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
+		dev_pm_opp_put_opp_table(opp_table);
 		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	}
 
@@ -780,6 +777,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	if (!IS_ERR(old_opp))
 		dev_pm_opp_put(old_opp);
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return set_opp(data);
 }
@@ -893,11 +891,9 @@ struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
 	/* Hold our table modification lock here */
 	mutex_lock(&opp_table_lock);
 
-	opp_table = _find_opp_table(dev);
-	if (!IS_ERR(opp_table)) {
-		_get_opp_table_kref(opp_table);
+	opp_table = _find_opp_table_unlocked(dev);
+	if (!IS_ERR(opp_table))
 		goto unlock;
-	}
 
 	opp_table = _allocate_opp_table(dev);
 
@@ -1004,12 +1000,9 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 	struct opp_table *opp_table;
 	bool found = false;
 
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
-
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table))
-		goto unlock;
+		return;
 
 	mutex_lock(&opp_table->lock);
 
@@ -1022,15 +1015,14 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 
 	mutex_unlock(&opp_table->lock);
 
-	if (!found) {
+	if (found) {
+		dev_pm_opp_put(opp);
+	} else {
 		dev_warn(dev, "%s: Couldn't find OPP with freq: %lu\n",
 			 __func__, freq);
-		goto unlock;
 	}
 
-	dev_pm_opp_put(opp);
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_remove);
 
@@ -1649,14 +1641,12 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 	if (!new_opp)
 		return -ENOMEM;
 
-	mutex_lock(&opp_table_lock);
-
 	/* Find the opp_table */
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		r = PTR_ERR(opp_table);
 		dev_warn(dev, "%s: Device OPP not found (%d)\n", __func__, r);
-		goto unlock;
+		goto free_opp;
 	}
 
 	mutex_lock(&opp_table->lock);
@@ -1669,8 +1659,6 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 		}
 	}
 
-	mutex_unlock(&opp_table->lock);
-
 	if (IS_ERR(opp)) {
 		r = PTR_ERR(opp);
 		goto unlock;
@@ -1686,7 +1674,6 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 	new_opp->available = availability_req;
 
 	list_replace_rcu(&opp->node, &new_opp->node);
-	mutex_unlock(&opp_table_lock);
 	call_srcu(&opp_table->srcu_head.srcu, &opp->rcu_head, _kfree_opp_rcu);
 
 	/* Notify the change of the OPP availability */
@@ -1697,10 +1684,14 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 		srcu_notifier_call_chain(&opp_table->srcu_head,
 					 OPP_EVENT_DISABLE, new_opp);
 
+	mutex_unlock(&opp_table->lock);
+	dev_pm_opp_put_opp_table(opp_table);
 	return 0;
 
 unlock:
-	mutex_unlock(&opp_table_lock);
+	mutex_unlock(&opp_table->lock);
+	dev_pm_opp_put_opp_table(opp_table);
+free_opp:
 	kfree(new_opp);
 	return r;
 }
@@ -1768,18 +1759,16 @@ int dev_pm_opp_register_notifier(struct device *dev, struct notifier_block *nb)
 	struct opp_table *opp_table;
 	int ret;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		ret = PTR_ERR(opp_table);
-		goto unlock;
-	}
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
+
+	rcu_read_lock();
 
 	ret = srcu_notifier_chain_register(&opp_table->srcu_head, nb);
 
-unlock:
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
 }
@@ -1798,18 +1787,14 @@ int dev_pm_opp_unregister_notifier(struct device *dev,
 	struct opp_table *opp_table;
 	int ret;
 
-	rcu_read_lock();
-
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		ret = PTR_ERR(opp_table);
-		goto unlock;
-	}
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	ret = srcu_notifier_chain_unregister(&opp_table->srcu_head, nb);
 
-unlock:
 	rcu_read_unlock();
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
 }
@@ -1840,9 +1825,6 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
 {
 	struct opp_table *opp_table;
 
-	/* Hold our table modification lock here */
-	mutex_lock(&opp_table_lock);
-
 	/* Check for existing table for 'dev' */
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
@@ -1853,13 +1835,12 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
 			     IS_ERR_OR_NULL(dev) ?
 					"Invalid device" : dev_name(dev),
 			     error);
-		goto unlock;
+		return;
 	}
 
 	_dev_pm_opp_remove_table(opp_table, dev, remove_all);
 
-unlock:
-	mutex_unlock(&opp_table_lock);
+	dev_pm_opp_put_opp_table(opp_table);
 }
 
 /**
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index adef788862d5..df29f08eecc4 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -174,13 +174,9 @@ int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev,
 	struct device *dev;
 	int cpu, ret = 0;
 
-	mutex_lock(&opp_table_lock);
-
 	opp_table = _find_opp_table(cpu_dev);
-	if (IS_ERR(opp_table)) {
-		ret = PTR_ERR(opp_table);
-		goto unlock;
-	}
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	for_each_cpu(cpu, cpumask) {
 		if (cpu == cpu_dev->id)
@@ -203,8 +199,8 @@ int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev,
 		/* Mark opp-table as multiple CPUs are sharing it now */
 		opp_table->shared_opp = OPP_TABLE_ACCESS_SHARED;
 	}
-unlock:
-	mutex_unlock(&opp_table_lock);
+
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
 }
@@ -232,17 +228,13 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 	struct opp_table *opp_table;
 	int ret = 0;
 
-	mutex_lock(&opp_table_lock);
-
 	opp_table = _find_opp_table(cpu_dev);
-	if (IS_ERR(opp_table)) {
-		ret = PTR_ERR(opp_table);
-		goto unlock;
-	}
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	if (opp_table->shared_opp == OPP_TABLE_ACCESS_UNKNOWN) {
 		ret = -EINVAL;
-		goto unlock;
+		goto put_opp_table;
 	}
 
 	cpumask_clear(cpumask);
@@ -254,8 +246,8 @@ int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 		cpumask_set_cpu(cpu_dev->id, cpumask);
 	}
 
-unlock:
-	mutex_unlock(&opp_table_lock);
+put_opp_table:
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
 }
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/12] PM / OPP: Move away from RCU locking
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (7 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 08/12] PM / OPP: Take kref from _find_opp_table() Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 23:57   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 10/12] PM / OPP: Simplify _opp_set_availability() Viresh Kumar
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

The RCU locking isn't well suited for the OPP core. The RCU locking fits
better for reader heavy stuff, while the OPP core have at max one or two
readers only at a time.

Over that, it was getting very confusing the way RCU locking was used
with the OPP core. The individual OPPs are mostly well handled, i.e. for
an update a new structure was created and then that replaced the older
one. But the OPP tables were updated directly all the time from various
parts of the core. Though they were mostly used from within RCU locked
region, they didn't had much to do with RCU and were governed by the
mutex instead.

And that mixed with the 'opp_table_lock' has made the core even more
confusing.

Now that we are already managing the OPPs and the OPP tables with kernel
reference infrastructure, we can get rid of RCU locking completely and
simplify the code a lot.

Remove all RCU references from code and comments.

Acquire opp_table->lock while parsing the list of OPPs though.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 294 +++++++++++-------------------------------
 drivers/base/power/opp/cpu.c  |  18 ---
 drivers/base/power/opp/of.c   |  40 +-----
 drivers/base/power/opp/opp.h  |  22 +---
 4 files changed, 80 insertions(+), 294 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 2b689fc73596..b5e9600058c2 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -32,14 +32,6 @@ LIST_HEAD(opp_tables);
 /* Lock to allow exclusive modification to the device and opp lists */
 DEFINE_MUTEX(opp_table_lock);
 
-#define opp_rcu_lockdep_assert()					\
-do {									\
-	RCU_LOCKDEP_WARN(!rcu_read_lock_held() &&			\
-			 !lockdep_is_held(&opp_table_lock),		\
-			 "Missing rcu_read_lock() or "			\
-			 "opp_table_lock protection");			\
-} while (0)
-
 static void dev_pm_opp_get(struct dev_pm_opp *opp);
 
 static struct opp_device *_find_opp_dev(const struct device *dev,
@@ -73,8 +65,7 @@ struct opp_table *_find_opp_table_unlocked(struct device *dev)
  * _find_opp_table() - find opp_table struct using device pointer
  * @dev:	device pointer used to lookup OPP table
  *
- * Search OPP table for one containing matching device. Does a RCU reader
- * operation to grab the pointer needed.
+ * Search OPP table for one containing matching device.
  *
  * Return: pointer to 'struct opp_table' if found, otherwise -ENODEV or
  * -EINVAL based on type of error.
@@ -108,19 +99,12 @@ struct opp_table *_find_opp_table(struct device *dev)
  */
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
-	struct dev_pm_opp *tmp_opp;
-	unsigned long v = 0;
-
-	rcu_read_lock();
-
-	tmp_opp = rcu_dereference(opp);
-	if (IS_ERR_OR_NULL(tmp_opp))
+	if (IS_ERR_OR_NULL(opp)) {
 		pr_err("%s: Invalid parameters\n", __func__);
-	else
-		v = tmp_opp->supplies[0].u_volt;
+		return 0;
+	}
 
-	rcu_read_unlock();
-	return v;
+	return opp->supplies[0].u_volt;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
 
@@ -133,19 +117,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
-	struct dev_pm_opp *tmp_opp;
-	unsigned long f = 0;
-
-	rcu_read_lock();
-
-	tmp_opp = rcu_dereference(opp);
-	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
+	if (IS_ERR_OR_NULL(opp) || !opp->available) {
 		pr_err("%s: Invalid parameters\n", __func__);
-	else
-		f = tmp_opp->rate;
+		return 0;
+	}
 
-	rcu_read_unlock();
-	return f;
+	return opp->rate;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
 
@@ -161,21 +138,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
  */
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
-	struct dev_pm_opp *tmp_opp;
-	bool turbo;
-
-	rcu_read_lock();
-
-	tmp_opp = rcu_dereference(opp);
-	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
+	if (IS_ERR_OR_NULL(opp) || !opp->available) {
 		pr_err("%s: Invalid parameters\n", __func__);
 		return false;
 	}
 
-	turbo = tmp_opp->turbo;
-
-	rcu_read_unlock();
-	return turbo;
+	return opp->turbo;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
@@ -223,8 +191,6 @@ static int _get_regulator_count(struct device *dev)
  * @dev: device for which we do this operation
  *
  * Return: This function returns the max voltage latency in nanoseconds.
- *
- * Locking: This function takes rcu_read_lock().
  */
 unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 {
@@ -256,15 +222,15 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 	if (IS_ERR(opp_table))
 		goto free_uV;
 
-	rcu_read_lock();
-
 	memcpy(regulators, opp_table->regulators, count * sizeof(*regulators));
 
+	mutex_lock(&opp_table->lock);
+
 	for (i = 0; i < count; i++) {
 		uV[i].min = ~0;
 		uV[i].max = 0;
 
-		list_for_each_entry_rcu(opp, &opp_table->opp_list, node) {
+		list_for_each_entry(opp, &opp_table->opp_list, node) {
 			if (!opp->available)
 				continue;
 
@@ -275,7 +241,7 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 		}
 	}
 
-	rcu_read_unlock();
+	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
 
 	/*
@@ -304,8 +270,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_volt_latency);
  *
  * Return: This function returns the max transition latency, in nanoseconds, to
  * switch from one OPP to other.
- *
- * Locking: This function takes rcu_read_lock().
  */
 unsigned long dev_pm_opp_get_max_transition_latency(struct device *dev)
 {
@@ -345,8 +309,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_suspend_opp_freq);
  *
  * Return: This function returns the number of available opps if there are any,
  * else returns 0 if none or the corresponding error value.
- *
- * Locking: This function takes rcu_read_lock().
  */
 int dev_pm_opp_get_opp_count(struct device *dev)
 {
@@ -362,14 +324,14 @@ int dev_pm_opp_get_opp_count(struct device *dev)
 		return count;
 	}
 
-	rcu_read_lock();
+	mutex_lock(&opp_table->lock);
 
-	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
+	list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available)
 			count++;
 	}
 
-	rcu_read_unlock();
+	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
 
 	return count;
@@ -414,9 +376,9 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		return ERR_PTR(r);
 	}
 
-	rcu_read_lock();
+	mutex_lock(&opp_table->lock);
 
-	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
+	list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
 			opp = temp_opp;
@@ -427,7 +389,7 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		}
 	}
 
-	rcu_read_unlock();
+	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
 
 	return opp;
@@ -439,7 +401,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 {
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
+	mutex_lock(&opp_table->lock);
+
+	list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available && temp_opp->rate >= *freq) {
 			opp = temp_opp;
 			*freq = opp->rate;
@@ -450,6 +414,8 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 		}
 	}
 
+	mutex_unlock(&opp_table->lock);
+
 	return opp;
 }
 
@@ -486,11 +452,8 @@ struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 	if (IS_ERR(opp_table))
 		return ERR_CAST(opp_table);
 
-	rcu_read_lock();
-
 	opp = _find_freq_ceil(opp_table, freq);
 
-	rcu_read_unlock();
 	dev_pm_opp_put_opp_table(opp_table);
 
 	return opp;
@@ -530,9 +493,9 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	if (IS_ERR(opp_table))
 		return ERR_CAST(opp_table);
 
-	rcu_read_lock();
+	mutex_lock(&opp_table->lock);
 
-	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
+	list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
 			/* go to the next node, before choosing prev */
 			if (temp_opp->rate > *freq)
@@ -545,7 +508,7 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	/* Increment the reference count of OPP */
 	if (!IS_ERR(opp))
 		dev_pm_opp_get(opp);
-	rcu_read_unlock();
+	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
 
 	if (!IS_ERR(opp))
@@ -555,30 +518,6 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_floor);
 
-/*
- * The caller needs to ensure that opp_table (and hence the clk) isn't freed,
- * while clk returned here is used.
- */
-static struct clk *_get_opp_clk(struct device *dev)
-{
-	struct opp_table *opp_table;
-	struct clk *clk;
-
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		dev_err(dev, "%s: device opp doesn't exist\n", __func__);
-		return ERR_CAST(opp_table);
-	}
-
-	clk = opp_table->clk;
-	if (IS_ERR(clk))
-		dev_err(dev, "%s: No clock available for the device\n",
-			__func__);
-	dev_pm_opp_put_opp_table(opp_table);
-
-	return clk;
-}
-
 static int _set_opp_voltage(struct device *dev, struct regulator *reg,
 			    struct dev_pm_opp_supply *supply)
 {
@@ -674,8 +613,6 @@ static int _generic_set_opp(struct dev_pm_set_opp_data *data)
  *
  * This configures the power-supplies and clock source to the levels specified
  * by the OPP corresponding to the target_freq.
- *
- * Locking: This function takes rcu_read_lock().
  */
 int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 {
@@ -694,9 +631,19 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		return -EINVAL;
 	}
 
-	clk = _get_opp_clk(dev);
-	if (IS_ERR(clk))
-		return PTR_ERR(clk);
+	opp_table = _find_opp_table(dev);
+	if (IS_ERR(opp_table)) {
+		dev_err(dev, "%s: device opp doesn't exist\n", __func__);
+		return PTR_ERR(opp_table);
+	}
+
+	clk = opp_table->clk;
+	if (IS_ERR(clk)) {
+		dev_err(dev, "%s: No clock available for the device\n",
+			__func__);
+		ret = PTR_ERR(clk);
+		goto put_opp_table;
+	}
 
 	freq = clk_round_rate(clk, target_freq);
 	if ((long)freq <= 0)
@@ -708,17 +655,10 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	if (old_freq == freq) {
 		dev_dbg(dev, "%s: old/new frequencies (%lu Hz) are same, nothing to do\n",
 			__func__, freq);
-		return 0;
-	}
-
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table)) {
-		dev_err(dev, "%s: device opp doesn't exist\n", __func__);
-		return PTR_ERR(opp_table);
+		ret = 0;
+		goto put_opp_table;
 	}
 
-	rcu_read_lock();
-
 	old_opp = _find_freq_ceil(opp_table, &old_freq);
 	if (IS_ERR(old_opp)) {
 		dev_err(dev, "%s: failed to find current OPP for freq %lu (%ld)\n",
@@ -730,11 +670,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		ret = PTR_ERR(opp);
 		dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
 			__func__, freq, ret);
-		if (!IS_ERR(old_opp))
-			dev_pm_opp_put(old_opp);
-		rcu_read_unlock();
-		dev_pm_opp_put_opp_table(opp_table);
-		return ret;
+		goto put_old_opp;
 	}
 
 	dev_dbg(dev, "%s: switching OPP: %lu Hz --> %lu Hz\n", __func__,
@@ -744,12 +680,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!regulators) {
-		dev_pm_opp_put(opp);
-		if (!IS_ERR(old_opp))
-			dev_pm_opp_put(old_opp);
-		rcu_read_unlock();
-		dev_pm_opp_put_opp_table(opp_table);
-		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
+		ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
+		goto put_opps;
 	}
 
 	if (opp_table->set_opp)
@@ -773,32 +705,26 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	data->new_opp.rate = freq;
 	memcpy(data->new_opp.supplies, opp->supplies, size);
 
+	ret = set_opp(data);
+
+put_opps:
 	dev_pm_opp_put(opp);
+put_old_opp:
 	if (!IS_ERR(old_opp))
 		dev_pm_opp_put(old_opp);
-	rcu_read_unlock();
+put_opp_table:
 	dev_pm_opp_put_opp_table(opp_table);
-
-	return set_opp(data);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_set_rate);
 
 /* OPP-dev Helpers */
-static void _kfree_opp_dev_rcu(struct rcu_head *head)
-{
-	struct opp_device *opp_dev;
-
-	opp_dev = container_of(head, struct opp_device, rcu_head);
-	kfree_rcu(opp_dev, rcu_head);
-}
-
 static void _remove_opp_dev(struct opp_device *opp_dev,
 			    struct opp_table *opp_table)
 {
 	opp_debug_unregister(opp_dev, opp_table);
 	list_del(&opp_dev->node);
-	call_srcu(&opp_table->srcu_head.srcu, &opp_dev->rcu_head,
-		  _kfree_opp_dev_rcu);
+	kfree(opp_dev);
 }
 
 struct opp_device *_add_opp_dev(const struct device *dev,
@@ -813,7 +739,7 @@ struct opp_device *_add_opp_dev(const struct device *dev,
 
 	/* Initialize opp-dev */
 	opp_dev->dev = dev;
-	list_add_rcu(&opp_dev->node, &opp_table->dev_list);
+	list_add(&opp_dev->node, &opp_table->dev_list);
 
 	/* Create debugfs entries for the opp_table */
 	ret = opp_debug_register(opp_dev, opp_table);
@@ -857,28 +783,16 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
 				ret);
 	}
 
-	srcu_init_notifier_head(&opp_table->srcu_head);
+	BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head);
 	INIT_LIST_HEAD(&opp_table->opp_list);
 	mutex_init(&opp_table->lock);
 	kref_init(&opp_table->kref);
 
 	/* Secure the device table modification */
-	list_add_rcu(&opp_table->node, &opp_tables);
+	list_add(&opp_table->node, &opp_tables);
 	return opp_table;
 }
 
-/**
- * _kfree_device_rcu() - Free opp_table RCU handler
- * @head:	RCU head
- */
-static void _kfree_device_rcu(struct rcu_head *head)
-{
-	struct opp_table *opp_table = container_of(head, struct opp_table,
-						   rcu_head);
-
-	kfree_rcu(opp_table, rcu_head);
-}
-
 void _get_opp_table_kref(struct opp_table *opp_table)
 {
 	kref_get(&opp_table->kref);
@@ -922,9 +836,8 @@ static void _opp_table_kref_release(struct kref *kref)
 	WARN_ON(!list_empty(&opp_table->dev_list));
 
 	mutex_destroy(&opp_table->lock);
-	list_del_rcu(&opp_table->node);
-	call_srcu(&opp_table->srcu_head.srcu, &opp_table->rcu_head,
-		  _kfree_device_rcu);
+	list_del(&opp_table->node);
+	kfree(opp_table);
 
 	mutex_unlock(&opp_table_lock);
 }
@@ -941,17 +854,6 @@ void _opp_free(struct dev_pm_opp *opp)
 	kfree(opp);
 }
 
-/**
- * _kfree_opp_rcu() - Free OPP RCU handler
- * @head:	RCU head
- */
-static void _kfree_opp_rcu(struct rcu_head *head)
-{
-	struct dev_pm_opp *opp = container_of(head, struct dev_pm_opp, rcu_head);
-
-	kfree_rcu(opp, rcu_head);
-}
-
 static void _opp_kref_release(struct kref *kref)
 {
 	struct dev_pm_opp *opp = container_of(kref, struct dev_pm_opp, kref);
@@ -961,10 +863,10 @@ static void _opp_kref_release(struct kref *kref)
 	 * Notify the changes in the availability of the operable
 	 * frequency/voltage list.
 	 */
-	srcu_notifier_call_chain(&opp_table->srcu_head, OPP_EVENT_REMOVE, opp);
+	blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_REMOVE, opp);
 	opp_debug_remove_one(opp);
-	list_del_rcu(&opp->node);
-	call_srcu(&opp_table->srcu_head.srcu, &opp->rcu_head, _kfree_opp_rcu);
+	list_del(&opp->node);
+	kfree(opp);
 
 	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
@@ -987,12 +889,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_put);
  * @freq:	OPP to remove with matching 'freq'
  *
  * This function removes an opp from the opp table.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_remove(struct device *dev, unsigned long freq)
 {
@@ -1098,7 +994,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 	mutex_lock(&opp_table->lock);
 	head = &opp_table->opp_list;
 
-	list_for_each_entry_rcu(opp, &opp_table->opp_list, node) {
+	list_for_each_entry(opp, &opp_table->opp_list, node) {
 		if (new_opp->rate > opp->rate) {
 			head = &opp->node;
 			continue;
@@ -1121,7 +1017,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
 		return ret;
 	}
 
-	list_add_rcu(&new_opp->node, head);
+	list_add(&new_opp->node, head);
 	mutex_unlock(&opp_table->lock);
 
 	new_opp->opp_table = opp_table;
@@ -1159,12 +1055,6 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp,
  * NOTE: "dynamic" parameter impacts OPPs added by the dev_pm_opp_of_add_table
  * and freed by dev_pm_opp_of_remove_table.
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
- *
  * Return:
  * 0		On success OR
  *		Duplicate OPPs (both freq and volt are same) and opp->available
@@ -1204,7 +1094,7 @@ int _opp_add_v1(struct opp_table *opp_table, struct device *dev,
 	 * Notify the changes in the availability of the operable
 	 * frequency/voltage list.
 	 */
-	srcu_notifier_call_chain(&opp_table->srcu_head, OPP_EVENT_ADD, new_opp);
+	blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ADD, new_opp);
 	return 0;
 
 free_opp:
@@ -1581,12 +1471,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_register_put_opp_helper);
  * The opp is made available by default and it can be controlled using
  * dev_pm_opp_enable/disable functions.
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
- *
  * Return:
  * 0		On success OR
  *		Duplicate OPPs (both freq and volt are same) and opp->available
@@ -1616,18 +1500,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_add);
  * @freq:		OPP frequency to modify availability
  * @availability_req:	availability status requested for this opp
  *
- * Set the availability of an OPP with an RCU operation, opp_{enable,disable}
- * share a common logic which is isolated here.
+ * Set the availability of an OPP, opp_{enable,disable} share a common logic
+ * which is isolated here.
  *
  * Return: -EINVAL for bad pointers, -ENOMEM if no memory available for the
  * copy operation, returns 0 if no modification was done OR modification was
  * successful.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks to
- * keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex locking or synchronize_rcu() blocking calls cannot be used.
  */
 static int _opp_set_availability(struct device *dev, unsigned long freq,
 				 bool availability_req)
@@ -1673,16 +1551,16 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 	/* plug in new node */
 	new_opp->available = availability_req;
 
-	list_replace_rcu(&opp->node, &new_opp->node);
-	call_srcu(&opp_table->srcu_head.srcu, &opp->rcu_head, _kfree_opp_rcu);
+	list_replace(&opp->node, &new_opp->node);
+	kfree(opp);
 
 	/* Notify the change of the OPP availability */
 	if (availability_req)
-		srcu_notifier_call_chain(&opp_table->srcu_head,
-					 OPP_EVENT_ENABLE, new_opp);
+		blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ENABLE,
+					     new_opp);
 	else
-		srcu_notifier_call_chain(&opp_table->srcu_head,
-					 OPP_EVENT_DISABLE, new_opp);
+		blocking_notifier_call_chain(&opp_table->head,
+					     OPP_EVENT_DISABLE, new_opp);
 
 	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
@@ -1705,12 +1583,6 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
  * corresponding error value. It is meant to be used for users an OPP available
  * after being temporarily made unavailable with dev_pm_opp_disable.
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function indirectly uses RCU and mutex locks to keep the
- * integrity of the internal data structures. Callers should ensure that
- * this function is *NOT* called under RCU protection or in contexts where
- * mutex locking or synchronize_rcu() blocking calls cannot be used.
- *
  * Return: -EINVAL for bad pointers, -ENOMEM if no memory available for the
  * copy operation, returns 0 if no modification was done OR modification was
  * successful.
@@ -1731,12 +1603,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_enable);
  * control by users to make this OPP not available until the circumstances are
  * right to make it available again (with a call to dev_pm_opp_enable).
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function indirectly uses RCU and mutex locks to keep the
- * integrity of the internal data structures. Callers should ensure that
- * this function is *NOT* called under RCU protection or in contexts where
- * mutex locking or synchronize_rcu() blocking calls cannot be used.
- *
  * Return: -EINVAL for bad pointers, -ENOMEM if no memory available for the
  * copy operation, returns 0 if no modification was done OR modification was
  * successful.
@@ -1763,11 +1629,8 @@ int dev_pm_opp_register_notifier(struct device *dev, struct notifier_block *nb)
 	if (IS_ERR(opp_table))
 		return PTR_ERR(opp_table);
 
-	rcu_read_lock();
-
-	ret = srcu_notifier_chain_register(&opp_table->srcu_head, nb);
+	ret = blocking_notifier_chain_register(&opp_table->head, nb);
 
-	rcu_read_unlock();
 	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
@@ -1791,9 +1654,8 @@ int dev_pm_opp_unregister_notifier(struct device *dev,
 	if (IS_ERR(opp_table))
 		return PTR_ERR(opp_table);
 
-	ret = srcu_notifier_chain_unregister(&opp_table->srcu_head, nb);
+	ret = blocking_notifier_chain_unregister(&opp_table->head, nb);
 
-	rcu_read_unlock();
 	dev_pm_opp_put_opp_table(opp_table);
 
 	return ret;
@@ -1849,12 +1711,6 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev, bool remove_all)
  *
  * Free both OPPs created using static entries present in DT and the
  * dynamically added entries.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function indirectly uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_remove_table(struct device *dev)
 {
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index df29f08eecc4..2d87bc1adf38 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -137,12 +137,6 @@ void _dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask, bool of)
  * This removes the OPP tables for CPUs present in the @cpumask.
  * This should be used to remove all the OPPs entries associated with
  * the cpus in @cpumask.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask)
 {
@@ -159,12 +153,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_cpumask_remove_table);
  * @cpumask.
  *
  * Returns -ENODEV if OPP table isn't already present.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev,
 				const struct cpumask *cpumask)
@@ -215,12 +203,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_set_sharing_cpus);
  *
  * Returns -ENODEV if OPP table isn't already present and -EINVAL if the OPP
  * table's status is access-unknown.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 {
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c
index a789dc228a6a..f78b9aba587c 100644
--- a/drivers/base/power/opp/of.c
+++ b/drivers/base/power/opp/of.c
@@ -28,7 +28,7 @@ static struct opp_table *_managed_opp(const struct device_node *np)
 
 	mutex_lock(&opp_table_lock);
 
-	list_for_each_entry_rcu(opp_table, &opp_tables, node) {
+	list_for_each_entry(opp_table, &opp_tables, node) {
 		if (opp_table->np == np) {
 			/*
 			 * Multiple devices can point to the same OPP table and
@@ -235,12 +235,6 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
  * @dev:	device pointer used to lookup OPP table.
  *
  * Free OPPs created using static entries present in DT.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function indirectly uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_of_remove_table(struct device *dev)
 {
@@ -269,12 +263,6 @@ static struct device_node *_of_get_opp_desc_node(struct device *dev)
  * opp can be controlled using dev_pm_opp_enable/disable functions and may be
  * removed by dev_pm_opp_remove.
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
- *
  * Return:
  * 0		On success OR
  *		Duplicate OPPs (both freq and volt are same) and opp->available
@@ -358,7 +346,7 @@ static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev,
 	 * Notify the changes in the availability of the operable
 	 * frequency/voltage list.
 	 */
-	srcu_notifier_call_chain(&opp_table->srcu_head, OPP_EVENT_ADD, new_opp);
+	blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ADD, new_opp);
 	return 0;
 
 free_opp:
@@ -470,12 +458,6 @@ static int _of_add_opp_table_v1(struct device *dev)
  *
  * Register the initial OPP table with the OPP library for given device.
  *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function indirectly uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
- *
  * Return:
  * 0		On success OR
  *		Duplicate OPPs (both freq and volt are same) and opp->available
@@ -520,12 +502,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_add_table);
  *
  * This removes the OPP tables for CPUs present in the @cpumask.
  * This should be used only to remove static entries created from DT.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 void dev_pm_opp_of_cpumask_remove_table(const struct cpumask *cpumask)
 {
@@ -538,12 +514,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_cpumask_remove_table);
  * @cpumask:	cpumask for which OPP table needs to be added.
  *
  * This adds the OPP tables for CPUs present in the @cpumask.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask)
 {
@@ -591,12 +561,6 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_of_cpumask_add_table);
  * This updates the @cpumask with CPUs that are sharing OPPs with @cpu_dev.
  *
  * Returns -ENOENT if operating-points-v2 isn't present for @cpu_dev.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Hence this function internally uses RCU updater strategy with mutex locks
- * to keep the integrity of the internal data structures. Callers should ensure
- * that this function is *NOT* called under RCU protection or in contexts where
- * mutex cannot be locked.
  */
 int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev,
 				   struct cpumask *cpumask)
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index bd929ba6efaf..e7898afddef2 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -20,8 +20,7 @@
 #include <linux/list.h>
 #include <linux/limits.h>
 #include <linux/pm_opp.h>
-#include <linux/rculist.h>
-#include <linux/rcupdate.h>
+#include <linux/notifier.h>
 
 struct clk;
 struct regulator;
@@ -52,9 +51,6 @@ extern struct list_head opp_tables;
  * @node:	opp table node. The nodes are maintained throughout the lifetime
  *		of boot. It is expected only an optimal set of OPPs are
  *		added to the library by the SoC framework.
- *		RCU usage: opp table is traversed with RCU locks. node
- *		modification is possible realtime, hence the modifications
- *		are protected by the opp_table_lock for integrity.
  *		IMPORTANT: the opp nodes should be maintained in increasing
  *		order.
  * @kref:	for reference count of the OPP.
@@ -67,7 +63,6 @@ extern struct list_head opp_tables;
  * @clock_latency_ns: Latency (in nanoseconds) of switching to this OPP's
  *		frequency from any other OPP's frequency.
  * @opp_table:	points back to the opp_table struct this opp belongs to
- * @rcu_head:	RCU callback head used for deferred freeing
  * @np:		OPP's device node.
  * @dentry:	debugfs dentry pointer (per opp)
  *
@@ -88,7 +83,6 @@ struct dev_pm_opp {
 	unsigned long clock_latency_ns;
 
 	struct opp_table *opp_table;
-	struct rcu_head rcu_head;
 
 	struct device_node *np;
 
@@ -101,7 +95,6 @@ struct dev_pm_opp {
  * struct opp_device - devices managed by 'struct opp_table'
  * @node:	list node
  * @dev:	device to which the struct object belongs
- * @rcu_head:	RCU callback head used for deferred freeing
  * @dentry:	debugfs dentry pointer (per device)
  *
  * This is an internal data structure maintaining the devices that are managed
@@ -110,7 +103,6 @@ struct dev_pm_opp {
 struct opp_device {
 	struct list_head node;
 	const struct device *dev;
-	struct rcu_head rcu_head;
 
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *dentry;
@@ -128,10 +120,7 @@ enum opp_table_access {
  * @node:	table node - contains the devices with OPPs that
  *		have been registered. Nodes once added are not modified in this
  *		table.
- *		RCU usage: nodes are not modified in the table of opp_table,
- *		however addition is possible and is secured by opp_table_lock
- * @srcu_head:	notifier head to notify the OPP availability changes.
- * @rcu_head:	RCU callback head used for deferred freeing
+ * @head:	notifier head to notify the OPP availability changes.
  * @dev_list:	list of devices that share these OPPs
  * @opp_list:	table of opps
  * @kref:	for reference count of the table.
@@ -156,16 +145,11 @@ enum opp_table_access {
  * This is an internal data structure maintaining the link to opps attached to
  * a device. This structure is not meant to be shared to users as it is
  * meant for book keeping and private to OPP library.
- *
- * Because the opp structures can be used from both rcu and srcu readers, we
- * need to wait for the grace period of both of them before freeing any
- * resources. And so we have used kfree_rcu() from within call_srcu() handlers.
  */
 struct opp_table {
 	struct list_head node;
 
-	struct srcu_notifier_head srcu_head;
-	struct rcu_head rcu_head;
+	struct blocking_notifier_head head;
 	struct list_head dev_list;
 	struct list_head opp_list;
 	struct kref kref;
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/12] PM / OPP: Simplify _opp_set_availability()
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (8 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 09/12] PM / OPP: Move away from RCU locking Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-10  0:00   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency() Viresh Kumar
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

As we don't use RCU locking anymore, there is no need to replace an
earlier OPP node with a new one. Just update the existing one.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index b5e9600058c2..6a1374fafe75 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -1511,20 +1511,15 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 				 bool availability_req)
 {
 	struct opp_table *opp_table;
-	struct dev_pm_opp *new_opp, *tmp_opp, *opp = ERR_PTR(-ENODEV);
+	struct dev_pm_opp *tmp_opp, *opp = ERR_PTR(-ENODEV);
 	int r = 0;
 
-	/* keep the node allocated */
-	new_opp = kmalloc(sizeof(*new_opp), GFP_KERNEL);
-	if (!new_opp)
-		return -ENOMEM;
-
 	/* Find the opp_table */
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		r = PTR_ERR(opp_table);
 		dev_warn(dev, "%s: Device OPP not found (%d)\n", __func__, r);
-		goto free_opp;
+		return r;
 	}
 
 	mutex_lock(&opp_table->lock);
@@ -1545,32 +1540,20 @@ static int _opp_set_availability(struct device *dev, unsigned long freq,
 	/* Is update really needed? */
 	if (opp->available == availability_req)
 		goto unlock;
-	/* copy the old data over */
-	*new_opp = *opp;
-
-	/* plug in new node */
-	new_opp->available = availability_req;
 
-	list_replace(&opp->node, &new_opp->node);
-	kfree(opp);
+	opp->available = availability_req;
 
 	/* Notify the change of the OPP availability */
 	if (availability_req)
 		blocking_notifier_call_chain(&opp_table->head, OPP_EVENT_ENABLE,
-					     new_opp);
+					     opp);
 	else
 		blocking_notifier_call_chain(&opp_table->head,
-					     OPP_EVENT_DISABLE, new_opp);
-
-	mutex_unlock(&opp_table->lock);
-	dev_pm_opp_put_opp_table(opp_table);
-	return 0;
+					     OPP_EVENT_DISABLE, opp);
 
 unlock:
 	mutex_unlock(&opp_table->lock);
 	dev_pm_opp_put_opp_table(opp_table);
-free_opp:
-	kfree(new_opp);
 	return r;
 }
 
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency()
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (9 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 10/12] PM / OPP: Simplify _opp_set_availability() Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 22:40   ` Stephen Boyd
  2016-12-07 10:37 ` [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits Viresh Kumar
  2016-12-07 23:14 ` [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Rafael J. Wysocki
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

dev_pm_opp_get_max_volt_latency() calls _find_opp_table() two times
effectively.

Merge _get_regulator_count() into dev_pm_opp_get_max_volt_latency() to
avoid that.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 34 +++++++++-------------------------
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 6a1374fafe75..21be3377c135 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -170,22 +170,6 @@ unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_clock_latency);
 
-static int _get_regulator_count(struct device *dev)
-{
-	struct opp_table *opp_table;
-	int count;
-
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
-		return 0;
-
-	count = opp_table->regulator_count;
-
-	dev_pm_opp_put_opp_table(opp_table);
-
-	return count;
-}
-
 /**
  * dev_pm_opp_get_max_volt_latency() - Get max voltage latency in nanoseconds
  * @dev: device for which we do this operation
@@ -204,24 +188,24 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 		unsigned long max;
 	} *uV;
 
-	count = _get_regulator_count(dev);
+	opp_table = _find_opp_table(dev);
+	if (IS_ERR(opp_table))
+		return 0;
+
+	count = opp_table->regulator_count;
 
 	/* Regulator may not be required for the device */
 	if (!count)
-		return 0;
+		goto put_opp_table;
 
 	regulators = kmalloc_array(count, sizeof(*regulators), GFP_KERNEL);
 	if (!regulators)
-		return 0;
+		goto put_opp_table;
 
 	uV = kmalloc_array(count, sizeof(*uV), GFP_KERNEL);
 	if (!uV)
 		goto free_regulators;
 
-	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
-		goto free_uV;
-
 	memcpy(regulators, opp_table->regulators, count * sizeof(*regulators));
 
 	mutex_lock(&opp_table->lock);
@@ -242,7 +226,6 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 	}
 
 	mutex_unlock(&opp_table->lock);
-	dev_pm_opp_put_opp_table(opp_table);
 
 	/*
 	 * The caller needs to ensure that opp_table (and hence the regulator)
@@ -254,10 +237,11 @@ unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
 			latency_ns += ret * 1000;
 	}
 
-free_uV:
 	kfree(uV);
 free_regulators:
 	kfree(regulators);
+put_opp_table:
+	dev_pm_opp_put_opp_table(opp_table);
 
 	return latency_ns;
 }
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (10 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency() Viresh Kumar
@ 2016-12-07 10:37 ` Viresh Kumar
  2017-01-09 22:39   ` Stephen Boyd
  2016-12-07 23:14 ` [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Rafael J. Wysocki
  12 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2016-12-07 10:37 UTC (permalink / raw)
  To: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Stephen Boyd,
	Len Brown, Pavel Machek
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Update OPP documentation to remove the RCU specific bits.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 Documentation/power/opp.txt | 47 +++++++++++++--------------------------------
 1 file changed, 13 insertions(+), 34 deletions(-)

diff --git a/Documentation/power/opp.txt b/Documentation/power/opp.txt
index c6279c2be47c..be895e32022d 100644
--- a/Documentation/power/opp.txt
+++ b/Documentation/power/opp.txt
@@ -79,22 +79,6 @@ dependent subsystems such as cpufreq are left to the discretion of the SoC
 specific framework which uses the OPP library. Similar care needs to be taken
 care to refresh the cpufreq table in cases of these operations.
 
-WARNING on OPP List locking mechanism:
--------------------------------------------------
-OPP library uses RCU for exclusivity. RCU allows the query functions to operate
-in multiple contexts and this synchronization mechanism is optimal for a read
-intensive operations on data structure as the OPP library caters to.
-
-To ensure that the data retrieved are sane, the users such as SoC framework
-should ensure that the section of code operating on OPP queries are locked
-using RCU read locks. The opp_find_freq_{exact,ceil,floor},
-opp_get_{voltage, freq, opp_count} fall into this category.
-
-opp_{add,enable,disable} are updaters which use mutex and implement it's own
-RCU locking mechanisms. These functions should *NOT* be called under RCU locks
-and other contexts that prevent blocking functions in RCU or mutex operations
-from working.
-
 2. Initial OPP List Registration
 ================================
 The SoC implementation calls dev_pm_opp_add function iteratively to add OPPs per
@@ -137,15 +121,18 @@ functions return the matching pointer representing the opp if a match is
 found, else returns error. These errors are expected to be handled by standard
 error checks such as IS_ERR() and appropriate actions taken by the caller.
 
+Callers of these functions shall call dev_pm_opp_put() after they have used the
+OPP. Otherwise the memory for the OPP will never get freed and result in
+memleak.
+
 dev_pm_opp_find_freq_exact - Search for an OPP based on an *exact* frequency and
 	availability. This function is especially useful to enable an OPP which
 	is not available by default.
 	Example: In a case when SoC framework detects a situation where a
 	higher frequency could be made available, it can use this function to
 	find the OPP prior to call the dev_pm_opp_enable to actually make it available.
-	 rcu_read_lock();
 	 opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
-	 rcu_read_unlock();
+	 dev_pm_opp_put(opp);
 	 /* dont operate on the pointer.. just do a sanity check.. */
 	 if (IS_ERR(opp)) {
 		pr_err("frequency not disabled!\n");
@@ -163,9 +150,8 @@ dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the
 	frequency.
 	Example: To find the highest opp for a device:
 	 freq = ULONG_MAX;
-	 rcu_read_lock();
 	 dev_pm_opp_find_freq_floor(dev, &freq);
-	 rcu_read_unlock();
+	 dev_pm_opp_put(opp);
 
 dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the
 	provided frequency. This function is useful while searching for a
@@ -173,17 +159,15 @@ dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the
 	frequency.
 	Example 1: To find the lowest opp for a device:
 	 freq = 0;
-	 rcu_read_lock();
 	 dev_pm_opp_find_freq_ceil(dev, &freq);
-	 rcu_read_unlock();
+	 dev_pm_opp_put(opp);
 	Example 2: A simplified implementation of a SoC cpufreq_driver->target:
 	 soc_cpufreq_target(..)
 	 {
 		/* Do stuff like policy checks etc. */
 		/* Find the best frequency match for the req */
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		if (!IS_ERR(opp))
 			soc_switch_to_freq_voltage(freq);
 		else
@@ -208,9 +192,8 @@ dev_pm_opp_enable - Make a OPP available for operation.
 	implementation might choose to do something as follows:
 	 if (cur_temp < temp_low_thresh) {
 		/* Enable 1GHz if it was disabled */
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		/* just error check */
 		if (!IS_ERR(opp))
 			ret = dev_pm_opp_enable(dev, 1000000000);
@@ -224,9 +207,8 @@ dev_pm_opp_disable - Make an OPP to be not available for operation
 	choose to do something as follows:
 	 if (cur_temp > temp_high_thresh) {
 		/* Disable 1GHz if it was enabled */
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, 1000000000, true);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		/* just error check */
 		if (!IS_ERR(opp))
 			ret = dev_pm_opp_disable(dev, 1000000000);
@@ -249,10 +231,9 @@ dev_pm_opp_get_voltage - Retrieve the voltage represented by the opp pointer.
 	 soc_switch_to_freq_voltage(freq)
 	 {
 		/* do things */
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 		v = dev_pm_opp_get_voltage(opp);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		if (v)
 			regulator_set_voltage(.., v);
 		/* do other things */
@@ -266,12 +247,11 @@ dev_pm_opp_get_freq - Retrieve the freq represented by the opp pointer.
 	 {
 		/* do things.. */
 		 max_freq = ULONG_MAX;
-		 rcu_read_lock();
 		 max_opp = dev_pm_opp_find_freq_floor(dev,&max_freq);
 		 requested_opp = dev_pm_opp_find_freq_ceil(dev,&freq);
 		 if (!IS_ERR(max_opp) && !IS_ERR(requested_opp))
 			r = soc_test_validity(max_opp, requested_opp);
-		 rcu_read_unlock();
+		 dev_pm_opp_put(max_opp);
 		/* do other things */
 	 }
 	 soc_test_validity(..)
@@ -289,7 +269,6 @@ dev_pm_opp_get_opp_count - Retrieve the number of available opps for a device
 	 soc_notify_coproc_available_frequencies()
 	 {
 		/* Do things */
-		rcu_read_lock();
 		num_available = dev_pm_opp_get_opp_count(dev);
 		speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL);
 		/* populate the table in increasing order */
@@ -298,8 +277,8 @@ dev_pm_opp_get_opp_count - Retrieve the number of available opps for a device
 			speeds[i] = freq;
 			freq++;
 			i++;
+			dev_pm_opp_put(opp);
 		}
-		rcu_read_unlock();
 
 		soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available);
 		/* Do other things */
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
  2016-12-07 10:37   ` Viresh Kumar
@ 2016-12-07 13:23     ` Chanwoo Choi
  -1 siblings, 0 replies; 41+ messages in thread
From: Chanwoo Choi @ 2016-12-07 13:23 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Kevin Hilman, Tony Lindgren,
	Viresh Kumar, Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Amit Daniel Kachhap, Javi Merino, Zhang Rui, Eduardo Valentin
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot

Hi Viresh,

[snip]

> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index b0de42972b74..89add0d7c017 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
>  		return;
>  	}
>  
> -	rcu_read_lock();
>  	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
>  		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
> +		dev_pm_opp_put(opp);

I think that the dev_pm_opp_put(opp) should be called after if statement
If dev_pm_opp_find_freq_ceil() return error, I think the calling of
dev_pm_opp_put(opp) is not necessary.

>  		if (IS_ERR(opp)) {
>  			devm_kfree(devfreq->dev.parent, profile->freq_table);
>  			profile->max_state = 0;
> -			rcu_read_unlock();
>  			return;
>  		}
>  		profile->freq_table[i] = freq;
>  	}
> -	rcu_read_unlock();
>  }
>  
>  /**
> @@ -1107,9 +1105,9 @@ static ssize_t available_frequencies_show(struct device *d,
>  	ssize_t count = 0;
>  	unsigned long freq = 0;
>  
> -	rcu_read_lock();
>  	do {
>  		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
> +		dev_pm_opp_put(opp);

ditto.

>  		if (IS_ERR(opp))
>  			break;
>  
> @@ -1117,7 +1115,6 @@ static ssize_t available_frequencies_show(struct device *d,
>  				   "%lu ", freq);
>  		freq++;
>  	} while (1);
> -	rcu_read_unlock();
>  
>  	/* Truncate the trailing space */
>  	if (count)
> @@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
>   * @freq:	The frequency given to target function
>   * @flags:	Flags handed from devfreq framework.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
>  					   unsigned long *freq,
> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
> index a8ed7792ece2..49ce38cef460 100644
> --- a/drivers/devfreq/exynos-bus.c
> +++ b/drivers/devfreq/exynos-bus.c
> @@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
>  	int ret = 0;
>  
>  	/* Get new opp-bus instance according to new bus clock */
> -	rcu_read_lock();
>  	new_opp = devfreq_recommended_opp(dev, freq, flags);
>  	if (IS_ERR(new_opp)) {
>  		dev_err(dev, "failed to get recommended opp instance\n");
> -		rcu_read_unlock();
>  		return PTR_ERR(new_opp);
>  	}
>  
>  	new_freq = dev_pm_opp_get_freq(new_opp);
>  	new_volt = dev_pm_opp_get_voltage(new_opp);
> +	dev_pm_opp_put(new_opp);
> +
>  	old_freq = bus->curr_freq;
> -	rcu_read_unlock();
>  
>  	if (old_freq == new_freq)
>  		return 0;
> @@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
>  	int ret = 0;
>  
>  	/* Get new opp-bus instance according to new bus clock */
> -	rcu_read_lock();
>  	new_opp = devfreq_recommended_opp(dev, freq, flags);
>  	if (IS_ERR(new_opp)) {
>  		dev_err(dev, "failed to get recommended opp instance\n");
> -		rcu_read_unlock();
>  		return PTR_ERR(new_opp);
>  	}
>  
>  	new_freq = dev_pm_opp_get_freq(new_opp);
> +	dev_pm_opp_put(new_opp);
> +
>  	old_freq = bus->curr_freq;
> -	rcu_read_unlock();
>  
>  	if (old_freq == new_freq)
>  		return 0;
> @@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
>  
>  	rate = clk_get_rate(bus->clk);
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &rate, 0);
>  	if (IS_ERR(opp)) {
>  		dev_err(dev, "failed to find dev_pm_opp\n");
> -		rcu_read_unlock();
>  		ret = PTR_ERR(opp);
>  		goto err_opp;
>  	}
>  	bus->curr_freq = dev_pm_opp_get_freq(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	return 0;
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 9ef46e2592c4..671a1e0afc6e 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -59,9 +59,9 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	 * list of parent device. Because in this case, *freq is temporary
>  	 * value which is decided by ondemand governor.
>  	 */
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);

ditto.

> +
>  	if (IS_ERR(opp)) {
>  		ret = PTR_ERR(opp);
>  		goto out;
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 27d2f349b53c..40a2499730fc 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
>  	unsigned long target_volt, target_rate;
>  	int err;
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, freq, flags);
> -	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
> +	if (IS_ERR(opp))
>  		return PTR_ERR(opp);
> -	}
>  
>  	target_rate = dev_pm_opp_get_freq(opp);
>  	target_volt = dev_pm_opp_get_voltage(opp);
> -
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	if (dmcfreq->rate == target_rate)
>  		return 0;
> @@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
>  
>  	data->rate = clk_get_rate(data->dmc_clk);
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &data->rate, 0);
> -	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
> +	if (IS_ERR(opp))
>  		return PTR_ERR(opp);
> -	}
> +
>  	data->rate = dev_pm_opp_get_freq(opp);
>  	data->volt = dev_pm_opp_get_voltage(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
>  
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index fe9dce0245bf..214fff96fa4a 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>  	struct dev_pm_opp *opp;
>  	unsigned long rate = *freq * KHZ;
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &rate, flags);
>  	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
>  		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
>  		return PTR_ERR(opp);
>  	}
>  	rate = dev_pm_opp_get_freq(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	clk_set_min_rate(tegra->emc_clock, rate);
>  	clk_set_rate(tegra->emc_clock, 0);

[snip]

Best Regards,
Chanwoo Choi

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
@ 2016-12-07 13:23     ` Chanwoo Choi
  0 siblings, 0 replies; 41+ messages in thread
From: Chanwoo Choi @ 2016-12-07 13:23 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Kevin Hilman, Tony Lindgren,
	Viresh Kumar, Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Amit Daniel Kachhap, Javi Merino
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot

Hi Viresh,

[snip]

> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index b0de42972b74..89add0d7c017 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
>  		return;
>  	}
>  
> -	rcu_read_lock();
>  	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
>  		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
> +		dev_pm_opp_put(opp);

I think that the dev_pm_opp_put(opp) should be called after if statement
If dev_pm_opp_find_freq_ceil() return error, I think the calling of
dev_pm_opp_put(opp) is not necessary.

>  		if (IS_ERR(opp)) {
>  			devm_kfree(devfreq->dev.parent, profile->freq_table);
>  			profile->max_state = 0;
> -			rcu_read_unlock();
>  			return;
>  		}
>  		profile->freq_table[i] = freq;
>  	}
> -	rcu_read_unlock();
>  }
>  
>  /**
> @@ -1107,9 +1105,9 @@ static ssize_t available_frequencies_show(struct device *d,
>  	ssize_t count = 0;
>  	unsigned long freq = 0;
>  
> -	rcu_read_lock();
>  	do {
>  		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
> +		dev_pm_opp_put(opp);

ditto.

>  		if (IS_ERR(opp))
>  			break;
>  
> @@ -1117,7 +1115,6 @@ static ssize_t available_frequencies_show(struct device *d,
>  				   "%lu ", freq);
>  		freq++;
>  	} while (1);
> -	rcu_read_unlock();
>  
>  	/* Truncate the trailing space */
>  	if (count)
> @@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
>   * @freq:	The frequency given to target function
>   * @flags:	Flags handed from devfreq framework.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
>  					   unsigned long *freq,
> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
> index a8ed7792ece2..49ce38cef460 100644
> --- a/drivers/devfreq/exynos-bus.c
> +++ b/drivers/devfreq/exynos-bus.c
> @@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
>  	int ret = 0;
>  
>  	/* Get new opp-bus instance according to new bus clock */
> -	rcu_read_lock();
>  	new_opp = devfreq_recommended_opp(dev, freq, flags);
>  	if (IS_ERR(new_opp)) {
>  		dev_err(dev, "failed to get recommended opp instance\n");
> -		rcu_read_unlock();
>  		return PTR_ERR(new_opp);
>  	}
>  
>  	new_freq = dev_pm_opp_get_freq(new_opp);
>  	new_volt = dev_pm_opp_get_voltage(new_opp);
> +	dev_pm_opp_put(new_opp);
> +
>  	old_freq = bus->curr_freq;
> -	rcu_read_unlock();
>  
>  	if (old_freq == new_freq)
>  		return 0;
> @@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
>  	int ret = 0;
>  
>  	/* Get new opp-bus instance according to new bus clock */
> -	rcu_read_lock();
>  	new_opp = devfreq_recommended_opp(dev, freq, flags);
>  	if (IS_ERR(new_opp)) {
>  		dev_err(dev, "failed to get recommended opp instance\n");
> -		rcu_read_unlock();
>  		return PTR_ERR(new_opp);
>  	}
>  
>  	new_freq = dev_pm_opp_get_freq(new_opp);
> +	dev_pm_opp_put(new_opp);
> +
>  	old_freq = bus->curr_freq;
> -	rcu_read_unlock();
>  
>  	if (old_freq == new_freq)
>  		return 0;
> @@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
>  
>  	rate = clk_get_rate(bus->clk);
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &rate, 0);
>  	if (IS_ERR(opp)) {
>  		dev_err(dev, "failed to find dev_pm_opp\n");
> -		rcu_read_unlock();
>  		ret = PTR_ERR(opp);
>  		goto err_opp;
>  	}
>  	bus->curr_freq = dev_pm_opp_get_freq(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	return 0;
>  
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 9ef46e2592c4..671a1e0afc6e 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -59,9 +59,9 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>  	 * list of parent device. Because in this case, *freq is temporary
>  	 * value which is decided by ondemand governor.
>  	 */
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);

ditto.

> +
>  	if (IS_ERR(opp)) {
>  		ret = PTR_ERR(opp);
>  		goto out;
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 27d2f349b53c..40a2499730fc 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
>  	unsigned long target_volt, target_rate;
>  	int err;
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, freq, flags);
> -	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
> +	if (IS_ERR(opp))
>  		return PTR_ERR(opp);
> -	}
>  
>  	target_rate = dev_pm_opp_get_freq(opp);
>  	target_volt = dev_pm_opp_get_voltage(opp);
> -
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	if (dmcfreq->rate == target_rate)
>  		return 0;
> @@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
>  
>  	data->rate = clk_get_rate(data->dmc_clk);
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &data->rate, 0);
> -	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
> +	if (IS_ERR(opp))
>  		return PTR_ERR(opp);
> -	}
> +
>  	data->rate = dev_pm_opp_get_freq(opp);
>  	data->volt = dev_pm_opp_get_voltage(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
>  
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index fe9dce0245bf..214fff96fa4a 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>  	struct dev_pm_opp *opp;
>  	unsigned long rate = *freq * KHZ;
>  
> -	rcu_read_lock();
>  	opp = devfreq_recommended_opp(dev, &rate, flags);
>  	if (IS_ERR(opp)) {
> -		rcu_read_unlock();
>  		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
>  		return PTR_ERR(opp);
>  	}
>  	rate = dev_pm_opp_get_freq(opp);
> -	rcu_read_unlock();
> +	dev_pm_opp_put(opp);
>  
>  	clk_set_min_rate(tegra->emc_clock, rate);
>  	clk_set_rate(tegra->emc_clock, 0);

[snip]

Best Regards,
Chanwoo Choi

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking
  2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
                   ` (11 preceding siblings ...)
  2016-12-07 10:37 ` [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits Viresh Kumar
@ 2016-12-07 23:14 ` Rafael J. Wysocki
  12 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2016-12-07 23:14 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Stephen Boyd, Nishanth Menon,
	Vincent Guittot

On Wed, Dec 7, 2016 at 11:37 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Hi Rafael,
>
> You can pretty much ignore this series until all other OPP cleanup/fixes
> get merged. I am posting these to get early reviews from Stephen as
> these patches have been lying with me for almost a week now. And I am
> also _not_ pushing these for 4.10-rc1. It all depends on how the reviews
> go.

So please stop sending this now if you will.

The amount of list noise you've just generated is well above what I'm
able to deal with today, so I'm going to ignore all of your patches
sent since yesterday (maybe except for the revert one as that should
be easy to handle).

This also means they are not going into 4.10-rc, so please don't
resend them before the merge window opens at least.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
  2016-12-07 13:23     ` Chanwoo Choi
@ 2016-12-08  4:00       ` Viresh Kumar
  -1 siblings, 0 replies; 41+ messages in thread
From: Viresh Kumar @ 2016-12-08  4:00 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Rafael Wysocki, Kevin Hilman, Tony Lindgren, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Amit Daniel Kachhap, Javi Merino, Zhang Rui, Eduardo Valentin,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot

On 07-12-16, 22:23, Chanwoo Choi wrote:
> I think that the dev_pm_opp_put(opp) should be called after if statement
> If dev_pm_opp_find_freq_ceil() return error, I think the calling of
> dev_pm_opp_put(opp) is not necessary.

During development I had following check in dev_pm_opp_put():

        if (IS_ERR(opp))
                return;

But that check isn't there anymore. And so it is also unsafe to call
dev_pm_opp_put() for invalid OPP pointers.

Thanks for reviewing this properly. devfreq_cooling.c also had the same issue
which you missed. Here is the new version of the patch:

-------------------------8<-------------------------
Subject: [PATCH] PM / OPP: Update OPP users to put reference

This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.

Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.

As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.

This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.

Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 arch/arm/mach-omap2/pm.c             |   5 +-
 drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
 drivers/base/power/opp/cpu.c         |  22 ++-----
 drivers/clk/tegra/clk-dfll.c         |  17 ++----
 drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
 drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
 drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
 drivers/cpufreq/omap-cpufreq.c       |   4 +-
 drivers/devfreq/devfreq.c            |  14 ++---
 drivers/devfreq/exynos-bus.c         |  14 ++---
 drivers/devfreq/governor_passive.c   |   4 +-
 drivers/devfreq/rk3399_dmc.c         |  16 ++---
 drivers/devfreq/tegra-devfreq.c      |   4 +-
 drivers/thermal/cpu_cooling.c        |  11 +---
 drivers/thermal/devfreq_cooling.c    |  15 ++---
 15 files changed, 110 insertions(+), 153 deletions(-)

diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
index 678d2a31dcb8..c5a1d4439202 100644
--- a/arch/arm/mach-omap2/pm.c
+++ b/arch/arm/mach-omap2/pm.c
@@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
 	freq = clk_get_rate(clk);
 	clk_put(clk);
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("%s: unable to find boot up OPP for vdd_%s\n",
 			__func__, vdd_name);
 		goto exit;
 	}
 
 	bootup_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (!bootup_volt) {
 		pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
 		       __func__, vdd_name);
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 9870ee54d708..a6efa818029a 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -40,6 +40,8 @@ do {									\
 			 "opp_table_lock protection");			\
 } while (0)
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp);
+
 static struct opp_device *_find_opp_dev(const struct device *dev,
 					struct opp_table *opp_table)
 {
@@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
  * return 0
  *
  * This is useful only for devices with single power supply.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long v = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp))
@@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 	else
 		v = tmp_opp->supplies[0].u_volt;
 
+	rcu_read_unlock();
 	return v;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
@@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  *
  * Return: frequency in hertz corresponding to the opp, else
  * return 0
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long f = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
@@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 	else
 		f = tmp_opp->rate;
 
+	rcu_read_unlock();
 	return f;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
@@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
  * quickly. Running on them for longer times may overheat the chip.
  *
  * Return: true if opp is turbo opp, else false.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
+	bool turbo;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
@@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 		return false;
 	}
 
-	return tmp_opp->turbo;
+	turbo = tmp_opp->turbo;
+
+	rcu_read_unlock();
+	return turbo;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
@@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
  * This provides a mechanism to enable an opp which is not available currently
  * or the opposite as well.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 					      unsigned long freq,
@@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		int r = PTR_ERR(opp_table);
 
 		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
+		rcu_read_unlock();
 		return ERR_PTR(r);
 	}
 
@@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
 			opp = temp_opp;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
 
+	rcu_read_unlock();
+
 	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
@@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 		if (temp_opp->available && temp_opp->rate >= *freq) {
 			opp = temp_opp;
 			*freq = opp->rate;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
@@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 					     unsigned long *freq)
 {
 	struct opp_table *opp_table;
-
-	opp_rcu_lockdep_assert();
+	struct dev_pm_opp *opp;
 
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
+
+	opp = _find_freq_ceil(opp_table, freq);
 
-	return _find_freq_ceil(opp_table, freq);
+	rcu_read_unlock();
+
+	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
 
@@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					      unsigned long *freq)
@@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
-
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
 
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
@@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 				opp = temp_opp;
 		}
 	}
+
+	/* Increment the reference count of OPP */
+	if (!IS_ERR(opp))
+		dev_pm_opp_get(opp);
+	rcu_read_unlock();
+
 	if (!IS_ERR(opp))
 		*freq = opp->rate;
 
@@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		ret = PTR_ERR(opp);
 		dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
 			__func__, freq, ret);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return ret;
 	}
@@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!regulators) {
+		dev_pm_opp_put(opp);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	}
@@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	data->new_opp.rate = freq;
 	memcpy(data->new_opp.supplies, opp->supplies, size);
 
+	dev_pm_opp_put(opp);
+	if (!IS_ERR(old_opp))
+		dev_pm_opp_put(old_opp);
 	rcu_read_unlock();
 
 	return set_opp(data);
@@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
 	dev_pm_opp_put_opp_table(opp_table);
 }
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp)
+{
+	kref_get(&opp->kref);
+}
+
 void dev_pm_opp_put(struct dev_pm_opp *opp)
 {
 	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index 8c3434bdb26d..adef788862d5 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -42,11 +42,6 @@
  *
  * WARNING: It is  important for the callers to ensure refreshing their copy of
  * the table if any of the mentioned functions have been invoked in the interim.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Since we just use the regular accessor functions to access the internal data
- * structures, we use RCU read lock inside this function. As a result, users of
- * this function DONOT need to use explicit locks for invoking.
  */
 int dev_pm_opp_init_cpufreq_table(struct device *dev,
 				  struct cpufreq_frequency_table **table)
@@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	int i, max_opps, ret = 0;
 	unsigned long rate;
 
-	rcu_read_lock();
-
 	max_opps = dev_pm_opp_get_opp_count(dev);
-	if (max_opps <= 0) {
-		ret = max_opps ? max_opps : -ENODATA;
-		goto out;
-	}
+	if (max_opps <= 0)
+		return max_opps ? max_opps : -ENODATA;
 
 	freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
-	if (!freq_table) {
-		ret = -ENOMEM;
-		goto out;
-	}
+	if (!freq_table)
+		return -ENOMEM;
 
 	for (i = 0, rate = 0; i < max_opps; i++, rate++) {
 		/* find next rate */
@@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 		/* Is Boost/turbo opp ? */
 		if (dev_pm_opp_is_turbo(opp))
 			freq_table[i].flags = CPUFREQ_BOOST_FREQ;
+
+		dev_pm_opp_put(opp);
 	}
 
 	freq_table[i].driver_data = i;
@@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	*table = &freq_table[0];
 
 out:
-	rcu_read_unlock();
 	if (ret)
 		kfree(freq_table);
 
diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
index f010562534eb..2c44aeb0b97c 100644
--- a/drivers/clk/tegra/clk-dfll.c
+++ b/drivers/clk/tegra/clk-dfll.c
@@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
 	struct dev_pm_opp *opp;
 	int i, uv;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
-	uv = dev_pm_opp_get_voltage(opp);
 
-	rcu_read_unlock();
+	uv = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	for (i = 0; i < td->i2c_lut_size; i++) {
 		if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
@@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 	struct dev_pm_opp *opp;
 	int lut;
 
-	rcu_read_lock();
-
 	rate = ULONG_MAX;
 	opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
 	if (IS_ERR(opp)) {
@@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		goto out;
 	}
 	v_max = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	v = td->soc->cvb->min_millivolts * 1000;
 	lut = find_vdd_map_entry_exact(td, v);
@@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		if (v_opp <= td->soc->cvb->min_millivolts * 1000)
 			td->dvco_rate_min = dev_pm_opp_get_freq(opp);
 
+		dev_pm_opp_put(opp);
+
 		for (;;) {
 			v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
 			if (v >= v_opp)
@@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		ret = 0;
 
 out:
-	rcu_read_unlock();
-
 	return ret;
 }
 
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index c0f3373706f4..9180d34cc9fc 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -118,12 +118,10 @@ static int init_div_table(void)
 	unsigned int tmp, clk_div, ema_div, freq, volt_id;
 	struct dev_pm_opp *opp;
 
-	rcu_read_lock();
 	cpufreq_for_each_entry(pos, freq_tbl) {
 		opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
 					pos->frequency * 1000, true);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(dvfs_info->dev,
 				"failed to find valid OPP for %u KHZ\n",
 				pos->frequency);
@@ -140,6 +138,7 @@ static int init_div_table(void)
 
 		/* Calculate EMA */
 		volt_id = dev_pm_opp_get_voltage(opp);
+
 		volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
 		if (volt_id < PMIC_HIGH_VOLT) {
 			ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
@@ -157,9 +156,9 @@ static int init_div_table(void)
 
 		__raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
 						(pos - freq_tbl));
+		dev_pm_opp_put(opp);
 	}
 
-	rcu_read_unlock();
 	return 0;
 }
 
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index ef1fa8145419..7719b02e04f5 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 	freq_hz = new_freq * 1000;
 	old_freq = clk_get_rate(arm_clk) / 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
 		return PTR_ERR(opp);
 	}
 
 	volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	volt_old = regulator_get_voltage(arm_reg);
 
 	dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
@@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 	 * freq_table initialised from OPP is therefore sorted in the
 	 * same order.
 	 */
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[0].frequency * 1000, true);
 	min_volt = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[--num].frequency * 1000, true);
 	max_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
 	if (ret > 0)
 		transition_latency += ret * 1000;
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index 643f43179df1..ab25b1235a5e 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 
 	freq_hz = freq_table[index].frequency * 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("cpu%d: failed to find OPP for %ld\n",
 		       policy->cpu, freq_hz);
 		return PTR_ERR(opp);
 	}
 	vproc = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
@@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
 		goto out_free_opp_table;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 376e63ca94e8..71e81bbf031b 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
 	freq = ret;
 
 	if (mpu_reg) {
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
 				__func__, new_freq);
 			return -EINVAL;
 		}
 		volt = dev_pm_opp_get_voltage(opp);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		tol = volt * OPP_TOLERANCE / 100;
 		volt_old = regulator_get_voltage(mpu_reg);
 	}
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index b0de42972b74..378f12a51496 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
 		return;
 	}
 
-	rcu_read_lock();
 	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
 		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
 		if (IS_ERR(opp)) {
 			devm_kfree(devfreq->dev.parent, profile->freq_table);
 			profile->max_state = 0;
-			rcu_read_unlock();
 			return;
 		}
+		dev_pm_opp_put(opp);
 		profile->freq_table[i] = freq;
 	}
-	rcu_read_unlock();
 }
 
 /**
@@ -1107,17 +1105,16 @@ static ssize_t available_frequencies_show(struct device *d,
 	ssize_t count = 0;
 	unsigned long freq = 0;
 
-	rcu_read_lock();
 	do {
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 		if (IS_ERR(opp))
 			break;
 
+		dev_pm_opp_put(opp);
 		count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
 				   "%lu ", freq);
 		freq++;
 	} while (1);
-	rcu_read_unlock();
 
 	/* Truncate the trailing space */
 	if (count)
@@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
  * @freq:	The frequency given to target function
  * @flags:	Flags handed from devfreq framework.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
 					   unsigned long *freq,
diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index a8ed7792ece2..49ce38cef460 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
 	new_volt = dev_pm_opp_get_voltage(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
 
 	rate = clk_get_rate(bus->clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, 0);
 	if (IS_ERR(opp)) {
 		dev_err(dev, "failed to find dev_pm_opp\n");
-		rcu_read_unlock();
 		ret = PTR_ERR(opp);
 		goto err_opp;
 	}
 	bus->curr_freq = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	return 0;
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 9ef46e2592c4..bd452236dba4 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -59,14 +59,14 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	 * list of parent device. Because in this case, *freq is temporary
 	 * value which is decided by ondemand governor.
 	 */
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
-	rcu_read_unlock();
 	if (IS_ERR(opp)) {
 		ret = PTR_ERR(opp);
 		goto out;
 	}
 
+	dev_pm_opp_put(opp);
+
 	/*
 	 * Get the OPP table's index of decided freqeuncy by governor
 	 * of parent device.
diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 27d2f349b53c..40a2499730fc 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
 	unsigned long target_volt, target_rate;
 	int err;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, freq, flags);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
 
 	target_rate = dev_pm_opp_get_freq(opp);
 	target_volt = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (dmcfreq->rate == target_rate)
 		return 0;
@@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
 
 	data->rate = clk_get_rate(data->dmc_clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &data->rate, 0);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
+
 	data->rate = dev_pm_opp_get_freq(opp);
 	data->volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
 
diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
index fe9dce0245bf..214fff96fa4a 100644
--- a/drivers/devfreq/tegra-devfreq.c
+++ b/drivers/devfreq/tegra-devfreq.c
@@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
 	struct dev_pm_opp *opp;
 	unsigned long rate = *freq * KHZ;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, flags);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
 		return PTR_ERR(opp);
 	}
 	rate = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	clk_set_min_rate(tegra->emc_clock, rate);
 	clk_set_rate(tegra->emc_clock, 0);
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 9ce0e9eef923..85fdbf762fa0 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 	if (!power_table)
 		return -ENOMEM;
 
-	rcu_read_lock();
-
 	for (freq = 0, i = 0;
 	     opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
 	     freq++, i++) {
@@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		u64 power;
 
 		if (i >= num_opps) {
-			rcu_read_unlock();
 			ret = -EAGAIN;
 			goto free_power_table;
 		}
 
 		freq_mhz = freq / 1000000;
 		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
+		dev_pm_opp_put(opp);
 
 		/*
 		 * Do the multiplication with MHz and millivolt so as
@@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		power_table[i].power = power;
 	}
 
-	rcu_read_unlock();
-
 	if (i != num_opps) {
 		ret = PTR_ERR(opp);
 		goto free_power_table;
@@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
 		return 0;
 	}
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
 					 true);
 	voltage = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(cpufreq_device->cpu_dev,
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 81631b110e17..abe8ad76bd8b 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -113,15 +113,15 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
 		unsigned int freq = dfc->freq_table[i];
 		bool want_enable = i >= cdev_state ? true : false;
 
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
-		rcu_read_unlock();
 
 		if (PTR_ERR(opp) == -ERANGE)
 			continue;
 		else if (IS_ERR(opp))
 			return PTR_ERR(opp);
 
+		dev_pm_opp_put(opp);
+
 		if (want_enable)
 			ret = dev_pm_opp_enable(dev, freq);
 		else
@@ -221,15 +221,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
 	if (!dfc->power_ops->get_static_power)
 		return 0;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(dev, freq, true);
 	if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
 		opp = dev_pm_opp_find_freq_exact(dev, freq, false);
 
 	voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(dev,
@@ -411,18 +408,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
 		unsigned long power_dyn, voltage;
 		struct dev_pm_opp *opp;
 
-		rcu_read_lock();
-
 		opp = dev_pm_opp_find_freq_floor(dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			ret = PTR_ERR(opp);
 			goto free_tables;
 		}
 
 		voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (dfc->power_ops) {
 			power_dyn = get_dynamic_power(dfc, freq, voltage);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
@ 2016-12-08  4:00       ` Viresh Kumar
  0 siblings, 0 replies; 41+ messages in thread
From: Viresh Kumar @ 2016-12-08  4:00 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Rafael Wysocki, Kevin Hilman, Tony Lindgren, Viresh Kumar,
	Nishanth Menon, Stephen Boyd, Peter De Schrijver,
	Prashant Gaikwad, Stephen Warren, Thierry Reding,
	Alexandre Courbot, Kukjin Kim, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Kyungmin Park,
	Amit Daniel Kachhap, Javi Merino, Zhang Rui

On 07-12-16, 22:23, Chanwoo Choi wrote:
> I think that the dev_pm_opp_put(opp) should be called after if statement
> If dev_pm_opp_find_freq_ceil() return error, I think the calling of
> dev_pm_opp_put(opp) is not necessary.

During development I had following check in dev_pm_opp_put():

        if (IS_ERR(opp))
                return;

But that check isn't there anymore. And so it is also unsafe to call
dev_pm_opp_put() for invalid OPP pointers.

Thanks for reviewing this properly. devfreq_cooling.c also had the same issue
which you missed. Here is the new version of the patch:

-------------------------8<-------------------------
Subject: [PATCH] PM / OPP: Update OPP users to put reference

This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.

Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.

As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.

This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.

Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 arch/arm/mach-omap2/pm.c             |   5 +-
 drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
 drivers/base/power/opp/cpu.c         |  22 ++-----
 drivers/clk/tegra/clk-dfll.c         |  17 ++----
 drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
 drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
 drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
 drivers/cpufreq/omap-cpufreq.c       |   4 +-
 drivers/devfreq/devfreq.c            |  14 ++---
 drivers/devfreq/exynos-bus.c         |  14 ++---
 drivers/devfreq/governor_passive.c   |   4 +-
 drivers/devfreq/rk3399_dmc.c         |  16 ++---
 drivers/devfreq/tegra-devfreq.c      |   4 +-
 drivers/thermal/cpu_cooling.c        |  11 +---
 drivers/thermal/devfreq_cooling.c    |  15 ++---
 15 files changed, 110 insertions(+), 153 deletions(-)

diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
index 678d2a31dcb8..c5a1d4439202 100644
--- a/arch/arm/mach-omap2/pm.c
+++ b/arch/arm/mach-omap2/pm.c
@@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
 	freq = clk_get_rate(clk);
 	clk_put(clk);
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("%s: unable to find boot up OPP for vdd_%s\n",
 			__func__, vdd_name);
 		goto exit;
 	}
 
 	bootup_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	if (!bootup_volt) {
 		pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
 		       __func__, vdd_name);
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 9870ee54d708..a6efa818029a 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -40,6 +40,8 @@ do {									\
 			 "opp_table_lock protection");			\
 } while (0)
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp);
+
 static struct opp_device *_find_opp_dev(const struct device *dev,
 					struct opp_table *opp_table)
 {
@@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
  * return 0
  *
  * This is useful only for devices with single power supply.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long v = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp))
@@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
 	else
 		v = tmp_opp->supplies[0].u_volt;
 
+	rcu_read_unlock();
 	return v;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
@@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
  *
  * Return: frequency in hertz corresponding to the opp, else
  * return 0
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
 	unsigned long f = 0;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
@@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 	else
 		f = tmp_opp->rate;
 
+	rcu_read_unlock();
 	return f;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
@@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
  * quickly. Running on them for longer times may overheat the chip.
  *
  * Return: true if opp is turbo opp, else false.
- *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. This means that opp which could have been fetched by
- * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
- * under RCU lock. The pointer returned by the opp_find_freq family must be
- * used in the same section as the usage of this function with the pointer
- * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
- * pointer.
  */
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
 	struct dev_pm_opp *tmp_opp;
+	bool turbo;
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	tmp_opp = rcu_dereference(opp);
 	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
@@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 		return false;
 	}
 
-	return tmp_opp->turbo;
+	turbo = tmp_opp->turbo;
+
+	rcu_read_unlock();
+	return turbo;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
@@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
  * This provides a mechanism to enable an opp which is not available currently
  * or the opposite as well.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 					      unsigned long freq,
@@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
+	rcu_read_lock();
 
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
 		int r = PTR_ERR(opp_table);
 
 		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
+		rcu_read_unlock();
 		return ERR_PTR(r);
 	}
 
@@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 		if (temp_opp->available == available &&
 				temp_opp->rate == freq) {
 			opp = temp_opp;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
 
+	rcu_read_unlock();
+
 	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
@@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 		if (temp_opp->available && temp_opp->rate >= *freq) {
 			opp = temp_opp;
 			*freq = opp->rate;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
 			break;
 		}
 	}
@@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
 					     unsigned long *freq)
 {
 	struct opp_table *opp_table;
-
-	opp_rcu_lockdep_assert();
+	struct dev_pm_opp *opp;
 
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
+
+	opp = _find_freq_ceil(opp_table, freq);
 
-	return _find_freq_ceil(opp_table, freq);
+	rcu_read_unlock();
+
+	return opp;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
 
@@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
  * ERANGE:	no match found for search
  * ENODEV:	if device not found in list of registered devices
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					      unsigned long *freq)
@@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 	struct opp_table *opp_table;
 	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
 
-	opp_rcu_lockdep_assert();
-
 	if (!dev || !freq) {
 		dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
 		return ERR_PTR(-EINVAL);
 	}
 
+	rcu_read_lock();
+
 	opp_table = _find_opp_table(dev);
-	if (IS_ERR(opp_table))
+	if (IS_ERR(opp_table)) {
+		rcu_read_unlock();
 		return ERR_CAST(opp_table);
+	}
 
 	list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
 		if (temp_opp->available) {
@@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 				opp = temp_opp;
 		}
 	}
+
+	/* Increment the reference count of OPP */
+	if (!IS_ERR(opp))
+		dev_pm_opp_get(opp);
+	rcu_read_unlock();
+
 	if (!IS_ERR(opp))
 		*freq = opp->rate;
 
@@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 		ret = PTR_ERR(opp);
 		dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
 			__func__, freq, ret);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return ret;
 	}
@@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!regulators) {
+		dev_pm_opp_put(opp);
+		if (!IS_ERR(old_opp))
+			dev_pm_opp_put(old_opp);
 		rcu_read_unlock();
 		return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	}
@@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	data->new_opp.rate = freq;
 	memcpy(data->new_opp.supplies, opp->supplies, size);
 
+	dev_pm_opp_put(opp);
+	if (!IS_ERR(old_opp))
+		dev_pm_opp_put(old_opp);
 	rcu_read_unlock();
 
 	return set_opp(data);
@@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
 	dev_pm_opp_put_opp_table(opp_table);
 }
 
+static void dev_pm_opp_get(struct dev_pm_opp *opp)
+{
+	kref_get(&opp->kref);
+}
+
 void dev_pm_opp_put(struct dev_pm_opp *opp)
 {
 	kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
index 8c3434bdb26d..adef788862d5 100644
--- a/drivers/base/power/opp/cpu.c
+++ b/drivers/base/power/opp/cpu.c
@@ -42,11 +42,6 @@
  *
  * WARNING: It is  important for the callers to ensure refreshing their copy of
  * the table if any of the mentioned functions have been invoked in the interim.
- *
- * Locking: The internal opp_table and opp structures are RCU protected.
- * Since we just use the regular accessor functions to access the internal data
- * structures, we use RCU read lock inside this function. As a result, users of
- * this function DONOT need to use explicit locks for invoking.
  */
 int dev_pm_opp_init_cpufreq_table(struct device *dev,
 				  struct cpufreq_frequency_table **table)
@@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	int i, max_opps, ret = 0;
 	unsigned long rate;
 
-	rcu_read_lock();
-
 	max_opps = dev_pm_opp_get_opp_count(dev);
-	if (max_opps <= 0) {
-		ret = max_opps ? max_opps : -ENODATA;
-		goto out;
-	}
+	if (max_opps <= 0)
+		return max_opps ? max_opps : -ENODATA;
 
 	freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
-	if (!freq_table) {
-		ret = -ENOMEM;
-		goto out;
-	}
+	if (!freq_table)
+		return -ENOMEM;
 
 	for (i = 0, rate = 0; i < max_opps; i++, rate++) {
 		/* find next rate */
@@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 		/* Is Boost/turbo opp ? */
 		if (dev_pm_opp_is_turbo(opp))
 			freq_table[i].flags = CPUFREQ_BOOST_FREQ;
+
+		dev_pm_opp_put(opp);
 	}
 
 	freq_table[i].driver_data = i;
@@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
 	*table = &freq_table[0];
 
 out:
-	rcu_read_unlock();
 	if (ret)
 		kfree(freq_table);
 
diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
index f010562534eb..2c44aeb0b97c 100644
--- a/drivers/clk/tegra/clk-dfll.c
+++ b/drivers/clk/tegra/clk-dfll.c
@@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
 	struct dev_pm_opp *opp;
 	int i, uv;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
-	uv = dev_pm_opp_get_voltage(opp);
 
-	rcu_read_unlock();
+	uv = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	for (i = 0; i < td->i2c_lut_size; i++) {
 		if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
@@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 	struct dev_pm_opp *opp;
 	int lut;
 
-	rcu_read_lock();
-
 	rate = ULONG_MAX;
 	opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
 	if (IS_ERR(opp)) {
@@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		goto out;
 	}
 	v_max = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 
 	v = td->soc->cvb->min_millivolts * 1000;
 	lut = find_vdd_map_entry_exact(td, v);
@@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		if (v_opp <= td->soc->cvb->min_millivolts * 1000)
 			td->dvco_rate_min = dev_pm_opp_get_freq(opp);
 
+		dev_pm_opp_put(opp);
+
 		for (;;) {
 			v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
 			if (v >= v_opp)
@@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
 		ret = 0;
 
 out:
-	rcu_read_unlock();
-
 	return ret;
 }
 
diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
index c0f3373706f4..9180d34cc9fc 100644
--- a/drivers/cpufreq/exynos5440-cpufreq.c
+++ b/drivers/cpufreq/exynos5440-cpufreq.c
@@ -118,12 +118,10 @@ static int init_div_table(void)
 	unsigned int tmp, clk_div, ema_div, freq, volt_id;
 	struct dev_pm_opp *opp;
 
-	rcu_read_lock();
 	cpufreq_for_each_entry(pos, freq_tbl) {
 		opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
 					pos->frequency * 1000, true);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(dvfs_info->dev,
 				"failed to find valid OPP for %u KHZ\n",
 				pos->frequency);
@@ -140,6 +138,7 @@ static int init_div_table(void)
 
 		/* Calculate EMA */
 		volt_id = dev_pm_opp_get_voltage(opp);
+
 		volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
 		if (volt_id < PMIC_HIGH_VOLT) {
 			ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
@@ -157,9 +156,9 @@ static int init_div_table(void)
 
 		__raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
 						(pos - freq_tbl));
+		dev_pm_opp_put(opp);
 	}
 
-	rcu_read_unlock();
 	return 0;
 }
 
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index ef1fa8145419..7719b02e04f5 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 	freq_hz = new_freq * 1000;
 	old_freq = clk_get_rate(arm_clk) / 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
 		return PTR_ERR(opp);
 	}
 
 	volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	volt_old = regulator_get_voltage(arm_reg);
 
 	dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
@@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 	 * freq_table initialised from OPP is therefore sorted in the
 	 * same order.
 	 */
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[0].frequency * 1000, true);
 	min_volt = dev_pm_opp_get_voltage(opp);
+	dev_pm_opp_put(opp);
 	opp = dev_pm_opp_find_freq_exact(cpu_dev,
 				  freq_table[--num].frequency * 1000, true);
 	max_volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
+
 	ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
 	if (ret > 0)
 		transition_latency += ret * 1000;
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index 643f43179df1..ab25b1235a5e 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 
 	freq_hz = freq_table[index].frequency * 1000;
 
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("cpu%d: failed to find OPP for %ld\n",
 		       policy->cpu, freq_hz);
 		return PTR_ERR(opp);
 	}
 	vproc = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	/*
 	 * If the new voltage or the intermediate voltage is higher than the
@@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
 
 	/* Search a safe voltage for intermediate frequency. */
 	rate = clk_get_rate(inter_clk);
-	rcu_read_lock();
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		pr_err("failed to get intermediate opp for cpu%d\n", cpu);
 		ret = PTR_ERR(opp);
 		goto out_free_opp_table;
 	}
 	info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	info->cpu_dev = cpu_dev;
 	info->proc_reg = proc_reg;
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index 376e63ca94e8..71e81bbf031b 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
 	freq = ret;
 
 	if (mpu_reg) {
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
 				__func__, new_freq);
 			return -EINVAL;
 		}
 		volt = dev_pm_opp_get_voltage(opp);
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 		tol = volt * OPP_TOLERANCE / 100;
 		volt_old = regulator_get_voltage(mpu_reg);
 	}
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index b0de42972b74..378f12a51496 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
 		return;
 	}
 
-	rcu_read_lock();
 	for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
 		opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
 		if (IS_ERR(opp)) {
 			devm_kfree(devfreq->dev.parent, profile->freq_table);
 			profile->max_state = 0;
-			rcu_read_unlock();
 			return;
 		}
+		dev_pm_opp_put(opp);
 		profile->freq_table[i] = freq;
 	}
-	rcu_read_unlock();
 }
 
 /**
@@ -1107,17 +1105,16 @@ static ssize_t available_frequencies_show(struct device *d,
 	ssize_t count = 0;
 	unsigned long freq = 0;
 
-	rcu_read_lock();
 	do {
 		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
 		if (IS_ERR(opp))
 			break;
 
+		dev_pm_opp_put(opp);
 		count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
 				   "%lu ", freq);
 		freq++;
 	} while (1);
-	rcu_read_unlock();
 
 	/* Truncate the trailing space */
 	if (count)
@@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
  * @freq:	The frequency given to target function
  * @flags:	Flags handed from devfreq framework.
  *
- * Locking: This function must be called under rcu_read_lock(). opp is a rcu
- * protected pointer. The reason for the same is that the opp pointer which is
- * returned will remain valid for use with opp_get_{voltage, freq} only while
- * under the locked area. The pointer returned must be used prior to unlocking
- * with rcu_read_unlock() to maintain the integrity of the pointer.
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
  */
 struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
 					   unsigned long *freq,
diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
index a8ed7792ece2..49ce38cef460 100644
--- a/drivers/devfreq/exynos-bus.c
+++ b/drivers/devfreq/exynos-bus.c
@@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
 	new_volt = dev_pm_opp_get_voltage(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
 	int ret = 0;
 
 	/* Get new opp-bus instance according to new bus clock */
-	rcu_read_lock();
 	new_opp = devfreq_recommended_opp(dev, freq, flags);
 	if (IS_ERR(new_opp)) {
 		dev_err(dev, "failed to get recommended opp instance\n");
-		rcu_read_unlock();
 		return PTR_ERR(new_opp);
 	}
 
 	new_freq = dev_pm_opp_get_freq(new_opp);
+	dev_pm_opp_put(new_opp);
+
 	old_freq = bus->curr_freq;
-	rcu_read_unlock();
 
 	if (old_freq == new_freq)
 		return 0;
@@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
 
 	rate = clk_get_rate(bus->clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, 0);
 	if (IS_ERR(opp)) {
 		dev_err(dev, "failed to find dev_pm_opp\n");
-		rcu_read_unlock();
 		ret = PTR_ERR(opp);
 		goto err_opp;
 	}
 	bus->curr_freq = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	return 0;
 
diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
index 9ef46e2592c4..bd452236dba4 100644
--- a/drivers/devfreq/governor_passive.c
+++ b/drivers/devfreq/governor_passive.c
@@ -59,14 +59,14 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
 	 * list of parent device. Because in this case, *freq is temporary
 	 * value which is decided by ondemand governor.
 	 */
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
-	rcu_read_unlock();
 	if (IS_ERR(opp)) {
 		ret = PTR_ERR(opp);
 		goto out;
 	}
 
+	dev_pm_opp_put(opp);
+
 	/*
 	 * Get the OPP table's index of decided freqeuncy by governor
 	 * of parent device.
diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
index 27d2f349b53c..40a2499730fc 100644
--- a/drivers/devfreq/rk3399_dmc.c
+++ b/drivers/devfreq/rk3399_dmc.c
@@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
 	unsigned long target_volt, target_rate;
 	int err;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, freq, flags);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
 
 	target_rate = dev_pm_opp_get_freq(opp);
 	target_volt = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (dmcfreq->rate == target_rate)
 		return 0;
@@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
 
 	data->rate = clk_get_rate(data->dmc_clk);
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &data->rate, 0);
-	if (IS_ERR(opp)) {
-		rcu_read_unlock();
+	if (IS_ERR(opp))
 		return PTR_ERR(opp);
-	}
+
 	data->rate = dev_pm_opp_get_freq(opp);
 	data->volt = dev_pm_opp_get_voltage(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	rk3399_devfreq_dmc_profile.initial_freq = data->rate;
 
diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
index fe9dce0245bf..214fff96fa4a 100644
--- a/drivers/devfreq/tegra-devfreq.c
+++ b/drivers/devfreq/tegra-devfreq.c
@@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
 	struct dev_pm_opp *opp;
 	unsigned long rate = *freq * KHZ;
 
-	rcu_read_lock();
 	opp = devfreq_recommended_opp(dev, &rate, flags);
 	if (IS_ERR(opp)) {
-		rcu_read_unlock();
 		dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
 		return PTR_ERR(opp);
 	}
 	rate = dev_pm_opp_get_freq(opp);
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	clk_set_min_rate(tegra->emc_clock, rate);
 	clk_set_rate(tegra->emc_clock, 0);
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 9ce0e9eef923..85fdbf762fa0 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 	if (!power_table)
 		return -ENOMEM;
 
-	rcu_read_lock();
-
 	for (freq = 0, i = 0;
 	     opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
 	     freq++, i++) {
@@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		u64 power;
 
 		if (i >= num_opps) {
-			rcu_read_unlock();
 			ret = -EAGAIN;
 			goto free_power_table;
 		}
 
 		freq_mhz = freq / 1000000;
 		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
+		dev_pm_opp_put(opp);
 
 		/*
 		 * Do the multiplication with MHz and millivolt so as
@@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
 		power_table[i].power = power;
 	}
 
-	rcu_read_unlock();
-
 	if (i != num_opps) {
 		ret = PTR_ERR(opp);
 		goto free_power_table;
@@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
 		return 0;
 	}
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
 					 true);
 	voltage = dev_pm_opp_get_voltage(opp);
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(cpufreq_device->cpu_dev,
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index 81631b110e17..abe8ad76bd8b 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -113,15 +113,15 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
 		unsigned int freq = dfc->freq_table[i];
 		bool want_enable = i >= cdev_state ? true : false;
 
-		rcu_read_lock();
 		opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
-		rcu_read_unlock();
 
 		if (PTR_ERR(opp) == -ERANGE)
 			continue;
 		else if (IS_ERR(opp))
 			return PTR_ERR(opp);
 
+		dev_pm_opp_put(opp);
+
 		if (want_enable)
 			ret = dev_pm_opp_enable(dev, freq);
 		else
@@ -221,15 +221,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
 	if (!dfc->power_ops->get_static_power)
 		return 0;
 
-	rcu_read_lock();
-
 	opp = dev_pm_opp_find_freq_exact(dev, freq, true);
 	if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
 		opp = dev_pm_opp_find_freq_exact(dev, freq, false);
 
 	voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-	rcu_read_unlock();
+	dev_pm_opp_put(opp);
 
 	if (voltage == 0) {
 		dev_warn_ratelimited(dev,
@@ -411,18 +408,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
 		unsigned long power_dyn, voltage;
 		struct dev_pm_opp *opp;
 
-		rcu_read_lock();
-
 		opp = dev_pm_opp_find_freq_floor(dev, &freq);
 		if (IS_ERR(opp)) {
-			rcu_read_unlock();
 			ret = PTR_ERR(opp);
 			goto free_tables;
 		}
 
 		voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
-
-		rcu_read_unlock();
+		dev_pm_opp_put(opp);
 
 		if (dfc->power_ops) {
 			power_dyn = get_dynamic_power(dfc, freq, voltage);

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits
  2016-12-07 10:37 ` [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits Viresh Kumar
@ 2017-01-09 22:39   ` Stephen Boyd
  2017-01-10  4:39     ` Viresh Kumar
  0 siblings, 1 reply; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 22:39 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Len Brown,
	Pavel Machek, linaro-kernel, linux-pm, linux-kernel,
	Vincent Guittot

On 12/07, Viresh Kumar wrote:
> @@ -137,15 +121,18 @@ functions return the matching pointer representing the opp if a match is
>  found, else returns error. These errors are expected to be handled by standard
>  error checks such as IS_ERR() and appropriate actions taken by the caller.
>  
> +Callers of these functions shall call dev_pm_opp_put() after they have used the
> +OPP. Otherwise the memory for the OPP will never get freed and result in
> +memleak.
> +
>  dev_pm_opp_find_freq_exact - Search for an OPP based on an *exact* frequency and
>  	availability. This function is especially useful to enable an OPP which
>  	is not available by default.
>  	Example: In a case when SoC framework detects a situation where a
>  	higher frequency could be made available, it can use this function to
>  	find the OPP prior to call the dev_pm_opp_enable to actually make it available.
> -	 rcu_read_lock();
>  	 opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
> -	 rcu_read_unlock();
> +	 dev_pm_opp_put(opp);
>  	 /* dont operate on the pointer.. just do a sanity check.. */
>  	 if (IS_ERR(opp)) {
>  		pr_err("frequency not disabled!\n");
> @@ -163,9 +150,8 @@ dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the
>  	frequency.
>  	Example: To find the highest opp for a device:
>  	 freq = ULONG_MAX;
> -	 rcu_read_lock();
>  	 dev_pm_opp_find_freq_floor(dev, &freq);
> -	 rcu_read_unlock();
> +	 dev_pm_opp_put(opp);

opp doesn't exist in the scope here. Missing an assignment during
the dev_pm_opp_find_freq_floor() call?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency()
  2016-12-07 10:37 ` [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency() Viresh Kumar
@ 2017-01-09 22:40   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 22:40 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> dev_pm_opp_get_max_volt_latency() calls _find_opp_table() two times
> effectively.
> 
> Merge _get_regulator_count() into dev_pm_opp_get_max_volt_latency() to
> avoid that.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/12] PM / OPP: Add per OPP table mutex
  2016-12-07 10:37 ` [PATCH 01/12] PM / OPP: Add per OPP table mutex Viresh Kumar
@ 2017-01-09 23:11   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:11 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Add per OPP table lock to protect opp_table->opp_list.
> 
> Note that at few places opp_list is used under the rcu_read_lock() and
> so a mutex can't be added there for now. This will be fixed by a later
> patch.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table
  2016-12-07 10:37 ` [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table Viresh Kumar
@ 2017-01-09 23:36   ` Stephen Boyd
  2017-01-10  4:23     ` Viresh Kumar
  0 siblings, 1 reply; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:36 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> @@ -894,8 +895,36 @@ static void _kfree_device_rcu(struct rcu_head *head)
>  	kfree_rcu(opp_table, rcu_head);
>  }
>  
> -static void _free_opp_table(struct opp_table *opp_table)
> +void _get_opp_table_kref(struct opp_table *opp_table)
>  {
> +	kref_get(&opp_table->kref);
> +}
> +
> +struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
> +{
> +	struct opp_table *opp_table;
> +
> +	/* Hold our table modification lock here */
> +	mutex_lock(&opp_table_lock);
> +
> +	opp_table = _find_opp_table(dev);
> +	if (!IS_ERR(opp_table)) {
> +		_get_opp_table_kref(opp_table);

It seems odd to have _get_opp_table_kref() take a pointer to
increment a kref on. It would be better to have _find_opp_table()
return the pointer with the reference already taken so that we
don't have to update callers with reference grabbing calls.
Typically if a function returns a reference counted pointer the
reference counting has already been done.

> +		goto unlock;
> +	}
> +
> +	opp_table = _allocate_opp_table(dev);
> +
> +unlock:
> +	mutex_unlock(&opp_table_lock);
> +
> +	return opp_table;
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_table);
> +
> +static void _opp_table_kref_release_unlocked(struct kref *kref)
> +{
> +	struct opp_table *opp_table = container_of(kref, struct opp_table, kref);
>  	struct opp_device *opp_dev;
>  
>  	/* Release clk */

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
  2016-12-07 10:37 ` [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines Viresh Kumar
@ 2017-01-09 23:37   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:37 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Patrice Chotard,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Now that we have proper kernel reference infrastructure in place for OPP
> tables, use it to guarantee that the OPP table isn't freed while being
> used by the callers of dev_pm_opp_set_*() APIs.
> 
> Make them all return the pointer to the OPP table after taking its
> reference and put the reference back with dev_pm_opp_put_*() APIs.
> 
> Now that the OPP table wouldn't get freed while these routines are
> executing after dev_pm_opp_get_opp_table() is called, there is no need
> to take opp_table_lock. Drop them as well.
> 
> Remove the rcu specific comments from these routines as they aren't
> relevant anymore.
> 
> Note that prototypes of dev_pm_opp_{set|put}_regulators() were already
> updated by another patch.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs
  2016-12-07 10:37 ` [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs Viresh Kumar
@ 2017-01-09 23:38   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:38 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Take reference of the OPP table while adding and removing OPPs, that
> helps us remove special checks in _remove_opp_table().
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table()
  2016-12-07 10:37 ` [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table() Viresh Kumar
@ 2017-01-09 23:43   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Migrate all users of _add_opp_table() to use dev_pm_opp_get_opp_table()
> to guarantee that the OPP table doesn't get freed while being used.
> 
> Also update _managed_opp() to get the reference to the OPP table.
> 
> Now that the OPP table wouldn't get freed while these routines are
> executing after dev_pm_opp_get_opp_table() is called, there is no need
> to take opp_table_lock. Drop them as well.
> 
> Now that _add_opp_table(), _remove_opp_table() and the unlocked release
> routines aren't used anymore, remove them.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2016-12-07 10:37 ` [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp Viresh Kumar
@ 2017-01-09 23:44   ` Stephen Boyd
  2017-01-10  4:26     ` Viresh Kumar
  0 siblings, 1 reply; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Add kref to struct dev_pm_opp for easier accounting of the OPPs.
> 
> Note that the OPPs are freed under the opp_table->lock mutex only.

I'm lost. Why add another level of krefs?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] PM / OPP: Take kref from _find_opp_table()
  2016-12-07 10:37 ` [PATCH 08/12] PM / OPP: Take kref from _find_opp_table() Viresh Kumar
@ 2017-01-09 23:49   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:49 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> Take reference of the OPP table from within _find_opp_table(). Also
> update the callers of _find_opp_table() to call
> dev_pm_opp_put_opp_table() after they have used the OPP table.
> 
> Note that _find_opp_table() increments the reference under the
> opp_table_lock.
> 
> Now that the OPP table wouldn't get freed until the callers of
> _find_opp_table() call dev_pm_opp_put_opp_table(), there is no need to
> take the opp_table_lock or rcu_read_lock() around it. Drop them.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] PM / OPP: Move away from RCU locking
  2016-12-07 10:37 ` [PATCH 09/12] PM / OPP: Move away from RCU locking Viresh Kumar
@ 2017-01-09 23:57   ` Stephen Boyd
  2017-01-10  4:28     ` Viresh Kumar
  0 siblings, 1 reply; 41+ messages in thread
From: Stephen Boyd @ 2017-01-09 23:57 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> The RCU locking isn't well suited for the OPP core. The RCU locking fits
> better for reader heavy stuff, while the OPP core have at max one or two
> readers only at a time.
> 
> Over that, it was getting very confusing the way RCU locking was used
> with the OPP core. The individual OPPs are mostly well handled, i.e. for
> an update a new structure was created and then that replaced the older
> one. But the OPP tables were updated directly all the time from various
> parts of the core. Though they were mostly used from within RCU locked
> region, they didn't had much to do with RCU and were governed by the
> mutex instead.
> 
> And that mixed with the 'opp_table_lock' has made the core even more
> confusing.
> 
> Now that we are already managing the OPPs and the OPP tables with kernel
> reference infrastructure, we can get rid of RCU locking completely and
> simplify the code a lot.
> 
> Remove all RCU references from code and comments.
> 
> Acquire opp_table->lock while parsing the list of OPPs though.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

> diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
> index 2b689fc73596..b5e9600058c2 100644
> --- a/drivers/base/power/opp/core.c
> +++ b/drivers/base/power/opp/core.c
> @@ -133,19 +117,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
>   */
>  unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>  {
> -	struct dev_pm_opp *tmp_opp;
> -	unsigned long f = 0;
> -
> -	rcu_read_lock();
> -
> -	tmp_opp = rcu_dereference(opp);
> -	if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
> +	if (IS_ERR_OR_NULL(opp) || !opp->available) {

I suppose this is one thing RCU was being used for, marking OPPs
available and then having these "getter" APIs fail if the OPPs go
away. But that was never right because the OPP could have been
made unavailable after this function returned and things still
wouldn't work.

>  		pr_err("%s: Invalid parameters\n", __func__);
> -	else
> -		f = tmp_opp->rate;
> +		return 0;
> +	}
>  
> -	rcu_read_unlock();
> -	return f;
> +	return opp->rate;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
>  

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 10/12] PM / OPP: Simplify _opp_set_availability()
  2016-12-07 10:37 ` [PATCH 10/12] PM / OPP: Simplify _opp_set_availability() Viresh Kumar
@ 2017-01-10  0:00   ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-10  0:00 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 12/07, Viresh Kumar wrote:
> As we don't use RCU locking anymore, there is no need to replace an
> earlier OPP node with a new one. Just update the existing one.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table
  2017-01-09 23:36   ` Stephen Boyd
@ 2017-01-10  4:23     ` Viresh Kumar
  2017-01-13  8:54       ` Stephen Boyd
  0 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2017-01-10  4:23 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 09-01-17, 15:36, Stephen Boyd wrote:
> On 12/07, Viresh Kumar wrote:
> > @@ -894,8 +895,36 @@ static void _kfree_device_rcu(struct rcu_head *head)
> >  	kfree_rcu(opp_table, rcu_head);
> >  }
> >  
> > -static void _free_opp_table(struct opp_table *opp_table)
> > +void _get_opp_table_kref(struct opp_table *opp_table)
> >  {
> > +	kref_get(&opp_table->kref);
> > +}
> > +
> > +struct opp_table *dev_pm_opp_get_opp_table(struct device *dev)
> > +{
> > +	struct opp_table *opp_table;
> > +
> > +	/* Hold our table modification lock here */
> > +	mutex_lock(&opp_table_lock);
> > +
> > +	opp_table = _find_opp_table(dev);
> > +	if (!IS_ERR(opp_table)) {
> > +		_get_opp_table_kref(opp_table);
> 
> It seems odd to have _get_opp_table_kref() take a pointer to
> increment a kref on.

This function is provided for better readability and passing opp_table to it is
the only option I had :)

> It would be better to have _find_opp_table()
> return the pointer with the reference already taken so that we
> don't have to update callers with reference grabbing calls.
> Typically if a function returns a reference counted pointer the
> reference counting has already been done.

Absolutely, but that happens with later patches in the series. I couldn't have
done it now, as something or the other would have broken.

-- 
viresh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2017-01-09 23:44   ` Stephen Boyd
@ 2017-01-10  4:26     ` Viresh Kumar
  2017-01-13  8:52       ` Stephen Boyd
  0 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2017-01-10  4:26 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 09-01-17, 15:44, Stephen Boyd wrote:
> On 12/07, Viresh Kumar wrote:
> > Add kref to struct dev_pm_opp for easier accounting of the OPPs.
> > 
> > Note that the OPPs are freed under the opp_table->lock mutex only.
> 
> I'm lost. Why add another level of krefs?

Heh. The earlier krefs were for the OPP table itself, so that it gets freed once
there are no more users of it.

The kref introduced now is for individual OPPs, so that they don't disappear
while being used and gets freed once all are done.

Also note that the OPP table will get freed only after all the OPPs are freed,
plus there are no more users left, like platform code which might have set
suppoerted-hw property.

-- 
viresh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] PM / OPP: Move away from RCU locking
  2017-01-09 23:57   ` Stephen Boyd
@ 2017-01-10  4:28     ` Viresh Kumar
  0 siblings, 0 replies; 41+ messages in thread
From: Viresh Kumar @ 2017-01-10  4:28 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 09-01-17, 15:57, Stephen Boyd wrote:
> I suppose this is one thing RCU was being used for, marking OPPs
> available and then having these "getter" APIs fail if the OPPs go
> away. But that was never right because the OPP could have been
> made unavailable after this function returned and things still
> wouldn't work.

Right, it was all so confusing with RCUs in OPP library :)

-- 
viresh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits
  2017-01-09 22:39   ` Stephen Boyd
@ 2017-01-10  4:39     ` Viresh Kumar
  2017-01-13  8:44       ` Stephen Boyd
  0 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2017-01-10  4:39 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Len Brown,
	Pavel Machek, linaro-kernel, linux-pm, linux-kernel,
	Vincent Guittot

On 09-01-17, 14:39, Stephen Boyd wrote:
> On 12/07, Viresh Kumar wrote:
> > @@ -137,15 +121,18 @@ functions return the matching pointer representing the opp if a match is
> >  found, else returns error. These errors are expected to be handled by standard
> >  error checks such as IS_ERR() and appropriate actions taken by the caller.
> >  
> > +Callers of these functions shall call dev_pm_opp_put() after they have used the
> > +OPP. Otherwise the memory for the OPP will never get freed and result in
> > +memleak.
> > +
> >  dev_pm_opp_find_freq_exact - Search for an OPP based on an *exact* frequency and
> >  	availability. This function is especially useful to enable an OPP which
> >  	is not available by default.
> >  	Example: In a case when SoC framework detects a situation where a
> >  	higher frequency could be made available, it can use this function to
> >  	find the OPP prior to call the dev_pm_opp_enable to actually make it available.
> > -	 rcu_read_lock();
> >  	 opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
> > -	 rcu_read_unlock();
> > +	 dev_pm_opp_put(opp);
> >  	 /* dont operate on the pointer.. just do a sanity check.. */
> >  	 if (IS_ERR(opp)) {
> >  		pr_err("frequency not disabled!\n");
> > @@ -163,9 +150,8 @@ dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the
> >  	frequency.
> >  	Example: To find the highest opp for a device:
> >  	 freq = ULONG_MAX;
> > -	 rcu_read_lock();
> >  	 dev_pm_opp_find_freq_floor(dev, &freq);
> > -	 rcu_read_unlock();
> > +	 dev_pm_opp_put(opp);
> 
> opp doesn't exist in the scope here. Missing an assignment during
> the dev_pm_opp_find_freq_floor() call?

Thanks for noticing this. Following is the diff I am adding to this patch:

diff --git a/Documentation/power/opp.txt b/Documentation/power/opp.txt
index be895e32022d..0c007e250cd1 100644
--- a/Documentation/power/opp.txt
+++ b/Documentation/power/opp.txt
@@ -150,7 +150,7 @@ dev_pm_opp_find_freq_floor - Search for an available OPP which is *at most* the
        frequency.
        Example: To find the highest opp for a device:
         freq = ULONG_MAX;
-        dev_pm_opp_find_freq_floor(dev, &freq);
+        opp = dev_pm_opp_find_freq_floor(dev, &freq);
         dev_pm_opp_put(opp);
 
 dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the
@@ -159,7 +159,7 @@ dev_pm_opp_find_freq_ceil - Search for an available OPP which is *at least* the
        frequency.
        Example 1: To find the lowest opp for a device:
         freq = 0;
-        dev_pm_opp_find_freq_ceil(dev, &freq);
+        opp = dev_pm_opp_find_freq_ceil(dev, &freq);
         dev_pm_opp_put(opp);
        Example 2: A simplified implementation of a SoC cpufreq_driver->target:
         soc_cpufreq_target(..)
@@ -252,6 +252,7 @@ dev_pm_opp_get_freq - Retrieve the freq represented by the opp pointer.
                 if (!IS_ERR(max_opp) && !IS_ERR(requested_opp))
                        r = soc_test_validity(max_opp, requested_opp);
                 dev_pm_opp_put(max_opp);
+                dev_pm_opp_put(requested_opp);
                /* do other things */
         }
         soc_test_validity(..)


Please add your RBY if it looks fine to you now.

-- 
viresh

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits
  2017-01-10  4:39     ` Viresh Kumar
@ 2017-01-13  8:44       ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-13  8:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, Len Brown,
	Pavel Machek, linaro-kernel, linux-pm, linux-kernel,
	Vincent Guittot

On 01/10, Viresh Kumar wrote:
> @@ -252,6 +252,7 @@ dev_pm_opp_get_freq - Retrieve the freq represented by the opp pointer.
>                  if (!IS_ERR(max_opp) && !IS_ERR(requested_opp))
>                         r = soc_test_validity(max_opp, requested_opp);
>                  dev_pm_opp_put(max_opp);
> +                dev_pm_opp_put(requested_opp);
>                 /* do other things */
>          }
>          soc_test_validity(..)
> 
> 
> Please add your RBY if it looks fine to you now.
> 

Sure.

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2017-01-10  4:26     ` Viresh Kumar
@ 2017-01-13  8:52       ` Stephen Boyd
  2017-01-13  8:56         ` Viresh Kumar
  0 siblings, 1 reply; 41+ messages in thread
From: Stephen Boyd @ 2017-01-13  8:52 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 01/10, Viresh Kumar wrote:
> On 09-01-17, 15:44, Stephen Boyd wrote:
> > On 12/07, Viresh Kumar wrote:
> > > Add kref to struct dev_pm_opp for easier accounting of the OPPs.
> > > 
> > > Note that the OPPs are freed under the opp_table->lock mutex only.
> > 
> > I'm lost. Why add another level of krefs?
> 
> Heh. The earlier krefs were for the OPP table itself, so that it gets freed once
> there are no more users of it.
> 
> The kref introduced now is for individual OPPs, so that they don't disappear
> while being used and gets freed once all are done.
> 
> Also note that the OPP table will get freed only after all the OPPs are freed,
> plus there are no more users left, like platform code which might have set
> suppoerted-hw property.
> 

What still doesn't make sense is how an individual OPP could go
away without the table that the OPP lives in also going away. If
an OPP is going away while a driver has a reference to it, then
the driver using that OPP should probably not be using it. TL;DR
letting drivers use OPP pointers outside of the OPP core feels
racy.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table
  2017-01-10  4:23     ` Viresh Kumar
@ 2017-01-13  8:54       ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-13  8:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 01/10, Viresh Kumar wrote:
> On 09-01-17, 15:36, Stephen Boyd wrote:
> 
> > It would be better to have _find_opp_table()
> > return the pointer with the reference already taken so that we
> > don't have to update callers with reference grabbing calls.
> > Typically if a function returns a reference counted pointer the
> > reference counting has already been done.
> 
> Absolutely, but that happens with later patches in the series. I couldn't have
> done it now, as something or the other would have broken.
> 

Ok, if things get better later in the series then you can have my

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2017-01-13  8:52       ` Stephen Boyd
@ 2017-01-13  8:56         ` Viresh Kumar
  2017-01-19 20:01           ` Stephen Boyd
  0 siblings, 1 reply; 41+ messages in thread
From: Viresh Kumar @ 2017-01-13  8:56 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 13-01-17, 00:52, Stephen Boyd wrote:
> What still doesn't make sense is how an individual OPP could go
> away without the table that the OPP lives in also going away.

dev_pm_opp_remove() is one such option, which can remove OPPs
individually. Over that, while remove tables we remove all the OPPs
one by one. So that really does happen.

> If
> an OPP is going away while a driver has a reference to it, then
> the driver using that OPP should probably not be using it.

That is being protected with this patch now and the drivers can use
them freely.

> TL;DR
> letting drivers use OPP pointers outside of the OPP core feels
> racy.

Hmm, we don't update the OPP a lot after creating it today. But that's
a different problem to solve, if we really see a race there.

-- 
viresh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp
  2017-01-13  8:56         ` Viresh Kumar
@ 2017-01-19 20:01           ` Stephen Boyd
  0 siblings, 0 replies; 41+ messages in thread
From: Stephen Boyd @ 2017-01-19 20:01 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Viresh Kumar, Nishanth Menon, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot

On 01/13, Viresh Kumar wrote:
> On 13-01-17, 00:52, Stephen Boyd wrote:
> > What still doesn't make sense is how an individual OPP could go
> > away without the table that the OPP lives in also going away.
> 
> dev_pm_opp_remove() is one such option, which can remove OPPs
> individually. Over that, while remove tables we remove all the OPPs
> one by one. So that really does happen.
> 
> > If
> > an OPP is going away while a driver has a reference to it, then
> > the driver using that OPP should probably not be using it.
> 
> That is being protected with this patch now and the drivers can use
> them freely.
> 
> > TL;DR
> > letting drivers use OPP pointers outside of the OPP core feels
> > racy.
> 
> Hmm, we don't update the OPP a lot after creating it today. But that's
> a different problem to solve, if we really see a race there.
> 

Ok. We still have work to do to fix the race between drivers
using dev_pm_opp pointers and other drivers updating the data
those pointers point to like voltage, enable/disable, etc. This
isn't making anything worse than it already is though, so:

Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
  2016-12-08  4:00       ` Viresh Kumar
@ 2017-01-21  7:42         ` Chanwoo Choi
  -1 siblings, 0 replies; 41+ messages in thread
From: Chanwoo Choi @ 2017-01-21  7:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Chanwoo Choi, Nishanth Menon, Prashant Gaikwad, Tony Lindgren,
	Stephen Boyd, Thierry Reding, Javi Merino, Alexandre Courbot,
	Viresh Kumar, linux-pm, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Zhang Rui, linaro-kernel,
	Stephen Warren, Eduardo Valentin, Peter De Schrijver,
	Rafael Wysocki, linux-kernel, Kyungmin Park, Kukjin Kim,
	Amit Daniel Kachhap

Hi Viresh,

For devfreq part, Looks good to me.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>


2016-12-08 13:00 GMT+09:00 Viresh Kumar <viresh.kumar@linaro.org>:
> On 07-12-16, 22:23, Chanwoo Choi wrote:
>> I think that the dev_pm_opp_put(opp) should be called after if statement
>> If dev_pm_opp_find_freq_ceil() return error, I think the calling of
>> dev_pm_opp_put(opp) is not necessary.
>
> During development I had following check in dev_pm_opp_put():
>
>         if (IS_ERR(opp))
>                 return;
>
> But that check isn't there anymore. And so it is also unsafe to call
> dev_pm_opp_put() for invalid OPP pointers.
>
> Thanks for reviewing this properly. devfreq_cooling.c also had the same issue
> which you missed. Here is the new version of the patch:
>
> -------------------------8<-------------------------
> Subject: [PATCH] PM / OPP: Update OPP users to put reference
>
> This patch updates dev_pm_opp_find_freq_*() routines to get a reference
> to the OPPs returned by them.
>
> Also updates the users of dev_pm_opp_find_freq_*() routines to call
> dev_pm_opp_put() after they are done using the OPPs.
>
> As it is guaranteed the that OPPs wouldn't get freed while being used,
> the RCU read side locking present with the users isn't required anymore.
> Drop it as well.
>
> This patch also updates all users of devfreq_recommended_opp() which was
> returning an OPP received from the OPP core.
>
> Note that some of the OPP core routines have gained
> rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
> within them.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  arch/arm/mach-omap2/pm.c             |   5 +-
>  drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
>  drivers/base/power/opp/cpu.c         |  22 ++-----
>  drivers/clk/tegra/clk-dfll.c         |  17 ++----
>  drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
>  drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
>  drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
>  drivers/cpufreq/omap-cpufreq.c       |   4 +-
>  drivers/devfreq/devfreq.c            |  14 ++---
>  drivers/devfreq/exynos-bus.c         |  14 ++---
>  drivers/devfreq/governor_passive.c   |   4 +-
>  drivers/devfreq/rk3399_dmc.c         |  16 ++---
>  drivers/devfreq/tegra-devfreq.c      |   4 +-
>  drivers/thermal/cpu_cooling.c        |  11 +---
>  drivers/thermal/devfreq_cooling.c    |  15 ++---
>  15 files changed, 110 insertions(+), 153 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
> index 678d2a31dcb8..c5a1d4439202 100644
> --- a/arch/arm/mach-omap2/pm.c
> +++ b/arch/arm/mach-omap2/pm.c
> @@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
>         freq = clk_get_rate(clk);
>         clk_put(clk);
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(dev, &freq);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("%s: unable to find boot up OPP for vdd_%s\n",
>                         __func__, vdd_name);
>                 goto exit;
>         }
>
>         bootup_volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         if (!bootup_volt) {
>                 pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
>                        __func__, vdd_name);
> diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
> index 9870ee54d708..a6efa818029a 100644
> --- a/drivers/base/power/opp/core.c
> +++ b/drivers/base/power/opp/core.c
> @@ -40,6 +40,8 @@ do {                                                                  \
>                          "opp_table_lock protection");                  \
>  } while (0)
>
> +static void dev_pm_opp_get(struct dev_pm_opp *opp);
> +
>  static struct opp_device *_find_opp_dev(const struct device *dev,
>                                         struct opp_table *opp_table)
>  {
> @@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
>   * return 0
>   *
>   * This is useful only for devices with single power supply.
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
>         unsigned long v = 0;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp))
> @@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
>         else
>                 v = tmp_opp->supplies[0].u_volt;
>
> +       rcu_read_unlock();
>         return v;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
> @@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
>   *
>   * Return: frequency in hertz corresponding to the opp, else
>   * return 0
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
>         unsigned long f = 0;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
> @@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>         else
>                 f = tmp_opp->rate;
>
> +       rcu_read_unlock();
>         return f;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
> @@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
>   * quickly. Running on them for longer times may overheat the chip.
>   *
>   * Return: true if opp is turbo opp, else false.
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
> +       bool turbo;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
> @@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
>                 return false;
>         }
>
> -       return tmp_opp->turbo;
> +       turbo = tmp_opp->turbo;
> +
> +       rcu_read_unlock();
> +       return turbo;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
>
> @@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
>   * This provides a mechanism to enable an opp which is not available currently
>   * or the opposite as well.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>                                               unsigned long freq,
> @@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>         struct opp_table *opp_table;
>         struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         opp_table = _find_opp_table(dev);
>         if (IS_ERR(opp_table)) {
>                 int r = PTR_ERR(opp_table);
>
>                 dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
> +               rcu_read_unlock();
>                 return ERR_PTR(r);
>         }
>
> @@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>                 if (temp_opp->available == available &&
>                                 temp_opp->rate == freq) {
>                         opp = temp_opp;
> +
> +                       /* Increment the reference count of OPP */
> +                       dev_pm_opp_get(opp);
>                         break;
>                 }
>         }
>
> +       rcu_read_unlock();
> +
>         return opp;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
> @@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
>                 if (temp_opp->available && temp_opp->rate >= *freq) {
>                         opp = temp_opp;
>                         *freq = opp->rate;
> +
> +                       /* Increment the reference count of OPP */
> +                       dev_pm_opp_get(opp);
>                         break;
>                 }
>         }
> @@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
>   * ERANGE:     no match found for search
>   * ENODEV:     if device not found in list of registered devices
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
>                                              unsigned long *freq)
>  {
>         struct opp_table *opp_table;
> -
> -       opp_rcu_lockdep_assert();
> +       struct dev_pm_opp *opp;
>
>         if (!dev || !freq) {
>                 dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
>                 return ERR_PTR(-EINVAL);
>         }
>
> +       rcu_read_lock();
> +
>         opp_table = _find_opp_table(dev);
> -       if (IS_ERR(opp_table))
> +       if (IS_ERR(opp_table)) {
> +               rcu_read_unlock();
>                 return ERR_CAST(opp_table);
> +       }
> +
> +       opp = _find_freq_ceil(opp_table, freq);
>
> -       return _find_freq_ceil(opp_table, freq);
> +       rcu_read_unlock();
> +
> +       return opp;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
>
> @@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
>   * ERANGE:     no match found for search
>   * ENODEV:     if device not found in list of registered devices
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>                                               unsigned long *freq)
> @@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>         struct opp_table *opp_table;
>         struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
>
> -       opp_rcu_lockdep_assert();
> -
>         if (!dev || !freq) {
>                 dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
>                 return ERR_PTR(-EINVAL);
>         }
>
> +       rcu_read_lock();
> +
>         opp_table = _find_opp_table(dev);
> -       if (IS_ERR(opp_table))
> +       if (IS_ERR(opp_table)) {
> +               rcu_read_unlock();
>                 return ERR_CAST(opp_table);
> +       }
>
>         list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
>                 if (temp_opp->available) {
> @@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>                                 opp = temp_opp;
>                 }
>         }
> +
> +       /* Increment the reference count of OPP */
> +       if (!IS_ERR(opp))
> +               dev_pm_opp_get(opp);
> +       rcu_read_unlock();
> +
>         if (!IS_ERR(opp))
>                 *freq = opp->rate;
>
> @@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>                 ret = PTR_ERR(opp);
>                 dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
>                         __func__, freq, ret);
> +               if (!IS_ERR(old_opp))
> +                       dev_pm_opp_put(old_opp);
>                 rcu_read_unlock();
>                 return ret;
>         }
> @@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>
>         /* Only frequency scaling */
>         if (!regulators) {
> +               dev_pm_opp_put(opp);
> +               if (!IS_ERR(old_opp))
> +                       dev_pm_opp_put(old_opp);
>                 rcu_read_unlock();
>                 return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
>         }
> @@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>         data->new_opp.rate = freq;
>         memcpy(data->new_opp.supplies, opp->supplies, size);
>
> +       dev_pm_opp_put(opp);
> +       if (!IS_ERR(old_opp))
> +               dev_pm_opp_put(old_opp);
>         rcu_read_unlock();
>
>         return set_opp(data);
> @@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
>         dev_pm_opp_put_opp_table(opp_table);
>  }
>
> +static void dev_pm_opp_get(struct dev_pm_opp *opp)
> +{
> +       kref_get(&opp->kref);
> +}
> +
>  void dev_pm_opp_put(struct dev_pm_opp *opp)
>  {
>         kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
> diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
> index 8c3434bdb26d..adef788862d5 100644
> --- a/drivers/base/power/opp/cpu.c
> +++ b/drivers/base/power/opp/cpu.c
> @@ -42,11 +42,6 @@
>   *
>   * WARNING: It is  important for the callers to ensure refreshing their copy of
>   * the table if any of the mentioned functions have been invoked in the interim.
> - *
> - * Locking: The internal opp_table and opp structures are RCU protected.
> - * Since we just use the regular accessor functions to access the internal data
> - * structures, we use RCU read lock inside this function. As a result, users of
> - * this function DONOT need to use explicit locks for invoking.
>   */
>  int dev_pm_opp_init_cpufreq_table(struct device *dev,
>                                   struct cpufreq_frequency_table **table)
> @@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>         int i, max_opps, ret = 0;
>         unsigned long rate;
>
> -       rcu_read_lock();
> -
>         max_opps = dev_pm_opp_get_opp_count(dev);
> -       if (max_opps <= 0) {
> -               ret = max_opps ? max_opps : -ENODATA;
> -               goto out;
> -       }
> +       if (max_opps <= 0)
> +               return max_opps ? max_opps : -ENODATA;
>
>         freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
> -       if (!freq_table) {
> -               ret = -ENOMEM;
> -               goto out;
> -       }
> +       if (!freq_table)
> +               return -ENOMEM;
>
>         for (i = 0, rate = 0; i < max_opps; i++, rate++) {
>                 /* find next rate */
> @@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>                 /* Is Boost/turbo opp ? */
>                 if (dev_pm_opp_is_turbo(opp))
>                         freq_table[i].flags = CPUFREQ_BOOST_FREQ;
> +
> +               dev_pm_opp_put(opp);
>         }
>
>         freq_table[i].driver_data = i;
> @@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>         *table = &freq_table[0];
>
>  out:
> -       rcu_read_unlock();
>         if (ret)
>                 kfree(freq_table);
>
> diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
> index f010562534eb..2c44aeb0b97c 100644
> --- a/drivers/clk/tegra/clk-dfll.c
> +++ b/drivers/clk/tegra/clk-dfll.c
> @@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
>         struct dev_pm_opp *opp;
>         int i, uv;
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
> -       uv = dev_pm_opp_get_voltage(opp);
>
> -       rcu_read_unlock();
> +       uv = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>
>         for (i = 0; i < td->i2c_lut_size; i++) {
>                 if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
> @@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>         struct dev_pm_opp *opp;
>         int lut;
>
> -       rcu_read_lock();
> -
>         rate = ULONG_MAX;
>         opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
>         if (IS_ERR(opp)) {
> @@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 goto out;
>         }
>         v_max = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>
>         v = td->soc->cvb->min_millivolts * 1000;
>         lut = find_vdd_map_entry_exact(td, v);
> @@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 if (v_opp <= td->soc->cvb->min_millivolts * 1000)
>                         td->dvco_rate_min = dev_pm_opp_get_freq(opp);
>
> +               dev_pm_opp_put(opp);
> +
>                 for (;;) {
>                         v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
>                         if (v >= v_opp)
> @@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 ret = 0;
>
>  out:
> -       rcu_read_unlock();
> -
>         return ret;
>  }
>
> diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
> index c0f3373706f4..9180d34cc9fc 100644
> --- a/drivers/cpufreq/exynos5440-cpufreq.c
> +++ b/drivers/cpufreq/exynos5440-cpufreq.c
> @@ -118,12 +118,10 @@ static int init_div_table(void)
>         unsigned int tmp, clk_div, ema_div, freq, volt_id;
>         struct dev_pm_opp *opp;
>
> -       rcu_read_lock();
>         cpufreq_for_each_entry(pos, freq_tbl) {
>                 opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
>                                         pos->frequency * 1000, true);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         dev_err(dvfs_info->dev,
>                                 "failed to find valid OPP for %u KHZ\n",
>                                 pos->frequency);
> @@ -140,6 +138,7 @@ static int init_div_table(void)
>
>                 /* Calculate EMA */
>                 volt_id = dev_pm_opp_get_voltage(opp);
> +
>                 volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
>                 if (volt_id < PMIC_HIGH_VOLT) {
>                         ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
> @@ -157,9 +156,9 @@ static int init_div_table(void)
>
>                 __raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
>                                                 (pos - freq_tbl));
> +               dev_pm_opp_put(opp);
>         }
>
> -       rcu_read_unlock();
>         return 0;
>  }
>
> diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
> index ef1fa8145419..7719b02e04f5 100644
> --- a/drivers/cpufreq/imx6q-cpufreq.c
> +++ b/drivers/cpufreq/imx6q-cpufreq.c
> @@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
>         freq_hz = new_freq * 1000;
>         old_freq = clk_get_rate(arm_clk) / 1000;
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
>                 return PTR_ERR(opp);
>         }
>
>         volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         volt_old = regulator_get_voltage(arm_reg);
>
>         dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
> @@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
>          * freq_table initialised from OPP is therefore sorted in the
>          * same order.
>          */
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_exact(cpu_dev,
>                                   freq_table[0].frequency * 1000, true);
>         min_volt = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>         opp = dev_pm_opp_find_freq_exact(cpu_dev,
>                                   freq_table[--num].frequency * 1000, true);
>         max_volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
>         if (ret > 0)
>                 transition_latency += ret * 1000;
> diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
> index 643f43179df1..ab25b1235a5e 100644
> --- a/drivers/cpufreq/mt8173-cpufreq.c
> +++ b/drivers/cpufreq/mt8173-cpufreq.c
> @@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
>
>         freq_hz = freq_table[index].frequency * 1000;
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("cpu%d: failed to find OPP for %ld\n",
>                        policy->cpu, freq_hz);
>                 return PTR_ERR(opp);
>         }
>         vproc = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         /*
>          * If the new voltage or the intermediate voltage is higher than the
> @@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
>
>         /* Search a safe voltage for intermediate frequency. */
>         rate = clk_get_rate(inter_clk);
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("failed to get intermediate opp for cpu%d\n", cpu);
>                 ret = PTR_ERR(opp);
>                 goto out_free_opp_table;
>         }
>         info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         info->cpu_dev = cpu_dev;
>         info->proc_reg = proc_reg;
> diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
> index 376e63ca94e8..71e81bbf031b 100644
> --- a/drivers/cpufreq/omap-cpufreq.c
> +++ b/drivers/cpufreq/omap-cpufreq.c
> @@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
>         freq = ret;
>
>         if (mpu_reg) {
> -               rcu_read_lock();
>                 opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
>                                 __func__, new_freq);
>                         return -EINVAL;
>                 }
>                 volt = dev_pm_opp_get_voltage(opp);
> -               rcu_read_unlock();
> +               dev_pm_opp_put(opp);
>                 tol = volt * OPP_TOLERANCE / 100;
>                 volt_old = regulator_get_voltage(mpu_reg);
>         }
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index b0de42972b74..378f12a51496 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
>                 return;
>         }
>
> -       rcu_read_lock();
>         for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
>                 opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
>                 if (IS_ERR(opp)) {
>                         devm_kfree(devfreq->dev.parent, profile->freq_table);
>                         profile->max_state = 0;
> -                       rcu_read_unlock();
>                         return;
>                 }
> +               dev_pm_opp_put(opp);
>                 profile->freq_table[i] = freq;
>         }
> -       rcu_read_unlock();
>  }
>
>  /**
> @@ -1107,17 +1105,16 @@ static ssize_t available_frequencies_show(struct device *d,
>         ssize_t count = 0;
>         unsigned long freq = 0;
>
> -       rcu_read_lock();
>         do {
>                 opp = dev_pm_opp_find_freq_ceil(dev, &freq);
>                 if (IS_ERR(opp))
>                         break;
>
> +               dev_pm_opp_put(opp);
>                 count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
>                                    "%lu ", freq);
>                 freq++;
>         } while (1);
> -       rcu_read_unlock();
>
>         /* Truncate the trailing space */
>         if (count)
> @@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
>   * @freq:      The frequency given to target function
>   * @flags:     Flags handed from devfreq framework.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
>                                            unsigned long *freq,
> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
> index a8ed7792ece2..49ce38cef460 100644
> --- a/drivers/devfreq/exynos-bus.c
> +++ b/drivers/devfreq/exynos-bus.c
> @@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
>         int ret = 0;
>
>         /* Get new opp-bus instance according to new bus clock */
> -       rcu_read_lock();
>         new_opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(new_opp)) {
>                 dev_err(dev, "failed to get recommended opp instance\n");
> -               rcu_read_unlock();
>                 return PTR_ERR(new_opp);
>         }
>
>         new_freq = dev_pm_opp_get_freq(new_opp);
>         new_volt = dev_pm_opp_get_voltage(new_opp);
> +       dev_pm_opp_put(new_opp);
> +
>         old_freq = bus->curr_freq;
> -       rcu_read_unlock();
>
>         if (old_freq == new_freq)
>                 return 0;
> @@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
>         int ret = 0;
>
>         /* Get new opp-bus instance according to new bus clock */
> -       rcu_read_lock();
>         new_opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(new_opp)) {
>                 dev_err(dev, "failed to get recommended opp instance\n");
> -               rcu_read_unlock();
>                 return PTR_ERR(new_opp);
>         }
>
>         new_freq = dev_pm_opp_get_freq(new_opp);
> +       dev_pm_opp_put(new_opp);
> +
>         old_freq = bus->curr_freq;
> -       rcu_read_unlock();
>
>         if (old_freq == new_freq)
>                 return 0;
> @@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
>
>         rate = clk_get_rate(bus->clk);
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &rate, 0);
>         if (IS_ERR(opp)) {
>                 dev_err(dev, "failed to find dev_pm_opp\n");
> -               rcu_read_unlock();
>                 ret = PTR_ERR(opp);
>                 goto err_opp;
>         }
>         bus->curr_freq = dev_pm_opp_get_freq(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         return 0;
>
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 9ef46e2592c4..bd452236dba4 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -59,14 +59,14 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>          * list of parent device. Because in this case, *freq is temporary
>          * value which is decided by ondemand governor.
>          */
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
> -       rcu_read_unlock();
>         if (IS_ERR(opp)) {
>                 ret = PTR_ERR(opp);
>                 goto out;
>         }
>
> +       dev_pm_opp_put(opp);
> +
>         /*
>          * Get the OPP table's index of decided freqeuncy by governor
>          * of parent device.
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 27d2f349b53c..40a2499730fc 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
>         unsigned long target_volt, target_rate;
>         int err;
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, freq, flags);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
>
>         target_rate = dev_pm_opp_get_freq(opp);
>         target_volt = dev_pm_opp_get_voltage(opp);
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (dmcfreq->rate == target_rate)
>                 return 0;
> @@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
>
>         data->rate = clk_get_rate(data->dmc_clk);
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &data->rate, 0);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
> +
>         data->rate = dev_pm_opp_get_freq(opp);
>         data->volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         rk3399_devfreq_dmc_profile.initial_freq = data->rate;
>
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index fe9dce0245bf..214fff96fa4a 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>         struct dev_pm_opp *opp;
>         unsigned long rate = *freq * KHZ;
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &rate, flags);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
>                 return PTR_ERR(opp);
>         }
>         rate = dev_pm_opp_get_freq(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         clk_set_min_rate(tegra->emc_clock, rate);
>         clk_set_rate(tegra->emc_clock, 0);
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 9ce0e9eef923..85fdbf762fa0 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>         if (!power_table)
>                 return -ENOMEM;
>
> -       rcu_read_lock();
> -
>         for (freq = 0, i = 0;
>              opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
>              freq++, i++) {
> @@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>                 u64 power;
>
>                 if (i >= num_opps) {
> -                       rcu_read_unlock();
>                         ret = -EAGAIN;
>                         goto free_power_table;
>                 }
>
>                 freq_mhz = freq / 1000000;
>                 voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
> +               dev_pm_opp_put(opp);
>
>                 /*
>                  * Do the multiplication with MHz and millivolt so as
> @@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>                 power_table[i].power = power;
>         }
>
> -       rcu_read_unlock();
> -
>         if (i != num_opps) {
>                 ret = PTR_ERR(opp);
>                 goto free_power_table;
> @@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
>                 return 0;
>         }
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
>                                          true);
>         voltage = dev_pm_opp_get_voltage(opp);
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (voltage == 0) {
>                 dev_warn_ratelimited(cpufreq_device->cpu_dev,
> diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
> index 81631b110e17..abe8ad76bd8b 100644
> --- a/drivers/thermal/devfreq_cooling.c
> +++ b/drivers/thermal/devfreq_cooling.c
> @@ -113,15 +113,15 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
>                 unsigned int freq = dfc->freq_table[i];
>                 bool want_enable = i >= cdev_state ? true : false;
>
> -               rcu_read_lock();
>                 opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
> -               rcu_read_unlock();
>
>                 if (PTR_ERR(opp) == -ERANGE)
>                         continue;
>                 else if (IS_ERR(opp))
>                         return PTR_ERR(opp);
>
> +               dev_pm_opp_put(opp);
> +
>                 if (want_enable)
>                         ret = dev_pm_opp_enable(dev, freq);
>                 else
> @@ -221,15 +221,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
>         if (!dfc->power_ops->get_static_power)
>                 return 0;
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_exact(dev, freq, true);
>         if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
>                 opp = dev_pm_opp_find_freq_exact(dev, freq, false);
>
>         voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (voltage == 0) {
>                 dev_warn_ratelimited(dev,
> @@ -411,18 +408,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
>                 unsigned long power_dyn, voltage;
>                 struct dev_pm_opp *opp;
>
> -               rcu_read_lock();
> -
>                 opp = dev_pm_opp_find_freq_floor(dev, &freq);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         ret = PTR_ERR(opp);
>                         goto free_tables;
>                 }
>
>                 voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
> -
> -               rcu_read_unlock();
> +               dev_pm_opp_put(opp);
>
>                 if (dfc->power_ops) {
>                         power_dyn = get_dynamic_power(dfc, freq, voltage);
> _______________________________________________
> linaro-kernel mailing list
> linaro-kernel@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-kernel



-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] PM / OPP: Update OPP users to put reference
@ 2017-01-21  7:42         ` Chanwoo Choi
  0 siblings, 0 replies; 41+ messages in thread
From: Chanwoo Choi @ 2017-01-21  7:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Chanwoo Choi, Nishanth Menon, Prashant Gaikwad, Tony Lindgren,
	Stephen Boyd, Thierry Reding, Javi Merino, Alexandre Courbot,
	Viresh Kumar, linux-pm, Krzysztof Kozlowski,
	Javier Martinez Canillas, MyungJoo Ham, Zhang Rui, linaro-kernel,
	Stephen Warren, Eduardo Valentin, Peter De Schrijver,
	Rafael Wysocki, linux-ke

Hi Viresh,

For devfreq part, Looks good to me.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>


2016-12-08 13:00 GMT+09:00 Viresh Kumar <viresh.kumar@linaro.org>:
> On 07-12-16, 22:23, Chanwoo Choi wrote:
>> I think that the dev_pm_opp_put(opp) should be called after if statement
>> If dev_pm_opp_find_freq_ceil() return error, I think the calling of
>> dev_pm_opp_put(opp) is not necessary.
>
> During development I had following check in dev_pm_opp_put():
>
>         if (IS_ERR(opp))
>                 return;
>
> But that check isn't there anymore. And so it is also unsafe to call
> dev_pm_opp_put() for invalid OPP pointers.
>
> Thanks for reviewing this properly. devfreq_cooling.c also had the same issue
> which you missed. Here is the new version of the patch:
>
> -------------------------8<-------------------------
> Subject: [PATCH] PM / OPP: Update OPP users to put reference
>
> This patch updates dev_pm_opp_find_freq_*() routines to get a reference
> to the OPPs returned by them.
>
> Also updates the users of dev_pm_opp_find_freq_*() routines to call
> dev_pm_opp_put() after they are done using the OPPs.
>
> As it is guaranteed the that OPPs wouldn't get freed while being used,
> the RCU read side locking present with the users isn't required anymore.
> Drop it as well.
>
> This patch also updates all users of devfreq_recommended_opp() which was
> returning an OPP received from the OPP core.
>
> Note that some of the OPP core routines have gained
> rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
> within them.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  arch/arm/mach-omap2/pm.c             |   5 +-
>  drivers/base/power/opp/core.c        | 114 +++++++++++++++++++----------------
>  drivers/base/power/opp/cpu.c         |  22 ++-----
>  drivers/clk/tegra/clk-dfll.c         |  17 ++----
>  drivers/cpufreq/exynos5440-cpufreq.c |   5 +-
>  drivers/cpufreq/imx6q-cpufreq.c      |  10 +--
>  drivers/cpufreq/mt8173-cpufreq.c     |   8 +--
>  drivers/cpufreq/omap-cpufreq.c       |   4 +-
>  drivers/devfreq/devfreq.c            |  14 ++---
>  drivers/devfreq/exynos-bus.c         |  14 ++---
>  drivers/devfreq/governor_passive.c   |   4 +-
>  drivers/devfreq/rk3399_dmc.c         |  16 ++---
>  drivers/devfreq/tegra-devfreq.c      |   4 +-
>  drivers/thermal/cpu_cooling.c        |  11 +---
>  drivers/thermal/devfreq_cooling.c    |  15 ++---
>  15 files changed, 110 insertions(+), 153 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/pm.c b/arch/arm/mach-omap2/pm.c
> index 678d2a31dcb8..c5a1d4439202 100644
> --- a/arch/arm/mach-omap2/pm.c
> +++ b/arch/arm/mach-omap2/pm.c
> @@ -167,17 +167,16 @@ static int __init omap2_set_init_voltage(char *vdd_name, char *clk_name,
>         freq = clk_get_rate(clk);
>         clk_put(clk);
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(dev, &freq);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("%s: unable to find boot up OPP for vdd_%s\n",
>                         __func__, vdd_name);
>                 goto exit;
>         }
>
>         bootup_volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         if (!bootup_volt) {
>                 pr_err("%s: unable to find voltage corresponding to the bootup OPP for vdd_%s\n",
>                        __func__, vdd_name);
> diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
> index 9870ee54d708..a6efa818029a 100644
> --- a/drivers/base/power/opp/core.c
> +++ b/drivers/base/power/opp/core.c
> @@ -40,6 +40,8 @@ do {                                                                  \
>                          "opp_table_lock protection");                  \
>  } while (0)
>
> +static void dev_pm_opp_get(struct dev_pm_opp *opp);
> +
>  static struct opp_device *_find_opp_dev(const struct device *dev,
>                                         struct opp_table *opp_table)
>  {
> @@ -94,21 +96,13 @@ struct opp_table *_find_opp_table(struct device *dev)
>   * return 0
>   *
>   * This is useful only for devices with single power supply.
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
>         unsigned long v = 0;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp))
> @@ -116,6 +110,7 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp)
>         else
>                 v = tmp_opp->supplies[0].u_volt;
>
> +       rcu_read_unlock();
>         return v;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
> @@ -126,21 +121,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_voltage);
>   *
>   * Return: frequency in hertz corresponding to the opp, else
>   * return 0
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
>         unsigned long f = 0;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available)
> @@ -148,6 +135,7 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>         else
>                 f = tmp_opp->rate;
>
> +       rcu_read_unlock();
>         return f;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
> @@ -161,20 +149,13 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
>   * quickly. Running on them for longer times may overheat the chip.
>   *
>   * Return: true if opp is turbo opp, else false.
> - *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. This means that opp which could have been fetched by
> - * opp_find_freq_{exact,ceil,floor} functions is valid as long as we are
> - * under RCU lock. The pointer returned by the opp_find_freq family must be
> - * used in the same section as the usage of this function with the pointer
> - * prior to unlocking with rcu_read_unlock() to maintain the integrity of the
> - * pointer.
>   */
>  bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
>  {
>         struct dev_pm_opp *tmp_opp;
> +       bool turbo;
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         tmp_opp = rcu_dereference(opp);
>         if (IS_ERR_OR_NULL(tmp_opp) || !tmp_opp->available) {
> @@ -182,7 +163,10 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
>                 return false;
>         }
>
> -       return tmp_opp->turbo;
> +       turbo = tmp_opp->turbo;
> +
> +       rcu_read_unlock();
> +       return turbo;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
>
> @@ -410,11 +394,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_opp_count);
>   * This provides a mechanism to enable an opp which is not available currently
>   * or the opposite as well.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>                                               unsigned long freq,
> @@ -423,13 +404,14 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>         struct opp_table *opp_table;
>         struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
>
> -       opp_rcu_lockdep_assert();
> +       rcu_read_lock();
>
>         opp_table = _find_opp_table(dev);
>         if (IS_ERR(opp_table)) {
>                 int r = PTR_ERR(opp_table);
>
>                 dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
> +               rcu_read_unlock();
>                 return ERR_PTR(r);
>         }
>
> @@ -437,10 +419,15 @@ struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
>                 if (temp_opp->available == available &&
>                                 temp_opp->rate == freq) {
>                         opp = temp_opp;
> +
> +                       /* Increment the reference count of OPP */
> +                       dev_pm_opp_get(opp);
>                         break;
>                 }
>         }
>
> +       rcu_read_unlock();
> +
>         return opp;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_exact);
> @@ -454,6 +441,9 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
>                 if (temp_opp->available && temp_opp->rate >= *freq) {
>                         opp = temp_opp;
>                         *freq = opp->rate;
> +
> +                       /* Increment the reference count of OPP */
> +                       dev_pm_opp_get(opp);
>                         break;
>                 }
>         }
> @@ -476,29 +466,33 @@ static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
>   * ERANGE:     no match found for search
>   * ENODEV:     if device not found in list of registered devices
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev,
>                                              unsigned long *freq)
>  {
>         struct opp_table *opp_table;
> -
> -       opp_rcu_lockdep_assert();
> +       struct dev_pm_opp *opp;
>
>         if (!dev || !freq) {
>                 dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
>                 return ERR_PTR(-EINVAL);
>         }
>
> +       rcu_read_lock();
> +
>         opp_table = _find_opp_table(dev);
> -       if (IS_ERR(opp_table))
> +       if (IS_ERR(opp_table)) {
> +               rcu_read_unlock();
>                 return ERR_CAST(opp_table);
> +       }
> +
> +       opp = _find_freq_ceil(opp_table, freq);
>
> -       return _find_freq_ceil(opp_table, freq);
> +       rcu_read_unlock();
> +
> +       return opp;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
>
> @@ -517,11 +511,8 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_ceil);
>   * ERANGE:     no match found for search
>   * ENODEV:     if device not found in list of registered devices
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>                                               unsigned long *freq)
> @@ -529,16 +520,18 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>         struct opp_table *opp_table;
>         struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
>
> -       opp_rcu_lockdep_assert();
> -
>         if (!dev || !freq) {
>                 dev_err(dev, "%s: Invalid argument freq=%p\n", __func__, freq);
>                 return ERR_PTR(-EINVAL);
>         }
>
> +       rcu_read_lock();
> +
>         opp_table = _find_opp_table(dev);
> -       if (IS_ERR(opp_table))
> +       if (IS_ERR(opp_table)) {
> +               rcu_read_unlock();
>                 return ERR_CAST(opp_table);
> +       }
>
>         list_for_each_entry_rcu(temp_opp, &opp_table->opp_list, node) {
>                 if (temp_opp->available) {
> @@ -549,6 +542,12 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
>                                 opp = temp_opp;
>                 }
>         }
> +
> +       /* Increment the reference count of OPP */
> +       if (!IS_ERR(opp))
> +               dev_pm_opp_get(opp);
> +       rcu_read_unlock();
> +
>         if (!IS_ERR(opp))
>                 *freq = opp->rate;
>
> @@ -736,6 +735,8 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>                 ret = PTR_ERR(opp);
>                 dev_err(dev, "%s: failed to find OPP for freq %lu (%d)\n",
>                         __func__, freq, ret);
> +               if (!IS_ERR(old_opp))
> +                       dev_pm_opp_put(old_opp);
>                 rcu_read_unlock();
>                 return ret;
>         }
> @@ -747,6 +748,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>
>         /* Only frequency scaling */
>         if (!regulators) {
> +               dev_pm_opp_put(opp);
> +               if (!IS_ERR(old_opp))
> +                       dev_pm_opp_put(old_opp);
>                 rcu_read_unlock();
>                 return _generic_set_opp_clk_only(dev, clk, old_freq, freq);
>         }
> @@ -772,6 +776,9 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>         data->new_opp.rate = freq;
>         memcpy(data->new_opp.supplies, opp->supplies, size);
>
> +       dev_pm_opp_put(opp);
> +       if (!IS_ERR(old_opp))
> +               dev_pm_opp_put(old_opp);
>         rcu_read_unlock();
>
>         return set_opp(data);
> @@ -967,6 +974,11 @@ static void _opp_kref_release(struct kref *kref)
>         dev_pm_opp_put_opp_table(opp_table);
>  }
>
> +static void dev_pm_opp_get(struct dev_pm_opp *opp)
> +{
> +       kref_get(&opp->kref);
> +}
> +
>  void dev_pm_opp_put(struct dev_pm_opp *opp)
>  {
>         kref_put_mutex(&opp->kref, _opp_kref_release, &opp->opp_table->lock);
> diff --git a/drivers/base/power/opp/cpu.c b/drivers/base/power/opp/cpu.c
> index 8c3434bdb26d..adef788862d5 100644
> --- a/drivers/base/power/opp/cpu.c
> +++ b/drivers/base/power/opp/cpu.c
> @@ -42,11 +42,6 @@
>   *
>   * WARNING: It is  important for the callers to ensure refreshing their copy of
>   * the table if any of the mentioned functions have been invoked in the interim.
> - *
> - * Locking: The internal opp_table and opp structures are RCU protected.
> - * Since we just use the regular accessor functions to access the internal data
> - * structures, we use RCU read lock inside this function. As a result, users of
> - * this function DONOT need to use explicit locks for invoking.
>   */
>  int dev_pm_opp_init_cpufreq_table(struct device *dev,
>                                   struct cpufreq_frequency_table **table)
> @@ -56,19 +51,13 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>         int i, max_opps, ret = 0;
>         unsigned long rate;
>
> -       rcu_read_lock();
> -
>         max_opps = dev_pm_opp_get_opp_count(dev);
> -       if (max_opps <= 0) {
> -               ret = max_opps ? max_opps : -ENODATA;
> -               goto out;
> -       }
> +       if (max_opps <= 0)
> +               return max_opps ? max_opps : -ENODATA;
>
>         freq_table = kcalloc((max_opps + 1), sizeof(*freq_table), GFP_ATOMIC);
> -       if (!freq_table) {
> -               ret = -ENOMEM;
> -               goto out;
> -       }
> +       if (!freq_table)
> +               return -ENOMEM;
>
>         for (i = 0, rate = 0; i < max_opps; i++, rate++) {
>                 /* find next rate */
> @@ -83,6 +72,8 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>                 /* Is Boost/turbo opp ? */
>                 if (dev_pm_opp_is_turbo(opp))
>                         freq_table[i].flags = CPUFREQ_BOOST_FREQ;
> +
> +               dev_pm_opp_put(opp);
>         }
>
>         freq_table[i].driver_data = i;
> @@ -91,7 +82,6 @@ int dev_pm_opp_init_cpufreq_table(struct device *dev,
>         *table = &freq_table[0];
>
>  out:
> -       rcu_read_unlock();
>         if (ret)
>                 kfree(freq_table);
>
> diff --git a/drivers/clk/tegra/clk-dfll.c b/drivers/clk/tegra/clk-dfll.c
> index f010562534eb..2c44aeb0b97c 100644
> --- a/drivers/clk/tegra/clk-dfll.c
> +++ b/drivers/clk/tegra/clk-dfll.c
> @@ -633,16 +633,12 @@ static int find_lut_index_for_rate(struct tegra_dfll *td, unsigned long rate)
>         struct dev_pm_opp *opp;
>         int i, uv;
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_ceil(td->soc->dev, &rate);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
> -       uv = dev_pm_opp_get_voltage(opp);
>
> -       rcu_read_unlock();
> +       uv = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>
>         for (i = 0; i < td->i2c_lut_size; i++) {
>                 if (regulator_list_voltage(td->vdd_reg, td->i2c_lut[i]) == uv)
> @@ -1440,8 +1436,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>         struct dev_pm_opp *opp;
>         int lut;
>
> -       rcu_read_lock();
> -
>         rate = ULONG_MAX;
>         opp = dev_pm_opp_find_freq_floor(td->soc->dev, &rate);
>         if (IS_ERR(opp)) {
> @@ -1449,6 +1443,7 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 goto out;
>         }
>         v_max = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>
>         v = td->soc->cvb->min_millivolts * 1000;
>         lut = find_vdd_map_entry_exact(td, v);
> @@ -1465,6 +1460,8 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 if (v_opp <= td->soc->cvb->min_millivolts * 1000)
>                         td->dvco_rate_min = dev_pm_opp_get_freq(opp);
>
> +               dev_pm_opp_put(opp);
> +
>                 for (;;) {
>                         v += max(1, (v_max - v) / (MAX_DFLL_VOLTAGES - j));
>                         if (v >= v_opp)
> @@ -1496,8 +1493,6 @@ static int dfll_build_i2c_lut(struct tegra_dfll *td)
>                 ret = 0;
>
>  out:
> -       rcu_read_unlock();
> -
>         return ret;
>  }
>
> diff --git a/drivers/cpufreq/exynos5440-cpufreq.c b/drivers/cpufreq/exynos5440-cpufreq.c
> index c0f3373706f4..9180d34cc9fc 100644
> --- a/drivers/cpufreq/exynos5440-cpufreq.c
> +++ b/drivers/cpufreq/exynos5440-cpufreq.c
> @@ -118,12 +118,10 @@ static int init_div_table(void)
>         unsigned int tmp, clk_div, ema_div, freq, volt_id;
>         struct dev_pm_opp *opp;
>
> -       rcu_read_lock();
>         cpufreq_for_each_entry(pos, freq_tbl) {
>                 opp = dev_pm_opp_find_freq_exact(dvfs_info->dev,
>                                         pos->frequency * 1000, true);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         dev_err(dvfs_info->dev,
>                                 "failed to find valid OPP for %u KHZ\n",
>                                 pos->frequency);
> @@ -140,6 +138,7 @@ static int init_div_table(void)
>
>                 /* Calculate EMA */
>                 volt_id = dev_pm_opp_get_voltage(opp);
> +
>                 volt_id = (MAX_VOLTAGE - volt_id) / VOLTAGE_STEP;
>                 if (volt_id < PMIC_HIGH_VOLT) {
>                         ema_div = (CPUEMA_HIGH << P0_7_CPUEMA_SHIFT) |
> @@ -157,9 +156,9 @@ static int init_div_table(void)
>
>                 __raw_writel(tmp, dvfs_info->base + XMU_PMU_P0_7 + 4 *
>                                                 (pos - freq_tbl));
> +               dev_pm_opp_put(opp);
>         }
>
> -       rcu_read_unlock();
>         return 0;
>  }
>
> diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
> index ef1fa8145419..7719b02e04f5 100644
> --- a/drivers/cpufreq/imx6q-cpufreq.c
> +++ b/drivers/cpufreq/imx6q-cpufreq.c
> @@ -53,16 +53,15 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
>         freq_hz = new_freq * 1000;
>         old_freq = clk_get_rate(arm_clk) / 1000;
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 dev_err(cpu_dev, "failed to find OPP for %ld\n", freq_hz);
>                 return PTR_ERR(opp);
>         }
>
>         volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         volt_old = regulator_get_voltage(arm_reg);
>
>         dev_dbg(cpu_dev, "%u MHz, %ld mV --> %u MHz, %ld mV\n",
> @@ -321,14 +320,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
>          * freq_table initialised from OPP is therefore sorted in the
>          * same order.
>          */
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_exact(cpu_dev,
>                                   freq_table[0].frequency * 1000, true);
>         min_volt = dev_pm_opp_get_voltage(opp);
> +       dev_pm_opp_put(opp);
>         opp = dev_pm_opp_find_freq_exact(cpu_dev,
>                                   freq_table[--num].frequency * 1000, true);
>         max_volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
> +
>         ret = regulator_set_voltage_time(arm_reg, min_volt, max_volt);
>         if (ret > 0)
>                 transition_latency += ret * 1000;
> diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
> index 643f43179df1..ab25b1235a5e 100644
> --- a/drivers/cpufreq/mt8173-cpufreq.c
> +++ b/drivers/cpufreq/mt8173-cpufreq.c
> @@ -232,16 +232,14 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
>
>         freq_hz = freq_table[index].frequency * 1000;
>
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("cpu%d: failed to find OPP for %ld\n",
>                        policy->cpu, freq_hz);
>                 return PTR_ERR(opp);
>         }
>         vproc = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         /*
>          * If the new voltage or the intermediate voltage is higher than the
> @@ -411,16 +409,14 @@ static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
>
>         /* Search a safe voltage for intermediate frequency. */
>         rate = clk_get_rate(inter_clk);
> -       rcu_read_lock();
>         opp = dev_pm_opp_find_freq_ceil(cpu_dev, &rate);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 pr_err("failed to get intermediate opp for cpu%d\n", cpu);
>                 ret = PTR_ERR(opp);
>                 goto out_free_opp_table;
>         }
>         info->intermediate_voltage = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         info->cpu_dev = cpu_dev;
>         info->proc_reg = proc_reg;
> diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
> index 376e63ca94e8..71e81bbf031b 100644
> --- a/drivers/cpufreq/omap-cpufreq.c
> +++ b/drivers/cpufreq/omap-cpufreq.c
> @@ -63,16 +63,14 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int index)
>         freq = ret;
>
>         if (mpu_reg) {
> -               rcu_read_lock();
>                 opp = dev_pm_opp_find_freq_ceil(mpu_dev, &freq);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n",
>                                 __func__, new_freq);
>                         return -EINVAL;
>                 }
>                 volt = dev_pm_opp_get_voltage(opp);
> -               rcu_read_unlock();
> +               dev_pm_opp_put(opp);
>                 tol = volt * OPP_TOLERANCE / 100;
>                 volt_old = regulator_get_voltage(mpu_reg);
>         }
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index b0de42972b74..378f12a51496 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -111,18 +111,16 @@ static void devfreq_set_freq_table(struct devfreq *devfreq)
>                 return;
>         }
>
> -       rcu_read_lock();
>         for (i = 0, freq = 0; i < profile->max_state; i++, freq++) {
>                 opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, &freq);
>                 if (IS_ERR(opp)) {
>                         devm_kfree(devfreq->dev.parent, profile->freq_table);
>                         profile->max_state = 0;
> -                       rcu_read_unlock();
>                         return;
>                 }
> +               dev_pm_opp_put(opp);
>                 profile->freq_table[i] = freq;
>         }
> -       rcu_read_unlock();
>  }
>
>  /**
> @@ -1107,17 +1105,16 @@ static ssize_t available_frequencies_show(struct device *d,
>         ssize_t count = 0;
>         unsigned long freq = 0;
>
> -       rcu_read_lock();
>         do {
>                 opp = dev_pm_opp_find_freq_ceil(dev, &freq);
>                 if (IS_ERR(opp))
>                         break;
>
> +               dev_pm_opp_put(opp);
>                 count += scnprintf(&buf[count], (PAGE_SIZE - count - 2),
>                                    "%lu ", freq);
>                 freq++;
>         } while (1);
> -       rcu_read_unlock();
>
>         /* Truncate the trailing space */
>         if (count)
> @@ -1219,11 +1216,8 @@ subsys_initcall(devfreq_init);
>   * @freq:      The frequency given to target function
>   * @flags:     Flags handed from devfreq framework.
>   *
> - * Locking: This function must be called under rcu_read_lock(). opp is a rcu
> - * protected pointer. The reason for the same is that the opp pointer which is
> - * returned will remain valid for use with opp_get_{voltage, freq} only while
> - * under the locked area. The pointer returned must be used prior to unlocking
> - * with rcu_read_unlock() to maintain the integrity of the pointer.
> + * The callers are required to call dev_pm_opp_put() for the returned OPP after
> + * use.
>   */
>  struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
>                                            unsigned long *freq,
> diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c
> index a8ed7792ece2..49ce38cef460 100644
> --- a/drivers/devfreq/exynos-bus.c
> +++ b/drivers/devfreq/exynos-bus.c
> @@ -103,18 +103,17 @@ static int exynos_bus_target(struct device *dev, unsigned long *freq, u32 flags)
>         int ret = 0;
>
>         /* Get new opp-bus instance according to new bus clock */
> -       rcu_read_lock();
>         new_opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(new_opp)) {
>                 dev_err(dev, "failed to get recommended opp instance\n");
> -               rcu_read_unlock();
>                 return PTR_ERR(new_opp);
>         }
>
>         new_freq = dev_pm_opp_get_freq(new_opp);
>         new_volt = dev_pm_opp_get_voltage(new_opp);
> +       dev_pm_opp_put(new_opp);
> +
>         old_freq = bus->curr_freq;
> -       rcu_read_unlock();
>
>         if (old_freq == new_freq)
>                 return 0;
> @@ -214,17 +213,16 @@ static int exynos_bus_passive_target(struct device *dev, unsigned long *freq,
>         int ret = 0;
>
>         /* Get new opp-bus instance according to new bus clock */
> -       rcu_read_lock();
>         new_opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(new_opp)) {
>                 dev_err(dev, "failed to get recommended opp instance\n");
> -               rcu_read_unlock();
>                 return PTR_ERR(new_opp);
>         }
>
>         new_freq = dev_pm_opp_get_freq(new_opp);
> +       dev_pm_opp_put(new_opp);
> +
>         old_freq = bus->curr_freq;
> -       rcu_read_unlock();
>
>         if (old_freq == new_freq)
>                 return 0;
> @@ -358,16 +356,14 @@ static int exynos_bus_parse_of(struct device_node *np,
>
>         rate = clk_get_rate(bus->clk);
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &rate, 0);
>         if (IS_ERR(opp)) {
>                 dev_err(dev, "failed to find dev_pm_opp\n");
> -               rcu_read_unlock();
>                 ret = PTR_ERR(opp);
>                 goto err_opp;
>         }
>         bus->curr_freq = dev_pm_opp_get_freq(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         return 0;
>
> diff --git a/drivers/devfreq/governor_passive.c b/drivers/devfreq/governor_passive.c
> index 9ef46e2592c4..bd452236dba4 100644
> --- a/drivers/devfreq/governor_passive.c
> +++ b/drivers/devfreq/governor_passive.c
> @@ -59,14 +59,14 @@ static int devfreq_passive_get_target_freq(struct devfreq *devfreq,
>          * list of parent device. Because in this case, *freq is temporary
>          * value which is decided by ondemand governor.
>          */
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(parent_devfreq->dev.parent, freq, 0);
> -       rcu_read_unlock();
>         if (IS_ERR(opp)) {
>                 ret = PTR_ERR(opp);
>                 goto out;
>         }
>
> +       dev_pm_opp_put(opp);
> +
>         /*
>          * Get the OPP table's index of decided freqeuncy by governor
>          * of parent device.
> diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c
> index 27d2f349b53c..40a2499730fc 100644
> --- a/drivers/devfreq/rk3399_dmc.c
> +++ b/drivers/devfreq/rk3399_dmc.c
> @@ -91,17 +91,13 @@ static int rk3399_dmcfreq_target(struct device *dev, unsigned long *freq,
>         unsigned long target_volt, target_rate;
>         int err;
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, freq, flags);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
>
>         target_rate = dev_pm_opp_get_freq(opp);
>         target_volt = dev_pm_opp_get_voltage(opp);
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (dmcfreq->rate == target_rate)
>                 return 0;
> @@ -422,15 +418,13 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
>
>         data->rate = clk_get_rate(data->dmc_clk);
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &data->rate, 0);
> -       if (IS_ERR(opp)) {
> -               rcu_read_unlock();
> +       if (IS_ERR(opp))
>                 return PTR_ERR(opp);
> -       }
> +
>         data->rate = dev_pm_opp_get_freq(opp);
>         data->volt = dev_pm_opp_get_voltage(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         rk3399_devfreq_dmc_profile.initial_freq = data->rate;
>
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index fe9dce0245bf..214fff96fa4a 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -487,15 +487,13 @@ static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>         struct dev_pm_opp *opp;
>         unsigned long rate = *freq * KHZ;
>
> -       rcu_read_lock();
>         opp = devfreq_recommended_opp(dev, &rate, flags);
>         if (IS_ERR(opp)) {
> -               rcu_read_unlock();
>                 dev_err(dev, "Failed to find opp for %lu KHz\n", *freq);
>                 return PTR_ERR(opp);
>         }
>         rate = dev_pm_opp_get_freq(opp);
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         clk_set_min_rate(tegra->emc_clock, rate);
>         clk_set_rate(tegra->emc_clock, 0);
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 9ce0e9eef923..85fdbf762fa0 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -297,8 +297,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>         if (!power_table)
>                 return -ENOMEM;
>
> -       rcu_read_lock();
> -
>         for (freq = 0, i = 0;
>              opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
>              freq++, i++) {
> @@ -306,13 +304,13 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>                 u64 power;
>
>                 if (i >= num_opps) {
> -                       rcu_read_unlock();
>                         ret = -EAGAIN;
>                         goto free_power_table;
>                 }
>
>                 freq_mhz = freq / 1000000;
>                 voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
> +               dev_pm_opp_put(opp);
>
>                 /*
>                  * Do the multiplication with MHz and millivolt so as
> @@ -328,8 +326,6 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
>                 power_table[i].power = power;
>         }
>
> -       rcu_read_unlock();
> -
>         if (i != num_opps) {
>                 ret = PTR_ERR(opp);
>                 goto free_power_table;
> @@ -433,13 +429,10 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
>                 return 0;
>         }
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
>                                          true);
>         voltage = dev_pm_opp_get_voltage(opp);
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (voltage == 0) {
>                 dev_warn_ratelimited(cpufreq_device->cpu_dev,
> diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
> index 81631b110e17..abe8ad76bd8b 100644
> --- a/drivers/thermal/devfreq_cooling.c
> +++ b/drivers/thermal/devfreq_cooling.c
> @@ -113,15 +113,15 @@ static int partition_enable_opps(struct devfreq_cooling_device *dfc,
>                 unsigned int freq = dfc->freq_table[i];
>                 bool want_enable = i >= cdev_state ? true : false;
>
> -               rcu_read_lock();
>                 opp = dev_pm_opp_find_freq_exact(dev, freq, !want_enable);
> -               rcu_read_unlock();
>
>                 if (PTR_ERR(opp) == -ERANGE)
>                         continue;
>                 else if (IS_ERR(opp))
>                         return PTR_ERR(opp);
>
> +               dev_pm_opp_put(opp);
> +
>                 if (want_enable)
>                         ret = dev_pm_opp_enable(dev, freq);
>                 else
> @@ -221,15 +221,12 @@ get_static_power(struct devfreq_cooling_device *dfc, unsigned long freq)
>         if (!dfc->power_ops->get_static_power)
>                 return 0;
>
> -       rcu_read_lock();
> -
>         opp = dev_pm_opp_find_freq_exact(dev, freq, true);
>         if (IS_ERR(opp) && (PTR_ERR(opp) == -ERANGE))
>                 opp = dev_pm_opp_find_freq_exact(dev, freq, false);
>
>         voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
> -
> -       rcu_read_unlock();
> +       dev_pm_opp_put(opp);
>
>         if (voltage == 0) {
>                 dev_warn_ratelimited(dev,
> @@ -411,18 +408,14 @@ static int devfreq_cooling_gen_tables(struct devfreq_cooling_device *dfc)
>                 unsigned long power_dyn, voltage;
>                 struct dev_pm_opp *opp;
>
> -               rcu_read_lock();
> -
>                 opp = dev_pm_opp_find_freq_floor(dev, &freq);
>                 if (IS_ERR(opp)) {
> -                       rcu_read_unlock();
>                         ret = PTR_ERR(opp);
>                         goto free_tables;
>                 }
>
>                 voltage = dev_pm_opp_get_voltage(opp) / 1000; /* mV */
> -
> -               rcu_read_unlock();
> +               dev_pm_opp_put(opp);
>
>                 if (dfc->power_ops) {
>                         power_dyn = get_dynamic_power(dfc, freq, voltage);
> _______________________________________________
> linaro-kernel mailing list
> linaro-kernel@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/linaro-kernel



-- 
Best Regards,
Chanwoo Choi
Samsung Electronics

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2017-01-21  7:42 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-07 10:37 [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Viresh Kumar
2016-12-07 10:37 ` [PATCH 01/12] PM / OPP: Add per OPP table mutex Viresh Kumar
2017-01-09 23:11   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 02/12] PM / OPP: Add 'struct kref' to OPP table Viresh Kumar
2017-01-09 23:36   ` Stephen Boyd
2017-01-10  4:23     ` Viresh Kumar
2017-01-13  8:54       ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 03/12] PM / OPP: Return opp_table from dev_pm_opp_set_*() routines Viresh Kumar
2017-01-09 23:37   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 04/12] PM / OPP: Take reference of the OPP table while adding/removing OPPs Viresh Kumar
2017-01-09 23:38   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 05/12] PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table() Viresh Kumar
2017-01-09 23:43   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 06/12] PM / OPP: Add 'struct kref' to struct dev_pm_opp Viresh Kumar
2017-01-09 23:44   ` Stephen Boyd
2017-01-10  4:26     ` Viresh Kumar
2017-01-13  8:52       ` Stephen Boyd
2017-01-13  8:56         ` Viresh Kumar
2017-01-19 20:01           ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 07/12] PM / OPP: Update OPP users to put reference Viresh Kumar
2016-12-07 10:37   ` Viresh Kumar
2016-12-07 13:23   ` Chanwoo Choi
2016-12-07 13:23     ` Chanwoo Choi
2016-12-08  4:00     ` Viresh Kumar
2016-12-08  4:00       ` Viresh Kumar
2017-01-21  7:42       ` Chanwoo Choi
2017-01-21  7:42         ` Chanwoo Choi
2016-12-07 10:37 ` [PATCH 08/12] PM / OPP: Take kref from _find_opp_table() Viresh Kumar
2017-01-09 23:49   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 09/12] PM / OPP: Move away from RCU locking Viresh Kumar
2017-01-09 23:57   ` Stephen Boyd
2017-01-10  4:28     ` Viresh Kumar
2016-12-07 10:37 ` [PATCH 10/12] PM / OPP: Simplify _opp_set_availability() Viresh Kumar
2017-01-10  0:00   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 11/12] PM / OPP: Simplify dev_pm_opp_get_max_volt_latency() Viresh Kumar
2017-01-09 22:40   ` Stephen Boyd
2016-12-07 10:37 ` [PATCH 12/12] PM / OPP: Update Documentation to remove RCU specific bits Viresh Kumar
2017-01-09 22:39   ` Stephen Boyd
2017-01-10  4:39     ` Viresh Kumar
2017-01-13  8:44       ` Stephen Boyd
2016-12-07 23:14 ` [PATCH 00/12] PM / OPP: Use kref and move away from RCU locking Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.