All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] clk: readd refcounting for struct clk instances
@ 2015-09-22 23:55 Heiko Stübner
  2015-09-28 18:46 ` Stephen Boyd
  0 siblings, 1 reply; 4+ messages in thread
From: Heiko Stübner @ 2015-09-22 23:55 UTC (permalink / raw)
  To: mturquette, sboyd, Kevin Hilman, Rafael J. Wysocki, Pavel Machek
  Cc: Caesar Wang, linux-clk, linux-kernel, linux-pm

With the split into struct clk and struct clk_core, clocks lost the
ability for nested __clk_get clkdev calls. While it stays possible to
call __clk_get, the first call to (__)clk_put will clear the struct clk,
making subsequent clk_put calls run into a NULL pointer dereference.

One prime example of this sits in the generic power domain code, where it
is possible to add the clocks both by name and by passing in a struct clk
via pm_clk_add_clk(). __pm_clk_add() in turn then calls __clk_get to
increase the refcount, so that the original code can put the clock again.

A possible call-path looks like
clk = of_clk_get();
pm_clk_add_clk(dev, clk);
clk_put(clk);

with pm_clk_add_clk() => __pm_clk_add() then calling __clk_get on the clk
and later clk_put when the pm clock list gets destroyed, thus creating
a NULL pointer deref, as the struct clk doesn't exist anymore.

So add a separate refcounting for struct clk instances and only clean up
once the refcount reaches zero. This makes it possible again to hand off
a clock reference to common code without needing to track it further.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---

While it may be nice to do the actual handling of the clock references
only in the calling code, in this current use case it would create
a big additional overhead.

It looks like this so called synchronous reset on power-domain state-
changes, requiring device clocks to be turned on, is not that uncommon
or rockchip-specific.
For this Kevin requested that we read the clocks from the actual consumer
devices and not double-list them in the power-domain node as well.

So when expecting pm_clk_add_clk() to work, the current powerdomain code
can simply do when adding a device to a domain in rockchip_pd_attach_dev():
    while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) {
	dev_dbg(dev, "adding clock '%pC' to list of PM clocks\n", clk);
	error = pm_clk_add_clk(dev, clk);
	clk_put(clk);
    }

The clock gets handed off to the generic pm clock handling and thus
clk_put in there.


On the other hand when only the rockchip power-domain code is expected
to get and put the clock, we would require a lot of new overhead, as now
the code would also need to track which devices got added to what
domain and also all clock-references until the device gets detached
again. So this would essentially duplicate a big part of what the
genpd-code does (per-domain device-list and per-device clock-list).

As this seems to be not uncommon, future powerdomain drivers
might need that too and would also need to duplicate that handling.

When allowing multiple __clk_get and __clk_put calls on the other
hand, the overhead for the regular case comes down to one atomic_inc,
atomic_sub_and_test and the function call to the new separate release
function ;-) .


changes in v2: removed double parentheses found by Stephen Boyd

 drivers/clk/clk.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 43e2c3a..7aab1a4 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -85,6 +85,7 @@ struct clk {
 	unsigned long min_rate;
 	unsigned long max_rate;
 	struct hlist_node clks_node;
+	struct kref ref;
 };
 
 /***           locking             ***/
@@ -2590,7 +2591,7 @@ fail_out:
 EXPORT_SYMBOL_GPL(clk_register);
 
 /* Free memory allocated for a clock. */
-static void __clk_release(struct kref *ref)
+static void __clk_core_release(struct kref *ref)
 {
 	struct clk_core *core = container_of(ref, struct clk_core, ref);
 	int i = core->num_parents;
@@ -2606,6 +2607,18 @@ static void __clk_release(struct kref *ref)
 	kfree(core);
 }
 
+static void __clk_release(struct kref *ref)
+{
+	struct clk *clk = container_of(ref, struct clk, ref);
+
+	hlist_del(&clk->clks_node);
+	if (clk->min_rate > clk->core->req_rate ||
+	    clk->max_rate < clk->core->req_rate)
+		clk_core_set_rate_nolock(clk->core, clk->core->req_rate);
+
+	kfree(clk);
+}
+
 /*
  * Empty clk_ops for unregistered clocks. These are used temporarily
  * after clk_unregister() was called on a clock and until last clock
@@ -2684,7 +2697,7 @@ void clk_unregister(struct clk *clk)
 	if (clk->core->prepare_count)
 		pr_warn("%s: unregistering prepared clock: %s\n",
 					__func__, clk->core->name);
-	kref_put(&clk->core->ref, __clk_release);
+	kref_put(&clk->core->ref, __clk_core_release);
 
 	clk_prepare_unlock();
 }
@@ -2759,12 +2772,14 @@ int __clk_get(struct clk *clk)
 			return 0;
 
 		kref_get(&core->ref);
+		kref_get(&clk->ref);
 	}
 	return 1;
 }
 
 void __clk_put(struct clk *clk)
 {
+	struct clk_core *core;
 	struct module *owner;
 
 	if (!clk || WARN_ON_ONCE(IS_ERR(clk)))
@@ -2772,19 +2787,15 @@ void __clk_put(struct clk *clk)
 
 	clk_prepare_lock();
 
-	hlist_del(&clk->clks_node);
-	if (clk->min_rate > clk->core->req_rate ||
-	    clk->max_rate < clk->core->req_rate)
-		clk_core_set_rate_nolock(clk->core, clk->core->req_rate);
+	core = clk->core;
+	owner = core->owner;
 
-	owner = clk->core->owner;
-	kref_put(&clk->core->ref, __clk_release);
+	kref_put(&clk->ref, __clk_release);
+	kref_put(&core->ref, __clk_core_release);
 
 	clk_prepare_unlock();
 
 	module_put(owner);
-
-	kfree(clk);
 }
 
 /***        clk rate change notifiers        ***/
-- 
2.5.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] clk: readd refcounting for struct clk instances
  2015-09-22 23:55 [PATCH v2] clk: readd refcounting for struct clk instances Heiko Stübner
@ 2015-09-28 18:46 ` Stephen Boyd
  2015-09-28 19:36     ` Heiko Stübner
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen Boyd @ 2015-09-28 18:46 UTC (permalink / raw)
  To: Heiko Stübner
  Cc: mturquette, Kevin Hilman, Rafael J. Wysocki, Pavel Machek,
	Caesar Wang, linux-clk, linux-kernel, linux-pm

On 09/23, Heiko Stübner wrote:
> ---
> 
> While it may be nice to do the actual handling of the clock references
> only in the calling code, in this current use case it would create
> a big additional overhead.
> 
> It looks like this so called synchronous reset on power-domain state-
> changes, requiring device clocks to be turned on, is not that uncommon
> or rockchip-specific.
> For this Kevin requested that we read the clocks from the actual consumer
> devices and not double-list them in the power-domain node as well.
> 
> So when expecting pm_clk_add_clk() to work, the current powerdomain code
> can simply do when adding a device to a domain in rockchip_pd_attach_dev():
>     while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) {
> 	dev_dbg(dev, "adding clock '%pC' to list of PM clocks\n", clk);
> 	error = pm_clk_add_clk(dev, clk);
> 	clk_put(clk);
>     }
> 
> The clock gets handed off to the generic pm clock handling and thus
> clk_put in there.
> 
> 
> On the other hand when only the rockchip power-domain code is expected
> to get and put the clock, we would require a lot of new overhead, as now
> the code would also need to track which devices got added to what
> domain and also all clock-references until the device gets detached
> again. So this would essentially duplicate a big part of what the
> genpd-code does (per-domain device-list and per-device clock-list).
> 
> As this seems to be not uncommon, future powerdomain drivers
> might need that too and would also need to duplicate that handling.
> 
> When allowing multiple __clk_get and __clk_put calls on the other
> hand, the overhead for the regular case comes down to one atomic_inc,
> atomic_sub_and_test and the function call to the new separate release
> function ;-) .
> 

Why are we doing of_clk_get(), pm_clk_add_clk(), and then
clk_put()?  Just drop that clk_put() in the caller and remove the
__clk_get() inside pm_clk_add_clk() and everything works the
same. This patch does most of that, except it doesn't handle the
error path where we would need to throw a clk_put().

Really, that snippet of code that loops over a device's clocks
and adds them to a pm domain is an example of duplicate code that
should go into some common layer like the PM clock stuff. Then we
don't have any kind of situation where we're passing struct clk
pointers off to other layers of code and this "problem" doesn't
exist.

----8<----
diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
index acef9f9f759a..529a03e8282c 100644
--- a/drivers/base/power/clock_ops.c
+++ b/drivers/base/power/clock_ops.c
@@ -95,7 +95,7 @@ static int __pm_clk_add(struct device *dev, const char *con_id,
                        return -ENOMEM;
                }
        } else {
-               if (IS_ERR(clk) || !__clk_get(clk)) {
+               if (IS_ERR(clk)) {
                        kfree(ce);
                        return -ENOENT;
                }
@@ -129,7 +129,7 @@ int pm_clk_add(struct device *dev, const char *con_id)
  * @clk: Clock pointer
  *
  * Add the clock to the list of clocks used for the power management of @dev.
- * It will increment refcount on clock pointer, use clk_put() on it when done.
+ * Callers should not call clk_put() on @clk after calling this function.
  */
 int pm_clk_add_clk(struct device *dev, struct clk *clk)
 {


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] clk: readd refcounting for struct clk instances
  2015-09-28 18:46 ` Stephen Boyd
@ 2015-09-28 19:36     ` Heiko Stübner
  0 siblings, 0 replies; 4+ messages in thread
From: Heiko Stübner @ 2015-09-28 19:36 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: mturquette, Kevin Hilman, Rafael J. Wysocki, Pavel Machek,
	Caesar Wang, linux-clk, linux-kernel, linux-pm

Am Montag, 28. September 2015, 11:46:40 schrieb Stephen Boyd:
> On 09/23, Heiko Stübner wrote:
> > ---
> > 
> > While it may be nice to do the actual handling of the clock references
> > only in the calling code, in this current use case it would create
> > a big additional overhead.
> > 
> > It looks like this so called synchronous reset on power-domain state-
> > changes, requiring device clocks to be turned on, is not that uncommon
> > or rockchip-specific.
> > For this Kevin requested that we read the clocks from the actual consumer
> > devices and not double-list them in the power-domain node as well.
> > 
> > So when expecting pm_clk_add_clk() to work, the current powerdomain code
> > 
> > can simply do when adding a device to a domain in 
rockchip_pd_attach_dev():
> >     while ((clk = of_clk_get(dev->of_node, i++)) && !IS_ERR(clk)) {
> > 	
> > 	dev_dbg(dev, "adding clock '%pC' to list of PM clocks\n", clk);
> > 	error = pm_clk_add_clk(dev, clk);
> > 	clk_put(clk);
> > 	
> >     }
> > 
> > The clock gets handed off to the generic pm clock handling and thus
> > clk_put in there.
> > 
> > 
> > On the other hand when only the rockchip power-domain code is expected
> > to get and put the clock, we would require a lot of new overhead, as now
> > the code would also need to track which devices got added to what
> > domain and also all clock-references until the device gets detached
> > again. So this would essentially duplicate a big part of what the
> > genpd-code does (per-domain device-list and per-device clock-list).
> > 
> > As this seems to be not uncommon, future powerdomain drivers
> > might need that too and would also need to duplicate that handling.
> > 
> > When allowing multiple __clk_get and __clk_put calls on the other
> > hand, the overhead for the regular case comes down to one atomic_inc,
> > atomic_sub_and_test and the function call to the new separate release
> > function ;-) .
> 
> Why are we doing of_clk_get(), pm_clk_add_clk(), and then
> clk_put()?  Just drop that clk_put() in the caller and remove the
> __clk_get() inside pm_clk_add_clk() and everything works the
> same. This patch does most of that, except it doesn't handle the
> error path where we would need to throw a clk_put().

I guess that was my try to keep all the gets and puts together, but also an 
instance of not seeing the forest for the trees. Looking at your solution 
below does look actually really cool.

Would still be nice if the genpd people would speak up on their preference. In 
the worst case I'll try hunting them down at ELCE next week :-)

I guess for the error path, one could just set the notion that if 
pm_clk_add_clk returns sucessfully it has taken over the clock reference 
completely and in the error case the caller should clean up?


> 
> Really, that snippet of code that loops over a device's clocks
> and adds them to a pm domain is an example of duplicate code that
> should go into some common layer like the PM clock stuff. Then we
> don't have any kind of situation where we're passing struct clk
> pointers off to other layers of code and this "problem" doesn't
> exist.
> 
> ----8<----
> diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c
> index acef9f9f759a..529a03e8282c 100644
> --- a/drivers/base/power/clock_ops.c
> +++ b/drivers/base/power/clock_ops.c
> @@ -95,7 +95,7 @@ static int __pm_clk_add(struct device *dev, const char
> *con_id, return -ENOMEM;
>                 }
>         } else {
> -               if (IS_ERR(clk) || !__clk_get(clk)) {
> +               if (IS_ERR(clk)) {
>                         kfree(ce);
>                         return -ENOENT;
>                 }
> @@ -129,7 +129,7 @@ int pm_clk_add(struct device *dev, const char *con_id)
>   * @clk: Clock pointer
>   *
>   * Add the clock to the list of clocks used for the power management of
> @dev. - * It will increment refcount on clock pointer, use clk_put() on it
> when done. + * Callers should not call clk_put() on @clk after calling this
> function. */
>  int pm_clk_add_clk(struct device *dev, struct clk *clk)
>  {


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] clk: readd refcounting for struct clk instances
@ 2015-09-28 19:36     ` Heiko Stübner
  0 siblings, 0 replies; 4+ messages in thread
From: Heiko Stübner @ 2015-09-28 19:36 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: mturquette, Kevin Hilman, Rafael J. Wysocki, Pavel Machek,
	Caesar Wang, linux-clk, linux-kernel, linux-pm

Am Montag, 28. September 2015, 11:46:40 schrieb Stephen Boyd:
> On 09/23, Heiko St=FCbner wrote:
> > ---
> >=20
> > While it may be nice to do the actual handling of the clock referen=
ces
> > only in the calling code, in this current use case it would create
> > a big additional overhead.
> >=20
> > It looks like this so called synchronous reset on power-domain stat=
e-
> > changes, requiring device clocks to be turned on, is not that uncom=
mon
> > or rockchip-specific.
> > For this Kevin requested that we read the clocks from the actual co=
nsumer
> > devices and not double-list them in the power-domain node as well.
> >=20
> > So when expecting pm_clk_add_clk() to work, the current powerdomain=
 code
> >=20
> > can simply do when adding a device to a domain in=20
rockchip_pd_attach_dev():
> >     while ((clk =3D of_clk_get(dev->of_node, i++)) && !IS_ERR(clk))=
 {
> > =09
> > =09dev_dbg(dev, "adding clock '%pC' to list of PM clocks\n", clk);
> > =09error =3D pm_clk_add_clk(dev, clk);
> > =09clk_put(clk);
> > =09
> >     }
> >=20
> > The clock gets handed off to the generic pm clock handling and thus=

> > clk_put in there.
> >=20
> >=20
> > On the other hand when only the rockchip power-domain code is expec=
ted
> > to get and put the clock, we would require a lot of new overhead, a=
s now
> > the code would also need to track which devices got added to what
> > domain and also all clock-references until the device gets detached=

> > again. So this would essentially duplicate a big part of what the
> > genpd-code does (per-domain device-list and per-device clock-list).=

> >=20
> > As this seems to be not uncommon, future powerdomain drivers
> > might need that too and would also need to duplicate that handling.=

> >=20
> > When allowing multiple __clk_get and __clk_put calls on the other
> > hand, the overhead for the regular case comes down to one atomic_in=
c,
> > atomic_sub_and_test and the function call to the new separate relea=
se
> > function ;-) .
>=20
> Why are we doing of_clk_get(), pm_clk_add_clk(), and then
> clk_put()?  Just drop that clk_put() in the caller and remove the
> __clk_get() inside pm_clk_add_clk() and everything works the
> same. This patch does most of that, except it doesn't handle the
> error path where we would need to throw a clk_put().

I guess that was my try to keep all the gets and puts together, but als=
o an=20
instance of not seeing the forest for the trees. Looking at your soluti=
on=20
below does look actually really cool.

Would still be nice if the genpd people would speak up on their prefere=
nce. In=20
the worst case I'll try hunting them down at ELCE next week :-)

I guess for the error path, one could just set the notion that if=20
pm_clk_add_clk returns sucessfully it has taken over the clock referenc=
e=20
completely and in the error case the caller should clean up?


>=20
> Really, that snippet of code that loops over a device's clocks
> and adds them to a pm domain is an example of duplicate code that
> should go into some common layer like the PM clock stuff. Then we
> don't have any kind of situation where we're passing struct clk
> pointers off to other layers of code and this "problem" doesn't
> exist.
>=20
> ----8<----
> diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/cloc=
k_ops.c
> index acef9f9f759a..529a03e8282c 100644
> --- a/drivers/base/power/clock_ops.c
> +++ b/drivers/base/power/clock_ops.c
> @@ -95,7 +95,7 @@ static int __pm_clk_add(struct device *dev, const c=
har
> *con_id, return -ENOMEM;
>                 }
>         } else {
> -               if (IS_ERR(clk) || !__clk_get(clk)) {
> +               if (IS_ERR(clk)) {
>                         kfree(ce);
>                         return -ENOENT;
>                 }
> @@ -129,7 +129,7 @@ int pm_clk_add(struct device *dev, const char *co=
n_id)
>   * @clk: Clock pointer
>   *
>   * Add the clock to the list of clocks used for the power management=
 of
> @dev. - * It will increment refcount on clock pointer, use clk_put() =
on it
> when done. + * Callers should not call clk_put() on @clk after callin=
g this
> function. */
>  int pm_clk_add_clk(struct device *dev, struct clk *clk)
>  {

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-09-28 19:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 23:55 [PATCH v2] clk: readd refcounting for struct clk instances Heiko Stübner
2015-09-28 18:46 ` Stephen Boyd
2015-09-28 19:36   ` Heiko Stübner
2015-09-28 19:36     ` Heiko Stübner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.