All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
@ 2013-08-01  0:08 Rafael J. Wysocki
  2013-08-01  8:11 ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01  0:08 UTC (permalink / raw)
  To: Viresh Kumar, Srivatsa S. Bhat; +Cc: Linux PM list, LKML, cpufreq

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The cpufreq core is a little inconsistent in the way it uses the
driver module refcount.

Namely, if __cpufreq_add_dev() is called for a CPU without siblings
or generally a CPU for which a new policy object has to be created,
it grabs a reference to the driver module to start with, but drops
that reference before returning.  As a result, the driver module
refcount is then equal to 0 after __cpufreq_add_dev() has returned.

On the other hand, if the given CPU is a sibling of some other
CPU already having a policy, cpufreq_add_policy_cpu() is called
to link the new CPU to the existing policy.  In that case,
cpufreq_cpu_get() is called to obtain that policy and grabs a
reference to the driver module, but that reference is not
released and the module refcount will be different from 0 after
__cpufreq_add_dev() returns (unless there is an error).  That
prevents the driver module from being unloaded until
__cpufreq_remove_dev() is called for all the CPUs that
cpufreq_add_policy_cpu() was called for previously.

To remove that inconsistency make cpufreq_add_policy_cpu() execute
cpufreq_cpu_put() for the given policy before returning, which
decrements the driver module refcount so that it will be 0 after
__cpufreq_add_dev() returns, but also make it take a reference to
the policy itself using kobject_get() and do not release that
reference (unless there's an error or system resume is under way),
which again is consistent with the "raw" __cpufreq_add_dev()
behavior.

Accordingly, modify __cpufreq_remove_dev() to use kobject_put() to
drop policy references taken by cpufreq_add_policy_cpu().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

On top of current linux-pm.git/linux-next.

Thanks,
Rafael

---
 drivers/cpufreq/cpufreq.c |   24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -908,8 +908,10 @@ static int cpufreq_add_policy_cpu(unsign
 	unsigned long flags;
 
 	policy = cpufreq_cpu_get(sibling);
-	WARN_ON(!policy);
+	if (WARN_ON_ONCE(!policy))
+		return -ENODATA;
 
+	kobject_get(&policy->kobj);
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
@@ -932,14 +934,14 @@ static int cpufreq_add_policy_cpu(unsign
 	/* Don't touch sysfs links during light-weight init */
 	if (frozen) {
 		/* Drop the extra refcount that we took above */
-		cpufreq_cpu_put(policy);
-		return 0;
+		kobject_put(&policy->kobj);
+	} else {
+		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
+		if (ret)
+			kobject_put(&policy->kobj);
 	}
 
-	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
-	if (ret)
-		cpufreq_cpu_put(policy);
-
+	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1298,10 +1300,14 @@ static int __cpufreq_remove_dev(struct d
 		if (!frozen)
 			cpufreq_policy_free(data);
 	} else {
-
+		/*
+		 * There are more CPUs using the same policy, so only drop the
+		 * reference taken by cpufreq_add_policy_cpu() (unless the
+		 * system is suspending).
+		 */
 		if (!frozen) {
 			pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-			cpufreq_cpu_put(data);
+			kobject_put(&data->kobj);
 		}
 
 		if (cpufreq_driver->target) {


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01  0:08 [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs Rafael J. Wysocki
@ 2013-08-01  8:11 ` Srivatsa S. Bhat
  2013-08-01 14:44   ` Viresh Kumar
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01  8:11 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Viresh Kumar, Linux PM list, LKML, cpufreq

On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The cpufreq core is a little inconsistent in the way it uses the
> driver module refcount.
> 
> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
> or generally a CPU for which a new policy object has to be created,
> it grabs a reference to the driver module to start with, but drops
> that reference before returning.  As a result, the driver module
> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
> 
> On the other hand, if the given CPU is a sibling of some other
> CPU already having a policy, cpufreq_add_policy_cpu() is called
> to link the new CPU to the existing policy.  In that case,
> cpufreq_cpu_get() is called to obtain that policy and grabs a
> reference to the driver module, but that reference is not
> released and the module refcount will be different from 0 after
> __cpufreq_add_dev() returns (unless there is an error).  That
> prevents the driver module from being unloaded until
> __cpufreq_remove_dev() is called for all the CPUs that
> cpufreq_add_policy_cpu() was called for previously.
> 
> To remove that inconsistency make cpufreq_add_policy_cpu() execute
> cpufreq_cpu_put() for the given policy before returning, which
> decrements the driver module refcount so that it will be 0 after
> __cpufreq_add_dev() returns,

Removing the inconsistency is a good thing, but I think we should
make it consistent the other way around - make a CPU-online increment
the driver module refcount and decrement it only on CPU-offline.

The reason is, one should not be able to unload the back-end cpufreq
driver module when some CPUs are still being managed. Nasty things
will result if we allow that. For example, if we unload the module,
and then try to do a CPU offline, then the cpufreq hotplug notifier
won't even be called (because cpufreq_unregister_driver also
unregisters the hotplug notifier). And that might be troublesome.

Even worse, if we unload a cpufreq driver module and load a new
one and *then* try to offline the CPU, then the cpufreq_driver->exit()
function that we call during CPU offline will end up calling the
corresponding function of an entirely different driver! So the
->init() and ->exit() calls won't match.

These complications won't exist if we simply prevent unloading the
driver module as long as they are used in managing atleast one CPU.
So I think it would be good to make the code consistent that way.

Regards,
Srivatsa S. Bhat

> but also make it take a reference to
> the policy itself using kobject_get() and do not release that
> reference (unless there's an error or system resume is under way),
> which again is consistent with the "raw" __cpufreq_add_dev()
> behavior.
> 
> Accordingly, modify __cpufreq_remove_dev() to use kobject_put() to
> drop policy references taken by cpufreq_add_policy_cpu().
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> On top of current linux-pm.git/linux-next.
> 
> Thanks,
> Rafael
> 
> ---
>  drivers/cpufreq/cpufreq.c |   24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -908,8 +908,10 @@ static int cpufreq_add_policy_cpu(unsign
>  	unsigned long flags;
> 
>  	policy = cpufreq_cpu_get(sibling);
> -	WARN_ON(!policy);
> +	if (WARN_ON_ONCE(!policy))
> +		return -ENODATA;
> 
> +	kobject_get(&policy->kobj);
>  	if (has_target)
>  		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> 
> @@ -932,14 +934,14 @@ static int cpufreq_add_policy_cpu(unsign
>  	/* Don't touch sysfs links during light-weight init */
>  	if (frozen) {
>  		/* Drop the extra refcount that we took above */
> -		cpufreq_cpu_put(policy);
> -		return 0;
> +		kobject_put(&policy->kobj);
> +	} else {
> +		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> +		if (ret)
> +			kobject_put(&policy->kobj);
>  	}
> 
> -	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> -	if (ret)
> -		cpufreq_cpu_put(policy);
> -
> +	cpufreq_cpu_put(policy);
>  	return ret;
>  }
>  #endif
> @@ -1298,10 +1300,14 @@ static int __cpufreq_remove_dev(struct d
>  		if (!frozen)
>  			cpufreq_policy_free(data);
>  	} else {
> -
> +		/*
> +		 * There are more CPUs using the same policy, so only drop the
> +		 * reference taken by cpufreq_add_policy_cpu() (unless the
> +		 * system is suspending).
> +		 */
>  		if (!frozen) {
>  			pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -			cpufreq_cpu_put(data);
> +			kobject_put(&data->kobj);
>  		}
> 
>  		if (cpufreq_driver->target) {
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01  8:11 ` Srivatsa S. Bhat
@ 2013-08-01 14:44   ` Viresh Kumar
  2013-08-01 15:24     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-01 14:44 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 1 August 2013 13:41, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> The cpufreq core is a little inconsistent in the way it uses the
>> driver module refcount.
>>
>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
>> or generally a CPU for which a new policy object has to be created,
>> it grabs a reference to the driver module to start with, but drops
>> that reference before returning.  As a result, the driver module
>> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
>>
>> On the other hand, if the given CPU is a sibling of some other
>> CPU already having a policy, cpufreq_add_policy_cpu() is called
>> to link the new CPU to the existing policy.  In that case,
>> cpufreq_cpu_get() is called to obtain that policy and grabs a
>> reference to the driver module, but that reference is not
>> released and the module refcount will be different from 0 after
>> __cpufreq_add_dev() returns (unless there is an error).  That
>> prevents the driver module from being unloaded until
>> __cpufreq_remove_dev() is called for all the CPUs that
>> cpufreq_add_policy_cpu() was called for previously.
>>
>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
>> cpufreq_cpu_put() for the given policy before returning, which
>> decrements the driver module refcount so that it will be 0 after
>> __cpufreq_add_dev() returns,
>
> Removing the inconsistency is a good thing, but I think we should
> make it consistent the other way around - make a CPU-online increment
> the driver module refcount and decrement it only on CPU-offline.

I took time to review to this mail as I was looking at the problem
yesterday. I am sorry to say, but I have completely different views as
compared to You and Rafael both :)

First of all, Rafael's patch is incomplete as it hasn't fixed the issue
completely. When we have multiple CPUs per policy and
cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu()
for all cpus of this policy(), so count is == x (no. of CPUs in this policy)
+ 1 (This is the initial value of .owner).

And so we still have module count getting incremented for other cpus :)

Now few lines about My point of view to this whole thing. I believe we
should get rid of .owner field from struct cpufreq_driver completely. It
doesn't make sense to me in doing all this management at all. Surprised?
Shocked? Laughing at me? :)

Well I may be wrong but this is what I think:
- It looks stupid to me that I can't do this from userspace in one go:
  $ insmod cpufreq_driver.ko
  $ rmmod cpufreq_driver.ko

What the hell changed in between that isn't visible to user? It looked
completely stupid in that way..

Something like this sure makes sense:
$ insmod ondemand-governor.ko
$ change governor to ondemand for few CPUs
$ rmmod ondemand-governor.ko

as we have deliberately add few users of governor. And so without second
step, rmmod should really work smoothly. And it does.

Now, why shouldn't there be a problem with this approach? I will write
that inline to the problems you just described.

> The reason is, one should not be able to unload the back-end cpufreq
> driver module when some CPUs are still being managed. Nasty things
> will result if we allow that. For example, if we unload the module,
> and then try to do a CPU offline, then the cpufreq hotplug notifier
> won't even be called (because cpufreq_unregister_driver also
> unregisters the hotplug notifier). And that might be troublesome.

So what? Its simply equivalent to we have booted our system, haven't
inserted cpufreq module and taken out a cpu.

> Even worse, if we unload a cpufreq driver module and load a new
> one and *then* try to offline the CPU, then the cpufreq_driver->exit()
> function that we call during CPU offline will end up calling the
> corresponding function of an entirely different driver! So the
> ->init() and ->exit() calls won't match.

That's not true. When we unload the module, it must call
cpufreq_unregister_driver() which should call cpufreq_remove_cpu()
for all cpus and so exit() is already called for last module.

If we get something new now, it should simply work.

What do you think gentlemen?

--
viresh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 14:44   ` Viresh Kumar
@ 2013-08-01 15:24     ` Srivatsa S. Bhat
  2013-08-01 17:24       ` Viresh Kumar
  2013-08-01 18:04       ` Rafael J. Wysocki
  0 siblings, 2 replies; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01 15:24 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 08/01/2013 08:14 PM, Viresh Kumar wrote:
> On 1 August 2013 13:41, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> The cpufreq core is a little inconsistent in the way it uses the
>>> driver module refcount.
>>>
>>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
>>> or generally a CPU for which a new policy object has to be created,
>>> it grabs a reference to the driver module to start with, but drops
>>> that reference before returning.  As a result, the driver module
>>> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
>>>
>>> On the other hand, if the given CPU is a sibling of some other
>>> CPU already having a policy, cpufreq_add_policy_cpu() is called
>>> to link the new CPU to the existing policy.  In that case,
>>> cpufreq_cpu_get() is called to obtain that policy and grabs a
>>> reference to the driver module, but that reference is not
>>> released and the module refcount will be different from 0 after
>>> __cpufreq_add_dev() returns (unless there is an error).  That
>>> prevents the driver module from being unloaded until
>>> __cpufreq_remove_dev() is called for all the CPUs that
>>> cpufreq_add_policy_cpu() was called for previously.
>>>
>>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
>>> cpufreq_cpu_put() for the given policy before returning, which
>>> decrements the driver module refcount so that it will be 0 after
>>> __cpufreq_add_dev() returns,
>>
>> Removing the inconsistency is a good thing, but I think we should
>> make it consistent the other way around - make a CPU-online increment
>> the driver module refcount and decrement it only on CPU-offline.
> 
> I took time to review to this mail as I was looking at the problem
> yesterday. I am sorry to say, but I have completely different views as
> compared to You and Rafael both :)
> 
> First of all, Rafael's patch is incomplete as it hasn't fixed the issue
> completely. When we have multiple CPUs per policy and
> cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu()
> for all cpus of this policy(), so count is == x (no. of CPUs in this policy)
> + 1 (This is the initial value of .owner).
> 
> And so we still have module count getting incremented for other cpus :)
>

Good catch!
 
> Now few lines about My point of view to this whole thing. I believe we
> should get rid of .owner field from struct cpufreq_driver completely. It
> doesn't make sense to me in doing all this management at all. Surprised?
> Shocked? Laughing at me? :)
> 
> Well I may be wrong but this is what I think:
> - It looks stupid to me that I can't do this from userspace in one go:
>   $ insmod cpufreq_driver.ko
>   $ rmmod cpufreq_driver.ko
> 
> What the hell changed in between that isn't visible to user? It looked
> completely stupid in that way..
> 
> Something like this sure makes sense:
> $ insmod ondemand-governor.ko
> $ change governor to ondemand for few CPUs
> $ rmmod ondemand-governor.ko
> 
> as we have deliberately add few users of governor. And so without second
> step, rmmod should really work smoothly. And it does.
> 
> Now, why shouldn't there be a problem with this approach? I will write
> that inline to the problems you just described.
> 
>> The reason is, one should not be able to unload the back-end cpufreq
>> driver module when some CPUs are still being managed. Nasty things
>> will result if we allow that. For example, if we unload the module,
>> and then try to do a CPU offline, then the cpufreq hotplug notifier
>> won't even be called (because cpufreq_unregister_driver also
>> unregisters the hotplug notifier). And that might be troublesome.
> 
> So what? Its simply equivalent to we have booted our system, haven't
> inserted cpufreq module and taken out a cpu.
> 
>> Even worse, if we unload a cpufreq driver module and load a new
>> one and *then* try to offline the CPU, then the cpufreq_driver->exit()
>> function that we call during CPU offline will end up calling the
>> corresponding function of an entirely different driver! So the
>> ->init() and ->exit() calls won't match.
> 
> That's not true. When we unload the module, it must call
> cpufreq_unregister_driver() which should call cpufreq_remove_cpu()
> for all cpus and so exit() is already called for last module.
> 

Sorry, I missed this one.

> If we get something new now, it should simply work.
> 

Yeah, I now see your point. It won't create any problems by
unloading the module and loading a new one.

> What do you think gentlemen?
> 

Well, I now agree that we don't have to keep the module refcount
non-zero as long as CPUs are being managed (that was just my
misunderstanding, sorry for the noise). However, I think the _get()
and _put() used in the existing code is for synchronization: that
is, to avoid races between trying to unload the cpufreq driver
module and running a hotplug notifier (and calling the driver module's
->init() or ->exit() function).

With that being the case, I think we can retain the module refcounts
and use them only for synchronization. And naturally the refcount
should drop to zero after the critical section; no point keeping
it incremented until the CPU is taken offline.

Or, am I confused again?

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 15:24     ` Srivatsa S. Bhat
@ 2013-08-01 17:24       ` Viresh Kumar
  2013-08-01 18:09         ` Rafael J. Wysocki
  2013-08-01 18:04       ` Rafael J. Wysocki
  1 sibling, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-01 17:24 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 1 August 2013 20:54, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> Well, I now agree that we don't have to keep the module refcount
> non-zero as long as CPUs are being managed (that was just my
> misunderstanding, sorry for the noise). However, I think the _get()
> and _put() used in the existing code is for synchronization: that
> is, to avoid races between trying to unload the cpufreq driver
> module and running a hotplug notifier (and calling the driver module's
> ->init() or ->exit() function).
>
> With that being the case, I think we can retain the module refcounts
> and use them only for synchronization. And naturally the refcount
> should drop to zero after the critical section; no point keeping
> it incremented until the CPU is taken offline.
>
> Or, am I confused again?

No, you aren't.

But, for synchronization we need some blocking stuff, so that when
user tries to rmmod the module, it should wait until other critical
sections are over and then unload the module. But here, we are
simply returning from rmmod, saying that we are busy :)

So, for that kind of synchronization we better use locks available
inside cpufreq core of if required another variable which can be
used for refcount. But now this.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 15:24     ` Srivatsa S. Bhat
  2013-08-01 17:24       ` Viresh Kumar
@ 2013-08-01 18:04       ` Rafael J. Wysocki
  2013-08-01 18:06         ` Srivatsa S. Bhat
  1 sibling, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 18:04 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Thursday, August 01, 2013 08:54:59 PM Srivatsa S. Bhat wrote:
> On 08/01/2013 08:14 PM, Viresh Kumar wrote:
> > On 1 August 2013 13:41, Srivatsa S. Bhat
> > <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> >> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>
> >>> The cpufreq core is a little inconsistent in the way it uses the
> >>> driver module refcount.
> >>>
> >>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
> >>> or generally a CPU for which a new policy object has to be created,
> >>> it grabs a reference to the driver module to start with, but drops
> >>> that reference before returning.  As a result, the driver module
> >>> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
> >>>
> >>> On the other hand, if the given CPU is a sibling of some other
> >>> CPU already having a policy, cpufreq_add_policy_cpu() is called
> >>> to link the new CPU to the existing policy.  In that case,
> >>> cpufreq_cpu_get() is called to obtain that policy and grabs a
> >>> reference to the driver module, but that reference is not
> >>> released and the module refcount will be different from 0 after
> >>> __cpufreq_add_dev() returns (unless there is an error).  That
> >>> prevents the driver module from being unloaded until
> >>> __cpufreq_remove_dev() is called for all the CPUs that
> >>> cpufreq_add_policy_cpu() was called for previously.
> >>>
> >>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
> >>> cpufreq_cpu_put() for the given policy before returning, which
> >>> decrements the driver module refcount so that it will be 0 after
> >>> __cpufreq_add_dev() returns,
> >>
> >> Removing the inconsistency is a good thing, but I think we should
> >> make it consistent the other way around - make a CPU-online increment
> >> the driver module refcount and decrement it only on CPU-offline.
> > 
> > I took time to review to this mail as I was looking at the problem
> > yesterday. I am sorry to say, but I have completely different views as
> > compared to You and Rafael both :)
> > 
> > First of all, Rafael's patch is incomplete as it hasn't fixed the issue
> > completely. When we have multiple CPUs per policy and
> > cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu()
> > for all cpus of this policy(), so count is == x (no. of CPUs in this policy)
> > + 1 (This is the initial value of .owner).
> > 
> > And so we still have module count getting incremented for other cpus :)
> >
> 
> Good catch!

Sorry, I don't see how this happens.

__cpufreq_add_dev() only directly calls cpufreq_cpu_get() at the beginning to
check if the policy is there and then immediately calls cpufreq_cpu_put() in
that case (for that policy).

Next, cpufreq_add_policy_cpu() calls cpufreq_cpu_get(), but that's what my
patch changes.

I don't see where else cpufreq_cpu_get() is called by __cpufreq_add_dev()
whether directly or indirectly.

Moreover, if I'm completely wrong and it is called there in an invisible
hush-hush way, it has to be explained why the module usage count as printed by
lsmod for acpi-cpufreq is 0 (in current linux-next).

> > Now few lines about My point of view to this whole thing. I believe we
> > should get rid of .owner field from struct cpufreq_driver completely. It
> > doesn't make sense to me in doing all this management at all. Surprised?
> > Shocked? Laughing at me? :)
> > 
> > Well I may be wrong but this is what I think:
> > - It looks stupid to me that I can't do this from userspace in one go:
> >   $ insmod cpufreq_driver.ko
> >   $ rmmod cpufreq_driver.ko
> > 
> > What the hell changed in between that isn't visible to user? It looked
> > completely stupid in that way..
> > 
> > Something like this sure makes sense:
> > $ insmod ondemand-governor.ko
> > $ change governor to ondemand for few CPUs
> > $ rmmod ondemand-governor.ko
> > 
> > as we have deliberately add few users of governor. And so without second
> > step, rmmod should really work smoothly. And it does.
> > 
> > Now, why shouldn't there be a problem with this approach? I will write
> > that inline to the problems you just described.
> > 
> >> The reason is, one should not be able to unload the back-end cpufreq
> >> driver module when some CPUs are still being managed. Nasty things
> >> will result if we allow that. For example, if we unload the module,
> >> and then try to do a CPU offline, then the cpufreq hotplug notifier
> >> won't even be called (because cpufreq_unregister_driver also
> >> unregisters the hotplug notifier). And that might be troublesome.
> > 
> > So what? Its simply equivalent to we have booted our system, haven't
> > inserted cpufreq module and taken out a cpu.

I'd put that differently.

With the current code as is it may cause problems to happen, but there are
two ways to change that in general:

(1) Disallow the removal of the cpufreq driver while there are any users, but
    for that we only need the driver to be refcounted *once* when a new policy
    is created (and not as many times as there are CPUs using that policy).
    Then, the reference can be dropped while removing the policy object.

(2) Allow the removal of the cpufreq driver, but harden the code against
    that.  [Maybe it doesn't have to be hardened any more as is, I haven't
    checked that.]

I agree with Viresh that (1) is kind of weird from the usability perspective,
because if we did that, it wouldn't be practically possible to remove cpufreq
driver modules after loading them (cpuidle currently has that problem).

> >> Even worse, if we unload a cpufreq driver module and load a new
> >> one and *then* try to offline the CPU, then the cpufreq_driver->exit()
> >> function that we call during CPU offline will end up calling the
> >> corresponding function of an entirely different driver! So the
> >> ->init() and ->exit() calls won't match.
> > 
> > That's not true. When we unload the module, it must call
> > cpufreq_unregister_driver() which should call cpufreq_remove_cpu()
> > for all cpus and so exit() is already called for last module.
> > 
> 
> Sorry, I missed this one.
> 
> > If we get something new now, it should simply work.
> > 
> 
> Yeah, I now see your point. It won't create any problems by
> unloading the module and loading a new one.
> 
> > What do you think gentlemen?
> > 
> 
> Well, I now agree that we don't have to keep the module refcount
> non-zero as long as CPUs are being managed (that was just my
> misunderstanding, sorry for the noise). However, I think the _get()
> and _put() used in the existing code is for synchronization: that
> is, to avoid races between trying to unload the cpufreq driver
> module and running a hotplug notifier (and calling the driver module's
> ->init() or ->exit() function).
> 
> With that being the case, I think we can retain the module refcounts
> and use them only for synchronization. And naturally the refcount
> should drop to zero after the critical section; no point keeping
> it incremented until the CPU is taken offline.
> 
> Or, am I confused again?

No, I don't think so.

In fact, the point of my patch was to make the module refcount stay 0
beyond critical sections, but it looks like I overlooked something.  What is
that?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 18:04       ` Rafael J. Wysocki
@ 2013-08-01 18:06         ` Srivatsa S. Bhat
  2013-08-01 19:01           ` Rafael J. Wysocki
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01 18:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 08/01/2013 11:34 PM, Rafael J. Wysocki wrote:
> On Thursday, August 01, 2013 08:54:59 PM Srivatsa S. Bhat wrote:
>> On 08/01/2013 08:14 PM, Viresh Kumar wrote:
>>> On 1 August 2013 13:41, Srivatsa S. Bhat
>>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>>>> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>
>>>>> The cpufreq core is a little inconsistent in the way it uses the
>>>>> driver module refcount.
>>>>>
>>>>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
>>>>> or generally a CPU for which a new policy object has to be created,
>>>>> it grabs a reference to the driver module to start with, but drops
>>>>> that reference before returning.  As a result, the driver module
>>>>> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
>>>>>
>>>>> On the other hand, if the given CPU is a sibling of some other
>>>>> CPU already having a policy, cpufreq_add_policy_cpu() is called
>>>>> to link the new CPU to the existing policy.  In that case,
>>>>> cpufreq_cpu_get() is called to obtain that policy and grabs a
>>>>> reference to the driver module, but that reference is not
>>>>> released and the module refcount will be different from 0 after
>>>>> __cpufreq_add_dev() returns (unless there is an error).  That
>>>>> prevents the driver module from being unloaded until
>>>>> __cpufreq_remove_dev() is called for all the CPUs that
>>>>> cpufreq_add_policy_cpu() was called for previously.
>>>>>
>>>>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
>>>>> cpufreq_cpu_put() for the given policy before returning, which
>>>>> decrements the driver module refcount so that it will be 0 after
>>>>> __cpufreq_add_dev() returns,
>>>>
>>>> Removing the inconsistency is a good thing, but I think we should
>>>> make it consistent the other way around - make a CPU-online increment
>>>> the driver module refcount and decrement it only on CPU-offline.
>>>
>>> I took time to review to this mail as I was looking at the problem
>>> yesterday. I am sorry to say, but I have completely different views as
>>> compared to You and Rafael both :)
>>>
>>> First of all, Rafael's patch is incomplete as it hasn't fixed the issue
>>> completely. When we have multiple CPUs per policy and
>>> cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu()
>>> for all cpus of this policy(), so count is == x (no. of CPUs in this policy)
>>> + 1 (This is the initial value of .owner).
>>>
>>> And so we still have module count getting incremented for other cpus :)
>>>
>>
>> Good catch!
> 
> Sorry, I don't see how this happens.
> 
> __cpufreq_add_dev() only directly calls cpufreq_cpu_get() at the beginning to
> check if the policy is there and then immediately calls cpufreq_cpu_put() in
> that case (for that policy).
> 
> Next, cpufreq_add_policy_cpu() calls cpufreq_cpu_get(), but that's what my
> patch changes.
> 
> I don't see where else cpufreq_cpu_get() is called by __cpufreq_add_dev()
> whether directly or indirectly.
>

__cpufreq_add_dev()->cpufreq_add_dev_interface()->cpufreq_add_dev_symlink().

The last one does:

 815         for_each_cpu(j, policy->cpus) {
 816                 struct cpufreq_policy *managed_policy;
 817                 struct device *cpu_dev;
 818 
 819                 if (j == cpu)
 820                         continue;
 821 
 822                 pr_debug("CPU %u already managed, adding link\n", j);
 823                 managed_policy = cpufreq_cpu_get(cpu);
 824                 cpu_dev = get_cpu_device(j);
 825                 ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
 826                                         "cpufreq");
		     ...
	     }

 
> Moreover, if I'm completely wrong and it is called there in an invisible
> hush-hush way, it has to be explained why the module usage count as printed by
> lsmod for acpi-cpufreq is 0 (in current linux-next).
>

Perhaps because none of your policies have more than one CPU associated
with it? I think related_cpus should be able to tell us that..

But yes, it is a little hidden and moreover, we don't take the refcount if
there is only one CPU in the mask. Which is a little inconsistent, IMHO.
 
>>> Now few lines about My point of view to this whole thing. I believe we
>>> should get rid of .owner field from struct cpufreq_driver completely. It
>>> doesn't make sense to me in doing all this management at all. Surprised?
>>> Shocked? Laughing at me? :)
>>>
>>> Well I may be wrong but this is what I think:
>>> - It looks stupid to me that I can't do this from userspace in one go:
>>>   $ insmod cpufreq_driver.ko
>>>   $ rmmod cpufreq_driver.ko
>>>
>>> What the hell changed in between that isn't visible to user? It looked
>>> completely stupid in that way..
>>>
>>> Something like this sure makes sense:
>>> $ insmod ondemand-governor.ko
>>> $ change governor to ondemand for few CPUs
>>> $ rmmod ondemand-governor.ko
>>>
>>> as we have deliberately add few users of governor. And so without second
>>> step, rmmod should really work smoothly. And it does.
>>>
>>> Now, why shouldn't there be a problem with this approach? I will write
>>> that inline to the problems you just described.
>>>
>>>> The reason is, one should not be able to unload the back-end cpufreq
>>>> driver module when some CPUs are still being managed. Nasty things
>>>> will result if we allow that. For example, if we unload the module,
>>>> and then try to do a CPU offline, then the cpufreq hotplug notifier
>>>> won't even be called (because cpufreq_unregister_driver also
>>>> unregisters the hotplug notifier). And that might be troublesome.
>>>
>>> So what? Its simply equivalent to we have booted our system, haven't
>>> inserted cpufreq module and taken out a cpu.
> 
> I'd put that differently.
> 
> With the current code as is it may cause problems to happen, but there are
> two ways to change that in general:
> 
> (1) Disallow the removal of the cpufreq driver while there are any users, but
>     for that we only need the driver to be refcounted *once* when a new policy
>     is created (and not as many times as there are CPUs using that policy).
>     Then, the reference can be dropped while removing the policy object.
> 
> (2) Allow the removal of the cpufreq driver, but harden the code against
>     that.  [Maybe it doesn't have to be hardened any more as is, I haven't
>     checked that.]
> 
> I agree with Viresh that (1) is kind of weird from the usability perspective,
> because if we did that, it wouldn't be practically possible to remove cpufreq
> driver modules after loading them (cpuidle currently has that problem).
>

Yes, I think we can go with Viresh's approach and use plain locking to
synchronize things. Returning -EBUSY isn't really beneficial, since the
critical sections are small and finite - its not like the user has to wait
a long time to rmmod the module if we use locking.
 
>>>> Even worse, if we unload a cpufreq driver module and load a new
>>>> one and *then* try to offline the CPU, then the cpufreq_driver->exit()
>>>> function that we call during CPU offline will end up calling the
>>>> corresponding function of an entirely different driver! So the
>>>> ->init() and ->exit() calls won't match.
>>>
>>> That's not true. When we unload the module, it must call
>>> cpufreq_unregister_driver() which should call cpufreq_remove_cpu()
>>> for all cpus and so exit() is already called for last module.
>>>
>>
>> Sorry, I missed this one.
>>
>>> If we get something new now, it should simply work.
>>>
>>
>> Yeah, I now see your point. It won't create any problems by
>> unloading the module and loading a new one.
>>
>>> What do you think gentlemen?
>>>
>>
>> Well, I now agree that we don't have to keep the module refcount
>> non-zero as long as CPUs are being managed (that was just my
>> misunderstanding, sorry for the noise). However, I think the _get()
>> and _put() used in the existing code is for synchronization: that
>> is, to avoid races between trying to unload the cpufreq driver
>> module and running a hotplug notifier (and calling the driver module's
>> ->init() or ->exit() function).
>>
>> With that being the case, I think we can retain the module refcounts
>> and use them only for synchronization. And naturally the refcount
>> should drop to zero after the critical section; no point keeping
>> it incremented until the CPU is taken offline.
>>
>> Or, am I confused again?
> 
> No, I don't think so.
> 
> In fact, the point of my patch was to make the module refcount stay 0
> beyond critical sections, but it looks like I overlooked something.  What is
> that?
> 

Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
that taken care of, everything should be OK. Then we can change the
synchronization part to avoid using refcounts.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 17:24       ` Viresh Kumar
@ 2013-08-01 18:09         ` Rafael J. Wysocki
  0 siblings, 0 replies; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 18:09 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Srivatsa S. Bhat, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Thursday, August 01, 2013 10:54:08 PM Viresh Kumar wrote:
> On 1 August 2013 20:54, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> > Well, I now agree that we don't have to keep the module refcount
> > non-zero as long as CPUs are being managed (that was just my
> > misunderstanding, sorry for the noise). However, I think the _get()
> > and _put() used in the existing code is for synchronization: that
> > is, to avoid races between trying to unload the cpufreq driver
> > module and running a hotplug notifier (and calling the driver module's
> > ->init() or ->exit() function).
> >
> > With that being the case, I think we can retain the module refcounts
> > and use them only for synchronization. And naturally the refcount
> > should drop to zero after the critical section; no point keeping
> > it incremented until the CPU is taken offline.
> >
> > Or, am I confused again?
> 
> No, you aren't.
> 
> But, for synchronization we need some blocking stuff, so that when
> user tries to rmmod the module, it should wait until other critical
> sections are over and then unload the module. But here, we are
> simply returning from rmmod, saying that we are busy :)
> 
> So, for that kind of synchronization we better use locks available
> inside cpufreq core of if required another variable which can be
> used for refcount. But now this.

I suppose you wanted to say "But not this"?

I agree.  That said the situation now is that for acpi-cpufreq the module
usage count is 0 (as of linux-next from today), so you can actually rmmod it.

I don't see why it should be different in the case when there are multiple
CPUs for the same policy and hence my patch.  [I don't see the problem with
it as I said to my reply to Srivatsa, so care to explain it to me?.]

Now it seems to me that the code *is* racy (I haven't verified that, though),
so I agree that we should use some other synchronization framework to be able
to rmmod and modprobe cpufreq drivers safely.  Using the driver module usage
counter for that is not quite correct in my view.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 19:01           ` Rafael J. Wysocki
@ 2013-08-01 19:01             ` Srivatsa S. Bhat
  2013-08-01 19:21               ` Rafael J. Wysocki
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
> On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
>> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
>> that taken care of, everything should be OK. Then we can change the
>> synchronization part to avoid using refcounts.
> 
> So I actually don't see why cpufreq_add_dev_symlink() needs to call
> cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
> point it is called and the bumping up of the driver module refcount is
> pointless.
> 

Hmm, yes, it seems so.

> However, if I change that I also need to change the piece of code that
> calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
> 

... I guess that's because you are looking at the code with your patch
applied (and your patch removed that _put()) ;-)

Its this part in __cpufreq_remove_dev():

1303         } else {
1304 
1305                 if (!frozen) {
1306                         pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
1307                         cpufreq_cpu_put(data);
1308                 }
1309 

 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 18:06         ` Srivatsa S. Bhat
@ 2013-08-01 19:01           ` Rafael J. Wysocki
  2013-08-01 19:01             ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 19:01 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
> On 08/01/2013 11:34 PM, Rafael J. Wysocki wrote:
> > On Thursday, August 01, 2013 08:54:59 PM Srivatsa S. Bhat wrote:
> >> On 08/01/2013 08:14 PM, Viresh Kumar wrote:
> >>> On 1 August 2013 13:41, Srivatsa S. Bhat
> >>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> >>>> On 08/01/2013 05:38 AM, Rafael J. Wysocki wrote:
> >>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>>
> >>>>> The cpufreq core is a little inconsistent in the way it uses the
> >>>>> driver module refcount.
> >>>>>
> >>>>> Namely, if __cpufreq_add_dev() is called for a CPU without siblings
> >>>>> or generally a CPU for which a new policy object has to be created,
> >>>>> it grabs a reference to the driver module to start with, but drops
> >>>>> that reference before returning.  As a result, the driver module
> >>>>> refcount is then equal to 0 after __cpufreq_add_dev() has returned.
> >>>>>
> >>>>> On the other hand, if the given CPU is a sibling of some other
> >>>>> CPU already having a policy, cpufreq_add_policy_cpu() is called
> >>>>> to link the new CPU to the existing policy.  In that case,
> >>>>> cpufreq_cpu_get() is called to obtain that policy and grabs a
> >>>>> reference to the driver module, but that reference is not
> >>>>> released and the module refcount will be different from 0 after
> >>>>> __cpufreq_add_dev() returns (unless there is an error).  That
> >>>>> prevents the driver module from being unloaded until
> >>>>> __cpufreq_remove_dev() is called for all the CPUs that
> >>>>> cpufreq_add_policy_cpu() was called for previously.
> >>>>>
> >>>>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
> >>>>> cpufreq_cpu_put() for the given policy before returning, which
> >>>>> decrements the driver module refcount so that it will be 0 after
> >>>>> __cpufreq_add_dev() returns,
> >>>>
> >>>> Removing the inconsistency is a good thing, but I think we should
> >>>> make it consistent the other way around - make a CPU-online increment
> >>>> the driver module refcount and decrement it only on CPU-offline.
> >>>
> >>> I took time to review to this mail as I was looking at the problem
> >>> yesterday. I am sorry to say, but I have completely different views as
> >>> compared to You and Rafael both :)
> >>>
> >>> First of all, Rafael's patch is incomplete as it hasn't fixed the issue
> >>> completely. When we have multiple CPUs per policy and
> >>> cpufreq_add_dev() is called for the first one, it call cpufreq_get_cpu()
> >>> for all cpus of this policy(), so count is == x (no. of CPUs in this policy)
> >>> + 1 (This is the initial value of .owner).
> >>>
> >>> And so we still have module count getting incremented for other cpus :)
> >>>
> >>
> >> Good catch!
> > 
> > Sorry, I don't see how this happens.
> > 
> > __cpufreq_add_dev() only directly calls cpufreq_cpu_get() at the beginning to
> > check if the policy is there and then immediately calls cpufreq_cpu_put() in
> > that case (for that policy).
> > 
> > Next, cpufreq_add_policy_cpu() calls cpufreq_cpu_get(), but that's what my
> > patch changes.
> > 
> > I don't see where else cpufreq_cpu_get() is called by __cpufreq_add_dev()
> > whether directly or indirectly.
> >
> 
> __cpufreq_add_dev()->cpufreq_add_dev_interface()->cpufreq_add_dev_symlink().

Ah, OK, thanks!

> The last one does:
> 
>  815         for_each_cpu(j, policy->cpus) {
>  816                 struct cpufreq_policy *managed_policy;
>  817                 struct device *cpu_dev;
>  818 
>  819                 if (j == cpu)
>  820                         continue;
>  821 
>  822                 pr_debug("CPU %u already managed, adding link\n", j);
>  823                 managed_policy = cpufreq_cpu_get(cpu);
>  824                 cpu_dev = get_cpu_device(j);
>  825                 ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
>  826                                         "cpufreq");
> 		     ...
> 	     }

And when do we drop this one?

> > Moreover, if I'm completely wrong and it is called there in an invisible
> > hush-hush way, it has to be explained why the module usage count as printed by
> > lsmod for acpi-cpufreq is 0 (in current linux-next).
> >
> 
> Perhaps because none of your policies have more than one CPU associated
> with it? I think related_cpus should be able to tell us that..

Yes, that's the case.

> But yes, it is a little hidden and moreover, we don't take the refcount if
> there is only one CPU in the mask. Which is a little inconsistent, IMHO.

Well, I obviously agree.

> >>> Now few lines about My point of view to this whole thing. I believe we
> >>> should get rid of .owner field from struct cpufreq_driver completely. It
> >>> doesn't make sense to me in doing all this management at all. Surprised?
> >>> Shocked? Laughing at me? :)
> >>>
> >>> Well I may be wrong but this is what I think:
> >>> - It looks stupid to me that I can't do this from userspace in one go:
> >>>   $ insmod cpufreq_driver.ko
> >>>   $ rmmod cpufreq_driver.ko
> >>>
> >>> What the hell changed in between that isn't visible to user? It looked
> >>> completely stupid in that way..
> >>>
> >>> Something like this sure makes sense:
> >>> $ insmod ondemand-governor.ko
> >>> $ change governor to ondemand for few CPUs
> >>> $ rmmod ondemand-governor.ko
> >>>
> >>> as we have deliberately add few users of governor. And so without second
> >>> step, rmmod should really work smoothly. And it does.
> >>>
> >>> Now, why shouldn't there be a problem with this approach? I will write
> >>> that inline to the problems you just described.
> >>>
> >>>> The reason is, one should not be able to unload the back-end cpufreq
> >>>> driver module when some CPUs are still being managed. Nasty things
> >>>> will result if we allow that. For example, if we unload the module,
> >>>> and then try to do a CPU offline, then the cpufreq hotplug notifier
> >>>> won't even be called (because cpufreq_unregister_driver also
> >>>> unregisters the hotplug notifier). And that might be troublesome.
> >>>
> >>> So what? Its simply equivalent to we have booted our system, haven't
> >>> inserted cpufreq module and taken out a cpu.
> > 
> > I'd put that differently.
> > 
> > With the current code as is it may cause problems to happen, but there are
> > two ways to change that in general:
> > 
> > (1) Disallow the removal of the cpufreq driver while there are any users, but
> >     for that we only need the driver to be refcounted *once* when a new policy
> >     is created (and not as many times as there are CPUs using that policy).
> >     Then, the reference can be dropped while removing the policy object.
> > 
> > (2) Allow the removal of the cpufreq driver, but harden the code against
> >     that.  [Maybe it doesn't have to be hardened any more as is, I haven't
> >     checked that.]
> > 
> > I agree with Viresh that (1) is kind of weird from the usability perspective,
> > because if we did that, it wouldn't be practically possible to remove cpufreq
> > driver modules after loading them (cpuidle currently has that problem).
> >
> 
> Yes, I think we can go with Viresh's approach and use plain locking to
> synchronize things. Returning -EBUSY isn't really beneficial, since the
> critical sections are small and finite - its not like the user has to wait
> a long time to rmmod the module if we use locking.

Agreed.

> >>>> Even worse, if we unload a cpufreq driver module and load a new
> >>>> one and *then* try to offline the CPU, then the cpufreq_driver->exit()
> >>>> function that we call during CPU offline will end up calling the
> >>>> corresponding function of an entirely different driver! So the
> >>>> ->init() and ->exit() calls won't match.
> >>>
> >>> That's not true. When we unload the module, it must call
> >>> cpufreq_unregister_driver() which should call cpufreq_remove_cpu()
> >>> for all cpus and so exit() is already called for last module.
> >>>
> >>
> >> Sorry, I missed this one.
> >>
> >>> If we get something new now, it should simply work.
> >>>
> >>
> >> Yeah, I now see your point. It won't create any problems by
> >> unloading the module and loading a new one.
> >>
> >>> What do you think gentlemen?
> >>>
> >>
> >> Well, I now agree that we don't have to keep the module refcount
> >> non-zero as long as CPUs are being managed (that was just my
> >> misunderstanding, sorry for the noise). However, I think the _get()
> >> and _put() used in the existing code is for synchronization: that
> >> is, to avoid races between trying to unload the cpufreq driver
> >> module and running a hotplug notifier (and calling the driver module's
> >> ->init() or ->exit() function).
> >>
> >> With that being the case, I think we can retain the module refcounts
> >> and use them only for synchronization. And naturally the refcount
> >> should drop to zero after the critical section; no point keeping
> >> it incremented until the CPU is taken offline.
> >>
> >> Or, am I confused again?
> > 
> > No, I don't think so.
> > 
> > In fact, the point of my patch was to make the module refcount stay 0
> > beyond critical sections, but it looks like I overlooked something.  What is
> > that?
> > 
> 
> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
> that taken care of, everything should be OK. Then we can change the
> synchronization part to avoid using refcounts.

So I actually don't see why cpufreq_add_dev_symlink() needs to call
cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
point it is called and the bumping up of the driver module refcount is
pointless.

However, if I change that I also need to change the piece of code that
calls the complementary cpufreq_cpu_put() and I kind of cannot find it.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 19:21               ` Rafael J. Wysocki
@ 2013-08-01 19:21                 ` Srivatsa S. Bhat
  2013-08-01 20:04                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01 19:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 08/02/2013 12:51 AM, Rafael J. Wysocki wrote:
> On Friday, August 02, 2013 12:31:23 AM Srivatsa S. Bhat wrote:
>> On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
>>> On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
>>>> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
>>>> that taken care of, everything should be OK. Then we can change the
>>>> synchronization part to avoid using refcounts.
>>>
>>> So I actually don't see why cpufreq_add_dev_symlink() needs to call
>>> cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
>>> point it is called and the bumping up of the driver module refcount is
>>> pointless.
>>>
>>
>> Hmm, yes, it seems so.
>>
>>> However, if I change that I also need to change the piece of code that
>>> calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
>>>
>>
>> ... I guess that's because you are looking at the code with your patch
>> applied (and your patch removed that _put()) ;-)
> 
> No, it's not that one.  That one was complementary to the cpufreq_cpu_get()
> done by cpufreq_add_policy_cpu() before my patch.  Since my patch changes
> cpufreq_add_policy_cpu() to call cpufreq_cpu_put() before returning and
> bump up the policy refcount with kobject_get(), the one in
> __cpufreq_remove_dev() is changed into kobject_put() (correctly, IMO).
> 
> What gives?
> 

Actually, it _is_ the one I pointed above. This thing is tricky, here's why:

cpufreq_add_policy_cpu() is called only if:
a. The CPU being onlined has per_cpu(cpufreq_cpu_data, cpu) == NULL
and 
b. Its is present in some CPU's related_cpus mask. 

If condition (a) doesn't hold good, you get out right in the beginning of
__cpufreq_add_dev().

So, cpufreq_add_policy_cpu() is called very rarely because, inside
__cpufreq_add_dev we do:

1093         write_lock_irqsave(&cpufreq_driver_lock, flags);
1094         for_each_cpu(j, policy->cpus) {
1095                 per_cpu(cpufreq_cpu_data, j) = policy;
1096                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
1097         }
1098         write_unlock_irqrestore(&cpufreq_driver_lock, flags);

So for all the CPUs in the above policy->cpus mask, we simply return
without further ado when they are onlined. In particular, we *dont* call
cpufreq_add_policy_cpu() for any of them.

And their refcounts are incremented by the cpufreq_add_dev_interface()->
cpufreq_add_dev_symlink() function.

So, ultimately, we increment the refcount for a given non-policy-owner CPU
only once. *Either* in cpufreq_add_dev_symlink() *or* in cpufreq_add_policy_cpu(),
but never both.

So, in the teardown path, __cpufreq_remove_dev() needs only one place to
decrement it as shown below:

1303         } else {
1304 
1305                 if (!frozen) {
1306                         pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
1307                         cpufreq_cpu_put(data);
1308                 }


Pretty good maze, right? ;-(
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 19:01             ` Srivatsa S. Bhat
@ 2013-08-01 19:21               ` Rafael J. Wysocki
  2013-08-01 19:21                 ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 19:21 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Friday, August 02, 2013 12:31:23 AM Srivatsa S. Bhat wrote:
> On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
> > On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
> >> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
> >> that taken care of, everything should be OK. Then we can change the
> >> synchronization part to avoid using refcounts.
> > 
> > So I actually don't see why cpufreq_add_dev_symlink() needs to call
> > cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
> > point it is called and the bumping up of the driver module refcount is
> > pointless.
> > 
> 
> Hmm, yes, it seems so.
> 
> > However, if I change that I also need to change the piece of code that
> > calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
> > 
> 
> ... I guess that's because you are looking at the code with your patch
> applied (and your patch removed that _put()) ;-)

No, it's not that one.  That one was complementary to the cpufreq_cpu_get()
done by cpufreq_add_policy_cpu() before my patch.  Since my patch changes
cpufreq_add_policy_cpu() to call cpufreq_cpu_put() before returning and
bump up the policy refcount with kobject_get(), the one in
__cpufreq_remove_dev() is changed into kobject_put() (correctly, IMO).

What gives?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 19:21                 ` Srivatsa S. Bhat
@ 2013-08-01 20:04                   ` Rafael J. Wysocki
  2013-08-01 20:26                     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 20:04 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Friday, August 02, 2013 12:51:24 AM Srivatsa S. Bhat wrote:
> On 08/02/2013 12:51 AM, Rafael J. Wysocki wrote:
> > On Friday, August 02, 2013 12:31:23 AM Srivatsa S. Bhat wrote:
> >> On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
> >>> On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
> >>>> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
> >>>> that taken care of, everything should be OK. Then we can change the
> >>>> synchronization part to avoid using refcounts.
> >>>
> >>> So I actually don't see why cpufreq_add_dev_symlink() needs to call
> >>> cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
> >>> point it is called and the bumping up of the driver module refcount is
> >>> pointless.
> >>>
> >>
> >> Hmm, yes, it seems so.
> >>
> >>> However, if I change that I also need to change the piece of code that
> >>> calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
> >>>
> >>
> >> ... I guess that's because you are looking at the code with your patch
> >> applied (and your patch removed that _put()) ;-)
> > 
> > No, it's not that one.  That one was complementary to the cpufreq_cpu_get()
> > done by cpufreq_add_policy_cpu() before my patch.  Since my patch changes
> > cpufreq_add_policy_cpu() to call cpufreq_cpu_put() before returning and
> > bump up the policy refcount with kobject_get(), the one in
> > __cpufreq_remove_dev() is changed into kobject_put() (correctly, IMO).
> > 
> > What gives?
> > 
> 
> Actually, it _is_ the one I pointed above. This thing is tricky, here's why:
> 
> cpufreq_add_policy_cpu() is called only if:
> a. The CPU being onlined has per_cpu(cpufreq_cpu_data, cpu) == NULL
> and 
> b. Its is present in some CPU's related_cpus mask. 
> 
> If condition (a) doesn't hold good, you get out right in the beginning of
> __cpufreq_add_dev().
> 
> So, cpufreq_add_policy_cpu() is called very rarely because, inside
> __cpufreq_add_dev we do:
> 
> 1093         write_lock_irqsave(&cpufreq_driver_lock, flags);
> 1094         for_each_cpu(j, policy->cpus) {
> 1095                 per_cpu(cpufreq_cpu_data, j) = policy;
> 1096                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
> 1097         }
> 1098         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 
> So for all the CPUs in the above policy->cpus mask, we simply return
> without further ado when they are onlined. In particular, we *dont* call
> cpufreq_add_policy_cpu() for any of them.
> 
> And their refcounts are incremented by the cpufreq_add_dev_interface()->
> cpufreq_add_dev_symlink() function.
> 
> So, ultimately, we increment the refcount for a given non-policy-owner CPU
> only once. *Either* in cpufreq_add_dev_symlink() *or* in cpufreq_add_policy_cpu(),
> but never both.
> 
> So, in the teardown path, __cpufreq_remove_dev() needs only one place to
> decrement it as shown below:
> 
> 1303         } else {
> 1304 
> 1305                 if (!frozen) {
> 1306                         pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> 1307                         cpufreq_cpu_put(data);
> 1308                 }
> 
> 
> Pretty good maze, right? ;-(

Oh dear.  Right.

I tgought I could change cpufreq_add_dev_symlink() to use kobject_get() to bump
up the policy refcount in analogy with cpufreq_add_policy_cpu() and then it
wouldn't need to call cpufreq_cpu_get() at all, but there is a bug in the
error code path of cpufreq_add_dev_interface(), because if
cpufreq_add_dev_symlink() fails for one of the CPUs sharing the policy,
it will just fail to drop references grabbed in there.  [Moreover, if it
fails for the first one different from policy->cpu, kobject_put() will be
called for that policy twice in a row if I'm not mistaken (first by
cpufreq_add_dev_interface() and then by __cpufreq_add_dev()), but that's
a different matter.]

So I think that neither cpufreq_add_dev_symlink() nor
cpufreq_add_policy_cpu() should bump up the policy refcount in any way.

Which entirely boils down to something like this:

---
 drivers/cpufreq/cpufreq.c |   31 +++++++------------------------
 1 file changed, 7 insertions(+), 24 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
 			continue;
 
 		pr_debug("Adding link for CPU: %u\n", j);
-		cpufreq_cpu_get(policy->cpu);
 		cpu_dev = get_cpu_device(j);
 		ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
 					"cpufreq");
-		if (ret) {
-			cpufreq_cpu_put(policy);
-			return ret;
-		}
+		if (ret)
+			break;
 	}
 	return ret;
 }
@@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
 	unsigned long flags;
 
 	policy = cpufreq_cpu_get(sibling);
-	WARN_ON(!policy);
+	if (WARN_ON_ONCE(!policy))
+		return -ENODATA;
 
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
@@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
 	}
 
 	/* Don't touch sysfs links during light-weight init */
-	if (frozen) {
-		/* Drop the extra refcount that we took above */
-		cpufreq_cpu_put(policy);
-		return 0;
-	}
-
-	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
-	if (ret)
-		cpufreq_cpu_put(policy);
+	if (!frozen)
+		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 
+	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1117,9 +1109,6 @@ err_out_unregister:
 	}
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
-	kobject_put(&policy->kobj);
-	wait_for_completion(&policy->kobj_unregister);
-
 err_set_policy_cpu:
 	per_cpu(cpufreq_policy_cpu, cpu) = -1;
 	cpufreq_policy_free(policy);
@@ -1298,12 +1287,6 @@ static int __cpufreq_remove_dev(struct d
 		if (!frozen)
 			cpufreq_policy_free(data);
 	} else {
-
-		if (!frozen) {
-			pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-			cpufreq_cpu_put(data);
-		}
-
 		if (cpufreq_driver->target) {
 			__cpufreq_governor(data, CPUFREQ_GOV_START);
 			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 20:04                   ` Rafael J. Wysocki
@ 2013-08-01 20:26                     ` Srivatsa S. Bhat
  2013-08-01 20:47                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-01 20:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On 08/02/2013 01:34 AM, Rafael J. Wysocki wrote:
> On Friday, August 02, 2013 12:51:24 AM Srivatsa S. Bhat wrote:
>> On 08/02/2013 12:51 AM, Rafael J. Wysocki wrote:
>>> On Friday, August 02, 2013 12:31:23 AM Srivatsa S. Bhat wrote:
>>>> On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
>>>>> On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
>>>>>> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
>>>>>> that taken care of, everything should be OK. Then we can change the
>>>>>> synchronization part to avoid using refcounts.
>>>>>
>>>>> So I actually don't see why cpufreq_add_dev_symlink() needs to call
>>>>> cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
>>>>> point it is called and the bumping up of the driver module refcount is
>>>>> pointless.
>>>>>
>>>>
>>>> Hmm, yes, it seems so.
>>>>
>>>>> However, if I change that I also need to change the piece of code that
>>>>> calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
>>>>>
>>>>
>>>> ... I guess that's because you are looking at the code with your patch
>>>> applied (and your patch removed that _put()) ;-)
>>>
>>> No, it's not that one.  That one was complementary to the cpufreq_cpu_get()
>>> done by cpufreq_add_policy_cpu() before my patch.  Since my patch changes
>>> cpufreq_add_policy_cpu() to call cpufreq_cpu_put() before returning and
>>> bump up the policy refcount with kobject_get(), the one in
>>> __cpufreq_remove_dev() is changed into kobject_put() (correctly, IMO).
>>>
>>> What gives?
>>>
>>
>> Actually, it _is_ the one I pointed above. This thing is tricky, here's why:
>>
>> cpufreq_add_policy_cpu() is called only if:
>> a. The CPU being onlined has per_cpu(cpufreq_cpu_data, cpu) == NULL
>> and 
>> b. Its is present in some CPU's related_cpus mask. 
>>
>> If condition (a) doesn't hold good, you get out right in the beginning of
>> __cpufreq_add_dev().
>>
>> So, cpufreq_add_policy_cpu() is called very rarely because, inside
>> __cpufreq_add_dev we do:
>>
>> 1093         write_lock_irqsave(&cpufreq_driver_lock, flags);
>> 1094         for_each_cpu(j, policy->cpus) {
>> 1095                 per_cpu(cpufreq_cpu_data, j) = policy;
>> 1096                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>> 1097         }
>> 1098         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>>
>> So for all the CPUs in the above policy->cpus mask, we simply return
>> without further ado when they are onlined. In particular, we *dont* call
>> cpufreq_add_policy_cpu() for any of them.
>>
>> And their refcounts are incremented by the cpufreq_add_dev_interface()->
>> cpufreq_add_dev_symlink() function.
>>
>> So, ultimately, we increment the refcount for a given non-policy-owner CPU
>> only once. *Either* in cpufreq_add_dev_symlink() *or* in cpufreq_add_policy_cpu(),
>> but never both.
>>
>> So, in the teardown path, __cpufreq_remove_dev() needs only one place to
>> decrement it as shown below:
>>
>> 1303         } else {
>> 1304 
>> 1305                 if (!frozen) {
>> 1306                         pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
>> 1307                         cpufreq_cpu_put(data);
>> 1308                 }
>>
>>
>> Pretty good maze, right? ;-(
> 
> Oh dear.  Right.
> 
> I tgought I could change cpufreq_add_dev_symlink() to use kobject_get() to bump
> up the policy refcount in analogy with cpufreq_add_policy_cpu() and then it
> wouldn't need to call cpufreq_cpu_get() at all, but there is a bug in the
> error code path of cpufreq_add_dev_interface(), because if
> cpufreq_add_dev_symlink() fails for one of the CPUs sharing the policy,
> it will just fail to drop references grabbed in there.  [Moreover, if it
> fails for the first one different from policy->cpu, kobject_put() will be
> called for that policy twice in a row if I'm not mistaken (first by
> cpufreq_add_dev_interface() and then by __cpufreq_add_dev()), but that's
> a different matter.]
> 
> So I think that neither cpufreq_add_dev_symlink() nor
> cpufreq_add_policy_cpu() should bump up the policy refcount in any way.
> 

Yeah, that greatly simplifies things, as seen in the patch below.

> Which entirely boils down to something like this:
>

Looks good to me.

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat
 
> ---
>  drivers/cpufreq/cpufreq.c |   31 +++++++------------------------
>  1 file changed, 7 insertions(+), 24 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
>  			continue;
> 
>  		pr_debug("Adding link for CPU: %u\n", j);
> -		cpufreq_cpu_get(policy->cpu);
>  		cpu_dev = get_cpu_device(j);
>  		ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
>  					"cpufreq");
> -		if (ret) {
> -			cpufreq_cpu_put(policy);
> -			return ret;
> -		}
> +		if (ret)
> +			break;
>  	}
>  	return ret;
>  }
> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
>  	unsigned long flags;
> 
>  	policy = cpufreq_cpu_get(sibling);
> -	WARN_ON(!policy);
> +	if (WARN_ON_ONCE(!policy))
> +		return -ENODATA;
> 
>  	if (has_target)
>  		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
>  	}
> 
>  	/* Don't touch sysfs links during light-weight init */
> -	if (frozen) {
> -		/* Drop the extra refcount that we took above */
> -		cpufreq_cpu_put(policy);
> -		return 0;
> -	}
> -
> -	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> -	if (ret)
> -		cpufreq_cpu_put(policy);
> +	if (!frozen)
> +		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> 
> +	cpufreq_cpu_put(policy);
>  	return ret;
>  }
>  #endif
> @@ -1117,9 +1109,6 @@ err_out_unregister:
>  	}
>  	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 
> -	kobject_put(&policy->kobj);
> -	wait_for_completion(&policy->kobj_unregister);
> -
>  err_set_policy_cpu:
>  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
>  	cpufreq_policy_free(policy);
> @@ -1298,12 +1287,6 @@ static int __cpufreq_remove_dev(struct d
>  		if (!frozen)
>  			cpufreq_policy_free(data);
>  	} else {
> -
> -		if (!frozen) {
> -			pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -			cpufreq_cpu_put(data);
> -		}
> -
>  		if (cpufreq_driver->target) {
>  			__cpufreq_governor(data, CPUFREQ_GOV_START);
>  			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 20:26                     ` Srivatsa S. Bhat
@ 2013-08-01 20:47                       ` Rafael J. Wysocki
  2013-08-01 20:53                         ` [Update][PATCH] " Rafael J. Wysocki
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 20:47 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Linux PM list, LKML, cpufreq, Lists linaro-kernel

On Friday, August 02, 2013 01:56:21 AM Srivatsa S. Bhat wrote:
> On 08/02/2013 01:34 AM, Rafael J. Wysocki wrote:
> > On Friday, August 02, 2013 12:51:24 AM Srivatsa S. Bhat wrote:
> >> On 08/02/2013 12:51 AM, Rafael J. Wysocki wrote:
> >>> On Friday, August 02, 2013 12:31:23 AM Srivatsa S. Bhat wrote:
> >>>> On 08/02/2013 12:31 AM, Rafael J. Wysocki wrote:
> >>>>> On Thursday, August 01, 2013 11:36:49 PM Srivatsa S. Bhat wrote:
> >>>>>> Its the cpufreq_cpu_get() hidden away in cpufreq_add_dev_symlink(). With
> >>>>>> that taken care of, everything should be OK. Then we can change the
> >>>>>> synchronization part to avoid using refcounts.
> >>>>>
> >>>>> So I actually don't see why cpufreq_add_dev_symlink() needs to call
> >>>>> cpufreq_cpu_get() at all, since the policy refcount is already 1 at the
> >>>>> point it is called and the bumping up of the driver module refcount is
> >>>>> pointless.
> >>>>>
> >>>>
> >>>> Hmm, yes, it seems so.
> >>>>
> >>>>> However, if I change that I also need to change the piece of code that
> >>>>> calls the complementary cpufreq_cpu_put() and I kind of cannot find it.
> >>>>>
> >>>>
> >>>> ... I guess that's because you are looking at the code with your patch
> >>>> applied (and your patch removed that _put()) ;-)
> >>>
> >>> No, it's not that one.  That one was complementary to the cpufreq_cpu_get()
> >>> done by cpufreq_add_policy_cpu() before my patch.  Since my patch changes
> >>> cpufreq_add_policy_cpu() to call cpufreq_cpu_put() before returning and
> >>> bump up the policy refcount with kobject_get(), the one in
> >>> __cpufreq_remove_dev() is changed into kobject_put() (correctly, IMO).
> >>>
> >>> What gives?
> >>>
> >>
> >> Actually, it _is_ the one I pointed above. This thing is tricky, here's why:
> >>
> >> cpufreq_add_policy_cpu() is called only if:
> >> a. The CPU being onlined has per_cpu(cpufreq_cpu_data, cpu) == NULL
> >> and 
> >> b. Its is present in some CPU's related_cpus mask. 
> >>
> >> If condition (a) doesn't hold good, you get out right in the beginning of
> >> __cpufreq_add_dev().
> >>
> >> So, cpufreq_add_policy_cpu() is called very rarely because, inside
> >> __cpufreq_add_dev we do:
> >>
> >> 1093         write_lock_irqsave(&cpufreq_driver_lock, flags);
> >> 1094         for_each_cpu(j, policy->cpus) {
> >> 1095                 per_cpu(cpufreq_cpu_data, j) = policy;
> >> 1096                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
> >> 1097         }
> >> 1098         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >>
> >> So for all the CPUs in the above policy->cpus mask, we simply return
> >> without further ado when they are onlined. In particular, we *dont* call
> >> cpufreq_add_policy_cpu() for any of them.
> >>
> >> And their refcounts are incremented by the cpufreq_add_dev_interface()->
> >> cpufreq_add_dev_symlink() function.
> >>
> >> So, ultimately, we increment the refcount for a given non-policy-owner CPU
> >> only once. *Either* in cpufreq_add_dev_symlink() *or* in cpufreq_add_policy_cpu(),
> >> but never both.
> >>
> >> So, in the teardown path, __cpufreq_remove_dev() needs only one place to
> >> decrement it as shown below:
> >>
> >> 1303         } else {
> >> 1304 
> >> 1305                 if (!frozen) {
> >> 1306                         pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> >> 1307                         cpufreq_cpu_put(data);
> >> 1308                 }
> >>
> >>
> >> Pretty good maze, right? ;-(
> > 
> > Oh dear.  Right.
> > 
> > I tgought I could change cpufreq_add_dev_symlink() to use kobject_get() to bump
> > up the policy refcount in analogy with cpufreq_add_policy_cpu() and then it
> > wouldn't need to call cpufreq_cpu_get() at all, but there is a bug in the
> > error code path of cpufreq_add_dev_interface(), because if
> > cpufreq_add_dev_symlink() fails for one of the CPUs sharing the policy,
> > it will just fail to drop references grabbed in there.  [Moreover, if it
> > fails for the first one different from policy->cpu, kobject_put() will be
> > called for that policy twice in a row if I'm not mistaken (first by
> > cpufreq_add_dev_interface() and then by __cpufreq_add_dev()), but that's
> > a different matter.]
> > 
> > So I think that neither cpufreq_add_dev_symlink() nor
> > cpufreq_add_policy_cpu() should bump up the policy refcount in any way.
> > 
> 
> Yeah, that greatly simplifies things, as seen in the patch below.
> 
> > Which entirely boils down to something like this:
> >
> 
> Looks good to me.
> 
> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Thanks! :-)

I actually think that I should move the error code path bug fix ->

> > ---
> >  drivers/cpufreq/cpufreq.c |   31 +++++++------------------------
> >  1 file changed, 7 insertions(+), 24 deletions(-)
> > 
> > Index: linux-pm/drivers/cpufreq/cpufreq.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> > +++ linux-pm/drivers/cpufreq/cpufreq.c
> > @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
> >  			continue;
> > 
> >  		pr_debug("Adding link for CPU: %u\n", j);
> > -		cpufreq_cpu_get(policy->cpu);
> >  		cpu_dev = get_cpu_device(j);
> >  		ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
> >  					"cpufreq");
> > -		if (ret) {
> > -			cpufreq_cpu_put(policy);
> > -			return ret;
> > -		}
> > +		if (ret)
> > +			break;
> >  	}
> >  	return ret;
> >  }
> > @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
> >  	unsigned long flags;
> > 
> >  	policy = cpufreq_cpu_get(sibling);
> > -	WARN_ON(!policy);
> > +	if (WARN_ON_ONCE(!policy))
> > +		return -ENODATA;
> > 
> >  	if (has_target)
> >  		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> > @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
> >  	}
> > 
> >  	/* Don't touch sysfs links during light-weight init */
> > -	if (frozen) {
> > -		/* Drop the extra refcount that we took above */
> > -		cpufreq_cpu_put(policy);
> > -		return 0;
> > -	}
> > -
> > -	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> > -	if (ret)
> > -		cpufreq_cpu_put(policy);
> > +	if (!frozen)
> > +		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> > 
> > +	cpufreq_cpu_put(policy);
> >  	return ret;
> >  }
> >  #endif
> > @@ -1117,9 +1109,6 @@ err_out_unregister:
> >  	}
> >  	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> > -	kobject_put(&policy->kobj);
> > -	wait_for_completion(&policy->kobj_unregister);
> > -
> >  err_set_policy_cpu:
> >  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> >  	cpufreq_policy_free(policy);

-> into a separate patch, because it's not really related to the other changes
made here.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 20:47                       ` Rafael J. Wysocki
@ 2013-08-01 20:53                         ` Rafael J. Wysocki
  2013-08-02  4:37                           ` Viresh Kumar
  2013-08-02 10:30                           ` Viresh Kumar
  0 siblings, 2 replies; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-01 20:53 UTC (permalink / raw)
  To: Linux PM list, cpufreq
  Cc: Srivatsa S. Bhat, Viresh Kumar, LKML, Lists linaro-kernel

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Do not hold driver module references for additional policy CPUs

The cpufreq core is a little inconsistent in the way it uses the
driver module refcount.

Namely, if __cpufreq_add_dev() is called for a CPU that doesn't
share the policy object with any other CPUs, the driver module
refcount it grabs to start with will be dropped by it before
returning and will be equal to 0 afterward.

However, if the given CPU does share the policy object with other
CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU
to the existing policy, or cpufreq_add_dev_symlink() is used to link
the other CPUs sharing the policy with it to the just created policy
object.  In that case, because both cpufreq_add_policy_cpu() and
cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given
policy (the latter possibly many times) without the balancing
cpufreq_cpu_put() (unless there is an error), the driver module
refcount will be left by __cpufreq_add_dev() with a nonzero value.

To remove that inconsistency make cpufreq_add_policy_cpu() execute
cpufreq_cpu_put() for the given policy before returning, which
decrements the driver module refcount so that it will be 0 after
__cpufreq_add_dev() returns.  Moreover, remove the cpufreq_cpu_get()
call from cpufreq_add_dev_symlink(), since both the policy refcount
and the driver module refcount are nonzero when it is called and they
don't need to be bumped up by it.

Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(),
since it is only necessary to balance the cpufreq_cpu_get() called
by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
 drivers/cpufreq/cpufreq.c |   28 +++++++---------------------
 1 file changed, 7 insertions(+), 21 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
 			continue;
 
 		pr_debug("Adding link for CPU: %u\n", j);
-		cpufreq_cpu_get(policy->cpu);
 		cpu_dev = get_cpu_device(j);
 		ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
 					"cpufreq");
-		if (ret) {
-			cpufreq_cpu_put(policy);
-			return ret;
-		}
+		if (ret)
+			break;
 	}
 	return ret;
 }
@@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
 	unsigned long flags;
 
 	policy = cpufreq_cpu_get(sibling);
-	WARN_ON(!policy);
+	if (WARN_ON_ONCE(!policy))
+		return -ENODATA;
 
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
@@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
 	}
 
 	/* Don't touch sysfs links during light-weight init */
-	if (frozen) {
-		/* Drop the extra refcount that we took above */
-		cpufreq_cpu_put(policy);
-		return 0;
-	}
-
-	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
-	if (ret)
-		cpufreq_cpu_put(policy);
+	if (!frozen)
+		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 
+	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1298,12 +1290,6 @@ static int __cpufreq_remove_dev(struct d
 		if (!frozen)
 			cpufreq_policy_free(data);
 	} else {
-
-		if (!frozen) {
-			pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-			cpufreq_cpu_put(data);
-		}
-
 		if (cpufreq_driver->target) {
 			__cpufreq_governor(data, CPUFREQ_GOV_START);
 			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 20:53                         ` [Update][PATCH] " Rafael J. Wysocki
@ 2013-08-02  4:37                           ` Viresh Kumar
  2013-08-02  6:49                             ` Srivatsa S. Bhat
  2013-08-02 10:55                             ` Viresh Kumar
  2013-08-02 10:30                           ` Viresh Kumar
  1 sibling, 2 replies; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02  4:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, cpufreq, Srivatsa S. Bhat, LKML, Lists linaro-kernel

Wow!! Lot of stuff happened while I was asleep..

@Srivatsa: Thanks for answering what I would have answered to Rafael :)
And you should really get some sleep, I would suggest :)

On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Do not hold driver module references for additional policy CPUs

I still have issues with this subject. Why don't we get rid of .owner
field completely? And stop using a mix of cpufreq_cpu_get() and
kobject_get()?

> The cpufreq core is a little inconsistent in the way it uses the
> driver module refcount.
>
> Namely, if __cpufreq_add_dev() is called for a CPU that doesn't
> share the policy object with any other CPUs, the driver module
> refcount it grabs to start with will be dropped by it before
> returning and will be equal to 0 afterward.

It wouldn't be zero but 1, this is what it is initialized with probably.
That's what I can see in my tests.

> However, if the given CPU does share the policy object with other
> CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU
> to the existing policy, or cpufreq_add_dev_symlink() is used to link
> the other CPUs sharing the policy with it to the just created policy
> object.  In that case, because both cpufreq_add_policy_cpu() and
> cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given
> policy (the latter possibly many times) without the balancing
> cpufreq_cpu_put() (unless there is an error), the driver module
> refcount will be left by __cpufreq_add_dev() with a nonzero value.
>
> To remove that inconsistency make cpufreq_add_policy_cpu() execute
> cpufreq_cpu_put() for the given policy before returning, which
> decrements the driver module refcount so that it will be 0 after
> __cpufreq_add_dev() returns.  Moreover, remove the cpufreq_cpu_get()
> call from cpufreq_add_dev_symlink(), since both the policy refcount
> and the driver module refcount are nonzero when it is called and they
> don't need to be bumped up by it.
>
> Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(),
> since it is only necessary to balance the cpufreq_cpu_get() called
> by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink().
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/cpufreq.c |   28 +++++++---------------------
>  1 file changed, 7 insertions(+), 21 deletions(-)

So, we can't rmmod the module as soon as it is inserted and so the
problem stays as is. :(

> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
>                         continue;
>
>                 pr_debug("Adding link for CPU: %u\n", j);
> -               cpufreq_cpu_get(policy->cpu);
>                 cpu_dev = get_cpu_device(j);
>                 ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
>                                         "cpufreq");
> -               if (ret) {
> -                       cpufreq_cpu_put(policy);
> -                       return ret;
> -               }
> +               if (ret)
> +                       break;
>         }
>         return ret;
>  }
> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
>         unsigned long flags;
>
>         policy = cpufreq_cpu_get(sibling);

This can be skipped completely at this place. Caller of
cpufreq_add_policy_cpu() has got the policy pointer with it and so
can be passed. I haven't done it earlier as the impression was we need
to call cpufreq_cpu_get()..

> -       WARN_ON(!policy);
> +       if (WARN_ON_ONCE(!policy))
> +               return -ENODATA;
>
>         if (has_target)
>                 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
>         }
>
>         /* Don't touch sysfs links during light-weight init */
> -       if (frozen) {
> -               /* Drop the extra refcount that we took above */
> -               cpufreq_cpu_put(policy);
> -               return 0;
> -       }
> -
> -       ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> -       if (ret)
> -               cpufreq_cpu_put(policy);
> +       if (!frozen)
> +               ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>
> +       cpufreq_cpu_put(policy);

And so this will go away.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  4:37                           ` Viresh Kumar
@ 2013-08-02  6:49                             ` Srivatsa S. Bhat
  2013-08-02  6:59                               ` Viresh Kumar
  2013-08-02  9:36                               ` Viresh Kumar
  2013-08-02 10:55                             ` Viresh Kumar
  1 sibling, 2 replies; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-02  6:49 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 08/02/2013 10:07 AM, Viresh Kumar wrote:
> Wow!! Lot of stuff happened while I was asleep..
> 
> @Srivatsa: Thanks for answering what I would have answered to Rafael :)
> And you should really get some sleep, I would suggest :)

No problem :-) And thank you for your concern :-)

> 
> On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Subject: cpufreq: Do not hold driver module references for additional policy CPUs
> 
> I still have issues with this subject. Why don't we get rid of .owner
> field completely? And stop using a mix of cpufreq_cpu_get() and
> kobject_get()?
>

I guess Rafael's intention is to do one thing at a time - fix the
inconsistency first, and then rework the synchronization on top of
it. And that makes sense to me, since its the logical way of fixing
all these issues.
 
>> The cpufreq core is a little inconsistent in the way it uses the
>> driver module refcount.
>>
>> Namely, if __cpufreq_add_dev() is called for a CPU that doesn't
>> share the policy object with any other CPUs, the driver module
>> refcount it grabs to start with will be dropped by it before
>> returning and will be equal to 0 afterward.
> 
> It wouldn't be zero but 1, this is what it is initialized with probably.
> That's what I can see in my tests.
> 

But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus
should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero
value due to the very bug/inconsistency that Rafael is fixing in this
patch).

>> However, if the given CPU does share the policy object with other
>> CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU
>> to the existing policy, or cpufreq_add_dev_symlink() is used to link
>> the other CPUs sharing the policy with it to the just created policy
>> object.  In that case, because both cpufreq_add_policy_cpu() and
>> cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given
>> policy (the latter possibly many times) without the balancing
>> cpufreq_cpu_put() (unless there is an error), the driver module
>> refcount will be left by __cpufreq_add_dev() with a nonzero value.
>>
>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
>> cpufreq_cpu_put() for the given policy before returning, which
>> decrements the driver module refcount so that it will be 0 after
>> __cpufreq_add_dev() returns.  Moreover, remove the cpufreq_cpu_get()
>> call from cpufreq_add_dev_symlink(), since both the policy refcount
>> and the driver module refcount are nonzero when it is called and they
>> don't need to be bumped up by it.
>>
>> Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(),
>> since it is only necessary to balance the cpufreq_cpu_get() called
>> by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink().
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> ---
>>  drivers/cpufreq/cpufreq.c |   28 +++++++---------------------
>>  1 file changed, 7 insertions(+), 21 deletions(-)
> 
> So, we can't rmmod the module as soon as it is inserted and so the
> problem stays as is. :(
> 

No, we get one step closer to the solution, since we fix the inconsistency
between refcounts. Next step would be to get rid of refcounts and use
locking like you suggested. Then we can rmmod it easily. I'm assuming
Rafael has the same plan.

(We could have done all this in one-shot, but that would make it difficult
to track regressions etc. So good to have each improvement in a separate
patch).

>> Index: linux-pm/drivers/cpufreq/cpufreq.c
>> ===================================================================
>> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
>> +++ linux-pm/drivers/cpufreq/cpufreq.c
>> @@ -818,14 +818,11 @@ static int cpufreq_add_dev_symlink(struc
>>                         continue;
>>
>>                 pr_debug("Adding link for CPU: %u\n", j);
>> -               cpufreq_cpu_get(policy->cpu);
>>                 cpu_dev = get_cpu_device(j);
>>                 ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj,
>>                                         "cpufreq");
>> -               if (ret) {
>> -                       cpufreq_cpu_put(policy);
>> -                       return ret;
>> -               }
>> +               if (ret)
>> +                       break;
>>         }
>>         return ret;
>>  }
>> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
>>         unsigned long flags;
>>
>>         policy = cpufreq_cpu_get(sibling);
> 
> This can be skipped completely at this place. Caller of
> cpufreq_add_policy_cpu() has got the policy pointer with it and so
> can be passed. I haven't done it earlier as the impression was we need
> to call cpufreq_cpu_get()..
> 

Agreed, that would be a good cleanup.

>> -       WARN_ON(!policy);
>> +       if (WARN_ON_ONCE(!policy))
>> +               return -ENODATA;
>>
>>         if (has_target)
>>                 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
>> @@ -930,16 +928,10 @@ static int cpufreq_add_policy_cpu(unsign
>>         }
>>
>>         /* Don't touch sysfs links during light-weight init */
>> -       if (frozen) {
>> -               /* Drop the extra refcount that we took above */
>> -               cpufreq_cpu_put(policy);
>> -               return 0;
>> -       }
>> -
>> -       ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>> -       if (ret)
>> -               cpufreq_cpu_put(policy);
>> +       if (!frozen)
>> +               ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>>
>> +       cpufreq_cpu_put(policy);
> 
> And so this will go away.
> 

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  6:49                             ` Srivatsa S. Bhat
@ 2013-08-02  6:59                               ` Viresh Kumar
  2013-08-02  7:09                                 ` Srivatsa S. Bhat
  2013-08-02  9:36                               ` Viresh Kumar
  1 sibling, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02  6:59 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 2 August 2013 12:19, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus
> should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero
> value due to the very bug/inconsistency that Rafael is fixing in this
> patch).

I have hacked the driver this way:

@@ -2114,10 +2114,16 @@ int cpufreq_register_driver(struct
cpufreq_driver *driver_data)
        cpufreq_driver = driver_data;
        write_unlock_irqrestore(&cpufreq_driver_lock, flags);

+       printk(KERN_INFO "%s: Module refcount: %lu\n", __func__,
+                       module_refcount(cpufreq_driver->owner));
+
        ret = subsys_interface_register(&cpufreq_interface);
        if (ret)
                goto err_null_driver;


And this gave me 1..

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  6:59                               ` Viresh Kumar
@ 2013-08-02  7:09                                 ` Srivatsa S. Bhat
  2013-08-02  7:16                                   ` Viresh Kumar
  0 siblings, 1 reply; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-02  7:09 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 08/02/2013 12:29 PM, Viresh Kumar wrote:
> On 2 August 2013 12:19, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> But lsmod shows 0 for the cpufreq driver right? (Note, your related_cpus
>> should have only 1 CPU each, for you to see 0. Else, you'll see a non-zero
>> value due to the very bug/inconsistency that Rafael is fixing in this
>> patch).
> 
> I have hacked the driver this way:
> 
> @@ -2114,10 +2114,16 @@ int cpufreq_register_driver(struct
> cpufreq_driver *driver_data)
>         cpufreq_driver = driver_data;
>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 
> +       printk(KERN_INFO "%s: Module refcount: %lu\n", __func__,
> +                       module_refcount(cpufreq_driver->owner));
> +
>         ret = subsys_interface_register(&cpufreq_interface);
>         if (ret)
>                 goto err_null_driver;
> 
> 
> And this gave me 1..
> 

Well, on my system, lsmod shows:

acpi_cpufreq           13643  0 

The last column is the refcount, as printed by:
kernel/module.c: print_unload_info()

 913         seq_printf(m, " %lu ", module_refcount(mod));


I guess you are printing it at an odd time, when the module
is still running its init function. Perhaps the core kernel
module infrastructure increments the refcount around that
region temporarily?

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  7:09                                 ` Srivatsa S. Bhat
@ 2013-08-02  7:16                                   ` Viresh Kumar
  0 siblings, 0 replies; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02  7:16 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 2 August 2013 12:39, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> I guess you are printing it at an odd time, when the module
> is still running its init function. Perhaps the core kernel
> module infrastructure increments the refcount around that
> region temporarily?

If I think logically, that sounds correct. I haven't looked at the
implementation details though. But yes, I understood why
my refcount was incremented :)

Thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  6:49                             ` Srivatsa S. Bhat
  2013-08-02  6:59                               ` Viresh Kumar
@ 2013-08-02  9:36                               ` Viresh Kumar
  2013-08-02 10:12                                 ` Srivatsa S. Bhat
  1 sibling, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02  9:36 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 2 August 2013 12:19, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> On 08/02/2013 10:07 AM, Viresh Kumar wrote:
>> So, we can't rmmod the module as soon as it is inserted and so the
>> problem stays as is. :(
>>
>
> No, we get one step closer to the solution, since we fix the inconsistency
> between refcounts. Next step would be to get rid of refcounts and use
> locking like you suggested. Then we can rmmod it easily. I'm assuming
> Rafael has the same plan.

Not really. We are putting the reference at the end of add_dev() and
so refcount would be zero when we aren't running any critical sections.
And so, we can rmmod the module now and that problem is gone.

@Rafael: I will try to do generic cleanups in cpufreq in coming time
and will take care to remove .owner field completely in that. Until that
point your patches look fine:

For both of your patches:
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  9:36                               ` Viresh Kumar
@ 2013-08-02 10:12                                 ` Srivatsa S. Bhat
  0 siblings, 0 replies; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-02 10:12 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 08/02/2013 03:06 PM, Viresh Kumar wrote:
> On 2 August 2013 12:19, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> On 08/02/2013 10:07 AM, Viresh Kumar wrote:
>>> So, we can't rmmod the module as soon as it is inserted and so the
>>> problem stays as is. :(
>>>
>>
>> No, we get one step closer to the solution, since we fix the inconsistency
>> between refcounts. Next step would be to get rid of refcounts and use
>> locking like you suggested. Then we can rmmod it easily. I'm assuming
>> Rafael has the same plan.
> 
> Not really. We are putting the reference at the end of add_dev() and
> so refcount would be zero when we aren't running any critical sections.
> And so, we can rmmod the module now and that problem is gone.
> 

Ah, yes, you are right.

> @Rafael: I will try to do generic cleanups in cpufreq in coming time
> and will take care to remove .owner field completely in that. Until that
> point your patches look fine:
> 
> For both of your patches:
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-01 20:53                         ` [Update][PATCH] " Rafael J. Wysocki
  2013-08-02  4:37                           ` Viresh Kumar
@ 2013-08-02 10:30                           ` Viresh Kumar
  2013-08-02 11:35                             ` Srivatsa S. Bhat
  1 sibling, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02 10:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, cpufreq, Srivatsa S. Bhat, LKML, Lists linaro-kernel

On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> To remove that inconsistency make cpufreq_add_policy_cpu() execute
> cpufreq_cpu_put() for the given policy before returning, which
> decrements the driver module refcount so that it will be 0 after
> __cpufreq_add_dev() returns.  Moreover, remove the cpufreq_cpu_get()
> call from cpufreq_add_dev_symlink(), since both the policy refcount
> and the driver module refcount are nonzero when it is called and they
> don't need to be bumped up by it.

Sorry for creating so many problems but my concerns with this patch
aren't yet over :(

Should we increment policy refcount or kobj refcount for every cpu it
is used on? I think yes, that's probably the right way of doing it.

And so we simply can't remove calls to cpufreq_cpu_get() from
cpufreq_add_dev_symlink() routine and also from
cpufreq_add_policy_cpu()..

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02  4:37                           ` Viresh Kumar
  2013-08-02  6:49                             ` Srivatsa S. Bhat
@ 2013-08-02 10:55                             ` Viresh Kumar
  2013-08-02 13:31                               ` Rafael J. Wysocki
  1 sibling, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02 10:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, cpufreq, Srivatsa S. Bhat, LKML, Lists linaro-kernel

[-- Attachment #1: Type: text/plain, Size: 2576 bytes --]

On 2 August 2013 10:07, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
>>         unsigned long flags;
>>
>>         policy = cpufreq_cpu_get(sibling);
>
> This can be skipped completely at this place. Caller of
> cpufreq_add_policy_cpu() has got the policy pointer with it and so
> can be passed. I haven't done it earlier as the impression was we need
> to call cpufreq_cpu_get()..

And here is the fixup to do this (attached too):

---
 drivers/cpufreq/cpufreq.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 46e70ae..47f2a6e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -889,21 +889,17 @@ static void cpufreq_init_policy(struct
cpufreq_policy *policy)
 }

 #ifdef CONFIG_HOTPLUG_CPU
-static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
-				  struct device *dev, bool frozen)
+static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
+				  unsigned int cpu, struct device *dev,
+				  bool frozen)
 {
-	struct cpufreq_policy *policy;
 	int ret = 0, has_target = !!cpufreq_driver->target;
 	unsigned long flags;

-	policy = cpufreq_cpu_get(sibling);
-	if (WARN_ON_ONCE(!policy))
-		return -ENODATA;
-
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);

-	lock_policy_rwsem_write(sibling);
+	lock_policy_rwsem_write(policy->cpu);

 	write_lock_irqsave(&cpufreq_driver_lock, flags);

@@ -912,7 +908,7 @@ static int cpufreq_add_policy_cpu(unsigned int
cpu, unsigned int sibling,
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);

-	unlock_policy_rwsem_write(sibling);
+	unlock_policy_rwsem_write(policy->cpu);

 	if (has_target) {
 		__cpufreq_governor(policy, CPUFREQ_GOV_START);
@@ -923,7 +919,6 @@ static int cpufreq_add_policy_cpu(unsigned int
cpu, unsigned int sibling,
 	if (!frozen)
 		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");

-	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1006,8 +1001,7 @@ static int __cpufreq_add_dev(struct device *dev,
struct subsys_interface *sif,
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
 			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-			return cpufreq_add_policy_cpu(cpu, sibling, dev,
-						      frozen);
+			return cpufreq_add_policy_cpu(cp, cpu, dev, frozen);
 		}
 	}
 	read_unlock_irqrestore(&cpufreq_driver_lock, flags);

[-- Attachment #2: 0001-Fix.patch --]
[-- Type: application/octet-stream, Size: 2437 bytes --]

From 77963cbcea2e148a2fe1b8865266cc4e2f187262 Mon Sep 17 00:00:00 2001
Message-Id: <77963cbcea2e148a2fe1b8865266cc4e2f187262.1375440908.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Fri, 2 Aug 2013 16:17:33 +0530
Subject: [PATCH] Fix

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 46e70ae..47f2a6e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -889,21 +889,17 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
-				  struct device *dev, bool frozen)
+static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
+				  unsigned int cpu, struct device *dev,
+				  bool frozen)
 {
-	struct cpufreq_policy *policy;
 	int ret = 0, has_target = !!cpufreq_driver->target;
 	unsigned long flags;
 
-	policy = cpufreq_cpu_get(sibling);
-	if (WARN_ON_ONCE(!policy))
-		return -ENODATA;
-
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
-	lock_policy_rwsem_write(sibling);
+	lock_policy_rwsem_write(policy->cpu);
 
 	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
@@ -912,7 +908,7 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
-	unlock_policy_rwsem_write(sibling);
+	unlock_policy_rwsem_write(policy->cpu);
 
 	if (has_target) {
 		__cpufreq_governor(policy, CPUFREQ_GOV_START);
@@ -923,7 +919,6 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 	if (!frozen)
 		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 
-	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1006,8 +1001,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
 			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-			return cpufreq_add_policy_cpu(cpu, sibling, dev,
-						      frozen);
+			return cpufreq_add_policy_cpu(cp, cpu, dev, frozen);
 		}
 	}
 	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-- 
1.7.12.rc2.18.g61b472e


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02 10:30                           ` Viresh Kumar
@ 2013-08-02 11:35                             ` Srivatsa S. Bhat
  0 siblings, 0 replies; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-02 11:35 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 08/02/2013 04:00 PM, Viresh Kumar wrote:
> On 2 August 2013 02:23, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> To remove that inconsistency make cpufreq_add_policy_cpu() execute
>> cpufreq_cpu_put() for the given policy before returning, which
>> decrements the driver module refcount so that it will be 0 after
>> __cpufreq_add_dev() returns.  Moreover, remove the cpufreq_cpu_get()
>> call from cpufreq_add_dev_symlink(), since both the policy refcount
>> and the driver module refcount are nonzero when it is called and they
>> don't need to be bumped up by it.
> 
> Sorry for creating so many problems but my concerns with this patch
> aren't yet over :(
> 
> Should we increment policy refcount or kobj refcount for every cpu it
> is used on? I think yes, that's probably the right way of doing it.
>

It depends on how you look at it. The number of CPUs in the policy
(cpumask_weight(policy)) itself serves as a refcount. We don't actually
need yet another refcount to manage things. Besides, not bumping
up the policy refcount for every CPU actually seems to simplify the
code and make it easier to understand, so why not do it? :-)
 
> And so we simply can't remove calls to cpufreq_cpu_get() from
> cpufreq_add_dev_symlink() routine and also from
> cpufreq_add_policy_cpu()..


Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02 10:55                             ` Viresh Kumar
@ 2013-08-02 13:31                               ` Rafael J. Wysocki
  2013-08-02 14:38                                 ` Viresh Kumar
  0 siblings, 1 reply; 29+ messages in thread
From: Rafael J. Wysocki @ 2013-08-02 13:31 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Linux PM list, cpufreq, Srivatsa S. Bhat, LKML, Lists linaro-kernel

On Friday, August 02, 2013 04:25:58 PM Viresh Kumar wrote:
> On 2 August 2013 10:07, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >> @@ -908,7 +905,8 @@ static int cpufreq_add_policy_cpu(unsign
> >>         unsigned long flags;
> >>
> >>         policy = cpufreq_cpu_get(sibling);
> >
> > This can be skipped completely at this place. Caller of
> > cpufreq_add_policy_cpu() has got the policy pointer with it and so
> > can be passed. I haven't done it earlier as the impression was we need
> > to call cpufreq_cpu_get()..
> 
> And here is the fixup to do this (attached too):

Care to add a changelog?

I'll apply this on top of my $subject patch, then.

Thanks,
Rafael


> ---
>  drivers/cpufreq/cpufreq.c | 18 ++++++------------
>  1 file changed, 6 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 46e70ae..47f2a6e 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -889,21 +889,17 @@ static void cpufreq_init_policy(struct
> cpufreq_policy *policy)
>  }
> 
>  #ifdef CONFIG_HOTPLUG_CPU
> -static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
> -				  struct device *dev, bool frozen)
> +static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
> +				  unsigned int cpu, struct device *dev,
> +				  bool frozen)
>  {
> -	struct cpufreq_policy *policy;
>  	int ret = 0, has_target = !!cpufreq_driver->target;
>  	unsigned long flags;
> 
> -	policy = cpufreq_cpu_get(sibling);
> -	if (WARN_ON_ONCE(!policy))
> -		return -ENODATA;
> -
>  	if (has_target)
>  		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> 
> -	lock_policy_rwsem_write(sibling);
> +	lock_policy_rwsem_write(policy->cpu);
> 
>  	write_lock_irqsave(&cpufreq_driver_lock, flags);
> 
> @@ -912,7 +908,7 @@ static int cpufreq_add_policy_cpu(unsigned int
> cpu, unsigned int sibling,
>  	per_cpu(cpufreq_cpu_data, cpu) = policy;
>  	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 
> -	unlock_policy_rwsem_write(sibling);
> +	unlock_policy_rwsem_write(policy->cpu);
> 
>  	if (has_target) {
>  		__cpufreq_governor(policy, CPUFREQ_GOV_START);
> @@ -923,7 +919,6 @@ static int cpufreq_add_policy_cpu(unsigned int
> cpu, unsigned int sibling,
>  	if (!frozen)
>  		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> 
> -	cpufreq_cpu_put(policy);
>  	return ret;
>  }
>  #endif
> @@ -1006,8 +1001,7 @@ static int __cpufreq_add_dev(struct device *dev,
> struct subsys_interface *sif,
>  		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
>  		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
>  			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> -			return cpufreq_add_policy_cpu(cpu, sibling, dev,
> -						      frozen);
> +			return cpufreq_add_policy_cpu(cp, cpu, dev, frozen);
>  		}
>  	}
>  	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02 13:31                               ` Rafael J. Wysocki
@ 2013-08-02 14:38                                 ` Viresh Kumar
  2013-08-02 14:46                                   ` Srivatsa S. Bhat
  0 siblings, 1 reply; 29+ messages in thread
From: Viresh Kumar @ 2013-08-02 14:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, cpufreq, Srivatsa S. Bhat, LKML, Lists linaro-kernel

[-- Attachment #1: Type: text/plain, Size: 762 bytes --]

On 2 August 2013 19:01, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Care to add a changelog?
>
> I'll apply this on top of my $subject patch, then.

I thought you will merge this one with your patch, but there is no
harm in getting one more on my name :)

Changelog is (patch attached):

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Fri, 2 Aug 2013 16:17:33 +0530
Subject: [PATCH] cpufreq: Pass policy to cpufreq_add_policy_cpu()

Caller of cpufreq_add_policy_cpu() already has pointer to policy structure and
so there is no need to find it again in cpufreq_add_policy_cpu(). Lets pass it
directly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

[-- Attachment #2: 0001-cpufreq-Pass-policy-to-cpufreq_add_policy_cpu.patch --]
[-- Type: application/octet-stream, Size: 2651 bytes --]

From e42fced50772cd2893cccc48990de123e82fa9ba Mon Sep 17 00:00:00 2001
Message-Id: <e42fced50772cd2893cccc48990de123e82fa9ba.1375454298.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Fri, 2 Aug 2013 16:17:33 +0530
Subject: [PATCH] cpufreq: Pass policy to cpufreq_add_policy_cpu()

Caller of cpufreq_add_policy_cpu() already has pointer to policy structure and
so there is no need to find it again in cpufreq_add_policy_cpu(). Lets pass it
directly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 46e70ae..47f2a6e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -889,21 +889,17 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
-static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
-				  struct device *dev, bool frozen)
+static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
+				  unsigned int cpu, struct device *dev,
+				  bool frozen)
 {
-	struct cpufreq_policy *policy;
 	int ret = 0, has_target = !!cpufreq_driver->target;
 	unsigned long flags;
 
-	policy = cpufreq_cpu_get(sibling);
-	if (WARN_ON_ONCE(!policy))
-		return -ENODATA;
-
 	if (has_target)
 		__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
-	lock_policy_rwsem_write(sibling);
+	lock_policy_rwsem_write(policy->cpu);
 
 	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
@@ -912,7 +908,7 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
-	unlock_policy_rwsem_write(sibling);
+	unlock_policy_rwsem_write(policy->cpu);
 
 	if (has_target) {
 		__cpufreq_governor(policy, CPUFREQ_GOV_START);
@@ -923,7 +919,6 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 	if (!frozen)
 		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 
-	cpufreq_cpu_put(policy);
 	return ret;
 }
 #endif
@@ -1006,8 +1001,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
 			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-			return cpufreq_add_policy_cpu(cpu, sibling, dev,
-						      frozen);
+			return cpufreq_add_policy_cpu(cp, cpu, dev, frozen);
 		}
 	}
 	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-- 
1.7.12.rc2.18.g61b472e


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Update][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs
  2013-08-02 14:38                                 ` Viresh Kumar
@ 2013-08-02 14:46                                   ` Srivatsa S. Bhat
  0 siblings, 0 replies; 29+ messages in thread
From: Srivatsa S. Bhat @ 2013-08-02 14:46 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Linux PM list, cpufreq, LKML, Lists linaro-kernel

On 08/02/2013 08:08 PM, Viresh Kumar wrote:
> On 2 August 2013 19:01, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> Care to add a changelog?
>>
>> I'll apply this on top of my $subject patch, then.
> 
> I thought you will merge this one with your patch, but there is no
> harm in getting one more on my name :)
> 
> Changelog is (patch attached):
> 
> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Fri, 2 Aug 2013 16:17:33 +0530
> Subject: [PATCH] cpufreq: Pass policy to cpufreq_add_policy_cpu()
> 
> Caller of cpufreq_add_policy_cpu() already has pointer to policy structure and
> so there is no need to find it again in cpufreq_add_policy_cpu(). Lets pass it
> directly.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

The patch looks good.

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat

>  drivers/cpufreq/cpufreq.c | 18 ++++++------------
>  1 file changed, 6 insertions(+), 12 deletions(-)
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-08-02 14:49 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01  0:08 [RFC][PATCH] cpufreq: Do not hold driver module references for additional policy CPUs Rafael J. Wysocki
2013-08-01  8:11 ` Srivatsa S. Bhat
2013-08-01 14:44   ` Viresh Kumar
2013-08-01 15:24     ` Srivatsa S. Bhat
2013-08-01 17:24       ` Viresh Kumar
2013-08-01 18:09         ` Rafael J. Wysocki
2013-08-01 18:04       ` Rafael J. Wysocki
2013-08-01 18:06         ` Srivatsa S. Bhat
2013-08-01 19:01           ` Rafael J. Wysocki
2013-08-01 19:01             ` Srivatsa S. Bhat
2013-08-01 19:21               ` Rafael J. Wysocki
2013-08-01 19:21                 ` Srivatsa S. Bhat
2013-08-01 20:04                   ` Rafael J. Wysocki
2013-08-01 20:26                     ` Srivatsa S. Bhat
2013-08-01 20:47                       ` Rafael J. Wysocki
2013-08-01 20:53                         ` [Update][PATCH] " Rafael J. Wysocki
2013-08-02  4:37                           ` Viresh Kumar
2013-08-02  6:49                             ` Srivatsa S. Bhat
2013-08-02  6:59                               ` Viresh Kumar
2013-08-02  7:09                                 ` Srivatsa S. Bhat
2013-08-02  7:16                                   ` Viresh Kumar
2013-08-02  9:36                               ` Viresh Kumar
2013-08-02 10:12                                 ` Srivatsa S. Bhat
2013-08-02 10:55                             ` Viresh Kumar
2013-08-02 13:31                               ` Rafael J. Wysocki
2013-08-02 14:38                                 ` Viresh Kumar
2013-08-02 14:46                                   ` Srivatsa S. Bhat
2013-08-02 10:30                           ` Viresh Kumar
2013-08-02 11:35                             ` Srivatsa S. Bhat

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.