All of lore.kernel.org
 help / color / mirror / Atom feed
* stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
@ 2013-07-27 17:40 Toralf Förster
  2013-07-27 23:39   ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Toralf Förster @ 2013-07-27 17:40 UTC (permalink / raw)
  To: cpufreq

it gives at a ThinkPad T420:

tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
acpi_cpufreq           12902  2147483647


wow !

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-27 17:40 stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq" Toralf Förster
@ 2013-07-27 23:39   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-27 23:39 UTC (permalink / raw)
  To: Toralf Förster; +Cc: cpufreq, Linux PM list

On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> it gives at a ThinkPad T420:
> 
> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> acpi_cpufreq           12902  2147483647

That is -1, which indicates some module refcount woes.

I definitely can't see that with the mainline on my machines.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
@ 2013-07-27 23:39   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-27 23:39 UTC (permalink / raw)
  To: Toralf Förster; +Cc: cpufreq, Linux PM list

On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> it gives at a ThinkPad T420:
> 
> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> acpi_cpufreq           12902  2147483647

That is -1, which indicates some module refcount woes.

I definitely can't see that with the mainline on my machines.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-27 23:39   ` Rafael J. Wysocki
  (?)
@ 2013-07-28  8:08   ` Toralf Förster
  2013-07-28  8:08     ` Srivatsa S. Bhat
  -1 siblings, 1 reply; 25+ messages in thread
From: Toralf Förster @ 2013-07-28  8:08 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: cpufreq, Linux PM list, Srivatsa S. Bhat

On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>> it gives at a ThinkPad T420:
>>
>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>> acpi_cpufreq           12902  2147483647
> 
> That is -1, which indicates some module refcount woes.
> 
> I definitely can't see that with the mainline on my machines.

It might be a regression in -stable only, b/c in 3,10.2 I did not
observed it.

Srivatsa,

by any chance - could the revert of the cpufreq patches have something
to do with that ?

-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-28  8:08   ` Toralf Förster
@ 2013-07-28  8:08     ` Srivatsa S. Bhat
  0 siblings, 0 replies; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-28  8:08 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Rafael J. Wysocki, cpufreq, Linux PM list, linux-kernel, Viresh Kumar

On 07/28/2013 01:38 PM, Toralf Förster wrote:
> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>>> it gives at a ThinkPad T420:
>>>
>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>>> acpi_cpufreq           12902  2147483647
>>
>> That is -1, which indicates some module refcount woes.
>>
>> I definitely can't see that with the mainline on my machines.
> 
> It might be a regression in -stable only, b/c in 3,10.2 I did not
> observed it.
> 
> Srivatsa,
> 
> by any chance - could the revert of the cpufreq patches have something
> to do with that ?
>

Hmmm? Those reverts didn't touch anything related to module
refcounts.. So I don't think they have anything to do with this.

Also, is the issue related to suspend/resume at all? (Sorry, I don't
have your original email, so I'm not sure what the exact issue is).

 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-27 23:39   ` Rafael J. Wysocki
  (?)
  (?)
@ 2013-07-28 10:21   ` Toralf Förster
  2013-07-28 22:11     ` Rafael J. Wysocki
  -1 siblings, 1 reply; 25+ messages in thread
From: Toralf Förster @ 2013-07-28 10:21 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: cpufreq, Linux PM list, Srivatsa S. Bhat

On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>> it gives at a ThinkPad T420:
>>
>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>> acpi_cpufreq           12902  2147483647
> 
> That is -1, which indicates some module refcount woes.

yes, ofc.

The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.

> I definitely can't see that with the mainline on my machines.

It is in mainline too.


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-28 10:21   ` Toralf Förster
@ 2013-07-28 22:11     ` Rafael J. Wysocki
  2013-07-28 22:43       ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-28 22:11 UTC (permalink / raw)
  To: Toralf Förster; +Cc: cpufreq, Linux PM list, Srivatsa S. Bhat

On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> > On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> >> it gives at a ThinkPad T420:
> >>
> >> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> >> acpi_cpufreq           12902  2147483647
> > 
> > That is -1, which indicates some module refcount woes.
> 
> yes, ofc.
> 
> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> 
> > I definitely can't see that with the mainline on my machines.
> 
> It is in mainline too.

Does the appended patch help?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, the latter has to bump up that refcount to
start with to balance the drop, or the cpufreq driver refcount will
become negative after a suspend/resume cycle.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |   14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1016,6 +1016,9 @@ static int __cpufreq_remove_dev(struct d
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
+	if (!try_module_get(cpufreq_driver->owner))
+		return -EINVAL;
+
 	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
@@ -1025,7 +1028,8 @@ static int __cpufreq_remove_dev(struct d
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
-		return -EINVAL;
+		ret = -EINVAL;
+		goto err;
 	}
 
 	if (cpufreq_driver->target)
@@ -1063,9 +1067,9 @@ static int __cpufreq_remove_dev(struct d
 
 			unlock_policy_rwsem_write(cpu);
 
-			ret = sysfs_create_link(&cpu_dev->kobj, &data->kobj,
+			sysfs_create_link(&cpu_dev->kobj, &data->kobj,
 					"cpufreq");
-			return -EINVAL;
+			goto err;
 		}
 
 		WARN_ON(lock_policy_rwsem_write(cpu));
@@ -1110,6 +1114,10 @@ static int __cpufreq_remove_dev(struct d
 
 	per_cpu(cpufreq_policy_cpu, cpu) = -1;
 	return 0;
+
+ err:
+	module_put(cpufreq_driver->owner);
+	return ret;
 }
 
 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-28 22:11     ` Rafael J. Wysocki
@ 2013-07-28 22:43       ` Rafael J. Wysocki
  2013-07-28 23:20         ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-28 22:43 UTC (permalink / raw)
  To: Toralf Förster; +Cc: cpufreq, Linux PM list, Srivatsa S. Bhat

On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> > On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> > > On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> > >> it gives at a ThinkPad T420:
> > >>
> > >> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> > >> acpi_cpufreq           12902  2147483647
> > > 
> > > That is -1, which indicates some module refcount woes.
> > 
> > yes, ofc.
> > 
> > The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> > 
> > > I definitely can't see that with the mainline on my machines.
> > 
> > It is in mainline too.
> 
> Does the appended patch help?

Actually, something as simple as this also should help:

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative after a suspend/resume cycle, for example.

To prevent this from happening make __cpufreq_remove_dev() put
the policy kobject only instead of calling cpufreq_cpu_put().

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1181,7 +1181,7 @@ static int __cpufreq_remove_dev(struct d
 		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
 
 	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-	cpufreq_cpu_put(data);
+	kobject_put(&data->kobj);
 
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-28 22:43       ` Rafael J. Wysocki
@ 2013-07-28 23:20         ` Rafael J. Wysocki
  2013-07-29  7:51           ` Viresh Kumar
  2013-07-29  9:41           ` Srivatsa S. Bhat
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-28 23:20 UTC (permalink / raw)
  To: Toralf Förster, Srivatsa S. Bhat; +Cc: cpufreq, Linux PM list

On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
> > On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> > > On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> > > > On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> > > >> it gives at a ThinkPad T420:
> > > >>
> > > >> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> > > >> acpi_cpufreq           12902  2147483647
> > > > 
> > > > That is -1, which indicates some module refcount woes.
> > > 
> > > yes, ofc.
> > > 
> > > The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> > > 
> > > > I definitely can't see that with the mainline on my machines.
> > > 
> > > It is in mainline too.
> > 
> > Does the appended patch help?
> 
> Actually, something as simple as this also should help:
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> 
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative after a suspend/resume cycle, for example.
> 
> To prevent this from happening make __cpufreq_remove_dev() put
> the policy kobject only instead of calling cpufreq_cpu_put().

Having a deeper look at it, though, I see that in fact the whole
cpufreq_cpu_put() is needed if the CPU is not the last one for the given
policy and is not needed at all otherwise (as described in the changelog
of the patch below).

Srivatsa, does this make sense to you?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the acpi-cpufreq driver after a suspend/resume
cycle.

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call pufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1181,7 +1181,6 @@ static int __cpufreq_remove_dev(struct d
 		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
 
 	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-	cpufreq_cpu_put(data);
 
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {
@@ -1205,9 +1204,12 @@ static int __cpufreq_remove_dev(struct d
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
-		__cpufreq_governor(data, CPUFREQ_GOV_START);
-		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+	} else {
+		cpufreq_cpu_put(data);
+		if (cpufreq_driver->target) {
+			__cpufreq_governor(data, CPUFREQ_GOV_START);
+			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+		}
 	}
 
 	per_cpu(cpufreq_policy_cpu, cpu) = -1;


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-28 23:20         ` Rafael J. Wysocki
@ 2013-07-29  7:51           ` Viresh Kumar
  2013-07-29  9:44             ` Srivatsa S. Bhat
  2013-07-29  9:41           ` Srivatsa S. Bhat
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-07-29  7:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toralf Förster, Srivatsa S. Bhat, cpufreq, Linux PM list

On Mon, Jul 29, 2013 at 4:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Having a deeper look at it, though, I see that in fact the whole
> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
> policy and is not needed at all otherwise (as described in the changelog
> of the patch below).
>
> Srivatsa, does this make sense to you?

It makes atleast to me :)

> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
>
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().

But I see one more issue with this code. For the last cpu we are just
calling kobject_put() and not cpufreq_cpu_put() and hence call to
module_put() is skipped. I am not sure, but that will probably cause
a problem when we try to rmmod the module? But which module then?
As we can't compile cpufreq.c as module.. So, is this part of code junk?
And so can be removed?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-28 23:20         ` Rafael J. Wysocki
  2013-07-29  7:51           ` Viresh Kumar
@ 2013-07-29  9:41           ` Srivatsa S. Bhat
  2013-07-29 11:22             ` Viresh Kumar
  2013-07-29 11:49             ` Rafael J. Wysocki
  1 sibling, 2 replies; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29  9:41 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Toralf Förster, cpufreq, Linux PM list

On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>>>>>> it gives at a ThinkPad T420:
>>>>>>
>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>>>>>> acpi_cpufreq           12902  2147483647
>>>>>
>>>>> That is -1, which indicates some module refcount woes.
>>>>
>>>> yes, ofc.
>>>>
>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
>>>>
>>>>> I definitely can't see that with the mainline on my machines.
>>>>
>>>> It is in mainline too.
>>>
>>> Does the appended patch help?
>>
>> Actually, something as simple as this also should help:
>>
>> ---
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
>>
>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
>> driver module refcount, __cpufreq_remove_dev() causes that refcount
>> to become negative after a suspend/resume cycle, for example.
>>
>> To prevent this from happening make __cpufreq_remove_dev() put
>> the policy kobject only instead of calling cpufreq_cpu_put().
> 
> Having a deeper look at it, though, I see that in fact the whole
> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
> policy and is not needed at all otherwise (as described in the changelog
> of the patch below).
> 
> Srivatsa, does this make sense to you?
>

Code-wise, your patch looks good to me. But one thing in the existing code
struck me as a little strange.

I'm assuming that the module_get() is used in the cpufreq core to ensure that
until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
driver module (eg: acpi-cpufreq) can't be removed.

But the cpufreq_add_dev() function does a module *put* at the end of
initializing a fresh CPU.

1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
1058         module_put(cpufreq_driver->owner);
1059         pr_debug("initialization complete\n");
1060 
1061         return 0;

So at that point, the module refcount of the cpufreq backend driver will become
zero, since we dropped it. Perhaps that was not the original intention.

Looking further (with respect to the problem we are discussing), the root-cause
is that a fresh CPU never does a cpufreq_cpu_get() for itself, but does it only
for the other CPUs in its policy->cpus mask.

From cpufreq_add_dev_symlink():

 814         for_each_cpu(j, policy->cpus) {
 815                 struct cpufreq_policy *managed_policy;
 816                 struct device *cpu_dev;
 817 
 818                 if (j == cpu)
 819                         continue;
 820 
 821                 pr_debug("CPU %u already managed, adding link\n", j);
 822                 managed_policy = cpufreq_cpu_get(cpu);
 823                 cpu_dev = get_cpu_device(j);

So, that 'continue' statement skips bumping the refcount on behalf of the master
CPU.

So, I wonder if it would be a good idea to instead allow that CPU to take a
module refcount as well. That way, the problem that Toralf reported would go
away, and at the same time, we can ensure that as long as the cpufreq core is
managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.

Something like this:

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a4ad733..ecfbc52 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 	}
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
+	/* Bump up the refcount for this CPU */
+	policy = cpufreq_cpu_get(cpu);
+
 	ret = cpufreq_add_dev_symlink(cpu, policy);
-	if (ret)
+	if (ret) {
+		cpufreq_cpu_put(policy);
 		goto err_out_kobj_put;
+	}
 
 	memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
 	/* assure that the starting sequence is run in __cpufreq_set_policy */

Thoughts?

Regards,
Srivatsa S. Bhat
 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> 
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative for the acpi-cpufreq driver after a suspend/resume
> cycle.
> 
> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
> 
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().
> 
> On the other hand, if the given CPU is not the last one for that
> policy, this means that cpufreq_add_policy_cpu() has been called
> at least once for that policy and cpufreq_cpu_get() has been
> called for it too.  To balance that cpufreq_cpu_get(), we need to
> call pufreq_cpu_put() in that case.
> 
> Thus, to fix the described problem and keep the reference
> counters balanced in both cases, move the cpufreq_cpu_get() call
> in __cpufreq_remove_dev() to the code path executed only for
> CPUs that share the policy with other CPUs.
> 
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -1181,7 +1181,6 @@ static int __cpufreq_remove_dev(struct d
>  		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
>  
>  	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -	cpufreq_cpu_put(data);
>  
>  	/* If cpu is last user of policy, free policy */
>  	if (cpus == 1) {
> @@ -1205,9 +1204,12 @@ static int __cpufreq_remove_dev(struct d
>  		free_cpumask_var(data->related_cpus);
>  		free_cpumask_var(data->cpus);
>  		kfree(data);
> -	} else if (cpufreq_driver->target) {
> -		__cpufreq_governor(data, CPUFREQ_GOV_START);
> -		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +	} else {
> +		cpufreq_cpu_put(data);
> +		if (cpufreq_driver->target) {
> +			__cpufreq_governor(data, CPUFREQ_GOV_START);
> +			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +		}
>  	}
>  
>  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> 


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29  7:51           ` Viresh Kumar
@ 2013-07-29  9:44             ` Srivatsa S. Bhat
  0 siblings, 0 replies; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29  9:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Toralf Förster, cpufreq, Linux PM list

On 07/29/2013 01:21 PM, Viresh Kumar wrote:
> On Mon, Jul 29, 2013 at 4:50 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> Having a deeper look at it, though, I see that in fact the whole
>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
>> policy and is not needed at all otherwise (as described in the changelog
>> of the patch below).
>>
>> Srivatsa, does this make sense to you?
> 
> It makes atleast to me :)
> 
>> This is not the only bad thing that happens there, however, because
>> kobject_put() should only be called for the policy kobject at this
>> point if the CPU is not the last one for that policy.
>>
>> Namely, if the given CPU is the last one for that policy, the
>> policy kobject's refcount should be 1 at this point, as set by
>> cpufreq_add_dev_interface(), and only needs to be dropped once for
>> the kobject to go away.  This actually happens under the cpu == 1
>> check, so it need not be done before by cpufreq_cpu_put().
> 
> But I see one more issue with this code. For the last cpu we are just
> calling kobject_put() and not cpufreq_cpu_put() and hence call to
> module_put() is skipped. I am not sure, but that will probably cause
> a problem when we try to rmmod the module? But which module then?
> As we can't compile cpufreq.c as module.. So, is this part of code junk?
> And so can be removed?
> 

I tried to address this concern in my other mail to Rafael. (Sorry I
forgot to CC you on that!). Let me know what you think of that solution.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29  9:41           ` Srivatsa S. Bhat
@ 2013-07-29 11:22             ` Viresh Kumar
  2013-07-29 11:54                 ` Rafael J. Wysocki
  2013-07-29 11:49             ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2013-07-29 11:22 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Rafael J. Wysocki, Toralf Förster, cpufreq, Linux PM list

On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
> I'm assuming that the module_get() is used in the cpufreq core to ensure that
> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> driver module (eg: acpi-cpufreq) can't be removed.

I missed this simple stuff in my email.. :(

> But the cpufreq_add_dev() function does a module *put* at the end of
> initializing a fresh CPU.
>
> 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
> 1058         module_put(cpufreq_driver->owner);
> 1059         pr_debug("initialization complete\n");
> 1060
> 1061         return 0;

That actually looks wrong. And shoud be fixed.

> So, I wonder if it would be a good idea to instead allow that CPU to take a
> module refcount as well. That way, the problem that Toralf reported would go
> away, and at the same time, we can ensure that as long as the cpufreq core is
> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
>
> Something like this:
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index a4ad733..ecfbc52 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>         }
>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>
> +       /* Bump up the refcount for this CPU */
> +       policy = cpufreq_cpu_get(cpu);
> +
>         ret = cpufreq_add_dev_symlink(cpu, policy);
> -       if (ret)
> +       if (ret) {
> +               cpufreq_cpu_put(policy);
>                 goto err_out_kobj_put;
> +       }

That will do an extra kobject_get() which we don't require.. Only removing what
I mentioned earlier should be good enough, I believe.

Over that, I think below code in __cpufreq_governor() is also wrong.

	/* we keep one module reference alive for
			each CPU governed by this CPU */
	if ((event != CPUFREQ_GOV_START) || ret)
		module_put(policy->governor->owner);
	if ((event == CPUFREQ_GOV_STOP) && !ret)
		module_put(policy->governor->owner);

The second if is sensible as it puts count when governor is stopped.
But the first one decrements the count when we failed for anything
other than START..

But this routine is called for other stuff too:
- INIT/Exit
- LIMITS,

And so, count must be incremented for a passed INIT call and
decremented for a passed EXIT call, otherwise shouldn't be
touched.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:49             ` Rafael J. Wysocki
@ 2013-07-29 11:44               ` Srivatsa S. Bhat
  2013-07-29 12:04                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29 11:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toralf Förster, cpufreq, Linux PM list, Viresh Kumar

On 07/29/2013 05:19 PM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote:
>> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
>>> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
>>>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
>>>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
>>>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
>>>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>>>>>>>> it gives at a ThinkPad T420:
>>>>>>>>
>>>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>>>>>>>> acpi_cpufreq           12902  2147483647
>>>>>>>
>>>>>>> That is -1, which indicates some module refcount woes.
>>>>>>
>>>>>> yes, ofc.
>>>>>>
>>>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
>>>>>>
>>>>>>> I definitely can't see that with the mainline on my machines.
>>>>>>
>>>>>> It is in mainline too.
>>>>>
>>>>> Does the appended patch help?
>>>>
>>>> Actually, something as simple as this also should help:
>>>>
>>>> ---
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
>>>>
>>>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
>>>> driver module refcount, __cpufreq_remove_dev() causes that refcount
>>>> to become negative after a suspend/resume cycle, for example.
>>>>
>>>> To prevent this from happening make __cpufreq_remove_dev() put
>>>> the policy kobject only instead of calling cpufreq_cpu_put().
>>>
>>> Having a deeper look at it, though, I see that in fact the whole
>>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
>>> policy and is not needed at all otherwise (as described in the changelog
>>> of the patch below).
>>>
>>> Srivatsa, does this make sense to you?
>>>
>>
>> Code-wise, your patch looks good to me. But one thing in the existing code
>> struck me as a little strange.
>>
>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
>> driver module (eg: acpi-cpufreq) can't be removed.
> 
> Quite frankly, I'm not sure about that.  If that were the case,
> cpufreq_add_dev() would not call module_put() which it does.
> 
> That may be a bug, I agree, but that's not for the present release cycle.  For
> now, we need to ensure that the reference counts are *balanced* .
> 

Sure, in that case, I agree that your patch is the right fix at this point,
since it resolves the immediate problem that we have with the refcounts.

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:54                 ` Rafael J. Wysocki
@ 2013-07-29 11:48                   ` Srivatsa S. Bhat
  -1 siblings, 0 replies; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29 11:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Toralf Förster, cpufreq, Linux PM list

On 07/29/2013 05:24 PM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
>> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
>>> driver module (eg: acpi-cpufreq) can't be removed.
>>
>> I missed this simple stuff in my email.. :(
>>
>>> But the cpufreq_add_dev() function does a module *put* at the end of
>>> initializing a fresh CPU.
>>>
>>> 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
>>> 1058         module_put(cpufreq_driver->owner);
>>> 1059         pr_debug("initialization complete\n");
>>> 1060
>>> 1061         return 0;
>>
>> That actually looks wrong. And shoud be fixed.
> 
> OK
> 
>>> So, I wonder if it would be a good idea to instead allow that CPU to take a
>>> module refcount as well. That way, the problem that Toralf reported would go
>>> away, and at the same time, we can ensure that as long as the cpufreq core is
>>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
>>>
>>> Something like this:
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index a4ad733..ecfbc52 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>>>         }
>>>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>>>
>>> +       /* Bump up the refcount for this CPU */
>>> +       policy = cpufreq_cpu_get(cpu);
>>> +
>>>         ret = cpufreq_add_dev_symlink(cpu, policy);
>>> -       if (ret)
>>> +       if (ret) {
>>> +               cpufreq_cpu_put(policy);
>>>                 goto err_out_kobj_put;
>>> +       }
>>
>> That will do an extra kobject_get() which we don't require.. Only removing what
>> I mentioned earlier should be good enough, I believe.
>>
>> Over that, I think below code in __cpufreq_governor() is also wrong.
>>
>> 	/* we keep one module reference alive for
>> 			each CPU governed by this CPU */
>> 	if ((event != CPUFREQ_GOV_START) || ret)
>> 		module_put(policy->governor->owner);
>> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
>> 		module_put(policy->governor->owner);
>>
>> The second if is sensible as it puts count when governor is stopped.
>> But the first one decrements the count when we failed for anything
>> other than START..
>>
>> But this routine is called for other stuff too:
>> - INIT/Exit
>> - LIMITS,
>>
>> And so, count must be incremented for a passed INIT call and
>> decremented for a passed EXIT call, otherwise shouldn't be
>> touched.
> 
> That sounds good, but I don't want to make those changes for 3.11 and at the
> same time I *do* want the reference count issue to go away.
> 
> So the patch below is the one I'd like to apply for the time being and
> we can work on more improvements on top of that.
> 
> Any objections?
> 
> Toralf, please test this patch in the meantime.
> 
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> 
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative for the cpufreq driver after a suspend/resume
> cycle.
> 
> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
> 
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().
> 
> On the other hand, if the given CPU is not the last one for that
> policy, this means that cpufreq_add_policy_cpu() has been called
> at least once for that policy and cpufreq_cpu_get() has been
> called for it too.  To balance that cpufreq_cpu_get(), we need to
> call cpufreq_cpu_put() in that case.
> 
> Thus, to fix the described problem and keep the reference
> counters balanced in both cases, move the cpufreq_cpu_get() call
> in __cpufreq_remove_dev() to the code path executed only for
> CPUs that share the policy with other CPUs.
> 
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This version looks good as well.

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat

> ---
>  drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
>  				__func__, cpu_dev->id, cpu);
>  	}
>  
> -	if ((cpus == 1) && (cpufreq_driver->target))
> -		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> -
> -	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -	cpufreq_cpu_put(data);
> -
>  	/* If cpu is last user of policy, free policy */
>  	if (cpus == 1) {
> +		if (cpufreq_driver->target)
> +			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> +
>  		lock_policy_rwsem_read(cpu);
>  		kobj = &data->kobj;
>  		cmp = &data->kobj_unregister;
> @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
>  		free_cpumask_var(data->related_cpus);
>  		free_cpumask_var(data->cpus);
>  		kfree(data);
> -	} else if (cpufreq_driver->target) {
> -		__cpufreq_governor(data, CPUFREQ_GOV_START);
> -		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +	} else {
> +		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> +		cpufreq_cpu_put(data);
> +		if (cpufreq_driver->target) {
> +			__cpufreq_governor(data, CPUFREQ_GOV_START);
> +			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +		}
>  	}
>  
>  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
@ 2013-07-29 11:48                   ` Srivatsa S. Bhat
  0 siblings, 0 replies; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29 11:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Toralf Förster, cpufreq, Linux PM list

On 07/29/2013 05:24 PM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
>> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
>>> driver module (eg: acpi-cpufreq) can't be removed.
>>
>> I missed this simple stuff in my email.. :(
>>
>>> But the cpufreq_add_dev() function does a module *put* at the end of
>>> initializing a fresh CPU.
>>>
>>> 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
>>> 1058         module_put(cpufreq_driver->owner);
>>> 1059         pr_debug("initialization complete\n");
>>> 1060
>>> 1061         return 0;
>>
>> That actually looks wrong. And shoud be fixed.
> 
> OK
> 
>>> So, I wonder if it would be a good idea to instead allow that CPU to take a
>>> module refcount as well. That way, the problem that Toralf reported would go
>>> away, and at the same time, we can ensure that as long as the cpufreq core is
>>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
>>>
>>> Something like this:
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index a4ad733..ecfbc52 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>>>         }
>>>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>>>
>>> +       /* Bump up the refcount for this CPU */
>>> +       policy = cpufreq_cpu_get(cpu);
>>> +
>>>         ret = cpufreq_add_dev_symlink(cpu, policy);
>>> -       if (ret)
>>> +       if (ret) {
>>> +               cpufreq_cpu_put(policy);
>>>                 goto err_out_kobj_put;
>>> +       }
>>
>> That will do an extra kobject_get() which we don't require.. Only removing what
>> I mentioned earlier should be good enough, I believe.
>>
>> Over that, I think below code in __cpufreq_governor() is also wrong.
>>
>> 	/* we keep one module reference alive for
>> 			each CPU governed by this CPU */
>> 	if ((event != CPUFREQ_GOV_START) || ret)
>> 		module_put(policy->governor->owner);
>> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
>> 		module_put(policy->governor->owner);
>>
>> The second if is sensible as it puts count when governor is stopped.
>> But the first one decrements the count when we failed for anything
>> other than START..
>>
>> But this routine is called for other stuff too:
>> - INIT/Exit
>> - LIMITS,
>>
>> And so, count must be incremented for a passed INIT call and
>> decremented for a passed EXIT call, otherwise shouldn't be
>> touched.
> 
> That sounds good, but I don't want to make those changes for 3.11 and at the
> same time I *do* want the reference count issue to go away.
> 
> So the patch below is the one I'd like to apply for the time being and
> we can work on more improvements on top of that.
> 
> Any objections?
> 
> Toralf, please test this patch in the meantime.
> 
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> 
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative for the cpufreq driver after a suspend/resume
> cycle.
> 
> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
> 
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().
> 
> On the other hand, if the given CPU is not the last one for that
> policy, this means that cpufreq_add_policy_cpu() has been called
> at least once for that policy and cpufreq_cpu_get() has been
> called for it too.  To balance that cpufreq_cpu_get(), we need to
> call cpufreq_cpu_put() in that case.
> 
> Thus, to fix the described problem and keep the reference
> counters balanced in both cases, move the cpufreq_cpu_get() call
> in __cpufreq_remove_dev() to the code path executed only for
> CPUs that share the policy with other CPUs.
> 
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This version looks good as well.

Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Regards,
Srivatsa S. Bhat

> ---
>  drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
>  				__func__, cpu_dev->id, cpu);
>  	}
>  
> -	if ((cpus == 1) && (cpufreq_driver->target))
> -		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> -
> -	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -	cpufreq_cpu_put(data);
> -
>  	/* If cpu is last user of policy, free policy */
>  	if (cpus == 1) {
> +		if (cpufreq_driver->target)
> +			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> +
>  		lock_policy_rwsem_read(cpu);
>  		kobj = &data->kobj;
>  		cmp = &data->kobj_unregister;
> @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
>  		free_cpumask_var(data->related_cpus);
>  		free_cpumask_var(data->cpus);
>  		kfree(data);
> -	} else if (cpufreq_driver->target) {
> -		__cpufreq_governor(data, CPUFREQ_GOV_START);
> -		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +	} else {
> +		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> +		cpufreq_cpu_put(data);
> +		if (cpufreq_driver->target) {
> +			__cpufreq_governor(data, CPUFREQ_GOV_START);
> +			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +		}
>  	}
>  
>  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-29  9:41           ` Srivatsa S. Bhat
  2013-07-29 11:22             ` Viresh Kumar
@ 2013-07-29 11:49             ` Rafael J. Wysocki
  2013-07-29 11:44               ` Srivatsa S. Bhat
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 11:49 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Toralf Förster, cpufreq, Linux PM list, Viresh Kumar

On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote:
> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
> > On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
> >> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
> >>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> >>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> >>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> >>>>>> it gives at a ThinkPad T420:
> >>>>>>
> >>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> >>>>>> acpi_cpufreq           12902  2147483647
> >>>>>
> >>>>> That is -1, which indicates some module refcount woes.
> >>>>
> >>>> yes, ofc.
> >>>>
> >>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> >>>>
> >>>>> I definitely can't see that with the mainline on my machines.
> >>>>
> >>>> It is in mainline too.
> >>>
> >>> Does the appended patch help?
> >>
> >> Actually, something as simple as this also should help:
> >>
> >> ---
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> >>
> >> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> >> driver module refcount, __cpufreq_remove_dev() causes that refcount
> >> to become negative after a suspend/resume cycle, for example.
> >>
> >> To prevent this from happening make __cpufreq_remove_dev() put
> >> the policy kobject only instead of calling cpufreq_cpu_put().
> > 
> > Having a deeper look at it, though, I see that in fact the whole
> > cpufreq_cpu_put() is needed if the CPU is not the last one for the given
> > policy and is not needed at all otherwise (as described in the changelog
> > of the patch below).
> > 
> > Srivatsa, does this make sense to you?
> >
> 
> Code-wise, your patch looks good to me. But one thing in the existing code
> struck me as a little strange.
> 
> I'm assuming that the module_get() is used in the cpufreq core to ensure that
> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> driver module (eg: acpi-cpufreq) can't be removed.

Quite frankly, I'm not sure about that.  If that were the case,
cpufreq_add_dev() would not call module_put() which it does.

That may be a bug, I agree, but that's not for the present release cycle.  For
now, we need to ensure that the reference counts are *balanced* .

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:22             ` Viresh Kumar
@ 2013-07-29 11:54                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 11:54 UTC (permalink / raw)
  To: Viresh Kumar, Srivatsa S. Bhat, Toralf Förster
  Cc: cpufreq, Linux PM list

On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> > I'm assuming that the module_get() is used in the cpufreq core to ensure that
> > until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> > driver module (eg: acpi-cpufreq) can't be removed.
> 
> I missed this simple stuff in my email.. :(
> 
> > But the cpufreq_add_dev() function does a module *put* at the end of
> > initializing a fresh CPU.
> >
> > 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
> > 1058         module_put(cpufreq_driver->owner);
> > 1059         pr_debug("initialization complete\n");
> > 1060
> > 1061         return 0;
> 
> That actually looks wrong. And shoud be fixed.

OK

> > So, I wonder if it would be a good idea to instead allow that CPU to take a
> > module refcount as well. That way, the problem that Toralf reported would go
> > away, and at the same time, we can ensure that as long as the cpufreq core is
> > managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
> >
> > Something like this:
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index a4ad733..ecfbc52 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
> >         }
> >         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >
> > +       /* Bump up the refcount for this CPU */
> > +       policy = cpufreq_cpu_get(cpu);
> > +
> >         ret = cpufreq_add_dev_symlink(cpu, policy);
> > -       if (ret)
> > +       if (ret) {
> > +               cpufreq_cpu_put(policy);
> >                 goto err_out_kobj_put;
> > +       }
> 
> That will do an extra kobject_get() which we don't require.. Only removing what
> I mentioned earlier should be good enough, I believe.
> 
> Over that, I think below code in __cpufreq_governor() is also wrong.
> 
> 	/* we keep one module reference alive for
> 			each CPU governed by this CPU */
> 	if ((event != CPUFREQ_GOV_START) || ret)
> 		module_put(policy->governor->owner);
> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
> 		module_put(policy->governor->owner);
> 
> The second if is sensible as it puts count when governor is stopped.
> But the first one decrements the count when we failed for anything
> other than START..
> 
> But this routine is called for other stuff too:
> - INIT/Exit
> - LIMITS,
> 
> And so, count must be incremented for a passed INIT call and
> decremented for a passed EXIT call, otherwise shouldn't be
> touched.

That sounds good, but I don't want to make those changes for 3.11 and at the
same time I *do* want the reference count issue to go away.

So the patch below is the one I'd like to apply for the time being and
we can work on more improvements on top of that.

Any objections?

Toralf, please test this patch in the meantime.

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume
cycle.

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
 				__func__, cpu_dev->id, cpu);
 	}
 
-	if ((cpus == 1) && (cpufreq_driver->target))
-		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
-
-	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-	cpufreq_cpu_put(data);
-
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {
+		if (cpufreq_driver->target)
+			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
+
 		lock_policy_rwsem_read(cpu);
 		kobj = &data->kobj;
 		cmp = &data->kobj_unregister;
@@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
-		__cpufreq_governor(data, CPUFREQ_GOV_START);
-		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+	} else {
+		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
+		cpufreq_cpu_put(data);
+		if (cpufreq_driver->target) {
+			__cpufreq_governor(data, CPUFREQ_GOV_START);
+			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+		}
 	}
 
 	per_cpu(cpufreq_policy_cpu, cpu) = -1;


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
@ 2013-07-29 11:54                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 11:54 UTC (permalink / raw)
  To: Viresh Kumar, Srivatsa S. Bhat, Toralf Förster
  Cc: cpufreq, Linux PM list

On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> > I'm assuming that the module_get() is used in the cpufreq core to ensure that
> > until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> > driver module (eg: acpi-cpufreq) can't be removed.
> 
> I missed this simple stuff in my email.. :(
> 
> > But the cpufreq_add_dev() function does a module *put* at the end of
> > initializing a fresh CPU.
> >
> > 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
> > 1058         module_put(cpufreq_driver->owner);
> > 1059         pr_debug("initialization complete\n");
> > 1060
> > 1061         return 0;
> 
> That actually looks wrong. And shoud be fixed.

OK

> > So, I wonder if it would be a good idea to instead allow that CPU to take a
> > module refcount as well. That way, the problem that Toralf reported would go
> > away, and at the same time, we can ensure that as long as the cpufreq core is
> > managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
> >
> > Something like this:
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index a4ad733..ecfbc52 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
> >         }
> >         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >
> > +       /* Bump up the refcount for this CPU */
> > +       policy = cpufreq_cpu_get(cpu);
> > +
> >         ret = cpufreq_add_dev_symlink(cpu, policy);
> > -       if (ret)
> > +       if (ret) {
> > +               cpufreq_cpu_put(policy);
> >                 goto err_out_kobj_put;
> > +       }
> 
> That will do an extra kobject_get() which we don't require.. Only removing what
> I mentioned earlier should be good enough, I believe.
> 
> Over that, I think below code in __cpufreq_governor() is also wrong.
> 
> 	/* we keep one module reference alive for
> 			each CPU governed by this CPU */
> 	if ((event != CPUFREQ_GOV_START) || ret)
> 		module_put(policy->governor->owner);
> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
> 		module_put(policy->governor->owner);
> 
> The second if is sensible as it puts count when governor is stopped.
> But the first one decrements the count when we failed for anything
> other than START..
> 
> But this routine is called for other stuff too:
> - INIT/Exit
> - LIMITS,
> 
> And so, count must be incremented for a passed INIT call and
> decremented for a passed EXIT call, otherwise shouldn't be
> touched.

That sounds good, but I don't want to make those changes for 3.11 and at the
same time I *do* want the reference count issue to go away.

So the patch below is the one I'd like to apply for the time being and
we can work on more improvements on top of that.

Any objections?

Toralf, please test this patch in the meantime.

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume

Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
driver module refcount, __cpufreq_remove_dev() causes that refcount
to become negative for the cpufreq driver after a suspend/resume
cycle.

This is not the only bad thing that happens there, however, because
kobject_put() should only be called for the policy kobject at this
point if the CPU is not the last one for that policy.

Namely, if the given CPU is the last one for that policy, the
policy kobject's refcount should be 1 at this point, as set by
cpufreq_add_dev_interface(), and only needs to be dropped once for
the kobject to go away.  This actually happens under the cpu == 1
check, so it need not be done before by cpufreq_cpu_put().

On the other hand, if the given CPU is not the last one for that
policy, this means that cpufreq_add_policy_cpu() has been called
at least once for that policy and cpufreq_cpu_get() has been
called for it too.  To balance that cpufreq_cpu_get(), we need to
call cpufreq_cpu_put() in that case.

Thus, to fix the described problem and keep the reference
counters balanced in both cases, move the cpufreq_cpu_get() call
in __cpufreq_remove_dev() to the code path executed only for
CPUs that share the policy with other CPUs.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
 				__func__, cpu_dev->id, cpu);
 	}
 
-	if ((cpus == 1) && (cpufreq_driver->target))
-		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
-
-	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
-	cpufreq_cpu_put(data);
-
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {
+		if (cpufreq_driver->target)
+			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
+
 		lock_policy_rwsem_read(cpu);
 		kobj = &data->kobj;
 		cmp = &data->kobj_unregister;
@@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
-		__cpufreq_governor(data, CPUFREQ_GOV_START);
-		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+	} else {
+		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
+		cpufreq_cpu_put(data);
+		if (cpufreq_driver->target) {
+			__cpufreq_governor(data, CPUFREQ_GOV_START);
+			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
+		}
 	}
 
 	per_cpu(cpufreq_policy_cpu, cpu) = -1;


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:44               ` Srivatsa S. Bhat
@ 2013-07-29 12:04                 ` Rafael J. Wysocki
  2013-07-29 15:27                   ` Srivatsa S. Bhat
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 12:04 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Toralf Förster, cpufreq, Linux PM list, Viresh Kumar

On Monday, July 29, 2013 05:14:25 PM Srivatsa S. Bhat wrote:
> On 07/29/2013 05:19 PM, Rafael J. Wysocki wrote:
> > On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote:
> >> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
> >>> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
> >>>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
> >>>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> >>>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> >>>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> >>>>>>>> it gives at a ThinkPad T420:
> >>>>>>>>
> >>>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> >>>>>>>> acpi_cpufreq           12902  2147483647
> >>>>>>>
> >>>>>>> That is -1, which indicates some module refcount woes.
> >>>>>>
> >>>>>> yes, ofc.
> >>>>>>
> >>>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> >>>>>>
> >>>>>>> I definitely can't see that with the mainline on my machines.
> >>>>>>
> >>>>>> It is in mainline too.
> >>>>>
> >>>>> Does the appended patch help?
> >>>>
> >>>> Actually, something as simple as this also should help:
> >>>>
> >>>> ---
> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> >>>>
> >>>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> >>>> driver module refcount, __cpufreq_remove_dev() causes that refcount
> >>>> to become negative after a suspend/resume cycle, for example.
> >>>>
> >>>> To prevent this from happening make __cpufreq_remove_dev() put
> >>>> the policy kobject only instead of calling cpufreq_cpu_put().
> >>>
> >>> Having a deeper look at it, though, I see that in fact the whole
> >>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
> >>> policy and is not needed at all otherwise (as described in the changelog
> >>> of the patch below).
> >>>
> >>> Srivatsa, does this make sense to you?
> >>>
> >>
> >> Code-wise, your patch looks good to me. But one thing in the existing code
> >> struck me as a little strange.
> >>
> >> I'm assuming that the module_get() is used in the cpufreq core to ensure that
> >> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> >> driver module (eg: acpi-cpufreq) can't be removed.
> > 
> > Quite frankly, I'm not sure about that.  If that were the case,
> > cpufreq_add_dev() would not call module_put() which it does.
> > 
> > That may be a bug, I agree, but that's not for the present release cycle.  For
> > now, we need to ensure that the reference counts are *balanced* .
> > 
> 
> Sure, in that case, I agree that your patch is the right fix at this point,
> since it resolves the immediate problem that we have with the refcounts.
> 
> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Great, thanks!

Does your patchset avoiding the creation/removal of sysfs directories over
suspend/resume need to be modified to take this patch into account?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-29 12:04                 ` Rafael J. Wysocki
@ 2013-07-29 15:27                   ` Srivatsa S. Bhat
  2013-07-29 20:28                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Srivatsa S. Bhat @ 2013-07-29 15:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Toralf Förster, cpufreq, Linux PM list, Viresh Kumar

On 07/29/2013 05:34 PM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 05:14:25 PM Srivatsa S. Bhat wrote:
>> On 07/29/2013 05:19 PM, Rafael J. Wysocki wrote:
>>> On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote:
>>>> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
>>>>> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
>>>>>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
>>>>>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
>>>>>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
>>>>>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
>>>>>>>>>> it gives at a ThinkPad T420:
>>>>>>>>>>
>>>>>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
>>>>>>>>>> acpi_cpufreq           12902  2147483647
>>>>>>>>>
>>>>>>>>> That is -1, which indicates some module refcount woes.
>>>>>>>>
>>>>>>>> yes, ofc.
>>>>>>>>
>>>>>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
>>>>>>>>
>>>>>>>>> I definitely can't see that with the mainline on my machines.
>>>>>>>>
>>>>>>>> It is in mainline too.
>>>>>>>
>>>>>>> Does the appended patch help?
>>>>>>
>>>>>> Actually, something as simple as this also should help:
>>>>>>
>>>>>> ---
>>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
>>>>>>
>>>>>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
>>>>>> driver module refcount, __cpufreq_remove_dev() causes that refcount
>>>>>> to become negative after a suspend/resume cycle, for example.
>>>>>>
>>>>>> To prevent this from happening make __cpufreq_remove_dev() put
>>>>>> the policy kobject only instead of calling cpufreq_cpu_put().
>>>>>
>>>>> Having a deeper look at it, though, I see that in fact the whole
>>>>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
>>>>> policy and is not needed at all otherwise (as described in the changelog
>>>>> of the patch below).
>>>>>
>>>>> Srivatsa, does this make sense to you?
>>>>>
>>>>
>>>> Code-wise, your patch looks good to me. But one thing in the existing code
>>>> struck me as a little strange.
>>>>
>>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
>>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
>>>> driver module (eg: acpi-cpufreq) can't be removed.
>>>
>>> Quite frankly, I'm not sure about that.  If that were the case,
>>> cpufreq_add_dev() would not call module_put() which it does.
>>>
>>> That may be a bug, I agree, but that's not for the present release cycle.  For
>>> now, we need to ensure that the reference counts are *balanced* .
>>>
>>
>> Sure, in that case, I agree that your patch is the right fix at this point,
>> since it resolves the immediate problem that we have with the refcounts.
>>
>> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> 
> Great, thanks!
> 
> Does your patchset avoiding the creation/removal of sysfs directories over
> suspend/resume need to be modified to take this patch into account?
> 

There will be some trivial conflicts, but looking a bit closer at my patchset,
I believe it has a subtle refcounting bug, relating to patches 6 and 7.

Patch 6: http://marc.info/?l=linux-kernel&m=137358141628042&w=2
Patch 7: http://marc.info/?l=linux-pm&m=137358124527993&w=2


Patch 6 modifies cpufreq_add_policy_cpu() to return 0 if frozen == true.
But by that time, it would have already done a cpufreq_cpu_get().

In patch 7, under __cpufreq_remove_dev(), we avoid dropping refcounts
for *any* CPU when frozen == true.

In the general case, this is fine (like myself and Viresh had discussed
earlier in that thread), because __cpufreq_add_dev() doesn't increment the
refcount in the frozen case.

But we overlooked a rather subtle scenario: say, CPU 'X' went through
cpufreq_add_dev in the resume path, and restored its policy structure.
Since this was fresh, no refcount was taken. And then CPU 'Y' came along,
but it was related to CPU 'X', so it will take this path:

1018         /* Check if this cpu was hot-unplugged earlier and has siblings */
1019         read_lock_irqsave(&cpufreq_driver_lock, flags);
1020         for_each_online_cpu(sibling) {
1021                 struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
1022                 if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
1023                         read_unlock_irqrestore(&cpufreq_driver_lock, flags);
1024                         return cpufreq_add_policy_cpu(cpu, sibling, dev,
1025                                                       frozen);
1026                 }
1027         }

This time, cp won't be NULL for CPU 'X', because its policy was restored
in the previous call. So we end up calling cpufreq_add_policy_cpu() for
CPU 'Y'. And that function takes a new refcount, as mentioned above.

So we need to have this extra hunk in cpufreq_add_policy_cpu() to ensure
we don't take any additional refcounts during suspend/resume:


diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c7f59e8..9681c01 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -932,8 +932,11 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 	}
 
 	/* Don't touch sysfs links during light-weight init */
-	if (frozen)
+	if (frozen) {
+		/* Drop the extra refcount that we took above */
+		cpufreq_cpu_put(policy);
 		return 0;
+	}
 
 	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 	if (ret)


I can respin my entire patchset with this fix included, on top of your
patch. Please let me know if you'd prefer some other method for resolving
this.

Regards,
Srivatsa S. Bhat


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:54                 ` Rafael J. Wysocki
  (?)
  (?)
@ 2013-07-29 17:23                 ` Toralf Förster
  2013-07-29 20:19                   ` Rafael J. Wysocki
  -1 siblings, 1 reply; 25+ messages in thread
From: Toralf Förster @ 2013-07-29 17:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Viresh Kumar, Srivatsa S. Bhat, cpufreq, Linux PM list

On 07/29/2013 01:54 PM, Rafael J. Wysocki wrote:
> On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
>> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
>> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
>>> driver module (eg: acpi-cpufreq) can't be removed.
>>
>> I missed this simple stuff in my email.. :(
>>
>>> But the cpufreq_add_dev() function does a module *put* at the end of
>>> initializing a fresh CPU.
>>>
>>> 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
>>> 1058         module_put(cpufreq_driver->owner);
>>> 1059         pr_debug("initialization complete\n");
>>> 1060
>>> 1061         return 0;
>>
>> That actually looks wrong. And shoud be fixed.
> 
> OK
> 
>>> So, I wonder if it would be a good idea to instead allow that CPU to take a
>>> module refcount as well. That way, the problem that Toralf reported would go
>>> away, and at the same time, we can ensure that as long as the cpufreq core is
>>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
>>>
>>> Something like this:
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index a4ad733..ecfbc52 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>>>         }
>>>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>>>
>>> +       /* Bump up the refcount for this CPU */
>>> +       policy = cpufreq_cpu_get(cpu);
>>> +
>>>         ret = cpufreq_add_dev_symlink(cpu, policy);
>>> -       if (ret)
>>> +       if (ret) {
>>> +               cpufreq_cpu_put(policy);
>>>                 goto err_out_kobj_put;
>>> +       }
>>
>> That will do an extra kobject_get() which we don't require.. Only removing what
>> I mentioned earlier should be good enough, I believe.
>>
>> Over that, I think below code in __cpufreq_governor() is also wrong.
>>
>> 	/* we keep one module reference alive for
>> 			each CPU governed by this CPU */
>> 	if ((event != CPUFREQ_GOV_START) || ret)
>> 		module_put(policy->governor->owner);
>> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
>> 		module_put(policy->governor->owner);
>>
>> The second if is sensible as it puts count when governor is stopped.
>> But the first one decrements the count when we failed for anything
>> other than START..
>>
>> But this routine is called for other stuff too:
>> - INIT/Exit
>> - LIMITS,
>>
>> And so, count must be incremented for a passed INIT call and
>> decremented for a passed EXIT call, otherwise shouldn't be
>> touched.
> 
> That sounds good, but I don't want to make those changes for 3.11 and at the
> same time I *do* want the reference count issue to go away.
> 
> So the patch below is the one I'd like to apply for the time being and
> we can work on more improvements on top of that.
> 
> Any objections?
> 
> Toralf, please test this patch in the meantime.

works - tested on top of 3.10.4

> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> 
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative for the cpufreq driver after a suspend/resume
> cycle.
> 
> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
> 
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().
> 
> On the other hand, if the given CPU is not the last one for that
> policy, this means that cpufreq_add_policy_cpu() has been called
> at least once for that policy and cpufreq_cpu_get() has been
> called for it too.  To balance that cpufreq_cpu_get(), we need to
> call cpufreq_cpu_put() in that case.
> 
> Thus, to fix the described problem and keep the reference
> counters balanced in both cases, move the cpufreq_cpu_get() call
> in __cpufreq_remove_dev() to the code path executed only for
> CPUs that share the policy with other CPUs.
> 
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
>  				__func__, cpu_dev->id, cpu);
>  	}
>  
> -	if ((cpus == 1) && (cpufreq_driver->target))
> -		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> -
> -	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> -	cpufreq_cpu_put(data);
> -
>  	/* If cpu is last user of policy, free policy */
>  	if (cpus == 1) {
> +		if (cpufreq_driver->target)
> +			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> +
>  		lock_policy_rwsem_read(cpu);
>  		kobj = &data->kobj;
>  		cmp = &data->kobj_unregister;
> @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
>  		free_cpumask_var(data->related_cpus);
>  		free_cpumask_var(data->cpus);
>  		kfree(data);
> -	} else if (cpufreq_driver->target) {
> -		__cpufreq_governor(data, CPUFREQ_GOV_START);
> -		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +	} else {
> +		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> +		cpufreq_cpu_put(data);
> +		if (cpufreq_driver->target) {
> +			__cpufreq_governor(data, CPUFREQ_GOV_START);
> +			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> +		}
>  	}
>  
>  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> 
> 


-- 
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29 17:23                 ` Toralf Förster
@ 2013-07-29 20:19                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 20:19 UTC (permalink / raw)
  To: Toralf Förster
  Cc: Viresh Kumar, Srivatsa S. Bhat, cpufreq, Linux PM list

On Monday, July 29, 2013 07:23:35 PM Toralf Förster wrote:
> On 07/29/2013 01:54 PM, Rafael J. Wysocki wrote:
> > On Monday, July 29, 2013 04:52:39 PM Viresh Kumar wrote:
> >> On Mon, Jul 29, 2013 at 3:11 PM, Srivatsa S. Bhat
> >> <srivatsa.bhat@linux.vnet.ibm.com> wrote:
> >>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
> >>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> >>> driver module (eg: acpi-cpufreq) can't be removed.
> >>
> >> I missed this simple stuff in my email.. :(
> >>
> >>> But the cpufreq_add_dev() function does a module *put* at the end of
> >>> initializing a fresh CPU.
> >>>
> >>> 1057         kobject_uevent(&policy->kobj, KOBJ_ADD);
> >>> 1058         module_put(cpufreq_driver->owner);
> >>> 1059         pr_debug("initialization complete\n");
> >>> 1060
> >>> 1061         return 0;
> >>
> >> That actually looks wrong. And shoud be fixed.
> > 
> > OK
> > 
> >>> So, I wonder if it would be a good idea to instead allow that CPU to take a
> >>> module refcount as well. That way, the problem that Toralf reported would go
> >>> away, and at the same time, we can ensure that as long as the cpufreq core is
> >>> managing CPUs, the cpufreq-backend-driver module refcount never drops to zero.
> >>>
> >>> Something like this:
> >>>
> >>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >>> index a4ad733..ecfbc52 100644
> >>> --- a/drivers/cpufreq/cpufreq.c
> >>> +++ b/drivers/cpufreq/cpufreq.c
> >>> @@ -878,9 +878,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
> >>>         }
> >>>         write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> >>>
> >>> +       /* Bump up the refcount for this CPU */
> >>> +       policy = cpufreq_cpu_get(cpu);
> >>> +
> >>>         ret = cpufreq_add_dev_symlink(cpu, policy);
> >>> -       if (ret)
> >>> +       if (ret) {
> >>> +               cpufreq_cpu_put(policy);
> >>>                 goto err_out_kobj_put;
> >>> +       }
> >>
> >> That will do an extra kobject_get() which we don't require.. Only removing what
> >> I mentioned earlier should be good enough, I believe.
> >>
> >> Over that, I think below code in __cpufreq_governor() is also wrong.
> >>
> >> 	/* we keep one module reference alive for
> >> 			each CPU governed by this CPU */
> >> 	if ((event != CPUFREQ_GOV_START) || ret)
> >> 		module_put(policy->governor->owner);
> >> 	if ((event == CPUFREQ_GOV_STOP) && !ret)
> >> 		module_put(policy->governor->owner);
> >>
> >> The second if is sensible as it puts count when governor is stopped.
> >> But the first one decrements the count when we failed for anything
> >> other than START..
> >>
> >> But this routine is called for other stuff too:
> >> - INIT/Exit
> >> - LIMITS,
> >>
> >> And so, count must be incremented for a passed INIT call and
> >> decremented for a passed EXIT call, otherwise shouldn't be
> >> touched.
> > 
> > That sounds good, but I don't want to make those changes for 3.11 and at the
> > same time I *do* want the reference count issue to go away.
> > 
> > So the patch below is the one I'd like to apply for the time being and
> > we can work on more improvements on top of that.
> > 
> > Any objections?
> > 
> > Toralf, please test this patch in the meantime.
> 
> works - tested on top of 3.10.4

Great, thanks for the confirmation!

Rafael


> > ---
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> > 
> > Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> > driver module refcount, __cpufreq_remove_dev() causes that refcount
> > to become negative for the cpufreq driver after a suspend/resume
> > cycle.
> > 
> > This is not the only bad thing that happens there, however, because
> > kobject_put() should only be called for the policy kobject at this
> > point if the CPU is not the last one for that policy.
> > 
> > Namely, if the given CPU is the last one for that policy, the
> > policy kobject's refcount should be 1 at this point, as set by
> > cpufreq_add_dev_interface(), and only needs to be dropped once for
> > the kobject to go away.  This actually happens under the cpu == 1
> > check, so it need not be done before by cpufreq_cpu_put().
> > 
> > On the other hand, if the given CPU is not the last one for that
> > policy, this means that cpufreq_add_policy_cpu() has been called
> > at least once for that policy and cpufreq_cpu_get() has been
> > called for it too.  To balance that cpufreq_cpu_get(), we need to
> > call cpufreq_cpu_put() in that case.
> > 
> > Thus, to fix the described problem and keep the reference
> > counters balanced in both cases, move the cpufreq_cpu_get() call
> > in __cpufreq_remove_dev() to the code path executed only for
> > CPUs that share the policy with other CPUs.
> > 
> > Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
> >  1 file changed, 10 insertions(+), 9 deletions(-)
> > 
> > Index: linux-pm/drivers/cpufreq/cpufreq.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> > +++ linux-pm/drivers/cpufreq/cpufreq.c
> > @@ -1177,14 +1177,11 @@ static int __cpufreq_remove_dev(struct d
> >  				__func__, cpu_dev->id, cpu);
> >  	}
> >  
> > -	if ((cpus == 1) && (cpufreq_driver->target))
> > -		__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> > -
> > -	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> > -	cpufreq_cpu_put(data);
> > -
> >  	/* If cpu is last user of policy, free policy */
> >  	if (cpus == 1) {
> > +		if (cpufreq_driver->target)
> > +			__cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
> > +
> >  		lock_policy_rwsem_read(cpu);
> >  		kobj = &data->kobj;
> >  		cmp = &data->kobj_unregister;
> > @@ -1205,9 +1202,13 @@ static int __cpufreq_remove_dev(struct d
> >  		free_cpumask_var(data->related_cpus);
> >  		free_cpumask_var(data->cpus);
> >  		kfree(data);
> > -	} else if (cpufreq_driver->target) {
> > -		__cpufreq_governor(data, CPUFREQ_GOV_START);
> > -		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> > +	} else {
> > +		pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
> > +		cpufreq_cpu_put(data);
> > +		if (cpufreq_driver->target) {
> > +			__cpufreq_governor(data, CPUFREQ_GOV_START);
> > +			__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
> > +		}
> >  	}
> >  
> >  	per_cpu(cpufreq_policy_cpu, cpu) = -1;
> > 
> > 
> 
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of  "lsmod | grep ^acpi_cpufreq"
  2013-07-29 15:27                   ` Srivatsa S. Bhat
@ 2013-07-29 20:28                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2013-07-29 20:28 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Toralf Förster, cpufreq, Linux PM list, Viresh Kumar

On Monday, July 29, 2013 08:57:20 PM Srivatsa S. Bhat wrote:
> On 07/29/2013 05:34 PM, Rafael J. Wysocki wrote:
> > On Monday, July 29, 2013 05:14:25 PM Srivatsa S. Bhat wrote:
> >> On 07/29/2013 05:19 PM, Rafael J. Wysocki wrote:
> >>> On Monday, July 29, 2013 03:11:45 PM Srivatsa S. Bhat wrote:
> >>>> On 07/29/2013 04:50 AM, Rafael J. Wysocki wrote:
> >>>>> On Monday, July 29, 2013 12:43:59 AM Rafael J. Wysocki wrote:
> >>>>>> On Monday, July 29, 2013 12:11:18 AM Rafael J. Wysocki wrote:
> >>>>>>> On Sunday, July 28, 2013 12:21:22 PM Toralf Förster wrote:
> >>>>>>>> On 07/28/2013 01:39 AM, Rafael J. Wysocki wrote:
> >>>>>>>>> On Saturday, July 27, 2013 07:40:34 PM Toralf Förster wrote:
> >>>>>>>>>> it gives at a ThinkPad T420:
> >>>>>>>>>>
> >>>>>>>>>> tfoerste@n22 ~/tmp $ lsmod | grep ^acpi_cpufreq
> >>>>>>>>>> acpi_cpufreq           12902  2147483647
> >>>>>>>>>
> >>>>>>>>> That is -1, which indicates some module refcount woes.
> >>>>>>>>
> >>>>>>>> yes, ofc.
> >>>>>>>>
> >>>>>>>> The issue apears after 1 s2ram/resume cycle, before s2ram the refcount is 1.
> >>>>>>>>
> >>>>>>>>> I definitely can't see that with the mainline on my machines.
> >>>>>>>>
> >>>>>>>> It is in mainline too.
> >>>>>>>
> >>>>>>> Does the appended patch help?
> >>>>>>
> >>>>>> Actually, something as simple as this also should help:
> >>>>>>
> >>>>>> ---
> >>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>>> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
> >>>>>>
> >>>>>> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> >>>>>> driver module refcount, __cpufreq_remove_dev() causes that refcount
> >>>>>> to become negative after a suspend/resume cycle, for example.
> >>>>>>
> >>>>>> To prevent this from happening make __cpufreq_remove_dev() put
> >>>>>> the policy kobject only instead of calling cpufreq_cpu_put().
> >>>>>
> >>>>> Having a deeper look at it, though, I see that in fact the whole
> >>>>> cpufreq_cpu_put() is needed if the CPU is not the last one for the given
> >>>>> policy and is not needed at all otherwise (as described in the changelog
> >>>>> of the patch below).
> >>>>>
> >>>>> Srivatsa, does this make sense to you?
> >>>>>
> >>>>
> >>>> Code-wise, your patch looks good to me. But one thing in the existing code
> >>>> struck me as a little strange.
> >>>>
> >>>> I'm assuming that the module_get() is used in the cpufreq core to ensure that
> >>>> until the cpufreq core is managing (atleast one) CPU(s), the cpufreq backend
> >>>> driver module (eg: acpi-cpufreq) can't be removed.
> >>>
> >>> Quite frankly, I'm not sure about that.  If that were the case,
> >>> cpufreq_add_dev() would not call module_put() which it does.
> >>>
> >>> That may be a bug, I agree, but that's not for the present release cycle.  For
> >>> now, we need to ensure that the reference counts are *balanced* .
> >>>
> >>
> >> Sure, in that case, I agree that your patch is the right fix at this point,
> >> since it resolves the immediate problem that we have with the refcounts.
> >>
> >> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> > 
> > Great, thanks!
> > 
> > Does your patchset avoiding the creation/removal of sysfs directories over
> > suspend/resume need to be modified to take this patch into account?
> > 
> 
> There will be some trivial conflicts, but looking a bit closer at my patchset,
> I believe it has a subtle refcounting bug, relating to patches 6 and 7.
> 
> Patch 6: http://marc.info/?l=linux-kernel&m=137358141628042&w=2
> Patch 7: http://marc.info/?l=linux-pm&m=137358124527993&w=2
> 
> 
> Patch 6 modifies cpufreq_add_policy_cpu() to return 0 if frozen == true.
> But by that time, it would have already done a cpufreq_cpu_get().
> 
> In patch 7, under __cpufreq_remove_dev(), we avoid dropping refcounts
> for *any* CPU when frozen == true.
> 
> In the general case, this is fine (like myself and Viresh had discussed
> earlier in that thread), because __cpufreq_add_dev() doesn't increment the
> refcount in the frozen case.
> 
> But we overlooked a rather subtle scenario: say, CPU 'X' went through
> cpufreq_add_dev in the resume path, and restored its policy structure.
> Since this was fresh, no refcount was taken. And then CPU 'Y' came along,
> but it was related to CPU 'X', so it will take this path:
> 
> 1018         /* Check if this cpu was hot-unplugged earlier and has siblings */
> 1019         read_lock_irqsave(&cpufreq_driver_lock, flags);
> 1020         for_each_online_cpu(sibling) {
> 1021                 struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
> 1022                 if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
> 1023                         read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 1024                         return cpufreq_add_policy_cpu(cpu, sibling, dev,
> 1025                                                       frozen);
> 1026                 }
> 1027         }
> 
> This time, cp won't be NULL for CPU 'X', because its policy was restored
> in the previous call. So we end up calling cpufreq_add_policy_cpu() for
> CPU 'Y'. And that function takes a new refcount, as mentioned above.
> 
> So we need to have this extra hunk in cpufreq_add_policy_cpu() to ensure
> we don't take any additional refcounts during suspend/resume:
> 
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index c7f59e8..9681c01 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -932,8 +932,11 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
>  	}
>  
>  	/* Don't touch sysfs links during light-weight init */
> -	if (frozen)
> +	if (frozen) {
> +		/* Drop the extra refcount that we took above */
> +		cpufreq_cpu_put(policy);
>  		return 0;
> +	}
>  
>  	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>  	if (ret)
> 
> 
> I can respin my entire patchset with this fix included, on top of your
> patch. Please let me know if you'd prefer some other method for resolving
> this.

Please respin the patchset with the fix folded in.

My patch is in the bleeding-edge branch of linux-pm.git, so please rebase the
new version on that if that's not a problem.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq"
  2013-07-29 11:54                 ` Rafael J. Wysocki
                                   ` (2 preceding siblings ...)
  (?)
@ 2013-07-30  5:23                 ` Viresh Kumar
  -1 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-07-30  5:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Srivatsa S. Bhat, Toralf Förster, cpufreq, Linux PM list

On 29 July 2013 17:24, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
>
> Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the
> driver module refcount, __cpufreq_remove_dev() causes that refcount
> to become negative for the cpufreq driver after a suspend/resume
> cycle.
>
> This is not the only bad thing that happens there, however, because
> kobject_put() should only be called for the policy kobject at this
> point if the CPU is not the last one for that policy.
>
> Namely, if the given CPU is the last one for that policy, the
> policy kobject's refcount should be 1 at this point, as set by
> cpufreq_add_dev_interface(), and only needs to be dropped once for
> the kobject to go away.  This actually happens under the cpu == 1
> check, so it need not be done before by cpufreq_cpu_put().
>
> On the other hand, if the given CPU is not the last one for that
> policy, this means that cpufreq_add_policy_cpu() has been called
> at least once for that policy and cpufreq_cpu_get() has been
> called for it too.  To balance that cpufreq_cpu_get(), we need to
> call cpufreq_cpu_put() in that case.
>
> Thus, to fix the described problem and keep the reference
> counters balanced in both cases, move the cpufreq_cpu_get() call
> in __cpufreq_remove_dev() to the code path executed only for
> CPUs that share the policy with other CPUs.
>
> Reported-by: Toralf Förster <toralf.foerster@gmx.de>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/cpufreq.c |   19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)

Looks fine.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-07-30  5:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-27 17:40 stable 3-10-3: strange output of "lsmod | grep ^acpi_cpufreq" Toralf Förster
2013-07-27 23:39 ` Rafael J. Wysocki
2013-07-27 23:39   ` Rafael J. Wysocki
2013-07-28  8:08   ` Toralf Förster
2013-07-28  8:08     ` Srivatsa S. Bhat
2013-07-28 10:21   ` Toralf Förster
2013-07-28 22:11     ` Rafael J. Wysocki
2013-07-28 22:43       ` Rafael J. Wysocki
2013-07-28 23:20         ` Rafael J. Wysocki
2013-07-29  7:51           ` Viresh Kumar
2013-07-29  9:44             ` Srivatsa S. Bhat
2013-07-29  9:41           ` Srivatsa S. Bhat
2013-07-29 11:22             ` Viresh Kumar
2013-07-29 11:54               ` Rafael J. Wysocki
2013-07-29 11:54                 ` Rafael J. Wysocki
2013-07-29 11:48                 ` Srivatsa S. Bhat
2013-07-29 11:48                   ` Srivatsa S. Bhat
2013-07-29 17:23                 ` Toralf Förster
2013-07-29 20:19                   ` Rafael J. Wysocki
2013-07-30  5:23                 ` Viresh Kumar
2013-07-29 11:49             ` Rafael J. Wysocki
2013-07-29 11:44               ` Srivatsa S. Bhat
2013-07-29 12:04                 ` Rafael J. Wysocki
2013-07-29 15:27                   ` Srivatsa S. Bhat
2013-07-29 20:28                     ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.