linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cpufreq: Avoid double addition/removal of sysfs links
       [not found] <20150718163149.GP7557@n2100.arm.linux.org.uk>
@ 2015-07-20  9:47 ` Viresh Kumar
  2015-07-20 10:36   ` Russell King - ARM Linux
  2015-07-21 23:15   ` Rafael J. Wysocki
  2015-07-22 12:07 ` [PATCH V2] cpufreq: Fix double addition " Viresh Kumar
  1 sibling, 2 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-20  9:47 UTC (permalink / raw)
  To: Rafael Wysocki, linux; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Consider a dual core (0/1) system with two CPUs:
- sharing clock/voltage rails and hence cpufreq-policy
- CPU1 is offline while the cpufreq driver is registered

- cpufreq_add_dev() is called from subsys callback for CPU0 and we
  create the policy for the CPUs and create link for CPU1.
- cpufreq_add_dev() is called from subsys callback for CPU1, we find
  that the cpu is offline and we try to create a sysfs link for CPU1.
- This results in double addition of the sysfs link and we get this:

	WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
	sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
	Modules linked in:
	CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
	Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
	Backtrace:
	[<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
	 r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
	[<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
	[<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
	 r4:d74abbd0 r3:d74c0000
	[<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
	 r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
	[<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
	 r3:d6b4dfe7 r2:c0930750
	[<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
	 r6:d75a8960 r5:c0993280 r4:d00aba20
	[<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
	 r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
	[<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
	[<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
	 r5:00000001 r4:00000000
	[<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
	 r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
	 r4:c0ae7e20
	[<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)


The check for offline-cpu in cpufreq_add_dev() is present to ensure that
link gets added for the CPUs, that weren't physically present earlier
and we missed the case where a CPU is offline while registering the
driver.

Fix this by keeping track of CPUs for which link is already created, and
avoiding duplicate sysfs entries.

Fixes: 87549141d516 ("cpufreq: Stop migrating sysfs files on hotplug")
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Russell,

Can you please give this a try? (completely untested).

 drivers/cpufreq/cpufreq.c | 26 +++++++++++++++++++++++---
 include/linux/cpufreq.h   |  1 +
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index cdbe0676d246..12d089b78cad 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -984,17 +984,26 @@ EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 static int add_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
 {
 	struct device *cpu_dev;
+	int ret;
 
 	pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
 
 	if (!policy)
 		return 0;
 
+	/* Already added for offline CPUS from subsys callback */
+	if (cpumask_test_cpu(cpu, policy->symlinks))
+		return 0;
+
 	cpu_dev = get_cpu_device(cpu);
 	if (WARN_ON(!cpu_dev))
 		return 0;
 
-	return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
+	ret = sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
+	if (!ret)
+		cpumask_set_cpu(cpu, policy->symlinks);
+
+	return ret;
 }
 
 static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
@@ -1007,6 +1016,7 @@ static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
 	if (WARN_ON(!cpu_dev))
 		return;
 
+	cpumask_clear_cpu(cpu, policy->symlinks);
 	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
 }
 
@@ -1038,6 +1048,10 @@ static void cpufreq_remove_dev_symlink(struct cpufreq_policy *policy)
 		if (j == policy->kobj_cpu)
 			continue;
 
+		/* Already removed for offline CPUS from subsys callback */
+		if (!cpumask_test_cpu(j, policy->symlinks))
+			continue;
+
 		remove_cpu_dev_symlink(policy, j);
 	}
 }
@@ -1172,11 +1186,14 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
 	if (!zalloc_cpumask_var(&policy->related_cpus, GFP_KERNEL))
 		goto err_free_cpumask;
 
+	if (!zalloc_cpumask_var(&policy->symlinks, GFP_KERNEL))
+		goto err_free_related_cpumask;
+
 	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, &dev->kobj,
 				   "cpufreq");
 	if (ret) {
 		pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-		goto err_free_rcpumask;
+		goto err_free_symlink_cpumask;
 	}
 
 	INIT_LIST_HEAD(&policy->policy_list);
@@ -1193,7 +1210,9 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
 
 	return policy;
 
-err_free_rcpumask:
+err_free_symlink_cpumask:
+	free_cpumask_var(policy->symlinks);
+err_free_related_cpumask:
 	free_cpumask_var(policy->related_cpus);
 err_free_cpumask:
 	free_cpumask_var(policy->cpus);
@@ -1243,6 +1262,7 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy, bool notify)
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	cpufreq_policy_put_kobj(policy, notify);
+	free_cpumask_var(policy->symlinks);
 	free_cpumask_var(policy->related_cpus);
 	free_cpumask_var(policy->cpus);
 	kfree(policy);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index a82049683016..b4812f6977c6 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -62,6 +62,7 @@ struct cpufreq_policy {
 	/* CPUs sharing clock, require sw coordination */
 	cpumask_var_t		cpus;	/* Online CPUs only */
 	cpumask_var_t		related_cpus; /* Online + Offline CPUs */
+	cpumask_var_t		symlinks; /* CPUs for which cpufreq sysfs directory is present */
 
 	unsigned int		shared_type; /* ACPI: ANY or ALL affected CPUs
 						should set cpufreq */
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-20  9:47 ` [PATCH] cpufreq: Avoid double addition/removal of sysfs links Viresh Kumar
@ 2015-07-20 10:36   ` Russell King - ARM Linux
  2015-07-21  0:47     ` Rafael J. Wysocki
                       ` (2 more replies)
  2015-07-21 23:15   ` Rafael J. Wysocki
  1 sibling, 3 replies; 23+ messages in thread
From: Russell King - ARM Linux @ 2015-07-20 10:36 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael Wysocki, linaro-kernel, linux-pm, open list

On Mon, Jul 20, 2015 at 03:17:10PM +0530, Viresh Kumar wrote:
> Consider a dual core (0/1) system with two CPUs:
> - sharing clock/voltage rails and hence cpufreq-policy
> - CPU1 is offline while the cpufreq driver is registered
> 
> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>   create the policy for the CPUs and create link for CPU1.
> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>   that the cpu is offline and we try to create a sysfs link for CPU1.
> - This results in double addition of the sysfs link and we get this:
> 
> 	WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
> 	sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
> 	Modules linked in:
> 	CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
> 	Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> 	Backtrace:
> 	[<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
> 	 r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
> 	[<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
> 	[<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
> 	 r4:d74abbd0 r3:d74c0000
> 	[<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
> 	 r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
> 	[<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
> 	 r3:d6b4dfe7 r2:c0930750
> 	[<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
> 	 r6:d75a8960 r5:c0993280 r4:d00aba20
> 	[<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
> 	 r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
> 	[<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
> 	[<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
> 	 r5:00000001 r4:00000000
> 	[<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
> 	 r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
> 	 r4:c0ae7e20
> 	[<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)
> 
> 
> The check for offline-cpu in cpufreq_add_dev() is present to ensure that
> link gets added for the CPUs, that weren't physically present earlier
> and we missed the case where a CPU is offline while registering the
> driver.
> 
> Fix this by keeping track of CPUs for which link is already created, and
> avoiding duplicate sysfs entries.

Why do we try to create the symlink for CPU devices which we haven't
"detected" yet (iow, we haven't had cpufreq_add_dev() called for)?
Surely we are guaranteed to have cpufreq_add_dev() called for every
CPU which exists in sysfs?  So why not _only_ create the sysfs symlinks
when cpufreq_add_dev() is notified that a CPU subsys interface is
present?

Sure, if the policy changes, we need to do maintanence on these symlinks,
but I see only one path down into cpufreq_add_dev_symlink(), which is:

	cpufreq_add_dev() -> cpufreq_add_dev_interface() ->
		cpufreq_add_dev_symlink()

In other words, only when we see a new CPU interface appears, not when
the policy changes.  If the set of related CPUs is policy independent,
why is this information carried in the cpufreq_policy struct?

If it is policy dependent, then I see no code which handles the effect
of a policy change where the policy->related_cpus is different.  To me,
that sounds like a rather huge design hole.

Things get worse.  Reading drivers/base/cpu.c, CPU interface nodes are
only ever created - they're created for the set of _possible_ CPUs in
the system, not those which are possible and present, and there is no
unregister_cpu() API, only a register_cpu() API.  So, cpufreq_remove_dev()
won't be called for CPUs which were present and are no longer present.
This appears to be a misunderstanding of CPU hotplug...

So, cpufreq_remove_dev() will only get called when you call
subsys_interface_unregister(), not when the CPU present mask changes.
I suspect that the code in cpufreq_remove_dev() dealing with "offline"
CPUs even works... I'd recommend reading Documentation/cpu-hotplug.txt:

| cpu_present_mask: Bitmap of CPUs currently present in the system. Not all
| of them may be online. When physical hotplug is processed by the relevant
| subsystem (e.g ACPI) can change and new bit either be added or removed
| from the map depending on the event is hot-add/hot-remove. There are
| currently no locking rules as of now. Typical usage is to init topology
| during boot, at which time hotplug is disabled.
|
| You really dont need to manipulate any of the system cpu maps. They should
| be read-only for most use. When setting up per-cpu resources almost always
| use cpu_possible_mask/for_each_possible_cpu() to iterate.

In other words, I think your usage of cpu_present_mask in this code is
buggy in itself.

Please rethink the design of this code - I think your original change is
mis-designed.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-20 10:36   ` Russell King - ARM Linux
@ 2015-07-21  0:47     ` Rafael J. Wysocki
  2015-07-21  1:14       ` Rafael J. Wysocki
  2015-07-21 10:43     ` Viresh Kumar
  2015-07-22  6:57     ` Viresh Kumar
  2 siblings, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-21  0:47 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Rafael Wysocki, Lists linaro-kernel, linux-pm, open list

Hi Russell,

On Mon, Jul 20, 2015 at 12:36 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jul 20, 2015 at 03:17:10PM +0530, Viresh Kumar wrote:
>> Consider a dual core (0/1) system with two CPUs:
>> - sharing clock/voltage rails and hence cpufreq-policy
>> - CPU1 is offline while the cpufreq driver is registered
>>
>> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>>   create the policy for the CPUs and create link for CPU1.
>> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>>   that the cpu is offline and we try to create a sysfs link for CPU1.
>> - This results in double addition of the sysfs link and we get this:
>>
>>       WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
>>       sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
>>       Modules linked in:
>>       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
>>       Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
>>       Backtrace:
>>       [<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
>>        r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
>>       [<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
>>       [<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
>>        r4:d74abbd0 r3:d74c0000
>>       [<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
>>        r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
>>       [<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
>>        r3:d6b4dfe7 r2:c0930750
>>       [<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
>>        r6:d75a8960 r5:c0993280 r4:d00aba20
>>       [<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
>>        r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
>>       [<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
>>       [<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
>>        r5:00000001 r4:00000000
>>       [<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
>>        r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
>>        r4:c0ae7e20
>>       [<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)
>>
>>
>> The check for offline-cpu in cpufreq_add_dev() is present to ensure that
>> link gets added for the CPUs, that weren't physically present earlier
>> and we missed the case where a CPU is offline while registering the
>> driver.
>>
>> Fix this by keeping track of CPUs for which link is already created, and
>> avoiding duplicate sysfs entries.
>
> Why do we try to create the symlink for CPU devices which we haven't
> "detected" yet (iow, we haven't had cpufreq_add_dev() called for)?
> Surely we are guaranteed to have cpufreq_add_dev() called for every
> CPU which exists in sysfs?  So why not _only_ create the sysfs symlinks
> when cpufreq_add_dev() is notified that a CPU subsys interface is
> present?

That's something I've overlooked.

Yes, we should be doing exactly that.

> Sure, if the policy changes, we need to do maintanence on these symlinks,
> but I see only one path down into cpufreq_add_dev_symlink(), which is:
>
>         cpufreq_add_dev() -> cpufreq_add_dev_interface() ->
>                 cpufreq_add_dev_symlink()
>
> In other words, only when we see a new CPU interface appears, not when
> the policy changes.  If the set of related CPUs is policy independent,
> why is this information carried in the cpufreq_policy struct?

It is not policy-dependent, but the way that information is gathered
is not exactly straightforward.  It generally depends on what the
platform firmware tells us about the topology.

> If it is policy dependent, then I see no code which handles the effect
> of a policy change where the policy->related_cpus is different.  To me,
> that sounds like a rather huge design hole.
>
> Things get worse.  Reading drivers/base/cpu.c, CPU interface nodes are
> only ever created - they're created for the set of _possible_ CPUs in
> the system, not those which are possible and present, and there is no
> unregister_cpu() API, only a register_cpu() API.

There is unregister_cpu() API too, but it is called from
arch_unregister_cpu().  And it calls device_unregister() and all of
the appropriate things happen AFAICS.  Eventually,
cpufreq_remove_dev() is called from that path.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-21  0:47     ` Rafael J. Wysocki
@ 2015-07-21  1:14       ` Rafael J. Wysocki
  2015-07-22  7:04         ` Viresh Kumar
  0 siblings, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-21  1:14 UTC (permalink / raw)
  To: Russell King - ARM Linux, Viresh Kumar
  Cc: Rafael Wysocki, Lists linaro-kernel, linux-pm, open list

On Tue, Jul 21, 2015 at 2:47 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> Hi Russell,
>
> On Mon, Jul 20, 2015 at 12:36 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
>> On Mon, Jul 20, 2015 at 03:17:10PM +0530, Viresh Kumar wrote:
>>> Consider a dual core (0/1) system with two CPUs:
>>> - sharing clock/voltage rails and hence cpufreq-policy
>>> - CPU1 is offline while the cpufreq driver is registered
>>>
>>> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>>>   create the policy for the CPUs and create link for CPU1.
>>> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>>>   that the cpu is offline and we try to create a sysfs link for CPU1.
>>> - This results in double addition of the sysfs link and we get this:
>>>
>>>       WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
>>>       sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
>>>       Modules linked in:
>>>       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
>>>       Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
>>>       Backtrace:
>>>       [<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
>>>        r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
>>>       [<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
>>>       [<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
>>>        r4:d74abbd0 r3:d74c0000
>>>       [<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
>>>        r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
>>>       [<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
>>>        r3:d6b4dfe7 r2:c0930750
>>>       [<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
>>>        r6:d75a8960 r5:c0993280 r4:d00aba20
>>>       [<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
>>>        r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
>>>       [<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
>>>       [<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
>>>        r5:00000001 r4:00000000
>>>       [<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
>>>        r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
>>>        r4:c0ae7e20
>>>       [<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)
>>>
>>>
>>> The check for offline-cpu in cpufreq_add_dev() is present to ensure that
>>> link gets added for the CPUs, that weren't physically present earlier
>>> and we missed the case where a CPU is offline while registering the
>>> driver.
>>>
>>> Fix this by keeping track of CPUs for which link is already created, and
>>> avoiding duplicate sysfs entries.
>>
>> Why do we try to create the symlink for CPU devices which we haven't
>> "detected" yet (iow, we haven't had cpufreq_add_dev() called for)?
>> Surely we are guaranteed to have cpufreq_add_dev() called for every
>> CPU which exists in sysfs?  So why not _only_ create the sysfs symlinks
>> when cpufreq_add_dev() is notified that a CPU subsys interface is
>> present?
>
> That's something I've overlooked.
>
> Yes, we should be doing exactly that.
>
>> Sure, if the policy changes, we need to do maintanence on these symlinks,
>> but I see only one path down into cpufreq_add_dev_symlink(), which is:
>>
>>         cpufreq_add_dev() -> cpufreq_add_dev_interface() ->
>>                 cpufreq_add_dev_symlink()
>>
>> In other words, only when we see a new CPU interface appears, not when
>> the policy changes.  If the set of related CPUs is policy independent,
>> why is this information carried in the cpufreq_policy struct?
>
> It is not policy-dependent, but the way that information is gathered
> is not exactly straightforward.  It generally depends on what the
> platform firmware tells us about the topology.
>
>> If it is policy dependent, then I see no code which handles the effect
>> of a policy change where the policy->related_cpus is different.  To me,
>> that sounds like a rather huge design hole.
>>
>> Things get worse.  Reading drivers/base/cpu.c, CPU interface nodes are
>> only ever created - they're created for the set of _possible_ CPUs in
>> the system, not those which are possible and present, and there is no
>> unregister_cpu() API, only a register_cpu() API.
>
> There is unregister_cpu() API too, but it is called from
> arch_unregister_cpu().  And it calls device_unregister() and all of
> the appropriate things happen AFAICS.  Eventually,
> cpufreq_remove_dev() is called from that path.

That said, cpu_present_mask may only be updated after calling
arch_unregister_cpu(), so checking it in cpufreq_remove_dev() doesn't
really help.

It looks like using cpufreq_remove_dev() as the subsys ->remove_dev
callback is a mistake as it cannot really tell the difference between
that code path and the CPU offline one.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-20 10:36   ` Russell King - ARM Linux
  2015-07-21  0:47     ` Rafael J. Wysocki
@ 2015-07-21 10:43     ` Viresh Kumar
  2015-07-22  6:57     ` Viresh Kumar
  2 siblings, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-21 10:43 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael Wysocki, Linaro Kernel Mailman List, linux-pm, open list

On 20 July 2015 at 16:06, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Why do we try to create the symlink for CPU devices which we haven't
> "detected" yet (iow, we haven't had cpufreq_add_dev() called for)?
> Surely we are guaranteed to have cpufreq_add_dev() called for every
> CPU which exists in sysfs?  So why not _only_ create the sysfs symlinks
> when cpufreq_add_dev() is notified that a CPU subsys interface is
> present?
>
> Sure, if the policy changes, we need to do maintanence on these symlinks,
> but I see only one path down into cpufreq_add_dev_symlink(), which is:
>
>         cpufreq_add_dev() -> cpufreq_add_dev_interface() ->
>                 cpufreq_add_dev_symlink()
>
> In other words, only when we see a new CPU interface appears, not when
> the policy changes.  If the set of related CPUs is policy independent,
> why is this information carried in the cpufreq_policy struct?
>
> If it is policy dependent, then I see no code which handles the effect
> of a policy change where the policy->related_cpus is different.  To me,
> that sounds like a rather huge design hole.
>
> Things get worse.  Reading drivers/base/cpu.c, CPU interface nodes are
> only ever created - they're created for the set of _possible_ CPUs in
> the system, not those which are possible and present, and there is no
> unregister_cpu() API, only a register_cpu() API.  So, cpufreq_remove_dev()
> won't be called for CPUs which were present and are no longer present.
> This appears to be a misunderstanding of CPU hotplug...
>
> So, cpufreq_remove_dev() will only get called when you call
> subsys_interface_unregister(), not when the CPU present mask changes.
> I suspect that the code in cpufreq_remove_dev() dealing with "offline"
> CPUs even works... I'd recommend reading Documentation/cpu-hotplug.txt:
>
> | cpu_present_mask: Bitmap of CPUs currently present in the system. Not all
> | of them may be online. When physical hotplug is processed by the relevant
> | subsystem (e.g ACPI) can change and new bit either be added or removed
> | from the map depending on the event is hot-add/hot-remove. There are
> | currently no locking rules as of now. Typical usage is to init topology
> | during boot, at which time hotplug is disabled.
> |
> | You really dont need to manipulate any of the system cpu maps. They should
> | be read-only for most use. When setting up per-cpu resources almost always
> | use cpu_possible_mask/for_each_possible_cpu() to iterate.
>
> In other words, I think your usage of cpu_present_mask in this code is
> buggy in itself.
>
> Please rethink the design of this code - I think your original change is
> mis-designed.

I wasn't able to get time in last few days for this, sorry about that..

Will try my best tomorrow to come back to this..

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-20  9:47 ` [PATCH] cpufreq: Avoid double addition/removal of sysfs links Viresh Kumar
  2015-07-20 10:36   ` Russell King - ARM Linux
@ 2015-07-21 23:15   ` Rafael J. Wysocki
  2015-07-22  1:56     ` Rafael J. Wysocki
  2015-07-22  7:12     ` Viresh Kumar
  1 sibling, 2 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-21 23:15 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Russell King - ARM Linux, Lists linaro-kernel,
	linux-pm, open list

Hi VIresh,

On Mon, Jul 20, 2015 at 11:47 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Consider a dual core (0/1) system with two CPUs:
> - sharing clock/voltage rails and hence cpufreq-policy
> - CPU1 is offline while the cpufreq driver is registered
>
> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>   create the policy for the CPUs and create link for CPU1.
> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>   that the cpu is offline and we try to create a sysfs link for CPU1.

So the problem is that the cpu_is_offline(cpu) check in
cpufreq_add_dev() matches two distinct cases: (1) the CPU was not
present before and it is just being hot-added and (2) the CPU is
initially offline, but present, and this is the first time its device
is registered.  In the first case we can expect that the CPU will
become online shortly (although that is not guaranteed too), but in
the second case that very well may not happen.

We need to be able to distinguish between those two cases and your
patch does that, but I'm not sure if this really is the most
straightforward way to do it.

I'm also unsure why you're changing the removal code paths.  Is there
any particular failure scenario you're concerned about?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-21 23:15   ` Rafael J. Wysocki
@ 2015-07-22  1:56     ` Rafael J. Wysocki
  2015-07-22  3:00       ` Rafael J. Wysocki
  2015-07-22  7:12     ` Viresh Kumar
  1 sibling, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-22  1:56 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Russell King - ARM Linux, Lists linaro-kernel,
	linux-pm, open list

On Wednesday, July 22, 2015 01:15:01 AM Rafael J. Wysocki wrote:
> Hi VIresh,
> 
> On Mon, Jul 20, 2015 at 11:47 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > Consider a dual core (0/1) system with two CPUs:
> > - sharing clock/voltage rails and hence cpufreq-policy
> > - CPU1 is offline while the cpufreq driver is registered
> >
> > - cpufreq_add_dev() is called from subsys callback for CPU0 and we
> >   create the policy for the CPUs and create link for CPU1.
> > - cpufreq_add_dev() is called from subsys callback for CPU1, we find
> >   that the cpu is offline and we try to create a sysfs link for CPU1.
> 
> So the problem is that the cpu_is_offline(cpu) check in
> cpufreq_add_dev() matches two distinct cases: (1) the CPU was not
> present before and it is just being hot-added and (2) the CPU is
> initially offline, but present, and this is the first time its device
> is registered.  In the first case we can expect that the CPU will
> become online shortly (although that is not guaranteed too), but in
> the second case that very well may not happen.
> 
> We need to be able to distinguish between those two cases and your
> patch does that, but I'm not sure if this really is the most
> straightforward way to do it.

It looks like we need a mask of related CPUs that are present.  Or,
alternatively, a mask of CPUs that would have been related had they
been present.

That's sort of what your patch is adding, but not quite.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-22  1:56     ` Rafael J. Wysocki
@ 2015-07-22  3:00       ` Rafael J. Wysocki
  0 siblings, 0 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-22  3:00 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Russell King - ARM Linux, Lists linaro-kernel,
	linux-pm, open list

On Wednesday, July 22, 2015 03:56:21 AM Rafael J. Wysocki wrote:
> On Wednesday, July 22, 2015 01:15:01 AM Rafael J. Wysocki wrote:
> > Hi VIresh,
> > 
> > On Mon, Jul 20, 2015 at 11:47 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > Consider a dual core (0/1) system with two CPUs:
> > > - sharing clock/voltage rails and hence cpufreq-policy
> > > - CPU1 is offline while the cpufreq driver is registered
> > >
> > > - cpufreq_add_dev() is called from subsys callback for CPU0 and we
> > >   create the policy for the CPUs and create link for CPU1.
> > > - cpufreq_add_dev() is called from subsys callback for CPU1, we find
> > >   that the cpu is offline and we try to create a sysfs link for CPU1.
> > 
> > So the problem is that the cpu_is_offline(cpu) check in
> > cpufreq_add_dev() matches two distinct cases: (1) the CPU was not
> > present before and it is just being hot-added and (2) the CPU is
> > initially offline, but present, and this is the first time its device
> > is registered.  In the first case we can expect that the CPU will
> > become online shortly (although that is not guaranteed too), but in
> > the second case that very well may not happen.
> > 
> > We need to be able to distinguish between those two cases and your
> > patch does that, but I'm not sure if this really is the most
> > straightforward way to do it.
> 
> It looks like we need a mask of related CPUs that are present.  Or,
> alternatively, a mask of CPUs that would have been related had they
> been present.
> 
> That's sort of what your patch is adding, but not quite.

OK, below is my take on this (untested), on top of https://patchwork.kernel.org/patch/6839031/

We still only create policies for online CPUs which may be confusing in some
cases, but that's because drivers may not be able to handle CPUs that aren't
online.


---
 drivers/cpufreq/cpufreq.c |   41 +++++++++++++++++++++++++----------------
 include/linux/cpufreq.h   |    1 +
 2 files changed, 26 insertions(+), 16 deletions(-)

Index: linux-pm/include/linux/cpufreq.h
===================================================================
--- linux-pm.orig/include/linux/cpufreq.h
+++ linux-pm/include/linux/cpufreq.h
@@ -62,6 +62,7 @@ struct cpufreq_policy {
 	/* CPUs sharing clock, require sw coordination */
 	cpumask_var_t		cpus;	/* Online CPUs only */
 	cpumask_var_t		related_cpus; /* Online + Offline CPUs */
+	cpumask_var_t		real_cpus; /* Related and present */
 
 	unsigned int		shared_type; /* ACPI: ANY or ALL affected CPUs
 						should set cpufreq */
Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1002,7 +1002,7 @@ static int cpufreq_add_dev_symlink(struc
 	int ret = 0;
 
 	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
+	for_each_cpu(j, policy->real_cpus) {
 		if (j == policy->kobj_cpu)
 			continue;
 
@@ -1019,7 +1019,7 @@ static void cpufreq_remove_dev_symlink(s
 	unsigned int j;
 
 	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
+	for_each_cpu(j, policy->real_cpus) {
 		if (j == policy->kobj_cpu)
 			continue;
 
@@ -1157,11 +1157,14 @@ static struct cpufreq_policy *cpufreq_po
 	if (!zalloc_cpumask_var(&policy->related_cpus, GFP_KERNEL))
 		goto err_free_cpumask;
 
+	if (!zalloc_cpumask_var(&policy->real_cpus, GFP_KERNEL))
+		goto err_free_rcpumask;
+
 	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, &dev->kobj,
 				   "cpufreq");
 	if (ret) {
 		pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-		goto err_free_rcpumask;
+		goto err_free_real_cpus;
 	}
 
 	INIT_LIST_HEAD(&policy->policy_list);
@@ -1178,6 +1181,8 @@ static struct cpufreq_policy *cpufreq_po
 
 	return policy;
 
+err_free_real_cpus:
+	free_cpumask_var(policy->real_cpus);
 err_free_rcpumask:
 	free_cpumask_var(policy->related_cpus);
 err_free_cpumask:
@@ -1228,6 +1233,7 @@ static void cpufreq_policy_free(struct c
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	cpufreq_policy_put_kobj(policy, notify);
+	free_cpumask_var(policy->real_cpus);
 	free_cpumask_var(policy->related_cpus);
 	free_cpumask_var(policy->cpus);
 	kfree(policy);
@@ -1252,14 +1258,17 @@ static int cpufreq_add_dev(struct device
 
 	pr_debug("adding CPU %u\n", cpu);
 
-	/*
-	 * Only possible if 'cpu' wasn't physically present earlier and we are
-	 * here from subsys_interface add callback. A hotplug notifier will
-	 * follow and we will handle it like logical CPU hotplug then. For now,
-	 * just create the sysfs link.
-	 */
-	if (cpu_is_offline(cpu))
-		return add_cpu_dev_symlink(per_cpu(cpufreq_cpu_data, cpu), cpu);
+	if (cpu_is_offline(cpu)) {
+		/*
+		 * Only possible if we are here from the subsys_interface add
+		 * callback.  A hotplug notifier will follow and we will handle
+		 * it as logical CPU hotplug then.  For now, just create the
+		 * sysfs link, unless there is no policy.
+		 */
+		policy = per_cpu(cpufreq_cpu_data, cpu);
+		return policy && !cpumask_test_and_set_cpu(cpu, policy->real_cpus)
+			? add_cpu_dev_symlink(policy, cpu) : 0;
+	}
 
 	if (!down_read_trylock(&cpufreq_rwsem))
 		return 0;
@@ -1301,6 +1310,9 @@ static int cpufreq_add_dev(struct device
 	/* related cpus should atleast have policy->cpus */
 	cpumask_or(policy->related_cpus, policy->related_cpus, policy->cpus);
 
+	cpumask_and(policy->cpus, policy->cpus, cpu_present_mask);
+	cpumask_or(policy->real_cpus, policy->real_cpus, policy->cpus);
+
 	/*
 	 * affected cpus must always be the one, which are online. We aren't
 	 * managing offline cpus here.
@@ -1525,19 +1537,16 @@ static int cpufreq_remove_dev(struct dev
 	 * link or free policy here.
 	 */
 	if (cpu_is_offline(cpu)) {
-		struct cpumask mask;
-
 		if (!policy)
 			return 0;
 
-		cpumask_copy(&mask, policy->related_cpus);
-		cpumask_clear_cpu(cpu, &mask);
+		cpumask_clear_cpu(cpu, policy->real_cpus);
 
 		/*
 		 * Free policy only if all policy->related_cpus are removed
 		 * physically.
 		 */
-		if (cpumask_intersects(&mask, cpu_present_mask)) {
+		if (!cpumask_empty(policy->real_cpus)) {
 			remove_cpu_dev_symlink(policy, cpu);
 			return 0;
 		}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-20 10:36   ` Russell King - ARM Linux
  2015-07-21  0:47     ` Rafael J. Wysocki
  2015-07-21 10:43     ` Viresh Kumar
@ 2015-07-22  6:57     ` Viresh Kumar
  2 siblings, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-22  6:57 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael Wysocki, linaro-kernel, linux-pm, open list

Back now, sorry for the delay ..

On 20-07-15, 11:36, Russell King - ARM Linux wrote:
> Why do we try to create the symlink for CPU devices which we haven't
> "detected" yet (iow, we haven't had cpufreq_add_dev() called for)?
> Surely we are guaranteed to have cpufreq_add_dev() called for every
> CPU which exists in sysfs?  So why not _only_ create the sysfs symlinks
> when cpufreq_add_dev() is notified that a CPU subsys interface is
> present?

There are lot of combinations in which things can happen and so it was
done to simplify things a bit, but I agree to what you are saying.
Makes sense, let me put some brain again on that path.

> Sure, if the policy changes, we need to do maintanence on these symlinks,

What do you mean by policy changes? The complete policy structure get
reallocated? or any of its properties changes?

The policy structure is only freed today, if either the driver gets
unregistered or its CPUs are physically hotplugged out. No maintenance
then.

> but I see only one path down into cpufreq_add_dev_symlink(), which is:
> 
> 	cpufreq_add_dev() -> cpufreq_add_dev_interface() ->
> 		cpufreq_add_dev_symlink()
> 
> In other words, only when we see a new CPU interface appears, not when
> the policy changes.

Right.

> If the set of related CPUs is policy independent,
> why is this information carried in the cpufreq_policy struct?

Management of policy is done based on what's there in related_cpus and
so its present here. IOW, these are the CPUs which own this policy.

> If it is policy dependent, then I see no code which handles the effect
> of a policy change where the policy->related_cpus is different.

Sorry, but I don't exactly understand the point here. What's policy
change? When a policy is destroyed we take care of all CPUs which are
there in ->cpus or related_cpus mask.. What else are we missing ?

> To me,
> that sounds like a rather huge design hole.

Maybe, just that I haven't understood it well yet.

> Things get worse.  Reading drivers/base/cpu.c, CPU interface nodes are
> only ever created - they're created for the set of _possible_ CPUs in
> the system, not those which are possible and present, and there is no
> unregister_cpu() API, only a register_cpu() API.  So, cpufreq_remove_dev()
> won't be called for CPUs which were present and are no longer present.
> This appears to be a misunderstanding of CPU hotplug...

You really got me worrying on this one and good that I read Rafael's
reply on this about it being called from arch code.

To be honest, I had no idea how the physical hotplug thing really
works and shouted couple of times on the list for help while working
on this set. Finally I took help of Srivatsa Bhat, who did lot of work
in the hotplug code and my patch was based on his suggestions. I
didn't looked at cpu.c in details to follow all code paths.

> So, cpufreq_remove_dev() will only get called when you call
> subsys_interface_unregister(), not when the CPU present mask changes.

I hope this statement doesn't hold true anymore.

> I suspect that the code in cpufreq_remove_dev() dealing with "offline"
> CPUs even works... I'd recommend reading Documentation/cpu-hotplug.txt:

I never tested it, our lovely ARM world doesn't have a case where we
can do physical hotplug :) .. Or do we have a virtual test to get that
done in some way? That would be helpful for future testing, in case
you are aware of any way out.

> | cpu_present_mask: Bitmap of CPUs currently present in the system. Not all
> | of them may be online. When physical hotplug is processed by the relevant
> | subsystem (e.g ACPI) can change and new bit either be added or removed
> | from the map depending on the event is hot-add/hot-remove. There are
> | currently no locking rules as of now. Typical usage is to init topology
> | during boot, at which time hotplug is disabled.
> |
> | You really dont need to manipulate any of the system cpu maps. They should
> | be read-only for most use. When setting up per-cpu resources almost always
> | use cpu_possible_mask/for_each_possible_cpu() to iterate.
> 
> In other words, I think your usage of cpu_present_mask in this code is
> buggy in itself.

I do not accept it yet, I thought it was just fine.

> Please rethink the design of this code - I think your original change is
> mis-designed.

Yeah, based on the suggestion at the top, things need to change a bit
for sure. Thanks for looking into the details of the crappy design :)

-- 
viresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-21  1:14       ` Rafael J. Wysocki
@ 2015-07-22  7:04         ` Viresh Kumar
  0 siblings, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-22  7:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Russell King - ARM Linux, Rafael Wysocki, Lists linaro-kernel,
	linux-pm, open list

On 21-07-15, 03:14, Rafael J. Wysocki wrote:
> That said, cpu_present_mask may only be updated after calling
> arch_unregister_cpu(), so checking it in cpufreq_remove_dev() doesn't
> really help.

No, it is indeed useful. This is a snippet from the latest code we
have:

		cpumask_copy(&mask, policy->related_cpus);
		cpumask_clear_cpu(cpu, &mask);

		/*
		 * Free policy only if all policy->related_cpus are removed
		 * physically.
		 */
		if (cpumask_intersects(&mask, cpu_present_mask)) {
			remove_cpu_dev_symlink(policy, cpu);
			return 0;
		}

		cpufreq_policy_free(policy, true);



So what we are checking in the 'if' block is: "Is any CPU from
related_cpus, apart from the one getting removed now, present in the
system."

If not, then free the policy.

> It looks like using cpufreq_remove_dev() as the subsys ->remove_dev
> callback is a mistake as it cannot really tell the difference between
> that code path and the CPU offline one.

What do you mean by this? Doesn't the sif parameter confirms that its
called from subsys path ?

-- 
viresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links
  2015-07-21 23:15   ` Rafael J. Wysocki
  2015-07-22  1:56     ` Rafael J. Wysocki
@ 2015-07-22  7:12     ` Viresh Kumar
  1 sibling, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-22  7:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Russell King - ARM Linux, Lists linaro-kernel,
	linux-pm, open list

On 22-07-15, 01:15, Rafael J. Wysocki wrote:
> So the problem is that the cpu_is_offline(cpu) check in
> cpufreq_add_dev() matches two distinct cases: (1) the CPU was not
> present before and it is just being hot-added and (2) the CPU is
> initially offline, but present, and this is the first time its device
> is registered.  In the first case we can expect that the CPU will
> become online shortly (although that is not guaranteed too), but in
> the second case that very well may not happen.

Yeah.

> We need to be able to distinguish between those two cases and your
> patch does that, but I'm not sure if this really is the most
> straightforward way to do it.

Maybe yeah. I will take another look into that after considering
Russell's input.

> I'm also unsure why you're changing the removal code paths.  Is there
> any particular failure scenario you're concerned about?

The same issue is present here too. The problem was that cpu_offline()
check was getting hit for a CPU that is present in related_cpus mask.
While allocating/freeing the policy, we create links for all
related_cpus and the cpu_offline() check was adding/removing the link
again.

-- 
viresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH V2] cpufreq: Fix double addition of sysfs links
       [not found] <20150718163149.GP7557@n2100.arm.linux.org.uk>
  2015-07-20  9:47 ` [PATCH] cpufreq: Avoid double addition/removal of sysfs links Viresh Kumar
@ 2015-07-22 12:07 ` Viresh Kumar
  2015-07-22 12:44   ` Rafael J. Wysocki
  2015-07-22 13:15   ` Russell King - ARM Linux
  1 sibling, 2 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-22 12:07 UTC (permalink / raw)
  To: Rafael Wysocki, linux; +Cc: linaro-kernel, linux-pm, Viresh Kumar, open list

Consider a dual core (0/1) system with two CPUs:
- sharing clock/voltage rails and hence cpufreq-policy
- CPU1 is offline while the cpufreq driver is registered
- cpufreq_add_dev() is called from subsys callback for CPU0 and we
  create the policy for the group of CPUs and create links for all
  present CPUs, i.e. CPU1 as well.
- cpufreq_add_dev() is called from subsys callback for CPU1, we find
  that the cpu is offline and we try to create a sysfs link for CPU1.

This results in double addtion of the sysfs link and we will get this:

WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Backtrace:
[<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
 r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
[<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
[<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
 r4:d74abbd0 r3:d74c0000
[<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
 r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
[<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
 r3:d6b4dfe7 r2:c0930750
[<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
 r6:d75a8960 r5:c0993280 r4:d00aba20
[<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
 r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
[<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
[<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
 r5:00000001 r4:00000000
[<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
 r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
 r4:c0ae7e20
[<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)

The check for offline-cpu in cpufreq_add_dev() is to ensure that link
gets added for the CPUs which weren't physically present earlier and
that misses the case where a CPU is offline while registering the
driver.

To fix this properly, don't create these links when the policy get
initialized. Rather wait for individual subsys callback for CPUs to
add/remove these links. This simplifies most of the code leaving
cpufreq_remove_dev().

The problem is that, we might remove cpu which was owner of policy->kobj
in sysfs, before other CPUs are removed. Fix this by the solution we
have been using until very recently, in which we move the kobject to any
other CPU, for which remove is yet to be called.

Tested on dual core exynos board with cpufreq-dt driver. The driver was
compiled as module and inserted/removed multiple times on a running
kernel.

Fixes: 87549141d516 ("cpufreq: Stop migrating sysfs files on hotplug")
Reported-and-suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
V1->V2: Completely changed, please review again :)

@Rafael: I didn't review your solution and gave this one because I
thought Russell suggested the right thing. i.e. don't create links in
the beginning.

This is based of 4.2-rc3 and so your other patch,
https://patchwork.kernel.org/patch/6839031/ has to be rebased over it.

I didn't rebase this patch over yours for two reasons:
- Yours wasn't necessarily 4.2 material.
- I already mentioned a problem in that patch.

@Russell: I hope this will look much better than V1 to you. Please give
it a try once you get some time.

 drivers/cpufreq/cpufreq.c | 165 ++++++++++++++++++----------------------------
 include/linux/cpufreq.h   |   1 +
 2 files changed, 65 insertions(+), 101 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 26063afb3eba..81c2417e52f4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -966,67 +966,6 @@ void cpufreq_sysfs_remove_file(const struct attribute *attr)
 }
 EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 
-static int add_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
-{
-	struct device *cpu_dev;
-
-	pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
-
-	if (!policy)
-		return 0;
-
-	cpu_dev = get_cpu_device(cpu);
-	if (WARN_ON(!cpu_dev))
-		return 0;
-
-	return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
-}
-
-static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
-{
-	struct device *cpu_dev;
-
-	pr_debug("%s: Removing symlink for CPU: %u\n", __func__, cpu);
-
-	cpu_dev = get_cpu_device(cpu);
-	if (WARN_ON(!cpu_dev))
-		return;
-
-	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
-}
-
-/* Add/remove symlinks for all related CPUs */
-static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
-{
-	unsigned int j;
-	int ret = 0;
-
-	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
-		if (j == policy->kobj_cpu)
-			continue;
-
-		ret = add_cpu_dev_symlink(policy, j);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-
-static void cpufreq_remove_dev_symlink(struct cpufreq_policy *policy)
-{
-	unsigned int j;
-
-	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
-		if (j == policy->kobj_cpu)
-			continue;
-
-		remove_cpu_dev_symlink(policy, j);
-	}
-}
-
 static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 				     struct device *dev)
 {
@@ -1057,7 +996,7 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 			return ret;
 	}
 
-	return cpufreq_add_dev_symlink(policy);
+	return 0;
 }
 
 static void cpufreq_init_policy(struct cpufreq_policy *policy)
@@ -1163,11 +1102,14 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
 	if (!zalloc_cpumask_var(&policy->related_cpus, GFP_KERNEL))
 		goto err_free_cpumask;
 
+	if (!zalloc_cpumask_var(&policy->symlinks, GFP_KERNEL))
+		goto err_free_related_cpumask;
+
 	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, &dev->kobj,
 				   "cpufreq");
 	if (ret) {
 		pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
-		goto err_free_rcpumask;
+		goto err_free_symlink_cpumask;
 	}
 
 	INIT_LIST_HEAD(&policy->policy_list);
@@ -1184,7 +1126,9 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
 
 	return policy;
 
-err_free_rcpumask:
+err_free_symlink_cpumask:
+	free_cpumask_var(policy->symlinks);
+err_free_related_cpumask:
 	free_cpumask_var(policy->related_cpus);
 err_free_cpumask:
 	free_cpumask_var(policy->cpus);
@@ -1204,7 +1148,6 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify)
 					     CPUFREQ_REMOVE_POLICY, policy);
 
 	down_write(&policy->rwsem);
-	cpufreq_remove_dev_symlink(policy);
 	kobj = &policy->kobj;
 	cmp = &policy->kobj_unregister;
 	up_write(&policy->rwsem);
@@ -1234,6 +1177,7 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy, bool notify)
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	cpufreq_policy_put_kobj(policy, notify);
+	free_cpumask_var(policy->symlinks);
 	free_cpumask_var(policy->related_cpus);
 	free_cpumask_var(policy->cpus);
 	kfree(policy);
@@ -1252,26 +1196,37 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 {
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM;
-	struct cpufreq_policy *policy;
+	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	unsigned long flags;
 	bool recover_policy = !sif;
 
 	pr_debug("adding CPU %u\n", cpu);
 
+	/* sysfs links are only created on subsys callback */
+	if (sif && policy) {
+		pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
+		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
+		if (ret) {
+			dev_err(dev, "%s: Failed to create link for cpu %d (%d)\n",
+				__func__, cpu, ret);
+			return ret;
+		}
+
+		/* Track CPUs for which sysfs links are created */
+		cpumask_set_cpu(cpu, policy->symlinks);
+	}
+
 	/*
-	 * Only possible if 'cpu' wasn't physically present earlier and we are
-	 * here from subsys_interface add callback. A hotplug notifier will
-	 * follow and we will handle it like logical CPU hotplug then. For now,
-	 * just create the sysfs link.
+	 * A hotplug notifier will follow and we will take care of rest
+	 * of the initialization then.
 	 */
 	if (cpu_is_offline(cpu))
-		return add_cpu_dev_symlink(per_cpu(cpufreq_cpu_data, cpu), cpu);
+		return 0;
 
 	if (!down_read_trylock(&cpufreq_rwsem))
 		return 0;
 
 	/* Check if this CPU already has a policy to manage it */
-	policy = per_cpu(cpufreq_cpu_data, cpu);
 	if (policy && !policy_is_inactive(policy)) {
 		WARN_ON(!cpumask_test_cpu(cpu, policy->related_cpus));
 		ret = cpufreq_add_policy_cpu(policy, cpu, dev);
@@ -1506,10 +1461,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
 	if (cpufreq_driver->exit)
 		cpufreq_driver->exit(policy);
 
-	/* Free the policy only if the driver is getting removed. */
-	if (sif)
-		cpufreq_policy_free(policy, true);
-
 	return 0;
 }
 
@@ -1521,42 +1472,54 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
 static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
 {
 	unsigned int cpu = dev->id;
+	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	int ret;
 
-	/*
-	 * Only possible if 'cpu' is getting physically removed now. A hotplug
-	 * notifier should have already been called and we just need to remove
-	 * link or free policy here.
-	 */
-	if (cpu_is_offline(cpu)) {
-		struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
-		struct cpumask mask;
+	if (!policy)
+		return 0;
 
-		if (!policy)
-			return 0;
+	if (cpu_online(cpu)) {
+		ret = __cpufreq_remove_dev_prepare(dev, sif);
+		if (!ret)
+			ret = __cpufreq_remove_dev_finish(dev, sif);
+		if (ret)
+			return ret;
+	}
 
-		cpumask_copy(&mask, policy->related_cpus);
-		cpumask_clear_cpu(cpu, &mask);
+	/* sysfs links are removed only on subsys callback */
+	if (cpumask_test_cpu(cpu, policy->symlinks)) {
+		dev_dbg(dev, "%s: Removing symlink for CPU: %u\n", __func__,
+			cpu);
+		cpumask_clear_cpu(cpu, policy->symlinks);
+		sysfs_remove_link(&dev->kobj, "cpufreq");
+		return 0;
+	}
 
+	if (cpumask_weight(policy->symlinks)) {
 		/*
-		 * Free policy only if all policy->related_cpus are removed
-		 * physically.
+		 * Okay, we still have some CPUs left. Transfer the ownership of
+		 * policy to one of them. Would be better to pass that to
+		 * cpumask_last() as that will be the last CPU to get removed,
+		 * but there is no API to get last cpu of the mask. Lets move it
+		 * to the first cpu in the mask.
 		 */
-		if (cpumask_intersects(&mask, cpu_present_mask)) {
-			remove_cpu_dev_symlink(policy, cpu);
-			return 0;
-		}
+		int new_cpu = cpumask_first(policy->symlinks);
+		struct device *new_dev = get_cpu_device(new_cpu);
 
-		cpufreq_policy_free(policy, true);
-		return 0;
-	}
+		dev_dbg(dev, "%s: Migrating kobj from %d to %d\n", __func__,
+			cpu, new_cpu);
 
-	ret = __cpufreq_remove_dev_prepare(dev, sif);
+		cpumask_clear_cpu(new_cpu, policy->symlinks);
+		sysfs_remove_link(&new_dev->kobj, "cpufreq");
 
-	if (!ret)
-		ret = __cpufreq_remove_dev_finish(dev, sif);
+		policy->kobj_cpu = new_cpu;
+		WARN_ON(kobject_move(&policy->kobj, &new_dev->kobj));
+	} else {
+		/* This is the last CPU to be removed */
+		cpufreq_policy_free(policy, true);
+	}
 
-	return ret;
+	return 0;
 }
 
 static void handle_update(struct work_struct *work)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 29ad97c34fd5..c748d1cd0815 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -62,6 +62,7 @@ struct cpufreq_policy {
 	/* CPUs sharing clock, require sw coordination */
 	cpumask_var_t		cpus;	/* Online CPUs only */
 	cpumask_var_t		related_cpus; /* Online + Offline CPUs */
+	cpumask_var_t		symlinks; /* CPUs for which cpufreq sysfs directory is present */
 
 	unsigned int		shared_type; /* ACPI: ANY or ALL affected CPUs
 						should set cpufreq */
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-22 12:07 ` [PATCH V2] cpufreq: Fix double addition " Viresh Kumar
@ 2015-07-22 12:44   ` Rafael J. Wysocki
  2015-07-22 13:15   ` Russell King - ARM Linux
  1 sibling, 0 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-22 12:44 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Russell King - ARM Linux, Lists linaro-kernel,
	linux-pm, open list

Hi Viresh,

On Wed, Jul 22, 2015 at 2:07 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Consider a dual core (0/1) system with two CPUs:
> - sharing clock/voltage rails and hence cpufreq-policy
> - CPU1 is offline while the cpufreq driver is registered
> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>   create the policy for the group of CPUs and create links for all
>   present CPUs, i.e. CPU1 as well.
> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>   that the cpu is offline and we try to create a sysfs link for CPU1.
>
> This results in double addtion of the sysfs link and we will get this:
>
> WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
> sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
> Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Backtrace:
> [<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
>  r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
> [<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
> [<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
>  r4:d74abbd0 r3:d74c0000
> [<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
>  r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
> [<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
>  r3:d6b4dfe7 r2:c0930750
> [<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
>  r6:d75a8960 r5:c0993280 r4:d00aba20
> [<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
>  r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
> [<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
> [<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
>  r5:00000001 r4:00000000
> [<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
>  r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
>  r4:c0ae7e20
> [<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)
>
> The check for offline-cpu in cpufreq_add_dev() is to ensure that link
> gets added for the CPUs which weren't physically present earlier and
> that misses the case where a CPU is offline while registering the
> driver.
>
> To fix this properly, don't create these links when the policy get
> initialized. Rather wait for individual subsys callback for CPUs to
> add/remove these links. This simplifies most of the code leaving
> cpufreq_remove_dev().
>
> The problem is that, we might remove cpu which was owner of policy->kobj
> in sysfs, before other CPUs are removed. Fix this by the solution we
> have been using until very recently, in which we move the kobject to any
> other CPU, for which remove is yet to be called.
>
> Tested on dual core exynos board with cpufreq-dt driver. The driver was
> compiled as module and inserted/removed multiple times on a running
> kernel.
>
> Fixes: 87549141d516 ("cpufreq: Stop migrating sysfs files on hotplug")
> Reported-and-suggested-by: Russell King <linux@arm.linux.org.uk>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

That looks good to me overall, but please let me rename your new
"symlinks" CPU mask to "dependent_cpus".

> ---
> V1->V2: Completely changed, please review again :)
>
> @Rafael: I didn't review your solution and gave this one because I
> thought Russell suggested the right thing. i.e. don't create links in
> the beginning.

Sure.  I prefer this approach too.

> This is based of 4.2-rc3 and so your other patch,
> https://patchwork.kernel.org/patch/6839031/ has to be rebased over it.

OK

> I didn't rebase this patch over yours for two reasons:
> - Yours wasn't necessarily 4.2 material.

Right.

> - I already mentioned a problem in that patch.

I'm not sure if the problem is really there, but after the changes in
this patch it doesn't really matter. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-22 12:07 ` [PATCH V2] cpufreq: Fix double addition " Viresh Kumar
  2015-07-22 12:44   ` Rafael J. Wysocki
@ 2015-07-22 13:15   ` Russell King - ARM Linux
  2015-07-22 16:42     ` Rafael J. Wysocki
  2015-07-23  5:54     ` Viresh Kumar
  1 sibling, 2 replies; 23+ messages in thread
From: Russell King - ARM Linux @ 2015-07-22 13:15 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Rafael Wysocki, linaro-kernel, linux-pm, open list

On Wed, Jul 22, 2015 at 05:37:18PM +0530, Viresh Kumar wrote:
> Consider a dual core (0/1) system with two CPUs:
> - sharing clock/voltage rails and hence cpufreq-policy
> - CPU1 is offline while the cpufreq driver is registered
> - cpufreq_add_dev() is called from subsys callback for CPU0 and we
>   create the policy for the group of CPUs and create links for all
>   present CPUs, i.e. CPU1 as well.
> - cpufreq_add_dev() is called from subsys callback for CPU1, we find
>   that the cpu is offline and we try to create a sysfs link for CPU1.
> 
> This results in double addtion of the sysfs link and we will get this:
> 
> WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c()
> sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704
> Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Backtrace:
> [<c0013248>] (dump_backtrace) from [<c00133e4>] (show_stack+0x18/0x1c)
>  r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000
> [<c00133cc>] (show_stack) from [<c076920c>] (dump_stack+0x7c/0x98)
> [<c0769190>] (dump_stack) from [<c0029ab4>] (warn_slowpath_common+0x80/0xbc)
>  r4:d74abbd0 r3:d74c0000
> [<c0029a34>] (warn_slowpath_common) from [<c0029b94>] (warn_slowpath_fmt+0x38/0x40)
>  r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000
> [<c0029b60>] (warn_slowpath_fmt) from [<c01a1f30>] (sysfs_warn_dup+0x60/0x7c)
>  r3:d6b4dfe7 r2:c0930750
> [<c01a1ed0>] (sysfs_warn_dup) from [<c01a22c8>] (sysfs_do_create_link_sd+0xb8/0xc0)
>  r6:d75a8960 r5:c0993280 r4:d00aba20
> [<c01a2210>] (sysfs_do_create_link_sd) from [<c01a22fc>] (sysfs_create_link+0x2c/0x3c)
>  r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200
> [<c01a22d0>] (sysfs_create_link) from [<c0506160>] (add_cpu_dev_symlink+0x34/0x5c)
> [<c050612c>] (add_cpu_dev_symlink) from [<c05084d0>] (cpufreq_add_dev+0x674/0x794)
>  r5:00000001 r4:00000000
> [<c0507e5c>] (cpufreq_add_dev) from [<c03db114>] (subsys_interface_register+0x8c/0xd0)
>  r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08
>  r4:c0ae7e20
> [<c03db088>] (subsys_interface_register) from [<c0508a2c>] (cpufreq_register_driver+0x104/0x1f4)
> 
> The check for offline-cpu in cpufreq_add_dev() is to ensure that link
> gets added for the CPUs which weren't physically present earlier and
> that misses the case where a CPU is offline while registering the
> driver.
> 
> To fix this properly, don't create these links when the policy get
> initialized. Rather wait for individual subsys callback for CPUs to
> add/remove these links. This simplifies most of the code leaving
> cpufreq_remove_dev().
> 
> The problem is that, we might remove cpu which was owner of policy->kobj
> in sysfs, before other CPUs are removed. Fix this by the solution we
> have been using until very recently, in which we move the kobject to any
> other CPU, for which remove is yet to be called.
> 
> Tested on dual core exynos board with cpufreq-dt driver. The driver was
> compiled as module and inserted/removed multiple times on a running
> kernel.
> 
> Fixes: 87549141d516 ("cpufreq: Stop migrating sysfs files on hotplug")
> Reported-and-suggested-by: Russell King <linux@arm.linux.org.uk>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> V1->V2: Completely changed, please review again :)
> 
> @Rafael: I didn't review your solution and gave this one because I
> thought Russell suggested the right thing. i.e. don't create links in
> the beginning.
> 
> This is based of 4.2-rc3 and so your other patch,
> https://patchwork.kernel.org/patch/6839031/ has to be rebased over it.
> 
> I didn't rebase this patch over yours for two reasons:
> - Yours wasn't necessarily 4.2 material.
> - I already mentioned a problem in that patch.
> 
> @Russell: I hope this will look much better than V1 to you. Please give
> it a try once you get some time.
> 
>  drivers/cpufreq/cpufreq.c | 165 ++++++++++++++++++----------------------------
>  include/linux/cpufreq.h   |   1 +
>  2 files changed, 65 insertions(+), 101 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 26063afb3eba..81c2417e52f4 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -966,67 +966,6 @@ void cpufreq_sysfs_remove_file(const struct attribute *attr)
>  }
>  EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
>  
> -static int add_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
> -{
> -	struct device *cpu_dev;
> -
> -	pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
> -
> -	if (!policy)
> -		return 0;
> -
> -	cpu_dev = get_cpu_device(cpu);
> -	if (WARN_ON(!cpu_dev))
> -		return 0;
> -
> -	return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
> -}
> -
> -static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
> -{
> -	struct device *cpu_dev;
> -
> -	pr_debug("%s: Removing symlink for CPU: %u\n", __func__, cpu);
> -
> -	cpu_dev = get_cpu_device(cpu);
> -	if (WARN_ON(!cpu_dev))
> -		return;
> -
> -	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
> -}
> -
> -/* Add/remove symlinks for all related CPUs */
> -static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
> -{
> -	unsigned int j;
> -	int ret = 0;
> -
> -	/* Some related CPUs might not be present (physically hotplugged) */
> -	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
> -		if (j == policy->kobj_cpu)
> -			continue;
> -
> -		ret = add_cpu_dev_symlink(policy, j);
> -		if (ret)
> -			break;
> -	}
> -
> -	return ret;
> -}
> -
> -static void cpufreq_remove_dev_symlink(struct cpufreq_policy *policy)
> -{
> -	unsigned int j;
> -
> -	/* Some related CPUs might not be present (physically hotplugged) */
> -	for_each_cpu_and(j, policy->related_cpus, cpu_present_mask) {
> -		if (j == policy->kobj_cpu)
> -			continue;
> -
> -		remove_cpu_dev_symlink(policy, j);
> -	}
> -}
> -
>  static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
>  				     struct device *dev)
>  {
> @@ -1057,7 +996,7 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
>  			return ret;
>  	}
>  
> -	return cpufreq_add_dev_symlink(policy);
> +	return 0;
>  }
>  
>  static void cpufreq_init_policy(struct cpufreq_policy *policy)
> @@ -1163,11 +1102,14 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
>  	if (!zalloc_cpumask_var(&policy->related_cpus, GFP_KERNEL))
>  		goto err_free_cpumask;
>  
> +	if (!zalloc_cpumask_var(&policy->symlinks, GFP_KERNEL))
> +		goto err_free_related_cpumask;
> +
>  	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, &dev->kobj,
>  				   "cpufreq");
>  	if (ret) {
>  		pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret);
> -		goto err_free_rcpumask;
> +		goto err_free_symlink_cpumask;
>  	}
>  
>  	INIT_LIST_HEAD(&policy->policy_list);
> @@ -1184,7 +1126,9 @@ static struct cpufreq_policy *cpufreq_policy_alloc(struct device *dev)
>  
>  	return policy;
>  
> -err_free_rcpumask:
> +err_free_symlink_cpumask:
> +	free_cpumask_var(policy->symlinks);
> +err_free_related_cpumask:
>  	free_cpumask_var(policy->related_cpus);
>  err_free_cpumask:
>  	free_cpumask_var(policy->cpus);
> @@ -1204,7 +1148,6 @@ static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify)
>  					     CPUFREQ_REMOVE_POLICY, policy);
>  
>  	down_write(&policy->rwsem);
> -	cpufreq_remove_dev_symlink(policy);
>  	kobj = &policy->kobj;
>  	cmp = &policy->kobj_unregister;
>  	up_write(&policy->rwsem);
> @@ -1234,6 +1177,7 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy, bool notify)
>  	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>  
>  	cpufreq_policy_put_kobj(policy, notify);
> +	free_cpumask_var(policy->symlinks);
>  	free_cpumask_var(policy->related_cpus);
>  	free_cpumask_var(policy->cpus);
>  	kfree(policy);
> @@ -1252,26 +1196,37 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  {
>  	unsigned int j, cpu = dev->id;
>  	int ret = -ENOMEM;
> -	struct cpufreq_policy *policy;
> +	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>  	unsigned long flags;
>  	bool recover_policy = !sif;
>  
>  	pr_debug("adding CPU %u\n", cpu);
>  
> +	/* sysfs links are only created on subsys callback */
> +	if (sif && policy) {
> +		pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);

dev_dbg() ?

> +		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> +		if (ret) {
> +			dev_err(dev, "%s: Failed to create link for cpu %d (%d)\n",
> +				__func__, cpu, ret);

I wonder why we print the CPU number - since it's from dev->id, isn't it
included in the struct device name printed by dev_err() already?

> +			return ret;
> +		}
> +
> +		/* Track CPUs for which sysfs links are created */
> +		cpumask_set_cpu(cpu, policy->symlinks);
> +	}
> +

I guess this will do for -rc, but it's not particularly nice.  Can I
suggest splitting the two operations here - the add_dev callback from
the subsys interface, and the handling of hotplug online/offline
notifications.

You only need to do the above for the subsys interface, and you only
need to do the remainder if the CPU was online.

Also, what about the CPU "owning" the policy?

So, this would become:

static int cpufreq_cpu_online(struct device *dev)
{
	pr_debug("bringing CPU%d online\n", dev->id);
	... stuff to do when CPU is online ...
}

static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
{
	unsigned int cpu = dev->id;
	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);

	pr_debug("adding CPU %u\n", cpu);

	if (policy && policy->kobj_cpu != cpu) {
		dev_dbg(dev, "%s: Adding cpufreq symlink\n", __func__);
		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
		if (ret) {
			dev_err(dev,
				"%s: Failed to create cpufreq symlink (%d)\n",
				__func__, ret);
			return ret;
		}

		/* Track CPUs for which sysfs links are created */
		cpumask_set_cpu(cpu, policy->symlinks);
	}

	/* Now do the remainder if the CPU is already online */
	if (cpu_online(cpu))
		return cpufreq_cpu_online(dev);

	return 0;
}

Next, change the cpufreq_add_dev(dev, NULL) in the hotplug notifier call
to cpufreq_cpu_online(dev) instead.

Doing the similar thing for the cpufreq_remove_dev() path would also make
sense.

>  	/*
> -	 * Only possible if 'cpu' wasn't physically present earlier and we are
> -	 * here from subsys_interface add callback. A hotplug notifier will
> -	 * follow and we will handle it like logical CPU hotplug then. For now,
> -	 * just create the sysfs link.
> +	 * A hotplug notifier will follow and we will take care of rest
> +	 * of the initialization then.
>  	 */
>  	if (cpu_is_offline(cpu))
> -		return add_cpu_dev_symlink(per_cpu(cpufreq_cpu_data, cpu), cpu);
> +		return 0;
>  
>  	if (!down_read_trylock(&cpufreq_rwsem))
>  		return 0;
>  
>  	/* Check if this CPU already has a policy to manage it */
> -	policy = per_cpu(cpufreq_cpu_data, cpu);
>  	if (policy && !policy_is_inactive(policy)) {
>  		WARN_ON(!cpumask_test_cpu(cpu, policy->related_cpus));
>  		ret = cpufreq_add_policy_cpu(policy, cpu, dev);
> @@ -1506,10 +1461,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  	if (cpufreq_driver->exit)
>  		cpufreq_driver->exit(policy);
>  
> -	/* Free the policy only if the driver is getting removed. */
> -	if (sif)
> -		cpufreq_policy_free(policy, true);
> -
>  	return 0;
>  }
>  
> @@ -1521,42 +1472,54 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
>  {
>  	unsigned int cpu = dev->id;
> +	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>  	int ret;
>  
> -	/*
> -	 * Only possible if 'cpu' is getting physically removed now. A hotplug
> -	 * notifier should have already been called and we just need to remove
> -	 * link or free policy here.
> -	 */
> -	if (cpu_is_offline(cpu)) {
> -		struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> -		struct cpumask mask;
> +	if (!policy)
> +		return 0;
>  
> -		if (!policy)
> -			return 0;
> +	if (cpu_online(cpu)) {
> +		ret = __cpufreq_remove_dev_prepare(dev, sif);
> +		if (!ret)
> +			ret = __cpufreq_remove_dev_finish(dev, sif);
> +		if (ret)
> +			return ret;

Here, I have to wonder about this.  If you look at the code in
drivers/base/bus.c, you'll notice that the ->remove_dev return code is
not used (personally, I hate interfaces which are created with an int
return type for a removal operation, but then ignore the return code.
Either have the return code and use it, or don't confuse driver authors
by having one.)

What this means is that in the remove path, the device _is_ going away,
whether you like it or not.  So, if you have an error early in your
remove path, returning that error does you no good - you end up leaking
memory because subsequent cleanup doesn't get done.

It's better to either ensure that your removal path can't fail, or if it
can, to reasonably clean up as much as you can (which here, means
continuing to remove the symlink.)

If you adopt my suggestion above, then cpufreq_remove_dev() becomes
something like:

static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
{
	unsigned int cpu = dev->id;
	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);

	if (cpu_is_online(cpu))
		cpufreq_cpu_offline(dev);

	if (policy) {
		if (cpumask_test_cpu(cpu, policy->symlinks)) {
			dev_dbg(dev, "%s: Removing cpufreq symlink\n",
				__func__);
			cpumask_clear_cpu(cpu, policy->symlinks);
			sysfs_remove_link(&dev->kobj, "cpufreq");
		}

		if (policy->kobj_cpu == cpu) {
			... migration code and final CPU deletion code ...
		}
	}

	return 0;
}

which IMHO is easier to read and follow, and more symetrical with
cpufreq_add_dev().

Now, I'm left wondering about a few things:

1. whether having a CPU "own" the policy, and having the cpufreq/ directory
   beneath the cpuN node is a good idea, or whether it would be better to
   place this in the /sys/devices/system/cpufreq/ subdirectory and always
   symlink to there.  It strikes me that would simplify the code a little.

2. whether using a kref to track the usage of the policy would be better
   than tracking symlink weight (or in the case of (1) being adopted,
   whether the symlink cpumask becomes empty.)  Note that the symlink
   weight becoming zero without (1) (in other words, no symlinks) is not
   the correct condition for freeing the policy - we still have one CPU,
   that being the CPU for policy->kobj_cpu.

3. what happens when 'policy' is NULL at the point when the first (few) CPUs
   are added - how do the symlinks get created later if/when policy becomes
   non-NULL (can it?)

4. what about policy->related_cpus ?  What if one of the CPUs being added is
   not in policy->related_cpus?  Should we still go ahead and create the
   symlink?

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-22 13:15   ` Russell King - ARM Linux
@ 2015-07-22 16:42     ` Rafael J. Wysocki
  2015-07-23  6:09       ` Viresh Kumar
  2015-07-23  5:54     ` Viresh Kumar
  1 sibling, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-22 16:42 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Viresh Kumar, Rafael Wysocki, Lists linaro-kernel, linux-pm, open list

Hi Russell,

On Wed, Jul 22, 2015 at 3:15 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Jul 22, 2015 at 05:37:18PM +0530, Viresh Kumar wrote:

[cut]

>> @@ -1252,26 +1196,37 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>>  {
>>       unsigned int j, cpu = dev->id;
>>       int ret = -ENOMEM;
>> -     struct cpufreq_policy *policy;
>> +     struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>>       unsigned long flags;
>>       bool recover_policy = !sif;
>>
>>       pr_debug("adding CPU %u\n", cpu);
>>
>> +     /* sysfs links are only created on subsys callback */
>> +     if (sif && policy) {
>> +             pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
>
> dev_dbg() ?

Right.

>> +             ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>> +             if (ret) {
>> +                     dev_err(dev, "%s: Failed to create link for cpu %d (%d)\n",
>> +                             __func__, cpu, ret);
>
> I wonder why we print the CPU number - since it's from dev->id, isn't it
> included in the struct device name printed by dev_err() already?

It is AFAICS.

>> +                     return ret;
>> +             }
>> +
>> +             /* Track CPUs for which sysfs links are created */
>> +             cpumask_set_cpu(cpu, policy->symlinks);
>> +     }
>> +
>
> I guess this will do for -rc, but it's not particularly nice.

Right.

This is what I'm queuing up for -rc:
http://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=9b07109f06a1edd6e636b1e7397157eae0e6baa4

I've made a few other minor changes (discussed with Viresh on IRC) to it.

> Can I suggest splitting the two operations here - the add_dev callback from
> the subsys interface, and the handling of hotplug online/offline
> notifications.
>
> You only need to do the above for the subsys interface, and you only
> need to do the remainder if the CPU was online.
>
> Also, what about the CPU "owning" the policy?
>
> So, this would become:
>
> static int cpufreq_cpu_online(struct device *dev)
> {
>         pr_debug("bringing CPU%d online\n", dev->id);
>         ... stuff to do when CPU is online ...
> }
>
> static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
> {
>         unsigned int cpu = dev->id;
>         struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>
>         pr_debug("adding CPU %u\n", cpu);
>
>         if (policy && policy->kobj_cpu != cpu) {
>                 dev_dbg(dev, "%s: Adding cpufreq symlink\n", __func__);
>                 ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>                 if (ret) {
>                         dev_err(dev,
>                                 "%s: Failed to create cpufreq symlink (%d)\n",
>                                 __func__, ret);
>                         return ret;
>                 }
>
>                 /* Track CPUs for which sysfs links are created */
>                 cpumask_set_cpu(cpu, policy->symlinks);
>         }
>
>         /* Now do the remainder if the CPU is already online */
>         if (cpu_online(cpu))
>                 return cpufreq_cpu_online(dev);
>
>         return 0;
> }
>
> Next, change the cpufreq_add_dev(dev, NULL) in the hotplug notifier call
> to cpufreq_cpu_online(dev) instead.

These are good suggestions.  I actually have a plan to do that cleanup
on top of the VIresh's patch.

> Doing the similar thing for the cpufreq_remove_dev() path would also make
> sense.

cpufreq_remove_dev() is only called from bus.c already, but it also
has to handle the driver removal case.

And I already have a patch to drop the "sif" argument from
__cpufreq_remove_dev_prepare/finish() that are called on CPU offline.

>>       /*
>> -      * Only possible if 'cpu' wasn't physically present earlier and we are
>> -      * here from subsys_interface add callback. A hotplug notifier will
>> -      * follow and we will handle it like logical CPU hotplug then. For now,
>> -      * just create the sysfs link.
>> +      * A hotplug notifier will follow and we will take care of rest
>> +      * of the initialization then.
>>        */
>>       if (cpu_is_offline(cpu))
>> -             return add_cpu_dev_symlink(per_cpu(cpufreq_cpu_data, cpu), cpu);
>> +             return 0;
>>
>>       if (!down_read_trylock(&cpufreq_rwsem))
>>               return 0;
>>
>>       /* Check if this CPU already has a policy to manage it */
>> -     policy = per_cpu(cpufreq_cpu_data, cpu);
>>       if (policy && !policy_is_inactive(policy)) {
>>               WARN_ON(!cpumask_test_cpu(cpu, policy->related_cpus));
>>               ret = cpufreq_add_policy_cpu(policy, cpu, dev);
>> @@ -1506,10 +1461,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>       if (cpufreq_driver->exit)
>>               cpufreq_driver->exit(policy);
>>
>> -     /* Free the policy only if the driver is getting removed. */
>> -     if (sif)
>> -             cpufreq_policy_free(policy, true);
>> -
>>       return 0;
>>  }
>>
>> @@ -1521,42 +1472,54 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>  static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
>>  {
>>       unsigned int cpu = dev->id;
>> +     struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>>       int ret;
>>
>> -     /*
>> -      * Only possible if 'cpu' is getting physically removed now. A hotplug
>> -      * notifier should have already been called and we just need to remove
>> -      * link or free policy here.
>> -      */
>> -     if (cpu_is_offline(cpu)) {
>> -             struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>> -             struct cpumask mask;
>> +     if (!policy)
>> +             return 0;
>>
>> -             if (!policy)
>> -                     return 0;
>> +     if (cpu_online(cpu)) {
>> +             ret = __cpufreq_remove_dev_prepare(dev, sif);
>> +             if (!ret)
>> +                     ret = __cpufreq_remove_dev_finish(dev, sif);
>> +             if (ret)
>> +                     return ret;
>
> Here, I have to wonder about this.  If you look at the code in
> drivers/base/bus.c, you'll notice that the ->remove_dev return code is
> not used (personally, I hate interfaces which are created with an int
> return type for a removal operation, but then ignore the return code.
> Either have the return code and use it, or don't confuse driver authors
> by having one.)
>
> What this means is that in the remove path, the device _is_ going away,
> whether you like it or not.  So, if you have an error early in your
> remove path, returning that error does you no good - you end up leaking
> memory because subsequent cleanup doesn't get done.

Right.

> It's better to either ensure that your removal path can't fail, or if it
> can, to reasonably clean up as much as you can (which here, means
> continuing to remove the symlink.)
>
> If you adopt my suggestion above, then cpufreq_remove_dev() becomes
> something like:
>
> static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
> {
>         unsigned int cpu = dev->id;
>         struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>
>         if (cpu_is_online(cpu))
>                 cpufreq_cpu_offline(dev);
>
>         if (policy) {
>                 if (cpumask_test_cpu(cpu, policy->symlinks)) {
>                         dev_dbg(dev, "%s: Removing cpufreq symlink\n",
>                                 __func__);
>                         cpumask_clear_cpu(cpu, policy->symlinks);
>                         sysfs_remove_link(&dev->kobj, "cpufreq");
>                 }
>
>                 if (policy->kobj_cpu == cpu) {
>                         ... migration code and final CPU deletion code ...
>                 }
>         }
>
>         return 0;
> }
>
> which IMHO is easier to read and follow, and more symetrical with
> cpufreq_add_dev().

Agreed (in general).

> Now, I'm left wondering about a few things:
>
> 1. whether having a CPU "own" the policy, and having the cpufreq/ directory
>    beneath the cpuN node is a good idea, or whether it would be better to
>    place this in the /sys/devices/system/cpufreq/ subdirectory and always
>    symlink to there.  It strikes me that would simplify the code a little.

That is a good idea IMO.  A small complication here is that there may
be multiple policies in the system and a kobject is needed for each of
them.

> 2. whether using a kref to track the usage of the policy would be better
>    than tracking symlink weight (or in the case of (1) being adopted,
>    whether the symlink cpumask becomes empty.)  Note that the symlink
>    weight becoming zero without (1) (in other words, no symlinks) is not
>    the correct condition for freeing the policy - we still have one CPU,
>    that being the CPU for policy->kobj_cpu.

Well, if we do (1), it certainly would be more straightforward to use
krefs for that.

> 3. what happens when 'policy' is NULL at the point when the first (few) CPUs
>    are added - how do the symlinks get created later if/when policy becomes
>    non-NULL (can it?)

Yes, it can, and we have a design issue here that bothers me a bit.
Namley, we need a driver's ->init callback to populate policy->cpus
for us, but this is not the only thing it is doing, so the concern is
that it may not be able to deal with CPUs that aren't online.

I was thinking about an additional driver callback that would *only*
populate a mask of CPUs that should use the same policy as the given
one.  We'd be able to call that from cpufreq_add_dev() for offline
CPUs too and this way the policy object could be created for the first
CPU using the policy that is registered instead of being added for the
first CPU using that policy that becomes online (which happens today).

> 4. what about policy->related_cpus ?  What if one of the CPUs being added is
>    not in policy->related_cpus?

It will need a new policy.

>  Should we still go ahead and create the  symlink?

There's nothing to link to then.  The policy object will be created
when that CPU becomes online (as per the above).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-22 13:15   ` Russell King - ARM Linux
  2015-07-22 16:42     ` Rafael J. Wysocki
@ 2015-07-23  5:54     ` Viresh Kumar
  1 sibling, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-23  5:54 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Rafael Wysocki, linaro-kernel, linux-pm, open list

On 22-07-15, 14:15, Russell King - ARM Linux wrote:
> > +	/* sysfs links are only created on subsys callback */
> > +	if (sif && policy) {
> > +		pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
> 
> dev_dbg() ?

Hmm, right.

> > +		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> > +		if (ret) {
> > +			dev_err(dev, "%s: Failed to create link for cpu %d (%d)\n",

Rafael updated this instead with dev_dbg :), I am sending separate
patches to fix that now.

> > +				__func__, cpu, ret);
> 
> I wonder why we print the CPU number - since it's from dev->id, isn't it
> included in the struct device name printed by dev_err() already?

:(

> > +			return ret;
> > +		}
> > +
> > +		/* Track CPUs for which sysfs links are created */
> > +		cpumask_set_cpu(cpu, policy->symlinks);
> > +	}
> > +
> 
> I guess this will do for -rc, but it's not particularly nice.  Can I
> suggest splitting the two operations here - the add_dev callback from
> the subsys interface, and the handling of hotplug online/offline
> notifications.
> 
> You only need to do the above for the subsys interface, and you only
> need to do the remainder if the CPU was online.
> 
> Also, what about the CPU "owning" the policy?
> 
> So, this would become:
> 
> static int cpufreq_cpu_online(struct device *dev)
> {
> 	pr_debug("bringing CPU%d online\n", dev->id);
> 	... stuff to do when CPU is online ...
> }
> 
> static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
> {
> 	unsigned int cpu = dev->id;
> 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> 
> 	pr_debug("adding CPU %u\n", cpu);
> 
> 	if (policy && policy->kobj_cpu != cpu) {
> 		dev_dbg(dev, "%s: Adding cpufreq symlink\n", __func__);
> 		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> 		if (ret) {
> 			dev_err(dev,
> 				"%s: Failed to create cpufreq symlink (%d)\n",
> 				__func__, ret);
> 			return ret;
> 		}
> 
> 		/* Track CPUs for which sysfs links are created */
> 		cpumask_set_cpu(cpu, policy->symlinks);
> 	}
> 
> 	/* Now do the remainder if the CPU is already online */
> 	if (cpu_online(cpu))
> 		return cpufreq_cpu_online(dev);
> 
> 	return 0;
> }
> 
> Next, change the cpufreq_add_dev(dev, NULL) in the hotplug notifier call
> to cpufreq_cpu_online(dev) instead.
> 
> Doing the similar thing for the cpufreq_remove_dev() path would also make
> sense.

Hmmm, Looks better ofcourse.

> > @@ -1521,42 +1472,54 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
> >  static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
> >  {
> >  	unsigned int cpu = dev->id;
> > +	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> >  	int ret;
> >  
> > -	/*
> > -	 * Only possible if 'cpu' is getting physically removed now. A hotplug
> > -	 * notifier should have already been called and we just need to remove
> > -	 * link or free policy here.
> > -	 */
> > -	if (cpu_is_offline(cpu)) {
> > -		struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> > -		struct cpumask mask;
> > +	if (!policy)
> > +		return 0;
> >  
> > -		if (!policy)
> > -			return 0;
> > +	if (cpu_online(cpu)) {
> > +		ret = __cpufreq_remove_dev_prepare(dev, sif);
> > +		if (!ret)
> > +			ret = __cpufreq_remove_dev_finish(dev, sif);
> > +		if (ret)
> > +			return ret;
> 
> Here, I have to wonder about this.  If you look at the code in
> drivers/base/bus.c, you'll notice that the ->remove_dev return code is
> not used

Its not even using the return type of ->add_dev :), I have send an
update for that recently as that was required for cpufreq-drivers.
Greg must be applying that for 4.3 I hope :)

> (personally, I hate interfaces which are created with an int
> return type for a removal operation, but then ignore the return code.
> Either have the return code and use it, or don't confuse driver authors
> by having one.)

+1

> What this means is that in the remove path, the device _is_ going away,
> whether you like it or not.  So, if you have an error early in your
> remove path, returning that error does you no good - you end up leaking
> memory because subsequent cleanup doesn't get done.
> 
> It's better to either ensure that your removal path can't fail, or if it
> can, to reasonably clean up as much as you can (which here, means
> continuing to remove the symlink.)
> 
> If you adopt my suggestion above, then cpufreq_remove_dev() becomes
> something like:
> 
> static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
> {
> 	unsigned int cpu = dev->id;
> 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
> 
> 	if (cpu_is_online(cpu))
> 		cpufreq_cpu_offline(dev);
> 
> 	if (policy) {
> 		if (cpumask_test_cpu(cpu, policy->symlinks)) {
> 			dev_dbg(dev, "%s: Removing cpufreq symlink\n",
> 				__func__);
> 			cpumask_clear_cpu(cpu, policy->symlinks);
> 			sysfs_remove_link(&dev->kobj, "cpufreq");
> 		}
> 
> 		if (policy->kobj_cpu == cpu) {
> 			... migration code and final CPU deletion code ...
> 		}
> 	}
> 
> 	return 0;
> }
> 
> which IMHO is easier to read and follow, and more symetrical with
> cpufreq_add_dev().

Ack.

> Now, I'm left wondering about a few things:
> 
> 1. whether having a CPU "own" the policy, and having the cpufreq/ directory
>    beneath the cpuN node is a good idea, or whether it would be better to
>    place this in the /sys/devices/system/cpufreq/ subdirectory and always
>    symlink to there.  It strikes me that would simplify the code a little.

Hmm, but there can be multiple policies in a system and that would
surely confuse people.

> 2. whether using a kref to track the usage of the policy would be better
>    than tracking symlink weight (or in the case of (1) being adopted,
>    whether the symlink cpumask becomes empty.)


>    Note that the symlink
>    weight becoming zero without (1) (in other words, no symlinks) is not
>    the correct condition for freeing the policy - we still have one CPU,
>    that being the CPU for policy->kobj_cpu.

But that's the cpu which is getting removed now, so it was really the
last cpu and we can free the policy.

> 3. what happens when 'policy' is NULL at the point when the first (few) CPUs
>    are added

The first CPU that comes up has to create the policy.

>    - how do the symlinks get created later if/when policy becomes
>    non-NULL (can it?)

It can't.

> 4. what about policy->related_cpus ?  What if one of the CPUs being added is
>    not in policy->related_cpus?  Should we still go ahead and create the
>    symlink?

Let me explain a bit around how policy are managed, you might already
know this but I got a bit confused by your question.

Consider a octa-core big LITTLE platform. All big core share
clock/voltage rails and all LITTLE too..

The system will have two policies:
- big: This will manage four CPUs (0-3)
  - policy->related_cpus = 0 1 2 3
  - policy->cpus = all online CPUs from 0-3
- LITTLE: This will manage four CPUs (4-7)
  - policy->related_cpus = 4 5 6 7
  - policy->cpus = all online CPUs from 4-7

So if a CPU (say 5) doesn't find a place in big cluster's
policy->related_cpus, then it must belong to a different policy.

Does that clear your query? Or did I completely miss your concern ?

-- 
viresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-22 16:42     ` Rafael J. Wysocki
@ 2015-07-23  6:09       ` Viresh Kumar
  2015-07-23  8:13         ` [PATCH 1/3] cpufreq: print error messages with dev_err() Viresh Kumar
  2015-07-23 17:22         ` [PATCH V2] cpufreq: Fix double addition of sysfs links Rafael J. Wysocki
  0 siblings, 2 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-23  6:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Russell King - ARM Linux, Rafael Wysocki, Lists linaro-kernel,
	linux-pm, open list

On 22-07-15, 18:42, Rafael J. Wysocki wrote:
> > 3. what happens when 'policy' is NULL at the point when the first (few) CPUs
> >    are added - how do the symlinks get created later if/when policy becomes
> >    non-NULL (can it?)
> 
> Yes, it can, and we have a design issue here that bothers me a bit.

I replied to Russell with a NO here as the first CPU should have
created the policy. BUT...

> Namley, we need a driver's ->init callback to populate policy->cpus
> for us, but this is not the only thing it is doing, so the concern is
> that it may not be able to deal with CPUs that aren't online.

... the first few CPUs could have been offline and so we might not
have tried to add the policy at all.. Need to fix that for sure.

> I was thinking about an additional driver callback that would *only*
> populate a mask of CPUs that should use the same policy as the given
> one.

Why so ? Drivers today are required to set policy->cpus with all CPUs
that should be managed by that policy. i.e. all online+offline. So,
actually ->init() fills policy->cpus with the value of
policy->related_cpus.

Yes, I thought earlier to change that by setting policy->related_cpus
from drivers, instead of policy->cpus and wasn't sure if I should do
that :)

> We'd be able to call that from cpufreq_add_dev() for offline
> CPUs too and this way the policy object could be created for the first
> CPU using the policy that is registered instead of being added for the
> first CPU using that policy that becomes online (which happens today).

Creating policy for offline CPUs doesn't look that great to me.

What we can do to fix the problem in hand, is to update a global mask
of CPUs (with policy == NULL) which were offline when
cpufreq_add_dev() was called for them. And when we create the policy,
we can add links for all such CPUs.

-- 
viresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/3] cpufreq: print error messages with dev_err()
  2015-07-23  6:09       ` Viresh Kumar
@ 2015-07-23  8:13         ` Viresh Kumar
  2015-07-23  8:13           ` [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier Viresh Kumar
  2015-07-23  8:13           ` [PATCH 3/3] cpufreq: use cpumask_test_and_clear_cpu() instead of separate routines Viresh Kumar
  2015-07-23 17:22         ` [PATCH V2] cpufreq: Fix double addition of sysfs links Rafael J. Wysocki
  1 sibling, 2 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-23  8:13 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, linux, Viresh Kumar, open list

By mistake dev_err was replaced by dev_dbg in a recent patch, fix that.

Fixes: 9b07109f06a1 ("cpufreq: Fix double addition of sysfs links")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index fa718644f1ee..84504ae3fb38 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1173,7 +1173,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
 		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 		if (ret) {
-			dev_dbg(dev, "%s: Failed to create link (%d)\n",
+			dev_err(dev, "%s: Failed to create link (%d)\n",
 				__func__, ret);
 			return ret;
 		}
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier
  2015-07-23  8:13         ` [PATCH 1/3] cpufreq: print error messages with dev_err() Viresh Kumar
@ 2015-07-23  8:13           ` Viresh Kumar
  2015-07-23 19:28             ` Rafael J. Wysocki
  2015-07-23  8:13           ` [PATCH 3/3] cpufreq: use cpumask_test_and_clear_cpu() instead of separate routines Viresh Kumar
  1 sibling, 1 reply; 23+ messages in thread
From: Viresh Kumar @ 2015-07-23  8:13 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, linux, Viresh Kumar, open list

If subsys callback ->add_dev() is called for an offline CPU, before its
policy is allocated, we will miss adding its sysfs symlink.

Fix this by tracking such CPUs in a separate mask.

Fixes: 9b07109f06a1 ("cpufreq: Fix double addition of sysfs links")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 74 +++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 84504ae3fb38..d01cad993fa7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -31,6 +31,12 @@
 #include <linux/tick.h>
 #include <trace/events/power.h>
 
+/*
+ * CPUs that were offline when a request to allocate policy was issued, symlinks
+ * for them should be created once the policy is available for them.
+ */
+cpumask_t linked_cpus_pending;
+
 static LIST_HEAD(cpufreq_policy_list);
 
 static inline bool policy_is_inactive(struct cpufreq_policy *policy)
@@ -938,6 +944,47 @@ void cpufreq_sysfs_remove_file(const struct attribute *attr)
 }
 EXPORT_SYMBOL(cpufreq_sysfs_remove_file);
 
+static int cpufreq_add_symlink(struct cpufreq_policy *policy,
+			       struct device *dev)
+{
+	int ret, cpu = dev->id;
+
+	dev_dbg(dev, "%s: Adding symlink for CPU: %u\n", __func__, cpu);
+	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
+	if (ret) {
+		dev_err(dev, "%s: Failed to create link (%d)\n", __func__, ret);
+		return ret;
+	}
+
+	/* Track CPUs for which sysfs links are created */
+	cpumask_set_cpu(cpu, policy->linked_cpus);
+	return 0;
+}
+
+/*
+ * Create symlinks for CPUs which are already added via subsys callbacks (and
+ * were offline then), before the policy was created.
+ */
+static int cpufreq_add_pending_symlinks(struct cpufreq_policy *policy)
+{
+	struct cpumask mask;
+	int cpu, ret;
+
+	cpumask_and(&mask, policy->related_cpus, &linked_cpus_pending);
+
+	if (cpumask_empty(&mask))
+		return 0;
+
+	for_each_cpu(cpu, &mask) {
+		ret = cpufreq_add_symlink(policy, get_cpu_device(cpu));
+		if (ret)
+			return ret;
+		cpumask_clear_cpu(cpu, &linked_cpus_pending);
+	}
+
+	return 0;
+}
+
 static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 				     struct device *dev)
 {
@@ -968,7 +1015,7 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy,
 			return ret;
 	}
 
-	return 0;
+	return cpufreq_add_pending_symlinks(policy);
 }
 
 static int cpufreq_init_policy(struct cpufreq_policy *policy)
@@ -1170,24 +1217,21 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 	/* sysfs links are only created on subsys callback */
 	if (sif && policy) {
-		pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
-		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
-		if (ret) {
-			dev_err(dev, "%s: Failed to create link (%d)\n",
-				__func__, ret);
+		ret = cpufreq_add_symlink(policy, dev);
+		if (ret)
 			return ret;
-		}
-
-		/* Track CPUs for which sysfs links are created */
-		cpumask_set_cpu(cpu, policy->linked_cpus);
 	}
 
 	/*
 	 * A hotplug notifier will follow and we will take care of rest
 	 * of the initialization then.
 	 */
-	if (cpu_is_offline(cpu))
+	if (cpu_is_offline(cpu)) {
+		/* symlink should be added for this CPU later */
+		if (!policy)
+			cpumask_set_cpu(cpu, &linked_cpus_pending);
 		return 0;
+	}
 
 	/* Check if this CPU already has a policy to manage it */
 	if (policy && !policy_is_inactive(policy)) {
@@ -1440,8 +1484,10 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	int ret;
 
-	if (!policy)
+	if (!policy) {
+		cpumask_clear_cpu(cpu, &linked_cpus_pending);
 		return 0;
+	}
 
 	if (cpu_online(cpu)) {
 		ret = __cpufreq_remove_dev_prepare(dev, sif);
@@ -2533,10 +2579,14 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 
 	/* Protect against concurrent cpu hotplug */
 	get_online_cpus();
+
 	subsys_interface_unregister(&cpufreq_interface);
 	if (cpufreq_boost_supported())
 		cpufreq_sysfs_remove_file(&boost.attr);
 
+	if (WARN_ON(!cpumask_empty(&linked_cpus_pending)))
+		cpumask_clear(&linked_cpus_pending);
+
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
 	write_lock_irqsave(&cpufreq_driver_lock, flags);
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/3] cpufreq: use cpumask_test_and_clear_cpu() instead of separate routines
  2015-07-23  8:13         ` [PATCH 1/3] cpufreq: print error messages with dev_err() Viresh Kumar
  2015-07-23  8:13           ` [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier Viresh Kumar
@ 2015-07-23  8:13           ` Viresh Kumar
  1 sibling, 0 replies; 23+ messages in thread
From: Viresh Kumar @ 2015-07-23  8:13 UTC (permalink / raw)
  To: Rafael Wysocki; +Cc: linaro-kernel, linux-pm, linux, Viresh Kumar, open list

We need to clear cpumask only if the relevant cpu is set and we could
have used cpumask_test_and_clear_cpu() and set instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d01cad993fa7..b223c9c5296b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1498,10 +1498,9 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
 	}
 
 	/* sysfs links are removed only on subsys callback */
-	if (cpumask_test_cpu(cpu, policy->linked_cpus)) {
+	if (cpumask_test_and_clear_cpu(cpu, policy->linked_cpus)) {
 		dev_dbg(dev, "%s: Removing symlink for CPU: %u\n", __func__,
 			cpu);
-		cpumask_clear_cpu(cpu, policy->linked_cpus);
 		sysfs_remove_link(&dev->kobj, "cpufreq");
 		return 0;
 	}
-- 
2.4.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-23  6:09       ` Viresh Kumar
  2015-07-23  8:13         ` [PATCH 1/3] cpufreq: print error messages with dev_err() Viresh Kumar
@ 2015-07-23 17:22         ` Rafael J. Wysocki
  2015-07-23 19:29           ` Rafael J. Wysocki
  1 sibling, 1 reply; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-23 17:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Russell King - ARM Linux, Rafael Wysocki,
	Lists linaro-kernel, linux-pm, open list

Hi Viresh,

On Thu, Jul 23, 2015 at 8:09 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 22-07-15, 18:42, Rafael J. Wysocki wrote:
>> > 3. what happens when 'policy' is NULL at the point when the first (few) CPUs
>> >    are added - how do the symlinks get created later if/when policy becomes
>> >    non-NULL (can it?)
>>
>> Yes, it can, and we have a design issue here that bothers me a bit.
>
> I replied to Russell with a NO here as the first CPU should have
> created the policy. BUT...
>
>> Namley, we need a driver's ->init callback to populate policy->cpus
>> for us, but this is not the only thing it is doing, so the concern is
>> that it may not be able to deal with CPUs that aren't online.
>
> ... the first few CPUs could have been offline and so we might not
> have tried to add the policy at all.. Need to fix that for sure.

Wait here.

The current Linus' tree doesn't have that problem as far as I can say.

Say cpufreq_interface->add_dev() is called for an offline CPU (say
CPU2).  It points to cpufreq_add_dev(), so we see that the CPU is
offline and call add_cpu_dev_symlink() for it.  But the first argument
we pass to that is per_cpu(cpufreq_cpu_data, cpu) and that is NULL,
because the policy is not there yet.  So we just return 0 (and the CPU
has no policy and no link).

Now say cpufreq_interface->add_dev() is called for an online CPU (say
CPU3).  It goes and creates the policy for it and the driver's
->init() tells us that CPU2 is related to it.  So
cpufreq_add_dev_interface() creates the link for CPU2 and we're fine.

Now say CPU3 was offline too when cpufreq_interface->add_dev() was
called for it.  We don't create a policy or a link for it.  Now say
CPU2 becomes online.  cpufreq_cpu_callback() calls cpufreq_add_dev()
for it and we land in the previous case.

The *broken* case is when CPU2 is online to start with and it had
created the link for CPU3, so when an offline CPU3 is now being added,
we try to create the link for it again.  That is the case we need to
address in -rc without introducing new problems.  The $subject patch
adresses that issue, but it introduces the above problem.  On the
other hand, my patch at https://patchwork.kernel.org/patch/6839151/
should take care of this too (unless it is broken in a way I'm not
seeing now).

>> I was thinking about an additional driver callback that would *only*
>> populate a mask of CPUs that should use the same policy as the given
>> one.
>
> Why so ? Drivers today are required to set policy->cpus with all CPUs
> that should be managed by that policy. i.e. all online+offline. So,
> actually ->init() fills policy->cpus with the value of
> policy->related_cpus.

So the problem is that the setting of policy->cpus is not the *only*
thing that ->init() is supposed to do.  It can go and write to
registers etc and is that guaranteed to work with offline CPUs?  I
honestly don't think so.

> Yes, I thought earlier to change that by setting policy->related_cpus
> from drivers, instead of policy->cpus and wasn't sure if I should do
> that :)
>
>> We'd be able to call that from cpufreq_add_dev() for offline
>> CPUs too and this way the policy object could be created for the first
>> CPU using the policy that is registered instead of being added for the
>> first CPU using that policy that becomes online (which happens today).
>
> Creating policy for offline CPUs doesn't look that great to me.

Then we have a problem that CPUs that are not initially online do not
have policies, but if they go online and *then* offline subsequently,
the policies will be there.  So there are two different "offline"
statuses for a CPU as far as cpufreq is concerned, depending on
whether or not the CPU has ever been online.  That's weird and IMO we
should avoid it.

> What we can do to fix the problem in hand, is to update a global mask
> of CPUs (with policy == NULL) which were offline when
> cpufreq_add_dev() was called for them. And when we create the policy,
> we can add links for all such CPUs.

For -rc, why don't we have a mask of CPUs that are both "related" and
present and create links only for those?  Which is what the patch at
https://patchwork.kernel.org/patch/6839151/ is doing?

And then we can target a major rework at the next merge window.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier
  2015-07-23  8:13           ` [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier Viresh Kumar
@ 2015-07-23 19:28             ` Rafael J. Wysocki
  0 siblings, 0 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-23 19:28 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Lists linaro-kernel, linux-pm,
	Russell King - ARM Linux, open list

Hi Viresh,

On Thu, Jul 23, 2015 at 10:13 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> If subsys callback ->add_dev() is called for an offline CPU, before its
> policy is allocated, we will miss adding its sysfs symlink.
>
> Fix this by tracking such CPUs in a separate mask.
>
> Fixes: 9b07109f06a1 ("cpufreq: Fix double addition of sysfs links")
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

No, we need to go to back to square one.

No fixes of fixes of fixes etc please.

Let me prepare a patch for -rc that won't introduce *new* problems and
we can make major changes as 4.3 material, OK?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH V2] cpufreq: Fix double addition of sysfs links
  2015-07-23 17:22         ` [PATCH V2] cpufreq: Fix double addition of sysfs links Rafael J. Wysocki
@ 2015-07-23 19:29           ` Rafael J. Wysocki
  0 siblings, 0 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2015-07-23 19:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Russell King - ARM Linux, Rafael Wysocki,
	Lists linaro-kernel, linux-pm, open list

On Thu, Jul 23, 2015 at 7:22 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> Hi Viresh,
>
> On Thu, Jul 23, 2015 at 8:09 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 22-07-15, 18:42, Rafael J. Wysocki wrote:
>>> > 3. what happens when 'policy' is NULL at the point when the first (few) CPUs
>>> >    are added - how do the symlinks get created later if/when policy becomes
>>> >    non-NULL (can it?)
>>>
>>> Yes, it can, and we have a design issue here that bothers me a bit.
>>
>> I replied to Russell with a NO here as the first CPU should have
>> created the policy. BUT...
>>
>>> Namley, we need a driver's ->init callback to populate policy->cpus
>>> for us, but this is not the only thing it is doing, so the concern is
>>> that it may not be able to deal with CPUs that aren't online.
>>
>> ... the first few CPUs could have been offline and so we might not
>> have tried to add the policy at all.. Need to fix that for sure.
>
> Wait here.
>
> The current Linus' tree doesn't have that problem as far as I can say.
>
> Say cpufreq_interface->add_dev() is called for an offline CPU (say
> CPU2).  It points to cpufreq_add_dev(), so we see that the CPU is
> offline and call add_cpu_dev_symlink() for it.  But the first argument
> we pass to that is per_cpu(cpufreq_cpu_data, cpu) and that is NULL,
> because the policy is not there yet.  So we just return 0 (and the CPU
> has no policy and no link).
>
> Now say cpufreq_interface->add_dev() is called for an online CPU (say
> CPU3).  It goes and creates the policy for it and the driver's
> ->init() tells us that CPU2 is related to it.  So
> cpufreq_add_dev_interface() creates the link for CPU2 and we're fine.
>
> Now say CPU3 was offline too when cpufreq_interface->add_dev() was
> called for it.  We don't create a policy or a link for it.  Now say
> CPU2 becomes online.  cpufreq_cpu_callback() calls cpufreq_add_dev()
> for it and we land in the previous case.
>
> The *broken* case is when CPU2 is online to start with and it had
> created the link for CPU3, so when an offline CPU3 is now being added,
> we try to create the link for it again.  That is the case we need to
> address in -rc without introducing new problems.  The $subject patch
> adresses that issue, but it introduces the above problem.  On the
> other hand, my patch at https://patchwork.kernel.org/patch/6839151/
> should take care of this too (unless it is broken in a way I'm not
> seeing now).

It doesn't address the case when the CPU being removed is the policy owner.

Let me prepare a new version of it and we'll start over from there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-07-23 19:29 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20150718163149.GP7557@n2100.arm.linux.org.uk>
2015-07-20  9:47 ` [PATCH] cpufreq: Avoid double addition/removal of sysfs links Viresh Kumar
2015-07-20 10:36   ` Russell King - ARM Linux
2015-07-21  0:47     ` Rafael J. Wysocki
2015-07-21  1:14       ` Rafael J. Wysocki
2015-07-22  7:04         ` Viresh Kumar
2015-07-21 10:43     ` Viresh Kumar
2015-07-22  6:57     ` Viresh Kumar
2015-07-21 23:15   ` Rafael J. Wysocki
2015-07-22  1:56     ` Rafael J. Wysocki
2015-07-22  3:00       ` Rafael J. Wysocki
2015-07-22  7:12     ` Viresh Kumar
2015-07-22 12:07 ` [PATCH V2] cpufreq: Fix double addition " Viresh Kumar
2015-07-22 12:44   ` Rafael J. Wysocki
2015-07-22 13:15   ` Russell King - ARM Linux
2015-07-22 16:42     ` Rafael J. Wysocki
2015-07-23  6:09       ` Viresh Kumar
2015-07-23  8:13         ` [PATCH 1/3] cpufreq: print error messages with dev_err() Viresh Kumar
2015-07-23  8:13           ` [PATCH 2/3] cpufreq: Create links for offline CPUs that got added earlier Viresh Kumar
2015-07-23 19:28             ` Rafael J. Wysocki
2015-07-23  8:13           ` [PATCH 3/3] cpufreq: use cpumask_test_and_clear_cpu() instead of separate routines Viresh Kumar
2015-07-23 17:22         ` [PATCH V2] cpufreq: Fix double addition of sysfs links Rafael J. Wysocki
2015-07-23 19:29           ` Rafael J. Wysocki
2015-07-23  5:54     ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).