All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-03 11:14 ` Bjørn Mork
  0 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-03 11:14 UTC (permalink / raw)
  To: cpufreq
  Cc: linux-pm, Srivatsa S. Bhat, Viresh Kumar, Rafael J. Wysocki,
	Bjørn Mork

This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
light-weight init/teardown during suspend/resume"), which enabled
suspend/resume optimizations leaving the sysfs files in place.

Errors during suspend/resume are not handled properly, leaving
dead sysfs attributes in case of failures.  There are are number of
functions with special code for the "frozen" case, and all these
need to also have special error handling.

The problem is easy to demonstrate by making cpufreq_driver->init()
or cpufreq_driver->get() fail during resume.

The code is too complex for a simple fix, with split code paths
in multiple blocks within a number of functions.  It is therefore
best to revert the patch enabling this code until the error handling
is in place.

Examples of problems resulting from resume errors:

Dec  2 09:54:38 nemi kernel: [  930.162476] ------------[ cut here ]------------
Dec  2 09:54:38 nemi kernel: [  930.162489] WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
Dec  2 09:54:38 nemi kernel: [  930.162493] missing sysfs attribute operations for kobject: (null)
Dec  2 09:54:38 nemi kernel: [  930.162495] Modules linked in: [stripped as irrelevant]
Dec  2 09:54:38 nemi kernel: [  930.162662] CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
Dec  2 09:54:38 nemi kernel: [  930.162665] Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
Dec  2 09:54:38 nemi kernel: [  930.162668]  0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
Dec  2 09:54:38 nemi kernel: [  930.162676]  ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
Dec  2 09:54:38 nemi kernel: [  930.162682]  ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
Dec  2 09:54:38 nemi kernel: [  930.162690] Call Trace:
Dec  2 09:54:38 nemi kernel: [  930.162698]  [<ffffffff81380b0e>] dump_stack+0x55/0x76
Dec  2 09:54:38 nemi kernel: [  930.162705]  [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
Dec  2 09:54:38 nemi kernel: [  930.162710]  [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
Dec  2 09:54:38 nemi kernel: [  930.162715]  [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
Dec  2 09:54:38 nemi kernel: [  930.162720]  [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
Dec  2 09:54:38 nemi kernel: [  930.162725]  [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
Dec  2 09:54:38 nemi kernel: [  930.162730]  [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
Dec  2 09:54:38 nemi kernel: [  930.162736]  [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
Dec  2 09:54:38 nemi kernel: [  930.162742]  [<ffffffff81122562>] do_dentry_open+0x17c/0x257
Dec  2 09:54:38 nemi kernel: [  930.162748]  [<ffffffff8112267e>] finish_open+0x41/0x4f
Dec  2 09:54:38 nemi kernel: [  930.162754]  [<ffffffff81130225>] do_last+0x80c/0x9ba
Dec  2 09:54:38 nemi kernel: [  930.162759]  [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
Dec  2 09:54:38 nemi kernel: [  930.162764]  [<ffffffff81130606>] path_openat+0x233/0x4a1
Dec  2 09:54:38 nemi kernel: [  930.162770]  [<ffffffff81130b7e>] do_filp_open+0x35/0x85
Dec  2 09:54:38 nemi kernel: [  930.162776]  [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
Dec  2 09:54:38 nemi kernel: [  930.162782]  [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
Dec  2 09:54:38 nemi kernel: [  930.162787]  [<ffffffff811233a7>] SyS_openat+0xf/0x11
Dec  2 09:54:38 nemi kernel: [  930.162794]  [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
Dec  2 09:54:38 nemi kernel: [  930.162798] ---[ end trace 48ce7fe74a95d4be ]---

The failure to restore cpufreq devices on cancelled hibernation is
not a new bug. It is caused by the ACPI _PPC call failing unless the
hibernate is completed. This makes the acpi_cpufreq driver fail its
init.

Previously, the cpufreq device could be restored by offlining the
cpu temporarily.  And as a complete hibernation cycle would do this,
it would be automatically restored most of the time.  But after
commit 5302c3fb2e62 the leftover sysfs attributes will block any
device add action.  Therefore offlining and onlining CPU 1 will no
longer restore the cpufreq object, and a complete suspend/resume
cycle will replace it with garbage.

Fixes: 5302c3fb2e62 ("cpufreq: Perform light-weight init/teardown during suspend/resume")
Cc: <stable@vger.kernel.org> # v3.12
Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/cpufreq/cpufreq.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534da22dd..b7c3b877da44 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2076,9 +2076,6 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb,
 	dev = get_cpu_device(cpu);
 	if (dev) {
 
-		if (action & CPU_TASKS_FROZEN)
-			frozen = true;
-
 		switch (action & ~CPU_TASKS_FROZEN) {
 		case CPU_ONLINE:
 			__cpufreq_add_dev(dev, NULL, frozen);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-03 11:14 ` Bjørn Mork
  0 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-03 11:14 UTC (permalink / raw)
  To: cpufreq
  Cc: linux-pm, Srivatsa S. Bhat, Viresh Kumar, Rafael J. Wysocki,
	Bjørn Mork

This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
light-weight init/teardown during suspend/resume"), which enabled
suspend/resume optimizations leaving the sysfs files in place.

Errors during suspend/resume are not handled properly, leaving
dead sysfs attributes in case of failures.  There are are number of
functions with special code for the "frozen" case, and all these
need to also have special error handling.

The problem is easy to demonstrate by making cpufreq_driver->init()
or cpufreq_driver->get() fail during resume.

The code is too complex for a simple fix, with split code paths
in multiple blocks within a number of functions.  It is therefore
best to revert the patch enabling this code until the error handling
is in place.

Examples of problems resulting from resume errors:

Dec  2 09:54:38 nemi kernel: [  930.162476] ------------[ cut here ]------------
Dec  2 09:54:38 nemi kernel: [  930.162489] WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
Dec  2 09:54:38 nemi kernel: [  930.162493] missing sysfs attribute operations for kobject: (null)
Dec  2 09:54:38 nemi kernel: [  930.162495] Modules linked in: [stripped as irrelevant]
Dec  2 09:54:38 nemi kernel: [  930.162662] CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
Dec  2 09:54:38 nemi kernel: [  930.162665] Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
Dec  2 09:54:38 nemi kernel: [  930.162668]  0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
Dec  2 09:54:38 nemi kernel: [  930.162676]  ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
Dec  2 09:54:38 nemi kernel: [  930.162682]  ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
Dec  2 09:54:38 nemi kernel: [  930.162690] Call Trace:
Dec  2 09:54:38 nemi kernel: [  930.162698]  [<ffffffff81380b0e>] dump_stack+0x55/0x76
Dec  2 09:54:38 nemi kernel: [  930.162705]  [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
Dec  2 09:54:38 nemi kernel: [  930.162710]  [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
Dec  2 09:54:38 nemi kernel: [  930.162715]  [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
Dec  2 09:54:38 nemi kernel: [  930.162720]  [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
Dec  2 09:54:38 nemi kernel: [  930.162725]  [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
Dec  2 09:54:38 nemi kernel: [  930.162730]  [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
Dec  2 09:54:38 nemi kernel: [  930.162736]  [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
Dec  2 09:54:38 nemi kernel: [  930.162742]  [<ffffffff81122562>] do_dentry_open+0x17c/0x257
Dec  2 09:54:38 nemi kernel: [  930.162748]  [<ffffffff8112267e>] finish_open+0x41/0x4f
Dec  2 09:54:38 nemi kernel: [  930.162754]  [<ffffffff81130225>] do_last+0x80c/0x9ba
Dec  2 09:54:38 nemi kernel: [  930.162759]  [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
Dec  2 09:54:38 nemi kernel: [  930.162764]  [<ffffffff81130606>] path_openat+0x233/0x4a1
Dec  2 09:54:38 nemi kernel: [  930.162770]  [<ffffffff81130b7e>] do_filp_open+0x35/0x85
Dec  2 09:54:38 nemi kernel: [  930.162776]  [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
Dec  2 09:54:38 nemi kernel: [  930.162782]  [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
Dec  2 09:54:38 nemi kernel: [  930.162787]  [<ffffffff811233a7>] SyS_openat+0xf/0x11
Dec  2 09:54:38 nemi kernel: [  930.162794]  [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
Dec  2 09:54:38 nemi kernel: [  930.162798] ---[ end trace 48ce7fe74a95d4be ]---

The failure to restore cpufreq devices on cancelled hibernation is
not a new bug. It is caused by the ACPI _PPC call failing unless the
hibernate is completed. This makes the acpi_cpufreq driver fail its
init.

Previously, the cpufreq device could be restored by offlining the
cpu temporarily.  And as a complete hibernation cycle would do this,
it would be automatically restored most of the time.  But after
commit 5302c3fb2e62 the leftover sysfs attributes will block any
device add action.  Therefore offlining and onlining CPU 1 will no
longer restore the cpufreq object, and a complete suspend/resume
cycle will replace it with garbage.

Fixes: 5302c3fb2e62 ("cpufreq: Perform light-weight init/teardown during suspend/resume")
Cc: <stable@vger.kernel.org> # v3.12
Signed-off-by: Bj√∏rn Mork <bjorn@mork.no>
---
 drivers/cpufreq/cpufreq.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534da22dd..b7c3b877da44 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2076,9 +2076,6 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb,
 	dev = get_cpu_device(cpu);
 	if (dev) {
 
-		if (action & CPU_TASKS_FROZEN)
-			frozen = true;
-
 		switch (action & ~CPU_TASKS_FROZEN) {
 		case CPU_ONLINE:
 			__cpufreq_add_dev(dev, NULL, frozen);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-03 11:14 ` Bjørn Mork
  (?)
@ 2013-12-03 21:45 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-03 21:45 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: cpufreq, linux-pm, Srivatsa S. Bhat, Viresh Kumar, Rafael J. Wysocki

On Tuesday, December 03, 2013 12:14:32 PM Bjørn Mork wrote:
> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
> light-weight init/teardown during suspend/resume"), which enabled
> suspend/resume optimizations leaving the sysfs files in place.
> 
> Errors during suspend/resume are not handled properly, leaving
> dead sysfs attributes in case of failures.  There are are number of
> functions with special code for the "frozen" case, and all these
> need to also have special error handling.
> 
> The problem is easy to demonstrate by making cpufreq_driver->init()
> or cpufreq_driver->get() fail during resume.
> 
> The code is too complex for a simple fix, with split code paths
> in multiple blocks within a number of functions.  It is therefore
> best to revert the patch enabling this code until the error handling
> is in place.
> 
> Examples of problems resulting from resume errors:

Applied, but next time you add call traces to patch changelogs, please remove
timestamps from them unless those timestamps are relevant for the bug
description.  They are pure noise otherwise.

Thanks,
Rafael


> Dec  2 09:54:38 nemi kernel: [  930.162476] ------------[ cut here ]------------
> Dec  2 09:54:38 nemi kernel: [  930.162489] WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
> Dec  2 09:54:38 nemi kernel: [  930.162493] missing sysfs attribute operations for kobject: (null)
> Dec  2 09:54:38 nemi kernel: [  930.162495] Modules linked in: [stripped as irrelevant]
> Dec  2 09:54:38 nemi kernel: [  930.162662] CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
> Dec  2 09:54:38 nemi kernel: [  930.162665] Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
> Dec  2 09:54:38 nemi kernel: [  930.162668]  0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
> Dec  2 09:54:38 nemi kernel: [  930.162676]  ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
> Dec  2 09:54:38 nemi kernel: [  930.162682]  ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
> Dec  2 09:54:38 nemi kernel: [  930.162690] Call Trace:
> Dec  2 09:54:38 nemi kernel: [  930.162698]  [<ffffffff81380b0e>] dump_stack+0x55/0x76
> Dec  2 09:54:38 nemi kernel: [  930.162705]  [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
> Dec  2 09:54:38 nemi kernel: [  930.162710]  [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162715]  [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
> Dec  2 09:54:38 nemi kernel: [  930.162720]  [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
> Dec  2 09:54:38 nemi kernel: [  930.162725]  [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162730]  [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162736]  [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
> Dec  2 09:54:38 nemi kernel: [  930.162742]  [<ffffffff81122562>] do_dentry_open+0x17c/0x257
> Dec  2 09:54:38 nemi kernel: [  930.162748]  [<ffffffff8112267e>] finish_open+0x41/0x4f
> Dec  2 09:54:38 nemi kernel: [  930.162754]  [<ffffffff81130225>] do_last+0x80c/0x9ba
> Dec  2 09:54:38 nemi kernel: [  930.162759]  [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
> Dec  2 09:54:38 nemi kernel: [  930.162764]  [<ffffffff81130606>] path_openat+0x233/0x4a1
> Dec  2 09:54:38 nemi kernel: [  930.162770]  [<ffffffff81130b7e>] do_filp_open+0x35/0x85
> Dec  2 09:54:38 nemi kernel: [  930.162776]  [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
> Dec  2 09:54:38 nemi kernel: [  930.162782]  [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
> Dec  2 09:54:38 nemi kernel: [  930.162787]  [<ffffffff811233a7>] SyS_openat+0xf/0x11
> Dec  2 09:54:38 nemi kernel: [  930.162794]  [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
> Dec  2 09:54:38 nemi kernel: [  930.162798] ---[ end trace 48ce7fe74a95d4be ]---
> 
> The failure to restore cpufreq devices on cancelled hibernation is
> not a new bug. It is caused by the ACPI _PPC call failing unless the
> hibernate is completed. This makes the acpi_cpufreq driver fail its
> init.
> 
> Previously, the cpufreq device could be restored by offlining the
> cpu temporarily.  And as a complete hibernation cycle would do this,
> it would be automatically restored most of the time.  But after
> commit 5302c3fb2e62 the leftover sysfs attributes will block any
> device add action.  Therefore offlining and onlining CPU 1 will no
> longer restore the cpufreq object, and a complete suspend/resume
> cycle will replace it with garbage.
> 
> Fixes: 5302c3fb2e62 ("cpufreq: Perform light-weight init/teardown during suspend/resume")
> Cc: <stable@vger.kernel.org> # v3.12
> Signed-off-by: Bjørn Mork <bjorn@mork.no>
> ---
>  drivers/cpufreq/cpufreq.c |    3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 02d534da22dd..b7c3b877da44 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2076,9 +2076,6 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb,
>  	dev = get_cpu_device(cpu);
>  	if (dev) {
>  
> -		if (action & CPU_TASKS_FROZEN)
> -			frozen = true;
> -
>  		switch (action & ~CPU_TASKS_FROZEN) {
>  		case CPU_ONLINE:
>  			__cpufreq_add_dev(dev, NULL, frozen);
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-03 11:14 ` Bjørn Mork
  (?)
  (?)
@ 2013-12-04  6:23 ` Srivatsa S. Bhat
  2013-12-24  9:46   ` Jarzmik, Robert
  -1 siblings, 1 reply; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-04  6:23 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: cpufreq, linux-pm, Viresh Kumar, Rafael J. Wysocki, Jarzmik,
	Robert, R, Durgadoss

On 12/03/2013 04:44 PM, Bjørn Mork wrote:
> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
> light-weight init/teardown during suspend/resume"), which enabled
> suspend/resume optimizations leaving the sysfs files in place.
> 
> Errors during suspend/resume are not handled properly, leaving
> dead sysfs attributes in case of failures.  There are are number of
> functions with special code for the "frozen" case, and all these
> need to also have special error handling.
> 
> The problem is easy to demonstrate by making cpufreq_driver->init()
> or cpufreq_driver->get() fail during resume.
> 
> The code is too complex for a simple fix, with split code paths
> in multiple blocks within a number of functions.

Yes, unfortunately. It was hard enough to get it into shape to make
suspend/resume preserve sysfs files. I definitely don't expect it to
be simple to fix error handling on top of that. 

>  It is therefore
> best to revert the patch enabling this code until the error handling
> is in place.

I agree, that's a good decision. I'll take a look at it later to see
if we can restructure the code to include proper error handling in all
the failure paths. If that gets way out of control in terms of complexity,
then its probably best to drop this "feature" altogether. It has caused
enough problems already, and the initial motivation behind doing that
doesn't seem to be that strong now (CC'ing Robert and Durga).

Thanks for the debugging and the fix!

Regards,
Srivatsa S. Bhat

> 
> Examples of problems resulting from resume errors:
> 
> Dec  2 09:54:38 nemi kernel: [  930.162476] ------------[ cut here ]------------
> Dec  2 09:54:38 nemi kernel: [  930.162489] WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212()
> Dec  2 09:54:38 nemi kernel: [  930.162493] missing sysfs attribute operations for kobject: (null)
> Dec  2 09:54:38 nemi kernel: [  930.162495] Modules linked in: [stripped as irrelevant]
> Dec  2 09:54:38 nemi kernel: [  930.162662] CPU: 0 PID: 6055 Comm: grep Tainted: G      D      3.13.0-rc2 #153
> Dec  2 09:54:38 nemi kernel: [  930.162665] Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
> Dec  2 09:54:38 nemi kernel: [  930.162668]  0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006
> Dec  2 09:54:38 nemi kernel: [  930.162676]  ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000
> Dec  2 09:54:38 nemi kernel: [  930.162682]  ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310
> Dec  2 09:54:38 nemi kernel: [  930.162690] Call Trace:
> Dec  2 09:54:38 nemi kernel: [  930.162698]  [<ffffffff81380b0e>] dump_stack+0x55/0x76
> Dec  2 09:54:38 nemi kernel: [  930.162705]  [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96
> Dec  2 09:54:38 nemi kernel: [  930.162710]  [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162715]  [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43
> Dec  2 09:54:38 nemi kernel: [  930.162720]  [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82
> Dec  2 09:54:38 nemi kernel: [  930.162725]  [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162730]  [<ffffffff811823c7>] sysfs_open_file+0x77/0x212
> Dec  2 09:54:38 nemi kernel: [  930.162736]  [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac
> Dec  2 09:54:38 nemi kernel: [  930.162742]  [<ffffffff81122562>] do_dentry_open+0x17c/0x257
> Dec  2 09:54:38 nemi kernel: [  930.162748]  [<ffffffff8112267e>] finish_open+0x41/0x4f
> Dec  2 09:54:38 nemi kernel: [  930.162754]  [<ffffffff81130225>] do_last+0x80c/0x9ba
> Dec  2 09:54:38 nemi kernel: [  930.162759]  [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42
> Dec  2 09:54:38 nemi kernel: [  930.162764]  [<ffffffff81130606>] path_openat+0x233/0x4a1
> Dec  2 09:54:38 nemi kernel: [  930.162770]  [<ffffffff81130b7e>] do_filp_open+0x35/0x85
> Dec  2 09:54:38 nemi kernel: [  930.162776]  [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184
> Dec  2 09:54:38 nemi kernel: [  930.162782]  [<ffffffff811232ea>] do_sys_open+0x6b/0xfa
> Dec  2 09:54:38 nemi kernel: [  930.162787]  [<ffffffff811233a7>] SyS_openat+0xf/0x11
> Dec  2 09:54:38 nemi kernel: [  930.162794]  [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b
> Dec  2 09:54:38 nemi kernel: [  930.162798] ---[ end trace 48ce7fe74a95d4be ]---
> 
> The failure to restore cpufreq devices on cancelled hibernation is
> not a new bug. It is caused by the ACPI _PPC call failing unless the
> hibernate is completed. This makes the acpi_cpufreq driver fail its
> init.
> 
> Previously, the cpufreq device could be restored by offlining the
> cpu temporarily.  And as a complete hibernation cycle would do this,
> it would be automatically restored most of the time.  But after
> commit 5302c3fb2e62 the leftover sysfs attributes will block any
> device add action.  Therefore offlining and onlining CPU 1 will no
> longer restore the cpufreq object, and a complete suspend/resume
> cycle will replace it with garbage.
> 
> Fixes: 5302c3fb2e62 ("cpufreq: Perform light-weight init/teardown during suspend/resume")
> Cc: <stable@vger.kernel.org> # v3.12
> Signed-off-by: Bjørn Mork <bjorn@mork.no>
> ---
>  drivers/cpufreq/cpufreq.c |    3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 02d534da22dd..b7c3b877da44 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2076,9 +2076,6 @@ static int cpufreq_cpu_callback(struct notifier_block *nfb,
>  	dev = get_cpu_device(cpu);
>  	if (dev) {
> 
> -		if (action & CPU_TASKS_FROZEN)
> -			frozen = true;
> -
>  		switch (action & ~CPU_TASKS_FROZEN) {
>  		case CPU_ONLINE:
>  			__cpufreq_add_dev(dev, NULL, frozen);
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-03 11:14 ` Bjørn Mork
                   ` (2 preceding siblings ...)
  (?)
@ 2013-12-04 10:32 ` viresh kumar
  2013-12-04 12:08   ` Bjørn Mork
  2013-12-05  1:29   ` Rafael J. Wysocki
  -1 siblings, 2 replies; 28+ messages in thread
From: viresh kumar @ 2013-12-04 10:32 UTC (permalink / raw)
  To: Bjørn Mork, cpufreq; +Cc: linux-pm, Srivatsa S. Bhat, Rafael J. Wysocki

On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
> light-weight init/teardown during suspend/resume"), which enabled
> suspend/resume optimizations leaving the sysfs files in place.
> 
> Errors during suspend/resume are not handled properly, leaving
> dead sysfs attributes in case of failures.  There are are number of
> functions with special code for the "frozen" case, and all these
> need to also have special error handling.

Can you try this please if it fixes things for you (with your patch reverted):

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Wed, 4 Dec 2013 15:20:12 +0530
Subject: [PATCH] cpufreq: remove sysfs files for CPU which failed to come
 back after resume

There are cases where cpufreq_add_dev() may fail for some CPUs during resume.
With the current code we will still have sysfs cpufreq files for such CPUs, and
struct cpufreq_policy would be already freed for them. Hence any operation on
those sysfs files would result in kernel warnings.

To fix this, lets remove those sysfs files or put the associated kobject in case
of such errors. Also, to make it simple lets remove the sysfs links from all the
CPUs (except policy->cpu) during suspend as that operation wouldn't result with a
loss of sysfs file permissions. And so we will create those links during resume
as well.

Reported-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 61 ++++++++++++++++++++++++-----------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 606224a..57c80a7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -849,8 +849,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)

 #ifdef CONFIG_HOTPLUG_CPU
 static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
-				  unsigned int cpu, struct device *dev,
-				  bool frozen)
+				  unsigned int cpu, struct device *dev)
 {
 	int ret = 0;
 	unsigned long flags;
@@ -881,9 +880,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
 		}
 	}

-	/* Don't touch sysfs links during light-weight init */
-	if (!frozen)
-		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
+	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");

 	return ret;
 }
@@ -930,6 +927,27 @@ err_free_policy:
 	return NULL;
 }

+static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
+{
+	struct kobject *kobj;
+	struct completion *cmp;
+
+	down_read(&policy->rwsem);
+	kobj = &policy->kobj;
+	cmp = &policy->kobj_unregister;
+	up_read(&policy->rwsem);
+	kobject_put(kobj);
+
+	/*
+	 * We need to make sure that the underlying kobj is
+	 * actually not referenced anymore by anybody before we
+	 * proceed with unloading.
+	 */
+	pr_debug("waiting for dropping of refcount\n");
+	wait_for_completion(cmp);
+	pr_debug("wait complete\n");
+}
+
 static void cpufreq_policy_free(struct cpufreq_policy *policy)
 {
 	free_cpumask_var(policy->related_cpus);
@@ -990,7 +1008,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
 	list_for_each_entry(tpolicy, &cpufreq_policy_list, policy_list) {
 		if (cpumask_test_cpu(cpu, tpolicy->related_cpus)) {
 			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
-			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev, frozen);
+			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev);
 			up_read(&cpufreq_rwsem);
 			return ret;
 		}
@@ -1100,7 +1118,10 @@ err_get_freq:
 	if (cpufreq_driver->exit)
 		cpufreq_driver->exit(policy);
 err_set_policy_cpu:
+	if (frozen)
+		cpufreq_policy_put_kobj(policy);
 	cpufreq_policy_free(policy);
+
 nomem_out:
 	up_read(&cpufreq_rwsem);

@@ -1122,7 +1143,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 }

 static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
-					   unsigned int old_cpu, bool frozen)
+					   unsigned int old_cpu)
 {
 	struct device *cpu_dev;
 	int ret;
@@ -1130,10 +1151,6 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
 	/* first sibling now owns the new sysfs dir */
 	cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu));

-	/* Don't touch sysfs files during light-weight tear-down */
-	if (frozen)
-		return cpu_dev->id;
-
 	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
 	ret = kobject_move(&policy->kobj, &cpu_dev->kobj);
 	if (ret) {
@@ -1200,7 +1217,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
 		if (!frozen)
 			sysfs_remove_link(&dev->kobj, "cpufreq");
 	} else if (cpus > 1) {
-		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen);
+		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
 		if (new_cpu >= 0) {
 			update_policy_cpu(policy, new_cpu);

@@ -1222,8 +1239,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
 	int ret;
 	unsigned long flags;
 	struct cpufreq_policy *policy;
-	struct kobject *kobj;
-	struct completion *cmp;

 	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, cpu);
@@ -1253,22 +1268,8 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
 			}
 		}

-		if (!frozen) {
-			down_read(&policy->rwsem);
-			kobj = &policy->kobj;
-			cmp = &policy->kobj_unregister;
-			up_read(&policy->rwsem);
-			kobject_put(kobj);
-
-			/*
-			 * We need to make sure that the underlying kobj is
-			 * actually not referenced anymore by anybody before we
-			 * proceed with unloading.
-			 */
-			pr_debug("waiting for dropping of refcount\n");
-			wait_for_completion(cmp);
-			pr_debug("wait complete\n");
-		}
+		if (!frozen)
+			cpufreq_policy_put_kobj(policy);

 		/*
 		 * Perform the ->exit() even during light-weight tear-down,


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-04 10:32 ` viresh kumar
@ 2013-12-04 12:08   ` Bjørn Mork
  2013-12-04 14:41     ` Viresh Kumar
  2013-12-05  1:29   ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Bjørn Mork @ 2013-12-04 12:08 UTC (permalink / raw)
  To: viresh kumar; +Cc: cpufreq, linux-pm, Srivatsa S. Bhat, Rafael J. Wysocki

viresh kumar <viresh.kumar@linaro.org> writes:

> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>> light-weight init/teardown during suspend/resume"), which enabled
>> suspend/resume optimizations leaving the sysfs files in place.
>> 
>> Errors during suspend/resume are not handled properly, leaving
>> dead sysfs attributes in case of failures.  There are are number of
>> functions with special code for the "frozen" case, and all these
>> need to also have special error handling.
>
> Can you try this please if it fixes things for you (with your patch reverted):

Yes, that patch looks like it fix the problem, and I cannot spot any
obvious missing cleanups.

But I must say that the whole kobj + free thing makes me feel a bit
uneasy.  I assume there are good reasons why "cpufreq" isn't just a
device with a release method?

And I do hope there is something gained by the special suspend handling,
because keeping this partial device looks really messy. If it's all just
for some file permissions, then there must be better ways?  Isn't
consistent file permissions on device creation really a userspace issue?

Anyway, the patch works so you can add

Tested-by: Bjørn Mork <bjorn@mork.no>


BTW, a simple way to test these things is cancelling a userspace
hibernate on an ACPI system.  It will always make the acpi_cpufreq
driver fail because it attemps to call _PPC inbetween
acpi_ec_block_transactions and acpi_ec_unblock_transactions.  There is
no acpi_ec_unblock_transactions_early call in this case.



Bjørn




> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Wed, 4 Dec 2013 15:20:12 +0530
> Subject: [PATCH] cpufreq: remove sysfs files for CPU which failed to come
>  back after resume
>
> There are cases where cpufreq_add_dev() may fail for some CPUs during resume.
> With the current code we will still have sysfs cpufreq files for such CPUs, and
> struct cpufreq_policy would be already freed for them. Hence any operation on
> those sysfs files would result in kernel warnings.
>
> To fix this, lets remove those sysfs files or put the associated kobject in case
> of such errors. Also, to make it simple lets remove the sysfs links from all the
> CPUs (except policy->cpu) during suspend as that operation wouldn't result with a
> loss of sysfs file permissions. And so we will create those links during resume
> as well.
>
> Reported-by: Bjørn Mork <bjorn@mork.no>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c | 61 ++++++++++++++++++++++++-----------------------
>  1 file changed, 31 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 606224a..57c80a7 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -849,8 +849,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)
>
>  #ifdef CONFIG_HOTPLUG_CPU
>  static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
> -				  unsigned int cpu, struct device *dev,
> -				  bool frozen)
> +				  unsigned int cpu, struct device *dev)
>  {
>  	int ret = 0;
>  	unsigned long flags;
> @@ -881,9 +880,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
>  		}
>  	}
>
> -	/* Don't touch sysfs links during light-weight init */
> -	if (!frozen)
> -		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> +	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>
>  	return ret;
>  }
> @@ -930,6 +927,27 @@ err_free_policy:
>  	return NULL;
>  }
>
> +static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
> +{
> +	struct kobject *kobj;
> +	struct completion *cmp;
> +
> +	down_read(&policy->rwsem);
> +	kobj = &policy->kobj;
> +	cmp = &policy->kobj_unregister;
> +	up_read(&policy->rwsem);
> +	kobject_put(kobj);
> +
> +	/*
> +	 * We need to make sure that the underlying kobj is
> +	 * actually not referenced anymore by anybody before we
> +	 * proceed with unloading.
> +	 */
> +	pr_debug("waiting for dropping of refcount\n");
> +	wait_for_completion(cmp);
> +	pr_debug("wait complete\n");
> +}
> +
>  static void cpufreq_policy_free(struct cpufreq_policy *policy)
>  {
>  	free_cpumask_var(policy->related_cpus);
> @@ -990,7 +1008,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
>  	list_for_each_entry(tpolicy, &cpufreq_policy_list, policy_list) {
>  		if (cpumask_test_cpu(cpu, tpolicy->related_cpus)) {
>  			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> -			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev, frozen);
> +			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev);
>  			up_read(&cpufreq_rwsem);
>  			return ret;
>  		}
> @@ -1100,7 +1118,10 @@ err_get_freq:
>  	if (cpufreq_driver->exit)
>  		cpufreq_driver->exit(policy);
>  err_set_policy_cpu:
> +	if (frozen)
> +		cpufreq_policy_put_kobj(policy);
>  	cpufreq_policy_free(policy);
> +
>  nomem_out:
>  	up_read(&cpufreq_rwsem);
>
> @@ -1122,7 +1143,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  }
>
>  static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
> -					   unsigned int old_cpu, bool frozen)
> +					   unsigned int old_cpu)
>  {
>  	struct device *cpu_dev;
>  	int ret;
> @@ -1130,10 +1151,6 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
>  	/* first sibling now owns the new sysfs dir */
>  	cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu));
>
> -	/* Don't touch sysfs files during light-weight tear-down */
> -	if (frozen)
> -		return cpu_dev->id;
> -
>  	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
>  	ret = kobject_move(&policy->kobj, &cpu_dev->kobj);
>  	if (ret) {
> @@ -1200,7 +1217,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
>  		if (!frozen)
>  			sysfs_remove_link(&dev->kobj, "cpufreq");
>  	} else if (cpus > 1) {
> -		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen);
> +		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
>  		if (new_cpu >= 0) {
>  			update_policy_cpu(policy, new_cpu);
>
> @@ -1222,8 +1239,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  	int ret;
>  	unsigned long flags;
>  	struct cpufreq_policy *policy;
> -	struct kobject *kobj;
> -	struct completion *cmp;
>
>  	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	policy = per_cpu(cpufreq_cpu_data, cpu);
> @@ -1253,22 +1268,8 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  			}
>  		}
>
> -		if (!frozen) {
> -			down_read(&policy->rwsem);
> -			kobj = &policy->kobj;
> -			cmp = &policy->kobj_unregister;
> -			up_read(&policy->rwsem);
> -			kobject_put(kobj);
> -
> -			/*
> -			 * We need to make sure that the underlying kobj is
> -			 * actually not referenced anymore by anybody before we
> -			 * proceed with unloading.
> -			 */
> -			pr_debug("waiting for dropping of refcount\n");
> -			wait_for_completion(cmp);
> -			pr_debug("wait complete\n");
> -		}
> +		if (!frozen)
> +			cpufreq_policy_put_kobj(policy);
>
>  		/*
>  		 * Perform the ->exit() even during light-weight tear-down,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-04 12:08   ` Bjørn Mork
@ 2013-12-04 14:41     ` Viresh Kumar
  2013-12-04 15:41       ` Bjørn Mork
  0 siblings, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2013-12-04 14:41 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: cpufreq, linux-pm, Srivatsa S. Bhat, Rafael J. Wysocki

On 4 December 2013 17:38, Bjørn Mork <bjorn@mork.no> wrote:
> Yes, that patch looks like it fix the problem, and I cannot spot any
> obvious missing cleanups.

Great!!

> But I must say that the whole kobj + free thing makes me feel a bit
> uneasy.  I assume there are good reasons why "cpufreq" isn't just a
> device with a release method?

I didn't get it completely.. How exactly you wanted it to be..

> And I do hope there is something gained by the special suspend handling,
> because keeping this partial device looks really messy. If it's all just
> for some file permissions, then there must be better ways?

We didn't found many and so Srivatsa went with this way.. If you have a
better way then we are all for it :)

> Isn't consistent file permissions on device creation really a userspace issue?

Yes, normally it is.. But suspend/resume or hibernation isn't a userspace
problem at all. We had a discussion earlier on it and realized that kernel
is screwing it up and so it should get it back. Userspace doesn't have to
know how cpufreq removed these directories/files and added them back
on resume.

> Anyway, the patch works so you can add
>
> Tested-by: Bjørn Mork <bjorn@mork.no>

Great. Rafael can take this with a revert of your patch now..

> BTW, a simple way to test these things is cancelling a userspace
> hibernate on an ACPI system.  It will always make the acpi_cpufreq
> driver fail because it attemps to call _PPC inbetween
> acpi_ec_block_transactions and acpi_ec_unblock_transactions.  There is
> no acpi_ec_unblock_transactions_early call in this case.

I did test it on my thinkpad with following hack:

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 57c80a7..ebaec3d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1046,6 +1046,10 @@ static int __cpufreq_add_dev(struct device
*dev, struct subsys_interface *sif,
        /* call driver. From then on the cpufreq must be able
         * to accept all calls to ->verify and ->setpolicy for this CPU
         */
+       if (frozen) {
+               ret = -EBUSY;
+               goto err_set_policy_cpu;
+       }
+
        ret = cpufreq_driver->init(policy);
        if (ret) {
                pr_debug("initialization failed\n");

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-04 14:41     ` Viresh Kumar
@ 2013-12-04 15:41       ` Bjørn Mork
       [not found]         ` <CAKohponu3Fu=WaBHXP1iBJM87V9g=+hDPe=M168U_weODenZdQ@mail.gmail.com>
  0 siblings, 1 reply; 28+ messages in thread
From: Bjørn Mork @ 2013-12-04 15:41 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: cpufreq, linux-pm, Srivatsa S. Bhat, Rafael J. Wysocki

Viresh Kumar <viresh.kumar@linaro.org> writes:
> On 4 December 2013 17:38, Bjørn Mork <bjorn@mork.no> wrote:
>> Yes, that patch looks like it fix the problem, and I cannot spot any
>> obvious missing cleanups.
>
> Great!!
>
>> But I must say that the whole kobj + free thing makes me feel a bit
>> uneasy.  I assume there are good reasons why "cpufreq" isn't just a
>> device with a release method?
>
> I didn't get it completely.. How exactly you wanted it to be..

Like a normal "struct device".  All that manual fiddling with sysfs
files and kobj removal etc looks so complicated and error prone to me.
But this is likely because I don't understand the code at all.

Take my comments for what they are worth.  I'm not claiming any
understanding at all here....

>> And I do hope there is something gained by the special suspend handling,
>> because keeping this partial device looks really messy. If it's all just
>> for some file permissions, then there must be better ways?
>
> We didn't found many and so Srivatsa went with this way.. If you have a
> better way then we are all for it :)
>
>> Isn't consistent file permissions on device creation really a userspace issue?
>
> Yes, normally it is.. But suspend/resume or hibernation isn't a userspace
> problem at all.

It is a userspace problem for other devices which are unplugged and
plugged on suspend/resume. CPUs are obviously exceptional for some
reason. This is what I don't understand.  I assume the file permissions
are set up by userspace when the device is first created? Why doesn't
the same apply to device creation after a resume hotplug?

You have made the very wise decision to handle suspend/resume of non
boot CPUs as hotplugging events.  But then you totally destroy this nice
and simple model by adding lots of code implementing some very device
specific suspend/resume handling.  That confuses me.  And it's not like
you save anything valuable over the suspend/resume.  You want to save
file permissions of sysfs attributes? Who cares?  They should be set by
udev rules anyway.

> We had a discussion earlier on it and realized that kernel
> is screwing it up and so it should get it back. Userspace doesn't have to
> know how cpufreq removed these directories/files and added them back
> on resume.

Userspace should see this as CPUs going away and then being added back,
right? If userspace set the file permissions the first time the CPU was
added, why can't it do so the next time as well?

This wouldn't have been such a big deal if the workaorund was a simple
line or two.  But it is not.



Bjørn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-04 10:32 ` viresh kumar
  2013-12-04 12:08   ` Bjørn Mork
@ 2013-12-05  1:29   ` Rafael J. Wysocki
  2013-12-09  2:59     ` Lan Tianyu
  1 sibling, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-05  1:29 UTC (permalink / raw)
  To: viresh kumar
  Cc: Bjørn Mork, cpufreq, linux-pm, Srivatsa S. Bhat, Rafael J. Wysocki

On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
> > This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
> > light-weight init/teardown during suspend/resume"), which enabled
> > suspend/resume optimizations leaving the sysfs files in place.
> > 
> > Errors during suspend/resume are not handled properly, leaving
> > dead sysfs attributes in case of failures.  There are are number of
> > functions with special code for the "frozen" case, and all these
> > need to also have special error handling.
> 
> Can you try this please if it fixes things for you (with your patch reverted):
> 
> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Wed, 4 Dec 2013 15:20:12 +0530
> Subject: [PATCH] cpufreq: remove sysfs files for CPU which failed to come
>  back after resume
> 
> There are cases where cpufreq_add_dev() may fail for some CPUs during resume.
> With the current code we will still have sysfs cpufreq files for such CPUs, and
> struct cpufreq_policy would be already freed for them. Hence any operation on
> those sysfs files would result in kernel warnings.
> 
> To fix this, lets remove those sysfs files or put the associated kobject in case
> of such errors. Also, to make it simple lets remove the sysfs links from all the
> CPUs (except policy->cpu) during suspend as that operation wouldn't result with a
> loss of sysfs file permissions. And so we will create those links during resume
> as well.
> 
> Reported-by: Bjørn Mork <bjorn@mork.no>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

There's no way this can go into 3.13, even though it fixes the problem for
Bjorn.

I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
but for that I guess it should contain a revert of the change made by the
Bjorn's patch.

Thanks!

> ---
>  drivers/cpufreq/cpufreq.c | 61 ++++++++++++++++++++++++-----------------------
>  1 file changed, 31 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 606224a..57c80a7 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -849,8 +849,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)
> 
>  #ifdef CONFIG_HOTPLUG_CPU
>  static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
> -				  unsigned int cpu, struct device *dev,
> -				  bool frozen)
> +				  unsigned int cpu, struct device *dev)
>  {
>  	int ret = 0;
>  	unsigned long flags;
> @@ -881,9 +880,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
>  		}
>  	}
> 
> -	/* Don't touch sysfs links during light-weight init */
> -	if (!frozen)
> -		ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> +	ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
> 
>  	return ret;
>  }
> @@ -930,6 +927,27 @@ err_free_policy:
>  	return NULL;
>  }
> 
> +static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
> +{
> +	struct kobject *kobj;
> +	struct completion *cmp;
> +
> +	down_read(&policy->rwsem);
> +	kobj = &policy->kobj;
> +	cmp = &policy->kobj_unregister;
> +	up_read(&policy->rwsem);
> +	kobject_put(kobj);
> +
> +	/*
> +	 * We need to make sure that the underlying kobj is
> +	 * actually not referenced anymore by anybody before we
> +	 * proceed with unloading.
> +	 */
> +	pr_debug("waiting for dropping of refcount\n");
> +	wait_for_completion(cmp);
> +	pr_debug("wait complete\n");
> +}
> +
>  static void cpufreq_policy_free(struct cpufreq_policy *policy)
>  {
>  	free_cpumask_var(policy->related_cpus);
> @@ -990,7 +1008,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
>  	list_for_each_entry(tpolicy, &cpufreq_policy_list, policy_list) {
>  		if (cpumask_test_cpu(cpu, tpolicy->related_cpus)) {
>  			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> -			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev, frozen);
> +			ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev);
>  			up_read(&cpufreq_rwsem);
>  			return ret;
>  		}
> @@ -1100,7 +1118,10 @@ err_get_freq:
>  	if (cpufreq_driver->exit)
>  		cpufreq_driver->exit(policy);
>  err_set_policy_cpu:
> +	if (frozen)
> +		cpufreq_policy_put_kobj(policy);
>  	cpufreq_policy_free(policy);
> +
>  nomem_out:
>  	up_read(&cpufreq_rwsem);
> 
> @@ -1122,7 +1143,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  }
> 
>  static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
> -					   unsigned int old_cpu, bool frozen)
> +					   unsigned int old_cpu)
>  {
>  	struct device *cpu_dev;
>  	int ret;
> @@ -1130,10 +1151,6 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
>  	/* first sibling now owns the new sysfs dir */
>  	cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu));
> 
> -	/* Don't touch sysfs files during light-weight tear-down */
> -	if (frozen)
> -		return cpu_dev->id;
> -
>  	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
>  	ret = kobject_move(&policy->kobj, &cpu_dev->kobj);
>  	if (ret) {
> @@ -1200,7 +1217,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
>  		if (!frozen)
>  			sysfs_remove_link(&dev->kobj, "cpufreq");
>  	} else if (cpus > 1) {
> -		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen);
> +		new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
>  		if (new_cpu >= 0) {
>  			update_policy_cpu(policy, new_cpu);
> 
> @@ -1222,8 +1239,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  	int ret;
>  	unsigned long flags;
>  	struct cpufreq_policy *policy;
> -	struct kobject *kobj;
> -	struct completion *cmp;
> 
>  	read_lock_irqsave(&cpufreq_driver_lock, flags);
>  	policy = per_cpu(cpufreq_cpu_data, cpu);
> @@ -1253,22 +1268,8 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  			}
>  		}
> 
> -		if (!frozen) {
> -			down_read(&policy->rwsem);
> -			kobj = &policy->kobj;
> -			cmp = &policy->kobj_unregister;
> -			up_read(&policy->rwsem);
> -			kobject_put(kobj);
> -
> -			/*
> -			 * We need to make sure that the underlying kobj is
> -			 * actually not referenced anymore by anybody before we
> -			 * proceed with unloading.
> -			 */
> -			pr_debug("waiting for dropping of refcount\n");
> -			wait_for_completion(cmp);
> -			pr_debug("wait complete\n");
> -		}
> +		if (!frozen)
> +			cpufreq_policy_put_kobj(policy);
> 
>  		/*
>  		 * Perform the ->exit() even during light-weight tear-down,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
       [not found]           ` <878uvzyecg.fsf@nemi.mork.no>
@ 2013-12-05 12:41               ` Srivatsa S. Bhat
  0 siblings, 0 replies; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-05 12:41 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Viresh Kumar, Rafael J. Wysocki, Linux PM mailing list, cpufreq

On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> Viresh Kumar <viresh.kumar@linaro.org> writes:
> 
>> Sending from phone.. html.. so left list.
>>
>> Here is the old thread where we discussed this.. see if this helps..
>>
>> http://marc.info/?t=136845016900003&r=1&w=2
> 
> Thanks.  That helped a lot.
> 
> Unless I miss something, it looks like the permission problem *started*
> with fallout from special suspend code - surprising the user by not
> creating any offline/online event on suspend/resume.  Quoting from
> http://marc.info/?l=linux-pm&m=136847781510358 :
> 
>   (And yes, even code-wise, we use a slightly different
>    path in the S3 code, to initiate hotplug. That's why the uevents
>    are by-passed.)
> 

I hope you didn't miss the main idea I was trying to convey in that
reply: 
"IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
supposed to be totally internal to the suspend/resume code. It is not
intended for userspace to know that we are internally offlining/
onlining CPUs."

...

"The user initiated an S3 operation, not CPU hotplug. So there is
no reason to surprise the user with unexpected events. Put another
way, in the future, if we change the kernel code to do S3 without
using hotplug, then there should be no visible change in userspace,
because how S3 is handled in the kernel is intended to be an "internal"
operation."


> So instead of going in the direction of even more special treatment to
> hide the fact that a offline/online is done, you could also have solved
> the problem by removed the existing special treatment.  That would
> likely have simplified the code and made it do what userspace expects.
> 

You seem to be getting confused as to what the *actual* userspace
expectation was, in that mail thread. The expectation was that suspend/
resume is a kernel operation that brings back the system to the same
state (as much as possible) at the end of resume, as it was before
suspend. And that is a perfectly valid expectation, and it is something
that the kernel has to try its level best to honor.

And in this particular case, the specific expectation was that the sysfs
file permissions set by the user for cpufreq files will remain as it is
even after a suspend/resume cycle. That's it. There is _absolutely_ _no_
talk about CPU hotplug here.

Robert _happened_ to dig this further and observe that suspend/resume
actually does offline/online of CPUs, and thought that he should have
also seen the associated udev events as well. But we have purposefully
not exposed the fact that suspend/resume involves CPU hotplug. Today,
suspend/resume uses CPU hotplug internally because we don't have any
other better alternative. The very concept/semantic of suspend/resume
_does_ _not_ imply CPU hotplug - it is just an implementation detail
that userspace should not need to care about or rely on.

Moreover, cpufreq is not the only subsystem that participates in suspend/
resume and CPU hotplug. And fundamentally, regular CPU hotplug has _very_
different semantics and guarantees than the hotplug done in suspend/resume.
For example, if you offline CPUs normally, the cpusets associated with
those CPUs will get destroyed, potentially in ways that _won't_ bring
them back to the same state when you online those exact same CPUs!
And this would have been totally unacceptable to a user innocuously
using suspend/resume. Look at commit d35be8bab9 (CPU hotplug, cpusets,
suspend: Don't modify cpusets during suspend/resume), for more details
on how we fixed that problem.


So, to summarize, suspend/resume should mean one simple thing to
userspace - "the kernel will transparently (to the extent possible)
perform suspend/resume and bring back the system during resume, to a
state as close as possible compared to how it was before suspend".
Any implementation challenges must be handled in the kernel (as far as
possible), and we should avoid burdening userspace with extraneous
events etc.

> You have chosen to do offline+online.  Userspace expects to see the
> offline+online.  Don't try to hide the implementation.  That's only
> confusing.
> 

I hope my above explanation answers your questions. We are not trying
to hide the implementation for no reason. We want to do it to preserve
sane semantics for suspend/resume. And that effectively translates to
"provide userspace with a transparent suspend/resume operation".

 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-05 12:41               ` Srivatsa S. Bhat
  0 siblings, 0 replies; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-05 12:41 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Viresh Kumar, Rafael J. Wysocki, Linux PM mailing list, cpufreq

On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> Viresh Kumar <viresh.kumar@linaro.org> writes:
> 
>> Sending from phone.. html.. so left list.
>>
>> Here is the old thread where we discussed this.. see if this helps..
>>
>> http://marc.info/?t=136845016900003&r=1&w=2
> 
> Thanks.  That helped a lot.
> 
> Unless I miss something, it looks like the permission problem *started*
> with fallout from special suspend code - surprising the user by not
> creating any offline/online event on suspend/resume.  Quoting from
> http://marc.info/?l=linux-pm&m=136847781510358 :
> 
>   (And yes, even code-wise, we use a slightly different
>    path in the S3 code, to initiate hotplug. That's why the uevents
>    are by-passed.)
> 

I hope you didn't miss the main idea I was trying to convey in that
reply: 
"IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
supposed to be totally internal to the suspend/resume code. It is not
intended for userspace to know that we are internally offlining/
onlining CPUs."

...

"The user initiated an S3 operation, not CPU hotplug. So there is
no reason to surprise the user with unexpected events. Put another
way, in the future, if we change the kernel code to do S3 without
using hotplug, then there should be no visible change in userspace,
because how S3 is handled in the kernel is intended to be an "internal"
operation."


> So instead of going in the direction of even more special treatment to
> hide the fact that a offline/online is done, you could also have solved
> the problem by removed the existing special treatment.  That would
> likely have simplified the code and made it do what userspace expects.
> 

You seem to be getting confused as to what the *actual* userspace
expectation was, in that mail thread. The expectation was that suspend/
resume is a kernel operation that brings back the system to the same
state (as much as possible) at the end of resume, as it was before
suspend. And that is a perfectly valid expectation, and it is something
that the kernel has to try its level best to honor.

And in this particular case, the specific expectation was that the sysfs
file permissions set by the user for cpufreq files will remain as it is
even after a suspend/resume cycle. That's it. There is _absolutely_ _no_
talk about CPU hotplug here.

Robert _happened_ to dig this further and observe that suspend/resume
actually does offline/online of CPUs, and thought that he should have
also seen the associated udev events as well. But we have purposefully
not exposed the fact that suspend/resume involves CPU hotplug. Today,
suspend/resume uses CPU hotplug internally because we don't have any
other better alternative. The very concept/semantic of suspend/resume
_does_ _not_ imply CPU hotplug - it is just an implementation detail
that userspace should not need to care about or rely on.

Moreover, cpufreq is not the only subsystem that participates in suspend/
resume and CPU hotplug. And fundamentally, regular CPU hotplug has _very_
different semantics and guarantees than the hotplug done in suspend/resume.
For example, if you offline CPUs normally, the cpusets associated with
those CPUs will get destroyed, potentially in ways that _won't_ bring
them back to the same state when you online those exact same CPUs!
And this would have been totally unacceptable to a user innocuously
using suspend/resume. Look at commit d35be8bab9 (CPU hotplug, cpusets,
suspend: Don't modify cpusets during suspend/resume), for more details
on how we fixed that problem.


So, to summarize, suspend/resume should mean one simple thing to
userspace - "the kernel will transparently (to the extent possible)
perform suspend/resume and bring back the system during resume, to a
state as close as possible compared to how it was before suspend".
Any implementation challenges must be handled in the kernel (as far as
possible), and we should avoid burdening userspace with extraneous
events etc.

> You have chosen to do offline+online.  Userspace expects to see the
> offline+online.  Don't try to hide the implementation.  That's only
> confusing.
> 

I hope my above explanation answers your questions. We are not trying
to hide the implementation for no reason. We want to do it to preserve
sane semantics for suspend/resume. And that effectively translates to
"provide userspace with a transparent suspend/resume operation".

 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-05 12:41               ` Srivatsa S. Bhat
@ 2013-12-05 13:21                 ` Bjørn Mork
  -1 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-05 13:21 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Rafael J. Wysocki, Linux PM mailing list, cpufreq

"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:

> You seem to be getting confused as to what the *actual* userspace
> expectation was, in that mail thread. The expectation was that suspend/
> resume is a kernel operation that brings back the system to the same
> state (as much as possible) at the end of resume, as it was before
> suspend. And that is a perfectly valid expectation, and it is something
> that the kernel has to try its level best to honor.

Our understanding of the thread is obviously biased by our respective
views of the matter ;-)

> And in this particular case, the specific expectation was that the sysfs
> file permissions set by the user for cpufreq files will remain as it is
> even after a suspend/resume cycle. That's it. There is _absolutely_ _no_
> talk about CPU hotplug here.

This is not a cpufreq problem.  It is a CPU problem.  You are only
complicating things by trying to solve it in cpufreq.  What about
permissions on the other CPU sysfs attributes?

> Robert _happened_ to dig this further and observe that suspend/resume
> actually does offline/online of CPUs, and thought that he should have
> also seen the associated udev events as well. But we have purposefully
> not exposed the fact that suspend/resume involves CPU hotplug.

Well, that is a bug in my opinion.  You are adding lots of kernel code
to avoid a very simple userspace configuration.  Seems very backwards.
This is mostly a documentation problem, if a problem at all.  I do note
that Robert did try the udev way before concluding that this was a
kernel bug.

> Today,
> suspend/resume uses CPU hotplug internally because we don't have any
> other better alternative. The very concept/semantic of suspend/resume
> _does_ _not_ imply CPU hotplug - it is just an implementation detail
> that userspace should not need to care about or rely on.
>
> Moreover, cpufreq is not the only subsystem that participates in suspend/
> resume and CPU hotplug. And fundamentally, regular CPU hotplug has _very_
> different semantics and guarantees than the hotplug done in suspend/resume.
> For example, if you offline CPUs normally, the cpusets associated with
> those CPUs will get destroyed, potentially in ways that _won't_ bring
> them back to the same state when you online those exact same CPUs!
> And this would have been totally unacceptable to a user innocuously
> using suspend/resume. Look at commit d35be8bab9 (CPU hotplug, cpusets,
> suspend: Don't modify cpusets during suspend/resume), for more details
> on how we fixed that problem.

Yes, I do see that point.  The special suspend/resume handling is
definitely necessary in this case.

> So, to summarize, suspend/resume should mean one simple thing to
> userspace - "the kernel will transparently (to the extent possible)
> perform suspend/resume and bring back the system during resume, to a
> state as close as possible compared to how it was before suspend".
> Any implementation challenges must be handled in the kernel (as far as
> possible), and we should avoid burdening userspace with extraneous
> events etc.

OK, I think we may have different views on how much kernel
implementation details userspace need to know :-)

That's fine, and it really was not my intention to question the way
cpufreq handled this.  It all just seemed so meaningless to me.  I now
understand that it has a meaning to you, which of course is what
matters.

I am glad I could help removing the most obvious bugs.


Bjørn


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-05 13:21                 ` Bjørn Mork
  0 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-05 13:21 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Viresh Kumar, Rafael J. Wysocki, Linux PM mailing list, cpufreq

"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:

> You seem to be getting confused as to what the *actual* userspace
> expectation was, in that mail thread. The expectation was that suspend/
> resume is a kernel operation that brings back the system to the same
> state (as much as possible) at the end of resume, as it was before
> suspend. And that is a perfectly valid expectation, and it is something
> that the kernel has to try its level best to honor.

Our understanding of the thread is obviously biased by our respective
views of the matter ;-)

> And in this particular case, the specific expectation was that the sysfs
> file permissions set by the user for cpufreq files will remain as it is
> even after a suspend/resume cycle. That's it. There is _absolutely_ _no_
> talk about CPU hotplug here.

This is not a cpufreq problem.  It is a CPU problem.  You are only
complicating things by trying to solve it in cpufreq.  What about
permissions on the other CPU sysfs attributes?

> Robert _happened_ to dig this further and observe that suspend/resume
> actually does offline/online of CPUs, and thought that he should have
> also seen the associated udev events as well. But we have purposefully
> not exposed the fact that suspend/resume involves CPU hotplug.

Well, that is a bug in my opinion.  You are adding lots of kernel code
to avoid a very simple userspace configuration.  Seems very backwards.
This is mostly a documentation problem, if a problem at all.  I do note
that Robert did try the udev way before concluding that this was a
kernel bug.

> Today,
> suspend/resume uses CPU hotplug internally because we don't have any
> other better alternative. The very concept/semantic of suspend/resume
> _does_ _not_ imply CPU hotplug - it is just an implementation detail
> that userspace should not need to care about or rely on.
>
> Moreover, cpufreq is not the only subsystem that participates in suspend/
> resume and CPU hotplug. And fundamentally, regular CPU hotplug has _very_
> different semantics and guarantees than the hotplug done in suspend/resume.
> For example, if you offline CPUs normally, the cpusets associated with
> those CPUs will get destroyed, potentially in ways that _won't_ bring
> them back to the same state when you online those exact same CPUs!
> And this would have been totally unacceptable to a user innocuously
> using suspend/resume. Look at commit d35be8bab9 (CPU hotplug, cpusets,
> suspend: Don't modify cpusets during suspend/resume), for more details
> on how we fixed that problem.

Yes, I do see that point.  The special suspend/resume handling is
definitely necessary in this case.

> So, to summarize, suspend/resume should mean one simple thing to
> userspace - "the kernel will transparently (to the extent possible)
> perform suspend/resume and bring back the system during resume, to a
> state as close as possible compared to how it was before suspend".
> Any implementation challenges must be handled in the kernel (as far as
> possible), and we should avoid burdening userspace with extraneous
> events etc.

OK, I think we may have different views on how much kernel
implementation details userspace need to know :-)

That's fine, and it really was not my intention to question the way
cpufreq handled this.  It all just seemed so meaningless to me.  I now
understand that it has a meaning to you, which of course is what
matters.

I am glad I could help removing the most obvious bugs.


Bj√∏rn


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-05 12:41               ` Srivatsa S. Bhat
@ 2013-12-05 22:29                 ` Rafael J. Wysocki
  -1 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-05 22:29 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> > Viresh Kumar <viresh.kumar@linaro.org> writes:
> > 
> >> Sending from phone.. html.. so left list.
> >>
> >> Here is the old thread where we discussed this.. see if this helps..
> >>
> >> http://marc.info/?t=136845016900003&r=1&w=2
> > 
> > Thanks.  That helped a lot.
> > 
> > Unless I miss something, it looks like the permission problem *started*
> > with fallout from special suspend code - surprising the user by not
> > creating any offline/online event on suspend/resume.  Quoting from
> > http://marc.info/?l=linux-pm&m=136847781510358 :
> > 
> >   (And yes, even code-wise, we use a slightly different
> >    path in the S3 code, to initiate hotplug. That's why the uevents
> >    are by-passed.)
> > 
> 
> I hope you didn't miss the main idea I was trying to convey in that
> reply: 
> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
> supposed to be totally internal to the suspend/resume code. It is not
> intended for userspace to know that we are internally offlining/
> onlining CPUs."

By the way, in the meantime I discussed this with Viresh in the context of
a different (although related) fix and I suggested a different approach.

Namely, to split the CPU offline/online code into "core" and "add-ons"
parts, where the core part will do just whatever is needed to offline/online
CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
removal/addition of sysfs attributes and so on).

Then, the system suspend/resume code path will only run the "core" part
and whatever else CPU handling is needed during suspend/resume will be
carried out by the device suspend/resume code (via driver callbacks or
stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
by Viresh).

In turn, the "runtime" CPU offline/online will carry out both the core
and the add-ons parts as it does today.

In my view this should address the problems we have with sysfs attributes,
governors start/stop etc. during system suspend/resume.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-05 22:29                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-05 22:29 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> > Viresh Kumar <viresh.kumar@linaro.org> writes:
> > 
> >> Sending from phone.. html.. so left list.
> >>
> >> Here is the old thread where we discussed this.. see if this helps..
> >>
> >> http://marc.info/?t=136845016900003&r=1&w=2
> > 
> > Thanks.  That helped a lot.
> > 
> > Unless I miss something, it looks like the permission problem *started*
> > with fallout from special suspend code - surprising the user by not
> > creating any offline/online event on suspend/resume.  Quoting from
> > http://marc.info/?l=linux-pm&m=136847781510358 :
> > 
> >   (And yes, even code-wise, we use a slightly different
> >    path in the S3 code, to initiate hotplug. That's why the uevents
> >    are by-passed.)
> > 
> 
> I hope you didn't miss the main idea I was trying to convey in that
> reply: 
> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
> supposed to be totally internal to the suspend/resume code. It is not
> intended for userspace to know that we are internally offlining/
> onlining CPUs."

By the way, in the meantime I discussed this with Viresh in the context of
a different (although related) fix and I suggested a different approach.

Namely, to split the CPU offline/online code into "core" and "add-ons"
parts, where the core part will do just whatever is needed to offline/online
CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
removal/addition of sysfs attributes and so on).

Then, the system suspend/resume code path will only run the "core" part
and whatever else CPU handling is needed during suspend/resume will be
carried out by the device suspend/resume code (via driver callbacks or
stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
by Viresh).

In turn, the "runtime" CPU offline/online will carry out both the core
and the add-ons parts as it does today.

In my view this should address the problems we have with sysfs attributes,
governors start/stop etc. during system suspend/resume.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-05 22:29                 ` Rafael J. Wysocki
@ 2013-12-06  5:23                   ` Srivatsa S. Bhat
  -1 siblings, 0 replies; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-06  5:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On 12/06/2013 03:59 AM, Rafael J. Wysocki wrote:
> On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
>> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
>>> Viresh Kumar <viresh.kumar@linaro.org> writes:
>>>
>>>> Sending from phone.. html.. so left list.
>>>>
>>>> Here is the old thread where we discussed this.. see if this helps..
>>>>
>>>> http://marc.info/?t=136845016900003&r=1&w=2
>>>
>>> Thanks.  That helped a lot.
>>>
>>> Unless I miss something, it looks like the permission problem *started*
>>> with fallout from special suspend code - surprising the user by not
>>> creating any offline/online event on suspend/resume.  Quoting from
>>> http://marc.info/?l=linux-pm&m=136847781510358 :
>>>
>>>   (And yes, even code-wise, we use a slightly different
>>>    path in the S3 code, to initiate hotplug. That's why the uevents
>>>    are by-passed.)
>>>
>>
>> I hope you didn't miss the main idea I was trying to convey in that
>> reply: 
>> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
>> supposed to be totally internal to the suspend/resume code. It is not
>> intended for userspace to know that we are internally offlining/
>> onlining CPUs."
> 
> By the way, in the meantime I discussed this with Viresh in the context of
> a different (although related) fix and I suggested a different approach.
> 
> Namely, to split the CPU offline/online code into "core" and "add-ons"
> parts, where the core part will do just whatever is needed to offline/online
> CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
> removal/addition of sysfs attributes and so on).
> 
> Then, the system suspend/resume code path will only run the "core" part
> and whatever else CPU handling is needed during suspend/resume will be
> carried out by the device suspend/resume code (via driver callbacks or
> stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
> by Viresh).
> 
> In turn, the "runtime" CPU offline/online will carry out both the core
> and the add-ons parts as it does today.
> 
> In my view this should address the problems we have with sysfs attributes,
> governors start/stop etc. during system suspend/resume.
> 

Hmm, yes that sounds like a good idea. Are you suggesting this "core" and
"add-on" split only for the cpufreq parts of CPU hotplug or for everything
involved in CPU hotplug code? IIUC you are suggesting the latter, which is
likely to be a significant undertaking, but very well worth it in the long
run, since it gives us an elegant solution for all these problems.

I guess the *_FROZEN CPU hotplug notifications were originally introduced
to provide us an infrastructure to do something like this, but obviously
that hasn't worked out very well. So I agree that a fundamental restructuring
is in order, to cure all the innumerable problems.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-06  5:23                   ` Srivatsa S. Bhat
  0 siblings, 0 replies; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-06  5:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On 12/06/2013 03:59 AM, Rafael J. Wysocki wrote:
> On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
>> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
>>> Viresh Kumar <viresh.kumar@linaro.org> writes:
>>>
>>>> Sending from phone.. html.. so left list.
>>>>
>>>> Here is the old thread where we discussed this.. see if this helps..
>>>>
>>>> http://marc.info/?t=136845016900003&r=1&w=2
>>>
>>> Thanks.  That helped a lot.
>>>
>>> Unless I miss something, it looks like the permission problem *started*
>>> with fallout from special suspend code - surprising the user by not
>>> creating any offline/online event on suspend/resume.  Quoting from
>>> http://marc.info/?l=linux-pm&m=136847781510358 :
>>>
>>>   (And yes, even code-wise, we use a slightly different
>>>    path in the S3 code, to initiate hotplug. That's why the uevents
>>>    are by-passed.)
>>>
>>
>> I hope you didn't miss the main idea I was trying to convey in that
>> reply: 
>> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
>> supposed to be totally internal to the suspend/resume code. It is not
>> intended for userspace to know that we are internally offlining/
>> onlining CPUs."
> 
> By the way, in the meantime I discussed this with Viresh in the context of
> a different (although related) fix and I suggested a different approach.
> 
> Namely, to split the CPU offline/online code into "core" and "add-ons"
> parts, where the core part will do just whatever is needed to offline/online
> CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
> removal/addition of sysfs attributes and so on).
> 
> Then, the system suspend/resume code path will only run the "core" part
> and whatever else CPU handling is needed during suspend/resume will be
> carried out by the device suspend/resume code (via driver callbacks or
> stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
> by Viresh).
> 
> In turn, the "runtime" CPU offline/online will carry out both the core
> and the add-ons parts as it does today.
> 
> In my view this should address the problems we have with sysfs attributes,
> governors start/stop etc. during system suspend/resume.
> 

Hmm, yes that sounds like a good idea. Are you suggesting this "core" and
"add-on" split only for the cpufreq parts of CPU hotplug or for everything
involved in CPU hotplug code? IIUC you are suggesting the latter, which is
likely to be a significant undertaking, but very well worth it in the long
run, since it gives us an elegant solution for all these problems.

I guess the *_FROZEN CPU hotplug notifications were originally introduced
to provide us an infrastructure to do something like this, but obviously
that hasn't worked out very well. So I agree that a fundamental restructuring
is in order, to cure all the innumerable problems.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-06  5:23                   ` Srivatsa S. Bhat
@ 2013-12-07  1:17                     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-07  1:17 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On Friday, December 06, 2013 10:53:16 AM Srivatsa S. Bhat wrote:
> On 12/06/2013 03:59 AM, Rafael J. Wysocki wrote:
> > On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
> >> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> >>> Viresh Kumar <viresh.kumar@linaro.org> writes:
> >>>
> >>>> Sending from phone.. html.. so left list.
> >>>>
> >>>> Here is the old thread where we discussed this.. see if this helps..
> >>>>
> >>>> http://marc.info/?t=136845016900003&r=1&w=2
> >>>
> >>> Thanks.  That helped a lot.
> >>>
> >>> Unless I miss something, it looks like the permission problem *started*
> >>> with fallout from special suspend code - surprising the user by not
> >>> creating any offline/online event on suspend/resume.  Quoting from
> >>> http://marc.info/?l=linux-pm&m=136847781510358 :
> >>>
> >>>   (And yes, even code-wise, we use a slightly different
> >>>    path in the S3 code, to initiate hotplug. That's why the uevents
> >>>    are by-passed.)
> >>>
> >>
> >> I hope you didn't miss the main idea I was trying to convey in that
> >> reply: 
> >> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
> >> supposed to be totally internal to the suspend/resume code. It is not
> >> intended for userspace to know that we are internally offlining/
> >> onlining CPUs."
> > 
> > By the way, in the meantime I discussed this with Viresh in the context of
> > a different (although related) fix and I suggested a different approach.
> > 
> > Namely, to split the CPU offline/online code into "core" and "add-ons"
> > parts, where the core part will do just whatever is needed to offline/online
> > CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
> > removal/addition of sysfs attributes and so on).
> > 
> > Then, the system suspend/resume code path will only run the "core" part
> > and whatever else CPU handling is needed during suspend/resume will be
> > carried out by the device suspend/resume code (via driver callbacks or
> > stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
> > by Viresh).
> > 
> > In turn, the "runtime" CPU offline/online will carry out both the core
> > and the add-ons parts as it does today.
> > 
> > In my view this should address the problems we have with sysfs attributes,
> > governors start/stop etc. during system suspend/resume.
> > 
> 
> Hmm, yes that sounds like a good idea. Are you suggesting this "core" and
> "add-on" split only for the cpufreq parts of CPU hotplug or for everything
> involved in CPU hotplug code? IIUC you are suggesting the latter, which is
> likely to be a significant undertaking, but very well worth it in the long
> run, since it gives us an elegant solution for all these problems.

Yeah, and I'd like that to be done.

> I guess the *_FROZEN CPU hotplug notifications were originally introduced
> to provide us an infrastructure to do something like this, but obviously
> that hasn't worked out very well.

That's been a workaround to be honest.

> So I agree that a fundamental restructuring is in order, to cure all the
> innumerable problems.

Yes.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
@ 2013-12-07  1:17                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-07  1:17 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Bjørn Mork, Viresh Kumar, Linux PM mailing list, cpufreq

On Friday, December 06, 2013 10:53:16 AM Srivatsa S. Bhat wrote:
> On 12/06/2013 03:59 AM, Rafael J. Wysocki wrote:
> > On Thursday, December 05, 2013 06:11:19 PM Srivatsa S. Bhat wrote:
> >> On 12/05/2013 12:27 PM, Bjørn Mork wrote:
> >>> Viresh Kumar <viresh.kumar@linaro.org> writes:
> >>>
> >>>> Sending from phone.. html.. so left list.
> >>>>
> >>>> Here is the old thread where we discussed this.. see if this helps..
> >>>>
> >>>> http://marc.info/?t=136845016900003&r=1&w=2
> >>>
> >>> Thanks.  That helped a lot.
> >>>
> >>> Unless I miss something, it looks like the permission problem *started*
> >>> with fallout from special suspend code - surprising the user by not
> >>> creating any offline/online event on suspend/resume.  Quoting from
> >>> http://marc.info/?l=linux-pm&m=136847781510358 :
> >>>
> >>>   (And yes, even code-wise, we use a slightly different
> >>>    path in the S3 code, to initiate hotplug. That's why the uevents
> >>>    are by-passed.)
> >>>
> >>
> >> I hope you didn't miss the main idea I was trying to convey in that
> >> reply: 
> >> "IMHO, using CPU hotplug (offline/online of CPUs) in the S3 path is
> >> supposed to be totally internal to the suspend/resume code. It is not
> >> intended for userspace to know that we are internally offlining/
> >> onlining CPUs."
> > 
> > By the way, in the meantime I discussed this with Viresh in the context of
> > a different (although related) fix and I suggested a different approach.
> > 
> > Namely, to split the CPU offline/online code into "core" and "add-ons"
> > parts, where the core part will do just whatever is needed to offline/online
> > CPU cores cleanly and the "add-ons" part will carry out the rest (e.g.
> > removal/addition of sysfs attributes and so on).
> > 
> > Then, the system suspend/resume code path will only run the "core" part
> > and whatever else CPU handling is needed during suspend/resume will be
> > carried out by the device suspend/resume code (via driver callbacks or
> > stuff similar to cpufreq_suspend() and cpufreq_resume() recently proposed
> > by Viresh).
> > 
> > In turn, the "runtime" CPU offline/online will carry out both the core
> > and the add-ons parts as it does today.
> > 
> > In my view this should address the problems we have with sysfs attributes,
> > governors start/stop etc. during system suspend/resume.
> > 
> 
> Hmm, yes that sounds like a good idea. Are you suggesting this "core" and
> "add-on" split only for the cpufreq parts of CPU hotplug or for everything
> involved in CPU hotplug code? IIUC you are suggesting the latter, which is
> likely to be a significant undertaking, but very well worth it in the long
> run, since it gives us an elegant solution for all these problems.

Yeah, and I'd like that to be done.

> I guess the *_FROZEN CPU hotplug notifications were originally introduced
> to provide us an infrastructure to do something like this, but obviously
> that hasn't worked out very well.

That's been a workaround to be honest.

> So I agree that a fundamental restructuring is in order, to cure all the
> innumerable problems.

Yes.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-05  1:29   ` Rafael J. Wysocki
@ 2013-12-09  2:59     ` Lan Tianyu
  2013-12-09  6:48       ` Srivatsa S. Bhat
  0 siblings, 1 reply; 28+ messages in thread
From: Lan Tianyu @ 2013-12-09  2:59 UTC (permalink / raw)
  To: Rafael J. Wysocki, ziegler
  Cc: viresh kumar, Bjørn Mork, cpufreq, Linux PM list,
	Srivatsa S. Bhat, Rafael J. Wysocki

2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>> > This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>> > light-weight init/teardown during suspend/resume"), which enabled
>> > suspend/resume optimizations leaving the sysfs files in place.
>> >
>> > Errors during suspend/resume are not handled properly, leaving
>> > dead sysfs attributes in case of failures.  There are are number of
>> > functions with special code for the "frozen" case, and all these
>> > need to also have special error handling.
>>
>> Can you try this please if it fixes things for you (with your patch reverted):
>>
>> From: Viresh Kumar <viresh.kumar@linaro.org>
>> Date: Wed, 4 Dec 2013 15:20:12 +0530
>> Subject: [PATCH] cpufreq: remove sysfs files for CPU which failed to come
>>  back after resume
>>
>> There are cases where cpufreq_add_dev() may fail for some CPUs during resume.
>> With the current code we will still have sysfs cpufreq files for such CPUs, and
>> struct cpufreq_policy would be already freed for them. Hence any operation on
>> those sysfs files would result in kernel warnings.
>>
>> To fix this, lets remove those sysfs files or put the associated kobject in case
>> of such errors. Also, to make it simple lets remove the sysfs links from all the
>> CPUs (except policy->cpu) during suspend as that operation wouldn't result with a
>> loss of sysfs file permissions. And so we will create those links during resume
>> as well.
>>
>> Reported-by: Bjørn Mork <bjorn@mork.no>
>> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
>
> There's no way this can go into 3.13, even though it fixes the problem for
> Bjorn.
>
> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
> but for that I guess it should contain a revert of the change made by the
> Bjorn's patch.

This patch causes a s3 regression. Cc:Martin Ziegler
https://bugzilla.kernel.org/show_bug.cgi?id=66751

>
> Thanks!
>
>> ---
>>  drivers/cpufreq/cpufreq.c | 61 ++++++++++++++++++++++++-----------------------
>>  1 file changed, 31 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 606224a..57c80a7 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -849,8 +849,7 @@ static void cpufreq_init_policy(struct cpufreq_policy *policy)
>>
>>  #ifdef CONFIG_HOTPLUG_CPU
>>  static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
>> -                               unsigned int cpu, struct device *dev,
>> -                               bool frozen)
>> +                               unsigned int cpu, struct device *dev)
>>  {
>>       int ret = 0;
>>       unsigned long flags;
>> @@ -881,9 +880,7 @@ static int cpufreq_add_policy_cpu(struct cpufreq_policy *policy,
>>               }
>>       }
>>
>> -     /* Don't touch sysfs links during light-weight init */
>> -     if (!frozen)
>> -             ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>> +     ret = sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
>>
>>       return ret;
>>  }
>> @@ -930,6 +927,27 @@ err_free_policy:
>>       return NULL;
>>  }
>>
>> +static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy)
>> +{
>> +     struct kobject *kobj;
>> +     struct completion *cmp;
>> +
>> +     down_read(&policy->rwsem);
>> +     kobj = &policy->kobj;
>> +     cmp = &policy->kobj_unregister;
>> +     up_read(&policy->rwsem);
>> +     kobject_put(kobj);
>> +
>> +     /*
>> +      * We need to make sure that the underlying kobj is
>> +      * actually not referenced anymore by anybody before we
>> +      * proceed with unloading.
>> +      */
>> +     pr_debug("waiting for dropping of refcount\n");
>> +     wait_for_completion(cmp);
>> +     pr_debug("wait complete\n");
>> +}
>> +
>>  static void cpufreq_policy_free(struct cpufreq_policy *policy)
>>  {
>>       free_cpumask_var(policy->related_cpus);
>> @@ -990,7 +1008,7 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif,
>>       list_for_each_entry(tpolicy, &cpufreq_policy_list, policy_list) {
>>               if (cpumask_test_cpu(cpu, tpolicy->related_cpus)) {
>>                       read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> -                     ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev, frozen);
>> +                     ret = cpufreq_add_policy_cpu(tpolicy, cpu, dev);
>>                       up_read(&cpufreq_rwsem);
>>                       return ret;
>>               }
>> @@ -1100,7 +1118,10 @@ err_get_freq:
>>       if (cpufreq_driver->exit)
>>               cpufreq_driver->exit(policy);
>>  err_set_policy_cpu:
>> +     if (frozen)
>> +             cpufreq_policy_put_kobj(policy);
>>       cpufreq_policy_free(policy);
>> +
>>  nomem_out:
>>       up_read(&cpufreq_rwsem);
>>
>> @@ -1122,7 +1143,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>>  }
>>
>>  static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
>> -                                        unsigned int old_cpu, bool frozen)
>> +                                        unsigned int old_cpu)
>>  {
>>       struct device *cpu_dev;
>>       int ret;
>> @@ -1130,10 +1151,6 @@ static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy,
>>       /* first sibling now owns the new sysfs dir */
>>       cpu_dev = get_cpu_device(cpumask_any_but(policy->cpus, old_cpu));
>>
>> -     /* Don't touch sysfs files during light-weight tear-down */
>> -     if (frozen)
>> -             return cpu_dev->id;
>> -
>>       sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
>>       ret = kobject_move(&policy->kobj, &cpu_dev->kobj);
>>       if (ret) {
>> @@ -1200,7 +1217,7 @@ static int __cpufreq_remove_dev_prepare(struct device *dev,
>>               if (!frozen)
>>                       sysfs_remove_link(&dev->kobj, "cpufreq");
>>       } else if (cpus > 1) {
>> -             new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu, frozen);
>> +             new_cpu = cpufreq_nominate_new_policy_cpu(policy, cpu);
>>               if (new_cpu >= 0) {
>>                       update_policy_cpu(policy, new_cpu);
>>
>> @@ -1222,8 +1239,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>       int ret;
>>       unsigned long flags;
>>       struct cpufreq_policy *policy;
>> -     struct kobject *kobj;
>> -     struct completion *cmp;
>>
>>       read_lock_irqsave(&cpufreq_driver_lock, flags);
>>       policy = per_cpu(cpufreq_cpu_data, cpu);
>> @@ -1253,22 +1268,8 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>                       }
>>               }
>>
>> -             if (!frozen) {
>> -                     down_read(&policy->rwsem);
>> -                     kobj = &policy->kobj;
>> -                     cmp = &policy->kobj_unregister;
>> -                     up_read(&policy->rwsem);
>> -                     kobject_put(kobj);
>> -
>> -                     /*
>> -                      * We need to make sure that the underlying kobj is
>> -                      * actually not referenced anymore by anybody before we
>> -                      * proceed with unloading.
>> -                      */
>> -                     pr_debug("waiting for dropping of refcount\n");
>> -                     wait_for_completion(cmp);
>> -                     pr_debug("wait complete\n");
>> -             }
>> +             if (!frozen)
>> +                     cpufreq_policy_put_kobj(policy);
>>
>>               /*
>>                * Perform the ->exit() even during light-weight tear-down,
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best regards
Tianyu Lan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09  2:59     ` Lan Tianyu
@ 2013-12-09  6:48       ` Srivatsa S. Bhat
  2013-12-09 10:04         ` Bjørn Mork
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Srivatsa S. Bhat @ 2013-12-09  6:48 UTC (permalink / raw)
  To: Lan Tianyu, ziegler
  Cc: Rafael J. Wysocki, viresh kumar, Bjørn Mork, cpufreq,
	Linux PM list, Rafael J. Wysocki

On 12/09/2013 08:29 AM, Lan Tianyu wrote:
> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>>>> light-weight init/teardown during suspend/resume"), which enabled
>>>> suspend/resume optimizations leaving the sysfs files in place.
[...]
>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>> but for that I guess it should contain a revert of the change made by the
>> Bjorn's patch.
> 
> This patch causes a s3 regression. Cc:Martin Ziegler
> https://bugzilla.kernel.org/show_bug.cgi?id=66751
> 

Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
identical to what happens during regular CPU hotplug. So is regular CPU
hotplug also failing for you, Martin?

You can do CPU hotplug by:

CPU offline:
echo 0 > /sys/devices/system/cpu/cpu<num>/online

CPU online:
echo 1 > /sys/devices/system/cpu/cpu<num>/online


Bjorn's patch looks pretty innocuous to me.. I couldn't catch any obvious
bug looking at the code. So answer to the above question should help us dig
deeper.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09  6:48       ` Srivatsa S. Bhat
@ 2013-12-09 10:04         ` Bjørn Mork
  2013-12-12  1:59           ` Rafael J. Wysocki
  2013-12-09 11:24         ` Martin Ziegler
  2013-12-10 16:02         ` Martin Ziegler
  2 siblings, 1 reply; 28+ messages in thread
From: Bjørn Mork @ 2013-12-09 10:04 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Lan Tianyu, ziegler, Rafael J. Wysocki, viresh kumar, cpufreq,
	Linux PM list, Rafael J. Wysocki

"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:
> On 12/09/2013 08:29 AM, Lan Tianyu wrote:
>> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>>>>> light-weight init/teardown during suspend/resume"), which enabled
>>>>> suspend/resume optimizations leaving the sysfs files in place.
> [...]
>>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>>> but for that I guess it should contain a revert of the change made by the
>>> Bjorn's patch.
>> 
>> This patch causes a s3 regression. Cc:Martin Ziegler
>> https://bugzilla.kernel.org/show_bug.cgi?id=66751
>> 
>
> Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
> identical to what happens during regular CPU hotplug.

Yes, I also wondered how that could have happened.

Apparently this is due to bad interaction between two patches. Commit 

  5a87182aa21d ("cpufreq: suspend governors on system suspend/hibernate")

added an implicit dependency on the suspend/resume code which commit

  2167e2399dc5 ("cpufreq: fix garbage kobjects on errors during suspend/resume")

disabled.

This would make the last patch applied of these two come out of the
bisect, which is 2167e2399dc5 in this case.  I can confirm that
reverting only this patch also fixes my hibernate problem.

BUT: It reintroduces the problem it was supposed to fix.  AND: As you
note, it really does nothing but revert to the assumed safe regular CPU
hotplug operations.  Which means that the other patch somehow has made
regular CPU hotplugging fail *if suspending*.  It won't make it fail
unless suspending, so there is no need to test CPU hotplugging
separately. 

In any case, my claim is that the real bug here still is in commit
5a87182aa21d, which added an undocumented implicit dependency on the
special cpufreq suspend/resume code.  There is no way in hell that
anyone could have guessed that the seemingly innocent changes in commit
2167e2399dc5 would fail because of this. Which should be more than
enough to understand why the continues sprinkling of suspend/resume code
all over has to stop.  Where did all the nice and clean pm hooks design
disappear?

My opinion is that commit 2167e2399dc5 still is the correct short term
fix, and it should be reapplied to v3.13-rcX and resubmitted for
3.12-stable.

I anticipate the real cleanup of this mess.  But I don't think any
additional "if suspending" tests has any place in it.  Test *once* and
fork to whatever you want to do differently when suspending .
Sprinkling these tests all over, having separate code blocks implicitly
depending on each other, is nothing but a recipe for hard to track bugs.

Just my € .015 (yes, I'm cheap)


Bjørn


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09  6:48       ` Srivatsa S. Bhat
  2013-12-09 10:04         ` Bjørn Mork
@ 2013-12-09 11:24         ` Martin Ziegler
  2013-12-09 11:53           ` Bjørn Mork
  2013-12-10 16:02         ` Martin Ziegler
  2 siblings, 1 reply; 28+ messages in thread
From: Martin Ziegler @ 2013-12-09 11:24 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Lan Tianyu, ziegler, Rafael J. Wysocki, viresh kumar,
	Bjørn Mork, cpufreq, Linux PM list, Rafael J. Wysocki,
	martin.ziegler

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1758 bytes --]

This works fine for cpu{1,2,3} after

  Revert "cpufreq: fix garbage kobjects on
  errors during suspend/resume"

  This reverts commit
  2167e2399dc5e69c62db56d933e9c8cbe107620a.

is applied to v3.13-rc3. There is no file 
cpu0/online.


I can check the behaviour of v3.13-rc3 itself only tomorrow,
since I am travelling.

Regards
Martin

Am Mo 09 Dez 2013 12:18:00 CET schrieb Srivatsa S. Bhat:

> On 12/09/2013 08:29 AM, Lan Tianyu wrote:
>> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>>>>> light-weight init/teardown during suspend/resume"), which enabled
>>>>> suspend/resume optimizations leaving the sysfs files in place.
> [...]
>>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>>> but for that I guess it should contain a revert of the change made by the
>>> Bjorn's patch.
>>
>> This patch causes a s3 regression. Cc:Martin Ziegler
>> https://bugzilla.kernel.org/show_bug.cgi?id=66751
>>
>
> Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
> identical to what happens during regular CPU hotplug. So is regular CPU
> hotplug also failing for you, Martin?
>
> You can do CPU hotplug by:
>
> CPU offline:
> echo 0 > /sys/devices/system/cpu/cpu<num>/online
>
> CPU online:
> echo 1 > /sys/devices/system/cpu/cpu<num>/online
>
>
> Bjorn's patch looks pretty innocuous to me.. I couldn't catch any obvious
> bug looking at the code. So answer to the above question should help us dig
> deeper.
>
> Regards,
> Srivatsa S. Bhat
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09 11:24         ` Martin Ziegler
@ 2013-12-09 11:53           ` Bjørn Mork
  0 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-09 11:53 UTC (permalink / raw)
  To: Martin Ziegler
  Cc: Srivatsa S. Bhat, Lan Tianyu, Rafael J. Wysocki, viresh kumar,
	cpufreq, Linux PM list, Rafael J. Wysocki, martin.ziegler

Martin Ziegler <ziegler@email.mathematik.uni-freiburg.de> writes:

> This works fine for cpu{1,2,3} after
>
>  Revert "cpufreq: fix garbage kobjects on
>  errors during suspend/resume"
>
>  This reverts commit
>  2167e2399dc5e69c62db56d933e9c8cbe107620a.
>
> is applied to v3.13-rc3. There is no file cpu0/online.
>
>
> I can check the behaviour of v3.13-rc3 itself only tomorrow,
> since I am travelling.

I don't think there is any need.  The bug is caused by the combination
of commits 2167e2399dc5 and 5a87182aa21d, and both only affect
suspend/resume.  CPU hotplug behaviour as such is unaffected by both
patches.


Bjørn

> Am Mo 09 Dez 2013 12:18:00 CET schrieb Srivatsa S. Bhat:
>
>> On 12/09/2013 08:29 AM, Lan Tianyu wrote:
>>> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>>>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>>>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>>>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>>>>>> light-weight init/teardown during suspend/resume"), which enabled
>>>>>> suspend/resume optimizations leaving the sysfs files in place.
>> [...]
>>>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>>>> but for that I guess it should contain a revert of the change made by the
>>>> Bjorn's patch.
>>>
>>> This patch causes a s3 regression. Cc:Martin Ziegler
>>> https://bugzilla.kernel.org/show_bug.cgi?id=66751
>>>
>>
>> Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
>> identical to what happens during regular CPU hotplug. So is regular CPU
>> hotplug also failing for you, Martin?
>>
>> You can do CPU hotplug by:
>>
>> CPU offline:
>> echo 0 > /sys/devices/system/cpu/cpu<num>/online
>>
>> CPU online:
>> echo 1 > /sys/devices/system/cpu/cpu<num>/online
>>
>>
>> Bjorn's patch looks pretty innocuous to me.. I couldn't catch any obvious
>> bug looking at the code. So answer to the above question should help us dig
>> deeper.
>>
>> Regards,
>> Srivatsa S. Bhat
>>
>>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09  6:48       ` Srivatsa S. Bhat
  2013-12-09 10:04         ` Bjørn Mork
  2013-12-09 11:24         ` Martin Ziegler
@ 2013-12-10 16:02         ` Martin Ziegler
  2 siblings, 0 replies; 28+ messages in thread
From: Martin Ziegler @ 2013-12-10 16:02 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Lan Tianyu, ziegler, Rafael J. Wysocki, viresh kumar,
	Bjørn Mork, cpufreq, Linux PM list, Rafael J. Wysocki,
	martin.ziegler

On my machine

CPU offline:
echo 0 > /sys/devices/system/cpu/cpu<num>/online

CPU online:
echo 1 > /sys/devices/system/cpu/cpu<num>/online

works fine with 3.13-rc3.

Regards

Martin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-09 10:04         ` Bjørn Mork
@ 2013-12-12  1:59           ` Rafael J. Wysocki
  2013-12-12  8:52             ` Bjørn Mork
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2013-12-12  1:59 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Srivatsa S. Bhat, Lan Tianyu, ziegler, viresh kumar, cpufreq,
	Linux PM list, Rafael J. Wysocki

On Monday, December 09, 2013 11:04:53 AM Bjørn Mork wrote:
> "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:
> > On 12/09/2013 08:29 AM, Lan Tianyu wrote:
> >> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
> >>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
> >>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
> >>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
> >>>>> light-weight init/teardown during suspend/resume"), which enabled
> >>>>> suspend/resume optimizations leaving the sysfs files in place.
> > [...]
> >>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
> >>> but for that I guess it should contain a revert of the change made by the
> >>> Bjorn's patch.
> >> 
> >> This patch causes a s3 regression. Cc:Martin Ziegler
> >> https://bugzilla.kernel.org/show_bug.cgi?id=66751
> >> 
> >
> > Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
> > identical to what happens during regular CPU hotplug.
> 
> Yes, I also wondered how that could have happened.
> 
> Apparently this is due to bad interaction between two patches. Commit 
> 
>   5a87182aa21d ("cpufreq: suspend governors on system suspend/hibernate")
> 
> added an implicit dependency on the suspend/resume code which commit
> 
>   2167e2399dc5 ("cpufreq: fix garbage kobjects on errors during suspend/resume")
> 
> disabled.

I suspected so, but then I was about to jump on a plane to another continent
in several hours, so I preferred to simply revert both commits and start over
after the dust settled.

> This would make the last patch applied of these two come out of the
> bisect, which is 2167e2399dc5 in this case.  I can confirm that
> reverting only this patch also fixes my hibernate problem.
> 
> BUT: It reintroduces the problem it was supposed to fix.  AND: As you
> note, it really does nothing but revert to the assumed safe regular CPU
> hotplug operations.  Which means that the other patch somehow has made
> regular CPU hotplugging fail *if suspending*.  It won't make it fail
> unless suspending, so there is no need to test CPU hotplugging
> separately. 
> 
> In any case, my claim is that the real bug here still is in commit
> 5a87182aa21d, which added an undocumented implicit dependency on the
> special cpufreq suspend/resume code.  There is no way in hell that
> anyone could have guessed that the seemingly innocent changes in commit
> 2167e2399dc5 would fail because of this. Which should be more than
> enough to understand why the continues sprinkling of suspend/resume code
> all over has to stop.  Where did all the nice and clean pm hooks design
> disappear?

cpufreq has always had problems with suspend/resume in the first place,
but it just didn't have so much testing coverage before.

> My opinion is that commit 2167e2399dc5 still is the correct short term
> fix, and it should be reapplied to v3.13-rcX and resubmitted for
> 3.12-stable.

First of all, I'm not going to send any pull requests this week and even
the next week may be too early to reintroduce that commit.  However, the
second next week will be the -rc6 time frame, so I'm not sure.  It may
end up in 3.14-rc1.

> I anticipate the real cleanup of this mess.  But I don't think any
> additional "if suspending" tests has any place in it.  Test *once* and
> fork to whatever you want to do differently when suspending .
> Sprinkling these tests all over, having separate code blocks implicitly
> depending on each other, is nothing but a recipe for hard to track bugs.

Yes, that's pretty much the case, but it looks like we need to do a major
redesign of stuff to really fix those problems.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-12  1:59           ` Rafael J. Wysocki
@ 2013-12-12  8:52             ` Bjørn Mork
  0 siblings, 0 replies; 28+ messages in thread
From: Bjørn Mork @ 2013-12-12  8:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Srivatsa S. Bhat, Lan Tianyu, ziegler, viresh kumar, cpufreq,
	Linux PM list, Rafael J. Wysocki

"Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
> On Monday, December 09, 2013 11:04:53 AM Bjørn Mork wrote:
>> "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> writes:
>> > On 12/09/2013 08:29 AM, Lan Tianyu wrote:
>> >> 2013/12/5 Rafael J. Wysocki <rjw@rjwysocki.net>:
>> >>> On Wednesday, December 04, 2013 04:02:18 PM viresh kumar wrote:
>> >>>> On Tuesday 03 December 2013 04:44 PM, Bjørn Mork wrote:
>> >>>>> This is effectively a revert of commit 5302c3fb2e62 ("cpufreq: Perform
>> >>>>> light-weight init/teardown during suspend/resume"), which enabled
>> >>>>> suspend/resume optimizations leaving the sysfs files in place.
>> > [...]
>> >>> I took the Bjorn's patch for 3.13 and this one I can queued up for 3.14,
>> >>> but for that I guess it should contain a revert of the change made by the
>> >>> Bjorn's patch.
>> >> 
>> >> This patch causes a s3 regression. Cc:Martin Ziegler
>> >> https://bugzilla.kernel.org/show_bug.cgi?id=66751
>> >> 
>> >
>> > Hmm.. With Bjorn's patch applied, the cpufreq hotplug callback should become
>> > identical to what happens during regular CPU hotplug.
>> 
>> Yes, I also wondered how that could have happened.
>> 
>> Apparently this is due to bad interaction between two patches. Commit 
>> 
>>   5a87182aa21d ("cpufreq: suspend governors on system suspend/hibernate")
>> 
>> added an implicit dependency on the suspend/resume code which commit
>> 
>>   2167e2399dc5 ("cpufreq: fix garbage kobjects on errors during suspend/resume")
>> 
>> disabled.
>
> I suspected so, but then I was about to jump on a plane to another continent
> in several hours, so I preferred to simply revert both commits and start over
> after the dust settled.

No, problem.  I saw your mail about travelling. And I definitely support
the "revert first, research later" strategy in any case.  There was
still too many people hit by this, and bisecting it just to find an
already known bug.

>> This would make the last patch applied of these two come out of the
>> bisect, which is 2167e2399dc5 in this case.  I can confirm that
>> reverting only this patch also fixes my hibernate problem.
>> 
>> BUT: It reintroduces the problem it was supposed to fix.  AND: As you
>> note, it really does nothing but revert to the assumed safe regular CPU
>> hotplug operations.  Which means that the other patch somehow has made
>> regular CPU hotplugging fail *if suspending*.  It won't make it fail
>> unless suspending, so there is no need to test CPU hotplugging
>> separately. 
>> 
>> In any case, my claim is that the real bug here still is in commit
>> 5a87182aa21d, which added an undocumented implicit dependency on the
>> special cpufreq suspend/resume code.  There is no way in hell that
>> anyone could have guessed that the seemingly innocent changes in commit
>> 2167e2399dc5 would fail because of this. Which should be more than
>> enough to understand why the continues sprinkling of suspend/resume code
>> all over has to stop.  Where did all the nice and clean pm hooks design
>> disappear?
>
> cpufreq has always had problems with suspend/resume in the first place,
> but it just didn't have so much testing coverage before.

Yes...  I have known about the problems with acpi-cpufreq "forever" and
do feel bad about not reporting it before.  But I usually don't want to
report bugs without being able to dedicate some time to follow up in
case the developers need more info or patch testing etc.  Which means
that "low priority" (rare, only slightly annoying, etc) bugs can end up
not being reported at all.

So the additional cpufreq breakage in v3.12 was actually good because it
made the acpi-cpufreq bug a log more annoying, and therefore increased
the priority :-)

>> My opinion is that commit 2167e2399dc5 still is the correct short term
>> fix, and it should be reapplied to v3.13-rcX and resubmitted for
>> 3.12-stable.
>
> First of all, I'm not going to send any pull requests this week and even
> the next week may be too early to reintroduce that commit.  However, the
> second next week will be the -rc6 time frame, so I'm not sure.  It may
> end up in 3.14-rc1.

You decide of course, but if it matters then I tend to agree that this
should wait for 3.14.  It has gone enough back and forth for now, and
the fact that noone(?) else has reported it as a 3.12 regression shows
that it probably isn't a big problem for most people.

>> I anticipate the real cleanup of this mess.  But I don't think any
>> additional "if suspending" tests has any place in it.  Test *once* and
>> fork to whatever you want to do differently when suspending .
>> Sprinkling these tests all over, having separate code blocks implicitly
>> depending on each other, is nothing but a recipe for hard to track bugs.
>
> Yes, that's pretty much the case, but it looks like we need to do a major
> redesign of stuff to really fix those problems.

Yes, I am hoping you will do that :-)



Bjørn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume
  2013-12-04  6:23 ` Srivatsa S. Bhat
@ 2013-12-24  9:46   ` Jarzmik, Robert
  0 siblings, 0 replies; 28+ messages in thread
From: Jarzmik, Robert @ 2013-12-24  9:46 UTC (permalink / raw)
  To: Srivatsa S. Bhat, Bjørn Mork
  Cc: cpufreq, linux-pm, Viresh Kumar, Wysocki, Rafael J, R, Durgadoss

> -----Original Message-----
> From: Srivatsa S. Bhat [mailto:srivatsa.bhat@linux.vnet.ibm.com]
> Sent: Wednesday, December 4, 2013 07:24
> To: Bjørn Mork
> Cc: cpufreq@vger.kernel.org; linux-pm@vger.kernel.org; Viresh Kumar;
> Wysocki, Rafael J; Jarzmik, Robert; R, Durgadoss
> Subject: Re: [PATCH] cpufreq: fix garbage kobj on errors during
> suspend/resume
> 
> >  It is therefore
> > best to revert the patch enabling this code until the error handling
> > is in place.
Agreed.

> I agree, that's a good decision. I'll take a look at it later to see if we
> can restructure the code to include proper error handling in all the
> failure paths. If that gets way out of control in terms of complexity,
> then its probably best to drop this "feature" altogether. It has caused
> enough problems already, and the initial motivation behind doing that
> doesn't seem to be that strong now (CC'ing Robert and Durga).

Well, the motivation is still the same I think, ie. making thermal management
work, but that's not as important as a broken hibernation.

And in the long term, if I understood correctly that it's the suspend error
path with "stale" kobjects which is the problem, then this will have to be
addressed sooner or later, won't it ?

Cheers.

--
Robert

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2013-12-24  9:46 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-03 11:14 [PATCH] cpufreq: fix garbage kobj on errors during suspend/resume Bjørn Mork
2013-12-03 11:14 ` Bjørn Mork
2013-12-03 21:45 ` Rafael J. Wysocki
2013-12-04  6:23 ` Srivatsa S. Bhat
2013-12-24  9:46   ` Jarzmik, Robert
2013-12-04 10:32 ` viresh kumar
2013-12-04 12:08   ` Bjørn Mork
2013-12-04 14:41     ` Viresh Kumar
2013-12-04 15:41       ` Bjørn Mork
     [not found]         ` <CAKohponu3Fu=WaBHXP1iBJM87V9g=+hDPe=M168U_weODenZdQ@mail.gmail.com>
     [not found]           ` <878uvzyecg.fsf@nemi.mork.no>
2013-12-05 12:41             ` Srivatsa S. Bhat
2013-12-05 12:41               ` Srivatsa S. Bhat
2013-12-05 13:21               ` Bjørn Mork
2013-12-05 13:21                 ` Bjørn Mork
2013-12-05 22:29               ` Rafael J. Wysocki
2013-12-05 22:29                 ` Rafael J. Wysocki
2013-12-06  5:23                 ` Srivatsa S. Bhat
2013-12-06  5:23                   ` Srivatsa S. Bhat
2013-12-07  1:17                   ` Rafael J. Wysocki
2013-12-07  1:17                     ` Rafael J. Wysocki
2013-12-05  1:29   ` Rafael J. Wysocki
2013-12-09  2:59     ` Lan Tianyu
2013-12-09  6:48       ` Srivatsa S. Bhat
2013-12-09 10:04         ` Bjørn Mork
2013-12-12  1:59           ` Rafael J. Wysocki
2013-12-12  8:52             ` Bjørn Mork
2013-12-09 11:24         ` Martin Ziegler
2013-12-09 11:53           ` Bjørn Mork
2013-12-10 16:02         ` Martin Ziegler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.