linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines
@ 2019-10-03 14:08 Ville Syrjala
  2019-10-03 15:05 ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Ville Syrjala @ 2019-10-03 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: stable, Paul E . McKenney, Andi Kleen, Rafael J. Wysocki,
	Viresh Kumar, linux-pm, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Ville Syrjälä

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

Since 4.20-rc1 my PIC machines no longer reboot/shutdown.
I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
API in terms of RCU for Tree RCU PREEMPT builds").

I traced the hang into
-> cpufreq_suspend()
 -> cpufreq_stop_governor()
  -> cpufreq_dbs_governor_stop()
   -> gov_clear_update_util()
    -> synchronize_sched()
     -> synchronize_rcu()

Only PREEMPT=y is affected for obvious reasons. The problem
is limited to PIC machines since they mask off interrupts
in i8259A_shutdown() (syscore_ops.shutdown() registered
from device_initcall()).

I reported this long ago but no better fix has surfaced,
hence sending out my initial workaround which I've been
carrying around ever since. I just move cpufreq_core_init()
to late_initcall() so the syscore_ops get registered in the
oppsite order and thus the .shutdown() hooks get executed
in the opposite order as well. Not 100% convinced this is
safe (especially moving the cpufreq_global_kobject creation
to late_initcall()) but I've not had any problems with it
at least.

Here's the resulting change in initcall_debug:
+ PM: Calling cpufreq_suspend+0x0/0x100
  PM: Calling mce_syscore_shutdown+0x0/0x10
  PM: Calling i8259A_shutdown+0x0/0x10
- PM: Calling cpufreq_suspend+0x0/0x100
+ reboot: Restarting system
+ reboot: machine restart

Cc: stable@vger.kernel.org
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-pm@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/cpufreq/cpufreq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c52d6fa32aac..6a8fb9b08e33 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2761,4 +2761,4 @@ static int __init cpufreq_core_init(void)
 	return 0;
 }
 module_param(off, int, 0444);
-core_initcall(cpufreq_core_init);
+late_initcall(cpufreq_core_init);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines
  2019-10-03 14:08 [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines Ville Syrjala
@ 2019-10-03 15:05 ` Rafael J. Wysocki
  2019-10-04 12:30   ` Ville Syrjälä
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2019-10-03 15:05 UTC (permalink / raw)
  To: Ville Syrjala
  Cc: linux-kernel, stable, Paul E . McKenney, Andi Kleen,
	Viresh Kumar, linux-pm, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Greg Kroah-Hartman

On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Since 4.20-rc1 my PIC machines no longer reboot/shutdown.
> I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
> API in terms of RCU for Tree RCU PREEMPT builds").
> 
> I traced the hang into
> -> cpufreq_suspend()
>  -> cpufreq_stop_governor()
>   -> cpufreq_dbs_governor_stop()
>    -> gov_clear_update_util()
>     -> synchronize_sched()
>      -> synchronize_rcu()
> 
> Only PREEMPT=y is affected for obvious reasons. The problem
> is limited to PIC machines since they mask off interrupts
> in i8259A_shutdown() (syscore_ops.shutdown() registered
> from device_initcall()).

Let me treat this as a fresh bug report. :-)

> I reported this long ago but no better fix has surfaced,

So I don't recall seeing the original report or if I did, I had not understood
the problem then.

> hence sending out my initial workaround which I've been
> carrying around ever since. I just move cpufreq_core_init()
> to late_initcall() so the syscore_ops get registered in the
> oppsite order and thus the .shutdown() hooks get executed
> in the opposite order as well. Not 100% convinced this is
> safe (especially moving the cpufreq_global_kobject creation
> to late_initcall()) but I've not had any problems with it
> at least.

The problem is a bug in cpufreq that shouldn't point its syscore shutdown
callback pointer to cpufreq_suspend(), because the syscore stage is generally
too lat to call that function and I'm not sure why this has not been causing
any other issues to trigger (or maybe it did, but they were not reported).

Does the patch below work for you?

---
 drivers/base/core.c       |    3 +++
 drivers/cpufreq/cpufreq.c |   10 ----------
 2 files changed, 3 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu
 }
 EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
 
-/*
- * Stop cpufreq at shutdown to make sure it isn't holding any locks
- * or mutexes when secondary CPUs are halted.
- */
-static struct syscore_ops cpufreq_syscore_ops = {
-	.shutdown = cpufreq_suspend,
-};
-
 struct kobject *cpufreq_global_kobject;
 EXPORT_SYMBOL(cpufreq_global_kobject);
 
@@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void
 	cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj);
 	BUG_ON(!cpufreq_global_kobject);
 
-	register_syscore_ops(&cpufreq_syscore_ops);
-
 	return 0;
 }
 module_param(off, int, 0444);
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -9,6 +9,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cpufreq.h>
 #include <linux/device.h>
 #include <linux/err.h>
 #include <linux/fwnode.h>
@@ -3179,6 +3180,8 @@ void device_shutdown(void)
 	wait_for_device_probe();
 	device_block_probing();
 
+	cpufreq_suspend();
+
 	spin_lock(&devices_kset->list_lock);
 	/*
 	 * Walk the devices list backward, shutting down each in turn.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines
  2019-10-03 15:05 ` Rafael J. Wysocki
@ 2019-10-04 12:30   ` Ville Syrjälä
  2019-10-06 15:34     ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Ville Syrjälä @ 2019-10-04 12:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, stable, Paul E . McKenney, Andi Kleen,
	Viresh Kumar, linux-pm, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Greg Kroah-Hartman

On Thu, Oct 03, 2019 at 05:05:28PM +0200, Rafael J. Wysocki wrote:
> On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > 
> > Since 4.20-rc1 my PIC machines no longer reboot/shutdown.
> > I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
> > API in terms of RCU for Tree RCU PREEMPT builds").
> > 
> > I traced the hang into
> > -> cpufreq_suspend()
> >  -> cpufreq_stop_governor()
> >   -> cpufreq_dbs_governor_stop()
> >    -> gov_clear_update_util()
> >     -> synchronize_sched()
> >      -> synchronize_rcu()
> > 
> > Only PREEMPT=y is affected for obvious reasons. The problem
> > is limited to PIC machines since they mask off interrupts
> > in i8259A_shutdown() (syscore_ops.shutdown() registered
> > from device_initcall()).
> 
> Let me treat this as a fresh bug report. :-)
> 
> > I reported this long ago but no better fix has surfaced,
> 
> So I don't recall seeing the original report or if I did, I had not understood
> the problem then.
> 
> > hence sending out my initial workaround which I've been
> > carrying around ever since. I just move cpufreq_core_init()
> > to late_initcall() so the syscore_ops get registered in the
> > oppsite order and thus the .shutdown() hooks get executed
> > in the opposite order as well. Not 100% convinced this is
> > safe (especially moving the cpufreq_global_kobject creation
> > to late_initcall()) but I've not had any problems with it
> > at least.
> 
> The problem is a bug in cpufreq that shouldn't point its syscore shutdown
> callback pointer to cpufreq_suspend(), because the syscore stage is generally
> too lat to call that function and I'm not sure why this has not been causing
> any other issues to trigger (or maybe it did, but they were not reported).
> 
> Does the patch below work for you?

It does. Thanks.

Feel free to slap on
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

if you want to go with that.

> 
> ---
>  drivers/base/core.c       |    3 +++
>  drivers/cpufreq/cpufreq.c |   10 ----------
>  2 files changed, 3 insertions(+), 10 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/cpufreq.c
> +++ linux-pm/drivers/cpufreq/cpufreq.c
> @@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
>  
> -/*
> - * Stop cpufreq at shutdown to make sure it isn't holding any locks
> - * or mutexes when secondary CPUs are halted.
> - */
> -static struct syscore_ops cpufreq_syscore_ops = {
> -	.shutdown = cpufreq_suspend,
> -};
> -
>  struct kobject *cpufreq_global_kobject;
>  EXPORT_SYMBOL(cpufreq_global_kobject);
>  
> @@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void
>  	cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj);
>  	BUG_ON(!cpufreq_global_kobject);
>  
> -	register_syscore_ops(&cpufreq_syscore_ops);
> -
>  	return 0;
>  }
>  module_param(off, int, 0444);
> Index: linux-pm/drivers/base/core.c
> ===================================================================
> --- linux-pm.orig/drivers/base/core.c
> +++ linux-pm/drivers/base/core.c
> @@ -9,6 +9,7 @@
>   */
>  
>  #include <linux/acpi.h>
> +#include <linux/cpufreq.h>
>  #include <linux/device.h>
>  #include <linux/err.h>
>  #include <linux/fwnode.h>
> @@ -3179,6 +3180,8 @@ void device_shutdown(void)
>  	wait_for_device_probe();
>  	device_block_probing();
>  
> +	cpufreq_suspend();
> +
>  	spin_lock(&devices_kset->list_lock);
>  	/*
>  	 * Walk the devices list backward, shutting down each in turn.
> 
> 

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines
  2019-10-04 12:30   ` Ville Syrjälä
@ 2019-10-06 15:34     ` Rafael J. Wysocki
  2019-10-08 23:29       ` [PATCH] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2019-10-06 15:34 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Stable,
	Paul E . McKenney, Andi Kleen, Viresh Kumar, Linux PM,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Greg Kroah-Hartman

On Fri, Oct 4, 2019 at 2:30 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Thu, Oct 03, 2019 at 05:05:28PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > Since 4.20-rc1 my PIC machines no longer reboot/shutdown.
> > > I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched
> > > API in terms of RCU for Tree RCU PREEMPT builds").
> > >
> > > I traced the hang into
> > > -> cpufreq_suspend()
> > >  -> cpufreq_stop_governor()
> > >   -> cpufreq_dbs_governor_stop()
> > >    -> gov_clear_update_util()
> > >     -> synchronize_sched()
> > >      -> synchronize_rcu()
> > >
> > > Only PREEMPT=y is affected for obvious reasons. The problem
> > > is limited to PIC machines since they mask off interrupts
> > > in i8259A_shutdown() (syscore_ops.shutdown() registered
> > > from device_initcall()).
> >
> > Let me treat this as a fresh bug report. :-)
> >
> > > I reported this long ago but no better fix has surfaced,
> >
> > So I don't recall seeing the original report or if I did, I had not understood
> > the problem then.
> >
> > > hence sending out my initial workaround which I've been
> > > carrying around ever since. I just move cpufreq_core_init()
> > > to late_initcall() so the syscore_ops get registered in the
> > > oppsite order and thus the .shutdown() hooks get executed
> > > in the opposite order as well. Not 100% convinced this is
> > > safe (especially moving the cpufreq_global_kobject creation
> > > to late_initcall()) but I've not had any problems with it
> > > at least.
> >
> > The problem is a bug in cpufreq that shouldn't point its syscore shutdown
> > callback pointer to cpufreq_suspend(), because the syscore stage is generally
> > too lat to call that function and I'm not sure why this has not been causing
> > any other issues to trigger (or maybe it did, but they were not reported).
> >
> > Does the patch below work for you?
>
> It does. Thanks.
>
> Feel free to slap on
> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> if you want to go with that.

I will, thank you!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown
  2019-10-06 15:34     ` Rafael J. Wysocki
@ 2019-10-08 23:29       ` Rafael J. Wysocki
  2019-10-10  6:53         ` Viresh Kumar
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2019-10-08 23:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Ville Syrjälä,
	Linux Kernel Mailing List, Stable, Paul E . McKenney, Andi Kleen,
	Viresh Kumar, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Greg Kroah-Hartman

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It is incorrect to set the cpufreq syscore shutdown callback pointer
to cpufreq_suspend(), because that function cannot be run in the
syscore stage of system shutdown for two reasons: (a) it may attempt
to carry out actions depending on devices that have already been shut
down at that point and (b) the RCU synchronization carried out by it
may not be able to make progress then.

The latter issue has been present since commit 45975c7d21a1 ("rcu:
Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"),
but the former one has always been there regardless.

Fix that by dropping cpufreq_syscore_ops altogether and making
device_shutdown() call cpufreq_suspend() directly before shutting
down devices, which is along the lines of what system-wide power
management does.

Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/core.c       |    3 +++
 drivers/cpufreq/cpufreq.c |   10 ----------
 2 files changed, 3 insertions(+), 10 deletions(-)

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu
 }
 EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
 
-/*
- * Stop cpufreq at shutdown to make sure it isn't holding any locks
- * or mutexes when secondary CPUs are halted.
- */
-static struct syscore_ops cpufreq_syscore_ops = {
-	.shutdown = cpufreq_suspend,
-};
-
 struct kobject *cpufreq_global_kobject;
 EXPORT_SYMBOL(cpufreq_global_kobject);
 
@@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void
 	cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj);
 	BUG_ON(!cpufreq_global_kobject);
 
-	register_syscore_ops(&cpufreq_syscore_ops);
-
 	return 0;
 }
 module_param(off, int, 0444);
Index: linux-pm/drivers/base/core.c
===================================================================
--- linux-pm.orig/drivers/base/core.c
+++ linux-pm/drivers/base/core.c
@@ -9,6 +9,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/cpufreq.h>
 #include <linux/device.h>
 #include <linux/err.h>
 #include <linux/fwnode.h>
@@ -3179,6 +3180,8 @@ void device_shutdown(void)
 	wait_for_device_probe();
 	device_block_probing();
 
+	cpufreq_suspend();
+
 	spin_lock(&devices_kset->list_lock);
 	/*
 	 * Walk the devices list backward, shutting down each in turn.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown
  2019-10-08 23:29       ` [PATCH] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Rafael J. Wysocki
@ 2019-10-10  6:53         ` Viresh Kumar
  0 siblings, 0 replies; 6+ messages in thread
From: Viresh Kumar @ 2019-10-10  6:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Ville Syrjälä,
	Linux Kernel Mailing List, Stable, Paul E . McKenney, Andi Kleen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Greg Kroah-Hartman

On 09-10-19, 01:29, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It is incorrect to set the cpufreq syscore shutdown callback pointer
> to cpufreq_suspend(), because that function cannot be run in the
> syscore stage of system shutdown for two reasons: (a) it may attempt
> to carry out actions depending on devices that have already been shut
> down at that point and (b) the RCU synchronization carried out by it
> may not be able to make progress then.
> 
> The latter issue has been present since commit 45975c7d21a1 ("rcu:
> Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"),
> but the former one has always been there regardless.
> 
> Fix that by dropping cpufreq_syscore_ops altogether and making
> device_shutdown() call cpufreq_suspend() directly before shutting
> down devices, which is along the lines of what system-wide power
> management does.
> 
> Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds")
> Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/base/core.c       |    3 +++
>  drivers/cpufreq/cpufreq.c |   10 ----------
>  2 files changed, 3 insertions(+), 10 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-10  6:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-03 14:08 [PATCH] cpufreq: Fix RCU reboot regression on x86 PIC machines Ville Syrjala
2019-10-03 15:05 ` Rafael J. Wysocki
2019-10-04 12:30   ` Ville Syrjälä
2019-10-06 15:34     ` Rafael J. Wysocki
2019-10-08 23:29       ` [PATCH] cpufreq: Avoid cpufreq_suspend() deadlock on system shutdown Rafael J. Wysocki
2019-10-10  6:53         ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).