All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] suspend/hibernation: Fix racing timers
@ 2014-07-21 17:35 Soren Brinkmann
  2014-07-24  3:55 ` John Stultz
  0 siblings, 1 reply; 18+ messages in thread
From: Soren Brinkmann @ 2014-07-21 17:35 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek, Len Brown
  Cc: linux-kernel, linux-pm, Daniel Lezcano, Soren Brinkmann

On platforms that do not power off during suspend, successfully entering
suspend races with timers.

The race happening in a couple of location is:

  1. disable IRQs 	(e.g. arch_suspend_disable_irqs())
     ...
  2. syscore_suspend()
        -> tick_suspend() 	(timers are turned off here)
     ...
  3. wfi		(wait for wake-IRQ here)

Between steps 1 and 2 the timers can still generate interrupts that are
not handled and stay pending until step 3. That pending IRQ causes an
immediate - spurious - wake.

The solution is to remove the timekeeping suspend/resume functions from
the syscore functions and explictly call them at the appropriate time in
the suspend/hibernation patchs. I.e. timers are suspend _before_ IRQs
get disabled. And accordingly in the resume path.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---
Hi,

I think I found the cause of spurious wakes that I'm observing
(https://lkml.org/lkml/2014/6/30/609). This patch seems
to work well for me.

	Sören

 include/linux/time.h      |  2 ++
 kernel/power/hibernate.c  |  9 +++++++++
 kernel/power/suspend.c    |  4 ++++
 kernel/time/timekeeping.c | 20 ++------------------
 4 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index d5d229b2e5af..6ef10c0cb35a 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -127,6 +127,8 @@ extern void read_boot_clock(struct timespec *ts);
 extern int persistent_clock_is_local;
 extern int update_persistent_clock(struct timespec now);
 void timekeeping_init(void);
+void timekeeping_suspend(void);
+void timekeeping_resume(void);
 extern int timekeeping_suspended;
 
 unsigned long get_seconds(void);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index fcc2611d3f14..a4e3375318a6 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -285,6 +285,8 @@ static int create_image(int platform_mode)
 	if (error || hibernation_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	timekeeping_suspend();
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -316,6 +318,7 @@ static int create_image(int platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	timekeeping_resume();
 	local_irq_enable();
 
  Enable_cpus:
@@ -440,6 +443,8 @@ static int resume_target_kernel(bool platform_mode)
 	if (error)
 		goto Enable_cpus;
 
+	timekeeping_suspend();
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -474,6 +479,7 @@ static int resume_target_kernel(bool platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	timekeeping_resume();
 	local_irq_enable();
 
  Enable_cpus:
@@ -555,6 +561,8 @@ int hibernation_platform_enter(void)
 	if (error)
 		goto Platform_finish;
 
+	timekeeping_suspend();
+
 	local_irq_disable();
 	syscore_suspend();
 	if (pm_wakeup_pending()) {
@@ -568,6 +576,7 @@ int hibernation_platform_enter(void)
 
  Power_up:
 	syscore_resume();
+	timekeeping_resume();
 	local_irq_enable();
 	enable_nonboot_cpus();
 
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ed35a4790afe..fcc47bae35d3 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -253,6 +253,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 	if (error || suspend_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	timekeeping_suspend();
+
 	arch_suspend_disable_irqs();
 	BUG_ON(!irqs_disabled());
 
@@ -270,6 +272,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 		syscore_resume();
 	}
 
+	timekeeping_resume();
+
 	arch_suspend_enable_irqs();
 	BUG_ON(irqs_disabled());
 
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 32d8d6aaedb8..96c14996da68 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -908,7 +908,7 @@ void timekeeping_inject_sleeptime(struct timespec *delta)
  * xtime/wall_to_monotonic/jiffies/etc are
  * still managed by arch specific suspend/resume code.
  */
-static void timekeeping_resume(void)
+void timekeeping_resume(void)
 {
 	struct timekeeper *tk = &timekeeper;
 	struct clocksource *clock = tk->clock;
@@ -986,7 +986,7 @@ static void timekeeping_resume(void)
 	hrtimers_resume();
 }
 
-static int timekeeping_suspend(void)
+void timekeeping_suspend(void)
 {
 	struct timekeeper *tk = &timekeeper;
 	unsigned long flags;
@@ -1035,24 +1035,8 @@ static int timekeeping_suspend(void)
 	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 	clocksource_suspend();
 	clockevents_suspend();
-
-	return 0;
-}
-
-/* sysfs resume/suspend bits for timekeeping */
-static struct syscore_ops timekeeping_syscore_ops = {
-	.resume		= timekeeping_resume,
-	.suspend	= timekeeping_suspend,
-};
-
-static int __init timekeeping_init_ops(void)
-{
-	register_syscore_ops(&timekeeping_syscore_ops);
-	return 0;
 }
 
-device_initcall(timekeeping_init_ops);
-
 /*
  * If the error is already larger, we look ahead even further
  * to compensate for late or lost adjustments.
-- 
2.0.1.1.gfbfc394


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC] suspend/hibernation: Fix racing timers
  2014-07-21 17:35 [PATCH RFC] suspend/hibernation: Fix racing timers Soren Brinkmann
@ 2014-07-24  3:55 ` John Stultz
  2014-07-24 15:59     ` Sören Brinkmann
  2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
  0 siblings, 2 replies; 18+ messages in thread
From: John Stultz @ 2014-07-24  3:55 UTC (permalink / raw)
  To: Soren Brinkmann, Thomas Gleixner, Rafael J. Wysocki,
	Pavel Machek, Len Brown
  Cc: linux-kernel, linux-pm, Daniel Lezcano

On 07/21/2014 10:35 AM, Soren Brinkmann wrote:
> On platforms that do not power off during suspend, successfully entering
> suspend races with timers.
>
> The race happening in a couple of location is:
>
>   1. disable IRQs 	(e.g. arch_suspend_disable_irqs())
>      ...
>   2. syscore_suspend()
>         -> tick_suspend() 	(timers are turned off here)
>      ...
>   3. wfi		(wait for wake-IRQ here)
>
> Between steps 1 and 2 the timers can still generate interrupts that are
> not handled and stay pending until step 3. That pending IRQ causes an
> immediate - spurious - wake.
>
> The solution is to remove the timekeeping suspend/resume functions from
> the syscore functions and explictly call them at the appropriate time in
> the suspend/hibernation patchs. I.e. timers are suspend _before_ IRQs
> get disabled. And accordingly in the resume path.

So.. I sort of follow this, though from the description disabling
timekeeping to turn off timers seems a little indirect (I do see that
suspending timekeeping calls clockevents_suspend() which is the key
part). Maybe this could be clarified in a future version of the patch
description?

I worry that moving timekeeping_suspend earlier in the suspend process
might cause problems where things access time in the suspend path. I
recall these orderings have been problematic in the past, and slightly
tweaking them can often destabilize things badly.

I wonder if it would be better just to move the clockevent_suspend()
call to the earlier site, that way timers are halted but timekeeping
continues until its normal suspend point.

thanks
-john


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC] suspend/hibernation: Fix racing timers
  2014-07-24  3:55 ` John Stultz
@ 2014-07-24 15:59     ` Sören Brinkmann
  2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
  1 sibling, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-24 15:59 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

Hi John,

On Wed, 2014-07-23 at 08:55PM -0700, John Stultz wrote:
> On 07/21/2014 10:35 AM, Soren Brinkmann wrote:
> > On platforms that do not power off during suspend, successfully entering
> > suspend races with timers.
> >
> > The race happening in a couple of location is:
> >
> >   1. disable IRQs 	(e.g. arch_suspend_disable_irqs())
> >      ...
> >   2. syscore_suspend()
> >         -> tick_suspend() 	(timers are turned off here)
> >      ...
> >   3. wfi		(wait for wake-IRQ here)
> >
> > Between steps 1 and 2 the timers can still generate interrupts that are
> > not handled and stay pending until step 3. That pending IRQ causes an
> > immediate - spurious - wake.
> >
> > The solution is to remove the timekeeping suspend/resume functions from
> > the syscore functions and explictly call them at the appropriate time in
> > the suspend/hibernation patchs. I.e. timers are suspend _before_ IRQs
> > get disabled. And accordingly in the resume path.
> 
> So.. I sort of follow this, though from the description disabling
> timekeeping to turn off timers seems a little indirect (I do see that
> suspending timekeeping calls clockevents_suspend() which is the key
> part). Maybe this could be clarified in a future version of the patch
> description?
> 
> I worry that moving timekeeping_suspend earlier in the suspend process
> might cause problems where things access time in the suspend path. I
> recall these orderings have been problematic in the past, and slightly
> tweaking them can often destabilize things badly.

You're right. Just when I received this I started seeing some warning
from the kernel due to ktime_get() called with timekeeping being
suspended.
Though, stability-wise it seems to work.

> 
> I wonder if it would be better just to move the clockevent_suspend()
> call to the earlier site, that way timers are halted but timekeeping
> continues until its normal suspend point.

I'll look into this and send out a patch once I have something working.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC] suspend/hibernation: Fix racing timers
@ 2014-07-24 15:59     ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-24 15:59 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

Hi John,

On Wed, 2014-07-23 at 08:55PM -0700, John Stultz wrote:
> On 07/21/2014 10:35 AM, Soren Brinkmann wrote:
> > On platforms that do not power off during suspend, successfully entering
> > suspend races with timers.
> >
> > The race happening in a couple of location is:
> >
> >   1. disable IRQs 	(e.g. arch_suspend_disable_irqs())
> >      ...
> >   2. syscore_suspend()
> >         -> tick_suspend() 	(timers are turned off here)
> >      ...
> >   3. wfi		(wait for wake-IRQ here)
> >
> > Between steps 1 and 2 the timers can still generate interrupts that are
> > not handled and stay pending until step 3. That pending IRQ causes an
> > immediate - spurious - wake.
> >
> > The solution is to remove the timekeeping suspend/resume functions from
> > the syscore functions and explictly call them at the appropriate time in
> > the suspend/hibernation patchs. I.e. timers are suspend _before_ IRQs
> > get disabled. And accordingly in the resume path.
> 
> So.. I sort of follow this, though from the description disabling
> timekeeping to turn off timers seems a little indirect (I do see that
> suspending timekeeping calls clockevents_suspend() which is the key
> part). Maybe this could be clarified in a future version of the patch
> description?
> 
> I worry that moving timekeeping_suspend earlier in the suspend process
> might cause problems where things access time in the suspend path. I
> recall these orderings have been problematic in the past, and slightly
> tweaking them can often destabilize things badly.

You're right. Just when I received this I started seeing some warning
from the kernel due to ktime_get() called with timekeeping being
suspended.
Though, stability-wise it seems to work.

> 
> I wonder if it would be better just to move the clockevent_suspend()
> call to the earlier site, that way timers are halted but timekeeping
> continues until its normal suspend point.

I'll look into this and send out a patch once I have something working.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-24  3:55 ` John Stultz
  2014-07-24 15:59     ` Sören Brinkmann
@ 2014-07-25 21:06   ` Soren Brinkmann
  2014-07-27 23:18     ` Rafael J. Wysocki
                       ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Soren Brinkmann @ 2014-07-25 21:06 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek, Len Brown
  Cc: linux-kernel, linux-pm, Daniel Lezcano, Soren Brinkmann

On platforms that do not power off during suspend, successfully entering
suspend races with timers.

The race happening in a couple of location is:

  1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
     ...
  2. syscore_suspend()
      -> timekeeping_suspend()
       -> clockevents_notify(SUSPEND)
        -> tick_suspend()   	(timers are turned off here)
     ...
  3. wfi            		(wait for wake-IRQ here)

Between steps 1 and 2 the timers can still generate interrupts that are
not handled and stay pending until step 3. That pending IRQ causes an
immediate - spurious - wake.

The solution is to move the clockevents suspend/resume notification
out of the syscore_suspend step and explictly call them at the appropriate
time in the suspend/hibernation paths. I.e. timers are suspend _before_
IRQs get disabled. And accordingly in the resume path.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
---
Hi,

This is my second shot at this. I followed John's suggestion to keep the
timekeeping suspend where it is and just move the shutdown of the clockevent
devices around.

	Thanks,
	Sören

 kernel/power/hibernate.c  | 9 +++++++++
 kernel/power/suspend.c    | 5 +++++
 kernel/time/timekeeping.c | 1 -
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index fcc2611d3f14..ab9f807e2ccb 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -285,6 +285,8 @@ static int create_image(int platform_mode)
 	if (error || hibernation_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -316,6 +318,7 @@ static int create_image(int platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 
  Enable_cpus:
@@ -440,6 +443,8 @@ static int resume_target_kernel(bool platform_mode)
 	if (error)
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 
 	error = syscore_suspend();
@@ -474,6 +479,7 @@ static int resume_target_kernel(bool platform_mode)
 	syscore_resume();
 
  Enable_irqs:
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 
  Enable_cpus:
@@ -555,6 +561,8 @@ int hibernation_platform_enter(void)
 	if (error)
 		goto Platform_finish;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	local_irq_disable();
 	syscore_suspend();
 	if (pm_wakeup_pending()) {
@@ -568,6 +576,7 @@ int hibernation_platform_enter(void)
 
  Power_up:
 	syscore_resume();
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
 	local_irq_enable();
 	enable_nonboot_cpus();
 
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ed35a4790afe..ca6c56a87ea3 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -12,6 +12,7 @@
 #include <linux/delay.h>
 #include <linux/errno.h>
 #include <linux/init.h>
+#include <linux/clockchips.h>
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/cpuidle.h>
@@ -253,6 +254,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 	if (error || suspend_test(TEST_CPUS))
 		goto Enable_cpus;
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+
 	arch_suspend_disable_irqs();
 	BUG_ON(!irqs_disabled());
 
@@ -270,6 +273,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 		syscore_resume();
 	}
 
+	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
+
 	arch_suspend_enable_irqs();
 	BUG_ON(irqs_disabled());
 
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 32d8d6aaedb8..d2f21cbe2bfd 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1032,7 +1032,6 @@ static int timekeeping_suspend(void)
 	write_seqcount_end(&timekeeper_seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 
-	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
 	clocksource_suspend();
 	clockevents_suspend();
 
-- 
2.0.1.1.gfbfc394


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
@ 2014-07-27 23:18     ` Rafael J. Wysocki
  2014-07-28 19:26       ` John Stultz
  2014-07-28 10:05     ` Pavel Machek
  2014-07-28 19:19     ` Stephen Boyd
  2 siblings, 1 reply; 18+ messages in thread
From: Rafael J. Wysocki @ 2014-07-27 23:18 UTC (permalink / raw)
  To: Soren Brinkmann, John Stultz
  Cc: Thomas Gleixner, Pavel Machek, Len Brown, linux-kernel, linux-pm,
	Daniel Lezcano

On Friday, July 25, 2014 02:06:48 PM Soren Brinkmann wrote:
> On platforms that do not power off during suspend, successfully entering
> suspend races with timers.
> 
> The race happening in a couple of location is:
> 
>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>      ...
>   2. syscore_suspend()
>       -> timekeeping_suspend()
>        -> clockevents_notify(SUSPEND)
>         -> tick_suspend()   	(timers are turned off here)
>      ...
>   3. wfi            		(wait for wake-IRQ here)
> 
> Between steps 1 and 2 the timers can still generate interrupts that are
> not handled and stay pending until step 3. That pending IRQ causes an
> immediate - spurious - wake.
> 
> The solution is to move the clockevents suspend/resume notification
> out of the syscore_suspend step and explictly call them at the appropriate
> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> IRQs get disabled. And accordingly in the resume path.
> 
> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> ---
> Hi,
> 
> This is my second shot at this. I followed John's suggestion to keep the
> timekeeping suspend where it is and just move the shutdown of the clockevent
> devices around.

John, what do you think?


>  kernel/power/hibernate.c  | 9 +++++++++
>  kernel/power/suspend.c    | 5 +++++
>  kernel/time/timekeeping.c | 1 -
>  3 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index fcc2611d3f14..ab9f807e2ccb 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -285,6 +285,8 @@ static int create_image(int platform_mode)
>  	if (error || hibernation_test(TEST_CPUS))
>  		goto Enable_cpus;
>  
> +	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
> +
>  	local_irq_disable();
>  
>  	error = syscore_suspend();
> @@ -316,6 +318,7 @@ static int create_image(int platform_mode)
>  	syscore_resume();
>  
>   Enable_irqs:
> +	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
>  	local_irq_enable();
>  
>   Enable_cpus:
> @@ -440,6 +443,8 @@ static int resume_target_kernel(bool platform_mode)
>  	if (error)
>  		goto Enable_cpus;
>  
> +	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
> +
>  	local_irq_disable();
>  
>  	error = syscore_suspend();
> @@ -474,6 +479,7 @@ static int resume_target_kernel(bool platform_mode)
>  	syscore_resume();
>  
>   Enable_irqs:
> +	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
>  	local_irq_enable();
>  
>   Enable_cpus:
> @@ -555,6 +561,8 @@ int hibernation_platform_enter(void)
>  	if (error)
>  		goto Platform_finish;
>  
> +	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
> +
>  	local_irq_disable();
>  	syscore_suspend();
>  	if (pm_wakeup_pending()) {
> @@ -568,6 +576,7 @@ int hibernation_platform_enter(void)
>  
>   Power_up:
>  	syscore_resume();
> +	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
>  	local_irq_enable();
>  	enable_nonboot_cpus();
>  
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index ed35a4790afe..ca6c56a87ea3 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -12,6 +12,7 @@
>  #include <linux/delay.h>
>  #include <linux/errno.h>
>  #include <linux/init.h>
> +#include <linux/clockchips.h>
>  #include <linux/console.h>
>  #include <linux/cpu.h>
>  #include <linux/cpuidle.h>
> @@ -253,6 +254,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
>  	if (error || suspend_test(TEST_CPUS))
>  		goto Enable_cpus;
>  
> +	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
> +
>  	arch_suspend_disable_irqs();
>  	BUG_ON(!irqs_disabled());
>  
> @@ -270,6 +273,8 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
>  		syscore_resume();
>  	}
>  
> +	clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
> +
>  	arch_suspend_enable_irqs();
>  	BUG_ON(irqs_disabled());
>  
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 32d8d6aaedb8..d2f21cbe2bfd 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1032,7 +1032,6 @@ static int timekeeping_suspend(void)
>  	write_seqcount_end(&timekeeper_seq);
>  	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
>  
> -	clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
>  	clocksource_suspend();
>  	clockevents_suspend();
>  
> 

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
  2014-07-27 23:18     ` Rafael J. Wysocki
@ 2014-07-28 10:05     ` Pavel Machek
  2014-07-28 15:51         ` Sören Brinkmann
  2014-07-28 19:19     ` Stephen Boyd
  2 siblings, 1 reply; 18+ messages in thread
From: Pavel Machek @ 2014-07-28 10:05 UTC (permalink / raw)
  To: Soren Brinkmann
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

On Fri 2014-07-25 14:06:48, Soren Brinkmann wrote:
> On platforms that do not power off during suspend, successfully entering
> suspend races with timers.
> 
> The race happening in a couple of location is:
> 
>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>      ...
>   2. syscore_suspend()
>       -> timekeeping_suspend()
>        -> clockevents_notify(SUSPEND)
>         -> tick_suspend()   	(timers are turned off here)
>      ...
>   3. wfi            		(wait for wake-IRQ here)
> 
> Between steps 1 and 2 the timers can still generate interrupts that are
> not handled and stay pending until step 3. That pending IRQ causes an
> immediate - spurious - wake.
> 
> The solution is to move the clockevents suspend/resume notification
> out of the syscore_suspend step and explictly call them at the appropriate
> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> IRQs get disabled. And accordingly in the resume path.
> 
> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>

Did you test the hibernation?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-28 10:05     ` Pavel Machek
@ 2014-07-28 15:51         ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-28 15:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

On Mon, 2014-07-28 at 12:05PM +0200, Pavel Machek wrote:
> On Fri 2014-07-25 14:06:48, Soren Brinkmann wrote:
> > On platforms that do not power off during suspend, successfully entering
> > suspend races with timers.
> > 
> > The race happening in a couple of location is:
> > 
> >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> >      ...
> >   2. syscore_suspend()
> >       -> timekeeping_suspend()
> >        -> clockevents_notify(SUSPEND)
> >         -> tick_suspend()   	(timers are turned off here)
> >      ...
> >   3. wfi            		(wait for wake-IRQ here)
> > 
> > Between steps 1 and 2 the timers can still generate interrupts that are
> > not handled and stay pending until step 3. That pending IRQ causes an
> > immediate - spurious - wake.
> > 
> > The solution is to move the clockevents suspend/resume notification
> > out of the syscore_suspend step and explictly call them at the appropriate
> > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > IRQs get disabled. And accordingly in the resume path.
> > 
> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> 
> Did you test the hibernation?
No, I don't have a system to test hibernation. Suspend seems to work
fine though.

	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
@ 2014-07-28 15:51         ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-28 15:51 UTC (permalink / raw)
  To: Pavel Machek
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

On Mon, 2014-07-28 at 12:05PM +0200, Pavel Machek wrote:
> On Fri 2014-07-25 14:06:48, Soren Brinkmann wrote:
> > On platforms that do not power off during suspend, successfully entering
> > suspend races with timers.
> > 
> > The race happening in a couple of location is:
> > 
> >   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> >      ...
> >   2. syscore_suspend()
> >       -> timekeeping_suspend()
> >        -> clockevents_notify(SUSPEND)
> >         -> tick_suspend()   	(timers are turned off here)
> >      ...
> >   3. wfi            		(wait for wake-IRQ here)
> > 
> > Between steps 1 and 2 the timers can still generate interrupts that are
> > not handled and stay pending until step 3. That pending IRQ causes an
> > immediate - spurious - wake.
> > 
> > The solution is to move the clockevents suspend/resume notification
> > out of the syscore_suspend step and explictly call them at the appropriate
> > time in the suspend/hibernation paths. I.e. timers are suspend _before_
> > IRQs get disabled. And accordingly in the resume path.
> > 
> > Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> 
> Did you test the hibernation?
No, I don't have a system to test hibernation. Suspend seems to work
fine though.

	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
  2014-07-27 23:18     ` Rafael J. Wysocki
  2014-07-28 10:05     ` Pavel Machek
@ 2014-07-28 19:19     ` Stephen Boyd
  2014-07-28 19:24       ` John Stultz
  2 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2014-07-28 19:19 UTC (permalink / raw)
  To: Soren Brinkmann
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On 07/25/14 14:06, Soren Brinkmann wrote:
> On platforms that do not power off during suspend, successfully entering
> suspend races with timers.
>
> The race happening in a couple of location is:
>
>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>      ...
>   2. syscore_suspend()
>       -> timekeeping_suspend()
>        -> clockevents_notify(SUSPEND)
>         -> tick_suspend()   	(timers are turned off here)
>      ...
>   3. wfi            		(wait for wake-IRQ here)
>
> Between steps 1 and 2 the timers can still generate interrupts that are
> not handled and stay pending until step 3. That pending IRQ causes an
> immediate - spurious - wake.
>
> The solution is to move the clockevents suspend/resume notification
> out of the syscore_suspend step and explictly call them at the appropriate
> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> IRQs get disabled. And accordingly in the resume path.
>
> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>

Do we know which timer_list or hrtimer wants to run while entering
suspend? I'd suspect the scheduler tick but perhaps we just forgot to
cancel some timer during suspend?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-28 19:19     ` Stephen Boyd
@ 2014-07-28 19:24       ` John Stultz
  2014-07-28 19:38         ` Stephen Boyd
  0 siblings, 1 reply; 18+ messages in thread
From: John Stultz @ 2014-07-28 19:24 UTC (permalink / raw)
  To: Stephen Boyd, Soren Brinkmann
  Cc: Thomas Gleixner, Rafael J. Wysocki, Pavel Machek, Len Brown,
	linux-kernel, linux-pm, Daniel Lezcano

On 07/28/2014 12:19 PM, Stephen Boyd wrote:
> On 07/25/14 14:06, Soren Brinkmann wrote:
>> On platforms that do not power off during suspend, successfully entering
>> suspend races with timers.
>>
>> The race happening in a couple of location is:
>>
>>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>>      ...
>>   2. syscore_suspend()
>>       -> timekeeping_suspend()
>>        -> clockevents_notify(SUSPEND)
>>         -> tick_suspend()   	(timers are turned off here)
>>      ...
>>   3. wfi            		(wait for wake-IRQ here)
>>
>> Between steps 1 and 2 the timers can still generate interrupts that are
>> not handled and stay pending until step 3. That pending IRQ causes an
>> immediate - spurious - wake.
>>
>> The solution is to move the clockevents suspend/resume notification
>> out of the syscore_suspend step and explictly call them at the appropriate
>> time in the suspend/hibernation paths. I.e. timers are suspend _before_
>> IRQs get disabled. And accordingly in the resume path.
>>
>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> Do we know which timer_list or hrtimer wants to run while entering
> suspend? I'd suspect the scheduler tick but perhaps we just forgot to
> cancel some timer during suspend?

Though, canceling timers really shouldn't be necessary for
suspend/resume, no?

thanks
-john


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-27 23:18     ` Rafael J. Wysocki
@ 2014-07-28 19:26       ` John Stultz
  0 siblings, 0 replies; 18+ messages in thread
From: John Stultz @ 2014-07-28 19:26 UTC (permalink / raw)
  To: Rafael J. Wysocki, Soren Brinkmann
  Cc: Thomas Gleixner, Pavel Machek, Len Brown, linux-kernel, linux-pm,
	Daniel Lezcano

On 07/27/2014 04:18 PM, Rafael J. Wysocki wrote:
> On Friday, July 25, 2014 02:06:48 PM Soren Brinkmann wrote:
>> On platforms that do not power off during suspend, successfully entering
>> suspend races with timers.
>>
>> The race happening in a couple of location is:
>>
>>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>>      ...
>>   2. syscore_suspend()
>>       -> timekeeping_suspend()
>>        -> clockevents_notify(SUSPEND)
>>         -> tick_suspend()   	(timers are turned off here)
>>      ...
>>   3. wfi            		(wait for wake-IRQ here)
>>
>> Between steps 1 and 2 the timers can still generate interrupts that are
>> not handled and stay pending until step 3. That pending IRQ causes an
>> immediate - spurious - wake.
>>
>> The solution is to move the clockevents suspend/resume notification
>> out of the syscore_suspend step and explictly call them at the appropriate
>> time in the suspend/hibernation paths. I.e. timers are suspend _before_
>> IRQs get disabled. And accordingly in the resume path.
>>
>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> ---
>> Hi,
>>
>> This is my second shot at this. I followed John's suggestion to keep the
>> timekeeping suspend where it is and just move the shutdown of the clockevent
>> devices around.
> John, what do you think?

I've not had the chance to take a closer look and do any testing. I
suspect we'll need tgxl's input here as well.

The change makes sense, but ordering modifications in this area tend to
be fragile, as there are lots of implicit dependencies.

thanks
-john


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-28 19:24       ` John Stultz
@ 2014-07-28 19:38         ` Stephen Boyd
  2014-07-28 20:02             ` Sören Brinkmann
  0 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2014-07-28 19:38 UTC (permalink / raw)
  To: John Stultz
  Cc: Soren Brinkmann, Thomas Gleixner, Rafael J. Wysocki,
	Pavel Machek, Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On 07/28/14 12:24, John Stultz wrote:
> On 07/28/2014 12:19 PM, Stephen Boyd wrote:
>> On 07/25/14 14:06, Soren Brinkmann wrote:
>>> On platforms that do not power off during suspend, successfully entering
>>> suspend races with timers.
>>>
>>> The race happening in a couple of location is:
>>>
>>>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
>>>      ...
>>>   2. syscore_suspend()
>>>       -> timekeeping_suspend()
>>>        -> clockevents_notify(SUSPEND)
>>>         -> tick_suspend()   	(timers are turned off here)
>>>      ...
>>>   3. wfi            		(wait for wake-IRQ here)
>>>
>>> Between steps 1 and 2 the timers can still generate interrupts that are
>>> not handled and stay pending until step 3. That pending IRQ causes an
>>> immediate - spurious - wake.
>>>
>>> The solution is to move the clockevents suspend/resume notification
>>> out of the syscore_suspend step and explictly call them at the appropriate
>>> time in the suspend/hibernation paths. I.e. timers are suspend _before_
>>> IRQs get disabled. And accordingly in the resume path.
>>>
>>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> Do we know which timer_list or hrtimer wants to run while entering
>> suspend? I'd suspect the scheduler tick but perhaps we just forgot to
>> cancel some timer during suspend?
> Though, canceling timers really shouldn't be necessary for
> suspend/resume, no?
>
>

Agreed. Perhaps I put it the wrong way. I'm worried that some timer
needs to run just when we go into suspend. As long as that timer is the
scheduler tick we should be ok, but if it isn't the scheduler tick then
it would be good to know what it is and why it's pending. Unless the
idea is that if we get this far into suspend and there's a pending timer
we should just ignore it and go to sleep anyway?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-28 19:38         ` Stephen Boyd
@ 2014-07-28 20:02             ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-28 20:02 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On Mon, 2014-07-28 at 12:38PM -0700, Stephen Boyd wrote:
> On 07/28/14 12:24, John Stultz wrote:
> > On 07/28/2014 12:19 PM, Stephen Boyd wrote:
> >> On 07/25/14 14:06, Soren Brinkmann wrote:
> >>> On platforms that do not power off during suspend, successfully entering
> >>> suspend races with timers.
> >>>
> >>> The race happening in a couple of location is:
> >>>
> >>>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> >>>      ...
> >>>   2. syscore_suspend()
> >>>       -> timekeeping_suspend()
> >>>        -> clockevents_notify(SUSPEND)
> >>>         -> tick_suspend()   	(timers are turned off here)
> >>>      ...
> >>>   3. wfi            		(wait for wake-IRQ here)
> >>>
> >>> Between steps 1 and 2 the timers can still generate interrupts that are
> >>> not handled and stay pending until step 3. That pending IRQ causes an
> >>> immediate - spurious - wake.
> >>>
> >>> The solution is to move the clockevents suspend/resume notification
> >>> out of the syscore_suspend step and explictly call them at the appropriate
> >>> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> >>> IRQs get disabled. And accordingly in the resume path.
> >>>
> >>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> >> Do we know which timer_list or hrtimer wants to run while entering
> >> suspend? I'd suspect the scheduler tick but perhaps we just forgot to
> >> cancel some timer during suspend?
> > Though, canceling timers really shouldn't be necessary for
> > suspend/resume, no?
> >
> >
> 
> Agreed. Perhaps I put it the wrong way. I'm worried that some timer
> needs to run just when we go into suspend. As long as that timer is the
> scheduler tick we should be ok, but if it isn't the scheduler tick then
> it would be good to know what it is and why it's pending. Unless the
> idea is that if we get this far into suspend and there's a pending timer
> we should just ignore it and go to sleep anyway?

Well, that is pretty much what happens currently. The IRQs are disabled
and nobody cares about the pending timer. My problem with that is, that
"suspend" for Zynq is just waiting in WFI. Hence, the pending interrupts
causes an immediate resume.
So, it should hopefully be more or less fine since the current
implementation basically ignores the timer. With this patch we just shut
them down a little earlier to prevent this pending interrupt - at least
that is the intention.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
@ 2014-07-28 20:02             ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-28 20:02 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On Mon, 2014-07-28 at 12:38PM -0700, Stephen Boyd wrote:
> On 07/28/14 12:24, John Stultz wrote:
> > On 07/28/2014 12:19 PM, Stephen Boyd wrote:
> >> On 07/25/14 14:06, Soren Brinkmann wrote:
> >>> On platforms that do not power off during suspend, successfully entering
> >>> suspend races with timers.
> >>>
> >>> The race happening in a couple of location is:
> >>>
> >>>   1. disable IRQs   		(e.g. arch_suspend_disable_irqs())
> >>>      ...
> >>>   2. syscore_suspend()
> >>>       -> timekeeping_suspend()
> >>>        -> clockevents_notify(SUSPEND)
> >>>         -> tick_suspend()   	(timers are turned off here)
> >>>      ...
> >>>   3. wfi            		(wait for wake-IRQ here)
> >>>
> >>> Between steps 1 and 2 the timers can still generate interrupts that are
> >>> not handled and stay pending until step 3. That pending IRQ causes an
> >>> immediate - spurious - wake.
> >>>
> >>> The solution is to move the clockevents suspend/resume notification
> >>> out of the syscore_suspend step and explictly call them at the appropriate
> >>> time in the suspend/hibernation paths. I.e. timers are suspend _before_
> >>> IRQs get disabled. And accordingly in the resume path.
> >>>
> >>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
> >> Do we know which timer_list or hrtimer wants to run while entering
> >> suspend? I'd suspect the scheduler tick but perhaps we just forgot to
> >> cancel some timer during suspend?
> > Though, canceling timers really shouldn't be necessary for
> > suspend/resume, no?
> >
> >
> 
> Agreed. Perhaps I put it the wrong way. I'm worried that some timer
> needs to run just when we go into suspend. As long as that timer is the
> scheduler tick we should be ok, but if it isn't the scheduler tick then
> it would be good to know what it is and why it's pending. Unless the
> idea is that if we get this far into suspend and there's a pending timer
> we should just ignore it and go to sleep anyway?

Well, that is pretty much what happens currently. The IRQs are disabled
and nobody cares about the pending timer. My problem with that is, that
"suspend" for Zynq is just waiting in WFI. Hence, the pending interrupts
causes an immediate resume.
So, it should hopefully be more or less fine since the current
implementation basically ignores the timer. With this patch we just shut
them down a little earlier to prevent this pending interrupt - at least
that is the intention.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-28 20:02             ` Sören Brinkmann
  (?)
@ 2014-07-29 23:05             ` Stephen Boyd
  2014-07-29 23:22                 ` Sören Brinkmann
  -1 siblings, 1 reply; 18+ messages in thread
From: Stephen Boyd @ 2014-07-29 23:05 UTC (permalink / raw)
  To: Sören Brinkmann
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On 07/28/14 13:02, Sören Brinkmann wrote:
> On Mon, 2014-07-28 at 12:38PM -0700, Stephen Boyd wrote:
>>
>> Agreed. Perhaps I put it the wrong way. I'm worried that some timer
>> needs to run just when we go into suspend. As long as that timer is the
>> scheduler tick we should be ok, but if it isn't the scheduler tick then
>> it would be good to know what it is and why it's pending. Unless the
>> idea is that if we get this far into suspend and there's a pending timer
>> we should just ignore it and go to sleep anyway?
> Well, that is pretty much what happens currently. The IRQs are disabled
> and nobody cares about the pending timer. 

Yep. It sounds like we don't know what it is so let's hope it's the
sched tick. I suspect that driver suspend paths are canceling their
timers because their hardware has been quiesced.

> My problem with that is, that
> "suspend" for Zynq is just waiting in WFI. Hence, the pending interrupts
> causes an immediate resume.
> So, it should hopefully be more or less fine since the current
> implementation basically ignores the timer. With this patch we just shut
> them down a little earlier to prevent this pending interrupt - at least
> that is the intention.
>

That sort of WFI based suspend doesn't actually sound like a memory
suspend at all. It's really the "freeze" state where we would sit in the
deepest CPU idle state waiting for some prescribed wakeup event (power
button press, etc.) that would then trigger a wakeup_source to be
activated and then wakeup the suspend thread.

Unless the WFI actually triggers some power state controller? For
example, on the ARM platforms I have we trigger suspend via a WFI, which
causes a power state controller to pull the power from the CPU that
triggered the WFI and then goes ahead and turns off the rest of the SoC
power and puts the ddr in self-refresh. If we have a pending irq then
the power state controller would abort suspend and we'd come right back
almost immediately (similar to your situation). The thing is we don't
see any pending irqs and we don't have this patch, so I wonder if we
just haven't hit this case, or if there's something more fundamental
going on that causes a difference. Or maybe we do see this pending irq
sometimes but we don't care because we'll try and go right back to
suspend again.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
  2014-07-29 23:05             ` Stephen Boyd
@ 2014-07-29 23:22                 ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-29 23:22 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On Tue, 2014-07-29 at 04:05PM -0700, Stephen Boyd wrote:
> On 07/28/14 13:02, Sören Brinkmann wrote:
> > On Mon, 2014-07-28 at 12:38PM -0700, Stephen Boyd wrote:
> >>
> >> Agreed. Perhaps I put it the wrong way. I'm worried that some timer
> >> needs to run just when we go into suspend. As long as that timer is the
> >> scheduler tick we should be ok, but if it isn't the scheduler tick then
> >> it would be good to know what it is and why it's pending. Unless the
> >> idea is that if we get this far into suspend and there's a pending timer
> >> we should just ignore it and go to sleep anyway?
> > Well, that is pretty much what happens currently. The IRQs are disabled
> > and nobody cares about the pending timer. 
> 
> Yep. It sounds like we don't know what it is so let's hope it's the
> sched tick. I suspect that driver suspend paths are canceling their
> timers because their hardware has been quiesced.
Drivers probably, as well as the suspend path in general, mask all non-timer
and non-wakeup interrupts.

> 
> > My problem with that is, that
> > "suspend" for Zynq is just waiting in WFI. Hence, the pending interrupts
> > causes an immediate resume.
> > So, it should hopefully be more or less fine since the current
> > implementation basically ignores the timer. With this patch we just shut
> > them down a little earlier to prevent this pending interrupt - at least
> > that is the intention.
> >
> 
> That sort of WFI based suspend doesn't actually sound like a memory
> suspend at all. It's really the "freeze" state where we would sit in the
> deepest CPU idle state waiting for some prescribed wakeup event (power
> button press, etc.) that would then trigger a wakeup_source to be
> activated and then wakeup the suspend thread.
> 
> Unless the WFI actually triggers some power state controller? For
> example, on the ARM platforms I have we trigger suspend via a WFI, which
> causes a power state controller to pull the power from the CPU that
> triggered the WFI and then goes ahead and turns off the rest of the SoC
> power and puts the ddr in self-refresh. If we have a pending irq then
> the power state controller would abort suspend and we'd come right back
> almost immediately (similar to your situation). The thing is we don't
> see any pending irqs and we don't have this patch, so I wonder if we
> just haven't hit this case, or if there's something more fundamental
> going on that causes a difference. Or maybe we do see this pending irq
> sometimes but we don't care because we'll try and go right back to
> suspend again.
On Zynq we don't have such sophisticated external helpers. The ARM core
does everything on its own and power down is pretty much impossible by
design.
So, for Zynq we enable some low power features, move execution to OCM,
shut down PLLs and DRAM as far as possible and then just sit in wfi
(which might trigger some external low power features like SCU and L2$
standby etc.), but no smart external power controller.

One thing that might make this happen on Zynq more frequently is
that we use a 16-bit timer. I guess that timer overflowing all the
time probably causes more frequent interrupts than you'd see on
platforms with wider timers. After all the window in which this
problem would occur is rather small.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC v2] PM / sleep: Fix racing timers
@ 2014-07-29 23:22                 ` Sören Brinkmann
  0 siblings, 0 replies; 18+ messages in thread
From: Sören Brinkmann @ 2014-07-29 23:22 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: John Stultz, Thomas Gleixner, Rafael J. Wysocki, Pavel Machek,
	Len Brown, linux-kernel, linux-pm, Daniel Lezcano

On Tue, 2014-07-29 at 04:05PM -0700, Stephen Boyd wrote:
> On 07/28/14 13:02, Sören Brinkmann wrote:
> > On Mon, 2014-07-28 at 12:38PM -0700, Stephen Boyd wrote:
> >>
> >> Agreed. Perhaps I put it the wrong way. I'm worried that some timer
> >> needs to run just when we go into suspend. As long as that timer is the
> >> scheduler tick we should be ok, but if it isn't the scheduler tick then
> >> it would be good to know what it is and why it's pending. Unless the
> >> idea is that if we get this far into suspend and there's a pending timer
> >> we should just ignore it and go to sleep anyway?
> > Well, that is pretty much what happens currently. The IRQs are disabled
> > and nobody cares about the pending timer. 
> 
> Yep. It sounds like we don't know what it is so let's hope it's the
> sched tick. I suspect that driver suspend paths are canceling their
> timers because their hardware has been quiesced.
Drivers probably, as well as the suspend path in general, mask all non-timer
and non-wakeup interrupts.

> 
> > My problem with that is, that
> > "suspend" for Zynq is just waiting in WFI. Hence, the pending interrupts
> > causes an immediate resume.
> > So, it should hopefully be more or less fine since the current
> > implementation basically ignores the timer. With this patch we just shut
> > them down a little earlier to prevent this pending interrupt - at least
> > that is the intention.
> >
> 
> That sort of WFI based suspend doesn't actually sound like a memory
> suspend at all. It's really the "freeze" state where we would sit in the
> deepest CPU idle state waiting for some prescribed wakeup event (power
> button press, etc.) that would then trigger a wakeup_source to be
> activated and then wakeup the suspend thread.
> 
> Unless the WFI actually triggers some power state controller? For
> example, on the ARM platforms I have we trigger suspend via a WFI, which
> causes a power state controller to pull the power from the CPU that
> triggered the WFI and then goes ahead and turns off the rest of the SoC
> power and puts the ddr in self-refresh. If we have a pending irq then
> the power state controller would abort suspend and we'd come right back
> almost immediately (similar to your situation). The thing is we don't
> see any pending irqs and we don't have this patch, so I wonder if we
> just haven't hit this case, or if there's something more fundamental
> going on that causes a difference. Or maybe we do see this pending irq
> sometimes but we don't care because we'll try and go right back to
> suspend again.
On Zynq we don't have such sophisticated external helpers. The ARM core
does everything on its own and power down is pretty much impossible by
design.
So, for Zynq we enable some low power features, move execution to OCM,
shut down PLLs and DRAM as far as possible and then just sit in wfi
(which might trigger some external low power features like SCU and L2$
standby etc.), but no smart external power controller.

One thing that might make this happen on Zynq more frequently is
that we use a 16-bit timer. I guess that timer overflowing all the
time probably causes more frequent interrupts than you'd see on
platforms with wider timers. After all the window in which this
problem would occur is rather small.

	Thanks,
	Sören

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-07-29 23:22 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-21 17:35 [PATCH RFC] suspend/hibernation: Fix racing timers Soren Brinkmann
2014-07-24  3:55 ` John Stultz
2014-07-24 15:59   ` Sören Brinkmann
2014-07-24 15:59     ` Sören Brinkmann
2014-07-25 21:06   ` [PATCH RFC v2] PM / sleep: " Soren Brinkmann
2014-07-27 23:18     ` Rafael J. Wysocki
2014-07-28 19:26       ` John Stultz
2014-07-28 10:05     ` Pavel Machek
2014-07-28 15:51       ` Sören Brinkmann
2014-07-28 15:51         ` Sören Brinkmann
2014-07-28 19:19     ` Stephen Boyd
2014-07-28 19:24       ` John Stultz
2014-07-28 19:38         ` Stephen Boyd
2014-07-28 20:02           ` Sören Brinkmann
2014-07-28 20:02             ` Sören Brinkmann
2014-07-29 23:05             ` Stephen Boyd
2014-07-29 23:22               ` Sören Brinkmann
2014-07-29 23:22                 ` Sören Brinkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.