linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle
@ 2015-02-11  4:00 Rafael J. Wysocki
  2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:00 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Alan Cox
  Cc: Li, Aubrey, LKML, Linux PM list, ACPI Devel Maling List,
	Kristen Carlson Accardi, John Stultz, Len Brown

Hi,

This series adds support for quiescing timers during the last state of
suspend-to-idle transitions.

Patches [1-4/6] together are functionally equivalent to the combo RFC patch
I sent last time (http://marc.info/?l=linux-kernel&m=142344909201464&w=4).

Patches [5-6/6] add ->enter_freeze callback implementations to intel_idle
and the ACPI cpuidle driver.

[1/6] - Rework the suspend-to-idle "mechanics" in preparation for the subsequent
        changes.  The existing functionality should not change after this.
[2/6] - Modify update_fast_timekeeper() to take struct tk_read_base pointers as
        arguments.
[3/6] - Make it safe to use the fast timekeeper while suspended.
[4/6] - Support for quiescing timers during suspend-to-idle (core part).
[5/6] - ->enter_freeze callback for intel_idle.
[6/6] - ->enter_freeze callback for ACPI cpuidle.

This works as expected on everything I have in my office and can readily test.

The patches should apply without any problems on top of the current Linus' tree.

Rafael


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
@ 2015-02-11  4:01 ` Rafael J. Wysocki
  2015-02-12 13:14   ` Peter Zijlstra
  2015-02-12 22:33   ` [Update][PATCH 1/6 v2] " Rafael J. Wysocki
  2015-02-11  4:01 ` [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper() Rafael J. Wysocki
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In preparation for adding support for quiescing timers in the final
stage of suspend-to-idle transitions, rework the freeze_enter()
function making the system wait on a wakeup event, the freeze_wake()
function terminating the suspend-to-idle loop and the mechanism by
which deep idle states are entered during suspend-to-idle.

First of all, introduce a simple state machine for suspend-to-idle
and make the code in question use it.

Second, prevent freeze_enter() from losing wakeup events due to race
conditions and ensure that the number of online CPUs won't change
while it is being executed.  In addition to that, make it force
all of the CPUs re-enter the idle loop in case they are in idle
states already (so they can enter deeper idle states if possible).

Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
checks in cpuidle_select() and cpuidle_reflect() with a single
suspend-to-idle state check in cpuidle_idle_call().

Finally, introduce cpuidle_enter_freeze() that will simply find the
deepest idle state available to the given CPU and enter it using
cpuidle_enter().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/cpuidle.c |   49 +++++++++++++++++++++-------------------
 include/linux/cpuidle.h   |    4 +--
 include/linux/suspend.h   |    2 +
 kernel/power/suspend.c    |   55 ++++++++++++++++++++++++++++++++++++++++------
 kernel/sched/idle.c       |   16 +++++++++++++
 5 files changed, 94 insertions(+), 32 deletions(-)

Index: linux-pm/kernel/power/suspend.c
===================================================================
--- linux-pm.orig/kernel/power/suspend.c
+++ linux-pm/kernel/power/suspend.c
@@ -37,7 +37,21 @@ const char *pm_states[PM_SUSPEND_MAX];
 static const struct platform_suspend_ops *suspend_ops;
 static const struct platform_freeze_ops *freeze_ops;
 static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
-static bool suspend_freeze_wake;
+
+/* Suspend-to-idle state machnine. */
+enum freeze_state {
+	FREEZE_STATE_NONE,      /* Not suspended/suspending. */
+	FREEZE_STATE_ENTER,     /* Enter suspend-to-idle. */
+	FREEZE_STATE_WAKE,      /* Wake up from suspend-to-idle. */
+};
+
+static enum freeze_state __read_mostly suspend_freeze_state;
+static DEFINE_SPINLOCK(suspend_freeze_lock);
+
+bool idle_should_freeze(void)
+{
+	return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
+}
 
 void freeze_set_ops(const struct platform_freeze_ops *ops)
 {
@@ -48,22 +62,49 @@ void freeze_set_ops(const struct platfor
 
 static void freeze_begin(void)
 {
-	suspend_freeze_wake = false;
+	suspend_freeze_state = FREEZE_STATE_NONE;
 }
 
 static void freeze_enter(void)
 {
-	cpuidle_use_deepest_state(true);
+	spin_lock_irq(&suspend_freeze_lock);
+	if (pm_wakeup_pending())
+		goto out;
+
+	suspend_freeze_state = FREEZE_STATE_ENTER;
+	spin_unlock_irq(&suspend_freeze_lock);
+
+	get_online_cpus();
 	cpuidle_resume();
-	wait_event(suspend_freeze_wait_head, suspend_freeze_wake);
+
+	/* Push all the CPUs into the idle loop. */
+	wake_up_all_idle_cpus();
+	pr_debug("PM: suspend-to-idle\n");
+	/* Make the current CPU wait so it can enter the idle loop too. */
+	wait_event(suspend_freeze_wait_head,
+		   suspend_freeze_state == FREEZE_STATE_WAKE);
+	pr_debug("PM: resume from suspend-to-idle\n");
+
 	cpuidle_pause();
-	cpuidle_use_deepest_state(false);
+	put_online_cpus();
+
+	spin_lock_irq(&suspend_freeze_lock);
+
+ out:
+	suspend_freeze_state = FREEZE_STATE_NONE;
+	spin_unlock_irq(&suspend_freeze_lock);
 }
 
 void freeze_wake(void)
 {
-	suspend_freeze_wake = true;
-	wake_up(&suspend_freeze_wait_head);
+	unsigned long flags;
+
+	spin_lock_irqsave(&suspend_freeze_lock, flags);
+	if (suspend_freeze_state > FREEZE_STATE_NONE) {
+		suspend_freeze_state = FREEZE_STATE_WAKE;
+		wake_up(&suspend_freeze_wait_head);
+	}
+	spin_unlock_irqrestore(&suspend_freeze_lock, flags);
 }
 EXPORT_SYMBOL_GPL(freeze_wake);
 
Index: linux-pm/include/linux/suspend.h
===================================================================
--- linux-pm.orig/include/linux/suspend.h
+++ linux-pm/include/linux/suspend.h
@@ -203,6 +203,7 @@ extern void suspend_set_ops(const struct
 extern int suspend_valid_only_mem(suspend_state_t state);
 extern void freeze_set_ops(const struct platform_freeze_ops *ops);
 extern void freeze_wake(void);
+extern bool idle_should_freeze(void);
 
 /**
  * arch_suspend_disable_irqs - disable IRQs for suspend
@@ -230,6 +231,7 @@ static inline void suspend_set_ops(const
 static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
 static inline void freeze_set_ops(const struct platform_freeze_ops *ops) {}
 static inline void freeze_wake(void) {}
+static inline bool idle_should_freeze(void) { return false; }
 #endif /* !CONFIG_SUSPEND */
 
 /* struct pbe is used for creating lists of pages that should be restored
Index: linux-pm/kernel/sched/idle.c
===================================================================
--- linux-pm.orig/kernel/sched/idle.c
+++ linux-pm/kernel/sched/idle.c
@@ -7,6 +7,7 @@
 #include <linux/tick.h>
 #include <linux/mm.h>
 #include <linux/stackprotector.h>
+#include <linux/suspend.h>
 
 #include <asm/tlb.h>
 
@@ -104,6 +105,21 @@ static void cpuidle_idle_call(void)
 	rcu_idle_enter();
 
 	/*
+	 * Suspend-to-idle ("freeze") is a system state in which all user space
+	 * has been frozen, all I/O devices have been suspended and the only
+	 * activity happens here and in iterrupts (if any).  In that case bypass
+	 * the cpuidle governor and go stratight for the deepest idle state
+	 * available.  Possibly also suspend the local tick and the entire
+	 * timekeeping to prevent timer interrupts from kicking us out of idle
+	 * until a proper wakeup interrupt happens.
+	 */
+	if (idle_should_freeze()) {
+		cpuidle_enter_freeze();
+		local_irq_enable();
+		goto exit_idle;
+	}
+
+	/*
 	 * Ask the cpuidle framework to choose a convenient idle state.
 	 * Fall back to the default arch idle method on errors.
 	 */
Index: linux-pm/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -19,6 +19,7 @@
 #include <linux/ktime.h>
 #include <linux/hrtimer.h>
 #include <linux/module.h>
+#include <linux/suspend.h>
 #include <trace/events/power.h>
 
 #include "cpuidle.h"
@@ -32,7 +33,6 @@ LIST_HEAD(cpuidle_detected_devices);
 static int enabled_devices;
 static int off __read_mostly;
 static int initialized __read_mostly;
-static bool use_deepest_state __read_mostly;
 
 int cpuidle_disabled(void)
 {
@@ -66,24 +66,9 @@ int cpuidle_play_dead(void)
 }
 
 /**
- * cpuidle_use_deepest_state - Enable/disable the "deepest idle" mode.
- * @enable: Whether enable or disable the feature.
- *
- * If the "deepest idle" mode is enabled, cpuidle will ignore the governor and
- * always use the state with the greatest exit latency (out of the states that
- * are not disabled).
- *
- * This function can only be called after cpuidle_pause() to avoid races.
- */
-void cpuidle_use_deepest_state(bool enable)
-{
-	use_deepest_state = enable;
-}
-
-/**
- * cpuidle_find_deepest_state - Find the state of the greatest exit latency.
- * @drv: cpuidle driver for a given CPU.
- * @dev: cpuidle device for a given CPU.
+ * cpuidle_find_deepest_state - Find deepest state meeting specific conditions.
+ * @drv: cpuidle driver for the given CPU.
+ * @dev: cpuidle device for the given CPU.
  */
 static int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 				      struct cpuidle_device *dev)
@@ -105,6 +90,27 @@ static int cpuidle_find_deepest_state(st
 }
 
 /**
+ * cpuidle_enter_freeze - Enter an idle state suitable for suspend-to-idle.
+ *
+ * Find the deepest state available and enter it.
+ */
+void cpuidle_enter_freeze(void)
+{
+	struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
+	struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
+	int index;
+
+	index = cpuidle_find_deepest_state(drv, dev);
+	if (index >= 0)
+		cpuidle_enter(drv, dev, index);
+	else
+		arch_cpu_idle();
+
+	/* Interrupts are enabled again here. */
+	local_irq_disable();
+}
+
+/**
  * cpuidle_enter_state - enter the state and update stats
  * @dev: cpuidle device for this cpu
  * @drv: cpuidle driver for this cpu
@@ -166,9 +172,6 @@ int cpuidle_select(struct cpuidle_driver
 	if (!drv || !dev || !dev->enabled)
 		return -EBUSY;
 
-	if (unlikely(use_deepest_state))
-		return cpuidle_find_deepest_state(drv, dev);
-
 	return cpuidle_curr_governor->select(drv, dev);
 }
 
@@ -200,7 +203,7 @@ int cpuidle_enter(struct cpuidle_driver
  */
 void cpuidle_reflect(struct cpuidle_device *dev, int index)
 {
-	if (cpuidle_curr_governor->reflect && !unlikely(use_deepest_state))
+	if (cpuidle_curr_governor->reflect)
 		cpuidle_curr_governor->reflect(dev, index);
 }
 
Index: linux-pm/include/linux/cpuidle.h
===================================================================
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -141,7 +141,7 @@ extern void cpuidle_resume(void);
 extern int cpuidle_enable_device(struct cpuidle_device *dev);
 extern void cpuidle_disable_device(struct cpuidle_device *dev);
 extern int cpuidle_play_dead(void);
-extern void cpuidle_use_deepest_state(bool enable);
+extern void cpuidle_enter_freeze(void);
 
 extern struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev);
 #else
@@ -174,7 +174,7 @@ static inline int cpuidle_enable_device(
 {return -ENODEV; }
 static inline void cpuidle_disable_device(struct cpuidle_device *dev) { }
 static inline int cpuidle_play_dead(void) {return -ENODEV; }
-static inline void cpuidle_use_deepest_state(bool enable) {}
+static inline void cpuidle_enter_freeze(void) { }
 static inline struct cpuidle_driver *cpuidle_get_cpu_driver(
 	struct cpuidle_device *dev) {return NULL; }
 #endif


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper()
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
  2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
@ 2015-02-11  4:01 ` Rafael J. Wysocki
  2015-02-13  0:29   ` John Stultz
  2015-02-11  4:03 ` [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended Rafael J. Wysocki
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify update_fast_timekeeper() to take a struct tk_read_base
pointer as its argument (instead of a struct timekeeper pointer)
and update its kerneldoc comment to reflect that.

That will allow a struct tk_read_base that is not part of a
struct timekeeper to be passed to it in the next patch.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/time/timekeeping.c |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

Index: linux-pm/kernel/time/timekeeping.c
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.c
+++ linux-pm/kernel/time/timekeeping.c
@@ -230,9 +230,7 @@ static inline s64 timekeeping_get_ns_raw
 
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
- * @tk:		The timekeeper from which we take the update
- * @tkf:	The fast timekeeper to update
- * @tbase:	The time base for the fast timekeeper (mono/raw)
+ * @tkr: Timekeeping readout base from which we take the update
  *
  * We want to use this from any context including NMI and tracing /
  * instrumenting the timekeeping code itself.
@@ -244,11 +242,11 @@ static inline s64 timekeeping_get_ns_raw
  * smp_wmb();	<- Ensure that the last base[1] update is visible
  * tkf->seq++;
  * smp_wmb();	<- Ensure that the seqcount update is visible
- * update(tkf->base[0], tk);
+ * update(tkf->base[0], tkr);
  * smp_wmb();	<- Ensure that the base[0] update is visible
  * tkf->seq++;
  * smp_wmb();	<- Ensure that the seqcount update is visible
- * update(tkf->base[1], tk);
+ * update(tkf->base[1], tkr);
  *
  * The reader side does:
  *
@@ -269,7 +267,7 @@ static inline s64 timekeeping_get_ns_raw
  * slightly wrong timestamp (a few nanoseconds). See
  * @ktime_get_mono_fast_ns.
  */
-static void update_fast_timekeeper(struct timekeeper *tk)
+static void update_fast_timekeeper(struct tk_read_base *tkr)
 {
 	struct tk_read_base *base = tk_fast_mono.base;
 
@@ -277,7 +275,7 @@ static void update_fast_timekeeper(struc
 	raw_write_seqcount_latch(&tk_fast_mono.seq);
 
 	/* Update base[0] */
-	memcpy(base, &tk->tkr, sizeof(*base));
+	memcpy(base, tkr, sizeof(*base));
 
 	/* Force readers back to base[0] */
 	raw_write_seqcount_latch(&tk_fast_mono.seq);
@@ -462,7 +460,7 @@ static void timekeeping_update(struct ti
 		memcpy(&shadow_timekeeper, &tk_core.timekeeper,
 		       sizeof(tk_core.timekeeper));
 
-	update_fast_timekeeper(tk);
+	update_fast_timekeeper(&tk->tkr);
 }
 
 /**


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
  2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
  2015-02-11  4:01 ` [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper() Rafael J. Wysocki
@ 2015-02-11  4:03 ` Rafael J. Wysocki
  2015-02-13  0:53   ` John Stultz
  2015-02-11  4:03 ` [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle Rafael J. Wysocki
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Theoretically, ktime_get_mono_fast_ns() may be executed after
timekeeping has been suspended (or before it is resumed) which
in turn may lead to undefined behavior, for example, when the
clocksource read from timekeeping_get_ns() called by it is
not accessible at that time.

Prevent that from happening by setting up a dummy readout base for
the fast timekeeper during timekeeping_suspend() such that it will
always return the same number of cycles.

After the last timekeeping_update() in timekeeping_suspend() the
clocksource is read and the result is stored as cycles_at_suspend.
The readout base from the current timekeeper is copied onto the
dummy and the ->read pointer of the dummy is set to a routine
unconditionally returning cycles_at_suspend.  Next, the dummy is
passed to update_fast_timekeeper().

Then, ktime_get_mono_fast_ns() will work until the subsequent
timekeeping_resume() and the proper readout base for the fast
timekeeper will be restored by the timekeeping_update() called
right after clearing timekeeping_suspended.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/time/timekeeping.c |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Index: linux-pm/kernel/time/timekeeping.c
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.c
+++ linux-pm/kernel/time/timekeeping.c
@@ -1249,9 +1249,23 @@ static void timekeeping_resume(void)
 	hrtimers_resume();
 }
 
+/*
+ * Dummy readout base and suspend-time cycles value for the fast timekeeper to
+ * work in a consistent way after timekeeping has been suspended if the core
+ * timekeeper clocksource is not suspend-nonstop.
+ */
+static struct tk_read_base tkr_dummy;
+static cycle_t cycles_at_suspend;
+
+static cycle_t dummy_clock_read(struct clocksource *cs)
+{
+	return cycles_at_suspend;
+}
+
 static int timekeeping_suspend(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
+	struct clocksource *clock = tk->tkr.clock;
 	unsigned long flags;
 	struct timespec64		delta, delta_delta;
 	static struct timespec64	old_delta;
@@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void)
 	}
 
 	timekeeping_update(tk, TK_MIRROR);
+
+	if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) {
+		memcpy(&tkr_dummy, &tk->tkr, sizeof(tkr_dummy));
+		cycles_at_suspend = tk->tkr.read(clock);
+		tkr_dummy.read = dummy_clock_read;
+		update_fast_timekeeper(&tkr_dummy);
+	}
+
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2015-02-11  4:03 ` [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended Rafael J. Wysocki
@ 2015-02-11  4:03 ` Rafael J. Wysocki
  2015-02-12 13:24   ` Peter Zijlstra
  2015-02-12 22:36   ` [Update][PATCH 4/6 v2] " Rafael J. Wysocki
  2015-02-11  4:04 ` [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks Rafael J. Wysocki
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.

However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion.  A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.

The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping.  That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs.  It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.

Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit.  Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.

To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/cpuidle.c |   48 +++++++++++++++++++++++++++++++++++++++-----
 include/linux/cpuidle.h   |    4 +++
 include/linux/tick.h      |    2 +
 kernel/time/tick-common.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++
 kernel/time/timekeeping.c |    4 +--
 kernel/time/timekeeping.h |    2 +
 6 files changed, 103 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/tick.h
===================================================================
--- linux-pm.orig/include/linux/tick.h
+++ linux-pm/include/linux/tick.h
@@ -226,5 +226,7 @@ static inline void tick_nohz_task_switch
 		__tick_nohz_task_switch(tsk);
 }
 
+extern void tick_freeze(void);
+extern void tick_unfreeze(void);
 
 #endif
Index: linux-pm/kernel/time/tick-common.c
===================================================================
--- linux-pm.orig/kernel/time/tick-common.c
+++ linux-pm/kernel/time/tick-common.c
@@ -394,6 +394,56 @@ void tick_resume(void)
 	}
 }
 
+static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
+static unsigned int tick_freeze_depth;
+
+/**
+ * tick_freeze - Suspend the local tick and (possibly) timekeeping.
+ *
+ * Check if this is the last online CPU executing the function and if so,
+ * suspend timekeeping.  Otherwise suspend the local tick.
+ *
+ * Call with interrupts disabled.  Must be balanced with %tick_unfreeze().
+ * Interrupts must not be enabled before the subsequent %tick_unfreeze().
+ */
+void tick_freeze(void)
+{
+	raw_spin_lock(&tick_freeze_lock);
+
+	tick_freeze_depth++;
+	if (tick_freeze_depth == num_online_cpus()) {
+		timekeeping_suspend();
+	} else {
+		tick_suspend();
+		tick_suspend_broadcast();
+	}
+
+	raw_spin_unlock(&tick_freeze_lock);
+}
+
+/**
+ * tick_unfreeze - Resume the local tick and (possibly) timekeeping.
+ *
+ * Check if this is the first CPU executing the function and if so, resume
+ * timekeeping.  Otherwise resume the local tick.
+ *
+ * Call with interrupts disabled.  Must be balanced with %tick_freeze().
+ * Interrupts must not be enabled after the preceding %tick_freeze().
+ */
+void tick_unfreeze(void)
+{
+	raw_spin_lock(&tick_freeze_lock);
+
+	if (tick_freeze_depth == num_online_cpus())
+		timekeeping_resume();
+	else
+		tick_resume();
+
+	tick_freeze_depth--;
+
+	raw_spin_unlock(&tick_freeze_lock);
+}
+
 /**
  * tick_init - initialize the tick control
  */
Index: linux-pm/kernel/time/timekeeping.h
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.h
+++ linux-pm/kernel/time/timekeeping.h
@@ -16,5 +16,7 @@ extern int timekeeping_inject_offset(str
 extern s32 timekeeping_get_tai_offset(void);
 extern void timekeeping_set_tai_offset(s32 tai_offset);
 extern void timekeeping_clocktai(struct timespec *ts);
+extern int timekeeping_suspend(void);
+extern void timekeeping_resume(void);
 
 #endif
Index: linux-pm/kernel/time/timekeeping.c
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.c
+++ linux-pm/kernel/time/timekeeping.c
@@ -1168,7 +1168,7 @@ void timekeeping_inject_sleeptime64(stru
  * xtime/wall_to_monotonic/jiffies/etc are
  * still managed by arch specific suspend/resume code.
  */
-static void timekeeping_resume(void)
+void timekeeping_resume(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	struct clocksource *clock = tk->tkr.clock;
@@ -1262,7 +1262,7 @@ static cycle_t dummy_clock_read(struct c
 	return cycles_at_suspend;
 }
 
-static int timekeeping_suspend(void)
+int timekeeping_suspend(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	struct clocksource *clock = tk->tkr.clock;
Index: linux-pm/include/linux/cpuidle.h
===================================================================
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -50,6 +50,10 @@ struct cpuidle_state {
 			int index);
 
 	int (*enter_dead) (struct cpuidle_device *dev, int index);
+
+	void (*enter_freeze) (struct cpuidle_device *dev,
+			      struct cpuidle_driver *drv,
+			      int index);
 };
 
 /* Idle State Flags */
Index: linux-pm/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -20,6 +20,7 @@
 #include <linux/hrtimer.h>
 #include <linux/module.h>
 #include <linux/suspend.h>
+#include <linux/tick.h>
 #include <trace/events/power.h>
 
 #include "cpuidle.h"
@@ -69,18 +70,20 @@ int cpuidle_play_dead(void)
  * cpuidle_find_deepest_state - Find deepest state meeting specific conditions.
  * @drv: cpuidle driver for the given CPU.
  * @dev: cpuidle device for the given CPU.
+ * @freeze: Whether or not the state should be suitable for suspend-to-idle.
  */
 static int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
-				      struct cpuidle_device *dev)
+				      struct cpuidle_device *dev, bool freeze)
 {
 	unsigned int latency_req = 0;
-	int i, ret = CPUIDLE_DRIVER_STATE_START - 1;
+	int i, ret = freeze ? -1 : CPUIDLE_DRIVER_STATE_START - 1;
 
 	for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
 		struct cpuidle_state *s = &drv->states[i];
 		struct cpuidle_state_usage *su = &dev->states_usage[i];
 
-		if (s->disabled || su->disable || s->exit_latency <= latency_req)
+		if (s->disabled || su->disable || s->exit_latency <= latency_req
+		    || (freeze && !s->enter_freeze))
 			continue;
 
 		latency_req = s->exit_latency;
@@ -89,10 +92,30 @@ static int cpuidle_find_deepest_state(st
 	return ret;
 }
 
+static void enter_freeze_proper(struct cpuidle_driver *drv,
+				struct cpuidle_device *dev, int index)
+{
+	tick_freeze();
+	/*
+	 * The state used here cannot be a "coupled" one, because the "coupled"
+	 * cpuidle mechanism enables interrupts and doing that with timekeeping
+	 * suspended is generally unsafe.
+	 */
+	drv->states[index].enter_freeze(dev, drv, index);
+	/*
+	 * timekeeping_resume() that will be called by tick_unfreeze() for the
+	 * last CPU executing it calls functions containing RCU read-side
+	 * critical sections, so tell RCU about that.
+	 */
+	RCU_NONIDLE(tick_unfreeze());
+}
+
 /**
  * cpuidle_enter_freeze - Enter an idle state suitable for suspend-to-idle.
  *
- * Find the deepest state available and enter it.
+ * If there are states with the ->enter_freeze callback, find the deepest of
+ * them and enter it with frozen tick.  Otherwise, find the deepest state
+ * available and enter it normally.
  */
 void cpuidle_enter_freeze(void)
 {
@@ -100,7 +123,22 @@ void cpuidle_enter_freeze(void)
 	struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
 	int index;
 
-	index = cpuidle_find_deepest_state(drv, dev);
+	/*
+	 * Find the deepest state with ->enter_freeze present, which guarantees
+	 * that interrupts won't be enabled when it exits and allows the tick to
+	 * be frozen safely.
+	 */
+	index = cpuidle_find_deepest_state(drv, dev, true);
+	if (index >= 0) {
+		enter_freeze_proper(drv, dev, index);
+		return;
+	}
+
+	/*
+	 * It is not safe to freeze the tick, find the deepest state available
+	 * at all and try to enter it normally.
+	 */
+	index = cpuidle_find_deepest_state(drv, dev, false);
 	if (index >= 0)
 		cpuidle_enter(drv, dev, index);
 	else


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2015-02-11  4:03 ` [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle Rafael J. Wysocki
@ 2015-02-11  4:04 ` Rafael J. Wysocki
  2015-02-12 13:26   ` Peter Zijlstra
  2015-02-11  4:04 ` [PATCH 6/6] ACPI / idle: Implement ->enter_freeze callback routine Rafael J. Wysocki
  2015-02-13  8:13 ` [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Peter Zijlstra
  6 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add an ->enter_freeze callback routine, intel_idle_freeze(), to
the intel_idle driver and point the ->enter_freeze callback
pointers of all of the driver's state objects to it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/idle/intel_idle.c |  179 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 125 insertions(+), 54 deletions(-)

Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -97,6 +97,8 @@ static const struct idle_cpu *icpu;
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 static int intel_idle(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv, int index);
+static void intel_idle_freeze(struct cpuidle_device *dev,
+			      struct cpuidle_driver *drv, int index);
 static int intel_idle_cpu_init(int cpu);
 
 static struct cpuidle_state *cpuidle_state_table;
@@ -131,28 +133,32 @@ static struct cpuidle_state nehalem_csta
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 3,
 		.target_residency = 6,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-NHM",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-NHM",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 20,
 		.target_residency = 80,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-NHM",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 200,
 		.target_residency = 800,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -164,35 +170,40 @@ static struct cpuidle_state snb_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-SNB",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-SNB",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 80,
 		.target_residency = 211,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-SNB",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 104,
 		.target_residency = 345,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7-SNB",
 		.desc = "MWAIT 0x30",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 109,
 		.target_residency = 345,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -204,42 +215,48 @@ static struct cpuidle_state byt_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-BYT",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 15,
 		.target_residency = 30,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6N-BYT",
 		.desc = "MWAIT 0x58",
 		.flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 40,
 		.target_residency = 275,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6S-BYT",
 		.desc = "MWAIT 0x52",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 140,
 		.target_residency = 560,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7-BYT",
 		.desc = "MWAIT 0x60",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 1200,
 		.target_residency = 1500,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7S-BYT",
 		.desc = "MWAIT 0x64",
 		.flags = MWAIT2flg(0x64) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 10000,
 		.target_residency = 20000,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -251,35 +268,40 @@ static struct cpuidle_state ivb_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-IVB",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-IVB",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 156,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-IVB",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 80,
 		.target_residency = 300,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7-IVB",
 		.desc = "MWAIT 0x30",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 87,
 		.target_residency = 300,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -291,28 +313,32 @@ static struct cpuidle_state ivt_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-IVT",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 80,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-IVT",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 156,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-IVT",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 82,
 		.target_residency = 300,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -324,28 +350,32 @@ static struct cpuidle_state ivt_cstates_
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-IVT-4S",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 250,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-IVT-4S",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 300,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-IVT-4S",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 84,
 		.target_residency = 400,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -357,28 +387,32 @@ static struct cpuidle_state ivt_cstates_
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-IVT-8S",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 500,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-IVT-8S",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 600,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-IVT-8S",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 88,
 		.target_residency = 700,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -390,56 +424,64 @@ static struct cpuidle_state hsw_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-HSW",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-HSW",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 33,
 		.target_residency = 100,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-HSW",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 400,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7s-HSW",
 		.desc = "MWAIT 0x32",
 		.flags = MWAIT2flg(0x32) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 166,
 		.target_residency = 500,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C8-HSW",
 		.desc = "MWAIT 0x40",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 300,
 		.target_residency = 900,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C9-HSW",
 		.desc = "MWAIT 0x50",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 600,
 		.target_residency = 1800,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C10-HSW",
 		.desc = "MWAIT 0x60",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 2600,
 		.target_residency = 7700,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -450,56 +492,64 @@ static struct cpuidle_state bdw_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C1E-BDW",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C3-BDW",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 40,
 		.target_residency = 100,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-BDW",
 		.desc = "MWAIT 0x20",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 400,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C7s-BDW",
 		.desc = "MWAIT 0x32",
 		.flags = MWAIT2flg(0x32) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 166,
 		.target_residency = 500,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C8-BDW",
 		.desc = "MWAIT 0x40",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 300,
 		.target_residency = 900,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C9-BDW",
 		.desc = "MWAIT 0x50",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 600,
 		.target_residency = 1800,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C10-BDW",
 		.desc = "MWAIT 0x60",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 2600,
 		.target_residency = 7700,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -511,28 +561,32 @@ static struct cpuidle_state atom_cstates
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 10,
 		.target_residency = 20,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C2-ATM",
 		.desc = "MWAIT 0x10",
 		.flags = MWAIT2flg(0x10),
 		.exit_latency = 20,
 		.target_residency = 80,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C4-ATM",
 		.desc = "MWAIT 0x30",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 100,
 		.target_residency = 400,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-ATM",
 		.desc = "MWAIT 0x52",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 140,
 		.target_residency = 560,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -543,14 +597,16 @@ static struct cpuidle_state avn_cstates[
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.name = "C6-AVN",
 		.desc = "MWAIT 0x51",
 		.flags = MWAIT2flg(0x51) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 15,
 		.target_residency = 45,
-		.enter = &intel_idle },
+		.enter = &intel_idle,
+		.enter_freeze = intel_idle_freeze, },
 	{
 		.enter = NULL }
 };
@@ -592,6 +648,21 @@ static int intel_idle(struct cpuidle_dev
 	return index;
 }
 
+/**
+ * intel_idle_freeze - simplified "enter" callback routine for suspend-to-idle
+ * @dev: cpuidle_device
+ * @drv: cpuidle driver
+ * @index: state index
+ */
+static void intel_idle_freeze(struct cpuidle_device *dev,
+			     struct cpuidle_driver *drv, int index)
+{
+	unsigned long ecx = 1; /* break on interrupt flag */
+	unsigned long eax = flg2MWAIT(drv->states[index].flags);
+
+	mwait_idle_with_hints(eax, ecx);
+}
+
 static void __setup_broadcast_timer(void *arg)
 {
 	unsigned long reason = (unsigned long)arg;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 6/6] ACPI / idle: Implement ->enter_freeze callback routine
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2015-02-11  4:04 ` [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks Rafael J. Wysocki
@ 2015-02-11  4:04 ` Rafael J. Wysocki
  2015-02-13  8:13 ` [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Peter Zijlstra
  6 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-11  4:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Add an ->enter_freeze callback routine, acpi_idle_enter_freeze(), to
the ACPI cpuidle driver and point ->enter_freeze to it for all the
C2-type and C3-type states that don't need to fall back to C1
(which may be halt-induced and that will re-enable interrupts on
exit from idle, which ->enter_freeze cannot do).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/processor_idle.c |   48 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 40 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/acpi/processor_idle.c
===================================================================
--- linux-pm.orig/drivers/acpi/processor_idle.c
+++ linux-pm/drivers/acpi/processor_idle.c
@@ -732,9 +732,8 @@ static int acpi_idle_play_dead(struct cp
 
 static bool acpi_idle_fallback_to_c1(struct acpi_processor *pr)
 {
-	return IS_ENABLED(CONFIG_HOTPLUG_CPU) && num_online_cpus() > 1 &&
-		!(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED) &&
-		!pr->flags.has_cst;
+	return IS_ENABLED(CONFIG_HOTPLUG_CPU) && !pr->flags.has_cst &&
+		!(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED);
 }
 
 static int c3_cpu_count;
@@ -744,9 +743,10 @@ static DEFINE_RAW_SPINLOCK(c3_lock);
  * acpi_idle_enter_bm - enters C3 with proper BM handling
  * @pr: Target processor
  * @cx: Target state context
+ * @timer_bc: Whether or not to change timer mode to broadcast
  */
 static void acpi_idle_enter_bm(struct acpi_processor *pr,
-			       struct acpi_processor_cx *cx)
+			       struct acpi_processor_cx *cx, bool timer_bc)
 {
 	acpi_unlazy_tlb(smp_processor_id());
 
@@ -754,7 +754,8 @@ static void acpi_idle_enter_bm(struct ac
 	 * Must be done before busmaster disable as we might need to
 	 * access HPET !
 	 */
-	lapic_timer_state_broadcast(pr, cx, 1);
+	if (timer_bc)
+		lapic_timer_state_broadcast(pr, cx, 1);
 
 	/*
 	 * disable bus master
@@ -784,7 +785,8 @@ static void acpi_idle_enter_bm(struct ac
 		raw_spin_unlock(&c3_lock);
 	}
 
-	lapic_timer_state_broadcast(pr, cx, 0);
+	if (timer_bc)
+		lapic_timer_state_broadcast(pr, cx, 0);
 }
 
 static int acpi_idle_enter(struct cpuidle_device *dev,
@@ -798,12 +800,12 @@ static int acpi_idle_enter(struct cpuidl
 		return -EINVAL;
 
 	if (cx->type != ACPI_STATE_C1) {
-		if (acpi_idle_fallback_to_c1(pr)) {
+		if (acpi_idle_fallback_to_c1(pr) && num_online_cpus() > 1) {
 			index = CPUIDLE_DRIVER_STATE_START;
 			cx = per_cpu(acpi_cstate[index], dev->cpu);
 		} else if (cx->type == ACPI_STATE_C3 && pr->flags.bm_check) {
 			if (cx->bm_sts_skip || !acpi_idle_bm_check()) {
-				acpi_idle_enter_bm(pr, cx);
+				acpi_idle_enter_bm(pr, cx, true);
 				return index;
 			} else if (drv->safe_state_index >= 0) {
 				index = drv->safe_state_index;
@@ -827,6 +829,27 @@ static int acpi_idle_enter(struct cpuidl
 	return index;
 }
 
+static void acpi_idle_enter_freeze(struct cpuidle_device *dev,
+				   struct cpuidle_driver *drv, int index)
+{
+	struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
+
+	if (cx->type == ACPI_STATE_C3) {
+		struct acpi_processor *pr = __this_cpu_read(processors);
+
+		if (unlikely(!pr))
+			return;
+
+		if (pr->flags.bm_check) {
+			acpi_idle_enter_bm(pr, cx, false);
+			return;
+		} else {
+			ACPI_FLUSH_CPU_CACHE();
+		}
+	}
+	acpi_idle_do_entry(cx);
+}
+
 struct cpuidle_driver acpi_idle_driver = {
 	.name =		"acpi_idle",
 	.owner =	THIS_MODULE,
@@ -925,6 +948,15 @@ static int acpi_processor_setup_cpuidle_
 			state->enter_dead = acpi_idle_play_dead;
 			drv->safe_state_index = count;
 		}
+		/*
+		 * Halt-induced C1 is not good for ->enter_freeze, because it
+		 * re-enables interrupts on exit.  Moreover, C1 is generally not
+		 * particularly interesting from the suspend-to-idle angle, so
+		 * avoid C1 and the situations in which we may need to fall back
+		 * to it altogether.
+		 */
+		if (cx->type != ACPI_STATE_C1 && !acpi_idle_fallback_to_c1(pr))
+			state->enter_freeze = acpi_idle_enter_freeze;
 
 		count++;
 		if (count == CPUIDLE_STATE_MAX)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling
  2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
@ 2015-02-12 13:14   ` Peter Zijlstra
  2015-02-12 16:18     ` Rafael J. Wysocki
  2015-02-12 22:33   ` [Update][PATCH 1/6 v2] " Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2015-02-12 13:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Wed, Feb 11, 2015 at 05:01:09AM +0100, Rafael J. Wysocki wrote:
> +/* Suspend-to-idle state machnine. */
> +enum freeze_state {
> +	FREEZE_STATE_NONE,      /* Not suspended/suspending. */
> +	FREEZE_STATE_ENTER,     /* Enter suspend-to-idle. */
> +	FREEZE_STATE_WAKE,      /* Wake up from suspend-to-idle. */
> +};
> +
> +static enum freeze_state __read_mostly suspend_freeze_state;
> +static DEFINE_SPINLOCK(suspend_freeze_lock);
> +
> +bool idle_should_freeze(void)
> +{
> +	return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
> +}

I don't see how a compiler can propagate the unlikely through an actual
function call. AFAICT that needs to be an inline function for that to
work.

It would mean exposing suspend_freeze_state and the enum; is there a
reason not to want to do that?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle
  2015-02-11  4:03 ` [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle Rafael J. Wysocki
@ 2015-02-12 13:24   ` Peter Zijlstra
  2015-02-12 16:19     ` Rafael J. Wysocki
  2015-02-12 22:36   ` [Update][PATCH 4/6 v2] " Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2015-02-12 13:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Wed, Feb 11, 2015 at 05:03:44AM +0100, Rafael J. Wysocki wrote:
> Index: linux-pm/include/linux/cpuidle.h
> ===================================================================
> --- linux-pm.orig/include/linux/cpuidle.h
> +++ linux-pm/include/linux/cpuidle.h
> @@ -50,6 +50,10 @@ struct cpuidle_state {
>  			int index);
>  
>  	int (*enter_dead) (struct cpuidle_device *dev, int index);

Do we want a comment here describing that enter_freeze() must not
re-enable interrupts _ever_?

To help people who want to enable this on their platform.

> +
> +	void (*enter_freeze) (struct cpuidle_device *dev,
> +			      struct cpuidle_driver *drv,
> +			      int index);
>  };

> +static void enter_freeze_proper(struct cpuidle_driver *drv,
> +				struct cpuidle_device *dev, int index)
> +{
> +	tick_freeze();
> +	/*
> +	 * The state used here cannot be a "coupled" one, because the "coupled"
> +	 * cpuidle mechanism enables interrupts and doing that with timekeeping
> +	 * suspended is generally unsafe.
> +	 */
> +	drv->states[index].enter_freeze(dev, drv, index);

	WARN_ON(!irqs_disabled());

To go along with the comment and catch fail?

> +	/*
> +	 * timekeeping_resume() that will be called by tick_unfreeze() for the
> +	 * last CPU executing it calls functions containing RCU read-side
> +	 * critical sections, so tell RCU about that.
> +	 */
> +	RCU_NONIDLE(tick_unfreeze());
> +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-02-11  4:04 ` [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks Rafael J. Wysocki
@ 2015-02-12 13:26   ` Peter Zijlstra
  2015-02-12 16:24     ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2015-02-12 13:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Wed, Feb 11, 2015 at 05:04:17AM +0100, Rafael J. Wysocki wrote:
> @@ -131,28 +133,32 @@ static struct cpuidle_state nehalem_csta
>  		.flags = MWAIT2flg(0x00),
>  		.exit_latency = 3,
>  		.target_residency = 6,
> -		.enter = &intel_idle },
> +		.enter = &intel_idle,
> +		.enter_freeze = intel_idle_freeze, },
>  	{
>  		.name = "C1E-NHM",
>  		.desc = "MWAIT 0x01",
>  		.flags = MWAIT2flg(0x01),
>  		.exit_latency = 10,
>  		.target_residency = 20,
> -		.enter = &intel_idle },
> +		.enter = &intel_idle,
> +		.enter_freeze = intel_idle_freeze, },
>  	{
>  		.name = "C3-NHM",
>  		.desc = "MWAIT 0x10",
>  		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
>  		.exit_latency = 20,
>  		.target_residency = 80,
> -		.enter = &intel_idle },
> +		.enter = &intel_idle,
> +		.enter_freeze = intel_idle_freeze, },
>  	{
>  		.name = "C6-NHM",
>  		.desc = "MWAIT 0x20",
>  		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
>  		.exit_latency = 200,
>  		.target_residency = 800,
> -		.enter = &intel_idle },
> +		.enter = &intel_idle,
> +		.enter_freeze = intel_idle_freeze, },
>  	{
>  		.enter = NULL }
>  };

Why bother with enter_freeze() for any but the deepest state (C6 in this
case)?

Also, should we ignore things like intel_idle.max_cstate for this
selection?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling
  2015-02-12 13:14   ` Peter Zijlstra
@ 2015-02-12 16:18     ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-12 16:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Thursday, February 12, 2015 02:14:20 PM Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 05:01:09AM +0100, Rafael J. Wysocki wrote:
> > +/* Suspend-to-idle state machnine. */
> > +enum freeze_state {
> > +	FREEZE_STATE_NONE,      /* Not suspended/suspending. */
> > +	FREEZE_STATE_ENTER,     /* Enter suspend-to-idle. */
> > +	FREEZE_STATE_WAKE,      /* Wake up from suspend-to-idle. */
> > +};
> > +
> > +static enum freeze_state __read_mostly suspend_freeze_state;
> > +static DEFINE_SPINLOCK(suspend_freeze_lock);
> > +
> > +bool idle_should_freeze(void)
> > +{
> > +	return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
> > +}
> 
> I don't see how a compiler can propagate the unlikely through an actual
> function call. AFAICT that needs to be an inline function for that to
> work.

Oh.

That was silly, I'll send an update later today.

> It would mean exposing suspend_freeze_state and the enum; is there a
> reason not to want to do that?

Not really.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle
  2015-02-12 13:24   ` Peter Zijlstra
@ 2015-02-12 16:19     ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-12 16:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Thursday, February 12, 2015 02:24:47 PM Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 05:03:44AM +0100, Rafael J. Wysocki wrote:
> > Index: linux-pm/include/linux/cpuidle.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/cpuidle.h
> > +++ linux-pm/include/linux/cpuidle.h
> > @@ -50,6 +50,10 @@ struct cpuidle_state {
> >  			int index);
> >  
> >  	int (*enter_dead) (struct cpuidle_device *dev, int index);
> 
> Do we want a comment here describing that enter_freeze() must not
> re-enable interrupts _ever_?
> 
> To help people who want to enable this on their platform.

Good point.

I'll update the patch later today.

> > +
> > +	void (*enter_freeze) (struct cpuidle_device *dev,
> > +			      struct cpuidle_driver *drv,
> > +			      int index);
> >  };
> 
> > +static void enter_freeze_proper(struct cpuidle_driver *drv,
> > +				struct cpuidle_device *dev, int index)
> > +{
> > +	tick_freeze();
> > +	/*
> > +	 * The state used here cannot be a "coupled" one, because the "coupled"
> > +	 * cpuidle mechanism enables interrupts and doing that with timekeeping
> > +	 * suspended is generally unsafe.
> > +	 */
> > +	drv->states[index].enter_freeze(dev, drv, index);
> 
> 	WARN_ON(!irqs_disabled());
> 
> To go along with the comment and catch fail?

Yeah, will do.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-02-12 13:26   ` Peter Zijlstra
@ 2015-02-12 16:24     ` Rafael J. Wysocki
  2015-02-12 16:25       ` Peter Zijlstra
  2015-03-04 23:50       ` Li, Aubrey
  0 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-12 16:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Thursday, February 12, 2015 02:26:43 PM Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 05:04:17AM +0100, Rafael J. Wysocki wrote:
> > @@ -131,28 +133,32 @@ static struct cpuidle_state nehalem_csta
> >  		.flags = MWAIT2flg(0x00),
> >  		.exit_latency = 3,
> >  		.target_residency = 6,
> > -		.enter = &intel_idle },
> > +		.enter = &intel_idle,
> > +		.enter_freeze = intel_idle_freeze, },
> >  	{
> >  		.name = "C1E-NHM",
> >  		.desc = "MWAIT 0x01",
> >  		.flags = MWAIT2flg(0x01),
> >  		.exit_latency = 10,
> >  		.target_residency = 20,
> > -		.enter = &intel_idle },
> > +		.enter = &intel_idle,
> > +		.enter_freeze = intel_idle_freeze, },
> >  	{
> >  		.name = "C3-NHM",
> >  		.desc = "MWAIT 0x10",
> >  		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
> >  		.exit_latency = 20,
> >  		.target_residency = 80,
> > -		.enter = &intel_idle },
> > +		.enter = &intel_idle,
> > +		.enter_freeze = intel_idle_freeze, },
> >  	{
> >  		.name = "C6-NHM",
> >  		.desc = "MWAIT 0x20",
> >  		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> >  		.exit_latency = 200,
> >  		.target_residency = 800,
> > -		.enter = &intel_idle },
> > +		.enter = &intel_idle,
> > +		.enter_freeze = intel_idle_freeze, },
> >  	{
> >  		.enter = NULL }
> >  };
> 
> Why bother with enter_freeze() for any but the deepest state (C6 in this
> case)?

User space may disable the deepest one (and any of them in general) via sysfs
and there's no good reason to ignore its choice in this particular case while
we're honoring it otherwise.

So the logic is basically "find the deepest one which isn't disabled" and
setting the pointers costs us nothing really.

> Also, should we ignore things like intel_idle.max_cstate for this
> selection?

No, we shouldn't.  The deeper ones may just not work then.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-02-12 16:24     ` Rafael J. Wysocki
@ 2015-02-12 16:25       ` Peter Zijlstra
  2015-03-04 23:50       ` Li, Aubrey
  1 sibling, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2015-02-12 16:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Thu, Feb 12, 2015 at 05:24:51PM +0100, Rafael J. Wysocki wrote:
> On Thursday, February 12, 2015 02:26:43 PM Peter Zijlstra wrote:

> > Why bother with enter_freeze() for any but the deepest state (C6 in this
> > case)?
> 
> User space may disable the deepest one (and any of them in general) via sysfs
> and there's no good reason to ignore its choice in this particular case while
> we're honoring it otherwise.
> 
> So the logic is basically "find the deepest one which isn't disabled" and
> setting the pointers costs us nothing really.
> 
> > Also, should we ignore things like intel_idle.max_cstate for this
> > selection?
> 
> No, we shouldn't.  The deeper ones may just not work then.

OK, fair enough.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Update][PATCH 1/6 v2] PM / sleep: Re-implement suspend-to-idle handling
  2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
  2015-02-12 13:14   ` Peter Zijlstra
@ 2015-02-12 22:33   ` Rafael J. Wysocki
  1 sibling, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-12 22:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In preparation for adding support for quiescing timers in the final
stage of suspend-to-idle transitions, rework the freeze_enter()
function making the system wait on a wakeup event, the freeze_wake()
function terminating the suspend-to-idle loop and the mechanism by
which deep idle states are entered during suspend-to-idle.

First of all, introduce a simple state machine for suspend-to-idle
and make the code in question use it.

Second, prevent freeze_enter() from losing wakeup events due to race
conditions and ensure that the number of online CPUs won't change
while it is being executed.  In addition to that, make it force
all of the CPUs re-enter the idle loop in case they are in idle
states already (so they can enter deeper idle states if possible).

Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
checks in cpuidle_select() and cpuidle_reflect() with a single
suspend-to-idle state check in cpuidle_idle_call().

Finally, introduce cpuidle_enter_freeze() that will simply find the
deepest idle state available to the given CPU and enter it using
cpuidle_enter().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v2:
 - Move the definition of enum freeze_state to suspend.h,
 - make suspend_freeze_state extern and declare it in suspend.h,
 - move the definition of idle_should_freeze() to suspend.h and
   define it as static inline. 

---
 drivers/cpuidle/cpuidle.c |   49 ++++++++++++++++++++++++----------------------
 include/linux/cpuidle.h   |    4 +--
 include/linux/suspend.h   |   16 +++++++++++++++
 kernel/power/suspend.c    |   43 +++++++++++++++++++++++++++++++++-------
 kernel/sched/idle.c       |   16 +++++++++++++++
 5 files changed, 96 insertions(+), 32 deletions(-)

Index: linux-pm/kernel/power/suspend.c
===================================================================
--- linux-pm.orig/kernel/power/suspend.c
+++ linux-pm/kernel/power/suspend.c
@@ -37,7 +37,9 @@ const char *pm_states[PM_SUSPEND_MAX];
 static const struct platform_suspend_ops *suspend_ops;
 static const struct platform_freeze_ops *freeze_ops;
 static DECLARE_WAIT_QUEUE_HEAD(suspend_freeze_wait_head);
-static bool suspend_freeze_wake;
+
+enum freeze_state __read_mostly suspend_freeze_state;
+static DEFINE_SPINLOCK(suspend_freeze_lock);
 
 void freeze_set_ops(const struct platform_freeze_ops *ops)
 {
@@ -48,22 +50,49 @@ void freeze_set_ops(const struct platfor
 
 static void freeze_begin(void)
 {
-	suspend_freeze_wake = false;
+	suspend_freeze_state = FREEZE_STATE_NONE;
 }
 
 static void freeze_enter(void)
 {
-	cpuidle_use_deepest_state(true);
+	spin_lock_irq(&suspend_freeze_lock);
+	if (pm_wakeup_pending())
+		goto out;
+
+	suspend_freeze_state = FREEZE_STATE_ENTER;
+	spin_unlock_irq(&suspend_freeze_lock);
+
+	get_online_cpus();
 	cpuidle_resume();
-	wait_event(suspend_freeze_wait_head, suspend_freeze_wake);
+
+	/* Push all the CPUs into the idle loop. */
+	wake_up_all_idle_cpus();
+	pr_debug("PM: suspend-to-idle\n");
+	/* Make the current CPU wait so it can enter the idle loop too. */
+	wait_event(suspend_freeze_wait_head,
+		   suspend_freeze_state == FREEZE_STATE_WAKE);
+	pr_debug("PM: resume from suspend-to-idle\n");
+
 	cpuidle_pause();
-	cpuidle_use_deepest_state(false);
+	put_online_cpus();
+
+	spin_lock_irq(&suspend_freeze_lock);
+
+ out:
+	suspend_freeze_state = FREEZE_STATE_NONE;
+	spin_unlock_irq(&suspend_freeze_lock);
 }
 
 void freeze_wake(void)
 {
-	suspend_freeze_wake = true;
-	wake_up(&suspend_freeze_wait_head);
+	unsigned long flags;
+
+	spin_lock_irqsave(&suspend_freeze_lock, flags);
+	if (suspend_freeze_state > FREEZE_STATE_NONE) {
+		suspend_freeze_state = FREEZE_STATE_WAKE;
+		wake_up(&suspend_freeze_wait_head);
+	}
+	spin_unlock_irqrestore(&suspend_freeze_lock, flags);
 }
 EXPORT_SYMBOL_GPL(freeze_wake);
 
Index: linux-pm/include/linux/suspend.h
===================================================================
--- linux-pm.orig/include/linux/suspend.h
+++ linux-pm/include/linux/suspend.h
@@ -201,6 +201,21 @@ struct platform_freeze_ops {
  */
 extern void suspend_set_ops(const struct platform_suspend_ops *ops);
 extern int suspend_valid_only_mem(suspend_state_t state);
+
+/* Suspend-to-idle state machnine. */
+enum freeze_state {
+	FREEZE_STATE_NONE,      /* Not suspended/suspending. */
+	FREEZE_STATE_ENTER,     /* Enter suspend-to-idle. */
+	FREEZE_STATE_WAKE,      /* Wake up from suspend-to-idle. */
+};
+
+extern enum freeze_state __read_mostly suspend_freeze_state;
+
+static inline bool idle_should_freeze(void)
+{
+	return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
+}
+
 extern void freeze_set_ops(const struct platform_freeze_ops *ops);
 extern void freeze_wake(void);
 
@@ -228,6 +243,7 @@ extern int pm_suspend(suspend_state_t st
 
 static inline void suspend_set_ops(const struct platform_suspend_ops *ops) {}
 static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
+static inline bool idle_should_freeze(void) { return false; }
 static inline void freeze_set_ops(const struct platform_freeze_ops *ops) {}
 static inline void freeze_wake(void) {}
 #endif /* !CONFIG_SUSPEND */
Index: linux-pm/kernel/sched/idle.c
===================================================================
--- linux-pm.orig/kernel/sched/idle.c
+++ linux-pm/kernel/sched/idle.c
@@ -7,6 +7,7 @@
 #include <linux/tick.h>
 #include <linux/mm.h>
 #include <linux/stackprotector.h>
+#include <linux/suspend.h>
 
 #include <asm/tlb.h>
 
@@ -104,6 +105,21 @@ static void cpuidle_idle_call(void)
 	rcu_idle_enter();
 
 	/*
+	 * Suspend-to-idle ("freeze") is a system state in which all user space
+	 * has been frozen, all I/O devices have been suspended and the only
+	 * activity happens here and in iterrupts (if any).  In that case bypass
+	 * the cpuidle governor and go stratight for the deepest idle state
+	 * available.  Possibly also suspend the local tick and the entire
+	 * timekeeping to prevent timer interrupts from kicking us out of idle
+	 * until a proper wakeup interrupt happens.
+	 */
+	if (idle_should_freeze()) {
+		cpuidle_enter_freeze();
+		local_irq_enable();
+		goto exit_idle;
+	}
+
+	/*
 	 * Ask the cpuidle framework to choose a convenient idle state.
 	 * Fall back to the default arch idle method on errors.
 	 */
Index: linux-pm/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -19,6 +19,7 @@
 #include <linux/ktime.h>
 #include <linux/hrtimer.h>
 #include <linux/module.h>
+#include <linux/suspend.h>
 #include <trace/events/power.h>
 
 #include "cpuidle.h"
@@ -32,7 +33,6 @@ LIST_HEAD(cpuidle_detected_devices);
 static int enabled_devices;
 static int off __read_mostly;
 static int initialized __read_mostly;
-static bool use_deepest_state __read_mostly;
 
 int cpuidle_disabled(void)
 {
@@ -66,24 +66,9 @@ int cpuidle_play_dead(void)
 }
 
 /**
- * cpuidle_use_deepest_state - Enable/disable the "deepest idle" mode.
- * @enable: Whether enable or disable the feature.
- *
- * If the "deepest idle" mode is enabled, cpuidle will ignore the governor and
- * always use the state with the greatest exit latency (out of the states that
- * are not disabled).
- *
- * This function can only be called after cpuidle_pause() to avoid races.
- */
-void cpuidle_use_deepest_state(bool enable)
-{
-	use_deepest_state = enable;
-}
-
-/**
- * cpuidle_find_deepest_state - Find the state of the greatest exit latency.
- * @drv: cpuidle driver for a given CPU.
- * @dev: cpuidle device for a given CPU.
+ * cpuidle_find_deepest_state - Find deepest state meeting specific conditions.
+ * @drv: cpuidle driver for the given CPU.
+ * @dev: cpuidle device for the given CPU.
  */
 static int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 				      struct cpuidle_device *dev)
@@ -105,6 +90,27 @@ static int cpuidle_find_deepest_state(st
 }
 
 /**
+ * cpuidle_enter_freeze - Enter an idle state suitable for suspend-to-idle.
+ *
+ * Find the deepest state available and enter it.
+ */
+void cpuidle_enter_freeze(void)
+{
+	struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
+	struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
+	int index;
+
+	index = cpuidle_find_deepest_state(drv, dev);
+	if (index >= 0)
+		cpuidle_enter(drv, dev, index);
+	else
+		arch_cpu_idle();
+
+	/* Interrupts are enabled again here. */
+	local_irq_disable();
+}
+
+/**
  * cpuidle_enter_state - enter the state and update stats
  * @dev: cpuidle device for this cpu
  * @drv: cpuidle driver for this cpu
@@ -166,9 +172,6 @@ int cpuidle_select(struct cpuidle_driver
 	if (!drv || !dev || !dev->enabled)
 		return -EBUSY;
 
-	if (unlikely(use_deepest_state))
-		return cpuidle_find_deepest_state(drv, dev);
-
 	return cpuidle_curr_governor->select(drv, dev);
 }
 
@@ -200,7 +203,7 @@ int cpuidle_enter(struct cpuidle_driver
  */
 void cpuidle_reflect(struct cpuidle_device *dev, int index)
 {
-	if (cpuidle_curr_governor->reflect && !unlikely(use_deepest_state))
+	if (cpuidle_curr_governor->reflect)
 		cpuidle_curr_governor->reflect(dev, index);
 }
 
Index: linux-pm/include/linux/cpuidle.h
===================================================================
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -141,7 +141,7 @@ extern void cpuidle_resume(void);
 extern int cpuidle_enable_device(struct cpuidle_device *dev);
 extern void cpuidle_disable_device(struct cpuidle_device *dev);
 extern int cpuidle_play_dead(void);
-extern void cpuidle_use_deepest_state(bool enable);
+extern void cpuidle_enter_freeze(void);
 
 extern struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev);
 #else
@@ -174,7 +174,7 @@ static inline int cpuidle_enable_device(
 {return -ENODEV; }
 static inline void cpuidle_disable_device(struct cpuidle_device *dev) { }
 static inline int cpuidle_play_dead(void) {return -ENODEV; }
-static inline void cpuidle_use_deepest_state(bool enable) {}
+static inline void cpuidle_enter_freeze(void) { }
 static inline struct cpuidle_driver *cpuidle_get_cpu_driver(
 	struct cpuidle_device *dev) {return NULL; }
 #endif


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [Update][PATCH 4/6 v2] PM / sleep: Make it possible to quiesce timers during suspend-to-idle
  2015-02-11  4:03 ` [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle Rafael J. Wysocki
  2015-02-12 13:24   ` Peter Zijlstra
@ 2015-02-12 22:36   ` Rafael J. Wysocki
  1 sibling, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-12 22:36 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.

However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion.  A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.

The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping.  That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs.  It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.

Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit.  Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.

To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

v1 -> v2:
 - Add WARN_ON(!irqs_disabled()) to enter_freeze_proper(),
 - add a comment about ->enter_freeze requirements to the definition
   of struct cpuidle_state in cpuidle.h.

---
 drivers/cpuidle/cpuidle.c |   49 ++++++++++++++++++++++++++++++++++++++++-----
 include/linux/cpuidle.h   |    9 ++++++++
 include/linux/tick.h      |    2 +
 kernel/time/tick-common.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++
 kernel/time/timekeeping.c |    4 +--
 kernel/time/timekeeping.h |    2 +
 6 files changed, 109 insertions(+), 7 deletions(-)

Index: linux-pm/include/linux/tick.h
===================================================================
--- linux-pm.orig/include/linux/tick.h
+++ linux-pm/include/linux/tick.h
@@ -226,5 +226,7 @@ static inline void tick_nohz_task_switch
 		__tick_nohz_task_switch(tsk);
 }
 
+extern void tick_freeze(void);
+extern void tick_unfreeze(void);
 
 #endif
Index: linux-pm/kernel/time/tick-common.c
===================================================================
--- linux-pm.orig/kernel/time/tick-common.c
+++ linux-pm/kernel/time/tick-common.c
@@ -394,6 +394,56 @@ void tick_resume(void)
 	}
 }
 
+static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
+static unsigned int tick_freeze_depth;
+
+/**
+ * tick_freeze - Suspend the local tick and (possibly) timekeeping.
+ *
+ * Check if this is the last online CPU executing the function and if so,
+ * suspend timekeeping.  Otherwise suspend the local tick.
+ *
+ * Call with interrupts disabled.  Must be balanced with %tick_unfreeze().
+ * Interrupts must not be enabled before the subsequent %tick_unfreeze().
+ */
+void tick_freeze(void)
+{
+	raw_spin_lock(&tick_freeze_lock);
+
+	tick_freeze_depth++;
+	if (tick_freeze_depth == num_online_cpus()) {
+		timekeeping_suspend();
+	} else {
+		tick_suspend();
+		tick_suspend_broadcast();
+	}
+
+	raw_spin_unlock(&tick_freeze_lock);
+}
+
+/**
+ * tick_unfreeze - Resume the local tick and (possibly) timekeeping.
+ *
+ * Check if this is the first CPU executing the function and if so, resume
+ * timekeeping.  Otherwise resume the local tick.
+ *
+ * Call with interrupts disabled.  Must be balanced with %tick_freeze().
+ * Interrupts must not be enabled after the preceding %tick_freeze().
+ */
+void tick_unfreeze(void)
+{
+	raw_spin_lock(&tick_freeze_lock);
+
+	if (tick_freeze_depth == num_online_cpus())
+		timekeeping_resume();
+	else
+		tick_resume();
+
+	tick_freeze_depth--;
+
+	raw_spin_unlock(&tick_freeze_lock);
+}
+
 /**
  * tick_init - initialize the tick control
  */
Index: linux-pm/kernel/time/timekeeping.h
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.h
+++ linux-pm/kernel/time/timekeeping.h
@@ -16,5 +16,7 @@ extern int timekeeping_inject_offset(str
 extern s32 timekeeping_get_tai_offset(void);
 extern void timekeeping_set_tai_offset(s32 tai_offset);
 extern void timekeeping_clocktai(struct timespec *ts);
+extern int timekeeping_suspend(void);
+extern void timekeeping_resume(void);
 
 #endif
Index: linux-pm/kernel/time/timekeeping.c
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.c
+++ linux-pm/kernel/time/timekeeping.c
@@ -1168,7 +1168,7 @@ void timekeeping_inject_sleeptime64(stru
  * xtime/wall_to_monotonic/jiffies/etc are
  * still managed by arch specific suspend/resume code.
  */
-static void timekeeping_resume(void)
+void timekeeping_resume(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	struct clocksource *clock = tk->tkr.clock;
@@ -1262,7 +1262,7 @@ static cycle_t dummy_clock_read(struct c
 	return cycles_at_suspend;
 }
 
-static int timekeeping_suspend(void)
+int timekeeping_suspend(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	struct clocksource *clock = tk->tkr.clock;
Index: linux-pm/include/linux/cpuidle.h
===================================================================
--- linux-pm.orig/include/linux/cpuidle.h
+++ linux-pm/include/linux/cpuidle.h
@@ -50,6 +50,15 @@ struct cpuidle_state {
 			int index);
 
 	int (*enter_dead) (struct cpuidle_device *dev, int index);
+
+	/*
+	 * CPUs execute ->enter_freeze with the local tick or entire timekeeping
+	 * suspended, so it must not re-enable interrupts at any point (even
+	 * temporarily) or attempt to change states of clock event devices.
+	 */
+	void (*enter_freeze) (struct cpuidle_device *dev,
+			      struct cpuidle_driver *drv,
+			      int index);
 };
 
 /* Idle State Flags */
Index: linux-pm/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -20,6 +20,7 @@
 #include <linux/hrtimer.h>
 #include <linux/module.h>
 #include <linux/suspend.h>
+#include <linux/tick.h>
 #include <trace/events/power.h>
 
 #include "cpuidle.h"
@@ -69,18 +70,20 @@ int cpuidle_play_dead(void)
  * cpuidle_find_deepest_state - Find deepest state meeting specific conditions.
  * @drv: cpuidle driver for the given CPU.
  * @dev: cpuidle device for the given CPU.
+ * @freeze: Whether or not the state should be suitable for suspend-to-idle.
  */
 static int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
-				      struct cpuidle_device *dev)
+				      struct cpuidle_device *dev, bool freeze)
 {
 	unsigned int latency_req = 0;
-	int i, ret = CPUIDLE_DRIVER_STATE_START - 1;
+	int i, ret = freeze ? -1 : CPUIDLE_DRIVER_STATE_START - 1;
 
 	for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
 		struct cpuidle_state *s = &drv->states[i];
 		struct cpuidle_state_usage *su = &dev->states_usage[i];
 
-		if (s->disabled || su->disable || s->exit_latency <= latency_req)
+		if (s->disabled || su->disable || s->exit_latency <= latency_req
+		    || (freeze && !s->enter_freeze))
 			continue;
 
 		latency_req = s->exit_latency;
@@ -89,10 +92,31 @@ static int cpuidle_find_deepest_state(st
 	return ret;
 }
 
+static void enter_freeze_proper(struct cpuidle_driver *drv,
+				struct cpuidle_device *dev, int index)
+{
+	tick_freeze();
+	/*
+	 * The state used here cannot be a "coupled" one, because the "coupled"
+	 * cpuidle mechanism enables interrupts and doing that with timekeeping
+	 * suspended is generally unsafe.
+	 */
+	drv->states[index].enter_freeze(dev, drv, index);
+	WARN_ON(!irqs_disabled());
+	/*
+	 * timekeeping_resume() that will be called by tick_unfreeze() for the
+	 * last CPU executing it calls functions containing RCU read-side
+	 * critical sections, so tell RCU about that.
+	 */
+	RCU_NONIDLE(tick_unfreeze());
+}
+
 /**
  * cpuidle_enter_freeze - Enter an idle state suitable for suspend-to-idle.
  *
- * Find the deepest state available and enter it.
+ * If there are states with the ->enter_freeze callback, find the deepest of
+ * them and enter it with frozen tick.  Otherwise, find the deepest state
+ * available and enter it normally.
  */
 void cpuidle_enter_freeze(void)
 {
@@ -100,7 +124,22 @@ void cpuidle_enter_freeze(void)
 	struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev);
 	int index;
 
-	index = cpuidle_find_deepest_state(drv, dev);
+	/*
+	 * Find the deepest state with ->enter_freeze present, which guarantees
+	 * that interrupts won't be enabled when it exits and allows the tick to
+	 * be frozen safely.
+	 */
+	index = cpuidle_find_deepest_state(drv, dev, true);
+	if (index >= 0) {
+		enter_freeze_proper(drv, dev, index);
+		return;
+	}
+
+	/*
+	 * It is not safe to freeze the tick, find the deepest state available
+	 * at all and try to enter it normally.
+	 */
+	index = cpuidle_find_deepest_state(drv, dev, false);
 	if (index >= 0)
 		cpuidle_enter(drv, dev, index);
 	else


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper()
  2015-02-11  4:01 ` [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper() Rafael J. Wysocki
@ 2015-02-13  0:29   ` John Stultz
  0 siblings, 0 replies; 28+ messages in thread
From: John Stultz @ 2015-02-13  0:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Wed, Feb 11, 2015 at 12:01 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Modify update_fast_timekeeper() to take a struct tk_read_base
> pointer as its argument (instead of a struct timekeeper pointer)
> and update its kerneldoc comment to reflect that.
>
> That will allow a struct tk_read_base that is not part of a
> struct timekeeper to be passed to it in the next patch.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thomas was really behind the update_fast_timekeeper design so I'm not
sure I can critique if this fits in the larger vision here, but I
don't see anything wrong and its not breaking in any of my tests,
so...

Acked-by: John Stultz <john.stultz@linaro.org>

thanks
-john

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-11  4:03 ` [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended Rafael J. Wysocki
@ 2015-02-13  0:53   ` John Stultz
  2015-02-13  2:03     ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: John Stultz @ 2015-02-13  0:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Theoretically, ktime_get_mono_fast_ns() may be executed after
> timekeeping has been suspended (or before it is resumed) which
> in turn may lead to undefined behavior, for example, when the
> clocksource read from timekeeping_get_ns() called by it is
> not accessible at that time.

And the callers of the ktime_get_mono_fast_ns() have to get back a
value? Or can we return an error on timekeeping_suspended like we do
w/ __getnstimeofday64()?

Also, what exactly is the case when the clocksource being read isn't
accessible? I see this is conditionalized on
CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
clocksource and its been reset causing a crazy time value?

> Prevent that from happening by setting up a dummy readout base for
> the fast timekeeper during timekeeping_suspend() such that it will
> always return the same number of cycles.
>
> After the last timekeeping_update() in timekeeping_suspend() the
> clocksource is read and the result is stored as cycles_at_suspend.
> The readout base from the current timekeeper is copied onto the
> dummy and the ->read pointer of the dummy is set to a routine
> unconditionally returning cycles_at_suspend.  Next, the dummy is
> passed to update_fast_timekeeper().
>
> Then, ktime_get_mono_fast_ns() will work until the subsequent
> timekeeping_resume() and the proper readout base for the fast
> timekeeper will be restored by the timekeeping_update() called
> right after clearing timekeeping_suspended.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  kernel/time/timekeeping.c |   22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> Index: linux-pm/kernel/time/timekeeping.c
> ===================================================================
> --- linux-pm.orig/kernel/time/timekeeping.c
> +++ linux-pm/kernel/time/timekeeping.c
> @@ -1249,9 +1249,23 @@ static void timekeeping_resume(void)
>         hrtimers_resume();
>  }
>
> +/*
> + * Dummy readout base and suspend-time cycles value for the fast timekeeper to
> + * work in a consistent way after timekeeping has been suspended if the core
> + * timekeeper clocksource is not suspend-nonstop.
> + */
> +static struct tk_read_base tkr_dummy;
> +static cycle_t cycles_at_suspend;
> +
> +static cycle_t dummy_clock_read(struct clocksource *cs)
> +{
> +       return cycles_at_suspend;
> +}
> +
>  static int timekeeping_suspend(void)
>  {
>         struct timekeeper *tk = &tk_core.timekeeper;
> +       struct clocksource *clock = tk->tkr.clock;
>         unsigned long flags;
>         struct timespec64               delta, delta_delta;
>         static struct timespec64        old_delta;
> @@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void)
>         }
>
>         timekeeping_update(tk, TK_MIRROR);
> +
> +       if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) {
> +               memcpy(&tkr_dummy, &tk->tkr, sizeof(tkr_dummy));
> +               cycles_at_suspend = tk->tkr.read(clock);
> +               tkr_dummy.read = dummy_clock_read;
> +               update_fast_timekeeper(&tkr_dummy);
> +       }

Its a little ugly... though I'm not sure I have a better idea right off.

thanks
-john

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-13  0:53   ` John Stultz
@ 2015-02-13  2:03     ` Rafael J. Wysocki
  2015-02-13  2:59       ` Travis
  2015-02-13  9:03       ` John Stultz
  0 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-13  2:03 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Friday, February 13, 2015 08:53:38 AM John Stultz wrote:
> On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Theoretically, ktime_get_mono_fast_ns() may be executed after
> > timekeeping has been suspended (or before it is resumed) which
> > in turn may lead to undefined behavior, for example, when the
> > clocksource read from timekeeping_get_ns() called by it is
> > not accessible at that time.
> 
> And the callers of the ktime_get_mono_fast_ns() have to get back a
> value?

Yes, they do.

> Or can we return an error on timekeeping_suspended like we do
> w/ __getnstimeofday64()?

No, we can't.

> Also, what exactly is the case when the clocksource being read isn't
> accessible? I see this is conditionalized on
> CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
> clocksource and its been reset causing a crazy time value?

The clocksource's ->suspend method may have been called (during suspend)
and depending on what that did we may even crash things theoretically.

During resume, before the clocksource's ->resume callback, it may just
be undefined behavior (random data etc).

For system suspend as we have today the window is quite narrow, but after
patch [4/6] from this series suspend-to-idle may suspend timekeeping and
just sit there in idle for extended time (hours even) which broadens the
potential exposure quite a bit.

Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns()
is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle
with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP
and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may
happen.

> > Prevent that from happening by setting up a dummy readout base for
> > the fast timekeeper during timekeeping_suspend() such that it will
> > always return the same number of cycles.
> >
> > After the last timekeeping_update() in timekeeping_suspend() the
> > clocksource is read and the result is stored as cycles_at_suspend.
> > The readout base from the current timekeeper is copied onto the
> > dummy and the ->read pointer of the dummy is set to a routine
> > unconditionally returning cycles_at_suspend.  Next, the dummy is
> > passed to update_fast_timekeeper().
> >
> > Then, ktime_get_mono_fast_ns() will work until the subsequent
> > timekeeping_resume() and the proper readout base for the fast
> > timekeeper will be restored by the timekeeping_update() called
> > right after clearing timekeeping_suspended.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  kernel/time/timekeeping.c |   22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> >
> > Index: linux-pm/kernel/time/timekeeping.c
> > ===================================================================
> > --- linux-pm.orig/kernel/time/timekeeping.c
> > +++ linux-pm/kernel/time/timekeeping.c
> > @@ -1249,9 +1249,23 @@ static void timekeeping_resume(void)
> >         hrtimers_resume();
> >  }
> >
> > +/*
> > + * Dummy readout base and suspend-time cycles value for the fast timekeeper to
> > + * work in a consistent way after timekeeping has been suspended if the core
> > + * timekeeper clocksource is not suspend-nonstop.
> > + */
> > +static struct tk_read_base tkr_dummy;
> > +static cycle_t cycles_at_suspend;
> > +
> > +static cycle_t dummy_clock_read(struct clocksource *cs)
> > +{
> > +       return cycles_at_suspend;
> > +}
> > +
> >  static int timekeeping_suspend(void)
> >  {
> >         struct timekeeper *tk = &tk_core.timekeeper;
> > +       struct clocksource *clock = tk->tkr.clock;
> >         unsigned long flags;
> >         struct timespec64               delta, delta_delta;
> >         static struct timespec64        old_delta;
> > @@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void)
> >         }
> >
> >         timekeeping_update(tk, TK_MIRROR);
> > +
> > +       if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) {
> > +               memcpy(&tkr_dummy, &tk->tkr, sizeof(tkr_dummy));
> > +               cycles_at_suspend = tk->tkr.read(clock);
> > +               tkr_dummy.read = dummy_clock_read;
> > +               update_fast_timekeeper(&tkr_dummy);
> > +       }
> 
> Its a little ugly... though I'm not sure I have a better idea right off.
> 
> thanks
> -john
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-13  2:03     ` Rafael J. Wysocki
@ 2015-02-13  2:59       ` Travis
  2015-02-13  9:03       ` John Stultz
  1 sibling, 0 replies; 28+ messages in thread
From: Travis @ 2015-02-13  2:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, Alan Cox, Thomas Gleixner, John Stultz, Len Brown,
	Kristen Carlson Accardi, Li, Aubrey, LKML,
	ACPI Devel Maling List, Peter Zijlstra

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 5774 bytes --]

Sounds good to me!

On Feb 12, 2015 8:03 PM, "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
>
> On Friday, February 13, 2015 08:53:38 AM John Stultz wrote: 
> > On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: 
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 
> > > 
> > > Theoretically, ktime_get_mono_fast_ns() may be executed after 
> > > timekeeping has been suspended (or before it is resumed) which 
> > > in turn may lead to undefined behavior, for example, when the 
> > > clocksource read from timekeeping_get_ns() called by it is 
> > > not accessible at that time. 
> > 
> > And the callers of the ktime_get_mono_fast_ns() have to get back a 
> > value? 
>
> Yes, they do. 
>
> > Or can we return an error on timekeeping_suspended like we do 
> > w/ __getnstimeofday64()? 
>
> No, we can't. 
>
> > Also, what exactly is the case when the clocksource being read isn't 
> > accessible? I see this is conditionalized on 
> > CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the 
> > clocksource and its been reset causing a crazy time value? 
>
> The clocksource's ->suspend method may have been called (during suspend) 
> and depending on what that did we may even crash things theoretically. 
>
> During resume, before the clocksource's ->resume callback, it may just 
> be undefined behavior (random data etc). 
>
> For system suspend as we have today the window is quite narrow, but after 
> patch [4/6] from this series suspend-to-idle may suspend timekeeping and 
> just sit there in idle for extended time (hours even) which broadens the 
> potential exposure quite a bit. 
>
> Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns() 
> is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle 
> with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP 
> and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may 
> happen. 
>
> > > Prevent that from happening by setting up a dummy readout base for 
> > > the fast timekeeper during timekeeping_suspend() such that it will 
> > > always return the same number of cycles. 
> > > 
> > > After the last timekeeping_update() in timekeeping_suspend() the 
> > > clocksource is read and the result is stored as cycles_at_suspend. 
> > > The readout base from the current timekeeper is copied onto the 
> > > dummy and the ->read pointer of the dummy is set to a routine 
> > > unconditionally returning cycles_at_suspend.  Next, the dummy is 
> > > passed to update_fast_timekeeper(). 
> > > 
> > > Then, ktime_get_mono_fast_ns() will work until the subsequent 
> > > timekeeping_resume() and the proper readout base for the fast 
> > > timekeeper will be restored by the timekeeping_update() called 
> > > right after clearing timekeeping_suspended. 
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 
> > > --- 
> > >  kernel/time/timekeeping.c |   22 ++++++++++++++++++++++ 
> > >  1 file changed, 22 insertions(+) 
> > > 
> > > Index: linux-pm/kernel/time/timekeeping.c 
> > > =================================================================== 
> > > --- linux-pm.orig/kernel/time/timekeeping.c 
> > > +++ linux-pm/kernel/time/timekeeping.c 
> > > @@ -1249,9 +1249,23 @@ static void timekeeping_resume(void) 
> > >         hrtimers_resume(); 
> > >  } 
> > > 
> > > +/* 
> > > + * Dummy readout base and suspend-time cycles value for the fast timekeeper to 
> > > + * work in a consistent way after timekeeping has been suspended if the core 
> > > + * timekeeper clocksource is not suspend-nonstop. 
> > > + */ 
> > > +static struct tk_read_base tkr_dummy; 
> > > +static cycle_t cycles_at_suspend; 
> > > + 
> > > +static cycle_t dummy_clock_read(struct clocksource *cs) 
> > > +{ 
> > > +       return cycles_at_suspend; 
> > > +} 
> > > + 
> > >  static int timekeeping_suspend(void) 
> > >  { 
> > >         struct timekeeper *tk = &tk_core.timekeeper; 
> > > +       struct clocksource *clock = tk->tkr.clock; 
> > >         unsigned long flags; 
> > >         struct timespec64               delta, delta_delta; 
> > >         static struct timespec64        old_delta; 
> > > @@ -1294,6 +1308,14 @@ static int timekeeping_suspend(void) 
> > >         } 
> > > 
> > >         timekeeping_update(tk, TK_MIRROR); 
> > > + 
> > > +       if (!(clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP)) { 
> > > +               memcpy(&tkr_dummy, &tk->tkr, sizeof(tkr_dummy)); 
> > > +               cycles_at_suspend = tk->tkr.read(clock); 
> > > +               tkr_dummy.read = dummy_clock_read; 
> > > +               update_fast_timekeeper(&tkr_dummy); 
> > > +       } 
> > 
> > Its a little ugly... though I'm not sure I have a better idea right off. 
> > 
> > thanks 
> > -john 
> > -- 
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in 
> > the body of a message to majordomo@vger.kernel.org 
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html 
> > Please read the FAQ at  http://www.tux.org/lkml/ 
>
> -- 
> I speak only for myself. 
> Rafael J. Wysocki, Intel Open Source Technology Center. 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in 
> the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html 
> Please read the FAQ at  http://www.tux.org/lkml/ 
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle
  2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
                   ` (5 preceding siblings ...)
  2015-02-11  4:04 ` [PATCH 6/6] ACPI / idle: Implement ->enter_freeze callback routine Rafael J. Wysocki
@ 2015-02-13  8:13 ` Peter Zijlstra
  2015-02-13 14:29   ` Rafael J. Wysocki
  6 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2015-02-13  8:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Wed, Feb 11, 2015 at 05:00:15AM +0100, Rafael J. Wysocki wrote:
> Hi,
> 
> This series adds support for quiescing timers during the last state of
> suspend-to-idle transitions.
> 
> Patches [1-4/6] together are functionally equivalent to the combo RFC patch
> I sent last time (http://marc.info/?l=linux-kernel&m=142344909201464&w=4).
> 
> Patches [5-6/6] add ->enter_freeze callback implementations to intel_idle
> and the ACPI cpuidle driver.
> 
> [1/6] - Rework the suspend-to-idle "mechanics" in preparation for the subsequent
>         changes.  The existing functionality should not change after this.
> [2/6] - Modify update_fast_timekeeper() to take struct tk_read_base pointers as
>         arguments.
> [3/6] - Make it safe to use the fast timekeeper while suspended.
> [4/6] - Support for quiescing timers during suspend-to-idle (core part).
> [5/6] - ->enter_freeze callback for intel_idle.
> [6/6] - ->enter_freeze callback for ACPI cpuidle.
> 
> This works as expected on everything I have in my office and can readily test.
> 
> The patches should apply without any problems on top of the current Linus' tree.

Thanks for the updates, looks good to me.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-13  2:03     ` Rafael J. Wysocki
  2015-02-13  2:59       ` Travis
@ 2015-02-13  9:03       ` John Stultz
  2015-02-13 14:32         ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: John Stultz @ 2015-02-13  9:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Fri, Feb 13, 2015 at 10:03 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Friday, February 13, 2015 08:53:38 AM John Stultz wrote:
>> On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >
>> > Theoretically, ktime_get_mono_fast_ns() may be executed after
>> > timekeeping has been suspended (or before it is resumed) which
>> > in turn may lead to undefined behavior, for example, when the
>> > clocksource read from timekeeping_get_ns() called by it is
>> > not accessible at that time.
>>
>> And the callers of the ktime_get_mono_fast_ns() have to get back a
>> value?
>
> Yes, they do.
>
>> Or can we return an error on timekeeping_suspended like we do
>> w/ __getnstimeofday64()?
>
> No, we can't.
>
>> Also, what exactly is the case when the clocksource being read isn't
>> accessible? I see this is conditionalized on
>> CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
>> clocksource and its been reset causing a crazy time value?
>
> The clocksource's ->suspend method may have been called (during suspend)
> and depending on what that did we may even crash things theoretically.
>
> During resume, before the clocksource's ->resume callback, it may just
> be undefined behavior (random data etc).
>
> For system suspend as we have today the window is quite narrow, but after
> patch [4/6] from this series suspend-to-idle may suspend timekeeping and
> just sit there in idle for extended time (hours even) which broadens the
> potential exposure quite a bit.
>
> Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns()
> is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle
> with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP
> and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may
> happen.

Ok.. No objection to the approach then. But maybe could you wrap the
new logic in a halt_fast_timekeeper() function? Also is there much
value in not halting it for SUSPEND_NONSTOP clocksources? If not,
might as well halt it in all cases just to simplify the conditions we
have to keep track of in our heads. :)

thanks
-john

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle
  2015-02-13  8:13 ` [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Peter Zijlstra
@ 2015-02-13 14:29   ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-13 14:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Alan Cox, Li, Aubrey, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Friday, February 13, 2015 09:13:42 AM Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 05:00:15AM +0100, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > This series adds support for quiescing timers during the last state of
> > suspend-to-idle transitions.
> > 
> > Patches [1-4/6] together are functionally equivalent to the combo RFC patch
> > I sent last time (http://marc.info/?l=linux-kernel&m=142344909201464&w=4).
> > 
> > Patches [5-6/6] add ->enter_freeze callback implementations to intel_idle
> > and the ACPI cpuidle driver.
> > 
> > [1/6] - Rework the suspend-to-idle "mechanics" in preparation for the subsequent
> >         changes.  The existing functionality should not change after this.
> > [2/6] - Modify update_fast_timekeeper() to take struct tk_read_base pointers as
> >         arguments.
> > [3/6] - Make it safe to use the fast timekeeper while suspended.
> > [4/6] - Support for quiescing timers during suspend-to-idle (core part).
> > [5/6] - ->enter_freeze callback for intel_idle.
> > [6/6] - ->enter_freeze callback for ACPI cpuidle.
> > 
> > This works as expected on everything I have in my office and can readily test.
> > 
> > The patches should apply without any problems on top of the current Linus' tree.
> 
> Thanks for the updates, looks good to me.
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks!


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-13  9:03       ` John Stultz
@ 2015-02-13 14:32         ` Rafael J. Wysocki
  2015-02-14 17:30           ` John Stultz
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-02-13 14:32 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Friday, February 13, 2015 05:03:51 PM John Stultz wrote:
> On Fri, Feb 13, 2015 at 10:03 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Friday, February 13, 2015 08:53:38 AM John Stultz wrote:
> >> On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >
> >> > Theoretically, ktime_get_mono_fast_ns() may be executed after
> >> > timekeeping has been suspended (or before it is resumed) which
> >> > in turn may lead to undefined behavior, for example, when the
> >> > clocksource read from timekeeping_get_ns() called by it is
> >> > not accessible at that time.
> >>
> >> And the callers of the ktime_get_mono_fast_ns() have to get back a
> >> value?
> >
> > Yes, they do.
> >
> >> Or can we return an error on timekeeping_suspended like we do
> >> w/ __getnstimeofday64()?
> >
> > No, we can't.
> >
> >> Also, what exactly is the case when the clocksource being read isn't
> >> accessible? I see this is conditionalized on
> >> CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
> >> clocksource and its been reset causing a crazy time value?
> >
> > The clocksource's ->suspend method may have been called (during suspend)
> > and depending on what that did we may even crash things theoretically.
> >
> > During resume, before the clocksource's ->resume callback, it may just
> > be undefined behavior (random data etc).
> >
> > For system suspend as we have today the window is quite narrow, but after
> > patch [4/6] from this series suspend-to-idle may suspend timekeeping and
> > just sit there in idle for extended time (hours even) which broadens the
> > potential exposure quite a bit.
> >
> > Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns()
> > is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle
> > with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP
> > and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may
> > happen.
> 
> Ok.. No objection to the approach then. But maybe could you wrap the
> new logic in a halt_fast_timekeeper() function? Also is there much
> value in not halting it for SUSPEND_NONSTOP clocksources? If not,
> might as well halt it in all cases just to simplify the conditions we
> have to keep track of in our heads. :)

I don't see a problem with doing that unconditionally.

What about the appended version of the patch, then?

Rafael


---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Subject: timekeeping: Make it safe to use the fast timekeeper while suspended

Theoretically, ktime_get_mono_fast_ns() may be executed after
timekeeping has been suspended (or before it is resumed) which
in turn may lead to undefined behavior, for example, when the
clocksource read from timekeeping_get_ns() called by it is
not accessible at that time.

Prevent that from happening by setting up a dummy readout base for
the fast timekeeper during timekeeping_suspend() such that it will
always return the same number of cycles.

After the last timekeeping_update() in timekeeping_suspend() the
clocksource is read and the result is stored as cycles_at_suspend.
The readout base from the current timekeeper is copied onto the
dummy and the ->read pointer of the dummy is set to a routine
unconditionally returning cycles_at_suspend.  Next, the dummy is
passed to update_fast_timekeeper().

Then, ktime_get_mono_fast_ns() will work until the subsequent
timekeeping_resume() and the proper readout base for the fast
timekeeper will be restored by the timekeeping_update() called
right after clearing timekeeping_suspended.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 kernel/time/timekeeping.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Index: linux-pm/kernel/time/timekeeping.c
===================================================================
--- linux-pm.orig/kernel/time/timekeeping.c
+++ linux-pm/kernel/time/timekeeping.c
@@ -332,6 +332,35 @@ u64 notrace ktime_get_mono_fast_ns(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);
 
+/* Suspend-time cycles value for halted fast timekeeper. */
+static cycle_t cycles_at_suspend;
+
+static cycle_t dummy_clock_read(struct clocksource *cs)
+{
+	return cycles_at_suspend;
+}
+
+/**
+ * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource.
+ * @tk: Timekeeper to snapshot.
+ *
+ * It generally is unsafe to access the clocksource after timekeeping has been
+ * suspended, so take a snapshot of the readout base of @tk and use it as the
+ * fast timekeeper's readout base while suspended.  It will return the same
+ * number of cycles every time until timekeeping is resumed at which time the
+ * proper readout base for the fast timekeeper will be restored automatically.
+ */
+static void halt_fast_timekeeper(struct timekeeper *tk)
+{
+	static struct tk_read_base tkr_dummy;
+	struct tk_read_base *tkr = &tk->tkr;
+
+	memcpy(&tkr_dummy, tkr, sizeof(tkr_dummy));
+	cycles_at_suspend = tkr->read(tkr->clock);
+	tkr_dummy.read = dummy_clock_read;
+	update_fast_timekeeper(&tkr_dummy);
+}
+
 #ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
 
 static inline void update_vsyscall(struct timekeeper *tk)
@@ -1294,6 +1323,7 @@ static int timekeeping_suspend(void)
 	}
 
 	timekeeping_update(tk, TK_MIRROR);
+	halt_fast_timekeeper(tk);
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended
  2015-02-13 14:32         ` Rafael J. Wysocki
@ 2015-02-14 17:30           ` John Stultz
  0 siblings, 0 replies; 28+ messages in thread
From: John Stultz @ 2015-02-14 17:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Peter Zijlstra, Alan Cox, Li, Aubrey, LKML,
	Linux PM list, ACPI Devel Maling List, Kristen Carlson Accardi,
	Len Brown

On Fri, Feb 13, 2015 at 10:32 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Friday, February 13, 2015 05:03:51 PM John Stultz wrote:
>> On Fri, Feb 13, 2015 at 10:03 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > On Friday, February 13, 2015 08:53:38 AM John Stultz wrote:
>> >> On Wed, Feb 11, 2015 at 12:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >> >
>> >> > Theoretically, ktime_get_mono_fast_ns() may be executed after
>> >> > timekeeping has been suspended (or before it is resumed) which
>> >> > in turn may lead to undefined behavior, for example, when the
>> >> > clocksource read from timekeeping_get_ns() called by it is
>> >> > not accessible at that time.
>> >>
>> >> And the callers of the ktime_get_mono_fast_ns() have to get back a
>> >> value?
>> >
>> > Yes, they do.
>> >
>> >> Or can we return an error on timekeeping_suspended like we do
>> >> w/ __getnstimeofday64()?
>> >
>> > No, we can't.
>> >
>> >> Also, what exactly is the case when the clocksource being read isn't
>> >> accessible? I see this is conditionalized on
>> >> CLOCK_SOURCE_SUSPEND_NONSTOP, so is the concern on resume we read the
>> >> clocksource and its been reset causing a crazy time value?
>> >
>> > The clocksource's ->suspend method may have been called (during suspend)
>> > and depending on what that did we may even crash things theoretically.
>> >
>> > During resume, before the clocksource's ->resume callback, it may just
>> > be undefined behavior (random data etc).
>> >
>> > For system suspend as we have today the window is quite narrow, but after
>> > patch [4/6] from this series suspend-to-idle may suspend timekeeping and
>> > just sit there in idle for extended time (hours even) which broadens the
>> > potential exposure quite a bit.
>> >
>> > Of course, it does that with interrupts disabled, but ktime_get_mono_fast_ns()
>> > is for NMI, so theoretically, if an NMI happens while we're in suspend-to-idle
>> > with timekeeping suspended and the clocksource is not CLOCK_SOURCE_SUSPEND_NONSTOP
>> > and the NMI calls ktime_get_mono_fast_ns(), strange and undesirable things may
>> > happen.
>>
>> Ok.. No objection to the approach then. But maybe could you wrap the
>> new logic in a halt_fast_timekeeper() function? Also is there much
>> value in not halting it for SUSPEND_NONSTOP clocksources? If not,
>> might as well halt it in all cases just to simplify the conditions we
>> have to keep track of in our heads. :)
>
> I don't see a problem with doing that unconditionally.
>
> What about the appended version of the patch, then?
>
> Rafael
>
>
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Subject: timekeeping: Make it safe to use the fast timekeeper while suspended
>
> Theoretically, ktime_get_mono_fast_ns() may be executed after
> timekeeping has been suspended (or before it is resumed) which
> in turn may lead to undefined behavior, for example, when the
> clocksource read from timekeeping_get_ns() called by it is
> not accessible at that time.
>
> Prevent that from happening by setting up a dummy readout base for
> the fast timekeeper during timekeeping_suspend() such that it will
> always return the same number of cycles.
>
> After the last timekeeping_update() in timekeeping_suspend() the
> clocksource is read and the result is stored as cycles_at_suspend.
> The readout base from the current timekeeper is copied onto the
> dummy and the ->read pointer of the dummy is set to a routine
> unconditionally returning cycles_at_suspend.  Next, the dummy is
> passed to update_fast_timekeeper().
>
> Then, ktime_get_mono_fast_ns() will work until the subsequent
> timekeeping_resume() and the proper readout base for the fast
> timekeeper will be restored by the timekeeping_update() called
> right after clearing timekeeping_suspended.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  kernel/time/timekeeping.c |   30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> Index: linux-pm/kernel/time/timekeeping.c
> ===================================================================
> --- linux-pm.orig/kernel/time/timekeeping.c
> +++ linux-pm/kernel/time/timekeeping.c
> @@ -332,6 +332,35 @@ u64 notrace ktime_get_mono_fast_ns(void)
>  }
>  EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);
>
> +/* Suspend-time cycles value for halted fast timekeeper. */
> +static cycle_t cycles_at_suspend;
> +
> +static cycle_t dummy_clock_read(struct clocksource *cs)
> +{
> +       return cycles_at_suspend;
> +}
> +
> +/**
> + * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource.
> + * @tk: Timekeeper to snapshot.
> + *
> + * It generally is unsafe to access the clocksource after timekeeping has been
> + * suspended, so take a snapshot of the readout base of @tk and use it as the
> + * fast timekeeper's readout base while suspended.  It will return the same
> + * number of cycles every time until timekeeping is resumed at which time the
> + * proper readout base for the fast timekeeper will be restored automatically.
> + */
> +static void halt_fast_timekeeper(struct timekeeper *tk)
> +{
> +       static struct tk_read_base tkr_dummy;
> +       struct tk_read_base *tkr = &tk->tkr;
> +
> +       memcpy(&tkr_dummy, tkr, sizeof(tkr_dummy));
> +       cycles_at_suspend = tkr->read(tkr->clock);
> +       tkr_dummy.read = dummy_clock_read;
> +       update_fast_timekeeper(&tkr_dummy);
> +}
> +
>  #ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
>
>  static inline void update_vsyscall(struct timekeeper *tk)
> @@ -1294,6 +1323,7 @@ static int timekeeping_suspend(void)
>         }
>
>         timekeeping_update(tk, TK_MIRROR);
> +       halt_fast_timekeeper(tk);
>         write_seqcount_end(&tk_core.seq);
>         raw_spin_unlock_irqrestore(&timekeeper_lock, flags);


Yep, this looks much nicer. Thanks for respinning it!

Untested, but...
Acked-by: John Stultz <john.stultz@linaro.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-02-12 16:24     ` Rafael J. Wysocki
  2015-02-12 16:25       ` Peter Zijlstra
@ 2015-03-04 23:50       ` Li, Aubrey
  2015-03-05  0:18         ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Li, Aubrey @ 2015-03-04 23:50 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra
  Cc: Thomas Gleixner, Alan Cox, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On 2015/2/13 0:24, Rafael J. Wysocki wrote:
> On Thursday, February 12, 2015 02:26:43 PM Peter Zijlstra wrote:
>>
>> Why bother with enter_freeze() for any but the deepest state (C6 in this
>> case)?
> 
> User space may disable the deepest one (and any of them in general) via sysfs
> and there's no good reason to ignore its choice in this particular case while
> we're honoring it otherwise.
> 
> So the logic is basically "find the deepest one which isn't disabled" and
> setting the pointers costs us nothing really.
> 

If the user has chance to disable C6 via /sys, that means c6 works?
Shouldn't we ignore user space setting during freeze? Otherwise, we will
lost S0ix?

Thanks,
-Aubrey


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-03-05  0:18         ` Rafael J. Wysocki
@ 2015-03-05  0:09           ` Li, Aubrey
  0 siblings, 0 replies; 28+ messages in thread
From: Li, Aubrey @ 2015-03-05  0:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Zijlstra, Thomas Gleixner, Alan Cox, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On 2015/3/5 8:18, Rafael J. Wysocki wrote:
> On Thursday, March 05, 2015 07:50:26 AM Li, Aubrey wrote:
>> On 2015/2/13 0:24, Rafael J. Wysocki wrote:
>>> On Thursday, February 12, 2015 02:26:43 PM Peter Zijlstra wrote:
>>>>
>>>> Why bother with enter_freeze() for any but the deepest state (C6 in this
>>>> case)?
>>>
>>> User space may disable the deepest one (and any of them in general) via sysfs
>>> and there's no good reason to ignore its choice in this particular case while
>>> we're honoring it otherwise.
>>>
>>> So the logic is basically "find the deepest one which isn't disabled" and
>>> setting the pointers costs us nothing really.
>>>
>>
>> If the user has chance to disable C6 via /sys, that means c6 works?
>> Shouldn't we ignore user space setting during freeze? Otherwise, we will
>> lost S0ix?
> 
> We can't ignore it, because we don't know the reason why the state was
> disabled.
> 

> It may just not work reliably enough on the given platform.
>
okay, make sense to me. Thanks, -Aubrey


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks
  2015-03-04 23:50       ` Li, Aubrey
@ 2015-03-05  0:18         ` Rafael J. Wysocki
  2015-03-05  0:09           ` Li, Aubrey
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2015-03-05  0:18 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Peter Zijlstra, Thomas Gleixner, Alan Cox, LKML, Linux PM list,
	ACPI Devel Maling List, Kristen Carlson Accardi, John Stultz,
	Len Brown

On Thursday, March 05, 2015 07:50:26 AM Li, Aubrey wrote:
> On 2015/2/13 0:24, Rafael J. Wysocki wrote:
> > On Thursday, February 12, 2015 02:26:43 PM Peter Zijlstra wrote:
> >>
> >> Why bother with enter_freeze() for any but the deepest state (C6 in this
> >> case)?
> > 
> > User space may disable the deepest one (and any of them in general) via sysfs
> > and there's no good reason to ignore its choice in this particular case while
> > we're honoring it otherwise.
> > 
> > So the logic is basically "find the deepest one which isn't disabled" and
> > setting the pointers costs us nothing really.
> > 
> 
> If the user has chance to disable C6 via /sys, that means c6 works?
> Shouldn't we ignore user space setting during freeze? Otherwise, we will
> lost S0ix?

We can't ignore it, because we don't know the reason why the state was
disabled.

It may just not work reliably enough on the given platform.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-03-05  0:09 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-11  4:00 [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Rafael J. Wysocki
2015-02-11  4:01 ` [PATCH 1/6] PM / sleep: Re-implement suspend-to-idle handling Rafael J. Wysocki
2015-02-12 13:14   ` Peter Zijlstra
2015-02-12 16:18     ` Rafael J. Wysocki
2015-02-12 22:33   ` [Update][PATCH 1/6 v2] " Rafael J. Wysocki
2015-02-11  4:01 ` [PATCH 2/6] timekeeping: Pass readout base to update_fast_timekeeper() Rafael J. Wysocki
2015-02-13  0:29   ` John Stultz
2015-02-11  4:03 ` [PATCH 3/6] timekeeping: Make it safe to use the fast timekeeper while suspended Rafael J. Wysocki
2015-02-13  0:53   ` John Stultz
2015-02-13  2:03     ` Rafael J. Wysocki
2015-02-13  2:59       ` Travis
2015-02-13  9:03       ` John Stultz
2015-02-13 14:32         ` Rafael J. Wysocki
2015-02-14 17:30           ` John Stultz
2015-02-11  4:03 ` [PATCH 4/6] PM / sleep: Make it possible to quiesce timers during suspend-to-idle Rafael J. Wysocki
2015-02-12 13:24   ` Peter Zijlstra
2015-02-12 16:19     ` Rafael J. Wysocki
2015-02-12 22:36   ` [Update][PATCH 4/6 v2] " Rafael J. Wysocki
2015-02-11  4:04 ` [PATCH 5/6] intel_idle: Add ->enter_freeze callbacks Rafael J. Wysocki
2015-02-12 13:26   ` Peter Zijlstra
2015-02-12 16:24     ` Rafael J. Wysocki
2015-02-12 16:25       ` Peter Zijlstra
2015-03-04 23:50       ` Li, Aubrey
2015-03-05  0:18         ` Rafael J. Wysocki
2015-03-05  0:09           ` Li, Aubrey
2015-02-11  4:04 ` [PATCH 6/6] ACPI / idle: Implement ->enter_freeze callback routine Rafael J. Wysocki
2015-02-13  8:13 ` [PATCH 0/6] PM / sleep: Support for quiescing timers during suspend-to-idle Peter Zijlstra
2015-02-13 14:29   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).