linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs
@ 2016-06-09  3:08 dbasehore
  2016-06-09  3:08 ` [PATCH v3 1/5] x86: stub out pmc function dbasehore
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

This patch set adds support for catching errors when entering freeze
on Intel Skylake SoCs. Support for this can be added to newer SoCs in
later patches.

Verification is done by waking up the CPU up to 1000 seconds later
based on base 10 exponential backoff from 1 second to check the
residency of S0ix. This can't be verified before attempting to enter
S0ix through mwait, so we have to repeatedly verify entry into that
state. Successfully entering S0ix is no guarantee that it will be
entered on the next attempt, so we have to schedule another check.
This has a minimal average power impact of ~3uW on Skylake systems.
This leaves plenty of room for additional overhead based on changes
to the system.

This relies on the recently added patch "platform/x86: Add PMC Driver
for Intel Core SoC"

Changes for v3:
- Fixed compilation error for "freeze: Add error reporting"

Changes for v2:
- Moved to exponential backoff for the freeze duration with a max of
  1000 seconds
- Changed to make the feature default off
- Add module parameter for enabling/disabling the feature instead of
  a debugfs entry

Derek Basehore (5):
  x86: stub out pmc function
  clockevents: Add timed freeze
  x86, apic: Add timed freeze support
  freeze: Add error reporting
  intel_idle: Add S0ix validation

 arch/x86/include/asm/pmc_core.h |   6 +-
 arch/x86/kernel/apic/apic.c     |  25 ++++++-
 drivers/acpi/processor_idle.c   |  10 ++-
 drivers/cpuidle/cpuidle.c       |  31 ++++++--
 drivers/idle/intel_idle.c       | 161 +++++++++++++++++++++++++++++++++++++---
 include/linux/clockchips.h      |  10 +++
 include/linux/cpuidle.h         |  12 ++-
 include/linux/suspend.h         |  10 +++
 kernel/power/suspend.c          |  11 ++-
 kernel/time/clockevents.c       | 117 +++++++++++++++++++++++++++++
 10 files changed, 363 insertions(+), 30 deletions(-)

-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/5] x86: stub out pmc function
  2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
@ 2016-06-09  3:08 ` dbasehore
  2016-06-09  3:08 ` [PATCH v3 2/5] clockevents: Add timed freeze dbasehore
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

This creates an inline function of intel_pmc_slp_s0_counter_read for
!CONFIG_INTEL_PMC_CORE.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 arch/x86/include/asm/pmc_core.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pmc_core.h b/arch/x86/include/asm/pmc_core.h
index d4855f1..786e526 100644
--- a/arch/x86/include/asm/pmc_core.h
+++ b/arch/x86/include/asm/pmc_core.h
@@ -22,6 +22,10 @@
 #define _ASM_PMC_CORE_H
 
 /* API to read SLP_S0_RESIDENCY counter */
-int intel_pmc_slp_s0_counter_read(u32 *data);
+#ifdef CONFIG_INTEL_PMC_CORE
+extern int intel_pmc_slp_s0_counter_read(u32 *data);
+#else
+static inline int intel_pmc_slp_s0_counter_read(u32 *data) { return -ENOSYS; }
+#endif
 
 #endif /* _ASM_PMC_CORE_H */
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/5] clockevents: Add timed freeze
  2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
  2016-06-09  3:08 ` [PATCH v3 1/5] x86: stub out pmc function dbasehore
@ 2016-06-09  3:08 ` dbasehore
  2016-06-09  3:08 ` [PATCH v3 3/5] x86, apic: Add timed freeze support dbasehore
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

Adds a new feature to clockevents to schedule wakeups on a CPU during
freeze. These won't fully wake up the system, but allow simple
platform callbacks that don't require device support to be run during
freeze with little power impact.

This implementation allows an idle driver to setup a timer event with
the clock event device when entering freeze by calling timed_freeze.
timed_freeze is a wrapper around entering freeze and setting up the
timer event so that clock event device state is not exposed to the
caller. This only allows one user of a timed freeze event, but that
will likely be the case for some time.

The reason this isn't implemented on top of the RTC is that wake irqs
unconditionally wake up the system by calling pm_system_wakeup. This
is done by both the SCI and RTC irqs on x86 when an RTC alarm fires.

An abstraction could be easily added on top of this if there is ever
more than one user for freeze events.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 include/linux/clockchips.h |  10 ++++
 include/linux/suspend.h    |  10 ++++
 kernel/time/clockevents.c  | 117 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 137 insertions(+)

diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 0d442e3..e1b66d4 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -66,12 +66,20 @@ enum clock_event_state {
  */
 # define CLOCK_EVT_FEAT_HRTIMER		0x000080
 
+/*
+ * Clockevent device may run during freeze
+ */
+# define CLOCK_EVT_FEAT_FREEZE		0x000100
+
 /**
  * struct clock_event_device - clock event device descriptor
  * @event_handler:	Assigned by the framework to be called by the low
  *			level handler of the event source
  * @set_next_event:	set next event function using a clocksource delta
  * @set_next_ktime:	set next event function using a direct ktime value
+ * @event_pending:	check if the programmed event is still pending. Used
+ *			for freeze events when timekeeping is suspended and
+ *			irqs are disabled.
  * @next_event:		local storage for the next event in oneshot mode
  * @max_delta_ns:	maximum delta value in ns
  * @min_delta_ns:	minimum delta value in ns
@@ -100,7 +108,9 @@ struct clock_event_device {
 	void			(*event_handler)(struct clock_event_device *);
 	int			(*set_next_event)(unsigned long evt, struct clock_event_device *);
 	int			(*set_next_ktime)(ktime_t expires, struct clock_event_device *);
+	bool			(*event_expired)(struct clock_event_device *);
 	ktime_t			next_event;
+	bool			freeze_event_programmed;
 	u64			max_delta_ns;
 	u64			min_delta_ns;
 	u32			mult;
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 8b6ec7e..d8d4296 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -194,6 +194,11 @@ struct platform_freeze_ops {
 	void (*end)(void);
 };
 
+struct timed_freeze_ops {
+	int (*enter_freeze)(void *);
+	int (*callback)(void *);
+};
+
 #ifdef CONFIG_SUSPEND
 /**
  * suspend_set_ops - set platform dependent suspend operations
@@ -246,6 +251,9 @@ static inline bool idle_should_freeze(void)
 	return unlikely(suspend_freeze_state == FREEZE_STATE_ENTER);
 }
 
+extern int timed_freeze(struct timed_freeze_ops *ops, void *data,
+			ktime_t delta);
+
 extern void freeze_set_ops(const struct platform_freeze_ops *ops);
 extern void freeze_wake(void);
 
@@ -280,6 +288,8 @@ static inline bool pm_resume_via_firmware(void) { return false; }
 static inline void suspend_set_ops(const struct platform_suspend_ops *ops) {}
 static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; }
 static inline bool idle_should_freeze(void) { return false; }
+static inline int timed_freeze(struct timed_freeze_ops *ops, void *data,
+			       ktime_t delta) { return -ENOSYS; }
 static inline void freeze_set_ops(const struct platform_freeze_ops *ops) {}
 static inline void freeze_wake(void) {}
 #endif /* !CONFIG_SUSPEND */
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index a9b76a4..e7ca673 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/smp.h>
 #include <linux/device.h>
+#include <linux/suspend.h>
 
 #include "tick-internal.h"
 
@@ -341,6 +342,122 @@ int clockevents_program_event(struct clock_event_device *dev, ktime_t expires,
 	return (rc && force) ? clockevents_program_min_delta(dev) : rc;
 }
 
+static int clockevents_program_freeze_event(struct clock_event_device *dev,
+					    ktime_t delta)
+{
+	int64_t delta_ns = ktime_to_ns(delta);
+	unsigned long long clc;
+	int ret;
+
+	if (delta_ns > (int64_t) dev->max_delta_ns) {
+		printk_deferred(KERN_WARNING
+				"Freeze event time longer than max delta\n");
+		delta_ns = (int64_t) dev->max_delta_ns;
+	}
+
+	clockevents_tick_resume(dev);
+	clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);
+	delta_ns = max_t(int64_t, delta_ns, dev->min_delta_ns);
+	clc = ((unsigned long long) delta_ns * dev->mult) >> dev->shift;
+	ret = dev->set_next_event((unsigned long) clc, dev);
+	if (ret < 0) {
+		printk_deferred(KERN_WARNING
+				"Failed to program freeze event\n");
+		clockevents_shutdown(dev);
+	} else {
+		dev->freeze_event_programmed = true;
+	}
+
+	return ret;
+}
+
+static bool clockevents_freeze_event_expired(struct clock_event_device *dev)
+{
+	if (dev->freeze_event_programmed)
+		return dev->event_expired(dev);
+
+	return false;
+}
+
+static void clockevents_cleanup_freeze_event(struct clock_event_device *dev)
+{
+	if (!(dev->features & CLOCK_EVT_FEAT_FREEZE))
+		return;
+
+	clockevents_shutdown(dev);
+	dev->freeze_event_programmed = false;
+}
+
+/**
+ * timed_freeze - Enter freeze on a CPU for a timed duration
+ * @ops:	Pointers for enter freeze and callback functions.
+ * @data:	Pointer to pass arguments to the function pointers.
+ * @delta:	Time to freeze for. If this amount of time passes in freeze, the
+ *		callback in ops will be called.
+ *
+ * Returns the value from ops->enter_freeze or ops->callback on success, -EERROR
+ * otherwise. If an error is encountered while setting up the clock event,
+ * freeze with still be entered, but it will not be timed nor will the callback
+ * function be run.
+ */
+int timed_freeze(struct timed_freeze_ops *ops, void *data, ktime_t delta)
+{
+	int cpu = smp_processor_id();
+	struct tick_device *td = tick_get_device(cpu);
+	struct clock_event_device *dev;
+	int ret;
+
+	if (!ops || !ops->enter_freeze) {
+		printk_deferred(KERN_ERR
+				"[%s] called with invalid ops\n", __func__);
+		return -EINVAL;
+	}
+
+	if (!td || !td->evtdev ||
+	    !(td->evtdev->features & CLOCK_EVT_FEAT_FREEZE)) {
+		printk_deferred(KERN_WARNING
+				"[%s] called with invalid clock event device\n",
+				__func__);
+		ret = -ENOSYS;
+		goto freeze_no_check;
+	}
+
+	dev = td->evtdev;
+	if (!clockevent_state_shutdown(dev)) {
+		printk_deferred(KERN_WARNING
+				"[%s] called while clock event device in use\n",
+				__func__);
+		ret = -EBUSY;
+		goto freeze_no_check;
+	}
+
+	ret = clockevents_program_freeze_event(dev, delta);
+	if (ret < 0)
+		goto freeze_no_check;
+
+	ret = ops->enter_freeze(data);
+	if (ret < 0)
+		goto out;
+
+	if (ops->callback && clockevents_freeze_event_expired(dev))
+		ret = ops->callback(data);
+
+out:
+	clockevents_cleanup_freeze_event(dev);
+	return ret;
+
+freeze_no_check:
+	/*
+	 * If an error happens before enter_freeze, enter freeze normally and
+	 * return an error. The called can't tell if freeze should be entered on
+	 * an error (since errors can happen after returning from freeze), so
+	 * just handle it here.
+	 */
+	ops->enter_freeze(data);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(timed_freeze);
+
 /*
  * Called after a notify add to make devices available which were
  * released from the notifier call.
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/5] x86, apic: Add timed freeze support
  2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
  2016-06-09  3:08 ` [PATCH v3 1/5] x86: stub out pmc function dbasehore
  2016-06-09  3:08 ` [PATCH v3 2/5] clockevents: Add timed freeze dbasehore
@ 2016-06-09  3:08 ` dbasehore
  2016-06-09  3:08 ` [PATCH v3 4/5] freeze: Add error reporting dbasehore
  2016-06-09  3:08 ` [PATCH v3 5/5] intel_idle: Add S0ix validation dbasehore
  4 siblings, 0 replies; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

This adds support to the clock event devices created by apic to use
timed freeze. The apic is able to run a timer during freeze with near
izero impact on modern CPUs such as skylake. This will allow S0ix,
suspend-to-idle, to be validated on Intel CPUs that support it.

This is needed because bugs with power settings on the SoC can prevent
S0ix entry. There is also no way to check this before idling all of
the CPUs.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 arch/x86/kernel/apic/apic.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 60078a6..f0c5f92 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -475,6 +475,26 @@ static int lapic_next_deadline(unsigned long delta,
 	return 0;
 }
 
+static bool lapic_event_expired(struct clock_event_device *evt)
+{
+	u32 cct;
+
+	cct = apic_read(APIC_TMCCT);
+	return cct == 0;
+}
+
+static bool lapic_deadline_expired(struct clock_event_device *evt)
+{
+	u64 msr;
+
+	/*
+	 * When the timer interrupt is triggered, the register is cleared, so a
+	 * non-zero value indicates a pending timer event.
+	 */
+	rdmsrl(MSR_IA32_TSC_DEADLINE, msr);
+	return msr == 0;
+}
+
 static int lapic_timer_shutdown(struct clock_event_device *evt)
 {
 	unsigned int v;
@@ -529,12 +549,14 @@ static struct clock_event_device lapic_clockevent = {
 	.name			= "lapic",
 	.features		= CLOCK_EVT_FEAT_PERIODIC |
 				  CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_C3STOP
-				  | CLOCK_EVT_FEAT_DUMMY,
+				  | CLOCK_EVT_FEAT_DUMMY |
+				  CLOCK_EVT_FEAT_FREEZE,
 	.shift			= 32,
 	.set_state_shutdown	= lapic_timer_shutdown,
 	.set_state_periodic	= lapic_timer_set_periodic,
 	.set_state_oneshot	= lapic_timer_set_oneshot,
 	.set_next_event		= lapic_next_event,
+	.event_expired		= lapic_event_expired,
 	.broadcast		= lapic_timer_broadcast,
 	.rating			= 100,
 	.irq			= -1,
@@ -562,6 +584,7 @@ static void setup_APIC_timer(void)
 		levt->features &= ~(CLOCK_EVT_FEAT_PERIODIC |
 				    CLOCK_EVT_FEAT_DUMMY);
 		levt->set_next_event = lapic_next_deadline;
+		levt->event_expired = lapic_deadline_expired;
 		clockevents_config_and_register(levt,
 						(tsc_khz / TSC_DIVISOR) * 1000,
 						0xF, ~0UL);
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 4/5] freeze: Add error reporting
  2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
                   ` (2 preceding siblings ...)
  2016-06-09  3:08 ` [PATCH v3 3/5] x86, apic: Add timed freeze support dbasehore
@ 2016-06-09  3:08 ` dbasehore
  2016-06-09  4:14   ` kbuild test robot
  2016-06-09  3:08 ` [PATCH v3 5/5] intel_idle: Add S0ix validation dbasehore
  4 siblings, 1 reply; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

This adds error reporting for cpuidle to freeze so suspend-to-idle can
report errors when the CPU/SoC is unable to idle properly. Freeze will
abort when an error is encounted.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 drivers/acpi/processor_idle.c | 10 ++++++----
 drivers/cpuidle/cpuidle.c     | 31 ++++++++++++++++++++++++++-----
 drivers/idle/intel_idle.c     |  8 +++++---
 include/linux/cpuidle.h       | 12 ++++++++----
 kernel/power/suspend.c        | 11 +++++++----
 5 files changed, 52 insertions(+), 20 deletions(-)

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 444e374..a959b32 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -783,8 +783,8 @@ static int acpi_idle_enter(struct cpuidle_device *dev,
 	return index;
 }
 
-static void acpi_idle_enter_freeze(struct cpuidle_device *dev,
-				   struct cpuidle_driver *drv, int index)
+static int acpi_idle_enter_freeze(struct cpuidle_device *dev,
+				  struct cpuidle_driver *drv, int index)
 {
 	struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
 
@@ -792,16 +792,18 @@ static void acpi_idle_enter_freeze(struct cpuidle_device *dev,
 		struct acpi_processor *pr = __this_cpu_read(processors);
 
 		if (unlikely(!pr))
-			return;
+			return 0;
 
 		if (pr->flags.bm_check) {
 			acpi_idle_enter_bm(pr, cx, false);
-			return;
+			return 0;
 		} else {
 			ACPI_FLUSH_CPU_CACHE();
 		}
 	}
 	acpi_idle_do_entry(cx);
+
+	return 0;
 }
 
 struct cpuidle_driver acpi_idle_driver = {
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index a4d0059..45bbaf7 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -34,6 +34,7 @@ LIST_HEAD(cpuidle_detected_devices);
 static int enabled_devices;
 static int off __read_mostly;
 static int initialized __read_mostly;
+static int cpuidle_freeze_error;
 
 int cpuidle_disabled(void)
 {
@@ -109,9 +110,11 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 	return find_deepest_state(drv, dev, UINT_MAX, 0, false);
 }
 
-static void enter_freeze_proper(struct cpuidle_driver *drv,
+static int enter_freeze_proper(struct cpuidle_driver *drv,
 				struct cpuidle_device *dev, int index)
 {
+	int ret;
+
 	/*
 	 * trace_suspend_resume() called by tick_freeze() for the last CPU
 	 * executing it contains RCU usage regarded as invalid in the idle
@@ -124,7 +127,7 @@ static void enter_freeze_proper(struct cpuidle_driver *drv,
 	 * suspended is generally unsafe.
 	 */
 	stop_critical_timings();
-	drv->states[index].enter_freeze(dev, drv, index);
+	ret = drv->states[index].enter_freeze(dev, drv, index);
 	WARN_ON(!irqs_disabled());
 	/*
 	 * timekeeping_resume() that will be called by tick_unfreeze() for the
@@ -133,6 +136,7 @@ static void enter_freeze_proper(struct cpuidle_driver *drv,
 	 */
 	RCU_NONIDLE(tick_unfreeze());
 	start_critical_timings();
+	return ret;
 }
 
 /**
@@ -145,7 +149,7 @@ static void enter_freeze_proper(struct cpuidle_driver *drv,
  */
 int cpuidle_enter_freeze(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
-	int index;
+	int index, ret = 0;
 
 	/*
 	 * Find the deepest state with ->enter_freeze present, which guarantees
@@ -153,11 +157,28 @@ int cpuidle_enter_freeze(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 	 * be frozen safely.
 	 */
 	index = find_deepest_state(drv, dev, UINT_MAX, 0, true);
-	if (index > 0)
-		enter_freeze_proper(drv, dev, index);
+	if (index >= 0)
+		ret = enter_freeze_proper(drv, dev, index);
+
+	if (ret < 0) {
+		cpuidle_freeze_error = ret;
+		freeze_wake();
+	}
 
 	return index;
 }
+
+void cpuidle_prepare_freeze(void)
+{
+	cpuidle_freeze_error = 0;
+	cpuidle_resume();
+}
+
+int cpuidle_complete_freeze(void)
+{
+	cpuidle_pause();
+	return cpuidle_freeze_error;
+}
 #endif /* CONFIG_SUSPEND */
 
 /**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index c966492..98565de 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -97,8 +97,8 @@ static const struct idle_cpu *icpu;
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 static int intel_idle(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv, int index);
-static void intel_idle_freeze(struct cpuidle_device *dev,
-			      struct cpuidle_driver *drv, int index);
+static int intel_idle_freeze(struct cpuidle_device *dev,
+			     struct cpuidle_driver *drv, int index);
 static int intel_idle_cpu_init(int cpu);
 
 static struct cpuidle_state *cpuidle_state_table;
@@ -870,13 +870,15 @@ static int intel_idle(struct cpuidle_device *dev,
  * @drv: cpuidle driver
  * @index: state index
  */
-static void intel_idle_freeze(struct cpuidle_device *dev,
+static int intel_idle_freeze(struct cpuidle_device *dev,
 			     struct cpuidle_driver *drv, int index)
 {
 	unsigned long ecx = 1; /* break on interrupt flag */
 	unsigned long eax = flg2MWAIT(drv->states[index].flags);
 
 	mwait_idle_with_hints(eax, ecx);
+
+	return 0;
 }
 
 static void __setup_broadcast_timer(void *arg)
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 786ad32..2309849 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -54,11 +54,11 @@ struct cpuidle_state {
 	/*
 	 * CPUs execute ->enter_freeze with the local tick or entire timekeeping
 	 * suspended, so it must not re-enable interrupts at any point (even
-	 * temporarily) or attempt to change states of clock event devices.
+	 * temporarily). Returns 0 on success and non-zero if an error occurred.
 	 */
-	void (*enter_freeze) (struct cpuidle_device *dev,
-			      struct cpuidle_driver *drv,
-			      int index);
+	int (*enter_freeze) (struct cpuidle_device *dev,
+			     struct cpuidle_driver *drv,
+			     int index);
 };
 
 /* Idle State Flags */
@@ -194,6 +194,8 @@ extern int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 				      struct cpuidle_device *dev);
 extern int cpuidle_enter_freeze(struct cpuidle_driver *drv,
 				struct cpuidle_device *dev);
+extern void cpuidle_prepare_freeze(void);
+extern int cpuidle_complete_freeze(void);
 #else
 static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 					     struct cpuidle_device *dev)
@@ -201,6 +203,8 @@ static inline int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 static inline int cpuidle_enter_freeze(struct cpuidle_driver *drv,
 				       struct cpuidle_device *dev)
 {return -ENODEV; }
+static inline void cpuidle_prepare_freeze(void) { }
+static inline int cpuidle_complete_freeze(void) { return -ENODEV; }
 #endif
 
 /* kernel/sched/idle.c */
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 5b70d64..419154b 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -57,8 +57,10 @@ static void freeze_begin(void)
 	suspend_freeze_state = FREEZE_STATE_NONE;
 }
 
-static void freeze_enter(void)
+static int freeze_enter(void)
 {
+	int error = 0;
+
 	spin_lock_irq(&suspend_freeze_lock);
 	if (pm_wakeup_pending())
 		goto out;
@@ -67,7 +69,7 @@ static void freeze_enter(void)
 	spin_unlock_irq(&suspend_freeze_lock);
 
 	get_online_cpus();
-	cpuidle_resume();
+	cpuidle_prepare_freeze();
 
 	/* Push all the CPUs into the idle loop. */
 	wake_up_all_idle_cpus();
@@ -77,7 +79,7 @@ static void freeze_enter(void)
 		   suspend_freeze_state == FREEZE_STATE_WAKE);
 	pr_debug("PM: resume from suspend-to-idle\n");
 
-	cpuidle_pause();
+	error = cpuidle_complete_freeze();
 	put_online_cpus();
 
 	spin_lock_irq(&suspend_freeze_lock);
@@ -85,6 +87,7 @@ static void freeze_enter(void)
  out:
 	suspend_freeze_state = FREEZE_STATE_NONE;
 	spin_unlock_irq(&suspend_freeze_lock);
+	return error;
 }
 
 void freeze_wake(void)
@@ -347,7 +350,7 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
 	 */
 	if (state == PM_SUSPEND_FREEZE) {
 		trace_suspend_resume(TPS("machine_suspend"), state, true);
-		freeze_enter();
+		error = freeze_enter();
 		trace_suspend_resume(TPS("machine_suspend"), state, false);
 		goto Platform_wake;
 	}
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 5/5] intel_idle: Add S0ix validation
  2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
                   ` (3 preceding siblings ...)
  2016-06-09  3:08 ` [PATCH v3 4/5] freeze: Add error reporting dbasehore
@ 2016-06-09  3:08 ` dbasehore
  4 siblings, 0 replies; 7+ messages in thread
From: dbasehore @ 2016-06-09  3:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: dbasehore, linux-pm, rjw, pavel, len.brown, tglx, gnomes, peterz

From: Derek Basehore <dbasehore@chromium.org>

This adds validation of S0ix entry and enables it on Skylake. Using
the new timed_freeze function, we program the CPU to wake up X seconds
after entering freeze. After X seconds, it will wake the CPU to check
the S0ix residency counters and make sure we entered the lowest power
state for suspend-to-idle.

It exits freeze and reports an error to userspace when the SoC does
not enter S0ix on suspend-to-idle.

One example of a bug that can prevent a Skylake CPU from entering S0ix
(suspend-to-idle) is a leaked reference count to one of the i915 power
wells. The CPU will not be able to enter Package C10 and will
therefore use about 4x as much power for the entire system. The issue
is not specific to the i915 power wells though.

Signed-off-by: Derek Basehore <dbasehore@chromium.org>
---
 drivers/idle/intel_idle.c | 153 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 145 insertions(+), 8 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 98565de..85dcbc6 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -61,9 +61,11 @@
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/module.h>
+#include <linux/suspend.h>
 #include <asm/cpu_device_id.h>
 #include <asm/mwait.h>
 #include <asm/msr.h>
+#include <asm/pmc_core.h>
 
 #define INTEL_IDLE_VERSION "0.4.1"
 #define PREFIX "intel_idle: "
@@ -93,12 +95,36 @@ struct idle_cpu {
 	bool disable_promotion_to_c1e;
 };
 
+/*
+ * The limit for the exponential backoff for the freeze duration. At this point,
+ * power impact is is far from measurable. An estimate is about 3uW based on
+ * scaling from waking up 10 times a second.
+ */
+#define MAX_SLP_S0_SECONDS 1000
+#define SLP_S0_EXP_BASE 10
+
+struct timed_freeze_data {
+	u32 slp_s0_saved_count;
+	struct cpuidle_device *dev;
+	struct cpuidle_driver *drv;
+	int index;
+};
+
+static bool slp_s0_check;
+static unsigned int slp_s0_seconds;
+
+static DEFINE_SPINLOCK(slp_s0_check_lock);
+static unsigned int slp_s0_num_cpus;
+static bool slp_s0_check_inprogress;
+
 static const struct idle_cpu *icpu;
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 static int intel_idle(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv, int index);
 static int intel_idle_freeze(struct cpuidle_device *dev,
 			     struct cpuidle_driver *drv, int index);
+static int intel_idle_freeze_and_check(struct cpuidle_device *dev,
+				       struct cpuidle_driver *drv, int index);
 static int intel_idle_cpu_init(int cpu);
 
 static struct cpuidle_state *cpuidle_state_table;
@@ -599,7 +625,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 2,
 		.target_residency = 2,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C1E-SKL",
 		.desc = "MWAIT 0x01",
@@ -607,7 +633,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 10,
 		.target_residency = 20,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C3-SKL",
 		.desc = "MWAIT 0x10",
@@ -615,7 +641,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 70,
 		.target_residency = 100,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C6-SKL",
 		.desc = "MWAIT 0x20",
@@ -623,7 +649,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 85,
 		.target_residency = 200,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C7s-SKL",
 		.desc = "MWAIT 0x33",
@@ -631,7 +657,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 124,
 		.target_residency = 800,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C8-SKL",
 		.desc = "MWAIT 0x40",
@@ -639,7 +665,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 200,
 		.target_residency = 800,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C9-SKL",
 		.desc = "MWAIT 0x50",
@@ -647,7 +673,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 480,
 		.target_residency = 5000,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.name = "C10-SKL",
 		.desc = "MWAIT 0x60",
@@ -655,7 +681,7 @@ static struct cpuidle_state skl_cstates[] = {
 		.exit_latency = 890,
 		.target_residency = 5000,
 		.enter = &intel_idle,
-		.enter_freeze = intel_idle_freeze, },
+		.enter_freeze = intel_idle_freeze_and_check, },
 	{
 		.enter = NULL }
 };
@@ -869,6 +895,8 @@ static int intel_idle(struct cpuidle_device *dev,
  * @dev: cpuidle_device
  * @drv: cpuidle driver
  * @index: state index
+ *
+ * @return 0 for success, no failure state
  */
 static int intel_idle_freeze(struct cpuidle_device *dev,
 			     struct cpuidle_driver *drv, int index)
@@ -881,6 +909,105 @@ static int intel_idle_freeze(struct cpuidle_device *dev,
 	return 0;
 }
 
+static int intel_idle_freeze_wrapper(void *data)
+{
+	struct timed_freeze_data *tfd = data;
+
+	return intel_idle_freeze(tfd->dev, tfd->drv, tfd->index);
+}
+
+static int check_slp_s0(void *data)
+{
+	struct timed_freeze_data *tfd = data;
+	u32 slp_s0_new_count;
+
+	if (intel_pmc_slp_s0_counter_read(&slp_s0_new_count)) {
+		pr_warn("Unable to read SLP S0 residency counter\n");
+		return -EIO;
+	}
+
+	if (tfd->slp_s0_saved_count == slp_s0_new_count) {
+		pr_warn("CPU did not enter SLP S0 for suspend-to-idle.\n");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/**
+ * intel_idle_freeze_and_check - enters suspend-to-idle and validates the power
+ * state
+ *
+ * This function enters suspend-to-idle with intel_idle_freeze, but also sets up
+ * a timer to check that S0ix (low power state for suspend-to-idle on Intel
+ * CPUs) is properly entered.
+ *
+ * @dev: cpuidle_device
+ * @drv: cpuidle_driver
+ * @index: state index
+ * @return 0 for success, -EERROR if S0ix was not entered.
+ */
+static int intel_idle_freeze_and_check(struct cpuidle_device *dev,
+				       struct cpuidle_driver *drv, int index)
+{
+	bool check_on_this_cpu = false;
+	struct timed_freeze_ops ops;
+	struct timed_freeze_data tfd;
+	unsigned long flags;
+	int ret = 0;
+
+	/* The last CPU to freeze sets up checking SLP S0 assertion. */
+	spin_lock_irqsave(&slp_s0_check_lock, flags);
+	if (slp_s0_seconds &&
+	    ++slp_s0_num_cpus == num_online_cpus() &&
+	    !slp_s0_check_inprogress &&
+	    !intel_pmc_slp_s0_counter_read(&tfd.slp_s0_saved_count)) {
+		tfd.dev = dev;
+		tfd.drv = drv;
+		tfd.index = index;
+		ops.enter_freeze = intel_idle_freeze_wrapper;
+		ops.callback = check_slp_s0;
+		check_on_this_cpu = true;
+		/*
+		 * Make sure check_slp_s0 isn't scheduled on another CPU if it
+		 * were to leave freeze and enter it again before this CPU
+		 * leaves freeze.
+		 */
+		slp_s0_check_inprogress = true;
+	}
+	spin_unlock_irqrestore(&slp_s0_check_lock, flags);
+
+	if (check_on_this_cpu)
+		ret = timed_freeze(&ops, &tfd, ktime_set(slp_s0_seconds, 0));
+	else
+		ret = intel_idle_freeze(dev, drv, index);
+
+	spin_lock_irqsave(&slp_s0_check_lock, flags);
+	slp_s0_num_cpus--;
+	if (check_on_this_cpu) {
+		slp_s0_check_inprogress = false;
+		slp_s0_seconds = min_t(unsigned int,
+				       SLP_S0_EXP_BASE * slp_s0_seconds,
+				       MAX_SLP_S0_SECONDS);
+	}
+
+	spin_unlock_irqrestore(&slp_s0_check_lock, flags);
+	return ret;
+}
+
+static int slp_s0_check_prepare(struct notifier_block *nb, unsigned long action,
+				void *data)
+{
+	if (action == PM_SUSPEND_PREPARE)
+		slp_s0_seconds = slp_s0_check ? 1 : 0;
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block intel_slp_s0_check_nb = {
+	.notifier_call = slp_s0_check_prepare,
+};
+
 static void __setup_broadcast_timer(void *arg)
 {
 	unsigned long on = (unsigned long)arg;
@@ -1391,6 +1518,13 @@ static int __init intel_idle_init(void)
 		return retval;
 	}
 
+	retval = register_pm_notifier(&intel_slp_s0_check_nb);
+	if (retval) {
+		free_percpu(intel_idle_cpuidle_devices);
+		cpuidle_unregister_driver(&intel_idle_driver);
+		return retval;
+	}
+
 	cpu_notifier_register_begin();
 
 	for_each_online_cpu(i) {
@@ -1398,6 +1532,7 @@ static int __init intel_idle_init(void)
 		if (retval) {
 			intel_idle_cpuidle_devices_uninit();
 			cpu_notifier_register_done();
+			unregister_pm_notifier(&intel_slp_s0_check_nb);
 			cpuidle_unregister_driver(&intel_idle_driver);
 			free_percpu(intel_idle_cpuidle_devices);
 			return retval;
@@ -1436,6 +1571,7 @@ static void __exit intel_idle_exit(void)
 
 	cpu_notifier_register_done();
 
+	unregister_pm_notifier(&intel_slp_s0_check_nb);
 	cpuidle_unregister_driver(&intel_idle_driver);
 	free_percpu(intel_idle_cpuidle_devices);
 }
@@ -1444,6 +1580,7 @@ module_init(intel_idle_init);
 module_exit(intel_idle_exit);
 
 module_param(max_cstate, int, 0444);
+module_param(slp_s0_check, bool, 0644);
 
 MODULE_AUTHOR("Len Brown <len.brown@intel.com>");
 MODULE_DESCRIPTION("Cpuidle driver for Intel Hardware v" INTEL_IDLE_VERSION);
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 4/5] freeze: Add error reporting
  2016-06-09  3:08 ` [PATCH v3 4/5] freeze: Add error reporting dbasehore
@ 2016-06-09  4:14   ` kbuild test robot
  0 siblings, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2016-06-09  4:14 UTC (permalink / raw)
  To: dbasehore
  Cc: kbuild-all, linux-kernel, dbasehore, linux-pm, rjw, pavel,
	len.brown, tglx, gnomes, peterz

[-- Attachment #1: Type: text/plain, Size: 1763 bytes --]

Hi,

[auto build test WARNING on pm/linux-next]
[also build test WARNING on v4.7-rc2 next-20160608]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/dbasehore-chromium-org/Add-suspend-to-idle-validation-for-Intel-SoCs/20160609-111421
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: x86_64-randconfig-s3-06091046 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

>> drivers/cpuidle/cpuidle.c:37:12: warning: 'cpuidle_freeze_error' defined but not used [-Wunused-variable]
    static int cpuidle_freeze_error;
               ^~~~~~~~~~~~~~~~~~~~

vim +/cpuidle_freeze_error +37 drivers/cpuidle/cpuidle.c

    21	#include <linux/module.h>
    22	#include <linux/suspend.h>
    23	#include <linux/tick.h>
    24	#include <trace/events/power.h>
    25	
    26	#include "cpuidle.h"
    27	
    28	DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices);
    29	DEFINE_PER_CPU(struct cpuidle_device, cpuidle_dev);
    30	
    31	DEFINE_MUTEX(cpuidle_lock);
    32	LIST_HEAD(cpuidle_detected_devices);
    33	
    34	static int enabled_devices;
    35	static int off __read_mostly;
    36	static int initialized __read_mostly;
  > 37	static int cpuidle_freeze_error;
    38	
    39	int cpuidle_disabled(void)
    40	{
    41		return off;
    42	}
    43	void disable_cpuidle(void)
    44	{
    45		off = 1;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 25981 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-06-09  4:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-09  3:08 [PATCH v3 0/5] Add suspend-to-idle validation for Intel SoCs dbasehore
2016-06-09  3:08 ` [PATCH v3 1/5] x86: stub out pmc function dbasehore
2016-06-09  3:08 ` [PATCH v3 2/5] clockevents: Add timed freeze dbasehore
2016-06-09  3:08 ` [PATCH v3 3/5] x86, apic: Add timed freeze support dbasehore
2016-06-09  3:08 ` [PATCH v3 4/5] freeze: Add error reporting dbasehore
2016-06-09  4:14   ` kbuild test robot
2016-06-09  3:08 ` [PATCH v3 5/5] intel_idle: Add S0ix validation dbasehore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).