linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware
@ 2017-01-10  9:06 Gautham R. Shenoy
  2017-01-10  9:07 ` [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro Gautham R. Shenoy
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:06 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

This is the fifth iteration of the patchset to use the psscr_val and
psscr_mask provided by the firmware for each of the stop states.

The previous versions can be found here:
[v4]: https://lkml.org/lkml/2016/12/9/288
[v3]: https://lkml.org/lkml/2016/11/10/37
[v2]: https://lkml.org/lkml/2016/10/27/143
[v1]: https://lkml.org/lkml/2016/9/29/45

This version addresses the feedback provided by Balbir and Vaidy to
v4. The key changes are:

- The current code had some functions/variables names containing
  arch300_* while others had power9*_. Uniformly rename this to arch300_*.

- Add a comment for POWERNV_THRESHOLD_LATENCY_NS.

- In case of a new firmware, validate that the PSSCR values provided
  by the firmware preserves the following invariants as required by the ISA:
    a) While running in Hypervisor mode (HV=1), EC bit must have the
    same value as ESL bit.
    b) For deep stop states that result in state-loss, ESL bit must be set.

- Optimize the sequence of instructions before executing stop with
  ESL=EC=0. Reduce one instruction.

- Fixed the typos in the Documentation for the device-tree bindings
  describing the stop-states exposed by the firmware.

Synopsis
==========
In the current implementation, the code for ISA
v3.0 stop implementation has a couple of shortcomings.

a) The code hand-codes the values for ESL,EC,TR,MTL bits of PSSCR and
   uses only the RL field from the firmware. While this is not
   incorrect, since the hand-coded values are legitimate, it is not a
   very flexible design since the firmware has the capability to
   communicate these values via the "ibm,cpu-idle-state-psscr" and
   "ibm,cpu-idle-state-psscr-mask" properties. In case where the
   firmware provides values for these fields that is different from
   the hand-coded values, the current code will not work as intended.

b) Due to issue a), the current code assumes that ESL=EC=1 for all the
   stop states and hence the wakeup from the stop instruction will
   happen at 0x100, the system-reset vector. However, the ISA v3.0
   allows the ESL=EC=0 behaviour where the corresponding stop-state
   loses no state and wakes up from the subsequent instruction. The
   current code doesn't handle this case.
   
This patch series addresses these issues.

The first patch in the series renames the existing
IDLE_STATE_ENTER_SEQ macro to IDLE_STATE_ENTER_SEQ_NORET. It reuses
the name IDLE_STATE_ENTER_SEQ for entering into stop-states which wake
up at the subsequent instruction.

The second patch in the series uniformly renames all occurences of "power9"
to "arch300" thereby making the variable/function names consistent.

The third patch adds a helper function in cpuidle-powernv.c for
initializing entries of the powernv_states[] table that is passed to
the cpu-idle core. This eliminates some of the code duplication in the
function that discovers and initializes the stop states.

The fourth patch in the series fixes issues a) and b) by ensuring that
the psscr-value and the psscr-mask provided by the firmware are what
will be used to set a particular stop state. It also adds support for
handling wake-up from stop states which were entered with ESL=EC=0.
It validates hat the psscr values exposed by the firmware maintains
the invariants mentioned in the ISA.

The fourth patch also handles the older firmware which sets only the
Requested Level (RL) field in the psscr and psscr-mask exposed in the
device tree. In the presence of such older firmware, this patch will
set the default sane values for for remaining PSSCR fields (i.e PSLL,
MTL, ESL, EC, and TR).

The fifth patch provides the documentation for the device-tree
bindings describing the idle state properties under the @power-mgt
node in the device-tree.

The skiboot patch populates all the relevant fields in the PSSCR
values and the mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html

Gautham R. Shenoy (5):
  powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro
  powernv:stop: Uniformly rename power9 to arch300
  cpuidle:powernv: Add helper function to populate powernv idle states.
  powernv: Pass PSSCR value and mask to power9_idle_stop
  Documentation:powerpc: Add device-tree bindings for power-mgt

 .../devicetree/bindings/powerpc/opal/power-mgt.txt | 125 ++++++++++++++++++
 arch/powerpc/include/asm/cpuidle.h                 |  49 ++++++-
 arch/powerpc/include/asm/processor.h               |   3 +-
 arch/powerpc/kernel/exceptions-64s.S               |   6 +-
 arch/powerpc/kernel/idle_book3s.S                  |  53 ++++----
 arch/powerpc/platforms/powernv/idle.c              | 142 ++++++++++++++++++---
 arch/powerpc/platforms/powernv/powernv.h           |   3 +-
 arch/powerpc/platforms/powernv/smp.c               |  14 +-
 drivers/cpuidle/cpuidle-powernv.c                  | 129 +++++++++++++------
 include/linux/cpuidle.h                            |   1 +
 10 files changed, 435 insertions(+), 90 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt

-- 
1.9.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro
  2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
@ 2017-01-10  9:07 ` Gautham R. Shenoy
  2017-01-12  9:42   ` Balbir Singh
  2017-01-10  9:07 ` [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300 Gautham R. Shenoy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:07 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently all the low-power idle states are expected to wake up
at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ
that puts the CPU to an idle state and never returns.

On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU
is expected to wake up at the next instruction of the idle
instruction.

This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the
no-return variant and reuses the name IDLE_STATE_ENTER_SEQ
for a variant that allows resuming operation at the instruction next
to the idle-instruction.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
No changes from v4

 arch/powerpc/include/asm/cpuidle.h   |  5 ++++-
 arch/powerpc/kernel/exceptions-64s.S |  6 +++---
 arch/powerpc/kernel/idle_book3s.S    | 10 +++++-----
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 3919332..0a3255b 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -21,7 +21,7 @@
 
 /* Idle state entry routines */
 #ifdef	CONFIG_PPC_P7_NAP
-#define	IDLE_STATE_ENTER_SEQ(IDLE_INST)				\
+#define IDLE_STATE_ENTER_SEQ(IDLE_INST)                         \
 	/* Magic NAP/SLEEP/WINKLE mode enter sequence */	\
 	std	r0,0(r1);					\
 	ptesync;						\
@@ -29,6 +29,9 @@
 1:	cmpd	cr0,r0,r0;					\
 	bne	1b;						\
 	IDLE_INST;						\
+
+#define	IDLE_STATE_ENTER_SEQ_NORET(IDLE_INST)			\
+	IDLE_STATE_ENTER_SEQ(IDLE_INST)                         \
 	b	.
 #endif /* CONFIG_PPC_P7_NAP */
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 1ba82ea..7aa8afc 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -381,12 +381,12 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	lbz	r3,PACA_THREAD_IDLE_STATE(r13)
 	cmpwi	r3,PNV_THREAD_NAP
 	bgt	10f
-	IDLE_STATE_ENTER_SEQ(PPC_NAP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
 	/* No return */
 10:
 	cmpwi	r3,PNV_THREAD_SLEEP
 	bgt	2f
-	IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
 	/* No return */
 
 2:
@@ -400,7 +400,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	 */
 	ori	r13,r13,1
 	SET_PACA(r13)
-	IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
 	/* No return */
 4:
 #endif
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 72dac0b..be90e2f 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -205,7 +205,7 @@ pnv_enter_arch207_idle_mode:
 	stb	r3,PACA_THREAD_IDLE_STATE(r13)
 	cmpwi	cr3,r3,PNV_THREAD_SLEEP
 	bge	cr3,2f
-	IDLE_STATE_ENTER_SEQ(PPC_NAP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_NAP)
 	/* No return */
 2:
 	/* Sleep or winkle */
@@ -239,7 +239,7 @@ pnv_fastsleep_workaround_at_entry:
 
 common_enter: /* common code for all the threads entering sleep or winkle */
 	bgt	cr3,enter_winkle
-	IDLE_STATE_ENTER_SEQ(PPC_SLEEP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_SLEEP)
 
 fastsleep_workaround_at_entry:
 	ori	r15,r15,PNV_CORE_IDLE_LOCK_BIT
@@ -261,7 +261,7 @@ fastsleep_workaround_at_entry:
 enter_winkle:
 	bl	save_sprs_to_stack
 
-	IDLE_STATE_ENTER_SEQ(PPC_WINKLE)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
 
 /*
  * r3 - requested stop state
@@ -280,7 +280,7 @@ power_enter_stop:
 	ld	r4,ADDROFF(pnv_first_deep_stop_state)(r5)
 	cmpd	r3,r4
 	bge	2f
-	IDLE_STATE_ENTER_SEQ(PPC_STOP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
 2:
 /*
  * Entering deep idle state.
@@ -302,7 +302,7 @@ lwarx_loop_stop:
 
 	bl	save_sprs_to_stack
 
-	IDLE_STATE_ENTER_SEQ(PPC_STOP)
+	IDLE_STATE_ENTER_SEQ_NORET(PPC_STOP)
 
 _GLOBAL(power7_idle)
 	/* Now check if user or arch enabled NAP mode */
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300
  2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
  2017-01-10  9:07 ` [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro Gautham R. Shenoy
@ 2017-01-10  9:07 ` Gautham R. Shenoy
  2017-01-12  9:47   ` Balbir Singh
  2017-01-10  9:07 ` [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states Gautham R. Shenoy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:07 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Balbir pointed out that in idle_book3s.S and powernv/idle.c some
functions and variables had power9 in their names while some others
had arch300.

This patch uniformly renames all instances of "power9" in the
variables/function/comments occuring in these files to "arch300" in
order to make them consistent.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
New patch in v5

 arch/powerpc/include/asm/processor.h  |  2 +-
 arch/powerpc/kernel/idle_book3s.S     | 13 +++++++------
 arch/powerpc/platforms/powernv/idle.c |  6 +++---
 arch/powerpc/platforms/powernv/smp.c  |  2 +-
 drivers/cpuidle/cpuidle-powernv.c     |  2 +-
 5 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index c07c31b..4b47308 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -458,7 +458,7 @@ static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
 extern unsigned long power7_nap(int check_irq);
 extern unsigned long power7_sleep(void);
 extern unsigned long power7_winkle(void);
-extern unsigned long power9_idle_stop(unsigned long stop_level);
+extern unsigned long arch300_idle_stop(unsigned long stop_level);
 
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index be90e2f..7f6657f 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -106,8 +106,9 @@ core_idle_lock_held:
 
 /*
  * Pass requested state in r3:
- *	r3 - PNV_THREAD_NAP/SLEEP/WINKLE in POWER8
- *	   - Requested STOP state in POWER9
+ *	r3 - PNV_THREAD_NAP/SLEEP/WINKLE on ISA 2.07 or less
+	     (POWER7,POWER8)
+ *	   - Requested STOP state on ISA 3.0 CPUs
  *
  * To check IRQ_HAPPENED in r4
  * 	0 - don't check
@@ -357,7 +358,7 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66);		\
 /*
  * r3 - requested stop state
  */
-_GLOBAL(power9_idle_stop)
+_GLOBAL(arch300_idle_stop)
 	LOAD_REG_IMMEDIATE(r4, PSSCR_HV_TEMPLATE)
 	or	r4,r4,r3
 	mtspr	SPRN_PSSCR, r4
@@ -377,7 +378,7 @@ _GLOBAL(pnv_restore_hyp_resource)
 BEGIN_FTR_SECTION
 	ld	r2,PACATOC(r13);
 	/*
-	 * POWER ISA 3. Use PSSCR to determine if we
+	 * POWER ISA 3.0. Use PSSCR to determine if we
 	 * are waking up from deep idle state
 	 */
 	LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
@@ -429,8 +430,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
 /*
  * Called if waking up from idle state which can cause either partial or
  * complete hyp state loss.
- * In POWER8, called if waking up from fastsleep or winkle
- * In POWER9, called if waking up from stop state >= pnv_first_deep_stop_state
+ * In ISA 2.07 (POWER8),called if waking up from fastsleep or winkle
+ * In ISA 3.0, called if waking up from stop state >= pnv_first_deep_stop_state
  *
  * r13 - PACA
  * cr3 - gt if waking up with partial/complete hypervisor state loss
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 479c256..c3a2fac 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -241,10 +241,10 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
 /*
  * Used for ppc_md.power_save which needs a function with no parameters
  */
-static void power9_idle(void)
+static void arch300_idle(void)
 {
 	/* Requesting stop state 0 */
-	power9_idle_stop(0);
+	arch300_idle_stop(0);
 }
 /*
  * First deep stop state. Used to figure out when to save/restore
@@ -415,7 +415,7 @@ static int __init pnv_init_idle_states(void)
 	if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED)
 		ppc_md.power_save = power7_idle;
 	else if (supported_cpuidle_states & OPAL_PM_STOP_INST_FAST)
-		ppc_md.power_save = power9_idle;
+		ppc_md.power_save = arch300_idle;
 
 out:
 	return 0;
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index c789258..c931bb2 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -183,7 +183,7 @@ static void pnv_smp_cpu_kill_self(void)
 		ppc64_runlatch_off();
 
 		if (cpu_has_feature(CPU_FTR_ARCH_300))
-			srr1 = power9_idle_stop(pnv_deepest_stop_state);
+			srr1 = arch300_idle_stop(pnv_deepest_stop_state);
 		else if (idle_states & OPAL_PM_WINKLE_ENABLED)
 			srr1 = power7_winkle();
 		else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 7fe442c..a7f6528 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -102,7 +102,7 @@ static int stop_loop(struct cpuidle_device *dev,
 		     int index)
 {
 	ppc64_runlatch_off();
-	power9_idle_stop(stop_psscr_table[index]);
+	arch300_idle_stop(stop_psscr_table[index]);
 	ppc64_runlatch_on();
 	return index;
 }
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states.
  2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
  2017-01-10  9:07 ` [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro Gautham R. Shenoy
  2017-01-10  9:07 ` [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300 Gautham R. Shenoy
@ 2017-01-10  9:07 ` Gautham R. Shenoy
  2017-01-12 10:04   ` Balbir Singh
  2017-01-10  9:07 ` [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop Gautham R. Shenoy
  2017-01-10  9:07 ` [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt Gautham R. Shenoy
  4 siblings, 1 reply; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:07 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

In the current code for powernv_add_idle_states, there is a lot of code
duplication while initializing an idle state in powernv_states table.

Add an inline helper function to populate the powernv_states[] table
for a given idle state. Invoke this for populating the "Nap",
"Fastsleep" and the stop states in powernv_add_idle_states.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
[v4] --> [v5] : Added a comment above POWERNV_THRESHOLD_LATENCY_NS

 drivers/cpuidle/cpuidle-powernv.c | 89 +++++++++++++++++++++++----------------
 include/linux/cpuidle.h           |  1 +
 2 files changed, 54 insertions(+), 36 deletions(-)

diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index a7f6528..a21f1f0 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -20,6 +20,10 @@
 #include <asm/opal.h>
 #include <asm/runlatch.h>
 
+/*
+ * Expose only those Hardware idle states via the cpuidle framework
+ * that have latency value below POWERNV_THRESHOLD_LATENCY_NS.
+ */
 #define POWERNV_THRESHOLD_LATENCY_NS 200000
 
 struct cpuidle_driver powernv_idle_driver = {
@@ -167,6 +171,24 @@ static int powernv_cpuidle_driver_init(void)
 	return 0;
 }
 
+static inline void add_powernv_state(int index, const char *name,
+				     unsigned int flags,
+				     int (*idle_fn)(struct cpuidle_device *,
+						    struct cpuidle_driver *,
+						    int),
+				     unsigned int target_residency,
+				     unsigned int exit_latency,
+				     u64 psscr_val)
+{
+	strlcpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
+	strlcpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
+	powernv_states[index].flags = flags;
+	powernv_states[index].target_residency = target_residency;
+	powernv_states[index].exit_latency = exit_latency;
+	powernv_states[index].enter = idle_fn;
+	stop_psscr_table[index] = psscr_val;
+}
+
 static int powernv_add_idle_states(void)
 {
 	struct device_node *power_mgt;
@@ -236,6 +258,7 @@ static int powernv_add_idle_states(void)
 		"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
 
 	for (i = 0; i < dt_idle_states; i++) {
+		unsigned int exit_latency, target_residency;
 		/*
 		 * If an idle state has exit latency beyond
 		 * POWERNV_THRESHOLD_LATENCY_NS then don't use it
@@ -243,28 +266,33 @@ static int powernv_add_idle_states(void)
 		 */
 		if (latency_ns[i] > POWERNV_THRESHOLD_LATENCY_NS)
 			continue;
+		/*
+		 * Firmware passes residency and latency values in ns.
+		 * cpuidle expects it in us.
+		 */
+		exit_latency = latency_ns[i] / 1000;
+		if (!rc)
+			target_residency = residency_ns[i] / 1000;
+		else
+			target_residency = 0;
 
 		/*
-		 * Cpuidle accepts exit_latency and target_residency in us.
-		 * Use default target_residency values if f/w does not expose it.
+		 * For nap and fastsleep, use default target_residency
+		 * values if f/w does not expose it.
 		 */
 		if (flags[i] & OPAL_PM_NAP_ENABLED) {
+			if (!rc)
+				target_residency = 100;
 			/* Add NAP state */
-			strcpy(powernv_states[nr_idle_states].name, "Nap");
-			strcpy(powernv_states[nr_idle_states].desc, "Nap");
-			powernv_states[nr_idle_states].flags = 0;
-			powernv_states[nr_idle_states].target_residency = 100;
-			powernv_states[nr_idle_states].enter = nap_loop;
+			add_powernv_state(nr_idle_states, "Nap",
+					  CPUIDLE_FLAG_NONE, nap_loop,
+					  target_residency, exit_latency, 0);
 		} else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
 				!(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
-			strncpy(powernv_states[nr_idle_states].name,
-				names[i], CPUIDLE_NAME_LEN);
-			strncpy(powernv_states[nr_idle_states].desc,
-				names[i], CPUIDLE_NAME_LEN);
-			powernv_states[nr_idle_states].flags = 0;
-
-			powernv_states[nr_idle_states].enter = stop_loop;
-			stop_psscr_table[nr_idle_states] = psscr_val[i];
+			add_powernv_state(nr_idle_states, names[i],
+					  CPUIDLE_FLAG_NONE, stop_loop,
+					  target_residency, exit_latency,
+					  psscr_val[i]);
 		}
 
 		/*
@@ -274,32 +302,21 @@ static int powernv_add_idle_states(void)
 #ifdef CONFIG_TICK_ONESHOT
 		if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
 			flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
+			if (!rc)
+				target_residency = 300000;
 			/* Add FASTSLEEP state */
-			strcpy(powernv_states[nr_idle_states].name, "FastSleep");
-			strcpy(powernv_states[nr_idle_states].desc, "FastSleep");
-			powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
-			powernv_states[nr_idle_states].target_residency = 300000;
-			powernv_states[nr_idle_states].enter = fastsleep_loop;
+			add_powernv_state(nr_idle_states, "FastSleep",
+					  CPUIDLE_FLAG_TIMER_STOP,
+					  fastsleep_loop,
+					  target_residency, exit_latency, 0);
 		} else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
 				(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
-			strncpy(powernv_states[nr_idle_states].name,
-				names[i], CPUIDLE_NAME_LEN);
-			strncpy(powernv_states[nr_idle_states].desc,
-				names[i], CPUIDLE_NAME_LEN);
-
-			powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
-			powernv_states[nr_idle_states].enter = stop_loop;
-			stop_psscr_table[nr_idle_states] = psscr_val[i];
+			add_powernv_state(nr_idle_states, names[i],
+					  CPUIDLE_FLAG_TIMER_STOP, stop_loop,
+					  target_residency, exit_latency,
+					  psscr_val[i]);
 		}
 #endif
-		powernv_states[nr_idle_states].exit_latency =
-				((unsigned int)latency_ns[i]) / 1000;
-
-		if (!rc) {
-			powernv_states[nr_idle_states].target_residency =
-				((unsigned int)residency_ns[i]) / 1000;
-		}
-
 		nr_idle_states++;
 	}
 out:
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index bb31373..c4e10f8 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -62,6 +62,7 @@ struct cpuidle_state {
 };
 
 /* Idle State Flags */
+#define CPUIDLE_FLAG_NONE       (0x00)
 #define CPUIDLE_FLAG_COUPLED	(0x02) /* state applies to multiple cpus */
 #define CPUIDLE_FLAG_TIMER_STOP (0x04)  /* timer is stopped on this state */
 
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop
  2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2017-01-10  9:07 ` [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states Gautham R. Shenoy
@ 2017-01-10  9:07 ` Gautham R. Shenoy
  2017-01-12 17:18   ` Balbir Singh
  2017-01-10  9:07 ` [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt Gautham R. Shenoy
  4 siblings, 1 reply; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:07 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The arch300_idle_stop method currently takes only the requested stop
level as a parameter and picks up the rest of the PSSCR bits from a
hand-coded macro. This is not a very flexible design, especially when
the firmware has the capability to communicate the psscr value and the
mask associated with a particular stop state via device tree.

This patch modifies the power9_idle_stop API to take as parameters the
PSSCR value and the PSSCR mask corresponding to the stop state that
needs to be set. These PSSCR value and mask are respectively obtained
by parsing the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" fields from the device tree.

In addition to this, the patch adds support for handling stop states
for which ESL and EC bits in the PSSCR are zero. As per the
architecture, a wakeup from these stop states resumes execution from
the subsequent instruction as opposed to waking up at the System
Vector.

The older firmware sets only the Requested Level (RL) field in the
psscr and psscr-mask exposed in the device tree. For older firmware
where psscr-mask=0xf, this patch will set the default sane values that
the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
TR). For the new firmware, the patch will validate that the invariants
required by the ISA for the psscr values are maintained by the
firmware.

This skiboot patch that exports fully populated PSSCR values and the
mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html

[Optimize the number of instructions before entering STOP with
ESL=EC=0, validate the PSSCR values provided by the firimware
maintains the invariants required as per the ISA suggested by Balbir
Singh]

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
[v4] -> [v5]: a) Validate the psscr values provided by firmware.
              b) Optimize the number of instructions before entering
	         stop with ESL=EC=0.

arch/powerpc/include/asm/cpuidle.h       |  44 ++++++++++
 arch/powerpc/include/asm/processor.h     |   3 +-
 arch/powerpc/kernel/idle_book3s.S        |  30 ++++---
 arch/powerpc/platforms/powernv/idle.c    | 138 ++++++++++++++++++++++++++++---
 arch/powerpc/platforms/powernv/powernv.h |   3 +-
 arch/powerpc/platforms/powernv/smp.c     |  14 ++--
 drivers/cpuidle/cpuidle-powernv.c        |  52 +++++++++---
 7 files changed, 241 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 0a3255b..ced47ee 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -10,11 +10,55 @@
 #define PNV_CORE_IDLE_LOCK_BIT          0x100
 #define PNV_CORE_IDLE_THREAD_BITS       0x0FF
 
+/*
+ * ============================ NOTE =================================
+ * The older firmware populates only the RL field in the psscr_val and
+ * sets the psscr_mask to 0xf. On such a firmware, the kernel sets the
+ * remaining PSSCR fields to default values as follows:
+ *
+ * - ESL and EC bits are to 1. So wakeup from any stop state will be
+ *   at vector 0x100.
+ *
+ * - MTL and PSLL are set to the maximum allowed value as per the ISA,
+ *    i.e. 15.
+ *
+ * - The Transition Rate, TR is set to the Maximum value 3.
+ */
+#define PSSCR_HV_DEFAULT_VAL    (PSSCR_ESL | PSSCR_EC |		    \
+				PSSCR_PSLL_MASK | PSSCR_TR_MASK |   \
+				PSSCR_MTL_MASK)
+
+#define PSSCR_HV_DEFAULT_MASK   (PSSCR_ESL | PSSCR_EC |		    \
+				PSSCR_PSLL_MASK | PSSCR_TR_MASK |   \
+				PSSCR_MTL_MASK | PSSCR_RL_MASK)
+#define PSSCR_EC_SHIFT    20
+#define PSSCR_ESL_SHIFT   21
+#define GET_PSSCR_EC(x)   (((x) & PSSCR_EC) >> PSSCR_EC_SHIFT)
+#define GET_PSSCR_ESL(x)  (((x) & PSSCR_ESL) >> PSSCR_ESL_SHIFT)
+#define GET_PSSCR_RL(x)   ((x) & PSSCR_RL_MASK)
+
+#define ERR_EC_ESL_MISMATCH		-1
+#define ERR_DEEP_STATE_ESL_MISMATCH	-2
+
 #ifndef __ASSEMBLY__
 extern u32 pnv_fastsleep_workaround_at_entry[];
 extern u32 pnv_fastsleep_workaround_at_exit[];
 
 extern u64 pnv_first_deep_stop_state;
+
+int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags);
+static inline void report_invalid_psscr_val(u64 psscr_val, int err)
+{
+	switch (err) {
+	case ERR_EC_ESL_MISMATCH:
+		pr_warn("Invalid psscr %llx : ESL,EC bits unequal",
+			psscr_val);
+		break;
+	case ERR_DEEP_STATE_ESL_MISMATCH:
+		pr_warn("Invalid psscr %llx : ESL cleared for deep stop-state",
+			psscr_val);
+	}
+}
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 4b47308..da68890 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -458,7 +458,8 @@ static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
 extern unsigned long power7_nap(int check_irq);
 extern unsigned long power7_sleep(void);
 extern unsigned long power7_winkle(void);
-extern unsigned long arch300_idle_stop(unsigned long stop_level);
+extern unsigned long arch300_idle_stop(unsigned long stop_psscr_val,
+				      unsigned long stop_psscr_mask);
 
 extern void flush_instruction_cache(void);
 extern void hard_reset_now(void);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 7f6657f..61965ab 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -40,9 +40,7 @@
 #define _WORC	GPR11
 #define _PTCR	GPR12
 
-#define PSSCR_HV_TEMPLATE	PSSCR_ESL | PSSCR_EC | \
-				PSSCR_PSLL_MASK | PSSCR_TR_MASK | \
-				PSSCR_MTL_MASK
+#define PSSCR_EC_ESL_MASK_SHIFTED          (PSSCR_EC | PSSCR_ESL) >> 16
 
 	.text
 
@@ -265,7 +263,7 @@ enter_winkle:
 	IDLE_STATE_ENTER_SEQ_NORET(PPC_WINKLE)
 
 /*
- * r3 - requested stop state
+ * r3 - PSSCR value corresponding to the requested stop state.
  */
 power_enter_stop:
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
@@ -275,9 +273,18 @@ power_enter_stop:
 	stb	r4,HSTATE_HWTHREAD_STATE(r13)
 #endif
 /*
+ * Check if we are executing the lite variant with ESL=EC=0
+ */
+	andis.   r4, r3, PSSCR_EC_ESL_MASK_SHIFTED
+	clrldi   r3, r3, 60 /* r3 = Bits 60:63 = Requested Level (RL) */
+	bne	 1f
+	IDLE_STATE_ENTER_SEQ(PPC_STOP)
+	li	r3,0  /* Since we didn't lose state, return 0 */
+	b 	pnv_wakeup_noloss
+/*
  * Check if the requested state is a deep idle state.
  */
-	LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
+1:	LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
 	ld	r4,ADDROFF(pnv_first_deep_stop_state)(r5)
 	cmpd	r3,r4
 	bge	2f
@@ -354,16 +361,17 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66);		\
 	ld	r3,ORIG_GPR3(r1);	/* Restore original r3 */	\
 20:	nop;
 
-
 /*
- * r3 - requested stop state
+ * r3 - The PSSCR value corresponding to the stop state.
+ * r4 - The PSSCR mask corrresonding to the stop state.
  */
 _GLOBAL(arch300_idle_stop)
-	LOAD_REG_IMMEDIATE(r4, PSSCR_HV_TEMPLATE)
-	or	r4,r4,r3
-	mtspr	SPRN_PSSCR, r4
-	li	r4, 1
+	mfspr   r5, SPRN_PSSCR
+	andc    r5, r5, r4
+	or      r3, r3, r5
+	mtspr 	SPRN_PSSCR, r3
 	LOAD_REG_ADDR(r5,power_enter_stop)
+	li	r4, 1
 	b	pnv_powersave_common
 	/* No return */
 /*
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index c3a2fac..7a6ff4a 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -237,15 +237,21 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
 			show_fastsleep_workaround_applyonce,
 			store_fastsleep_workaround_applyonce);
 
+/*
+ * The default stop state that will be used by ppc_md.power_save
+ * function on platforms that support stop instruction.
+ */
+u64 pnv_default_stop_val;
+u64 pnv_default_stop_mask;
 
 /*
  * Used for ppc_md.power_save which needs a function with no parameters
  */
 static void arch300_idle(void)
 {
-	/* Requesting stop state 0 */
-	arch300_idle_stop(0);
+	arch300_idle_stop(pnv_default_stop_val, pnv_default_stop_mask);
 }
+
 /*
  * First deep stop state. Used to figure out when to save/restore
  * hypervisor context.
@@ -253,9 +259,11 @@ static void arch300_idle(void)
 u64 pnv_first_deep_stop_state = MAX_STOP_STATE;
 
 /*
- * Deepest stop idle state. Used when a cpu is offlined
+ * psscr value and mask of the deepest stop idle state.
+ * Used when a cpu is offlined.
  */
-u64 pnv_deepest_stop_state;
+u64 pnv_deepest_stop_psscr_val;
+u64 pnv_deepest_stop_psscr_mask;
 
 /*
  * Power ISA 3.0 idle initialization.
@@ -292,6 +300,44 @@ static void arch300_idle(void)
  *	Bits 60:63 - Requested Level
  *	Used to specify which power-saving level must be entered on executing
  *	stop instruction
+ */
+
+int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags)
+{
+	int err = 0;
+
+	/*
+	 * psscr_mask == 0xf indicates an older firmware.
+	 * Set remaining fields of psscr to the default values.
+	 * See NOTE above definition of PSSCR_HV_DEFAULT_VAL
+	 */
+	if (*psscr_mask == 0xf) {
+		*psscr_val = *psscr_val | PSSCR_HV_DEFAULT_VAL;
+		*psscr_mask = PSSCR_HV_DEFAULT_MASK;
+		return err;
+	}
+
+	/*
+	 * New firmware is expected to set the psscr_val bits correctly.
+	 * Validate that the following invariants are correctly maintained by
+	 * the new firmware.
+	 * - ESL bit value matches the EC bit value.
+	 * - ESL bit is set for all the deep stop states.
+	 */
+	if (GET_PSSCR_ESL(*psscr_val) != GET_PSSCR_EC(*psscr_val)) {
+		err = ERR_EC_ESL_MISMATCH;
+	} else if ((flags & OPAL_PM_LOSE_FULL_CONTEXT) &&
+		GET_PSSCR_ESL(*psscr_val) == 0) {
+		err = ERR_DEEP_STATE_ESL_MISMATCH;
+	}
+
+	return err;
+}
+
+/*
+ * pnv_arch300_idle_init: Initializes the default idle state, first
+ *                        deep idle state and deepest idle state on
+ *                        ISA 3.0 CPUs.
  *
  * @np: /ibm,opal/power-mgt device node
  * @flags: cpu-idle-state-flags array
@@ -302,43 +348,109 @@ static int __init pnv_arch300_idle_init(struct device_node *np, u32 *flags,
 					int dt_idle_states)
 {
 	u64 *psscr_val = NULL;
+	u64 *psscr_mask = NULL;
+	u32 *residency_ns = NULL;
+	u64 max_residency_ns = 0;
 	int rc = 0, i;
+	bool default_stop_found = false, deepest_stop_found = false;
 
-	psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val),
-				GFP_KERNEL);
-	if (!psscr_val) {
+	psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val), GFP_KERNEL);
+	psscr_mask = kcalloc(dt_idle_states, sizeof(*psscr_mask), GFP_KERNEL);
+	residency_ns = kcalloc(dt_idle_states, sizeof(*residency_ns),
+			       GFP_KERNEL);
+
+	if (!psscr_val || !psscr_mask || !residency_ns) {
 		rc = -1;
 		goto out;
 	}
+
 	if (of_property_read_u64_array(np,
 		"ibm,cpu-idle-state-psscr",
 		psscr_val, dt_idle_states)) {
-		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n");
+		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr in DT\n");
+		rc = -1;
+		goto out;
+	}
+
+	if (of_property_read_u64_array(np,
+				       "ibm,cpu-idle-state-psscr-mask",
+				       psscr_mask, dt_idle_states)) {
+		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr-mask in DT\n");
+		rc = -1;
+		goto out;
+	}
+
+	if (of_property_read_u32_array(np,
+				       "ibm,cpu-idle-state-residency-ns",
+					residency_ns, dt_idle_states)) {
+		pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-residency-ns in DT\n");
 		rc = -1;
 		goto out;
 	}
 
 	/*
-	 * Set pnv_first_deep_stop_state and pnv_deepest_stop_state.
+	 * Set pnv_first_deep_stop_state, pnv_deepest_stop_psscr_{val,mask},
+	 * and the pnv_default_stop_{val,mask}.
+	 *
 	 * pnv_first_deep_stop_state should be set to the first stop
 	 * level to cause hypervisor state loss.
-	 * pnv_deepest_stop_state should be set to the deepest stop
-	 * stop state.
+	 *
+	 * pnv_deepest_stop_{val,mask} should be set to values corresponding to
+	 * the deepest stop state.
+	 *
+	 * pnv_default_stop_{val,mask} should be set to values corresponding to
+	 * the shallowest (OPAL_PM_STOP_INST_FAST) loss-less stop state.
 	 */
 	pnv_first_deep_stop_state = MAX_STOP_STATE;
 	for (i = 0; i < dt_idle_states; i++) {
+		int err;
 		u64 psscr_rl = psscr_val[i] & PSSCR_RL_MASK;
 
 		if ((flags[i] & OPAL_PM_LOSE_FULL_CONTEXT) &&
 		     (pnv_first_deep_stop_state > psscr_rl))
 			pnv_first_deep_stop_state = psscr_rl;
 
-		if (pnv_deepest_stop_state < psscr_rl)
-			pnv_deepest_stop_state = psscr_rl;
+		err = validate_psscr_val_mask(&psscr_val[i], &psscr_mask[i],
+					      flags[i]);
+		if (err) {
+			report_invalid_psscr_val(psscr_val[i], err);
+			continue;
+		}
+
+		if (max_residency_ns < residency_ns[i]) {
+			max_residency_ns = residency_ns[i];
+			pnv_deepest_stop_psscr_val = psscr_val[i];
+			pnv_deepest_stop_psscr_mask = psscr_mask[i];
+			deepest_stop_found = true;
+		}
+
+		if (!default_stop_found &&
+		    (flags[i] & OPAL_PM_STOP_INST_FAST)) {
+			pnv_default_stop_val = psscr_val[i];
+			pnv_default_stop_mask = psscr_mask[i];
+			default_stop_found = true;
+		}
+	}
+
+	if (!default_stop_found) {
+		pnv_default_stop_val = PSSCR_HV_DEFAULT_VAL;
+		pnv_default_stop_mask = PSSCR_HV_DEFAULT_MASK;
+		pr_warn("Setting default stop psscr val=%llx,mask=%llx\n",
+			pnv_default_stop_val, pnv_default_stop_mask);
+	}
+
+	if (!deepest_stop_found) {
+		pnv_deepest_stop_psscr_val = PSSCR_HV_DEFAULT_VAL;
+		pnv_deepest_stop_psscr_mask = PSSCR_HV_DEFAULT_MASK;
+		pr_warn("Setting default stop psscr val=%llx,mask=%llx\n",
+			pnv_deepest_stop_psscr_val,
+			pnv_deepest_stop_psscr_mask);
 	}
 
 out:
 	kfree(psscr_val);
+	kfree(psscr_mask);
+	kfree(residency_ns);
 	return rc;
 }
 
diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
index da7c843..6130522 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -18,7 +18,8 @@ static inline void pnv_pci_shutdown(void) { }
 #endif
 
 extern u32 pnv_get_supported_cpuidle_states(void);
-extern u64 pnv_deepest_stop_state;
+extern u64 pnv_deepest_stop_psscr_val;
+extern u64 pnv_deepest_stop_psscr_mask;
 
 extern void pnv_lpc_init(void);
 
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index c931bb2..408695f 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -182,15 +182,17 @@ static void pnv_smp_cpu_kill_self(void)
 
 		ppc64_runlatch_off();
 
-		if (cpu_has_feature(CPU_FTR_ARCH_300))
-			srr1 = arch300_idle_stop(pnv_deepest_stop_state);
-		else if (idle_states & OPAL_PM_WINKLE_ENABLED)
+		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+			srr1 = arch300_idle_stop(pnv_deepest_stop_psscr_val,
+						pnv_deepest_stop_psscr_mask);
+		} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
 			srr1 = power7_winkle();
-		else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
-				(idle_states & OPAL_PM_SLEEP_ENABLED_ER1))
+		} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
+			   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
 			srr1 = power7_sleep();
-		else
+		} else {
 			srr1 = power7_nap(1);
+		}
 
 		ppc64_runlatch_on();
 
diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index a21f1f0..bbadd91 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -19,6 +19,7 @@
 #include <asm/firmware.h>
 #include <asm/opal.h>
 #include <asm/runlatch.h>
+#include <asm/cpuidle.h>
 
 /*
  * Expose only those Hardware idle states via the cpuidle framework
@@ -34,7 +35,12 @@ struct cpuidle_driver powernv_idle_driver = {
 static int max_idle_state;
 static struct cpuidle_state *cpuidle_state_table;
 
-static u64 stop_psscr_table[CPUIDLE_STATE_MAX];
+struct stop_psscr_table {
+	u64 val;
+	u64 mask;
+};
+
+static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX];
 
 static u64 snooze_timeout;
 static bool snooze_timeout_en;
@@ -106,7 +112,8 @@ static int stop_loop(struct cpuidle_device *dev,
 		     int index)
 {
 	ppc64_runlatch_off();
-	arch300_idle_stop(stop_psscr_table[index]);
+	arch300_idle_stop(stop_psscr_table[index].val,
+			 stop_psscr_table[index].mask);
 	ppc64_runlatch_on();
 	return index;
 }
@@ -178,7 +185,7 @@ static inline void add_powernv_state(int index, const char *name,
 						    int),
 				     unsigned int target_residency,
 				     unsigned int exit_latency,
-				     u64 psscr_val)
+				     u64 psscr_val, u64 psscr_mask)
 {
 	strlcpy(powernv_states[index].name, name, CPUIDLE_NAME_LEN);
 	strlcpy(powernv_states[index].desc, name, CPUIDLE_NAME_LEN);
@@ -186,7 +193,8 @@ static inline void add_powernv_state(int index, const char *name,
 	powernv_states[index].target_residency = target_residency;
 	powernv_states[index].exit_latency = exit_latency;
 	powernv_states[index].enter = idle_fn;
-	stop_psscr_table[index] = psscr_val;
+	stop_psscr_table[index].val = psscr_val;
+	stop_psscr_table[index].mask = psscr_mask;
 }
 
 static int powernv_add_idle_states(void)
@@ -198,7 +206,9 @@ static int powernv_add_idle_states(void)
 	u32 residency_ns[CPUIDLE_STATE_MAX];
 	u32 flags[CPUIDLE_STATE_MAX];
 	u64 psscr_val[CPUIDLE_STATE_MAX];
+	u64 psscr_mask[CPUIDLE_STATE_MAX];
 	const char *names[CPUIDLE_STATE_MAX];
+	bool has_stop_states = false;
 	int i, rc;
 
 	/* Currently we have snooze statically defined */
@@ -245,15 +255,25 @@ static int powernv_add_idle_states(void)
 
 	/*
 	 * If the idle states use stop instruction, probe for psscr values
-	 * which are necessary to specify required stop level.
+	 * and psscr mask which are necessary to specify required stop level.
 	 */
-	if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP))
+	if (flags[0] & (OPAL_PM_STOP_INST_FAST | OPAL_PM_STOP_INST_DEEP)) {
 		if (of_property_read_u64_array(power_mgt,
 		    "ibm,cpu-idle-state-psscr", psscr_val, dt_idle_states)) {
-			pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n");
+			pr_warn("cpuidle-powernv: missing ibm,cpu-idle-state-psscr in DT\n");
+			goto out;
+		}
+
+		if (of_property_read_u64_array(power_mgt,
+					       "ibm,cpu-idle-state-psscr-mask",
+						psscr_mask, dt_idle_states)) {
+			pr_warn("cpuidle-powernv:Missing ibm,cpu-idle-state-psscr-mask in DT\n");
 			goto out;
 		}
 
+		has_stop_states = true;
+	}
+
 	rc = of_property_read_u32_array(power_mgt,
 		"ibm,cpu-idle-state-residency-ns", residency_ns, dt_idle_states);
 
@@ -276,6 +296,16 @@ static int powernv_add_idle_states(void)
 		else
 			target_residency = 0;
 
+		if (has_stop_states) {
+			int err = validate_psscr_val_mask(&psscr_val[i],
+							  &psscr_mask[i],
+							  flags[i]);
+			if (err) {
+				report_invalid_psscr_val(psscr_val[i], err);
+				continue;
+			}
+		}
+
 		/*
 		 * For nap and fastsleep, use default target_residency
 		 * values if f/w does not expose it.
@@ -286,13 +316,13 @@ static int powernv_add_idle_states(void)
 			/* Add NAP state */
 			add_powernv_state(nr_idle_states, "Nap",
 					  CPUIDLE_FLAG_NONE, nap_loop,
-					  target_residency, exit_latency, 0);
+					  target_residency, exit_latency, 0, 0);
 		} else if ((flags[i] & OPAL_PM_STOP_INST_FAST) &&
 				!(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
 			add_powernv_state(nr_idle_states, names[i],
 					  CPUIDLE_FLAG_NONE, stop_loop,
 					  target_residency, exit_latency,
-					  psscr_val[i]);
+					  psscr_val[i], psscr_mask[i]);
 		}
 
 		/*
@@ -308,13 +338,13 @@ static int powernv_add_idle_states(void)
 			add_powernv_state(nr_idle_states, "FastSleep",
 					  CPUIDLE_FLAG_TIMER_STOP,
 					  fastsleep_loop,
-					  target_residency, exit_latency, 0);
+					  target_residency, exit_latency, 0, 0);
 		} else if ((flags[i] & OPAL_PM_STOP_INST_DEEP) &&
 				(flags[i] & OPAL_PM_TIMEBASE_STOP)) {
 			add_powernv_state(nr_idle_states, names[i],
 					  CPUIDLE_FLAG_TIMER_STOP, stop_loop,
 					  target_residency, exit_latency,
-					  psscr_val[i]);
+					  psscr_val[i], psscr_mask[i]);
 		}
 #endif
 		nr_idle_states++;
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt
  2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
                   ` (3 preceding siblings ...)
  2017-01-10  9:07 ` [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop Gautham R. Shenoy
@ 2017-01-10  9:07 ` Gautham R. Shenoy
  2017-01-13 16:57   ` Rob Herring
  4 siblings, 1 reply; 14+ messages in thread
From: Gautham R. Shenoy @ 2017-01-10  9:07 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran
  Cc: linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Document the device-tree bindings defining the the properties under
the @power-mgt node in the device tree that describe the idle states
for Linux running on baremetal POWER servers.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
[v4]-> [v5]: Fixed a couple of typos.

 .../devicetree/bindings/powerpc/opal/power-mgt.txt | 125 +++++++++++++++++++++
 1 file changed, 125 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt

diff --git a/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
new file mode 100644
index 0000000..4967831
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
@@ -0,0 +1,125 @@
+IBM Power-Management Bindings
+=============================
+
+Linux running on baremetal POWER machines has access to the processor
+idle states. The description of these idle states is exposed via the
+node @power-mgt in the device-tree by the firmware.
+
+Definitions:
+----------------
+Typically each idle state has the following associated properties:
+
+- name: The name of the idle state as defined by the firmware.
+
+- flags: indicating some aspects of this idle states such as the
+         extent of state-loss, whether timebase is stopped on this
+         idle states and so on. The flag bits are as follows:
+
+- exit-latency: The latency involved in transitioning the state of the
+		CPU from idle to running.
+
+- target-residency: The minimum time that the CPU needs to reside in
+		    this idle state in order to accrue power-savings
+		    benefit.
+
+Properties
+----------------
+The following properties provide details about the idle states. These
+properties are optional unless mentioned otherwise below.
+
+- ibm,cpu-idle-state-names:
+	Array of strings containing the names of the idle states.
+
+- ibm,cpu-idle-state-flags:
+	Array of unsigned 32-bit values containing the values of the
+	flags associated with the the aforementioned idle-states. This
+	property is required on POWER9 whenever
+	ibm,cpu-idle-state-names is defined and the length of this
+	property array should be the same as
+	ibm,-cpu-idle-state-names.The flag bits are as follows:
+		0x00000001 /* Decrementer would stop */
+		0x00000002 /* Needs timebase restore */
+		0x00001000 /* Restore GPRs like nap */
+		0x00002000 /* Restore hypervisor resource from PACA pointer */
+		0x00004000 /* Program PORE to restore PACA pointer */
+		0x00010000 /* This is a nap state */
+		0x00020000 /* This is a fast-sleep state */
+		0x00040000 /* This is a winkle state */
+		0x00080000 /* This is a fast-sleep state which requires a */
+			   /* software workaround for restoring the timebase*/
+		0x00800000 /* This state uses SPR PMICR instruction */
+		0x00100000 /* This is a fast stop state */
+		0x00200000 /* This is a deep-stop state */
+
+- ibm,cpu-idle-state-latencies-ns:
+	Array of unsigned 32-bit values containing the values of the
+	exit-latencies (in ns) for the idle states in
+	ibm,cpu-idle-state-names. This property is required whenever
+	ibm,cpu-idle-state-names is defined and the length of this
+	property array should be the same as
+	ibm,-cpu-idle-state-names.
+
+- ibm,cpu-idle-state-residency-ns:
+	Array of unsigned 32-bit values containing the values of the
+	target-residency (in ns) for the idle states in
+	ibm,cpu-idle-state-names. On POWER8 this is an optional
+	property. If the property is absent, the target residency for
+	the "Nap", "FastSleep" are defined to 10000 and 300000000
+	respectively. On POWER9 this property must be defined if
+	ibm,cpu-idle-state-names is defined and the length should be
+	same as that of ibm,cpu-idle-state-names.
+
+- ibm,cpu-idle-state-psscr:
+	Array of unsigned 64-bit values containing the values for the
+	PSSCR for each of the idle states in ibm,cpu-idle-state-names.
+	This property is required on POWER9 whenever
+	ibm,cpu-idle-state-names is defined and the length of this
+	property array should be the same as
+	ibm,cpu-idle-state-names.
+
+- ibm,cpu-idle-state-psscr-mask:
+	Array of unsigned 64-bit values containing the masks
+	indicating which psscr fields are set in the corresponding
+	entries of ibm,cpu-idle-state-psscr.  This property is
+	required on POWER9 whenever ibm,cpu-idle-state-names is
+	defined and the length of this property array should be the
+	same as ibm,cpu-idle-state-names.
+
+	Whenever the firmware sets an entry in
+	ibm,cpu-idle-state-psscr-mask value to 0xf, it implies that
+	only the Requested Level (RL) field of the corresponding entry
+	in ibm,cpu-idle-state-psscr should be considered by the
+	kernel. For such idle states, the kernel would set the
+	remaining fields of the psscr to the following sane-default
+	values.
+
+		- ESL and EC bits are to 1. So wakeup from any stop
+		  state will be at vector 0x100.
+
+		- MTL and PSLL are set to the maximum allowed value as
+		  per the ISA, i.e. 15.
+
+		- The Transition Rate, TR is set to the Maximum value
+                  3.
+
+	For all the other values of the entry in
+	ibm,cpu-idle-state-psscr-mask, the Kernel expects all the
+	psscr fields of the corresponding entry in
+	ibm,cpu-idle-state-psscr to be correctly set by the firmware.
+
+- ibm,cpu-idle-state-pmicr:
+	Array of unsigned 64-bit values containing the pmicr values
+	for the idle states in ibm,cpu-idle-state-names. This 64-bit
+	register value is to be set in pmicr for the corresponding
+	state if the flag indicates that pmicr SPR should be set. This
+	is an optional property on POWER8 and is absent on
+	POWER9. When present on POWER8, the length of this property
+	array should be the same as ibm,cpu-idle-state-names.
+
+- ibm,cpu-idle-state-pmicr:-mask
+	Array of unsigned 64-bit values containing the mask indicating
+	which of the fields of the PMICR are set in the corresponding
+	entries in ibm,cpu-idle-state-pmicr. This is an optional
+	property on POWER8 and is absent on POWER9. When present on
+	POWER8, the length of this property array should be the same
+	as ibm,cpu-idle-state-names.
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro
  2017-01-10  9:07 ` [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro Gautham R. Shenoy
@ 2017-01-12  9:42   ` Balbir Singh
  0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-01-12  9:42 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran, linuxppc-dev,
	linux-kernel, linux-pm, devicetree, Rob Herring, mark.rutland

On Tue, Jan 10, 2017 at 02:37:00PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Currently all the low-power idle states are expected to wake up
> at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ
> that puts the CPU to an idle state and never returns.
> 
> On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU
> is expected to wake up at the next instruction of the idle
> instruction.
> 
> This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the
> no-return variant and reuses the name IDLE_STATE_ENTER_SEQ
> for a variant that allows resuming operation at the instruction next
> to the idle-instruction.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> No changes from v4
>

Acked-by: Balbir Singh <bsingharora@gmail.com> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300
  2017-01-10  9:07 ` [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300 Gautham R. Shenoy
@ 2017-01-12  9:47   ` Balbir Singh
  2017-01-13  3:44     ` Gautham R Shenoy
  0 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2017-01-12  9:47 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran, linuxppc-dev,
	linux-kernel, linux-pm, devicetree, Rob Herring, mark.rutland

On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Balbir pointed out that in idle_book3s.S and powernv/idle.c some
> functions and variables had power9 in their names while some others
> had arch300.
>

I would prefer power9 to arch300

Balbir Singh. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states.
  2017-01-10  9:07 ` [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states Gautham R. Shenoy
@ 2017-01-12 10:04   ` Balbir Singh
  0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-01-12 10:04 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran, linuxppc-dev,
	linux-kernel, linux-pm, devicetree, Rob Herring, mark.rutland

On Tue, Jan 10, 2017 at 02:37:02PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> In the current code for powernv_add_idle_states, there is a lot of code
> duplication while initializing an idle state in powernv_states table.
> 
> Add an inline helper function to populate the powernv_states[] table
> for a given idle state. Invoke this for populating the "Nap",
> "Fastsleep" and the stop states in powernv_add_idle_states.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> -- 

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop
  2017-01-10  9:07 ` [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop Gautham R. Shenoy
@ 2017-01-12 17:18   ` Balbir Singh
  0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-01-12 17:18 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran, linuxppc-dev,
	linux-kernel, linux-pm, devicetree, Rob Herring, mark.rutland

On Tue, Jan 10, 2017 at 02:37:03PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The arch300_idle_stop method currently takes only the requested stop
      power9_idle_stop (see subject :) and second paragraph)
> level as a parameter and picks up the rest of the PSSCR bits from a
> hand-coded macro. This is not a very flexible design, especially when
> the firmware has the capability to communicate the psscr value and the
> mask associated with a particular stop state via device tree.
> 
> This patch modifies the power9_idle_stop API to take as parameters the
> PSSCR value and the PSSCR mask corresponding to the stop state that
> needs to be set. These PSSCR value and mask are respectively obtained
> by parsing the "ibm,cpu-idle-state-psscr" and
> "ibm,cpu-idle-state-psscr-mask" fields from the device tree.
> 
> In addition to this, the patch adds support for handling stop states
> for which ESL and EC bits in the PSSCR are zero. As per the
> architecture, a wakeup from these stop states resumes execution from
> the subsequent instruction as opposed to waking up at the System
> Vector.
> 
> The older firmware sets only the Requested Level (RL) field in the
> psscr and psscr-mask exposed in the device tree. For older firmware
> where psscr-mask=0xf, this patch will set the default sane values that
> the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
> TR). For the new firmware, the patch will validate that the invariants
> required by the ISA for the psscr values are maintained by the
> firmware.
> 
> This skiboot patch that exports fully populated PSSCR values and the
> mask for all the stop states can be found here:
> https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
> 
> [Optimize the number of instructions before entering STOP with
> ESL=EC=0, validate the PSSCR values provided by the firimware
> maintains the invariants required as per the ISA suggested by Balbir
> Singh]
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300
  2017-01-12  9:47   ` Balbir Singh
@ 2017-01-13  3:44     ` Gautham R Shenoy
  2017-01-13  3:46       ` Oliver O'Halloran
  0 siblings, 1 reply; 14+ messages in thread
From: Gautham R Shenoy @ 2017-01-13  3:44 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Gautham R. Shenoy, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rafael J. Wysocki, Daniel Lezcano,
	Michael Neuling, Vaidyanathan Srinivasan, Shreyas B. Prabhu,
	Shilpasri G Bhat, Stewart Smith, Oliver O'Halloran,
	linuxppc-dev, linux-kernel, linux-pm, devicetree, Rob Herring,
	mark.rutland

On Thu, Jan 12, 2017 at 03:17:33PM +0530, Balbir Singh wrote:
> On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > 
> > Balbir pointed out that in idle_book3s.S and powernv/idle.c some
> > functions and variables had power9 in their names while some others
> > had arch300.
> >
> 
> I would prefer power9 to arch300
>


I don't have a strong preference for arch300 vs power9, will change it
to power9 if that looks better.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300
  2017-01-13  3:44     ` Gautham R Shenoy
@ 2017-01-13  3:46       ` Oliver O'Halloran
  0 siblings, 0 replies; 14+ messages in thread
From: Oliver O'Halloran @ 2017-01-13  3:46 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Balbir Singh, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rafael J. Wysocki, Daniel Lezcano,
	Michael Neuling, Vaidyanathan Srinivasan, Shreyas B. Prabhu,
	Shilpasri G Bhat, Stewart Smith, linuxppc-dev, linux-kernel,
	linux-pm, devicetree, Rob Herring, mark.rutland

On Fri, Jan 13, 2017 at 2:44 PM, Gautham R Shenoy
<ego@linux.vnet.ibm.com> wrote:
> On Thu, Jan 12, 2017 at 03:17:33PM +0530, Balbir Singh wrote:
>> On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
>> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>> >
>> > Balbir pointed out that in idle_book3s.S and powernv/idle.c some
>> > functions and variables had power9 in their names while some others
>> > had arch300.
>> >
>>
>> I would prefer power9 to arch300
>>
>
>
> I don't have a strong preference for arch300 vs power9, will change it
> to power9 if that looks better.

Personally I think we should be as descriptive as possible and use
power_9_arch_300_the_bikeshed_is_red_dammit.

Oliver

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt
  2017-01-10  9:07 ` [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt Gautham R. Shenoy
@ 2017-01-13 16:57   ` Rob Herring
  2017-01-25  8:09     ` Gautham R Shenoy
  0 siblings, 1 reply; 14+ messages in thread
From: Rob Herring @ 2017-01-13 16:57 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Rafael J. Wysocki, Daniel Lezcano, Michael Neuling,
	Vaidyanathan Srinivasan, Shreyas B. Prabhu, Shilpasri G Bhat,
	Stewart Smith, Balbir Singh, Oliver O'Halloran, linuxppc-dev,
	linux-kernel, linux-pm, devicetree, mark.rutland

On Tue, Jan 10, 2017 at 02:37:04PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Document the device-tree bindings defining the the properties under
> the @power-mgt node in the device tree that describe the idle states
> for Linux running on baremetal POWER servers.
> 

We have "common" idle state bindings. Perhaps some explanation why those 
can't be used.

> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> [v4]-> [v5]: Fixed a couple of typos.
> 
>  .../devicetree/bindings/powerpc/opal/power-mgt.txt | 125 +++++++++++++++++++++
>  1 file changed, 125 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> 
> diff --git a/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> new file mode 100644
> index 0000000..4967831
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> @@ -0,0 +1,125 @@
> +IBM Power-Management Bindings
> +=============================
> +
> +Linux running on baremetal POWER machines has access to the processor
> +idle states. The description of these idle states is exposed via the
> +node @power-mgt in the device-tree by the firmware.
> +
> +Definitions:
> +----------------
> +Typically each idle state has the following associated properties:
> +
> +- name: The name of the idle state as defined by the firmware.
> +
> +- flags: indicating some aspects of this idle states such as the
> +         extent of state-loss, whether timebase is stopped on this
> +         idle states and so on. The flag bits are as follows:
> +
> +- exit-latency: The latency involved in transitioning the state of the
> +		CPU from idle to running.
> +
> +- target-residency: The minimum time that the CPU needs to reside in
> +		    this idle state in order to accrue power-savings
> +		    benefit.
> +
> +Properties
> +----------------
> +The following properties provide details about the idle states. These
> +properties are optional unless mentioned otherwise below.

-names is optional but everything else seems to require it. It is not 
clear what the binding looks like if -names is not present.

> +
> +- ibm,cpu-idle-state-names:
> +	Array of strings containing the names of the idle states.
> +
> +- ibm,cpu-idle-state-flags:
> +	Array of unsigned 32-bit values containing the values of the
> +	flags associated with the the aforementioned idle-states. This
> +	property is required on POWER9 whenever
> +	ibm,cpu-idle-state-names is defined and the length of this
> +	property array should be the same as
> +	ibm,-cpu-idle-state-names.The flag bits are as follows:

s/ibm,-cpu/ibm,cpu/

Needs a space after the period.

> +		0x00000001 /* Decrementer would stop */
> +		0x00000002 /* Needs timebase restore */
> +		0x00001000 /* Restore GPRs like nap */
> +		0x00002000 /* Restore hypervisor resource from PACA pointer */
> +		0x00004000 /* Program PORE to restore PACA pointer */
> +		0x00010000 /* This is a nap state */
> +		0x00020000 /* This is a fast-sleep state */
> +		0x00040000 /* This is a winkle state */
> +		0x00080000 /* This is a fast-sleep state which requires a */
> +			   /* software workaround for restoring the timebase*/
> +		0x00800000 /* This state uses SPR PMICR instruction */
> +		0x00100000 /* This is a fast stop state */
> +		0x00200000 /* This is a deep-stop state */
> +
> +- ibm,cpu-idle-state-latencies-ns:
> +	Array of unsigned 32-bit values containing the values of the
> +	exit-latencies (in ns) for the idle states in
> +	ibm,cpu-idle-state-names. This property is required whenever
> +	ibm,cpu-idle-state-names is defined and the length of this
> +	property array should be the same as
> +	ibm,-cpu-idle-state-names.

Same typo.

> +
> +- ibm,cpu-idle-state-residency-ns:
> +	Array of unsigned 32-bit values containing the values of the
> +	target-residency (in ns) for the idle states in
> +	ibm,cpu-idle-state-names. On POWER8 this is an optional
> +	property. If the property is absent, the target residency for
> +	the "Nap", "FastSleep" are defined to 10000 and 300000000
> +	respectively. On POWER9 this property must be defined if
> +	ibm,cpu-idle-state-names is defined and the length should be
> +	same as that of ibm,cpu-idle-state-names.
> +
> +- ibm,cpu-idle-state-psscr:
> +	Array of unsigned 64-bit values containing the values for the
> +	PSSCR for each of the idle states in ibm,cpu-idle-state-names.
> +	This property is required on POWER9 whenever
> +	ibm,cpu-idle-state-names is defined and the length of this
> +	property array should be the same as
> +	ibm,cpu-idle-state-names.
> +
> +- ibm,cpu-idle-state-psscr-mask:
> +	Array of unsigned 64-bit values containing the masks
> +	indicating which psscr fields are set in the corresponding
> +	entries of ibm,cpu-idle-state-psscr.  This property is
> +	required on POWER9 whenever ibm,cpu-idle-state-names is
> +	defined and the length of this property array should be the
> +	same as ibm,cpu-idle-state-names.
> +
> +	Whenever the firmware sets an entry in
> +	ibm,cpu-idle-state-psscr-mask value to 0xf, it implies that
> +	only the Requested Level (RL) field of the corresponding entry
> +	in ibm,cpu-idle-state-psscr should be considered by the
> +	kernel. For such idle states, the kernel would set the
> +	remaining fields of the psscr to the following sane-default
> +	values.
> +
> +		- ESL and EC bits are to 1. So wakeup from any stop
> +		  state will be at vector 0x100.
> +
> +		- MTL and PSLL are set to the maximum allowed value as
> +		  per the ISA, i.e. 15.
> +
> +		- The Transition Rate, TR is set to the Maximum value
> +                  3.
> +
> +	For all the other values of the entry in
> +	ibm,cpu-idle-state-psscr-mask, the Kernel expects all the
> +	psscr fields of the corresponding entry in
> +	ibm,cpu-idle-state-psscr to be correctly set by the firmware.
> +
> +- ibm,cpu-idle-state-pmicr:
> +	Array of unsigned 64-bit values containing the pmicr values
> +	for the idle states in ibm,cpu-idle-state-names. This 64-bit
> +	register value is to be set in pmicr for the corresponding
> +	state if the flag indicates that pmicr SPR should be set. This
> +	is an optional property on POWER8 and is absent on
> +	POWER9. When present on POWER8, the length of this property
> +	array should be the same as ibm,cpu-idle-state-names.
> +
> +- ibm,cpu-idle-state-pmicr:-mask
> +	Array of unsigned 64-bit values containing the mask indicating
> +	which of the fields of the PMICR are set in the corresponding
> +	entries in ibm,cpu-idle-state-pmicr. This is an optional
> +	property on POWER8 and is absent on POWER9. When present on
> +	POWER8, the length of this property array should be the same
> +	as ibm,cpu-idle-state-names.
> -- 
> 1.9.4
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt
  2017-01-13 16:57   ` Rob Herring
@ 2017-01-25  8:09     ` Gautham R Shenoy
  0 siblings, 0 replies; 14+ messages in thread
From: Gautham R Shenoy @ 2017-01-25  8:09 UTC (permalink / raw)
  To: Rob Herring
  Cc: Gautham R. Shenoy, Michael Ellerman, Benjamin Herrenschmidt,
	Paul Mackerras, Rafael J. Wysocki, Daniel Lezcano,
	Michael Neuling, Vaidyanathan Srinivasan, Shreyas B. Prabhu,
	Shilpasri G Bhat, Stewart Smith, Balbir Singh,
	Oliver O'Halloran, linuxppc-dev, linux-kernel, linux-pm,
	devicetree, mark.rutland

Hello Rob,

Thank you very much for your review. I had missed this mail
and found it while looking at the lkml thread while preparing for the
next iteration.

On Fri, Jan 13, 2017 at 10:57:43AM -0600, Rob Herring wrote:
> On Tue, Jan 10, 2017 at 02:37:04PM +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > 
> > Document the device-tree bindings defining the the properties under
> > the @power-mgt node in the device tree that describe the idle states
> > for Linux running on baremetal POWER servers.
> > 
> 
> We have "common" idle state bindings. Perhaps some explanation why those 
> can't be used.

Are you referring to
Documentation/devicetree/bindings/power/domain-idle-state.txt ?

On POWER, since POWER8 time, the DVFS states as well as the idle
states were exposed as properties under the common /ibm,opal/power-mgt
node. Since the DVFS states were large in number, creating a separate
node for each one of them didn't seem like a good design. Hence the
DVFS state properties were encoded as property arrays. The same design
was carried forward to the idle-states as well.

Which is why the common idle-state bindings in domain-idle-state.txt
cannot be used since each idle-state is described as a node there.

> 
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> > [v4]-> [v5]: Fixed a couple of typos.
> > 
> >  .../devicetree/bindings/powerpc/opal/power-mgt.txt | 125 +++++++++++++++++++++
> >  1 file changed, 125 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> > new file mode 100644
> > index 0000000..4967831
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt
> > @@ -0,0 +1,125 @@
> > +IBM Power-Management Bindings
> > +=============================
> > +
> > +Linux running on baremetal POWER machines has access to the processor
> > +idle states. The description of these idle states is exposed via the
> > +node @power-mgt in the device-tree by the firmware.
> > +
> > +Definitions:
> > +----------------
> > +Typically each idle state has the following associated properties:
> > +
> > +- name: The name of the idle state as defined by the firmware.
> > +
> > +- flags: indicating some aspects of this idle states such as the
> > +         extent of state-loss, whether timebase is stopped on this
> > +         idle states and so on. The flag bits are as follows:
> > +
> > +- exit-latency: The latency involved in transitioning the state of the
> > +		CPU from idle to running.
> > +
> > +- target-residency: The minimum time that the CPU needs to reside in
> > +		    this idle state in order to accrue power-savings
> > +		    benefit.
> > +
> > +Properties
> > +----------------
> > +The following properties provide details about the idle states. These
> > +properties are optional unless mentioned otherwise below.
> 
> -names is optional but everything else seems to require it. It is not 
> clear what the binding looks like if -names is not present.

The firmware could choose not to expose the idle-states to the kernel,
in which case, it would expose any of these properties.

However, if the firmware chooses to expose idle-states, then the 
ibm,cpu-idle-state-names and ibm,cpu-idle-state-flags properties are
required.

I will reword this as follows:

  The following properties provide details about the idle
  states. These properties are exposed as arrays. Each entry in the
  property array provides the value of that property for the idle
  state associated with the array index of that entry.

  If idle-states are defined, then the properties
  "ibm,cpu-idle-state-names" and "ibm,cpu-idle-state-flags" are
  required. The other properties are required unless mentioned
  otherwise. The length of all the property arrays must be the same.


> 
> > +
> > +- ibm,cpu-idle-state-names:
> > +	Array of strings containing the names of the idle states.
> > +
> > +- ibm,cpu-idle-state-flags:
> > +	Array of unsigned 32-bit values containing the values of the
> > +	flags associated with the the aforementioned idle-states. This
> > +	property is required on POWER9 whenever
> > +	ibm,cpu-idle-state-names is defined and the length of this
> > +	property array should be the same as
> > +	ibm,-cpu-idle-state-names.The flag bits are as follows:
> 
> s/ibm,-cpu/ibm,cpu/
> 
> Needs a space after the period.

Thanks for catching this. Will fix it in the next iteration.

> 
> > +		0x00000001 /* Decrementer would stop */
> > +		0x00000002 /* Needs timebase restore */
> > +		0x00001000 /* Restore GPRs like nap */
> > +		0x00002000 /* Restore hypervisor resource from PACA pointer */
> > +		0x00004000 /* Program PORE to restore PACA pointer */
> > +		0x00010000 /* This is a nap state */
> > +		0x00020000 /* This is a fast-sleep state */
> > +		0x00040000 /* This is a winkle state */
> > +		0x00080000 /* This is a fast-sleep state which requires a */
> > +			   /* software workaround for restoring the timebase*/
> > +		0x00800000 /* This state uses SPR PMICR instruction */
> > +		0x00100000 /* This is a fast stop state */
> > +		0x00200000 /* This is a deep-stop state */
> > +
> > +- ibm,cpu-idle-state-latencies-ns:
> > +	Array of unsigned 32-bit values containing the values of the
> > +	exit-latencies (in ns) for the idle states in
> > +	ibm,cpu-idle-state-names. This property is required whenever
> > +	ibm,cpu-idle-state-names is defined and the length of this
> > +	property array should be the same as
> > +	ibm,-cpu-idle-state-names.
> 
> Same typo.

Will fix this one as well.

> 
> > +
> > +- ibm,cpu-idle-state-residency-ns:
> > +	Array of unsigned 32-bit values containing the values of the
> > +	target-residency (in ns) for the idle states in
> > +	ibm,cpu-idle-state-names. On POWER8 this is an optional
> > +	property. If the property is absent, the target residency for
> > +	the "Nap", "FastSleep" are defined to 10000 and 300000000
> > +	respectively. On POWER9 this property must be defined if
> > +	ibm,cpu-idle-state-names is defined and the length should be
> > +	same as that of ibm,cpu-idle-state-names.
> > +
> > +- ibm,cpu-idle-state-psscr:
> > +	Array of unsigned 64-bit values containing the values for the
> > +	PSSCR for each of the idle states in ibm,cpu-idle-state-names.
> > +	This property is required on POWER9 whenever
> > +	ibm,cpu-idle-state-names is defined and the length of this
> > +	property array should be the same as
> > +	ibm,cpu-idle-state-names.
> > +
> > +- ibm,cpu-idle-state-psscr-mask:
> > +	Array of unsigned 64-bit values containing the masks
> > +	indicating which psscr fields are set in the corresponding
> > +	entries of ibm,cpu-idle-state-psscr.  This property is
> > +	required on POWER9 whenever ibm,cpu-idle-state-names is
> > +	defined and the length of this property array should be the
> > +	same as ibm,cpu-idle-state-names.
> > +
> > +	Whenever the firmware sets an entry in
> > +	ibm,cpu-idle-state-psscr-mask value to 0xf, it implies that
> > +	only the Requested Level (RL) field of the corresponding entry
> > +	in ibm,cpu-idle-state-psscr should be considered by the
> > +	kernel. For such idle states, the kernel would set the
> > +	remaining fields of the psscr to the following sane-default
> > +	values.
> > +
> > +		- ESL and EC bits are to 1. So wakeup from any stop
> > +		  state will be at vector 0x100.
> > +
> > +		- MTL and PSLL are set to the maximum allowed value as
> > +		  per the ISA, i.e. 15.
> > +
> > +		- The Transition Rate, TR is set to the Maximum value
> > +                  3.
> > +
> > +	For all the other values of the entry in
> > +	ibm,cpu-idle-state-psscr-mask, the Kernel expects all the
> > +	psscr fields of the corresponding entry in
> > +	ibm,cpu-idle-state-psscr to be correctly set by the firmware.
> > +
> > +- ibm,cpu-idle-state-pmicr:
> > +	Array of unsigned 64-bit values containing the pmicr values
> > +	for the idle states in ibm,cpu-idle-state-names. This 64-bit
> > +	register value is to be set in pmicr for the corresponding
> > +	state if the flag indicates that pmicr SPR should be set. This
> > +	is an optional property on POWER8 and is absent on
> > +	POWER9. When present on POWER8, the length of this property
> > +	array should be the same as ibm,cpu-idle-state-names.
> > +
> > +- ibm,cpu-idle-state-pmicr:-mask
> > +	Array of unsigned 64-bit values containing the mask indicating
> > +	which of the fields of the PMICR are set in the corresponding
> > +	entries in ibm,cpu-idle-state-pmicr. This is an optional
> > +	property on POWER8 and is absent on POWER9. When present on
> > +	POWER8, the length of this property array should be the same
> > +	as ibm,cpu-idle-state-names.
> > -- 
> > 1.9.4
> > 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-25  8:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-10  9:06 [PATCH v5 0/5] powernv:stop: Use psscr_val,mask provided by firmware Gautham R. Shenoy
2017-01-10  9:07 ` [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro Gautham R. Shenoy
2017-01-12  9:42   ` Balbir Singh
2017-01-10  9:07 ` [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300 Gautham R. Shenoy
2017-01-12  9:47   ` Balbir Singh
2017-01-13  3:44     ` Gautham R Shenoy
2017-01-13  3:46       ` Oliver O'Halloran
2017-01-10  9:07 ` [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states Gautham R. Shenoy
2017-01-12 10:04   ` Balbir Singh
2017-01-10  9:07 ` [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop Gautham R. Shenoy
2017-01-12 17:18   ` Balbir Singh
2017-01-10  9:07 ` [PATCH v5 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt Gautham R. Shenoy
2017-01-13 16:57   ` Rob Herring
2017-01-25  8:09     ` Gautham R Shenoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).