linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0
@ 2017-03-22 15:04 Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c Gautham R. Shenoy
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Gautham R. Shenoy @ 2017-03-22 15:04 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the third version of the patchset containing the fixes to
make CPU-Hotplug working on correctly on POWER9 DD1 systems.

The earlier versions can be found here:
[v2] : https://lkml.org/lkml/2017/3/20/555
[v1] : https://lkml.org/lkml/2017/3/13/46

The only change in this patch series from v2 are the following
optimizations suggested by Nicholas Piggin.

- Dynamically allocate the thread_sibling_pacas array to contain
  "threads_per_core" number of slots instead of declaring the array
  size upfront.

- Use SPRN_TIR instead of (SPRN_PIR & 0x3) to determine the thread's
  index within a core.

Patch 4 in the series requires Nicholas Piggin's ack. Rest of the
patches are unchanged from the previous version.

These patches are based on v4.11-rc3.

The patches have been tested with stop1 (ESL=EC=1) as the
deepest-state entered into during CPU-Hotplug.

Gautham R. Shenoy (4):
  powernv: Move CPU-Offline idle state invocation from smp.c to idle.c
  powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug
  powernv:idle: Don't override default/deepest directly in kernel
  powernv: Recover correct PACA on wakeup from a stop on P9 DD1

 arch/powerpc/include/asm/cpuidle.h       |  1 +
 arch/powerpc/include/asm/paca.h          |  5 ++
 arch/powerpc/kernel/asm-offsets.c        |  1 +
 arch/powerpc/kernel/idle_book3s.S        | 48 +++++++++++++++-
 arch/powerpc/platforms/powernv/idle.c    | 96 ++++++++++++++++++++++++++------
 arch/powerpc/platforms/powernv/powernv.h |  2 -
 arch/powerpc/platforms/powernv/smp.c     | 18 +-----
 7 files changed, 136 insertions(+), 35 deletions(-)

-- 
1.9.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c
  2017-03-22 15:04 [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0 Gautham R. Shenoy
@ 2017-03-22 15:04 ` Gautham R. Shenoy
  2017-04-12 13:31   ` [v3, " Michael Ellerman
  2017-03-22 15:04 ` [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug Gautham R. Shenoy
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Gautham R. Shenoy @ 2017-03-22 15:04 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
transitions the CPU to the deepest available platform idle state to a
new function named pnv_cpu_offline() in powernv/idle.c. The rationale
behind this code movement is that the data required to determine the
deepest available platform state resides in powernv/idle.c.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/cpuidle.h       |  1 +
 arch/powerpc/platforms/powernv/idle.c    | 25 +++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/powernv.h |  2 --
 arch/powerpc/platforms/powernv/smp.c     | 18 ++----------------
 4 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 1557315..4649ca0 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -46,6 +46,7 @@
 
 extern u64 pnv_first_deep_stop_state;
 
+unsigned long pnv_cpu_offline(unsigned int cpu);
 int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags);
 static inline void report_invalid_psscr_val(u64 psscr_val, int err)
 {
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 4ee837e..419edff 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -266,6 +266,31 @@ static void power9_idle(void)
 u64 pnv_deepest_stop_psscr_mask;
 
 /*
+ * pnv_cpu_offline: A function that puts the CPU into the deepest
+ * available platform idle state on a CPU-Offline.
+ */
+unsigned long pnv_cpu_offline(unsigned int cpu)
+{
+	unsigned long srr1;
+
+	u32 idle_states = pnv_get_supported_cpuidle_states();
+
+	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+		srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
+					pnv_deepest_stop_psscr_mask);
+	} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
+		srr1 = power7_winkle();
+	} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
+		   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
+		srr1 = power7_sleep();
+	} else {
+		srr1 = power7_nap(1);
+	}
+
+	return srr1;
+}
+
+/*
  * Power ISA 3.0 idle initialization.
  *
  * POWER ISA 3.0 defines a new SPR Processor stop Status and Control
diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
index 6130522..6dbc0a1 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -18,8 +18,6 @@ static inline void pnv_pci_shutdown(void) { }
 #endif
 
 extern u32 pnv_get_supported_cpuidle_states(void);
-extern u64 pnv_deepest_stop_psscr_val;
-extern u64 pnv_deepest_stop_psscr_mask;
 
 extern void pnv_lpc_init(void);
 
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 8b67e1e..914b456 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -35,6 +35,7 @@
 #include <asm/dbell.h>
 #include <asm/kvm_ppc.h>
 #include <asm/ppc-opcode.h>
+#include <asm/cpuidle.h>
 
 #include "powernv.h"
 
@@ -140,7 +141,6 @@ static void pnv_smp_cpu_kill_self(void)
 {
 	unsigned int cpu;
 	unsigned long srr1, wmask;
-	u32 idle_states;
 
 	/* Standard hot unplug procedure */
 	local_irq_disable();
@@ -155,8 +155,6 @@ static void pnv_smp_cpu_kill_self(void)
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		wmask = SRR1_WAKEMASK_P8;
 
-	idle_states = pnv_get_supported_cpuidle_states();
-
 	/* We don't want to take decrementer interrupts while we are offline,
 	 * so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9)
 	 * enabled as to let IPIs in.
@@ -184,19 +182,7 @@ static void pnv_smp_cpu_kill_self(void)
 		kvmppc_set_host_ipi(cpu, 0);
 
 		ppc64_runlatch_off();
-
-		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
-			srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
-						pnv_deepest_stop_psscr_mask);
-		} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
-			srr1 = power7_winkle();
-		} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
-			   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
-			srr1 = power7_sleep();
-		} else {
-			srr1 = power7_nap(1);
-		}
-
+		srr1 = pnv_cpu_offline(cpu);
 		ppc64_runlatch_on();
 
 		/*
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug
  2017-03-22 15:04 [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0 Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c Gautham R. Shenoy
@ 2017-03-22 15:04 ` Gautham R. Shenoy
  2017-03-27 11:43   ` Michael Ellerman
  2017-03-22 15:04 ` [v3 PATCH 3/4] powernv:idle: Don't override default/deepest directly in kernel Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1 Gautham R. Shenoy
  3 siblings, 1 reply; 9+ messages in thread
From: Gautham R. Shenoy @ 2017-03-22 15:04 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently, the powernv cpu-offline function assumes that platform idle
states such as stop on POWER9, winkle/sleep/nap on POWER8 are always
available. On POWER8, it picks nap as the default state if other deep
idle states like sleep/winkle are not available and enabled in the
platform.

On POWER9, nap is not available and all idle states are managed by
STOP instruction.  The parameters to the idle state are passed through
processor stop status control register (PSSCR).  Hence as such
executing STOP would take parameters from current PSSCR. We do not
want to make any assumptions in kernel on what STOP states and PSSCR
features are configured by the platform.

Ideally platform will configure a good set of stop states that can be
used in the kernel.  We would like to start with a clean slate, if the
platform choose to not configure any state or there is an error in
platform firmware that lead to no stop states being configured or
allowed to be requested.

This patch adds a fallback method for CPU-Hotplug that is similar to
snooze loop at idle where the threads are left to spin at low priority
and hence reduce the cycles consumed.

This is a safe fallback mechanism in the case when no stop state would
be requested if the platform firmware did not configure them most
likely due to an error condition.

Requesting a stop state when the platform has not configured them or
enabled them would lead to further error conditions which could be
difficult to debug.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/idle.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 419edff..f335e0f 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -283,8 +283,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 	} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
 		   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
 		srr1 = power7_sleep();
-	} else {
+	} else if (idle_states & OPAL_PM_NAP_ENABLED) {
 		srr1 = power7_nap(1);
+	} else {
+		/* This is the fallback method. We emulate snooze */
+		while (!generic_check_cpu_restart(cpu)) {
+			HMT_low();
+			HMT_very_low();
+		}
+		srr1 = 0;
+		HMT_medium();
 	}
 
 	return srr1;
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v3 PATCH 3/4] powernv:idle: Don't override default/deepest directly in kernel
  2017-03-22 15:04 [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0 Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug Gautham R. Shenoy
@ 2017-03-22 15:04 ` Gautham R. Shenoy
  2017-03-22 15:04 ` [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1 Gautham R. Shenoy
  3 siblings, 0 replies; 9+ messages in thread
From: Gautham R. Shenoy @ 2017-03-22 15:04 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently during idle-init on power9, if we don't find suitable stop
states in the device tree that can be used as the
default_stop/deepest_stop, we set stop0 (ESL=1,EC=1) as the default
stop state psscr to be used by power9_idle and deepest stop state
which is used by CPU-Hotplug.

However, if the platform firmware has not configured or enabled a stop
state, the kernel should not make any assumptions and fallback to a
default choice.

If the kernel uses a stop state that is not configured by the platform
firmware, it may lead to further failures which should be avoided.

In this patch, we modify the init code to ensure that the kernel uses
only the stop states exposed by the firmware through the device
tree. When a suitable default stop state isn't found, we disable
ppc_md.power_save for power9. Similarly, when a suitable
deepest_stop_state is not found in the device tree exported by the
firmware, fall back to the default busy-wait loop in the CPU-Hotplug
code.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/idle.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index f335e0f..63ade78 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -147,7 +147,6 @@ u32 pnv_get_supported_cpuidle_states(void)
 }
 EXPORT_SYMBOL_GPL(pnv_get_supported_cpuidle_states);
 
-
 static void pnv_fastsleep_workaround_apply(void *info)
 
 {
@@ -241,8 +240,9 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
  * The default stop state that will be used by ppc_md.power_save
  * function on platforms that support stop instruction.
  */
-u64 pnv_default_stop_val;
-u64 pnv_default_stop_mask;
+static u64 pnv_default_stop_val;
+static u64 pnv_default_stop_mask;
+static bool default_stop_found;
 
 /*
  * Used for ppc_md.power_save which needs a function with no parameters
@@ -262,8 +262,9 @@ static void power9_idle(void)
  * psscr value and mask of the deepest stop idle state.
  * Used when a cpu is offlined.
  */
-u64 pnv_deepest_stop_psscr_val;
-u64 pnv_deepest_stop_psscr_mask;
+static u64 pnv_deepest_stop_psscr_val;
+static u64 pnv_deepest_stop_psscr_mask;
+static bool deepest_stop_found;
 
 /*
  * pnv_cpu_offline: A function that puts the CPU into the deepest
@@ -275,7 +276,7 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
 
 	u32 idle_states = pnv_get_supported_cpuidle_states();
 
-	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+	if (cpu_has_feature(CPU_FTR_ARCH_300) && deepest_stop_found) {
 		srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
 					pnv_deepest_stop_psscr_mask);
 	} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
@@ -385,7 +386,6 @@ static int __init pnv_power9_idle_init(struct device_node *np, u32 *flags,
 	u32 *residency_ns = NULL;
 	u64 max_residency_ns = 0;
 	int rc = 0, i;
-	bool default_stop_found = false, deepest_stop_found = false;
 
 	psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val), GFP_KERNEL);
 	psscr_mask = kcalloc(dt_idle_states, sizeof(*psscr_mask), GFP_KERNEL);
@@ -465,21 +465,24 @@ static int __init pnv_power9_idle_init(struct device_node *np, u32 *flags,
 		}
 	}
 
-	if (!default_stop_found) {
-		pnv_default_stop_val = PSSCR_HV_DEFAULT_VAL;
-		pnv_default_stop_mask = PSSCR_HV_DEFAULT_MASK;
-		pr_warn("Setting default stop psscr val=0x%016llx,mask=0x%016llx\n",
+	if (unlikely(!default_stop_found)) {
+		pr_warn("cpuidle-powernv: No suitable default stop state found. Disabling platform idle.\n");
+	} else {
+		ppc_md.power_save = power9_idle;
+		pr_info("cpuidle-powernv: Default stop: psscr = 0x%016llx,mask=0x%016llx\n",
 			pnv_default_stop_val, pnv_default_stop_mask);
 	}
 
-	if (!deepest_stop_found) {
-		pnv_deepest_stop_psscr_val = PSSCR_HV_DEFAULT_VAL;
-		pnv_deepest_stop_psscr_mask = PSSCR_HV_DEFAULT_MASK;
-		pr_warn("Setting default stop psscr val=0x%016llx,mask=0x%016llx\n",
+	if (unlikely(!deepest_stop_found)) {
+		pr_warn("cpuidle-powernv: No suitable stop state for CPU-Hotplug. Offlined CPUs will busy wait");
+	} else {
+		pr_info("cpuidle-powernv: Deepest stop: psscr = 0x%016llx,mask=0x%016llx\n",
 			pnv_deepest_stop_psscr_val,
 			pnv_deepest_stop_psscr_mask);
 	}
 
+	pr_info("cpuidle-powernv: Requested Level (RL) value of first deep stop = 0x%llx\n",
+		pnv_first_deep_stop_state);
 out:
 	kfree(psscr_val);
 	kfree(psscr_mask);
@@ -559,8 +562,6 @@ static int __init pnv_init_idle_states(void)
 
 	if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED)
 		ppc_md.power_save = power7_idle;
-	else if (supported_cpuidle_states & OPAL_PM_STOP_INST_FAST)
-		ppc_md.power_save = power9_idle;
 
 out:
 	return 0;
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1
  2017-03-22 15:04 [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0 Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2017-03-22 15:04 ` [v3 PATCH 3/4] powernv:idle: Don't override default/deepest directly in kernel Gautham R. Shenoy
@ 2017-03-22 15:04 ` Gautham R. Shenoy
  2017-03-22 16:06   ` Nicholas Piggin
  3 siblings, 1 reply; 9+ messages in thread
From: Gautham R. Shenoy @ 2017-03-22 15:04 UTC (permalink / raw)
  To: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

POWER9 DD1.0 hardware has an issue due to which the SPRs of a thread
waking up from stop 0,1,2 with ESL=1 can endup being misplaced in the
core. Thus the HSPRG0 of a thread waking up from can contain the paca
pointer of its sibling.

This patch implements a context recovery framework within threads of a
core, by provisioning space in paca_struct for saving every sibling
threads's paca pointers. Basically, we should be able to arrive at the
right paca pointer from any of the thread's existing paca pointer.

At bootup, during powernv idle-init, we save the paca address of every
CPU in each one its siblings paca_struct in the slot corresponding to
this CPU's index in the core.

On wakeup from a stop, the thread will determine its index in the core
from the TIR register and recover its PACA pointer by indexing into
the correct slot in the provisioned space in the current PACA.

Furthermore, ensure that the NVGPRs are restored from the stack on the
way out by setting the NAPSTATELOST in paca.

[Changelog written with inputs from svaidy@linux.vnet.ibm.com]

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/paca.h       |  5 ++++
 arch/powerpc/kernel/asm-offsets.c     |  1 +
 arch/powerpc/kernel/idle_book3s.S     | 48 ++++++++++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/idle.c | 30 ++++++++++++++++++++++
 4 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 708c3e5..2a17c15 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -172,6 +172,11 @@ struct paca_struct {
 	u8 thread_mask;
 	/* Mask to denote subcore sibling threads */
 	u8 subcore_sibling_mask;
+	/*
+	 * Pointer to an array which contains pointer
+	 * to the sibling threads' paca.
+	 */
+	struct paca_struct **thread_sibling_pacas;
 #endif
 
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4367e7d..6ec5016 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -727,6 +727,7 @@ int main(void)
 	OFFSET(PACA_THREAD_IDLE_STATE, paca_struct, thread_idle_state);
 	OFFSET(PACA_THREAD_MASK, paca_struct, thread_mask);
 	OFFSET(PACA_SUBCORE_SIBLING_MASK, paca_struct, subcore_sibling_mask);
+	OFFSET(PACA_SIBLING_PACA_PTRS, paca_struct, thread_sibling_pacas);
 #endif
 
 	DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
index 9957287..24717a7 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -375,6 +375,46 @@ _GLOBAL(power9_idle_stop)
 	li	r4,1
 	b	pnv_powersave_common
 	/* No return */
+
+
+/*
+ * On waking up from stop 0,1,2 with ESL=1 on POWER9 DD1,
+ * HSPRG0 will be set to the HSPRG0 value of one of the
+ * threads in this core. Thus the value we have in r13
+ * may not be this thread's paca pointer.
+ *
+ * Fortunately, the TIR remains invariant. Since this thread's
+ * paca pointer is recorded in all its sibling's paca, we can
+ * correctly recover this thread's paca pointer if we
+ * know the index of this thread in the core.
+ *
+ * This index can be obtained from the TIR.
+ *
+ * i.e, thread's position in the core = TIR.
+ * If this value is i, then this thread's paca is
+ * paca->thread_sibling_pacas[i].
+ */
+power9_dd1_recover_paca:
+	mfspr	r4, SPRN_TIR
+	/*
+	 * Since each entry in thread_sibling_pacas is 8 bytes
+	 * we need to left-shift by 3 bits. Thus r4 = i * 8
+	 */
+	sldi	r4, r4, 3
+	/* Get &paca->thread_sibling_pacas[0] in r5 */
+	ld	r5, PACA_SIBLING_PACA_PTRS(r13)
+	/* Load paca->thread_sibling_pacas[i] into r13 */
+	ldx	r13, r4, r5
+	SET_PACA(r13)
+	ld	r2, PACATOC(r13)
+	/*
+	 * Indicate that we have lost NVGPR state
+	 * which needs to be restored from the stack.
+	 */
+	li	r3, 1
+	stb	r0,PACA_NAPSTATELOST(r13)
+	blr
+
 /*
  * Called from reset vector. Check whether we have woken up with
  * hypervisor state loss. If yes, restore hypervisor state and return
@@ -385,7 +425,13 @@ _GLOBAL(power9_idle_stop)
  */
 _GLOBAL(pnv_restore_hyp_resource)
 BEGIN_FTR_SECTION
-	ld	r2,PACATOC(r13);
+BEGIN_FTR_SECTION_NESTED(70)
+	mflr 	r6
+	bl	power9_dd1_recover_paca
+	mtlr	r6
+FTR_SECTION_ELSE_NESTED(70)
+	ld	r2, PACATOC(r13)
+ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_POWER9_DD1, 70)
 	/*
 	 * POWER ISA 3. Use PSSCR to determine if we
 	 * are waking up from deep idle state
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 63ade78..b369e39 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -122,9 +122,12 @@ static void pnv_alloc_idle_core_states(void)
 	for (i = 0; i < nr_cores; i++) {
 		int first_cpu = i * threads_per_core;
 		int node = cpu_to_node(first_cpu);
+		size_t paca_ptr_array_size;
 
 		core_idle_state = kmalloc_node(sizeof(u32), GFP_KERNEL, node);
 		*core_idle_state = PNV_CORE_IDLE_THREAD_BITS;
+		paca_ptr_array_size = (threads_per_core *
+				       sizeof(struct paca_struct *));
 
 		for (j = 0; j < threads_per_core; j++) {
 			int cpu = first_cpu + j;
@@ -132,6 +135,11 @@ static void pnv_alloc_idle_core_states(void)
 			paca[cpu].core_idle_state_ptr = core_idle_state;
 			paca[cpu].thread_idle_state = PNV_THREAD_RUNNING;
 			paca[cpu].thread_mask = 1 << j;
+			if (!cpu_has_feature(CPU_FTR_POWER9_DD1))
+				continue;
+			paca[cpu].thread_sibling_pacas =
+				kmalloc_node(paca_ptr_array_size,
+					     GFP_KERNEL, node);
 		}
 	}
 
@@ -560,6 +568,28 @@ static int __init pnv_init_idle_states(void)
 
 	pnv_alloc_idle_core_states();
 
+	/*
+	 * For each CPU, record its PACA address in each of it's
+	 * sibling thread's PACA at the slot corresponding to this
+	 * CPU's index in the core.
+	 */
+	if (cpu_has_feature(CPU_FTR_POWER9_DD1)) {
+		int cpu;
+
+		pr_info("powernv: idle: Saving PACA pointers of all CPUs in their thread sibling PACA\n");
+		for_each_possible_cpu(cpu) {
+			int base_cpu = cpu_first_thread_sibling(cpu);
+			int idx = cpu_thread_in_core(cpu);
+			int i;
+
+			for (i = 0; i < threads_per_core; i++) {
+				int j = base_cpu + i;
+
+				paca[j].thread_sibling_pacas[idx] = &paca[cpu];
+			}
+		}
+	}
+
 	if (supported_cpuidle_states & OPAL_PM_NAP_ENABLED)
 		ppc_md.power_save = power7_idle;
 
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1
  2017-03-22 15:04 ` [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1 Gautham R. Shenoy
@ 2017-03-22 16:06   ` Nicholas Piggin
  0 siblings, 0 replies; 9+ messages in thread
From: Nicholas Piggin @ 2017-03-22 16:06 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Michael Ellerman, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Mahesh J Salgaonkar,
	linuxppc-dev, linux-kernel

On Wed, 22 Mar 2017 20:34:17 +0530
"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> wrote:

> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> POWER9 DD1.0 hardware has an issue due to which the SPRs of a thread
> waking up from stop 0,1,2 with ESL=1 can endup being misplaced in the
> core. Thus the HSPRG0 of a thread waking up from can contain the paca
> pointer of its sibling.
> 
> This patch implements a context recovery framework within threads of a
> core, by provisioning space in paca_struct for saving every sibling
> threads's paca pointers. Basically, we should be able to arrive at the
> right paca pointer from any of the thread's existing paca pointer.
> 
> At bootup, during powernv idle-init, we save the paca address of every
> CPU in each one its siblings paca_struct in the slot corresponding to
> this CPU's index in the core.
> 
> On wakeup from a stop, the thread will determine its index in the core
> from the TIR register and recover its PACA pointer by indexing into
> the correct slot in the provisioned space in the current PACA.
> 
> Furthermore, ensure that the NVGPRs are restored from the stack on the
> way out by setting the NAPSTATELOST in paca.
> 
> [Changelog written with inputs from svaidy@linux.vnet.ibm.com]

Looks good.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug
  2017-03-22 15:04 ` [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug Gautham R. Shenoy
@ 2017-03-27 11:43   ` Michael Ellerman
  2017-03-30  9:19     ` Gautham R Shenoy
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2017-03-27 11:43 UTC (permalink / raw)
  To: Gautham R. Shenoy, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:

> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 419edff..f335e0f 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -283,8 +283,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
>  	} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
>  		   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
>  		srr1 = power7_sleep();
> -	} else {
> +	} else if (idle_states & OPAL_PM_NAP_ENABLED) {
>  		srr1 = power7_nap(1);
> +	} else {
> +		/* This is the fallback method. We emulate snooze */
> +		while (!generic_check_cpu_restart(cpu)) {

Breaks the SMP=n build :/

arch/powerpc/platforms/powernv/idle.c:299:11: error: implicit declaration of function 'generic_check_cpu_restart' [-Werror=implicit-function-declaration]

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug
  2017-03-27 11:43   ` Michael Ellerman
@ 2017-03-30  9:19     ` Gautham R Shenoy
  0 siblings, 0 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2017-03-30  9:19 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Gautham R. Shenoy, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar, linuxppc-dev, linux-kernel

On Mon, Mar 27, 2017 at 10:43:44PM +1100, Michael Ellerman wrote:
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> 
> > diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> > index 419edff..f335e0f 100644
> > --- a/arch/powerpc/platforms/powernv/idle.c
> > +++ b/arch/powerpc/platforms/powernv/idle.c
> > @@ -283,8 +283,16 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
> >  	} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
> >  		   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
> >  		srr1 = power7_sleep();
> > -	} else {
> > +	} else if (idle_states & OPAL_PM_NAP_ENABLED) {
> >  		srr1 = power7_nap(1);
> > +	} else {
> > +		/* This is the fallback method. We emulate snooze */
> > +		while (!generic_check_cpu_restart(cpu)) {
> 
> Breaks the SMP=n build :/

Hmmm.. Looks like I have missed adding a CONFIG_HOTPLUG_CPU somewhere.
> 
> arch/powerpc/platforms/powernv/idle.c:299:11: error: implicit declaration of function 'generic_check_cpu_restart' [-Werror=implicit-function-declaration]
>

Could you add the following patch as the PATCH 1 in the series instead
of what was present in this series ? It adds a CONFIG_HOTPLUG_CPU
around pnv_cpu_offline(). I have compiled it with both all
combinations of CONFIG_HOTPLUG_CPU and CONFIG_SMP and it builds fine.


----------------------------x-----x-----x-------------------------------
powernv: Move CPU-Offline idle state invocation from smp.c to idle.c

Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
transitions the CPU to the deepest available platform idle state to a
new function named pnv_cpu_offline() in powernv/idle.c. The rationale
behind this code movement is that the data required to determine the
deepest available platform state resides in powernv/idle.c.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
Restricts pnv_cpu_offline() to CONFIG_HOTPLUG_CPU

 arch/powerpc/include/asm/cpuidle.h       |  1 +
 arch/powerpc/platforms/powernv/idle.c    | 27 +++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/powernv.h |  2 --
 arch/powerpc/platforms/powernv/smp.c     | 18 ++----------------
 4 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
index 1557315..4649ca0 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -46,6 +46,7 @@
 
 extern u64 pnv_first_deep_stop_state;
 
+unsigned long pnv_cpu_offline(unsigned int cpu);
 int validate_psscr_val_mask(u64 *psscr_val, u64 *psscr_mask, u32 flags);
 static inline void report_invalid_psscr_val(u64 psscr_val, int err)
 {
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 4ee837e..fafafa3 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -265,6 +265,33 @@ static void power9_idle(void)
 u64 pnv_deepest_stop_psscr_val;
 u64 pnv_deepest_stop_psscr_mask;
 
+#ifdef CONFIG_HOTPLUG_CPU
+/*
+ * pnv_cpu_offline: A function that puts the CPU into the deepest
+ * available platform idle state on a CPU-Offline.
+ */
+unsigned long pnv_cpu_offline(unsigned int cpu)
+{
+	unsigned long srr1;
+
+	u32 idle_states = pnv_get_supported_cpuidle_states();
+
+	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+		srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
+					pnv_deepest_stop_psscr_mask);
+	} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
+		srr1 = power7_winkle();
+	} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
+		   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
+		srr1 = power7_sleep();
+	} else {
+		srr1 = power7_nap(1);
+	}
+
+	return srr1;
+}
+#endif
+
 /*
  * Power ISA 3.0 idle initialization.
  *
diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h
index 6130522..6dbc0a1 100644
--- a/arch/powerpc/platforms/powernv/powernv.h
+++ b/arch/powerpc/platforms/powernv/powernv.h
@@ -18,8 +18,6 @@ static inline void pnv_pci_shutdown(void) { }
 #endif
 
 extern u32 pnv_get_supported_cpuidle_states(void);
-extern u64 pnv_deepest_stop_psscr_val;
-extern u64 pnv_deepest_stop_psscr_mask;
 
 extern void pnv_lpc_init(void);
 
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 8b67e1e..914b456 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -35,6 +35,7 @@
 #include <asm/dbell.h>
 #include <asm/kvm_ppc.h>
 #include <asm/ppc-opcode.h>
+#include <asm/cpuidle.h>
 
 #include "powernv.h"
 
@@ -140,7 +141,6 @@ static void pnv_smp_cpu_kill_self(void)
 {
 	unsigned int cpu;
 	unsigned long srr1, wmask;
-	u32 idle_states;
 
 	/* Standard hot unplug procedure */
 	local_irq_disable();
@@ -155,8 +155,6 @@ static void pnv_smp_cpu_kill_self(void)
 	if (cpu_has_feature(CPU_FTR_ARCH_207S))
 		wmask = SRR1_WAKEMASK_P8;
 
-	idle_states = pnv_get_supported_cpuidle_states();
-
 	/* We don't want to take decrementer interrupts while we are offline,
 	 * so clear LPCR:PECE1. We keep PECE2 (and LPCR_PECE_HVEE on P9)
 	 * enabled as to let IPIs in.
@@ -184,19 +182,7 @@ static void pnv_smp_cpu_kill_self(void)
 		kvmppc_set_host_ipi(cpu, 0);
 
 		ppc64_runlatch_off();
-
-		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
-			srr1 = power9_idle_stop(pnv_deepest_stop_psscr_val,
-						pnv_deepest_stop_psscr_mask);
-		} else if (idle_states & OPAL_PM_WINKLE_ENABLED) {
-			srr1 = power7_winkle();
-		} else if ((idle_states & OPAL_PM_SLEEP_ENABLED) ||
-			   (idle_states & OPAL_PM_SLEEP_ENABLED_ER1)) {
-			srr1 = power7_sleep();
-		} else {
-			srr1 = power7_nap(1);
-		}
-
+		srr1 = pnv_cpu_offline(cpu);
 		ppc64_runlatch_on();
 
 		/*
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [v3, 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c
  2017-03-22 15:04 ` [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c Gautham R. Shenoy
@ 2017-04-12 13:31   ` Michael Ellerman
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2017-04-12 13:31 UTC (permalink / raw)
  To: Gautham R. Shenoy, Michael Neuling, Benjamin Herrenschmidt,
	Shreyas B. Prabhu, Shilpasri G Bhat, Vaidyanathan Srinivasan,
	Anton Blanchard, Balbir Singh, Akshay Adiga, Nicholas Piggin,
	Mahesh J Salgaonkar
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel

On Wed, 2017-03-22 at 15:04:14 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
> transitions the CPU to the deepest available platform idle state to a
> new function named pnv_cpu_offline() in powernv/idle.c. The rationale
> behind this code movement is that the data required to determine the
> deepest available platform state resides in powernv/idle.c.
> 
> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/a7cd88da97040513e17cd77ae3e577

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-12 13:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-22 15:04 [v3 PATCH 0/4] powernv:idle: Fixes for CPU-Hotplug on POWER DD1.0 Gautham R. Shenoy
2017-03-22 15:04 ` [v3 PATCH 1/4] powernv: Move CPU-Offline idle state invocation from smp.c to idle.c Gautham R. Shenoy
2017-04-12 13:31   ` [v3, " Michael Ellerman
2017-03-22 15:04 ` [v3 PATCH 2/4] powernv:smp: Add busy-wait loop as fall back for CPU-Hotplug Gautham R. Shenoy
2017-03-27 11:43   ` Michael Ellerman
2017-03-30  9:19     ` Gautham R Shenoy
2017-03-22 15:04 ` [v3 PATCH 3/4] powernv:idle: Don't override default/deepest directly in kernel Gautham R. Shenoy
2017-03-22 15:04 ` [v3 PATCH 4/4] powernv: Recover correct PACA on wakeup from a stop on P9 DD1 Gautham R. Shenoy
2017-03-22 16:06   ` Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).