From: Michael Ellerman <mpe@ellerman.id.au>
To: "Shreyas B. Prabhu" <shreyas@linux.vnet.ibm.com>
Cc: ego@linux.vnet.ibm.com, mikey@neuling.org, benh@au1.ibm.com,
"Shreyas B. Prabhu" <shreyas@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, maddy@linux.vnet.ibm.com,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [v6, 08/11] powerpc/powernv: Add platform support for stop instruction
Date: Wed, 15 Jun 2016 21:14:51 +1000 (AEST) [thread overview]
Message-ID: <3rV3nm07hsz9t1V@ozlabs.org> (raw)
In-Reply-To: <1465404871-5406-9-git-send-email-shreyas@linux.vnet.ibm.com>
Hi Shreyas,
Comments inline ...
On Wed, 2016-08-06 at 16:54:28 UTC, "Shreyas B. Prabhu" wrote:
> POWER ISA v3 defines a new idle processor core mechanism. In summary,
> a) new instruction named stop is added. This instruction replaces
> instructions like nap, sleep, rvwinkle.
> b) new per thread SPR named Processor Stop Status and Control Register
> (PSSCR) is added which controls the behavior of stop instruction.
>
> PSSCR layout:
> ----------------------------------------------------------
> | PLS | /// | SD | ESL | EC | PSLL | /// | TR | MTL | RL |
> ----------------------------------------------------------
> 0 4 41 42 43 44 48 54 56 60
>
> PSSCR key fields:
> Bits 0:3 - Power-Saving Level Status. This field indicates the lowest
> power-saving state the thread entered since stop instruction was last
> executed.
>
> Bit 42 - Enable State Loss
> 0 - No state is lost irrespective of other fields
> 1 - Allows state loss
>
> Bits 44:47 - Power-Saving Level Limit
> This limits the power-saving level that can be entered into.
>
> Bits 60:63 - Requested Level
> Used to specify which power-saving level must be entered on executing
> stop instruction
That would probably be good as a comment somewhere too, maybe idle.c
> diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
> index d2f99ca..3d7fc06 100644
> --- a/arch/powerpc/include/asm/cpuidle.h
> +++ b/arch/powerpc/include/asm/cpuidle.h
> @@ -13,6 +13,8 @@
> #ifndef __ASSEMBLY__
> extern u32 pnv_fastsleep_workaround_at_entry[];
> extern u32 pnv_fastsleep_workaround_at_exit[];
> +
> +extern u64 pnv_first_deep_stop_state;
Should this have some safe initial value?
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index 6bdcd0d..ae3b155 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -262,6 +262,7 @@ struct machdep_calls {
> extern void e500_idle(void);
> extern void power4_idle(void);
> extern void power7_idle(void);
> +extern void power_stop0(void);
Can that have a better name please?
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 9bb8ddf..7f3f8c6 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -162,13 +162,20 @@
>
> /* Device tree flags */
>
> -/* Flags set in power-mgmt nodes in device tree if
> - * respective idle states are supported in the platform.
> +/*
> + * Flags set in power-mgmt nodes in device tree describing
> + * idle states that are supported in the platform.
> */
> +
> +#define OPAL_PM_TIMEBASE_STOP 0x00000002
> +#define OPAL_PM_LOSE_HYP_CONTEXT 0x00002000
> +#define OPAL_PM_LOSE_FULL_CONTEXT 0x00004000
> #define OPAL_PM_NAP_ENABLED 0x00010000
> #define OPAL_PM_SLEEP_ENABLED 0x00020000
> #define OPAL_PM_WINKLE_ENABLED 0x00040000
> #define OPAL_PM_SLEEP_ENABLED_ER1 0x00080000 /* with workaround */
> +#define OPAL_PM_STOP_INST_FAST 0x00100000
> +#define OPAL_PM_STOP_INST_DEEP 0x00200000
I don't see the above in skiboot yet?
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index 546540b..ae91b44 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -171,6 +171,8 @@ struct paca_struct {
> /* Mask to denote subcore sibling threads */
> u8 subcore_sibling_mask;
> #endif
> + /* Template for PSSCR with EC, ESL, TR, PSLL, MTL fields set */
> + u64 thread_psscr;
I'm not entirely clear on why that needs to be in the paca. Could it not be global?
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 009fab1..7f92fc8 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -457,6 +457,7 @@ extern int powersave_nap; /* set if nap mode can be used in idle loop */
> extern unsigned long power7_nap(int check_irq);
> extern unsigned long power7_sleep(void);
> extern unsigned long power7_winkle(void);
> +extern unsigned long power_stop(unsigned long state);
Can that also have a better name?
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index a0948f4..89a00d9 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -145,6 +145,16 @@
> #define MSR_64BIT 0
> #endif
>
> +/* Power Management - PSSCR Fields */
> +#define PSSCR_RL_MASK 0x0000000F
> +#define PSSCR_MTL_MASK 0x000000F0
> +#define PSSCR_TR_MASK 0x00000300
> +#define PSSCR_PSLL_MASK 0x000F0000
> +#define PSSCR_EC 0x00100000
> +#define PSSCR_ESL 0x00200000
> +#define PSSCR_SD 0x00400000
Can we get a comment for each #define saying what it is?
> @@ -288,6 +298,7 @@
> #define SPRN_PMICR 0x354 /* Power Management Idle Control Reg */
> #define SPRN_PMSR 0x355 /* Power Management Status Reg */
> #define SPRN_PMMAR 0x356 /* Power Management Memory Activity Register */
> +#define SPRN_PSSCR 0x357 /* Processor Stop Status and Control Register */
Can you add ISA 3, eg:
#define SPRN_PSSCR 0x357 /* Processor Stop Status and Control Register (ISA 3.0) */
I know we haven't been very consistent about that in the past, but we can always try :)
> @@ -761,6 +772,9 @@
> #define SIER_SDAR_VALID 0x0200000 /* SDAR contents valid */
> #define SPRN_SIAR 796
> #define SPRN_SDAR 797
> +#define SPRN_LMRR 813
> +#define SPRN_LMSER 814
> +#define SPRN_ASDR 816
Ditto, comments with ISA 3 please.
> diff --git a/arch/powerpc/kernel/idle_power_common.S b/arch/powerpc/kernel/idle_power_common.S
> index 2f909a1..c6c2f66 100644
> --- a/arch/powerpc/kernel/idle_power_common.S
> +++ b/arch/powerpc/kernel/idle_power_common.S
> @@ -50,6 +55,15 @@
> IDLE_INST; \
> b .
>
> +/*
> + * rA - Requested stop state
> + * rB - Spare reg that can be used
> + */
> +#define PSSCR_REQUEST_STATE(rA, rB) \
> + ld rB, PACA_THREAD_PSSCR(r13); \
> + or rB,rB,rA; \
> + mtspr SPRN_PSSCR, rB;
I only see this used once, so it can just be inline.
> +
> .text
>
> /*
> @@ -61,8 +75,19 @@ save_sprs_to_stack:
> * Note all register i.e per-core, per-subcore or per-thread is saved
> * here since any thread in the core might wake up first
> */
> +BEGIN_FTR_SECTION
> + mfspr r3,SPRN_PTCR
> + std r3,_PTCR(r1)
> + mfspr r3,SPRN_LMRR
> + std r3,_LMRR(r1)
> + mfspr r3,SPRN_LMSER
> + std r3,_LMSER(r1)
> + mfspr r3,SPRN_ASDR
> + std r3,_ASDR(r1)
> +FTR_SECTION_ELSE
A comment here saying that SDR1 is removed in ISA 3.0 would be helpful.
> mfspr r3,SPRN_SDR1
> std r3,_SDR1(r1)
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> @@ -293,6 +354,21 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
>
>
> /*
> + * Used for ppc_md.power_save which needs a function with no parameters
> + */
> +_GLOBAL(power_stop0)
> + li r3,0
Zero?
> + /* Fall through to power_stop */
I think I'd rather you just did that as a C function.
> +/*
> + * r3 - requested stop state
> + */
> +_GLOBAL(power_stop)
> + PSSCR_REQUEST_STATE(r3,r4)
> + li r4, 1
> + LOAD_REG_ADDR(r5,power_enter_stop)
> + b pnv_powersave_common
> + /* No return */
> +/*
> * Called from reset vector. Check whether we have woken up with
> * hypervisor state loss. If yes, restore hypervisor state and return
> * back to reset vector.
> @@ -301,7 +377,32 @@ ALT_FTR_SECTION_END_NESTED_IFSET(CPU_FTR_ARCH_207S, 66); \
> * cr3 - set to gt if waking up with partial/complete hypervisor state loss
> */
> _GLOBAL(pnv_restore_hyp_resource)
> +BEGIN_FTR_SECTION
> /*
> + * POWER ISA 3. Use PSSCR to determine if we
> + * are waking up from deep idle state
> + */
> + LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
That's an @got load using r2, but have we restored r2 yet?
> + ld r4,ADDROFF(pnv_first_deep_stop_state)(r5)
> +
> + mfspr r5,SPRN_PSSCR
> @@ -397,8 +507,11 @@ first_thread_in_subcore:
> bne cr4,subcore_state_restored
>
> /* Restore per-subcore state */
We don't have subcores on P9, or did I miss a memo?
A comment somewhere explaining that would help I think, it's not clear AFAICS.
> +BEGIN_FTR_SECTION
> ld r4,_SDR1(r1)
> mtspr SPRN_SDR1,r4
> +END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
> +
> ld r4,_RPR(r1)
> mtspr SPRN_RPR,r4
> ld r4,_AMOR(r1)
> @@ -477,6 +601,21 @@ common_exit:
> slbmte r6,r5
> 1: addi r8,r8,16
> .endr
> +END_MMU_FTR_SECTION_IFCLR(MMU_FTR_RADIX)
> +
> + /* Restore per thread state */
> +BEGIN_FTR_SECTION
> + bl __restore_cpu_power9
> +
> + ld r4,_LMRR(r1)
> + mtspr SPRN_LMRR,r4
> + ld r4,_LMSER(r1)
> + mtspr SPRN_LMSER,r4
> + ld r4,_ASDR(r1)
> + mtspr SPRN_ASDR,r4
Should those be in __restore_cpu_power9 ?
> +FTR_SECTION_ELSE
> + bl __restore_cpu_power8
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
Then we could potentially do the above by calling through cur_cpu_spec.
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index fbb09fb..bfbd359 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -27,9 +27,11 @@
> #include "powernv.h"
> #include "subcore.h"
>
> +#define MAX_STOP_STATE 0xF
Says who?
> @@ -130,8 +136,8 @@ static void pnv_alloc_idle_core_states(void)
>
> update_subcore_sibling_mask();
>
> - if (supported_cpuidle_states & OPAL_PM_WINKLE_ENABLED)
> - pnv_save_sprs_for_winkle();
> + if (supported_cpuidle_states & OPAL_PM_LOSE_FULL_CONTEXT)
> + pnv_save_sprs_for_deep_states();
How does that work on old skiboot that doesn't set OPAL_PM_LOSE_FULL_CONTEXT but
still sets OPAL_PM_WINKLE_ENABLED?
> }
>
> u32 pnv_get_supported_cpuidle_states(void)
> @@ -230,11 +236,18 @@ static DEVICE_ATTR(fastsleep_workaround_applyonce, 0600,
> show_fastsleep_workaround_applyonce,
> store_fastsleep_workaround_applyonce);
>
> +/*
> + * First deep stop state. Used to figure out when to save/restore
> + * hypervisor context.
> + */
> +u64 pnv_first_deep_stop_state;
> +
> static int __init pnv_init_idle_states(void)
> {
> struct device_node *power_mgt;
I prefer just "np" - it's shorter and I immediately know what it is.
> int dt_idle_states;
> u32 *flags;
> + u64 *psscr_val = NULL;
> int i;
>
> supported_cpuidle_states = 0;
> @@ -264,6 +277,32 @@ static int __init pnv_init_idle_states(void)
> goto out_free;
> }
>
> + if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> + psscr_val = kcalloc(dt_idle_states, sizeof(*psscr_val),
> + GFP_KERNEL);
> + if (!psscr_val)
> + goto out_free;
Newline please.
> + if (of_property_read_u64_array(power_mgt,
> + "ibm,cpu-idle-state-psscr",
> + psscr_val, dt_idle_states)) {
> + pr_warn("cpuidle-powernv: missing ibm,cpu-idle-states-psscr in DT\n");
> + goto out_free_psscr;
> + }
> +
> + /*
> + * Set pnv_first_deep_stop_state to the first stop level
> + * to cause hypervisor state loss
> + */
> + pnv_first_deep_stop_state = MAX_STOP_STATE;
> + for (i = 0; i < dt_idle_states; i++) {
> + u64 psscr_rl = psscr_val[i] & PSSCR_RL_MASK;
> +
> + if ((flags[i] & OPAL_PM_LOSE_FULL_CONTEXT) &&
> + (pnv_first_deep_stop_state > psscr_rl))
> + pnv_first_deep_stop_state = psscr_rl;
> + }
> + }
This function is >110 lines, which is too big for my taste.
The above would be better as a separate function I think.
> for (i = 0; i < dt_idle_states; i++)
> supported_cpuidle_states |= flags[i];
>
> @@ -286,8 +325,29 @@ static int __init pnv_init_idle_states(void)
>
> pnv_alloc_idle_core_states();
>
> + if (supported_cpuidle_states & OPAL_PM_STOP_INST_FAST)
> + for_each_possible_cpu(i) {
> +
> + u64 psscr_init_val = PSSCR_ESL | PSSCR_EC |
> + PSSCR_PSLL_MASK | PSSCR_TR_MASK |
> + PSSCR_MTL_MASK;
> +
> + paca[i].thread_psscr = psscr_init_val;
> + /*
> + * Memory barrier to ensure that the writes to PACA
> + * goes through before ppc_md.power_save is updated
> + * below.
> + */
> + mb();
> + }
And likewise that loop.
cheers
next prev parent reply other threads:[~2016-06-15 11:14 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-08 16:54 [PATCH v6 00/11] powerpc/powernv/cpuidle: Add support for POWER ISA v3 idle states Shreyas B. Prabhu
2016-06-08 16:54 ` Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 01/11] powerpc/powernv: Use PNV_THREAD_WINKLE macro while requesting for winkle Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 02/11] powerpc/kvm: make hypervisor state restore a function Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 03/11] powerpc/powernv: Rename idle_power7.S to idle_power_common.S Shreyas B. Prabhu
2016-06-15 5:31 ` [v6, " Michael Ellerman
2016-06-08 16:54 ` [PATCH v6 04/11] powerpc/powernv: Rename reusable idle functions to hardware agnostic names Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 05/11] powerpc/powernv: Make pnv_powersave_common more generic Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 06/11] powerpc/powernv: abstraction for saving SPRs before entering deep idle states Shreyas B. Prabhu
2016-06-08 16:54 ` [PATCH v6 07/11] powerpc/powernv: set power_save func after the idle states are initialized Shreyas B. Prabhu
2016-06-15 5:41 ` [v6, " Michael Ellerman
2016-06-15 6:11 ` Shreyas B Prabhu
2016-06-22 1:54 ` [PATCH v6 " Benjamin Herrenschmidt
2016-06-22 5:11 ` Michael Neuling
2016-06-22 5:11 ` Michael Neuling
2016-06-28 12:10 ` [v6, " Michael Ellerman
2016-06-08 16:54 ` [PATCH v6 08/11] powerpc/powernv: Add platform support for stop instruction Shreyas B. Prabhu
2016-06-15 11:14 ` Michael Ellerman [this message]
2016-06-15 13:00 ` [v6, " Shreyas B Prabhu
2016-06-21 6:14 ` Michael Neuling
2016-06-21 6:14 ` Michael Neuling
2016-06-08 16:54 ` [PATCH v6 09/11] cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES Shreyas B. Prabhu
2016-06-13 15:01 ` Daniel Lezcano
2016-06-13 20:58 ` Rafael J. Wysocki
2016-06-08 16:54 ` [PATCH v6 10/11] cpuidle/powernv: Add support for POWER ISA v3 idle states Shreyas B. Prabhu
2016-06-13 15:34 ` Daniel Lezcano
2016-06-14 10:47 ` Shreyas B Prabhu
2016-06-14 11:29 ` Benjamin Herrenschmidt
2016-06-15 4:57 ` Shreyas B Prabhu
2016-06-13 21:48 ` Benjamin Herrenschmidt
2016-06-14 11:11 ` Shreyas B Prabhu
2016-06-14 11:31 ` Benjamin Herrenschmidt
2016-06-14 11:31 ` Benjamin Herrenschmidt
2016-06-08 16:54 ` [PATCH v6 11/11] powerpc/powernv: Use deepest stop state when cpu is offlined Shreyas B. Prabhu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3rV3nm07hsz9t1V@ozlabs.org \
--to=mpe@ellerman.id.au \
--cc=benh@au1.ibm.com \
--cc=ego@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.vnet.ibm.com \
--cc=mikey@neuling.org \
--cc=shreyas@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.