* [PATCH v5 0/2] x86/fpu: Make AMX state ready for CPU idle @ 2022-06-08 16:47 Chang S. Bae 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae 2022-06-08 16:47 ` [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae 0 siblings, 2 replies; 16+ messages in thread From: Chang S. Bae @ 2022-06-08 16:47 UTC (permalink / raw) To: linux-kernel, x86, linux-pm Cc: tglx, dave.hansen, peterz, bp, rafael, riel, bigeasy, hch, fenghua.yu, rui.zhang, artem.bityutskiy, jacob.jun.pan, lenb, chang.seok.bae Here is the fifth version of this series. I've addressed Dave's comment [2] assuming that the change makes sense to folks: * Check the AMX_TILE feature bit instead of XGETBV1. * Massage the changelog accordingly. While many people had their eyeballs on this, Rafael's ACK was given so far. Hopefully this can attracts more acknowledgment or endorsement if it looks fine. === Cover Letter === AMX state is a large state (at least 8KB or more). Entering CPU idle with this non-initialized large state may result in shallow states while a deeper low-power state is available. We can confirm this behavior is implementation-specific. Section 3.3 in [3] will be updated to clarify this. This patch set ensures the AMX state is initialized before entering the CPU idle state. The patch set is based on 5.19-rc1. It is also available here: git://github.com/intel/amx-linux.git tilerelease [1]: V4 https://lore.kernel.org/lkml/20220517222430.24524-1-chang.seok.bae@intel.com/ [2]: https://lore.kernel.org/lkml/25a2a82f-b5e5-0fce-86c8-03d7da5fcdd1@intel.com/ [3]: Intel Architecture Instruction Set Extension Programming Reference May 2021, https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf Chang S. Bae (2): x86/fpu: Add a helper to prepare AMX state for low-power CPU idle intel_idle: Add a new flag to initialize the AMX state arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ drivers/idle/intel_idle.c | 18 ++++++++++++++++-- 4 files changed, 41 insertions(+), 2 deletions(-) base-commit: f2906aa863381afb0015a9eb7fefad885d4e5a56 -- 2.17.1 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle 2022-06-08 16:47 [PATCH v5 0/2] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae @ 2022-06-08 16:47 ` Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae ` (3 more replies) 2022-06-08 16:47 ` [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae 1 sibling, 4 replies; 16+ messages in thread From: Chang S. Bae @ 2022-06-08 16:47 UTC (permalink / raw) To: linux-kernel, x86, linux-pm Cc: tglx, dave.hansen, peterz, bp, rafael, riel, bigeasy, hch, fenghua.yu, rui.zhang, artem.bityutskiy, jacob.jun.pan, lenb, chang.seok.bae When a CPU enters an idle state, a non-initialized AMX register state may be the cause of preventing a deeper low-power state. Other extended register states whether initialized or not do not impact the CPU idle state. The new helper can ensure the AMX state is initialized before the CPU is idle, and it will be used by the intel idle driver. Check the AMX_TILE feature bit before using XGETBV1 as a chain of dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Cc: Rik van Riel <riel@fb.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Fenghua Yu <fenghua.yu@intel.com> --- Changes from v4: * Switch to check the AMX_TILE flag instead XGETBV1 (Dave Hansen). * Massage the changelog. Changes from v3: * Call out AMX state in changelog (Thomas Glexiner). Changes from v2: * Check the feature flag instead of fpu_state_size_dynamic() (Dave Hansen). Changes from v1: * Check the dynamic state flag first, to avoid #UD with XGETBV(1). --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 6b0f31fb53f7..503a577814b2 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -164,4 +164,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu) /* prctl */ extern long fpu_xstate_prctl(int option, unsigned long arg2); +extern void fpu_idle_fpregs(void); + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 45b18eb94fa1..35f709f619fb 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -295,6 +295,15 @@ static inline int enqcmds(void __iomem *dst, const void *src) return 0; } +static inline void tile_release(void) +{ + /* + * Instruction opcode for TILERELEASE; supported in binutils + * version >= 2.36. + */ + asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 0531d6a06df5..3b28c5b25e12 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -851,3 +851,17 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) */ return 0; } + +/* + * Initialize register state that may prevent from entering low-power idle. + * This function will be invoked from the cpuidle driver only when needed. + */ +void fpu_idle_fpregs(void) +{ + /* Note: AMX_TILE being enabled implies XGETBV1 support */ + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) && + (xfeatures_in_use() & XFEATURE_MASK_XTILE)) { + tile_release(); + fpregs_deactivate(¤t->thread.fpu); + } +} -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae @ 2022-06-08 19:32 ` tip-bot2 for Chang S. Bae 2022-06-14 22:46 ` tip-bot2 for Chang S. Bae ` (2 subsequent siblings) 3 siblings, 0 replies; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-06-08 19:32 UTC (permalink / raw) To: linux-tip-commits; +Cc: Chang S. Bae, Dave Hansen, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: 407f1fd0780e39d60778a66c6a157d8c8c832729 Gitweb: https://git.kernel.org/tip/407f1fd0780e39d60778a66c6a157d8c8c832729 Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Wed, 08 Jun 2022 09:47:47 -07:00 Committer: Dave Hansen <dave.hansen@linux.intel.com> CommitterDate: Wed, 08 Jun 2022 12:03:56 -07:00 x86/fpu: Add a helper to prepare AMX state for low-power CPU idle When a CPU enters an idle state, a non-initialized AMX register state may be the cause of preventing a deeper low-power state. Other extended register states whether initialized or not do not impact the CPU idle state. The new helper can ensure the AMX state is initialized before the CPU is idle, and it will be used by the intel idle driver. Check the AMX_TILE feature bit before using XGETBV1 as a chain of dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-2-chang.seok.bae@intel.com --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index c83b302..df48912 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -165,4 +165,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu) struct task_struct; extern long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2); +extern void fpu_idle_fpregs(void); + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 68c257a..d434fba 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -294,6 +294,15 @@ static inline int enqcmds(void __iomem *dst, const void *src) return 0; } +static inline void tile_release(void) +{ + /* + * Instruction opcode for TILERELEASE; supported in binutils + * version >= 2.36. + */ + asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index c049561..209cfac 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -851,3 +851,17 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) */ return 0; } + +/* + * Initialize register state that may prevent from entering low-power idle. + * This function will be invoked from the cpuidle driver only when needed. + */ +void fpu_idle_fpregs(void) +{ + /* Note: AMX_TILE being enabled implies XGETBV1 support */ + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) && + (xfeatures_in_use() & XFEATURE_MASK_XTILE)) { + tile_release(); + fpregs_deactivate(¤t->thread.fpu); + } +} ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae @ 2022-06-14 22:46 ` tip-bot2 for Chang S. Bae 2022-06-14 22:53 ` tip-bot2 for Chang S. Bae 2022-07-19 17:31 ` tip-bot2 for Chang S. Bae 3 siblings, 0 replies; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-06-14 22:46 UTC (permalink / raw) To: linux-tip-commits; +Cc: Chang S. Bae, Dave Hansen, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: 012b91af28e4d83240e053e16af3528a902b0d84 Gitweb: https://git.kernel.org/tip/012b91af28e4d83240e053e16af3528a902b0d84 Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Wed, 08 Jun 2022 09:47:47 -07:00 Committer: Dave Hansen <dave.hansen@linux.intel.com> CommitterDate: Tue, 14 Jun 2022 15:42:41 -07:00 x86/fpu: Add a helper to prepare AMX state for low-power CPU idle When a CPU enters an idle state, a non-initialized AMX register state may be the cause of preventing a deeper low-power state. Other extended register states whether initialized or not do not impact the CPU idle state. The new helper can ensure the AMX state is initialized before the CPU is idle, and it will be used by the intel idle driver. Check the AMX_TILE feature bit before using XGETBV1 as a chain of dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-2-chang.seok.bae@intel.com --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 6b0f31f..503a577 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -164,4 +164,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu) /* prctl */ extern long fpu_xstate_prctl(int option, unsigned long arg2); +extern void fpu_idle_fpregs(void); + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 45b18eb..35f709f 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -295,6 +295,15 @@ static inline int enqcmds(void __iomem *dst, const void *src) return 0; } +static inline void tile_release(void) +{ + /* + * Instruction opcode for TILERELEASE; supported in binutils + * version >= 2.36. + */ + asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 0fdc807..8fbbe89 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -851,3 +851,17 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) */ return 0; } + +/* + * Initialize register state that may prevent from entering low-power idle. + * This function will be invoked from the cpuidle driver only when needed. + */ +void fpu_idle_fpregs(void) +{ + /* Note: AMX_TILE being enabled implies XGETBV1 support */ + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) && + (xfeatures_in_use() & XFEATURE_MASK_XTILE)) { + tile_release(); + fpregs_deactivate(¤t->thread.fpu); + } +} ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 2022-06-14 22:46 ` tip-bot2 for Chang S. Bae @ 2022-06-14 22:53 ` tip-bot2 for Chang S. Bae 2022-07-19 17:31 ` tip-bot2 for Chang S. Bae 3 siblings, 0 replies; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-06-14 22:53 UTC (permalink / raw) To: linux-tip-commits; +Cc: Chang S. Bae, Dave Hansen, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: 418bf5f906c33e83e76239748982dc3d2330cf30 Gitweb: https://git.kernel.org/tip/418bf5f906c33e83e76239748982dc3d2330cf30 Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Wed, 08 Jun 2022 09:47:47 -07:00 Committer: Dave Hansen <dave.hansen@linux.intel.com> CommitterDate: Tue, 14 Jun 2022 15:48:44 -07:00 x86/fpu: Add a helper to prepare AMX state for low-power CPU idle When a CPU enters an idle state, a non-initialized AMX register state may be the cause of preventing a deeper low-power state. Other extended register states whether initialized or not do not impact the CPU idle state. The new helper can ensure the AMX state is initialized before the CPU is idle, and it will be used by the intel idle driver. Check the AMX_TILE feature bit before using XGETBV1 as a chain of dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-2-chang.seok.bae@intel.com --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 6b0f31f..503a577 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -164,4 +164,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu) /* prctl */ extern long fpu_xstate_prctl(int option, unsigned long arg2); +extern void fpu_idle_fpregs(void); + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 45b18eb..35f709f 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -295,6 +295,15 @@ static inline int enqcmds(void __iomem *dst, const void *src) return 0; } +static inline void tile_release(void) +{ + /* + * Instruction opcode for TILERELEASE; supported in binutils + * version >= 2.36. + */ + asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 0531d6a..3b28c5b 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -851,3 +851,17 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) */ return 0; } + +/* + * Initialize register state that may prevent from entering low-power idle. + * This function will be invoked from the cpuidle driver only when needed. + */ +void fpu_idle_fpregs(void) +{ + /* Note: AMX_TILE being enabled implies XGETBV1 support */ + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) && + (xfeatures_in_use() & XFEATURE_MASK_XTILE)) { + tile_release(); + fpregs_deactivate(¤t->thread.fpu); + } +} ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae ` (2 preceding siblings ...) 2022-06-14 22:53 ` tip-bot2 for Chang S. Bae @ 2022-07-19 17:31 ` tip-bot2 for Chang S. Bae 3 siblings, 0 replies; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-07-19 17:31 UTC (permalink / raw) To: linux-tip-commits Cc: Chang S. Bae, Dave Hansen, Borislav Petkov, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: f17b168734c0fe47343a7502d012266a051f9942 Gitweb: https://git.kernel.org/tip/f17b168734c0fe47343a7502d012266a051f9942 Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Wed, 08 Jun 2022 09:47:47 -07:00 Committer: Borislav Petkov <bp@suse.de> CommitterDate: Tue, 19 Jul 2022 18:46:15 +02:00 x86/fpu: Add a helper to prepare AMX state for low-power CPU idle When a CPU enters an idle state, a non-initialized AMX register state may be the cause of preventing a deeper low-power state. Other extended register states whether initialized or not do not impact the CPU idle state. The new helper can ensure the AMX state is initialized before the CPU is idle, and it will be used by the intel idle driver. Check the AMX_TILE feature bit before using XGETBV1 as a chain of dependencies was established via cpuid_deps[]: AMX->XFD->XGETBV1. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20220608164748.11864-2-chang.seok.bae@intel.com --- arch/x86/include/asm/fpu/api.h | 2 ++ arch/x86/include/asm/special_insns.h | 9 +++++++++ arch/x86/kernel/fpu/core.c | 14 ++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 6b0f31f..503a577 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -164,4 +164,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu) /* prctl */ extern long fpu_xstate_prctl(int option, unsigned long arg2); +extern void fpu_idle_fpregs(void); + #endif /* _ASM_X86_FPU_API_H */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 45b18eb..35f709f 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -295,6 +295,15 @@ static inline int enqcmds(void __iomem *dst, const void *src) return 0; } +static inline void tile_release(void) +{ + /* + * Instruction opcode for TILERELEASE; supported in binutils + * version >= 2.36. + */ + asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 0531d6a..3b28c5b 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -851,3 +851,17 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) */ return 0; } + +/* + * Initialize register state that may prevent from entering low-power idle. + * This function will be invoked from the cpuidle driver only when needed. + */ +void fpu_idle_fpregs(void) +{ + /* Note: AMX_TILE being enabled implies XGETBV1 support */ + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE) && + (xfeatures_in_use() & XFEATURE_MASK_XTILE)) { + tile_release(); + fpregs_deactivate(¤t->thread.fpu); + } +} ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state 2022-06-08 16:47 [PATCH v5 0/2] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae @ 2022-06-08 16:47 ` Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae ` (2 more replies) 1 sibling, 3 replies; 16+ messages in thread From: Chang S. Bae @ 2022-06-08 16:47 UTC (permalink / raw) To: linux-kernel, x86, linux-pm Cc: tglx, dave.hansen, peterz, bp, rafael, riel, bigeasy, hch, fenghua.yu, rui.zhang, artem.bityutskiy, jacob.jun.pan, lenb, chang.seok.bae The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Tested-by : Zhang Rui <rui.zhang@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> --- Changes from v2: * Remove an unnecessary backslash (Rafael Wysocki). Changes from v1: * Simplify the code with a new flag (Rui). * Rebase on Artem's patches for SPR intel_idle. * Massage the changelog. --- drivers/idle/intel_idle.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b9bb94bd0f67..5f36c4b28f9d 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -105,6 +106,11 @@ static unsigned int mwait_substates __initdata; */ #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) +/* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + /* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) @@ -139,6 +145,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) local_irq_enable(); + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); + mwait_idle_with_hints(eax, ecx); return index; @@ -159,8 +168,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -895,7 +908,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] intel_idle: Add a new flag to initialize the AMX state 2022-06-08 16:47 ` [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae @ 2022-06-08 19:32 ` tip-bot2 for Chang S. Bae 2022-06-09 10:23 ` Peter Zijlstra [not found] ` <38cd51750ef7b995506d001eae3e4ec872cf5b77.camel@linux.intel.com> 2022-06-14 22:53 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 2 siblings, 1 reply; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-06-08 19:32 UTC (permalink / raw) To: linux-tip-commits Cc: Zhang Rui, Peter Zijlstra (Intel), Chang S. Bae, Dave Hansen, Rafael J. Wysocki, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: 43843d58393026fef4a43d192b641a4fabdc42bf Gitweb: https://git.kernel.org/tip/43843d58393026fef4a43d192b641a4fabdc42bf Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Wed, 08 Jun 2022 09:47:48 -07:00 Committer: Dave Hansen <dave.hansen@linux.intel.com> CommitterDate: Wed, 08 Jun 2022 12:04:11 -07:00 intel_idle: Add a new flag to initialize the AMX state The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Tested-by : Zhang Rui <rui.zhang@intel.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-3-chang.seok.bae@intel.com --- drivers/idle/intel_idle.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b7640cf..d357908 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -101,6 +102,11 @@ static unsigned int mwait_substates __initdata; #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) /* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + +/* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. @@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) local_irq_enable(); + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); + mwait_idle_with_hints(eax, ecx); return index; @@ -154,8 +163,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -790,7 +803,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [tip: x86/fpu] intel_idle: Add a new flag to initialize the AMX state 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae @ 2022-06-09 10:23 ` Peter Zijlstra 2022-06-14 16:41 ` [PATCH][Rebased] " Chang S. Bae 0 siblings, 1 reply; 16+ messages in thread From: Peter Zijlstra @ 2022-06-09 10:23 UTC (permalink / raw) To: linux-kernel Cc: linux-tip-commits, Zhang Rui, Chang S. Bae, Dave Hansen, Rafael J. Wysocki, x86 On Wed, Jun 08, 2022 at 07:32:37PM -0000, tip-bot2 for Chang S. Bae wrote: > @@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, > if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) > local_irq_enable(); > > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > + > mwait_idle_with_hints(eax, ecx); > > return index; This will conflict with an intel_idle patch Rafael took from me; the resolution would be something along these lines: --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -166,6 +166,13 @@ static __cpuidle int intel_idle_irq(stru return ret; } +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + fpu_idle_fpregs(); + return __intel_idle(dev, drv, index); +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. @@ -1831,6 +1838,9 @@ static void __init intel_idle_init_cstat if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) drv->states[drv->state_count].enter = intel_idle_irq; + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) + drv->states[drv->state_count].enter = intel_idle_xstate; + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && intel_idle_off_by_default(mwait_hint) && ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH][Rebased] intel_idle: Add a new flag to initialize the AMX state 2022-06-09 10:23 ` Peter Zijlstra @ 2022-06-14 16:41 ` Chang S. Bae 2022-07-19 17:31 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 0 siblings, 1 reply; 16+ messages in thread From: Chang S. Bae @ 2022-06-14 16:41 UTC (permalink / raw) To: peterz, linux-kernel, dave.hansen Cc: linux-tip-commits, rui.zhang, rafael.j.wysocki, x86, Chang S. Bae The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Tested-by: Zhang Rui <rui.zhang@intel.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-3-chang.seok.bae@intel.com [changb: Rebase to the upstream with peterz's help] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> --- The patch merged in the tip's x86/fpu branch [1] has conflict with the upstream -- commit 32d4fd5751ea ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE") as of v5.19-rc2. [1] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/fpu --- drivers/idle/intel_idle.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 424ef470223d..8a19ba1c2c1b 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -105,6 +106,11 @@ static unsigned int mwait_substates __initdata; */ #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) +/* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + /* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) @@ -159,6 +165,13 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, return ret; } +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + fpu_idle_fpregs(); + return __intel_idle(dev, drv, index); +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. @@ -174,8 +187,12 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -910,7 +927,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, @@ -1819,6 +1837,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) drv->states[drv->state_count].enter = intel_idle_irq; + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) + drv->states[drv->state_count].enter = intel_idle_xstate; + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && intel_idle_off_by_default(mwait_hint) && -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [tip: x86/fpu] intel_idle: Add a new flag to initialize the AMX state 2022-06-14 16:41 ` [PATCH][Rebased] " Chang S. Bae @ 2022-07-19 17:31 ` tip-bot2 for Chang S. Bae 0 siblings, 0 replies; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-07-19 17:31 UTC (permalink / raw) To: linux-tip-commits Cc: Peter Zijlstra (Intel), Dave Hansen, Chang S. Bae, Borislav Petkov, Rafael J. Wysocki, Zhang Rui, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: 9f01129382774d98ec21526f13da26a0630ee3d8 Gitweb: https://git.kernel.org/tip/9f01129382774d98ec21526f13da26a0630ee3d8 Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Mon, 18 Jul 2022 11:56:11 -07:00 Committer: Borislav Petkov <bp@suse.de> CommitterDate: Tue, 19 Jul 2022 19:17:28 +02:00 intel_idle: Add a new flag to initialize the AMX state The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lkml.kernel.org/r/20220614164116.5196-1-chang.seok.bae@intel.com --- drivers/idle/intel_idle.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f5c6802..1ec2210 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -56,6 +56,7 @@ #include <asm/nospec-branch.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -114,6 +115,11 @@ static unsigned int mwait_substates __initdata; #define CPUIDLE_FLAG_IBRS BIT(16) /* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(17) + +/* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. @@ -185,6 +191,13 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, return ret; } +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + fpu_idle_fpregs(); + return __intel_idle(dev, drv, index); +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. @@ -200,8 +213,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -936,7 +953,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, @@ -1851,6 +1869,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) drv->states[drv->state_count].enter = intel_idle_ibrs; } + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) + drv->states[drv->state_count].enter = intel_idle_xstate; + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && intel_idle_off_by_default(mwait_hint) && ^ permalink raw reply related [flat|nested] 16+ messages in thread
[parent not found: <38cd51750ef7b995506d001eae3e4ec872cf5b77.camel@linux.intel.com>]
* Re: [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state [not found] ` <38cd51750ef7b995506d001eae3e4ec872cf5b77.camel@linux.intel.com> @ 2022-06-14 17:23 ` Chang S. Bae 2022-06-15 6:25 ` Artem Bityutskiy 0 siblings, 1 reply; 16+ messages in thread From: Chang S. Bae @ 2022-06-14 17:23 UTC (permalink / raw) To: Artem Bityutskiy, linux-kernel, x86, linux-pm Cc: tglx, dave.hansen, peterz, bp, rafael, riel, bigeasy, hch, fenghua.yu, rui.zhang, jacob.jun.pan, lenb On 6/10/2022 3:02 AM, Artem Bityutskiy wrote: > > LGTM, > > Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Thanks, Artem! Chang ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state 2022-06-14 17:23 ` [PATCH v5 2/2] " Chang S. Bae @ 2022-06-15 6:25 ` Artem Bityutskiy 0 siblings, 0 replies; 16+ messages in thread From: Artem Bityutskiy @ 2022-06-15 6:25 UTC (permalink / raw) To: Chang S. Bae, linux-kernel, x86, linux-pm Cc: tglx, dave.hansen, peterz, bp, rafael, riel, bigeasy, hch, fenghua.yu, rui.zhang, jacob.jun.pan, lenb On Tue, 2022-06-14 at 10:23 -0700, Chang S. Bae wrote: > On 6/10/2022 3:02 AM, Artem Bityutskiy wrote: > > > > LGTM, > > > > Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> > > Thanks, Artem! I apologize for sending that e-mail in HTML format. It did not reach the mailing lists. Artem. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [tip: x86/fpu] intel_idle: Add a new flag to initialize the AMX state 2022-06-08 16:47 ` [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae [not found] ` <38cd51750ef7b995506d001eae3e4ec872cf5b77.camel@linux.intel.com> @ 2022-06-14 22:53 ` tip-bot2 for Chang S. Bae 2022-07-18 9:06 ` Borislav Petkov 2 siblings, 1 reply; 16+ messages in thread From: tip-bot2 for Chang S. Bae @ 2022-06-14 22:53 UTC (permalink / raw) To: linux-tip-commits Cc: Peter Zijlstra (Intel), Chang S. Bae, Dave Hansen, Rafael J. Wysocki, Zhang Rui, x86, linux-kernel The following commit has been merged into the x86/fpu branch of tip: Commit-ID: f08ef9057b7b110f44cd364744ba6b5f0115390f Gitweb: https://git.kernel.org/tip/f08ef9057b7b110f44cd364744ba6b5f0115390f Author: Chang S. Bae <chang.seok.bae@intel.com> AuthorDate: Tue, 14 Jun 2022 09:41:16 -07:00 Committer: Dave Hansen <dave.hansen@linux.intel.com> CommitterDate: Tue, 14 Jun 2022 15:48:58 -07:00 intel_idle: Add a new flag to initialize the AMX state The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-3-chang.seok.bae@intel.com Link: https://lkml.kernel.org/r/20220614164116.5196-1-chang.seok.bae@intel.com --- drivers/idle/intel_idle.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 424ef47..8a19ba1 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -106,6 +107,11 @@ static unsigned int mwait_substates __initdata; #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) /* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + +/* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. @@ -159,6 +165,13 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, return ret; } +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + fpu_idle_fpregs(); + return __intel_idle(dev, drv, index); +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. @@ -174,8 +187,12 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -910,7 +927,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, @@ -1819,6 +1837,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) drv->states[drv->state_count].enter = intel_idle_irq; + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) + drv->states[drv->state_count].enter = intel_idle_xstate; + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && intel_idle_off_by_default(mwait_hint) && ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [tip: x86/fpu] intel_idle: Add a new flag to initialize the AMX state 2022-06-14 22:53 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae @ 2022-07-18 9:06 ` Borislav Petkov 2022-07-18 18:56 ` [PATCH][Rebased] " Chang S. Bae 0 siblings, 1 reply; 16+ messages in thread From: Borislav Petkov @ 2022-07-18 9:06 UTC (permalink / raw) To: Chang S. Bae Cc: linux-tip-commits, Peter Zijlstra (Intel), Chang S. Bae, Dave Hansen, Rafael J. Wysocki, Zhang Rui, x86, linux-kernel Hi, this is conflicting with the retbleed changes which went upstream and resolving those conflicts would practically mean rewriting your patches. Can you please redo them ontop of -rc7 and send them again? This is not something we normally do but retbleed is not something normal so.. Thanks! On Tue, Jun 14, 2022 at 10:53:58PM -0000, tip-bot2 for Chang S. Bae wrote: > The following commit has been merged into the x86/fpu branch of tip: > > Commit-ID: f08ef9057b7b110f44cd364744ba6b5f0115390f > Gitweb: https://git.kernel.org/tip/f08ef9057b7b110f44cd364744ba6b5f0115390f > Author: Chang S. Bae <chang.seok.bae@intel.com> > AuthorDate: Tue, 14 Jun 2022 09:41:16 -07:00 > Committer: Dave Hansen <dave.hansen@linux.intel.com> > CommitterDate: Tue, 14 Jun 2022 15:48:58 -07:00 > > intel_idle: Add a new flag to initialize the AMX state > > The non-initialized AMX state can be the cause of C-state demotion from C6 > to C1E. This low-power idle state may improve power savings and thus result > in a higher available turbo frequency budget. > > This behavior is implementation-specific. Initialize the state for the C6 > entrance of Sapphire Rapids as needed. > > Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Tested-by: Zhang Rui <rui.zhang@intel.com> > Link: https://lkml.kernel.org/r/20220608164748.11864-3-chang.seok.bae@intel.com > Link: https://lkml.kernel.org/r/20220614164116.5196-1-chang.seok.bae@intel.com > --- > drivers/idle/intel_idle.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 424ef47..8a19ba1 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -54,6 +54,7 @@ > #include <asm/intel-family.h> > #include <asm/mwait.h> > #include <asm/msr.h> > +#include <asm/fpu/api.h> > > #define INTEL_IDLE_VERSION "0.5.1" > > @@ -106,6 +107,11 @@ static unsigned int mwait_substates __initdata; > #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) > > /* > + * Initialize large xstate for the C6-state entrance. > + */ > +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) > + > +/* > * MWAIT takes an 8-bit "hint" in EAX "suggesting" > * the C-state (top nibble) and sub-state (bottom nibble) > * 0x00 means "MWAIT(C1)", 0x10 means "MWAIT(C2)" etc. > @@ -159,6 +165,13 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, > return ret; > } > > +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, > + struct cpuidle_driver *drv, int index) > +{ > + fpu_idle_fpregs(); > + return __intel_idle(dev, drv, index); > +} > + > /** > * intel_idle_s2idle - Ask the processor to enter the given idle state. > * @dev: cpuidle device of the target CPU. > @@ -174,8 +187,12 @@ static __cpuidle int intel_idle_irq(struct cpuidle_device *dev, > static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int index) > { > - unsigned long eax = flg2MWAIT(drv->states[index].flags); > unsigned long ecx = 1; /* break on interrupt flag */ > + struct cpuidle_state *state = &drv->states[index]; > + unsigned long eax = flg2MWAIT(state->flags); > + > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > > mwait_idle_with_hints(eax, ecx); > > @@ -910,7 +927,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { > { > .name = "C6", > .desc = "MWAIT 0x20", > - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, > + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | > + CPUIDLE_FLAG_INIT_XSTATE, > .exit_latency = 290, > .target_residency = 800, > .enter = &intel_idle, > @@ -1819,6 +1837,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) > if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_IRQ_ENABLE) > drv->states[drv->state_count].enter = intel_idle_irq; > > + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) > + drv->states[drv->state_count].enter = intel_idle_xstate; > + > if ((disabled_states_mask & BIT(drv->state_count)) || > ((icpu->use_acpi || force_use_acpi) && > intel_idle_off_by_default(mwait_hint) && -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH][Rebased] intel_idle: Add a new flag to initialize the AMX state 2022-07-18 9:06 ` Borislav Petkov @ 2022-07-18 18:56 ` Chang S. Bae 0 siblings, 0 replies; 16+ messages in thread From: Chang S. Bae @ 2022-07-18 18:56 UTC (permalink / raw) To: bp Cc: linux-tip-commits, peterz, dave.hansen, rafael.j.wysocki, rui.zhang, x86, linux-kernel, chang.seok.bae The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://lkml.kernel.org/r/20220608164748.11864-3-chang.seok.bae@intel.com Link: https://lkml.kernel.org/r/20220614164116.5196-1-chang.seok.bae@intel.com [ changb: Rebased to the upstream again. ] Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> --- The patch merged in the tip's x86/fpu has conflict with the retbleed patch -- commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle") as of v5.19-rc7. --- drivers/idle/intel_idle.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index f5c6802aa6c3..1ec221079367 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -56,6 +56,7 @@ #include <asm/nospec-branch.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -113,6 +114,11 @@ static unsigned int mwait_substates __initdata; */ #define CPUIDLE_FLAG_IBRS BIT(16) +/* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(17) + /* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) @@ -185,6 +191,13 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, return ret; } +static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev, + struct cpuidle_driver *drv, int index) +{ + fpu_idle_fpregs(); + return __intel_idle(dev, drv, index); +} + /** * intel_idle_s2idle - Ask the processor to enter the given idle state. * @dev: cpuidle device of the target CPU. @@ -200,8 +213,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -936,7 +953,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle, @@ -1851,6 +1869,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) drv->states[drv->state_count].enter = intel_idle_ibrs; } + if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_INIT_XSTATE) + drv->states[drv->state_count].enter = intel_idle_xstate; + if ((disabled_states_mask & BIT(drv->state_count)) || ((icpu->use_acpi || force_use_acpi) && intel_idle_off_by_default(mwait_hint) && -- 2.17.1 ^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2022-07-19 17:31 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-08 16:47 [PATCH v5 0/2] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae 2022-06-08 16:47 ` [PATCH v5 1/2] x86/fpu: Add a helper to prepare AMX state for low-power " Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 2022-06-14 22:46 ` tip-bot2 for Chang S. Bae 2022-06-14 22:53 ` tip-bot2 for Chang S. Bae 2022-07-19 17:31 ` tip-bot2 for Chang S. Bae 2022-06-08 16:47 ` [PATCH v5 2/2] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae 2022-06-08 19:32 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 2022-06-09 10:23 ` Peter Zijlstra 2022-06-14 16:41 ` [PATCH][Rebased] " Chang S. Bae 2022-07-19 17:31 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae [not found] ` <38cd51750ef7b995506d001eae3e4ec872cf5b77.camel@linux.intel.com> 2022-06-14 17:23 ` [PATCH v5 2/2] " Chang S. Bae 2022-06-15 6:25 ` Artem Bityutskiy 2022-06-14 22:53 ` [tip: x86/fpu] " tip-bot2 for Chang S. Bae 2022-07-18 9:06 ` Borislav Petkov 2022-07-18 18:56 ` [PATCH][Rebased] " Chang S. Bae
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).