All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] x86/fpu: Make AMX state ready for CPU idle
@ 2022-03-25  2:22 Chang S. Bae
  2022-03-25  2:22 ` [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering Chang S. Bae
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Chang S. Bae @ 2022-03-25  2:22 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar, chang.seok.bae

Changes from v2 [1]:
* Secure XCR0 accessors from the compiler reordering issue (Dave Hansen).
* Check the feature flag instead of fpu_state_size_dynamic() (Dave Hansen).
* Remove the meaningless backslash (Rafael Wysocki).

Note the two changes this series wants are now in the mainline:
* the opcode for TILERELEASE [2] and
* the C-state table for Sapphire Rapids [3]

The series on top of the above is available here:
  git://github.com/intel/amx-linux.git tilerelease

Thanks,
Chang

[1]: https://lore.kernel.org/lkml/20220309223431.26560-1-chang.seok.bae@intel.com/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch?id=9dd94df75b30eca03ed2151dd5bbc152a6f19abf
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch?id=9edf3c0ffef0ec1bed8300315852b5c6a0997130

Chang S. Bae (3):
  x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering
  x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
  intel_idle: Add a new flag to initialize the AMX state

 arch/x86/include/asm/fpu/api.h       |  2 ++
 arch/x86/include/asm/fpu/xcr.h       |  8 ++++++--
 arch/x86/include/asm/special_insns.h |  9 +++++++++
 arch/x86/kernel/fpu/core.c           | 13 +++++++++++++
 drivers/idle/intel_idle.c            | 18 ++++++++++++++++--
 5 files changed, 46 insertions(+), 4 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering
  2022-03-25  2:22 [PATCH v3 0/3] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae
@ 2022-03-25  2:22 ` Chang S. Bae
  2022-04-01 17:58   ` Dave Hansen
  2022-03-25  2:22 ` [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle Chang S. Bae
  2022-03-25  2:22 ` [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae
  2 siblings, 1 reply; 10+ messages in thread
From: Chang S. Bae @ 2022-03-25  2:22 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar, chang.seok.bae

Some old GCC versions (4.9.x and 5.x) have an issue that incorrectly
reordering volatile asm statements with each other [1]. While this bug was
fixed on later versions (8.1, 7.3, and 6.5), and the kernel's current XCR0
read/write do not appear to be impacted, it is preventive to ensure them on
the program order.

Have a memory clobber for write to prevent caching/reordering memory
accesses across other XCR0 writes. A dummy operand with an arbitrary
address can prevent the compiler from reordering with other writes. Add the
dummy operand for read as used for other accessors in aa5cacdc29d
("x86/asm: Replace __force_order with a memory clobber").

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v2:
* Add as a new patch (Dave Hansen).
---
 arch/x86/include/asm/fpu/xcr.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xcr.h b/arch/x86/include/asm/fpu/xcr.h
index 9656a5bc6fea..9b513e7c0161 100644
--- a/arch/x86/include/asm/fpu/xcr.h
+++ b/arch/x86/include/asm/fpu/xcr.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_FPU_XCR_H
 #define _ASM_X86_FPU_XCR_H
 
+#include <asm/special_insns.h>
+
 #define XCR_XFEATURE_ENABLED_MASK	0x00000000
 #define XCR_XFEATURE_IN_USE_MASK	0x00000001
 
@@ -9,7 +11,8 @@ static inline u64 xgetbv(u32 index)
 {
 	u32 eax, edx;
 
-	asm volatile("xgetbv" : "=a" (eax), "=d" (edx) : "c" (index));
+	asm volatile("xgetbv" : "=a" (eax), "=d" (edx) : "c" (index),
+				__FORCE_ORDER);
 	return eax + ((u64)edx << 32);
 }
 
@@ -18,7 +21,8 @@ static inline void xsetbv(u32 index, u64 value)
 	u32 eax = value;
 	u32 edx = value >> 32;
 
-	asm volatile("xsetbv" :: "a" (eax), "d" (edx), "c" (index));
+	asm volatile("xsetbv" :: "a" (eax), "d" (edx), "c" (index)
+			      :	 "memory");
 }
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
  2022-03-25  2:22 [PATCH v3 0/3] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae
  2022-03-25  2:22 ` [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering Chang S. Bae
@ 2022-03-25  2:22 ` Chang S. Bae
  2022-04-03 16:37   ` Thomas Gleixner
  2022-03-25  2:22 ` [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae
  2 siblings, 1 reply; 10+ messages in thread
From: Chang S. Bae @ 2022-03-25  2:22 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar, chang.seok.bae

When a CPU enters an idle state, non-initialized states left in large
registers may be the cause of preventing deeper low-power states.

The new helper ensures the AMX state is initialized to make the CPU
ready for low-power states. It will be used by the intel idle driver.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v2:
* Check the feature flag instead of fpu_state_size_dynamic() (Dave Hansen).

Changes from v1:
* Check the dynamic state flag first, to avoid #UD with XGETBV(1).
---
 arch/x86/include/asm/fpu/api.h       |  2 ++
 arch/x86/include/asm/special_insns.h |  9 +++++++++
 arch/x86/kernel/fpu/core.c           | 13 +++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index c83b3020350a..df48912fd1c8 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -165,4 +165,6 @@ static inline bool fpstate_is_confidential(struct fpu_guest *gfpu)
 struct task_struct;
 extern long fpu_xstate_prctl(struct task_struct *tsk, int option, unsigned long arg2);
 
+extern void fpu_idle_fpregs(void);
+
 #endif /* _ASM_X86_FPU_API_H */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 68c257a3de0d..d434fbaeb3ff 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -294,6 +294,15 @@ static inline int enqcmds(void __iomem *dst, const void *src)
 	return 0;
 }
 
+static inline void tile_release(void)
+{
+	/*
+	 * Instruction opcode for TILERELEASE; supported in binutils
+	 * version >= 2.36.
+	 */
+	asm volatile(".byte 0xc4, 0xe2, 0x78, 0x49, 0xc0");
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_SPECIAL_INSNS_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8dea01ffc5c1..3507609e22d7 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -847,3 +847,16 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr)
 	 */
 	return 0;
 }
+
+/*
+ * Initialize register state that may prevent from entering low-power idle.
+ * This function will be invoked from the cpuidle driver only when needed.
+ */
+void fpu_idle_fpregs(void)
+{
+	if (cpu_feature_enabled(X86_FEATURE_XGETBV1) &&
+	    (xfeatures_in_use() & XFEATURE_MASK_XTILE)) {
+		tile_release();
+		fpregs_deactivate(&current->thread.fpu);
+	}
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state
  2022-03-25  2:22 [PATCH v3 0/3] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae
  2022-03-25  2:22 ` [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering Chang S. Bae
  2022-03-25  2:22 ` [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle Chang S. Bae
@ 2022-03-25  2:22 ` Chang S. Bae
  2022-03-29 17:42   ` Rafael J. Wysocki
  2 siblings, 1 reply; 10+ messages in thread
From: Chang S. Bae @ 2022-03-25  2:22 UTC (permalink / raw)
  To: linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar,
	chang.seok.bae, Artem Bityutskiy

The non-initialized AMX state can be the cause of C-state demotion from C6
to C1E. This low-power idle state may improve power savings and thus result
in a higher available turbo frequency budget.

This behavior is implementation-specific. Initialize the state for the C6
entrance of Sapphire Rapids as needed.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by : Zhang Rui <rui.zhang@intel.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
---
Changes from v2:
* Remove an unnecessary backslash (Rafael Wysocki).

Changes from v1:
* Simplify the code with a new flag (Rui).
* Rebase on Artem's patches for SPR intel_idle.
* Massage the changelog.
---
 drivers/idle/intel_idle.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b7640cfe0020..d35790890a3f 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -54,6 +54,7 @@
 #include <asm/intel-family.h>
 #include <asm/mwait.h>
 #include <asm/msr.h>
+#include <asm/fpu/api.h>
 
 #define INTEL_IDLE_VERSION "0.5.1"
 
@@ -100,6 +101,11 @@ static unsigned int mwait_substates __initdata;
  */
 #define CPUIDLE_FLAG_ALWAYS_ENABLE	BIT(15)
 
+/*
+ * Initialize large xstate for the C6-state entrance.
+ */
+#define CPUIDLE_FLAG_INIT_XSTATE	BIT(16)
+
 /*
  * MWAIT takes an 8-bit "hint" in EAX "suggesting"
  * the C-state (top nibble) and sub-state (bottom nibble)
@@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
 	if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
 		local_irq_enable();
 
+	if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
+		fpu_idle_fpregs();
+
 	mwait_idle_with_hints(eax, ecx);
 
 	return index;
@@ -154,8 +163,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
 static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
 				       struct cpuidle_driver *drv, int index)
 {
-	unsigned long eax = flg2MWAIT(drv->states[index].flags);
 	unsigned long ecx = 1; /* break on interrupt flag */
+	struct cpuidle_state *state = &drv->states[index];
+	unsigned long eax = flg2MWAIT(state->flags);
+
+	if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
+		fpu_idle_fpregs();
 
 	mwait_idle_with_hints(eax, ecx);
 
@@ -790,7 +803,8 @@ static struct cpuidle_state spr_cstates[] __initdata = {
 	{
 		.name = "C6",
 		.desc = "MWAIT 0x20",
-		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED |
+					   CPUIDLE_FLAG_INIT_XSTATE,
 		.exit_latency = 290,
 		.target_residency = 800,
 		.enter = &intel_idle,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state
  2022-03-25  2:22 ` [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae
@ 2022-03-29 17:42   ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2022-03-29 17:42 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Linux Kernel Mailing List, the arch/x86 maintainers, Linux PM,
	Thomas Gleixner, Dave Hansen, Peter Zijlstra, Borislav Petkov,
	Rafael J. Wysocki, Ravi V. Shankar, Artem Bityutskiy

On Fri, Mar 25, 2022 at 3:30 AM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> The non-initialized AMX state can be the cause of C-state demotion from C6
> to C1E. This low-power idle state may improve power savings and thus result
> in a higher available turbo frequency budget.
>
> This behavior is implementation-specific. Initialize the state for the C6
> entrance of Sapphire Rapids as needed.
>
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Tested-by : Zhang Rui <rui.zhang@intel.com>
> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-pm@vger.kernel.org

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

and I'm expecting this to be routed along with the rest of the series.

> ---
> Changes from v2:
> * Remove an unnecessary backslash (Rafael Wysocki).
>
> Changes from v1:
> * Simplify the code with a new flag (Rui).
> * Rebase on Artem's patches for SPR intel_idle.
> * Massage the changelog.
> ---
>  drivers/idle/intel_idle.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index b7640cfe0020..d35790890a3f 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -54,6 +54,7 @@
>  #include <asm/intel-family.h>
>  #include <asm/mwait.h>
>  #include <asm/msr.h>
> +#include <asm/fpu/api.h>
>
>  #define INTEL_IDLE_VERSION "0.5.1"
>
> @@ -100,6 +101,11 @@ static unsigned int mwait_substates __initdata;
>   */
>  #define CPUIDLE_FLAG_ALWAYS_ENABLE     BIT(15)
>
> +/*
> + * Initialize large xstate for the C6-state entrance.
> + */
> +#define CPUIDLE_FLAG_INIT_XSTATE       BIT(16)
> +
>  /*
>   * MWAIT takes an 8-bit "hint" in EAX "suggesting"
>   * the C-state (top nibble) and sub-state (bottom nibble)
> @@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
>         if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
>                 local_irq_enable();
>
> +       if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
> +               fpu_idle_fpregs();
> +
>         mwait_idle_with_hints(eax, ecx);
>
>         return index;
> @@ -154,8 +163,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
>  static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev,
>                                        struct cpuidle_driver *drv, int index)
>  {
> -       unsigned long eax = flg2MWAIT(drv->states[index].flags);
>         unsigned long ecx = 1; /* break on interrupt flag */
> +       struct cpuidle_state *state = &drv->states[index];
> +       unsigned long eax = flg2MWAIT(state->flags);
> +
> +       if (state->flags & CPUIDLE_FLAG_INIT_XSTATE)
> +               fpu_idle_fpregs();
>
>         mwait_idle_with_hints(eax, ecx);
>
> @@ -790,7 +803,8 @@ static struct cpuidle_state spr_cstates[] __initdata = {
>         {
>                 .name = "C6",
>                 .desc = "MWAIT 0x20",
> -               .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> +               .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED |
> +                                          CPUIDLE_FLAG_INIT_XSTATE,
>                 .exit_latency = 290,
>                 .target_residency = 800,
>                 .enter = &intel_idle,
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering
  2022-03-25  2:22 ` [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering Chang S. Bae
@ 2022-04-01 17:58   ` Dave Hansen
  2022-04-01 18:16     ` Dave Hansen
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2022-04-01 17:58 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar

On 3/24/22 19:22, Chang S. Bae wrote:
> Some old GCC versions (4.9.x and 5.x) have an issue that incorrectly
> reordering volatile asm statements with each other [1]. While this bug was
> fixed on later versions (8.1, 7.3, and 6.5), and the kernel's current XCR0
> read/write do not appear to be impacted, it is preventive to ensure them on
> the program order.

I thought you *actually* saw xgetbv() be reordered, though.  Was that on
a buggy compiler?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering
  2022-04-01 17:58   ` Dave Hansen
@ 2022-04-01 18:16     ` Dave Hansen
  2022-04-01 22:14       ` Chang S. Bae
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2022-04-01 18:16 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar

On 4/1/22 10:58, Dave Hansen wrote:
> On 3/24/22 19:22, Chang S. Bae wrote:
>> Some old GCC versions (4.9.x and 5.x) have an issue that incorrectly
>> reordering volatile asm statements with each other [1]. While this bug was
>> fixed on later versions (8.1, 7.3, and 6.5), and the kernel's current XCR0
>> read/write do not appear to be impacted, it is preventive to ensure them on
>> the program order.
> I thought you *actually* saw xgetbv() be reordered, though.  Was that on
> a buggy compiler?

Actually, reading the gcc bug[1] a bit more, it says:

> However, correctness here depends on the compiler honouring the
> ordering of volatile asm statements with respect to other volatile
> code, and this patch fixes that.
In your case, there was presumably no volatile code, just a
fpu_state_size_dynamic() call.  The compiler thought the xgetbv() was
safe to reorder, and did so.

So, I'm not quite sure how that bug is relevant.  It just dealt with
multiple "asm volatile" statements that don't have memory clobbers.  We
only have one "asm volatile" in play.  Adding the memory clobber should
make that bug totally irrelevant.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering
  2022-04-01 18:16     ` Dave Hansen
@ 2022-04-01 22:14       ` Chang S. Bae
  0 siblings, 0 replies; 10+ messages in thread
From: Chang S. Bae @ 2022-04-01 22:14 UTC (permalink / raw)
  To: Dave Hansen, linux-kernel, x86, linux-pm
  Cc: tglx, dave.hansen, peterz, bp, rafael, ravi.v.shankar

On 4/1/2022 11:16 AM, Dave Hansen wrote:
> On 4/1/22 10:58, Dave Hansen wrote:
>> On 3/24/22 19:22, Chang S. Bae wrote:
>>> Some old GCC versions (4.9.x and 5.x) have an issue that incorrectly
>>> reordering volatile asm statements with each other [1]. While this bug was
>>> fixed on later versions (8.1, 7.3, and 6.5), and the kernel's current XCR0
>>> read/write do not appear to be impacted, it is preventive to ensure them on
>>> the program order.
>> I thought you *actually* saw xgetbv() be reordered, though.  Was that on
>> a buggy compiler?

No, sorry, my earlier analysis was naive. The #UD was raised on the 
non-XGETBV1 system because Objtool didn't process 
fpu_state_size_dynamic() at build time. It was only when the TILERELEASE 
opcode was not given. Here is a bit more detail:
https://lore.kernel.org/lkml/aa3ff0f4-d74c-2c84-37c0-0880cabc0dd4@intel.com/

> 
> Actually, reading the gcc bug[1] a bit more, it says:
> 
>> However, correctness here depends on the compiler honouring the
>> ordering of volatile asm statements with respect to other volatile
>> code, and this patch fixes that.
> In your case, there was presumably no volatile code, just a
> fpu_state_size_dynamic() call.  The compiler thought the xgetbv() was
> safe to reorder, and did so.
> 
> So, I'm not quite sure how that bug is relevant.  It just dealt with
> multiple "asm volatile" statements that don't have memory clobbers.  We
> only have one "asm volatile" in play.  Adding the memory clobber should
> make that bug totally irrelevant.
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602

Yeah, now this patch looks to have created more confusion than fixing 
any real problem. Let me drop this on v4.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
  2022-03-25  2:22 ` [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle Chang S. Bae
@ 2022-04-03 16:37   ` Thomas Gleixner
  2022-04-05  0:24     ` Chang S. Bae
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2022-04-03 16:37 UTC (permalink / raw)
  To: Chang S. Bae, linux-kernel, x86, linux-pm
  Cc: dave.hansen, peterz, bp, rafael, ravi.v.shankar, chang.seok.bae

On Thu, Mar 24 2022 at 19:22, Chang S. Bae wrote:
> When a CPU enters an idle state, non-initialized states left in large
> registers may be the cause of preventing deeper low-power states.
>
> The new helper ensures the AMX state is initialized to make the CPU
> ready for low-power states. It will be used by the intel idle driver.

What about AVX...AVX512? Are they harmless in that regard?

If so, then the first sentence above is confusing and should clearly
spell out that it's AMX which causes that problem.

In not, then why are we not putting them into init too?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
  2022-04-03 16:37   ` Thomas Gleixner
@ 2022-04-05  0:24     ` Chang S. Bae
  0 siblings, 0 replies; 10+ messages in thread
From: Chang S. Bae @ 2022-04-05  0:24 UTC (permalink / raw)
  To: Thomas Gleixner, linux-kernel, x86, linux-pm
  Cc: dave.hansen, peterz, bp, rafael, ravi.v.shankar

On 4/3/2022 9:37 AM, Thomas Gleixner wrote:
> On Thu, Mar 24 2022 at 19:22, Chang S. Bae wrote:
>> When a CPU enters an idle state, non-initialized states left in large
>> registers may be the cause of preventing deeper low-power states.
>>
>> The new helper ensures the AMX state is initialized to make the CPU
>> ready for low-power states. It will be used by the intel idle driver.
> 
> What about AVX...AVX512? Are they harmless in that regard?

I double-checked with HW folks. Yes, that's the case. Their state 
whether initialized or not does not prevent the processor from entering 
a deeper low-power state.

> 
> If so, then the first sentence above is confusing and should clearly
> spell out that it's AMX which causes that problem.

I see. Will do that.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-05  0:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-25  2:22 [PATCH v3 0/3] x86/fpu: Make AMX state ready for CPU idle Chang S. Bae
2022-03-25  2:22 ` [PATCH v3 1/3] x86/fpu: Make XCR0 accessors immune to unwanted compiler reordering Chang S. Bae
2022-04-01 17:58   ` Dave Hansen
2022-04-01 18:16     ` Dave Hansen
2022-04-01 22:14       ` Chang S. Bae
2022-03-25  2:22 ` [PATCH v3 2/3] x86/fpu: Add a helper to prepare AMX state for low-power CPU idle Chang S. Bae
2022-04-03 16:37   ` Thomas Gleixner
2022-04-05  0:24     ` Chang S. Bae
2022-03-25  2:22 ` [PATCH v3 3/3] intel_idle: Add a new flag to initialize the AMX state Chang S. Bae
2022-03-29 17:42   ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.