All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] x86/speculation: Disable IBRS when idle
@ 2023-06-22  0:36 Waiman Long
  2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22  0:36 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario, Waiman Long

 v3:
  - Drop patches 1 ("x86/speculation: Provide a debugfs file to dump
    SPEC_CTRL MSRs") and 5 ("x86/idle: Disable IBRS entering mwait idle
    and enable it on wakeup") for now.
  - Drop the MSR restoration code in ("x86/idle: Disable IBRS when cpu
    is offline") as native_play_dead() does not return.
  - For patch ("intel_idle: Add ibrs_off module parameter to force
    disable IBRS"), change the name from "no_ibrs" to "ibrs_off" and
    document the new parameter in intel_idle.rst.

For Intel processors that need to turn on IBRS to protect against
Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
the performance of the whole core even if only one thread is turning
it on when running in the kernel. For user space heavy applications,
the performance impact of occasionally turning IBRS on during syscalls
shouldn't be significant. Unfortunately, that is not the case when the
sibling thread is idling in the kernel. In that case, the performance
impact can be significant.

When DPDK is running on an isolated CPU thread processing network packets
in user space while its sibling thread is idle. The performance of the
busy DPDK thread with IBRS on and off in the sibling idle thread are:

                                IBRS on         IBRS off
                                -------         --------
  packets/second:                  7.8M           10.4M
  avg tsc cycles/packet:         282.26          209.86

This is a 25% performance degradation. The test system is a Intel Xeon
4114 CPU @ 2.20GHz.

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle (C6 or below). However, there
are existing users out there who have set "intel_idle.max_cstate=1"
to decrease latency. Those users won't be able to benefit from this
commit. This patch series extends this commit by providing a new
"intel_idle.ibrs_off" module parameter to force disable IBRS even when
"intel_idle.max_cstate=1" at the expense of increased IRQ response
latency. It also includes a commit to allow the disabling of IBRS when
a CPU becomes offline.

Waiman Long (3):
  x86/idle: Disable IBRS when cpu is offline
  intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  intel_idle: Add ibrs_off module parameter to force disable IBRS

 Documentation/admin-guide/pm/intel_idle.rst | 17 +++++++++++++++-
 arch/x86/kernel/smpboot.c                   | 10 ++++++++++
 drivers/idle/intel_idle.c                   | 22 +++++++++++++++++----
 3 files changed, 44 insertions(+), 5 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline
  2023-06-22  0:36 [PATCH v3 0/3] x86/speculation: Disable IBRS when idle Waiman Long
@ 2023-06-22  0:36 ` Waiman Long
  2023-06-22  2:05   ` Randy Dunlap
  2023-06-22  5:40   ` Josh Poimboeuf
  2023-06-22  0:36 ` [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current Waiman Long
  2023-06-22  0:36 ` [PATCH v3 3/3] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
  2 siblings, 2 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22  0:36 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario, Waiman Long

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU
becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
is enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline. When the CPU is online again, it will be re-initialized
and so restoring the SPEC_CTRL value isn't needed.

Add a comment to say that native_play_dead() is a __noreturn function,
but it can't be marked as such to avoid confusion about the missing
MSR restoration code.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 arch/x86/kernel/smpboot.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 352f0ce1ece4..7bc33885518c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -84,6 +84,7 @@
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
 #include <asm/sev.h>
+#include <asm/nospec-branch.h>
 
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1836,8 +1837,17 @@ void __noreturn hlt_play_dead(void)
 	}
 }
 
+/*
+ * naitve_play_dead() is essentially a __noreturn function, but it can't
+ * be marked as such as the compiler may complain about it.
+ */
 void native_play_dead(void)
 {
+	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
+		this_cpu_write(x86_spec_ctrl_current, 0);
+		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	}
+
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  0:36 [PATCH v3 0/3] x86/speculation: Disable IBRS when idle Waiman Long
  2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
@ 2023-06-22  0:36 ` Waiman Long
  2023-06-22  5:46   ` Josh Poimboeuf
  2023-06-22  0:36 ` [PATCH v3 3/3] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
  2 siblings, 1 reply; 15+ messages in thread
From: Waiman Long @ 2023-06-22  0:36 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario, Waiman Long

When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
0 in order disable IBRS. However, the new MSR value isn't reflected
in x86_spec_ctrl_current which is at odd with the other code that
keep track of its state in that percpu variable. Fix that by updating
x86_spec_ctrl_current percpu value to always match the content of the
SPEC_CTRL MSR.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 drivers/idle/intel_idle.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index aa2d19db2b1d..07fa23707b3c 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -181,13 +181,17 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	u64 spec_ctrl = spec_ctrl_current();
 	int ret;
 
-	if (smt_active)
+	if (smt_active) {
+		__this_cpu_write(x86_spec_ctrl_current, 0);
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	}
 
 	ret = __intel_idle(dev, drv, index);
 
-	if (smt_active)
+	if (smt_active) {
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+		__this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
+	}
 
 	return ret;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/3] intel_idle: Add ibrs_off module parameter to force disable IBRS
  2023-06-22  0:36 [PATCH v3 0/3] x86/speculation: Disable IBRS when idle Waiman Long
  2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
  2023-06-22  0:36 ` [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current Waiman Long
@ 2023-06-22  0:36 ` Waiman Long
  2 siblings, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22  0:36 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario, Waiman Long

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the cstate is 6 or lower. However, there are
some use cases where a customer may want to use max_cstate=1 to
lower latency. Such use cases will suffer from the performance
degradation caused by the enabling of IBRS in the sibling idle thread.
Add a "ibrs_off" module parameter to force disable IBRS and the
CPUIDLE_FLAG_IRQ_ENABLE flag if set.

In the case of a Skylake server with max_cstate=1, this new ibrs_off
option will likely increase the IRQ response latency as IRQ will now
be disabled.

When running SPECjbb2015 with cstates set to C1 on a Skylake system.

First test when the kernel is booted with: "intel_idle.ibrs_off"
  max-jOPS = 117828, critical-jOPS = 66047

Then retest when the kernel is booted without the "intel_idle.ibrs_off"
added.
  max-jOPS = 116408, critical-jOPS = 58958

That means booting with "intel_idle.ibrs_off" improves performance by:
  max-jOPS:   1.2%, which could be considered noise range.
  critical-jOPS: 12%, which is definitely a solid improvement.

The admin-guide/pm/intel_idle.rst file is updated to add a description
about the new "ibrs_off" module parameter.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
 drivers/idle/intel_idle.c                   | 14 ++++++++++++--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst
index b799a43da62e..8604e6d1fe2c 100644
--- a/Documentation/admin-guide/pm/intel_idle.rst
+++ b/Documentation/admin-guide/pm/intel_idle.rst
@@ -170,7 +170,7 @@ and ``idle=nomwait``.  If any of them is present in the kernel command line, the
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are four module parameters recognized by ``intel_idle``
+Apart from that there are five module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -216,6 +216,21 @@ are ignored).
 The idle states disabled this way can be enabled (on a per-CPU basis) from user
 space via ``sysfs``.
 
+The ``ibrs_off`` module parameter is a boolean flag (default to false). It is
+used to control if IBRS (Indirect Branch Restricted Speculation) should be
+turned off, if set, when the CPU enters an idle state.  This flag will not
+affect CPUs that are using Enhanced IBRS which can remain on with little
+performance impact.
+
+For some CPUs, IBRS will be selected as mitigation for Spectre v2 and Retbleed
+security vulnerabilities by default.  Leaving the IBRS mode on while idling may
+have a performance impact on its sibling CPU.  The IBRS mode will be turned off
+by default when the CPU enters into a deep idle state, but not in some
+shallower ones.  Setting the ``ibrs_off`` module parameter will force the IBRS
+mode to off when the CPU is in any one of the available idle states.  This may
+help performance of a sibling CPU at the expense of a slightly higher wakeup
+latency for the idle CPU.
+
 
 .. _intel-idle-core-and-package-idle-states:
 
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 07fa23707b3c..8bbf1e6d845c 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -69,6 +69,7 @@ static int max_cstate = CPUIDLE_STATE_MAX - 1;
 static unsigned int disabled_states_mask __read_mostly;
 static unsigned int preferred_states_mask __read_mostly;
 static bool force_irq_on __read_mostly;
+static bool ibrs_off __read_mostly;
 
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 
@@ -1907,12 +1908,15 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
 			WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
 			state->enter = intel_idle_xstate;
 		} else if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
-			   state->flags & CPUIDLE_FLAG_IBRS) {
+			  ((state->flags & CPUIDLE_FLAG_IBRS) || ibrs_off)) {
 			/*
 			 * IBRS mitigation requires that C-states are entered
 			 * with interrupts disabled.
 			 */
-			WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
+			if (ibrs_off && (state->flags & CPUIDLE_FLAG_IRQ_ENABLE))
+				state->flags &= ~CPUIDLE_FLAG_IRQ_ENABLE;
+			else
+				WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
 			state->enter = intel_idle_ibrs;
 		} else if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) {
 			state->enter = intel_idle_irq;
@@ -2165,3 +2169,9 @@ MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
  * 'CPUIDLE_FLAG_INIT_XSTATE' and 'CPUIDLE_FLAG_IBRS' flags.
  */
 module_param(force_irq_on, bool, 0444);
+/*
+ * Force the disabling of IBRS when X86_FEATURE_KERNEL_IBRS is on and
+ * CPUIDLE_FLAG_IRQ_ENABLE isn't set.
+ */
+module_param(ibrs_off, bool, 0444);
+MODULE_PARM_DESC(ibrs_off, "Disable IBRS when idle");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline
  2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
@ 2023-06-22  2:05   ` Randy Dunlap
  2023-06-22  2:13     ` Waiman Long
  2023-06-22  5:40   ` Josh Poimboeuf
  1 sibling, 1 reply; 15+ messages in thread
From: Randy Dunlap @ 2023-06-22  2:05 UTC (permalink / raw)
  To: Waiman Long, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Josh Poimboeuf,
	Pawan Gupta, Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario



On 6/21/23 17:36, Waiman Long wrote:
> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
> disables IBRS when the CPU enters long idle. However, when a CPU
> becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
> is enabled. That will impact the performance of a sibling CPU. Mitigate
> this performance impact by clearing all the mitigation bits in SPEC_CTRL
> MSR when offline. When the CPU is online again, it will be re-initialized
> and so restoring the SPEC_CTRL value isn't needed.
> 
> Add a comment to say that native_play_dead() is a __noreturn function,
> but it can't be marked as such to avoid confusion about the missing
> MSR restoration code.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  arch/x86/kernel/smpboot.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 352f0ce1ece4..7bc33885518c 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -84,6 +84,7 @@
>  #include <asm/hw_irq.h>
>  #include <asm/stackprotector.h>
>  #include <asm/sev.h>
> +#include <asm/nospec-branch.h>
>  
>  /* representing HT siblings of each logical CPU */
>  DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
> @@ -1836,8 +1837,17 @@ void __noreturn hlt_play_dead(void)
>  	}
>  }
>  
> +/*
> + * naitve_play_dead() is essentially a __noreturn function, but it can't

typo: native_play_dead()

> + * be marked as such as the compiler may complain about it.
> + */
>  void native_play_dead(void)
>  {
> +	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> +		this_cpu_write(x86_spec_ctrl_current, 0);
> +		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +	}
> +
>  	play_dead_common();
>  	tboot_shutdown(TB_SHUTDOWN_WFS);
>  

-- 
~Randy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline
  2023-06-22  2:05   ` Randy Dunlap
@ 2023-06-22  2:13     ` Waiman Long
  0 siblings, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22  2:13 UTC (permalink / raw)
  To: Randy Dunlap, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, Josh Poimboeuf,
	Pawan Gupta, Jacob Pan, Len Brown
  Cc: linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario


On 6/21/23 22:05, Randy Dunlap wrote:
>
> On 6/21/23 17:36, Waiman Long wrote:
>> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
>> disables IBRS when the CPU enters long idle. However, when a CPU
>> becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
>> is enabled. That will impact the performance of a sibling CPU. Mitigate
>> this performance impact by clearing all the mitigation bits in SPEC_CTRL
>> MSR when offline. When the CPU is online again, it will be re-initialized
>> and so restoring the SPEC_CTRL value isn't needed.
>>
>> Add a comment to say that native_play_dead() is a __noreturn function,
>> but it can't be marked as such to avoid confusion about the missing
>> MSR restoration code.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   arch/x86/kernel/smpboot.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 352f0ce1ece4..7bc33885518c 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -84,6 +84,7 @@
>>   #include <asm/hw_irq.h>
>>   #include <asm/stackprotector.h>
>>   #include <asm/sev.h>
>> +#include <asm/nospec-branch.h>
>>   
>>   /* representing HT siblings of each logical CPU */
>>   DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
>> @@ -1836,8 +1837,17 @@ void __noreturn hlt_play_dead(void)
>>   	}
>>   }
>>   
>> +/*
>> + * naitve_play_dead() is essentially a __noreturn function, but it can't
> typo: native_play_dead()

Thanks for spotting the typo. Will fix it in the next version.

Cheers,
Longman

>
>> + * be marked as such as the compiler may complain about it.
>> + */
>>   void native_play_dead(void)
>>   {
>> +	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>> +		this_cpu_write(x86_spec_ctrl_current, 0);
>> +		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> +	}
>> +
>>   	play_dead_common();
>>   	tboot_shutdown(TB_SHUTDOWN_WFS);
>>   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline
  2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
  2023-06-22  2:05   ` Randy Dunlap
@ 2023-06-22  5:40   ` Josh Poimboeuf
  2023-06-22 12:27     ` Waiman Long
  1 sibling, 1 reply; 15+ messages in thread
From: Josh Poimboeuf @ 2023-06-22  5:40 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On Wed, Jun 21, 2023 at 08:36:01PM -0400, Waiman Long wrote:
> +/*
> + * naitve_play_dead() is essentially a __noreturn function, but it can't
> + * be marked as such as the compiler may complain about it.
> + */

FWIW, we could in theory do so by marking the smp_ops.play_dead function
pointer as __noreturn, but it would be tricky to teach objtool how to
understand that.

>  void native_play_dead(void)
>  {
> +	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> +		this_cpu_write(x86_spec_ctrl_current, 0);
> +		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +	}

Can update_spec_ctrl() be used instead?

-- 
Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  0:36 ` [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current Waiman Long
@ 2023-06-22  5:46   ` Josh Poimboeuf
  2023-06-22  9:38     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Josh Poimboeuf @ 2023-06-22  5:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
> When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
> 0 in order disable IBRS. However, the new MSR value isn't reflected
> in x86_spec_ctrl_current which is at odd with the other code that
> keep track of its state in that percpu variable. Fix that by updating
> x86_spec_ctrl_current percpu value to always match the content of the
> SPEC_CTRL MSR.

Is this fixing an actual bug or is there some other reason for doing
this?

> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  drivers/idle/intel_idle.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index aa2d19db2b1d..07fa23707b3c 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -181,13 +181,17 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
>  	u64 spec_ctrl = spec_ctrl_current();
>  	int ret;
>  
> -	if (smt_active)
> +	if (smt_active) {
> +		__this_cpu_write(x86_spec_ctrl_current, 0);
>  		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> +	}
>  
>  	ret = __intel_idle(dev, drv, index);
>  
> -	if (smt_active)
> +	if (smt_active) {
>  		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
> +		__this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
> +	}

More candidates for update_spec_ctrl()?

-- 
Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  5:46   ` Josh Poimboeuf
@ 2023-06-22  9:38     ` Peter Zijlstra
  2023-06-22 12:43       ` Waiman Long
  2023-06-22 13:08       ` Peter Zijlstra
  2023-06-22  9:40     ` Peter Zijlstra
  2023-06-22 12:31     ` Waiman Long
  2 siblings, 2 replies; 15+ messages in thread
From: Peter Zijlstra @ 2023-06-22  9:38 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Waiman Long, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On Wed, Jun 21, 2023 at 10:46:33PM -0700, Josh Poimboeuf wrote:
> On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
> > When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
> > 0 in order disable IBRS. However, the new MSR value isn't reflected
> > in x86_spec_ctrl_current which is at odd with the other code that
> > keep track of its state in that percpu variable. Fix that by updating
> > x86_spec_ctrl_current percpu value to always match the content of the
> > SPEC_CTRL MSR.
> 
> Is this fixing an actual bug or is there some other reason for doing
> this?
> 
> > 
> > Signed-off-by: Waiman Long <longman@redhat.com>
> > ---
> >  drivers/idle/intel_idle.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> > index aa2d19db2b1d..07fa23707b3c 100644
> > --- a/drivers/idle/intel_idle.c
> > +++ b/drivers/idle/intel_idle.c
> > @@ -181,13 +181,17 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
> >  	u64 spec_ctrl = spec_ctrl_current();
> >  	int ret;
> >  
> > -	if (smt_active)
> > +	if (smt_active) {
> > +		__this_cpu_write(x86_spec_ctrl_current, 0);
> >  		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> > +	}
> >  
> >  	ret = __intel_idle(dev, drv, index);
> >  
> > -	if (smt_active)
> > +	if (smt_active) {
> >  		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
> > +		__this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
> > +	}
> 
> More candidates for update_spec_ctrl()?

Both this and the play_dead case can't use update_spec_ctrl() because
RCU isn't there anymore and all that is noinstr. Additionally, both
sites rely on preemption being off already, where update_spec_ctrl()
can't do that.

That said, I suppose one could write it like so:

static __always_inline __update_spec_ctrl(u64 val)
{
	__this_cpu_write(x86_spec_ctrl_current, val);
	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
}

static void update_spec_ctrl(u64 val)
{
	preempt_disable();
	__update_spec_ctrl(val);
	preempt_enable();
}

And then you can use __update_spec_ctrl(). But that would need a wee
audit of using native_wrmsrl() in all places, probably ok, IIRC Xen
wasn't using our IBRS stuff anyway.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  5:46   ` Josh Poimboeuf
  2023-06-22  9:38     ` Peter Zijlstra
@ 2023-06-22  9:40     ` Peter Zijlstra
  2023-06-22 12:34       ` Waiman Long
  2023-06-22 12:31     ` Waiman Long
  2 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2023-06-22  9:40 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Waiman Long, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On Wed, Jun 21, 2023 at 10:46:33PM -0700, Josh Poimboeuf wrote:
> On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
> > When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
> > 0 in order disable IBRS. However, the new MSR value isn't reflected
> > in x86_spec_ctrl_current which is at odd with the other code that
> > keep track of its state in that percpu variable. Fix that by updating
> > x86_spec_ctrl_current percpu value to always match the content of the
> > SPEC_CTRL MSR.
> 
> Is this fixing an actual bug or is there some other reason for doing
> this?

No actual bug, he did this for his debugfs file -- which is no longer
part of the series. With that on, you can observe the
x86_spec_ctrl_current value while idle.

Arguably that's not even inconsistent because we disable/enable the
thing with IRQs disabled, so nothing can observe the difference.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline
  2023-06-22  5:40   ` Josh Poimboeuf
@ 2023-06-22 12:27     ` Waiman Long
  0 siblings, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22 12:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On 6/22/23 01:40, Josh Poimboeuf wrote:
> On Wed, Jun 21, 2023 at 08:36:01PM -0400, Waiman Long wrote:
>> +/*
>> + * naitve_play_dead() is essentially a __noreturn function, but it can't
>> + * be marked as such as the compiler may complain about it.
>> + */
> FWIW, we could in theory do so by marking the smp_ops.play_dead function
> pointer as __noreturn, but it would be tricky to teach objtool how to
> understand that.
I added the comment here because I had taken out the MSR restoration 
code. We can always replace that later on if there is a better way to do 
that.
>
>>   void native_play_dead(void)
>>   {
>> +	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>> +		this_cpu_write(x86_spec_ctrl_current, 0);
>> +		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> +	}
> Can update_spec_ctrl() be used instead?

Yes, the code is similar to what has been done in update_spec_ctrl(). 
Using it, however, will require exporting the function either by putting 
it into a public header or making it a global function.

Cheers,
Longman

>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  5:46   ` Josh Poimboeuf
  2023-06-22  9:38     ` Peter Zijlstra
  2023-06-22  9:40     ` Peter Zijlstra
@ 2023-06-22 12:31     ` Waiman Long
  2 siblings, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22 12:31 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On 6/22/23 01:46, Josh Poimboeuf wrote:
> On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
>> When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
>> 0 in order disable IBRS. However, the new MSR value isn't reflected
>> in x86_spec_ctrl_current which is at odd with the other code that
>> keep track of its state in that percpu variable. Fix that by updating
>> x86_spec_ctrl_current percpu value to always match the content of the
>> SPEC_CTRL MSR.
> Is this fixing an actual bug or is there some other reason for doing
> this?
It is not a bug per se. It is mainly to make the per cpu variable more 
up to date.
>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   drivers/idle/intel_idle.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
>> index aa2d19db2b1d..07fa23707b3c 100644
>> --- a/drivers/idle/intel_idle.c
>> +++ b/drivers/idle/intel_idle.c
>> @@ -181,13 +181,17 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
>>   	u64 spec_ctrl = spec_ctrl_current();
>>   	int ret;
>>   
>> -	if (smt_active)
>> +	if (smt_active) {
>> +		__this_cpu_write(x86_spec_ctrl_current, 0);
>>   		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> +	}
>>   
>>   	ret = __intel_idle(dev, drv, index);
>>   
>> -	if (smt_active)
>> +	if (smt_active) {
>>   		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
>> +		__this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
>> +	}
> More candidates for update_spec_ctrl()?

I don't think we can use update_spec_ctrl() here simply because of the 
noinstr requirement. See commit 9b461a6faae7 ("cpuidle, intel_idle: Fix 
CPUIDLE_FLAG_IBRS").

Cheers,
Longman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  9:40     ` Peter Zijlstra
@ 2023-06-22 12:34       ` Waiman Long
  0 siblings, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22 12:34 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown, linux-kernel,
	x86, linux-pm, Robin Jarry, Joe Mario


On 6/22/23 05:40, Peter Zijlstra wrote:
> On Wed, Jun 21, 2023 at 10:46:33PM -0700, Josh Poimboeuf wrote:
>> On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
>>> When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
>>> 0 in order disable IBRS. However, the new MSR value isn't reflected
>>> in x86_spec_ctrl_current which is at odd with the other code that
>>> keep track of its state in that percpu variable. Fix that by updating
>>> x86_spec_ctrl_current percpu value to always match the content of the
>>> SPEC_CTRL MSR.
>> Is this fixing an actual bug or is there some other reason for doing
>> this?
> No actual bug, he did this for his debugfs file -- which is no longer
> part of the series. With that on, you can observe the
> x86_spec_ctrl_current value while idle.

Right. That is the main reason as I want the SPEC_CTRL MSRs value to be 
observable by some external mean:-)

Regards,
Longman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  9:38     ` Peter Zijlstra
@ 2023-06-22 12:43       ` Waiman Long
  2023-06-22 13:08       ` Peter Zijlstra
  1 sibling, 0 replies; 15+ messages in thread
From: Waiman Long @ 2023-06-22 12:43 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown, linux-kernel,
	x86, linux-pm, Robin Jarry, Joe Mario


On 6/22/23 05:38, Peter Zijlstra wrote:
> On Wed, Jun 21, 2023 at 10:46:33PM -0700, Josh Poimboeuf wrote:
>> On Wed, Jun 21, 2023 at 08:36:02PM -0400, Waiman Long wrote:
>>> When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to
>>> 0 in order disable IBRS. However, the new MSR value isn't reflected
>>> in x86_spec_ctrl_current which is at odd with the other code that
>>> keep track of its state in that percpu variable. Fix that by updating
>>> x86_spec_ctrl_current percpu value to always match the content of the
>>> SPEC_CTRL MSR.
>> Is this fixing an actual bug or is there some other reason for doing
>> this?
>>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> ---
>>>   drivers/idle/intel_idle.c | 8 ++++++--
>>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
>>> index aa2d19db2b1d..07fa23707b3c 100644
>>> --- a/drivers/idle/intel_idle.c
>>> +++ b/drivers/idle/intel_idle.c
>>> @@ -181,13 +181,17 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
>>>   	u64 spec_ctrl = spec_ctrl_current();
>>>   	int ret;
>>>   
>>> -	if (smt_active)
>>> +	if (smt_active) {
>>> +		__this_cpu_write(x86_spec_ctrl_current, 0);
>>>   		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>>> +	}
>>>   
>>>   	ret = __intel_idle(dev, drv, index);
>>>   
>>> -	if (smt_active)
>>> +	if (smt_active) {
>>>   		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
>>> +		__this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
>>> +	}
>> More candidates for update_spec_ctrl()?
> Both this and the play_dead case can't use update_spec_ctrl() because
> RCU isn't there anymore and all that is noinstr. Additionally, both
> sites rely on preemption being off already, where update_spec_ctrl()
> can't do that.
>
> That said, I suppose one could write it like so:
>
> static __always_inline __update_spec_ctrl(u64 val)
> {
> 	__this_cpu_write(x86_spec_ctrl_current, val);
> 	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
> }

We can put this into asm/nospec-branch.h since x86_spec_ctrl_current is 
defined there as well.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current
  2023-06-22  9:38     ` Peter Zijlstra
  2023-06-22 12:43       ` Waiman Long
@ 2023-06-22 13:08       ` Peter Zijlstra
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2023-06-22 13:08 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Waiman Long, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Pawan Gupta, Jacob Pan, Len Brown,
	linux-kernel, x86, linux-pm, Robin Jarry, Joe Mario

On Thu, Jun 22, 2023 at 11:38:28AM +0200, Peter Zijlstra wrote:

> Both this and the play_dead case can't use update_spec_ctrl() because
> RCU isn't there anymore and all that is noinstr. Additionally, both
> sites rely on preemption being off already, where update_spec_ctrl()
> can't do that.

Oh, I might be wrong about the preemption thing; it doesn't make sense
to do wrmsr with preemption on, so it could be simpler.

> That said, I suppose one could write it like so:
> 
> static __always_inline __update_spec_ctrl(u64 val)
> {
> 	__this_cpu_write(x86_spec_ctrl_current, val);
> 	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
> }
> 
> static void update_spec_ctrl(u64 val)
> {
> 	preempt_disable();
> 	__update_spec_ctrl(val);
> 	preempt_enable();
> }
> 
> And then you can use __update_spec_ctrl(). But that would need a wee
> audit of using native_wrmsrl() in all places, probably ok, IIRC Xen
> wasn't using our IBRS stuff anyway.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-06-22 13:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-22  0:36 [PATCH v3 0/3] x86/speculation: Disable IBRS when idle Waiman Long
2023-06-22  0:36 ` [PATCH v3 1/3] x86/idle: Disable IBRS when cpu is offline Waiman Long
2023-06-22  2:05   ` Randy Dunlap
2023-06-22  2:13     ` Waiman Long
2023-06-22  5:40   ` Josh Poimboeuf
2023-06-22 12:27     ` Waiman Long
2023-06-22  0:36 ` [PATCH v3 2/3] intel_idle: Sync up the SPEC_CTRL MSR value to x86_spec_ctrl_current Waiman Long
2023-06-22  5:46   ` Josh Poimboeuf
2023-06-22  9:38     ` Peter Zijlstra
2023-06-22 12:43       ` Waiman Long
2023-06-22 13:08       ` Peter Zijlstra
2023-06-22  9:40     ` Peter Zijlstra
2023-06-22 12:34       ` Waiman Long
2023-06-22 12:31     ` Waiman Long
2023-06-22  0:36 ` [PATCH v3 3/3] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.