All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] x86/speculation: Disable IBRS when idle
@ 2023-07-27 18:45 Waiman Long
  2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Waiman Long @ 2023-07-27 18:45 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Waiman Long

 v6:
  - Fix allyesconfig build error by moving __update_spec_ctrl()
    helper from nospec-branch.h to spec-ctrl.h and include it in files
    that need the helper.

 v5:
  - Update comment in patch 1.
  - Minor doc update and code twist in patch 4 as suggested by Peter and
    Randy.

 v4:
  - Add a new __update_spec_ctrl() helper in patch 1.
  - Rebased to the latest linux kernel.

 v3:
  - Drop patches 1 ("x86/speculation: Provide a debugfs file to dump
    SPEC_CTRL MSRs") and 5 ("x86/idle: Disable IBRS entering mwait idle
    and enable it on wakeup") for now.
  - Drop the MSR restoration code in ("x86/idle: Disable IBRS when cpu
    is offline") as native_play_dead() does not return.
  - For patch ("intel_idle: Add ibrs_off module parameter to force
    disable IBRS"), change the name from "no_ibrs" to "ibrs_off" and
    document the new parameter in intel_idle.rst.

For Intel processors that need to turn on IBRS to protect against
Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
the performance of the whole core even if only one thread is turning
it on when running in the kernel. For user space heavy applications,
the performance impact of occasionally turning IBRS on during syscalls
shouldn't be significant. Unfortunately, that is not the case when the
sibling thread is idling in the kernel. In that case, the performance
impact can be significant.

When DPDK is running on an isolated CPU thread processing network packets
in user space while its sibling thread is idle. The performance of the
busy DPDK thread with IBRS on and off in the sibling idle thread are:

                                IBRS on         IBRS off
                                -------         --------
  packets/second:                  7.8M           10.4M
  avg tsc cycles/packet:         282.26          209.86

This is a 25% performance degradation. The test system is a Intel Xeon
4114 CPU @ 2.20GHz.

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle (C6 or below). However, there
are existing users out there who have set "intel_idle.max_cstate=1"
to decrease latency. Those users won't be able to benefit from this
commit. This patch series extends this commit by providing a new
"intel_idle.ibrs_off" module parameter to force disable IBRS even when
"intel_idle.max_cstate=1" at the expense of increased IRQ response
latency. It also includes a commit to allow the disabling of IBRS when
a CPU becomes offline.

Waiman Long (4):
  x86/speculation: Add __update_spec_ctrl() helper
  x86/idle: Disable IBRS when cpu is offline
  intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
  intel_idle: Add ibrs_off module parameter to force disable IBRS

 Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
 arch/x86/include/asm/spec-ctrl.h            | 11 +++++++++++
 arch/x86/kernel/smpboot.c                   |  8 ++++++++
 drivers/idle/intel_idle.c                   | 18 +++++++++++++-----
 4 files changed, 48 insertions(+), 6 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
@ 2023-07-27 18:45 ` Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  2023-07-27 18:45 ` [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline Waiman Long
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2023-07-27 18:45 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Waiman Long

Add a new __update_spec_ctrl() helper which is a variant of
update_spec_ctrl() that can be used in a noinstr function.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/include/asm/spec-ctrl.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index cb0386fc4dc3..d03408e0468a 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -4,6 +4,7 @@
 
 #include <linux/thread_info.h>
 #include <asm/nospec-branch.h>
+#include <asm/msr.h>
 
 /*
  * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
@@ -76,6 +77,16 @@ static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
 	return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
 }
 
+/*
+ * This can be used in noinstr function & should only be called in bare
+ * metal context.
+ */
+static __always_inline void __update_spec_ctrl(u64 val)
+{
+	__this_cpu_write(x86_spec_ctrl_current, val);
+	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
+}
+
 #ifdef CONFIG_SMP
 extern void speculative_store_bypass_ht_init(void);
 #else
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
  2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
@ 2023-07-27 18:45 ` Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  2023-07-27 18:45 ` [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs() Waiman Long
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2023-07-27 18:45 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Waiman Long

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU
becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
is enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline. When the CPU is online again, it will be re-initialized
and so restoring the SPEC_CTRL value isn't needed.

Add a comment to say that native_play_dead() is a __noreturn function,
but it can't be marked as such to avoid confusion about the missing
MSR restoration code.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/kernel/smpboot.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index e1aa2cd7734b..16f420d49a13 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -87,6 +87,7 @@
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
 #include <asm/sev.h>
+#include <asm/spec-ctrl.h>
 
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1743,8 +1744,15 @@ void __noreturn hlt_play_dead(void)
 		native_halt();
 }
 
+/*
+ * native_play_dead() is essentially a __noreturn function, but it can't
+ * be marked as such as the compiler may complain about it.
+ */
 void native_play_dead(void)
 {
+	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS))
+		__update_spec_ctrl(0);
+
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
  2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
  2023-07-27 18:45 ` [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline Waiman Long
@ 2023-07-27 18:45 ` Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  2023-07-27 18:46 ` [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2023-07-27 18:45 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Waiman Long

When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to 0
in order disable IBRS. However, the new MSR value isn't reflected in
x86_spec_ctrl_current which is at odd with the other code that keep track
of its state in that percpu variable.  Use the new __update_spec_ctrl()
to have the x86_spec_ctrl_current percpu value properly updated.

Since spec-ctrl.h includes both msr.h and nospec-branch.h, we can remove
those from the include file list.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/idle/intel_idle.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 256c2d42e350..d286f31a5b96 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -53,9 +53,8 @@
 #include <linux/moduleparam.h>
 #include <asm/cpu_device_id.h>
 #include <asm/intel-family.h>
-#include <asm/nospec-branch.h>
 #include <asm/mwait.h>
-#include <asm/msr.h>
+#include <asm/spec-ctrl.h>
 #include <asm/fpu/api.h>
 
 #define INTEL_IDLE_VERSION "0.5.1"
@@ -182,12 +181,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	int ret;
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+		__update_spec_ctrl(0);
 
 	ret = __intel_idle(dev, drv, index);
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+		__update_spec_ctrl(spec_ctrl);
 
 	return ret;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
                   ` (2 preceding siblings ...)
  2023-07-27 18:45 ` [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs() Waiman Long
@ 2023-07-27 18:46 ` Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] intel_idle: Add ibrs_off module parameter to force-disable IBRS tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  2023-08-16 20:42 ` [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
  2023-10-04 11:50 ` Ingo Molnar
  5 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2023-07-27 18:46 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Waiman Long

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the cstate is 6 or lower. However, there are
some use cases where a customer may want to use max_cstate=1 to
lower latency. Such use cases will suffer from the performance
degradation caused by the enabling of IBRS in the sibling idle thread.
Add a "ibrs_off" module parameter to force disable IBRS and the
CPUIDLE_FLAG_IRQ_ENABLE flag if set.

In the case of a Skylake server with max_cstate=1, this new ibrs_off
option will likely increase the IRQ response latency as IRQ will now
be disabled.

When running SPECjbb2015 with cstates set to C1 on a Skylake system.

First test when the kernel is booted with: "intel_idle.ibrs_off"
  max-jOPS = 117828, critical-jOPS = 66047

Then retest when the kernel is booted without the "intel_idle.ibrs_off"
added.
  max-jOPS = 116408, critical-jOPS = 58958

That means booting with "intel_idle.ibrs_off" improves performance by:
  max-jOPS:   1.2%, which could be considered noise range.
  critical-jOPS: 12%, which is definitely a solid improvement.

The admin-guide/pm/intel_idle.rst file is updated to add a description
about the new "ibrs_off" module parameter.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
 drivers/idle/intel_idle.c                   | 11 ++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst
index b799a43da62e..39bd6ecce7de 100644
--- a/Documentation/admin-guide/pm/intel_idle.rst
+++ b/Documentation/admin-guide/pm/intel_idle.rst
@@ -170,7 +170,7 @@ and ``idle=nomwait``.  If any of them is present in the kernel command line, the
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are four module parameters recognized by ``intel_idle``
+Apart from that there are five module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -216,6 +216,21 @@ are ignored).
 The idle states disabled this way can be enabled (on a per-CPU basis) from user
 space via ``sysfs``.
 
+The ``ibrs_off`` module parameter is a boolean flag (defaults to
+false). If set, it is used to control if IBRS (Indirect Branch Restricted
+Speculation) should be turned off when the CPU enters an idle state.
+This flag does not affect CPUs that use Enhanced IBRS which can remain
+on with little performance impact.
+
+For some CPUs, IBRS will be selected as mitigation for Spectre v2 and Retbleed
+security vulnerabilities by default.  Leaving the IBRS mode on while idling may
+have a performance impact on its sibling CPU.  The IBRS mode will be turned off
+by default when the CPU enters into a deep idle state, but not in some
+shallower ones.  Setting the ``ibrs_off`` module parameter will force the IBRS
+mode to off when the CPU is in any one of the available idle states.  This may
+help performance of a sibling CPU at the expense of a slightly higher wakeup
+latency for the idle CPU.
+
 
 .. _intel-idle-core-and-package-idle-states:
 
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index d286f31a5b96..18fa47370bde 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -68,6 +68,7 @@ static int max_cstate = CPUIDLE_STATE_MAX - 1;
 static unsigned int disabled_states_mask __read_mostly;
 static unsigned int preferred_states_mask __read_mostly;
 static bool force_irq_on __read_mostly;
+static bool ibrs_off __read_mostly;
 
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 
@@ -1852,11 +1853,13 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate)
 	}
 
 	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
-			   state->flags & CPUIDLE_FLAG_IBRS) {
+			((state->flags & CPUIDLE_FLAG_IBRS) || ibrs_off)) {
 		/*
 		 * IBRS mitigation requires that C-states are entered
 		 * with interrupts disabled.
 		 */
+		if (ibrs_off && (state->flags & CPUIDLE_FLAG_IRQ_ENABLE))
+			state->flags &= ~CPUIDLE_FLAG_IRQ_ENABLE;
 		WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
 		state->enter = intel_idle_ibrs;
 		return;
@@ -2175,3 +2178,9 @@ MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
  * 'CPUIDLE_FLAG_INIT_XSTATE' and 'CPUIDLE_FLAG_IBRS' flags.
  */
 module_param(force_irq_on, bool, 0444);
+/*
+ * Force the disabling of IBRS when X86_FEATURE_KERNEL_IBRS is on and
+ * CPUIDLE_FLAG_IRQ_ENABLE isn't set.
+ */
+module_param(ibrs_off, bool, 0444);
+MODULE_PARM_DESC(ibrs_off, "Disable IBRS when idle");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 0/4] x86/speculation: Disable IBRS when idle
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
                   ` (3 preceding siblings ...)
  2023-07-27 18:46 ` [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
@ 2023-08-16 20:42 ` Waiman Long
  2023-10-04 11:50 ` Ingo Molnar
  5 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2023-08-16 20:42 UTC (permalink / raw)
  To: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki
  Cc: linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap

On 7/27/23 14:45, Waiman Long wrote:
>   v6:
>    - Fix allyesconfig build error by moving __update_spec_ctrl()
>      helper from nospec-branch.h to spec-ctrl.h and include it in files
>      that need the helper.
>
>   v5:
>    - Update comment in patch 1.
>    - Minor doc update and code twist in patch 4 as suggested by Peter and
>      Randy.
>
>   v4:
>    - Add a new __update_spec_ctrl() helper in patch 1.
>    - Rebased to the latest linux kernel.
>
>   v3:
>    - Drop patches 1 ("x86/speculation: Provide a debugfs file to dump
>      SPEC_CTRL MSRs") and 5 ("x86/idle: Disable IBRS entering mwait idle
>      and enable it on wakeup") for now.
>    - Drop the MSR restoration code in ("x86/idle: Disable IBRS when cpu
>      is offline") as native_play_dead() does not return.
>    - For patch ("intel_idle: Add ibrs_off module parameter to force
>      disable IBRS"), change the name from "no_ibrs" to "ibrs_off" and
>      document the new parameter in intel_idle.rst.
>
> For Intel processors that need to turn on IBRS to protect against
> Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
> the performance of the whole core even if only one thread is turning
> it on when running in the kernel. For user space heavy applications,
> the performance impact of occasionally turning IBRS on during syscalls
> shouldn't be significant. Unfortunately, that is not the case when the
> sibling thread is idling in the kernel. In that case, the performance
> impact can be significant.
>
> When DPDK is running on an isolated CPU thread processing network packets
> in user space while its sibling thread is idle. The performance of the
> busy DPDK thread with IBRS on and off in the sibling idle thread are:
>
>                                  IBRS on         IBRS off
>                                  -------         --------
>    packets/second:                  7.8M           10.4M
>    avg tsc cycles/packet:         282.26          209.86
>
> This is a 25% performance degradation. The test system is a Intel Xeon
> 4114 CPU @ 2.20GHz.
>
> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
> disables IBRS when the CPU enters long idle (C6 or below). However, there
> are existing users out there who have set "intel_idle.max_cstate=1"
> to decrease latency. Those users won't be able to benefit from this
> commit. This patch series extends this commit by providing a new
> "intel_idle.ibrs_off" module parameter to force disable IBRS even when
> "intel_idle.max_cstate=1" at the expense of increased IRQ response
> latency. It also includes a commit to allow the disabling of IBRS when
> a CPU becomes offline.
>
> Waiman Long (4):
>    x86/speculation: Add __update_spec_ctrl() helper
>    x86/idle: Disable IBRS when cpu is offline
>    intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
>    intel_idle: Add ibrs_off module parameter to force disable IBRS
>
>   Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
>   arch/x86/include/asm/spec-ctrl.h            | 11 +++++++++++
>   arch/x86/kernel/smpboot.c                   |  8 ++++++++
>   drivers/idle/intel_idle.c                   | 18 +++++++++++++-----
>   4 files changed, 48 insertions(+), 6 deletions(-)
>
Peter,

Is this patch series good enough to be merged?

Thanks,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 0/4] x86/speculation: Disable IBRS when idle
  2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
                   ` (4 preceding siblings ...)
  2023-08-16 20:42 ` [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
@ 2023-10-04 11:50 ` Ingo Molnar
  2023-10-04 16:11   ` Waiman Long
  5 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2023-10-04 11:50 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki,
	linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Linus Torvalds


* Waiman Long <longman@redhat.com> wrote:

> For Intel processors that need to turn on IBRS to protect against
> Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
> the performance of the whole core even if only one thread is turning
> it on when running in the kernel. For user space heavy applications,
> the performance impact of occasionally turning IBRS on during syscalls
> shouldn't be significant. Unfortunately, that is not the case when the
> sibling thread is idling in the kernel. In that case, the performance
> impact can be significant.
> 
> When DPDK is running on an isolated CPU thread processing network packets
> in user space while its sibling thread is idle. The performance of the
> busy DPDK thread with IBRS on and off in the sibling idle thread are:
> 
>                                 IBRS on         IBRS off
>                                 -------         --------
>   packets/second:                  7.8M           10.4M
>   avg tsc cycles/packet:         282.26          209.86
> 
> This is a 25% performance degradation. The test system is a Intel Xeon
> 4114 CPU @ 2.20GHz.

Ok, that's a solid improvement, and the feature has no obvious
downsides, so I've applied your series to tip:sched/core with a few
edits here and there.

Thanks!

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip: sched/core] intel_idle: Add ibrs_off module parameter to force-disable IBRS
  2023-07-27 18:46 ` [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
@ 2023-10-04 15:47   ` tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-04 15:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     238437d88cea4bfcbc0e7c5031e873dec15d3e93
Gitweb:        https://git.kernel.org/tip/238437d88cea4bfcbc0e7c5031e873dec15d3e93
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:46:00 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 04 Oct 2023 13:48:48 +02:00

intel_idle: Add ibrs_off module parameter to force-disable IBRS

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the cstate is 6 or lower. However, there are
some use cases where a customer may want to use max_cstate=1 to
lower latency. Such use cases will suffer from the performance
degradation caused by the enabling of IBRS in the sibling idle thread.
Add a "ibrs_off" module parameter to force disable IBRS and the
CPUIDLE_FLAG_IRQ_ENABLE flag if set.

In the case of a Skylake server with max_cstate=1, this new ibrs_off
option will likely increase the IRQ response latency as IRQ will now
be disabled.

When running SPECjbb2015 with cstates set to C1 on a Skylake system.

First test when the kernel is booted with: "intel_idle.ibrs_off":

  max-jOPS = 117828, critical-jOPS = 66047

Then retest when the kernel is booted without the "intel_idle.ibrs_off"
added:

  max-jOPS = 116408, critical-jOPS = 58958

That means booting with "intel_idle.ibrs_off" improves performance by:

  max-jOPS:      +1.2%, which could be considered noise range.
  critical-jOPS: +12%,  which is definitely a solid improvement.

The admin-guide/pm/intel_idle.rst file is updated to add a description
about the new "ibrs_off" module parameter.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-5-longman@redhat.com
---
 Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
 drivers/idle/intel_idle.c                   | 11 ++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst
index b799a43..39bd6ec 100644
--- a/Documentation/admin-guide/pm/intel_idle.rst
+++ b/Documentation/admin-guide/pm/intel_idle.rst
@@ -170,7 +170,7 @@ and ``idle=nomwait``.  If any of them is present in the kernel command line, the
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are four module parameters recognized by ``intel_idle``
+Apart from that there are five module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -216,6 +216,21 @@ are ignored).
 The idle states disabled this way can be enabled (on a per-CPU basis) from user
 space via ``sysfs``.
 
+The ``ibrs_off`` module parameter is a boolean flag (defaults to
+false). If set, it is used to control if IBRS (Indirect Branch Restricted
+Speculation) should be turned off when the CPU enters an idle state.
+This flag does not affect CPUs that use Enhanced IBRS which can remain
+on with little performance impact.
+
+For some CPUs, IBRS will be selected as mitigation for Spectre v2 and Retbleed
+security vulnerabilities by default.  Leaving the IBRS mode on while idling may
+have a performance impact on its sibling CPU.  The IBRS mode will be turned off
+by default when the CPU enters into a deep idle state, but not in some
+shallower ones.  Setting the ``ibrs_off`` module parameter will force the IBRS
+mode to off when the CPU is in any one of the available idle states.  This may
+help performance of a sibling CPU at the expense of a slightly higher wakeup
+latency for the idle CPU.
+
 
 .. _intel-idle-core-and-package-idle-states:
 
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 86ac9a4..dcda0af 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -68,6 +68,7 @@ static int max_cstate = CPUIDLE_STATE_MAX - 1;
 static unsigned int disabled_states_mask __read_mostly;
 static unsigned int preferred_states_mask __read_mostly;
 static bool force_irq_on __read_mostly;
+static bool ibrs_off __read_mostly;
 
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 
@@ -1852,11 +1853,13 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate)
 	}
 
 	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
-			   state->flags & CPUIDLE_FLAG_IBRS) {
+			((state->flags & CPUIDLE_FLAG_IBRS) || ibrs_off)) {
 		/*
 		 * IBRS mitigation requires that C-states are entered
 		 * with interrupts disabled.
 		 */
+		if (ibrs_off && (state->flags & CPUIDLE_FLAG_IRQ_ENABLE))
+			state->flags &= ~CPUIDLE_FLAG_IRQ_ENABLE;
 		WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
 		state->enter = intel_idle_ibrs;
 		return;
@@ -2175,3 +2178,9 @@ MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
  * 'CPUIDLE_FLAG_INIT_XSTATE' and 'CPUIDLE_FLAG_IBRS' flags.
  */
 module_param(force_irq_on, bool, 0444);
+/*
+ * Force the disabling of IBRS when X86_FEATURE_KERNEL_IBRS is on and
+ * CPUIDLE_FLAG_IRQ_ENABLE isn't set.
+ */
+module_param(ibrs_off, bool, 0444);
+MODULE_PARM_DESC(ibrs_off, "Disable IBRS when idle");

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
  2023-07-27 18:45 ` [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs() Waiman Long
@ 2023-10-04 15:47   ` tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-04 15:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     1fe96fb4d0f2df207c906a7b1bba70154900ebd1
Gitweb:        https://git.kernel.org/tip/1fe96fb4d0f2df207c906a7b1bba70154900ebd1
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:59 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 04 Oct 2023 13:48:48 +02:00

intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()

When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to 0
in order disable IBRS. However, the new MSR value isn't reflected in
x86_spec_ctrl_current which is at odd with the other code that keep track
of its state in that percpu variable.  Use the new __update_spec_ctrl()
to have the x86_spec_ctrl_current percpu value properly updated.

Since spec-ctrl.h includes both msr.h and nospec-branch.h, we can remove
those from the include file list.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-4-longman@redhat.com
---
 drivers/idle/intel_idle.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index ea5a6a1..86ac9a4 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -53,9 +53,8 @@
 #include <linux/moduleparam.h>
 #include <asm/cpu_device_id.h>
 #include <asm/intel-family.h>
-#include <asm/nospec-branch.h>
 #include <asm/mwait.h>
-#include <asm/msr.h>
+#include <asm/spec-ctrl.h>
 #include <asm/fpu/api.h>
 
 #define INTEL_IDLE_VERSION "0.5.1"
@@ -182,12 +181,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	int ret;
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+		__update_spec_ctrl(0);
 
 	ret = __intel_idle(dev, drv, index);
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+		__update_spec_ctrl(spec_ctrl);
 
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance
  2023-07-27 18:45 ` [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline Waiman Long
@ 2023-10-04 15:47   ` tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-04 15:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     8c7a9b1bb45060b6e67456c3cf28475f6e0bd65d
Gitweb:        https://git.kernel.org/tip/8c7a9b1bb45060b6e67456c3cf28475f6e0bd65d
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:58 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 04 Oct 2023 13:48:48 +02:00

x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU
becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
is enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline. When the CPU is online again, it will be re-initialized
and so restoring the SPEC_CTRL value isn't needed.

Add a comment to say that native_play_dead() is a __noreturn function,
but it can't be marked as such to avoid confusion about the missing
MSR restoration code.

When DPDK is running on an isolated CPU thread processing network packets
in user space while its sibling thread is idle. The performance of the
busy DPDK thread with IBRS on and off in the sibling idle thread are:

                                IBRS on         IBRS off
                                -------         --------
  packets/second:                  7.8M           10.4M
  avg tsc cycles/packet:         282.26          209.86

This is a 25% performance degradation. The test system is a Intel Xeon
4114 CPU @ 2.20GHz.

[ mingo: Extended the changelog with performance data from the 0/4 mail. ]

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-3-longman@redhat.com
---
 arch/x86/kernel/smpboot.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 48e0406..02765d9 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -87,6 +87,7 @@
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
 #include <asm/sev.h>
+#include <asm/spec-ctrl.h>
 
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1623,8 +1624,15 @@ void __noreturn hlt_play_dead(void)
 		native_halt();
 }
 
+/*
+ * native_play_dead() is essentially a __noreturn function, but it can't
+ * be marked as such as the compiler may complain about it.
+ */
 void native_play_dead(void)
 {
+	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS))
+		__update_spec_ctrl(0);
+
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] x86/speculation: Add __update_spec_ctrl() helper
  2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
@ 2023-10-04 15:47   ` tip-bot2 for Waiman Long
  2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-04 15:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Waiman Long, Ingo Molnar, Rafael J. Wysocki,
	Linus Torvalds, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     aaa3e6678978b5e2b5c6e80e439fc4db9bbdb375
Gitweb:        https://git.kernel.org/tip/aaa3e6678978b5e2b5c6e80e439fc4db9bbdb375
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:57 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 04 Oct 2023 13:48:48 +02:00

x86/speculation: Add __update_spec_ctrl() helper

Add a new __update_spec_ctrl() helper which is a variant of
update_spec_ctrl() that can be used in a noinstr function.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-2-longman@redhat.com
---
 arch/x86/include/asm/spec-ctrl.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index cb0386f..c648502 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -4,6 +4,7 @@
 
 #include <linux/thread_info.h>
 #include <asm/nospec-branch.h>
+#include <asm/msr.h>
 
 /*
  * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
@@ -76,6 +77,16 @@ static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
 	return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
 }
 
+/*
+ * This can be used in noinstr functions & should only be called in bare
+ * metal context.
+ */
+static __always_inline void __update_spec_ctrl(u64 val)
+{
+	__this_cpu_write(x86_spec_ctrl_current, val);
+	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
+}
+
 #ifdef CONFIG_SMP
 extern void speculative_store_bypass_ht_init(void);
 #else

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v6 0/4] x86/speculation: Disable IBRS when idle
  2023-10-04 11:50 ` Ingo Molnar
@ 2023-10-04 16:11   ` Waiman Long
  0 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2023-10-04 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Josh Poimboeuf, Pawan Gupta,
	Jacob Pan, Len Brown, Jonathan Corbet, Rafael J . Wysocki,
	linux-kernel, linux-doc, x86, linux-pm, Robin Jarry, Joe Mario,
	Randy Dunlap, Linus Torvalds

On 10/4/23 07:50, Ingo Molnar wrote:
> * Waiman Long <longman@redhat.com> wrote:
>
>> For Intel processors that need to turn on IBRS to protect against
>> Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects
>> the performance of the whole core even if only one thread is turning
>> it on when running in the kernel. For user space heavy applications,
>> the performance impact of occasionally turning IBRS on during syscalls
>> shouldn't be significant. Unfortunately, that is not the case when the
>> sibling thread is idling in the kernel. In that case, the performance
>> impact can be significant.
>>
>> When DPDK is running on an isolated CPU thread processing network packets
>> in user space while its sibling thread is idle. The performance of the
>> busy DPDK thread with IBRS on and off in the sibling idle thread are:
>>
>>                                  IBRS on         IBRS off
>>                                  -------         --------
>>    packets/second:                  7.8M           10.4M
>>    avg tsc cycles/packet:         282.26          209.86
>>
>> This is a 25% performance degradation. The test system is a Intel Xeon
>> 4114 CPU @ 2.20GHz.
> Ok, that's a solid improvement, and the feature has no obvious
> downsides, so I've applied your series to tip:sched/core with a few
> edits here and there.

Thanks!

-Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip: sched/core] intel_idle: Add ibrs_off module parameter to force-disable IBRS
  2023-07-27 18:46 ` [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] intel_idle: Add ibrs_off module parameter to force-disable IBRS tip-bot2 for Waiman Long
@ 2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-07 17:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     aa1567a7e6440b8c3af4b0d8a8219d8fc5028c5f
Gitweb:        https://git.kernel.org/tip/aa1567a7e6440b8c3af4b0d8a8219d8fc5028c5f
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:46:00 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 07 Oct 2023 11:33:28 +02:00

intel_idle: Add ibrs_off module parameter to force-disable IBRS

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the cstate is 6 or lower. However, there are
some use cases where a customer may want to use max_cstate=1 to
lower latency. Such use cases will suffer from the performance
degradation caused by the enabling of IBRS in the sibling idle thread.
Add a "ibrs_off" module parameter to force disable IBRS and the
CPUIDLE_FLAG_IRQ_ENABLE flag if set.

In the case of a Skylake server with max_cstate=1, this new ibrs_off
option will likely increase the IRQ response latency as IRQ will now
be disabled.

When running SPECjbb2015 with cstates set to C1 on a Skylake system.

First test when the kernel is booted with: "intel_idle.ibrs_off":

  max-jOPS = 117828, critical-jOPS = 66047

Then retest when the kernel is booted without the "intel_idle.ibrs_off"
added:

  max-jOPS = 116408, critical-jOPS = 58958

That means booting with "intel_idle.ibrs_off" improves performance by:

  max-jOPS:      +1.2%, which could be considered noise range.
  critical-jOPS: +12%,  which is definitely a solid improvement.

The admin-guide/pm/intel_idle.rst file is updated to add a description
about the new "ibrs_off" module parameter.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-5-longman@redhat.com
---
 Documentation/admin-guide/pm/intel_idle.rst | 17 ++++++++++++++++-
 drivers/idle/intel_idle.c                   | 11 ++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst
index b799a43..39bd6ec 100644
--- a/Documentation/admin-guide/pm/intel_idle.rst
+++ b/Documentation/admin-guide/pm/intel_idle.rst
@@ -170,7 +170,7 @@ and ``idle=nomwait``.  If any of them is present in the kernel command line, the
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are four module parameters recognized by ``intel_idle``
+Apart from that there are five module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -216,6 +216,21 @@ are ignored).
 The idle states disabled this way can be enabled (on a per-CPU basis) from user
 space via ``sysfs``.
 
+The ``ibrs_off`` module parameter is a boolean flag (defaults to
+false). If set, it is used to control if IBRS (Indirect Branch Restricted
+Speculation) should be turned off when the CPU enters an idle state.
+This flag does not affect CPUs that use Enhanced IBRS which can remain
+on with little performance impact.
+
+For some CPUs, IBRS will be selected as mitigation for Spectre v2 and Retbleed
+security vulnerabilities by default.  Leaving the IBRS mode on while idling may
+have a performance impact on its sibling CPU.  The IBRS mode will be turned off
+by default when the CPU enters into a deep idle state, but not in some
+shallower ones.  Setting the ``ibrs_off`` module parameter will force the IBRS
+mode to off when the CPU is in any one of the available idle states.  This may
+help performance of a sibling CPU at the expense of a slightly higher wakeup
+latency for the idle CPU.
+
 
 .. _intel-idle-core-and-package-idle-states:
 
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 86ac9a4..dcda0af 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -68,6 +68,7 @@ static int max_cstate = CPUIDLE_STATE_MAX - 1;
 static unsigned int disabled_states_mask __read_mostly;
 static unsigned int preferred_states_mask __read_mostly;
 static bool force_irq_on __read_mostly;
+static bool ibrs_off __read_mostly;
 
 static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
 
@@ -1852,11 +1853,13 @@ static void state_update_enter_method(struct cpuidle_state *state, int cstate)
 	}
 
 	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
-			   state->flags & CPUIDLE_FLAG_IBRS) {
+			((state->flags & CPUIDLE_FLAG_IBRS) || ibrs_off)) {
 		/*
 		 * IBRS mitigation requires that C-states are entered
 		 * with interrupts disabled.
 		 */
+		if (ibrs_off && (state->flags & CPUIDLE_FLAG_IRQ_ENABLE))
+			state->flags &= ~CPUIDLE_FLAG_IRQ_ENABLE;
 		WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
 		state->enter = intel_idle_ibrs;
 		return;
@@ -2175,3 +2178,9 @@ MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
  * 'CPUIDLE_FLAG_INIT_XSTATE' and 'CPUIDLE_FLAG_IBRS' flags.
  */
 module_param(force_irq_on, bool, 0444);
+/*
+ * Force the disabling of IBRS when X86_FEATURE_KERNEL_IBRS is on and
+ * CPUIDLE_FLAG_IRQ_ENABLE isn't set.
+ */
+module_param(ibrs_off, bool, 0444);
+MODULE_PARM_DESC(ibrs_off, "Disable IBRS when idle");

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()
  2023-07-27 18:45 ` [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs() Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
@ 2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-07 17:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     7506203089dceb1d9e1f35d37ad2e46d44798a6d
Gitweb:        https://git.kernel.org/tip/7506203089dceb1d9e1f35d37ad2e46d44798a6d
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:59 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 07 Oct 2023 11:33:28 +02:00

intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs()

When intel_idle_ibrs() is called, it modifies the SPEC_CTRL MSR to 0
in order disable IBRS. However, the new MSR value isn't reflected in
x86_spec_ctrl_current which is at odd with the other code that keep track
of its state in that percpu variable.  Use the new __update_spec_ctrl()
to have the x86_spec_ctrl_current percpu value properly updated.

Since spec-ctrl.h includes both msr.h and nospec-branch.h, we can remove
those from the include file list.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-4-longman@redhat.com
---
 drivers/idle/intel_idle.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index ea5a6a1..86ac9a4 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -53,9 +53,8 @@
 #include <linux/moduleparam.h>
 #include <asm/cpu_device_id.h>
 #include <asm/intel-family.h>
-#include <asm/nospec-branch.h>
 #include <asm/mwait.h>
-#include <asm/msr.h>
+#include <asm/spec-ctrl.h>
 #include <asm/fpu/api.h>
 
 #define INTEL_IDLE_VERSION "0.5.1"
@@ -182,12 +181,12 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	int ret;
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+		__update_spec_ctrl(0);
 
 	ret = __intel_idle(dev, drv, index);
 
 	if (smt_active)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+		__update_spec_ctrl(spec_ctrl);
 
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance
  2023-07-27 18:45 ` [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance tip-bot2 for Waiman Long
@ 2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-07 17:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Ingo Molnar, Rafael J. Wysocki, Linus Torvalds, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     2743fe89d4d41616ffbe1e7e96e443ae7a4b1cc6
Gitweb:        https://git.kernel.org/tip/2743fe89d4d41616ffbe1e7e96e443ae7a4b1cc6
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:58 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 07 Oct 2023 11:33:28 +02:00

x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance

Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU
becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS
is enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline. When the CPU is online again, it will be re-initialized
and so restoring the SPEC_CTRL value isn't needed.

Add a comment to say that native_play_dead() is a __noreturn function,
but it can't be marked as such to avoid confusion about the missing
MSR restoration code.

When DPDK is running on an isolated CPU thread processing network packets
in user space while its sibling thread is idle. The performance of the
busy DPDK thread with IBRS on and off in the sibling idle thread are:

                                IBRS on         IBRS off
                                -------         --------
  packets/second:                  7.8M           10.4M
  avg tsc cycles/packet:         282.26          209.86

This is a 25% performance degradation. The test system is a Intel Xeon
4114 CPU @ 2.20GHz.

[ mingo: Extended the changelog with performance data from the 0/4 mail. ]

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-3-longman@redhat.com
---
 arch/x86/kernel/smpboot.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 48e0406..02765d9 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -87,6 +87,7 @@
 #include <asm/hw_irq.h>
 #include <asm/stackprotector.h>
 #include <asm/sev.h>
+#include <asm/spec-ctrl.h>
 
 /* representing HT siblings of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1623,8 +1624,15 @@ void __noreturn hlt_play_dead(void)
 		native_halt();
 }
 
+/*
+ * native_play_dead() is essentially a __noreturn function, but it can't
+ * be marked as such as the compiler may complain about it.
+ */
 void native_play_dead(void)
 {
+	if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS))
+		__update_spec_ctrl(0);
+
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [tip: sched/core] x86/speculation: Add __update_spec_ctrl() helper
  2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
  2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
@ 2023-10-07 17:10   ` tip-bot2 for Waiman Long
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot2 for Waiman Long @ 2023-10-07 17:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Waiman Long, Ingo Molnar, Rafael J. Wysocki,
	Linus Torvalds, x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e3e3bab1844d448a239cd57ebf618839e26b4157
Gitweb:        https://git.kernel.org/tip/e3e3bab1844d448a239cd57ebf618839e26b4157
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 27 Jul 2023 14:45:57 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 07 Oct 2023 11:33:28 +02:00

x86/speculation: Add __update_spec_ctrl() helper

Add a new __update_spec_ctrl() helper which is a variant of
update_spec_ctrl() that can be used in a noinstr function.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230727184600.26768-2-longman@redhat.com
---
 arch/x86/include/asm/spec-ctrl.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index cb0386f..c648502 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -4,6 +4,7 @@
 
 #include <linux/thread_info.h>
 #include <asm/nospec-branch.h>
+#include <asm/msr.h>
 
 /*
  * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
@@ -76,6 +77,16 @@ static inline u64 ssbd_tif_to_amd_ls_cfg(u64 tifn)
 	return (tifn & _TIF_SSBD) ? x86_amd_ls_cfg_ssbd_mask : 0ULL;
 }
 
+/*
+ * This can be used in noinstr functions & should only be called in bare
+ * metal context.
+ */
+static __always_inline void __update_spec_ctrl(u64 val)
+{
+	__this_cpu_write(x86_spec_ctrl_current, val);
+	native_wrmsrl(MSR_IA32_SPEC_CTRL, val);
+}
+
 #ifdef CONFIG_SMP
 extern void speculative_store_bypass_ht_init(void);
 #else

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-10-07 17:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-27 18:45 [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
2023-07-27 18:45 ` [PATCH v6 1/4] x86/speculation: Add __update_spec_ctrl() helper Waiman Long
2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
2023-10-07 17:10   ` tip-bot2 for Waiman Long
2023-07-27 18:45 ` [PATCH v6 2/4] x86/idle: Disable IBRS when cpu is offline Waiman Long
2023-10-04 15:47   ` [tip: sched/core] x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance tip-bot2 for Waiman Long
2023-10-07 17:10   ` tip-bot2 for Waiman Long
2023-07-27 18:45 ` [PATCH v6 3/4] intel_idle: Use __update_spec_ctrl() in intel_idle_ibrs() Waiman Long
2023-10-04 15:47   ` [tip: sched/core] " tip-bot2 for Waiman Long
2023-10-07 17:10   ` tip-bot2 for Waiman Long
2023-07-27 18:46 ` [PATCH v6 4/4] intel_idle: Add ibrs_off module parameter to force disable IBRS Waiman Long
2023-10-04 15:47   ` [tip: sched/core] intel_idle: Add ibrs_off module parameter to force-disable IBRS tip-bot2 for Waiman Long
2023-10-07 17:10   ` tip-bot2 for Waiman Long
2023-08-16 20:42 ` [PATCH v6 0/4] x86/speculation: Disable IBRS when idle Waiman Long
2023-10-04 11:50 ` Ingo Molnar
2023-10-04 16:11   ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.