All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/6] x86/idle: Cure RCU violations and cleanups
@ 2024-02-29 14:23 Thomas Gleixner
  2024-02-29 14:23 ` [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call() Thomas Gleixner
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

Boris reported that a RCU related warning triggers in the tracer code on
AMD machines which are affected by Erratum 400. On those CPUs the local
APIC timer stops in the C1E halt state. This is handled by a special idle
function which invokes tick_broadcast_enter()/exit() around HALT.  These
functions can end up in lockdep or tracing which use RCU protected data,
but the core code already set RCU to idle which means that the RCU
protection is not longer given.

This series fixes this by handling the tick broadcast conditionally in the
core idle function. While working on it I noticed a few bogosities in the
related code and cleaned that up on top.

The series is also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/core

Thanks,

	tglx
---
 arch/x86/Kconfig                 |    1 
 arch/x86/include/asm/processor.h |    2 
 arch/x86/kernel/cpu/common.c     |    4 -
 arch/x86/kernel/process.c        |   89 +++++++++++----------------------------
 include/linux/cpu.h              |    2 
 include/linux/tick.h             |    3 +
 kernel/sched/idle.c              |   21 +++++++++
 kernel/time/Kconfig              |    5 ++
 8 files changed, 62 insertions(+), 65 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call()
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-02-29 14:23 ` [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling Thomas Gleixner
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney, Borislav Petkov

X86 has a idle routine for AMD CPUs which are affected by erratum 400. On
the affected CPUs the local APIC timer stops in the C1E halt state. It
therefore requires tick broadcasting. The invocation of
tick_broadcast_enter()/exit() from this function violates the RCU
constraints because it can end up in lockdep or tracing, which rightfully
triggers a warning.

tick_broadcast_enter()/exit() must be invoked before ct_cpuidle_enter() and
after ct_cpuidle_exit() in default_idle_call().

Add a static branch conditional invocation of tick_broadcast_enter()/exit()
into this function to allow X86 to replace the AMD specific idle code. It's
guarded by a config switch which will be selected by X86. Otherwise it's a
NOOP.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/cpu.h  |    2 ++
 include/linux/tick.h |    3 +++
 kernel/sched/idle.c  |   21 +++++++++++++++++++++
 kernel/time/Kconfig  |    5 +++++
 4 files changed, 31 insertions(+)

--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -196,6 +196,8 @@ void arch_cpu_idle(void);
 void arch_cpu_idle_prepare(void);
 void arch_cpu_idle_enter(void);
 void arch_cpu_idle_exit(void);
+void arch_tick_broadcast_enter(void);
+void arch_tick_broadcast_exit(void);
 void __noreturn arch_cpu_idle_dead(void);
 
 #ifdef CONFIG_ARCH_HAS_CPU_FINALIZE_INIT
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -12,6 +12,7 @@
 #include <linux/cpumask.h>
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
+#include <linux/static_key.h>
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 extern void __init tick_init(void);
@@ -69,6 +70,8 @@ extern void tick_broadcast_control(enum
 static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
 #endif /* BROADCAST */
 
+extern struct static_key_false arch_needs_tick_broadcast;
+
 #if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_HOTPLUG_CPU)
 extern void tick_offline_cpu(unsigned int cpu);
 #else
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -81,6 +81,25 @@ void __weak arch_cpu_idle(void)
 	cpu_idle_force_poll = 1;
 }
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST_IDLE
+DEFINE_STATIC_KEY_FALSE(arch_needs_tick_broadcast);
+
+static inline void cond_tick_broadcast_enter(void)
+{
+	if (static_branch_unlikely(&arch_needs_tick_broadcast))
+		tick_broadcast_enter();
+}
+
+static inline void cond_tick_broadcast_exit(void)
+{
+	if (static_branch_unlikely(&arch_needs_tick_broadcast))
+		tick_broadcast_exit();
+}
+#else
+static inline void cond_tick_broadcast_enter(void) { }
+static inline void cond_tick_broadcast_exit(void) { }
+#endif
+
 /**
  * default_idle_call - Default CPU idle routine.
  *
@@ -90,6 +109,7 @@ void __cpuidle default_idle_call(void)
 {
 	instrumentation_begin();
 	if (!current_clr_polling_and_test()) {
+		cond_tick_broadcast_enter();
 		trace_cpu_idle(1, smp_processor_id());
 		stop_critical_timings();
 
@@ -99,6 +119,7 @@ void __cpuidle default_idle_call(void)
 
 		start_critical_timings();
 		trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+		cond_tick_broadcast_exit();
 	}
 	local_irq_enable();
 	instrumentation_end();
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -39,6 +39,11 @@ config GENERIC_CLOCKEVENTS_BROADCAST
 	bool
 	depends on GENERIC_CLOCKEVENTS
 
+# Handle broadcast in default_idle_call()
+config GENERIC_CLOCKEVENTS_BROADCAST_IDLE
+	bool
+	depends on GENERIC_CLOCKEVENTS_BROADCAST
+
 # Automatically adjust the min. reprogramming time for
 # clock event device
 config GENERIC_CLOCKEVENTS_MIN_ADJUST


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
  2024-02-29 14:23 ` [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call() Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01 12:33   ` Thomas Gleixner
  2024-02-29 14:23 ` [patch 3/6] x86/idle: Clean up idle selection Thomas Gleixner
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney, Borislav Petkov

amd_e400_idle(), the idle routine for AMD CPUs which are affected by
erratum 400 violates the RCU constraints by invoking tick_broadcast_enter()
and tick_broadcast_exit() after the core code has marked RCU non-idle.  The
functions can end up in lockdep or tracing, which rightfully triggers a
RCU warning.

The core code provides now a static branch conditional invocation of the
broadcast functions.

Remove amd_e400_idle(), enforce default_idle() and enable the static branch
on affected CPUs to cure this.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/Kconfig          |    1 +
 arch/x86/kernel/process.c |   41 ++++++++---------------------------------
 2 files changed, 9 insertions(+), 33 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,6 +147,7 @@ config X86
 	select EDAC_ATOMIC_SCRUB
 	select EDAC_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST	if X86_64 || (X86_32 && X86_LOCAL_APIC)
+	select GENERIC_CLOCKEVENTS_BROADCAST_IDLE	if GENERIC_CLOCKEVENTS_BROADCAST
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -846,31 +846,6 @@ void __noreturn stop_this_cpu(void *dumm
 }
 
 /*
- * AMD Erratum 400 aware idle routine. We handle it the same way as C3 power
- * states (local apic timer and TSC stop).
- *
- * XXX this function is completely buggered vs RCU and tracing.
- */
-static void amd_e400_idle(void)
-{
-	/*
-	 * We cannot use static_cpu_has_bug() here because X86_BUG_AMD_APIC_C1E
-	 * gets set after static_cpu_has() places have been converted via
-	 * alternatives.
-	 */
-	if (!boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
-		default_idle();
-		return;
-	}
-
-	tick_broadcast_enter();
-
-	default_idle();
-
-	tick_broadcast_exit();
-}
-
-/*
  * Prefer MWAIT over HALT if MWAIT is supported, MWAIT_CPUID leaf
  * exists and whenever MONITOR/MWAIT extensions are present there is at
  * least one C1 substate.
@@ -890,8 +865,8 @@ static int prefer_mwait_c1_over_halt(con
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
 		return 0;
 
-	/* Monitor has a bug. Fallback to HALT */
-	if (boot_cpu_has_bug(X86_BUG_MONITOR))
+	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
+	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
 		return 0;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
@@ -942,17 +917,15 @@ void select_idle_routine(const struct cp
 	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
 		return;
 
-	if (boot_cpu_has_bug(X86_BUG_AMD_E400)) {
-		pr_info("using AMD E400 aware idle routine\n");
-		static_call_update(x86_idle, amd_e400_idle);
-	} else if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt(c)) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
 		pr_info("using TDX aware idle routine\n");
 		static_call_update(x86_idle, tdx_safe_halt);
-	} else
+	} else {
 		static_call_update(x86_idle, default_idle);
+	}
 }
 
 void amd_e400_c1e_apic_setup(void)
@@ -985,7 +958,9 @@ void __init arch_post_acpi_subsys_init(v
 
 	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
 		mark_tsc_unstable("TSC halt in AMD C1E");
-	pr_info("System has AMD C1E enabled\n");
+
+	static_branch_enable(&arch_needs_tick_broadcast);
+	pr_info("System has AMD C1E erratum E400. Workaround enabled.\n");
 }
 
 static int __init idle_setup(char *str)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 3/6] x86/idle: Clean up idle selection
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
  2024-02-29 14:23 ` [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call() Thomas Gleixner
  2024-02-29 14:23 ` [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  2024-02-29 14:23 ` [patch 4/6] x86/idle: Cleanup idle_setup() Thomas Gleixner
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

Clean up the code to make it readable. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/process.c |   12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -910,11 +910,13 @@ static __cpuidle void mwait_idle(void)
 
 void select_idle_routine(const struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_SMP
-	if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
-		pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
-#endif
-	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
+	if (boot_option_idle_override == IDLE_POLL) {
+		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
+			pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
+		return;
+	}
+
+	if (x86_idle_set())
 		return;
 
 	if (prefer_mwait_c1_over_halt(c)) {


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 4/6] x86/idle: Cleanup idle_setup()
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
                   ` (2 preceding siblings ...)
  2024-02-29 14:23 ` [patch 3/6] x86/idle: Clean up idle selection Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  2024-02-29 14:23 ` [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool Thomas Gleixner
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

Updating the static call for x86_idle() from idle_setup() is counter
intuitive.

Let select_idle_routine() handle it like the other idle choices, which
allows to simplify the idle selection later on.

While at it rewrite comments and return a proper error code and not -1.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/process.c |   24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -857,8 +857,8 @@ static int prefer_mwait_c1_over_halt(con
 {
 	u32 eax, ebx, ecx, edx;
 
-	/* User has disallowed the use of MWAIT. Fallback to HALT */
-	if (boot_option_idle_override == IDLE_NOMWAIT)
+	/* If override is enforced on the command line, fall back to HALT. */
+	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
 		return 0;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
@@ -975,24 +975,14 @@ static int __init idle_setup(char *str)
 		boot_option_idle_override = IDLE_POLL;
 		cpu_idle_poll_ctrl(true);
 	} else if (!strcmp(str, "halt")) {
-		/*
-		 * When the boot option of idle=halt is added, halt is
-		 * forced to be used for CPU idle. In such case CPU C2/C3
-		 * won't be used again.
-		 * To continue to load the CPU idle driver, don't touch
-		 * the boot_option_idle_override.
-		 */
-		static_call_update(x86_idle, default_idle);
+		/* 'idle=halt' HALT for idle. C-states are disabled. */
 		boot_option_idle_override = IDLE_HALT;
 	} else if (!strcmp(str, "nomwait")) {
-		/*
-		 * If the boot option of "idle=nomwait" is added,
-		 * it means that mwait will be disabled for CPU C1/C2/C3
-		 * states.
-		 */
+		/* 'idle=nomwait' disables MWAIT for idle */
 		boot_option_idle_override = IDLE_NOMWAIT;
-	} else
-		return -1;
+	} else {
+		return -EINVAL;
+	}
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
                   ` (3 preceding siblings ...)
  2024-02-29 14:23 ` [patch 4/6] x86/idle: Cleanup idle_setup() Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  2024-02-29 14:23 ` [patch 6/6] x86/idle: Select idle routine only once Thomas Gleixner
  2024-03-01 18:41 ` [patch 0/6] x86/idle: Cure RCU violations and cleanups Borislav Petkov
  6 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

The return value is truly boolean. Make it so.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/process.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,21 +853,21 @@ void __noreturn stop_this_cpu(void *dumm
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
 	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
-		return 0;
+		return false;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
-		return 0;
+		return false;
 
 	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
 	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
-		return 0;
+		return false;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
 
@@ -876,13 +876,13 @@ static int prefer_mwait_c1_over_halt(con
 	 * with EAX=0, ECX=0.
 	 */
 	if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED))
-		return 1;
+		return true;
 
 	/*
 	 * If MWAIT extensions are available, there should be at least one
 	 * MWAIT C1 substate present.
 	 */
-	return (edx & MWAIT_C1_SUBSTATE_MASK);
+	return !!(edx & MWAIT_C1_SUBSTATE_MASK);
 }
 
 /*


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [patch 6/6] x86/idle: Select idle routine only once
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
                   ` (4 preceding siblings ...)
  2024-02-29 14:23 ` [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool Thomas Gleixner
@ 2024-02-29 14:23 ` Thomas Gleixner
  2024-03-01  8:14   ` Thomas Gleixner
  2024-03-01 18:41 ` [patch 0/6] x86/idle: Cure RCU violations and cleanups Borislav Petkov
  6 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2024-02-29 14:23 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

The idle routine selection is done on every CPU bringup operation and has a
guard in place which is effective after the first invocation, which is a
pointless exercise.

Invoke it once on the boot CPU and mark the related functions __init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    2 +-
 arch/x86/kernel/cpu/common.c     |    4 ++--
 arch/x86/kernel/process.c        |   10 ++++------
 3 files changed, 7 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -555,7 +555,7 @@ static inline void load_sp0(unsigned lon
 
 unsigned long __get_wchan(struct task_struct *p);
 
-extern void select_idle_routine(const struct cpuinfo_x86 *c);
+extern void select_idle_routine(void);
 extern void amd_e400_c1e_apic_setup(void);
 
 extern unsigned long		boot_option_idle_override;
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1938,8 +1938,6 @@ static void identify_cpu(struct cpuinfo_
 	/* Init Machine Check Exception if available. */
 	mcheck_cpu_init(c);
 
-	select_idle_routine(c);
-
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
@@ -2343,6 +2341,8 @@ void __init arch_cpu_finalize_init(void)
 {
 	identify_boot_cpu();
 
+	select_idle_routine();
+
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,8 +853,9 @@ void __noreturn stop_this_cpu(void *dumm
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static __init bool prefer_mwait_c1_over_halt(void)
 {
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
@@ -908,7 +909,7 @@ static __cpuidle void mwait_idle(void)
 	__current_clr_polling();
 }
 
-void select_idle_routine(const struct cpuinfo_x86 *c)
+void __init select_idle_routine(void)
 {
 	if (boot_option_idle_override == IDLE_POLL) {
 		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
@@ -916,10 +917,7 @@ void select_idle_routine(const struct cp
 		return;
 	}
 
-	if (x86_idle_set())
-		return;
-
-	if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt()) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 6/6] x86/idle: Select idle routine only once
  2024-02-29 14:23 ` [patch 6/6] x86/idle: Select idle routine only once Thomas Gleixner
@ 2024-03-01  8:14   ` Thomas Gleixner
  2024-03-01 11:33     ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2024-03-01  8:14 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

On Thu, Feb 29 2024 at 15:23, Thomas Gleixner wrote:
>  
> -	if (x86_idle_set())
> -		return;
> -

Bah. With XEN=n this results in a defined but not used warning.
Updated version below.

---
Subject: x86/idle: Select idle routine only once
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 28 Feb 2024 23:20:32 +0100

The idle routine selection is done on every CPU bringup operation and has a
guard in place which is effective after the first invocation, which is a
pointless exercise.

Invoke it once on the boot CPU and mark the related functions __init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V1a: Move x86_idle_set() into the only usage site (0day)
---
 arch/x86/include/asm/processor.h |    2 +-
 arch/x86/kernel/cpu/common.c     |    4 ++--
 arch/x86/kernel/process.c        |   18 +++++-------------
 3 files changed, 8 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -555,7 +555,7 @@ static inline void load_sp0(unsigned lon
 
 unsigned long __get_wchan(struct task_struct *p);
 
-extern void select_idle_routine(const struct cpuinfo_x86 *c);
+extern void select_idle_routine(void);
 extern void amd_e400_c1e_apic_setup(void);
 
 extern unsigned long		boot_option_idle_override;
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1938,8 +1938,6 @@ static void identify_cpu(struct cpuinfo_
 	/* Init Machine Check Exception if available. */
 	mcheck_cpu_init(c);
 
-	select_idle_routine(c);
-
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
@@ -2343,6 +2341,8 @@ void __init arch_cpu_finalize_init(void)
 {
 	identify_boot_cpu();
 
+	select_idle_routine();
+
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -748,11 +748,6 @@ EXPORT_SYMBOL(default_idle);
 
 DEFINE_STATIC_CALL_NULL(x86_idle, default_idle);
 
-static bool x86_idle_set(void)
-{
-	return !!static_call_query(x86_idle);
-}
-
 #ifndef CONFIG_SMP
 static inline void __noreturn play_dead(void)
 {
@@ -783,10 +778,9 @@ EXPORT_SYMBOL_GPL(arch_cpu_idle);
 #ifdef CONFIG_XEN
 bool xen_set_default_idle(void)
 {
-	bool ret = x86_idle_set();
+	bool ret = !!static_call_query(x86_idle);
 
 	static_call_update(x86_idle, default_idle);
-
 	return ret;
 }
 #endif
@@ -853,8 +847,9 @@ void __noreturn stop_this_cpu(void *dumm
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static __init bool prefer_mwait_c1_over_halt(void)
 {
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
@@ -908,7 +903,7 @@ static __cpuidle void mwait_idle(void)
 	__current_clr_polling();
 }
 
-void select_idle_routine(const struct cpuinfo_x86 *c)
+void __init select_idle_routine(void)
 {
 	if (boot_option_idle_override == IDLE_POLL) {
 		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
@@ -916,10 +911,7 @@ void select_idle_routine(const struct cp
 		return;
 	}
 
-	if (x86_idle_set())
-		return;
-
-	if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt()) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 6/6] x86/idle: Select idle routine only once
  2024-03-01  8:14   ` Thomas Gleixner
@ 2024-03-01 11:33     ` Thomas Gleixner
  2024-03-01 20:48       ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-03-04 16:51       ` tip-bot2 for Thomas Gleixner
  0 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-03-01 11:33 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney

On Fri, Mar 01 2024 at 09:14, Thomas Gleixner wrote:
> On Thu, Feb 29 2024 at 15:23, Thomas Gleixner wrote:
>>  
>> -	if (x86_idle_set())
>> -		return;
>> -
>
> Bah. With XEN=n this results in a defined but not used warning.
> Updated version below.

I'm a moron. xen_set_default_idle() is invoked very early on, so
select_idle_routine() has to keep the check. Sigh.

Fixed it and added an comment to that effect.

---
Subject: x86/idle: Select idle routine only once
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 28 Feb 2024 23:20:32 +0100

The idle routine selection is done on every CPU bringup operation and has a
guard in place which is effective after the first invocation, which is a
pointless exercise.

Invoke it once on the boot CPU and mark the related functions __init.
The guard check has to stay as xen_set_default_idle() runs early.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V1b: Keep the x86_idle_set() check so XEN keeps working
V1a: Move x86_idle_set() into the only usage site (0day)
---
 arch/x86/include/asm/processor.h |    2 +-
 arch/x86/kernel/cpu/common.c     |    4 ++--
 arch/x86/kernel/process.c        |    8 +++++---
 3 files changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -555,7 +555,7 @@ static inline void load_sp0(unsigned lon
 
 unsigned long __get_wchan(struct task_struct *p);
 
-extern void select_idle_routine(const struct cpuinfo_x86 *c);
+extern void select_idle_routine(void);
 extern void amd_e400_c1e_apic_setup(void);
 
 extern unsigned long		boot_option_idle_override;
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1938,8 +1938,6 @@ static void identify_cpu(struct cpuinfo_
 	/* Init Machine Check Exception if available. */
 	mcheck_cpu_init(c);
 
-	select_idle_routine(c);
-
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
@@ -2343,6 +2341,8 @@ void __init arch_cpu_finalize_init(void)
 {
 	identify_boot_cpu();
 
+	select_idle_routine();
+
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,8 +853,9 @@ void __noreturn stop_this_cpu(void *dumm
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static __init bool prefer_mwait_c1_over_halt(void)
 {
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
@@ -908,7 +909,7 @@ static __cpuidle void mwait_idle(void)
 	__current_clr_polling();
 }
 
-void select_idle_routine(const struct cpuinfo_x86 *c)
+void __init select_idle_routine(void)
 {
 	if (boot_option_idle_override == IDLE_POLL) {
 		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
@@ -916,10 +917,11 @@ void select_idle_routine(const struct cp
 		return;
 	}
 
+	/* Required to guard against xen_set_default_idle() */
 	if (x86_idle_set())
 		return;
 
-	if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt()) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling
  2024-02-29 14:23 ` [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling Thomas Gleixner
@ 2024-03-01 12:33   ` Thomas Gleixner
  2024-03-01 20:48     ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
  2024-03-04 16:51     ` tip-bot2 for Thomas Gleixner
  0 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2024-03-01 12:33 UTC (permalink / raw)
  To: LKML; +Cc: x86, Steven Rostedt, Paul E. McKenney, Borislav Petkov

On Thu, Feb 29 2024 at 15:23, Thomas Gleixner wrote:
> @@ -985,7 +958,9 @@ void __init arch_post_acpi_subsys_init(v
>  
>  	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
>  		mark_tsc_unstable("TSC halt in AMD C1E");
> -	pr_info("System has AMD C1E enabled\n");
> +
> +	static_branch_enable(&arch_needs_tick_broadcast);
> +	pr_info("System has AMD C1E erratum E400. Workaround enabled.\n");
>  }

Breaks the 32-bit build with APIC=n :(

---
Subject: x86/idle: Sanitize X86_BUG_AMD_E400 handling
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 28 Feb 2024 23:13:00 +0100

amd_e400_idle(), the idle routine for AMD CPUs which are affected by
erratum 400 violates the RCU constraints by invoking tick_broadcast_enter()
and tick_broadcast_exit() after the core code has marked RCU non-idle.  The
functions can end up in lockdep or tracing, which rightfully triggers a
RCU warning.

The core code provides now a static branch conditional invocation of the
broadcast functions.

Remove amd_e400_idle(), enforce default_idle() and enable the static branch
on affected CPUs to cure this.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V1b: Fix 32bit built with APIC=n (0-day)
---
 arch/x86/Kconfig          |    1 +
 arch/x86/kernel/process.c |   42 +++++++++---------------------------------
 2 files changed, 10 insertions(+), 33 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,6 +147,7 @@ config X86
 	select EDAC_ATOMIC_SCRUB
 	select EDAC_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST	if X86_64 || (X86_32 && X86_LOCAL_APIC)
+	select GENERIC_CLOCKEVENTS_BROADCAST_IDLE	if GENERIC_CLOCKEVENTS_BROADCAST
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -846,31 +846,6 @@ void __noreturn stop_this_cpu(void *dumm
 }
 
 /*
- * AMD Erratum 400 aware idle routine. We handle it the same way as C3 power
- * states (local apic timer and TSC stop).
- *
- * XXX this function is completely buggered vs RCU and tracing.
- */
-static void amd_e400_idle(void)
-{
-	/*
-	 * We cannot use static_cpu_has_bug() here because X86_BUG_AMD_APIC_C1E
-	 * gets set after static_cpu_has() places have been converted via
-	 * alternatives.
-	 */
-	if (!boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
-		default_idle();
-		return;
-	}
-
-	tick_broadcast_enter();
-
-	default_idle();
-
-	tick_broadcast_exit();
-}
-
-/*
  * Prefer MWAIT over HALT if MWAIT is supported, MWAIT_CPUID leaf
  * exists and whenever MONITOR/MWAIT extensions are present there is at
  * least one C1 substate.
@@ -890,8 +865,8 @@ static int prefer_mwait_c1_over_halt(con
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
 		return 0;
 
-	/* Monitor has a bug. Fallback to HALT */
-	if (boot_cpu_has_bug(X86_BUG_MONITOR))
+	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
+	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
 		return 0;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
@@ -942,17 +917,15 @@ void select_idle_routine(const struct cp
 	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
 		return;
 
-	if (boot_cpu_has_bug(X86_BUG_AMD_E400)) {
-		pr_info("using AMD E400 aware idle routine\n");
-		static_call_update(x86_idle, amd_e400_idle);
-	} else if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt(c)) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
 		pr_info("using TDX aware idle routine\n");
 		static_call_update(x86_idle, tdx_safe_halt);
-	} else
+	} else {
 		static_call_update(x86_idle, default_idle);
+	}
 }
 
 void amd_e400_c1e_apic_setup(void)
@@ -985,7 +958,10 @@ void __init arch_post_acpi_subsys_init(v
 
 	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
 		mark_tsc_unstable("TSC halt in AMD C1E");
-	pr_info("System has AMD C1E enabled\n");
+
+	if (IS_ENABLED(GENERIC_CLOCKEVENTS_BROADCAST_IDLE))
+		static_branch_enable(&arch_needs_tick_broadcast);
+	pr_info("System has AMD C1E erratum E400. Workaround enabled.\n");
 }
 
 static int __init idle_setup(char *str)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch 0/6] x86/idle: Cure RCU violations and cleanups
  2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
                   ` (5 preceding siblings ...)
  2024-02-29 14:23 ` [patch 6/6] x86/idle: Select idle routine only once Thomas Gleixner
@ 2024-03-01 18:41 ` Borislav Petkov
  6 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2024-03-01 18:41 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, x86, Steven Rostedt, Paul E. McKenney

On Thu, Feb 29, 2024 at 03:23:35PM +0100, Thomas Gleixner wrote:
> Boris reported that a RCU related warning triggers in the tracer code on
> AMD machines which are affected by Erratum 400. On those CPUs the local
> APIC timer stops in the C1E halt state. This is handled by a special idle
> function which invokes tick_broadcast_enter()/exit() around HALT.  These
> functions can end up in lockdep or tracing which use RCU protected data,
> but the core code already set RCU to idle which means that the RCU
> protection is not longer given.
> 
> This series fixes this by handling the tick broadcast conditionally in the
> core idle function. While working on it I noticed a few bogosities in the
> related code and cleaned that up on top.
> 
> The series is also available from git:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/core
> 
> Thanks,
> 
> 	tglx
> ---
>  arch/x86/Kconfig                 |    1 
>  arch/x86/include/asm/processor.h |    2 
>  arch/x86/kernel/cpu/common.c     |    4 -
>  arch/x86/kernel/process.c        |   89 +++++++++++----------------------------
>  include/linux/cpu.h              |    2 
>  include/linux/tick.h             |    3 +
>  kernel/sched/idle.c              |   21 +++++++++
>  kernel/time/Kconfig              |    5 ++
>  8 files changed, 62 insertions(+), 65 deletions(-)

I refreshed my local branch with all your fixed patches and it still
works.

Tested-by: Borislav Petkov (AMD) <bp@alien8.de>

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Select idle routine only once
  2024-03-01 11:33     ` Thomas Gleixner
@ 2024-03-01 20:48       ` tip-bot2 for Thomas Gleixner
  2024-03-04 16:51       ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     25525edd9c99d3aa799e80a8e98bdd62ed1639f9
Gitweb:        https://git.kernel.org/tip/25525edd9c99d3aa799e80a8e98bdd62ed1639f9
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 28 Feb 2024 23:20:32 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:34:55 +01:00

x86/idle: Select idle routine only once

The idle routine selection is done on every CPU bringup operation and
has a guard in place which is effective after the first invocation,
which is a pointless exercise.

Invoke it once on the boot CPU and mark the related functions __init.
The guard check has to stay as xen_set_default_idle() runs early.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/87edcu6vaq.ffs@tglx
---
 arch/x86/include/asm/processor.h | 2 +-
 arch/x86/kernel/cpu/common.c     | 4 ++--
 arch/x86/kernel/process.c        | 8 +++++---
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 1188e8b..523c466 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -558,7 +558,7 @@ static inline void load_sp0(unsigned long sp0)
 
 unsigned long __get_wchan(struct task_struct *p);
 
-extern void select_idle_routine(const struct cpuinfo_x86 *c);
+extern void select_idle_routine(void);
 extern void amd_e400_c1e_apic_setup(void);
 
 extern unsigned long		boot_option_idle_override;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8f367d3..5c72af1 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1938,8 +1938,6 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	/* Init Machine Check Exception if available. */
 	mcheck_cpu_init(c);
 
-	select_idle_routine(c);
-
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
@@ -2344,6 +2342,8 @@ void __init arch_cpu_finalize_init(void)
 {
 	identify_boot_cpu();
 
+	select_idle_routine();
+
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 2d51cbd..ac34342 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,8 +853,9 @@ void __noreturn stop_this_cpu(void *dummy)
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static __init bool prefer_mwait_c1_over_halt(void)
 {
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
@@ -908,7 +909,7 @@ static __cpuidle void mwait_idle(void)
 	__current_clr_polling();
 }
 
-void select_idle_routine(const struct cpuinfo_x86 *c)
+void __init select_idle_routine(void)
 {
 	if (boot_option_idle_override == IDLE_POLL) {
 		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
@@ -916,10 +917,11 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
 		return;
 	}
 
+	/* Required to guard against xen_set_default_idle() */
 	if (x86_idle_set())
 		return;
 
-	if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt()) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Let prefer_mwait_c1_over_halt() return bool
  2024-02-29 14:23 ` [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool Thomas Gleixner
@ 2024-03-01 20:48   ` tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     71f3a783a758696b5f8084b6f1621240b8804568
Gitweb:        https://git.kernel.org/tip/71f3a783a758696b5f8084b6f1621240b8804568
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:41 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:33:12 +01:00

x86/idle: Let prefer_mwait_c1_over_halt() return bool

The return value is truly boolean. Make it so.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.518723854@linutronix.de
---
 arch/x86/kernel/process.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 678d5cf..2d51cbd 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,21 +853,21 @@ void __noreturn stop_this_cpu(void *dummy)
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
 	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
-		return 0;
+		return false;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
-		return 0;
+		return false;
 
 	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
 	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
-		return 0;
+		return false;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
 
@@ -876,13 +876,13 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	 * with EAX=0, ECX=0.
 	 */
 	if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED))
-		return 1;
+		return true;
 
 	/*
 	 * If MWAIT extensions are available, there should be at least one
 	 * MWAIT C1 substate present.
 	 */
-	return (edx & MWAIT_C1_SUBSTATE_MASK);
+	return !!(edx & MWAIT_C1_SUBSTATE_MASK);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Cleanup idle_setup()
  2024-02-29 14:23 ` [patch 4/6] x86/idle: Cleanup idle_setup() Thomas Gleixner
@ 2024-03-01 20:48   ` tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     695f90619cf785f8a13cbf21bbfddae103542fdf
Gitweb:        https://git.kernel.org/tip/695f90619cf785f8a13cbf21bbfddae103542fdf
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:40 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:33:12 +01:00

x86/idle: Cleanup idle_setup()

Updating the static call for x86_idle() from idle_setup() is
counter-intuitive.

Let select_idle_routine() handle it like the other idle choices, which
allows to simplify the idle selection later on.

While at it rewrite comments and return a proper error code and not -1.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.455616019@linutronix.de
---
 arch/x86/kernel/process.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b6512f7..678d5cf 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -857,8 +857,8 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 
-	/* User has disallowed the use of MWAIT. Fallback to HALT */
-	if (boot_option_idle_override == IDLE_NOMWAIT)
+	/* If override is enforced on the command line, fall back to HALT. */
+	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
 		return 0;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
@@ -976,24 +976,14 @@ static int __init idle_setup(char *str)
 		boot_option_idle_override = IDLE_POLL;
 		cpu_idle_poll_ctrl(true);
 	} else if (!strcmp(str, "halt")) {
-		/*
-		 * When the boot option of idle=halt is added, halt is
-		 * forced to be used for CPU idle. In such case CPU C2/C3
-		 * won't be used again.
-		 * To continue to load the CPU idle driver, don't touch
-		 * the boot_option_idle_override.
-		 */
-		static_call_update(x86_idle, default_idle);
+		/* 'idle=halt' HALT for idle. C-states are disabled. */
 		boot_option_idle_override = IDLE_HALT;
 	} else if (!strcmp(str, "nomwait")) {
-		/*
-		 * If the boot option of "idle=nomwait" is added,
-		 * it means that mwait will be disabled for CPU C1/C2/C3
-		 * states.
-		 */
+		/* 'idle=nomwait' disables MWAIT for idle */
 		boot_option_idle_override = IDLE_NOMWAIT;
-	} else
-		return -1;
+	} else {
+		return -EINVAL;
+	}
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Clean up idle selection
  2024-02-29 14:23 ` [patch 3/6] x86/idle: Clean up idle selection Thomas Gleixner
@ 2024-03-01 20:48   ` tip-bot2 for Thomas Gleixner
  2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     18a9f4806dcb33ed7eab93232d13facf5c6e2b70
Gitweb:        https://git.kernel.org/tip/18a9f4806dcb33ed7eab93232d13facf5c6e2b70
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:38 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:33:12 +01:00

x86/idle: Clean up idle selection

Clean up the code to make it readable. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.392017685@linutronix.de
---
 arch/x86/kernel/process.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 5853d56..b6512f7 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -910,11 +910,13 @@ static __cpuidle void mwait_idle(void)
 
 void select_idle_routine(const struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_SMP
-	if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
-		pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
-#endif
-	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
+	if (boot_option_idle_override == IDLE_POLL) {
+		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
+			pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
+		return;
+	}
+
+	if (x86_idle_set())
 		return;
 
 	if (prefer_mwait_c1_over_halt(c)) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Sanitize X86_BUG_AMD_E400 handling
  2024-03-01 12:33   ` Thomas Gleixner
@ 2024-03-01 20:48     ` tip-bot2 for Thomas Gleixner
  2024-03-04 16:51     ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Borislav Petkov, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     3b9b8e89c8e1e97c76be6abc67db702ea978d89b
Gitweb:        https://git.kernel.org/tip/3b9b8e89c8e1e97c76be6abc67db702ea978d89b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 28 Feb 2024 23:13:00 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:33:05 +01:00

x86/idle: Sanitize X86_BUG_AMD_E400 handling

amd_e400_idle(), the idle routine for AMD CPUs which are affected by
erratum 400 violates the RCU constraints by invoking tick_broadcast_enter()
and tick_broadcast_exit() after the core code has marked RCU non-idle.  The
functions can end up in lockdep or tracing, which rightfully triggers a
RCU warning.

The core code provides now a static branch conditional invocation of the
broadcast functions.

Remove amd_e400_idle(), enforce default_idle() and enable the static branch
on affected CPUs to cure this.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/877cim6sis.ffs@tglx
---
 arch/x86/Kconfig          |  1 +-
 arch/x86/kernel/process.c | 42 ++++++++------------------------------
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5029862..0802653 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,6 +147,7 @@ config X86
 	select EDAC_ATOMIC_SCRUB
 	select EDAC_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST	if X86_64 || (X86_32 && X86_LOCAL_APIC)
+	select GENERIC_CLOCKEVENTS_BROADCAST_IDLE	if GENERIC_CLOCKEVENTS_BROADCAST
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 45a9d49..5853d56 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -846,31 +846,6 @@ void __noreturn stop_this_cpu(void *dummy)
 }
 
 /*
- * AMD Erratum 400 aware idle routine. We handle it the same way as C3 power
- * states (local apic timer and TSC stop).
- *
- * XXX this function is completely buggered vs RCU and tracing.
- */
-static void amd_e400_idle(void)
-{
-	/*
-	 * We cannot use static_cpu_has_bug() here because X86_BUG_AMD_APIC_C1E
-	 * gets set after static_cpu_has() places have been converted via
-	 * alternatives.
-	 */
-	if (!boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
-		default_idle();
-		return;
-	}
-
-	tick_broadcast_enter();
-
-	default_idle();
-
-	tick_broadcast_exit();
-}
-
-/*
  * Prefer MWAIT over HALT if MWAIT is supported, MWAIT_CPUID leaf
  * exists and whenever MONITOR/MWAIT extensions are present there is at
  * least one C1 substate.
@@ -890,8 +865,8 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
 		return 0;
 
-	/* Monitor has a bug. Fallback to HALT */
-	if (boot_cpu_has_bug(X86_BUG_MONITOR))
+	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
+	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
 		return 0;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
@@ -942,17 +917,15 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
 	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
 		return;
 
-	if (boot_cpu_has_bug(X86_BUG_AMD_E400)) {
-		pr_info("using AMD E400 aware idle routine\n");
-		static_call_update(x86_idle, amd_e400_idle);
-	} else if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt(c)) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
 		pr_info("using TDX aware idle routine\n");
 		static_call_update(x86_idle, tdx_safe_halt);
-	} else
+	} else {
 		static_call_update(x86_idle, default_idle);
+	}
 }
 
 void amd_e400_c1e_apic_setup(void)
@@ -985,7 +958,10 @@ void __init arch_post_acpi_subsys_init(void)
 
 	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
 		mark_tsc_unstable("TSC halt in AMD C1E");
-	pr_info("System has AMD C1E enabled\n");
+
+	if (IS_ENABLED(GENERIC_CLOCKEVENTS_BROADCAST_IDLE))
+		static_branch_enable(&arch_needs_tick_broadcast);
+	pr_info("System has AMD C1E erratum E400. Workaround enabled.\n");
 }
 
 static int __init idle_setup(char *str)

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] sched/idle: Conditionally handle tick broadcast in default_idle_call()
  2024-02-29 14:23 ` [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call() Thomas Gleixner
@ 2024-03-01 20:48   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-01 20:48 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Borislav Petkov, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     2be2a197ff6c3a659ab9285e1d88cbdc609ac6de
Gitweb:        https://git.kernel.org/tip/2be2a197ff6c3a659ab9285e1d88cbdc609ac6de
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:36 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Fri, 01 Mar 2024 21:04:27 +01:00

sched/idle: Conditionally handle tick broadcast in default_idle_call()

The x86 architecture has an idle routine for AMD CPUs which are affected
by erratum 400. On the affected CPUs the local APIC timer stops in the
C1E halt state.

It therefore requires tick broadcasting. The invocation of
tick_broadcast_enter()/exit() from this function violates the RCU
constraints because it can end up in lockdep or tracing, which
rightfully triggers a warning.

tick_broadcast_enter()/exit() must be invoked before ct_cpuidle_enter()
and after ct_cpuidle_exit() in default_idle_call().

Add a static branch conditional invocation of tick_broadcast_enter()/exit()
into this function to allow X86 to replace the AMD specific idle code. It's
guarded by a config switch which will be selected by x86. Otherwise it's
a NOOP.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.266708822@linutronix.de
---
 include/linux/cpu.h  |  2 ++
 include/linux/tick.h |  3 +++
 kernel/sched/idle.c  | 21 +++++++++++++++++++++
 kernel/time/Kconfig  |  5 +++++
 4 files changed, 31 insertions(+)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index dcb89c9..715017d 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -196,6 +196,8 @@ void arch_cpu_idle(void);
 void arch_cpu_idle_prepare(void);
 void arch_cpu_idle_enter(void);
 void arch_cpu_idle_exit(void);
+void arch_tick_broadcast_enter(void);
+void arch_tick_broadcast_exit(void);
 void __noreturn arch_cpu_idle_dead(void);
 
 #ifdef CONFIG_ARCH_HAS_CPU_FINALIZE_INIT
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 716d17f..e134f28 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -12,6 +12,7 @@
 #include <linux/cpumask.h>
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
+#include <linux/static_key.h>
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 extern void __init tick_init(void);
@@ -63,6 +64,8 @@ enum tick_broadcast_state {
 	TICK_BROADCAST_ENTER,
 };
 
+extern struct static_key_false arch_needs_tick_broadcast;
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 extern void tick_broadcast_control(enum tick_broadcast_mode mode);
 #else
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 3123192..31ad811 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -81,6 +81,25 @@ void __weak arch_cpu_idle(void)
 	cpu_idle_force_poll = 1;
 }
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST_IDLE
+DEFINE_STATIC_KEY_FALSE(arch_needs_tick_broadcast);
+
+static inline void cond_tick_broadcast_enter(void)
+{
+	if (static_branch_unlikely(&arch_needs_tick_broadcast))
+		tick_broadcast_enter();
+}
+
+static inline void cond_tick_broadcast_exit(void)
+{
+	if (static_branch_unlikely(&arch_needs_tick_broadcast))
+		tick_broadcast_exit();
+}
+#else
+static inline void cond_tick_broadcast_enter(void) { }
+static inline void cond_tick_broadcast_exit(void) { }
+#endif
+
 /**
  * default_idle_call - Default CPU idle routine.
  *
@@ -90,6 +109,7 @@ void __cpuidle default_idle_call(void)
 {
 	instrumentation_begin();
 	if (!current_clr_polling_and_test()) {
+		cond_tick_broadcast_enter();
 		trace_cpu_idle(1, smp_processor_id());
 		stop_critical_timings();
 
@@ -99,6 +119,7 @@ void __cpuidle default_idle_call(void)
 
 		start_critical_timings();
 		trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+		cond_tick_broadcast_exit();
 	}
 	local_irq_enable();
 	instrumentation_end();
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index bae8f11..fc3b1a0 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -39,6 +39,11 @@ config GENERIC_CLOCKEVENTS_BROADCAST
 	bool
 	depends on GENERIC_CLOCKEVENTS
 
+# Handle broadcast in default_idle_call()
+config GENERIC_CLOCKEVENTS_BROADCAST_IDLE
+	bool
+	depends on GENERIC_CLOCKEVENTS_BROADCAST
+
 # Automatically adjust the min. reprogramming time for
 # clock event device
 config GENERIC_CLOCKEVENTS_MIN_ADJUST

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Select idle routine only once
  2024-03-01 11:33     ` Thomas Gleixner
  2024-03-01 20:48       ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
@ 2024-03-04 16:51       ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 16:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     35ce64922c8263448e58a2b9e8d15a64e11e9b2d
Gitweb:        https://git.kernel.org/tip/35ce64922c8263448e58a2b9e8d15a64e11e9b2d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 28 Feb 2024 23:20:32 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 04 Mar 2024 17:39:24 +01:00

x86/idle: Select idle routine only once

The idle routine selection is done on every CPU bringup operation and
has a guard in place which is effective after the first invocation,
which is a pointless exercise.

Invoke it once on the boot CPU and mark the related functions __init.
The guard check has to stay as xen_set_default_idle() runs early.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/87edcu6vaq.ffs@tglx
---
 arch/x86/include/asm/processor.h | 2 +-
 arch/x86/kernel/cpu/common.c     | 4 ++--
 arch/x86/kernel/process.c        | 8 +++++---
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 1188e8b..523c466 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -558,7 +558,7 @@ static inline void load_sp0(unsigned long sp0)
 
 unsigned long __get_wchan(struct task_struct *p);
 
-extern void select_idle_routine(const struct cpuinfo_x86 *c);
+extern void select_idle_routine(void);
 extern void amd_e400_c1e_apic_setup(void);
 
 extern unsigned long		boot_option_idle_override;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 8f367d3..5c72af1 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1938,8 +1938,6 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	/* Init Machine Check Exception if available. */
 	mcheck_cpu_init(c);
 
-	select_idle_routine(c);
-
 #ifdef CONFIG_NUMA
 	numa_add_cpu(smp_processor_id());
 #endif
@@ -2344,6 +2342,8 @@ void __init arch_cpu_finalize_init(void)
 {
 	identify_boot_cpu();
 
+	select_idle_routine();
+
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ccaacc7..f0166b3 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,8 +853,9 @@ void __noreturn stop_this_cpu(void *dummy)
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static __init bool prefer_mwait_c1_over_halt(void)
 {
+	const struct cpuinfo_x86 *c = &boot_cpu_data;
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
@@ -908,7 +909,7 @@ static __cpuidle void mwait_idle(void)
 	__current_clr_polling();
 }
 
-void select_idle_routine(const struct cpuinfo_x86 *c)
+void __init select_idle_routine(void)
 {
 	if (boot_option_idle_override == IDLE_POLL) {
 		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
@@ -916,10 +917,11 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
 		return;
 	}
 
+	/* Required to guard against xen_set_default_idle() */
 	if (x86_idle_set())
 		return;
 
-	if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt()) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Let prefer_mwait_c1_over_halt() return bool
  2024-02-29 14:23 ` [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
@ 2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 16:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     5f75916ec6ecdc6314b637746f3ad809f2fc7379
Gitweb:        https://git.kernel.org/tip/5f75916ec6ecdc6314b637746f3ad809f2fc7379
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:41 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 04 Mar 2024 17:39:24 +01:00

x86/idle: Let prefer_mwait_c1_over_halt() return bool

The return value is truly boolean. Make it so.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.518723854@linutronix.de
---
 arch/x86/kernel/process.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9709959..ccaacc7 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -853,21 +853,21 @@ void __noreturn stop_this_cpu(void *dummy)
  * Do not prefer MWAIT if MONITOR instruction has a bug or idle=nomwait
  * is passed to kernel commandline parameter.
  */
-static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
+static bool prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 
 	/* If override is enforced on the command line, fall back to HALT. */
 	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
-		return 0;
+		return false;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
-		return 0;
+		return false;
 
 	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
 	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
-		return 0;
+		return false;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
 
@@ -876,13 +876,13 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	 * with EAX=0, ECX=0.
 	 */
 	if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED))
-		return 1;
+		return true;
 
 	/*
 	 * If MWAIT extensions are available, there should be at least one
 	 * MWAIT C1 substate present.
 	 */
-	return (edx & MWAIT_C1_SUBSTATE_MASK);
+	return !!(edx & MWAIT_C1_SUBSTATE_MASK);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Cleanup idle_setup()
  2024-02-29 14:23 ` [patch 4/6] x86/idle: Cleanup idle_setup() Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
@ 2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 16:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     f3d7eab7be871d948d896e7021038b092ece687e
Gitweb:        https://git.kernel.org/tip/f3d7eab7be871d948d896e7021038b092ece687e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:40 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 04 Mar 2024 17:39:24 +01:00

x86/idle: Cleanup idle_setup()

Updating the static call for x86_idle() from idle_setup() is
counter-intuitive.

Let select_idle_routine() handle it like the other idle choices, which
allows to simplify the idle selection later on.

While at it rewrite comments and return a proper error code and not -1.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.455616019@linutronix.de
---
 arch/x86/kernel/process.c | 24 +++++++-----------------
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 792394b..9709959 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -857,8 +857,8 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 {
 	u32 eax, ebx, ecx, edx;
 
-	/* User has disallowed the use of MWAIT. Fallback to HALT */
-	if (boot_option_idle_override == IDLE_NOMWAIT)
+	/* If override is enforced on the command line, fall back to HALT. */
+	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
 		return 0;
 
 	/* MWAIT is not supported on this platform. Fallback to HALT */
@@ -976,24 +976,14 @@ static int __init idle_setup(char *str)
 		boot_option_idle_override = IDLE_POLL;
 		cpu_idle_poll_ctrl(true);
 	} else if (!strcmp(str, "halt")) {
-		/*
-		 * When the boot option of idle=halt is added, halt is
-		 * forced to be used for CPU idle. In such case CPU C2/C3
-		 * won't be used again.
-		 * To continue to load the CPU idle driver, don't touch
-		 * the boot_option_idle_override.
-		 */
-		static_call_update(x86_idle, default_idle);
+		/* 'idle=halt' HALT for idle. C-states are disabled. */
 		boot_option_idle_override = IDLE_HALT;
 	} else if (!strcmp(str, "nomwait")) {
-		/*
-		 * If the boot option of "idle=nomwait" is added,
-		 * it means that mwait will be disabled for CPU C1/C2/C3
-		 * states.
-		 */
+		/* 'idle=nomwait' disables MWAIT for idle */
 		boot_option_idle_override = IDLE_NOMWAIT;
-	} else
-		return -1;
+	} else {
+		return -EINVAL;
+	}
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Clean up idle selection
  2024-02-29 14:23 ` [patch 3/6] x86/idle: Clean up idle selection Thomas Gleixner
  2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
@ 2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 16:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     0ab562875c01c91ec8167f8f6593ea61e510fd0a
Gitweb:        https://git.kernel.org/tip/0ab562875c01c91ec8167f8f6593ea61e510fd0a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 29 Feb 2024 15:23:38 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 04 Mar 2024 17:39:24 +01:00

x86/idle: Clean up idle selection

Clean up the code to make it readable. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240229142248.392017685@linutronix.de
---
 arch/x86/kernel/process.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b86ff0f..792394b 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -910,11 +910,13 @@ static __cpuidle void mwait_idle(void)
 
 void select_idle_routine(const struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_SMP
-	if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
-		pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
-#endif
-	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
+	if (boot_option_idle_override == IDLE_POLL) {
+		if (IS_ENABLED(CONFIG_SMP) && smp_num_siblings > 1)
+			pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
+		return;
+	}
+
+	if (x86_idle_set())
 		return;
 
 	if (prefer_mwait_c1_over_halt(c)) {

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/core] x86/idle: Sanitize X86_BUG_AMD_E400 handling
  2024-03-01 12:33   ` Thomas Gleixner
  2024-03-01 20:48     ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
@ 2024-03-04 16:51     ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 22+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-03-04 16:51 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Borislav Petkov, Thomas Gleixner, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     cb81deefb59de01325ab822f900c13941bfaf67f
Gitweb:        https://git.kernel.org/tip/cb81deefb59de01325ab822f900c13941bfaf67f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 28 Feb 2024 23:13:00 +01:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 04 Mar 2024 17:38:06 +01:00

x86/idle: Sanitize X86_BUG_AMD_E400 handling

amd_e400_idle(), the idle routine for AMD CPUs which are affected by
erratum 400 violates the RCU constraints by invoking tick_broadcast_enter()
and tick_broadcast_exit() after the core code has marked RCU non-idle.  The
functions can end up in lockdep or tracing, which rightfully triggers a
RCU warning.

The core code provides now a static branch conditional invocation of the
broadcast functions.

Remove amd_e400_idle(), enforce default_idle() and enable the static branch
on affected CPUs to cure this.

  [ bp: Fold in a fix for a IS_ENABLED() check fail missing a "CONFIG_"
    prefix which tglx spotted. ]

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/877cim6sis.ffs@tglx
---
 arch/x86/Kconfig          |  1 +-
 arch/x86/kernel/process.c | 42 ++++++++------------------------------
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5029862..0802653 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,6 +147,7 @@ config X86
 	select EDAC_ATOMIC_SCRUB
 	select EDAC_SUPPORT
 	select GENERIC_CLOCKEVENTS_BROADCAST	if X86_64 || (X86_32 && X86_LOCAL_APIC)
+	select GENERIC_CLOCKEVENTS_BROADCAST_IDLE	if GENERIC_CLOCKEVENTS_BROADCAST
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 45a9d49..b86ff0f 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -846,31 +846,6 @@ void __noreturn stop_this_cpu(void *dummy)
 }
 
 /*
- * AMD Erratum 400 aware idle routine. We handle it the same way as C3 power
- * states (local apic timer and TSC stop).
- *
- * XXX this function is completely buggered vs RCU and tracing.
- */
-static void amd_e400_idle(void)
-{
-	/*
-	 * We cannot use static_cpu_has_bug() here because X86_BUG_AMD_APIC_C1E
-	 * gets set after static_cpu_has() places have been converted via
-	 * alternatives.
-	 */
-	if (!boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
-		default_idle();
-		return;
-	}
-
-	tick_broadcast_enter();
-
-	default_idle();
-
-	tick_broadcast_exit();
-}
-
-/*
  * Prefer MWAIT over HALT if MWAIT is supported, MWAIT_CPUID leaf
  * exists and whenever MONITOR/MWAIT extensions are present there is at
  * least one C1 substate.
@@ -890,8 +865,8 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	if (!cpu_has(c, X86_FEATURE_MWAIT))
 		return 0;
 
-	/* Monitor has a bug. Fallback to HALT */
-	if (boot_cpu_has_bug(X86_BUG_MONITOR))
+	/* Monitor has a bug or APIC stops in C1E. Fallback to HALT */
+	if (boot_cpu_has_bug(X86_BUG_MONITOR) || boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E))
 		return 0;
 
 	cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &edx);
@@ -942,17 +917,15 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
 	if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
 		return;
 
-	if (boot_cpu_has_bug(X86_BUG_AMD_E400)) {
-		pr_info("using AMD E400 aware idle routine\n");
-		static_call_update(x86_idle, amd_e400_idle);
-	} else if (prefer_mwait_c1_over_halt(c)) {
+	if (prefer_mwait_c1_over_halt(c)) {
 		pr_info("using mwait in idle threads\n");
 		static_call_update(x86_idle, mwait_idle);
 	} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
 		pr_info("using TDX aware idle routine\n");
 		static_call_update(x86_idle, tdx_safe_halt);
-	} else
+	} else {
 		static_call_update(x86_idle, default_idle);
+	}
 }
 
 void amd_e400_c1e_apic_setup(void)
@@ -985,7 +958,10 @@ void __init arch_post_acpi_subsys_init(void)
 
 	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
 		mark_tsc_unstable("TSC halt in AMD C1E");
-	pr_info("System has AMD C1E enabled\n");
+
+	if (IS_ENABLED(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST_IDLE))
+		static_branch_enable(&arch_needs_tick_broadcast);
+	pr_info("System has AMD C1E erratum E400. Workaround enabled.\n");
 }
 
 static int __init idle_setup(char *str)

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-03-04 16:51 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-29 14:23 [patch 0/6] x86/idle: Cure RCU violations and cleanups Thomas Gleixner
2024-02-29 14:23 ` [patch 1/6] sched/idle: Conditionally handle tick broadcast in default_idle_call() Thomas Gleixner
2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-02-29 14:23 ` [patch 2/6] x86/idle: Sanitize X86_BUG_AMD_E400 handling Thomas Gleixner
2024-03-01 12:33   ` Thomas Gleixner
2024-03-01 20:48     ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-03-04 16:51     ` tip-bot2 for Thomas Gleixner
2024-02-29 14:23 ` [patch 3/6] x86/idle: Clean up idle selection Thomas Gleixner
2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
2024-02-29 14:23 ` [patch 4/6] x86/idle: Cleanup idle_setup() Thomas Gleixner
2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
2024-02-29 14:23 ` [patch 5/6] x86/idle: Let prefer_mwait_c1_over_halt() return bool Thomas Gleixner
2024-03-01 20:48   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-03-04 16:51   ` tip-bot2 for Thomas Gleixner
2024-02-29 14:23 ` [patch 6/6] x86/idle: Select idle routine only once Thomas Gleixner
2024-03-01  8:14   ` Thomas Gleixner
2024-03-01 11:33     ` Thomas Gleixner
2024-03-01 20:48       ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2024-03-04 16:51       ` tip-bot2 for Thomas Gleixner
2024-03-01 18:41 ` [patch 0/6] x86/idle: Cure RCU violations and cleanups Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.