All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/2] x86/hotplug: Cure the MCE wreckage
@ 2018-06-29 14:05 Thomas Gleixner
  2018-06-29 14:05 ` [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force" Thomas Gleixner
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 14:05 UTC (permalink / raw)
  To: speck

As Dave pointed out, the HT siblings have to be booted up at least once in
order to set the CR4.MCE bit. Otherwise a MCE will cause the Internal
Processor Error to be raised which in turn causes shutdown or reset.

Revert the patch which ignores the siblings in the enumeration and let the
hotplug core boot the siblings up to the initial boot state and then
immediately bring them offline again.

Thanks,

	tglx

8<-------------------
 Documentation/admin-guide/kernel-parameters.txt |    8 --
 arch/x86/include/asm/apic.h                     |    2 
 arch/x86/kernel/acpi/boot.c                     |    3 -
 arch/x86/kernel/apic/apic.c                     |   19 ------
 kernel/cpu.c                                    |   69 +++++++++++++++---------
 5 files changed, 48 insertions(+), 53 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force"
  2018-06-29 14:05 [patch 0/2] x86/hotplug: Cure the MCE wreckage Thomas Gleixner
@ 2018-06-29 14:05 ` Thomas Gleixner
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
  2018-07-02 12:27 ` [MODERATED] Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage Peter Zijlstra
  2 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 14:05 UTC (permalink / raw)
  To: speck

From: Thomas Gleixner tglx@linutronix.de
Subject: [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force"

Dave Hansen reported, that it's outright dangerous to keep SMT siblings
disabled completely so they are stuck in the BIOS and wait for SIPI.

The reason is that Machine Check Exceptions are broadcasted to siblings and
the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a
logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or
reboots the machine. The MCE chapter in the SDM contains the following
blurb:

    Because the logical processors within a physical package are tightly
    coupled with respect to shared hardware resources, both logical
    processors are notified of machine check errors that occur within a
    given physical processor. If machine-check exceptions are enabled when
    a fatal error is reported, all the logical processors within a physical
    package are dispatched to the machine-check exception handler. If
    machine-check exceptions are disabled, the logical processors enter the
    shutdown state and assert the IERR# signal. When enabling machine-check
    exceptions, the MCE flag in control register CR4 should be set for each
    logical processor.

Reverting the commit which ignores siblings at enumeration time solves only
half of the problem. The core cpuhotplug logic needs to be adjusted as
well.

This thoughtful engineered mechanism also turns the boot process on all
Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU
before the secondary CPUs are brought up. Depending on the number of
physical cores the window in which this situation can happen is smaller or
larger. On a HSW-EX it's about 750ms:

MCE is enabled on the boot CPU:

[    0.244017] mce: CPU supports 22 MCE banks

The corresponding sibling #72 boots:

[    1.008005] .... node  #0, CPUs:    #72

That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
between these two points the machine is going to shutdown. At least it's a
known safe state.

It's obvious that the early boot can be hit by an MCE as well and then runs
into the same situation because MCEs are not yet enabled on the boot CPU.
But after enabling them on the boot CPU, it does not make any sense to
prevent the kernel from recovering.

Adjust the nosmt kernel parameter documentation as well.

Reverts: 2207def700f9 ("x86/apic: Ignore secondary threads if nosmt=force")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


---
 Documentation/admin-guide/kernel-parameters.txt |    8 ++------
 arch/x86/include/asm/apic.h                     |    2 --
 arch/x86/kernel/acpi/boot.c                     |    3 +--
 arch/x86/kernel/apic/apic.c                     |   19 -------------------
 4 files changed, 3 insertions(+), 29 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2688,12 +2688,8 @@
 			Equivalent to smt=1.
 
 			[KNL,x86] Disable symmetric multithreading (SMT).
-			nosmt=force: Force disable SMT, similar to disabling
-				     it in the BIOS except that some of the
-				     resource partitioning effects which are
-				     caused by having SMT enabled in the BIOS
-				     cannot be undone. Depending on the CPU
-				     type this might have a performance impact.
+			nosmt=force: Force disable SMT, cannot be undone
+				     via the sysfs control file.
 
 	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
 			(indirect branch prediction) vulnerability. System may
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -504,10 +504,8 @@ extern int default_check_phys_apicid_pre
 
 #ifdef CONFIG_SMP
 bool apic_id_is_primary_thread(unsigned int id);
-bool apic_id_disabled(unsigned int id);
 #else
 static inline bool apic_id_is_primary_thread(unsigned int id) { return false; }
-static inline bool apic_id_disabled(unsigned int id) { return false; }
 #endif
 
 extern void irq_enter(void);
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -181,8 +181,7 @@ static int acpi_register_lapic(int id, u
 	}
 
 	if (!enabled) {
-		if (!apic_id_disabled(id))
-			++disabled_cpus;
+		++disabled_cpus;
 		return -EINVAL;
 	}
 
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2204,16 +2204,6 @@ bool apic_id_is_primary_thread(unsigned
 	return !(apicid & mask);
 }
 
-/**
- * apic_id_disabled - Check whether APIC ID is disabled via SMT control
- * @id:	APIC ID to check
- */
-bool apic_id_disabled(unsigned int id)
-{
-	return (cpu_smt_control == CPU_SMT_FORCE_DISABLED &&
-		!apic_id_is_primary_thread(id));
-}
-
 /*
  * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
  * and cpuid_to_apicid[] synchronized.
@@ -2309,15 +2299,6 @@ int generic_processor_info(int apicid, i
 		return -EINVAL;
 	}
 
-	/*
-	 * If SMT is force disabled and the APIC ID belongs to
-	 * a secondary thread, ignore it.
-	 */
-	if (apic_id_disabled(apicid)) {
-		pr_info_once("Ignoring secondary SMT threads\n");
-		return -EINVAL;
-	}
-
 	if (apicid == boot_cpu_physical_apicid) {
 		/*
 		 * x86_bios_cpu_apicid is required to have processors listed

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 14:05 [patch 0/2] x86/hotplug: Cure the MCE wreckage Thomas Gleixner
  2018-06-29 14:05 ` [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force" Thomas Gleixner
@ 2018-06-29 14:05 ` Thomas Gleixner
  2018-06-29 19:13   ` [MODERATED] " Josh Poimboeuf
                     ` (3 more replies)
  2018-07-02 12:27 ` [MODERATED] Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage Peter Zijlstra
  2 siblings, 4 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 14:05 UTC (permalink / raw)
  To: speck

From: Thomas Gleixner <tglx@linutronix.de>
Subject: cpu/hotplug: Boot HT siblings at least once

Due to the way Machine Check Exceptions work on X86 hyperthreads it's
required to boot up _all_ logical cores at least once in order to set the
CR4.MCE bit.

So instead of ignoring the sibling threads right away, let them boot up
once so they can configure themself. After they came out of the initial
boot stage check whether its a "secondary" sibling and cancel the operation
which puts the CPU back into offline state.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/cpu.c |   69 ++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 45 insertions(+), 24 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -60,6 +60,7 @@ struct cpuhp_cpu_state {
 	bool			rollback;
 	bool			single;
 	bool			bringup;
+	bool			booted_once;
 	struct hlist_node	*node;
 	struct hlist_node	*last;
 	enum cpuhp_state	cb_state;
@@ -342,6 +343,37 @@ void cpu_hotplug_enable(void)
 EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
 #endif	/* CONFIG_HOTPLUG_CPU */
 
+#ifdef CONFIG_HOTPLUG_SMT
+enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
+
+static int __init smt_cmdline_disable(char *str)
+{
+	cpu_smt_control = CPU_SMT_DISABLED;
+	if (str && !strcmp(str, "force")) {
+		pr_info("SMT: Force disabled\n");
+		cpu_smt_control = CPU_SMT_FORCE_DISABLED;
+	}
+	return 0;
+}
+early_param("nosmt", smt_cmdline_disable);
+
+static inline bool cpu_smt_allowed(unsigned int cpu)
+{
+	if (cpu_smt_control == CPU_SMT_ENABLED)
+		return true;
+	if (topology_is_primary_thread(cpu))
+		return true;
+	/*
+	 * X86 requires that the sibling threads are at least booted up
+	 * once to set the CR4.MCE bit so Machine Check Exceptions can be
+	 * handled and do not end up raising the CPU Internal Error line.
+	 */
+	return !per_cpu(cpuhp_state, cpu).booted_once;
+}
+#else
+static inline bool cpu_smt_allowed(unsigned int cpu) { return true; }
+#endif
+
 static inline enum cpuhp_state
 cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
 {
@@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
 	stop_machine_unpark(cpu);
 	kthread_unpark(st->thread);
 
+	/*
+	 * SMT soft disabling on X86 requires to bring the CPU out of the
+	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
+	 * CPU marked itself as booted_once in cpu_notify_starting() so the
+	 * cpu_smt_allowed() check will now return false if this is not the
+	 * primary sibling.
+	 */
+	if (!cpu_smt_allowed(cpu))
+		return -ECANCELED;
+
 	if (st->target <= CPUHP_AP_ONLINE_IDLE)
 		return 0;
 
@@ -933,29 +975,6 @@ EXPORT_SYMBOL(cpu_down);
 #define takedown_cpu		NULL
 #endif /*CONFIG_HOTPLUG_CPU*/
 
-#ifdef CONFIG_HOTPLUG_SMT
-enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
-
-static int __init smt_cmdline_disable(char *str)
-{
-	cpu_smt_control = CPU_SMT_DISABLED;
-	if (str && !strcmp(str, "force")) {
-		pr_info("SMT: Force disabled\n");
-		cpu_smt_control = CPU_SMT_FORCE_DISABLED;
-	}
-	return 0;
-}
-early_param("nosmt", smt_cmdline_disable);
-
-static inline bool cpu_smt_allowed(unsigned int cpu)
-{
-	return cpu_smt_control == CPU_SMT_ENABLED ||
-		topology_is_primary_thread(cpu);
-}
-#else
-static inline bool cpu_smt_allowed(unsigned int cpu) { return true; }
-#endif
-
 /**
  * notify_cpu_starting(cpu) - Invoke the callbacks on the starting CPU
  * @cpu: cpu that just started
@@ -970,6 +989,7 @@ void notify_cpu_starting(unsigned int cp
 	int ret;
 
 	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
+	st->booted_once = true;
 	while (st->state < target) {
 		st->state++;
 		ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
@@ -2180,5 +2200,6 @@ void __init boot_cpu_init(void)
  */
 void __init boot_cpu_state_init(void)
 {
-	per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
+	this_cpu_write(cpuhp_state.booted_once, true);
+	this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
 }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
@ 2018-06-29 19:13   ` Josh Poimboeuf
  2018-06-29 20:25     ` Josh Poimboeuf
  2018-06-29 19:17   ` [MODERATED] " Luck, Tony
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2018-06-29 19:13 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> @@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
>  	stop_machine_unpark(cpu);
>  	kthread_unpark(st->thread);
>  
> +	/*
> +	 * SMT soft disabling on X86 requires to bring the CPU out of the
> +	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
> +	 * CPU marked itself as booted_once in cpu_notify_starting() so the
> +	 * cpu_smt_allowed() check will now return false if this is not the
> +	 * primary sibling.
> +	 */
> +	if (!cpu_smt_allowed(cpu))
> +		return -ECANCELED;
> +

Shouldn't this be checked *before* unparking the kthreads?  Or does
ECANCELED magically re-park them somehow?

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
  2018-06-29 19:13   ` [MODERATED] " Josh Poimboeuf
@ 2018-06-29 19:17   ` Luck, Tony
  2018-06-29 19:27     ` Thomas Gleixner
  2018-06-29 20:08   ` [MODERATED] " Josh Poimboeuf
  2018-06-30 16:38   ` Borislav Petkov
  3 siblings, 1 reply; 17+ messages in thread
From: Luck, Tony @ 2018-06-29 19:17 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Subject: cpu/hotplug: Boot HT siblings at least once
> 
> Due to the way Machine Check Exceptions work on X86 hyperthreads it's
> required to boot up _all_ logical cores at least once in order to set the
> CR4.MCE bit.
> 
> So instead of ignoring the sibling threads right away, let them boot up
> once so they can configure themself. After they came out of the initial
> boot stage check whether its a "secondary" sibling and cancel the operation
> which puts the CPU back into offline state.

Having just joined the list. I don't have the full tree to apply this to.

I hacked just this patch on top of v4.18-rc2 and booted with nosmt.

[Is there a place to securely pick up a tarball of patches?  Some of the
 problems below might be a result of my hacking not covering some patch
 bits I don't have.]

Basics are working. dmesg during boot looked like:

[    0.158114] x86: Booting SMP configuration:
[    0.159002] .... node  #0, CPUs:          #1   #2   #3   #4   #5   #6   #7   #8   #9  #10  #11  #12  #13  #14  #15  #16  #17  #18  #19  #20  #21  #22  #23
[    0.388003] .... node  #1, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31  #32  #33  #34  #35  #36  #37  #38  #39  #40  #41  #42  #43  #44  #45  #46  #47
[    0.754003] .... node  #2, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55  #56  #57  #58  #59  #60  #61  #62  #63  #64  #65  #66  #67  #68  #69  #70  #71
[    1.121003] .... node  #3, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79  #80  #81  #82  #83  #84  #85  #86  #87  #88  #89  #90  #91  #92  #93  #94  #95
[    1.430003] .... node  #0, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103 #104 #105 #106 #107 #108 #109 #110 #111 #112 #113 #114 #115 #116 #117 #118 #119
[    1.537008] .... node  #1, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127 #128 #129 #130 #131 #132 #133 #134 #135 #136 #137 #138 #139 #140 #141 #142 #143
[    1.659002] .... node  #2, CPUs:   #144 #145 #146 #147 #148 #149 #150 #151 #152 #153 #154 #155 #156 #157 #158 #159 #160 #161 #162 #163 #164 #165 #166 #167
[    1.784002] .... node  #3, CPUs:   #168 #169 #170 #171 #172 #173 #174 #175 #176 #177 #178 #179 #180 #181 #182 #183 #184 #185 #186 #187 #188 #189 #190 #191
[    1.889073] smp: Brought up 4 nodes, 96 CPUs

so we started bringing up all CPUs, but only half ended up online. Awesome!

Confusingly only five of the 96 offline CPUs reported they were offline:

[   26.528492] smpboot: CPU 100 is now offline
[   35.081182] smpboot: CPU 97 is now offline
[   35.122864] smpboot: CPU 99 is now offline
[   35.170483] smpboot: CPU 96 is now offline
[   35.239174] smpboot: CPU 98 is now offline

Also a bit odd that these messages were delayed so long.  It might be better
if we didn't print this for any of them ... just useless noise?

I injected an uncorrected error and consumed it to generate a broadcast machine
check.  That worked perfectly.  Hurrah!

I tried to bring CPU 96 online:

# echo 1 > /sys/devices/system/cpu/cpu96/online

but it tried, and failed:

[  318.998632] smpboot: Booting Node 0 Processor 96 APIC 0x1
[  319.023475] smpboot: CPU 96 is now offline

-Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 19:17   ` [MODERATED] " Luck, Tony
@ 2018-06-29 19:27     ` Thomas Gleixner
  2018-06-29 20:57       ` [MODERATED] " Luck, Tony
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 19:27 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 279 bytes --]

On Fri, 29 Jun 2018, speck for Luck, Tony wrote:
> [Is there a place to securely pick up a tarball of patches?  Some of the
>  problems below might be a result of my hacking not covering some patch
>  bits I don't have.]

Git bundle of the lot which is in the git repo attached.

[-- Attachment #2: Type: application/octet-stream, Size: 39687 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
  2018-06-29 19:13   ` [MODERATED] " Josh Poimboeuf
  2018-06-29 19:17   ` [MODERATED] " Luck, Tony
@ 2018-06-29 20:08   ` Josh Poimboeuf
  2018-06-30 16:38   ` Borislav Petkov
  3 siblings, 0 replies; 17+ messages in thread
From: Josh Poimboeuf @ 2018-06-29 20:08 UTC (permalink / raw)
  To: speck

> @@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
>  	stop_machine_unpark(cpu);
>  	kthread_unpark(st->thread);
>  
> +	/*
> +	 * SMT soft disabling on X86 requires to bring the CPU out of the
> +	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
> +	 * CPU marked itself as booted_once in cpu_notify_starting() so the

s/cpu_notify_starting/notify_cpu_starting

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 19:13   ` [MODERATED] " Josh Poimboeuf
@ 2018-06-29 20:25     ` Josh Poimboeuf
  2018-06-29 20:52       ` Josh Poimboeuf
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2018-06-29 20:25 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 02:13:57PM -0500, Josh Poimboeuf wrote:
> On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> > @@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
> >  	stop_machine_unpark(cpu);
> >  	kthread_unpark(st->thread);
> >  
> > +	/*
> > +	 * SMT soft disabling on X86 requires to bring the CPU out of the
> > +	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
> > +	 * CPU marked itself as booted_once in cpu_notify_starting() so the
> > +	 * cpu_smt_allowed() check will now return false if this is not the
> > +	 * primary sibling.
> > +	 */
> > +	if (!cpu_smt_allowed(cpu))
> > +		return -ECANCELED;
> > +
> 
> Shouldn't this be checked *before* unparking the kthreads?  Or does
> ECANCELED magically re-park them somehow?

Oh, I think I see it now.  Looks like the state machine will call
takedown_cpu() in the CPUHP_TEARDOWN_CPU state on the way down.  A short
comment to that effect would help neophyte code readers like myself.

I suppose the 'WARN_ON_ONCE((!cpu_online(cpu)))' check+return should
also be done after unparking the threads, to prevent "double parking" in
the error path?

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 20:25     ` Josh Poimboeuf
@ 2018-06-29 20:52       ` Josh Poimboeuf
  2018-06-29 21:47         ` Thomas Gleixner
  0 siblings, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2018-06-29 20:52 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 03:25:22PM -0500, Josh Poimboeuf wrote:
> On Fri, Jun 29, 2018 at 02:13:57PM -0500, Josh Poimboeuf wrote:
> > On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> > > @@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
> > >  	stop_machine_unpark(cpu);
> > >  	kthread_unpark(st->thread);
> > >  
> > > +	/*
> > > +	 * SMT soft disabling on X86 requires to bring the CPU out of the
> > > +	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
> > > +	 * CPU marked itself as booted_once in cpu_notify_starting() so the
> > > +	 * cpu_smt_allowed() check will now return false if this is not the
> > > +	 * primary sibling.
> > > +	 */
> > > +	if (!cpu_smt_allowed(cpu))
> > > +		return -ECANCELED;
> > > +
> > 
> > Shouldn't this be checked *before* unparking the kthreads?  Or does
> > ECANCELED magically re-park them somehow?
> 
> Oh, I think I see it now.  Looks like the state machine will call
> takedown_cpu() in the CPUHP_TEARDOWN_CPU state on the way down.  A short
> comment to that effect would help neophyte code readers like myself.

But... this code unparks both threads, whereas takedown_cpu() only
re-parks the cpuhp thread.  So it looks like the stopper thread never
gets re-parked?

It also looks like the hotplug thread gets unparked twice on the way up:
once in bringup_wait_for_ap() and once in smpboot_unpark_threads().
Maybe that's harmless though.

> I suppose the 'WARN_ON_ONCE((!cpu_online(cpu)))' check+return should
> also be done after unparking the threads, to prevent "double parking" in
> the error path?

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 19:27     ` Thomas Gleixner
@ 2018-06-29 20:57       ` Luck, Tony
  2018-06-29 21:01         ` [MODERATED] " Borislav Petkov
  2018-06-29 21:48         ` Thomas Gleixner
  0 siblings, 2 replies; 17+ messages in thread
From: Luck, Tony @ 2018-06-29 20:57 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 09:27:12PM +0200, speck for Thomas Gleixner wrote:
> Git bundle of the lot which is in the git repo attached.

Ok. Did a fetch from that bundle, and then applied patches 1 & 2.

Booted with "nosmt".

The weirdness with 5 of the HT threads reporting offline went away.

System did boot with just 96 of 192 CPUs online. They are the right
ones (all the apicids in /proc/cpuinfo are even, so thread 0 on each core).

Broadcast recoverable machine check works and recovers neatly.

Just one issue.  Even though I booted with "nosmt" rather than
"nosmt=force", I still can't bring the offline threads online:

# echo 1 > /sys/devices/system/cpu/cpu96/online
bash: echo: write error: Operation not permitted


I can take cpu95 offline and bring it back again.

-Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 20:57       ` [MODERATED] " Luck, Tony
@ 2018-06-29 21:01         ` Borislav Petkov
  2018-06-29 21:07           ` Luck, Tony
  2018-06-29 21:48         ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2018-06-29 21:01 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 01:57:52PM -0700, speck for Luck, Tony wrote:
> Just one issue.  Even though I booted with "nosmt" rather than
> "nosmt=force", I still can't bring the offline threads online:
> 
> # echo 1 > /sys/devices/system/cpu/cpu96/online
> bash: echo: write error: Operation not permitted

Did you echo "on" into the smt/control file before trying to online
them?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 21:01         ` [MODERATED] " Borislav Petkov
@ 2018-06-29 21:07           ` Luck, Tony
  0 siblings, 0 replies; 17+ messages in thread
From: Luck, Tony @ 2018-06-29 21:07 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 11:01:17PM +0200, speck for Borislav Petkov wrote:
> On Fri, Jun 29, 2018 at 01:57:52PM -0700, speck for Luck, Tony wrote:
> > Just one issue.  Even though I booted with "nosmt" rather than
> > "nosmt=force", I still can't bring the offline threads online:
> > 
> > # echo 1 > /sys/devices/system/cpu/cpu96/online
> > bash: echo: write error: Operation not permitted
> 
> Did you echo "on" into the smt/control file before trying to online
> them?

No.

But now I know about it I did that, and the the online of cpu96
worked just fine.

So this all seems to be working fine.

Tested-by: Tony Luck <tony.luck@intel.com>

Thanks

-Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 20:52       ` Josh Poimboeuf
@ 2018-06-29 21:47         ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 21:47 UTC (permalink / raw)
  To: speck

On Fri, 29 Jun 2018, speck for Josh Poimboeuf wrote:
> On Fri, Jun 29, 2018 at 03:25:22PM -0500, Josh Poimboeuf wrote:
> > On Fri, Jun 29, 2018 at 02:13:57PM -0500, Josh Poimboeuf wrote:
> > > On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> > > > @@ -422,6 +454,16 @@ static int bringup_wait_for_ap(unsigned
> > > >  	stop_machine_unpark(cpu);
> > > >  	kthread_unpark(st->thread);
> > > >  
> > > > +	/*
> > > > +	 * SMT soft disabling on X86 requires to bring the CPU out of the
> > > > +	 * BIOS 'wait for SIPI' state in order to set the CR4.MCE bit.  The
> > > > +	 * CPU marked itself as booted_once in cpu_notify_starting() so the
> > > > +	 * cpu_smt_allowed() check will now return false if this is not the
> > > > +	 * primary sibling.
> > > > +	 */
> > > > +	if (!cpu_smt_allowed(cpu))
> > > > +		return -ECANCELED;
> > > > +
> > > 
> > > Shouldn't this be checked *before* unparking the kthreads?  Or does
> > > ECANCELED magically re-park them somehow?
> > 
> > Oh, I think I see it now.  Looks like the state machine will call
> > takedown_cpu() in the CPUHP_TEARDOWN_CPU state on the way down.  A short
> > comment to that effect would help neophyte code readers like myself.
> 
> But... this code unparks both threads, whereas takedown_cpu() only
> re-parks the cpuhp thread.  So it looks like the stopper thread never
> gets re-parked?

The stopper thread is self parking.

> It also looks like the hotplug thread gets unparked twice on the way up:
> once in bringup_wait_for_ap() and once in smpboot_unpark_threads().
> Maybe that's harmless though.

Yes, it is.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 20:57       ` [MODERATED] " Luck, Tony
  2018-06-29 21:01         ` [MODERATED] " Borislav Petkov
@ 2018-06-29 21:48         ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-06-29 21:48 UTC (permalink / raw)
  To: speck

On Fri, 29 Jun 2018, speck for Luck, Tony wrote:
> On Fri, Jun 29, 2018 at 09:27:12PM +0200, speck for Thomas Gleixner wrote:
> > Git bundle of the lot which is in the git repo attached.
> 
> Ok. Did a fetch from that bundle, and then applied patches 1 & 2.
> 
> Booted with "nosmt".
> 
> The weirdness with 5 of the HT threads reporting offline went away.
> 
> System did boot with just 96 of 192 CPUs online. They are the right
> ones (all the apicids in /proc/cpuinfo are even, so thread 0 on each core).
> 
> Broadcast recoverable machine check works and recovers neatly.
> 
> Just one issue.  Even though I booted with "nosmt" rather than
> "nosmt=force", I still can't bring the offline threads online:
> 
> # echo 1 > /sys/devices/system/cpu/cpu96/online
> bash: echo: write error: Operation not permitted

Hrm. I'll have a look tomorrow with brain awake.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 2/2] cpu/hotplug: Boot HT siblings at least once
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
                     ` (2 preceding siblings ...)
  2018-06-29 20:08   ` [MODERATED] " Josh Poimboeuf
@ 2018-06-30 16:38   ` Borislav Petkov
  3 siblings, 0 replies; 17+ messages in thread
From: Borislav Petkov @ 2018-06-30 16:38 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 04:05:48PM +0200, speck for Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Subject: cpu/hotplug: Boot HT siblings at least once
> 
> Due to the way Machine Check Exceptions work on X86 hyperthreads it's
> required to boot up _all_ logical cores at least once in order to set the
> CR4.MCE bit.
> 
> So instead of ignoring the sibling threads right away, let them boot up
> once so they can configure themself. After they came out of the initial

			    themselves.

> boot stage check whether its a "secondary" sibling and cancel the operation
> which puts the CPU back into offline state.
> 
> Reported-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/cpu.c |   69 ++++++++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 45 insertions(+), 24 deletions(-)
> 
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -60,6 +60,7 @@ struct cpuhp_cpu_state {
>  	bool			rollback;
>  	bool			single;
>  	bool			bringup;
> +	bool			booted_once;

Might wanna turn all those bools into a bitfield at some point, as a
cleanup.

>  	struct hlist_node	*node;
>  	struct hlist_node	*last;
>  	enum cpuhp_state	cb_state;
> @@ -342,6 +343,37 @@ void cpu_hotplug_enable(void)
>  EXPORT_SYMBOL_GPL(cpu_hotplug_enable);
>  #endif	/* CONFIG_HOTPLUG_CPU */
>  
> +#ifdef CONFIG_HOTPLUG_SMT
> +enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
> +
> +static int __init smt_cmdline_disable(char *str)
> +{
> +	cpu_smt_control = CPU_SMT_DISABLED;
> +	if (str && !strcmp(str, "force")) {
> +		pr_info("SMT: Force disabled\n");
> +		cpu_smt_control = CPU_SMT_FORCE_DISABLED;
> +	}
> +	return 0;
> +}
> +early_param("nosmt", smt_cmdline_disable);
> +
> +static inline bool cpu_smt_allowed(unsigned int cpu)
> +{
> +	if (cpu_smt_control == CPU_SMT_ENABLED)
> +		return true;

<---- newline here.

> +	if (topology_is_primary_thread(cpu))
> +		return true;

<---- newline here.

It looks better with newlines spacing them out. :)

> +	/*
> +	 * X86 requires that the sibling threads are at least booted up
> +	 * once to set the CR4.MCE bit so Machine Check Exceptions can be

Let's rewrite that:

"Boot all logical CPUs at least once so that the init code can get a chance to
set CR4.MCE on each CPU. Otherwise, a broadacasted MCE observing CR4.MCE=0b on
any core will shutdown the machine."

> +	 * handled and do not end up raising the CPU Internal Error line.
> +	 */
> +	return !per_cpu(cpuhp_state, cpu).booted_once;
> +}
> +#else
> +static inline bool cpu_smt_allowed(unsigned int cpu) { return true; }
> +#endif
> +
>  static inline enum cpuhp_state
>  cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target)
>  {

...

> @@ -970,6 +989,7 @@ void notify_cpu_starting(unsigned int cp
>  	int ret;
>  
>  	rcu_cpu_starting(cpu);	/* Enables RCU usage on this CPU. */
> +	st->booted_once = true;
>  	while (st->state < target) {
>  		st->state++;
>  		ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
> @@ -2180,5 +2200,6 @@ void __init boot_cpu_init(void)
>   */
>  void __init boot_cpu_state_init(void)
>  {
> -	per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE;
> +	this_cpu_write(cpuhp_state.booted_once, true);

I guess it doesn't matter if set ->booted_once on the BSP too - it is a
primary thread, I'd strongly hope. But it is better this way as we don't
have to special-case it.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage
  2018-06-29 14:05 [patch 0/2] x86/hotplug: Cure the MCE wreckage Thomas Gleixner
  2018-06-29 14:05 ` [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force" Thomas Gleixner
  2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
@ 2018-07-02 12:27 ` Peter Zijlstra
  2018-07-02 13:24   ` Thomas Gleixner
  2 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-07-02 12:27 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 04:05:46PM +0200, speck for Thomas Gleixner wrote:
> As Dave pointed out, the HT siblings have to be booted up at least once in
> order to set the CR4.MCE bit. Otherwise a MCE will cause the Internal
> Processor Error to be raised which in turn causes shutdown or reset.

Urghh.. but doesn't this also mean that things like nosmp/maxcpus= are
currently broken?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage
  2018-07-02 12:27 ` [MODERATED] Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage Peter Zijlstra
@ 2018-07-02 13:24   ` Thomas Gleixner
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2018-07-02 13:24 UTC (permalink / raw)
  To: speck

On Mon, 2 Jul 2018, speck for Peter Zijlstra wrote:

> On Fri, Jun 29, 2018 at 04:05:46PM +0200, speck for Thomas Gleixner wrote:
> > As Dave pointed out, the HT siblings have to be booted up at least once in
> > order to set the CR4.MCE bit. Otherwise a MCE will cause the Internal
> > Processor Error to be raised which in turn causes shutdown or reset.
> 
> Urghh.. but doesn't this also mean that things like nosmp/maxcpus= are
> currently broken?

Of course.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-07-02 13:25 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-29 14:05 [patch 0/2] x86/hotplug: Cure the MCE wreckage Thomas Gleixner
2018-06-29 14:05 ` [patch 1/2] Revert "x86/apic: Ignore secondary threads if nosmt=force" Thomas Gleixner
2018-06-29 14:05 ` [patch 2/2] cpu/hotplug: Boot HT siblings at least once Thomas Gleixner
2018-06-29 19:13   ` [MODERATED] " Josh Poimboeuf
2018-06-29 20:25     ` Josh Poimboeuf
2018-06-29 20:52       ` Josh Poimboeuf
2018-06-29 21:47         ` Thomas Gleixner
2018-06-29 19:17   ` [MODERATED] " Luck, Tony
2018-06-29 19:27     ` Thomas Gleixner
2018-06-29 20:57       ` [MODERATED] " Luck, Tony
2018-06-29 21:01         ` [MODERATED] " Borislav Petkov
2018-06-29 21:07           ` Luck, Tony
2018-06-29 21:48         ` Thomas Gleixner
2018-06-29 20:08   ` [MODERATED] " Josh Poimboeuf
2018-06-30 16:38   ` Borislav Petkov
2018-07-02 12:27 ` [MODERATED] Re: [patch 0/2] x86/hotplug: Cure the MCE wreckage Peter Zijlstra
2018-07-02 13:24   ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.