All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V4 00/13] SMT control knobs
@ 2018-06-20 20:19 Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 01/13] sched/smt: Update sched_smt_present at runtime Thomas Gleixner
                   ` (14 more replies)
  0 siblings, 15 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

V4 of the SMT control knob series. It addresses the review feed back of
V3 plus:

 - Added Borislavs patch which removes the weird cpu leaf check and
   adapted the early sibling check accordingly

 - Borislav noticed that the sys/..../cpuN/online file is stale after
   writing 'off' to the smt/control file. Added a workaround which might be
   replaced when Greg either has a better idea or starts to yell murder.

 - Updated the changelog of the last patch.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 01/13] sched/smt: Update sched_smt_present at runtime
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 02/13] x86/smp: Provide topology_is_primary_thread() Thomas Gleixner
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 01/13] sched/smt: Update sched_smt_present at runtime
From: Peter Zijlstra <peterz@infradead.org>

The static key sched_smt_present is only updated at boot time when SMT
siblings have been detected. Booting with maxcpus=1 and bringing the
siblings online after boot rebuilds the scheduling domains correctly but
does not update the static key, so the SMT code is not enabled.

Let the key be updated in the scheduler CPU hotplug code to fix this.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
V3: Fixed changelog

 kernel/sched/core.c |   30 ++++++++++++------------------
 kernel/sched/fair.c |    1 +
 2 files changed, 13 insertions(+), 18 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5777,6 +5777,18 @@ int sched_cpu_activate(unsigned int cpu)
 	struct rq *rq = cpu_rq(cpu);
 	struct rq_flags rf;
 
+#ifdef CONFIG_SCHED_SMT
+	/*
+	 * The sched_smt_present static key needs to be evaluated on every
+	 * hotplug event because at boot time SMT might be disabled when
+	 * the number of booted CPUs is limited.
+	 *
+	 * If then later a sibling gets hotplugged, then the key would stay
+	 * off and SMT scheduling would never be functional.
+	 */
+	if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
+		static_branch_enable_cpuslocked(&sched_smt_present);
+#endif
 	set_cpu_active(cpu, true);
 
 	if (sched_smp_initialized) {
@@ -5874,22 +5886,6 @@ int sched_cpu_dying(unsigned int cpu)
 }
 #endif
 
-#ifdef CONFIG_SCHED_SMT
-DEFINE_STATIC_KEY_FALSE(sched_smt_present);
-
-static void sched_init_smt(void)
-{
-	/*
-	 * We've enumerated all CPUs and will assume that if any CPU
-	 * has SMT siblings, CPU0 will too.
-	 */
-	if (cpumask_weight(cpu_smt_mask(0)) > 1)
-		static_branch_enable(&sched_smt_present);
-}
-#else
-static inline void sched_init_smt(void) { }
-#endif
-
 void __init sched_init_smp(void)
 {
 	sched_init_numa();
@@ -5911,8 +5907,6 @@ void __init sched_init_smp(void)
 	init_sched_rt_class();
 	init_sched_dl_class();
 
-	sched_init_smt();
-
 	sched_smp_initialized = true;
 }
 
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6238,6 +6238,7 @@ static inline int find_idlest_cpu(struct
 }
 
 #ifdef CONFIG_SCHED_SMT
+DEFINE_STATIC_KEY_FALSE(sched_smt_present);
 
 static inline void set_idle_cores(int cpu, int val)
 {

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 01/13] sched/smt: Update sched_smt_present at runtime Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-27 13:56   ` [MODERATED] " Josh Poimboeuf
  2018-06-20 20:19 ` [patch V4 03/13] cpu/hotplug: Make bringup/teardown of smp threads symmetric Thomas Gleixner
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
From: Thomas Gleixner <tglx@linutronix.de>

If the CPU is supporting SMT then the primary thread can be found by
checking the lower APIC ID bits for zero. smp_num_siblings is used to build
the mask for the APIC ID bits which need to be taken into account.

This uses the MPTABLE or ACPI/MADT supplied APIC ID, which can be different
than the initial APIC ID in CPUID. But according to AMD the lower bits have
to be consistent. Intel gave a tentative confirmation as well.

Preparatory patch to support disabling SMT at boot/runtime.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/include/asm/apic.h     |    6 ++++++
 arch/x86/include/asm/topology.h |    4 +++-
 arch/x86/kernel/apic/apic.c     |   15 +++++++++++++++
 arch/x86/kernel/smpboot.c       |    9 +++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -502,6 +502,12 @@ extern int default_check_phys_apicid_pre
 
 #endif /* CONFIG_X86_LOCAL_APIC */
 
+#ifdef CONFIG_SMP
+bool apic_id_is_primary_thread(unsigned int id);
+#else
+static inline bool apic_id_is_primary_thread(unsigned int id) { return false; }
+#endif
+
 extern void irq_enter(void);
 extern void irq_exit(void);
 
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -123,13 +123,15 @@ static inline int topology_max_smt_threa
 }
 
 int topology_update_package_map(unsigned int apicid, unsigned int cpu);
-extern int topology_phys_to_logical_pkg(unsigned int pkg);
+int topology_phys_to_logical_pkg(unsigned int pkg);
+bool topology_is_primary_thread(unsigned int cpu);
 #else
 #define topology_max_packages()			(1)
 static inline int
 topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
 static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
 static inline int topology_max_smt_threads(void) { return 1; }
+static inline bool topology_is_primary_thread(unsigned int cpu) { return true; ]
 #endif
 
 static inline void arch_fix_phys_package_id(int num, u32 slot)
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2189,6 +2189,21 @@ static int cpuid_to_apicid[] = {
 	[0 ... NR_CPUS - 1] = -1,
 };
 
+/**
+ * apic_id_is_primary_thread - Check whether APIC ID belongs to a primary thread
+ * @id:	APIC ID to check
+ */
+bool apic_id_is_primary_thread(unsigned int apicid)
+{
+	u32 mask;
+
+	if (smp_num_siblings == 1)
+		return true;
+	/* Isolate the SMT bit(s) in the APICID and check for 0 */
+	mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
+	return !(apicid & mask);
+}
+
 /*
  * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
  * and cpuid_to_apicid[] synchronized.
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -266,6 +266,15 @@ static void notrace start_secondary(void
 }
 
 /**
+ * topology_is_primary_thread - Check whether CPU is the primary SMT thread
+ * @cpu:	CPU to check
+ */
+bool topology_is_primary_thread(unsigned int cpu)
+{
+	return apic_id_is_primary_thread(per_cpu(x86_cpu_to_apicid, cpu));
+}
+
+/**
  * topology_phys_to_logical_pkg - Map a physical package id to a logical
  *
  * Returns logical package id or -1 if not found

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 03/13] cpu/hotplug: Make bringup/teardown of smp threads symmetric
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 01/13] sched/smt: Update sched_smt_present at runtime Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 02/13] x86/smp: Provide topology_is_primary_thread() Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 04/13] cpu/hotplug: Split do_cpu_down() Thomas Gleixner
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 03/13] cpu/hotplug: Make bringup/teardown of smp threads symmetric
From: Thomas Gleixner <tglx@linutronix.de>

The asymmetry caused a warning to trigger if the bootup was stopped in state
CPUHP_AP_ONLINE_IDLE. The warning no longer triggers as kthread_park() can
now be invoked on already or still parked threads. But there is still no
reason to have this be asymmetric.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 kernel/cpu.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -754,7 +754,6 @@ static int takedown_cpu(unsigned int cpu
 
 	/* Park the smpboot threads */
 	kthread_park(per_cpu_ptr(&cpuhp_state, cpu)->thread);
-	smpboot_park_threads(cpu);
 
 	/*
 	 * Prevent irq alloc/free while the dying cpu reorganizes the
@@ -1332,7 +1331,7 @@ static struct cpuhp_step cpuhp_hp_states
 	[CPUHP_AP_SMPBOOT_THREADS] = {
 		.name			= "smpboot/threads:online",
 		.startup.single		= smpboot_unpark_threads,
-		.teardown.single	= NULL,
+		.teardown.single	= smpboot_park_threads,
 	},
 	[CPUHP_AP_IRQ_AFFINITY_ONLINE] = {
 		.name			= "irq/affinity:online",

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 04/13] cpu/hotplug: Split do_cpu_down()
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (2 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 03/13] cpu/hotplug: Make bringup/teardown of smp threads symmetric Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT Thomas Gleixner
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 04/13] cpu/hotplug: Split do_cpu_down()
From: Thomas Gleixner <tglx@linutronix.de>

Split out the inner workings of do_cpu_down() to allow reuse of that
function for the upcoming SMT disabling mechanism.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 kernel/cpu.c |   17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -906,20 +906,19 @@ static int __ref _cpu_down(unsigned int
 	return ret;
 }
 
+static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
+{
+	if (cpu_hotplug_disabled)
+		return -EBUSY;
+	return _cpu_down(cpu, 0, target);
+}
+
 static int do_cpu_down(unsigned int cpu, enum cpuhp_state target)
 {
 	int err;
 
 	cpu_maps_update_begin();
-
-	if (cpu_hotplug_disabled) {
-		err = -EBUSY;
-		goto out;
-	}
-
-	err = _cpu_down(cpu, 0, target);
-
-out:
+	err = cpu_down_maps_locked(cpu, target);
 	cpu_maps_update_done();
 	return err;
 }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (3 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 04/13] cpu/hotplug: Split do_cpu_down() Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 23:05   ` [MODERATED] " Greg KH
  2018-06-20 20:19 ` [patch V4 06/13] x86/cpu: Remove the pointless CPU printout Thomas Gleixner
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT
From: Thomas Gleixner <tglx@linutronix.de>

Provide a command line and a sysfs knob to control SMT.

The command line options are:

 'nosmt':	Enumerate secondary threads, but do not online them
 		
 'nosmt=force': Ignore secondary threads completely during enumeration
 		via MP table and ACPI/MADT.

The sysfs control file has the following states (read/write):

 'on':		SMT is enabled. Secondary threads can be freely onlined
 'off':		SMT is disabled. Secondary threads, even if enumerated
 		cannot be onlined
 'forceoff':	SMT is permanentely disabled. Writes to the control
 		file are rejected.

The command line option 'nosmt' sets the sysfs control to 'off'. This
can be changed to 'on' to reenable SMT during runtime.

The command line option 'nosmt=force' sets the sysfs control to
'forceoff'. This cannot be changed during runtime.

When SMT is 'on' and the control file is changed to 'off' then all online
secondary threads are offlined and attempts to online a secondary thread
later on are rejected.

When SMT is 'off' and the control file is changed to 'on' then secondary
threads can be onlined again. The 'off' -> 'on' transition does not
automatically online the secondary threads.

When the control file is set to 'forceoff', the behaviour is the same as
setting it to 'off', but the operation is irreversible and later writes to
the control file are rejected.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
V3: Changelog grammar update, Minor tweaks as requested by Borislav
V2: Rewrote changelog. Make the enum named and use it for the state string
    array. Rename the evaluation function to cpu_smt_allowed().

 Documentation/ABI/testing/sysfs-devices-system-cpu |   16 ++
 Documentation/admin-guide/kernel-parameters.txt    |    8 +
 arch/Kconfig                                       |    3 
 arch/x86/Kconfig                                   |    1 
 include/linux/cpu.h                                |   12 +
 kernel/cpu.c                                       |  155 +++++++++++++++++++++
 6 files changed, 195 insertions(+)

--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -487,3 +487,19 @@ Description:	Information about CPU vulne
 		"Not affected"	  CPU is not affected by the vulnerability
 		"Vulnerable"	  CPU is affected and no mitigation in effect
 		"Mitigation: $M"  CPU is affected and mitigation $M is in effect
+
+What:		/sys/devices/system/cpu/smt
+		/sys/devices/system/cpu/smt/active
+		/sys/devices/system/cpu/smt/control
+Date:		June 2018
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Control Symetric Multi Threading (SMT)
+
+		active:  Tells whether SMT is active (enabled and siblings online)
+
+		control: Read/write interface to control SMT. Possible
+			 values:
+
+			 "on"		SMT is enabled
+			 "off"		SMT is disabled
+			 "forceoff"	SMT is force disabled. Cannot be changed.
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2687,6 +2687,14 @@
 	nosmt		[KNL,S390] Disable symmetric multithreading (SMT).
 			Equivalent to smt=1.
 
+			[KNL,x86] Disable symmetric multithreading (SMT).
+			nosmt=force: Force disable SMT, similar to disabling
+				     it in the BIOS except that some of the
+				     resource partitioning effects which are
+				     caused by having SMT enabled in the BIOS
+				     cannot be undone. Depending on the CPU
+				     type this might have a performance impact.
+
 	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
 			(indirect branch prediction) vulnerability. System may
 			allow data leaks with this option, which is equivalent
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -13,6 +13,9 @@ config KEXEC_CORE
 config HAVE_IMA_KEXEC
 	bool
 
+config HOTPLUG_SMT
+	bool
+
 config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -187,6 +187,7 @@ config X86
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UNSTABLE_SCHED_CLOCK
 	select HAVE_USER_RETURN_NOTIFIER
+	select HOTPLUG_SMT			if SMP
 	select IRQ_FORCED_THREADING
 	select NEED_SG_DMA_LENGTH
 	select PCI_LOCKLESS_CONFIG
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -168,4 +168,16 @@ void cpuhp_report_idle_dead(void);
 static inline void cpuhp_report_idle_dead(void) { }
 #endif /* #ifdef CONFIG_HOTPLUG_CPU */
 
+enum cpuhp_smt_control {
+	CPU_SMT_ENABLED,
+	CPU_SMT_DISABLED,
+	CPU_SMT_FORCE_DISABLED,
+};
+
+#if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT)
+extern enum cpuhp_smt_control cpu_smt_control;
+#else
+# define cpu_smt_control		(CPU_SMT_ENABLED)
+#endif
+
 #endif /* _LINUX_CPU_H_ */
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -933,6 +933,29 @@ EXPORT_SYMBOL(cpu_down);
 #define takedown_cpu		NULL
 #endif /*CONFIG_HOTPLUG_CPU*/
 
+#ifdef CONFIG_HOTPLUG_SMT
+enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED;
+
+static int __init smt_cmdline_disable(char *str)
+{
+	cpu_smt_control = CPU_SMT_DISABLED;
+	if (str && !strcmp(str, "force")) {
+		pr_info("SMT: Force disabled\n");
+		cpu_smt_control = CPU_SMT_FORCE_DISABLED;
+	}
+	return 0;
+}
+early_param("nosmt", smt_cmdline_disable);
+
+static inline bool cpu_smt_allowed(unsigned int cpu)
+{
+	return cpu_smt_control == CPU_SMT_ENABLED ||
+		topology_is_primary_thread(cpu);
+}
+#else
+static inline bool cpu_smt_allowed(unsigned int cpu) { return true; }
+#endif
+
 /**
  * notify_cpu_starting(cpu) - Invoke the callbacks on the starting CPU
  * @cpu: cpu that just started
@@ -1056,6 +1079,10 @@ static int do_cpu_up(unsigned int cpu, e
 		err = -EBUSY;
 		goto out;
 	}
+	if (!cpu_smt_allowed(cpu)) {
+		err = -EPERM;
+		goto out;
+	}
 
 	err = _cpu_up(cpu, 0, target);
 out:
@@ -1904,10 +1931,138 @@ static const struct attribute_group cpuh
 	NULL
 };
 
+#ifdef CONFIG_HOTPLUG_SMT
+
+static const char *smt_states[] = {
+	[CPU_SMT_ENABLED]		= "on",
+	[CPU_SMT_DISABLED]		= "off",
+	[CPU_SMT_FORCE_DISABLED]	= "forceoff",
+};
+
+static ssize_t
+show_smt_control(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE - 2, "%s\n", smt_states[cpu_smt_control]);
+}
+
+static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
+{
+	int cpu, ret = 0;
+
+	cpu_maps_update_begin();
+	for_each_online_cpu(cpu) {
+		struct device *dev;
+
+		if (topology_is_primary_thread(cpu))
+			continue;
+		ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
+		if (ret)
+			break;
+		/*
+		 * As this needs to hold the cpu maps lock it's impossible
+		 * to call device_offline(), so nothing would update
+		 * device:offline state. That would leave the sysfs entry
+		 * stale and prevent onlining after smt control has been
+		 * changed to 'off' again. This is called under the sysfs
+		 * hotplug lock, so it is properly serialized.
+		 *
+		 * Temporary workaround until Greg has a smarter idea to do
+		 * that.
+		 */
+		dev = get_cpu_device(cpu);
+		dev->offline = true;
+	}
+	if (!ret)
+		cpu_smt_control = ctrlval;
+	cpu_maps_update_done();
+	return ret;
+}
+
+static void cpuhp_smt_enable(void)
+{
+	cpu_maps_update_begin();
+	cpu_smt_control = CPU_SMT_ENABLED;
+	cpu_maps_update_done();
+}
+
+static ssize_t
+store_smt_control(struct device *dev, struct device_attribute *attr,
+		  const char *buf, size_t count)
+{
+	int ctrlval, ret;
+
+	if (sysfs_streq(buf, "on"))
+		ctrlval = CPU_SMT_ENABLED;
+	else if (sysfs_streq(buf, "off"))
+		ctrlval = CPU_SMT_DISABLED;
+	else if (sysfs_streq(buf, "forceoff"))
+		ctrlval = CPU_SMT_FORCE_DISABLED;
+	else
+		return -EINVAL;
+
+	if (cpu_smt_control == CPU_SMT_FORCE_DISABLED)
+		return -EPERM;
+
+	ret = lock_device_hotplug_sysfs();
+	if (ret)
+		return ret;
+
+	if (ctrlval != cpu_smt_control) {
+		switch (ctrlval) {
+		case CPU_SMT_ENABLED:
+			cpuhp_smt_enable();
+			break;
+		case CPU_SMT_DISABLED:
+		case CPU_SMT_FORCE_DISABLED:
+			ret = cpuhp_smt_disable(ctrlval);
+			break;
+		}
+	}
+
+	unlock_device_hotplug();
+	return ret ? ret : count;
+}
+static DEVICE_ATTR(control, 0644, show_smt_control, store_smt_control);
+
+static ssize_t
+show_smt_active(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	bool active = topology_max_smt_threads() > 1;
+
+	return snprintf(buf, PAGE_SIZE - 2, "%d\n", active);
+}
+static DEVICE_ATTR(active, 0444, show_smt_active, NULL);
+
+static struct attribute *cpuhp_smt_attrs[] = {
+	&dev_attr_control.attr,
+	&dev_attr_active.attr,
+	NULL
+};
+
+static const struct attribute_group cpuhp_smt_attr_group = {
+	.attrs = cpuhp_smt_attrs,
+	.name = "smt",
+	NULL
+};
+
+static int __init cpu_smt_state_init(void)
+{
+	return sysfs_create_group(&cpu_subsys.dev_root->kobj,
+				  &cpuhp_smt_attr_group);
+}
+
+#else
+static inline int cpu_smt_state_init(void) { return 0; }
+#endif
+
 static int __init cpuhp_sysfs_init(void)
 {
 	int cpu, ret;
 
+	ret = cpu_smt_state_init();
+	if (ret)
+		return ret;
+
 	ret = sysfs_create_group(&cpu_subsys.dev_root->kobj,
 				 &cpuhp_cpu_root_attr_group);
 	if (ret)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 06/13] x86/cpu: Remove the pointless CPU printout
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (4 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 07/13] x86/cpu/AMD: Remove the pointless detect_ht() call Thomas Gleixner
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 06/13] x86/cpu: Remove the pointless CPU printout
From: Thomas Gleixner <tglx@linutronix.de>

The value of this printout is dubious at best and there is no point in
having it in two different places along with convoluted ways to reach it.

Remove it completely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kernel/cpu/common.c   |   20 +++++---------------
 arch/x86/kernel/cpu/topology.c |   10 ----------
 2 files changed, 5 insertions(+), 25 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -663,13 +663,12 @@ void detect_ht(struct cpuinfo_x86 *c)
 #ifdef CONFIG_SMP
 	u32 eax, ebx, ecx, edx;
 	int index_msb, core_bits;
-	static bool printed;
 
 	if (!cpu_has(c, X86_FEATURE_HT))
 		return;
 
 	if (cpu_has(c, X86_FEATURE_CMP_LEGACY))
-		goto out;
+		return;
 
 	if (cpu_has(c, X86_FEATURE_XTOPOLOGY))
 		return;
@@ -678,14 +677,14 @@ void detect_ht(struct cpuinfo_x86 *c)
 
 	smp_num_siblings = (ebx & 0xff0000) >> 16;
 
+	if (!smp_num_siblings)
+		smp_num_siblings = 1;
+
 	if (smp_num_siblings == 1) {
 		pr_info_once("CPU0: Hyper-Threading is disabled\n");
-		goto out;
+		return;
 	}
 
-	if (smp_num_siblings <= 1)
-		goto out;
-
 	index_msb = get_count_order(smp_num_siblings);
 	c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, index_msb);
 
@@ -697,15 +696,6 @@ void detect_ht(struct cpuinfo_x86 *c)
 
 	c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, index_msb) &
 				       ((1 << core_bits) - 1);
-
-out:
-	if (!printed && (c->x86_max_cores * smp_num_siblings) > 1) {
-		pr_info("CPU: Physical Processor ID: %d\n",
-			c->phys_proc_id);
-		pr_info("CPU: Processor Core ID: %d\n",
-			c->cpu_core_id);
-		printed = 1;
-	}
 #endif
 }
 
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -33,7 +33,6 @@ int detect_extended_topology(struct cpui
 	unsigned int eax, ebx, ecx, edx, sub_index;
 	unsigned int ht_mask_width, core_plus_mask_width;
 	unsigned int core_select_mask, core_level_siblings;
-	static bool printed;
 
 	if (c->cpuid_level < 0xb)
 		return -1;
@@ -86,15 +85,6 @@ int detect_extended_topology(struct cpui
 	c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);
 
 	c->x86_max_cores = (core_level_siblings / smp_num_siblings);
-
-	if (!printed) {
-		pr_info("CPU: Physical Processor ID: %d\n",
-		       c->phys_proc_id);
-		if (c->x86_max_cores > 1)
-			pr_info("CPU: Processor Core ID: %d\n",
-			       c->cpu_core_id);
-		printed = 1;
-	}
 #endif
 	return 0;
 }

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 07/13] x86/cpu/AMD: Remove the pointless detect_ht() call
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (5 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 06/13] x86/cpu: Remove the pointless CPU printout Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 08/13] x86/cpu/common: Provide detect_ht_early() Thomas Gleixner
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 07/13] x86/cpu/AMD: Remove the pointless detect_ht() call
From: Thomas Gleixner <tglx@linutronix.de>

Real 32bit AMD CPUs do not have SMT and the only value of the call was to
reach the magic printout which got removed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kernel/cpu/amd.c |    4 ----
 1 file changed, 4 deletions(-)

--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -855,10 +855,6 @@ static void init_amd(struct cpuinfo_x86
 		srat_detect_node(c);
 	}
 
-#ifdef CONFIG_X86_32
-	detect_ht(c);
-#endif
-
 	init_amd_cacheinfo(c);
 
 	if (c->x86 >= 0xf)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 08/13] x86/cpu/common: Provide detect_ht_early()
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (6 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 07/13] x86/cpu/AMD: Remove the pointless detect_ht() call Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 09/13] x86/cpu/topology: Provide detect_extended_topology_early() Thomas Gleixner
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 08/13] x86/cpu/common: Provide detect_ht_early()
From: Thomas Gleixner <tglx@linutronix.de>

To support force disabling of SMT it's required to know the number of
thread siblings early. detect_ht() cannot be called before the APIC driver
is selected, so split out the part which initializes smp_num_siblings.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kernel/cpu/common.c |   24 ++++++++++++++----------
 arch/x86/kernel/cpu/cpu.h    |    1 +
 2 files changed, 15 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -658,32 +658,36 @@ static void cpu_detect_tlb(struct cpuinf
 		tlb_lld_4m[ENTRIES], tlb_lld_1g[ENTRIES]);
 }
 
-void detect_ht(struct cpuinfo_x86 *c)
+int detect_ht_early(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_SMP
 	u32 eax, ebx, ecx, edx;
-	int index_msb, core_bits;
 
 	if (!cpu_has(c, X86_FEATURE_HT))
-		return;
+		return -1;
 
 	if (cpu_has(c, X86_FEATURE_CMP_LEGACY))
-		return;
+		return -1;
 
 	if (cpu_has(c, X86_FEATURE_XTOPOLOGY))
-		return;
+		return -1;
 
 	cpuid(1, &eax, &ebx, &ecx, &edx);
 
 	smp_num_siblings = (ebx & 0xff0000) >> 16;
+	if (smp_num_siblings == 1)
+		pr_info_once("CPU0: Hyper-Threading is disabled\n");
+#endif
+	return 0;
+}
 
-	if (!smp_num_siblings)
-		smp_num_siblings = 1;
+void detect_ht(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+	int index_msb, core_bits;
 
-	if (smp_num_siblings == 1) {
-		pr_info_once("CPU0: Hyper-Threading is disabled\n");
+	if (detect_ht_early(c) < 0)
 		return;
-	}
 
 	index_msb = get_count_order(smp_num_siblings);
 	c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, index_msb);
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -56,6 +56,7 @@ extern void init_amd_cacheinfo(struct cp
 
 extern void detect_num_cpu_cores(struct cpuinfo_x86 *c);
 extern int detect_extended_topology(struct cpuinfo_x86 *c);
+extern int detect_ht_early(struct cpuinfo_x86 *c);
 extern void detect_ht(struct cpuinfo_x86 *c);
 
 unsigned int aperfmperf_get_khz(int cpu);

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 09/13] x86/cpu/topology: Provide detect_extended_topology_early()
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (7 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 08/13] x86/cpu/common: Provide detect_ht_early() Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 10/13] x86/cpu/intel: Evaluate smp_num_siblings early Thomas Gleixner
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 09/13] x86/cpu/topology: Provide detect_extended_topology_early()
From: Thomas Gleixner <tglx@linutronix.de>

To support force disabling of SMT it's required to know the number of
thread siblings early. detect_extended_topology() cannot be called before
the APIC driver is selected, so split out the part which initializes
smp_num_siblings.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/kernel/cpu/cpu.h      |    1 +
 arch/x86/kernel/cpu/topology.c |   31 ++++++++++++++++++++++---------
 2 files changed, 23 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -55,6 +55,7 @@ extern void init_intel_cacheinfo(struct
 extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);
 
 extern void detect_num_cpu_cores(struct cpuinfo_x86 *c);
+extern int detect_extended_topology_early(struct cpuinfo_x86 *c);
 extern int detect_extended_topology(struct cpuinfo_x86 *c);
 extern int detect_ht_early(struct cpuinfo_x86 *c);
 extern void detect_ht(struct cpuinfo_x86 *c);
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -22,17 +22,10 @@
 #define BITS_SHIFT_NEXT_LEVEL(eax)	((eax) & 0x1f)
 #define LEVEL_MAX_SIBLINGS(ebx)		((ebx) & 0xffff)
 
-/*
- * Check for extended topology enumeration cpuid leaf 0xb and if it
- * exists, use it for populating initial_apicid and cpu topology
- * detection.
- */
-int detect_extended_topology(struct cpuinfo_x86 *c)
+int detect_extended_topology_early(struct cpuinfo_x86 *c)
 {
 #ifdef CONFIG_SMP
-	unsigned int eax, ebx, ecx, edx, sub_index;
-	unsigned int ht_mask_width, core_plus_mask_width;
-	unsigned int core_select_mask, core_level_siblings;
+	unsigned int eax, ebx, ecx, edx;
 
 	if (c->cpuid_level < 0xb)
 		return -1;
@@ -51,10 +44,30 @@ int detect_extended_topology(struct cpui
 	 * initial apic id, which also represents 32-bit extended x2apic id.
 	 */
 	c->initial_apicid = edx;
+	smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
+#endif
+	return 0;
+}
+
+/*
+ * Check for extended topology enumeration cpuid leaf 0xb and if it
+ * exists, use it for populating initial_apicid and cpu topology
+ * detection.
+ */
+int detect_extended_topology(struct cpuinfo_x86 *c)
+{
+#ifdef CONFIG_SMP
+	unsigned int eax, ebx, ecx, edx, sub_index;
+	unsigned int ht_mask_width, core_plus_mask_width;
+	unsigned int core_select_mask, core_level_siblings;
+
+	if (detect_extended_topology_early(c) < 0)
+		return -1;
 
 	/*
 	 * Populate HT related information from sub-leaf level 0.
 	 */
+	cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
 	core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
 	core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 10/13] x86/cpu/intel: Evaluate smp_num_siblings early
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (8 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 09/13] x86/cpu/topology: Provide detect_extended_topology_early() Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 11/13] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info Thomas Gleixner
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 10/13] x86/cpu/intel: Evaluate smp_num_siblings early
From: Thomas Gleixner <tglx@linutronix.de>

Make use of the new early detection function to initialize smp_num_siblings
on the boot cpu before the MP-Table or ACPI/MADT scan happens. That's
required for force disabling SMT.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
V2: Drop the bogus 32bit conditional

 arch/x86/kernel/cpu/intel.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -301,6 +301,13 @@ static void early_init_intel(struct cpui
 	}
 
 	check_mpx_erratum(c);
+
+	/*
+	 * Get the number of SMT siblings early from the extended topology
+	 * leaf, if available. Otherwise try the legacy SMT detection.
+	 */
+	if (detect_extended_topology_early(c) < 0)
+		detect_ht_early(c);
 }
 
 #ifdef CONFIG_X86_32

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 11/13] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (9 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 10/13] x86/cpu/intel: Evaluate smp_num_siblings early Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 12/13] x86/cpu/AMD: Evaluate smp_num_siblings early Thomas Gleixner
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

From: Borislav Petkov <bp@suse.de>
Subject: [patch V4 11/13] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info

Old code used to check whether CPUID ext max level is >= 0x80000008 because
that last leaf contains the number of cores of the physical CPU.  The three
functions called there now do not depend on that leaf anymore so the check
can go.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/amd.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -848,12 +848,9 @@ static void init_amd(struct cpuinfo_x86
 
 	cpu_detect_cache_sizes(c);
 
-	/* Multi core CPU? */
-	if (c->extended_cpuid_level >= 0x80000008) {
-		amd_detect_cmp(c);
-		amd_get_topology(c);
-		srat_detect_node(c);
-	}
+	amd_detect_cmp(c);
+	amd_get_topology(c);
+	srat_detect_node(c);
 
 	init_amd_cacheinfo(c);
 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 12/13] x86/cpu/AMD: Evaluate smp_num_siblings early
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (10 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 11/13] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 20:19 ` [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force Thomas Gleixner
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 12/13] x86/cpu/AMD: Evaluate smp_num_siblings early
From: Thomas Gleixner <tglx@linutronix.de>

To support force disabling of SMT it's required to know the number of
thread siblings early. amd_get_topology() cannot be called before the APIC
driver is selected, so split out the part which initializes
smp_num_siblings and invoke it from amd_early_init().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/amd.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -315,6 +315,17 @@ static void legacy_fixup_core_id(struct
 	c->cpu_core_id %= cus_per_node;
 }
 
+
+static void amd_get_topology_early(struct cpuinfo_x86 *c)
+{
+	if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
+		u32 eax, ebx, ecx, edx;
+
+		cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
+		smp_num_siblings = ((ebx >> 8) & 0xff) + 1;
+	}
+}
+
 /*
  * Fixup core topology information for
  * (1) AMD multi-node processors
@@ -681,6 +692,8 @@ static void early_init_amd(struct cpuinf
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	early_detect_mem_encrypt(c);
+
+	amd_get_topology_early(c);
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (11 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 12/13] x86/cpu/AMD: Evaluate smp_num_siblings early Thomas Gleixner
@ 2018-06-20 20:19 ` Thomas Gleixner
  2018-06-20 21:44   ` [MODERATED] " Linus Torvalds
  2018-06-28 22:13   ` Dave Hansen
  2018-06-20 20:53 ` [patch V4 00/13] SMT control knobs Thomas Gleixner
  2018-06-21 11:52 ` [MODERATED] " Ingo Molnar
  14 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:19 UTC (permalink / raw)
  To: speck

Subject: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
From: Thomas Gleixner <tglx@linutronix.de>

nosmt on the kernel command line merily prevents the onlining of the
secondary SMT siblings.

nosmt=force makes the APIC detection code ignore the secondary SMT siblings
completely, so they even do not show up as possible CPUs. That reduces the
amount of memory allocations for per cpu variables and saves other
resources from being allocated too large.

This is not fully equivalent to disabling SMT in the BIOS because the low
level SMT enabling in the BIOS can result in partitioning of resources
between the siblings, which is not undone by just ignoring them. Some CPUs
can use the full resources when their sibling is not onlined, but this is
depending on the CPU family and model and it's not well documented whether
this applies to all partitioned resources. That means depending on the
workload disabling SMT in the BIOS might result in better performance.

Linus analysis of the Intel manual:

  The intel optimization manual is not very clear on what the partitioning
  rules are.

  I find:

    "In general, the buffers for staging instructions between major pipe
     stages  are partitioned. These buffers include µop queues after the
     execution trace cache, the queues after the register rename stage, the
     reorder buffer which stages instructions for retirement, and the load
     and store buffers.

     In the case of load and store buffers, partitioning also provided an
     easier implementation to maintain memory ordering for each logical
     processor and detect memory ordering violations"

  but some of that partitioning may be relaxed if the HT thread is "not
  active":

    "In Intel microarchitecture code name Sandy Bridge, the micro-op queue
     is statically partitioned to provide 28 entries for each logical
     processor,  irrespective of software executing in single thread or
     multiple threads. If one logical processor is not active in Intel
     microarchitecture code name Ivy Bridge, then a single thread executing
     on that processor  core can use the 56 entries in the micro-op queue"

  but I do not know what "not active" means, and how dynamic it is. Some of
  that partitioning may be entirely static and depend on the early BIOS
  disabling of HT, and even if we park the cores, the resources will just be
  wasted.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
V3: Updated changelog
V2: Moved the mask related hunk to the proper patch and clarified that
    the command line switch might not be completely equivalent to the
    BIOS switch.

 arch/x86/include/asm/apic.h |    2 ++
 arch/x86/kernel/acpi/boot.c |    3 ++-
 arch/x86/kernel/apic/apic.c |   19 +++++++++++++++++++
 3 files changed, 23 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -504,8 +504,10 @@ extern int default_check_phys_apicid_pre
 
 #ifdef CONFIG_SMP
 bool apic_id_is_primary_thread(unsigned int id);
+bool apic_id_disabled(unsigned int id);
 #else
 static inline bool apic_id_is_primary_thread(unsigned int id) { return false; }
+static inline bool apic_id_disabled(unsigned int id) { return false; }
 #endif
 
 extern void irq_enter(void);
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -181,7 +181,8 @@ static int acpi_register_lapic(int id, u
 	}
 
 	if (!enabled) {
-		++disabled_cpus;
+		if (!apic_id_disabled(id))
+			++disabled_cpus;
 		return -EINVAL;
 	}
 
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2204,6 +2204,16 @@ bool apic_id_is_primary_thread(unsigned
 	return !(apicid & mask);
 }
 
+/**
+ * apic_id_disabled - Check whether APIC ID is disabled via SMT control
+ * @id:	APIC ID to check
+ */
+bool apic_id_disabled(unsigned int id)
+{
+	return (cpu_smt_control == CPU_SMT_FORCE_DISABLED &&
+		!apic_id_is_primary_thread(id));
+}
+
 /*
  * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
  * and cpuid_to_apicid[] synchronized.
@@ -2299,6 +2309,15 @@ int generic_processor_info(int apicid, i
 		return -EINVAL;
 	}
 
+	/*
+	 * If SMT is force disabled and the APIC ID belongs to
+	 * a secondary thread, ignore it.
+	 */
+	if (apic_id_disabled(apicid)) {
+		pr_info_once("Ignoring secondary SMT threads\n");
+		return -EINVAL;
+	}
+
 	if (apicid == boot_cpu_physical_apicid) {
 		/*
 		 * x86_bios_cpu_apicid is required to have processors listed

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch V4 00/13] SMT control knobs
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (12 preceding siblings ...)
  2018-06-20 20:19 ` [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force Thomas Gleixner
@ 2018-06-20 20:53 ` Thomas Gleixner
  2018-06-21 11:52 ` [MODERATED] " Ingo Molnar
  14 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-20 20:53 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 629 bytes --]

On Wed, 20 Jun 2018, speck for Thomas Gleixner wrote:

> V4 of the SMT control knob series. It addresses the review feed back of
> V3 plus:
> 
>  - Added Borislavs patch which removes the weird cpu leaf check and
>    adapted the early sibling check accordingly
> 
>  - Borislav noticed that the sys/..../cpuN/online file is stale after
>    writing 'off' to the smt/control file. Added a workaround which might be
>    replaced when Greg either has a better idea or starts to yell murder.
> 
>  - Updated the changelog of the last patch.

git bundle attached. It's the combo of l1tf host side plus the SMT stuff.

Thanks,

	tglx

[-- Attachment #2: Type: application/octet-stream, Size: 32463 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-20 20:19 ` [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force Thomas Gleixner
@ 2018-06-20 21:44   ` Linus Torvalds
  2018-06-28 22:13   ` Dave Hansen
  1 sibling, 0 replies; 33+ messages in thread
From: Linus Torvalds @ 2018-06-20 21:44 UTC (permalink / raw)
  To: speck



On Wed, 20 Jun 2018, speck for Thomas Gleixner wrote:
> 
> nosmt on the kernel command line merily prevents the onlining of the
> secondary SMT siblings.

"merely", I assume.

Unless you meant "merrily". I guess it parses that way too ;)

           Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT
  2018-06-20 20:19 ` [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT Thomas Gleixner
@ 2018-06-20 23:05   ` Greg KH
  2018-06-21  6:08     ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2018-06-20 23:05 UTC (permalink / raw)
  To: speck

On Wed, Jun 20, 2018 at 10:19:12PM +0200, speck for Thomas Gleixner wrote:
> +static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> +{
> +	int cpu, ret = 0;
> +
> +	cpu_maps_update_begin();
> +	for_each_online_cpu(cpu) {
> +		struct device *dev;
> +
> +		if (topology_is_primary_thread(cpu))
> +			continue;
> +		ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
> +		if (ret)
> +			break;
> +		/*
> +		 * As this needs to hold the cpu maps lock it's impossible
> +		 * to call device_offline(), so nothing would update
> +		 * device:offline state. That would leave the sysfs entry
> +		 * stale and prevent onlining after smt control has been
> +		 * changed to 'off' again. This is called under the sysfs
> +		 * hotplug lock, so it is properly serialized.
> +		 *
> +		 * Temporary workaround until Greg has a smarter idea to do
> +		 * that.
> +		 */
> +		dev = get_cpu_device(cpu);
> +		dev->offline = true;

You are right to call me out here.  Ugh.

I'm missing where device_offline() wants to call any lock other than the
lock on the device you are offlineing.  Why can't that be called here?

If you do just want to "open code" this, you should also tell userspace
what just happened, so should you add this line as well:
		kobject_uevent(&dev->kobj, KOBJ_OFFLINE);

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT
  2018-06-20 23:05   ` [MODERATED] " Greg KH
@ 2018-06-21  6:08     ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-21  6:08 UTC (permalink / raw)
  To: speck

On Thu, 21 Jun 2018, speck for Greg KH wrote:
> On Wed, Jun 20, 2018 at 10:19:12PM +0200, speck for Thomas Gleixner wrote:
> > +static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> > +{
> > +	int cpu, ret = 0;
> > +
> > +	cpu_maps_update_begin();
> > +	for_each_online_cpu(cpu) {
> > +		struct device *dev;
> > +
> > +		if (topology_is_primary_thread(cpu))
> > +			continue;
> > +		ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
> > +		if (ret)
> > +			break;
> > +		/*
> > +		 * As this needs to hold the cpu maps lock it's impossible
> > +		 * to call device_offline(), so nothing would update
> > +		 * device:offline state. That would leave the sysfs entry
> > +		 * stale and prevent onlining after smt control has been
> > +		 * changed to 'off' again. This is called under the sysfs
> > +		 * hotplug lock, so it is properly serialized.
> > +		 *
> > +		 * Temporary workaround until Greg has a smarter idea to do
> > +		 * that.
> > +		 */
> > +		dev = get_cpu_device(cpu);
> > +		dev->offline = true;
> 
> You are right to call me out here.  Ugh.
> 
> I'm missing where device_offline() wants to call any lock other than the
> lock on the device you are offlineing.  Why can't that be called here?

Because it ends up cpu_maps_update_begin() at the end, which is already
held.

> If you do just want to "open code" this, you should also tell userspace
> what just happened, so should you add this line as well:
> 		kobject_uevent(&dev->kobj, KOBJ_OFFLINE);

Ok.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 00/13] SMT control knobs
  2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
                   ` (13 preceding siblings ...)
  2018-06-20 20:53 ` [patch V4 00/13] SMT control knobs Thomas Gleixner
@ 2018-06-21 11:52 ` Ingo Molnar
  14 siblings, 0 replies; 33+ messages in thread
From: Ingo Molnar @ 2018-06-21 11:52 UTC (permalink / raw)
  To: speck


* speck for Thomas Gleixner <speck@linutronix.de> wrote:

> V4 of the SMT control knob series. It addresses the review feed back of
> V3 plus:
> 
>  - Added Borislavs patch which removes the weird cpu leaf check and
>    adapted the early sibling check accordingly
> 
>  - Borislav noticed that the sys/..../cpuN/online file is stale after
>    writing 'off' to the smt/control file. Added a workaround which might be
>    replaced when Greg either has a better idea or starts to yell murder.
> 
>  - Updated the changelog of the last patch.

Very nice!

Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
  2018-06-20 20:19 ` [patch V4 02/13] x86/smp: Provide topology_is_primary_thread() Thomas Gleixner
@ 2018-06-27 13:56   ` Josh Poimboeuf
  2018-06-27 15:54     ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Josh Poimboeuf @ 2018-06-27 13:56 UTC (permalink / raw)
  To: speck

On Wed, Jun 20, 2018 at 10:19:09PM +0200, speck for Thomas Gleixner wrote:
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -123,13 +123,15 @@ static inline int topology_max_smt_threa
>  }
>  
>  int topology_update_package_map(unsigned int apicid, unsigned int cpu);
> -extern int topology_phys_to_logical_pkg(unsigned int pkg);
> +int topology_phys_to_logical_pkg(unsigned int pkg);
> +bool topology_is_primary_thread(unsigned int cpu);
>  #else
>  #define topology_max_packages()			(1)
>  static inline int
>  topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
>  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
>  static inline int topology_max_smt_threads(void) { return 1; }
> +static inline bool topology_is_primary_thread(unsigned int cpu) { return true; ]

Longman found a typo: "]" at the end of the line

-- 
Josh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
  2018-06-27 13:56   ` [MODERATED] " Josh Poimboeuf
@ 2018-06-27 15:54     ` Thomas Gleixner
  2018-06-27 16:20       ` [MODERATED] " Josh Poimboeuf
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-27 15:54 UTC (permalink / raw)
  To: speck

On Wed, 27 Jun 2018, speck for Josh Poimboeuf wrote:

> On Wed, Jun 20, 2018 at 10:19:09PM +0200, speck for Thomas Gleixner wrote:
> > --- a/arch/x86/include/asm/topology.h
> > +++ b/arch/x86/include/asm/topology.h
> > @@ -123,13 +123,15 @@ static inline int topology_max_smt_threa
> >  }
> >  
> >  int topology_update_package_map(unsigned int apicid, unsigned int cpu);
> > -extern int topology_phys_to_logical_pkg(unsigned int pkg);
> > +int topology_phys_to_logical_pkg(unsigned int pkg);
> > +bool topology_is_primary_thread(unsigned int cpu);
> >  #else
> >  #define topology_max_packages()			(1)
> >  static inline int
> >  topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
> >  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
> >  static inline int topology_max_smt_threads(void) { return 1; }
> > +static inline bool topology_is_primary_thread(unsigned int cpu) { return true; ]
> 
> Longman found a typo: "]" at the end of the line

In the patch but not in the repo...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
  2018-06-27 15:54     ` Thomas Gleixner
@ 2018-06-27 16:20       ` Josh Poimboeuf
  2018-06-27 16:52         ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Josh Poimboeuf @ 2018-06-27 16:20 UTC (permalink / raw)
  To: speck

On Wed, Jun 27, 2018 at 05:54:04PM +0200, speck for Thomas Gleixner wrote:
> On Wed, 27 Jun 2018, speck for Josh Poimboeuf wrote:
> 
> > On Wed, Jun 20, 2018 at 10:19:09PM +0200, speck for Thomas Gleixner wrote:
> > > --- a/arch/x86/include/asm/topology.h
> > > +++ b/arch/x86/include/asm/topology.h
> > > @@ -123,13 +123,15 @@ static inline int topology_max_smt_threa
> > >  }
> > >  
> > >  int topology_update_package_map(unsigned int apicid, unsigned int cpu);
> > > -extern int topology_phys_to_logical_pkg(unsigned int pkg);
> > > +int topology_phys_to_logical_pkg(unsigned int pkg);
> > > +bool topology_is_primary_thread(unsigned int cpu);
> > >  #else
> > >  #define topology_max_packages()			(1)
> > >  static inline int
> > >  topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
> > >  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
> > >  static inline int topology_max_smt_threads(void) { return 1; }
> > > +static inline bool topology_is_primary_thread(unsigned int cpu) { return true; ]
> > 
> > Longman found a typo: "]" at the end of the line
> 
> In the patch but not in the repo...

Where is the repo?

-- 
Josh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch V4 02/13] x86/smp: Provide topology_is_primary_thread()
  2018-06-27 16:20       ` [MODERATED] " Josh Poimboeuf
@ 2018-06-27 16:52         ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-27 16:52 UTC (permalink / raw)
  To: speck

On Wed, 27 Jun 2018, speck for Josh Poimboeuf wrote:
> On Wed, Jun 27, 2018 at 05:54:04PM +0200, speck for Thomas Gleixner wrote:
> > On Wed, 27 Jun 2018, speck for Josh Poimboeuf wrote:
> > 
> > > On Wed, Jun 20, 2018 at 10:19:09PM +0200, speck for Thomas Gleixner wrote:
> > > > --- a/arch/x86/include/asm/topology.h
> > > > +++ b/arch/x86/include/asm/topology.h
> > > > @@ -123,13 +123,15 @@ static inline int topology_max_smt_threa
> > > >  }
> > > >  
> > > >  int topology_update_package_map(unsigned int apicid, unsigned int cpu);
> > > > -extern int topology_phys_to_logical_pkg(unsigned int pkg);
> > > > +int topology_phys_to_logical_pkg(unsigned int pkg);
> > > > +bool topology_is_primary_thread(unsigned int cpu);
> > > >  #else
> > > >  #define topology_max_packages()			(1)
> > > >  static inline int
> > > >  topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
> > > >  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
> > > >  static inline int topology_max_smt_threads(void) { return 1; }
> > > > +static inline bool topology_is_primary_thread(unsigned int cpu) { return true; ]
> > > 
> > > Longman found a typo: "]" at the end of the line
> > 
> > In the patch but not in the repo...
> 
> Where is the repo?

It's a sekrit gitolite instance of course. Send me a ssh pub key please.

I also posted the git bundle on top of 4.18-rc1 on the list.

  Subject: L!TF Bulletin: State of the horrors
  
  Message-ID: alpine.DEB.2.21.1806211719180.1591@nanos.tec.linutronix.de

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-20 20:19 ` [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force Thomas Gleixner
  2018-06-20 21:44   ` [MODERATED] " Linus Torvalds
@ 2018-06-28 22:13   ` Dave Hansen
  2018-06-28 22:19     ` Andrew Cooper
                       ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Dave Hansen @ 2018-06-28 22:13 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1761 bytes --]

On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote:
> +	/*
> +	 * If SMT is force disabled and the APIC ID belongs to
> +	 * a secondary thread, ignore it.
> +	 */
> +	if (apic_id_disabled(apicid)) {
> +		pr_info_once("Ignoring secondary SMT threads\n");
> +		return -EINVAL;
> +	}

Thomas, this boottime-disable stuff just ends up ignoring the
hyperthread and leaves it alone, right?

Some Intel folks pointed out a few problems with this.  One is with
machine checks.  If one thread is booted and has CR4.MCE=1, but the
other never gets booted and still has CR4.MCE=0, things go boom
(everything goes to shutdown state) if a machine check happens.

We've traditionally pretended that this does not happen because folks
don't tend to turn off CPUs they've paid for via things like maxcpus= in
the real world.

Ashok Raj and Tony Luck were evidently looking at this at some point,
but it got tricky and decided it wasn't worth the trouble.

It makes me think we should either scrap or recommend against "nosmt=force".

Some relevant SDM language:

> Because the logical processors within a physical package are tightly
> coupled with respect to shared hardware resources, both logical
> processors are notified of machine check errors that occur within a
> given physical processor. If machine-check exceptions are enabled
> when a fatal error is reported, all the logical processors within a
> physical package are dispatched to the machine-check exception
> handler. If machine-check exceptions are disabled, the logical
> processors enter the shutdown state and assert the IERR# signal. When
> enabling machine-check exceptions, the MCE flag in control register
> CR4 should be set for each logical processor.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-28 22:13   ` Dave Hansen
@ 2018-06-28 22:19     ` Andrew Cooper
  2018-06-28 22:23     ` Luck, Tony
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2018-06-28 22:19 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 2113 bytes --]

On 28/06/2018 23:13, speck for Dave Hansen wrote:
> On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote:
>> +	/*
>> +	 * If SMT is force disabled and the APIC ID belongs to
>> +	 * a secondary thread, ignore it.
>> +	 */
>> +	if (apic_id_disabled(apicid)) {
>> +		pr_info_once("Ignoring secondary SMT threads\n");
>> +		return -EINVAL;
>> +	}
> Thomas, this boottime-disable stuff just ends up ignoring the
> hyperthread and leaves it alone, right?

Yes

>
> Some Intel folks pointed out a few problems with this.  One is with
> machine checks.  If one thread is booted and has CR4.MCE=1, but the
> other never gets booted and still has CR4.MCE=0, things go boom
> (everything goes to shutdown state) if a machine check happens.
>
> We've traditionally pretended that this does not happen because folks
> don't tend to turn off CPUs they've paid for via things like maxcpus= in
> the real world.
>
> Ashok Raj and Tony Luck were evidently looking at this at some point,
> but it got tricky and decided it wasn't worth the trouble.
>
> It makes me think we should either scrap or recommend against "nosmt=force".
>
> Some relevant SDM language:
>
>> Because the logical processors within a physical package are tightly
>> coupled with respect to shared hardware resources, both logical
>> processors are notified of machine check errors that occur within a
>> given physical processor. If machine-check exceptions are enabled
>> when a fatal error is reported, all the logical processors within a
>> physical package are dispatched to the machine-check exception
>> handler. If machine-check exceptions are disabled, the logical
>> processors enter the shutdown state and assert the IERR# signal. When
>> enabling machine-check exceptions, the MCE flag in control register
>> CR4 should be set for each logical processor.

So what you're saying is that we need to boot all the threads, including
MCE setup etc, then leave them alone (mwait/deep C states?) so they
avoid causing a shutdown?

If so, I've got quite a lot of extra work to do in Xen...

~Andrew


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-28 22:13   ` Dave Hansen
  2018-06-28 22:19     ` Andrew Cooper
@ 2018-06-28 22:23     ` Luck, Tony
  2018-06-28 22:29       ` Andrew Cooper
  2018-06-29  8:26     ` [MODERATED] " Borislav Petkov
  2018-06-29  8:50     ` Thomas Gleixner
  3 siblings, 1 reply; 33+ messages in thread
From: Luck, Tony @ 2018-06-28 22:23 UTC (permalink / raw)
  To: speck

On Thu, Jun 28, 2018 at 03:13:11PM -0700, speck for Dave Hansen wrote:
> On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote:
> > +	/*
> > +	 * If SMT is force disabled and the APIC ID belongs to
> > +	 * a secondary thread, ignore it.
> > +	 */
> > +	if (apic_id_disabled(apicid)) {
> > +		pr_info_once("Ignoring secondary SMT threads\n");
> > +		return -EINVAL;
> > +	}
> 
> Thomas, this boottime-disable stuff just ends up ignoring the
> hyperthread and leaves it alone, right?
> 
> Some Intel folks pointed out a few problems with this.  One is with
> machine checks.  If one thread is booted and has CR4.MCE=1, but the
> other never gets booted and still has CR4.MCE=0, things go boom
> (everything goes to shutdown state) if a machine check happens.

There is also a power consumption concern. Asking BIOS folks now
whether a logical CPU in the BIOS waiting for SIPI is in a deep
C-state.

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-28 22:23     ` Luck, Tony
@ 2018-06-28 22:29       ` Andrew Cooper
  0 siblings, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2018-06-28 22:29 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1150 bytes --]

On 28/06/2018 23:23, speck for Luck, Tony wrote:
> On Thu, Jun 28, 2018 at 03:13:11PM -0700, speck for Dave Hansen wrote:
>> On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote:
>>> +	/*
>>> +	 * If SMT is force disabled and the APIC ID belongs to
>>> +	 * a secondary thread, ignore it.
>>> +	 */
>>> +	if (apic_id_disabled(apicid)) {
>>> +		pr_info_once("Ignoring secondary SMT threads\n");
>>> +		return -EINVAL;
>>> +	}
>> Thomas, this boottime-disable stuff just ends up ignoring the
>> hyperthread and leaves it alone, right?
>>
>> Some Intel folks pointed out a few problems with this.  One is with
>> machine checks.  If one thread is booted and has CR4.MCE=1, but the
>> other never gets booted and still has CR4.MCE=0, things go boom
>> (everything goes to shutdown state) if a machine check happens.
> There is also a power consumption concern. Asking BIOS folks now
> whether a logical CPU in the BIOS waiting for SIPI is in a deep
> C-state.

Alternatively, and probably better, are there any MSRs which can be used
to turn off hypethreading once the OS starts, or is a write-once-lock
setting?

~Andrew


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-28 22:13   ` Dave Hansen
  2018-06-28 22:19     ` Andrew Cooper
  2018-06-28 22:23     ` Luck, Tony
@ 2018-06-29  8:26     ` Borislav Petkov
  2018-06-29 17:01       ` Luck, Tony
  2018-06-29  8:50     ` Thomas Gleixner
  3 siblings, 1 reply; 33+ messages in thread
From: Borislav Petkov @ 2018-06-29  8:26 UTC (permalink / raw)
  To: speck

On Thu, Jun 28, 2018 at 03:13:11PM -0700, speck for Dave Hansen wrote:
> > If machine-check exceptions are disabled, the logical
> > processors enter the shutdown state and assert the IERR# signal. When
> > enabling machine-check exceptions, the MCE flag in control register
> > CR4 should be set for each logical processor.

If only we were able to set MCG_STATUS_LMCES on all relevant CPUs and
disable this MCE broadcasting insanity...

Ok, question: what exactly does the hw/microcode do on an offlined
logical CPU: does it test CR4.MCE first and assert that IERR# or does it
look at MCG_CTL/MCi_CTL too so that we can try to disable MCE reporting
on those offlined cores?

There must be some recommendation about what BIOS should be doing when
*it* causes an MCE during its init phase and the other cores are not up.
Or does it have its own #MC handler and then disables everything before
handoff to the OS?

Hmmm.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-28 22:13   ` Dave Hansen
                       ` (2 preceding siblings ...)
  2018-06-29  8:26     ` [MODERATED] " Borislav Petkov
@ 2018-06-29  8:50     ` Thomas Gleixner
  2018-06-29 16:48       ` [MODERATED] " Dave Hansen
  3 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-29  8:50 UTC (permalink / raw)
  To: speck

On Thu, 28 Jun 2018, speck for Dave Hansen wrote:

> On 06/20/2018 01:19 PM, speck for Thomas Gleixner wrote:
> > +	/*
> > +	 * If SMT is force disabled and the APIC ID belongs to
> > +	 * a secondary thread, ignore it.
> > +	 */
> > +	if (apic_id_disabled(apicid)) {
> > +		pr_info_once("Ignoring secondary SMT threads\n");
> > +		return -EINVAL;
> > +	}
> 
> Thomas, this boottime-disable stuff just ends up ignoring the
> hyperthread and leaves it alone, right?
> 
> Some Intel folks pointed out a few problems with this.  One is with
> machine checks.  If one thread is booted and has CR4.MCE=1, but the
> other never gets booted and still has CR4.MCE=0, things go boom
> (everything goes to shutdown state) if a machine check happens.
> 
> We've traditionally pretended that this does not happen because folks
> don't tend to turn off CPUs they've paid for via things like maxcpus= in
> the real world.
> 
> Ashok Raj and Tony Luck were evidently looking at this at some point,
> but it got tricky and decided it wasn't worth the trouble.
> 
> It makes me think we should either scrap or recommend against "nosmt=force".

So you are saying that the boot process on _EVERY_ Intel box with HT
enabled just works by chance.

MCE is enabled on the boot CPU here:

[    0.244017] mce: CPU supports 22 MCE banks

The corresponding sibling #72 boots here:

[    1.008005] .... node  #0, CPUs:    #72

That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
between these two points the machine is going to shutdown.

At least it's a known safe state.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-29  8:50     ` Thomas Gleixner
@ 2018-06-29 16:48       ` Dave Hansen
  0 siblings, 0 replies; 33+ messages in thread
From: Dave Hansen @ 2018-06-29 16:48 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 825 bytes --]

On 06/29/2018 01:50 AM, speck for Thomas Gleixner wrote:
> On Thu, 28 Jun 2018, speck for Dave Hansen wrote:
>> Some Intel folks pointed out a few problems with this.  One is with
>> machine checks.  If one thread is booted and has CR4.MCE=1, but the
>> other never gets booted and still has CR4.MCE=0, things go boom
>> (everything goes to shutdown state) if a machine check happens.
...
> So you are saying that the boot process on _EVERY_ Intel box with HT
> enabled just works by chance.

Yeah, it works by praying an MCE doesn't hit in that window.  Seems to
be a rather screwy architecture to me.

> That means if an MCE hits on physical core 0 (logical CPUs 0 and 72)
> between these two points the machine is going to shutdown.

Yep.

It seems to be a well understood but somewhat ignored issue.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-29  8:26     ` [MODERATED] " Borislav Petkov
@ 2018-06-29 17:01       ` Luck, Tony
  2018-06-29 17:08         ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Luck, Tony @ 2018-06-29 17:01 UTC (permalink / raw)
  To: speck

On Fri, Jun 29, 2018 at 10:26:17AM +0200, speck for Borislav Petkov wrote:
> Ok, question: what exactly does the hw/microcode do on an offlined
> logical CPU: does it test CR4.MCE first and assert that IERR# or does it
> look at MCG_CTL/MCi_CTL too so that we can try to disable MCE reporting
> on those offlined cores?

MCi_CTL may or may not save you. Banks are shared across cores or the
whole package. So if you bring any CPU online that shares the bank that
reports the error, then the offline CPUs that share that bank will have
the machine check enabled.

For the hyper-thread disabled case you are guaranteed to be screwed.

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [MODERATED] Re: Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-29 17:01       ` Luck, Tony
@ 2018-06-29 17:08         ` Linus Torvalds
  2018-06-29 18:36           ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2018-06-29 17:08 UTC (permalink / raw)
  To: speck



On Fri, 29 Jun 2018, speck for Luck, Tony wrote:
> 
> MCi_CTL may or may not save you. Banks are shared across cores or the
> whole package. So if you bring any CPU online that shares the bank that
> reports the error, then the offline CPUs that share that bank will have
> the machine check enabled.
> 
> For the hyper-thread disabled case you are guaranteed to be screwed.

Can we ask people inside intel to 

 (a) get a babysitter for the people who designed MCE? They need a 
     pacifier and/or a banana. It's not like this is the first time we hit 
     a "christ, this is so stupid that clearly no adult ever checked it" 
     issue with the garbage that Intel calls machine checks.

 (b) Get an adult to supervise and fix MCE once and for all for future 
     CPU's. 

Honestly, MCE's are broken garbage. The whole "let's broadcast this shit 
to random CPU's" was incredibly broken from the very beginning. Adding 
this "let's initialize CPU's in a mode that is guaranteed to be broken" is 
just icing on the shit-cake that is MCE.

Seriously. We've complained about MCE before. There must be people inside 
intel who are just twiddling their thumbs waiting for the process people 
get their shit together some day next decade, could some of them perhaps 
be made to look at MCE?

            Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Re: [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force
  2018-06-29 17:08         ` Linus Torvalds
@ 2018-06-29 18:36           ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2018-06-29 18:36 UTC (permalink / raw)
  To: speck

On Fri, 29 Jun 2018, speck for Linus Torvalds wrote:
> On Fri, 29 Jun 2018, speck for Luck, Tony wrote:
> > 
> > MCi_CTL may or may not save you. Banks are shared across cores or the
> > whole package. So if you bring any CPU online that shares the bank that
> > reports the error, then the offline CPUs that share that bank will have
> > the machine check enabled.
> > 
> > For the hyper-thread disabled case you are guaranteed to be screwed.
> 
> Can we ask people inside intel to 
> 
>  (a) get a babysitter for the people who designed MCE? They need a 
>      pacifier and/or a banana. It's not like this is the first time we hit 
>      a "christ, this is so stupid that clearly no adult ever checked it" 
>      issue with the garbage that Intel calls machine checks.
> 
>  (b) Get an adult to supervise and fix MCE once and for all for future 
>      CPU's. 
> 
> Honestly, MCE's are broken garbage. The whole "let's broadcast this shit 
> to random CPU's" was incredibly broken from the very beginning. Adding 
> this "let's initialize CPU's in a mode that is guaranteed to be broken" is 
> just icing on the shit-cake that is MCE.
> 
> Seriously. We've complained about MCE before. There must be people inside 
> intel who are just twiddling their thumbs waiting for the process people 
> get their shit together some day next decade, could some of them perhaps 
> be made to look at MCE?

The thumb twiddlers might be already members of the Massive Cake
Engineering departement.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2018-06-29 18:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-20 20:19 [patch V4 00/13] SMT control knobs Thomas Gleixner
2018-06-20 20:19 ` [patch V4 01/13] sched/smt: Update sched_smt_present at runtime Thomas Gleixner
2018-06-20 20:19 ` [patch V4 02/13] x86/smp: Provide topology_is_primary_thread() Thomas Gleixner
2018-06-27 13:56   ` [MODERATED] " Josh Poimboeuf
2018-06-27 15:54     ` Thomas Gleixner
2018-06-27 16:20       ` [MODERATED] " Josh Poimboeuf
2018-06-27 16:52         ` Thomas Gleixner
2018-06-20 20:19 ` [patch V4 03/13] cpu/hotplug: Make bringup/teardown of smp threads symmetric Thomas Gleixner
2018-06-20 20:19 ` [patch V4 04/13] cpu/hotplug: Split do_cpu_down() Thomas Gleixner
2018-06-20 20:19 ` [patch V4 05/13] cpu/hotplug: Provide knobs to control SMT Thomas Gleixner
2018-06-20 23:05   ` [MODERATED] " Greg KH
2018-06-21  6:08     ` Thomas Gleixner
2018-06-20 20:19 ` [patch V4 06/13] x86/cpu: Remove the pointless CPU printout Thomas Gleixner
2018-06-20 20:19 ` [patch V4 07/13] x86/cpu/AMD: Remove the pointless detect_ht() call Thomas Gleixner
2018-06-20 20:19 ` [patch V4 08/13] x86/cpu/common: Provide detect_ht_early() Thomas Gleixner
2018-06-20 20:19 ` [patch V4 09/13] x86/cpu/topology: Provide detect_extended_topology_early() Thomas Gleixner
2018-06-20 20:19 ` [patch V4 10/13] x86/cpu/intel: Evaluate smp_num_siblings early Thomas Gleixner
2018-06-20 20:19 ` [patch V4 11/13] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info Thomas Gleixner
2018-06-20 20:19 ` [patch V4 12/13] x86/cpu/AMD: Evaluate smp_num_siblings early Thomas Gleixner
2018-06-20 20:19 ` [patch V4 13/13] x86/apic: Ignore secondary threads if nosmt=force Thomas Gleixner
2018-06-20 21:44   ` [MODERATED] " Linus Torvalds
2018-06-28 22:13   ` Dave Hansen
2018-06-28 22:19     ` Andrew Cooper
2018-06-28 22:23     ` Luck, Tony
2018-06-28 22:29       ` Andrew Cooper
2018-06-29  8:26     ` [MODERATED] " Borislav Petkov
2018-06-29 17:01       ` Luck, Tony
2018-06-29 17:08         ` Linus Torvalds
2018-06-29 18:36           ` Thomas Gleixner
2018-06-29  8:50     ` Thomas Gleixner
2018-06-29 16:48       ` [MODERATED] " Dave Hansen
2018-06-20 20:53 ` [patch V4 00/13] SMT control knobs Thomas Gleixner
2018-06-21 11:52 ` [MODERATED] " Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.