All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpu/hotplug: SMT control knobs
@ 2018-05-25 22:08 Thomas Gleixner
  2018-05-25 23:01 ` [MODERATED] " Andi Kleen
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-25 22:08 UTC (permalink / raw)
  To: speck

On Fri, 25 May 2018, speck for Thomas Gleixner wrote:
> On Thu, 24 May 2018, speck for Linus Torvalds wrote:
> > But if there is actually a way to turn HT off dynamically, maybe we could 
> > make it be CPU hotplug. That would certainly be _so_ much better than the 
> > nasty "turn it on/off in BIOS" even for other uses.
> 
> CPU hotplug is pretty much the only solution for runtime HT disable. In the
> current state it won't give the boot time allocated per cpu resources back
> and nr_present_cpus() will still be the same as before shutting them down,
> but everything else will be cleaned out. I need to check the scheduler
> topology stuff, but that should see the change as well. We could optimize
> that with the static key which is already available (sched_smt_present),
> but I have to look which nastiness is involved when toggling that at
> runtime.
> 
> So yes, it's different from the case where HT is disabled in the BIOS, but
> I don't think it really matters much in practice.
> 
> I'll implement a knob in sysfs for that and see how that behaves.

The below patch implements boot/runtime SMT control.

SMT can be disabled on the command line with "nosmt" or at runtime
controlled via

    /sys/devices/system/cpu/smt/disabled

There is also a file which tells whether SMT is active or not

    /sys/devices/system/cpu/smt/active

There are a few oddities, which are all fixable (see further down) but I
wanted to share the initial version for discussion.

1) If a CPU is onlined the first time, then the cpu_core_id is currently
   unknown, so there is no way to figure out whether it has an online
   sibling or not. The solution for this is to boot the cpu up to the point
   where it reaches the idle task and check whether it has an already
   online sibling. If no, the bringup continues. If yes, it cancels the
   operation.

   So the boot log looks a bit funny as it boots the secondary siblings and
   then tells that only half of the CPUs which were booted are actually
   booted.

   Another side effect is that the per cpu threads which have been created
   when the CPU had to be booted to figure out the core ID stay around. The
   smp_boot_threads are parked and the kworkers are in I state.

2) Depending on the ACPI tables the resulting CPU enumeration might not be
   linear.

   Many ACPI tables enumerate first the primary siblings of all cores and
   afterwards the secondary ones. In that case it ends up with CPU0-N where
   N is the number of physical cores.

   One of my machines enumerates the siblings consecutively so the cpu
   numbers of online CPUs are 0,2,4.... Virtualization has the same
   enumeration mode.

3) If both siblings are unplugged, then there is momentarily no possibility
   to prevent that the secondary sibling is plugged first which blocks a
   subsequent online request for the primary one.

4) The static key sched_smt_present is enabled at boot time after smp
   bringup, if the number of SMT threads is > 1.

   If the kernel is booted with 'nosmt' on the command line or with a
   limited number or CPUs, then the key stays disabled even if later the
   SMT disabled restriction is lifted. That's not a new problem, upstream
   has the same problem when booting with maxcpus=1 and then bringing the
   rest up later.

   The topoology rebuild is still working on hotplug, but it's certainly
   worthwhile to fix the behaviour of the static key. Peter?

#1 - #3 can be addressed by looking at the initial (x2)APIC ID.

For APIC bit 0 is reserved for the siblings on CPUs which support SMT. So
we can use the initial APIC ID tables to figure that out. For x2APIC the
number of bits valid for the SMT level can be queried from the boot CPU
topology enumeration CPUID leaves.

That allows to linearize the CPU number space and always put the secondary
siblings at the end - or the other way round :)

Plus it allows to make the command line switch permanent which limits the
number of possible CPUs (the ACPI enumeration would just discard the
siblings based on the (x2)APIC ID) and therefore the memory waste. OTOH the
full runtime control has it's charme as well.

Thoughts?

Thanks,

	tglx

8<----------------------------

 arch/x86/Kconfig                |    3 +
 arch/x86/include/asm/topology.h |    3 -
 arch/x86/kernel/smpboot.c       |   41 +++++++++++++
 kernel/cpu.c                    |  120 +++++++++++++++++++++++++++++++++++++++-
 4 files changed, 163 insertions(+), 4 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2274,6 +2274,9 @@ config DEBUG_HOTPLUG_CPU0
 
 	  If unsure, say N.
 
+config HOTPLUG_SMT
+	def_bool y if HOTPLUG_CPU
+
 config COMPAT_VDSO
 	def_bool n
 	prompt "Disable the 32-bit vDSO (needed for glibc 2.3.3)"
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -123,7 +123,8 @@ static inline int topology_max_smt_threa
 }
 
 int topology_update_package_map(unsigned int apicid, unsigned int cpu);
-extern int topology_phys_to_logical_pkg(unsigned int pkg);
+int topology_phys_to_logical_pkg(unsigned int pkg);
+int topology_get_online_sibling(unsigned int cpu);
 #else
 #define topology_max_packages()			(1)
 static inline int
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -326,6 +326,44 @@ void __init smp_store_boot_cpu_info(void
 	c->initialized = true;
 }
 
+/**
+ * topology_get_online_sibling - Get an online sibling for a CPU
+ * @cpu:	CPU to check for
+ */
+int topology_get_online_sibling(unsigned int cpu)
+{
+	struct cpuinfo_x86 *cs, *c = &cpu_data(cpu);
+	unsigned int sibling;
+
+	if (smp_num_siblings == 1)
+		return -ENODEV;
+
+	/* If @cpu is online, just query the sibling mask */
+	if (cpu_online(cpu)) {
+		sibling = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+		return sibling < nr_cpu_ids ? sibling : -ENODEV;
+	}
+
+	/*
+	 * @cpu is about to be onlined. Walk the online CPUs and check for
+	 * matching phys_proc_id and core_id.
+	 *
+	 * If the CPU never booted before the core id is unknown
+	 * and cannot be queried.
+	 */
+	if (c->cpu_core_id == U16_MAX)
+		return -ENODEV;
+
+	for_each_online_cpu(sibling) {
+		cs = &cpu_data(sibling);
+
+		if (cs->phys_proc_id == c->phys_proc_id &&
+		    cs->cpu_core_id == c->cpu_core_id)
+			return sibling;
+	}
+	return -ENODEV;
+}
+
 /*
  * The bootstrap kernel entry code has set these up. Save them for
  * a given CPU
@@ -1217,6 +1255,8 @@ static void __init smp_cpu_index_default
 		c = &cpu_data(i);
 		/* mark all to hotplug */
 		c->cpu_index = nr_cpu_ids;
+		/* Mark core ID as unknown */
+		c->cpu_core_id = U16_MAX;
 	}
 }
 
@@ -1476,7 +1516,6 @@ static void remove_siblinginfo(int cpu)
 	cpumask_clear(cpu_llc_shared_mask(cpu));
 	cpumask_clear(topology_sibling_cpumask(cpu));
 	cpumask_clear(topology_core_cpumask(cpu));
-	c->cpu_core_id = 0;
 	c->booted_cores = 0;
 	cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
 	recompute_smt_state();
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -96,6 +96,8 @@ static inline void cpuhp_lock_release(bo
 
 #endif
 
+static inline bool cpu_smt_blocked(unsigned int cpu);
+
 /**
  * cpuhp_step - Hotplug state machine step
  * @name:	Name of the step
@@ -422,6 +424,15 @@ static int bringup_wait_for_ap(unsigned
 	stop_machine_unpark(cpu);
 	kthread_unpark(st->thread);
 
+	/*
+	 * If the CPU was booted the first time then there was no
+	 * information about the sibling state. If SMT is disabled then
+	 * return with an error and let the hotplug machinery offline
+	 * the CPU again.
+	 */
+	if (cpu_smt_blocked(cpu))
+		return -ECANCELED;
+
 	if (st->target <= CPUHP_AP_ONLINE_IDLE)
 		return 0;
 
@@ -754,7 +765,6 @@ static int takedown_cpu(unsigned int cpu
 
 	/* Park the smpboot threads */
 	kthread_park(per_cpu_ptr(&cpuhp_state, cpu)->thread);
-	smpboot_park_threads(cpu);
 
 	/*
 	 * Prevent irq alloc/free while the dying cpu reorganizes the
@@ -1039,6 +1049,9 @@ static int do_cpu_up(unsigned int cpu, e
 {
 	int err = 0;
 
+	if (cpu_smt_blocked(cpu))
+		return -EBUSY;
+
 	if (!cpu_possible(cpu)) {
 		pr_err("can't online cpu %d because it is not configured as may-hotadd at boot time\n",
 		       cpu);
@@ -1332,7 +1345,7 @@ static struct cpuhp_step cpuhp_hp_states
 	[CPUHP_AP_SMPBOOT_THREADS] = {
 		.name			= "smpboot/threads:online",
 		.startup.single		= smpboot_unpark_threads,
-		.teardown.single	= NULL,
+		.teardown.single	= smpboot_park_threads,
 	},
 	[CPUHP_AP_IRQ_AFFINITY_ONLINE] = {
 		.name			= "irq/affinity:online",
@@ -1906,10 +1919,113 @@ static const struct attribute_group cpuh
 	NULL
 };
 
+#ifdef CONFIG_HOTPLUG_SMT
+static bool cpu_smt_disabled __read_mostly;
+
+static int __init smt_disable(char *str)
+{
+	cpu_smt_disabled = true;
+	return 1;
+}
+__setup("nosmt", smt_disable);
+
+static inline bool cpu_smt_blocked(unsigned int cpu)
+{
+	return cpu_smt_disabled && topology_get_online_sibling(cpu) >= 0;
+}
+
+static ssize_t
+show_smt_disabled(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE - 2, "%d\n", cpu_smt_disabled);
+}
+
+static int offline_smt_siblings(void)
+{
+	int cpu, sibling, ret = 0;
+
+	for_each_online_cpu(cpu) {
+		/* Bring all siblings down */
+		for (sibling = topology_get_online_sibling(cpu); sibling >= 0;
+		     sibling = topology_get_online_sibling(cpu)) {
+			ret = cpu_down(sibling);
+			if (ret)
+			       return ret;
+		}
+	}
+	return 0;
+}
+
+static ssize_t
+store_smt_disabled(struct device *dev, struct device_attribute *attr,
+		   const char *buf, size_t count)
+{
+	bool val;
+	int ret;
+
+	ret = kstrtobool(buf, &val);
+	if (ret)
+		return ret;
+
+	ret = lock_device_hotplug_sysfs();
+	if (ret)
+		return ret;
+
+	if (val == cpu_smt_disabled)
+		goto out;
+
+	if (val) {
+		ret = offline_smt_siblings();
+		if (ret)
+			goto out;
+	}
+	cpu_smt_disabled = val;
+out:
+	unlock_device_hotplug();
+	return ret ? ret : count;
+}
+static DEVICE_ATTR(disabled, 0644, show_smt_disabled, store_smt_disabled);
+
+static ssize_t
+show_smt_active(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	bool active = topology_max_smt_threads() > 1;
+
+	return snprintf(buf, PAGE_SIZE - 2, "%d\n", active);
+}
+static DEVICE_ATTR(active, 0444, show_smt_active, NULL);
+
+static struct attribute *cpuhp_smt_attrs[] = {
+	&dev_attr_disabled.attr,
+	&dev_attr_active.attr,
+	NULL
+};
+
+static const struct attribute_group cpuhp_smt_attr_group = {
+	.attrs = cpuhp_smt_attrs,
+	.name = "smt",
+	NULL
+};
+
+static int __init cpu_smt_state_init(void)
+{
+	return sysfs_create_group(&cpu_subsys.dev_root->kobj,
+				  &cpuhp_smt_attr_group);
+}
+
+#else
+static inline int cpu_smt_state_init(void) { return 0; }
+static inline bool cpu_smt_blocked(unsigned int cpu) { return false; }
+#endif
+
 static int __init cpuhp_sysfs_init(void)
 {
 	int cpu, ret;
 
+	ret = cpu_smt_state_init();
+	if (ret)
+		return ret;
+
 	ret = sysfs_create_group(&cpu_subsys.dev_root->kobj,
 				 &cpuhp_cpu_root_attr_group);
 	if (ret)

   

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-25 22:08 [PATCH] cpu/hotplug: SMT control knobs Thomas Gleixner
@ 2018-05-25 23:01 ` Andi Kleen
  2018-05-26  6:58   ` Thomas Gleixner
  2018-05-29  7:17 ` Peter Zijlstra
  2018-05-29  9:23 ` [MODERATED] " Peter Zijlstra
  2 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2018-05-25 23:01 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1599 bytes --]

On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:
> On Fri, 25 May 2018, speck for Thomas Gleixner wrote:
> > On Thu, 24 May 2018, speck for Linus Torvalds wrote:
> > > But if there is actually a way to turn HT off dynamically, maybe we could 
> > > make it be CPU hotplug. That would certainly be _so_ much better than the 
> > > nasty "turn it on/off in BIOS" even for other uses.
> > 
> > CPU hotplug is pretty much the only solution for runtime HT disable. In the
> > current state it won't give the boot time allocated per cpu resources back
> > and nr_present_cpus() will still be the same as before shutting them down,
> > but everything else will be cleaned out. I need to check the scheduler
> > topology stuff, but that should see the change as well. We could optimize
> > that with the static key which is already available (sched_smt_present),
> > but I have to look which nastiness is involved when toggling that at
> > runtime.
> > 
> > So yes, it's different from the case where HT is disabled in the BIOS, but
> > I don't think it really matters much in practice.
> > 
> > I'll implement a knob in sysfs for that and see how that behaves.
> 
> The below patch implements boot/runtime SMT control.

Seems very complicated for something that can be done in user space today. 

The only advantage is to block newly onlined CPUs, but then perhaps
the user should never online them?

I usually use the attached python to disable SMT

cputop.py 'thread == 1' offline | sh

cputop.py 'thread == 1' online | sh

The expression can be extended to subset CPUs.

-Andi

[-- Attachment #2: cputop.py --]
[-- Type: text/plain, Size: 2325 bytes --]

#!/usr/bin/env python
# query cpu topology and print all matching cpu numbers
# cputop "query" ["format"]
# query is a python expression, using variables:
# socket, core, thread
# or "offline" to query all offline cpus
# format is a printf format with %d
# %d will be replaced with the cpu number
# format can be offline to offline the cpu or online to online
# Author: Andi Kleen
import sys, os, re, argparse

def numfile(fn):
    f = open(fn, "r")
    v = int(f.read())
    f.close()
    return v

def output(p, fmt):
    if fmt:
        print fmt % (p,)
    else:
        print p

ap = argparse.ArgumentParser(description='''
query cpu topology and print all matching cpu numbers
cputop "query" ["format"]
query is a python expression, using variables:
socket, core, thread
or "offline" to query all offline cpus
format is a printf format with %d
%d will be replaced with the cpu number''',
epilog='''
Examples:
print all cores on socket 0
cputop "socket == 0"

print all first threads in each core on socket 0
cputop "thread == 0 and socket == 0"

disable all second threads (disable hyper threading)
cputop "thread == 1" offline

reenable all offlined cpus
cputop offline online

print all online cpus
cputop True ''', formatter_class=argparse.RawTextHelpFormatter)
ap.add_argument('expr', help='python expression with socket/core/thread')
ap.add_argument('fmt', help='Output format string with %%d, or online/offline', nargs='?')
args = ap.parse_args()

special = {
    "offline": "echo 0 > /sys/devices/system/cpu/cpu%d/online",
    "online": "echo 1 > /sys/devices/system/cpu/cpu%d/online",
}        

if args.fmt in special:
    args.fmt = special[args.fmt]

base = "/sys/devices/system/cpu/"
p = {}
l = os.listdir(base)
for d in l:
    m = re.match(r"cpu([0-9]+)", d)
    if not m:
        continue
    proc = int(m.group(1))
    top = base + d + "/topology"
    if not os.path.exists(top):
        if args.expr == "offline":
            output(proc, args.fmt)
        continue
    socket = numfile(top + "/physical_package_id")
    core = numfile(top + "/core_id")
    n = 0
    while (socket, core, n) in p:
        n += 1
    p[(socket, core, n)] = proc

if args.expr == "offline":
    sys.exit(0)

for j in sorted(p.keys()):
    socket, core, thread = j
    if eval(args.expr):
        output(p[j], args.fmt)
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-25 23:01 ` [MODERATED] " Andi Kleen
@ 2018-05-26  6:58   ` Thomas Gleixner
  2018-05-28 20:59     ` [MODERATED] " Jiri Kosina
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-26  6:58 UTC (permalink / raw)
  To: speck

On Fri, 25 May 2018, speck for Andi Kleen wrote:
> On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:
> > The below patch implements boot/runtime SMT control.
> 
> Seems very complicated for something that can be done in user space today. 
> 
> The only advantage is to block newly onlined CPUs, but then perhaps
> the user should never online them?

Emphasis on should. I just experimented with it to see how ugly it becomes
and I don't think it's too ugly. It will be even simpler when the APIC ID
is used to identify the threads because that makes the whole loop business
go away.

The main advantage I see from having it in the kernel is ease of
deployment. It just needs a command line argument and no extra user space
interaction.

The extra bonus for using APIC ID plus a commend line switch like
"nosmt=force" is that it completely hides the threads and also avoids the
memory allocations, threads etc.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-26  6:58   ` Thomas Gleixner
@ 2018-05-28 20:59     ` Jiri Kosina
  0 siblings, 0 replies; 13+ messages in thread
From: Jiri Kosina @ 2018-05-28 20:59 UTC (permalink / raw)
  To: speck

On Sat, 26 May 2018, speck for Thomas Gleixner wrote:

> The extra bonus for using APIC ID plus a commend line switch like 
> "nosmt=force" is that it completely hides the threads and also avoids 
> the memory allocations, threads etc.

I'd have sweared that we did have "noht" command-line option in the past, 
but I currently fail to see whether/why/when it's been removed.

But yeah, it existed exactly because of this reason IIRC -- to not keep 
all the housekeeping memory occupied for nothing.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-25 22:08 [PATCH] cpu/hotplug: SMT control knobs Thomas Gleixner
  2018-05-25 23:01 ` [MODERATED] " Andi Kleen
@ 2018-05-29  7:17 ` Peter Zijlstra
  2018-05-29  7:54   ` Thomas Gleixner
  2018-05-29  9:23 ` [MODERATED] " Peter Zijlstra
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2018-05-29  7:17 UTC (permalink / raw)
  To: speck

On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:

> +/**
> + * topology_get_online_sibling - Get an online sibling for a CPU
> + * @cpu:	CPU to check for

Might want to document the return value..

> + */
> +int topology_get_online_sibling(unsigned int cpu)
> +{
> +	struct cpuinfo_x86 *cs, *c = &cpu_data(cpu);
> +	unsigned int sibling;
> +
> +	if (smp_num_siblings == 1)
> +		return -ENODEV;
> +
> +	/* If @cpu is online, just query the sibling mask */
> +	if (cpu_online(cpu)) {
> +		sibling = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> +		return sibling < nr_cpu_ids ? sibling : -ENODEV;
> +	}
> +
> +	/*
> +	 * @cpu is about to be onlined. Walk the online CPUs and check for
> +	 * matching phys_proc_id and core_id.
> +	 *
> +	 * If the CPU never booted before the core id is unknown
> +	 * and cannot be queried.
> +	 */
> +	if (c->cpu_core_id == U16_MAX)
> +		return -ENODEV;
> +
> +	for_each_online_cpu(sibling) {

afaict, not all callers have the cpu hotplug lock taken, so online is
not a stable set.

That _might_ not matter, but it does need a comment if so.

> +		cs = &cpu_data(sibling);
> +
> +		if (cs->phys_proc_id == c->phys_proc_id &&
> +		    cs->cpu_core_id == c->cpu_core_id)
> +			return sibling;
> +	}
> +	return -ENODEV;
> +}
> +
>  /*
>   * The bootstrap kernel entry code has set these up. Save them for
>   * a given CPU


> +static int offline_smt_siblings(void)
> +{
> +	int cpu, sibling, ret = 0;
> +
> +	for_each_online_cpu(cpu) {

Again, no hotplug lock taken.

> +		/* Bring all siblings down */
> +		for (sibling = topology_get_online_sibling(cpu); sibling >= 0;
> +		     sibling = topology_get_online_sibling(cpu)) {
> +			ret = cpu_down(sibling);
> +			if (ret)
> +			       return ret;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static ssize_t
> +store_smt_disabled(struct device *dev, struct device_attribute *attr,
> +		   const char *buf, size_t count)
> +{
> +	bool val;
> +	int ret;
> +
> +	ret = kstrtobool(buf, &val);
> +	if (ret)
> +		return ret;
> +
> +	ret = lock_device_hotplug_sysfs();

I wondered what that was, and omg wtf kind of hack is that?

> +	if (ret)
> +		return ret;
> +
> +	if (val == cpu_smt_disabled)
> +		goto out;
> +
> +	if (val) {
> +		ret = offline_smt_siblings();
> +		if (ret)
> +			goto out;
> +	}
> +	cpu_smt_disabled = val;

I think you want to set cpu_smt_disabled _before_ doing
offline_smt_siblings, such that we _will_ fail concurrent hotplug.

> +out:
> +	unlock_device_hotplug();
> +	return ret ? ret : count;
> +}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29  7:17 ` Peter Zijlstra
@ 2018-05-29  7:54   ` Thomas Gleixner
  2018-05-29  9:29     ` [MODERATED] " Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-29  7:54 UTC (permalink / raw)
  To: speck

On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:
> > +	for_each_online_cpu(sibling) {
> 
> afaict, not all callers have the cpu hotplug lock taken, so online is
> not a stable set.
> 
> That _might_ not matter, but it does need a comment if so.

See below.

> > +	ret = kstrtobool(buf, &val);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = lock_device_hotplug_sysfs();
> 
> I wondered what that was, and omg wtf kind of hack is that?

That's serializing the CPU hotplug related interfaces. 

> > +	if (ret)
> > +		return ret;
> > +
> > +	if (val == cpu_smt_disabled)
> > +		goto out;
> > +
> > +	if (val) {
> > +		ret = offline_smt_siblings();
> > +		if (ret)
> > +			goto out;
> > +	}
> > +	cpu_smt_disabled = val;
> 
> I think you want to set cpu_smt_disabled _before_ doing
> offline_smt_siblings, such that we _will_ fail concurrent hotplug.

Can't happen because that magic sysfs lock is held.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-25 22:08 [PATCH] cpu/hotplug: SMT control knobs Thomas Gleixner
  2018-05-25 23:01 ` [MODERATED] " Andi Kleen
  2018-05-29  7:17 ` Peter Zijlstra
@ 2018-05-29  9:23 ` Peter Zijlstra
  2018-05-29 14:53   ` Thomas Gleixner
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2018-05-29  9:23 UTC (permalink / raw)
  To: speck

On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:
> 4) The static key sched_smt_present is enabled at boot time after smp
>    bringup, if the number of SMT threads is > 1.
> 
>    If the kernel is booted with 'nosmt' on the command line or with a
>    limited number or CPUs, then the key stays disabled even if later the
>    SMT disabled restriction is lifted. That's not a new problem, upstream
>    has the same problem when booting with maxcpus=1 and then bringing the
>    rest up later.
> 
>    The topoology rebuild is still working on hotplug, but it's certainly
>    worthwhile to fix the behaviour of the static key. Peter?

This seems to 'work'..

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8d59b259af4a..84c38a61a326 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5754,6 +5754,20 @@ int sched_cpu_activate(unsigned int cpu)
 	struct rq *rq = cpu_rq(cpu);
 	struct rq_flags rf;
 
+#ifdef CONFIG_SCHED_SMT
+	/*
+	 * We can't rely on the sched domains themselves to reliably inform us
+	 * of SMT, but if we ever see a CPU with siblings pass by, enable the
+	 * SMT code.
+	 *
+	 * The failure case for domains is when we boot with SMT disabled, then
+	 * create partitions that split all cores and then hotplug the
+	 * siblings. In that case we'll never build an SMT domain.
+	 */
+	if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
+		static_branch_enable_cpuslocked(&sched_smt_present);
+#endif
+
 	set_cpu_active(cpu, true);
 
 	if (sched_smp_initialized) {
@@ -5851,22 +5865,6 @@ int sched_cpu_dying(unsigned int cpu)
 }
 #endif
 
-#ifdef CONFIG_SCHED_SMT
-DEFINE_STATIC_KEY_FALSE(sched_smt_present);
-
-static void sched_init_smt(void)
-{
-	/*
-	 * We've enumerated all CPUs and will assume that if any CPU
-	 * has SMT siblings, CPU0 will too.
-	 */
-	if (cpumask_weight(cpu_smt_mask(0)) > 1)
-		static_branch_enable(&sched_smt_present);
-}
-#else
-static inline void sched_init_smt(void) { }
-#endif
-
 void __init sched_init_smp(void)
 {
 	sched_init_numa();
@@ -5888,8 +5886,6 @@ void __init sched_init_smp(void)
 	init_sched_rt_class();
 	init_sched_dl_class();
 
-	sched_init_smt();
-
 	sched_smp_initialized = true;
 }
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e497c05aab7f..973396933809 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6238,6 +6238,7 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
 }
 
 #ifdef CONFIG_SCHED_SMT
+DEFINE_STATIC_KEY_FALSE(sched_smt_present);
 
 static inline void set_idle_cores(int cpu, int val)
 {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29  7:54   ` Thomas Gleixner
@ 2018-05-29  9:29     ` Peter Zijlstra
  2018-05-29  9:59       ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2018-05-29  9:29 UTC (permalink / raw)
  To: speck

On Tue, May 29, 2018 at 09:54:57AM +0200, speck for Thomas Gleixner wrote:
> On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > I think you want to set cpu_smt_disabled _before_ doing
> > offline_smt_siblings, such that we _will_ fail concurrent hotplug.
> 
> Can't happen because that magic sysfs lock is held.

No, userspace cannot, but there are plenty in-kernel hotplug thingies
around which can.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29  9:29     ` [MODERATED] " Peter Zijlstra
@ 2018-05-29  9:59       ` Thomas Gleixner
  2018-05-29 10:48         ` [MODERATED] " Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-29  9:59 UTC (permalink / raw)
  To: speck

On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> On Tue, May 29, 2018 at 09:54:57AM +0200, speck for Thomas Gleixner wrote:
> > On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > > I think you want to set cpu_smt_disabled _before_ doing
> > > offline_smt_siblings, such that we _will_ fail concurrent hotplug.
> > 
> > Can't happen because that magic sysfs lock is held.
> 
> No, userspace cannot, but there are plenty in-kernel hotplug thingies
> around which can.

Aside of suspend/resume are there more which can initiate hotplug from the
kernel?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [MODERATED] Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29  9:59       ` Thomas Gleixner
@ 2018-05-29 10:48         ` Peter Zijlstra
  2018-05-29 12:58           ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2018-05-29 10:48 UTC (permalink / raw)
  To: speck

On Tue, May 29, 2018 at 11:59:33AM +0200, speck for Thomas Gleixner wrote:
> On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > On Tue, May 29, 2018 at 09:54:57AM +0200, speck for Thomas Gleixner wrote:
> > > On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > > > I think you want to set cpu_smt_disabled _before_ doing
> > > > offline_smt_siblings, such that we _will_ fail concurrent hotplug.
> > > 
> > > Can't happen because that magic sysfs lock is held.
> > 
> > No, userspace cannot, but there are plenty in-kernel hotplug thingies
> > around which can.
> 
> Aside of suspend/resume are there more which can initiate hotplug from the
> kernel?

A quick grep shows:

  drivers/firmware/psci_checker.c

and I'm pretty sure there is some arm big-little hotplug thing
(bL_switcher) and some ACPI power thing that randomly unplugs CPUs if
the cabinet gets too hot or something, possible this thing:

  drivers/acpi/processor_throttling.c

and I think there were one or two other things around that did something
hotplug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29 10:48         ` [MODERATED] " Peter Zijlstra
@ 2018-05-29 12:58           ` Thomas Gleixner
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-29 12:58 UTC (permalink / raw)
  To: speck

On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> On Tue, May 29, 2018 at 11:59:33AM +0200, speck for Thomas Gleixner wrote:
> > On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > > On Tue, May 29, 2018 at 09:54:57AM +0200, speck for Thomas Gleixner wrote:
> > > > On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > > > > I think you want to set cpu_smt_disabled _before_ doing
> > > > > offline_smt_siblings, such that we _will_ fail concurrent hotplug.
> > > > 
> > > > Can't happen because that magic sysfs lock is held.
> > > 
> > > No, userspace cannot, but there are plenty in-kernel hotplug thingies
> > > around which can.
> > 
> > Aside of suspend/resume are there more which can initiate hotplug from the
> > kernel?
> 
> A quick grep shows:
> 
>   drivers/firmware/psci_checker.c
> 
> and I'm pretty sure there is some arm big-little hotplug thing
> (bL_switcher) and some ACPI power thing that randomly unplugs CPUs if
> the cabinet gets too hot or something, possible this thing:
> 
>   drivers/acpi/processor_throttling.c

It does all kind of crap, but it does not do hotplug :)

> and I think there were one or two other things around that did something
> hotplug.

These I found:

arch/arm64/kernel/hibernate.c:          ret = cpu_up(sleep_cpu);
arch/powerpc/kernel/rtas.c:                     cpuret = cpu_up(cpu);
arch/sparc/kernel/ds.c:         err = cpu_up(cpu);
arch/x86/kernel/topology.c:             ret = cpu_up(cpu);
arch/x86/mm/mmio-mod.c:         err = cpu_up(cpu);
drivers/base/cpu.c:     ret = cpu_up(cpuid);
drivers/firmware/psci_checker.c:                int ret = cpu_up(cpu);
kernel/torture.c:       ret = cpu_up(cpu);

So yes, we need some more protection. I'll have a look.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29  9:23 ` [MODERATED] " Peter Zijlstra
@ 2018-05-29 14:53   ` Thomas Gleixner
  2018-05-29 15:01     ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-29 14:53 UTC (permalink / raw)
  To: speck

On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> On Sat, May 26, 2018 at 12:08:51AM +0200, speck for Thomas Gleixner wrote:
> >    The topoology rebuild is still working on hotplug, but it's certainly
> >    worthwhile to fix the behaviour of the static key. Peter?
> 
> This seems to 'work'..

:)

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8d59b259af4a..84c38a61a326 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5754,6 +5754,20 @@ int sched_cpu_activate(unsigned int cpu)
>  	struct rq *rq = cpu_rq(cpu);
>  	struct rq_flags rf;
>  
> +#ifdef CONFIG_SCHED_SMT
> +	/*
> +	 * We can't rely on the sched domains themselves to reliably inform us
> +	 * of SMT, but if we ever see a CPU with siblings pass by, enable the
> +	 * SMT code.
> +	 *
> +	 * The failure case for domains is when we boot with SMT disabled, then
> +	 * create partitions that split all cores and then hotplug the
> +	 * siblings. In that case we'll never build an SMT domain.
> +	 */
> +	if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
> +		static_branch_enable_cpuslocked(&sched_smt_present);

This triggers the WARN_ON_ONCE() in static_key_enable() when the next
sibling is onlined.... So this wants to be:

	if (!static_branch_likely(&sched_smt_present) &&
	    cpumask_weight(cpu_smt_mask(cpu)) > 1)
		static_branch_enable_cpuslocked(&sched_smt_present);
	    
or something like that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] cpu/hotplug: SMT control knobs
  2018-05-29 14:53   ` Thomas Gleixner
@ 2018-05-29 15:01     ` Thomas Gleixner
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2018-05-29 15:01 UTC (permalink / raw)
  To: speck

On Tue, 29 May 2018, speck for Thomas Gleixner wrote:
> On Tue, 29 May 2018, speck for Peter Zijlstra wrote:
> > +	if (cpumask_weight(cpu_smt_mask(cpu)) > 1)
> > +		static_branch_enable_cpuslocked(&sched_smt_present);
> 
> This triggers the WARN_ON_ONCE() in static_key_enable() when the next
> sibling is onlined.... So this wants to be:
> 
> 	if (!static_branch_likely(&sched_smt_present) &&
> 	    cpumask_weight(cpu_smt_mask(cpu)) > 1)
> 		static_branch_enable_cpuslocked(&sched_smt_present);
> 	    
> or something like that.

Ignore me. Misread the code ....

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-05-29 15:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-25 22:08 [PATCH] cpu/hotplug: SMT control knobs Thomas Gleixner
2018-05-25 23:01 ` [MODERATED] " Andi Kleen
2018-05-26  6:58   ` Thomas Gleixner
2018-05-28 20:59     ` [MODERATED] " Jiri Kosina
2018-05-29  7:17 ` Peter Zijlstra
2018-05-29  7:54   ` Thomas Gleixner
2018-05-29  9:29     ` [MODERATED] " Peter Zijlstra
2018-05-29  9:59       ` Thomas Gleixner
2018-05-29 10:48         ` [MODERATED] " Peter Zijlstra
2018-05-29 12:58           ` Thomas Gleixner
2018-05-29  9:23 ` [MODERATED] " Peter Zijlstra
2018-05-29 14:53   ` Thomas Gleixner
2018-05-29 15:01     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.