All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
@ 2016-09-21 11:39 Prarit Bhargava
  2016-09-21 11:39 ` [PATCH 1/2 v3] drivers/base: Combine topology.c and cpu.c Prarit Bhargava
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-21 11:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Greg Kroah-Hartman, Peter Zijlstra, Len Brown,
	Borislav Petkov, Andi Kleen, Jiri Olsa, Juergen Gross

The information in /sys/devices/system/cpu/cpuX/topology
directory is useful for userspace monitoring applications and in-tree
utilities like cpupower & turbostat.

When down'ing a CPU the /sys/devices/system/cpu/cpuX/topology directory is
removed during the CPU_DEAD hotplug callback in the kernel.  The problem
with this model is that the CPU has not been physically removed and the
data in the topology directory is still valid.  IOW, the cpu is still
present but the kernel has removed the topology directory making it
very difficult to determine exactly where the cpu is located.

This patchset adds CONFIG_PERMANENT_CPU_TOPOLOGY, and is Y by default for
x86, an N for all other arches.  When enabled the kernel is modified so
that the topology directory is added to the core cpu sysfs files so that
the topology directory exists for the lifetime of the CPU.  When
disabled, the behavior of the current kernel is maintained (that is, the
topology directory is removed on a down and added on an up).  Adding
CONFIG_PERMANENT_CPU_TOPOLOGY may require additional architecture so that
the cpumask data the CPU's topology is not cleared during a CPU down.

This patchset combines drivers/base/topology.c and drivers/base/cpu.c to
implement CONFIG_PERMANENT_CPU_TOPOLOGY and leaves all arches except
x86 with the current behavior.

peterz asked when the topology directory is destroyed when
CONFIG_PERMANENT_CPU_TOPOLOGY=y.  The topology directory's lifetime will
change when CONFIG_PERMANENT_CPU_TOPOLOGY=y from existing when the thread
is online, to being created when the struct device associated with the thread
is created, and similarly being destroyed when the struct device is destroyed.

[v2 & v3]: Fix ktest build robot build failures

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Juergen Gross <jgross@suse.com>

Prarit Bhargava (2):
  drivers/base: Combine topology.c and cpu.c
  cpu hotplug: add CONFIG_PERMANENT_CPU_TOPOLOGY

 arch/x86/kernel/smpboot.c |    3 -
 drivers/base/Kconfig      |   12 ++++
 drivers/base/Makefile     |    2 +-
 drivers/base/cpu.c        |  148 +++++++++++++++++++++++++++++++++++++++
 drivers/base/topology.c   |  168 ---------------------------------------------
 5 files changed, 161 insertions(+), 172 deletions(-)
 delete mode 100644 drivers/base/topology.c

-- 
1.7.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2 v3] drivers/base: Combine topology.c and cpu.c
  2016-09-21 11:39 [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Prarit Bhargava
@ 2016-09-21 11:39 ` Prarit Bhargava
  2016-09-21 11:39 ` [PATCH 2/2 v3] cpu hotplug: add CONFIG_PERMANENT_CPU_TOPOLOGY Prarit Bhargava
  2016-09-21 13:04 ` [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Borislav Petkov
  2 siblings, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-21 11:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Greg Kroah-Hartman, Peter Zijlstra, Len Brown,
	Borislav Petkov, Andi Kleen, Jiri Olsa, Juergen Gross

The topology.c file contains sysfs files that describe a cpu's location
and its siblings.  There is no purpose that this file is separate from
the core cpu code.  This patch combines topology.c into cpu.c to make the
next set of changes easier to understand.

There are no functional changes with this patch.

[v2]: fix error from ktest build robot with CONFIG_KEXEC=n
[v3]: fix error from ktest build robot with ARCH=sparc

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
---
 drivers/base/Makefile   |    2 +-
 drivers/base/cpu.c      |  140 +++++++++++++++++++++++++++++++++++++++
 drivers/base/topology.c |  168 -----------------------------------------------
 3 files changed, 141 insertions(+), 169 deletions(-)
 delete mode 100644 drivers/base/topology.c

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 2609ba20b396..e50491bb3d60 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,7 @@ obj-y			:= component.o core.o bus.o dd.o syscore.o \
 			   driver.o class.o platform.o \
 			   cpu.o firmware.o init.o map.o devres.o \
 			   attribute_container.o transport_class.o \
-			   topology.o container.o property.o cacheinfo.o
+			   container.o property.o cacheinfo.o
 obj-$(CONFIG_DEVTMPFS)	+= devtmpfs.o
 obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
 obj-y			+= power/
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 691eeea2f19a..ce722fdf48c6 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -17,6 +17,7 @@
 #include <linux/of.h>
 #include <linux/cpufeature.h>
 #include <linux/tick.h>
+#include <linux/hardirq.h>
 
 #include "base.h"
 
@@ -180,6 +181,145 @@ static struct attribute_group crash_note_cpu_attr_group = {
 };
 #endif
 
+
+#define define_id_show_func(name)				\
+static ssize_t name##_show(struct device *dev,			\
+		struct device_attribute *attr, char *buf)	\
+{								\
+	return sprintf(buf, "%d\n", topology_##name(dev->id));	\
+}
+
+#define define_siblings_show_map(name, mask)				\
+static ssize_t name##_show(struct device *dev,				\
+			   struct device_attribute *attr, char *buf)	\
+{									\
+	return cpumap_print_to_pagebuf(false, buf, topology_##mask(dev->id));\
+}
+
+#define define_siblings_show_list(name, mask)				\
+static ssize_t name##_list_show(struct device *dev,			\
+				struct device_attribute *attr,		\
+				char *buf)				\
+{									\
+	return cpumap_print_to_pagebuf(true, buf, topology_##mask(dev->id));\
+}
+
+#define define_siblings_show_func(name, mask)	\
+	define_siblings_show_map(name, mask);	\
+	define_siblings_show_list(name, mask)
+
+define_id_show_func(physical_package_id);
+static DEVICE_ATTR_RO(physical_package_id);
+
+define_id_show_func(core_id);
+static DEVICE_ATTR_RO(core_id);
+
+define_siblings_show_func(thread_siblings, sibling_cpumask);
+static DEVICE_ATTR_RO(thread_siblings);
+static DEVICE_ATTR_RO(thread_siblings_list);
+
+define_siblings_show_func(core_siblings, core_cpumask);
+static DEVICE_ATTR_RO(core_siblings);
+static DEVICE_ATTR_RO(core_siblings_list);
+
+#ifdef CONFIG_SCHED_BOOK
+define_id_show_func(book_id);
+static DEVICE_ATTR_RO(book_id);
+define_siblings_show_func(book_siblings, book_cpumask);
+static DEVICE_ATTR_RO(book_siblings);
+static DEVICE_ATTR_RO(book_siblings_list);
+#endif
+
+#ifdef CONFIG_SCHED_DRAWER
+define_id_show_func(drawer_id);
+static DEVICE_ATTR_RO(drawer_id);
+define_siblings_show_func(drawer_siblings, drawer_cpumask);
+static DEVICE_ATTR_RO(drawer_siblings);
+static DEVICE_ATTR_RO(drawer_siblings_list);
+#endif
+
+static struct attribute *topology_attrs[] = {
+	&dev_attr_physical_package_id.attr,
+	&dev_attr_core_id.attr,
+	&dev_attr_thread_siblings.attr,
+	&dev_attr_thread_siblings_list.attr,
+	&dev_attr_core_siblings.attr,
+	&dev_attr_core_siblings_list.attr,
+#ifdef CONFIG_SCHED_BOOK
+	&dev_attr_book_id.attr,
+	&dev_attr_book_siblings.attr,
+	&dev_attr_book_siblings_list.attr,
+#endif
+#ifdef CONFIG_SCHED_DRAWER
+	&dev_attr_drawer_id.attr,
+	&dev_attr_drawer_siblings.attr,
+	&dev_attr_drawer_siblings_list.attr,
+#endif
+	NULL
+};
+
+static struct attribute_group topology_attr_group = {
+	.attrs = topology_attrs,
+	.name = "topology"
+};
+
+/* Add/Remove cpu_topology interface for CPU device */
+static int topology_add_dev(unsigned int cpu)
+{
+	struct device *dev = get_cpu_device(cpu);
+
+	return sysfs_create_group(&dev->kobj, &topology_attr_group);
+}
+
+static void topology_remove_dev(unsigned int cpu)
+{
+	struct device *dev = get_cpu_device(cpu);
+
+	sysfs_remove_group(&dev->kobj, &topology_attr_group);
+}
+
+static int topology_cpu_callback(struct notifier_block *nfb,
+				 unsigned long action, void *hcpu)
+{
+	unsigned int cpu = (unsigned long)hcpu;
+	int rc = 0;
+
+	switch (action) {
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		rc = topology_add_dev(cpu);
+		break;
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		topology_remove_dev(cpu);
+		break;
+	}
+	return notifier_from_errno(rc);
+}
+
+static int topology_sysfs_init(void)
+{
+	int cpu;
+	int rc = 0;
+
+	cpu_notifier_register_begin();
+
+	for_each_online_cpu(cpu) {
+		rc = topology_add_dev(cpu);
+		if (rc)
+			goto out;
+	}
+	__hotcpu_notifier(topology_cpu_callback, 0);
+
+out:
+	cpu_notifier_register_done();
+	return rc;
+}
+
+device_initcall(topology_sysfs_init);
+
 static const struct attribute_group *common_cpu_attr_groups[] = {
 #ifdef CONFIG_KEXEC
 	&crash_note_cpu_attr_group,
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
deleted file mode 100644
index df3c97cb4c99..000000000000
--- a/drivers/base/topology.c
+++ /dev/null
@@ -1,168 +0,0 @@
-/*
- * driver/base/topology.c - Populate sysfs with cpu topology information
- *
- * Written by: Zhang Yanmin, Intel Corporation
- *
- * Copyright (C) 2006, Intel Corp.
- *
- * All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
- * NON INFRINGEMENT.  See the GNU General Public License for more
- * details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- */
-#include <linux/mm.h>
-#include <linux/cpu.h>
-#include <linux/module.h>
-#include <linux/hardirq.h>
-#include <linux/topology.h>
-
-#define define_id_show_func(name)				\
-static ssize_t name##_show(struct device *dev,			\
-		struct device_attribute *attr, char *buf)	\
-{								\
-	return sprintf(buf, "%d\n", topology_##name(dev->id));	\
-}
-
-#define define_siblings_show_map(name, mask)				\
-static ssize_t name##_show(struct device *dev,				\
-			   struct device_attribute *attr, char *buf)	\
-{									\
-	return cpumap_print_to_pagebuf(false, buf, topology_##mask(dev->id));\
-}
-
-#define define_siblings_show_list(name, mask)				\
-static ssize_t name##_list_show(struct device *dev,			\
-				struct device_attribute *attr,		\
-				char *buf)				\
-{									\
-	return cpumap_print_to_pagebuf(true, buf, topology_##mask(dev->id));\
-}
-
-#define define_siblings_show_func(name, mask)	\
-	define_siblings_show_map(name, mask);	\
-	define_siblings_show_list(name, mask)
-
-define_id_show_func(physical_package_id);
-static DEVICE_ATTR_RO(physical_package_id);
-
-define_id_show_func(core_id);
-static DEVICE_ATTR_RO(core_id);
-
-define_siblings_show_func(thread_siblings, sibling_cpumask);
-static DEVICE_ATTR_RO(thread_siblings);
-static DEVICE_ATTR_RO(thread_siblings_list);
-
-define_siblings_show_func(core_siblings, core_cpumask);
-static DEVICE_ATTR_RO(core_siblings);
-static DEVICE_ATTR_RO(core_siblings_list);
-
-#ifdef CONFIG_SCHED_BOOK
-define_id_show_func(book_id);
-static DEVICE_ATTR_RO(book_id);
-define_siblings_show_func(book_siblings, book_cpumask);
-static DEVICE_ATTR_RO(book_siblings);
-static DEVICE_ATTR_RO(book_siblings_list);
-#endif
-
-#ifdef CONFIG_SCHED_DRAWER
-define_id_show_func(drawer_id);
-static DEVICE_ATTR_RO(drawer_id);
-define_siblings_show_func(drawer_siblings, drawer_cpumask);
-static DEVICE_ATTR_RO(drawer_siblings);
-static DEVICE_ATTR_RO(drawer_siblings_list);
-#endif
-
-static struct attribute *default_attrs[] = {
-	&dev_attr_physical_package_id.attr,
-	&dev_attr_core_id.attr,
-	&dev_attr_thread_siblings.attr,
-	&dev_attr_thread_siblings_list.attr,
-	&dev_attr_core_siblings.attr,
-	&dev_attr_core_siblings_list.attr,
-#ifdef CONFIG_SCHED_BOOK
-	&dev_attr_book_id.attr,
-	&dev_attr_book_siblings.attr,
-	&dev_attr_book_siblings_list.attr,
-#endif
-#ifdef CONFIG_SCHED_DRAWER
-	&dev_attr_drawer_id.attr,
-	&dev_attr_drawer_siblings.attr,
-	&dev_attr_drawer_siblings_list.attr,
-#endif
-	NULL
-};
-
-static struct attribute_group topology_attr_group = {
-	.attrs = default_attrs,
-	.name = "topology"
-};
-
-/* Add/Remove cpu_topology interface for CPU device */
-static int topology_add_dev(unsigned int cpu)
-{
-	struct device *dev = get_cpu_device(cpu);
-
-	return sysfs_create_group(&dev->kobj, &topology_attr_group);
-}
-
-static void topology_remove_dev(unsigned int cpu)
-{
-	struct device *dev = get_cpu_device(cpu);
-
-	sysfs_remove_group(&dev->kobj, &topology_attr_group);
-}
-
-static int topology_cpu_callback(struct notifier_block *nfb,
-				 unsigned long action, void *hcpu)
-{
-	unsigned int cpu = (unsigned long)hcpu;
-	int rc = 0;
-
-	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-		rc = topology_add_dev(cpu);
-		break;
-	case CPU_UP_CANCELED:
-	case CPU_UP_CANCELED_FROZEN:
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		topology_remove_dev(cpu);
-		break;
-	}
-	return notifier_from_errno(rc);
-}
-
-static int topology_sysfs_init(void)
-{
-	int cpu;
-	int rc = 0;
-
-	cpu_notifier_register_begin();
-
-	for_each_online_cpu(cpu) {
-		rc = topology_add_dev(cpu);
-		if (rc)
-			goto out;
-	}
-	__hotcpu_notifier(topology_cpu_callback, 0);
-
-out:
-	cpu_notifier_register_done();
-	return rc;
-}
-
-device_initcall(topology_sysfs_init);
-- 
1.7.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2 v3] cpu hotplug: add CONFIG_PERMANENT_CPU_TOPOLOGY
  2016-09-21 11:39 [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Prarit Bhargava
  2016-09-21 11:39 ` [PATCH 1/2 v3] drivers/base: Combine topology.c and cpu.c Prarit Bhargava
@ 2016-09-21 11:39 ` Prarit Bhargava
  2016-09-21 13:04 ` [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Borislav Petkov
  2 siblings, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-21 11:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Greg Kroah-Hartman, Peter Zijlstra, Len Brown,
	Borislav Petkov, Andi Kleen, Jiri Olsa, Juergen Gross

The information in /sys/devices/system/cpu/cpuX/topology
directory is useful for userspace monitoring applications and in-tree
utilities like cpupower & turbostat.

When down'ing a thread the /sys/devices/system/cpu/cpuX/topology directory is
removed during the CPU_DEAD hotplug callback in the kernel.  The problem
with this model is that the thread's core has not been physically removed
and the data in the topology directory is still valid and the core's
location is now lost to userspace.

This patch adds CONFIG_PERMANENT_CPU_TOPOLOGY, and is Y by default for
x86, an N for all other arches.  When enabled the kernel is modified so
that the topology directory is added to the core cpu sysfs files so that
the topology directory exists while the CPU is physically present.  When
disabled, the behavior of the current kernel is maintained (that is, the
topology directory is removed on a soft down and added on an soft up of a
thread).

Adding CONFIG_PERMANENT_CPU_TOPOLOGY may require additional architecture
so that the cpumask data the CPU's topology is not cleared during a CPU
down.

Before patch:

[root@hp-z620-01 ~]# grep ^ /sys/devices/system/cpu/cpu10/topology/*
/sys/devices/system/cpu/cpu10/topology/core_id:3
/sys/devices/system/cpu/cpu10/topology/core_siblings:ffff
/sys/devices/system/cpu/cpu10/topology/core_siblings_list:0-15
/sys/devices/system/cpu/cpu10/topology/physical_package_id:0
/sys/devices/system/cpu/cpu10/topology/thread_siblings:0404
/sys/devices/system/cpu/cpu10/topology/thread_siblings_list:2,10

Down a cpu

[root@hp-z620-01 ~]# echo 0 > /sys/devices/system/cpu/cpu10/online

[root@hp-z620-01 ~]# ls /sys/devices/system/cpu/cpu10/topology
ls: cannot access topology: No such file or directory

After patch:

[root@hp-z620-01 ~]# grep ^ /sys/devices/system/cpu/cpu10/topology/*
/sys/devices/system/cpu/cpu10/topology/core_id:3
/sys/devices/system/cpu/cpu10/topology/core_siblings:ffff
/sys/devices/system/cpu/cpu10/topology/core_siblings_list:0-15
/sys/devices/system/cpu/cpu10/topology/physical_package_id:0
/sys/devices/system/cpu/cpu10/topology/thread_siblings:0404
/sys/devices/system/cpu/cpu10/topology/thread_siblings_list:2,10

Down a cpu

[root@hp-z620-01 ~]# echo 0 > /sys/devices/system/cpu/cpu10/online

[root@hp-z620-01 ~]# grep ^ /sys/devices/system/cpu/cpu10/topology/*
/sys/devices/system/cpu/cpu10/topology/core_id:3
/sys/devices/system/cpu/cpu10/topology/core_siblings:0000
/sys/devices/system/cpu/cpu10/topology/core_siblings_list:
/sys/devices/system/cpu/cpu10/topology/physical_package_id:0
/sys/devices/system/cpu/cpu10/topology/thread_siblings:0000
/sys/devices/system/cpu/cpu10/topology/thread_siblings_list:

I did some testing with and without BOOTPARAM_HOTPLUG_CPU0 enabled,
and up'd and down'd threads in sequence, randomly, by thread group, by
socket group and didn't see any issues.

core_siblings and thread_siblings are "numa siblings that are online"
and "thread siblings that are online" and are used as such within the kernel.
They must be zero'd out when the thread is offline.

CONFIG_PERMANENT_CPU_TOPOLOGY=y changes the lifetime of the topology
directory from existing when a thread is online to when a thread is
created and destroyed.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
---
 arch/x86/kernel/smpboot.c |    3 ---
 drivers/base/Kconfig      |   12 ++++++++++++
 drivers/base/cpu.c        |    8 ++++++++
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 4296beb8fdd3..ae82f8f45b61 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1472,7 +1472,6 @@ static void recompute_smt_state(void)
 static void remove_siblinginfo(int cpu)
 {
 	int sibling;
-	struct cpuinfo_x86 *c = &cpu_data(cpu);
 
 	for_each_cpu(sibling, topology_core_cpumask(cpu)) {
 		cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
@@ -1490,8 +1489,6 @@ static void remove_siblinginfo(int cpu)
 	cpumask_clear(cpu_llc_shared_mask(cpu));
 	cpumask_clear(topology_sibling_cpumask(cpu));
 	cpumask_clear(topology_core_cpumask(cpu));
-	c->phys_proc_id = 0;
-	c->cpu_core_id = 0;
 	cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);
 	recompute_smt_state();
 }
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 98504ec99c7d..b3935a272c3c 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -324,4 +324,16 @@ config CMA_ALIGNMENT
 
 endif
 
+config PERMANENT_CPU_TOPOLOGY
+	bool "Permanent CPU Topology"
+	depends on HOTPLUG_CPU
+	def_bool y if X86_64
+	help
+	  This option configures CPU topology to be permanent for the lifetime
+	  of the CPU (until it is physically removed).  Selecting Y here
+	  results in the kernel reporting the physical location for offlined
+	  CPUs.
+
+	  If unsure, leave the default value as is.
+
 endmenu
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index ce722fdf48c6..891afc94c149 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -263,6 +263,7 @@ static struct attribute_group topology_attr_group = {
 	.name = "topology"
 };
 
+#ifndef CONFIG_PERMANENT_CPU_TOPOLOGY
 /* Add/Remove cpu_topology interface for CPU device */
 static int topology_add_dev(unsigned int cpu)
 {
@@ -319,11 +320,15 @@ out:
 }
 
 device_initcall(topology_sysfs_init);
+#endif
 
 static const struct attribute_group *common_cpu_attr_groups[] = {
 #ifdef CONFIG_KEXEC
 	&crash_note_cpu_attr_group,
 #endif
+#ifdef CONFIG_PERMANENT_CPU_TOPOLOGY
+	&topology_attr_group,
+#endif
 	NULL
 };
 
@@ -331,6 +336,9 @@ static const struct attribute_group *hotplugable_cpu_attr_groups[] = {
 #ifdef CONFIG_KEXEC
 	&crash_note_cpu_attr_group,
 #endif
+#ifdef CONFIG_PERMANENT_CPU_TOPOLOGY
+	&topology_attr_group,
+#endif
 	NULL
 };
 
-- 
1.7.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-21 11:39 [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Prarit Bhargava
  2016-09-21 11:39 ` [PATCH 1/2 v3] drivers/base: Combine topology.c and cpu.c Prarit Bhargava
  2016-09-21 11:39 ` [PATCH 2/2 v3] cpu hotplug: add CONFIG_PERMANENT_CPU_TOPOLOGY Prarit Bhargava
@ 2016-09-21 13:04 ` Borislav Petkov
  2016-09-21 13:32   ` Prarit Bhargava
  2 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2016-09-21 13:04 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Wed, Sep 21, 2016 at 07:39:31AM -0400, Prarit Bhargava wrote:
> The information in /sys/devices/system/cpu/cpuX/topology
> directory is useful for userspace monitoring applications and in-tree
> utilities like cpupower & turbostat.
> 
> When down'ing a CPU the /sys/devices/system/cpu/cpuX/topology directory is
> removed during the CPU_DEAD hotplug callback in the kernel.  The problem
> with this model is that the CPU has not been physically removed and the
> data in the topology directory is still valid.  IOW, the cpu is still
> present but the kernel has removed the topology directory making it
> very difficult to determine exactly where the cpu is located.

So I'm afraid I still don't understand what the problem here is.

And the commit message of 2/2 doesn't make it any clearer. Can you
please give a concrete example what the problem is and what you're
trying to achieve.

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-21 13:04 ` [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Borislav Petkov
@ 2016-09-21 13:32   ` Prarit Bhargava
  2016-09-21 14:01     ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-21 13:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/21/2016 09:04 AM, Borislav Petkov wrote:
> On Wed, Sep 21, 2016 at 07:39:31AM -0400, Prarit Bhargava wrote:
>> The information in /sys/devices/system/cpu/cpuX/topology
>> directory is useful for userspace monitoring applications and in-tree
>> utilities like cpupower & turbostat.
>>
>> When down'ing a CPU the /sys/devices/system/cpu/cpuX/topology directory is
>> removed during the CPU_DEAD hotplug callback in the kernel.  The problem
>> with this model is that the CPU has not been physically removed and the
>> data in the topology directory is still valid.  IOW, the cpu is still
>> present but the kernel has removed the topology directory making it
>> very difficult to determine exactly where the cpu is located.
> 
> So I'm afraid I still don't understand what the problem here is.
> 
> And the commit message of 2/2 doesn't make it any clearer. Can you
> please give a concrete example what the problem is and what you're
> trying to achieve.

Sorry, I'll try again....

Right now, if you removed thread 29 from your system you would do:

echo 0 > /sys/devices/system/cpu/cpu29/online

>From the userspace side, this results in the removing of the

/sys/devices/system/cpu/cpu29/topology

directory.

This is not the right thing to do [1].  The topology directory should exist as
long as the thread is present in the system.  The thread (and its core) are
still physically there, it's just that the thread is not available to the
scheduler.  The topology of the thread hasn't changed due to it being soft
offlined this way.

turbostat was modified to deal with the missing topology directory, and in tree
utility cpupower prints out significantly less information when a thread is
offline.  ISTR a powertop bug due to hotplug too.  This makes these monitoring
utilities a problem for users who want only one thread per core.

The patchset does two things.  The first patch unifies the topology.c and cpu.c
code.  The second patch introduces a config option to change the lifetime of the
topology directory to exist as long as the thread's device struct exists in the
device subsystem.

This now means that

echo 0 > /sys/devices/system/cpu/cpu29/online

will result in the thread's topology directory staying around until the struct
device associated with it is destroyed upon a physical socket hotplug event.

This patchset will result in cleanups to turbostat, and make fixes to cpupower
*much* easier to deal with.

[1] I cannot say with any certainty that other arches do or do not require this
change.  That is the only reason the change is restricted to x86 right now.

P.

> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-21 13:32   ` Prarit Bhargava
@ 2016-09-21 14:01     ` Borislav Petkov
  2016-09-22 11:59       ` Prarit Bhargava
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2016-09-21 14:01 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Wed, Sep 21, 2016 at 09:32:47AM -0400, Prarit Bhargava wrote:
> This is not the right thing to do [1].  The topology directory should exist as
> long as the thread is present in the system.  The thread (and its core) are
> still physically there, it's just that the thread is not available to the
> scheduler.  The topology of the thread hasn't changed due to it being soft
> offlined this way.

So far so good.

> turbostat was modified to deal with the missing topology directory, and in tree
> utility cpupower prints out significantly less information when a thread is
> offline.

Why does it do that? Why does an offlined core change that info?

Concrete details please.

> ISTR a powertop bug due to hotplug too.  This makes these monitoring
> utilities a problem for users who want only one thread per core.

one thread per core? What does that mean?

> This now means that
> 
> echo 0 > /sys/devices/system/cpu/cpu29/online
> 
> will result in the thread's topology directory staying around until the struct
> device associated with it is destroyed upon a physical socket hotplug event.

So your 2/2 says that on an offlined CPU, you have

/sys/devices/system/cpu/cpu10/topology/core_id:3
/sys/devices/system/cpu/cpu10/topology/core_siblings:0000
/sys/devices/system/cpu/cpu10/topology/core_siblings_list:
/sys/devices/system/cpu/cpu10/topology/physical_package_id:0
/sys/devices/system/cpu/cpu10/topology/thread_siblings:0000
/sys/devices/system/cpu/cpu10/topology/thread_siblings_list:

and this information is bollocks. core_siblings is 0, thread_siblings
is 0. You can just as well not have them there at all.

So is this whole jumping around just so that you can have a
/sys/devices/system/cpu/cpu10/topology directory and so that tools don't
get confused by it missing?

So again, what exactly are those tools accessing and how does the
offlined cores puzzle them?

A concrete example please:

"turbostat tries to access X and it is gone when the CPU is offlined so
this is a problem because it can't do Y"

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-21 14:01     ` Borislav Petkov
@ 2016-09-22 11:59       ` Prarit Bhargava
  2016-09-22 12:10         ` Borislav Petkov
  0 siblings, 1 reply; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-22 11:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/21/2016 10:01 AM, Borislav Petkov wrote:
> On Wed, Sep 21, 2016 at 09:32:47AM -0400, Prarit Bhargava wrote:
>> This is not the right thing to do [1].  The topology directory should exist as
>> long as the thread is present in the system.  The thread (and its core) are
>> still physically there, it's just that the thread is not available to the
>> scheduler.  The topology of the thread hasn't changed due to it being soft
>> offlined this way.
> 
> So far so good.
> 
>> turbostat was modified to deal with the missing topology directory, and in tree
>> utility cpupower prints out significantly less information when a thread is
>> offline.
> 
> Why does it do that? Why does an offlined core change that info?
> 
> Concrete details please.
> 
>> ISTR a powertop bug due to hotplug too.  This makes these monitoring
>> utilities a problem for users who want only one thread per core.
> 
> one thread per core? What does that mean?

System boots with (usually) with 2 threads/core.  Some performance users want
one thread per core.  Since there is no "noht" option anymore, users use /sys to
disable a thread on each core.

> 
>> This now means that
>>
>> echo 0 > /sys/devices/system/cpu/cpu29/online
>>
>> will result in the thread's topology directory staying around until the struct
>> device associated with it is destroyed upon a physical socket hotplug event.
> 
> So your 2/2 says that on an offlined CPU, you have
> 
> /sys/devices/system/cpu/cpu10/topology/core_id:3
> /sys/devices/system/cpu/cpu10/topology/core_siblings:0000
> /sys/devices/system/cpu/cpu10/topology/core_siblings_list:
> /sys/devices/system/cpu/cpu10/topology/physical_package_id:0
> /sys/devices/system/cpu/cpu10/topology/thread_siblings:0000
> /sys/devices/system/cpu/cpu10/topology/thread_siblings_list:
> 
> and this information is bollocks. core_siblings is 0, thread_siblings
> is 0. You can just as well not have them there at all.

core_siblings and thread_siblings are the online thread's sibling cores and
threads that are available to the scheduler, and should be 0 when the thread is
offline.  That comes directly from reading the code.

> 
> So is this whole jumping around just so that you can have a
> /sys/devices/system/cpu/cpu10/topology directory and so that tools don't
> get confused by it missing?

Yes.

> 
> So again, what exactly are those tools accessing and how does the
> offlined cores puzzle them?
> 
> A concrete example please:
> 

See commit 20102ac5bee3 ("cpupower: cpupower monitor reports uninitialized
values for offline cpus").  That patch papers over the bug of not being able to
find core_id and physical_package_id for an offline thread.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-22 11:59       ` Prarit Bhargava
@ 2016-09-22 12:10         ` Borislav Petkov
  2016-09-26 11:45           ` Prarit Bhargava
  0 siblings, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2016-09-22 12:10 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Thu, Sep 22, 2016 at 07:59:08AM -0400, Prarit Bhargava wrote:
> System boots with (usually) with 2 threads/core.  Some performance users want
> one thread per core.  Since there is no "noht" option anymore, users use /sys to
> disable a thread on each core.

I see.

> core_siblings and thread_siblings are the online thread's sibling cores and
> threads that are available to the scheduler

Hmm, I see something else:

<Documentation/cputopology.txt>:
7) /sys/devices/system/cpu/cpuX/topology/core_siblings:

        internal kernel map of cpuX's hardware threads within the same
        physical_package_id.

> and should be 0 when the thread is offline. That comes directly from
> reading the code.

But then code which reads those will have to *know* that those cores are
offline - otherwise it would be confused by what it is reading there.

For example, the core siblings of an offlined core are still the same,
they don't change. It is just the core that is offline.

> See commit 20102ac5bee3 ("cpupower: cpupower monitor reports uninitialized
> values for offline cpus").  That patch papers over the bug of not being able to
> find core_id and physical_package_id for an offline thread.

Right, and this is *exactly* the *right* thing to do - tools should
handle the case gracefully when cores are offline.

And we already state that explicitly in /sys/devices/system/cpu/online.

And for offlined cores they should show exactly that - cores
are offlined and either say that the measurement is going to be
wrong/innacurate or if they're showing per-core info, they don't show it
for the offlined cores or show special symbols like stars and whatnot.

So far I still see no need for changing anything in the kernel.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-22 12:10         ` Borislav Petkov
@ 2016-09-26 11:45           ` Prarit Bhargava
  2016-09-26 11:57             ` Borislav Petkov
  2016-09-26 11:59             ` Peter Zijlstra
  0 siblings, 2 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-26 11:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/22/2016 08:10 AM, Borislav Petkov wrote:
> On Thu, Sep 22, 2016 at 07:59:08AM -0400, Prarit Bhargava wrote:
>> System boots with (usually) with 2 threads/core.  Some performance users want
>> one thread per core.  Since there is no "noht" option anymore, users use /sys to
>> disable a thread on each core.
> 
> I see.
> 
>> core_siblings and thread_siblings are the online thread's sibling cores and
>> threads that are available to the scheduler
> 
> Hmm, I see something else:
> 
> <Documentation/cputopology.txt>:
> 7) /sys/devices/system/cpu/cpuX/topology/core_siblings:
> 
>         internal kernel map of cpuX's hardware threads within the same
>         physical_package_id.

Thanks.  I'll send a patch to modify the Documentation which is out-of-date.

> 
>> and should be 0 when the thread is offline. That comes directly from
>> reading the code.
> 
> But then code which reads those will have to *know* that those cores are
> offline - otherwise it would be confused by what it is reading there.

When offline, /sys/devices/system/cpuX/cpu/online is 0.  The problem is that
when online is 0, topology disappears so there is no way to determine _the
location_ of the offline'd thread.

> 
> For example, the core siblings of an offlined core are still the same,
> they don't change. It is just the core that is offline.

Please see (in latest linux.git)

arch/x86/kernel/smpboot.c:1493 function remove_siblinginfo()

Specifically,

        cpumask_clear(topology_sibling_cpumask(cpu));
        cpumask_clear(topology_core_cpumask(cpu));

> 
>> See commit 20102ac5bee3 ("cpupower: cpupower monitor reports uninitialized
>> values for offline cpus").  That patch papers over the bug of not being able to
>> find core_id and physical_package_id for an offline thread.
> 
> Right, and this is *exactly* the *right* thing to do - tools should
> handle the case gracefully when cores are offline.

cpupower should still print out all asterisks for down'd threads.  It does not
because the topology directory is incorrectly removed.

IOW how does userspace know the _location_ of the thread?  The topology
directory no longer exists when the thread is downed, so core_id and
physical_package_id (both of which would be effectively static) do not exist.
The whole point of this patchset is to know where the offline'd thread actually is.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-26 11:45           ` Prarit Bhargava
@ 2016-09-26 11:57             ` Borislav Petkov
  2016-09-27 11:45               ` Prarit Bhargava
  2016-09-26 11:59             ` Peter Zijlstra
  1 sibling, 1 reply; 20+ messages in thread
From: Borislav Petkov @ 2016-09-26 11:57 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Mon, Sep 26, 2016 at 07:45:37AM -0400, Prarit Bhargava wrote:
> When offline, /sys/devices/system/cpuX/cpu/online is 0.  The problem is that
> when online is 0, topology disappears so there is no way to determine _the
> location_ of the offline'd thread.

What does "the location" mean exactly?

> cpupower should still print out all asterisks for down'd threads.  It does not
> because the topology directory is incorrectly removed.
> 
> IOW how does userspace know the _location_ of the thread?  The topology
> directory no longer exists when the thread is downed, so core_id and
> physical_package_id (both of which would be effectively static) do not exist.
> The whole point of this patchset is to know where the offline'd thread actually is.

What do you mean "where"?

$ echo 0 > /sys/devices/system/cpu/cpu2/online
$ cat /sys/devices/system/cpu/online
0-1,3-7

So core 2 is right between 1 and 3.

If you need to show the package id, you still iterate over the core
numbers in an increasing order and show '*' for the offlined ones.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-26 11:45           ` Prarit Bhargava
  2016-09-26 11:57             ` Borislav Petkov
@ 2016-09-26 11:59             ` Peter Zijlstra
  2016-09-27 11:47               ` Prarit Bhargava
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2016-09-26 11:59 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Borislav Petkov, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Greg Kroah-Hartman, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Mon, Sep 26, 2016 at 07:45:37AM -0400, Prarit Bhargava wrote:
> > But then code which reads those will have to *know* that those cores are
> > offline - otherwise it would be confused by what it is reading there.
> 
> When offline, /sys/devices/system/cpuX/cpu/online is 0.  The problem is that
> when online is 0, topology disappears so there is no way to determine _the
> location_ of the offline'd thread.

As far as all that code is concerned, that CPU doesn't even have a
location anymore.

While there might be some distinction between hotplug and physical
hotplug on the user API side (I really wouldn't know), there isn't on
the kernel side.

Once you unplug a CPU, its _gone_. There isn't another hotplug operation
once you really take the CPU out.

Offline means out gone, vamoosh.

And it doesn't make sense to talk about the location of a resource
that's not there.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-26 11:57             ` Borislav Petkov
@ 2016-09-27 11:45               ` Prarit Bhargava
  2016-09-27 13:49                 ` Greg Kroah-Hartman
  2016-09-28  5:02                 ` Borislav Petkov
  0 siblings, 2 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-27 11:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/26/2016 07:57 AM, Borislav Petkov wrote:
> On Mon, Sep 26, 2016 at 07:45:37AM -0400, Prarit Bhargava wrote:
>> When offline, /sys/devices/system/cpuX/cpu/online is 0.  The problem is that
>> when online is 0, topology disappears so there is no way to determine _the
>> location_ of the offline'd thread.
> 
> What does "the location" mean exactly?
> 
>> cpupower should still print out all asterisks for down'd threads.  It does not
>> because the topology directory is incorrectly removed.
>>
>> IOW how does userspace know the _location_ of the thread?  The topology
>> directory no longer exists when the thread is downed, so core_id and
>> physical_package_id (both of which would be effectively static) do not exist.
>> The whole point of this patchset is to know where the offline'd thread actually is.
> 
> What do you mean "where"?

The socket and core location.

Look at it this way (and let's get the terminology straight at the same time).

You have a socket CPU.  That socket has cores on it.  Each core (at least on
Intel) has two threads.

I down a thread (as you did):

> 
> $ echo 0 > /sys/devices/system/cpu/cpu2/online

This results in the topology directory being destroyed.  It shouldn't be -- the
socket and core are still there.  If you could open up your computer you could
touch them.  This is similar to downing a PCI device, or removing !kernel memory
DIMM from a system.  The device is still physically there.

> $ cat /sys/devices/system/cpu/online
> 0-1,3-7
> 
> So core 2 is right between 1 and 3.

Yes.  But *where* is it relative to the cores and socket(s)?

> 
> If you need to show the package id, you still iterate over the core
> numbers in an increasing order and show '*' for the offlined ones.
> 

Explain this in more detail please?

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-26 11:59             ` Peter Zijlstra
@ 2016-09-27 11:47               ` Prarit Bhargava
  2016-09-27 11:57                 ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-27 11:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Greg Kroah-Hartman, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/26/2016 07:59 AM, Peter Zijlstra wrote:
> On Mon, Sep 26, 2016 at 07:45:37AM -0400, Prarit Bhargava wrote:
>>> But then code which reads those will have to *know* that those cores are
>>> offline - otherwise it would be confused by what it is reading there.
>>
>> When offline, /sys/devices/system/cpuX/cpu/online is 0.  The problem is that
>> when online is 0, topology disappears so there is no way to determine _the
>> location_ of the offline'd thread.
> 
> As far as all that code is concerned, that CPU doesn't even have a
> location anymore.
> 
> While there might be some distinction between hotplug and physical
> hotplug on the user API side (I really wouldn't know), there isn't on
> the kernel side.
> 
> Once you unplug a CPU, its _gone_. There isn't another hotplug operation
> once you really take the CPU out.

There's a difference between soft remove (via sysfs) and a true hot remove
operation (where the whole thing is physically removed).  Soft remove only
results in the processor being made "not available" to the scheduler.

> 
> Offline means out gone, vamoosh.

No, that is incorrect.  The socket that contains the cores (and threads) is
still plugged in.

> 
> And it doesn't make sense to talk about the location of a resource
> that's not there.
> 

Again, it is _physically_ there.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 11:47               ` Prarit Bhargava
@ 2016-09-27 11:57                 ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2016-09-27 11:57 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Borislav Petkov, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Greg Kroah-Hartman, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Tue, Sep 27, 2016 at 07:47:56AM -0400, Prarit Bhargava wrote:

> There's a difference between soft remove (via sysfs) and a true hot remove
> operation (where the whole thing is physically removed).  Soft remove only
> results in the processor being made "not available" to the scheduler.

How is it different? We do _ONE_ CPU unplug operation. We do not touch
the thing anymore after that. It _can_ be taken out after that.

Therefore hotplug wipes the topology information and clears the CPU from
relevant bitmasks.

>From the kernel's POV there really is no distinction.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 11:45               ` Prarit Bhargava
@ 2016-09-27 13:49                 ` Greg Kroah-Hartman
  2016-09-27 15:26                   ` Prarit Bhargava
  2016-09-28  5:02                 ` Borislav Petkov
  1 sibling, 1 reply; 20+ messages in thread
From: Greg Kroah-Hartman @ 2016-09-27 13:49 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Borislav Petkov, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Tue, Sep 27, 2016 at 07:45:56AM -0400, Prarit Bhargava wrote:
> On 09/26/2016 07:57 AM, Borislav Petkov wrote:
> > $ echo 0 > /sys/devices/system/cpu/cpu2/online
> 
> This results in the topology directory being destroyed.  It shouldn't be -- the
> socket and core are still there.  If you could open up your computer you could
> touch them.  This is similar to downing a PCI device, or removing !kernel memory
> DIMM from a system.  The device is still physically there.

If you remove a PCI device from the system, by turning it off from the
PCI hotplug interface, it will go away from sysfs, just like this CPU
device is going away, no matter if you physically remove the device or
not.

Thanks for validating that everything is working the same and correctly
:)

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 13:49                 ` Greg Kroah-Hartman
@ 2016-09-27 15:26                   ` Prarit Bhargava
  2016-09-28  5:05                     ` Borislav Petkov
  2016-09-28  6:48                     ` Peter Zijlstra
  0 siblings, 2 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-27 15:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Borislav Petkov, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross



On 09/27/2016 09:49 AM, Greg Kroah-Hartman wrote:
> On Tue, Sep 27, 2016 at 07:45:56AM -0400, Prarit Bhargava wrote:
>> On 09/26/2016 07:57 AM, Borislav Petkov wrote:
>>> $ echo 0 > /sys/devices/system/cpu/cpu2/online
>>
>> This results in the topology directory being destroyed.  It shouldn't be -- the
>> socket and core are still there.  If you could open up your computer you could
>> touch them.  This is similar to downing a PCI device, or removing !kernel memory
>> DIMM from a system.  The device is still physically there.
> 
> If you remove a PCI device from the system, by turning it off from the
> PCI hotplug interface, it will go away from sysfs, just like this CPU
> device is going away, no matter if you physically remove the device or
> not.
> 

While similar, the thread's device struct, link to structures, and masks in the
kernel are kept in place.  In the PCI case these links are all destroyed which
results in requiring a bus rescan to find the devcie.  The sysfs (soft) removal
of a thread has a very different outcome from PCI.

I see now that the issue is not understanding the difference between physical
and soft thread removal.  I will write that up and get back to everyone.

P.

> Thanks for validating that everything is working the same and correctly
> :)
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 11:45               ` Prarit Bhargava
  2016-09-27 13:49                 ` Greg Kroah-Hartman
@ 2016-09-28  5:02                 ` Borislav Petkov
  1 sibling, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2016-09-28  5:02 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Greg Kroah-Hartman, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Tue, Sep 27, 2016 at 07:45:56AM -0400, Prarit Bhargava wrote:
> Yes.  But *where* is it relative to the cores and socket(s)?

And you need that information because...

> > If you need to show the package id, you still iterate over the core
> > numbers in an increasing order and show '*' for the offlined ones.
> > 
> 
> Explain this in more detail please?

Just s/package id/core id/ and then it makes sense - I meant "core id".

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 15:26                   ` Prarit Bhargava
@ 2016-09-28  5:05                     ` Borislav Petkov
  2016-09-28  6:48                     ` Peter Zijlstra
  1 sibling, 0 replies; 20+ messages in thread
From: Borislav Petkov @ 2016-09-28  5:05 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Greg Kroah-Hartman, linux-kernel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Len Brown, Andi Kleen,
	Jiri Olsa, Juergen Gross

On Tue, Sep 27, 2016 at 11:26:14AM -0400, Prarit Bhargava wrote:
> I see now that the issue is not understanding the difference between physical
> and soft thread removal.  I will write that up and get back to everyone.

No need - we understand the issue.

What I don't understand is what information you need *exactly* in sysfs
from the offlined cores and why. Something like "I need to know the id
of the thread on core X on socket Y because..."

I've been trying to get it out of you but I've failed so far at it.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-27 15:26                   ` Prarit Bhargava
  2016-09-28  5:05                     ` Borislav Petkov
@ 2016-09-28  6:48                     ` Peter Zijlstra
  2016-09-28 10:06                       ` Prarit Bhargava
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2016-09-28  6:48 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Greg Kroah-Hartman, Borislav Petkov, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Len Brown,
	Andi Kleen, Jiri Olsa, Juergen Gross

On Tue, Sep 27, 2016 at 11:26:14AM -0400, Prarit Bhargava wrote:
> I see now that the issue is not understanding the difference between physical
> and soft thread removal.  I will write that up and get back to everyone.

You don't seem to understand that from the kernels POV there is no such
distinction.

There is only one unplug operation that caters to both cases. Therefore
unplug needs to assume the most stringent (ie. physical) for things to
work right.

There is no 'soft thread removal', except maybe in the userspace API,
you're the one that is confused.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event
  2016-09-28  6:48                     ` Peter Zijlstra
@ 2016-09-28 10:06                       ` Prarit Bhargava
  0 siblings, 0 replies; 20+ messages in thread
From: Prarit Bhargava @ 2016-09-28 10:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Greg Kroah-Hartman, Borislav Petkov, linux-kernel,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Len Brown,
	Andi Kleen, Jiri Olsa, Juergen Gross



On 09/28/2016 02:48 AM, Peter Zijlstra wrote:
> On Tue, Sep 27, 2016 at 11:26:14AM -0400, Prarit Bhargava wrote:
>> I see now that the issue is not understanding the difference between physical
>> and soft thread removal.  I will write that up and get back to everyone.
> 
> You don't seem to understand that from the kernels POV there is no such
> distinction.
> 

And I'm saying you're wrong.  Unlike PCI, CPU sysfs unplug does not completely
remove the device from the kernel's knowledge.  If that were the case then

/sys/devices/system/cpu/cpuX/online (and other files left after the unplug)
wouldn't exist.

If you do a physical removal (signaled via ACPI or interrupt) the entire device
is removed from the kernel's POV.

Yes, the sysfs removal is a subset of the physical removal.  But the end result
is very different.

> There is only one unplug operation that caters to both cases. Therefore
> unplug needs to assume the most stringent (ie. physical) for things to
> work right.

See last comment above.

P.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-09-28 10:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-21 11:39 [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Prarit Bhargava
2016-09-21 11:39 ` [PATCH 1/2 v3] drivers/base: Combine topology.c and cpu.c Prarit Bhargava
2016-09-21 11:39 ` [PATCH 2/2 v3] cpu hotplug: add CONFIG_PERMANENT_CPU_TOPOLOGY Prarit Bhargava
2016-09-21 13:04 ` [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event Borislav Petkov
2016-09-21 13:32   ` Prarit Bhargava
2016-09-21 14:01     ` Borislav Petkov
2016-09-22 11:59       ` Prarit Bhargava
2016-09-22 12:10         ` Borislav Petkov
2016-09-26 11:45           ` Prarit Bhargava
2016-09-26 11:57             ` Borislav Petkov
2016-09-27 11:45               ` Prarit Bhargava
2016-09-27 13:49                 ` Greg Kroah-Hartman
2016-09-27 15:26                   ` Prarit Bhargava
2016-09-28  5:05                     ` Borislav Petkov
2016-09-28  6:48                     ` Peter Zijlstra
2016-09-28 10:06                       ` Prarit Bhargava
2016-09-28  5:02                 ` Borislav Petkov
2016-09-26 11:59             ` Peter Zijlstra
2016-09-27 11:47               ` Prarit Bhargava
2016-09-27 11:57                 ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.