linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] CPU DLPAR/hotplug for v5.16
@ 2021-09-20 13:55 Nathan Lynch
  2021-09-20 13:55 ` [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections Nathan Lynch
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Nathan Lynch @ 2021-09-20 13:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar, danielhb413

Fixes for some vintage bugs in handling cache node addition and removal, a
miscellaneous BUG->WARN conversion, and removal of the fragile "by count"
CPU DLPAR code that probably has no users.

Nathan Lynch (3):
  powerpc/pseries/cpuhp: cache node corrections
  powerpc/cpuhp: BUG -> WARN conversion in offline path
  powerpc/pseries/cpuhp: delete add/remove_by_count code

 arch/powerpc/kernel/sysfs.c                  |   3 +-
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 290 +++++--------------
 2 files changed, 74 insertions(+), 219 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections
  2021-09-20 13:55 [PATCH 0/3] CPU DLPAR/hotplug for v5.16 Nathan Lynch
@ 2021-09-20 13:55 ` Nathan Lynch
  2021-09-20 23:59   ` Daniel Henrique Barboza
  2021-09-20 13:55 ` [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
  2021-09-20 13:55 ` [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
  2 siblings, 1 reply; 11+ messages in thread
From: Nathan Lynch @ 2021-09-20 13:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar, danielhb413

On pseries, cache nodes in the device tree can be added and removed by the
CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
partitions in dedicated processor mode typically have L2 and L3 cache
nodes.

The CPU DLPAR code has the following shortcomings:

* Cache nodes returned as siblings of a new CPU node by
  ibm,configure-connector are silently discarded; only the CPU node is
  added to the device tree.

* Cache nodes which become unreferenced in the processor removal path are
  not removed from the device tree. This can lead to duplicate nodes when
  the post-migration device tree update code replaces cache nodes.

This is long-standing behavior. Presumably it has gone mostly unnoticed
because the two bugs have the property of obscuring each other in common
simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
only if you cared to inspect the device tree or the sysfs cacheinfo
information.

Booted with two processors:

  $ pwd
  /sys/firmware/devicetree/base/cpus
  $ ls -1d */
  l2-cache@2010/
  l2-cache@2011/
  l3-cache@3110/
  l3-cache@3111/
  PowerPC,POWER9@0/
  PowerPC,POWER9@8/
  $ lsprop */l2-cache
  l2-cache@2010/l2-cache
                 00003110 (12560)
  l2-cache@2011/l2-cache
                 00003111 (12561)
  PowerPC,POWER9@0/l2-cache
                 00002010 (8208)
  PowerPC,POWER9@8/l2-cache
                 00002011 (8209)
  $ ls /sys/devices/system/cpu/cpu0/cache/
  index0  index1  index2  index3

After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
missing a cache level in its sched domain hierarchy:

  $ ls -1d */
  l2-cache@2010/
  l2-cache@2011/
  l3-cache@3110/
  l3-cache@3111/
  PowerPC,POWER9@0/
  PowerPC,POWER9@10/
  PowerPC,POWER9@8/
  $ lsprop PowerPC\,POWER9@10/l2-cache
  PowerPC,POWER9@10/l2-cache
                 00002012 (8210)
  $ ls /sys/devices/system/cpu/cpu16/cache/
  index0  index1
  $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
  /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
  /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
  /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
  /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
  /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
  /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
  /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
  /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE

When removing PowerPC,POWER9@8, we see that its cache nodes are left
behind:

  $ ls -1d */
  l2-cache@2010/
  l2-cache@2011/
  l3-cache@3110/
  l3-cache@3111/
  PowerPC,POWER9@0/

When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
removing one processor, then migrating, adding a processor, and then
migrating again can result in warnings from the OF core during
post-migration device tree updates:

  Duplicate name in cpus, renamed to "l2-cache@2011#1"
  Duplicate name in cpus, renamed to "l3-cache@3111#1"

and nodes with duplicated phandles in the tree, making lookup behavior
unpredictable:

  $ lsprop l[23]-cache@*/ibm,phandle
  l2-cache@2010/ibm,phandle
                   00002010 (8208)
  l2-cache@2011#1/ibm,phandle
                   00002011 (8209)
  l2-cache@2011/ibm,phandle
                   00002011 (8209)
  l3-cache@3110/ibm,phandle
                   00003110 (12560)
  l3-cache@3111#1/ibm,phandle
                   00003111 (12561)
  l3-cache@3111/ibm,phandle
                   00003111 (12561)

Address these issues by:

* Correctly processing siblings of the node returned from
  dlpar_configure_connector().
* Removing cache nodes in the CPU remove path when it can be determined
  that they are not associated with other CPUs or caches.

Use the of_changeset API in both cases, which allows us to keep the error
handling in this code from becoming more complex while ensuring that the
device tree cannot become inconsistent.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: ac71380 ("powerpc/pseries: Add CPU dlpar remove functionality")
Fixes: 90edf18 ("powerpc/pseries: Add CPU dlpar add functionality")
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 72 +++++++++++++++++++-
 1 file changed, 70 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index d646c22e94ab..87a0fbe9cf12 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -521,6 +521,27 @@ static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index)
 	return found;
 }
 
+static int pseries_cpuhp_attach_nodes(struct device_node *dn)
+{
+	struct of_changeset cs;
+	int ret;
+
+	/*
+	 * This device node is unattached but may have siblings; open-code the
+	 * traversal.
+	 */
+	for (of_changeset_init(&cs); dn != NULL; dn = dn->sibling) {
+		ret = of_changeset_attach_node(&cs, dn);
+		if (ret)
+			goto out;
+	}
+
+	ret = of_changeset_apply(&cs);
+out:
+	of_changeset_destroy(&cs);
+	return ret;
+}
+
 static ssize_t dlpar_cpu_add(u32 drc_index)
 {
 	struct device_node *dn, *parent;
@@ -563,7 +584,7 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
 		return -EINVAL;
 	}
 
-	rc = dlpar_attach_node(dn, parent);
+	rc = pseries_cpuhp_attach_nodes(dn);
 
 	/* Regardless we are done with parent now */
 	of_node_put(parent);
@@ -600,6 +621,53 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
 	return rc;
 }
 
+static unsigned int pseries_cpuhp_cache_use_count(const struct device_node *cachedn)
+{
+	unsigned int use_count = 0;
+	struct device_node *dn;
+
+	WARN_ON(!of_node_is_type(cachedn, "cache"));
+
+	for_each_of_cpu_node(dn) {
+		if (of_find_next_cache_node(dn) == cachedn)
+			use_count++;
+	}
+
+	for_each_node_by_type(dn, "cache") {
+		if (of_find_next_cache_node(dn) == cachedn)
+			use_count++;
+	}
+
+	return use_count;
+}
+
+static int pseries_cpuhp_detach_nodes(struct device_node *cpudn)
+{
+	struct device_node *dn;
+	struct of_changeset cs;
+	int ret = 0;
+
+	of_changeset_init(&cs);
+	ret = of_changeset_detach_node(&cs, cpudn);
+	if (ret)
+		goto out;
+
+	dn = cpudn;
+	while ((dn = of_find_next_cache_node(dn))) {
+		if (pseries_cpuhp_cache_use_count(dn) > 1)
+			break;
+
+		ret = of_changeset_detach_node(&cs, dn);
+		if (ret)
+			goto out;
+	}
+
+	ret = of_changeset_apply(&cs);
+out:
+	of_changeset_destroy(&cs);
+	return ret;
+}
+
 static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
 {
 	int rc;
@@ -621,7 +689,7 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
 		return rc;
 	}
 
-	rc = dlpar_detach_node(dn);
+	rc = pseries_cpuhp_detach_nodes(dn);
 	if (rc) {
 		int saved_rc = rc;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path
  2021-09-20 13:55 [PATCH 0/3] CPU DLPAR/hotplug for v5.16 Nathan Lynch
  2021-09-20 13:55 ` [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections Nathan Lynch
@ 2021-09-20 13:55 ` Nathan Lynch
  2021-09-20 14:26   ` Christophe Leroy
  2021-09-21  0:05   ` Daniel Henrique Barboza
  2021-09-20 13:55 ` [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
  2 siblings, 2 replies; 11+ messages in thread
From: Nathan Lynch @ 2021-09-20 13:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar, danielhb413

If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
that isn't marked hotpluggable, we can emit a warning and return an
appropriate error instead of crashing.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---
 arch/powerpc/kernel/sysfs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index defecb3b1b15..08d8072d6e7a 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -928,7 +928,8 @@ static int unregister_cpu_online(unsigned int cpu)
 	struct device_attribute *attrs, *pmc_attrs;
 	int i, nattrs;
 
-	BUG_ON(!c->hotpluggable);
+	if (WARN_RATELIMIT(!c->hotpluggable, "cpu %d can't be offlined\n", cpu))
+		return -EBUSY;
 
 #ifdef CONFIG_PPC64
 	if (cpu_has_feature(CPU_FTR_SMT))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code
  2021-09-20 13:55 [PATCH 0/3] CPU DLPAR/hotplug for v5.16 Nathan Lynch
  2021-09-20 13:55 ` [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections Nathan Lynch
  2021-09-20 13:55 ` [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
@ 2021-09-20 13:55 ` Nathan Lynch
  2021-09-20 21:50   ` kernel test robot
  2021-09-21  0:18   ` Daniel Henrique Barboza
  2 siblings, 2 replies; 11+ messages in thread
From: Nathan Lynch @ 2021-09-20 13:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar, danielhb413

The core DLPAR code supports two actions (add and remove) and three
subtypes of action:

* By DRC index: the action is attempted on a single specified resource.
  This is the usual case for processors.
* By indexed count: the action is attempted on a range of resources
  beginning at the specified index. This is implemented only by the memory
  DLPAR code.
* By count: the lower layer (CPU or memory) is responsible for locating the
  specified number of resources to which the action can be applied.

I cannot find any evidence of the "by count" subtype being used by drmgr or
qemu for processors. And when I try to exercise this code, the add case
does not work:

  $ ppc64_cpu --smt ; nproc
  SMT=8
  24
  $ printf "cpu remove count 2" > /sys/kernel/dlpar
  $ nproc
  8
  $ printf "cpu add count 2" > /sys/kernel/dlpar
  -bash: printf: write error: Invalid argument
  $ dmesg | tail -2
  pseries-hotplug-cpu: Failed to find enough CPUs (1 of 2) to add
  dlpar: Could not handle DLPAR request "cpu add count 2"
  $ nproc
  8
  $ drmgr -c cpu -a -q 2         # this uses the by-index method
  Validating CPU DLPAR capability...yes.
  CPU 1
  CPU 17
  $ nproc
  24

This is because find_drc_info_cpus_to_add() does not increment drc_index
appropriately during its search.

This is not hard to fix. But the _by_count() functions also have the
property that they attempt to roll back all prior operations if the entire
request cannot be satisfied, even though the rollback itself can encounter
errors. It's not possible to provide transaction-like behavior at this
level, and it's undesirable to have code that can only pretend to do that.
Any users of these functions cannot know what the state of the system is in
the error case. And the error paths are, to my knowledge, impossible to
test without adding custom error injection code.

Summary:

* This code has not worked reliably since its introduction.
* There is no evidence that it is used.
* It contains questionable rollback behaviors in error paths which are
  difficult to test.

So let's remove it.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: ac71380071d1 ("powerpc/pseries: Add CPU dlpar remove functionality")
Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality")
Fixes: b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR support for drc-info property")
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c | 218 +------------------
 1 file changed, 2 insertions(+), 216 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 87a0fbe9cf12..768997261ce8 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -741,216 +741,6 @@ static int dlpar_cpu_remove_by_index(u32 drc_index)
 	return rc;
 }
 
-static int find_dlpar_cpus_to_remove(u32 *cpu_drcs, int cpus_to_remove)
-{
-	struct device_node *dn;
-	int cpus_found = 0;
-	int rc;
-
-	/* We want to find cpus_to_remove + 1 CPUs to ensure we do not
-	 * remove the last CPU.
-	 */
-	for_each_node_by_type(dn, "cpu") {
-		cpus_found++;
-
-		if (cpus_found > cpus_to_remove) {
-			of_node_put(dn);
-			break;
-		}
-
-		/* Note that cpus_found is always 1 ahead of the index
-		 * into the cpu_drcs array, so we use cpus_found - 1
-		 */
-		rc = of_property_read_u32(dn, "ibm,my-drc-index",
-					  &cpu_drcs[cpus_found - 1]);
-		if (rc) {
-			pr_warn("Error occurred getting drc-index for %pOFn\n",
-				dn);
-			of_node_put(dn);
-			return -1;
-		}
-	}
-
-	if (cpus_found < cpus_to_remove) {
-		pr_warn("Failed to find enough CPUs (%d of %d) to remove\n",
-			cpus_found, cpus_to_remove);
-	} else if (cpus_found == cpus_to_remove) {
-		pr_warn("Cannot remove all CPUs\n");
-	}
-
-	return cpus_found;
-}
-
-static int dlpar_cpu_remove_by_count(u32 cpus_to_remove)
-{
-	u32 *cpu_drcs;
-	int cpus_found;
-	int cpus_removed = 0;
-	int i, rc;
-
-	pr_debug("Attempting to hot-remove %d CPUs\n", cpus_to_remove);
-
-	cpu_drcs = kcalloc(cpus_to_remove, sizeof(*cpu_drcs), GFP_KERNEL);
-	if (!cpu_drcs)
-		return -EINVAL;
-
-	cpus_found = find_dlpar_cpus_to_remove(cpu_drcs, cpus_to_remove);
-	if (cpus_found <= cpus_to_remove) {
-		kfree(cpu_drcs);
-		return -EINVAL;
-	}
-
-	for (i = 0; i < cpus_to_remove; i++) {
-		rc = dlpar_cpu_remove_by_index(cpu_drcs[i]);
-		if (rc)
-			break;
-
-		cpus_removed++;
-	}
-
-	if (cpus_removed != cpus_to_remove) {
-		pr_warn("CPU hot-remove failed, adding back removed CPUs\n");
-
-		for (i = 0; i < cpus_removed; i++)
-			dlpar_cpu_add(cpu_drcs[i]);
-
-		rc = -EINVAL;
-	} else {
-		rc = 0;
-	}
-
-	kfree(cpu_drcs);
-	return rc;
-}
-
-static int find_drc_info_cpus_to_add(struct device_node *cpus,
-				     struct property *info,
-				     u32 *cpu_drcs, u32 cpus_to_add)
-{
-	struct of_drc_info drc;
-	const __be32 *value;
-	u32 count, drc_index;
-	int cpus_found = 0;
-	int i, j;
-
-	if (!info)
-		return -1;
-
-	value = of_prop_next_u32(info, NULL, &count);
-	if (value)
-		value++;
-
-	for (i = 0; i < count; i++) {
-		of_read_drc_info_cell(&info, &value, &drc);
-		if (strncmp(drc.drc_type, "CPU", 3))
-			break;
-
-		drc_index = drc.drc_index_start;
-		for (j = 0; j < drc.num_sequential_elems; j++) {
-			if (dlpar_cpu_exists(cpus, drc_index))
-				continue;
-
-			cpu_drcs[cpus_found++] = drc_index;
-
-			if (cpus_found == cpus_to_add)
-				return cpus_found;
-
-			drc_index += drc.sequential_inc;
-		}
-	}
-
-	return cpus_found;
-}
-
-static int find_drc_index_cpus_to_add(struct device_node *cpus,
-				      u32 *cpu_drcs, u32 cpus_to_add)
-{
-	int cpus_found = 0;
-	int index, rc;
-	u32 drc_index;
-
-	/* Search the ibm,drc-indexes array for possible CPU drcs to
-	 * add. Note that the format of the ibm,drc-indexes array is
-	 * the number of entries in the array followed by the array
-	 * of drc values so we start looking at index = 1.
-	 */
-	index = 1;
-	while (cpus_found < cpus_to_add) {
-		rc = of_property_read_u32_index(cpus, "ibm,drc-indexes",
-						index++, &drc_index);
-
-		if (rc)
-			break;
-
-		if (dlpar_cpu_exists(cpus, drc_index))
-			continue;
-
-		cpu_drcs[cpus_found++] = drc_index;
-	}
-
-	return cpus_found;
-}
-
-static int dlpar_cpu_add_by_count(u32 cpus_to_add)
-{
-	struct device_node *parent;
-	struct property *info;
-	u32 *cpu_drcs;
-	int cpus_added = 0;
-	int cpus_found;
-	int i, rc;
-
-	pr_debug("Attempting to hot-add %d CPUs\n", cpus_to_add);
-
-	cpu_drcs = kcalloc(cpus_to_add, sizeof(*cpu_drcs), GFP_KERNEL);
-	if (!cpu_drcs)
-		return -EINVAL;
-
-	parent = of_find_node_by_path("/cpus");
-	if (!parent) {
-		pr_warn("Could not find CPU root node in device tree\n");
-		kfree(cpu_drcs);
-		return -1;
-	}
-
-	info = of_find_property(parent, "ibm,drc-info", NULL);
-	if (info)
-		cpus_found = find_drc_info_cpus_to_add(parent, info, cpu_drcs, cpus_to_add);
-	else
-		cpus_found = find_drc_index_cpus_to_add(parent, cpu_drcs, cpus_to_add);
-
-	of_node_put(parent);
-
-	if (cpus_found < cpus_to_add) {
-		pr_warn("Failed to find enough CPUs (%d of %d) to add\n",
-			cpus_found, cpus_to_add);
-		kfree(cpu_drcs);
-		return -EINVAL;
-	}
-
-	for (i = 0; i < cpus_to_add; i++) {
-		rc = dlpar_cpu_add(cpu_drcs[i]);
-		if (rc)
-			break;
-
-		cpus_added++;
-	}
-
-	if (cpus_added < cpus_to_add) {
-		pr_warn("CPU hot-add failed, removing any added CPUs\n");
-
-		for (i = 0; i < cpus_added; i++)
-			dlpar_cpu_remove_by_index(cpu_drcs[i]);
-
-		rc = -EINVAL;
-	} else {
-		rc = 0;
-	}
-
-	kfree(cpu_drcs);
-	return rc;
-}
-
 int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
 {
 	u32 count, drc_index;
@@ -963,9 +753,7 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
 
 	switch (hp_elog->action) {
 	case PSERIES_HP_ELOG_ACTION_REMOVE:
-		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_COUNT)
-			rc = dlpar_cpu_remove_by_count(count);
-		else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
+		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
 			rc = dlpar_cpu_remove_by_index(drc_index);
 			/*
 			 * Setting the isolation state of an UNISOLATED/CONFIGURED
@@ -979,9 +767,7 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
 			rc = -EINVAL;
 		break;
 	case PSERIES_HP_ELOG_ACTION_ADD:
-		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_COUNT)
-			rc = dlpar_cpu_add_by_count(count);
-		else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
+		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
 			rc = dlpar_cpu_add(drc_index);
 		else
 			rc = -EINVAL;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path
  2021-09-20 13:55 ` [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
@ 2021-09-20 14:26   ` Christophe Leroy
  2021-09-20 14:39     ` Nathan Lynch
  2021-09-21  0:05   ` Daniel Henrique Barboza
  1 sibling, 1 reply; 11+ messages in thread
From: Christophe Leroy @ 2021-09-20 14:26 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar, danielhb413



Le 20/09/2021 à 15:55, Nathan Lynch a écrit :
> If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
> that isn't marked hotpluggable, we can emit a warning and return an
> appropriate error instead of crashing.

Is it only a bug situation, or is it something that can happen in real 
life ?

If it can happen in real life, kernels with panic_on_warn will still be 
impacted.

> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---
>   arch/powerpc/kernel/sysfs.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index defecb3b1b15..08d8072d6e7a 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -928,7 +928,8 @@ static int unregister_cpu_online(unsigned int cpu)
>   	struct device_attribute *attrs, *pmc_attrs;
>   	int i, nattrs;
>   
> -	BUG_ON(!c->hotpluggable);
> +	if (WARN_RATELIMIT(!c->hotpluggable, "cpu %d can't be offlined\n", cpu))
> +		return -EBUSY;
>   
>   #ifdef CONFIG_PPC64
>   	if (cpu_has_feature(CPU_FTR_SMT))
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path
  2021-09-20 14:26   ` Christophe Leroy
@ 2021-09-20 14:39     ` Nathan Lynch
  0 siblings, 0 replies; 11+ messages in thread
From: Nathan Lynch @ 2021-09-20 14:39 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev; +Cc: danielhb413, tyreld, ldufour, aneesh.kumar

Christophe Leroy <christophe.leroy@csgroup.eu> writes:

> Le 20/09/2021 à 15:55, Nathan Lynch a écrit :
>> If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
>> that isn't marked hotpluggable, we can emit a warning and return an
>> appropriate error instead of crashing.
>
> Is it only a bug situation, or is it something that can happen in real 
> life ?
>
> If it can happen in real life, kernels with panic_on_warn will still be 
> impacted.

I only found this by inspection, and it can happen only due to a bug in
CPU device registration at boot. The flag must not be set if the
platform or CPU can't support going offline.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code
  2021-09-20 13:55 ` [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
@ 2021-09-20 21:50   ` kernel test robot
  2021-09-21  0:18   ` Daniel Henrique Barboza
  1 sibling, 0 replies; 11+ messages in thread
From: kernel test robot @ 2021-09-20 21:50 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev
  Cc: danielhb413, tyreld, ldufour, kbuild-all, aneesh.kumar

[-- Attachment #1: Type: text/plain, Size: 5199 bytes --]

Hi Nathan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v5.15-rc2 next-20210920]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nathan-Lynch/CPU-DLPAR-hotplug-for-v5-16/20210920-215907
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/72ea4c8a5398a4a72da34051a66f260ab0154f57
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nathan-Lynch/CPU-DLPAR-hotplug-for-v5-16/20210920-215907
        git checkout 72ea4c8a5398a4a72da34051a66f260ab0154f57
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'dlpar_cpu':
>> arch/powerpc/platforms/pseries/hotplug-cpu.c:746:13: error: variable 'count' set but not used [-Werror=unused-but-set-variable]
     746 |         u32 count, drc_index;
         |             ^~~~~
   cc1: all warnings being treated as errors


vim +/count +746 arch/powerpc/platforms/pseries/hotplug-cpu.c

ac71380071d19d Nathan Fontenot         2015-12-16  743  
ac71380071d19d Nathan Fontenot         2015-12-16  744  int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
ac71380071d19d Nathan Fontenot         2015-12-16  745  {
ac71380071d19d Nathan Fontenot         2015-12-16 @746  	u32 count, drc_index;
ac71380071d19d Nathan Fontenot         2015-12-16  747  	int rc;
ac71380071d19d Nathan Fontenot         2015-12-16  748  
ac71380071d19d Nathan Fontenot         2015-12-16  749  	count = hp_elog->_drc_u.drc_count;
ac71380071d19d Nathan Fontenot         2015-12-16  750  	drc_index = hp_elog->_drc_u.drc_index;
ac71380071d19d Nathan Fontenot         2015-12-16  751  
ac71380071d19d Nathan Fontenot         2015-12-16  752  	lock_device_hotplug();
ac71380071d19d Nathan Fontenot         2015-12-16  753  
ac71380071d19d Nathan Fontenot         2015-12-16  754  	switch (hp_elog->action) {
ac71380071d19d Nathan Fontenot         2015-12-16  755  	case PSERIES_HP_ELOG_ACTION_REMOVE:
72ea4c8a5398a4 Nathan Lynch            2021-09-20  756  		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
ac71380071d19d Nathan Fontenot         2015-12-16  757  			rc = dlpar_cpu_remove_by_index(drc_index);
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  758  			/*
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  759  			 * Setting the isolation state of an UNISOLATED/CONFIGURED
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  760  			 * device to UNISOLATE is a no-op, but the hypervisor can
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  761  			 * use it as a hint that the CPU removal failed.
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  762  			 */
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  763  			if (rc)
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  764  				dlpar_unisolate_drc(drc_index);
29c9a2699e71a7 Daniel Henrique Barboza 2021-04-16  765  		}
ac71380071d19d Nathan Fontenot         2015-12-16  766  		else
ac71380071d19d Nathan Fontenot         2015-12-16  767  			rc = -EINVAL;
ac71380071d19d Nathan Fontenot         2015-12-16  768  		break;
90edf184b9b727 Nathan Fontenot         2015-12-16  769  	case PSERIES_HP_ELOG_ACTION_ADD:
72ea4c8a5398a4 Nathan Lynch            2021-09-20  770  		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
90edf184b9b727 Nathan Fontenot         2015-12-16  771  			rc = dlpar_cpu_add(drc_index);
90edf184b9b727 Nathan Fontenot         2015-12-16  772  		else
90edf184b9b727 Nathan Fontenot         2015-12-16  773  			rc = -EINVAL;
90edf184b9b727 Nathan Fontenot         2015-12-16  774  		break;
ac71380071d19d Nathan Fontenot         2015-12-16  775  	default:
ac71380071d19d Nathan Fontenot         2015-12-16  776  		pr_err("Invalid action (%d) specified\n", hp_elog->action);
ac71380071d19d Nathan Fontenot         2015-12-16  777  		rc = -EINVAL;
ac71380071d19d Nathan Fontenot         2015-12-16  778  		break;
ac71380071d19d Nathan Fontenot         2015-12-16  779  	}
ac71380071d19d Nathan Fontenot         2015-12-16  780  
ac71380071d19d Nathan Fontenot         2015-12-16  781  	unlock_device_hotplug();
ac71380071d19d Nathan Fontenot         2015-12-16  782  	return rc;
ac71380071d19d Nathan Fontenot         2015-12-16  783  }
ac71380071d19d Nathan Fontenot         2015-12-16  784  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 74041 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections
  2021-09-20 13:55 ` [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections Nathan Lynch
@ 2021-09-20 23:59   ` Daniel Henrique Barboza
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Henrique Barboza @ 2021-09-20 23:59 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar



On 9/20/21 10:55, Nathan Lynch wrote:
> On pseries, cache nodes in the device tree can be added and removed by the
> CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
> partitions in dedicated processor mode typically have L2 and L3 cache
> nodes.
> 
> The CPU DLPAR code has the following shortcomings:
> 
> * Cache nodes returned as siblings of a new CPU node by
>    ibm,configure-connector are silently discarded; only the CPU node is
>    added to the device tree.
> 
> * Cache nodes which become unreferenced in the processor removal path are
>    not removed from the device tree. This can lead to duplicate nodes when
>    the post-migration device tree update code replaces cache nodes.
> 
> This is long-standing behavior. Presumably it has gone mostly unnoticed
> because the two bugs have the property of obscuring each other in common
> simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
> only if you cared to inspect the device tree or the sysfs cacheinfo
> information.
> 
> Booted with two processors:
> 
>    $ pwd
>    /sys/firmware/devicetree/base/cpus
>    $ ls -1d */
>    l2-cache@2010/
>    l2-cache@2011/
>    l3-cache@3110/
>    l3-cache@3111/
>    PowerPC,POWER9@0/
>    PowerPC,POWER9@8/
>    $ lsprop */l2-cache
>    l2-cache@2010/l2-cache
>                   00003110 (12560)
>    l2-cache@2011/l2-cache
>                   00003111 (12561)
>    PowerPC,POWER9@0/l2-cache
>                   00002010 (8208)
>    PowerPC,POWER9@8/l2-cache
>                   00002011 (8209)
>    $ ls /sys/devices/system/cpu/cpu0/cache/
>    index0  index1  index2  index3
> 
> After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
> nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
> missing a cache level in its sched domain hierarchy:
> 
>    $ ls -1d */
>    l2-cache@2010/
>    l2-cache@2011/
>    l3-cache@3110/
>    l3-cache@3111/
>    PowerPC,POWER9@0/
>    PowerPC,POWER9@10/
>    PowerPC,POWER9@8/
>    $ lsprop PowerPC\,POWER9@10/l2-cache
>    PowerPC,POWER9@10/l2-cache
>                   00002012 (8210)
>    $ ls /sys/devices/system/cpu/cpu16/cache/
>    index0  index1
>    $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
>    /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
>    /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
>    /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
>    /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
>    /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
>    /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
>    /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
>    /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
> 
> When removing PowerPC,POWER9@8, we see that its cache nodes are left
> behind:
> 
>    $ ls -1d */
>    l2-cache@2010/
>    l2-cache@2011/
>    l3-cache@3110/
>    l3-cache@3111/
>    PowerPC,POWER9@0/
> 
> When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
> removing one processor, then migrating, adding a processor, and then
> migrating again can result in warnings from the OF core during
> post-migration device tree updates:
> 
>    Duplicate name in cpus, renamed to "l2-cache@2011#1"
>    Duplicate name in cpus, renamed to "l3-cache@3111#1"
> 
> and nodes with duplicated phandles in the tree, making lookup behavior
> unpredictable:
> 
>    $ lsprop l[23]-cache@*/ibm,phandle
>    l2-cache@2010/ibm,phandle
>                     00002010 (8208)
>    l2-cache@2011#1/ibm,phandle
>                     00002011 (8209)
>    l2-cache@2011/ibm,phandle
>                     00002011 (8209)
>    l3-cache@3110/ibm,phandle
>                     00003110 (12560)
>    l3-cache@3111#1/ibm,phandle
>                     00003111 (12561)
>    l3-cache@3111/ibm,phandle
>                     00003111 (12561)
> 
> Address these issues by:
> 
> * Correctly processing siblings of the node returned from
>    dlpar_configure_connector().
> * Removing cache nodes in the CPU remove path when it can be determined
>    that they are not associated with other CPUs or caches.
> 
> Use the of_changeset API in both cases, which allows us to keep the error
> handling in this code from becoming more complex while ensuring that the
> device tree cannot become inconsistent.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: ac71380 ("powerpc/pseries: Add CPU dlpar remove functionality")
> Fixes: 90edf18 ("powerpc/pseries: Add CPU dlpar add functionality")
> ---
>   arch/powerpc/platforms/pseries/hotplug-cpu.c | 72 +++++++++++++++++++-
>   1 file changed, 70 insertions(+), 2 deletions(-)

Tested with a QEMU pseries guest, multiple CPU add/removals of the same CPU,
and no issues found with these new pseries_cpuhp* functions.

Code LGTM as well.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>




> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index d646c22e94ab..87a0fbe9cf12 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -521,6 +521,27 @@ static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index)
>   	return found;
>   }
>   
> +static int pseries_cpuhp_attach_nodes(struct device_node *dn)
> +{
> +	struct of_changeset cs;
> +	int ret;
> +
> +	/*
> +	 * This device node is unattached but may have siblings; open-code the
> +	 * traversal.
> +	 */
> +	for (of_changeset_init(&cs); dn != NULL; dn = dn->sibling) {
> +		ret = of_changeset_attach_node(&cs, dn);
> +		if (ret)
> +			goto out;
> +	}
> +
> +	ret = of_changeset_apply(&cs);
> +out:
> +	of_changeset_destroy(&cs);
> +	return ret;
> +}
> +
>   static ssize_t dlpar_cpu_add(u32 drc_index)
>   {
>   	struct device_node *dn, *parent;
> @@ -563,7 +584,7 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
>   		return -EINVAL;
>   	}
>   
> -	rc = dlpar_attach_node(dn, parent);
> +	rc = pseries_cpuhp_attach_nodes(dn);
>   
>   	/* Regardless we are done with parent now */
>   	of_node_put(parent);
> @@ -600,6 +621,53 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
>   	return rc;
>   }
>   
> +static unsigned int pseries_cpuhp_cache_use_count(const struct device_node *cachedn)
> +{
> +	unsigned int use_count = 0;
> +	struct device_node *dn;
> +
> +	WARN_ON(!of_node_is_type(cachedn, "cache"));
> +
> +	for_each_of_cpu_node(dn) {
> +		if (of_find_next_cache_node(dn) == cachedn)
> +			use_count++;
> +	}
> +
> +	for_each_node_by_type(dn, "cache") {
> +		if (of_find_next_cache_node(dn) == cachedn)
> +			use_count++;
> +	}
> +
> +	return use_count;
> +}
> +
> +static int pseries_cpuhp_detach_nodes(struct device_node *cpudn)
> +{
> +	struct device_node *dn;
> +	struct of_changeset cs;
> +	int ret = 0;
> +
> +	of_changeset_init(&cs);
> +	ret = of_changeset_detach_node(&cs, cpudn);
> +	if (ret)
> +		goto out;
> +
> +	dn = cpudn;
> +	while ((dn = of_find_next_cache_node(dn))) {
> +		if (pseries_cpuhp_cache_use_count(dn) > 1)
> +			break;
> +
> +		ret = of_changeset_detach_node(&cs, dn);
> +		if (ret)
> +			goto out;
> +	}
> +
> +	ret = of_changeset_apply(&cs);
> +out:
> +	of_changeset_destroy(&cs);
> +	return ret;
> +}
> +
>   static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
>   {
>   	int rc;
> @@ -621,7 +689,7 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
>   		return rc;
>   	}
>   
> -	rc = dlpar_detach_node(dn);
> +	rc = pseries_cpuhp_detach_nodes(dn);
>   	if (rc) {
>   		int saved_rc = rc;
>   
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path
  2021-09-20 13:55 ` [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
  2021-09-20 14:26   ` Christophe Leroy
@ 2021-09-21  0:05   ` Daniel Henrique Barboza
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Henrique Barboza @ 2021-09-21  0:05 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar



On 9/20/21 10:55, Nathan Lynch wrote:
> If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
> that isn't marked hotpluggable, we can emit a warning and return an
> appropriate error instead of crashing.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> ---

As mentioned by Christophe this will not solve the crash for kernels with
panic_on_warn, but at least it will panic with a clearer message on those
and will not panic for everyone else.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>



>   arch/powerpc/kernel/sysfs.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index defecb3b1b15..08d8072d6e7a 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -928,7 +928,8 @@ static int unregister_cpu_online(unsigned int cpu)
>   	struct device_attribute *attrs, *pmc_attrs;
>   	int i, nattrs;
>   
> -	BUG_ON(!c->hotpluggable);
> +	if (WARN_RATELIMIT(!c->hotpluggable, "cpu %d can't be offlined\n", cpu))
> +		return -EBUSY;
>   
>   #ifdef CONFIG_PPC64
>   	if (cpu_has_feature(CPU_FTR_SMT))
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code
  2021-09-20 13:55 ` [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
  2021-09-20 21:50   ` kernel test robot
@ 2021-09-21  0:18   ` Daniel Henrique Barboza
  2021-09-21  0:43     ` Nathan Lynch
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Henrique Barboza @ 2021-09-21  0:18 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar



On 9/20/21 10:55, Nathan Lynch wrote:
> The core DLPAR code supports two actions (add and remove) and three
> subtypes of action:
> 
> * By DRC index: the action is attempted on a single specified resource.
>    This is the usual case for processors.
> * By indexed count: the action is attempted on a range of resources
>    beginning at the specified index. This is implemented only by the memory
>    DLPAR code.
> * By count: the lower layer (CPU or memory) is responsible for locating the
>    specified number of resources to which the action can be applied.
> 
> I cannot find any evidence of the "by count" subtype being used by drmgr or
> qemu for processors. And when I try to exercise this code, the add case
> does not work:


Just to clarify: did you check both CPU and memory cases and found out that the
'by count' subtype isn't used with CPUs, but drmgr has some cases in which
'by count' is used with LMBs?

I'm asking because I worked with a part of the LMB removal code a few months ago,
and got stuck in a situation in which the 'by count' and 'by indexed count' are
similar enough to feel repetitive, but distinct enough to not be easily reduced
into a single function. If drmgr wasn't using the 'by count' subtypes for LMBs
that would be a good chance for more code redux.


> 
>    $ ppc64_cpu --smt ; nproc
>    SMT=8
>    24
>    $ printf "cpu remove count 2" > /sys/kernel/dlpar
>    $ nproc
>    8
>    $ printf "cpu add count 2" > /sys/kernel/dlpar
>    -bash: printf: write error: Invalid argument
>    $ dmesg | tail -2
>    pseries-hotplug-cpu: Failed to find enough CPUs (1 of 2) to add
>    dlpar: Could not handle DLPAR request "cpu add count 2"
>    $ nproc
>    8
>    $ drmgr -c cpu -a -q 2         # this uses the by-index method
>    Validating CPU DLPAR capability...yes.
>    CPU 1
>    CPU 17
>    $ nproc
>    24
> 
> This is because find_drc_info_cpus_to_add() does not increment drc_index
> appropriately during its search.
> 
> This is not hard to fix. But the _by_count() functions also have the
> property that they attempt to roll back all prior operations if the entire
> request cannot be satisfied, even though the rollback itself can encounter
> errors. It's not possible to provide transaction-like behavior at this
> level, and it's undesirable to have code that can only pretend to do that.
> Any users of these functions cannot know what the state of the system is in
> the error case. And the error paths are, to my knowledge, impossible to
> test without adding custom error injection code.
> 
> Summary:
> 
> * This code has not worked reliably since its introduction.
> * There is no evidence that it is used.
> * It contains questionable rollback behaviors in error paths which are
>    difficult to test.
> 
> So let's remove it.
> 
> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
> Fixes: ac71380071d1 ("powerpc/pseries: Add CPU dlpar remove functionality")
> Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality")
> Fixes: b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR support for drc-info property")
> ---

Tested with a QEMU pseries guest, no issues found.


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>



>   arch/powerpc/platforms/pseries/hotplug-cpu.c | 218 +------------------
>   1 file changed, 2 insertions(+), 216 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 87a0fbe9cf12..768997261ce8 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -741,216 +741,6 @@ static int dlpar_cpu_remove_by_index(u32 drc_index)
>   	return rc;
>   }
>   
> -static int find_dlpar_cpus_to_remove(u32 *cpu_drcs, int cpus_to_remove)
> -{
> -	struct device_node *dn;
> -	int cpus_found = 0;
> -	int rc;
> -
> -	/* We want to find cpus_to_remove + 1 CPUs to ensure we do not
> -	 * remove the last CPU.
> -	 */
> -	for_each_node_by_type(dn, "cpu") {
> -		cpus_found++;
> -
> -		if (cpus_found > cpus_to_remove) {
> -			of_node_put(dn);
> -			break;
> -		}
> -
> -		/* Note that cpus_found is always 1 ahead of the index
> -		 * into the cpu_drcs array, so we use cpus_found - 1
> -		 */
> -		rc = of_property_read_u32(dn, "ibm,my-drc-index",
> -					  &cpu_drcs[cpus_found - 1]);
> -		if (rc) {
> -			pr_warn("Error occurred getting drc-index for %pOFn\n",
> -				dn);
> -			of_node_put(dn);
> -			return -1;
> -		}
> -	}
> -
> -	if (cpus_found < cpus_to_remove) {
> -		pr_warn("Failed to find enough CPUs (%d of %d) to remove\n",
> -			cpus_found, cpus_to_remove);
> -	} else if (cpus_found == cpus_to_remove) {
> -		pr_warn("Cannot remove all CPUs\n");
> -	}
> -
> -	return cpus_found;
> -}
> -
> -static int dlpar_cpu_remove_by_count(u32 cpus_to_remove)
> -{
> -	u32 *cpu_drcs;
> -	int cpus_found;
> -	int cpus_removed = 0;
> -	int i, rc;
> -
> -	pr_debug("Attempting to hot-remove %d CPUs\n", cpus_to_remove);
> -
> -	cpu_drcs = kcalloc(cpus_to_remove, sizeof(*cpu_drcs), GFP_KERNEL);
> -	if (!cpu_drcs)
> -		return -EINVAL;
> -
> -	cpus_found = find_dlpar_cpus_to_remove(cpu_drcs, cpus_to_remove);
> -	if (cpus_found <= cpus_to_remove) {
> -		kfree(cpu_drcs);
> -		return -EINVAL;
> -	}
> -
> -	for (i = 0; i < cpus_to_remove; i++) {
> -		rc = dlpar_cpu_remove_by_index(cpu_drcs[i]);
> -		if (rc)
> -			break;
> -
> -		cpus_removed++;
> -	}
> -
> -	if (cpus_removed != cpus_to_remove) {
> -		pr_warn("CPU hot-remove failed, adding back removed CPUs\n");
> -
> -		for (i = 0; i < cpus_removed; i++)
> -			dlpar_cpu_add(cpu_drcs[i]);
> -
> -		rc = -EINVAL;
> -	} else {
> -		rc = 0;
> -	}
> -
> -	kfree(cpu_drcs);
> -	return rc;
> -}
> -
> -static int find_drc_info_cpus_to_add(struct device_node *cpus,
> -				     struct property *info,
> -				     u32 *cpu_drcs, u32 cpus_to_add)
> -{
> -	struct of_drc_info drc;
> -	const __be32 *value;
> -	u32 count, drc_index;
> -	int cpus_found = 0;
> -	int i, j;
> -
> -	if (!info)
> -		return -1;
> -
> -	value = of_prop_next_u32(info, NULL, &count);
> -	if (value)
> -		value++;
> -
> -	for (i = 0; i < count; i++) {
> -		of_read_drc_info_cell(&info, &value, &drc);
> -		if (strncmp(drc.drc_type, "CPU", 3))
> -			break;
> -
> -		drc_index = drc.drc_index_start;
> -		for (j = 0; j < drc.num_sequential_elems; j++) {
> -			if (dlpar_cpu_exists(cpus, drc_index))
> -				continue;
> -
> -			cpu_drcs[cpus_found++] = drc_index;
> -
> -			if (cpus_found == cpus_to_add)
> -				return cpus_found;
> -
> -			drc_index += drc.sequential_inc;
> -		}
> -	}
> -
> -	return cpus_found;
> -}
> -
> -static int find_drc_index_cpus_to_add(struct device_node *cpus,
> -				      u32 *cpu_drcs, u32 cpus_to_add)
> -{
> -	int cpus_found = 0;
> -	int index, rc;
> -	u32 drc_index;
> -
> -	/* Search the ibm,drc-indexes array for possible CPU drcs to
> -	 * add. Note that the format of the ibm,drc-indexes array is
> -	 * the number of entries in the array followed by the array
> -	 * of drc values so we start looking at index = 1.
> -	 */
> -	index = 1;
> -	while (cpus_found < cpus_to_add) {
> -		rc = of_property_read_u32_index(cpus, "ibm,drc-indexes",
> -						index++, &drc_index);
> -
> -		if (rc)
> -			break;
> -
> -		if (dlpar_cpu_exists(cpus, drc_index))
> -			continue;
> -
> -		cpu_drcs[cpus_found++] = drc_index;
> -	}
> -
> -	return cpus_found;
> -}
> -
> -static int dlpar_cpu_add_by_count(u32 cpus_to_add)
> -{
> -	struct device_node *parent;
> -	struct property *info;
> -	u32 *cpu_drcs;
> -	int cpus_added = 0;
> -	int cpus_found;
> -	int i, rc;
> -
> -	pr_debug("Attempting to hot-add %d CPUs\n", cpus_to_add);
> -
> -	cpu_drcs = kcalloc(cpus_to_add, sizeof(*cpu_drcs), GFP_KERNEL);
> -	if (!cpu_drcs)
> -		return -EINVAL;
> -
> -	parent = of_find_node_by_path("/cpus");
> -	if (!parent) {
> -		pr_warn("Could not find CPU root node in device tree\n");
> -		kfree(cpu_drcs);
> -		return -1;
> -	}
> -
> -	info = of_find_property(parent, "ibm,drc-info", NULL);
> -	if (info)
> -		cpus_found = find_drc_info_cpus_to_add(parent, info, cpu_drcs, cpus_to_add);
> -	else
> -		cpus_found = find_drc_index_cpus_to_add(parent, cpu_drcs, cpus_to_add);
> -
> -	of_node_put(parent);
> -
> -	if (cpus_found < cpus_to_add) {
> -		pr_warn("Failed to find enough CPUs (%d of %d) to add\n",
> -			cpus_found, cpus_to_add);
> -		kfree(cpu_drcs);
> -		return -EINVAL;
> -	}
> -
> -	for (i = 0; i < cpus_to_add; i++) {
> -		rc = dlpar_cpu_add(cpu_drcs[i]);
> -		if (rc)
> -			break;
> -
> -		cpus_added++;
> -	}
> -
> -	if (cpus_added < cpus_to_add) {
> -		pr_warn("CPU hot-add failed, removing any added CPUs\n");
> -
> -		for (i = 0; i < cpus_added; i++)
> -			dlpar_cpu_remove_by_index(cpu_drcs[i]);
> -
> -		rc = -EINVAL;
> -	} else {
> -		rc = 0;
> -	}
> -
> -	kfree(cpu_drcs);
> -	return rc;
> -}
> -
>   int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
>   {
>   	u32 count, drc_index;
> @@ -963,9 +753,7 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
>   
>   	switch (hp_elog->action) {
>   	case PSERIES_HP_ELOG_ACTION_REMOVE:
> -		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_COUNT)
> -			rc = dlpar_cpu_remove_by_count(count);
> -		else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
> +		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX) {
>   			rc = dlpar_cpu_remove_by_index(drc_index);
>   			/*
>   			 * Setting the isolation state of an UNISOLATED/CONFIGURED
> @@ -979,9 +767,7 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
>   			rc = -EINVAL;
>   		break;
>   	case PSERIES_HP_ELOG_ACTION_ADD:
> -		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_COUNT)
> -			rc = dlpar_cpu_add_by_count(count);
> -		else if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
> +		if (hp_elog->id_type == PSERIES_HP_ELOG_ID_DRC_INDEX)
>   			rc = dlpar_cpu_add(drc_index);
>   		else
>   			rc = -EINVAL;
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code
  2021-09-21  0:18   ` Daniel Henrique Barboza
@ 2021-09-21  0:43     ` Nathan Lynch
  0 siblings, 0 replies; 11+ messages in thread
From: Nathan Lynch @ 2021-09-21  0:43 UTC (permalink / raw)
  To: Daniel Henrique Barboza, linuxppc-dev; +Cc: tyreld, ldufour, aneesh.kumar

Daniel Henrique Barboza <danielhb413@gmail.com> writes:

> On 9/20/21 10:55, Nathan Lynch wrote:
>> The core DLPAR code supports two actions (add and remove) and three
>> subtypes of action:
>> 
>> * By DRC index: the action is attempted on a single specified resource.
>>    This is the usual case for processors.
>> * By indexed count: the action is attempted on a range of resources
>>    beginning at the specified index. This is implemented only by the memory
>>    DLPAR code.
>> * By count: the lower layer (CPU or memory) is responsible for locating the
>>    specified number of resources to which the action can be applied.
>> 
>> I cannot find any evidence of the "by count" subtype being used by drmgr or
>> qemu for processors. And when I try to exercise this code, the add case
>> does not work:
>
>
> Just to clarify: did you check both CPU and memory cases and found out that the
> 'by count' subtype isn't used with CPUs, but drmgr has some cases in which
> 'by count' is used with LMBs?

Yes, drmgr uses both the 'by count' and the 'by index' methods for
memory currently on PowerVM.

> I'm asking because I worked with a part of the LMB removal code a few months ago,
> and got stuck in a situation in which the 'by count' and 'by indexed count' are
> similar enough to feel repetitive, but distinct enough to not be easily reduced
> into a single function. If drmgr wasn't using the 'by count' subtypes for LMBs
> that would be a good chance for more code redux.

The 'by count' method is definitely used for memory on PowerVM. I was
under the impression that the 'by indexed count' method was used by qemu
for memory sometimes; I'm pretty sure it's not used on PowerVM.

>> Summary:
>> 
>> * This code has not worked reliably since its introduction.
>> * There is no evidence that it is used.
>> * It contains questionable rollback behaviors in error paths which are
>>    difficult to test.
>> 
>> So let's remove it.
>> 
>> Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
>> Fixes: ac71380071d1 ("powerpc/pseries: Add CPU dlpar remove functionality")
>> Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality")
>> Fixes: b015f6bc9547 ("powerpc/pseries: Add cpu DLPAR support for drc-info property")
>> ---
>
> Tested with a QEMU pseries guest, no issues found.
>
>
> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-22  2:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-20 13:55 [PATCH 0/3] CPU DLPAR/hotplug for v5.16 Nathan Lynch
2021-09-20 13:55 ` [PATCH 1/3] powerpc/pseries/cpuhp: cache node corrections Nathan Lynch
2021-09-20 23:59   ` Daniel Henrique Barboza
2021-09-20 13:55 ` [PATCH 2/3] powerpc/cpuhp: BUG -> WARN conversion in offline path Nathan Lynch
2021-09-20 14:26   ` Christophe Leroy
2021-09-20 14:39     ` Nathan Lynch
2021-09-21  0:05   ` Daniel Henrique Barboza
2021-09-20 13:55 ` [PATCH 3/3] powerpc/pseries/cpuhp: delete add/remove_by_count code Nathan Lynch
2021-09-20 21:50   ` kernel test robot
2021-09-21  0:18   ` Daniel Henrique Barboza
2021-09-21  0:43     ` Nathan Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).