* [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline()
@ 2021-03-26 14:19 Daniel Henrique Barboza
2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza
0 siblings, 1 reply; 3+ messages in thread
From: Daniel Henrique Barboza @ 2021-03-26 14:19 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Daniel Henrique Barboza, dja
changes in v3 after Daniel Axtens review:
- fixed typo in commit mgs
- fixed block comment format to make checkpatch happy
v2 link:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210323205056.52768-2-danielhb413@gmail.com/
changes in v2 after Michael Ellerman review:
- moved the verification code from dlpar_cpu_remove() to
dlpar_cpu_offline(), while holding cpu_add_remove_lock
- reworded the commit message and code comment
v1 link:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210305173845.451158-1-danielhb413@gmail.com/
Daniel Henrique Barboza (1):
hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline()
arch/powerpc/platforms/pseries/hotplug-cpu.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--
2.30.2
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v3 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline()
2021-03-26 14:19 [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline() Daniel Henrique Barboza
@ 2021-03-26 14:19 ` Daniel Henrique Barboza
2021-03-26 15:30 ` Andrew Donnellan
0 siblings, 1 reply; 3+ messages in thread
From: Daniel Henrique Barboza @ 2021-03-26 14:19 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Daniel Henrique Barboza, dja
One of the reasons that dlpar_cpu_offline can fail is when attempting to
offline the last online CPU of the kernel. This can be observed in a
pseries QEMU guest that has hotplugged CPUs. If the user offlines all
other CPUs of the guest, and a hotplugged CPU is now the last online
CPU, trying to reclaim it will fail. See [1] for an example.
The current error message in this situation returns rc with -EBUSY and a
generic explanation, e.g.:
pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16
EBUSY can be caused by other conditions, such as cpu_hotplug_disable
being true. Throwing a more specific error message for this case,
instead of just "Failed to offline CPU", makes it clearer that the error
is in fact a known error situation instead of other generic/unknown
cause.
This patch adds a 'last online' check in dlpar_cpu_offline() to catch
the 'last online CPU' offline error, returning a more informative error
message:
pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9
[1] https://bugzilla.redhat.com/1911414
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
arch/powerpc/platforms/pseries/hotplug-cpu.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 12cbffd3c2e3..4b9df4d645b4 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -271,6 +271,19 @@ static int dlpar_offline_cpu(struct device_node *dn)
if (!cpu_online(cpu))
break;
+ /* device_offline() will return -EBUSY (via cpu_down())
+ * if there is only one CPU left. Check it here to fail
+ * earlier and with a more informative error message,
+ * while also retaining the cpu_add_remove_lock to be sure
+ * that no CPUs are being online/offlined during this
+ * check.
+ */
+ if (num_online_cpus() == 1) {
+ pr_warn("Unable to remove last online CPU %pOFn\n", dn);
+ rc = -EBUSY;
+ goto out_unlock;
+ }
+
cpu_maps_update_done();
rc = device_offline(get_cpu_device(cpu));
if (rc)
@@ -283,6 +296,7 @@ static int dlpar_offline_cpu(struct device_node *dn)
thread);
}
}
+out_unlock:
cpu_maps_update_done();
out:
--
2.30.2
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline()
2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza
@ 2021-03-26 15:30 ` Andrew Donnellan
0 siblings, 0 replies; 3+ messages in thread
From: Andrew Donnellan @ 2021-03-26 15:30 UTC (permalink / raw)
To: Daniel Henrique Barboza, linuxppc-dev; +Cc: dja
On 27/3/21 1:19 am, Daniel Henrique Barboza wrote:
> One of the reasons that dlpar_cpu_offline can fail is when attempting to
> offline the last online CPU of the kernel. This can be observed in a
> pseries QEMU guest that has hotplugged CPUs. If the user offlines all
> other CPUs of the guest, and a hotplugged CPU is now the last online
> CPU, trying to reclaim it will fail. See [1] for an example.
>
> The current error message in this situation returns rc with -EBUSY and a
> generic explanation, e.g.:
>
> pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16
>
> EBUSY can be caused by other conditions, such as cpu_hotplug_disable
> being true. Throwing a more specific error message for this case,
> instead of just "Failed to offline CPU", makes it clearer that the error
> is in fact a known error situation instead of other generic/unknown
> cause.
>
> This patch adds a 'last online' check in dlpar_cpu_offline() to catch
> the 'last online CPU' offline error, returning a more informative error
> message:
>
> pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9
>
> [1] https://bugzilla.redhat.com/1911414
>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Thanks for addressing the issues in Daniel's review.
I haven't tested it, but this patch looks sensible enough to me.
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
--
Andrew Donnellan OzLabs, ADL Canberra
ajd@linux.ibm.com IBM Australia Limited
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-26 15:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-26 14:19 [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline() Daniel Henrique Barboza
2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza
2021-03-26 15:30 ` Andrew Donnellan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.