* [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline() @ 2021-03-26 14:19 Daniel Henrique Barboza 2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza 0 siblings, 1 reply; 3+ messages in thread From: Daniel Henrique Barboza @ 2021-03-26 14:19 UTC (permalink / raw) To: linuxppc-dev; +Cc: Daniel Henrique Barboza, dja changes in v3 after Daniel Axtens review: - fixed typo in commit mgs - fixed block comment format to make checkpatch happy v2 link: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210323205056.52768-2-danielhb413@gmail.com/ changes in v2 after Michael Ellerman review: - moved the verification code from dlpar_cpu_remove() to dlpar_cpu_offline(), while holding cpu_add_remove_lock - reworded the commit message and code comment v1 link: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210305173845.451158-1-danielhb413@gmail.com/ Daniel Henrique Barboza (1): hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline() arch/powerpc/platforms/pseries/hotplug-cpu.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) -- 2.30.2 ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v3 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline() 2021-03-26 14:19 [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline() Daniel Henrique Barboza @ 2021-03-26 14:19 ` Daniel Henrique Barboza 2021-03-26 15:30 ` Andrew Donnellan 0 siblings, 1 reply; 3+ messages in thread From: Daniel Henrique Barboza @ 2021-03-26 14:19 UTC (permalink / raw) To: linuxppc-dev; +Cc: Daniel Henrique Barboza, dja One of the reasons that dlpar_cpu_offline can fail is when attempting to offline the last online CPU of the kernel. This can be observed in a pseries QEMU guest that has hotplugged CPUs. If the user offlines all other CPUs of the guest, and a hotplugged CPU is now the last online CPU, trying to reclaim it will fail. See [1] for an example. The current error message in this situation returns rc with -EBUSY and a generic explanation, e.g.: pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16 EBUSY can be caused by other conditions, such as cpu_hotplug_disable being true. Throwing a more specific error message for this case, instead of just "Failed to offline CPU", makes it clearer that the error is in fact a known error situation instead of other generic/unknown cause. This patch adds a 'last online' check in dlpar_cpu_offline() to catch the 'last online CPU' offline error, returning a more informative error message: pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9 [1] https://bugzilla.redhat.com/1911414 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 12cbffd3c2e3..4b9df4d645b4 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -271,6 +271,19 @@ static int dlpar_offline_cpu(struct device_node *dn) if (!cpu_online(cpu)) break; + /* device_offline() will return -EBUSY (via cpu_down()) + * if there is only one CPU left. Check it here to fail + * earlier and with a more informative error message, + * while also retaining the cpu_add_remove_lock to be sure + * that no CPUs are being online/offlined during this + * check. + */ + if (num_online_cpus() == 1) { + pr_warn("Unable to remove last online CPU %pOFn\n", dn); + rc = -EBUSY; + goto out_unlock; + } + cpu_maps_update_done(); rc = device_offline(get_cpu_device(cpu)); if (rc) @@ -283,6 +296,7 @@ static int dlpar_offline_cpu(struct device_node *dn) thread); } } +out_unlock: cpu_maps_update_done(); out: -- 2.30.2 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v3 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline() 2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza @ 2021-03-26 15:30 ` Andrew Donnellan 0 siblings, 0 replies; 3+ messages in thread From: Andrew Donnellan @ 2021-03-26 15:30 UTC (permalink / raw) To: Daniel Henrique Barboza, linuxppc-dev; +Cc: dja On 27/3/21 1:19 am, Daniel Henrique Barboza wrote: > One of the reasons that dlpar_cpu_offline can fail is when attempting to > offline the last online CPU of the kernel. This can be observed in a > pseries QEMU guest that has hotplugged CPUs. If the user offlines all > other CPUs of the guest, and a hotplugged CPU is now the last online > CPU, trying to reclaim it will fail. See [1] for an example. > > The current error message in this situation returns rc with -EBUSY and a > generic explanation, e.g.: > > pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16 > > EBUSY can be caused by other conditions, such as cpu_hotplug_disable > being true. Throwing a more specific error message for this case, > instead of just "Failed to offline CPU", makes it clearer that the error > is in fact a known error situation instead of other generic/unknown > cause. > > This patch adds a 'last online' check in dlpar_cpu_offline() to catch > the 'last online CPU' offline error, returning a more informative error > message: > > pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9 > > [1] https://bugzilla.redhat.com/1911414 > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Thanks for addressing the issues in Daniel's review. I haven't tested it, but this patch looks sensible enough to me. Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> -- Andrew Donnellan OzLabs, ADL Canberra ajd@linux.ibm.com IBM Australia Limited ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-26 15:31 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-26 14:19 [PATCH v3 0/1] show 'last online CPU' error in dlpar_cpu_offline() Daniel Henrique Barboza 2021-03-26 14:19 ` [PATCH v3 1/1] hotplug-cpu.c: " Daniel Henrique Barboza 2021-03-26 15:30 ` Andrew Donnellan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.