On Fri, Apr 16, 2021 at 06:02:14PM -0300, Daniel Henrique Barboza wrote: > At this moment, PAPR [1] does not have a way to report errors during a device > removal operation. This puts a strain in the hypervisor, which needs extra > mechanisms to try to fallback and recover from an error that might have > happened during the removal. The QEMU community has dealt with it during these > years by either trying to preempt the error before sending the HP event or, in > case of a guest side failure, reboot the guest to complete the removal process. > > This started to change with QEMU commit fe1831eff8a4 ("spapr_drc.c: use DRC > reconfiguration to cleanup DIMM unplug state"), where a way to fallback from a > memory removal error was introduced. In this case, when QEMU detects that the > kernel is reconfiguring LMBs DRCs that were marked as pending removal, the > entire process is reverted from the QEMU side as well. Around the same time, > other discussions in the QEMU mailing discussed an alternative for other device > as well. > > In [2] the idea of using RTAS set-indicator for this role was first introduced. > The RTAS set-indicator call, when attempting to UNISOLATE a DRC that is already > UNISOLATED or CONFIGURED, returns RTAS_OK and does nothing else for both QEMU > and phyp. This gives us an opportunity to use this behavior to signal the > hypervisor layer when a device removal happens, allowing it to do a > proper Nit: it's not when a device removal happens, but when it *fails* to happen. > error handling knowing for sure that the removal failed in the kernel. Using > set-indicator to report HP errors isn't strange to PAPR, as per R1-13.5.3.4-4. > of table 13.7 of [1]: > "For all DR options: If this is a DR operation that involves the user insert- > ing a DR entity, then if the firmware can determine that the inserted entity > would cause a system disturbance, then the set-indicator RTAS call must not > unisolate the entity and must return an error status which is unique to the > particular error." > > PAPR does not make any restrictions or considerations about setting an already > Unisolated/Configured DRC to 'unisolate', meaning we have a chance to use it > for this purpose - signal an OS side error when attempting to remove a DR > entity. To validate the design, this is being implemented only for CPUs. > > QEMU will use this mechanism to rollback the device removal (hotunplug) state, > allowing for a better error handling mechanism. A implementation of how QEMU > can do it is in [3]. When using a kernel with this series applied, together > with this QEMU build, this is what happens in a common CPU removal/hotunplug > error scenario (trying to remove the last online CPU): > > ( QEMU command line: qemu-system-ppc64 -machine pseries,accel=kvm,usb=off > -smp 1,maxcpus=2,threads=1,cores=2,sockets=1 ... ) > > [root@localhost ~]# QEMU 5.2.92 monitor - type 'help' for more information > (qemu) device_add host-spapr-cpu-core,core-id=1,id=core1 > (qemu) > > [root@localhost ~]# echo 0 > /sys/devices/system/cpu/cpu0/online > [ 77.548442][ T13] IRQ 19: no longer affine to CPU0 > [ 77.548452][ T13] IRQ 20: no longer affine to CPU0 > [ 77.548458][ T13] IRQ 256: no longer affine to CPU0 > [ 77.548465][ T13] IRQ 258: no longer affine to CPU0 > [ 77.548472][ T13] IRQ 259: no longer affine to CPU0 > [ 77.548479][ T13] IRQ 260: no longer affine to CPU0 > [ 77.548485][ T13] IRQ 261: no longer affine to CPU0 > [ 77.548590][ T0] cpu 0 (hwid 0) Ready to die... > [root@localhost ~]# (qemu) > (qemu) device_del core1 > (qemu) [ 83.214073][ T100] pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16 > qemu-system-ppc64: Device hotunplug rejected by the guest for device core1 > > (qemu) > > As soon as the CPU removal fails in dlpar_cpu(), QEMU becames aware of > it and is able to do error recovery. > > If this solution is well received, I'll push for an architecture change > request internally at IBM to make this mechanism PAPR official. > > > [1] https://openpowerfoundation.org/wp-content/uploads/2020/07/LoPAR-20200611.pdf > [2] https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06395.html > [3] https://github.com/danielhb/qemu/tree/unisolate_drc_callback_v1 > > Daniel Henrique Barboza (2): > dlpar.c: introduce dlpar_unisolate_drc() > hotplug-cpu.c: set UNISOLATE on dlpar_cpu_remove() failure > > arch/powerpc/platforms/pseries/dlpar.c | 14 ++++++++++++++ > arch/powerpc/platforms/pseries/hotplug-cpu.c | 9 ++++++++- > arch/powerpc/platforms/pseries/pseries.h | 1 + > 3 files changed, 23 insertions(+), 1 deletion(-) > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson