On Wed, May 24, 2017 at 11:28:57AM +0200, Greg Kurz wrote: > On Wed, 24 May 2017 15:07:54 +1000 > David Gibson wrote: > > > On Tue, May 23, 2017 at 01:18:11PM +0200, Laurent Vivier wrote: > > > If the OS is not started, QEMU sends an event to the OS > > > that is lost and cannot be recovered. An unplug is not > > > able to restore QEMU in a coherent state. > > > So, while the OS is not started, disable CPU and memory hotplug. > > > We use option vector 6 to know if the OS is started > > > > > > Signed-off-by: Laurent Vivier > > > > Urgh.. I'm not terribly confident that this is really correct. As > > discussed on the previous patch, you're essentially using OV6 as a > > flag that CAS is complete. > > > > But while it undoubtedly makes the race window much smaller, I don't > > see that there's any guarantee the guest OS will really be able to > > handle hotplug events immediately after CAS. > > > > In particular if the CAS process completes partially but then needs to > > trigger a reboot, I think that would end up setting the ov6 variable, > > but the OS would definitely not be in a state to accept events. > > > > We never have any guarantee that the OS will process an event that > we've sent actually (think of a kernel crash just after a successful > CAS negotiation for example, or any failure with the various guest > components involved in the process of hotplug). > > > Mike, I really think we need some input from someone familiar with how > > these hotplug events are supposed to work. What do we need to do to > > handle lost or stale events, such as those delivered when an OS is not > > booted. > > > > AFAIK, in the PowerVM world, the HMC exposes a user configurable timeout. > > https://www.ibm.com/support/knowledgecenter/POWER8/p8hat/p8hat_dlparprocpoweraddp6.htm > > I'm not sure we can do anything better than being able to "cancel" a previous > hotplug attempt if it takes too long, but I'm not necessarily the expert you're > looking for :) Right, but at the moment we *don't* have a way to cancel a previous hotplug attempt. Trying to remove again ends up with things in a tangle. > > > > --- > > > hw/ppc/spapr.c | 22 +++++++++++++++++++--- > > > 1 file changed, 19 insertions(+), 3 deletions(-) > > > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > > index eceb4cc..2e9320d 100644 > > > --- a/hw/ppc/spapr.c > > > +++ b/hw/ppc/spapr.c > > > @@ -2625,6 +2625,7 @@ out: > > > static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > Error **errp) > > > { > > > + sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev); > > > PCDIMMDevice *dimm = PC_DIMM(dev); > > > PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); > > > MemoryRegion *mr = ddc->get_memory_region(dimm); > > > @@ -2645,6 +2646,13 @@ static void spapr_memory_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > goto out; > > > } > > > > > > + if (dev->hotplugged) { > > > + if (!ms->os_name) { > > > + error_setg(&local_err, "Memory hotplug not supported without OS"); > > > + goto out; > > > + } > > > + } > > > + > > > out: > > > error_propagate(errp, local_err); > > > } > > > @@ -2874,6 +2882,7 @@ static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > Error **errp) > > > { > > > MachineState *machine = MACHINE(OBJECT(hotplug_dev)); > > > + sPAPRMachineState *ms = SPAPR_MACHINE(machine); > > > MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev); > > > Error *local_err = NULL; > > > CPUCore *cc = CPU_CORE(dev); > > > @@ -2884,9 +2893,16 @@ static void spapr_core_pre_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > > int node_id; > > > int index; > > > > > > - if (dev->hotplugged && !mc->has_hotpluggable_cpus) { > > > - error_setg(&local_err, "CPU hotplug not supported for this machine"); > > > - goto out; > > > + if (dev->hotplugged) { > > > + if (!mc->has_hotpluggable_cpus) { > > > + error_setg(&local_err, > > > + "CPU hotplug not supported for this machine"); > > > + goto out; > > > + } > > > + if (!ms->os_name) { > > > + error_setg(&local_err, "CPU hotplug not supported without OS"); > > > + goto out; > > > + } > > > } > > > > > > if (strcmp(base_core_type, type)) { > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson