All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events
@ 2017-04-24 22:08 Daniel Henrique Barboza
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry Daniel Henrique Barboza
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-24 22:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

Hi,

This is the version 6 of the pseries patches that was last sent in the mailing list
more than 6 months ago. The original v5 patchset was authored by Jianjun Duan (see link
below):

http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00270.html

The specific pseries patches were stripped out in the original v6 patchset
and it was later pushed upstream in the v17 in the 'extend VMStateInfo' and
'migrate QTAILQ' contributions.

The changelog as far as the pseries patches are concerned:


v6: - Rebased with QEMU master after 6+ months.
    - Simplified the logic in patch 1.
    - Reworked patch 2: added CPU DRC migration, removed a function pointer from DRC
class and minor improvements.
    - Added clarifications from the previous v5 discussions in the commit messages.

v5: - Rebased to David's ppc-for-2.8.

v4: - Introduce a way to set customized instance_id in SaveStateEntry. Use it
      to set instance_id for DRC using its unique index to address David 
      Gibson's concern.
    - Rename VMS_CSTM to VMS_LINKED based on Paolo Bonzini's suggestions.
    - Clean up qjson stuff in put_qtailq. 
    - Add trace for put_qtailq and get_qtailq based on David Gilbert's 
      suggestion.

    - Based on David's ppc-for-2.7. 

v3: - Simplify overall design followng discussion with Paolo. No longer need
      metadata to migrate QTAILQ.
    - Extend VMStateInfo instead of adding similar fields to VMStateField.
    - Clean up macros in qemu/queue.h.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg05695.html)

v2: - Introduce a general approach to migrate QTAILQ in qemu/queue.h.
    - Migrate signalled field in the DRC state.
    - Put the newly added migrating fields in subsections so that backward 
      migration is not broken.  
    - Set detach_cb field right after migration so that a migrated hot-unplug
      event could finish its course.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04188.html)

v1: - Inital version.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02601.html)


To make guest devices (PCI, CPU and memory) hotplug work together 
with guest migration, spapr drc state needs be transmitted in
migration. This patch defines the VMStateDescription struct for
spapr drc state to enable it.

To fix the potential racing between hotplug events on guest and 
guest migration, ccs_list and pending_events of spapr state need be 
transmitted in migration. This patch set also takes care of it.



Daniel Henrique Barboza (1):
  hw/ppc: migrating the DRC state of hotplugged devices

Jianjun Duan (3):
  migration: alternative way to set instance_id in SaveStateEntry
  migration: spapr: migrate ccs_list in spapr state
  migration: spapr: migrate pending_events of spapr state

 hw/ppc/spapr.c         | 65 ++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_drc.c     | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_events.c  | 24 +++++++++---------
 hw/ppc/spapr_pci.c     | 22 +++++++++++++++++
 include/hw/ppc/spapr.h |  3 ++-
 include/hw/qdev-core.h |  6 +++++
 migration/savevm.c     |  6 +++++
 7 files changed, 181 insertions(+), 12 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry
  2017-04-24 22:08 [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
@ 2017-04-24 22:08 ` Daniel Henrique Barboza
  2017-04-25 22:26   ` Michael Roth
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-24 22:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

From: Jianjun Duan <duanj@linux.vnet.ibm.com>

In QOM (QEMU Object Model) migrated objects are identified with instance_id
which is calculated automatically using their path in the QOM composition
tree. For some objects, this path could change from source to target in
migration. To migrate such objects, we need to make sure the instance_id does
not change from source to target. We add a hook in DeviceClass to do customized
instance_id calculation in such cases.

As a result, in these cases compat will not be set in the concerned
SaveStateEntry. This will prevent the inconsistent idstr to be sent over in
migration. We could have set alias_id in a similar way. But that will be
overloading the purpose of alias_id.

The first application will be setting instance_id for pseries DRC objects using
its unique index. Doing this makes the instance_id of DRC to be consistent
across migration and supports flexible management of DRC objects in migration.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 include/hw/qdev-core.h | 6 ++++++
 migration/savevm.c     | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 4bf86b0..9b3914c 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -127,6 +127,12 @@ typedef struct DeviceClass {
     qdev_initfn init; /* TODO remove, once users are converted to realize */
     qdev_event exit; /* TODO remove, once users are converted to unrealize */
     const char *bus_type;
+
+    /* When this field is set, qemu will use it to get an unique instance_id
+     * instead of calculating an auto idstr and instance_id for the relevant
+     * SaveStateEntry
+     */
+    int (*dev_get_instance_id)(DeviceState *dev);
 } DeviceClass;
 
 typedef struct NamedGPIOList NamedGPIOList;
diff --git a/migration/savevm.c b/migration/savevm.c
index 03ae1bd..5d8135f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -606,6 +606,9 @@ int register_savevm_live(DeviceState *dev,
                          calculate_compat_instance_id(idstr) : instance_id;
             instance_id = -1;
         }
+        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
+            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
+        }
     }
     pstrcat(se->idstr, sizeof(se->idstr), idstr);
 
@@ -696,6 +699,9 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
                          calculate_compat_instance_id(vmsd->name) : instance_id;
             instance_id = -1;
         }
+        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
+            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
+        }
     }
     pstrcat(se->idstr, sizeof(se->idstr), vmsd->name);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-24 22:08 [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry Daniel Henrique Barboza
@ 2017-04-24 22:08 ` Daniel Henrique Barboza
  2017-04-25 22:45   ` Michael Roth
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
  3 siblings, 1 reply; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-24 22:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'configured' are involved
in the DR state transition diagram from PAPR+ 2.7, 13.4;

- 'configured' and 'signalled' are needed in attaching and detaching
devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

The instance_id is used to identify objects in migration. We set
instance_id of DRC using the unique index so that it is the same
across migration.

In hw/ppc/spapr_pci.c, a function called spapr_pci_set_detach_cb
was created to set the detach_cb of the migrated DRC in the
spapr_pci_post_load. The reason is that detach_cb is a DRC function
pointer that can't be migrated but we need it set in the target
so a ongoing hot-unplug event can properly finish.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr_drc.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_pci.c | 22 ++++++++++++++++++
 2 files changed, 89 insertions(+)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index a1cdc87..5c2baad 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -651,6 +651,70 @@ static void spapr_dr_connector_instance_init(Object *obj)
                         NULL, NULL, NULL, NULL);
 }
 
+static bool spapr_drc_needed(void *opaque)
+{
+    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    bool rc = false;
+    sPAPRDREntitySense value;
+    drck->entity_sense(drc, &value);
+    /* If no dev is plugged in there is no need to migrate the DRC state */
+    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+        return false;
+    }
+
+    /*
+     * If there is dev plugged in, we need to migrate the DRC state when
+     * it is different from cold-plugged state
+     */
+    switch (drc->type) {
+
+    case SPAPR_DR_CONNECTOR_TYPE_PCI:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+                drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    default:
+        ;
+    }
+    return rc;
+}
+
+/* return the unique drc index as instance_id for qom interfaces*/
+static int get_instance_id(DeviceState *dev)
+{
+    return (int)get_index(SPAPR_DR_CONNECTOR(OBJECT(dev)));
+}
+
+static const VMStateDescription vmstate_spapr_drc = {
+    .name = "spapr_drc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
+        VMSTATE_BOOL(configured, sPAPRDRConnector),
+        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
+        VMSTATE_BOOL(signalled, sPAPRDRConnector),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
 {
     DeviceClass *dk = DEVICE_CLASS(k);
@@ -659,6 +723,8 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
     dk->reset = reset;
     dk->realize = realize;
     dk->unrealize = unrealize;
+    dk->vmsd = &vmstate_spapr_drc;
+    dk->dev_get_instance_id = get_instance_id;
     drck->set_isolation_state = set_isolation_state;
     drck->set_indicator_state = set_indicator_state;
     drck->set_allocation_state = set_allocation_state;
@@ -672,6 +738,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
     drck->detach = detach;
     drck->release_pending = release_pending;
     drck->set_signalled = set_signalled;
+
     /*
      * Reason: it crashes FIXME find and document the real reason
      */
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 98c52e4..639dad2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1922,12 +1922,34 @@ static void spapr_pci_pre_save(void *opaque)
     }
 }
 
+/*
+ * detach_cb in the DRC state is a function pointer that cannot be
+ * migrated. We set it right after migration so that a migrated
+ * hot-unplug event could finish its work.
+ */
+static void spapr_pci_set_detach_cb(PCIBus *bus, PCIDevice *pdev,
+                                 void *opaque)
+{
+    sPAPRPHBState *sphb = opaque;
+    sPAPRDRConnector *drc = spapr_phb_get_pci_drc(sphb, pdev);
+    drc->detach_cb = spapr_phb_remove_pci_device_cb;
+}
+
 static int spapr_pci_post_load(void *opaque, int version_id)
 {
     sPAPRPHBState *sphb = opaque;
     gpointer key, value;
     int i;
 
+    PCIBus *bus = PCI_HOST_BRIDGE(sphb)->bus;
+    unsigned int bus_no = 0;
+
+    /* Set detach_cb for the drc unconditionally after migration */
+    if (bus) {
+        pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
+                            &bus_no);
+    }
+
     for (i = 0; i < sphb->msi_devs_num; ++i) {
         key = g_memdup(&sphb->msi_devs[i].key,
                        sizeof(sphb->msi_devs[i].key));
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state
  2017-04-24 22:08 [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry Daniel Henrique Barboza
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
@ 2017-04-24 22:08 ` Daniel Henrique Barboza
  2017-04-25 22:47   ` Michael Roth
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
  3 siblings, 1 reply; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-24 22:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

From: Jianjun Duan <duanj@linux.vnet.ibm.com>

ccs_list in spapr state maintains the device tree related
information on the rtas side for hotplugged devices. In racing
situations between hotplug events and migration operation, a rtas
hotplug event could be migrated from the source guest to target
guest, or the source guest could have not yet finished fetching
the device tree when migration is started, the target will try
to finish fetching the device tree. By migrating ccs_list, the
target can fetch the device tree properly.

In theory there would be other alternatives besides migrating the
css_list to fix this. For example, we could add a flag that indicates
whether a device is in the middle of the configure_connector during the
migration process, in the post_load we can detect if this flag
is active and then return an error informing the guest to restart the
hotplug process. However, the DRC state can still be modified outside of
hotplug. Using:

   drmgr -c pci -s <drc_index> -r
   drmgr -c pci -s <drc_index> -a

it is possible to return a device to firmware and then later take it
back and reconfigure it. This is not a common case but it's not prohibited,
and performing a migration between these 2 operations would fail because
the default coldplug state on target assumes a configured state in
the source*. Migrating ccs_list is one solution that cover this
case as well.

ccs_list is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

* see http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg01763.html
for more information on this discussion.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 35db949..22f351c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1388,6 +1388,37 @@ static bool version_before_3(void *opaque, int version_id)
     return version_id < 3;
 }
 
+static bool spapr_ccs_list_needed(void *opaque)
+{
+    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+    return !QTAILQ_EMPTY(&spapr->ccs_list);
+}
+
+static const VMStateDescription vmstate_spapr_ccs = {
+    .name = "spaprconfigureconnectorstate",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(drc_index, sPAPRConfigureConnectorState),
+        VMSTATE_INT32(fdt_offset, sPAPRConfigureConnectorState),
+        VMSTATE_INT32(fdt_depth, sPAPRConfigureConnectorState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_ccs_list = {
+    .name = "spaprccslist",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_ccs_list_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_QTAILQ_V(ccs_list, sPAPRMachineState, 1,
+                         vmstate_spapr_ccs, sPAPRConfigureConnectorState,
+                         next),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
     sPAPRMachineState *spapr = opaque;
@@ -1486,6 +1517,7 @@ static const VMStateDescription vmstate_spapr = {
     .subsections = (const VMStateDescription*[]) {
         &vmstate_spapr_ov5_cas,
         &vmstate_spapr_patb_entry,
+        &vmstate_spapr_ccs_list,
         NULL
     }
 };
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of spapr state
  2017-04-24 22:08 [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
                   ` (2 preceding siblings ...)
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
@ 2017-04-24 22:08 ` Daniel Henrique Barboza
  2017-04-25 22:53   ` Michael Roth
  3 siblings, 1 reply; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-24 22:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

From: Jianjun Duan <duanj@linux.vnet.ibm.com>

In racing situations between hotplug events and migration operation,
a rtas hotplug event could have not yet be delivered to the source
guest when migration is started. In this case the pending_events of
spapr state need be transmitted to the target so that the hotplug
event can be finished on the target.

All the different fields of the events are encoded as defined by
PAPR. We can migrate them as uint8_t binary stream without any
concerns about data padding or endianess.

pending_events is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++++++++
 hw/ppc/spapr_events.c  | 24 +++++++++++++-----------
 include/hw/ppc/spapr.h |  3 ++-
 3 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 22f351c..a3e939b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1419,6 +1419,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = {
     },
 };
 
+static bool spapr_pending_events_needed(void *opaque)
+{
+    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+    return !QTAILQ_EMPTY(&spapr->pending_events);
+}
+
+static const VMStateDescription vmstate_spapr_event_entry = {
+    .name = "spapreventlogentry",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_INT32(log_type, sPAPREventLogEntry),
+        VMSTATE_BOOL(exception, sPAPREventLogEntry),
+        VMSTATE_UINT32(data_size, sPAPREventLogEntry),
+        VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
+                                    0, vmstate_info_uint8, uint8_t),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_pending_events = {
+    .name = "spaprpendingevents",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_pending_events_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
+                         vmstate_spapr_event_entry, sPAPREventLogEntry, next),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
     sPAPRMachineState *spapr = opaque;
@@ -1518,6 +1550,7 @@ static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_ov5_cas,
         &vmstate_spapr_patb_entry,
         &vmstate_spapr_ccs_list,
+        &vmstate_spapr_pending_events,
         NULL
     }
 };
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 24a5758..399dd49 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, int log_type)
     return source->irq;
 }
 
-static void rtas_event_log_queue(int log_type, void *data, bool exception)
+static void rtas_event_log_queue(int log_type, void *data, bool exception,
+                                 int data_size)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
     sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
@@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, bool exception)
     entry->log_type = log_type;
     entry->exception = exception;
     entry->data = data;
+    entry->data_size = data_size;
     QTAILQ_INSERT_TAIL(&spapr->pending_events, entry, next);
 }
 
@@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     struct rtas_event_log_v6_mainb *mainb;
     struct rtas_event_log_v6_epow *epow;
     struct epow_log_full *new_epow;
+    uint32_t data_size;
 
     new_epow = g_malloc0(sizeof(*new_epow));
     hdr = &new_epow->hdr;
@@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     mainb = &new_epow->mainb;
     epow = &new_epow->epow;
 
+    data_size = sizeof(*new_epow);
     hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
                                | RTAS_LOG_SEVERITY_EVENT
                                | RTAS_LOG_DISPOSITION_NOT_RECOVERED
                                | RTAS_LOG_OPTIONAL_PART_PRESENT
                                | RTAS_LOG_TYPE_EPOW);
-    hdr->extended_length = cpu_to_be32(sizeof(*new_epow)
-                                       - sizeof(new_epow->hdr));
-
+    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_epow->hdr));
     spapr_init_v6hdr(v6hdr);
     spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
 
@@ -479,7 +481,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
     epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
 
-    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true);
+    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true, data_size);
 
     qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
                                  rtas_event_log_to_irq(spapr,
@@ -504,6 +506,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     struct rtas_event_log_v6_maina *maina;
     struct rtas_event_log_v6_mainb *mainb;
     struct rtas_event_log_v6_hp *hp;
+    uint32_t data_size;
 
     new_hp = g_malloc0(sizeof(struct hp_log_full));
     hdr = &new_hp->hdr;
@@ -512,14 +515,14 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     mainb = &new_hp->mainb;
     hp = &new_hp->hp;
 
+    data_size = sizeof(*new_hp);
     hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
                                | RTAS_LOG_SEVERITY_EVENT
                                | RTAS_LOG_DISPOSITION_NOT_RECOVERED
                                | RTAS_LOG_OPTIONAL_PART_PRESENT
                                | RTAS_LOG_INITIATOR_HOTPLUG
                                | RTAS_LOG_TYPE_HOTPLUG);
-    hdr->extended_length = cpu_to_be32(sizeof(*new_hp)
-                                       - sizeof(new_hp->hdr));
+    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_hp->hdr));
 
     spapr_init_v6hdr(v6hdr);
     spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
@@ -572,7 +575,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
             cpu_to_be32(drc_id->count_indexed.index);
     }
 
-    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true);
+    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true, data_size);
 
     qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
                                  rtas_event_log_to_irq(spapr,
@@ -671,8 +674,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     if (!event) {
         goto out_no_events;
     }
-
-    hdr = event->data;
+    hdr = (struct rtas_error_log *)event->data;
     event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
 
     if (event_len < len) {
@@ -728,7 +730,7 @@ static void event_scan(PowerPCCPU *cpu, sPAPRMachineState *spapr,
         goto out_no_events;
     }
 
-    hdr = event->data;
+    hdr = (struct rtas_error_log *)event->data;
     event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
 
     if (event_len < len) {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 808aac8..c60dbd2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -584,7 +584,8 @@ sPAPRTCETable *spapr_tce_find_by_liobn(target_ulong liobn);
 struct sPAPREventLogEntry {
     int log_type;
     bool exception;
-    void *data;
+    uint8_t *data;
+    uint32_t data_size;
     QTAILQ_ENTRY(sPAPREventLogEntry) next;
 };
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry Daniel Henrique Barboza
@ 2017-04-25 22:26   ` Michael Roth
  2017-04-26 10:05     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Roth @ 2017-04-25 22:26 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

Quoting Daniel Henrique Barboza (2017-04-24 17:08:25)
> From: Jianjun Duan <duanj@linux.vnet.ibm.com>
> 
> In QOM (QEMU Object Model) migrated objects are identified with instance_id
> which is calculated automatically using their path in the QOM composition
> tree. For some objects, this path could change from source to target in
> migration. To migrate such objects, we need to make sure the instance_id does
> not change from source to target. We add a hook in DeviceClass to do customized
> instance_id calculation in such cases.

When I tried to pluck a subset of these patches for another series it
was noticed that we don't actually need this patch anymore:

  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05475.html

hw/ppc/spapr_iommu.c already implements an approach for registering DRCs
that would work for our case as well since DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so using
vmstate_register(DEVICE(drc), drck->get_index(drc), ...) will work in
the same way this patch expects it to.

> 
> As a result, in these cases compat will not be set in the concerned
> SaveStateEntry. This will prevent the inconsistent idstr to be sent over in
> migration. We could have set alias_id in a similar way. But that will be
> overloading the purpose of alias_id.
> 
> The first application will be setting instance_id for pseries DRC objects using
> its unique index. Doing this makes the instance_id of DRC to be consistent
> across migration and supports flexible management of DRC objects in migration.
> 
> Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  include/hw/qdev-core.h | 6 ++++++
>  migration/savevm.c     | 6 ++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 4bf86b0..9b3914c 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -127,6 +127,12 @@ typedef struct DeviceClass {
>      qdev_initfn init; /* TODO remove, once users are converted to realize */
>      qdev_event exit; /* TODO remove, once users are converted to unrealize */
>      const char *bus_type;
> +
> +    /* When this field is set, qemu will use it to get an unique instance_id
> +     * instead of calculating an auto idstr and instance_id for the relevant
> +     * SaveStateEntry
> +     */
> +    int (*dev_get_instance_id)(DeviceState *dev);
>  } DeviceClass;
> 
>  typedef struct NamedGPIOList NamedGPIOList;
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 03ae1bd..5d8135f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -606,6 +606,9 @@ int register_savevm_live(DeviceState *dev,
>                           calculate_compat_instance_id(idstr) : instance_id;
>              instance_id = -1;
>          }
> +        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
> +            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
> +        }
>      }
>      pstrcat(se->idstr, sizeof(se->idstr), idstr);
> 
> @@ -696,6 +699,9 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
>                           calculate_compat_instance_id(vmsd->name) : instance_id;
>              instance_id = -1;
>          }
> +        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
> +            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
> +        }
>      }
>      pstrcat(se->idstr, sizeof(se->idstr), vmsd->name);
> 
> -- 
> 2.9.3
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
@ 2017-04-25 22:45   ` Michael Roth
  2017-04-26  5:55     ` David Gibson
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Roth @ 2017-04-25 22:45 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

Quoting Daniel Henrique Barboza (2017-04-24 17:08:26)
> In pseries, a firmware abstraction called Dynamic Reconfiguration
> Connector (DRC) is used to assign a particular dynamic resource
> to the guest and provide an interface to manage configuration/removal
> of the resource associated with it. In other words, DRC is the
> 'plugged state' of a device.
> 
> Before this patch, DRC wasn't being migrated. This causes
> post-migration problems due to DRC state mismatch between source and
> target. The DRC state of a device X in the source might
> change, while in the target the DRC state of X is still fresh. When
> migrating the guest, X will not have the same hotplugged state as it
> did in the source. This means that we can't hot unplug X in the
> target after migration is completed because its DRC state is not consistent.
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
> bug that is caused by this DRC state mismatch between source and
> target.
> 
> To migrate the DRC state, we defined the VMStateDescription struct for
> spapr_drc to enable the transmission of spapr_drc state in migration.
> Not all the elements in the DRC state are migrated - only those
> that can be modified by guest actions or device add/remove
> operations:
> 
> - 'isolation_state', 'allocation_state' and 'configured' are involved
> in the DR state transition diagram from PAPR+ 2.7, 13.4;
> 
> - 'configured' and 'signalled' are needed in attaching and detaching
> devices;
> 
> - 'indicator_state' provides users with hardware state information.
> 
> These are the DRC elements that are migrated.
> 
> In this patch the DRC state is migrated for PCI, LMB and CPU
> connector types. At this moment there is no support to migrate
> DRC for the PHB (PCI Host Bridge) type.
> 
> The instance_id is used to identify objects in migration. We set
> instance_id of DRC using the unique index so that it is the same
> across migration.
> 
> In hw/ppc/spapr_pci.c, a function called spapr_pci_set_detach_cb
> was created to set the detach_cb of the migrated DRC in the
> spapr_pci_post_load. The reason is that detach_cb is a DRC function
> pointer that can't be migrated but we need it set in the target
> so a ongoing hot-unplug event can properly finish.
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr_drc.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_pci.c | 22 ++++++++++++++++++
>  2 files changed, 89 insertions(+)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index a1cdc87..5c2baad 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -651,6 +651,70 @@ static void spapr_dr_connector_instance_init(Object *obj)
>                          NULL, NULL, NULL, NULL);
>  }
> 
> +static bool spapr_drc_needed(void *opaque)
> +{
> +    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +    bool rc = false;
> +    sPAPRDREntitySense value;
> +    drck->entity_sense(drc, &value);
> +    /* If no dev is plugged in there is no need to migrate the DRC state */
> +    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
> +        return false;
> +    }
> +
> +    /*
> +     * If there is dev plugged in, we need to migrate the DRC state when
> +     * it is different from cold-plugged state
> +     */
> +    switch (drc->type) {
> +
> +    case SPAPR_DR_CONNECTOR_TYPE_PCI:
> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
> +               drc->configured && drc->signalled && !drc->awaiting_release);
> +        break;
> +
> +    case SPAPR_DR_CONNECTOR_TYPE_LMB:
> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
> +               drc->configured && drc->signalled && !drc->awaiting_release);
> +        break;
> +
> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
> +                drc->configured && drc->signalled && !drc->awaiting_release);
> +        break;
> +
> +    default:
> +        ;
> +    }
> +    return rc;
> +}
> +
> +/* return the unique drc index as instance_id for qom interfaces*/
> +static int get_instance_id(DeviceState *dev)
> +{
> +    return (int)get_index(SPAPR_DR_CONNECTOR(OBJECT(dev)));
> +}
> +
> +static const VMStateDescription vmstate_spapr_drc = {
> +    .name = "spapr_drc",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_drc_needed,
> +    .fields  = (VMStateField []) {
> +        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
> +        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
> +        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
> +        VMSTATE_BOOL(configured, sPAPRDRConnector),
> +        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
> +        VMSTATE_BOOL(signalled, sPAPRDRConnector),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>  {
>      DeviceClass *dk = DEVICE_CLASS(k);
> @@ -659,6 +723,8 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>      dk->reset = reset;
>      dk->realize = realize;
>      dk->unrealize = unrealize;
> +    dk->vmsd = &vmstate_spapr_drc;
> +    dk->dev_get_instance_id = get_instance_id;
>      drck->set_isolation_state = set_isolation_state;
>      drck->set_indicator_state = set_indicator_state;
>      drck->set_allocation_state = set_allocation_state;
> @@ -672,6 +738,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>      drck->detach = detach;
>      drck->release_pending = release_pending;
>      drck->set_signalled = set_signalled;
> +
>      /*
>       * Reason: it crashes FIXME find and document the real reason
>       */
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 98c52e4..639dad2 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1922,12 +1922,34 @@ static void spapr_pci_pre_save(void *opaque)
>      }
>  }
> 
> +/*
> + * detach_cb in the DRC state is a function pointer that cannot be
> + * migrated. We set it right after migration so that a migrated
> + * hot-unplug event could finish its work.
> + */
> +static void spapr_pci_set_detach_cb(PCIBus *bus, PCIDevice *pdev,
> +                                 void *opaque)
> +{
> +    sPAPRPHBState *sphb = opaque;
> +    sPAPRDRConnector *drc = spapr_phb_get_pci_drc(sphb, pdev);
> +    drc->detach_cb = spapr_phb_remove_pci_device_cb;
> +}

This is doesn't quite work, because drc->detach_cb takes an opaque
argument that is not being restored in here (and is not really
possible to restore).

However, the only DRC user who currently relies on the opaque is
memory unplug, which passes an sPAPRDIMMState via spapr_del_lmbs:

  drck->detach(drc, dev, spapr_lmb_release, ds, errp);

It's actually possible to do away with this, you just need to add
the sPAPRDIMMStates to a QTAILQ that's attached to sPAPRPHBState,
and then migrate it when it is non-empty, similar to how ccs_list
is migrated in the following patch. Then you would modify
spapr_lmb_release to search that queue for the matching
sPAPRDIMMState instead of relying on the opaque argument. You
can get to the sPAPRPHBState via qdev_get_machine(),
spapr_phb_realize() has an example.

spapr_phb_remove_pci_device_cb also currently takes an opaque to
an sPAPRPHBState*, but it doesn't actually do anything with it,
so you can drop it from there. and spapr_core_release never used
an opaque.

> +
>  static int spapr_pci_post_load(void *opaque, int version_id)
>  {
>      sPAPRPHBState *sphb = opaque;
>      gpointer key, value;
>      int i;
> 
> +    PCIBus *bus = PCI_HOST_BRIDGE(sphb)->bus;
> +    unsigned int bus_no = 0;
> +
> +    /* Set detach_cb for the drc unconditionally after migration */
> +    if (bus) {
> +        pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
> +                            &bus_no);
> +    }

In a previous thread it was suggested that rather than restoring these
after migration, we should modify spapr_dr_connector_new() to take the
attach/detach callbacks as parameters so that they are set statically
at init time. Then we can drop all the post-load hooks (memory/cpu
would need similar post-load restorations as well, otherwise).

> +
>      for (i = 0; i < sphb->msi_devs_num; ++i) {
>          key = g_memdup(&sphb->msi_devs[i].key,
>                         sizeof(sphb->msi_devs[i].key));
> -- 
> 2.9.3
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
@ 2017-04-25 22:47   ` Michael Roth
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Roth @ 2017-04-25 22:47 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

Quoting Daniel Henrique Barboza (2017-04-24 17:08:27)
> From: Jianjun Duan <duanj@linux.vnet.ibm.com>
> 
> ccs_list in spapr state maintains the device tree related
> information on the rtas side for hotplugged devices. In racing
> situations between hotplug events and migration operation, a rtas
> hotplug event could be migrated from the source guest to target
> guest, or the source guest could have not yet finished fetching
> the device tree when migration is started, the target will try
> to finish fetching the device tree. By migrating ccs_list, the
> target can fetch the device tree properly.
> 
> In theory there would be other alternatives besides migrating the
> css_list to fix this. For example, we could add a flag that indicates
> whether a device is in the middle of the configure_connector during the
> migration process, in the post_load we can detect if this flag
> is active and then return an error informing the guest to restart the
> hotplug process. However, the DRC state can still be modified outside of
> hotplug. Using:
> 
>    drmgr -c pci -s <drc_index> -r
>    drmgr -c pci -s <drc_index> -a
> 
> it is possible to return a device to firmware and then later take it
> back and reconfigure it. This is not a common case but it's not prohibited,
> and performing a migration between these 2 operations would fail because
> the default coldplug state on target assumes a configured state in
> the source*. Migrating ccs_list is one solution that cover this
> case as well.
> 
> ccs_list is put in a subsection in the spapr state VMSD to make
> sure migration across different versions is not broken.
> 
> * see http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg01763.html
> for more information on this discussion.
> 
> Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 35db949..22f351c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1388,6 +1388,37 @@ static bool version_before_3(void *opaque, int version_id)
>      return version_id < 3;
>  }
> 
> +static bool spapr_ccs_list_needed(void *opaque)
> +{
> +    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
> +    return !QTAILQ_EMPTY(&spapr->ccs_list);
> +}
> +
> +static const VMStateDescription vmstate_spapr_ccs = {
> +    .name = "spaprconfigureconnectorstate",

some word separators wouldn't hurt, since they might show up in
migration error messages.

> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32(drc_index, sPAPRConfigureConnectorState),
> +        VMSTATE_INT32(fdt_offset, sPAPRConfigureConnectorState),
> +        VMSTATE_INT32(fdt_depth, sPAPRConfigureConnectorState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
> +static const VMStateDescription vmstate_spapr_ccs_list = {
> +    .name = "spaprccslist",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_ccs_list_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_QTAILQ_V(ccs_list, sPAPRMachineState, 1,
> +                         vmstate_spapr_ccs, sPAPRConfigureConnectorState,
> +                         next),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static bool spapr_ov5_cas_needed(void *opaque)
>  {
>      sPAPRMachineState *spapr = opaque;
> @@ -1486,6 +1517,7 @@ static const VMStateDescription vmstate_spapr = {
>      .subsections = (const VMStateDescription*[]) {
>          &vmstate_spapr_ov5_cas,
>          &vmstate_spapr_patb_entry,
> +        &vmstate_spapr_ccs_list,
>          NULL
>      }
>  };
> -- 
> 2.9.3
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of spapr state
  2017-04-24 22:08 ` [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
@ 2017-04-25 22:53   ` Michael Roth
  2017-04-26 10:10     ` Daniel Henrique Barboza
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Roth @ 2017-04-25 22:53 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

Quoting Daniel Henrique Barboza (2017-04-24 17:08:28)
> From: Jianjun Duan <duanj@linux.vnet.ibm.com>
> 
> In racing situations between hotplug events and migration operation,
> a rtas hotplug event could have not yet be delivered to the source
> guest when migration is started. In this case the pending_events of
> spapr state need be transmitted to the target so that the hotplug
> event can be finished on the target.
> 
> All the different fields of the events are encoded as defined by
> PAPR. We can migrate them as uint8_t binary stream without any
> concerns about data padding or endianess.
> 
> pending_events is put in a subsection in the spapr state VMSD to make
> sure migration across different versions is not broken.
> 
> Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++++++++
>  hw/ppc/spapr_events.c  | 24 +++++++++++++-----------
>  include/hw/ppc/spapr.h |  3 ++-
>  3 files changed, 48 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 22f351c..a3e939b 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1419,6 +1419,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = {
>      },
>  };
> 
> +static bool spapr_pending_events_needed(void *opaque)
> +{
> +    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
> +    return !QTAILQ_EMPTY(&spapr->pending_events);
> +}
> +
> +static const VMStateDescription vmstate_spapr_event_entry = {
> +    .name = "spapreventlogentry",

Similar suggestion on the naming here and elsewhere.

> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_INT32(log_type, sPAPREventLogEntry),
> +        VMSTATE_BOOL(exception, sPAPREventLogEntry),
> +        VMSTATE_UINT32(data_size, sPAPREventLogEntry),
> +        VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
> +                                    0, vmstate_info_uint8, uint8_t),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
> +static const VMStateDescription vmstate_spapr_pending_events = {
> +    .name = "spaprpendingevents",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_pending_events_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
> +                         vmstate_spapr_event_entry, sPAPREventLogEntry, next),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static bool spapr_ov5_cas_needed(void *opaque)
>  {
>      sPAPRMachineState *spapr = opaque;
> @@ -1518,6 +1550,7 @@ static const VMStateDescription vmstate_spapr = {
>          &vmstate_spapr_ov5_cas,
>          &vmstate_spapr_patb_entry,
>          &vmstate_spapr_ccs_list,
> +        &vmstate_spapr_pending_events,
>          NULL
>      }
>  };
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 24a5758..399dd49 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, int log_type)
>      return source->irq;
>  }
> 
> -static void rtas_event_log_queue(int log_type, void *data, bool exception)
> +static void rtas_event_log_queue(int log_type, void *data, bool exception,
> +                                 int data_size)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>      sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
> @@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, bool exception)
>      entry->log_type = log_type;
>      entry->exception = exception;
>      entry->data = data;
> +    entry->data_size = data_size;
>      QTAILQ_INSERT_TAIL(&spapr->pending_events, entry, next);
>  }
> 
> @@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      struct rtas_event_log_v6_mainb *mainb;
>      struct rtas_event_log_v6_epow *epow;
>      struct epow_log_full *new_epow;
> +    uint32_t data_size;
> 
>      new_epow = g_malloc0(sizeof(*new_epow));
>      hdr = &new_epow->hdr;
> @@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      mainb = &new_epow->mainb;
>      epow = &new_epow->epow;
> 
> +    data_size = sizeof(*new_epow);
>      hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
>                                 | RTAS_LOG_SEVERITY_EVENT
>                                 | RTAS_LOG_DISPOSITION_NOT_RECOVERED
>                                 | RTAS_LOG_OPTIONAL_PART_PRESENT
>                                 | RTAS_LOG_TYPE_EPOW);
> -    hdr->extended_length = cpu_to_be32(sizeof(*new_epow)
> -                                       - sizeof(new_epow->hdr));
> -
> +    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_epow->hdr));
>      spapr_init_v6hdr(v6hdr);
>      spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
> 
> @@ -479,7 +481,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>      epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
>      epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
> 
> -    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true);
> +    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true, data_size);
> 
>      qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
>                                   rtas_event_log_to_irq(spapr,
> @@ -504,6 +506,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>      struct rtas_event_log_v6_maina *maina;
>      struct rtas_event_log_v6_mainb *mainb;
>      struct rtas_event_log_v6_hp *hp;
> +    uint32_t data_size;
> 
>      new_hp = g_malloc0(sizeof(struct hp_log_full));
>      hdr = &new_hp->hdr;
> @@ -512,14 +515,14 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>      mainb = &new_hp->mainb;
>      hp = &new_hp->hp;
> 
> +    data_size = sizeof(*new_hp);
>      hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
>                                 | RTAS_LOG_SEVERITY_EVENT
>                                 | RTAS_LOG_DISPOSITION_NOT_RECOVERED
>                                 | RTAS_LOG_OPTIONAL_PART_PRESENT
>                                 | RTAS_LOG_INITIATOR_HOTPLUG
>                                 | RTAS_LOG_TYPE_HOTPLUG);
> -    hdr->extended_length = cpu_to_be32(sizeof(*new_hp)
> -                                       - sizeof(new_hp->hdr));
> +    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_hp->hdr));
> 
>      spapr_init_v6hdr(v6hdr);
>      spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
> @@ -572,7 +575,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>              cpu_to_be32(drc_id->count_indexed.index);
>      }
> 
> -    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true);
> +    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true, data_size);
> 
>      qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
>                                   rtas_event_log_to_irq(spapr,
> @@ -671,8 +674,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>      if (!event) {
>          goto out_no_events;
>      }
> -
> -    hdr = event->data;
> +    hdr = (struct rtas_error_log *)event->data;
>      event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
> 
>      if (event_len < len) {
> @@ -728,7 +730,7 @@ static void event_scan(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>          goto out_no_events;
>      }
> 
> -    hdr = event->data;
> +    hdr = (struct rtas_error_log *)event->data;
>      event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
> 
>      if (event_len < len) {
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 808aac8..c60dbd2 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -584,7 +584,8 @@ sPAPRTCETable *spapr_tce_find_by_liobn(target_ulong liobn);
>  struct sPAPREventLogEntry {
>      int log_type;
>      bool exception;
> -    void *data;
> +    uint8_t *data;
> +    uint32_t data_size;
>      QTAILQ_ENTRY(sPAPREventLogEntry) next;
>  };
> 
> -- 
> 2.9.3
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-25 22:45   ` Michael Roth
@ 2017-04-26  5:55     ` David Gibson
  2017-04-26 10:07       ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  0 siblings, 1 reply; 17+ messages in thread
From: David Gibson @ 2017-04-26  5:55 UTC (permalink / raw)
  To: Michael Roth; +Cc: Daniel Henrique Barboza, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 10303 bytes --]

On Tue, Apr 25, 2017 at 05:45:11PM -0500, Michael Roth wrote:
> Quoting Daniel Henrique Barboza (2017-04-24 17:08:26)
> > In pseries, a firmware abstraction called Dynamic Reconfiguration
> > Connector (DRC) is used to assign a particular dynamic resource
> > to the guest and provide an interface to manage configuration/removal
> > of the resource associated with it. In other words, DRC is the
> > 'plugged state' of a device.
> > 
> > Before this patch, DRC wasn't being migrated. This causes
> > post-migration problems due to DRC state mismatch between source and
> > target. The DRC state of a device X in the source might
> > change, while in the target the DRC state of X is still fresh. When
> > migrating the guest, X will not have the same hotplugged state as it
> > did in the source. This means that we can't hot unplug X in the
> > target after migration is completed because its DRC state is not consistent.
> > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
> > bug that is caused by this DRC state mismatch between source and
> > target.
> > 
> > To migrate the DRC state, we defined the VMStateDescription struct for
> > spapr_drc to enable the transmission of spapr_drc state in migration.
> > Not all the elements in the DRC state are migrated - only those
> > that can be modified by guest actions or device add/remove
> > operations:
> > 
> > - 'isolation_state', 'allocation_state' and 'configured' are involved
> > in the DR state transition diagram from PAPR+ 2.7, 13.4;
> > 
> > - 'configured' and 'signalled' are needed in attaching and detaching
> > devices;
> > 
> > - 'indicator_state' provides users with hardware state information.
> > 
> > These are the DRC elements that are migrated.
> > 
> > In this patch the DRC state is migrated for PCI, LMB and CPU
> > connector types. At this moment there is no support to migrate
> > DRC for the PHB (PCI Host Bridge) type.
> > 
> > The instance_id is used to identify objects in migration. We set
> > instance_id of DRC using the unique index so that it is the same
> > across migration.
> > 
> > In hw/ppc/spapr_pci.c, a function called spapr_pci_set_detach_cb
> > was created to set the detach_cb of the migrated DRC in the
> > spapr_pci_post_load. The reason is that detach_cb is a DRC function
> > pointer that can't be migrated but we need it set in the target
> > so a ongoing hot-unplug event can properly finish.
> > 
> > Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> > ---
> >  hw/ppc/spapr_drc.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/ppc/spapr_pci.c | 22 ++++++++++++++++++
> >  2 files changed, 89 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index a1cdc87..5c2baad 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -651,6 +651,70 @@ static void spapr_dr_connector_instance_init(Object *obj)
> >                          NULL, NULL, NULL, NULL);
> >  }
> > 
> > +static bool spapr_drc_needed(void *opaque)
> > +{
> > +    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
> > +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +    bool rc = false;
> > +    sPAPRDREntitySense value;
> > +    drck->entity_sense(drc, &value);
> > +    /* If no dev is plugged in there is no need to migrate the DRC state */
> > +    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
> > +        return false;
> > +    }
> > +
> > +    /*
> > +     * If there is dev plugged in, we need to migrate the DRC state when
> > +     * it is different from cold-plugged state
> > +     */
> > +    switch (drc->type) {
> > +
> > +    case SPAPR_DR_CONNECTOR_TYPE_PCI:
> > +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
> > +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
> > +               drc->configured && drc->signalled && !drc->awaiting_release);
> > +        break;
> > +
> > +    case SPAPR_DR_CONNECTOR_TYPE_LMB:
> > +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
> > +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
> > +               drc->configured && drc->signalled && !drc->awaiting_release);
> > +        break;
> > +
> > +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
> > +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
> > +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
> > +                drc->configured && drc->signalled && !drc->awaiting_release);
> > +        break;
> > +
> > +    default:
> > +        ;
> > +    }
> > +    return rc;
> > +}
> > +
> > +/* return the unique drc index as instance_id for qom interfaces*/
> > +static int get_instance_id(DeviceState *dev)
> > +{
> > +    return (int)get_index(SPAPR_DR_CONNECTOR(OBJECT(dev)));
> > +}
> > +
> > +static const VMStateDescription vmstate_spapr_drc = {
> > +    .name = "spapr_drc",
> > +    .version_id = 1,
> > +    .minimum_version_id = 1,
> > +    .needed = spapr_drc_needed,
> > +    .fields  = (VMStateField []) {
> > +        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
> > +        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
> > +        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
> > +        VMSTATE_BOOL(configured, sPAPRDRConnector),
> > +        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
> > +        VMSTATE_BOOL(signalled, sPAPRDRConnector),
> > +        VMSTATE_END_OF_LIST()
> > +    }
> > +};
> > +
> >  static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
> >  {
> >      DeviceClass *dk = DEVICE_CLASS(k);
> > @@ -659,6 +723,8 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
> >      dk->reset = reset;
> >      dk->realize = realize;
> >      dk->unrealize = unrealize;
> > +    dk->vmsd = &vmstate_spapr_drc;
> > +    dk->dev_get_instance_id = get_instance_id;
> >      drck->set_isolation_state = set_isolation_state;
> >      drck->set_indicator_state = set_indicator_state;
> >      drck->set_allocation_state = set_allocation_state;
> > @@ -672,6 +738,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
> >      drck->detach = detach;
> >      drck->release_pending = release_pending;
> >      drck->set_signalled = set_signalled;
> > +
> >      /*
> >       * Reason: it crashes FIXME find and document the real reason
> >       */
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 98c52e4..639dad2 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1922,12 +1922,34 @@ static void spapr_pci_pre_save(void *opaque)
> >      }
> >  }
> > 
> > +/*
> > + * detach_cb in the DRC state is a function pointer that cannot be
> > + * migrated. We set it right after migration so that a migrated
> > + * hot-unplug event could finish its work.
> > + */
> > +static void spapr_pci_set_detach_cb(PCIBus *bus, PCIDevice *pdev,
> > +                                 void *opaque)
> > +{
> > +    sPAPRPHBState *sphb = opaque;
> > +    sPAPRDRConnector *drc = spapr_phb_get_pci_drc(sphb, pdev);
> > +    drc->detach_cb = spapr_phb_remove_pci_device_cb;
> > +}
> 
> This is doesn't quite work, because drc->detach_cb takes an opaque
> argument that is not being restored in here (and is not really
> possible to restore).

Yeah.  Plus the whole fact that we need to sort-of migrate this
non-migratable callback pointer in the first place probably indicates
our state isn't properly factored anyway.

I think we need to rework the DRC code, so that rather than
transiently setting the callback pointer, we have a fixed callback
pointer or hook in the DRC class or somewhere.  Then we just have a
flag indicating whether it is currently pending or not, which *is*
migratable.  Or possibly we can even deduce that flag from the
existing state values, I'm not sure.

> 
> However, the only DRC user who currently relies on the opaque is
> memory unplug, which passes an sPAPRDIMMState via spapr_del_lmbs:
> 
>   drck->detach(drc, dev, spapr_lmb_release, ds, errp);
> 
> It's actually possible to do away with this, you just need to add
> the sPAPRDIMMStates to a QTAILQ that's attached to sPAPRPHBState,
> and then migrate it when it is non-empty, similar to how ccs_list
> is migrated in the following patch. Then you would modify
> spapr_lmb_release to search that queue for the matching
> sPAPRDIMMState instead of relying on the opaque argument. You
> can get to the sPAPRPHBState via qdev_get_machine(),
> spapr_phb_realize() has an example.
> 
> spapr_phb_remove_pci_device_cb also currently takes an opaque to
> an sPAPRPHBState*, but it doesn't actually do anything with it,
> so you can drop it from there. and spapr_core_release never used
> an opaque.
> 
> > +
> >  static int spapr_pci_post_load(void *opaque, int version_id)
> >  {
> >      sPAPRPHBState *sphb = opaque;
> >      gpointer key, value;
> >      int i;
> > 
> > +    PCIBus *bus = PCI_HOST_BRIDGE(sphb)->bus;
> > +    unsigned int bus_no = 0;
> > +
> > +    /* Set detach_cb for the drc unconditionally after migration */
> > +    if (bus) {
> > +        pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
> > +                            &bus_no);
> > +    }
> 
> In a previous thread it was suggested that rather than restoring these
> after migration, we should modify spapr_dr_connector_new() to take the
> attach/detach callbacks as parameters so that they are set statically
> at init time. Then we can drop all the post-load hooks (memory/cpu
> would need similar post-load restorations as well, otherwise).

Yeah, that's absolutely the better way to do this.

> 
> > +
> >      for (i = 0; i < sphb->msi_devs_num; ++i) {
> >          key = g_memdup(&sphb->msi_devs[i].key,
> >                         sizeof(sphb->msi_devs[i].key));
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry
  2017-04-25 22:26   ` Michael Roth
@ 2017-04-26 10:05     ` Daniel Henrique Barboza
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-26 10:05 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: qemu-ppc, david



On 04/25/2017 07:26 PM, Michael Roth wrote:
> Quoting Daniel Henrique Barboza (2017-04-24 17:08:25)
>> From: Jianjun Duan <duanj@linux.vnet.ibm.com>
>>
>> In QOM (QEMU Object Model) migrated objects are identified with instance_id
>> which is calculated automatically using their path in the QOM composition
>> tree. For some objects, this path could change from source to target in
>> migration. To migrate such objects, we need to make sure the instance_id does
>> not change from source to target. We add a hook in DeviceClass to do customized
>> instance_id calculation in such cases.
> When I tried to pluck a subset of these patches for another series it
> was noticed that we don't actually need this patch anymore:
>
>    https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05475.html
>
> hw/ppc/spapr_iommu.c already implements an approach for registering DRCs
> that would work for our case as well since DRCs are bus-less and do not sit
> on a BusClass that implements bc->get_dev_path, so using
> vmstate_register(DEVICE(drc), drck->get_index(drc), ...) will work in
> the same way this patch expects it to.
Good to know. I'll see if I can get rid of this whole patch and use this
approach instead.

>
>> As a result, in these cases compat will not be set in the concerned
>> SaveStateEntry. This will prevent the inconsistent idstr to be sent over in
>> migration. We could have set alias_id in a similar way. But that will be
>> overloading the purpose of alias_id.
>>
>> The first application will be setting instance_id for pseries DRC objects using
>> its unique index. Doing this makes the instance_id of DRC to be consistent
>> across migration and supports flexible management of DRC objects in migration.
>>
>> Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
>> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
>> ---
>>   include/hw/qdev-core.h | 6 ++++++
>>   migration/savevm.c     | 6 ++++++
>>   2 files changed, 12 insertions(+)
>>
>> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>> index 4bf86b0..9b3914c 100644
>> --- a/include/hw/qdev-core.h
>> +++ b/include/hw/qdev-core.h
>> @@ -127,6 +127,12 @@ typedef struct DeviceClass {
>>       qdev_initfn init; /* TODO remove, once users are converted to realize */
>>       qdev_event exit; /* TODO remove, once users are converted to unrealize */
>>       const char *bus_type;
>> +
>> +    /* When this field is set, qemu will use it to get an unique instance_id
>> +     * instead of calculating an auto idstr and instance_id for the relevant
>> +     * SaveStateEntry
>> +     */
>> +    int (*dev_get_instance_id)(DeviceState *dev);
>>   } DeviceClass;
>>
>>   typedef struct NamedGPIOList NamedGPIOList;
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 03ae1bd..5d8135f 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -606,6 +606,9 @@ int register_savevm_live(DeviceState *dev,
>>                            calculate_compat_instance_id(idstr) : instance_id;
>>               instance_id = -1;
>>           }
>> +        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
>> +            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
>> +        }
>>       }
>>       pstrcat(se->idstr, sizeof(se->idstr), idstr);
>>
>> @@ -696,6 +699,9 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id,
>>                            calculate_compat_instance_id(vmsd->name) : instance_id;
>>               instance_id = -1;
>>           }
>> +        if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
>> +            instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
>> +        }
>>       }
>>       pstrcat(se->idstr, sizeof(se->idstr), vmsd->name);
>>
>> -- 
>> 2.9.3
>>
>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-26  5:55     ` David Gibson
@ 2017-04-26 10:07       ` Daniel Henrique Barboza
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-26 10:07 UTC (permalink / raw)
  To: David Gibson, Michael Roth; +Cc: qemu-ppc, qemu-devel



On 04/26/2017 02:55 AM, David Gibson wrote:
> On Tue, Apr 25, 2017 at 05:45:11PM -0500, Michael Roth wrote:
>> Quoting Daniel Henrique Barboza (2017-04-24 17:08:26)
>>> In pseries, a firmware abstraction called Dynamic Reconfiguration
>>> Connector (DRC) is used to assign a particular dynamic resource
>>> to the guest and provide an interface to manage configuration/removal
>>> of the resource associated with it. In other words, DRC is the
>>> 'plugged state' of a device.
>>>
>>> Before this patch, DRC wasn't being migrated. This causes
>>> post-migration problems due to DRC state mismatch between source and
>>> target. The DRC state of a device X in the source might
>>> change, while in the target the DRC state of X is still fresh. When
>>> migrating the guest, X will not have the same hotplugged state as it
>>> did in the source. This means that we can't hot unplug X in the
>>> target after migration is completed because its DRC state is not consistent.
>>> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
>>> bug that is caused by this DRC state mismatch between source and
>>> target.
>>>
>>> To migrate the DRC state, we defined the VMStateDescription struct for
>>> spapr_drc to enable the transmission of spapr_drc state in migration.
>>> Not all the elements in the DRC state are migrated - only those
>>> that can be modified by guest actions or device add/remove
>>> operations:
>>>
>>> - 'isolation_state', 'allocation_state' and 'configured' are involved
>>> in the DR state transition diagram from PAPR+ 2.7, 13.4;
>>>
>>> - 'configured' and 'signalled' are needed in attaching and detaching
>>> devices;
>>>
>>> - 'indicator_state' provides users with hardware state information.
>>>
>>> These are the DRC elements that are migrated.
>>>
>>> In this patch the DRC state is migrated for PCI, LMB and CPU
>>> connector types. At this moment there is no support to migrate
>>> DRC for the PHB (PCI Host Bridge) type.
>>>
>>> The instance_id is used to identify objects in migration. We set
>>> instance_id of DRC using the unique index so that it is the same
>>> across migration.
>>>
>>> In hw/ppc/spapr_pci.c, a function called spapr_pci_set_detach_cb
>>> was created to set the detach_cb of the migrated DRC in the
>>> spapr_pci_post_load. The reason is that detach_cb is a DRC function
>>> pointer that can't be migrated but we need it set in the target
>>> so a ongoing hot-unplug event can properly finish.
>>>
>>> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
>>> ---
>>>   hw/ppc/spapr_drc.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   hw/ppc/spapr_pci.c | 22 ++++++++++++++++++
>>>   2 files changed, 89 insertions(+)
>>>
>>> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
>>> index a1cdc87..5c2baad 100644
>>> --- a/hw/ppc/spapr_drc.c
>>> +++ b/hw/ppc/spapr_drc.c
>>> @@ -651,6 +651,70 @@ static void spapr_dr_connector_instance_init(Object *obj)
>>>                           NULL, NULL, NULL, NULL);
>>>   }
>>>
>>> +static bool spapr_drc_needed(void *opaque)
>>> +{
>>> +    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
>>> +    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>>> +    bool rc = false;
>>> +    sPAPRDREntitySense value;
>>> +    drck->entity_sense(drc, &value);
>>> +    /* If no dev is plugged in there is no need to migrate the DRC state */
>>> +    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
>>> +        return false;
>>> +    }
>>> +
>>> +    /*
>>> +     * If there is dev plugged in, we need to migrate the DRC state when
>>> +     * it is different from cold-plugged state
>>> +     */
>>> +    switch (drc->type) {
>>> +
>>> +    case SPAPR_DR_CONNECTOR_TYPE_PCI:
>>> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
>>> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
>>> +               drc->configured && drc->signalled && !drc->awaiting_release);
>>> +        break;
>>> +
>>> +    case SPAPR_DR_CONNECTOR_TYPE_LMB:
>>> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
>>> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
>>> +               drc->configured && drc->signalled && !drc->awaiting_release);
>>> +        break;
>>> +
>>> +    case SPAPR_DR_CONNECTOR_TYPE_CPU:
>>> +        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
>>> +               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
>>> +                drc->configured && drc->signalled && !drc->awaiting_release);
>>> +        break;
>>> +
>>> +    default:
>>> +        ;
>>> +    }
>>> +    return rc;
>>> +}
>>> +
>>> +/* return the unique drc index as instance_id for qom interfaces*/
>>> +static int get_instance_id(DeviceState *dev)
>>> +{
>>> +    return (int)get_index(SPAPR_DR_CONNECTOR(OBJECT(dev)));
>>> +}
>>> +
>>> +static const VMStateDescription vmstate_spapr_drc = {
>>> +    .name = "spapr_drc",
>>> +    .version_id = 1,
>>> +    .minimum_version_id = 1,
>>> +    .needed = spapr_drc_needed,
>>> +    .fields  = (VMStateField []) {
>>> +        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
>>> +        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
>>> +        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
>>> +        VMSTATE_BOOL(configured, sPAPRDRConnector),
>>> +        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
>>> +        VMSTATE_BOOL(signalled, sPAPRDRConnector),
>>> +        VMSTATE_END_OF_LIST()
>>> +    }
>>> +};
>>> +
>>>   static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>>>   {
>>>       DeviceClass *dk = DEVICE_CLASS(k);
>>> @@ -659,6 +723,8 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>>>       dk->reset = reset;
>>>       dk->realize = realize;
>>>       dk->unrealize = unrealize;
>>> +    dk->vmsd = &vmstate_spapr_drc;
>>> +    dk->dev_get_instance_id = get_instance_id;
>>>       drck->set_isolation_state = set_isolation_state;
>>>       drck->set_indicator_state = set_indicator_state;
>>>       drck->set_allocation_state = set_allocation_state;
>>> @@ -672,6 +738,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>>>       drck->detach = detach;
>>>       drck->release_pending = release_pending;
>>>       drck->set_signalled = set_signalled;
>>> +
>>>       /*
>>>        * Reason: it crashes FIXME find and document the real reason
>>>        */
>>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>>> index 98c52e4..639dad2 100644
>>> --- a/hw/ppc/spapr_pci.c
>>> +++ b/hw/ppc/spapr_pci.c
>>> @@ -1922,12 +1922,34 @@ static void spapr_pci_pre_save(void *opaque)
>>>       }
>>>   }
>>>
>>> +/*
>>> + * detach_cb in the DRC state is a function pointer that cannot be
>>> + * migrated. We set it right after migration so that a migrated
>>> + * hot-unplug event could finish its work.
>>> + */
>>> +static void spapr_pci_set_detach_cb(PCIBus *bus, PCIDevice *pdev,
>>> +                                 void *opaque)
>>> +{
>>> +    sPAPRPHBState *sphb = opaque;
>>> +    sPAPRDRConnector *drc = spapr_phb_get_pci_drc(sphb, pdev);
>>> +    drc->detach_cb = spapr_phb_remove_pci_device_cb;
>>> +}
>> This is doesn't quite work, because drc->detach_cb takes an opaque
>> argument that is not being restored in here (and is not really
>> possible to restore).
> Yeah.  Plus the whole fact that we need to sort-of migrate this
> non-migratable callback pointer in the first place probably indicates
> our state isn't properly factored anyway.
>
> I think we need to rework the DRC code, so that rather than
> transiently setting the callback pointer, we have a fixed callback
> pointer or hook in the DRC class or somewhere.  Then we just have a
> flag indicating whether it is currently pending or not, which *is*
> migratable.  Or possibly we can even deduce that flag from the
> existing state values, I'm not sure.
>
>> However, the only DRC user who currently relies on the opaque is
>> memory unplug, which passes an sPAPRDIMMState via spapr_del_lmbs:
>>
>>    drck->detach(drc, dev, spapr_lmb_release, ds, errp);
>>
>> It's actually possible to do away with this, you just need to add
>> the sPAPRDIMMStates to a QTAILQ that's attached to sPAPRPHBState,
>> and then migrate it when it is non-empty, similar to how ccs_list
>> is migrated in the following patch. Then you would modify
>> spapr_lmb_release to search that queue for the matching
>> sPAPRDIMMState instead of relying on the opaque argument. You
>> can get to the sPAPRPHBState via qdev_get_machine(),
>> spapr_phb_realize() has an example.
>>
>> spapr_phb_remove_pci_device_cb also currently takes an opaque to
>> an sPAPRPHBState*, but it doesn't actually do anything with it,
>> so you can drop it from there. and spapr_core_release never used
>> an opaque.
>>
>>> +
>>>   static int spapr_pci_post_load(void *opaque, int version_id)
>>>   {
>>>       sPAPRPHBState *sphb = opaque;
>>>       gpointer key, value;
>>>       int i;
>>>
>>> +    PCIBus *bus = PCI_HOST_BRIDGE(sphb)->bus;
>>> +    unsigned int bus_no = 0;
>>> +
>>> +    /* Set detach_cb for the drc unconditionally after migration */
>>> +    if (bus) {
>>> +        pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
>>> +                            &bus_no);
>>> +    }
>> In a previous thread it was suggested that rather than restoring these
>> after migration, we should modify spapr_dr_connector_new() to take the
>> attach/detach callbacks as parameters so that they are set statically
>> at init time. Then we can drop all the post-load hooks (memory/cpu
>> would need similar post-load restorations as well, otherwise).
> Yeah, that's absolutely the better way to do this.
Agree. I'll remove all the code that migrates callbacks and see if setting
them statically at spapr_dr_connector_init() works.
>>> +
>>>       for (i = 0; i < sphb->msi_devs_num; ++i) {
>>>           key = g_memdup(&sphb->msi_devs[i].key,
>>>                          sizeof(sphb->msi_devs[i].key));

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of spapr state
  2017-04-25 22:53   ` Michael Roth
@ 2017-04-26 10:10     ` Daniel Henrique Barboza
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-26 10:10 UTC (permalink / raw)
  To: Michael Roth, qemu-devel; +Cc: qemu-ppc, david



On 04/25/2017 07:53 PM, Michael Roth wrote:
> Quoting Daniel Henrique Barboza (2017-04-24 17:08:28)
>> From: Jianjun Duan <duanj@linux.vnet.ibm.com>
>>
>> In racing situations between hotplug events and migration operation,
>> a rtas hotplug event could have not yet be delivered to the source
>> guest when migration is started. In this case the pending_events of
>> spapr state need be transmitted to the target so that the hotplug
>> event can be finished on the target.
>>
>> All the different fields of the events are encoded as defined by
>> PAPR. We can migrate them as uint8_t binary stream without any
>> concerns about data padding or endianess.
>>
>> pending_events is put in a subsection in the spapr state VMSD to make
>> sure migration across different versions is not broken.
>>
>> Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
>> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
>> ---
>>   hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++++++++
>>   hw/ppc/spapr_events.c  | 24 +++++++++++++-----------
>>   include/hw/ppc/spapr.h |  3 ++-
>>   3 files changed, 48 insertions(+), 12 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 22f351c..a3e939b 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1419,6 +1419,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = {
>>       },
>>   };
>>
>> +static bool spapr_pending_events_needed(void *opaque)
>> +{
>> +    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
>> +    return !QTAILQ_EMPTY(&spapr->pending_events);
>> +}
>> +
>> +static const VMStateDescription vmstate_spapr_event_entry = {
>> +    .name = "spapreventlogentry",
> Similar suggestion on the naming here and elsewhere.

I'll review all of them in the code to make them more readable.

Thanks!

>
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_INT32(log_type, sPAPREventLogEntry),
>> +        VMSTATE_BOOL(exception, sPAPREventLogEntry),
>> +        VMSTATE_UINT32(data_size, sPAPREventLogEntry),
>> +        VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
>> +                                    0, vmstate_info_uint8, uint8_t),
>> +        VMSTATE_END_OF_LIST()
>> +    },
>> +};
>> +
>> +static const VMStateDescription vmstate_spapr_pending_events = {
>> +    .name = "spaprpendingevents",
>> +    .version_id = 1,
>> +    .minimum_version_id = 1,
>> +    .needed = spapr_pending_events_needed,
>> +    .fields = (VMStateField[]) {
>> +        VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
>> +                         vmstate_spapr_event_entry, sPAPREventLogEntry, next),
>> +        VMSTATE_END_OF_LIST()
>> +    },
>> +};
>> +
>>   static bool spapr_ov5_cas_needed(void *opaque)
>>   {
>>       sPAPRMachineState *spapr = opaque;
>> @@ -1518,6 +1550,7 @@ static const VMStateDescription vmstate_spapr = {
>>           &vmstate_spapr_ov5_cas,
>>           &vmstate_spapr_patb_entry,
>>           &vmstate_spapr_ccs_list,
>> +        &vmstate_spapr_pending_events,
>>           NULL
>>       }
>>   };
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index 24a5758..399dd49 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, int log_type)
>>       return source->irq;
>>   }
>>
>> -static void rtas_event_log_queue(int log_type, void *data, bool exception)
>> +static void rtas_event_log_queue(int log_type, void *data, bool exception,
>> +                                 int data_size)
>>   {
>>       sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>       sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
>> @@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, bool exception)
>>       entry->log_type = log_type;
>>       entry->exception = exception;
>>       entry->data = data;
>> +    entry->data_size = data_size;
>>       QTAILQ_INSERT_TAIL(&spapr->pending_events, entry, next);
>>   }
>>
>> @@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>>       struct rtas_event_log_v6_mainb *mainb;
>>       struct rtas_event_log_v6_epow *epow;
>>       struct epow_log_full *new_epow;
>> +    uint32_t data_size;
>>
>>       new_epow = g_malloc0(sizeof(*new_epow));
>>       hdr = &new_epow->hdr;
>> @@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>>       mainb = &new_epow->mainb;
>>       epow = &new_epow->epow;
>>
>> +    data_size = sizeof(*new_epow);
>>       hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
>>                                  | RTAS_LOG_SEVERITY_EVENT
>>                                  | RTAS_LOG_DISPOSITION_NOT_RECOVERED
>>                                  | RTAS_LOG_OPTIONAL_PART_PRESENT
>>                                  | RTAS_LOG_TYPE_EPOW);
>> -    hdr->extended_length = cpu_to_be32(sizeof(*new_epow)
>> -                                       - sizeof(new_epow->hdr));
>> -
>> +    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_epow->hdr));
>>       spapr_init_v6hdr(v6hdr);
>>       spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
>>
>> @@ -479,7 +481,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
>>       epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
>>       epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
>>
>> -    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true);
>> +    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true, data_size);
>>
>>       qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
>>                                    rtas_event_log_to_irq(spapr,
>> @@ -504,6 +506,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>>       struct rtas_event_log_v6_maina *maina;
>>       struct rtas_event_log_v6_mainb *mainb;
>>       struct rtas_event_log_v6_hp *hp;
>> +    uint32_t data_size;
>>
>>       new_hp = g_malloc0(sizeof(struct hp_log_full));
>>       hdr = &new_hp->hdr;
>> @@ -512,14 +515,14 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>>       mainb = &new_hp->mainb;
>>       hp = &new_hp->hp;
>>
>> +    data_size = sizeof(*new_hp);
>>       hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
>>                                  | RTAS_LOG_SEVERITY_EVENT
>>                                  | RTAS_LOG_DISPOSITION_NOT_RECOVERED
>>                                  | RTAS_LOG_OPTIONAL_PART_PRESENT
>>                                  | RTAS_LOG_INITIATOR_HOTPLUG
>>                                  | RTAS_LOG_TYPE_HOTPLUG);
>> -    hdr->extended_length = cpu_to_be32(sizeof(*new_hp)
>> -                                       - sizeof(new_hp->hdr));
>> +    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_hp->hdr));
>>
>>       spapr_init_v6hdr(v6hdr);
>>       spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
>> @@ -572,7 +575,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
>>               cpu_to_be32(drc_id->count_indexed.index);
>>       }
>>
>> -    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true);
>> +    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true, data_size);
>>
>>       qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
>>                                    rtas_event_log_to_irq(spapr,
>> @@ -671,8 +674,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>       if (!event) {
>>           goto out_no_events;
>>       }
>> -
>> -    hdr = event->data;
>> +    hdr = (struct rtas_error_log *)event->data;
>>       event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
>>
>>       if (event_len < len) {
>> @@ -728,7 +730,7 @@ static void event_scan(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>>           goto out_no_events;
>>       }
>>
>> -    hdr = event->data;
>> +    hdr = (struct rtas_error_log *)event->data;
>>       event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
>>
>>       if (event_len < len) {
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index 808aac8..c60dbd2 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -584,7 +584,8 @@ sPAPRTCETable *spapr_tce_find_by_liobn(target_ulong liobn);
>>   struct sPAPREventLogEntry {
>>       int log_type;
>>       bool exception;
>> -    void *data;
>> +    uint8_t *data;
>> +    uint32_t data_size;
>>       QTAILQ_ENTRY(sPAPREventLogEntry) next;
>>   };
>>
>> -- 
>> 2.9.3
>>
>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-05-03 16:09   ` Dr. David Alan Gilbert
@ 2017-05-03 18:05     ` Daniel Henrique Barboza
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-03 18:05 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel



On 05/03/2017 01:09 PM, Dr. David Alan Gilbert wrote:
> * Daniel Henrique Barboza (danielhb@linux.vnet.ibm.com) wrote:
>
> <snip>
>
>>   static void realize(DeviceState *d, Error **errp)
>>   {
>>       sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(d);
>> @@ -540,6 +598,8 @@ static void realize(DeviceState *d, Error **errp)
>>           object_unref(OBJECT(drc));
>>       }
>>       g_free(child_name);
>> +    vmstate_register(DEVICE(drc), drck->get_index(drc), &vmstate_spapr_drc,
>> +                     drc);
>>       trace_spapr_drc_realize_complete(drck->get_index(drc));
>>   }
>>   
>> @@ -658,6 +718,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>>       dk->reset = reset;
>>       dk->realize = realize;
>>       dk->unrealize = unrealize;
>> +    dk->vmsd = &vmstate_spapr_drc;
> Are you sure this is right - isn't it unusual to have both
> a ->vmsd entry AND a vmstate_register?

I've changed the code to use vmstate_register but forgot to remove the
->vmsd entry that was being used in v6.

Thanks for pointing it out Dave. I'll fix it in v9.


Daniel

>
> a ->vmsd =   is the preferable way I think, but I see you're
> doing something with the 2nd parameter of vmstate_register;
> if you *need* to do that then I think it's the only way.
>
> Dave
>
>>       drck->set_isolation_state = set_isolation_state;
>>       drck->set_indicator_state = set_indicator_state;
>>       drck->set_allocation_state = set_allocation_state;
>> -- 
>> 2.9.3
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-26 21:23 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
@ 2017-05-03 16:09   ` Dr. David Alan Gilbert
  2017-05-03 18:05     ` Daniel Henrique Barboza
  0 siblings, 1 reply; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-05-03 16:09 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-devel

* Daniel Henrique Barboza (danielhb@linux.vnet.ibm.com) wrote:

<snip>

>  static void realize(DeviceState *d, Error **errp)
>  {
>      sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(d);
> @@ -540,6 +598,8 @@ static void realize(DeviceState *d, Error **errp)
>          object_unref(OBJECT(drc));
>      }
>      g_free(child_name);
> +    vmstate_register(DEVICE(drc), drck->get_index(drc), &vmstate_spapr_drc,
> +                     drc);
>      trace_spapr_drc_realize_complete(drck->get_index(drc));
>  }
>  
> @@ -658,6 +718,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>      dk->reset = reset;
>      dk->realize = realize;
>      dk->unrealize = unrealize;
> +    dk->vmsd = &vmstate_spapr_drc;

Are you sure this is right - isn't it unusual to have both
a ->vmsd entry AND a vmstate_register?

a ->vmsd =   is the preferable way I think, but I see you're
doing something with the 2nd parameter of vmstate_register;
if you *need* to do that then I think it's the only way.

Dave

>      drck->set_isolation_state = set_isolation_state;
>      drck->set_indicator_state = set_indicator_state;
>      drck->set_allocation_state = set_allocation_state;
> -- 
> 2.9.3
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-26 21:31 [Qemu-devel] [PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
@ 2017-04-26 21:31 ` Daniel Henrique Barboza
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-26 21:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'configured' are involved
in the DR state transition diagram from PAPR+ 2.7, 13.4;

- 'configured' and 'signalled' are needed in attaching and detaching
devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr_drc.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index afe5d82..13d18f5 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -512,6 +512,64 @@ static void reset(DeviceState *d)
     }
 }
 
+static bool spapr_drc_needed(void *opaque)
+{
+    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    bool rc = false;
+    sPAPRDREntitySense value;
+    drck->entity_sense(drc, &value);
+    /* If no dev is plugged in there is no need to migrate the DRC state */
+    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+        return false;
+    }
+
+    /*
+     * If there is dev plugged in, we need to migrate the DRC state when
+     * it is different from cold-plugged state
+     */
+    switch (drc->type) {
+
+    case SPAPR_DR_CONNECTOR_TYPE_PCI:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+                drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    default:
+        ;
+    }
+    return rc;
+}
+
+static const VMStateDescription vmstate_spapr_drc = {
+    .name = "spapr_drc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
+        VMSTATE_BOOL(configured, sPAPRDRConnector),
+        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
+        VMSTATE_BOOL(signalled, sPAPRDRConnector),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static void realize(DeviceState *d, Error **errp)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(d);
@@ -540,6 +598,8 @@ static void realize(DeviceState *d, Error **errp)
         object_unref(OBJECT(drc));
     }
     g_free(child_name);
+    vmstate_register(DEVICE(drc), drck->get_index(drc), &vmstate_spapr_drc,
+                     drc);
     trace_spapr_drc_realize_complete(drck->get_index(drc));
 }
 
@@ -658,6 +718,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
     dk->reset = reset;
     dk->realize = realize;
     dk->unrealize = unrealize;
+    dk->vmsd = &vmstate_spapr_drc;
     drck->set_isolation_state = set_isolation_state;
     drck->set_indicator_state = set_indicator_state;
     drck->set_allocation_state = set_allocation_state;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-26 21:22 [Qemu-devel] [PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
@ 2017-04-26 21:23 ` Daniel Henrique Barboza
  2017-05-03 16:09   ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 17+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-26 21:23 UTC (permalink / raw)
  To: qemu-devel

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'configured' are involved
in the DR state transition diagram from PAPR+ 2.7, 13.4;

- 'configured' and 'signalled' are needed in attaching and detaching
devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr_drc.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index afe5d82..13d18f5 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -512,6 +512,64 @@ static void reset(DeviceState *d)
     }
 }
 
+static bool spapr_drc_needed(void *opaque)
+{
+    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    bool rc = false;
+    sPAPRDREntitySense value;
+    drck->entity_sense(drc, &value);
+    /* If no dev is plugged in there is no need to migrate the DRC state */
+    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+        return false;
+    }
+
+    /*
+     * If there is dev plugged in, we need to migrate the DRC state when
+     * it is different from cold-plugged state
+     */
+    switch (drc->type) {
+
+    case SPAPR_DR_CONNECTOR_TYPE_PCI:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+                drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    default:
+        ;
+    }
+    return rc;
+}
+
+static const VMStateDescription vmstate_spapr_drc = {
+    .name = "spapr_drc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
+        VMSTATE_BOOL(configured, sPAPRDRConnector),
+        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
+        VMSTATE_BOOL(signalled, sPAPRDRConnector),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static void realize(DeviceState *d, Error **errp)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(d);
@@ -540,6 +598,8 @@ static void realize(DeviceState *d, Error **errp)
         object_unref(OBJECT(drc));
     }
     g_free(child_name);
+    vmstate_register(DEVICE(drc), drck->get_index(drc), &vmstate_spapr_drc,
+                     drc);
     trace_spapr_drc_realize_complete(drck->get_index(drc));
 }
 
@@ -658,6 +718,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
     dk->reset = reset;
     dk->realize = realize;
     dk->unrealize = unrealize;
+    dk->vmsd = &vmstate_spapr_drc;
     drck->set_isolation_state = set_isolation_state;
     drck->set_indicator_state = set_indicator_state;
     drck->set_allocation_state = set_allocation_state;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-05-03 18:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-24 22:08 [Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
2017-04-24 22:08 ` [Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry Daniel Henrique Barboza
2017-04-25 22:26   ` Michael Roth
2017-04-26 10:05     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-04-24 22:08 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
2017-04-25 22:45   ` Michael Roth
2017-04-26  5:55     ` David Gibson
2017-04-26 10:07       ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-04-24 22:08 ` [Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
2017-04-25 22:47   ` Michael Roth
2017-04-24 22:08 ` [Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
2017-04-25 22:53   ` Michael Roth
2017-04-26 10:10     ` Daniel Henrique Barboza
2017-04-26 21:22 [Qemu-devel] [PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
2017-04-26 21:23 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
2017-05-03 16:09   ` Dr. David Alan Gilbert
2017-05-03 18:05     ` Daniel Henrique Barboza
2017-04-26 21:31 [Qemu-devel] [PATCH 0/4 v7] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
2017-04-26 21:31 ` [Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.