All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events
@ 2017-04-30 17:25 Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

v8:
- new patch added: 'removing spapr_drc_detach_cb opaques'. This new patch removes
the need for the detach_cb_opaques inside the removal callback functions. See
the commit message of the patch for more info.

v7:
- removed the first patch. DRC registration is now done by vmstate_register
in patch 2.
- added a new patch that changes spapr_dr_connector_new to receive as argument
the detach_cb.
- removed the callback logic of patch 2 since there is no need to restore the
detach_cb on post-load due to the detach_cb on spapr_dr_connector_new change.
- added word separators in the VMSD names of patch 3 and 4.

v6: - Rebased with QEMU master after 6+ months.
    - Simplified the logic in patch 1.
    - Reworked patch 2: added CPU DRC migration, removed a function pointer from DRC
class and minor improvements.
    - Added clarifications from the previous v5 discussions in the commit messages.

v5: - Rebased to David's ppc-for-2.8.

v4: - Introduce a way to set customized instance_id in SaveStateEntry. Use it
      to set instance_id for DRC using its unique index to address David 
      Gibson's concern.
    - Rename VMS_CSTM to VMS_LINKED based on Paolo Bonzini's suggestions.
    - Clean up qjson stuff in put_qtailq. 
    - Add trace for put_qtailq and get_qtailq based on David Gilbert's 
      suggestion.

    - Based on David's ppc-for-2.7. 

v3: - Simplify overall design followng discussion with Paolo. No longer need
      metadata to migrate QTAILQ.
    - Extend VMStateInfo instead of adding similar fields to VMStateField.
    - Clean up macros in qemu/queue.h.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg05695.html)

v2: - Introduce a general approach to migrate QTAILQ in qemu/queue.h.
    - Migrate signalled field in the DRC state.
    - Put the newly added migrating fields in subsections so that backward 
      migration is not broken.  
    - Set detach_cb field right after migration so that a migrated hot-unplug
      event could finish its course.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04188.html)

v1: - Inital version.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02601.html)


To make guest devices (PCI, CPU and memory) hotplug work together 
with guest migration, spapr drc state needs be transmitted in
migration. This patch defines the VMStateDescription struct for
spapr drc state to enable it.

To fix the potential racing between hotplug events on guest and 
guest migration, ccs_list and pending_events of spapr state need be 
transmitted in migration. This patch set also takes care of it.


Daniel Henrique Barboza (3):
  hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new
  hw/ppc: removing spapr_drc_detach_cb opaques
  hw/ppc: migrating the DRC state of hotplugged devices

Jianjun Duan (2):
  migration: spapr: migrate ccs_list in spapr state
  migration: spapr: migrate pending_events of spapr state

 hw/ppc/spapr.c             | 160 ++++++++++++++++++++++++++++++++++-----------
 hw/ppc/spapr_drc.c         |  84 +++++++++++++++++++-----
 hw/ppc/spapr_events.c      |  24 +++----
 hw/ppc/spapr_pci.c         |   7 +-
 include/hw/ppc/spapr.h     |   3 +-
 include/hw/ppc/spapr_drc.h |  10 ++-
 6 files changed, 215 insertions(+), 73 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new
  2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
@ 2017-04-30 17:25 ` Daniel Henrique Barboza
  2017-05-03 14:01   ` Laurent Vivier
  2017-05-04  5:33   ` David Gibson
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques Daniel Henrique Barboza
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

The idea of moving the detach callback functions to the constructor
of the dr_connector is to set them statically at init time, avoiding
any post-load hooks to restore it (after a migration, for example).

Summary of changes:

- hw/ppc/spapr_drc.c and include/hw/ppc/spapr_drc.h:
    *  spapr_dr_connector_new() now has an additional parameter,
spapr_drc_detach_cb *detach_cb
    *  'spapr_drc_detach_cb *detach_cb' parameter was removed of
the detach function pointer in sPAPRDRConnectorClass

- hw/ppc/spapr_pci.c:
    * the callback 'spapr_phb_remove_pci_device_cb' is now passed
as a parameter in 'spapr_dr_connector_new' instead of 'drck->detach()'

- hw/ppc/spapr.c:
    * 'spapr_create_lmb_dr_connectors' now passes the callback
'spapr_lmb_release' to 'spapr_dr_connector_new' instead of 'drck-detach()'
    * 'spapr_init_cpus' now passes the callback 'spapr_core_release'
to 'spapr_dr_connector_new' instead of 'drck-detach()'
    * moved the callback functions up in the code so they can be referenced
by 'spapr_create_lmb_dr_connectors' and 'spapr_init_cpus'

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c             | 71 +++++++++++++++++++++++-----------------------
 hw/ppc/spapr_drc.c         | 17 ++++++-----
 hw/ppc/spapr_pci.c         |  5 ++--
 include/hw/ppc/spapr_drc.h |  4 +--
 4 files changed, 49 insertions(+), 48 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 80d12d0..bc11757 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1887,6 +1887,29 @@ static void spapr_drc_reset(void *opaque)
     }
 }
 
+typedef struct sPAPRDIMMState {
+    uint32_t nr_lmbs;
+} sPAPRDIMMState;
+
+static void spapr_lmb_release(DeviceState *dev, void *opaque)
+{
+    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
+    HotplugHandler *hotplug_ctrl;
+
+    if (--ds->nr_lmbs) {
+        return;
+    }
+
+    g_free(ds);
+
+    /*
+     * Now that all the LMBs have been removed by the guest, call the
+     * pc-dimm unplug handler to cleanup up the pc-dimm device.
+     */
+    hotplug_ctrl = qdev_get_hotplug_handler(dev);
+    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
+}
+
 static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
 {
     MachineState *machine = MACHINE(spapr);
@@ -1900,7 +1923,7 @@ static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
 
         addr = i * lmb_size + spapr->hotplug_memory.base;
         drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
-                                     addr/lmb_size);
+                                     (addr / lmb_size), spapr_lmb_release);
         qemu_register_reset(spapr_drc_reset, drc);
     }
 }
@@ -1956,6 +1979,14 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
     return &ms->possible_cpus->cpus[index];
 }
 
+static void spapr_core_release(DeviceState *dev, void *opaque)
+{
+    HotplugHandler *hotplug_ctrl;
+
+    hotplug_ctrl = qdev_get_hotplug_handler(dev);
+    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
+}
+
 static void spapr_init_cpus(sPAPRMachineState *spapr)
 {
     MachineState *machine = MACHINE(spapr);
@@ -1998,7 +2029,8 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
             sPAPRDRConnector *drc =
                 spapr_dr_connector_new(OBJECT(spapr),
                                        SPAPR_DR_CONNECTOR_TYPE_CPU,
-                                       (core_id / smp_threads) * smt);
+                                       (core_id / smp_threads) * smt,
+                                       spapr_core_release);
 
             qemu_register_reset(spapr_drc_reset, drc);
         }
@@ -2596,29 +2628,6 @@ out:
     error_propagate(errp, local_err);
 }
 
-typedef struct sPAPRDIMMState {
-    uint32_t nr_lmbs;
-} sPAPRDIMMState;
-
-static void spapr_lmb_release(DeviceState *dev, void *opaque)
-{
-    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
-    HotplugHandler *hotplug_ctrl;
-
-    if (--ds->nr_lmbs) {
-        return;
-    }
-
-    g_free(ds);
-
-    /*
-     * Now that all the LMBs have been removed by the guest, call the
-     * pc-dimm unplug handler to cleanup up the pc-dimm device.
-     */
-    hotplug_ctrl = qdev_get_hotplug_handler(dev);
-    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
-}
-
 static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
                            Error **errp)
 {
@@ -2636,7 +2645,7 @@ static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
         g_assert(drc);
 
         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
+        drck->detach(drc, dev, ds, errp);
         addr += SPAPR_MEMORY_BLOCK_SIZE;
     }
 
@@ -2712,14 +2721,6 @@ static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
     object_unparent(OBJECT(dev));
 }
 
-static void spapr_core_release(DeviceState *dev, void *opaque)
-{
-    HotplugHandler *hotplug_ctrl;
-
-    hotplug_ctrl = qdev_get_hotplug_handler(dev);
-    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
-}
-
 static
 void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
                                Error **errp)
@@ -2745,7 +2746,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
     g_assert(drc);
 
     drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-    drck->detach(drc, dev, spapr_core_release, NULL, &local_err);
+    drck->detach(drc, dev, NULL, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index a1cdc87..afe5d82 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -99,8 +99,8 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
         if (drc->awaiting_release) {
             if (drc->configured) {
                 trace_spapr_drc_set_isolation_state_finalizing(get_index(drc));
-                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
-                             drc->detach_cb_opaque, NULL);
+                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
+                                         NULL);
             } else {
                 trace_spapr_drc_set_isolation_state_deferring(get_index(drc));
             }
@@ -153,8 +153,8 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
         if (drc->awaiting_release &&
             drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
             trace_spapr_drc_set_allocation_state_finalizing(get_index(drc));
-            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
-                         drc->detach_cb_opaque, NULL);
+            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
+                         NULL);
         } else if (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) {
             drc->awaiting_allocation = false;
         }
@@ -405,12 +405,10 @@ static void attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
 }
 
 static void detach(sPAPRDRConnector *drc, DeviceState *d,
-                   spapr_drc_detach_cb *detach_cb,
                    void *detach_cb_opaque, Error **errp)
 {
     trace_spapr_drc_detach(get_index(drc));
 
-    drc->detach_cb = detach_cb;
     drc->detach_cb_opaque = detach_cb_opaque;
 
     /* if we've signalled device presence to the guest, or if the guest
@@ -498,8 +496,7 @@ static void reset(DeviceState *d)
          * force removal if we are
          */
         if (drc->awaiting_release) {
-            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
-                         drc->detach_cb_opaque, NULL);
+            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque, NULL);
         }
 
         /* non-PCI devices may be awaiting a transition to UNUSABLE */
@@ -566,7 +563,8 @@ static void unrealize(DeviceState *d, Error **errp)
 
 sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
                                          sPAPRDRConnectorType type,
-                                         uint32_t id)
+                                         uint32_t id,
+                                         spapr_drc_detach_cb *detach_cb)
 {
     sPAPRDRConnector *drc =
         SPAPR_DR_CONNECTOR(object_new(TYPE_SPAPR_DR_CONNECTOR));
@@ -577,6 +575,7 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
     drc->type = type;
     drc->id = id;
     drc->owner = owner;
+    drc->detach_cb = detach_cb;
     prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", get_index(drc));
     object_property_add_child(owner, prop_name, OBJECT(drc), NULL);
     object_property_set_bool(OBJECT(drc), true, "realized", NULL);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e7567e2..935e65e 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1392,7 +1392,7 @@ static void spapr_phb_remove_pci_device(sPAPRDRConnector *drc,
 {
     sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
-    drck->detach(drc, DEVICE(pdev), spapr_phb_remove_pci_device_cb, phb, errp);
+    drck->detach(drc, DEVICE(pdev), phb, errp);
 }
 
 static sPAPRDRConnector *spapr_phb_get_pci_func_drc(sPAPRPHBState *phb,
@@ -1764,7 +1764,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
         for (i = 0; i < PCI_SLOT_MAX * 8; i++) {
             spapr_dr_connector_new(OBJECT(phb),
                                    SPAPR_DR_CONNECTOR_TYPE_PCI,
-                                   (sphb->index << 16) | i);
+                                   (sphb->index << 16) | i,
+                                   spapr_phb_remove_pci_device_cb);
         }
     }
 
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 5524247..0a2c173 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -189,7 +189,6 @@ typedef struct sPAPRDRConnectorClass {
     void (*attach)(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                    int fdt_start_offset, bool coldplug, Error **errp);
     void (*detach)(sPAPRDRConnector *drc, DeviceState *d,
-                   spapr_drc_detach_cb *detach_cb,
                    void *detach_cb_opaque, Error **errp);
     bool (*release_pending)(sPAPRDRConnector *drc);
     void (*set_signalled)(sPAPRDRConnector *drc);
@@ -197,7 +196,8 @@ typedef struct sPAPRDRConnectorClass {
 
 sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
                                          sPAPRDRConnectorType type,
-                                         uint32_t id);
+                                         uint32_t id,
+                                         spapr_drc_detach_cb *detach_cb);
 sPAPRDRConnector *spapr_dr_connector_by_index(uint32_t index);
 sPAPRDRConnector *spapr_dr_connector_by_id(sPAPRDRConnectorType type,
                                            uint32_t id);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
@ 2017-04-30 17:25 ` Daniel Henrique Barboza
  2017-05-02  3:40   ` Bharata B Rao
  2017-05-03 14:33   ` [Qemu-devel] " Laurent Vivier
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 3/5] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

Following up the previous detach_cb change, this patch removes the
detach_cb_opaque entirely from the code.

The reason is that the drc->detach_cb_opaque object can't be
restored in the post load of the upcoming DRC migration and no detach
callbacks actually need this opaque. 'spapr_core_release' is
receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
a phb object as opaque but is't using it. These were trivial removal
cases.

However, the LM removal callback 'spapr_lmb_release' is receiving
and using the opaque object, a 'sPAPRDIMMState' struct. This struct
holds the number of LMBs the DIMM object contains and the callback
was using this counter as a countdown to check if all LMB DRCs were
release before proceeding to the DIMM unplug. To remove the need of
this callback we have choices such as:

- migrate the 'sPAPRDIMMState' struct. This would require creating a
QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
associate the DIMMState with the DIMM object. We could attach this
QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.

- fetch the state of the LMB DRCs directly by scanning the state of
them and, if all of them are released, proceed with the DIMM unplug.

The second approach was chosen. The new 'spapr_all_lmbs_drcs_released'
function scans all LMBs of a given DIMM device to see if their DRC
state are inactive. If all of them are inactive return 'true', 'false'
otherwise. This function is being called inside the 'spapr_lmb_release'
callback, replacing the role of the 'sPAPRDIMMState' opaque. The
'sPAPRDIMMState' struct was removed from the code given that there are
no more uses for it.

After all these changes, there are no roles left for the 'detach_cb_opaque'
attribute of the 'sPAPRDRConnector' as well, so we can safely remove
it from the code too.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c             | 46 +++++++++++++++++++++++++++++++++-------------
 hw/ppc/spapr_drc.c         | 16 +++++-----------
 hw/ppc/spapr_pci.c         |  4 ++--
 include/hw/ppc/spapr_drc.h |  6 ++----
 4 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bc11757..8b9a6cf 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
     }
 }
 
-typedef struct sPAPRDIMMState {
-    uint32_t nr_lmbs;
-} sPAPRDIMMState;
+static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
+{
+    Error *local_err = NULL;
+    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
+    MemoryRegion *mr = ddc->get_memory_region(dimm);
+    uint64_t size = memory_region_size(mr);
+
+    uint64_t addr;
+    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, &local_err);
+    if (local_err) {
+        error_propagate(&error_abort, local_err);
+        return false;
+    }
+    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
 
-static void spapr_lmb_release(DeviceState *dev, void *opaque)
+    sPAPRDRConnector *drc;
+    int i = 0;
+    for (i = 0; i < nr_lmbs; i++) {
+        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
+                addr / SPAPR_MEMORY_BLOCK_SIZE);
+        g_assert(drc);
+        if (drc->indicator_state != SPAPR_DR_INDICATOR_STATE_INACTIVE) {
+            return false;
+        }
+        addr += SPAPR_MEMORY_BLOCK_SIZE;
+    }
+    return true;
+}
+
+static void spapr_lmb_release(DeviceState *dev)
 {
-    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
     HotplugHandler *hotplug_ctrl;
 
-    if (--ds->nr_lmbs) {
+    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
         return;
     }
 
-    g_free(ds);
-
     /*
      * Now that all the LMBs have been removed by the guest, call the
      * pc-dimm unplug handler to cleanup up the pc-dimm device.
@@ -1979,7 +2001,7 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
     return &ms->possible_cpus->cpus[index];
 }
 
-static void spapr_core_release(DeviceState *dev, void *opaque)
+static void spapr_core_release(DeviceState *dev)
 {
     HotplugHandler *hotplug_ctrl;
 
@@ -2635,17 +2657,15 @@ static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
     sPAPRDRConnectorClass *drck;
     uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
     int i;
-    sPAPRDIMMState *ds = g_malloc0(sizeof(sPAPRDIMMState));
     uint64_t addr = addr_start;
 
-    ds->nr_lmbs = nr_lmbs;
     for (i = 0; i < nr_lmbs; i++) {
         drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
                 addr / SPAPR_MEMORY_BLOCK_SIZE);
         g_assert(drc);
 
         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-        drck->detach(drc, dev, ds, errp);
+        drck->detach(drc, dev, errp);
         addr += SPAPR_MEMORY_BLOCK_SIZE;
     }
 
@@ -2746,7 +2766,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
     g_assert(drc);
 
     drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-    drck->detach(drc, dev, NULL, &local_err);
+    drck->detach(drc, dev, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index afe5d82..6f98242 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -99,8 +99,7 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
         if (drc->awaiting_release) {
             if (drc->configured) {
                 trace_spapr_drc_set_isolation_state_finalizing(get_index(drc));
-                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
-                                         NULL);
+                drck->detach(drc, DEVICE(drc->dev), NULL);
             } else {
                 trace_spapr_drc_set_isolation_state_deferring(get_index(drc));
             }
@@ -153,8 +152,7 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
         if (drc->awaiting_release &&
             drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
             trace_spapr_drc_set_allocation_state_finalizing(get_index(drc));
-            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
-                         NULL);
+            drck->detach(drc, DEVICE(drc->dev), NULL);
         } else if (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) {
             drc->awaiting_allocation = false;
         }
@@ -404,13 +402,10 @@ static void attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                              NULL, 0, NULL);
 }
 
-static void detach(sPAPRDRConnector *drc, DeviceState *d,
-                   void *detach_cb_opaque, Error **errp)
+static void detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp)
 {
     trace_spapr_drc_detach(get_index(drc));
 
-    drc->detach_cb_opaque = detach_cb_opaque;
-
     /* if we've signalled device presence to the guest, or if the guest
      * has gone ahead and configured the device (via manually-executed
      * device add via drmgr in guest, namely), we need to wait
@@ -455,7 +450,7 @@ static void detach(sPAPRDRConnector *drc, DeviceState *d,
     drc->indicator_state = SPAPR_DR_INDICATOR_STATE_INACTIVE;
 
     if (drc->detach_cb) {
-        drc->detach_cb(drc->dev, drc->detach_cb_opaque);
+        drc->detach_cb(drc->dev);
     }
 
     drc->awaiting_release = false;
@@ -466,7 +461,6 @@ static void detach(sPAPRDRConnector *drc, DeviceState *d,
     object_property_del(OBJECT(drc), "device", NULL);
     drc->dev = NULL;
     drc->detach_cb = NULL;
-    drc->detach_cb_opaque = NULL;
 }
 
 static bool release_pending(sPAPRDRConnector *drc)
@@ -496,7 +490,7 @@ static void reset(DeviceState *d)
          * force removal if we are
          */
         if (drc->awaiting_release) {
-            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque, NULL);
+            drck->detach(drc, DEVICE(drc->dev), NULL);
         }
 
         /* non-PCI devices may be awaiting a transition to UNUSABLE */
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 935e65e..c620eee 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1369,7 +1369,7 @@ out:
     }
 }
 
-static void spapr_phb_remove_pci_device_cb(DeviceState *dev, void *opaque)
+static void spapr_phb_remove_pci_device_cb(DeviceState *dev)
 {
     /* some version guests do not wait for completion of a device
      * cleanup (generally done asynchronously by the kernel) before
@@ -1392,7 +1392,7 @@ static void spapr_phb_remove_pci_device(sPAPRDRConnector *drc,
 {
     sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
-    drck->detach(drc, DEVICE(pdev), phb, errp);
+    drck->detach(drc, DEVICE(pdev), errp);
 }
 
 static sPAPRDRConnector *spapr_phb_get_pci_func_drc(sPAPRPHBState *phb,
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 0a2c173..f68c90e 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -130,7 +130,7 @@ typedef enum {
     SPAPR_DR_CC_RESPONSE_NOT_CONFIGURABLE = -9003,
 } sPAPRDRCCResponse;
 
-typedef void (spapr_drc_detach_cb)(DeviceState *d, void *opaque);
+typedef void (spapr_drc_detach_cb)(DeviceState *d);
 
 typedef struct sPAPRDRConnector {
     /*< private >*/
@@ -159,7 +159,6 @@ typedef struct sPAPRDRConnector {
     /* device pointer, via link property */
     DeviceState *dev;
     spapr_drc_detach_cb *detach_cb;
-    void *detach_cb_opaque;
 } sPAPRDRConnector;
 
 typedef struct sPAPRDRConnectorClass {
@@ -188,8 +187,7 @@ typedef struct sPAPRDRConnectorClass {
     /* QEMU interfaces for managing hotplug operations */
     void (*attach)(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                    int fdt_start_offset, bool coldplug, Error **errp);
-    void (*detach)(sPAPRDRConnector *drc, DeviceState *d,
-                   void *detach_cb_opaque, Error **errp);
+    void (*detach)(sPAPRDRConnector *drc, DeviceState *d, Error **errp);
     bool (*release_pending)(sPAPRDRConnector *drc);
     void (*set_signalled)(sPAPRDRConnector *drc);
 } sPAPRDRConnectorClass;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 3/5] hw/ppc: migrating the DRC state of hotplugged devices
  2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques Daniel Henrique Barboza
@ 2017-04-30 17:25 ` Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 4/5] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 5/5] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
  4 siblings, 0 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'configured' are involved
in the DR state transition diagram from PAPR+ 2.7, 13.4;

- 'configured' and 'signalled' are needed in attaching and detaching
devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

In the 'realize' function the DRC is registered using vmstate_register,
similar to what hw/ppc/spapr_iommu.c does in 'spapr_tce_table_realize'.
This approach works because  DRCs are bus-less and do not sit
on a BusClass that implements bc->get_dev_path, so as a fallback the
VMSD gets identified via "spapr_drc"/get_index(drc).

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr_drc.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 6f98242..0da5f02 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -506,6 +506,64 @@ static void reset(DeviceState *d)
     }
 }
 
+static bool spapr_drc_needed(void *opaque)
+{
+    sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+    bool rc = false;
+    sPAPRDREntitySense value;
+    drck->entity_sense(drc, &value);
+    /* If no dev is plugged in there is no need to migrate the DRC state */
+    if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+        return false;
+    }
+
+    /*
+     * If there is dev plugged in, we need to migrate the DRC state when
+     * it is different from cold-plugged state
+     */
+    switch (drc->type) {
+
+    case SPAPR_DR_CONNECTOR_TYPE_PCI:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_LMB:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+               drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    case SPAPR_DR_CONNECTOR_TYPE_CPU:
+        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+                drc->configured && drc->signalled && !drc->awaiting_release);
+        break;
+
+    default:
+        ;
+    }
+    return rc;
+}
+
+static const VMStateDescription vmstate_spapr_drc = {
+    .name = "spapr_drc",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
+        VMSTATE_BOOL(configured, sPAPRDRConnector),
+        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
+        VMSTATE_BOOL(signalled, sPAPRDRConnector),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static void realize(DeviceState *d, Error **errp)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(d);
@@ -534,6 +592,8 @@ static void realize(DeviceState *d, Error **errp)
         object_unref(OBJECT(drc));
     }
     g_free(child_name);
+    vmstate_register(DEVICE(drc), drck->get_index(drc), &vmstate_spapr_drc,
+                     drc);
     trace_spapr_drc_realize_complete(drck->get_index(drc));
 }
 
@@ -652,6 +712,7 @@ static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
     dk->reset = reset;
     dk->realize = realize;
     dk->unrealize = unrealize;
+    dk->vmsd = &vmstate_spapr_drc;
     drck->set_isolation_state = set_isolation_state;
     drck->set_indicator_state = set_indicator_state;
     drck->set_allocation_state = set_allocation_state;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 4/5] migration: spapr: migrate ccs_list in spapr state
  2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
                   ` (2 preceding siblings ...)
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 3/5] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
@ 2017-04-30 17:25 ` Daniel Henrique Barboza
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 5/5] migration: spapr: migrate pending_events of " Daniel Henrique Barboza
  4 siblings, 0 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

From: Jianjun Duan <duanj@linux.vnet.ibm.com>

ccs_list in spapr state maintains the device tree related
information on the rtas side for hotplugged devices. In racing
situations between hotplug events and migration operation, a rtas
hotplug event could be migrated from the source guest to target
guest, or the source guest could have not yet finished fetching
the device tree when migration is started, the target will try
to finish fetching the device tree. By migrating ccs_list, the
target can fetch the device tree properly.

In theory there would be other alternatives besides migrating the
css_list to fix this. For example, we could add a flag that indicates
whether a device is in the middle of the configure_connector during the
migration process, in the post_load we can detect if this flag
is active and then return an error informing the guest to restart the
hotplug process. However, the DRC state can still be modified outside of
hotplug. Using:

   drmgr -c pci -s <drc_index> -r
   drmgr -c pci -s <drc_index> -a

it is possible to return a device to firmware and then later take it
back and reconfigure it. This is not a common case but it's not prohibited,
and performing a migration between these 2 operations would fail because
the default coldplug state on target assumes a configured state in
the source*. Migrating ccs_list is one solution that cover this
case as well.

ccs_list is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

* see http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg01763.html
for more information on this discussion.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8b9a6cf..cb3f0e8 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1437,6 +1437,37 @@ static bool version_before_3(void *opaque, int version_id)
     return version_id < 3;
 }
 
+static bool spapr_ccs_list_needed(void *opaque)
+{
+    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+    return !QTAILQ_EMPTY(&spapr->ccs_list);
+}
+
+static const VMStateDescription vmstate_spapr_ccs = {
+    .name = "spapr_configure_connector_state",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(drc_index, sPAPRConfigureConnectorState),
+        VMSTATE_INT32(fdt_offset, sPAPRConfigureConnectorState),
+        VMSTATE_INT32(fdt_depth, sPAPRConfigureConnectorState),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_ccs_list = {
+    .name = "spapr_ccs_list",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_ccs_list_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_QTAILQ_V(ccs_list, sPAPRMachineState, 1,
+                         vmstate_spapr_ccs, sPAPRConfigureConnectorState,
+                         next),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
     sPAPRMachineState *spapr = opaque;
@@ -1535,6 +1566,7 @@ static const VMStateDescription vmstate_spapr = {
     .subsections = (const VMStateDescription*[]) {
         &vmstate_spapr_ov5_cas,
         &vmstate_spapr_patb_entry,
+        &vmstate_spapr_ccs_list,
         NULL
     }
 };
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Qemu-devel] [PATCH 5/5] migration: spapr: migrate pending_events of spapr state
  2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
                   ` (3 preceding siblings ...)
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 4/5] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
@ 2017-04-30 17:25 ` Daniel Henrique Barboza
  4 siblings, 0 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-04-30 17:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, david

From: Jianjun Duan <duanj@linux.vnet.ibm.com>

In racing situations between hotplug events and migration operation,
a rtas hotplug event could have not yet be delivered to the source
guest when migration is started. In this case the pending_events of
spapr state need be transmitted to the target so that the hotplug
event can be finished on the target.

All the different fields of the events are encoded as defined by
PAPR. We can migrate them as uint8_t binary stream without any
concerns about data padding or endianess.

pending_events is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c         | 33 +++++++++++++++++++++++++++++++++
 hw/ppc/spapr_events.c  | 24 +++++++++++++-----------
 include/hw/ppc/spapr.h |  3 ++-
 3 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cb3f0e8..cd42449 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1468,6 +1468,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = {
     },
 };
 
+static bool spapr_pending_events_needed(void *opaque)
+{
+    sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+    return !QTAILQ_EMPTY(&spapr->pending_events);
+}
+
+static const VMStateDescription vmstate_spapr_event_entry = {
+    .name = "spapr_event_log_entry",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_INT32(log_type, sPAPREventLogEntry),
+        VMSTATE_BOOL(exception, sPAPREventLogEntry),
+        VMSTATE_UINT32(data_size, sPAPREventLogEntry),
+        VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
+                                    0, vmstate_info_uint8, uint8_t),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static const VMStateDescription vmstate_spapr_pending_events = {
+    .name = "spapr_pending_events",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_pending_events_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
+                         vmstate_spapr_event_entry, sPAPREventLogEntry, next),
+        VMSTATE_END_OF_LIST()
+    },
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
     sPAPRMachineState *spapr = opaque;
@@ -1567,6 +1599,7 @@ static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_ov5_cas,
         &vmstate_spapr_patb_entry,
         &vmstate_spapr_ccs_list,
+        &vmstate_spapr_pending_events,
         NULL
     }
 };
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index f0b28d8..70c7cfc 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, int log_type)
     return source->irq;
 }
 
-static void rtas_event_log_queue(int log_type, void *data, bool exception)
+static void rtas_event_log_queue(int log_type, void *data, bool exception,
+                                 int data_size)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
     sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
@@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, bool exception)
     entry->log_type = log_type;
     entry->exception = exception;
     entry->data = data;
+    entry->data_size = data_size;
     QTAILQ_INSERT_TAIL(&spapr->pending_events, entry, next);
 }
 
@@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     struct rtas_event_log_v6_mainb *mainb;
     struct rtas_event_log_v6_epow *epow;
     struct epow_log_full *new_epow;
+    uint32_t data_size;
 
     new_epow = g_malloc0(sizeof(*new_epow));
     hdr = &new_epow->hdr;
@@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     mainb = &new_epow->mainb;
     epow = &new_epow->epow;
 
+    data_size = sizeof(*new_epow);
     hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
                                | RTAS_LOG_SEVERITY_EVENT
                                | RTAS_LOG_DISPOSITION_NOT_RECOVERED
                                | RTAS_LOG_OPTIONAL_PART_PRESENT
                                | RTAS_LOG_TYPE_EPOW);
-    hdr->extended_length = cpu_to_be32(sizeof(*new_epow)
-                                       - sizeof(new_epow->hdr));
-
+    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_epow->hdr));
     spapr_init_v6hdr(v6hdr);
     spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
 
@@ -479,7 +481,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
     epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
     epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
 
-    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true);
+    rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true, data_size);
 
     qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
                                  rtas_event_log_to_irq(spapr,
@@ -504,6 +506,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     struct rtas_event_log_v6_maina *maina;
     struct rtas_event_log_v6_mainb *mainb;
     struct rtas_event_log_v6_hp *hp;
+    uint32_t data_size;
 
     new_hp = g_malloc0(sizeof(struct hp_log_full));
     hdr = &new_hp->hdr;
@@ -512,14 +515,14 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
     mainb = &new_hp->mainb;
     hp = &new_hp->hp;
 
+    data_size = sizeof(*new_hp);
     hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
                                | RTAS_LOG_SEVERITY_EVENT
                                | RTAS_LOG_DISPOSITION_NOT_RECOVERED
                                | RTAS_LOG_OPTIONAL_PART_PRESENT
                                | RTAS_LOG_INITIATOR_HOTPLUG
                                | RTAS_LOG_TYPE_HOTPLUG);
-    hdr->extended_length = cpu_to_be32(sizeof(*new_hp)
-                                       - sizeof(new_hp->hdr));
+    hdr->extended_length = cpu_to_be32(data_size - sizeof(new_hp->hdr));
 
     spapr_init_v6hdr(v6hdr);
     spapr_init_maina(maina, 3 /* Main-A, Main-B, HP */);
@@ -572,7 +575,7 @@ static void spapr_hotplug_req_event(uint8_t hp_id, uint8_t hp_action,
             cpu_to_be32(drc_id->count_indexed.index);
     }
 
-    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true);
+    rtas_event_log_queue(RTAS_LOG_TYPE_HOTPLUG, new_hp, true, data_size);
 
     qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),
                                  rtas_event_log_to_irq(spapr,
@@ -671,8 +674,7 @@ static void check_exception(PowerPCCPU *cpu, sPAPRMachineState *spapr,
     if (!event) {
         goto out_no_events;
     }
-
-    hdr = event->data;
+    hdr = (struct rtas_error_log *)event->data;
     event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
 
     if (event_len < len) {
@@ -728,7 +730,7 @@ static void event_scan(PowerPCCPU *cpu, sPAPRMachineState *spapr,
         goto out_no_events;
     }
 
-    hdr = event->data;
+    hdr = (struct rtas_error_log *)event->data;
     event_len = be32_to_cpu(hdr->extended_length) + sizeof(*hdr);
 
     if (event_len < len) {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5802f88..fbe1d93 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -599,7 +599,8 @@ sPAPRTCETable *spapr_tce_find_by_liobn(target_ulong liobn);
 struct sPAPREventLogEntry {
     int log_type;
     bool exception;
-    void *data;
+    uint8_t *data;
+    uint32_t data_size;
     QTAILQ_ENTRY(sPAPREventLogEntry) next;
 };
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques Daniel Henrique Barboza
@ 2017-05-02  3:40   ` Bharata B Rao
  2017-05-02  7:43     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  2017-05-03 14:33   ` [Qemu-devel] " Laurent Vivier
  1 sibling, 1 reply; 20+ messages in thread
From: Bharata B Rao @ 2017-05-02  3:40 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc, David Gibson

On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza <
danielhb@linux.vnet.ibm.com> wrote:

> Following up the previous detach_cb change, this patch removes the
> detach_cb_opaque entirely from the code.
>
> The reason is that the drc->detach_cb_opaque object can't be
> restored in the post load of the upcoming DRC migration and no detach
> callbacks actually need this opaque. 'spapr_core_release' is
> receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
> a phb object as opaque but is't using it. These were trivial removal
> cases.
>
> However, the LM removal callback 'spapr_lmb_release' is receiving
> and using the opaque object, a 'sPAPRDIMMState' struct. This struct
> holds the number of LMBs the DIMM object contains and the callback
> was using this counter as a countdown to check if all LMB DRCs were
> release before proceeding to the DIMM unplug. To remove the need of
> this callback we have choices such as:
>
> - migrate the 'sPAPRDIMMState' struct. This would require creating a
> QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
> associate the DIMMState with the DIMM object. We could attach this
> QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
>
> - fetch the state of the LMB DRCs directly by scanning the state of
> them and, if all of them are released, proceed with the DIMM unplug.
>
> The second approach was chosen. The new 'spapr_all_lmbs_drcs_released'
> function scans all LMBs of a given DIMM device to see if their DRC
> state are inactive. If all of them are inactive return 'true', 'false'
> otherwise. This function is being called inside the 'spapr_lmb_release'
> callback, replacing the role of the 'sPAPRDIMMState' opaque. The
> 'sPAPRDIMMState' struct was removed from the code given that there are
> no more uses for it.
>
> After all these changes, there are no roles left for the 'detach_cb_opaque'
> attribute of the 'sPAPRDRConnector' as well, so we can safely remove
> it from the code too.
>
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c             | 46 ++++++++++++++++++++++++++++++
> +++-------------
>  hw/ppc/spapr_drc.c         | 16 +++++-----------
>  hw/ppc/spapr_pci.c         |  4 ++--
>  include/hw/ppc/spapr_drc.h |  6 ++----
>  4 files changed, 42 insertions(+), 30 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index bc11757..8b9a6cf 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
>      }
>  }
>
> -typedef struct sPAPRDIMMState {
> -    uint32_t nr_lmbs;
> -} sPAPRDIMMState;
> +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
> +{
> +    Error *local_err = NULL;
> +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> +    uint64_t size = memory_region_size(mr);
> +
> +    uint64_t addr;
> +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP,
> &local_err);
> +    if (local_err) {
> +        error_propagate(&error_abort, local_err);
> +        return false;
> +    }
> +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>
> -static void spapr_lmb_release(DeviceState *dev, void *opaque)
> +    sPAPRDRConnector *drc;
> +    int i = 0;
> +    for (i = 0; i < nr_lmbs; i++) {
> +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> +                addr / SPAPR_MEMORY_BLOCK_SIZE);
> +        g_assert(drc);
> +        if (drc->indicator_state != SPAPR_DR_INDICATOR_STATE_INACTIVE) {
> +            return false;
> +        }
> +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> +    }
> +    return true;
> +}
> +
> +static void spapr_lmb_release(DeviceState *dev)
>  {
> -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>      HotplugHandler *hotplug_ctrl;
>
> -    if (--ds->nr_lmbs) {
> +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>          return;
>      }
>

I am concerned about the number of times we walk the DRC list corresponding
to each DIMM device. When a DIMM device is being removed,
spapr_lmb_release() will be invoked for each of the LMBs of that DIMM. Now
in this scheme, we end up walking through all the DRC objects of the DIMM
from every LMB's release function.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-02  3:40   ` Bharata B Rao
@ 2017-05-02  7:43     ` Daniel Henrique Barboza
  2017-05-03  3:26       ` Bharata B Rao
  2017-05-04  7:20       ` David Gibson
  0 siblings, 2 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-02  7:43 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: qemu-ppc, qemu-devel, David Gibson



On 05/02/2017 12:40 AM, Bharata B Rao wrote:
> On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza 
> <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>> wrote:
>
>     Following up the previous detach_cb change, this patch removes the
>     detach_cb_opaque entirely from the code.
>
>     The reason is that the drc->detach_cb_opaque object can't be
>     restored in the post load of the upcoming DRC migration and no detach
>     callbacks actually need this opaque. 'spapr_core_release' is
>     receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
>     a phb object as opaque but is't using it. These were trivial removal
>     cases.
>
>     However, the LM removal callback 'spapr_lmb_release' is receiving
>     and using the opaque object, a 'sPAPRDIMMState' struct. This struct
>     holds the number of LMBs the DIMM object contains and the callback
>     was using this counter as a countdown to check if all LMB DRCs were
>     release before proceeding to the DIMM unplug. To remove the need of
>     this callback we have choices such as:
>
>     - migrate the 'sPAPRDIMMState' struct. This would require creating a
>     QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
>     associate the DIMMState with the DIMM object. We could attach this
>     QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
>
>     - fetch the state of the LMB DRCs directly by scanning the state of
>     them and, if all of them are released, proceed with the DIMM unplug.
>
>     The second approach was chosen. The new 'spapr_all_lmbs_drcs_released'
>     function scans all LMBs of a given DIMM device to see if their DRC
>     state are inactive. If all of them are inactive return 'true', 'false'
>     otherwise. This function is being called inside the
>     'spapr_lmb_release'
>     callback, replacing the role of the 'sPAPRDIMMState' opaque. The
>     'sPAPRDIMMState' struct was removed from the code given that there are
>     no more uses for it.
>
>     After all these changes, there are no roles left for the
>     'detach_cb_opaque'
>     attribute of the 'sPAPRDRConnector' as well, so we can safely remove
>     it from the code too.
>
>     Signed-off-by: Daniel Henrique Barboza
>     <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
>     ---
>      hw/ppc/spapr.c             | 46
>     +++++++++++++++++++++++++++++++++-------------
>      hw/ppc/spapr_drc.c         | 16 +++++-----------
>      hw/ppc/spapr_pci.c         |  4 ++--
>      include/hw/ppc/spapr_drc.h |  6 ++----
>      4 files changed, 42 insertions(+), 30 deletions(-)
>
>     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>     index bc11757..8b9a6cf 100644
>     --- a/hw/ppc/spapr.c
>     +++ b/hw/ppc/spapr.c
>     @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
>          }
>      }
>
>     -typedef struct sPAPRDIMMState {
>     -    uint32_t nr_lmbs;
>     -} sPAPRDIMMState;
>     +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
>     +{
>     +    Error *local_err = NULL;
>     +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>     +    MemoryRegion *mr = ddc->get_memory_region(dimm);
>     +    uint64_t size = memory_region_size(mr);
>     +
>     +    uint64_t addr;
>     +    addr = object_property_get_int(OBJECT(dimm),
>     PC_DIMM_ADDR_PROP, &local_err);
>     +    if (local_err) {
>     +        error_propagate(&error_abort, local_err);
>     +        return false;
>     +    }
>     +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>
>     -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>     +    sPAPRDRConnector *drc;
>     +    int i = 0;
>     +    for (i = 0; i < nr_lmbs; i++) {
>     +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>     +                addr / SPAPR_MEMORY_BLOCK_SIZE);
>     +        g_assert(drc);
>     +        if (drc->indicator_state !=
>     SPAPR_DR_INDICATOR_STATE_INACTIVE) {
>     +            return false;
>     +        }
>     +        addr += SPAPR_MEMORY_BLOCK_SIZE;
>     +    }
>     +    return true;
>     +}
>     +
>     +static void spapr_lmb_release(DeviceState *dev)
>      {
>     -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>          HotplugHandler *hotplug_ctrl;
>
>     -    if (--ds->nr_lmbs) {
>     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>              return;
>          }
>
>
> I am concerned about the number of times we walk the DRC list 
> corresponding to each DIMM device. When a DIMM device is being 
> removed, spapr_lmb_release() will be invoked for each of the LMBs of 
> that DIMM. Now in this scheme, we end up walking through all the DRC 
> objects of the DIMM from every LMB's release function.

Hi Bharata,


I agree, this is definitely a poorer performance than simply 
decrementing ds->nr_lmbs.
The reasons why I went on with it:

- hot unplug isn't an operation that happens too often, so it's not terrible
to have a delay increase here;

- it didn't increased the unplug delay in an human noticeable way, at 
least in
my tests;

- apart from migrating the information, there is nothing much we can do 
in the
callback side about it. The callback isn't aware of the current state of 
the DIMM
removal process, so the scanning is required every time.


All that said, assuming that the process of DIMM removal will always go 
through
'spapr_del_lmbs', why do we need this callback? Can't we simply do something
like this in spapr_del_lmbs?


diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cd42449..e443fea 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState *dev, 
uint64_t addr_start, uint64_t size,
          addr += SPAPR_MEMORY_BLOCK_SIZE;
      }

+    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
+        // something went wrong in the removal of the LMBs.
+        // propagate error and return
+        throw_error_code;
+        return;
+    }
+
+    /*
+     * Now that all the LMBs have been removed by the guest, call the
+     * pc-dimm unplug handler to cleanup up the pc-dimm device.
+     */
+    hotplug_ctrl = qdev_get_hotplug_handler(dev);
+    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
+
      drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
                                     addr_start / SPAPR_MEMORY_BLOCK_SIZE);
      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);


With this change we run the LMB scanning once at the end of the for
loop inside spapr_del_lmbs to make sure everything went fine (something
that the current code  isn't doing, there are operationsvbeing done 
afterwards
without checking if the LMB removals actually happened).

If something went wrong, propagate an error. If not, proceed with the 
removal
of the DIMM device and the remaining spapr_del_lmbs code. 
spapr_lmb_release can
be safely removed from the code after that.


What do you think?


Daniel

>
> Regards,
> Bharata.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-02  7:43     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
@ 2017-05-03  3:26       ` Bharata B Rao
  2017-05-03 13:56         ` Daniel Henrique Barboza
  2017-05-04  7:20       ` David Gibson
  1 sibling, 1 reply; 20+ messages in thread
From: Bharata B Rao @ 2017-05-03  3:26 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-ppc, qemu-devel, David Gibson

On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza <
danielhb@linux.vnet.ibm.com> wrote:

>
>
> On 05/02/2017 12:40 AM, Bharata B Rao wrote:
>
> On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza <
> danielhb@linux.vnet.ibm.com> wrote:
>
>> Following up the previous detach_cb change, this patch removes the
>> detach_cb_opaque entirely from the code.
>>
>> The reason is that the drc->detach_cb_opaque object can't be
>> restored in the post load of the upcoming DRC migration and no detach
>> callbacks actually need this opaque. 'spapr_core_release' is
>> receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
>> a phb object as opaque but is't using it. These were trivial removal
>> cases.
>>
>> However, the LM removal callback 'spapr_lmb_release' is receiving
>> and using the opaque object, a 'sPAPRDIMMState' struct. This struct
>> holds the number of LMBs the DIMM object contains and the callback
>> was using this counter as a countdown to check if all LMB DRCs were
>> release before proceeding to the DIMM unplug. To remove the need of
>> this callback we have choices such as:
>>
>> - migrate the 'sPAPRDIMMState' struct. This would require creating a
>> QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
>> associate the DIMMState with the DIMM object. We could attach this
>> QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
>>
>> - fetch the state of the LMB DRCs directly by scanning the state of
>> them and, if all of them are released, proceed with the DIMM unplug.
>>
>> The second approach was chosen. The new 'spapr_all_lmbs_drcs_released'
>> function scans all LMBs of a given DIMM device to see if their DRC
>> state are inactive. If all of them are inactive return 'true', 'false'
>> otherwise. This function is being called inside the 'spapr_lmb_release'
>> callback, replacing the role of the 'sPAPRDIMMState' opaque. The
>> 'sPAPRDIMMState' struct was removed from the code given that there are
>> no more uses for it.
>>
>> After all these changes, there are no roles left for the
>> 'detach_cb_opaque'
>> attribute of the 'sPAPRDRConnector' as well, so we can safely remove
>> it from the code too.
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
>> ---
>>  hw/ppc/spapr.c             | 46 ++++++++++++++++++++++++++++++
>> +++-------------
>>  hw/ppc/spapr_drc.c         | 16 +++++-----------
>>  hw/ppc/spapr_pci.c         |  4 ++--
>>  include/hw/ppc/spapr_drc.h |  6 ++----
>>  4 files changed, 42 insertions(+), 30 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index bc11757..8b9a6cf 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
>>      }
>>  }
>>
>> -typedef struct sPAPRDIMMState {
>> -    uint32_t nr_lmbs;
>> -} sPAPRDIMMState;
>> +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
>> +{
>> +    Error *local_err = NULL;
>> +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>> +    MemoryRegion *mr = ddc->get_memory_region(dimm);
>> +    uint64_t size = memory_region_size(mr);
>> +
>> +    uint64_t addr;
>> +    addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP,
>> &local_err);
>> +    if (local_err) {
>> +        error_propagate(&error_abort, local_err);
>> +        return false;
>> +    }
>> +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>>
>> -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>> +    sPAPRDRConnector *drc;
>> +    int i = 0;
>> +    for (i = 0; i < nr_lmbs; i++) {
>> +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>> +                addr / SPAPR_MEMORY_BLOCK_SIZE);
>> +        g_assert(drc);
>> +        if (drc->indicator_state != SPAPR_DR_INDICATOR_STATE_INACTIVE) {
>> +            return false;
>> +        }
>> +        addr += SPAPR_MEMORY_BLOCK_SIZE;
>> +    }
>> +    return true;
>> +}
>> +
>> +static void spapr_lmb_release(DeviceState *dev)
>>  {
>> -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>>      HotplugHandler *hotplug_ctrl;
>>
>> -    if (--ds->nr_lmbs) {
>> +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>          return;
>>      }
>>
>
> I am concerned about the number of times we walk the DRC list
> corresponding to each DIMM device. When a DIMM device is being removed,
> spapr_lmb_release() will be invoked for each of the LMBs of that DIMM. Now
> in this scheme, we end up walking through all the DRC objects of the DIMM
> from every LMB's release function.
>
>
> Hi Bharata,
>
>
> I agree, this is definitely a poorer performance than simply decrementing
> ds->nr_lmbs.
> The reasons why I went on with it:
>
> - hot unplug isn't an operation that happens too often, so it's not
> terrible
> to have a delay increase here;
>
> - it didn't increased the unplug delay in an human noticeable way, at
> least in
> my tests;
>
> - apart from migrating the information, there is nothing much we can do in
> the
> callback side about it. The callback isn't aware of the current state of
> the DIMM
> removal process, so the scanning is required every time.
>
>
> All that said, assuming that the process of DIMM removal will always go
> through
> 'spapr_del_lmbs', why do we need this callback? Can't we simply do
> something
> like this in spapr_del_lmbs?
>
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cd42449..e443fea 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState *dev,
> uint64_t addr_start, uint64_t size,
>          addr += SPAPR_MEMORY_BLOCK_SIZE;
>      }
>
> +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> +        // something went wrong in the removal of the LMBs.
> +        // propagate error and return
> +        throw_error_code;
> +        return;
> +    }
>

spapr_del_lmbs() is called from ->unplug_request(). Here we notify the
guest about the unplug request. We have to wait till the guest gives us a
go ahead so that we can cleanup the DIMM device. The cleanup is done as
part of release callback (spapr_lmb_release) at which point we are sure
that the given LMB has been indeed removed by the guest.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-03  3:26       ` Bharata B Rao
@ 2017-05-03 13:56         ` Daniel Henrique Barboza
  2017-05-03 20:33           ` Daniel Henrique Barboza
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-03 13:56 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: qemu-ppc, qemu-devel, David Gibson



On 05/03/2017 12:26 AM, Bharata B Rao wrote:
> On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza 
> <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>> wrote:
>
>
>
>     On 05/02/2017 12:40 AM, Bharata B Rao wrote:
>>     On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
>>     <danielhb@linux.vnet.ibm.com
>>     <mailto:danielhb@linux.vnet.ibm.com>> wrote:
>>
>>         Following up the previous detach_cb change, this patch
>>         removes the
>>         detach_cb_opaque entirely from the code.
>>
>>         The reason is that the drc->detach_cb_opaque object can't be
>>         restored in the post load of the upcoming DRC migration and
>>         no detach
>>         callbacks actually need this opaque. 'spapr_core_release' is
>>         receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is
>>         receiving
>>         a phb object as opaque but is't using it. These were trivial
>>         removal
>>         cases.
>>
>>         However, the LM removal callback 'spapr_lmb_release' is receiving
>>         and using the opaque object, a 'sPAPRDIMMState' struct. This
>>         struct
>>         holds the number of LMBs the DIMM object contains and the
>>         callback
>>         was using this counter as a countdown to check if all LMB
>>         DRCs were
>>         release before proceeding to the DIMM unplug. To remove the
>>         need of
>>         this callback we have choices such as:
>>
>>         - migrate the 'sPAPRDIMMState' struct. This would require
>>         creating a
>>         QTAILQ to store all DIMMStates and an additional 'dimm_id'
>>         field to
>>         associate the DIMMState with the DIMM object. We could attach
>>         this
>>         QTAILQ to the 'sPAPRPHBState' and retrieve it later in the
>>         callback.
>>
>>         - fetch the state of the LMB DRCs directly by scanning the
>>         state of
>>         them and, if all of them are released, proceed with the DIMM
>>         unplug.
>>
>>         The second approach was chosen. The new
>>         'spapr_all_lmbs_drcs_released'
>>         function scans all LMBs of a given DIMM device to see if
>>         their DRC
>>         state are inactive. If all of them are inactive return
>>         'true', 'false'
>>         otherwise. This function is being called inside the
>>         'spapr_lmb_release'
>>         callback, replacing the role of the 'sPAPRDIMMState' opaque. The
>>         'sPAPRDIMMState' struct was removed from the code given that
>>         there are
>>         no more uses for it.
>>
>>         After all these changes, there are no roles left for the
>>         'detach_cb_opaque'
>>         attribute of the 'sPAPRDRConnector' as well, so we can safely
>>         remove
>>         it from the code too.
>>
>>         Signed-off-by: Daniel Henrique Barboza
>>         <danielhb@linux.vnet.ibm.com
>>         <mailto:danielhb@linux.vnet.ibm.com>>
>>         ---
>>          hw/ppc/spapr.c             | 46
>>         +++++++++++++++++++++++++++++++++-------------
>>          hw/ppc/spapr_drc.c         | 16 +++++-----------
>>          hw/ppc/spapr_pci.c         |  4 ++--
>>          include/hw/ppc/spapr_drc.h |  6 ++----
>>          4 files changed, 42 insertions(+), 30 deletions(-)
>>
>>         diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>         index bc11757..8b9a6cf 100644
>>         --- a/hw/ppc/spapr.c
>>         +++ b/hw/ppc/spapr.c
>>         @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
>>              }
>>          }
>>
>>         -typedef struct sPAPRDIMMState {
>>         -    uint32_t nr_lmbs;
>>         -} sPAPRDIMMState;
>>         +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
>>         +{
>>         +    Error *local_err = NULL;
>>         +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>>         +    MemoryRegion *mr = ddc->get_memory_region(dimm);
>>         +    uint64_t size = memory_region_size(mr);
>>         +
>>         +    uint64_t addr;
>>         +    addr = object_property_get_int(OBJECT(dimm),
>>         PC_DIMM_ADDR_PROP, &local_err);
>>         +    if (local_err) {
>>         +        error_propagate(&error_abort, local_err);
>>         +        return false;
>>         +    }
>>         +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>>
>>         -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>>         +    sPAPRDRConnector *drc;
>>         +    int i = 0;
>>         +    for (i = 0; i < nr_lmbs; i++) {
>>         +        drc =
>>         spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>>         +                addr / SPAPR_MEMORY_BLOCK_SIZE);
>>         +        g_assert(drc);
>>         +        if (drc->indicator_state !=
>>         SPAPR_DR_INDICATOR_STATE_INACTIVE) {
>>         +            return false;
>>         +        }
>>         +        addr += SPAPR_MEMORY_BLOCK_SIZE;
>>         +    }
>>         +    return true;
>>         +}
>>         +
>>         +static void spapr_lmb_release(DeviceState *dev)
>>          {
>>         -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>>              HotplugHandler *hotplug_ctrl;
>>
>>         -    if (--ds->nr_lmbs) {
>>         +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>                  return;
>>              }
>>
>>
>>     I am concerned about the number of times we walk the DRC list
>>     corresponding to each DIMM device. When a DIMM device is being
>>     removed, spapr_lmb_release() will be invoked for each of the LMBs
>>     of that DIMM. Now in this scheme, we end up walking through all
>>     the DRC objects of the DIMM from every LMB's release function.
>
>     Hi Bharata,
>
>
>     I agree, this is definitely a poorer performance than simply
>     decrementing ds->nr_lmbs.
>     The reasons why I went on with it:
>
>     - hot unplug isn't an operation that happens too often, so it's
>     not terrible
>     to have a delay increase here;
>
>     - it didn't increased the unplug delay in an human noticeable way,
>     at least in
>     my tests;
>
>     - apart from migrating the information, there is nothing much we
>     can do in the
>     callback side about it. The callback isn't aware of the current
>     state of the DIMM
>     removal process, so the scanning is required every time.
>
>
>     All that said, assuming that the process of DIMM removal will
>     always go through
>     'spapr_del_lmbs', why do we need this callback? Can't we simply do
>     something
>     like this in spapr_del_lmbs?
>
>
>     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>     index cd42449..e443fea 100644
>     --- a/hw/ppc/spapr.c
>     +++ b/hw/ppc/spapr.c
>     @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState
>     *dev, uint64_t addr_start, uint64_t size,
>              addr += SPAPR_MEMORY_BLOCK_SIZE;
>          }
>
>     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>     +        // something went wrong in the removal of the LMBs.
>     +        // propagate error and return
>     +        throw_error_code;
>     +        return;
>     +    }
>
>
> spapr_del_lmbs() is called from ->unplug_request(). Here we notify the 
> guest about the unplug request. We have to wait till the guest gives 
> us a go ahead so that we can cleanup the DIMM device. The cleanup is 
> done as part of release callback (spapr_lmb_release) at which point we 
> are sure that the given LMB has been indeed removed by the guest.

I wasn't clear enough in my last comment. Let me rephrase. Is there any 
other use for
the 'spapr_lmb_release' callback function other than being called by the 
spapr_del_lmbs()
in the flow you've stated above? Searching the master code now I've found:


$ grep -R 'spapr_lmb_release' .
./spapr.c:static void spapr_lmb_release(DeviceState *dev, void *opaque)
./spapr.c:        drck->detach(drc, dev, spapr_lmb_release, ds, errp);


Note that all the callback is doing is asserting that a nr_lmb counter 
will be zero after
a decrement and, if true,  execute the following:


     hotplug_ctrl = qdev_get_hotplug_handler(dev);
     hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);


So, if the callback spapr_lmb_release is only being called in the 
following for loop of spapr_del_lmbs()
to clean up each LMB DRC, can't we get rid of it and do the following 
after this for loop?

     for (i = 0; i < nr_lmbs; i++) {
         drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
                 addr / SPAPR_MEMORY_BLOCK_SIZE);
         g_assert(drc);

         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
         drck->detach(drc, dev, ds, errp);
         addr += SPAPR_MEMORY_BLOCK_SIZE;
     }

     if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
         // All LMBs were cleared, proceed with detach
         hotplug_ctrl = qdev_get_hotplug_handler(dev);
         hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
    }
    // proceed with spapr_del_lmbs code


Doesn't this code does exactly the same thing that the callback does 
today? Note that we can
even use that conditional to block the remaining spapr_del_lmbs code 
from executing if the
LMBs weren't properly cleansed - something that today isn't done.


If removing this callback is too problematic or can somehow cause 
problems that I am unable to
foresee, then the alternative would be to either deal with the scanning 
inside the callback
(as it is being done in this patch) or migrate the nr_lmbs information 
for late retrieval in
the callback. I am fine with any alternative, we just need to agree on 
what makes more
sense.


Daniel

>
> Regards,
> Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
@ 2017-05-03 14:01   ` Laurent Vivier
  2017-05-04  5:33   ` David Gibson
  1 sibling, 0 replies; 20+ messages in thread
From: Laurent Vivier @ 2017-05-03 14:01 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

On 30/04/2017 19:25, Daniel Henrique Barboza wrote:
> The idea of moving the detach callback functions to the constructor
> of the dr_connector is to set them statically at init time, avoiding
> any post-load hooks to restore it (after a migration, for example).
> 
> Summary of changes:
> 
> - hw/ppc/spapr_drc.c and include/hw/ppc/spapr_drc.h:
>     *  spapr_dr_connector_new() now has an additional parameter,
> spapr_drc_detach_cb *detach_cb
>     *  'spapr_drc_detach_cb *detach_cb' parameter was removed of
> the detach function pointer in sPAPRDRConnectorClass
> 
> - hw/ppc/spapr_pci.c:
>     * the callback 'spapr_phb_remove_pci_device_cb' is now passed
> as a parameter in 'spapr_dr_connector_new' instead of 'drck->detach()'
> 
> - hw/ppc/spapr.c:
>     * 'spapr_create_lmb_dr_connectors' now passes the callback
> 'spapr_lmb_release' to 'spapr_dr_connector_new' instead of 'drck-detach()'
>     * 'spapr_init_cpus' now passes the callback 'spapr_core_release'
> to 'spapr_dr_connector_new' instead of 'drck-detach()'
>     * moved the callback functions up in the code so they can be referenced
> by 'spapr_create_lmb_dr_connectors' and 'spapr_init_cpus'
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c             | 71 +++++++++++++++++++++++-----------------------
>  hw/ppc/spapr_drc.c         | 17 ++++++-----
>  hw/ppc/spapr_pci.c         |  5 ++--
>  include/hw/ppc/spapr_drc.h |  4 +--
>  4 files changed, 49 insertions(+), 48 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 80d12d0..bc11757 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1887,6 +1887,29 @@ static void spapr_drc_reset(void *opaque)
>      }
>  }
>  
> +typedef struct sPAPRDIMMState {
> +    uint32_t nr_lmbs;
> +} sPAPRDIMMState;
> +
> +static void spapr_lmb_release(DeviceState *dev, void *opaque)
> +{
> +    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> +    HotplugHandler *hotplug_ctrl;
> +
> +    if (--ds->nr_lmbs) {
> +        return;
> +    }
> +
> +    g_free(ds);
> +
> +    /*
> +     * Now that all the LMBs have been removed by the guest, call the
> +     * pc-dimm unplug handler to cleanup up the pc-dimm device.
> +     */
> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> +}
> +
>  static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>  {
>      MachineState *machine = MACHINE(spapr);
> @@ -1900,7 +1923,7 @@ static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>  
>          addr = i * lmb_size + spapr->hotplug_memory.base;
>          drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
> -                                     addr/lmb_size);
> +                                     (addr / lmb_size), spapr_lmb_release);

You have added useless parenthesis around "addr / lmb_size".

Reviewed-by: Laurent Vivier <lvivier@redhat.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques Daniel Henrique Barboza
  2017-05-02  3:40   ` Bharata B Rao
@ 2017-05-03 14:33   ` Laurent Vivier
  2017-05-04  7:27     ` David Gibson
  1 sibling, 1 reply; 20+ messages in thread
From: Laurent Vivier @ 2017-05-03 14:33 UTC (permalink / raw)
  To: Daniel Henrique Barboza, qemu-devel; +Cc: qemu-ppc, david

On 30/04/2017 19:25, Daniel Henrique Barboza wrote:
> Following up the previous detach_cb change, this patch removes the
> detach_cb_opaque entirely from the code.
> 
> The reason is that the drc->detach_cb_opaque object can't be
> restored in the post load of the upcoming DRC migration and no detach
> callbacks actually need this opaque. 'spapr_core_release' is
> receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
> a phb object as opaque but is't using it. These were trivial removal
> cases.
> 
> However, the LM removal callback 'spapr_lmb_release' is receiving
> and using the opaque object, a 'sPAPRDIMMState' struct. This struct
> holds the number of LMBs the DIMM object contains and the callback
> was using this counter as a countdown to check if all LMB DRCs were
> release before proceeding to the DIMM unplug. To remove the need of
> this callback we have choices such as:
> 
> - migrate the 'sPAPRDIMMState' struct. This would require creating a
> QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
> associate the DIMMState with the DIMM object. We could attach this
> QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
> 

Did you think about adding the nr_lmbs into PCDIMMDevice structure?
Or to create a SPAPRPCDIMMDevice with PCDIMMDevice parent_obj and
nr_lmbs field we can retrieve from the DeviceState pointer?

I don't know if it is a good idea or an acceptable way to do that, but
I'd like to know if you have thought about that.

Thanks,
Laurent

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-03 13:56         ` Daniel Henrique Barboza
@ 2017-05-03 20:33           ` Daniel Henrique Barboza
  2017-05-04  7:24             ` David Gibson
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-03 20:33 UTC (permalink / raw)
  To: Bharata B Rao; +Cc: qemu-ppc, qemu-devel, David Gibson, Michael Roth

Update: I have talked with Michael Roth about the spapr_release_lmb 
callback, the flow
of the LMB releases and so on. He clarified to me that it is not 
possible to get rid of
the callback and put its code in the spapr_del_lmbs function.

The reason is that the callback is being executed by the guest via 
spapr_rtas.c:rtas_set_indicator(),
which in turn will follow the flow of the DRC releases and eventually 
will hit the spapr_release_lmb
callback, but this will not necessarily happen in the spam of the 
spapr_del_lmbs function. This means that
my idea of putting the cb code in the spapr_del_lmbs and removing the 
callback not possible.

On the other hand, Bharata raised the issue about the scan function in 
the callback being a problem.
My tests with a 1 Gb unplug didn't show any noticeable delay increase 
but in theory we support memory
unplugs of 1 Terabyte. With 1Gb of RAM we need 4 DRCs, so 1 Tb would 
require 4000 DRCs. Then we
would scan through them 4000 times. I don't think the scan inside the 
callback is feasible in this scenario
either.

In the v9 I'll migrate the sPAPRDIMMState structure like Michael Roth 
mentioned somewhere in the
v6 review to use it inside the spapr_lmb_release callback - looks like 
the best option we have.


Thanks,


Daniel



On 05/03/2017 10:56 AM, Daniel Henrique Barboza wrote:
>
>
> On 05/03/2017 12:26 AM, Bharata B Rao wrote:
>> On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza 
>> <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>> 
>> wrote:
>>
>>
>>
>>     On 05/02/2017 12:40 AM, Bharata B Rao wrote:
>>>     On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
>>>     <danielhb@linux.vnet.ibm.com
>>>     <mailto:danielhb@linux.vnet.ibm.com>> wrote:
>>>
>>>         Following up the previous detach_cb change, this patch
>>>         removes the
>>>         detach_cb_opaque entirely from the code.
>>>
>>>         The reason is that the drc->detach_cb_opaque object can't be
>>>         restored in the post load of the upcoming DRC migration and
>>>         no detach
>>>         callbacks actually need this opaque. 'spapr_core_release' is
>>>         receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is
>>>         receiving
>>>         a phb object as opaque but is't using it. These were trivial
>>>         removal
>>>         cases.
>>>
>>>         However, the LM removal callback 'spapr_lmb_release' is 
>>> receiving
>>>         and using the opaque object, a 'sPAPRDIMMState' struct. This
>>>         struct
>>>         holds the number of LMBs the DIMM object contains and the
>>>         callback
>>>         was using this counter as a countdown to check if all LMB
>>>         DRCs were
>>>         release before proceeding to the DIMM unplug. To remove the
>>>         need of
>>>         this callback we have choices such as:
>>>
>>>         - migrate the 'sPAPRDIMMState' struct. This would require
>>>         creating a
>>>         QTAILQ to store all DIMMStates and an additional 'dimm_id'
>>>         field to
>>>         associate the DIMMState with the DIMM object. We could attach
>>>         this
>>>         QTAILQ to the 'sPAPRPHBState' and retrieve it later in the
>>>         callback.
>>>
>>>         - fetch the state of the LMB DRCs directly by scanning the
>>>         state of
>>>         them and, if all of them are released, proceed with the DIMM
>>>         unplug.
>>>
>>>         The second approach was chosen. The new
>>>         'spapr_all_lmbs_drcs_released'
>>>         function scans all LMBs of a given DIMM device to see if
>>>         their DRC
>>>         state are inactive. If all of them are inactive return
>>>         'true', 'false'
>>>         otherwise. This function is being called inside the
>>>         'spapr_lmb_release'
>>>         callback, replacing the role of the 'sPAPRDIMMState' opaque. 
>>> The
>>>         'sPAPRDIMMState' struct was removed from the code given that
>>>         there are
>>>         no more uses for it.
>>>
>>>         After all these changes, there are no roles left for the
>>>         'detach_cb_opaque'
>>>         attribute of the 'sPAPRDRConnector' as well, so we can safely
>>>         remove
>>>         it from the code too.
>>>
>>>         Signed-off-by: Daniel Henrique Barboza
>>>         <danielhb@linux.vnet.ibm.com
>>>         <mailto:danielhb@linux.vnet.ibm.com>>
>>>         ---
>>>          hw/ppc/spapr.c             | 46
>>>         +++++++++++++++++++++++++++++++++-------------
>>>          hw/ppc/spapr_drc.c         | 16 +++++-----------
>>>          hw/ppc/spapr_pci.c         |  4 ++--
>>>          include/hw/ppc/spapr_drc.h |  6 ++----
>>>          4 files changed, 42 insertions(+), 30 deletions(-)
>>>
>>>         diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>         index bc11757..8b9a6cf 100644
>>>         --- a/hw/ppc/spapr.c
>>>         +++ b/hw/ppc/spapr.c
>>>         @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void 
>>> *opaque)
>>>              }
>>>          }
>>>
>>>         -typedef struct sPAPRDIMMState {
>>>         -    uint32_t nr_lmbs;
>>>         -} sPAPRDIMMState;
>>>         +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
>>>         +{
>>>         +    Error *local_err = NULL;
>>>         +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>>>         +    MemoryRegion *mr = ddc->get_memory_region(dimm);
>>>         +    uint64_t size = memory_region_size(mr);
>>>         +
>>>         +    uint64_t addr;
>>>         +    addr = object_property_get_int(OBJECT(dimm),
>>>         PC_DIMM_ADDR_PROP, &local_err);
>>>         +    if (local_err) {
>>>         +        error_propagate(&error_abort, local_err);
>>>         +        return false;
>>>         +    }
>>>         +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>>>
>>>         -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>>>         +    sPAPRDRConnector *drc;
>>>         +    int i = 0;
>>>         +    for (i = 0; i < nr_lmbs; i++) {
>>>         +        drc =
>>>         spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>>>         +                addr / SPAPR_MEMORY_BLOCK_SIZE);
>>>         +        g_assert(drc);
>>>         +        if (drc->indicator_state !=
>>>         SPAPR_DR_INDICATOR_STATE_INACTIVE) {
>>>         +            return false;
>>>         +        }
>>>         +        addr += SPAPR_MEMORY_BLOCK_SIZE;
>>>         +    }
>>>         +    return true;
>>>         +}
>>>         +
>>>         +static void spapr_lmb_release(DeviceState *dev)
>>>          {
>>>         -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>>>              HotplugHandler *hotplug_ctrl;
>>>
>>>         -    if (--ds->nr_lmbs) {
>>>         +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>>                  return;
>>>              }
>>>
>>>
>>>     I am concerned about the number of times we walk the DRC list
>>>     corresponding to each DIMM device. When a DIMM device is being
>>>     removed, spapr_lmb_release() will be invoked for each of the LMBs
>>>     of that DIMM. Now in this scheme, we end up walking through all
>>>     the DRC objects of the DIMM from every LMB's release function.
>>
>>     Hi Bharata,
>>
>>
>>     I agree, this is definitely a poorer performance than simply
>>     decrementing ds->nr_lmbs.
>>     The reasons why I went on with it:
>>
>>     - hot unplug isn't an operation that happens too often, so it's
>>     not terrible
>>     to have a delay increase here;
>>
>>     - it didn't increased the unplug delay in an human noticeable way,
>>     at least in
>>     my tests;
>>
>>     - apart from migrating the information, there is nothing much we
>>     can do in the
>>     callback side about it. The callback isn't aware of the current
>>     state of the DIMM
>>     removal process, so the scanning is required every time.
>>
>>
>>     All that said, assuming that the process of DIMM removal will
>>     always go through
>>     'spapr_del_lmbs', why do we need this callback? Can't we simply do
>>     something
>>     like this in spapr_del_lmbs?
>>
>>
>>     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>     index cd42449..e443fea 100644
>>     --- a/hw/ppc/spapr.c
>>     +++ b/hw/ppc/spapr.c
>>     @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState
>>     *dev, uint64_t addr_start, uint64_t size,
>>              addr += SPAPR_MEMORY_BLOCK_SIZE;
>>          }
>>
>>     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>     +        // something went wrong in the removal of the LMBs.
>>     +        // propagate error and return
>>     +        throw_error_code;
>>     +        return;
>>     +    }
>>
>>
>> spapr_del_lmbs() is called from ->unplug_request(). Here we notify 
>> the guest about the unplug request. We have to wait till the guest 
>> gives us a go ahead so that we can cleanup the DIMM device. The 
>> cleanup is done as part of release callback (spapr_lmb_release) at 
>> which point we are sure that the given LMB has been indeed removed by 
>> the guest.
>
> I wasn't clear enough in my last comment. Let me rephrase. Is there 
> any other use for
> the 'spapr_lmb_release' callback function other than being called by 
> the spapr_del_lmbs()
> in the flow you've stated above? Searching the master code now I've 
> found:
>
>
> $ grep -R 'spapr_lmb_release' .
> ./spapr.c:static void spapr_lmb_release(DeviceState *dev, void *opaque)
> ./spapr.c:        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
>
>
> Note that all the callback is doing is asserting that a nr_lmb counter 
> will be zero after
> a decrement and, if true,  execute the following:
>
>
>     hotplug_ctrl = qdev_get_hotplug_handler(dev);
>     hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>
>
> So, if the callback spapr_lmb_release is only being called in the 
> following for loop of spapr_del_lmbs()
> to clean up each LMB DRC, can't we get rid of it and do the following 
> after this for loop?
>
>     for (i = 0; i < nr_lmbs; i++) {
>         drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>                 addr / SPAPR_MEMORY_BLOCK_SIZE);
>         g_assert(drc);
>
>         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>         drck->detach(drc, dev, ds, errp);
>         addr += SPAPR_MEMORY_BLOCK_SIZE;
>     }
>
>     if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>         // All LMBs were cleared, proceed with detach
>         hotplug_ctrl = qdev_get_hotplug_handler(dev);
>         hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>    }
>    // proceed with spapr_del_lmbs code
>
>
> Doesn't this code does exactly the same thing that the callback does 
> today? Note that we can
> even use that conditional to block the remaining spapr_del_lmbs code 
> from executing if the
> LMBs weren't properly cleansed - something that today isn't done.
>
>
> If removing this callback is too problematic or can somehow cause 
> problems that I am unable to
> foresee, then the alternative would be to either deal with the 
> scanning inside the callback
> (as it is being done in this patch) or migrate the nr_lmbs information 
> for late retrieval in
> the callback. I am fine with any alternative, we just need to agree on 
> what makes more
> sense.
>
>
> Daniel
>
>>
>> Regards,
>> Bharata.
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new
  2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
  2017-05-03 14:01   ` Laurent Vivier
@ 2017-05-04  5:33   ` David Gibson
  2017-05-04 12:57     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  1 sibling, 1 reply; 20+ messages in thread
From: David Gibson @ 2017-05-04  5:33 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 12805 bytes --]

On Sun, Apr 30, 2017 at 02:25:43PM -0300, Daniel Henrique Barboza wrote:
> The idea of moving the detach callback functions to the constructor
> of the dr_connector is to set them statically at init time, avoiding
> any post-load hooks to restore it (after a migration, for example).
> 
> Summary of changes:
> 
> - hw/ppc/spapr_drc.c and include/hw/ppc/spapr_drc.h:
>     *  spapr_dr_connector_new() now has an additional parameter,
> spapr_drc_detach_cb *detach_cb
>     *  'spapr_drc_detach_cb *detach_cb' parameter was removed of
> the detach function pointer in sPAPRDRConnectorClass
> 
> - hw/ppc/spapr_pci.c:
>     * the callback 'spapr_phb_remove_pci_device_cb' is now passed
> as a parameter in 'spapr_dr_connector_new' instead of 'drck->detach()'
> 
> - hw/ppc/spapr.c:
>     * 'spapr_create_lmb_dr_connectors' now passes the callback
> 'spapr_lmb_release' to 'spapr_dr_connector_new' instead of 'drck-detach()'
>     * 'spapr_init_cpus' now passes the callback 'spapr_core_release'
> to 'spapr_dr_connector_new' instead of 'drck-detach()'
>     * moved the callback functions up in the code so they can be referenced
> by 'spapr_create_lmb_dr_connectors' and 'spapr_init_cpus'
> 
> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>

So, the patch looks correct, but the approach bothers me a bit.  The
DRC type and the detach callback are essentially redundant information
- the callback still gets stored in the instance, which doesn't make a
whole lot of sense.

Ideally we'd use QOM subtypes of the base DRC type and put the
callback in the drck, but I suspect that will raise some other
complications, so I'm ok with postponing that.

In the meantime, I think we'd be better of doing an explicit switch on
the DRC type when we want to call the detach function, rather than
storing a function pointer at all.  It's kind of ugly, but we already
have a bunch of nasty switches on the type, so I don't think it's
really any uglier than what we have.


> ---
>  hw/ppc/spapr.c             | 71 +++++++++++++++++++++++-----------------------
>  hw/ppc/spapr_drc.c         | 17 ++++++-----
>  hw/ppc/spapr_pci.c         |  5 ++--
>  include/hw/ppc/spapr_drc.h |  4 +--
>  4 files changed, 49 insertions(+), 48 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 80d12d0..bc11757 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1887,6 +1887,29 @@ static void spapr_drc_reset(void *opaque)
>      }
>  }
>  
> +typedef struct sPAPRDIMMState {
> +    uint32_t nr_lmbs;
> +} sPAPRDIMMState;
> +
> +static void spapr_lmb_release(DeviceState *dev, void *opaque)
> +{
> +    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> +    HotplugHandler *hotplug_ctrl;
> +
> +    if (--ds->nr_lmbs) {
> +        return;
> +    }
> +
> +    g_free(ds);
> +
> +    /*
> +     * Now that all the LMBs have been removed by the guest, call the
> +     * pc-dimm unplug handler to cleanup up the pc-dimm device.
> +     */
> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> +}
> +
>  static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>  {
>      MachineState *machine = MACHINE(spapr);
> @@ -1900,7 +1923,7 @@ static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>  
>          addr = i * lmb_size + spapr->hotplug_memory.base;
>          drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
> -                                     addr/lmb_size);
> +                                     (addr / lmb_size), spapr_lmb_release);
>          qemu_register_reset(spapr_drc_reset, drc);
>      }
>  }
> @@ -1956,6 +1979,14 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
>      return &ms->possible_cpus->cpus[index];
>  }
>  
> +static void spapr_core_release(DeviceState *dev, void *opaque)
> +{
> +    HotplugHandler *hotplug_ctrl;
> +
> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> +}
> +
>  static void spapr_init_cpus(sPAPRMachineState *spapr)
>  {
>      MachineState *machine = MACHINE(spapr);
> @@ -1998,7 +2029,8 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>              sPAPRDRConnector *drc =
>                  spapr_dr_connector_new(OBJECT(spapr),
>                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> -                                       (core_id / smp_threads) * smt);
> +                                       (core_id / smp_threads) * smt,
> +                                       spapr_core_release);
>  
>              qemu_register_reset(spapr_drc_reset, drc);
>          }
> @@ -2596,29 +2628,6 @@ out:
>      error_propagate(errp, local_err);
>  }
>  
> -typedef struct sPAPRDIMMState {
> -    uint32_t nr_lmbs;
> -} sPAPRDIMMState;
> -
> -static void spapr_lmb_release(DeviceState *dev, void *opaque)
> -{
> -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> -    HotplugHandler *hotplug_ctrl;
> -
> -    if (--ds->nr_lmbs) {
> -        return;
> -    }
> -
> -    g_free(ds);
> -
> -    /*
> -     * Now that all the LMBs have been removed by the guest, call the
> -     * pc-dimm unplug handler to cleanup up the pc-dimm device.
> -     */
> -    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> -    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> -}
> -
>  static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
>                             Error **errp)
>  {
> @@ -2636,7 +2645,7 @@ static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
>          g_assert(drc);
>  
>          drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
> +        drck->detach(drc, dev, ds, errp);
>          addr += SPAPR_MEMORY_BLOCK_SIZE;
>      }
>  
> @@ -2712,14 +2721,6 @@ static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      object_unparent(OBJECT(dev));
>  }
>  
> -static void spapr_core_release(DeviceState *dev, void *opaque)
> -{
> -    HotplugHandler *hotplug_ctrl;
> -
> -    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> -    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> -}
> -
>  static
>  void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
> @@ -2745,7 +2746,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>      g_assert(drc);
>  
>      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -    drck->detach(drc, dev, spapr_core_release, NULL, &local_err);
> +    drck->detach(drc, dev, NULL, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index a1cdc87..afe5d82 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -99,8 +99,8 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
>          if (drc->awaiting_release) {
>              if (drc->configured) {
>                  trace_spapr_drc_set_isolation_state_finalizing(get_index(drc));
> -                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
> -                             drc->detach_cb_opaque, NULL);
> +                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
> +                                         NULL);
>              } else {
>                  trace_spapr_drc_set_isolation_state_deferring(get_index(drc));
>              }
> @@ -153,8 +153,8 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
>          if (drc->awaiting_release &&
>              drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
>              trace_spapr_drc_set_allocation_state_finalizing(get_index(drc));
> -            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
> -                         drc->detach_cb_opaque, NULL);
> +            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
> +                         NULL);
>          } else if (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) {
>              drc->awaiting_allocation = false;
>          }
> @@ -405,12 +405,10 @@ static void attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>  }
>  
>  static void detach(sPAPRDRConnector *drc, DeviceState *d,
> -                   spapr_drc_detach_cb *detach_cb,
>                     void *detach_cb_opaque, Error **errp)
>  {
>      trace_spapr_drc_detach(get_index(drc));
>  
> -    drc->detach_cb = detach_cb;
>      drc->detach_cb_opaque = detach_cb_opaque;
>  
>      /* if we've signalled device presence to the guest, or if the guest
> @@ -498,8 +496,7 @@ static void reset(DeviceState *d)
>           * force removal if we are
>           */
>          if (drc->awaiting_release) {
> -            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
> -                         drc->detach_cb_opaque, NULL);
> +            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque, NULL);
>          }
>  
>          /* non-PCI devices may be awaiting a transition to UNUSABLE */
> @@ -566,7 +563,8 @@ static void unrealize(DeviceState *d, Error **errp)
>  
>  sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>                                           sPAPRDRConnectorType type,
> -                                         uint32_t id)
> +                                         uint32_t id,
> +                                         spapr_drc_detach_cb *detach_cb)
>  {
>      sPAPRDRConnector *drc =
>          SPAPR_DR_CONNECTOR(object_new(TYPE_SPAPR_DR_CONNECTOR));
> @@ -577,6 +575,7 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>      drc->type = type;
>      drc->id = id;
>      drc->owner = owner;
> +    drc->detach_cb = detach_cb;
>      prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", get_index(drc));
>      object_property_add_child(owner, prop_name, OBJECT(drc), NULL);
>      object_property_set_bool(OBJECT(drc), true, "realized", NULL);
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e7567e2..935e65e 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1392,7 +1392,7 @@ static void spapr_phb_remove_pci_device(sPAPRDRConnector *drc,
>  {
>      sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>  
> -    drck->detach(drc, DEVICE(pdev), spapr_phb_remove_pci_device_cb, phb, errp);
> +    drck->detach(drc, DEVICE(pdev), phb, errp);
>  }
>  
>  static sPAPRDRConnector *spapr_phb_get_pci_func_drc(sPAPRPHBState *phb,
> @@ -1764,7 +1764,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>          for (i = 0; i < PCI_SLOT_MAX * 8; i++) {
>              spapr_dr_connector_new(OBJECT(phb),
>                                     SPAPR_DR_CONNECTOR_TYPE_PCI,
> -                                   (sphb->index << 16) | i);
> +                                   (sphb->index << 16) | i,
> +                                   spapr_phb_remove_pci_device_cb);
>          }
>      }
>  
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 5524247..0a2c173 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -189,7 +189,6 @@ typedef struct sPAPRDRConnectorClass {
>      void (*attach)(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>                     int fdt_start_offset, bool coldplug, Error **errp);
>      void (*detach)(sPAPRDRConnector *drc, DeviceState *d,
> -                   spapr_drc_detach_cb *detach_cb,
>                     void *detach_cb_opaque, Error **errp);
>      bool (*release_pending)(sPAPRDRConnector *drc);
>      void (*set_signalled)(sPAPRDRConnector *drc);
> @@ -197,7 +196,8 @@ typedef struct sPAPRDRConnectorClass {
>  
>  sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>                                           sPAPRDRConnectorType type,
> -                                         uint32_t id);
> +                                         uint32_t id,
> +                                         spapr_drc_detach_cb *detach_cb);
>  sPAPRDRConnector *spapr_dr_connector_by_index(uint32_t index);
>  sPAPRDRConnector *spapr_dr_connector_by_id(sPAPRDRConnectorType type,
>                                             uint32_t id);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-02  7:43     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
  2017-05-03  3:26       ` Bharata B Rao
@ 2017-05-04  7:20       ` David Gibson
  2017-05-05  4:38         ` Bharata B Rao
  1 sibling, 1 reply; 20+ messages in thread
From: David Gibson @ 2017-05-04  7:20 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: Bharata B Rao, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 8365 bytes --]

On Tue, May 02, 2017 at 04:43:51AM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 05/02/2017 12:40 AM, Bharata B Rao wrote:
> > On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
> > <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
> > wrote:
> > 
> >     Following up the previous detach_cb change, this patch removes the
> >     detach_cb_opaque entirely from the code.
> > 
> >     The reason is that the drc->detach_cb_opaque object can't be
> >     restored in the post load of the upcoming DRC migration and no detach
> >     callbacks actually need this opaque. 'spapr_core_release' is
> >     receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
> >     a phb object as opaque but is't using it. These were trivial removal
> >     cases.
> > 
> >     However, the LM removal callback 'spapr_lmb_release' is receiving
> >     and using the opaque object, a 'sPAPRDIMMState' struct. This struct
> >     holds the number of LMBs the DIMM object contains and the callback
> >     was using this counter as a countdown to check if all LMB DRCs were
> >     release before proceeding to the DIMM unplug. To remove the need of
> >     this callback we have choices such as:
> > 
> >     - migrate the 'sPAPRDIMMState' struct. This would require creating a
> >     QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
> >     associate the DIMMState with the DIMM object. We could attach this
> >     QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
> > 
> >     - fetch the state of the LMB DRCs directly by scanning the state of
> >     them and, if all of them are released, proceed with the DIMM unplug.
> > 
> >     The second approach was chosen. The new 'spapr_all_lmbs_drcs_released'
> >     function scans all LMBs of a given DIMM device to see if their DRC
> >     state are inactive. If all of them are inactive return 'true', 'false'
> >     otherwise. This function is being called inside the
> >     'spapr_lmb_release'
> >     callback, replacing the role of the 'sPAPRDIMMState' opaque. The
> >     'sPAPRDIMMState' struct was removed from the code given that there are
> >     no more uses for it.
> > 
> >     After all these changes, there are no roles left for the
> >     'detach_cb_opaque'
> >     attribute of the 'sPAPRDRConnector' as well, so we can safely remove
> >     it from the code too.
> > 
> >     Signed-off-by: Daniel Henrique Barboza
> >     <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
> >     ---
> >      hw/ppc/spapr.c             | 46
> >     +++++++++++++++++++++++++++++++++-------------
> >      hw/ppc/spapr_drc.c         | 16 +++++-----------
> >      hw/ppc/spapr_pci.c         |  4 ++--
> >      include/hw/ppc/spapr_drc.h |  6 ++----
> >      4 files changed, 42 insertions(+), 30 deletions(-)
> > 
> >     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >     index bc11757..8b9a6cf 100644
> >     --- a/hw/ppc/spapr.c
> >     +++ b/hw/ppc/spapr.c
> >     @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
> >          }
> >      }
> > 
> >     -typedef struct sPAPRDIMMState {
> >     -    uint32_t nr_lmbs;
> >     -} sPAPRDIMMState;
> >     +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
> >     +{
> >     +    Error *local_err = NULL;
> >     +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> >     +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> >     +    uint64_t size = memory_region_size(mr);
> >     +
> >     +    uint64_t addr;
> >     +    addr = object_property_get_int(OBJECT(dimm),
> >     PC_DIMM_ADDR_PROP, &local_err);
> >     +    if (local_err) {
> >     +        error_propagate(&error_abort, local_err);
> >     +        return false;
> >     +    }
> >     +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
> > 
> >     -static void spapr_lmb_release(DeviceState *dev, void *opaque)
> >     +    sPAPRDRConnector *drc;
> >     +    int i = 0;
> >     +    for (i = 0; i < nr_lmbs; i++) {
> >     +        drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> >     +                addr / SPAPR_MEMORY_BLOCK_SIZE);
> >     +        g_assert(drc);
> >     +        if (drc->indicator_state !=
> >     SPAPR_DR_INDICATOR_STATE_INACTIVE) {
> >     +            return false;
> >     +        }
> >     +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> >     +    }
> >     +    return true;
> >     +}
> >     +
> >     +static void spapr_lmb_release(DeviceState *dev)
> >      {
> >     -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> >          HotplugHandler *hotplug_ctrl;
> > 
> >     -    if (--ds->nr_lmbs) {
> >     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> >              return;
> >          }
> > 
> > 
> > I am concerned about the number of times we walk the DRC list
> > corresponding to each DIMM device. When a DIMM device is being removed,
> > spapr_lmb_release() will be invoked for each of the LMBs of that DIMM.
> > Now in this scheme, we end up walking through all the DRC objects of the
> > DIMM from every LMB's release function.
> 
> Hi Bharata,
> 
> 
> I agree, this is definitely a poorer performance than simply decrementing
> ds->nr_lmbs.
> The reasons why I went on with it:
> 
> - hot unplug isn't an operation that happens too often, so it's not terrible
> to have a delay increase here;

So, if it were just a fixed increase in the time, I'd agree.  But IIUC
from the above, this basically makes the removal O(N^2) in the size of
the DIMM, which sounds like it could get bad to me.

> - it didn't increased the unplug delay in an human noticeable way, at least
> in
> my tests;

Right, but what size of DIMM did you use?

> - apart from migrating the information, there is nothing much we can do in
> the
> callback side about it. The callback isn't aware of the current state of the
> DIMM
> removal process, so the scanning is required every time.

Well we could potentially use "cached" state here.  In the normal way
of things we use a value like this, but in the case of migration we
re-generate the information with a full scan.

> All that said, assuming that the process of DIMM removal will always go
> through
> 'spapr_del_lmbs', why do we need this callback? Can't we simply do something
> like this in spapr_del_lmbs?
> 
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cd42449..e443fea 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState *dev, uint64_t
> addr_start, uint64_t size,
>          addr += SPAPR_MEMORY_BLOCK_SIZE;
>      }
> 
> +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> +        // something went wrong in the removal of the LMBs.
> +        // propagate error and return
> +        throw_error_code;
> +        return;
> +    }
> +
> +    /*
> +     * Now that all the LMBs have been removed by the guest, call the
> +     * pc-dimm unplug handler to cleanup up the pc-dimm device.
> +     */
> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> +
>      drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>                                     addr_start / SPAPR_MEMORY_BLOCK_SIZE);
>      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> 
> 
> With this change we run the LMB scanning once at the end of the for
> loop inside spapr_del_lmbs to make sure everything went fine (something
> that the current code  isn't doing, there are operationsvbeing done
> afterwards
> without checking if the LMB removals actually happened).
> 
> If something went wrong, propagate an error. If not, proceed with the
> removal
> of the DIMM device and the remaining spapr_del_lmbs code. spapr_lmb_release
> can
> be safely removed from the code after that.
> 
> 
> What do you think?

That seems like a good idea, independent of anything else.  But I may
not be remembering how the LMB removal paths all work.  Bharata?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-03 20:33           ` Daniel Henrique Barboza
@ 2017-05-04  7:24             ` David Gibson
  2017-05-04 16:30               ` Daniel Henrique Barboza
  0 siblings, 1 reply; 20+ messages in thread
From: David Gibson @ 2017-05-04  7:24 UTC (permalink / raw)
  To: Daniel Henrique Barboza; +Cc: Bharata B Rao, qemu-ppc, qemu-devel, Michael Roth

[-- Attachment #1: Type: text/plain, Size: 13240 bytes --]

On Wed, May 03, 2017 at 05:33:54PM -0300, Daniel Henrique Barboza wrote:
> Update: I have talked with Michael Roth about the spapr_release_lmb
> callback, the flow
> of the LMB releases and so on. He clarified to me that it is not possible to
> get rid of
> the callback and put its code in the spapr_del_lmbs function.
> 
> The reason is that the callback is being executed by the guest via
> spapr_rtas.c:rtas_set_indicator(),
> which in turn will follow the flow of the DRC releases and eventually will
> hit the spapr_release_lmb
> callback, but this will not necessarily happen in the spam of the
> spapr_del_lmbs function. This means that
> my idea of putting the cb code in the spapr_del_lmbs and removing the
> callback not possible.
> 
> On the other hand, Bharata raised the issue about the scan function in the
> callback being a problem.
> My tests with a 1 Gb unplug didn't show any noticeable delay increase but in
> theory we support memory
> unplugs of 1 Terabyte. With 1Gb of RAM we need 4 DRCs, so 1 Tb would require
> 4000 DRCs. Then we
> would scan through them 4000 times. I don't think the scan inside the
> callback is feasible in this scenario
> either.
> 
> In the v9 I'll migrate the sPAPRDIMMState structure like Michael Roth
> mentioned somewhere in the
> v6 review to use it inside the spapr_lmb_release callback - looks like the
> best option we have.

I don't think you have to migrate that actual structure, just the fact
that there's an in-progress removal (which you might be able to derive
from existing migrated state anyway).  You can reconstruct the nr_lmbs
state with a full scan on post_load().  That way it's only O(N) not
O(N^2), and only in the case that a migration occurs mid-unplug.

> 
> 
> Thanks,
> 
> 
> Daniel
> 
> 
> 
> On 05/03/2017 10:56 AM, Daniel Henrique Barboza wrote:
> > 
> > 
> > On 05/03/2017 12:26 AM, Bharata B Rao wrote:
> > > On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza
> > > <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
> > > wrote:
> > > 
> > > 
> > > 
> > >     On 05/02/2017 12:40 AM, Bharata B Rao wrote:
> > > >     On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
> > > >     <danielhb@linux.vnet.ibm.com
> > > >     <mailto:danielhb@linux.vnet.ibm.com>> wrote:
> > > > 
> > > >         Following up the previous detach_cb change, this patch
> > > >         removes the
> > > >         detach_cb_opaque entirely from the code.
> > > > 
> > > >         The reason is that the drc->detach_cb_opaque object can't be
> > > >         restored in the post load of the upcoming DRC migration and
> > > >         no detach
> > > >         callbacks actually need this opaque. 'spapr_core_release' is
> > > >         receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is
> > > >         receiving
> > > >         a phb object as opaque but is't using it. These were trivial
> > > >         removal
> > > >         cases.
> > > > 
> > > >         However, the LM removal callback 'spapr_lmb_release' is
> > > > receiving
> > > >         and using the opaque object, a 'sPAPRDIMMState' struct. This
> > > >         struct
> > > >         holds the number of LMBs the DIMM object contains and the
> > > >         callback
> > > >         was using this counter as a countdown to check if all LMB
> > > >         DRCs were
> > > >         release before proceeding to the DIMM unplug. To remove the
> > > >         need of
> > > >         this callback we have choices such as:
> > > > 
> > > >         - migrate the 'sPAPRDIMMState' struct. This would require
> > > >         creating a
> > > >         QTAILQ to store all DIMMStates and an additional 'dimm_id'
> > > >         field to
> > > >         associate the DIMMState with the DIMM object. We could attach
> > > >         this
> > > >         QTAILQ to the 'sPAPRPHBState' and retrieve it later in the
> > > >         callback.
> > > > 
> > > >         - fetch the state of the LMB DRCs directly by scanning the
> > > >         state of
> > > >         them and, if all of them are released, proceed with the DIMM
> > > >         unplug.
> > > > 
> > > >         The second approach was chosen. The new
> > > >         'spapr_all_lmbs_drcs_released'
> > > >         function scans all LMBs of a given DIMM device to see if
> > > >         their DRC
> > > >         state are inactive. If all of them are inactive return
> > > >         'true', 'false'
> > > >         otherwise. This function is being called inside the
> > > >         'spapr_lmb_release'
> > > >         callback, replacing the role of the 'sPAPRDIMMState'
> > > > opaque. The
> > > >         'sPAPRDIMMState' struct was removed from the code given that
> > > >         there are
> > > >         no more uses for it.
> > > > 
> > > >         After all these changes, there are no roles left for the
> > > >         'detach_cb_opaque'
> > > >         attribute of the 'sPAPRDRConnector' as well, so we can safely
> > > >         remove
> > > >         it from the code too.
> > > > 
> > > >         Signed-off-by: Daniel Henrique Barboza
> > > >         <danielhb@linux.vnet.ibm.com
> > > >         <mailto:danielhb@linux.vnet.ibm.com>>
> > > >         ---
> > > >          hw/ppc/spapr.c             | 46
> > > >         +++++++++++++++++++++++++++++++++-------------
> > > >          hw/ppc/spapr_drc.c         | 16 +++++-----------
> > > >          hw/ppc/spapr_pci.c         |  4 ++--
> > > >          include/hw/ppc/spapr_drc.h |  6 ++----
> > > >          4 files changed, 42 insertions(+), 30 deletions(-)
> > > > 
> > > >         diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > >         index bc11757..8b9a6cf 100644
> > > >         --- a/hw/ppc/spapr.c
> > > >         +++ b/hw/ppc/spapr.c
> > > >         @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void
> > > > *opaque)
> > > >              }
> > > >          }
> > > > 
> > > >         -typedef struct sPAPRDIMMState {
> > > >         -    uint32_t nr_lmbs;
> > > >         -} sPAPRDIMMState;
> > > >         +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
> > > >         +{
> > > >         +    Error *local_err = NULL;
> > > >         +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > > >         +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> > > >         +    uint64_t size = memory_region_size(mr);
> > > >         +
> > > >         +    uint64_t addr;
> > > >         +    addr = object_property_get_int(OBJECT(dimm),
> > > >         PC_DIMM_ADDR_PROP, &local_err);
> > > >         +    if (local_err) {
> > > >         +        error_propagate(&error_abort, local_err);
> > > >         +        return false;
> > > >         +    }
> > > >         +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
> > > > 
> > > >         -static void spapr_lmb_release(DeviceState *dev, void *opaque)
> > > >         +    sPAPRDRConnector *drc;
> > > >         +    int i = 0;
> > > >         +    for (i = 0; i < nr_lmbs; i++) {
> > > >         +        drc =
> > > >         spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> > > >         +                addr / SPAPR_MEMORY_BLOCK_SIZE);
> > > >         +        g_assert(drc);
> > > >         +        if (drc->indicator_state !=
> > > >         SPAPR_DR_INDICATOR_STATE_INACTIVE) {
> > > >         +            return false;
> > > >         +        }
> > > >         +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> > > >         +    }
> > > >         +    return true;
> > > >         +}
> > > >         +
> > > >         +static void spapr_lmb_release(DeviceState *dev)
> > > >          {
> > > >         -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> > > >              HotplugHandler *hotplug_ctrl;
> > > > 
> > > >         -    if (--ds->nr_lmbs) {
> > > >         +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> > > >                  return;
> > > >              }
> > > > 
> > > > 
> > > >     I am concerned about the number of times we walk the DRC list
> > > >     corresponding to each DIMM device. When a DIMM device is being
> > > >     removed, spapr_lmb_release() will be invoked for each of the LMBs
> > > >     of that DIMM. Now in this scheme, we end up walking through all
> > > >     the DRC objects of the DIMM from every LMB's release function.
> > > 
> > >     Hi Bharata,
> > > 
> > > 
> > >     I agree, this is definitely a poorer performance than simply
> > >     decrementing ds->nr_lmbs.
> > >     The reasons why I went on with it:
> > > 
> > >     - hot unplug isn't an operation that happens too often, so it's
> > >     not terrible
> > >     to have a delay increase here;
> > > 
> > >     - it didn't increased the unplug delay in an human noticeable way,
> > >     at least in
> > >     my tests;
> > > 
> > >     - apart from migrating the information, there is nothing much we
> > >     can do in the
> > >     callback side about it. The callback isn't aware of the current
> > >     state of the DIMM
> > >     removal process, so the scanning is required every time.
> > > 
> > > 
> > >     All that said, assuming that the process of DIMM removal will
> > >     always go through
> > >     'spapr_del_lmbs', why do we need this callback? Can't we simply do
> > >     something
> > >     like this in spapr_del_lmbs?
> > > 
> > > 
> > >     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > >     index cd42449..e443fea 100644
> > >     --- a/hw/ppc/spapr.c
> > >     +++ b/hw/ppc/spapr.c
> > >     @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState
> > >     *dev, uint64_t addr_start, uint64_t size,
> > >              addr += SPAPR_MEMORY_BLOCK_SIZE;
> > >          }
> > > 
> > >     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> > >     +        // something went wrong in the removal of the LMBs.
> > >     +        // propagate error and return
> > >     +        throw_error_code;
> > >     +        return;
> > >     +    }
> > > 
> > > 
> > > spapr_del_lmbs() is called from ->unplug_request(). Here we notify
> > > the guest about the unplug request. We have to wait till the guest
> > > gives us a go ahead so that we can cleanup the DIMM device. The
> > > cleanup is done as part of release callback (spapr_lmb_release) at
> > > which point we are sure that the given LMB has been indeed removed
> > > by the guest.
> > 
> > I wasn't clear enough in my last comment. Let me rephrase. Is there any
> > other use for
> > the 'spapr_lmb_release' callback function other than being called by the
> > spapr_del_lmbs()
> > in the flow you've stated above? Searching the master code now I've
> > found:
> > 
> > 
> > $ grep -R 'spapr_lmb_release' .
> > ./spapr.c:static void spapr_lmb_release(DeviceState *dev, void *opaque)
> > ./spapr.c:        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
> > 
> > 
> > Note that all the callback is doing is asserting that a nr_lmb counter
> > will be zero after
> > a decrement and, if true,  execute the following:
> > 
> > 
> >     hotplug_ctrl = qdev_get_hotplug_handler(dev);
> >     hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> > 
> > 
> > So, if the callback spapr_lmb_release is only being called in the
> > following for loop of spapr_del_lmbs()
> > to clean up each LMB DRC, can't we get rid of it and do the following
> > after this for loop?
> > 
> >     for (i = 0; i < nr_lmbs; i++) {
> >         drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> >                 addr / SPAPR_MEMORY_BLOCK_SIZE);
> >         g_assert(drc);
> > 
> >         drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >         drck->detach(drc, dev, ds, errp);
> >         addr += SPAPR_MEMORY_BLOCK_SIZE;
> >     }
> > 
> >     if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> >         // All LMBs were cleared, proceed with detach
> >         hotplug_ctrl = qdev_get_hotplug_handler(dev);
> >         hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> >    }
> >    // proceed with spapr_del_lmbs code
> > 
> > 
> > Doesn't this code does exactly the same thing that the callback does
> > today? Note that we can
> > even use that conditional to block the remaining spapr_del_lmbs code
> > from executing if the
> > LMBs weren't properly cleansed - something that today isn't done.
> > 
> > 
> > If removing this callback is too problematic or can somehow cause
> > problems that I am unable to
> > foresee, then the alternative would be to either deal with the scanning
> > inside the callback
> > (as it is being done in this patch) or migrate the nr_lmbs information
> > for late retrieval in
> > the callback. I am fine with any alternative, we just need to agree on
> > what makes more
> > sense.
> > 
> > 
> > Daniel
> > 
> > > 
> > > Regards,
> > > Bharata.
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-03 14:33   ` [Qemu-devel] " Laurent Vivier
@ 2017-05-04  7:27     ` David Gibson
  0 siblings, 0 replies; 20+ messages in thread
From: David Gibson @ 2017-05-04  7:27 UTC (permalink / raw)
  To: Laurent Vivier; +Cc: Daniel Henrique Barboza, qemu-devel, qemu-ppc

[-- Attachment #1: Type: text/plain, Size: 2446 bytes --]

On Wed, May 03, 2017 at 04:33:56PM +0200, Laurent Vivier wrote:
> On 30/04/2017 19:25, Daniel Henrique Barboza wrote:
> > Following up the previous detach_cb change, this patch removes the
> > detach_cb_opaque entirely from the code.
> > 
> > The reason is that the drc->detach_cb_opaque object can't be
> > restored in the post load of the upcoming DRC migration and no detach
> > callbacks actually need this opaque. 'spapr_core_release' is
> > receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is receiving
> > a phb object as opaque but is't using it. These were trivial removal
> > cases.
> > 
> > However, the LM removal callback 'spapr_lmb_release' is receiving
> > and using the opaque object, a 'sPAPRDIMMState' struct. This struct
> > holds the number of LMBs the DIMM object contains and the callback
> > was using this counter as a countdown to check if all LMB DRCs were
> > release before proceeding to the DIMM unplug. To remove the need of
> > this callback we have choices such as:
> > 
> > - migrate the 'sPAPRDIMMState' struct. This would require creating a
> > QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
> > associate the DIMMState with the DIMM object. We could attach this
> > QTAILQ to the 'sPAPRPHBState' and retrieve it later in the callback.
> > 
> 
> Did you think about adding the nr_lmbs into PCDIMMDevice structure?

I think that's unlikely to fly, because it's very platform specific
state to put into the supposedly general DIMM object.

> Or to create a SPAPRPCDIMMDevice with PCDIMMDevice parent_obj and
> nr_lmbs field we can retrieve from the DeviceState pointer?

I wondered about that.  In some ways it is the simplest way forward;
however it means the user (and/or libvirt) has to be aware that they
need an spapr dimm not a pc-dimm for this platform, which is pretty
awful UX.

Plus, if we were going to have an SPAPR specific way of handling
memory hotplug, I'd prefer to really base it on sPAPR, which would let
us get rid of a bunch of the ugly glue between DIMMs and LMBs.

> 
> I don't know if it is a good idea or an acceptable way to do that, but
> I'd like to know if you have thought about that.
> 
> Thanks,
> Laurent
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new
  2017-05-04  5:33   ` David Gibson
@ 2017-05-04 12:57     ` Daniel Henrique Barboza
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-04 12:57 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel



On 05/04/2017 02:33 AM, David Gibson wrote:
> On Sun, Apr 30, 2017 at 02:25:43PM -0300, Daniel Henrique Barboza wrote:
>> The idea of moving the detach callback functions to the constructor
>> of the dr_connector is to set them statically at init time, avoiding
>> any post-load hooks to restore it (after a migration, for example).
>>
>> Summary of changes:
>>
>> - hw/ppc/spapr_drc.c and include/hw/ppc/spapr_drc.h:
>>      *  spapr_dr_connector_new() now has an additional parameter,
>> spapr_drc_detach_cb *detach_cb
>>      *  'spapr_drc_detach_cb *detach_cb' parameter was removed of
>> the detach function pointer in sPAPRDRConnectorClass
>>
>> - hw/ppc/spapr_pci.c:
>>      * the callback 'spapr_phb_remove_pci_device_cb' is now passed
>> as a parameter in 'spapr_dr_connector_new' instead of 'drck->detach()'
>>
>> - hw/ppc/spapr.c:
>>      * 'spapr_create_lmb_dr_connectors' now passes the callback
>> 'spapr_lmb_release' to 'spapr_dr_connector_new' instead of 'drck-detach()'
>>      * 'spapr_init_cpus' now passes the callback 'spapr_core_release'
>> to 'spapr_dr_connector_new' instead of 'drck-detach()'
>>      * moved the callback functions up in the code so they can be referenced
>> by 'spapr_create_lmb_dr_connectors' and 'spapr_init_cpus'
>>
>> Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
> So, the patch looks correct, but the approach bothers me a bit.  The
> DRC type and the detach callback are essentially redundant information
> - the callback still gets stored in the instance, which doesn't make a
> whole lot of sense.
>
> Ideally we'd use QOM subtypes of the base DRC type and put the
> callback in the drck, but I suspect that will raise some other
> complications, so I'm ok with postponing that.

Interesting.

>
> In the meantime, I think we'd be better of doing an explicit switch on
> the DRC type when we want to call the detach function, rather than
> storing a function pointer at all.  It's kind of ugly, but we already
> have a bunch of nasty switches on the type, so I don't think it's
> really any uglier than what we have.

That sounds reasonable to me. If no one has a strong objection against 
it I'll
add this change in the next version.


Daniel

>
>
>> ---
>>   hw/ppc/spapr.c             | 71 +++++++++++++++++++++++-----------------------
>>   hw/ppc/spapr_drc.c         | 17 ++++++-----
>>   hw/ppc/spapr_pci.c         |  5 ++--
>>   include/hw/ppc/spapr_drc.h |  4 +--
>>   4 files changed, 49 insertions(+), 48 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 80d12d0..bc11757 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -1887,6 +1887,29 @@ static void spapr_drc_reset(void *opaque)
>>       }
>>   }
>>   
>> +typedef struct sPAPRDIMMState {
>> +    uint32_t nr_lmbs;
>> +} sPAPRDIMMState;
>> +
>> +static void spapr_lmb_release(DeviceState *dev, void *opaque)
>> +{
>> +    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>> +    HotplugHandler *hotplug_ctrl;
>> +
>> +    if (--ds->nr_lmbs) {
>> +        return;
>> +    }
>> +
>> +    g_free(ds);
>> +
>> +    /*
>> +     * Now that all the LMBs have been removed by the guest, call the
>> +     * pc-dimm unplug handler to cleanup up the pc-dimm device.
>> +     */
>> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
>> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>> +}
>> +
>>   static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>>   {
>>       MachineState *machine = MACHINE(spapr);
>> @@ -1900,7 +1923,7 @@ static void spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
>>   
>>           addr = i * lmb_size + spapr->hotplug_memory.base;
>>           drc = spapr_dr_connector_new(OBJECT(spapr), SPAPR_DR_CONNECTOR_TYPE_LMB,
>> -                                     addr/lmb_size);
>> +                                     (addr / lmb_size), spapr_lmb_release);
>>           qemu_register_reset(spapr_drc_reset, drc);
>>       }
>>   }
>> @@ -1956,6 +1979,14 @@ static CPUArchId *spapr_find_cpu_slot(MachineState *ms, uint32_t id, int *idx)
>>       return &ms->possible_cpus->cpus[index];
>>   }
>>   
>> +static void spapr_core_release(DeviceState *dev, void *opaque)
>> +{
>> +    HotplugHandler *hotplug_ctrl;
>> +
>> +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
>> +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>> +}
>> +
>>   static void spapr_init_cpus(sPAPRMachineState *spapr)
>>   {
>>       MachineState *machine = MACHINE(spapr);
>> @@ -1998,7 +2029,8 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>>               sPAPRDRConnector *drc =
>>                   spapr_dr_connector_new(OBJECT(spapr),
>>                                          SPAPR_DR_CONNECTOR_TYPE_CPU,
>> -                                       (core_id / smp_threads) * smt);
>> +                                       (core_id / smp_threads) * smt,
>> +                                       spapr_core_release);
>>   
>>               qemu_register_reset(spapr_drc_reset, drc);
>>           }
>> @@ -2596,29 +2628,6 @@ out:
>>       error_propagate(errp, local_err);
>>   }
>>   
>> -typedef struct sPAPRDIMMState {
>> -    uint32_t nr_lmbs;
>> -} sPAPRDIMMState;
>> -
>> -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>> -{
>> -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>> -    HotplugHandler *hotplug_ctrl;
>> -
>> -    if (--ds->nr_lmbs) {
>> -        return;
>> -    }
>> -
>> -    g_free(ds);
>> -
>> -    /*
>> -     * Now that all the LMBs have been removed by the guest, call the
>> -     * pc-dimm unplug handler to cleanup up the pc-dimm device.
>> -     */
>> -    hotplug_ctrl = qdev_get_hotplug_handler(dev);
>> -    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>> -}
>> -
>>   static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
>>                              Error **errp)
>>   {
>> @@ -2636,7 +2645,7 @@ static void spapr_del_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t size,
>>           g_assert(drc);
>>   
>>           drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>> -        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
>> +        drck->detach(drc, dev, ds, errp);
>>           addr += SPAPR_MEMORY_BLOCK_SIZE;
>>       }
>>   
>> @@ -2712,14 +2721,6 @@ static void spapr_core_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>       object_unparent(OBJECT(dev));
>>   }
>>   
>> -static void spapr_core_release(DeviceState *dev, void *opaque)
>> -{
>> -    HotplugHandler *hotplug_ctrl;
>> -
>> -    hotplug_ctrl = qdev_get_hotplug_handler(dev);
>> -    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>> -}
>> -
>>   static
>>   void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>>                                  Error **errp)
>> @@ -2745,7 +2746,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>>       g_assert(drc);
>>   
>>       drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>> -    drck->detach(drc, dev, spapr_core_release, NULL, &local_err);
>> +    drck->detach(drc, dev, NULL, &local_err);
>>       if (local_err) {
>>           error_propagate(errp, local_err);
>>           return;
>> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
>> index a1cdc87..afe5d82 100644
>> --- a/hw/ppc/spapr_drc.c
>> +++ b/hw/ppc/spapr_drc.c
>> @@ -99,8 +99,8 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
>>           if (drc->awaiting_release) {
>>               if (drc->configured) {
>>                   trace_spapr_drc_set_isolation_state_finalizing(get_index(drc));
>> -                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
>> -                             drc->detach_cb_opaque, NULL);
>> +                drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
>> +                                         NULL);
>>               } else {
>>                   trace_spapr_drc_set_isolation_state_deferring(get_index(drc));
>>               }
>> @@ -153,8 +153,8 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
>>           if (drc->awaiting_release &&
>>               drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
>>               trace_spapr_drc_set_allocation_state_finalizing(get_index(drc));
>> -            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
>> -                         drc->detach_cb_opaque, NULL);
>> +            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque,
>> +                         NULL);
>>           } else if (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) {
>>               drc->awaiting_allocation = false;
>>           }
>> @@ -405,12 +405,10 @@ static void attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>>   }
>>   
>>   static void detach(sPAPRDRConnector *drc, DeviceState *d,
>> -                   spapr_drc_detach_cb *detach_cb,
>>                      void *detach_cb_opaque, Error **errp)
>>   {
>>       trace_spapr_drc_detach(get_index(drc));
>>   
>> -    drc->detach_cb = detach_cb;
>>       drc->detach_cb_opaque = detach_cb_opaque;
>>   
>>       /* if we've signalled device presence to the guest, or if the guest
>> @@ -498,8 +496,7 @@ static void reset(DeviceState *d)
>>            * force removal if we are
>>            */
>>           if (drc->awaiting_release) {
>> -            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb,
>> -                         drc->detach_cb_opaque, NULL);
>> +            drck->detach(drc, DEVICE(drc->dev), drc->detach_cb_opaque, NULL);
>>           }
>>   
>>           /* non-PCI devices may be awaiting a transition to UNUSABLE */
>> @@ -566,7 +563,8 @@ static void unrealize(DeviceState *d, Error **errp)
>>   
>>   sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>>                                            sPAPRDRConnectorType type,
>> -                                         uint32_t id)
>> +                                         uint32_t id,
>> +                                         spapr_drc_detach_cb *detach_cb)
>>   {
>>       sPAPRDRConnector *drc =
>>           SPAPR_DR_CONNECTOR(object_new(TYPE_SPAPR_DR_CONNECTOR));
>> @@ -577,6 +575,7 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>>       drc->type = type;
>>       drc->id = id;
>>       drc->owner = owner;
>> +    drc->detach_cb = detach_cb;
>>       prop_name = g_strdup_printf("dr-connector[%"PRIu32"]", get_index(drc));
>>       object_property_add_child(owner, prop_name, OBJECT(drc), NULL);
>>       object_property_set_bool(OBJECT(drc), true, "realized", NULL);
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index e7567e2..935e65e 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -1392,7 +1392,7 @@ static void spapr_phb_remove_pci_device(sPAPRDRConnector *drc,
>>   {
>>       sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>>   
>> -    drck->detach(drc, DEVICE(pdev), spapr_phb_remove_pci_device_cb, phb, errp);
>> +    drck->detach(drc, DEVICE(pdev), phb, errp);
>>   }
>>   
>>   static sPAPRDRConnector *spapr_phb_get_pci_func_drc(sPAPRPHBState *phb,
>> @@ -1764,7 +1764,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp)
>>           for (i = 0; i < PCI_SLOT_MAX * 8; i++) {
>>               spapr_dr_connector_new(OBJECT(phb),
>>                                      SPAPR_DR_CONNECTOR_TYPE_PCI,
>> -                                   (sphb->index << 16) | i);
>> +                                   (sphb->index << 16) | i,
>> +                                   spapr_phb_remove_pci_device_cb);
>>           }
>>       }
>>   
>> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
>> index 5524247..0a2c173 100644
>> --- a/include/hw/ppc/spapr_drc.h
>> +++ b/include/hw/ppc/spapr_drc.h
>> @@ -189,7 +189,6 @@ typedef struct sPAPRDRConnectorClass {
>>       void (*attach)(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>>                      int fdt_start_offset, bool coldplug, Error **errp);
>>       void (*detach)(sPAPRDRConnector *drc, DeviceState *d,
>> -                   spapr_drc_detach_cb *detach_cb,
>>                      void *detach_cb_opaque, Error **errp);
>>       bool (*release_pending)(sPAPRDRConnector *drc);
>>       void (*set_signalled)(sPAPRDRConnector *drc);
>> @@ -197,7 +196,8 @@ typedef struct sPAPRDRConnectorClass {
>>   
>>   sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>>                                            sPAPRDRConnectorType type,
>> -                                         uint32_t id);
>> +                                         uint32_t id,
>> +                                         spapr_drc_detach_cb *detach_cb);
>>   sPAPRDRConnector *spapr_dr_connector_by_index(uint32_t index);
>>   sPAPRDRConnector *spapr_dr_connector_by_id(sPAPRDRConnectorType type,
>>                                              uint32_t id);

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-04  7:24             ` David Gibson
@ 2017-05-04 16:30               ` Daniel Henrique Barboza
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Henrique Barboza @ 2017-05-04 16:30 UTC (permalink / raw)
  To: David Gibson; +Cc: Michael Roth, qemu-ppc, qemu-devel, Bharata B Rao



On 05/04/2017 04:24 AM, David Gibson wrote:
> On Wed, May 03, 2017 at 05:33:54PM -0300, Daniel Henrique Barboza wrote:
>> Update: I have talked with Michael Roth about the spapr_release_lmb
>> callback, the flow
>> of the LMB releases and so on. He clarified to me that it is not possible to
>> get rid of
>> the callback and put its code in the spapr_del_lmbs function.
>>
>> The reason is that the callback is being executed by the guest via
>> spapr_rtas.c:rtas_set_indicator(),
>> which in turn will follow the flow of the DRC releases and eventually will
>> hit the spapr_release_lmb
>> callback, but this will not necessarily happen in the spam of the
>> spapr_del_lmbs function. This means that
>> my idea of putting the cb code in the spapr_del_lmbs and removing the
>> callback not possible.
>>
>> On the other hand, Bharata raised the issue about the scan function in the
>> callback being a problem.
>> My tests with a 1 Gb unplug didn't show any noticeable delay increase but in
>> theory we support memory
>> unplugs of 1 Terabyte. With 1Gb of RAM we need 4 DRCs, so 1 Tb would require
>> 4000 DRCs. Then we
>> would scan through them 4000 times. I don't think the scan inside the
>> callback is feasible in this scenario
>> either.
>>
>> In the v9 I'll migrate the sPAPRDIMMState structure like Michael Roth
>> mentioned somewhere in the
>> v6 review to use it inside the spapr_lmb_release callback - looks like the
>> best option we have.
> I don't think you have to migrate that actual structure, just the fact
> that there's an in-progress removal (which you might be able to derive
> from existing migrated state anyway).  You can reconstruct the nr_lmbs
> state with a full scan on post_load().  That way it's only O(N) not
> O(N^2), and only in the case that a migration occurs mid-unplug.

Interesting idea. I'll give it a go.

Daniel

>
>>
>> Thanks,
>>
>>
>> Daniel
>>
>>
>>
>> On 05/03/2017 10:56 AM, Daniel Henrique Barboza wrote:
>>>
>>> On 05/03/2017 12:26 AM, Bharata B Rao wrote:
>>>> On Tue, May 2, 2017 at 1:13 PM, Daniel Henrique Barboza
>>>> <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
>>>> wrote:
>>>>
>>>>
>>>>
>>>>      On 05/02/2017 12:40 AM, Bharata B Rao wrote:
>>>>>      On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
>>>>>      <danielhb@linux.vnet.ibm.com
>>>>>      <mailto:danielhb@linux.vnet.ibm.com>> wrote:
>>>>>
>>>>>          Following up the previous detach_cb change, this patch
>>>>>          removes the
>>>>>          detach_cb_opaque entirely from the code.
>>>>>
>>>>>          The reason is that the drc->detach_cb_opaque object can't be
>>>>>          restored in the post load of the upcoming DRC migration and
>>>>>          no detach
>>>>>          callbacks actually need this opaque. 'spapr_core_release' is
>>>>>          receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is
>>>>>          receiving
>>>>>          a phb object as opaque but is't using it. These were trivial
>>>>>          removal
>>>>>          cases.
>>>>>
>>>>>          However, the LM removal callback 'spapr_lmb_release' is
>>>>> receiving
>>>>>          and using the opaque object, a 'sPAPRDIMMState' struct. This
>>>>>          struct
>>>>>          holds the number of LMBs the DIMM object contains and the
>>>>>          callback
>>>>>          was using this counter as a countdown to check if all LMB
>>>>>          DRCs were
>>>>>          release before proceeding to the DIMM unplug. To remove the
>>>>>          need of
>>>>>          this callback we have choices such as:
>>>>>
>>>>>          - migrate the 'sPAPRDIMMState' struct. This would require
>>>>>          creating a
>>>>>          QTAILQ to store all DIMMStates and an additional 'dimm_id'
>>>>>          field to
>>>>>          associate the DIMMState with the DIMM object. We could attach
>>>>>          this
>>>>>          QTAILQ to the 'sPAPRPHBState' and retrieve it later in the
>>>>>          callback.
>>>>>
>>>>>          - fetch the state of the LMB DRCs directly by scanning the
>>>>>          state of
>>>>>          them and, if all of them are released, proceed with the DIMM
>>>>>          unplug.
>>>>>
>>>>>          The second approach was chosen. The new
>>>>>          'spapr_all_lmbs_drcs_released'
>>>>>          function scans all LMBs of a given DIMM device to see if
>>>>>          their DRC
>>>>>          state are inactive. If all of them are inactive return
>>>>>          'true', 'false'
>>>>>          otherwise. This function is being called inside the
>>>>>          'spapr_lmb_release'
>>>>>          callback, replacing the role of the 'sPAPRDIMMState'
>>>>> opaque. The
>>>>>          'sPAPRDIMMState' struct was removed from the code given that
>>>>>          there are
>>>>>          no more uses for it.
>>>>>
>>>>>          After all these changes, there are no roles left for the
>>>>>          'detach_cb_opaque'
>>>>>          attribute of the 'sPAPRDRConnector' as well, so we can safely
>>>>>          remove
>>>>>          it from the code too.
>>>>>
>>>>>          Signed-off-by: Daniel Henrique Barboza
>>>>>          <danielhb@linux.vnet.ibm.com
>>>>>          <mailto:danielhb@linux.vnet.ibm.com>>
>>>>>          ---
>>>>>           hw/ppc/spapr.c             | 46
>>>>>          +++++++++++++++++++++++++++++++++-------------
>>>>>           hw/ppc/spapr_drc.c         | 16 +++++-----------
>>>>>           hw/ppc/spapr_pci.c         |  4 ++--
>>>>>           include/hw/ppc/spapr_drc.h |  6 ++----
>>>>>           4 files changed, 42 insertions(+), 30 deletions(-)
>>>>>
>>>>>          diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>>          index bc11757..8b9a6cf 100644
>>>>>          --- a/hw/ppc/spapr.c
>>>>>          +++ b/hw/ppc/spapr.c
>>>>>          @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void
>>>>> *opaque)
>>>>>               }
>>>>>           }
>>>>>
>>>>>          -typedef struct sPAPRDIMMState {
>>>>>          -    uint32_t nr_lmbs;
>>>>>          -} sPAPRDIMMState;
>>>>>          +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
>>>>>          +{
>>>>>          +    Error *local_err = NULL;
>>>>>          +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
>>>>>          +    MemoryRegion *mr = ddc->get_memory_region(dimm);
>>>>>          +    uint64_t size = memory_region_size(mr);
>>>>>          +
>>>>>          +    uint64_t addr;
>>>>>          +    addr = object_property_get_int(OBJECT(dimm),
>>>>>          PC_DIMM_ADDR_PROP, &local_err);
>>>>>          +    if (local_err) {
>>>>>          +        error_propagate(&error_abort, local_err);
>>>>>          +        return false;
>>>>>          +    }
>>>>>          +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
>>>>>
>>>>>          -static void spapr_lmb_release(DeviceState *dev, void *opaque)
>>>>>          +    sPAPRDRConnector *drc;
>>>>>          +    int i = 0;
>>>>>          +    for (i = 0; i < nr_lmbs; i++) {
>>>>>          +        drc =
>>>>>          spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>>>>>          +                addr / SPAPR_MEMORY_BLOCK_SIZE);
>>>>>          +        g_assert(drc);
>>>>>          +        if (drc->indicator_state !=
>>>>>          SPAPR_DR_INDICATOR_STATE_INACTIVE) {
>>>>>          +            return false;
>>>>>          +        }
>>>>>          +        addr += SPAPR_MEMORY_BLOCK_SIZE;
>>>>>          +    }
>>>>>          +    return true;
>>>>>          +}
>>>>>          +
>>>>>          +static void spapr_lmb_release(DeviceState *dev)
>>>>>           {
>>>>>          -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
>>>>>               HotplugHandler *hotplug_ctrl;
>>>>>
>>>>>          -    if (--ds->nr_lmbs) {
>>>>>          +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>>>>                   return;
>>>>>               }
>>>>>
>>>>>
>>>>>      I am concerned about the number of times we walk the DRC list
>>>>>      corresponding to each DIMM device. When a DIMM device is being
>>>>>      removed, spapr_lmb_release() will be invoked for each of the LMBs
>>>>>      of that DIMM. Now in this scheme, we end up walking through all
>>>>>      the DRC objects of the DIMM from every LMB's release function.
>>>>      Hi Bharata,
>>>>
>>>>
>>>>      I agree, this is definitely a poorer performance than simply
>>>>      decrementing ds->nr_lmbs.
>>>>      The reasons why I went on with it:
>>>>
>>>>      - hot unplug isn't an operation that happens too often, so it's
>>>>      not terrible
>>>>      to have a delay increase here;
>>>>
>>>>      - it didn't increased the unplug delay in an human noticeable way,
>>>>      at least in
>>>>      my tests;
>>>>
>>>>      - apart from migrating the information, there is nothing much we
>>>>      can do in the
>>>>      callback side about it. The callback isn't aware of the current
>>>>      state of the DIMM
>>>>      removal process, so the scanning is required every time.
>>>>
>>>>
>>>>      All that said, assuming that the process of DIMM removal will
>>>>      always go through
>>>>      'spapr_del_lmbs', why do we need this callback? Can't we simply do
>>>>      something
>>>>      like this in spapr_del_lmbs?
>>>>
>>>>
>>>>      diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>      index cd42449..e443fea 100644
>>>>      --- a/hw/ppc/spapr.c
>>>>      +++ b/hw/ppc/spapr.c
>>>>      @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState
>>>>      *dev, uint64_t addr_start, uint64_t size,
>>>>               addr += SPAPR_MEMORY_BLOCK_SIZE;
>>>>           }
>>>>
>>>>      +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>>>      +        // something went wrong in the removal of the LMBs.
>>>>      +        // propagate error and return
>>>>      +        throw_error_code;
>>>>      +        return;
>>>>      +    }
>>>>
>>>>
>>>> spapr_del_lmbs() is called from ->unplug_request(). Here we notify
>>>> the guest about the unplug request. We have to wait till the guest
>>>> gives us a go ahead so that we can cleanup the DIMM device. The
>>>> cleanup is done as part of release callback (spapr_lmb_release) at
>>>> which point we are sure that the given LMB has been indeed removed
>>>> by the guest.
>>> I wasn't clear enough in my last comment. Let me rephrase. Is there any
>>> other use for
>>> the 'spapr_lmb_release' callback function other than being called by the
>>> spapr_del_lmbs()
>>> in the flow you've stated above? Searching the master code now I've
>>> found:
>>>
>>>
>>> $ grep -R 'spapr_lmb_release' .
>>> ./spapr.c:static void spapr_lmb_release(DeviceState *dev, void *opaque)
>>> ./spapr.c:        drck->detach(drc, dev, spapr_lmb_release, ds, errp);
>>>
>>>
>>> Note that all the callback is doing is asserting that a nr_lmb counter
>>> will be zero after
>>> a decrement and, if true,  execute the following:
>>>
>>>
>>>      hotplug_ctrl = qdev_get_hotplug_handler(dev);
>>>      hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>>>
>>>
>>> So, if the callback spapr_lmb_release is only being called in the
>>> following for loop of spapr_del_lmbs()
>>> to clean up each LMB DRC, can't we get rid of it and do the following
>>> after this for loop?
>>>
>>>      for (i = 0; i < nr_lmbs; i++) {
>>>          drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>>>                  addr / SPAPR_MEMORY_BLOCK_SIZE);
>>>          g_assert(drc);
>>>
>>>          drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
>>>          drck->detach(drc, dev, ds, errp);
>>>          addr += SPAPR_MEMORY_BLOCK_SIZE;
>>>      }
>>>
>>>      if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
>>>          // All LMBs were cleared, proceed with detach
>>>          hotplug_ctrl = qdev_get_hotplug_handler(dev);
>>>          hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
>>>     }
>>>     // proceed with spapr_del_lmbs code
>>>
>>>
>>> Doesn't this code does exactly the same thing that the callback does
>>> today? Note that we can
>>> even use that conditional to block the remaining spapr_del_lmbs code
>>> from executing if the
>>> LMBs weren't properly cleansed - something that today isn't done.
>>>
>>>
>>> If removing this callback is too problematic or can somehow cause
>>> problems that I am unable to
>>> foresee, then the alternative would be to either deal with the scanning
>>> inside the callback
>>> (as it is being done in this patch) or migrate the nr_lmbs information
>>> for late retrieval in
>>> the callback. I am fine with any alternative, we just need to agree on
>>> what makes more
>>> sense.
>>>
>>>
>>> Daniel
>>>
>>>> Regards,
>>>> Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques
  2017-05-04  7:20       ` David Gibson
@ 2017-05-05  4:38         ` Bharata B Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Bharata B Rao @ 2017-05-05  4:38 UTC (permalink / raw)
  To: David Gibson; +Cc: Daniel Henrique Barboza, qemu-ppc, qemu-devel

On Thu, May 4, 2017 at 12:50 PM, David Gibson <david@gibson.dropbear.id.au>
wrote:

> On Tue, May 02, 2017 at 04:43:51AM -0300, Daniel Henrique Barboza wrote:
> >
> >
> > On 05/02/2017 12:40 AM, Bharata B Rao wrote:
> > > On Sun, Apr 30, 2017 at 10:55 PM, Daniel Henrique Barboza
> > > <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
> > > wrote:
> > >
> > >     Following up the previous detach_cb change, this patch removes the
> > >     detach_cb_opaque entirely from the code.
> > >
> > >     The reason is that the drc->detach_cb_opaque object can't be
> > >     restored in the post load of the upcoming DRC migration and no
> detach
> > >     callbacks actually need this opaque. 'spapr_core_release' is
> > >     receiving it as NULL, 'spapr_phb_remove_pci_device_cb' is
> receiving
> > >     a phb object as opaque but is't using it. These were trivial
> removal
> > >     cases.
> > >
> > >     However, the LM removal callback 'spapr_lmb_release' is receiving
> > >     and using the opaque object, a 'sPAPRDIMMState' struct. This struct
> > >     holds the number of LMBs the DIMM object contains and the callback
> > >     was using this counter as a countdown to check if all LMB DRCs were
> > >     release before proceeding to the DIMM unplug. To remove the need of
> > >     this callback we have choices such as:
> > >
> > >     - migrate the 'sPAPRDIMMState' struct. This would require creating
> a
> > >     QTAILQ to store all DIMMStates and an additional 'dimm_id' field to
> > >     associate the DIMMState with the DIMM object. We could attach this
> > >     QTAILQ to the 'sPAPRPHBState' and retrieve it later in the
> callback.
> > >
> > >     - fetch the state of the LMB DRCs directly by scanning the state of
> > >     them and, if all of them are released, proceed with the DIMM
> unplug.
> > >
> > >     The second approach was chosen. The new
> 'spapr_all_lmbs_drcs_released'
> > >     function scans all LMBs of a given DIMM device to see if their DRC
> > >     state are inactive. If all of them are inactive return 'true',
> 'false'
> > >     otherwise. This function is being called inside the
> > >     'spapr_lmb_release'
> > >     callback, replacing the role of the 'sPAPRDIMMState' opaque. The
> > >     'sPAPRDIMMState' struct was removed from the code given that there
> are
> > >     no more uses for it.
> > >
> > >     After all these changes, there are no roles left for the
> > >     'detach_cb_opaque'
> > >     attribute of the 'sPAPRDRConnector' as well, so we can safely
> remove
> > >     it from the code too.
> > >
> > >     Signed-off-by: Daniel Henrique Barboza
> > >     <danielhb@linux.vnet.ibm.com <mailto:danielhb@linux.vnet.ibm.com>>
> > >     ---
> > >      hw/ppc/spapr.c             | 46
> > >     +++++++++++++++++++++++++++++++++-------------
> > >      hw/ppc/spapr_drc.c         | 16 +++++-----------
> > >      hw/ppc/spapr_pci.c         |  4 ++--
> > >      include/hw/ppc/spapr_drc.h |  6 ++----
> > >      4 files changed, 42 insertions(+), 30 deletions(-)
> > >
> > >     diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > >     index bc11757..8b9a6cf 100644
> > >     --- a/hw/ppc/spapr.c
> > >     +++ b/hw/ppc/spapr.c
> > >     @@ -1887,21 +1887,43 @@ static void spapr_drc_reset(void *opaque)
> > >          }
> > >      }
> > >
> > >     -typedef struct sPAPRDIMMState {
> > >     -    uint32_t nr_lmbs;
> > >     -} sPAPRDIMMState;
> > >     +static bool spapr_all_lmbs_drcs_released(PCDIMMDevice *dimm)
> > >     +{
> > >     +    Error *local_err = NULL;
> > >     +    PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm);
> > >     +    MemoryRegion *mr = ddc->get_memory_region(dimm);
> > >     +    uint64_t size = memory_region_size(mr);
> > >     +
> > >     +    uint64_t addr;
> > >     +    addr = object_property_get_int(OBJECT(dimm),
> > >     PC_DIMM_ADDR_PROP, &local_err);
> > >     +    if (local_err) {
> > >     +        error_propagate(&error_abort, local_err);
> > >     +        return false;
> > >     +    }
> > >     +    uint32_t nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE;
> > >
> > >     -static void spapr_lmb_release(DeviceState *dev, void *opaque)
> > >     +    sPAPRDRConnector *drc;
> > >     +    int i = 0;
> > >     +    for (i = 0; i < nr_lmbs; i++) {
> > >     +        drc = spapr_dr_connector_by_id(
> SPAPR_DR_CONNECTOR_TYPE_LMB,
> > >     +                addr / SPAPR_MEMORY_BLOCK_SIZE);
> > >     +        g_assert(drc);
> > >     +        if (drc->indicator_state !=
> > >     SPAPR_DR_INDICATOR_STATE_INACTIVE) {
> > >     +            return false;
> > >     +        }
> > >     +        addr += SPAPR_MEMORY_BLOCK_SIZE;
> > >     +    }
> > >     +    return true;
> > >     +}
> > >     +
> > >     +static void spapr_lmb_release(DeviceState *dev)
> > >      {
> > >     -    sPAPRDIMMState *ds = (sPAPRDIMMState *)opaque;
> > >          HotplugHandler *hotplug_ctrl;
> > >
> > >     -    if (--ds->nr_lmbs) {
> > >     +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> > >              return;
> > >          }
> > >
> > >
> > > I am concerned about the number of times we walk the DRC list
> > > corresponding to each DIMM device. When a DIMM device is being removed,
> > > spapr_lmb_release() will be invoked for each of the LMBs of that DIMM.
> > > Now in this scheme, we end up walking through all the DRC objects of
> the
> > > DIMM from every LMB's release function.
> >
> > Hi Bharata,
> >
> >
> > I agree, this is definitely a poorer performance than simply decrementing
> > ds->nr_lmbs.
> > The reasons why I went on with it:
> >
> > - hot unplug isn't an operation that happens too often, so it's not
> terrible
> > to have a delay increase here;
>
> So, if it were just a fixed increase in the time, I'd agree.  But IIUC
> from the above, this basically makes the removal O(N^2) in the size of
> the DIMM, which sounds like it could get bad to me.
>
> > - it didn't increased the unplug delay in an human noticeable way, at
> least
> > in
> > my tests;
>
> Right, but what size of DIMM did you use?
>
> > - apart from migrating the information, there is nothing much we can do
> in
> > the
> > callback side about it. The callback isn't aware of the current state of
> the
> > DIMM
> > removal process, so the scanning is required every time.
>
> Well we could potentially use "cached" state here.  In the normal way
> of things we use a value like this, but in the case of migration we
> re-generate the information with a full scan.
>
> > All that said, assuming that the process of DIMM removal will always go
> > through
> > 'spapr_del_lmbs', why do we need this callback? Can't we simply do
> something
> > like this in spapr_del_lmbs?
> >
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index cd42449..e443fea 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2734,6 +2734,20 @@ static void spapr_del_lmbs(DeviceState *dev,
> uint64_t
> > addr_start, uint64_t size,
> >          addr += SPAPR_MEMORY_BLOCK_SIZE;
> >      }
> >
> > +    if (!spapr_all_lmbs_drcs_released(PC_DIMM(dev))) {
> > +        // something went wrong in the removal of the LMBs.
> > +        // propagate error and return
> > +        throw_error_code;
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * Now that all the LMBs have been removed by the guest, call the
> > +     * pc-dimm unplug handler to cleanup up the pc-dimm device.
> > +     */
> > +    hotplug_ctrl = qdev_get_hotplug_handler(dev);
> > +    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> > +
> >      drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> >                                     addr_start /
> SPAPR_MEMORY_BLOCK_SIZE);
> >      drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> >
> >
> > With this change we run the LMB scanning once at the end of the for
> > loop inside spapr_del_lmbs to make sure everything went fine (something
> > that the current code  isn't doing, there are operationsvbeing done
> > afterwards
> > without checking if the LMB removals actually happened).
> >
> > If something went wrong, propagate an error. If not, proceed with the
> > removal
> > of the DIMM device and the remaining spapr_del_lmbs code.
> spapr_lmb_release
> > can
> > be safely removed from the code after that.
> >
> >
> > What do you think?
>
> That seems like a good idea, independent of anything else.  But I may
> not be remembering how the LMB removal paths all work.  Bharata?
>

As I pointed out in another thread and as Daniel himself realized, the
above scheme won't work as we have to wait for the guest to acknowledge the
removal and that is when callback is executed.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-05-05  4:38 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-30 17:25 [Qemu-devel] [PATCH 0/5 v8] migration/ppc: migrating DRC, ccs_list and pending_events Daniel Henrique Barboza
2017-04-30 17:25 ` [Qemu-devel] [PATCH 1/5] hw/ppc: setting spapr_drc_detach_cb in spapr_dr_connector_new Daniel Henrique Barboza
2017-05-03 14:01   ` Laurent Vivier
2017-05-04  5:33   ` David Gibson
2017-05-04 12:57     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-04-30 17:25 ` [Qemu-devel] [PATCH 2/5] hw/ppc: removing spapr_drc_detach_cb opaques Daniel Henrique Barboza
2017-05-02  3:40   ` Bharata B Rao
2017-05-02  7:43     ` [Qemu-devel] [Qemu-ppc] " Daniel Henrique Barboza
2017-05-03  3:26       ` Bharata B Rao
2017-05-03 13:56         ` Daniel Henrique Barboza
2017-05-03 20:33           ` Daniel Henrique Barboza
2017-05-04  7:24             ` David Gibson
2017-05-04 16:30               ` Daniel Henrique Barboza
2017-05-04  7:20       ` David Gibson
2017-05-05  4:38         ` Bharata B Rao
2017-05-03 14:33   ` [Qemu-devel] " Laurent Vivier
2017-05-04  7:27     ` David Gibson
2017-04-30 17:25 ` [Qemu-devel] [PATCH 3/5] hw/ppc: migrating the DRC state of hotplugged devices Daniel Henrique Barboza
2017-04-30 17:25 ` [Qemu-devel] [PATCH 4/5] migration: spapr: migrate ccs_list in spapr state Daniel Henrique Barboza
2017-04-30 17:25 ` [Qemu-devel] [PATCH 5/5] migration: spapr: migrate pending_events of " Daniel Henrique Barboza

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.