All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI)
@ 2017-06-21  9:18 David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

This sixth set of DRC cleanup patches (to be applied on top of the
patches from part V) is a complete rework of DRC state management.  We
stop tracking some unnecessary things, and change the basic state
representation to a simpler and more robust model.

Most of the patches in this set "break" migration from earlier git
snapshots, but not from any released qemu version.  The previous
migration stream format had multiple problems, so better to fix them
now, before 2.10 is out.

David Gibson (5):
  spapr: Remove 'awaiting_allocation' DRC flag
  spapr: Refactor spapr_drc_detach()
  spapr: Cleanups relating to DRC awaiting_release field
  spapr: Consolidate DRC state variables
  spapr: Remove sPAPRConfigureConnectorState sub-structure

 hw/ppc/spapr.c             |   9 +-
 hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
 hw/ppc/spapr_pci.c         |  13 +-
 hw/ppc/trace-events        |   3 +-
 include/hw/ppc/spapr_drc.h |  53 +++++---
 5 files changed, 187 insertions(+), 212 deletions(-)

-- 
2.9.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
@ 2017-06-21  9:18 ` David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

The awaiting_allocation flag in the DRC was introduced by aab9913
"spapr_drc: Prevent detach racing against attach for CPU DR", allegedly to
prevent a guest crash on racing attach and detach.  Except.. information
from the BZ actually suggests a qemu crash, not a guest crash.  And there
shouldn't be a problem here anyway: if the guest has already moved the DRC
away from UNUSABLE state, the detach would already be deferred, and if it
hadn't it should be safe to detach it (the guest should fail gracefully
when it attempts to change the allocation state).

I think this was probably just a bandaid for some other problem in the
state management.  So, remove awaiting_allocation and associated code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_drc.c         | 23 +++--------------------
 include/hw/ppc/spapr_drc.h |  1 -
 2 files changed, 3 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index f34355d..59e19f8 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -170,19 +170,13 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
     if (!drc->dev) {
         return RTAS_OUT_NO_SUCH_INDICATOR;
     }
-    if (drc->awaiting_release && drc->awaiting_allocation) {
-        /* kernel is acknowledging a previous hotplug event
-         * while we are already removing it.
-         * it's safe to ignore awaiting_allocation here since we know the
-         * situation is predicated on the guest either already having done
-         * so (boot-time hotplug), or never being able to acquire in the
-         * first place (hotplug followed by immediate unplug).
-         */
+    if (drc->awaiting_release) {
+        /* Don't allow the guest to move a device away from UNUSABLE
+         * state when we want to unplug it */
         return RTAS_OUT_NO_SUCH_INDICATOR;
     }
 
     drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_USABLE;
-    drc->awaiting_allocation = false;
 
     return RTAS_OUT_SUCCESS;
 }
@@ -357,10 +351,6 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
     drc->fdt = fdt;
     drc->fdt_start_offset = fdt_start_offset;
 
-    if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI) {
-        drc->awaiting_allocation = true;
-    }
-
     object_property_add_link(OBJECT(drc), "device",
                              object_get_typename(OBJECT(drc->dev)),
                              (Object **)(&drc->dev),
@@ -398,12 +388,6 @@ void spapr_drc_detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp)
         return;
     }
 
-    if (drc->awaiting_allocation) {
-        drc->awaiting_release = true;
-        trace_spapr_drc_awaiting_allocation(spapr_drc_index(drc));
-        return;
-    }
-
     spapr_drc_release(drc);
 }
 
@@ -490,7 +474,6 @@ static const VMStateDescription vmstate_spapr_drc = {
         VMSTATE_UINT32(dr_indicator, sPAPRDRConnector),
         VMSTATE_BOOL(configured, sPAPRDRConnector),
         VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
-        VMSTATE_BOOL(awaiting_allocation, sPAPRDRConnector),
         VMSTATE_END_OF_LIST()
     }
 };
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index d15e9eb..42c3722 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -199,7 +199,6 @@ typedef struct sPAPRDRConnector {
     sPAPRConfigureConnectorState *ccs;
 
     bool awaiting_release;
-    bool awaiting_allocation;
 
     /* device pointer, via link property */
     DeviceState *dev;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach()
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
@ 2017-06-21  9:18 ` David Gibson
  2017-06-22  9:32   ` Greg Kurz
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

This function has two unused parameters - remove them.

It also sets awaiting_release on all paths, except one.  On that path
setting it is harmless, since it will be immediately cleared by
spapr_drc_release().  So factor it out of the if statements.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr.c             |  9 ++-------
 hw/ppc/spapr_drc.c         | 12 ++++++------
 hw/ppc/spapr_pci.c         |  7 +------
 include/hw/ppc/spapr_drc.h |  2 +-
 4 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5aa3760..8b5ab3a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2827,7 +2827,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev,
                               addr / SPAPR_MEMORY_BLOCK_SIZE);
         g_assert(drc);
 
-        spapr_drc_detach(drc, dev, errp);
+        spapr_drc_detach(drc);
         addr += SPAPR_MEMORY_BLOCK_SIZE;
     }
 
@@ -2902,7 +2902,6 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
 {
     int index;
     sPAPRDRConnector *drc;
-    Error *local_err = NULL;
     CPUCore *cc = CPU_CORE(dev);
     int smt = kvmppc_smt_threads();
 
@@ -2919,11 +2918,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
     drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
     g_assert(drc);
 
-    spapr_drc_detach(drc, dev, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return;
-    }
+    spapr_drc_detach(drc);
 
     spapr_hotplug_req_remove_by_index(drc);
 }
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 59e19f8..dae1f79 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -70,7 +70,7 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
         uint32_t drc_index = spapr_drc_index(drc);
         if (drc->configured) {
             trace_spapr_drc_set_isolation_state_finalizing(drc_index);
-            spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
+            spapr_drc_detach(drc);
         } else {
             trace_spapr_drc_set_isolation_state_deferring(drc_index);
         }
@@ -134,7 +134,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
         uint32_t drc_index = spapr_drc_index(drc);
         if (drc->configured) {
             trace_spapr_drc_set_isolation_state_finalizing(drc_index);
-            spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
+            spapr_drc_detach(drc);
         } else {
             trace_spapr_drc_set_isolation_state_deferring(drc_index);
         }
@@ -187,7 +187,7 @@ static uint32_t drc_set_unusable(sPAPRDRConnector *drc)
     if (drc->awaiting_release) {
         uint32_t drc_index = spapr_drc_index(drc);
         trace_spapr_drc_set_allocation_state_finalizing(drc_index);
-        spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
+        spapr_drc_detach(drc);
     }
 
     return RTAS_OUT_SUCCESS;
@@ -371,20 +371,20 @@ static void spapr_drc_release(sPAPRDRConnector *drc)
     drc->dev = NULL;
 }
 
-void spapr_drc_detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp)
+void spapr_drc_detach(sPAPRDRConnector *drc)
 {
     trace_spapr_drc_detach(spapr_drc_index(drc));
 
+    drc->awaiting_release = true;
+
     if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
         trace_spapr_drc_awaiting_isolated(spapr_drc_index(drc));
-        drc->awaiting_release = true;
         return;
     }
 
     if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI &&
         drc->allocation_state != SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
         trace_spapr_drc_awaiting_unusable(spapr_drc_index(drc));
-        drc->awaiting_release = true;
         return;
     }
 
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index bda8938..af925c0 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1476,7 +1476,6 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
     PCIDevice *pdev = PCI_DEVICE(plugged_dev);
     sPAPRDRConnectorClass *drck;
     sPAPRDRConnector *drc = spapr_phb_get_pci_drc(phb, pdev);
-    Error *local_err = NULL;
 
     if (!phb->dr_enabled) {
         error_setg(errp, QERR_BUS_NO_HOTPLUG,
@@ -1514,11 +1513,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
             }
         }
 
-        spapr_drc_detach(drc, DEVICE(pdev), &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
+        spapr_drc_detach(drc);
 
         /* if this isn't func 0, defer unplug event. otherwise signal removal
          * for all present functions
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 42c3722..ab64235 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -234,6 +234,6 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object *owner,
 
 void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                       int fdt_start_offset, Error **errp);
-void spapr_drc_detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp);
+void spapr_drc_detach(sPAPRDRConnector *drc);
 
 #endif /* HW_SPAPR_DRC_H */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
@ 2017-06-21  9:18 ` David Gibson
  2017-06-22 12:51   ` Greg Kurz
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables David Gibson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

'awaiting_release' indicates that the host has requested an unplug of the
device attached to the DRC, but the guest has not (yet) put the device
into a state where it is safe to complete removal.

1. Rename it to 'unplug_requested' which to me at least is clearer

2. Remove the ->release_pending() method used to check this from outside
spapr_drc.c.  The method only plausibly has one implementation, so use
a plain function (spapr_drc_unplug_requested()) instead.

3. Remove it from the migration stream.  Attempting to migrate mid-unplug
is broken not just for spapr - in general management has no good way to
determine if the device should be present on the destination or not.  So,
until that's fixed, there's no point adding extra things to the stream.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_drc.c         | 26 +++++++++-----------------
 hw/ppc/spapr_pci.c         |  6 ++----
 include/hw/ppc/spapr_drc.h | 11 ++++++-----
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index dae1f79..7fa9595 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -66,7 +66,7 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
      * configured state, as suggested by the state diagram from PAPR+
      * 2.7, 13.4
      */
-    if (drc->awaiting_release) {
+    if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
         if (drc->configured) {
             trace_spapr_drc_set_isolation_state_finalizing(drc_index);
@@ -116,7 +116,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
      * actually being unplugged, fail the isolation request here.
      */
     if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_LMB
-        && !drc->awaiting_release) {
+        && !drc->unplug_requested) {
         return RTAS_OUT_HW_ERROR;
     }
 
@@ -130,7 +130,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
      * configured state, as suggested by the state diagram from PAPR+
      * 2.7, 13.4
      */
-    if (drc->awaiting_release) {
+    if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
         if (drc->configured) {
             trace_spapr_drc_set_isolation_state_finalizing(drc_index);
@@ -170,7 +170,7 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
     if (!drc->dev) {
         return RTAS_OUT_NO_SUCH_INDICATOR;
     }
-    if (drc->awaiting_release) {
+    if (drc->unplug_requested) {
         /* Don't allow the guest to move a device away from UNUSABLE
          * state when we want to unplug it */
         return RTAS_OUT_NO_SUCH_INDICATOR;
@@ -184,7 +184,7 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
 static uint32_t drc_set_unusable(sPAPRDRConnector *drc)
 {
     drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_UNUSABLE;
-    if (drc->awaiting_release) {
+    if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
         trace_spapr_drc_set_allocation_state_finalizing(drc_index);
         spapr_drc_detach(drc);
@@ -363,7 +363,7 @@ static void spapr_drc_release(sPAPRDRConnector *drc)
 
     drck->release(drc->dev);
 
-    drc->awaiting_release = false;
+    drc->unplug_requested = false;
     g_free(drc->fdt);
     drc->fdt = NULL;
     drc->fdt_start_offset = 0;
@@ -375,7 +375,7 @@ void spapr_drc_detach(sPAPRDRConnector *drc)
 {
     trace_spapr_drc_detach(spapr_drc_index(drc));
 
-    drc->awaiting_release = true;
+    drc->unplug_requested = true;
 
     if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
         trace_spapr_drc_awaiting_isolated(spapr_drc_index(drc));
@@ -391,11 +391,6 @@ void spapr_drc_detach(sPAPRDRConnector *drc)
     spapr_drc_release(drc);
 }
 
-static bool release_pending(sPAPRDRConnector *drc)
-{
-    return drc->awaiting_release;
-}
-
 static void drc_reset(void *opaque)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(opaque);
@@ -408,7 +403,7 @@ static void drc_reset(void *opaque)
     /* immediately upon reset we can safely assume DRCs whose devices
      * are pending removal can be safely removed.
      */
-    if (drc->awaiting_release) {
+    if (drc->unplug_requested) {
         spapr_drc_release(drc);
     }
 
@@ -453,7 +448,7 @@ static bool spapr_drc_needed(void *opaque)
     case SPAPR_DR_CONNECTOR_TYPE_LMB:
         rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
                (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
-               drc->configured && !drc->awaiting_release);
+               drc->configured);
         break;
     case SPAPR_DR_CONNECTOR_TYPE_PHB:
     case SPAPR_DR_CONNECTOR_TYPE_VIO:
@@ -473,7 +468,6 @@ static const VMStateDescription vmstate_spapr_drc = {
         VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
         VMSTATE_UINT32(dr_indicator, sPAPRDRConnector),
         VMSTATE_BOOL(configured, sPAPRDRConnector),
-        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -564,11 +558,9 @@ static void spapr_dr_connector_instance_init(Object *obj)
 static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
 {
     DeviceClass *dk = DEVICE_CLASS(k);
-    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
 
     dk->realize = realize;
     dk->unrealize = unrealize;
-    drck->release_pending = release_pending;
     /*
      * Reason: it crashes FIXME find and document the real reason
      */
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index af925c0..3dfb77d 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1474,7 +1474,6 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
 {
     sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
     PCIDevice *pdev = PCI_DEVICE(plugged_dev);
-    sPAPRDRConnectorClass *drck;
     sPAPRDRConnector *drc = spapr_phb_get_pci_drc(phb, pdev);
 
     if (!phb->dr_enabled) {
@@ -1486,8 +1485,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
     g_assert(drc);
     g_assert(drc->dev == plugged_dev);
 
-    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-    if (!drck->release_pending(drc)) {
+    if (!spapr_drc_unplug_requested(drc)) {
         PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(pdev)));
         uint32_t slotnr = PCI_SLOT(pdev->devfn);
         sPAPRDRConnector *func_drc;
@@ -1503,7 +1501,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
                 func_drck = SPAPR_DR_CONNECTOR_GET_CLASS(func_drc);
                 state = func_drck->dr_entity_sense(func_drc);
                 if (state == SPAPR_DR_ENTITY_SENSE_PRESENT
-                    && !func_drck->release_pending(func_drc)) {
+                    && !spapr_drc_unplug_requested(func_drc)) {
                     error_setg(errp,
                                "PCI: slot %d, function %d still present. "
                                "Must unplug all non-0 functions first.",
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index ab64235..7846cca 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -198,10 +198,9 @@ typedef struct sPAPRDRConnector {
     bool configured;
     sPAPRConfigureConnectorState *ccs;
 
-    bool awaiting_release;
-
     /* device pointer, via link property */
     DeviceState *dev;
+    bool unplug_requested;
 } sPAPRDRConnector;
 
 typedef struct sPAPRDRConnectorClass {
@@ -217,9 +216,6 @@ typedef struct sPAPRDRConnectorClass {
     uint32_t (*isolate)(sPAPRDRConnector *drc);
     uint32_t (*unisolate)(sPAPRDRConnector *drc);
     void (*release)(DeviceState *dev);
-
-    /* QEMU interfaces for managing hotplug operations */
-    bool (*release_pending)(sPAPRDRConnector *drc);
 } sPAPRDRConnectorClass;
 
 uint32_t spapr_drc_index(sPAPRDRConnector *drc);
@@ -236,4 +232,9 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
                       int fdt_start_offset, Error **errp);
 void spapr_drc_detach(sPAPRDRConnector *drc);
 
+static inline bool spapr_drc_unplug_requested(sPAPRDRConnector *drc)
+{
+    return drc->unplug_requested;
+}
+
 #endif /* HW_SPAPR_DRC_H */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
                   ` (2 preceding siblings ...)
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
@ 2017-06-21  9:18 ` David Gibson
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure David Gibson
  2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
  5 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

Each DRC has three fields describing its state: isolation_state,
allocation_state and configured.  At first this seems like a reasonable
representation, since its based directly on the PAPR defined
isolation-state and allocation-state indicators.  However:
  * Only a few combinations of the two fields' values are permitted
  * allocation_state isn't used at all for physical DRCs
  * The indicators are write only so they don't really have a well
    defined current value independent of each other

This replaces these variables with a single state variable, whose names
and numbers are based on the diagram in LoPAPR section 13.4.  Along with
this we add code to check the current state on various operations and make
sure the requested transition is permitted.

Strictly speaking, this makes guest visible changes to behaviour (since we
probably allowed some transitions we shouldn't have before).  However, a
hypothetical guest broken by that wasn't PAPR compliant, and probably
wouldn't have worked under PowerVM.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_drc.c         | 224 ++++++++++++++++++++++++---------------------
 hw/ppc/trace-events        |   3 +-
 include/hw/ppc/spapr_drc.h |  25 ++++-
 3 files changed, 144 insertions(+), 108 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 7fa9595..54c3757 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -48,6 +48,17 @@ uint32_t spapr_drc_index(sPAPRDRConnector *drc)
 
 static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
 {
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_PHYSICAL_POWERON:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_PHYSICAL_CONFIGURED:
+        break; /* see below */
+    case SPAPR_DRC_STATE_PHYSICAL_UNISOLATE:
+        return RTAS_OUT_PARAM_ERROR; /* not allowed */
+    default:
+        g_assert_not_reached();
+    }
+
     /* if the guest is configuring a device attached to this DRC, we
      * should reset the configuration state at this point since it may
      * no longer be reliable (guest released device and needs to start
@@ -56,32 +67,29 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
     g_free(drc->ccs);
     drc->ccs = NULL;
 
-    drc->isolation_state = SPAPR_DR_ISOLATION_STATE_ISOLATED;
+    drc->state = SPAPR_DRC_STATE_PHYSICAL_POWERON;
 
-    /* if we're awaiting release, but still in an unconfigured state,
-     * it's likely the guest is still in the process of configuring
-     * the device and is transitioning the devices to an ISOLATED
-     * state as a part of that process. so we only complete the
-     * removal when this transition happens for a device in a
-     * configured state, as suggested by the state diagram from PAPR+
-     * 2.7, 13.4
-     */
     if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
-        if (drc->configured) {
-            trace_spapr_drc_set_isolation_state_finalizing(drc_index);
-            spapr_drc_detach(drc);
-        } else {
-            trace_spapr_drc_set_isolation_state_deferring(drc_index);
-        }
+        trace_spapr_drc_set_isolation_state_finalizing(drc_index);
+        spapr_drc_detach(drc);
     }
-    drc->configured = false;
 
     return RTAS_OUT_SUCCESS;
 }
 
 static uint32_t drc_unisolate_physical(sPAPRDRConnector *drc)
 {
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_PHYSICAL_UNISOLATE:
+    case SPAPR_DRC_STATE_PHYSICAL_CONFIGURED:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_PHYSICAL_POWERON:
+        break; /* see below */
+    default:
+        g_assert_not_reached();
+    }
+
     /* cannot unisolate a non-existent resource, and, or resources
      * which are in an 'UNUSABLE' allocation state. (PAPR 2.7,
      * 13.5.3.5)
@@ -90,13 +98,25 @@ static uint32_t drc_unisolate_physical(sPAPRDRConnector *drc)
         return RTAS_OUT_NO_SUCH_INDICATOR;
     }
 
-    drc->isolation_state = SPAPR_DR_ISOLATION_STATE_UNISOLATED;
+    drc->state = SPAPR_DRC_STATE_PHYSICAL_UNISOLATE;
 
     return RTAS_OUT_SUCCESS;
 }
 
 static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
 {
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_LOGICAL_AVAILABLE:
+    case SPAPR_DRC_STATE_LOGICAL_UNUSABLE:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_LOGICAL_CONFIGURED:
+        break; /* see below */
+    case SPAPR_DRC_STATE_LOGICAL_UNISOLATE:
+        return RTAS_OUT_PARAM_ERROR; /* not allowed */
+    default:
+        g_assert_not_reached();
+    }
+
     /* if the guest is configuring a device attached to this DRC, we
      * should reset the configuration state at this point since it may
      * no longer be reliable (guest released device and needs to start
@@ -120,7 +140,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
         return RTAS_OUT_HW_ERROR;
     }
 
-    drc->isolation_state = SPAPR_DR_ISOLATION_STATE_ISOLATED;
+    drc->state = SPAPR_DRC_STATE_LOGICAL_AVAILABLE;
 
     /* if we're awaiting release, but still in an unconfigured state,
      * it's likely the guest is still in the process of configuring
@@ -132,36 +152,46 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
      */
     if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
-        if (drc->configured) {
-            trace_spapr_drc_set_isolation_state_finalizing(drc_index);
-            spapr_drc_detach(drc);
-        } else {
-            trace_spapr_drc_set_isolation_state_deferring(drc_index);
-        }
+        trace_spapr_drc_set_isolation_state_finalizing(drc_index);
+        spapr_drc_detach(drc);
     }
-    drc->configured = false;
-
     return RTAS_OUT_SUCCESS;
 }
 
 static uint32_t drc_unisolate_logical(sPAPRDRConnector *drc)
 {
-    /* cannot unisolate a non-existent resource, and, or resources
-     * which are in an 'UNUSABLE' allocation state. (PAPR 2.7,
-     * 13.5.3.5)
-     */
-    if (!drc->dev ||
-        drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
-        return RTAS_OUT_NO_SUCH_INDICATOR;
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_LOGICAL_UNISOLATE:
+    case SPAPR_DRC_STATE_LOGICAL_CONFIGURED:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_LOGICAL_AVAILABLE:
+        break; /* see below */
+    case SPAPR_DRC_STATE_LOGICAL_UNUSABLE:
+        return RTAS_OUT_NO_SUCH_INDICATOR; /* not allowed */
+    default:
+        g_assert_not_reached();
     }
 
-    drc->isolation_state = SPAPR_DR_ISOLATION_STATE_UNISOLATED;
+    /* Move to AVAILABLE state should have ensured device was present */
+    g_assert(drc->dev);
 
+    drc->state = SPAPR_DRC_STATE_LOGICAL_UNISOLATE;
     return RTAS_OUT_SUCCESS;
 }
 
 static uint32_t drc_set_usable(sPAPRDRConnector *drc)
 {
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_LOGICAL_AVAILABLE:
+    case SPAPR_DRC_STATE_LOGICAL_UNISOLATE:
+    case SPAPR_DRC_STATE_LOGICAL_CONFIGURED:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_LOGICAL_UNUSABLE:
+        break; /* see below */
+    default:
+        g_assert_not_reached();
+    }
+
     /* if there's no resource/device associated with the DRC, there's
      * no way for us to put it in an allocation state consistent with
      * being 'USABLE'. PAPR 2.7, 13.5.3.4 documents that this should
@@ -176,14 +206,26 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
         return RTAS_OUT_NO_SUCH_INDICATOR;
     }
 
-    drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_USABLE;
+    drc->state = SPAPR_DRC_STATE_LOGICAL_AVAILABLE;
 
     return RTAS_OUT_SUCCESS;
 }
 
 static uint32_t drc_set_unusable(sPAPRDRConnector *drc)
 {
-    drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_UNUSABLE;
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_LOGICAL_UNUSABLE:
+        return RTAS_OUT_SUCCESS; /* Nothing to do */
+    case SPAPR_DRC_STATE_LOGICAL_AVAILABLE:
+        break; /* see below */
+    case SPAPR_DRC_STATE_LOGICAL_UNISOLATE:
+    case SPAPR_DRC_STATE_LOGICAL_CONFIGURED:
+        return RTAS_OUT_NO_SUCH_INDICATOR; /* not allowed */
+    default:
+        g_assert_not_reached();
+    }
+
+    drc->state = SPAPR_DRC_STATE_LOGICAL_UNUSABLE;
     if (drc->unplug_requested) {
         uint32_t drc_index = spapr_drc_index(drc);
         trace_spapr_drc_set_allocation_state_finalizing(drc_index);
@@ -241,11 +283,16 @@ static sPAPRDREntitySense physical_entity_sense(sPAPRDRConnector *drc)
 
 static sPAPRDREntitySense logical_entity_sense(sPAPRDRConnector *drc)
 {
-    if (drc->dev
-        && (drc->allocation_state != SPAPR_DR_ALLOCATION_STATE_UNUSABLE)) {
-        return SPAPR_DR_ENTITY_SENSE_PRESENT;
-    } else {
+    switch (drc->state) {
+    case SPAPR_DRC_STATE_LOGICAL_UNUSABLE:
         return SPAPR_DR_ENTITY_SENSE_UNUSABLE;
+    case SPAPR_DRC_STATE_LOGICAL_AVAILABLE:
+    case SPAPR_DRC_STATE_LOGICAL_UNISOLATE:
+    case SPAPR_DRC_STATE_LOGICAL_CONFIGURED:
+        g_assert(drc->dev);
+        return SPAPR_DR_ENTITY_SENSE_PRESENT;
+    default:
+        g_assert_not_reached();
     }
 }
 
@@ -338,13 +385,12 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
 {
     trace_spapr_drc_attach(spapr_drc_index(drc));
 
-    if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
+    if (drc->dev) {
         error_setg(errp, "an attached device is still awaiting release");
         return;
     }
-    if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_PCI) {
-        g_assert(drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE);
-    }
+    g_assert((drc->state == SPAPR_DRC_STATE_LOGICAL_UNUSABLE)
+             || (drc->state == SPAPR_DRC_STATE_PHYSICAL_POWERON));
     g_assert(fdt);
 
     drc->dev = d;
@@ -373,18 +419,16 @@ static void spapr_drc_release(sPAPRDRConnector *drc)
 
 void spapr_drc_detach(sPAPRDRConnector *drc)
 {
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
     trace_spapr_drc_detach(spapr_drc_index(drc));
 
-    drc->unplug_requested = true;
+    g_assert(drc->dev);
 
-    if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
-        trace_spapr_drc_awaiting_isolated(spapr_drc_index(drc));
-        return;
-    }
+    drc->unplug_requested = true;
 
-    if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI &&
-        drc->allocation_state != SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
-        trace_spapr_drc_awaiting_unusable(spapr_drc_index(drc));
+    if (drc->state != drck->empty_state) {
+        trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc));
         return;
     }
 
@@ -394,6 +438,7 @@ void spapr_drc_detach(sPAPRDRConnector *drc)
 static void drc_reset(void *opaque)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(opaque);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
     trace_spapr_drc_reset(spapr_drc_index(drc));
 
@@ -410,19 +455,10 @@ static void drc_reset(void *opaque)
     drc->awaiting_allocation = false;
 
     if (drc->dev) {
-        /* A device present at reset is coldplugged */
-        drc->isolation_state = SPAPR_DR_ISOLATION_STATE_UNISOLATED;
-        if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI) {
-            drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_USABLE;
-        }
-        drc->dr_indicator = SPAPR_DR_INDICATOR_ACTIVE;
+        /* A device present at reset is ready to go, same as coldplugged */
+        drc->state = drck->ready_state;
     } else {
-        /* Otherwise device is absent, but might be hotplugged */
-        drc->isolation_state = SPAPR_DR_ISOLATION_STATE_ISOLATED;
-        if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI) {
-            drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_UNUSABLE;
-        }
-        drc->dr_indicator = SPAPR_DR_INDICATOR_INACTIVE;
+        drc->state = drck->empty_state;
     }
 }
 
@@ -430,7 +466,6 @@ static bool spapr_drc_needed(void *opaque)
 {
     sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
     sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-    bool rc = false;
     sPAPRDREntitySense value = drck->dr_entity_sense(drc);
 
     /* If no dev is plugged in there is no need to migrate the DRC state */
@@ -439,23 +474,10 @@ static bool spapr_drc_needed(void *opaque)
     }
 
     /*
-     * If there is dev plugged in, we need to migrate the DRC state when
-     * it is different from cold-plugged state
-     */
-    switch (spapr_drc_type(drc)) {
-    case SPAPR_DR_CONNECTOR_TYPE_PCI:
-    case SPAPR_DR_CONNECTOR_TYPE_CPU:
-    case SPAPR_DR_CONNECTOR_TYPE_LMB:
-        rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
-               (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
-               drc->configured);
-        break;
-    case SPAPR_DR_CONNECTOR_TYPE_PHB:
-    case SPAPR_DR_CONNECTOR_TYPE_VIO:
-    default:
-        g_assert_not_reached();
-    }
-    return rc;
+     * We need to migrate the state if it's not equal to the expected
+     * long-term state, which is the same as the coldplugged initial
+     * state */
+    return (drc->state != drck->ready_state);
 }
 
 static const VMStateDescription vmstate_spapr_drc = {
@@ -464,10 +486,8 @@ static const VMStateDescription vmstate_spapr_drc = {
     .minimum_version_id = 1,
     .needed = spapr_drc_needed,
     .fields  = (VMStateField []) {
-        VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
-        VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+        VMSTATE_UINT32(state, sPAPRDRConnector),
         VMSTATE_UINT32(dr_indicator, sPAPRDRConnector),
-        VMSTATE_BOOL(configured, sPAPRDRConnector),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -536,23 +556,20 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner, const char *type,
     object_property_set_bool(OBJECT(drc), true, "realized", NULL);
     g_free(prop_name);
 
-    /* PCI slot always start in a USABLE state, and stay there */
-    if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_PCI) {
-        drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_USABLE;
-    }
-
     return drc;
 }
 
 static void spapr_dr_connector_instance_init(Object *obj)
 {
     sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(obj);
+    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
     object_property_add_uint32_ptr(obj, "id", &drc->id, NULL);
     object_property_add(obj, "index", "uint32", prop_get_index,
                         NULL, NULL, NULL, NULL);
     object_property_add(obj, "fdt", "struct", prop_get_fdt,
                         NULL, NULL, NULL, NULL);
+    drc->state = drck->empty_state;
 }
 
 static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
@@ -574,6 +591,8 @@ static void spapr_drc_physical_class_init(ObjectClass *k, void *data)
     drck->dr_entity_sense = physical_entity_sense;
     drck->isolate = drc_isolate_physical;
     drck->unisolate = drc_unisolate_physical;
+    drck->ready_state = SPAPR_DRC_STATE_PHYSICAL_CONFIGURED;
+    drck->empty_state = SPAPR_DRC_STATE_PHYSICAL_POWERON;
 }
 
 static void spapr_drc_logical_class_init(ObjectClass *k, void *data)
@@ -583,6 +602,8 @@ static void spapr_drc_logical_class_init(ObjectClass *k, void *data)
     drck->dr_entity_sense = logical_entity_sense;
     drck->isolate = drc_isolate_logical;
     drck->unisolate = drc_unisolate_logical;
+    drck->ready_state = SPAPR_DRC_STATE_LOGICAL_CONFIGURED;
+    drck->empty_state = SPAPR_DRC_STATE_LOGICAL_UNUSABLE;
 }
 
 static void spapr_drc_cpu_class_init(ObjectClass *k, void *data)
@@ -986,6 +1007,7 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
     uint64_t wa_offset;
     uint32_t drc_index;
     sPAPRDRConnector *drc;
+    sPAPRDRConnectorClass *drck;
     sPAPRConfigureConnectorState *ccs;
     sPAPRDRCCResponse resp = SPAPR_DR_CC_RESPONSE_CONTINUE;
     int rc;
@@ -1005,12 +1027,17 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
         goto out;
     }
 
-    if (!drc->fdt) {
-        trace_spapr_rtas_ibm_configure_connector_missing_fdt(drc_index);
+    if ((drc->state != SPAPR_DRC_STATE_LOGICAL_UNISOLATE)
+        && (drc->state != SPAPR_DRC_STATE_PHYSICAL_UNISOLATE)) {
+        /* Need to unisolate the device before configuring */
         rc = SPAPR_DR_CC_RESPONSE_NOT_CONFIGURABLE;
         goto out;
     }
 
+    g_assert(drc->fdt);
+
+    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
     ccs = drc->ccs;
     if (!ccs) {
         ccs = g_new0(sPAPRConfigureConnectorState, 1);
@@ -1040,18 +1067,11 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
         case FDT_END_NODE:
             ccs->fdt_depth--;
             if (ccs->fdt_depth == 0) {
-                sPAPRDRIsolationState state = drc->isolation_state;
                 uint32_t drc_index = spapr_drc_index(drc);
-                /* done sending the device tree, don't need to track
-                 * the state anymore
-                 */
+
+                /* done sending the device tree, move to configured state */
                 trace_spapr_drc_set_configured(drc_index);
-                if (state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) {
-                    drc->configured = true;
-                } else {
-                    /* guest should be not configuring an isolated device */
-                    trace_spapr_drc_set_configured_skipping(drc_index);
-                }
+                drc->state = drck->ready_state;
                 g_free(ccs);
                 drc->ccs = NULL;
                 ccs = NULL;
diff --git a/hw/ppc/trace-events b/hw/ppc/trace-events
index 3e8e3cf..8e79f7e 100644
--- a/hw/ppc/trace-events
+++ b/hw/ppc/trace-events
@@ -46,8 +46,7 @@ spapr_drc_set_configured(uint32_t index) "drc: 0x%"PRIx32
 spapr_drc_set_configured_skipping(uint32_t index) "drc: 0x%"PRIx32", isolated device"
 spapr_drc_attach(uint32_t index) "drc: 0x%"PRIx32
 spapr_drc_detach(uint32_t index) "drc: 0x%"PRIx32
-spapr_drc_awaiting_isolated(uint32_t index) "drc: 0x%"PRIx32
-spapr_drc_awaiting_unusable(uint32_t index) "drc: 0x%"PRIx32
+spapr_drc_awaiting_quiesce(uint32_t index) "drc: 0x%"PRIx32
 spapr_drc_awaiting_allocation(uint32_t index) "drc: 0x%"PRIx32
 spapr_drc_reset(uint32_t index) "drc: 0x%"PRIx32
 spapr_drc_realize(uint32_t index) "drc: 0x%"PRIx32
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 7846cca..c229722 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -172,6 +172,24 @@ typedef enum {
     SPAPR_DR_CC_RESPONSE_NOT_CONFIGURABLE = -9003,
 } sPAPRDRCCResponse;
 
+typedef enum {
+    /*
+     * Values come from Fig. 12 in LoPAPR section 13.4
+     *
+     * These are exposed in the migration stream, so don't change
+     * them.
+     */
+    SPAPR_DRC_STATE_INVALID             = 0,
+    SPAPR_DRC_STATE_LOGICAL_UNUSABLE    = 1,
+    SPAPR_DRC_STATE_LOGICAL_AVAILABLE   = 2,
+    SPAPR_DRC_STATE_LOGICAL_UNISOLATE   = 3,
+    SPAPR_DRC_STATE_LOGICAL_CONFIGURED  = 4,
+    SPAPR_DRC_STATE_PHYSICAL_AVAILABLE  = 5,
+    SPAPR_DRC_STATE_PHYSICAL_POWERON    = 6,
+    SPAPR_DRC_STATE_PHYSICAL_UNISOLATE  = 7,
+    SPAPR_DRC_STATE_PHYSICAL_CONFIGURED = 8,
+} sPAPRDRCState;
+
 /* rtas-configure-connector state */
 typedef struct sPAPRConfigureConnectorState {
     int fdt_offset;
@@ -188,14 +206,11 @@ typedef struct sPAPRDRConnector {
     /* DR-indicator */
     uint32_t dr_indicator;
 
-    /* sensor/indicator states */
-    uint32_t isolation_state;
-    uint32_t allocation_state;
+    uint32_t state;
 
     /* configure-connector state */
     void *fdt;
     int fdt_start_offset;
-    bool configured;
     sPAPRConfigureConnectorState *ccs;
 
     /* device pointer, via link property */
@@ -206,6 +221,8 @@ typedef struct sPAPRDRConnector {
 typedef struct sPAPRDRConnectorClass {
     /*< private >*/
     DeviceClass parent;
+    sPAPRDRCState empty_state;
+    sPAPRDRCState ready_state;
 
     /*< public >*/
     sPAPRDRConnectorTypeShift typeshift;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
                   ` (3 preceding siblings ...)
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables David Gibson
@ 2017-06-21  9:18 ` David Gibson
  2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
  5 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2017-06-21  9:18 UTC (permalink / raw)
  To: mdroth
  Cc: bharata, groug, sursingh, lvivier, qemu-ppc, qemu-devel, David Gibson

Most of the time, the state of a DRC object is contained in the single
'state' variable.  However, during the transition from UNISOLATE to
CONFIGURED state requires multiple calls to the ibm,configure-connector
RTAS call to retrieve the device tree for the attached device.  We need
some extra state to keep track of where we're up to in delivering the
device tree information to the guest.

Currently that extra state is in a sPAPRConfigureConnectorState
substructure which is only allocated when we're in the middle of the
configure connector process.  That sounds like a good idea, but the extra
state is only two integers - on many platforms that will take up the same
room as the (maybe NULL) ccs pointer even before malloc() overhead.  Plus
it's another object whose lifetime we need to manage.  In short, it's not
worth it.

So, fold the sPAPRConfigureConnectorState substructure directly into the
DRC object.

Previously the structure was allocated lazily when the configure-connector
call discovers it's not there.  Now, we need to initialize the subfields
pre-emptively, as soon as we enter UNISOLATE state.

Although it's not strictly necessary (the field values should only ever
be consulted when in UNISOLATE state), we try to keep them at -1 when in
other states, as a debugging aid.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ppc/spapr_drc.c         | 56 +++++++++++++++-------------------------------
 include/hw/ppc/spapr_drc.h | 16 +++++--------
 2 files changed, 24 insertions(+), 48 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 54c3757..4251031 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -59,14 +59,6 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
         g_assert_not_reached();
     }
 
-    /* if the guest is configuring a device attached to this DRC, we
-     * should reset the configuration state at this point since it may
-     * no longer be reliable (guest released device and needs to start
-     * over, or unplug occurred so the FDT is no longer valid)
-     */
-    g_free(drc->ccs);
-    drc->ccs = NULL;
-
     drc->state = SPAPR_DRC_STATE_PHYSICAL_POWERON;
 
     if (drc->unplug_requested) {
@@ -99,6 +91,8 @@ static uint32_t drc_unisolate_physical(sPAPRDRConnector *drc)
     }
 
     drc->state = SPAPR_DRC_STATE_PHYSICAL_UNISOLATE;
+    drc->ccs_offset = drc->fdt_start_offset;
+    drc->ccs_depth = 0;
 
     return RTAS_OUT_SUCCESS;
 }
@@ -117,14 +111,6 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
         g_assert_not_reached();
     }
 
-    /* if the guest is configuring a device attached to this DRC, we
-     * should reset the configuration state at this point since it may
-     * no longer be reliable (guest released device and needs to start
-     * over, or unplug occurred so the FDT is no longer valid)
-     */
-    g_free(drc->ccs);
-    drc->ccs = NULL;
-
     /*
      * Fail any requests to ISOLATE the LMB DRC if this LMB doesn't
      * belong to a DIMM device that is marked for removal.
@@ -176,6 +162,9 @@ static uint32_t drc_unisolate_logical(sPAPRDRConnector *drc)
     g_assert(drc->dev);
 
     drc->state = SPAPR_DRC_STATE_LOGICAL_UNISOLATE;
+    drc->ccs_offset = drc->fdt_start_offset;
+    drc->ccs_depth = 0;
+
     return RTAS_OUT_SUCCESS;
 }
 
@@ -442,9 +431,6 @@ static void drc_reset(void *opaque)
 
     trace_spapr_drc_reset(spapr_drc_index(drc));
 
-    g_free(drc->ccs);
-    drc->ccs = NULL;
-
     /* immediately upon reset we can safely assume DRCs whose devices
      * are pending removal can be safely removed.
      */
@@ -460,6 +446,9 @@ static void drc_reset(void *opaque)
     } else {
         drc->state = drck->empty_state;
     }
+
+    drc->ccs_offset = -1;
+    drc->ccs_depth = -1;
 }
 
 static bool spapr_drc_needed(void *opaque)
@@ -1008,7 +997,6 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
     uint32_t drc_index;
     sPAPRDRConnector *drc;
     sPAPRDRConnectorClass *drck;
-    sPAPRConfigureConnectorState *ccs;
     sPAPRDRCCResponse resp = SPAPR_DR_CC_RESPONSE_CONTINUE;
     int rc;
 
@@ -1038,25 +1026,18 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
 
     drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
-    ccs = drc->ccs;
-    if (!ccs) {
-        ccs = g_new0(sPAPRConfigureConnectorState, 1);
-        ccs->fdt_offset = drc->fdt_start_offset;
-        drc->ccs = ccs;
-    }
-
     do {
         uint32_t tag;
         const char *name;
         const struct fdt_property *prop;
         int fdt_offset_next, prop_len;
 
-        tag = fdt_next_tag(drc->fdt, ccs->fdt_offset, &fdt_offset_next);
+        tag = fdt_next_tag(drc->fdt, drc->ccs_offset, &fdt_offset_next);
 
         switch (tag) {
         case FDT_BEGIN_NODE:
-            ccs->fdt_depth++;
-            name = fdt_get_name(drc->fdt, ccs->fdt_offset, NULL);
+            drc->ccs_depth++;
+            name = fdt_get_name(drc->fdt, drc->ccs_offset, NULL);
 
             /* provide the name of the next OF node */
             wa_offset = CC_VAL_DATA_OFFSET;
@@ -1065,23 +1046,22 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
             resp = SPAPR_DR_CC_RESPONSE_NEXT_CHILD;
             break;
         case FDT_END_NODE:
-            ccs->fdt_depth--;
-            if (ccs->fdt_depth == 0) {
+            drc->ccs_depth--;
+            if (drc->ccs_depth == 0) {
                 uint32_t drc_index = spapr_drc_index(drc);
 
                 /* done sending the device tree, move to configured state */
                 trace_spapr_drc_set_configured(drc_index);
                 drc->state = drck->ready_state;
-                g_free(ccs);
-                drc->ccs = NULL;
-                ccs = NULL;
+                drc->ccs_offset = -1;
+                drc->ccs_depth = -1;
                 resp = SPAPR_DR_CC_RESPONSE_SUCCESS;
             } else {
                 resp = SPAPR_DR_CC_RESPONSE_PREV_PARENT;
             }
             break;
         case FDT_PROP:
-            prop = fdt_get_property_by_offset(drc->fdt, ccs->fdt_offset,
+            prop = fdt_get_property_by_offset(drc->fdt, drc->ccs_offset,
                                               &prop_len);
             name = fdt_string(drc->fdt, fdt32_to_cpu(prop->nameoff));
 
@@ -1106,8 +1086,8 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
             /* keep seeking for an actionable tag */
             break;
         }
-        if (ccs) {
-            ccs->fdt_offset = fdt_offset_next;
+        if (drc->ccs_offset >= 0) {
+            drc->ccs_offset = fdt_offset_next;
         }
     } while (resp == SPAPR_DR_CC_RESPONSE_CONTINUE);
 
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index c229722..4c54864 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -190,12 +190,6 @@ typedef enum {
     SPAPR_DRC_STATE_PHYSICAL_CONFIGURED = 8,
 } sPAPRDRCState;
 
-/* rtas-configure-connector state */
-typedef struct sPAPRConfigureConnectorState {
-    int fdt_offset;
-    int fdt_depth;
-} sPAPRConfigureConnectorState;
-
 typedef struct sPAPRDRConnector {
     /*< private >*/
     DeviceState parent;
@@ -208,14 +202,16 @@ typedef struct sPAPRDRConnector {
 
     uint32_t state;
 
-    /* configure-connector state */
-    void *fdt;
-    int fdt_start_offset;
-    sPAPRConfigureConnectorState *ccs;
+    /* RTAS ibm,configure-connector state */
+    /* (only valid in UNISOLATE state) */
+    int ccs_offset;
+    int ccs_depth;
 
     /* device pointer, via link property */
     DeviceState *dev;
     bool unplug_requested;
+    void *fdt;
+    int fdt_start_offset;
 } sPAPRDRConnector;
 
 typedef struct sPAPRDRConnectorClass {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach()
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
@ 2017-06-22  9:32   ` Greg Kurz
  2017-07-04 14:54     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Greg Kurz @ 2017-06-22  9:32 UTC (permalink / raw)
  To: David Gibson; +Cc: mdroth, bharata, sursingh, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 6217 bytes --]

On Wed, 21 Jun 2017 17:18:45 +0800
David Gibson <david@gibson.dropbear.id.au> wrote:

> This function has two unused parameters - remove them.
> 

It's ok for the d argument but I'm not sure about errp... Indeed it isn't used
in the current code, but looking at the paths below spapr_drc_detach(), we
have:

spapr_drc_detach()
 spapr_drc_release()
  spapr_core_release()
   hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);

and

spapr_drc_detach()
 spapr_drc_release()
  spapr_lmb_release()
   hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);

Shouldn't we pass the errp down to hotplug_handler_unplug() instead of
aborting ?

> It also sets awaiting_release on all paths, except one.  On that path
> setting it is harmless, since it will be immediately cleared by
> spapr_drc_release().  So factor it out of the if statements.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ppc/spapr.c             |  9 ++-------
>  hw/ppc/spapr_drc.c         | 12 ++++++------
>  hw/ppc/spapr_pci.c         |  7 +------
>  include/hw/ppc/spapr_drc.h |  2 +-
>  4 files changed, 10 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5aa3760..8b5ab3a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2827,7 +2827,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev,
>                                addr / SPAPR_MEMORY_BLOCK_SIZE);
>          g_assert(drc);
>  
> -        spapr_drc_detach(drc, dev, errp);
> +        spapr_drc_detach(drc);
>          addr += SPAPR_MEMORY_BLOCK_SIZE;
>      }
>  
> @@ -2902,7 +2902,6 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>  {
>      int index;
>      sPAPRDRConnector *drc;
> -    Error *local_err = NULL;
>      CPUCore *cc = CPU_CORE(dev);
>      int smt = kvmppc_smt_threads();
>  
> @@ -2919,11 +2918,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>      drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
>      g_assert(drc);
>  
> -    spapr_drc_detach(drc, dev, &local_err);
> -    if (local_err) {
> -        error_propagate(errp, local_err);
> -        return;
> -    }
> +    spapr_drc_detach(drc);
>  
>      spapr_hotplug_req_remove_by_index(drc);
>  }
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 59e19f8..dae1f79 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -70,7 +70,7 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
>          uint32_t drc_index = spapr_drc_index(drc);
>          if (drc->configured) {
>              trace_spapr_drc_set_isolation_state_finalizing(drc_index);
> -            spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
> +            spapr_drc_detach(drc);
>          } else {
>              trace_spapr_drc_set_isolation_state_deferring(drc_index);
>          }
> @@ -134,7 +134,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
>          uint32_t drc_index = spapr_drc_index(drc);
>          if (drc->configured) {
>              trace_spapr_drc_set_isolation_state_finalizing(drc_index);
> -            spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
> +            spapr_drc_detach(drc);
>          } else {
>              trace_spapr_drc_set_isolation_state_deferring(drc_index);
>          }
> @@ -187,7 +187,7 @@ static uint32_t drc_set_unusable(sPAPRDRConnector *drc)
>      if (drc->awaiting_release) {
>          uint32_t drc_index = spapr_drc_index(drc);
>          trace_spapr_drc_set_allocation_state_finalizing(drc_index);
> -        spapr_drc_detach(drc, DEVICE(drc->dev), NULL);
> +        spapr_drc_detach(drc);
>      }
>  
>      return RTAS_OUT_SUCCESS;
> @@ -371,20 +371,20 @@ static void spapr_drc_release(sPAPRDRConnector *drc)
>      drc->dev = NULL;
>  }
>  
> -void spapr_drc_detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp)
> +void spapr_drc_detach(sPAPRDRConnector *drc)
>  {
>      trace_spapr_drc_detach(spapr_drc_index(drc));
>  
> +    drc->awaiting_release = true;
> +
>      if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
>          trace_spapr_drc_awaiting_isolated(spapr_drc_index(drc));
> -        drc->awaiting_release = true;
>          return;
>      }
>  
>      if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI &&
>          drc->allocation_state != SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
>          trace_spapr_drc_awaiting_unusable(spapr_drc_index(drc));
> -        drc->awaiting_release = true;
>          return;
>      }
>  
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index bda8938..af925c0 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1476,7 +1476,6 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>      PCIDevice *pdev = PCI_DEVICE(plugged_dev);
>      sPAPRDRConnectorClass *drck;
>      sPAPRDRConnector *drc = spapr_phb_get_pci_drc(phb, pdev);
> -    Error *local_err = NULL;
>  
>      if (!phb->dr_enabled) {
>          error_setg(errp, QERR_BUS_NO_HOTPLUG,
> @@ -1514,11 +1513,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>              }
>          }
>  
> -        spapr_drc_detach(drc, DEVICE(pdev), &local_err);
> -        if (local_err) {
> -            error_propagate(errp, local_err);
> -            return;
> -        }
> +        spapr_drc_detach(drc);
>  
>          /* if this isn't func 0, defer unplug event. otherwise signal removal
>           * for all present functions
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index 42c3722..ab64235 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -234,6 +234,6 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object *owner,
>  
>  void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>                        int fdt_start_offset, Error **errp);
> -void spapr_drc_detach(sPAPRDRConnector *drc, DeviceState *d, Error **errp);
> +void spapr_drc_detach(sPAPRDRConnector *drc);
>  
>  #endif /* HW_SPAPR_DRC_H */


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
@ 2017-06-22 12:51   ` Greg Kurz
  0 siblings, 0 replies; 18+ messages in thread
From: Greg Kurz @ 2017-06-22 12:51 UTC (permalink / raw)
  To: David Gibson; +Cc: mdroth, bharata, sursingh, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 8973 bytes --]

On Wed, 21 Jun 2017 17:18:46 +0800
David Gibson <david@gibson.dropbear.id.au> wrote:

> 'awaiting_release' indicates that the host has requested an unplug of the
> device attached to the DRC, but the guest has not (yet) put the device
> into a state where it is safe to complete removal.
> 
> 1. Rename it to 'unplug_requested' which to me at least is clearer
> 
> 2. Remove the ->release_pending() method used to check this from outside
> spapr_drc.c.  The method only plausibly has one implementation, so use
> a plain function (spapr_drc_unplug_requested()) instead.
> 
> 3. Remove it from the migration stream.  Attempting to migrate mid-unplug
> is broken not just for spapr - in general management has no good way to
> determine if the device should be present on the destination or not.  So,
> until that's fixed, there's no point adding extra things to the stream.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr_drc.c         | 26 +++++++++-----------------
>  hw/ppc/spapr_pci.c         |  6 ++----
>  include/hw/ppc/spapr_drc.h | 11 ++++++-----
>  3 files changed, 17 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index dae1f79..7fa9595 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -66,7 +66,7 @@ static uint32_t drc_isolate_physical(sPAPRDRConnector *drc)
>       * configured state, as suggested by the state diagram from PAPR+
>       * 2.7, 13.4
>       */
> -    if (drc->awaiting_release) {
> +    if (drc->unplug_requested) {
>          uint32_t drc_index = spapr_drc_index(drc);
>          if (drc->configured) {
>              trace_spapr_drc_set_isolation_state_finalizing(drc_index);
> @@ -116,7 +116,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
>       * actually being unplugged, fail the isolation request here.
>       */
>      if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_LMB
> -        && !drc->awaiting_release) {
> +        && !drc->unplug_requested) {
>          return RTAS_OUT_HW_ERROR;
>      }
>  
> @@ -130,7 +130,7 @@ static uint32_t drc_isolate_logical(sPAPRDRConnector *drc)
>       * configured state, as suggested by the state diagram from PAPR+
>       * 2.7, 13.4
>       */
> -    if (drc->awaiting_release) {
> +    if (drc->unplug_requested) {
>          uint32_t drc_index = spapr_drc_index(drc);
>          if (drc->configured) {
>              trace_spapr_drc_set_isolation_state_finalizing(drc_index);
> @@ -170,7 +170,7 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
>      if (!drc->dev) {
>          return RTAS_OUT_NO_SUCH_INDICATOR;
>      }
> -    if (drc->awaiting_release) {
> +    if (drc->unplug_requested) {
>          /* Don't allow the guest to move a device away from UNUSABLE
>           * state when we want to unplug it */
>          return RTAS_OUT_NO_SUCH_INDICATOR;
> @@ -184,7 +184,7 @@ static uint32_t drc_set_usable(sPAPRDRConnector *drc)
>  static uint32_t drc_set_unusable(sPAPRDRConnector *drc)
>  {
>      drc->allocation_state = SPAPR_DR_ALLOCATION_STATE_UNUSABLE;
> -    if (drc->awaiting_release) {
> +    if (drc->unplug_requested) {
>          uint32_t drc_index = spapr_drc_index(drc);
>          trace_spapr_drc_set_allocation_state_finalizing(drc_index);
>          spapr_drc_detach(drc);
> @@ -363,7 +363,7 @@ static void spapr_drc_release(sPAPRDRConnector *drc)
>  
>      drck->release(drc->dev);
>  
> -    drc->awaiting_release = false;
> +    drc->unplug_requested = false;
>      g_free(drc->fdt);
>      drc->fdt = NULL;
>      drc->fdt_start_offset = 0;
> @@ -375,7 +375,7 @@ void spapr_drc_detach(sPAPRDRConnector *drc)
>  {
>      trace_spapr_drc_detach(spapr_drc_index(drc));
>  
> -    drc->awaiting_release = true;
> +    drc->unplug_requested = true;
>  
>      if (drc->isolation_state != SPAPR_DR_ISOLATION_STATE_ISOLATED) {
>          trace_spapr_drc_awaiting_isolated(spapr_drc_index(drc));
> @@ -391,11 +391,6 @@ void spapr_drc_detach(sPAPRDRConnector *drc)
>      spapr_drc_release(drc);
>  }
>  
> -static bool release_pending(sPAPRDRConnector *drc)
> -{
> -    return drc->awaiting_release;
> -}
> -
>  static void drc_reset(void *opaque)
>  {
>      sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(opaque);
> @@ -408,7 +403,7 @@ static void drc_reset(void *opaque)
>      /* immediately upon reset we can safely assume DRCs whose devices
>       * are pending removal can be safely removed.
>       */
> -    if (drc->awaiting_release) {
> +    if (drc->unplug_requested) {
>          spapr_drc_release(drc);
>      }
>  
> @@ -453,7 +448,7 @@ static bool spapr_drc_needed(void *opaque)
>      case SPAPR_DR_CONNECTOR_TYPE_LMB:
>          rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
>                 (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
> -               drc->configured && !drc->awaiting_release);
> +               drc->configured);
>          break;
>      case SPAPR_DR_CONNECTOR_TYPE_PHB:
>      case SPAPR_DR_CONNECTOR_TYPE_VIO:
> @@ -473,7 +468,6 @@ static const VMStateDescription vmstate_spapr_drc = {
>          VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
>          VMSTATE_UINT32(dr_indicator, sPAPRDRConnector),
>          VMSTATE_BOOL(configured, sPAPRDRConnector),
> -        VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
>          VMSTATE_END_OF_LIST()
>      }
>  };
> @@ -564,11 +558,9 @@ static void spapr_dr_connector_instance_init(Object *obj)
>  static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
>  {
>      DeviceClass *dk = DEVICE_CLASS(k);
> -    sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
>  
>      dk->realize = realize;
>      dk->unrealize = unrealize;
> -    drck->release_pending = release_pending;
>      /*
>       * Reason: it crashes FIXME find and document the real reason
>       */
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index af925c0..3dfb77d 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1474,7 +1474,6 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>  {
>      sPAPRPHBState *phb = SPAPR_PCI_HOST_BRIDGE(DEVICE(plug_handler));
>      PCIDevice *pdev = PCI_DEVICE(plugged_dev);
> -    sPAPRDRConnectorClass *drck;
>      sPAPRDRConnector *drc = spapr_phb_get_pci_drc(phb, pdev);
>  
>      if (!phb->dr_enabled) {
> @@ -1486,8 +1485,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>      g_assert(drc);
>      g_assert(drc->dev == plugged_dev);
>  
> -    drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -    if (!drck->release_pending(drc)) {
> +    if (!spapr_drc_unplug_requested(drc)) {
>          PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(pdev)));
>          uint32_t slotnr = PCI_SLOT(pdev->devfn);
>          sPAPRDRConnector *func_drc;
> @@ -1503,7 +1501,7 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler,
>                  func_drck = SPAPR_DR_CONNECTOR_GET_CLASS(func_drc);
>                  state = func_drck->dr_entity_sense(func_drc);
>                  if (state == SPAPR_DR_ENTITY_SENSE_PRESENT
> -                    && !func_drck->release_pending(func_drc)) {
> +                    && !spapr_drc_unplug_requested(func_drc)) {
>                      error_setg(errp,
>                                 "PCI: slot %d, function %d still present. "
>                                 "Must unplug all non-0 functions first.",
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index ab64235..7846cca 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -198,10 +198,9 @@ typedef struct sPAPRDRConnector {
>      bool configured;
>      sPAPRConfigureConnectorState *ccs;
>  
> -    bool awaiting_release;
> -
>      /* device pointer, via link property */
>      DeviceState *dev;
> +    bool unplug_requested;
>  } sPAPRDRConnector;
>  
>  typedef struct sPAPRDRConnectorClass {
> @@ -217,9 +216,6 @@ typedef struct sPAPRDRConnectorClass {
>      uint32_t (*isolate)(sPAPRDRConnector *drc);
>      uint32_t (*unisolate)(sPAPRDRConnector *drc);
>      void (*release)(DeviceState *dev);
> -
> -    /* QEMU interfaces for managing hotplug operations */
> -    bool (*release_pending)(sPAPRDRConnector *drc);
>  } sPAPRDRConnectorClass;
>  
>  uint32_t spapr_drc_index(sPAPRDRConnector *drc);
> @@ -236,4 +232,9 @@ void spapr_drc_attach(sPAPRDRConnector *drc, DeviceState *d, void *fdt,
>                        int fdt_start_offset, Error **errp);
>  void spapr_drc_detach(sPAPRDRConnector *drc);
>  
> +static inline bool spapr_drc_unplug_requested(sPAPRDRConnector *drc)
> +{
> +    return drc->unplug_requested;
> +}
> +
>  #endif /* HW_SPAPR_DRC_H */


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach()
  2017-06-22  9:32   ` Greg Kurz
@ 2017-07-04 14:54     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2017-07-04 14:54 UTC (permalink / raw)
  To: Greg Kurz; +Cc: mdroth, bharata, sursingh, lvivier, qemu-ppc, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1372 bytes --]

On Thu, Jun 22, 2017 at 11:32:15AM +0200, Greg Kurz wrote:
> On Wed, 21 Jun 2017 17:18:45 +0800
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > This function has two unused parameters - remove them.
> > 
> 
> It's ok for the d argument but I'm not sure about errp... Indeed it isn't used
> in the current code, but looking at the paths below spapr_drc_detach(), we
> have:
> 
> spapr_drc_detach()
>  spapr_drc_release()
>   spapr_core_release()
>    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> 
> and
> 
> spapr_drc_detach()
>  spapr_drc_release()
>   spapr_lmb_release()
>    hotplug_handler_unplug(hotplug_ctrl, dev, &error_abort);
> 
> Shouldn't we pass the errp down to hotplug_handler_unplug() instead of
> aborting ?

So, at first I thought you were right, and was going to rewrite on
that basis.

Then I realized that calling hotplug_handler_unplug() is a bizarrely
roundabout way of doing what we need anyway (usually the unplug
handler is never called if there is an unplug_request() handler).
Once we expand those out to what they actually call, the errors
disappear again.  I'll do that in the next spin.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
                   ` (4 preceding siblings ...)
  2017-06-21  9:18 ` [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure David Gibson
@ 2017-07-04 21:13 ` Daniel Henrique Barboza
  2017-07-05 11:04   ` David Gibson
  5 siblings, 1 reply; 18+ messages in thread
From: Daniel Henrique Barboza @ 2017-07-04 21:13 UTC (permalink / raw)
  To: David Gibson, mdroth
  Cc: lvivier, sursingh, qemu-devel, groug, qemu-ppc, bharata

I just tested this patch set on top of current ppc-for-2.10 branch 
(which contains
the patches from part V). It applied cleanly but required a couple of 
trivial
fixes to build probably because it was made on top of an older code base.

The trivial migration test worked fine. The libvirt scenario (attaching 
a device on
target before migration, try to unplug after migration) isn't working as 
expected
but we have a different result with this series. Instead of silently 
failing to unplug
with error messages on dmesg, the hot unplug works on QEMU level:

(qemu) device_del core1
(qemu)
(qemu) info cpus
* CPU #0: nip=0xc0000000000a3e0c thread_id=86162
(qemu) info hotpluggable-cpus
Hotpluggable CPUs:
   type: "host-spapr-cpu-core"
   vcpus_count: "1"
   CPUInstance Properties:
     core-id: "3"
   type: "host-spapr-cpu-core"
   vcpus_count: "1"
   CPUInstance Properties:
     core-id: "2"
   type: "host-spapr-cpu-core"
   vcpus_count: "1"
   CPUInstance Properties:
     core-id: "1"
   type: "host-spapr-cpu-core"
   vcpus_count: "1"
   qom_path: "/machine/unattached/device[0]"
   CPUInstance Properties:
     core-id: "0"
(qemu)


However, any operation on the guest afterwards (tried with lscpu and 
dmesg) seems
to hung the guest. This is what I got when trying to do a dmesg after the
hot unplug:

danielhb@ubuntu1704:~$
danielhb@ubuntu1704:~$
danielhb@ubuntu1704:~$ dmesg | tail -n 5
^C

[  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
[  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
[  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120 
seconds.
[  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
[  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
[  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
[  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120 
seconds.
(...)


I am not sure if it was intended for this scenario to work with this 
patch set already. Figured
it can serve as a FYI for the next series.


Given that hotplug/unplug without migration still works and on the 
assumption that
we'll look more into this libvirt scenario in the next spins, +1.


Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>



On 06/21/2017 06:18 AM, David Gibson wrote:
> This sixth set of DRC cleanup patches (to be applied on top of the
> patches from part V) is a complete rework of DRC state management.  We
> stop tracking some unnecessary things, and change the basic state
> representation to a simpler and more robust model.
>
> Most of the patches in this set "break" migration from earlier git
> snapshots, but not from any released qemu version.  The previous
> migration stream format had multiple problems, so better to fix them
> now, before 2.10 is out.
>
> David Gibson (5):
>    spapr: Remove 'awaiting_allocation' DRC flag
>    spapr: Refactor spapr_drc_detach()
>    spapr: Cleanups relating to DRC awaiting_release field
>    spapr: Consolidate DRC state variables
>    spapr: Remove sPAPRConfigureConnectorState sub-structure
>
>   hw/ppc/spapr.c             |   9 +-
>   hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
>   hw/ppc/spapr_pci.c         |  13 +-
>   hw/ppc/trace-events        |   3 +-
>   include/hw/ppc/spapr_drc.h |  53 +++++---
>   5 files changed, 187 insertions(+), 212 deletions(-)
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
@ 2017-07-05 11:04   ` David Gibson
  2017-07-05 21:53     ` Daniel Henrique Barboza
  0 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2017-07-05 11:04 UTC (permalink / raw)
  To: Daniel Henrique Barboza
  Cc: mdroth, lvivier, sursingh, qemu-devel, groug, qemu-ppc, bharata

[-- Attachment #1: Type: text/plain, Size: 5366 bytes --]

On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> I just tested this patch set on top of current ppc-for-2.10 branch (which
> contains
> the patches from part V). It applied cleanly but required a couple of
> trivial
> fixes to build probably because it was made on top of an older code base.

Right, I fixed that up locally already, but haven't gotten around to
reposting yet.  You can look at the 'drcVI' branch on my github tree,
if you're interested.

> The trivial migration test worked fine. The libvirt scenario (attaching a
> device on
> target before migration, try to unplug after migration) isn't working as
> expected
> but we have a different result with this series. Instead of silently failing
> to unplug
> with error messages on dmesg, the hot unplug works on QEMU level:

Thanks for testing.  Just to clarify what you're saying here, you
haven't spotted a regression with this series, but there is a case
which was broken and is still broken with slightly different
symptoms.  Yes?

> 
> (qemu) device_del core1
> (qemu)
> (qemu) info cpus
> * CPU #0: nip=0xc0000000000a3e0c thread_id=86162
> (qemu) info hotpluggable-cpus
> Hotpluggable CPUs:
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "3"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "2"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "1"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   qom_path: "/machine/unattached/device[0]"
>   CPUInstance Properties:
>     core-id: "0"
> (qemu)
> 
> 
> However, any operation on the guest afterwards (tried with lscpu and dmesg)
> seems
> to hung the guest. This is what I got when trying to do a dmesg after the
> hot unplug:

Ouch.  That's bad.  I'll have to look into it.

I have rather a lot on my plate at the moment - if you get a chance to
work out which of the patches in the series causes this behaviour,
that could be handy.

> danielhb@ubuntu1704:~$
> danielhb@ubuntu1704:~$
> danielhb@ubuntu1704:~$ dmesg | tail -n 5
> ^C
> 
> [  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> [  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> (...)
> 
> 
> I am not sure if it was intended for this scenario to work with this patch
> set already. Figured
> it can serve as a FYI for the next series.

This might be due to a problem I've discussed previously with
Laurent.  libvirt uses device_add on the destination's monitor before
sending the migration stream, rather than starting the destination
with already-hotplugged devices as -device.

To my surprise, there doesn't seem to be a system reset between that
device_add phase and processing the incoming stream, which means DRCs
for those devices might be in the wrong state.

I might need to explicitly trigger a reset of DRC state at incoming
migration time to handle this.

> Given that hotplug/unplug without migration still works and on the
> assumption that
> we'll look more into this libvirt scenario in the next spins, +1.
> 
> 
> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
> 
> 
> 
> On 06/21/2017 06:18 AM, David Gibson wrote:
> > This sixth set of DRC cleanup patches (to be applied on top of the
> > patches from part V) is a complete rework of DRC state management.  We
> > stop tracking some unnecessary things, and change the basic state
> > representation to a simpler and more robust model.
> > 
> > Most of the patches in this set "break" migration from earlier git
> > snapshots, but not from any released qemu version.  The previous
> > migration stream format had multiple problems, so better to fix them
> > now, before 2.10 is out.
> > 
> > David Gibson (5):
> >    spapr: Remove 'awaiting_allocation' DRC flag
> >    spapr: Refactor spapr_drc_detach()
> >    spapr: Cleanups relating to DRC awaiting_release field
> >    spapr: Consolidate DRC state variables
> >    spapr: Remove sPAPRConfigureConnectorState sub-structure
> > 
> >   hw/ppc/spapr.c             |   9 +-
> >   hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
> >   hw/ppc/spapr_pci.c         |  13 +-
> >   hw/ppc/trace-events        |   3 +-
> >   include/hw/ppc/spapr_drc.h |  53 +++++---
> >   5 files changed, 187 insertions(+), 212 deletions(-)
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-05 11:04   ` David Gibson
@ 2017-07-05 21:53     ` Daniel Henrique Barboza
  2017-07-05 22:41       ` Michael Roth
  2017-07-06 14:31       ` Daniel Henrique Barboza
  0 siblings, 2 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2017-07-05 21:53 UTC (permalink / raw)
  To: David Gibson
  Cc: mdroth, lvivier, sursingh, qemu-devel, groug, qemu-ppc, bharata



On 07/05/2017 08:04 AM, David Gibson wrote:
> On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
>> I just tested this patch set on top of current ppc-for-2.10 branch (which
>> contains
>> the patches from part V). It applied cleanly but required a couple of
>> trivial
>> fixes to build probably because it was made on top of an older code base.
> Right, I fixed that up locally already, but haven't gotten around to
> reposting yet.  You can look at the 'drcVI' branch on my github tree,
> if you're interested.
>
>> The trivial migration test worked fine. The libvirt scenario (attaching a
>> device on
>> target before migration, try to unplug after migration) isn't working as
>> expected
>> but we have a different result with this series. Instead of silently failing
>> to unplug
>> with error messages on dmesg, the hot unplug works on QEMU level:
> Thanks for testing.  Just to clarify what you're saying here, you
> haven't spotted a regression with this series, but there is a case
> which was broken and is still broken with slightly different
> symptoms.  Yes?
In my opinion, yes. It is debatable if the patch series made it worse 
because
the guest is now misbehaving, but the feature per se wasn't working
prior to it.

>
>> (qemu) device_del core1
>> (qemu)
>> (qemu) info cpus
>> * CPU #0: nip=0xc0000000000a3e0c thread_id=86162
>> (qemu) info hotpluggable-cpus
>> Hotpluggable CPUs:
>>    type: "host-spapr-cpu-core"
>>    vcpus_count: "1"
>>    CPUInstance Properties:
>>      core-id: "3"
>>    type: "host-spapr-cpu-core"
>>    vcpus_count: "1"
>>    CPUInstance Properties:
>>      core-id: "2"
>>    type: "host-spapr-cpu-core"
>>    vcpus_count: "1"
>>    CPUInstance Properties:
>>      core-id: "1"
>>    type: "host-spapr-cpu-core"
>>    vcpus_count: "1"
>>    qom_path: "/machine/unattached/device[0]"
>>    CPUInstance Properties:
>>      core-id: "0"
>> (qemu)
>>
>>
>> However, any operation on the guest afterwards (tried with lscpu and dmesg)
>> seems
>> to hung the guest. This is what I got when trying to do a dmesg after the
>> hot unplug:
> Ouch.  That's bad.  I'll have to look into it.
>
> I have rather a lot on my plate at the moment - if you get a chance to
> work out which of the patches in the series causes this behaviour,
> that could be handy.

***long post warning***

With the current master and current HEAD of ppc-for-2.10, the behavior in
the Libvirt scenario (device_add in both source and target before migration,
hot unplug after migration is completed) is that QEMU fails to hot 
unplug the
CPU from the guest OS. lscpu reports the same # of cpus even after 
device_del,
dmesg shows an error like this:

[  108.182291] pseries-hotplug-cpu: CPU with drc index 10000008 already 
exists


WIth this patch series, QEMU removes the CPU but the guest misbehaves 
like I said
in my previous message.

With the current HEAD of drcVI branch, I rolled back until I found the 
patch that was
causing this new symptom. The patch is the very first of the series:

b752844 spapr: Remove 'awaiting_allocation' DRC flag

In short, adding this single patch into the HEAD of ppc-for-2.10 is 
causing this new
behavior I saw in my tests. This is the retrieved kernel log after the 
failed unplug:

[  176.434099] random: crng init done
[  461.182729] pseries-hotplug-cpu: CPU with drc index 10000008 already 
exists
[  604.707369] INFO: task kworker/0:2:920 blocked for more than 120 seconds.
[  604.707666]       Not tainted 4.10.0-26-generic #30-Ubuntu
[  604.707881] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  604.708194] kworker/0:2     D    0   920      2 0x00000800
[  604.708248] Workqueue: events vmstat_shepherd
[  604.708251] Call Trace:
[  604.708265] [c0000000055a7830] [c000000001492090] 
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
[  604.708271] [c0000000055a7a00] [c00000000001b770] __switch_to+0x2c0/0x450
[  604.708286] [c0000000055a7a60] [c000000000b51238] __schedule+0x2f8/0x990
[  604.708289] [c0000000055a7b40] [c000000000b51918] schedule+0x48/0xc0
[  604.708292] [c0000000055a7b70] [c000000000b51e40] 
schedule_preempt_disabled+0x20/0x30
[  604.708295] [c0000000055a7b90] [c000000000b54598] 
__mutex_lock_slowpath+0x208/0x380
[  604.708310] [c0000000055a7c10] [c0000000000e2ec8] 
get_online_cpus+0x58/0xa0
[  604.708312] [c0000000055a7c40] [c0000000002a5a18] 
vmstat_shepherd+0x38/0x160
[  604.708316] [c0000000055a7c90] [c0000000001061a0] 
process_one_work+0x2b0/0x5a0
[  604.708319] [c0000000055a7d20] [c000000000106538] 
worker_thread+0xa8/0x650
[  604.708322] [c0000000055a7dc0] [c00000000010f0a4] kthread+0x164/0x1b0
[  604.708326] [c0000000055a7e30] [c00000000000b4e8] 
ret_from_kernel_thread+0x5c/0x74
[  604.708341] INFO: task kworker/u8:0:3068 blocked for more than 120 
seconds.
[  604.708601]       Not tainted 4.10.0-26-generic #30-Ubuntu
[  604.708809] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  604.709107] kworker/u8:0    D    0  3068      2 0x00000800
[  604.709114] Workqueue: pseries hotplug workque pseries_hp_work_fn
[  604.709116] Call Trace:
[  604.709118] [c00000000589b3c0] [000fffffffffffff] 0xfffffffffffff 
(unreliable)
[  604.709122] [c00000000589b590] [c00000000001b770] __switch_to+0x2c0/0x450
[  604.709125] [c00000000589b5f0] [c000000000b51238] __schedule+0x2f8/0x990
[  604.709128] [c00000000589b6d0] [c000000000b51918] schedule+0x48/0xc0
[  604.709131] [c00000000589b700] [c000000000b566b4] 
schedule_timeout+0x274/0x470
[  604.709134] [c00000000589b7f0] [c000000000b528ec] 
wait_for_common+0xec/0x250
[  604.709137] [c00000000589b870] [c0000000000e3320] 
cpuhp_kick_ap_work+0x90/0x210
[  604.709140] [c00000000589b8d0] [c00000000026f57c] _cpu_down+0xdc/0x1b0
[  604.709143] [c00000000589b930] [c0000000000e44b4] do_cpu_down+0x64/0xb0
[  604.709157] [c00000000589b970] [c0000000007541b4] 
cpu_subsys_offline+0x24/0x40
[  604.709160] [c00000000589b990] [c00000000074bc04] 
device_offline+0xf4/0x130
[  604.709163] [c00000000589b9d0] [c0000000000ae4bc] 
dlpar_cpu_remove+0x24c/0x350
[  604.709166] [c00000000589bab0] [c0000000000ae6bc] 
dlpar_cpu_remove_by_index+0xfc/0x130
[  604.709169] [c00000000589bb40] [c0000000000af838] dlpar_cpu+0x78/0x540
[  604.709172] [c00000000589bbf0] [c0000000000a98a0] 
handle_dlpar_errorlog+0xf0/0x170
[  604.709174] [c00000000589bc60] [c0000000000a99b4] 
pseries_hp_work_fn+0x94/0xa0
[  604.709177] [c00000000589bc90] [c0000000001061a0] 
process_one_work+0x2b0/0x5a0
[  604.709180] [c00000000589bd20] [c000000000106538] 
worker_thread+0xa8/0x650
[  604.709183] [c00000000589bdc0] [c00000000010f0a4] kthread+0x164/0x1b0
[  604.709186] [c00000000589be30] [c00000000000b4e8] 
ret_from_kernel_thread+0x5c/0x74


Digging around in the kernel code reported in the trace and reading the 
commit message
of the patch that is causing it, my first guess is that there is some 
confusion left
in the DRC state management that, at this moment, is being fixed by the
'awaiting_allocation' flag. One potentially relevant use of this flag 
that was removed was this
piece of code:

(inside spapr_drc_detach)

-    if (drc->awaiting_allocation) {
-        drc->awaiting_release = true;
-        trace_spapr_drc_awaiting_allocation(spapr_drc_index(drc));
-        return;
-    }


Speculating a little, perhaps the guest is trying to release a DRC that 
isn't ready to be released
yet (inconsistent state probably) and this code was preventing it from 
doing so. I'll
know more when I dig further.


>
>> danielhb@ubuntu1704:~$
>> danielhb@ubuntu1704:~$
>> danielhb@ubuntu1704:~$ dmesg | tail -n 5
>> ^C
>>
>> [  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
>> [  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
>> [  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120
>> seconds.
>> [  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
>> [  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
>> [  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
>> [  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120
>> seconds.
>> (...)
>>
>>
>> I am not sure if it was intended for this scenario to work with this patch
>> set already. Figured
>> it can serve as a FYI for the next series.
> This might be due to a problem I've discussed previously with
> Laurent.  libvirt uses device_add on the destination's monitor before
> sending the migration stream, rather than starting the destination
> with already-hotplugged devices as -device.
>
> To my surprise, there doesn't seem to be a system reset between that
> device_add phase and processing the incoming stream, which means DRCs
> for those devices might be in the wrong state.
>
> I might need to explicitly trigger a reset of DRC state at incoming
> migration time to handle this.

Worth a try. I'll see if I can make a POC of this DRC reset at incoming
migration time.


Daniel

>
>> Given that hotplug/unplug without migration still works and on the
>> assumption that
>> we'll look more into this libvirt scenario in the next spins, +1.
>>
>>
>> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
>>
>>
>>
>> On 06/21/2017 06:18 AM, David Gibson wrote:
>>> This sixth set of DRC cleanup patches (to be applied on top of the
>>> patches from part V) is a complete rework of DRC state management.  We
>>> stop tracking some unnecessary things, and change the basic state
>>> representation to a simpler and more robust model.
>>>
>>> Most of the patches in this set "break" migration from earlier git
>>> snapshots, but not from any released qemu version.  The previous
>>> migration stream format had multiple problems, so better to fix them
>>> now, before 2.10 is out.
>>>
>>> David Gibson (5):
>>>     spapr: Remove 'awaiting_allocation' DRC flag
>>>     spapr: Refactor spapr_drc_detach()
>>>     spapr: Cleanups relating to DRC awaiting_release field
>>>     spapr: Consolidate DRC state variables
>>>     spapr: Remove sPAPRConfigureConnectorState sub-structure
>>>
>>>    hw/ppc/spapr.c             |   9 +-
>>>    hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
>>>    hw/ppc/spapr_pci.c         |  13 +-
>>>    hw/ppc/trace-events        |   3 +-
>>>    include/hw/ppc/spapr_drc.h |  53 +++++---
>>>    5 files changed, 187 insertions(+), 212 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-05 21:53     ` Daniel Henrique Barboza
@ 2017-07-05 22:41       ` Michael Roth
  2017-07-06  9:46         ` David Gibson
  2017-07-06 14:31       ` Daniel Henrique Barboza
  1 sibling, 1 reply; 18+ messages in thread
From: Michael Roth @ 2017-07-05 22:41 UTC (permalink / raw)
  To: Daniel Henrique Barboza, David Gibson
  Cc: lvivier, sursingh, qemu-devel, groug, qemu-ppc, bharata

Quoting Daniel Henrique Barboza (2017-07-05 16:53:57)
> 
> 
> On 07/05/2017 08:04 AM, David Gibson wrote:
> > On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> >> I just tested this patch set on top of current ppc-for-2.10 branch (which
> >> contains
> >> the patches from part V). It applied cleanly but required a couple of
> >> trivial
> >> fixes to build probably because it was made on top of an older code base.
> > Right, I fixed that up locally already, but haven't gotten around to
> > reposting yet.  You can look at the 'drcVI' branch on my github tree,
> > if you're interested.
> >
> >> The trivial migration test worked fine. The libvirt scenario (attaching a
> >> device on
> >> target before migration, try to unplug after migration) isn't working as
> >> expected
> >> but we have a different result with this series. Instead of silently failing
> >> to unplug
> >> with error messages on dmesg, the hot unplug works on QEMU level:
> > Thanks for testing.  Just to clarify what you're saying here, you
> > haven't spotted a regression with this series, but there is a case
> > which was broken and is still broken with slightly different
> > symptoms.  Yes?
> In my opinion, yes. It is debatable if the patch series made it worse 
> because
> the guest is now misbehaving, but the feature per se wasn't working
> prior to it.

I think it's the removal of awaiting_allocation. We know currently that
in the libvirt scenario the DRC is exposed in an pre-hotplug state of
ISOLATED/UNALLOCATED. In that state, spapr_drc_detach() completes
immediately because from the perspective of QEMU it apparently has not
been exposed to the guest yet, or the guest has already quiesced it on
it's end.

awaiting_allocation guarded against this, as it's intention was to make
sure that resource was put into an ALLOCATED state prior to getting moved
back into an UNALLOCATED state, so we didn't immediately unplug a CPU
while the hotplug was in progress.

So in your scenario the CPU is just mysteriously vanishing out from
under the guest, which probably explains the hang. The fix for this
particular scenario is to fix the initial DRC on the target. I think
we have a plan for that so I wouldn't consider this a regression
necessarily.

> 
> >
> >> (qemu) device_del core1
> >> (qemu)
> >> (qemu) info cpus
> >> * CPU #0: nip=0xc0000000000a3e0c thread_id=86162
> >> (qemu) info hotpluggable-cpus
> >> Hotpluggable CPUs:
> >>    type: "host-spapr-cpu-core"
> >>    vcpus_count: "1"
> >>    CPUInstance Properties:
> >>      core-id: "3"
> >>    type: "host-spapr-cpu-core"
> >>    vcpus_count: "1"
> >>    CPUInstance Properties:
> >>      core-id: "2"
> >>    type: "host-spapr-cpu-core"
> >>    vcpus_count: "1"
> >>    CPUInstance Properties:
> >>      core-id: "1"
> >>    type: "host-spapr-cpu-core"
> >>    vcpus_count: "1"
> >>    qom_path: "/machine/unattached/device[0]"
> >>    CPUInstance Properties:
> >>      core-id: "0"
> >> (qemu)
> >>
> >>
> >> However, any operation on the guest afterwards (tried with lscpu and dmesg)
> >> seems
> >> to hung the guest. This is what I got when trying to do a dmesg after the
> >> hot unplug:
> > Ouch.  That's bad.  I'll have to look into it.
> >
> > I have rather a lot on my plate at the moment - if you get a chance to
> > work out which of the patches in the series causes this behaviour,
> > that could be handy.
> 
> ***long post warning***
> 
> With the current master and current HEAD of ppc-for-2.10, the behavior in
> the Libvirt scenario (device_add in both source and target before migration,
> hot unplug after migration is completed) is that QEMU fails to hot 
> unplug the
> CPU from the guest OS. lscpu reports the same # of cpus even after 
> device_del,
> dmesg shows an error like this:
> 
> [  108.182291] pseries-hotplug-cpu: CPU with drc index 10000008 already 
> exists
> 
> 
> WIth this patch series, QEMU removes the CPU but the guest misbehaves 
> like I said
> in my previous message.
> 
> With the current HEAD of drcVI branch, I rolled back until I found the 
> patch that was
> causing this new symptom. The patch is the very first of the series:
> 
> b752844 spapr: Remove 'awaiting_allocation' DRC flag
> 
> In short, adding this single patch into the HEAD of ppc-for-2.10 is 
> causing this new
> behavior I saw in my tests. This is the retrieved kernel log after the 
> failed unplug:
> 
> [  176.434099] random: crng init done
> [  461.182729] pseries-hotplug-cpu: CPU with drc index 10000008 already 
> exists
> [  604.707369] INFO: task kworker/0:2:920 blocked for more than 120 seconds.
> [  604.707666]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  604.707881] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [  604.708194] kworker/0:2     D    0   920      2 0x00000800
> [  604.708248] Workqueue: events vmstat_shepherd
> [  604.708251] Call Trace:
> [  604.708265] [c0000000055a7830] [c000000001492090] 
> sysctl_sched_migration_cost+0x0/0x4 (unreliable)
> [  604.708271] [c0000000055a7a00] [c00000000001b770] __switch_to+0x2c0/0x450
> [  604.708286] [c0000000055a7a60] [c000000000b51238] __schedule+0x2f8/0x990
> [  604.708289] [c0000000055a7b40] [c000000000b51918] schedule+0x48/0xc0
> [  604.708292] [c0000000055a7b70] [c000000000b51e40] 
> schedule_preempt_disabled+0x20/0x30
> [  604.708295] [c0000000055a7b90] [c000000000b54598] 
> __mutex_lock_slowpath+0x208/0x380
> [  604.708310] [c0000000055a7c10] [c0000000000e2ec8] 
> get_online_cpus+0x58/0xa0
> [  604.708312] [c0000000055a7c40] [c0000000002a5a18] 
> vmstat_shepherd+0x38/0x160
> [  604.708316] [c0000000055a7c90] [c0000000001061a0] 
> process_one_work+0x2b0/0x5a0
> [  604.708319] [c0000000055a7d20] [c000000000106538] 
> worker_thread+0xa8/0x650
> [  604.708322] [c0000000055a7dc0] [c00000000010f0a4] kthread+0x164/0x1b0
> [  604.708326] [c0000000055a7e30] [c00000000000b4e8] 
> ret_from_kernel_thread+0x5c/0x74
> [  604.708341] INFO: task kworker/u8:0:3068 blocked for more than 120 
> seconds.
> [  604.708601]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  604.708809] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [  604.709107] kworker/u8:0    D    0  3068      2 0x00000800
> [  604.709114] Workqueue: pseries hotplug workque pseries_hp_work_fn
> [  604.709116] Call Trace:
> [  604.709118] [c00000000589b3c0] [000fffffffffffff] 0xfffffffffffff 
> (unreliable)
> [  604.709122] [c00000000589b590] [c00000000001b770] __switch_to+0x2c0/0x450
> [  604.709125] [c00000000589b5f0] [c000000000b51238] __schedule+0x2f8/0x990
> [  604.709128] [c00000000589b6d0] [c000000000b51918] schedule+0x48/0xc0
> [  604.709131] [c00000000589b700] [c000000000b566b4] 
> schedule_timeout+0x274/0x470
> [  604.709134] [c00000000589b7f0] [c000000000b528ec] 
> wait_for_common+0xec/0x250
> [  604.709137] [c00000000589b870] [c0000000000e3320] 
> cpuhp_kick_ap_work+0x90/0x210
> [  604.709140] [c00000000589b8d0] [c00000000026f57c] _cpu_down+0xdc/0x1b0
> [  604.709143] [c00000000589b930] [c0000000000e44b4] do_cpu_down+0x64/0xb0
> [  604.709157] [c00000000589b970] [c0000000007541b4] 
> cpu_subsys_offline+0x24/0x40
> [  604.709160] [c00000000589b990] [c00000000074bc04] 
> device_offline+0xf4/0x130
> [  604.709163] [c00000000589b9d0] [c0000000000ae4bc] 
> dlpar_cpu_remove+0x24c/0x350
> [  604.709166] [c00000000589bab0] [c0000000000ae6bc] 
> dlpar_cpu_remove_by_index+0xfc/0x130
> [  604.709169] [c00000000589bb40] [c0000000000af838] dlpar_cpu+0x78/0x540
> [  604.709172] [c00000000589bbf0] [c0000000000a98a0] 
> handle_dlpar_errorlog+0xf0/0x170
> [  604.709174] [c00000000589bc60] [c0000000000a99b4] 
> pseries_hp_work_fn+0x94/0xa0
> [  604.709177] [c00000000589bc90] [c0000000001061a0] 
> process_one_work+0x2b0/0x5a0
> [  604.709180] [c00000000589bd20] [c000000000106538] 
> worker_thread+0xa8/0x650
> [  604.709183] [c00000000589bdc0] [c00000000010f0a4] kthread+0x164/0x1b0
> [  604.709186] [c00000000589be30] [c00000000000b4e8] 
> ret_from_kernel_thread+0x5c/0x74
> 
> 
> Digging around in the kernel code reported in the trace and reading the 
> commit message
> of the patch that is causing it, my first guess is that there is some 
> confusion left
> in the DRC state management that, at this moment, is being fixed by the
> 'awaiting_allocation' flag. One potentially relevant use of this flag 
> that was removed was this
> piece of code:
> 
> (inside spapr_drc_detach)
> 
> -    if (drc->awaiting_allocation) {
> -        drc->awaiting_release = true;
> -        trace_spapr_drc_awaiting_allocation(spapr_drc_index(drc));
> -        return;
> -    }
> 
> 
> Speculating a little, perhaps the guest is trying to release a DRC that 
> isn't ready to be released
> yet (inconsistent state probably) and this code was preventing it from 
> doing so. I'll
> know more when I dig further.
> 
> 
> >
> >> danielhb@ubuntu1704:~$
> >> danielhb@ubuntu1704:~$
> >> danielhb@ubuntu1704:~$ dmesg | tail -n 5
> >> ^C
> >>
> >> [  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> >> [  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
> >> [  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> >> this message.
> >> [  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120
> >> seconds.
> >> [  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
> >> [  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> >> this message.
> >> [  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> >> [  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
> >> [  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> >> this message.
> >> [  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120
> >> seconds.
> >> (...)
> >>
> >>
> >> I am not sure if it was intended for this scenario to work with this patch
> >> set already. Figured
> >> it can serve as a FYI for the next series.
> > This might be due to a problem I've discussed previously with
> > Laurent.  libvirt uses device_add on the destination's monitor before
> > sending the migration stream, rather than starting the destination
> > with already-hotplugged devices as -device.
> >
> > To my surprise, there doesn't seem to be a system reset between that
> > device_add phase and processing the incoming stream, which means DRCs
> > for those devices might be in the wrong state.
> >
> > I might need to explicitly trigger a reset of DRC state at incoming
> > migration time to handle this.
> 
> Worth a try. I'll see if I can make a POC of this DRC reset at incoming
> migration time.
> 
> 
> Daniel
> 
> >
> >> Given that hotplug/unplug without migration still works and on the
> >> assumption that
> >> we'll look more into this libvirt scenario in the next spins, +1.
> >>
> >>
> >> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
> >>
> >>
> >>
> >> On 06/21/2017 06:18 AM, David Gibson wrote:
> >>> This sixth set of DRC cleanup patches (to be applied on top of the
> >>> patches from part V) is a complete rework of DRC state management.  We
> >>> stop tracking some unnecessary things, and change the basic state
> >>> representation to a simpler and more robust model.
> >>>
> >>> Most of the patches in this set "break" migration from earlier git
> >>> snapshots, but not from any released qemu version.  The previous
> >>> migration stream format had multiple problems, so better to fix them
> >>> now, before 2.10 is out.
> >>>
> >>> David Gibson (5):
> >>>     spapr: Remove 'awaiting_allocation' DRC flag
> >>>     spapr: Refactor spapr_drc_detach()
> >>>     spapr: Cleanups relating to DRC awaiting_release field
> >>>     spapr: Consolidate DRC state variables
> >>>     spapr: Remove sPAPRConfigureConnectorState sub-structure
> >>>
> >>>    hw/ppc/spapr.c             |   9 +-
> >>>    hw/ppc/spapr_drc.c         | 321 +++++++++++++++++++++------------------------
> >>>    hw/ppc/spapr_pci.c         |  13 +-
> >>>    hw/ppc/trace-events        |   3 +-
> >>>    include/hw/ppc/spapr_drc.h |  53 +++++---
> >>>    5 files changed, 187 insertions(+), 212 deletions(-)
> >>>
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-05 22:41       ` Michael Roth
@ 2017-07-06  9:46         ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2017-07-06  9:46 UTC (permalink / raw)
  To: Michael Roth
  Cc: Daniel Henrique Barboza, lvivier, sursingh, qemu-devel, groug,
	qemu-ppc, bharata

[-- Attachment #1: Type: text/plain, Size: 3828 bytes --]

On Wed, Jul 05, 2017 at 05:41:40PM -0500, Michael Roth wrote:
> Quoting Daniel Henrique Barboza (2017-07-05 16:53:57)
> > 
> > 
> > On 07/05/2017 08:04 AM, David Gibson wrote:
> > > On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> > >> I just tested this patch set on top of current ppc-for-2.10 branch (which
> > >> contains
> > >> the patches from part V). It applied cleanly but required a couple of
> > >> trivial
> > >> fixes to build probably because it was made on top of an older code base.
> > > Right, I fixed that up locally already, but haven't gotten around to
> > > reposting yet.  You can look at the 'drcVI' branch on my github tree,
> > > if you're interested.
> > >
> > >> The trivial migration test worked fine. The libvirt scenario (attaching a
> > >> device on
> > >> target before migration, try to unplug after migration) isn't working as
> > >> expected
> > >> but we have a different result with this series. Instead of silently failing
> > >> to unplug
> > >> with error messages on dmesg, the hot unplug works on QEMU level:
> > > Thanks for testing.  Just to clarify what you're saying here, you
> > > haven't spotted a regression with this series, but there is a case
> > > which was broken and is still broken with slightly different
> > > symptoms.  Yes?
> > In my opinion, yes. It is debatable if the patch series made it worse 
> > because
> > the guest is now misbehaving, but the feature per se wasn't working
> > prior to it.
> 
> I think it's the removal of awaiting_allocation.

So.. yes, in the sense that I think we've rediscovered the problem
which prompter the awaiting_allocation flag in the first place.  I
still don't think awaiting_allocation is a sensible fix for.. well,
anything.

So we need to find the right fix for this problem.

> We know currently that
> in the libvirt scenario the DRC is exposed in an pre-hotplug state of
> ISOLATED/UNALLOCATED.

Right, need to understand exactly how we get there.

> In that state, spapr_drc_detach() completes
> immediately because from the perspective of QEMU it apparently has not
> been exposed to the guest yet, or the guest has already quiesced it on
> it's end.

Right.

> awaiting_allocation guarded against this, as it's intention was to make
> sure that resource was put into an ALLOCATED state prior to getting moved
> back into an UNALLOCATED state,

Right, but that seems broken to me.  It means if a cpu is hotplug when
the guest isn't paying attention (e.g. early boot, guest is
halted/crashed), then you can't remove it until you either boot an OS
that allocates then releases it, or you reset (and then release).

> so we didn't immediately unplug a CPU
> while the hotplug was in progress.

But in what sense is the hotplug "in progress"?  The guest has been
given a notification, but it hasn't touched the device yet.  Moving
the state away from UNALLOCATED is never guaranteed to work,
regardless of what notifications have been received.

> So in your scenario the CPU is just mysteriously vanishing out from
> under the guest,

Because the DRC is ending up in UNUSABLE state on the destination when
it should be in some other state (and was on the source)?  Yeah, that
sounds plausible.

> which probably explains the hang. The fix for this
> particular scenario is to fix the initial DRC on the target. I think
> we have a plan for that so I wouldn't consider this a regression
> necessarily.

Yeah.. I'm not quite sure how it's ending up wrong; why isn't the
incoming migration state setting it correctly?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-05 21:53     ` Daniel Henrique Barboza
  2017-07-05 22:41       ` Michael Roth
@ 2017-07-06 14:31       ` Daniel Henrique Barboza
  2017-07-07  7:14         ` David Gibson
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Henrique Barboza @ 2017-07-06 14:31 UTC (permalink / raw)
  To: David Gibson
  Cc: lvivier, qemu-devel, sursingh, groug, mdroth, qemu-ppc, bharata



On 07/05/2017 06:53 PM, Daniel Henrique Barboza wrote:
> Worth a try. I'll see if I can make a POC of this DRC reset at incoming
> migration time. 


Just played a little with the idea of manual reset during migration. 
I've created a
POC that resets the CPU DRCs in a pre_load hook. Then I've done the same
experiment - device_add on both source and target before the migration,
hot unplug the device after migration is completed. The hot unplug worked as
expected in both QEMU and guest.

To minimize the impact I am resetting only the DRCs of the CPUs that 
were hotplugged
in the target instead of resetting everybody.

I'll see if this solution works for LMBs and PCI devices. In case 
affirmative, and
if we are fine with this solution of resetting the DRCs in pre_load (not 
sure if pre_load is
the right place for doing it - suggestions welcome), I can send a patch 
to be applied on top
of this series.



Thanks,

Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-06 14:31       ` Daniel Henrique Barboza
@ 2017-07-07  7:14         ` David Gibson
  2017-07-07 10:23           ` Daniel Henrique Barboza
  0 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2017-07-07  7:14 UTC (permalink / raw)
  To: Daniel Henrique Barboza
  Cc: lvivier, qemu-devel, sursingh, groug, mdroth, qemu-ppc, bharata

[-- Attachment #1: Type: text/plain, Size: 1552 bytes --]

On Thu, Jul 06, 2017 at 11:31:35AM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 07/05/2017 06:53 PM, Daniel Henrique Barboza wrote:
> > Worth a try. I'll see if I can make a POC of this DRC reset at incoming
> > migration time.
> 
> 
> Just played a little with the idea of manual reset during migration. I've
> created a
> POC that resets the CPU DRCs in a pre_load hook.

Ok.  I'm not sure does the pre_load hook get called for objects where
nothing ends up being sent in the migration stream?

> Then I've done the same
> experiment - device_add on both source and target before the migration,
> hot unplug the device after migration is completed. The hot unplug worked as
> expected in both QEMU and guest.

That's encouraging.

> 
> To minimize the impact I am resetting only the DRCs of the CPUs that were
> hotplugged
> in the target instead of resetting everybody.

That sounds like more trouble than it's worth, I think it'll be easier
and safer to reset all DRCs.

> I'll see if this solution works for LMBs and PCI devices. In case
> affirmative, and
> if we are fine with this solution of resetting the DRCs in pre_load (not
> sure if pre_load is
> the right place for doing it - suggestions welcome), I can send a patch to
> be applied on top
> of this series.
> 
> 
> 
> Thanks,
> 
> Daniel
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-07  7:14         ` David Gibson
@ 2017-07-07 10:23           ` Daniel Henrique Barboza
  2017-07-07 21:24             ` Daniel Henrique Barboza
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Henrique Barboza @ 2017-07-07 10:23 UTC (permalink / raw)
  To: David Gibson
  Cc: lvivier, sursingh, mdroth, qemu-devel, groug, qemu-ppc, bharata



On 07/07/2017 04:14 AM, David Gibson wrote:
> On Thu, Jul 06, 2017 at 11:31:35AM -0300, Daniel Henrique Barboza wrote:
>>
>> On 07/05/2017 06:53 PM, Daniel Henrique Barboza wrote:
>>> Worth a try. I'll see if I can make a POC of this DRC reset at incoming
>>> migration time.
>>
>> Just played a little with the idea of manual reset during migration. I've
>> created a
>> POC that resets the CPU DRCs in a pre_load hook.
> Ok.  I'm not sure does the pre_load hook get called for objects where
> nothing ends up being sent in the migration stream?

In this POC I've added a pre_load hook in the machine state to ensure
that it will be called.  I saw that we already had a post_load hook in there
that does adjustments in the CPU and RADIX state, so adding a pre_load hook
sounded viable too. I am not sure if we can add some form of hook for
the DRCs that won't be migrated in the DRC state or if there is a better
VMSD to do the reset of the DRCs.

Anyway, this was more of an experiment than a serious fix proposal.
I believe the general idea is solid but it'll get a couple of spins to get a
good patch for pushing. I'll send the POC as a RFC to get things started and
we'll see what happens.


Daniel

>> Then I've done the same
>> experiment - device_add on both source and target before the migration,
>> hot unplug the device after migration is completed. The hot unplug worked as
>> expected in both QEMU and guest.
> That's encouraging.
>
>> To minimize the impact I am resetting only the DRCs of the CPUs that were
>> hotplugged
>> in the target instead of resetting everybody.
> That sounds like more trouble than it's worth, I think it'll be easier
> and safer to reset all DRCs.
>
>> I'll see if this solution works for LMBs and PCI devices. In case
>> affirmative, and
>> if we are fine with this solution of resetting the DRCs in pre_load (not
>> sure if pre_load is
>> the right place for doing it - suggestions welcome), I can send a patch to
>> be applied on top
>> of this series.
>>
>>
>>
>> Thanks,
>>
>> Daniel
>>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
  2017-07-07 10:23           ` Daniel Henrique Barboza
@ 2017-07-07 21:24             ` Daniel Henrique Barboza
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Henrique Barboza @ 2017-07-07 21:24 UTC (permalink / raw)
  To: David Gibson
  Cc: lvivier, qemu-devel, sursingh, groug, mdroth, qemu-ppc, bharata


> Anyway, this was more of an experiment than a serious fix proposal.
> I believe the general idea is solid but it'll get a couple of spins to 
> get a
> good patch for pushing. I'll send the POC as a RFC to get things 
> started and
> we'll see what happens.

I've sent the POC as RFC here:

"[RFC drcVI PATCH] spapr: reset DRCs on migration pre_load"


The patch mentions drcVI because it was made on top of this series. Let 
me know
what you think.

Daniel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-07-07 21:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-21  9:18 [Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part VI) David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 1/5] spapr: Remove 'awaiting_allocation' DRC flag David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 2/5] spapr: Refactor spapr_drc_detach() David Gibson
2017-06-22  9:32   ` Greg Kurz
2017-07-04 14:54     ` David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 3/5] spapr: Cleanups relating to DRC awaiting_release field David Gibson
2017-06-22 12:51   ` Greg Kurz
2017-06-21  9:18 ` [Qemu-devel] [PATCH 4/5] spapr: Consolidate DRC state variables David Gibson
2017-06-21  9:18 ` [Qemu-devel] [PATCH 5/5] spapr: Remove sPAPRConfigureConnectorState sub-structure David Gibson
2017-07-04 21:13 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI) Daniel Henrique Barboza
2017-07-05 11:04   ` David Gibson
2017-07-05 21:53     ` Daniel Henrique Barboza
2017-07-05 22:41       ` Michael Roth
2017-07-06  9:46         ` David Gibson
2017-07-06 14:31       ` Daniel Henrique Barboza
2017-07-07  7:14         ` David Gibson
2017-07-07 10:23           ` Daniel Henrique Barboza
2017-07-07 21:24             ` Daniel Henrique Barboza

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.