All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] vhost-user: Add suspend/resume
@ 2023-07-11 15:52 Hanna Czenczek
  2023-07-11 15:52 ` [PATCH 1/6] vhost-user.rst: " Hanna Czenczek
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

Hi,

As discussed on the previous version of the virtio-fs migration series
(https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html),
we currently don’t have a good way to have a vhost-user back-end fully
cease all operations, including background operations.  To work around
this, we reset it, which is not an option for stateful devices like
virtio-fs.

Instead, we want the same SUSPEND/RESUME model that vhost-vdpa already
has, so that we can suspend back-ends when we want them to stop doing
anything (i.e. on VM stop), and resume them later (i.e. on VM resume).
This series adds these vhost-user operations to the protocol and
implements them in qemu.  Furthermore, it has vhost-user and vhost-vdpa
do roughly the same thing in their reset paths, as far as possible.
That path will still remain as a fall-back if SUSPEND/RESUME is not
implemented, and, given that qemu’s vhost-vdpa code currently does not
make use of RESUME, it is actually always used for vhost-vdpa (to take
the device out of a suspended state).


Hanna Czenczek (6):
  vhost-user.rst: Add suspend/resume
  vhost-vdpa: Move vhost_vdpa_reset_status() up
  vhost: Do not reset suspended devices on stop
  vhost-user: Implement suspend/resume
  vhost-vdpa: Match vhost-user's status reset
  vhost-user: Have reset_status fall back to reset

 docs/interop/vhost-user.rst    |  35 +++++++++++-
 include/hw/virtio/vhost-vdpa.h |   2 -
 include/hw/virtio/vhost.h      |   8 +++
 hw/virtio/vhost-user.c         | 101 ++++++++++++++++++++++++++++++++-
 hw/virtio/vhost-vdpa.c         |  41 ++++++-------
 hw/virtio/vhost.c              |   8 ++-
 6 files changed, 169 insertions(+), 26 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 14:25   ` Stefan Hajnoczi
  2023-07-11 15:52 ` [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up Hanna Czenczek
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

When stopping the VM, qemu wants all devices to fully cease any
operation, too.  Currently, we can only have vhost-user back-ends stop
processing vrings, but not background operations.  Add the SUSPEND and
RESUME commands from vDPA, which allow the front-end (qemu) to tell
back-ends to cease all operations, including those running in the
background.

qemu's current work-around for this is to reset the back-end instead of
suspending it, which will not work for back-ends that have internal
state that must be preserved across e.g. stop/cont.

Note that the given specification requires the back-end to delay
processing kicks (i.e. starting vrings) until the device is resumed,
instead of requiring the front-end not to send kicks while suspended.
qemu starts devices (and would just resume them) only when the VM is in
a running state, so it would be difficult to have qemu delay kicks until
the device is resumed, which is why this patch specifies handling of
kicks as it does.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5a070adbc1..ac6be34c4c 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
 or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
 and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
 
+While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
+never process rings, and thus also delay handling kicks until the
+back-end is resumed again.
+
 Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
 
 If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
@@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
 ancillary data, it may be used to inform the front-end that the log has
 been modified.
 
-Once the source has finished migration, rings will be stopped by the
-source. No further update must be done before rings are restarted.
+Once the source has finished migration, the device will be suspended and
+its rings will be stopped by the source. No further update must be done
+before the device and its rings are resumed.
 
 In postcopy migration the back-end is started before all the memory has
 been received from the source host, and care must be taken to avoid
@@ -885,6 +890,7 @@ Protocol features
   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
   #define VHOST_USER_PROTOCOL_F_STATUS               16
   #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
+  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
 
 Front-end message types
 -----------------------
@@ -1440,6 +1446,31 @@ Front-end message types
   query the back-end for its device status as defined in the Virtio
   specification.
 
+``VHOST_USER_SUSPEND``
+  :id: 41
+  :equivalent ioctl: VHOST_VDPA_SUSPEND
+  :request payload: N/A
+  :reply payload: N/A
+
+  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
+  successfully negotiated, this message is submitted by the front-end to
+  have the back-end cease all operations except for handling vhost-user
+  requests.  The back-end must stop processing all virt queues, and it
+  must not perform any background operations.  It may not resume until a
+  subsequent ``VHOST_USER_RESUME`` call.
+
+``VHOST_USER_RESUME``
+  :id: 42
+  :equivalent ioctl: VHOST_VDPA_RESUME
+  :request payload: N/A
+  :reply payload: N/A
+
+  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
+  successfully negotiated, this message is submitted by the front-end to
+  allow the back-end to resume operations after having been suspended
+  before.  The back-end must again begin processing rings that are not
+  stopped, and it may resume background operations.
+
 
 Back-end message types
 ----------------------
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
  2023-07-11 15:52 ` [PATCH 1/6] vhost-user.rst: " Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 14:29   ` Stefan Hajnoczi
  2023-07-11 15:52 ` [PATCH 3/6] vhost: Do not reset suspended devices on stop Hanna Czenczek
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

The next commit is going to have vhost_vdpa_dev_start() call this, so
move it up to have the declaration where we are going to need it.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 42f2a4bae9..7b7dee468e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1286,6 +1286,20 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
     vhost_vdpa_reset_device(dev);
 }
 
+static void vhost_vdpa_reset_status(struct vhost_dev *dev)
+{
+    struct vhost_vdpa *v = dev->opaque;
+
+    if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
+        return;
+    }
+
+    vhost_vdpa_reset_device(dev);
+    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                               VIRTIO_CONFIG_S_DRIVER);
+    memory_listener_unregister(&v->listener);
+}
+
 static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
 {
     struct vhost_vdpa *v = dev->opaque;
@@ -1323,20 +1337,6 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
     return 0;
 }
 
-static void vhost_vdpa_reset_status(struct vhost_dev *dev)
-{
-    struct vhost_vdpa *v = dev->opaque;
-
-    if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
-        return;
-    }
-
-    vhost_vdpa_reset_device(dev);
-    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-                               VIRTIO_CONFIG_S_DRIVER);
-    memory_listener_unregister(&v->listener);
-}
-
 static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
                                      struct vhost_log *log)
 {
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
  2023-07-11 15:52 ` [PATCH 1/6] vhost-user.rst: " Hanna Czenczek
  2023-07-11 15:52 ` [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 14:33   ` Stefan Hajnoczi
  2023-07-21 15:25   ` Eugenio Perez Martin
  2023-07-11 15:52 ` [PATCH 4/6] vhost-user: Implement suspend/resume Hanna Czenczek
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

Move the `suspended` field from vhost_vdpa into the global vhost_dev
struct, so vhost_dev_stop() can check whether the back-end has been
suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
there is no need to reset it; the reset is just a fall-back to stop
device operations for back-ends that do not support suspend.

Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
when the device is re-started, we still have to do the reset to have it
un-suspend.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 include/hw/virtio/vhost-vdpa.h |  2 --
 include/hw/virtio/vhost.h      |  8 ++++++++
 hw/virtio/vhost-vdpa.c         | 11 +++++++----
 hw/virtio/vhost.c              |  8 +++++++-
 4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index e64bfc7f98..72c3686b7f 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
     bool shadow_vqs_enabled;
     /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
     bool shadow_data;
-    /* Device suspended successfully */
-    bool suspended;
     /* IOVA mapping used by the Shadow Virtqueue */
     VhostIOVATree *iova_tree;
     GPtrArray *shadow_vqs;
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 6a173cb9fa..69bf59d630 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -120,6 +120,14 @@ struct vhost_dev {
     uint64_t backend_cap;
     /* @started: is the vhost device started? */
     bool started;
+    /**
+     * @suspended: Whether the vhost device is currently suspended.  Set
+     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
+     * are supposed to automatically suspend/resume in their
+     * vhost_dev_start handlers as required.  Must also be cleared when
+     * the device is reset.
+     */
+    bool suspended;
     bool log_enabled;
     uint64_t log_size;
     Error *migration_blocker;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7b7dee468e..f7fd19a203 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
 
 static int vhost_vdpa_reset_device(struct vhost_dev *dev)
 {
-    struct vhost_vdpa *v = dev->opaque;
     int ret;
     uint8_t status = 0;
 
     ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
     trace_vhost_vdpa_reset_device(dev);
-    v->suspended = false;
+    dev->suspended = false;
     return ret;
 }
 
@@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
         if (unlikely(r)) {
             error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
         } else {
-            v->suspended = true;
+            dev->suspended = true;
             return;
         }
     }
@@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
             return -1;
         }
         vhost_vdpa_set_vring_ready(dev);
+        if (dev->suspended) {
+            /* TODO: When RESUME is available, use it instead of resetting */
+            vhost_vdpa_reset_status(dev);
+        }
     } else {
         vhost_vdpa_suspend(dev);
         vhost_vdpa_svqs_stop(dev);
@@ -1400,7 +1403,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
         return 0;
     }
 
-    if (!v->suspended) {
+    if (!dev->suspended) {
         /*
          * Cannot trust in value returned by device, let vhost recover used
          * idx from guest.
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index abf0d03c8d..2e28e58da7 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2059,7 +2059,13 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
                              hdev->vqs + i,
                              hdev->vq_index + i);
     }
-    if (hdev->vhost_ops->vhost_reset_status) {
+
+    /*
+     * If we failed to successfully stop the device via SUSPEND (should have
+     * been attempted by `vhost_dev_start(hdev, false)`), reset it to stop it.
+     * Stateful devices where this would be problematic must implement SUSPEND.
+     */
+    if (!hdev->suspended && hdev->vhost_ops->vhost_reset_status) {
         hdev->vhost_ops->vhost_reset_status(hdev);
     }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/6] vhost-user: Implement suspend/resume
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
                   ` (2 preceding siblings ...)
  2023-07-11 15:52 ` [PATCH 3/6] vhost: Do not reset suspended devices on stop Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 14:37   ` Stefan Hajnoczi
  2023-07-11 15:52 ` [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset Hanna Czenczek
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

Implement SUSPEND/RESUME like vDPA does, by automatically using it in
vhost_user_dev_start().  (Though our vDPA code does not implement RESUME
yet, so there, the device is reset when it is to be resumed.)

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 hw/virtio/vhost-user.c | 99 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 97 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d42..4507de5a92 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -74,6 +74,8 @@ enum VhostUserProtocolFeature {
     /* Feature 14 reserved for VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS. */
     VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS = 15,
     VHOST_USER_PROTOCOL_F_STATUS = 16,
+    /* Feature 17 reserved for VHOST_USER_PROTOCOL_F_XEN_MMAP. */
+    VHOST_USER_PROTOCOL_F_SUSPEND = 18,
     VHOST_USER_PROTOCOL_F_MAX
 };
 
@@ -121,6 +123,8 @@ typedef enum VhostUserRequest {
     VHOST_USER_REM_MEM_REG = 38,
     VHOST_USER_SET_STATUS = 39,
     VHOST_USER_GET_STATUS = 40,
+    VHOST_USER_SUSPEND = 41,
+    VHOST_USER_RESUME = 42,
     VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -1389,7 +1393,19 @@ static int vhost_user_set_u64(struct vhost_dev *dev, int request, uint64_t u64,
 
 static int vhost_user_set_status(struct vhost_dev *dev, uint8_t status)
 {
-    return vhost_user_set_u64(dev, VHOST_USER_SET_STATUS, status, false);
+    int ret;
+
+    ret = vhost_user_set_u64(dev, VHOST_USER_SET_STATUS, status, false);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (!status) {
+        /* reset */
+        dev->suspended = false;
+    }
+
+    return 0;
 }
 
 static int vhost_user_get_status(struct vhost_dev *dev, uint8_t *status)
@@ -1490,6 +1506,7 @@ static int vhost_user_get_max_memslots(struct vhost_dev *dev,
 
 static int vhost_user_reset_device(struct vhost_dev *dev)
 {
+    int ret;
     VhostUserMsg msg = {
         .hdr.flags = VHOST_USER_VERSION,
     };
@@ -1499,7 +1516,13 @@ static int vhost_user_reset_device(struct vhost_dev *dev)
         ? VHOST_USER_RESET_DEVICE
         : VHOST_USER_RESET_OWNER;
 
-    return vhost_user_write(dev, &msg, NULL, 0);
+    ret = vhost_user_write(dev, &msg, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+
+    dev->suspended = false;
+    return 0;
 }
 
 static int vhost_user_backend_handle_config_change(struct vhost_dev *dev)
@@ -2707,8 +2730,80 @@ void vhost_user_async_close(DeviceState *d,
     }
 }
 
+static bool vhost_user_supports_suspend(struct vhost_dev *dev)
+{
+    return virtio_has_feature(dev->protocol_features,
+                              VHOST_USER_PROTOCOL_F_SUSPEND);
+}
+
+static int vhost_user_do_suspend_resume(struct vhost_dev *dev, bool suspend)
+{
+    VhostUserMsg msg;
+    bool reply_supported = virtio_has_feature(dev->protocol_features,
+                                              VHOST_USER_PROTOCOL_F_REPLY_ACK);
+    VhostUserRequest request = suspend ? VHOST_USER_SUSPEND : VHOST_USER_RESUME;
+    int ret;
+
+    if (dev->suspended == suspend) {
+        /* Nothing to do */
+        return 0;
+    }
+
+    if (!vhost_user_supports_suspend(dev)) {
+        return -ENOTSUP;
+    }
+
+    msg = (VhostUserMsg) {
+        .hdr = {
+            .request = request,
+            .size = 0,
+            .flags = VHOST_USER_VERSION,
+        },
+    };
+    if (reply_supported) {
+        msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
+    }
+
+    ret = vhost_user_write(dev, &msg, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (reply_supported) {
+        ret = process_message_reply(dev, &msg);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    dev->suspended = suspend;
+    return 0;
+}
+
+static int vhost_user_suspend(struct vhost_dev *dev)
+{
+    return vhost_user_do_suspend_resume(dev, true);
+}
+
+static int vhost_user_resume(struct vhost_dev *dev)
+{
+    return vhost_user_do_suspend_resume(dev, false);
+}
+
 static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
 {
+    /*
+     * Ignore results.  If the central vhost code cares, it will check
+     * dev->suspended.  (These calls will fail if the back-end does not
+     * support suspend/resume, which callers that just want to start the
+     * device do not care about.)
+     */
+    if (started) {
+        vhost_user_resume(dev);
+    } else {
+        vhost_user_suspend(dev);
+    }
+
     if (!virtio_has_feature(dev->protocol_features,
                             VHOST_USER_PROTOCOL_F_STATUS)) {
         return 0;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
                   ` (3 preceding siblings ...)
  2023-07-11 15:52 ` [PATCH 4/6] vhost-user: Implement suspend/resume Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 14:50   ` Stefan Hajnoczi
  2023-07-11 15:52 ` [PATCH 6/6] vhost-user: Have reset_status fall back to reset Hanna Czenczek
  2023-07-18 15:14 ` [PATCH 0/6] vhost-user: Add suspend/resume Stefan Hajnoczi
  6 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

vhost-vdpa and vhost-user differ in how they reset the status in their
respective vhost_reset_status implementations: vhost-vdpa zeroes it,
then re-adds the S_ACKNOWLEDGE and S_DRIVER config bits.  S_DRIVER_OK is
then set in vhost_vdpa_dev_start().

vhost-user in contrast just zeroes the status, and does no re-add any
config bits until vhost_user_dev_start() (where it does re-add all of
S_ACKNOWLEDGE, S_DRIVER, and S_DRIVER_OK).

There is no documentation for vhost_reset_status, but its only caller is
vhost_dev_stop().  So apparently, the device is to be stopped after
vhost_reset_status, and therefore it makes more sense to keep the status
field fully cleared until the back-end is re-started, which is how
vhost-user does it.  Make vhost-vdpa do the same -- if nothing else it's
confusing to have both vhost implementations handle this differently.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 hw/virtio/vhost-vdpa.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index f7fd19a203..0cde8b40de 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1294,8 +1294,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
     }
 
     vhost_vdpa_reset_device(dev);
-    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
-                               VIRTIO_CONFIG_S_DRIVER);
     memory_listener_unregister(&v->listener);
 }
 
@@ -1334,7 +1332,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
         }
         memory_listener_register(&v->listener, dev->vdev->dma_as);
 
-        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                                          VIRTIO_CONFIG_S_DRIVER |
+                                          VIRTIO_CONFIG_S_DRIVER_OK);
     }
 
     return 0;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
                   ` (4 preceding siblings ...)
  2023-07-11 15:52 ` [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset Hanna Czenczek
@ 2023-07-11 15:52 ` Hanna Czenczek
  2023-07-18 15:10   ` Stefan Hajnoczi
  2023-07-18 15:14 ` [PATCH 0/6] vhost-user: Add suspend/resume Stefan Hajnoczi
  6 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-11 15:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Hanna Czenczek, Michael S . Tsirkin, Stefan Hajnoczi,
	Eugenio Pérez, German Maglione

The only user of vhost_user_reset_status() is vhost_dev_stop(), which
only uses it as a fall-back to stop the back-end if it does not support
SUSPEND.  However, vhost-user's implementation is a no-op unless the
back-end supports SET_STATUS.

vhost-vdpa's implementation instead just calls
vhost_vdpa_reset_device(), implying that it's OK to fully reset the
device if SET_STATUS is not supported.

To be fair, vhost_vdpa_reset_device() does nothing but to set the status
to zero.  However, that may well be because vhost-vdpa has no method
besides this to reset a device.  In contrast, vhost-user has
RESET_DEVICE and a RESET_OWNER, which can be used instead.

While it is not entirely clear from documentation or git logs, from
discussions and the order of vhost-user protocol features, it appears to
me as if RESET_OWNER originally had no real meaning for vhost-user, and
was thus used to signal a device reset to the back-end.  Then,
RESET_DEVICE was introduced, to have a well-defined dedicated reset
command.  Finally, vhost-user received full STATUS support, including
SET_STATUS, so setting the device status to 0 is now the preferred way
of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
remain valid as fall-backs.

Therefore, have vhost_user_reset_status() fall back to
vhost_user_reset_device() if the back-end has no STATUS support.

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 hw/virtio/vhost-user.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4507de5a92..53a881ec2a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev)
     if (virtio_has_feature(dev->protocol_features,
                            VHOST_USER_PROTOCOL_F_STATUS)) {
         vhost_user_set_status(dev, 0);
+    } else {
+        vhost_user_reset_device(dev);
     }
 }
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-11 15:52 ` [PATCH 1/6] vhost-user.rst: " Hanna Czenczek
@ 2023-07-18 14:25   ` Stefan Hajnoczi
  2023-07-19 13:59     ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 14:25 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 5477 bytes --]

On Tue, Jul 11, 2023 at 05:52:23PM +0200, Hanna Czenczek wrote:
> When stopping the VM, qemu wants all devices to fully cease any
> operation, too.  Currently, we can only have vhost-user back-ends stop
> processing vrings, but not background operations.  Add the SUSPEND and
> RESUME commands from vDPA, which allow the front-end (qemu) to tell
> back-ends to cease all operations, including those running in the
> background.
> 
> qemu's current work-around for this is to reset the back-end instead of
> suspending it, which will not work for back-ends that have internal
> state that must be preserved across e.g. stop/cont.
> 
> Note that the given specification requires the back-end to delay
> processing kicks (i.e. starting vrings) until the device is resumed,
> instead of requiring the front-end not to send kicks while suspended.
> qemu starts devices (and would just resume them) only when the VM is in
> a running state, so it would be difficult to have qemu delay kicks until
> the device is resumed, which is why this patch specifies handling of
> kicks as it does.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5a070adbc1..ac6be34c4c 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
>  or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
>  and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
>  
> +While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
> +never process rings, and thus also delay handling kicks until the

If you respin this series, I suggest replacing "never" with "not" to
emphasize that ring processing is only skipped while the device is
suspended (rather than forever). "Never" feels too strong to use when
describing a temporary state.

> +back-end is resumed again.
> +
>  Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
>  
>  If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
> @@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
>  ancillary data, it may be used to inform the front-end that the log has
>  been modified.
>  
> -Once the source has finished migration, rings will be stopped by the
> -source. No further update must be done before rings are restarted.
> +Once the source has finished migration, the device will be suspended and
> +its rings will be stopped by the source. No further update must be done
> +before the device and its rings are resumed.

This paragraph is abstract and doesn't directly identify the mechanisms
or who does what:
- "the device will be suspended" via VHOST_USER_SUSPEND (or reset when
  VHOST_USER_SUSPEND is not supported?) or automatically by the device
  itself or some other mechanism?
- "before the device and its rings are resumed" via VHOST_USER_RESUME?
  And is this referring to the source device?

Please rephrase the paragraph to identify the vhost-user messages
involved.

>  
>  In postcopy migration the back-end is started before all the memory has
>  been received from the source host, and care must be taken to avoid
> @@ -885,6 +890,7 @@ Protocol features
>    #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>    #define VHOST_USER_PROTOCOL_F_STATUS               16
>    #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
> +  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
>  
>  Front-end message types
>  -----------------------
> @@ -1440,6 +1446,31 @@ Front-end message types
>    query the back-end for its device status as defined in the Virtio
>    specification.
>  
> +``VHOST_USER_SUSPEND``
> +  :id: 41
> +  :equivalent ioctl: VHOST_VDPA_SUSPEND
> +  :request payload: N/A
> +  :reply payload: N/A
> +
> +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
> +  successfully negotiated, this message is submitted by the front-end to
> +  have the back-end cease all operations except for handling vhost-user
> +  requests.  The back-end must stop processing all virt queues, and it
> +  must not perform any background operations.  It may not resume until a

"background operations" are not defined. What does it mean:
- Anything that writes to memory slots
- Anything that changes the visible state of the device
- Anything that changes the non-visible internal state of the device
- etc
?

> +  subsequent ``VHOST_USER_RESUME`` call.
> +
> +``VHOST_USER_RESUME``
> +  :id: 42
> +  :equivalent ioctl: VHOST_VDPA_RESUME
> +  :request payload: N/A
> +  :reply payload: N/A
> +
> +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
> +  successfully negotiated, this message is submitted by the front-end to
> +  allow the back-end to resume operations after having been suspended
> +  before.  The back-end must again begin processing rings that are not

This can be made more concrete by referencing the vhost-user message
used to suspend the device:

"suspended before" -> "suspended with VHOST_USER_SUSPEND"

> +  stopped, and it may resume background operations.
> +
>  
>  Back-end message types
>  ----------------------
> -- 
> 2.41.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up
  2023-07-11 15:52 ` [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up Hanna Czenczek
@ 2023-07-18 14:29   ` Stefan Hajnoczi
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 14:29 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 440 bytes --]

On Tue, Jul 11, 2023 at 05:52:24PM +0200, Hanna Czenczek wrote:
> The next commit is going to have vhost_vdpa_dev_start() call this, so
> move it up to have the declaration where we are going to need it.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-11 15:52 ` [PATCH 3/6] vhost: Do not reset suspended devices on stop Hanna Czenczek
@ 2023-07-18 14:33   ` Stefan Hajnoczi
  2023-07-21 15:25   ` Eugenio Perez Martin
  1 sibling, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 14:33 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 948 bytes --]

On Tue, Jul 11, 2023 at 05:52:25PM +0200, Hanna Czenczek wrote:
> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> struct, so vhost_dev_stop() can check whether the back-end has been
> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> there is no need to reset it; the reset is just a fall-back to stop
> device operations for back-ends that do not support suspend.
> 
> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> when the device is re-started, we still have to do the reset to have it
> un-suspend.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  include/hw/virtio/vhost-vdpa.h |  2 --
>  include/hw/virtio/vhost.h      |  8 ++++++++
>  hw/virtio/vhost-vdpa.c         | 11 +++++++----
>  hw/virtio/vhost.c              |  8 +++++++-
>  4 files changed, 22 insertions(+), 7 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 4/6] vhost-user: Implement suspend/resume
  2023-07-11 15:52 ` [PATCH 4/6] vhost-user: Implement suspend/resume Hanna Czenczek
@ 2023-07-18 14:37   ` Stefan Hajnoczi
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 14:37 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 525 bytes --]

On Tue, Jul 11, 2023 at 05:52:26PM +0200, Hanna Czenczek wrote:
> Implement SUSPEND/RESUME like vDPA does, by automatically using it in
> vhost_user_dev_start().  (Though our vDPA code does not implement RESUME
> yet, so there, the device is reset when it is to be resumed.)
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  hw/virtio/vhost-user.c | 99 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 97 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset
  2023-07-11 15:52 ` [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset Hanna Czenczek
@ 2023-07-18 14:50   ` Stefan Hajnoczi
  2023-07-19 14:09     ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 14:50 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 3091 bytes --]

On Tue, Jul 11, 2023 at 05:52:27PM +0200, Hanna Czenczek wrote:
> vhost-vdpa and vhost-user differ in how they reset the status in their
> respective vhost_reset_status implementations: vhost-vdpa zeroes it,
> then re-adds the S_ACKNOWLEDGE and S_DRIVER config bits.  S_DRIVER_OK is
> then set in vhost_vdpa_dev_start().
> 
> vhost-user in contrast just zeroes the status, and does no re-add any
> config bits until vhost_user_dev_start() (where it does re-add all of
> S_ACKNOWLEDGE, S_DRIVER, and S_DRIVER_OK).
> 
> There is no documentation for vhost_reset_status, but its only caller is
> vhost_dev_stop().  So apparently, the device is to be stopped after
> vhost_reset_status, and therefore it makes more sense to keep the status
> field fully cleared until the back-end is re-started, which is how
> vhost-user does it.  Make vhost-vdpa do the same -- if nothing else it's
> confusing to have both vhost implementations handle this differently.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  hw/virtio/vhost-vdpa.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Hi Hanna,
The VIRTIO spec lists the Device Initialization sequence including the
bits set in the Device Status Register here:
https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-1070001

ACKNOWLEDGE and DRIVER must be set before FEATURES_OK. DRIVER_OK is set
after FEATURES_OK.

The driver may read the Device Configuration Space once ACKNOWLEDGE and
DRIVER are set.

QEMU's vhost code should follow this sequence (especially for vDPA where
full VIRTIO devices are implemented).

vhost-user is not faithful to the VIRTIO spec here. That's probably due
to the fact that vhost-user didn't have the concept of the Device Status
Register until recently and back-ends mostly ignore it.

Please do the opposite of this patch: bring vhost-user in line with the
VIRTIO specification so that the Device Initialization sequence is
followed correctly. I think vhost-vdpa already does the right thing.

> 
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index f7fd19a203..0cde8b40de 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -1294,8 +1294,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>      }
>  
>      vhost_vdpa_reset_device(dev);
> -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> -                               VIRTIO_CONFIG_S_DRIVER);
>      memory_listener_unregister(&v->listener);
>  }
>  
> @@ -1334,7 +1332,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>          }
>          memory_listener_register(&v->listener, dev->vdev->dma_as);
>  
> -        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> +        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                                          VIRTIO_CONFIG_S_DRIVER |
> +                                          VIRTIO_CONFIG_S_DRIVER_OK);
>      }
>  
>      return 0;
> -- 
> 2.41.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-11 15:52 ` [PATCH 6/6] vhost-user: Have reset_status fall back to reset Hanna Czenczek
@ 2023-07-18 15:10   ` Stefan Hajnoczi
  2023-07-19 14:11     ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 15:10 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 2601 bytes --]

On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
> The only user of vhost_user_reset_status() is vhost_dev_stop(), which
> only uses it as a fall-back to stop the back-end if it does not support
> SUSPEND.  However, vhost-user's implementation is a no-op unless the
> back-end supports SET_STATUS.
> 
> vhost-vdpa's implementation instead just calls
> vhost_vdpa_reset_device(), implying that it's OK to fully reset the
> device if SET_STATUS is not supported.
> 
> To be fair, vhost_vdpa_reset_device() does nothing but to set the status
> to zero.  However, that may well be because vhost-vdpa has no method
> besides this to reset a device.  In contrast, vhost-user has
> RESET_DEVICE and a RESET_OWNER, which can be used instead.
> 
> While it is not entirely clear from documentation or git logs, from
> discussions and the order of vhost-user protocol features, it appears to
> me as if RESET_OWNER originally had no real meaning for vhost-user, and
> was thus used to signal a device reset to the back-end.  Then,
> RESET_DEVICE was introduced, to have a well-defined dedicated reset
> command.  Finally, vhost-user received full STATUS support, including
> SET_STATUS, so setting the device status to 0 is now the preferred way
> of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
> remain valid as fall-backs.
> 
> Therefore, have vhost_user_reset_status() fall back to
> vhost_user_reset_device() if the back-end has no STATUS support.
> 
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  hw/virtio/vhost-user.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 4507de5a92..53a881ec2a 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev)
>      if (virtio_has_feature(dev->protocol_features,
>                             VHOST_USER_PROTOCOL_F_STATUS)) {
>          vhost_user_set_status(dev, 0);
> +    } else {
> +        vhost_user_reset_device(dev);
>      }
>  }

Did you check whether DPDK treats setting the status to 0 as equivalent
to RESET_DEVICE?

My understanding is that SET_STATUS is mostly ignored by vhost-user
back-ends today. Even those that implement it may not treat SET_STATUS 0
as equivalent to RESET_DEVICE.

If you decide it's safe to make this change, please also update
vhost-user.rst to document that front-ends should use SET_STATUS 0,
RESET_DEVICE, and RESET_OWNER in order of preference.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 0/6] vhost-user: Add suspend/resume
  2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
                   ` (5 preceding siblings ...)
  2023-07-11 15:52 ` [PATCH 6/6] vhost-user: Have reset_status fall back to reset Hanna Czenczek
@ 2023-07-18 15:14 ` Stefan Hajnoczi
  6 siblings, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-18 15:14 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 2079 bytes --]

On Tue, Jul 11, 2023 at 05:52:22PM +0200, Hanna Czenczek wrote:
> Hi,
> 
> As discussed on the previous version of the virtio-fs migration series
> (https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html),
> we currently don’t have a good way to have a vhost-user back-end fully
> cease all operations, including background operations.  To work around
> this, we reset it, which is not an option for stateful devices like
> virtio-fs.
> 
> Instead, we want the same SUSPEND/RESUME model that vhost-vdpa already
> has, so that we can suspend back-ends when we want them to stop doing
> anything (i.e. on VM stop), and resume them later (i.e. on VM resume).
> This series adds these vhost-user operations to the protocol and
> implements them in qemu.  Furthermore, it has vhost-user and vhost-vdpa
> do roughly the same thing in their reset paths, as far as possible.
> That path will still remain as a fall-back if SUSPEND/RESUME is not
> implemented, and, given that qemu’s vhost-vdpa code currently does not
> make use of RESUME, it is actually always used for vhost-vdpa (to take
> the device out of a suspended state).
> 
> 
> Hanna Czenczek (6):
>   vhost-user.rst: Add suspend/resume
>   vhost-vdpa: Move vhost_vdpa_reset_status() up
>   vhost: Do not reset suspended devices on stop
>   vhost-user: Implement suspend/resume
>   vhost-vdpa: Match vhost-user's status reset
>   vhost-user: Have reset_status fall back to reset
> 
>  docs/interop/vhost-user.rst    |  35 +++++++++++-
>  include/hw/virtio/vhost-vdpa.h |   2 -
>  include/hw/virtio/vhost.h      |   8 +++
>  hw/virtio/vhost-user.c         | 101 ++++++++++++++++++++++++++++++++-
>  hw/virtio/vhost-vdpa.c         |  41 ++++++-------
>  hw/virtio/vhost.c              |   8 ++-
>  6 files changed, 169 insertions(+), 26 deletions(-)

Hi Hanna,
I posted comments but wanted to say great job! There was a long and
somewhat messy email discussion to figure out how to proceed and you
came up with a clean patch series that solves the issues.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-18 14:25   ` Stefan Hajnoczi
@ 2023-07-19 13:59     ` Hanna Czenczek
  2023-07-24 17:55       ` Stefan Hajnoczi
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-19 13:59 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

On 18.07.23 16:25, Stefan Hajnoczi wrote:
> On Tue, Jul 11, 2023 at 05:52:23PM +0200, Hanna Czenczek wrote:
>> When stopping the VM, qemu wants all devices to fully cease any
>> operation, too.  Currently, we can only have vhost-user back-ends stop
>> processing vrings, but not background operations.  Add the SUSPEND and
>> RESUME commands from vDPA, which allow the front-end (qemu) to tell
>> back-ends to cease all operations, including those running in the
>> background.
>>
>> qemu's current work-around for this is to reset the back-end instead of
>> suspending it, which will not work for back-ends that have internal
>> state that must be preserved across e.g. stop/cont.
>>
>> Note that the given specification requires the back-end to delay
>> processing kicks (i.e. starting vrings) until the device is resumed,
>> instead of requiring the front-end not to send kicks while suspended.
>> qemu starts devices (and would just resume them) only when the VM is in
>> a running state, so it would be difficult to have qemu delay kicks until
>> the device is resumed, which is why this patch specifies handling of
>> kicks as it does.
>>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>> ---
>>   docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
>>   1 file changed, 33 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>> index 5a070adbc1..ac6be34c4c 100644
>> --- a/docs/interop/vhost-user.rst
>> +++ b/docs/interop/vhost-user.rst
>> @@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
>>   or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
>>   and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
>>   
>> +While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
>> +never process rings, and thus also delay handling kicks until the
> If you respin this series, I suggest replacing "never" with "not" to
> emphasize that ring processing is only skipped while the device is
> suspended (rather than forever). "Never" feels too strong to use when
> describing a temporary state.

Sure.

>> +back-end is resumed again.
>> +
>>   Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
>>   
>>   If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
>> @@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
>>   ancillary data, it may be used to inform the front-end that the log has
>>   been modified.
>>   
>> -Once the source has finished migration, rings will be stopped by the
>> -source. No further update must be done before rings are restarted.
>> +Once the source has finished migration, the device will be suspended and
>> +its rings will be stopped by the source. No further update must be done
>> +before the device and its rings are resumed.
> This paragraph is abstract and doesn't directly identify the mechanisms
> or who does what:
> - "the device will be suspended" via VHOST_USER_SUSPEND (or reset when
>    VHOST_USER_SUSPEND is not supported?) or automatically by the device
>    itself or some other mechanism?

OK, I’ll add VHOST_USER_SUSPEND.

So far I hadn’t considered making a note of resetting as a fallback 
right in the specification.  I don’t think I would want it in the 
specification, but on the other hand, it is probably important for 
back-end authors to know that if they do not implement SUSPEND, their 
device is going to be reset by qemu.

Can we make that a ”may”, i.e.:

```
Once the source has finished migration, the device will be suspended via 
VHOST_USER_SUSPEND and its rings will be stopped by the source. No 
further update must be done before the device and its rings are resumed. 
If and only if the back-end does not support VHOST_USER_SUSPEND, the 
front-end may reset it instead (via VHOST_USER_SET_STATUS, 
VHOST_USER_RESET_DEVICE, or VHOST_USER_RESET_OWNER).
```

I’m unsure about the “If and only if” – older qemu versions will break 
this, but I feel like we must promise back-end writers that if they 
implement SUSPEND, their back-end is not going to be reset; if it is, 
and something breaks because of it, it’s the front-end that must be 
updated to match the specification.

> - "before the device and its rings are resumed" via VHOST_USER_RESUME?
>    And is this referring to the source device?

Yes, via VHOST_USER_RESUME, and restarting the rings by starting them 
(i.e. a kick).

Whether this is referring to the source device…  Well, the text as it 
was before begs the exact same question, so honestly, I don’t know for 
sure.  “Restarting” only makes sense if the rings were stopped before, 
so I assume it’s referring to the source, e.g. for the case of a failed 
migration.  RESUME at least definitely will only happen after a prior 
SUSPEND, so this one will definitely only apply on the source side.

> Please rephrase the paragraph to identify the vhost-user messages
> involved.
>
>>   
>>   In postcopy migration the back-end is started before all the memory has
>>   been received from the source host, and care must be taken to avoid
>> @@ -885,6 +890,7 @@ Protocol features
>>     #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>>     #define VHOST_USER_PROTOCOL_F_STATUS               16
>>     #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>> +  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
>>   
>>   Front-end message types
>>   -----------------------
>> @@ -1440,6 +1446,31 @@ Front-end message types
>>     query the back-end for its device status as defined in the Virtio
>>     specification.
>>   
>> +``VHOST_USER_SUSPEND``
>> +  :id: 41
>> +  :equivalent ioctl: VHOST_VDPA_SUSPEND
>> +  :request payload: N/A
>> +  :reply payload: N/A
>> +
>> +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
>> +  successfully negotiated, this message is submitted by the front-end to
>> +  have the back-end cease all operations except for handling vhost-user
>> +  requests.  The back-end must stop processing all virt queues, and it
>> +  must not perform any background operations.  It may not resume until a
> "background operations" are not defined. What does it mean:
> - Anything that writes to memory slots
> - Anything that changes the visible state of the device
> - Anything that changes the non-visible internal state of the device
> - etc
> ?

My best answer (honestly) is: You tell me.  This series is introducing 
SUSPEND/RESUME because qemu wants to reset devices to make them stop 
“background operations”, and this would break virtiofsd if any form of 
reset were actually implemented.  The implementation of SUSPEND/RESUME 
in virtiofsd on the other hand is supposed to basically be a no-op 
(besides delaying ring processing until a RESUME, but even if we 
processed them before, i.e. really make SUSPEND/RESUME no-ops, it would 
most likely work out fine), so I have no idea what kind of background 
operations we are even talking about, or whether any such actually exist 
in practice.

I don’t know what anyone had in mind when introducing the RESET.  It 
comes from vDPA (c3716f260bf moved it from vdpa into the common code), 
and exists since the code was originally added (108a64818e6), so there’s 
no comment explaining why it exists.  I can’t explain what the back-end 
is supposed to stop doing, because so far it isn’t explained anywhere in 
qemu, its git log, or in any documentation (there basically is no vdpa 
documentation).

I can only say what I just completely naïvely assumed it to mean so far: 
That the back-end basically should stop doing anything besides handling 
and replying to vhost-user messages.  If such a message requires 
changing any state, visible or not, then this state change is permissible.

>> +  subsequent ``VHOST_USER_RESUME`` call.
>> +
>> +``VHOST_USER_RESUME``
>> +  :id: 42
>> +  :equivalent ioctl: VHOST_VDPA_RESUME
>> +  :request payload: N/A
>> +  :reply payload: N/A
>> +
>> +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
>> +  successfully negotiated, this message is submitted by the front-end to
>> +  allow the back-end to resume operations after having been suspended
>> +  before.  The back-end must again begin processing rings that are not
> This can be made more concrete by referencing the vhost-user message
> used to suspend the device:
>
> "suspended before" -> "suspended with VHOST_USER_SUSPEND"

Alright.

Hanna

>> +  stopped, and it may resume background operations.
>> +
>>   
>>   Back-end message types
>>   ----------------------
>> -- 
>> 2.41.0
>>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset
  2023-07-18 14:50   ` Stefan Hajnoczi
@ 2023-07-19 14:09     ` Hanna Czenczek
  2023-07-19 15:06       ` Stefan Hajnoczi
  2023-07-21 15:47       ` Eugenio Perez Martin
  0 siblings, 2 replies; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-19 14:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

On 18.07.23 16:50, Stefan Hajnoczi wrote:
> On Tue, Jul 11, 2023 at 05:52:27PM +0200, Hanna Czenczek wrote:
>> vhost-vdpa and vhost-user differ in how they reset the status in their
>> respective vhost_reset_status implementations: vhost-vdpa zeroes it,
>> then re-adds the S_ACKNOWLEDGE and S_DRIVER config bits.  S_DRIVER_OK is
>> then set in vhost_vdpa_dev_start().
>>
>> vhost-user in contrast just zeroes the status, and does no re-add any
>> config bits until vhost_user_dev_start() (where it does re-add all of
>> S_ACKNOWLEDGE, S_DRIVER, and S_DRIVER_OK).
>>
>> There is no documentation for vhost_reset_status, but its only caller is
>> vhost_dev_stop().  So apparently, the device is to be stopped after
>> vhost_reset_status, and therefore it makes more sense to keep the status
>> field fully cleared until the back-end is re-started, which is how
>> vhost-user does it.  Make vhost-vdpa do the same -- if nothing else it's
>> confusing to have both vhost implementations handle this differently.
>>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>> ---
>>   hw/virtio/vhost-vdpa.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
> Hi Hanna,
> The VIRTIO spec lists the Device Initialization sequence including the
> bits set in the Device Status Register here:
> https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-1070001
>
> ACKNOWLEDGE and DRIVER must be set before FEATURES_OK. DRIVER_OK is set
> after FEATURES_OK.
>
> The driver may read the Device Configuration Space once ACKNOWLEDGE and
> DRIVER are set.
>
> QEMU's vhost code should follow this sequence (especially for vDPA where
> full VIRTIO devices are implemented).
>
> vhost-user is not faithful to the VIRTIO spec here. That's probably due
> to the fact that vhost-user didn't have the concept of the Device Status
> Register until recently and back-ends mostly ignore it.
>
> Please do the opposite of this patch: bring vhost-user in line with the
> VIRTIO specification so that the Device Initialization sequence is
> followed correctly. I think vhost-vdpa already does the right thing.

Hm.  This sounds all very good, but what leaves me lost is the fact that 
we never actually expose the status field to the guest, as far as I can 
see.  We have no set_status callback, and as written in the commit 
message, the only caller of reset_status is vhost_dev_stop().  So the 
status field seems completely artificial in vhost right now.  That is 
why I’m wondering what the flags even really mean.

Another point I made in the commit message is that it is strange that we 
reset the status to 0, and then add the ACKNOWLEDGE and DRIVER while the 
VM is still stopped.  It doesn’t make sense to me to set these flags 
while the guest driver is not operative.

If what you’re saying is that we must set FEATURES_OK only after 
ACKNOWLEDGE and DRIVER, wouldn’t it be still better to set all of these 
flags only in vhost_*_dev_start(), but do it in two separate SET_STATUS 
calls?

(You mentioned the configuration space – is that accessed while between 
vhost_dev_stop and vhost_dev_start?)

Hanna

>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index f7fd19a203..0cde8b40de 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -1294,8 +1294,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>>       }
>>   
>>       vhost_vdpa_reset_device(dev);
>> -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
>> -                               VIRTIO_CONFIG_S_DRIVER);
>>       memory_listener_unregister(&v->listener);
>>   }
>>   
>> @@ -1334,7 +1332,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>           }
>>           memory_listener_register(&v->listener, dev->vdev->dma_as);
>>   
>> -        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
>> +        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
>> +                                          VIRTIO_CONFIG_S_DRIVER |
>> +                                          VIRTIO_CONFIG_S_DRIVER_OK);
>>       }
>>   
>>       return 0;
>> -- 
>> 2.41.0
>>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-18 15:10   ` Stefan Hajnoczi
@ 2023-07-19 14:11     ` Hanna Czenczek
  2023-07-19 14:27       ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-19 14:11 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

On 18.07.23 17:10, Stefan Hajnoczi wrote:
> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which
>> only uses it as a fall-back to stop the back-end if it does not support
>> SUSPEND.  However, vhost-user's implementation is a no-op unless the
>> back-end supports SET_STATUS.
>>
>> vhost-vdpa's implementation instead just calls
>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the
>> device if SET_STATUS is not supported.
>>
>> To be fair, vhost_vdpa_reset_device() does nothing but to set the status
>> to zero.  However, that may well be because vhost-vdpa has no method
>> besides this to reset a device.  In contrast, vhost-user has
>> RESET_DEVICE and a RESET_OWNER, which can be used instead.
>>
>> While it is not entirely clear from documentation or git logs, from
>> discussions and the order of vhost-user protocol features, it appears to
>> me as if RESET_OWNER originally had no real meaning for vhost-user, and
>> was thus used to signal a device reset to the back-end.  Then,
>> RESET_DEVICE was introduced, to have a well-defined dedicated reset
>> command.  Finally, vhost-user received full STATUS support, including
>> SET_STATUS, so setting the device status to 0 is now the preferred way
>> of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
>> remain valid as fall-backs.
>>
>> Therefore, have vhost_user_reset_status() fall back to
>> vhost_user_reset_device() if the back-end has no STATUS support.
>>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>> ---
>>   hw/virtio/vhost-user.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 4507de5a92..53a881ec2a 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev)
>>       if (virtio_has_feature(dev->protocol_features,
>>                              VHOST_USER_PROTOCOL_F_STATUS)) {
>>           vhost_user_set_status(dev, 0);
>> +    } else {
>> +        vhost_user_reset_device(dev);
>>       }
>>   }
> Did you check whether DPDK treats setting the status to 0 as equivalent
> to RESET_DEVICE?

If it doesn’t, what’s even the point of using reset_status?

I will investigate, but if there’s a difference, that makes the whole 
reset_* thing even more questionable to me than it has already been so far.

Hanna

> My understanding is that SET_STATUS is mostly ignored by vhost-user
> back-ends today. Even those that implement it may not treat SET_STATUS 0
> as equivalent to RESET_DEVICE.
>
> If you decide it's safe to make this change, please also update
> vhost-user.rst to document that front-ends should use SET_STATUS 0,
> RESET_DEVICE, and RESET_OWNER in order of preference.
>
> Stefan



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-19 14:11     ` Hanna Czenczek
@ 2023-07-19 14:27       ` Hanna Czenczek
  2023-07-20 16:03         ` Stefan Hajnoczi
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-19 14:27 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

On 19.07.23 16:11, Hanna Czenczek wrote:
> On 18.07.23 17:10, Stefan Hajnoczi wrote:
>> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
>>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which
>>> only uses it as a fall-back to stop the back-end if it does not support
>>> SUSPEND.  However, vhost-user's implementation is a no-op unless the
>>> back-end supports SET_STATUS.
>>>
>>> vhost-vdpa's implementation instead just calls
>>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the
>>> device if SET_STATUS is not supported.
>>>
>>> To be fair, vhost_vdpa_reset_device() does nothing but to set the 
>>> status
>>> to zero.  However, that may well be because vhost-vdpa has no method
>>> besides this to reset a device.  In contrast, vhost-user has
>>> RESET_DEVICE and a RESET_OWNER, which can be used instead.
>>>
>>> While it is not entirely clear from documentation or git logs, from
>>> discussions and the order of vhost-user protocol features, it 
>>> appears to
>>> me as if RESET_OWNER originally had no real meaning for vhost-user, and
>>> was thus used to signal a device reset to the back-end.  Then,
>>> RESET_DEVICE was introduced, to have a well-defined dedicated reset
>>> command.  Finally, vhost-user received full STATUS support, including
>>> SET_STATUS, so setting the device status to 0 is now the preferred way
>>> of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
>>> remain valid as fall-backs.
>>>
>>> Therefore, have vhost_user_reset_status() fall back to
>>> vhost_user_reset_device() if the back-end has no STATUS support.
>>>
>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>> ---
>>>   hw/virtio/vhost-user.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>> index 4507de5a92..53a881ec2a 100644
>>> --- a/hw/virtio/vhost-user.c
>>> +++ b/hw/virtio/vhost-user.c
>>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct 
>>> vhost_dev *dev)
>>>       if (virtio_has_feature(dev->protocol_features,
>>>                              VHOST_USER_PROTOCOL_F_STATUS)) {
>>>           vhost_user_set_status(dev, 0);
>>> +    } else {
>>> +        vhost_user_reset_device(dev);
>>>       }
>>>   }
>> Did you check whether DPDK treats setting the status to 0 as equivalent
>> to RESET_DEVICE?
>
> If it doesn’t, what’s even the point of using reset_status?

Sorry, I’m being unclear, and I think this may be important because it 
ties into the question from patch 1, what qemu is even trying to do by 
running SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the 
impression that SET_STATUS(0) and RESET_DEVICE should be equivalent:

vhost-vdpa.c runs SET_STATUS(0) in a function called 
vhost_vdpa_reset_device().  This is one thing that gave me the 
impression that this is about an actual full reset.

Another is the whole discussion that we’ve had.  vhost_dev_stop() does 
not call a `vhost_reset_device()` function, it calls 
`vhost_reset_status()`.  Still, we were always talking about resetting 
the device.

It doesn’t make sense to me that vDPA would provide no function to fully 
reset a device, while vhost-user does.  Being able to reset a device 
sounds vital to me.  This also gave me the impression that SET_STATUS(0) 
on vDPA at least is functionally equivalent to a full device reset.

Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on 
vhost-user.  That would be a real shame, so I assumed this would not be 
the case; that SET_STATUS(0) does the same thing on both protocols.

The virtio specification says “Writing 0 into this field resets the 
device.” about the device_status field.

This also makes sense, because the device_status field is basically used 
to tell the device that a driver has taken control.  If reset, this 
indicates the driver has given up control, and to me this is a point 
where a device should fully reset itself.

So all in all, I can’t see the rationale why any implementation that 
supports SET_STATUS would decide to treat SET_STATUS(0) not as 
equivalent or a superset of RESET_DEVICE.  I may be wrong, and this 
might explain a whole deal about what kind of background operations we 
hope to stop with SET_STATUS(0).

Hanna



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset
  2023-07-19 14:09     ` Hanna Czenczek
@ 2023-07-19 15:06       ` Stefan Hajnoczi
  2023-07-21 15:47       ` Eugenio Perez Martin
  1 sibling, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-19 15:06 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: Stefan Hajnoczi, qemu-devel, Michael S . Tsirkin,
	Eugenio Pérez, German Maglione

On Wed, 19 Jul 2023 at 10:10, Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 18.07.23 16:50, Stefan Hajnoczi wrote:
> > On Tue, Jul 11, 2023 at 05:52:27PM +0200, Hanna Czenczek wrote:
> >> vhost-vdpa and vhost-user differ in how they reset the status in their
> >> respective vhost_reset_status implementations: vhost-vdpa zeroes it,
> >> then re-adds the S_ACKNOWLEDGE and S_DRIVER config bits.  S_DRIVER_OK is
> >> then set in vhost_vdpa_dev_start().
> >>
> >> vhost-user in contrast just zeroes the status, and does no re-add any
> >> config bits until vhost_user_dev_start() (where it does re-add all of
> >> S_ACKNOWLEDGE, S_DRIVER, and S_DRIVER_OK).
> >>
> >> There is no documentation for vhost_reset_status, but its only caller is
> >> vhost_dev_stop().  So apparently, the device is to be stopped after
> >> vhost_reset_status, and therefore it makes more sense to keep the status
> >> field fully cleared until the back-end is re-started, which is how
> >> vhost-user does it.  Make vhost-vdpa do the same -- if nothing else it's
> >> confusing to have both vhost implementations handle this differently.
> >>
> >> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >> ---
> >>   hw/virtio/vhost-vdpa.c | 6 +++---
> >>   1 file changed, 3 insertions(+), 3 deletions(-)
> > Hi Hanna,
> > The VIRTIO spec lists the Device Initialization sequence including the
> > bits set in the Device Status Register here:
> > https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-1070001
> >
> > ACKNOWLEDGE and DRIVER must be set before FEATURES_OK. DRIVER_OK is set
> > after FEATURES_OK.
> >
> > The driver may read the Device Configuration Space once ACKNOWLEDGE and
> > DRIVER are set.
> >
> > QEMU's vhost code should follow this sequence (especially for vDPA where
> > full VIRTIO devices are implemented).
> >
> > vhost-user is not faithful to the VIRTIO spec here. That's probably due
> > to the fact that vhost-user didn't have the concept of the Device Status
> > Register until recently and back-ends mostly ignore it.
> >
> > Please do the opposite of this patch: bring vhost-user in line with the
> > VIRTIO specification so that the Device Initialization sequence is
> > followed correctly. I think vhost-vdpa already does the right thing.
>
> Hm.  This sounds all very good, but what leaves me lost is the fact that
> we never actually expose the status field to the guest, as far as I can
> see.  We have no set_status callback, and as written in the commit
> message, the only caller of reset_status is vhost_dev_stop().  So the
> status field seems completely artificial in vhost right now.  That is
> why I’m wondering what the flags even really mean.

vhost (including vDPA and vhost-user) is not a 100% passthrough
solution. The VMM emulates a VIRTIO device (e.g. virtio-fs-pci) that
has some separate state from the vhost back-end, including the Device
Status Register. This is analogous to how passthrough PCI devices
still have emulated PCI registers that are not passed through to the
physical PCI device.

However, just because the vDPA, and now vhost-user with the SET_STATUS
message, back-end is not directly exposed to the guest does not mean
it should diverge from the VIRTIO specification for no reason.

> Another point I made in the commit message is that it is strange that we
> reset the status to 0, and then add the ACKNOWLEDGE and DRIVER while the
> VM is still stopped.  It doesn’t make sense to me to set these flags
> while the guest driver is not operative.

While there is no harm in setting those bits, I agree that leaving the
Device Status Register at 0 while the VM is stopped would be nicer.

> If what you’re saying is that we must set FEATURES_OK only after
> ACKNOWLEDGE and DRIVER, wouldn’t it be still better to set all of these
> flags only in vhost_*_dev_start(), but do it in two separate SET_STATUS
> calls?

The device initialization sequence could be put into vhost_dev_start():
1. ACKNOWLEDGE | DRIVER
2. FEATURES_OK via vhost_dev_set_features()
3. DRIVER_OK via ->vhost_dev_start()

But note that the ->vhost_dev_start() callback is too late to set
ACKNOWLEDGE | DRIVER because feature negotiation happens earlier.

> (You mentioned the configuration space – is that accessed while between
> vhost_dev_stop and vhost_dev_start?)

I don't think so.

>
> Hanna
>
> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >> index f7fd19a203..0cde8b40de 100644
> >> --- a/hw/virtio/vhost-vdpa.c
> >> +++ b/hw/virtio/vhost-vdpa.c
> >> @@ -1294,8 +1294,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> >>       }
> >>
> >>       vhost_vdpa_reset_device(dev);
> >> -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >> -                               VIRTIO_CONFIG_S_DRIVER);
> >>       memory_listener_unregister(&v->listener);
> >>   }
> >>
> >> @@ -1334,7 +1332,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>           }
> >>           memory_listener_register(&v->listener, dev->vdev->dma_as);
> >>
> >> -        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >> +        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >> +                                          VIRTIO_CONFIG_S_DRIVER |
> >> +                                          VIRTIO_CONFIG_S_DRIVER_OK);
> >>       }
> >>
> >>       return 0;
> >> --
> >> 2.41.0
> >>
>
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-19 14:27       ` Hanna Czenczek
@ 2023-07-20 16:03         ` Stefan Hajnoczi
  2023-07-21 14:16           ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-20 16:03 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez,
	German Maglione, Maxime Coquelin

[-- Attachment #1: Type: text/plain, Size: 5643 bytes --]

On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote:
> On 19.07.23 16:11, Hanna Czenczek wrote:
> > On 18.07.23 17:10, Stefan Hajnoczi wrote:
> > > On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
> > > > The only user of vhost_user_reset_status() is vhost_dev_stop(), which
> > > > only uses it as a fall-back to stop the back-end if it does not support
> > > > SUSPEND.  However, vhost-user's implementation is a no-op unless the
> > > > back-end supports SET_STATUS.
> > > > 
> > > > vhost-vdpa's implementation instead just calls
> > > > vhost_vdpa_reset_device(), implying that it's OK to fully reset the
> > > > device if SET_STATUS is not supported.
> > > > 
> > > > To be fair, vhost_vdpa_reset_device() does nothing but to set
> > > > the status
> > > > to zero.  However, that may well be because vhost-vdpa has no method
> > > > besides this to reset a device.  In contrast, vhost-user has
> > > > RESET_DEVICE and a RESET_OWNER, which can be used instead.
> > > > 
> > > > While it is not entirely clear from documentation or git logs, from
> > > > discussions and the order of vhost-user protocol features, it
> > > > appears to
> > > > me as if RESET_OWNER originally had no real meaning for vhost-user, and
> > > > was thus used to signal a device reset to the back-end.  Then,
> > > > RESET_DEVICE was introduced, to have a well-defined dedicated reset
> > > > command.  Finally, vhost-user received full STATUS support, including
> > > > SET_STATUS, so setting the device status to 0 is now the preferred way
> > > > of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
> > > > remain valid as fall-backs.
> > > > 
> > > > Therefore, have vhost_user_reset_status() fall back to
> > > > vhost_user_reset_device() if the back-end has no STATUS support.
> > > > 
> > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > > ---
> > > >   hw/virtio/vhost-user.c | 2 ++
> > > >   1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > index 4507de5a92..53a881ec2a 100644
> > > > --- a/hw/virtio/vhost-user.c
> > > > +++ b/hw/virtio/vhost-user.c
> > > > @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct
> > > > vhost_dev *dev)
> > > >       if (virtio_has_feature(dev->protocol_features,
> > > >                              VHOST_USER_PROTOCOL_F_STATUS)) {
> > > >           vhost_user_set_status(dev, 0);
> > > > +    } else {
> > > > +        vhost_user_reset_device(dev);
> > > >       }
> > > >   }
> > > Did you check whether DPDK treats setting the status to 0 as equivalent
> > > to RESET_DEVICE?
> > 
> > If it doesn’t, what’s even the point of using reset_status?
> 
> Sorry, I’m being unclear, and I think this may be important because it ties
> into the question from patch 1, what qemu is even trying to do by running
> SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that
> SET_STATUS(0) and RESET_DEVICE should be equivalent:
> 
> vhost-vdpa.c runs SET_STATUS(0) in a function called
> vhost_vdpa_reset_device().  This is one thing that gave me the impression
> that this is about an actual full reset.
> 
> Another is the whole discussion that we’ve had.  vhost_dev_stop() does not
> call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. 
> Still, we were always talking about resetting the device.

There is some hacky stuff with struct vhost_dev's vq_index_end and
multi-queue devices. I think it's because multi-queue vhost-net device
consist of many vhost_devs and NetClientStates, so certain vhost
operations are skipped unless this is the "first" or "last" vhost_dev
from a large aggregate vhost-net device. That might be responsible for
part of the weirdness.

> 
> It doesn’t make sense to me that vDPA would provide no function to fully
> reset a device, while vhost-user does.  Being able to reset a device sounds
> vital to me.  This also gave me the impression that SET_STATUS(0) on vDPA at
> least is functionally equivalent to a full device reset.
> 
> 
> Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on
> vhost-user.  That would be a real shame, so I assumed this would not be the
> case; that SET_STATUS(0) does the same thing on both protocols.

Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user
it's currently only used by DPDK as a hint for when device
initialization is complete:
https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1

> The virtio specification says “Writing 0 into this field resets the device.”
> about the device_status field.
> 
> This also makes sense, because the device_status field is basically used to
> tell the device that a driver has taken control.  If reset, this indicates
> the driver has given up control, and to me this is a point where a device
> should fully reset itself.
> 
> So all in all, I can’t see the rationale why any implementation that
> supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or
> a superset of RESET_DEVICE.  I may be wrong, and this might explain a whole
> deal about what kind of background operations we hope to stop with
> SET_STATUS(0).

I would like vhost-user devices to implement SET_STATUS according to the
VIRTIO specification in the future and they can do that. But I think
front-ends should continue sending RESET_DEVICE in order to support old
devices.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-20 16:03         ` Stefan Hajnoczi
@ 2023-07-21 14:16           ` Hanna Czenczek
  2023-07-24 18:04             ` Stefan Hajnoczi
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-21 14:16 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez,
	German Maglione, Maxime Coquelin

On 20.07.23 18:03, Stefan Hajnoczi wrote:
> On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote:
>> On 19.07.23 16:11, Hanna Czenczek wrote:
>>> On 18.07.23 17:10, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
>>>>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which
>>>>> only uses it as a fall-back to stop the back-end if it does not support
>>>>> SUSPEND.  However, vhost-user's implementation is a no-op unless the
>>>>> back-end supports SET_STATUS.
>>>>>
>>>>> vhost-vdpa's implementation instead just calls
>>>>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the
>>>>> device if SET_STATUS is not supported.
>>>>>
>>>>> To be fair, vhost_vdpa_reset_device() does nothing but to set
>>>>> the status
>>>>> to zero.  However, that may well be because vhost-vdpa has no method
>>>>> besides this to reset a device.  In contrast, vhost-user has
>>>>> RESET_DEVICE and a RESET_OWNER, which can be used instead.
>>>>>
>>>>> While it is not entirely clear from documentation or git logs, from
>>>>> discussions and the order of vhost-user protocol features, it
>>>>> appears to
>>>>> me as if RESET_OWNER originally had no real meaning for vhost-user, and
>>>>> was thus used to signal a device reset to the back-end.  Then,
>>>>> RESET_DEVICE was introduced, to have a well-defined dedicated reset
>>>>> command.  Finally, vhost-user received full STATUS support, including
>>>>> SET_STATUS, so setting the device status to 0 is now the preferred way
>>>>> of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
>>>>> remain valid as fall-backs.
>>>>>
>>>>> Therefore, have vhost_user_reset_status() fall back to
>>>>> vhost_user_reset_device() if the back-end has no STATUS support.
>>>>>
>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>>> ---
>>>>>    hw/virtio/vhost-user.c | 2 ++
>>>>>    1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>>>> index 4507de5a92..53a881ec2a 100644
>>>>> --- a/hw/virtio/vhost-user.c
>>>>> +++ b/hw/virtio/vhost-user.c
>>>>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct
>>>>> vhost_dev *dev)
>>>>>        if (virtio_has_feature(dev->protocol_features,
>>>>>                               VHOST_USER_PROTOCOL_F_STATUS)) {
>>>>>            vhost_user_set_status(dev, 0);
>>>>> +    } else {
>>>>> +        vhost_user_reset_device(dev);
>>>>>        }
>>>>>    }
>>>> Did you check whether DPDK treats setting the status to 0 as equivalent
>>>> to RESET_DEVICE?
>>> If it doesn’t, what’s even the point of using reset_status?
>> Sorry, I’m being unclear, and I think this may be important because it ties
>> into the question from patch 1, what qemu is even trying to do by running
>> SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that
>> SET_STATUS(0) and RESET_DEVICE should be equivalent:
>>
>> vhost-vdpa.c runs SET_STATUS(0) in a function called
>> vhost_vdpa_reset_device().  This is one thing that gave me the impression
>> that this is about an actual full reset.
>>
>> Another is the whole discussion that we’ve had.  vhost_dev_stop() does not
>> call a `vhost_reset_device()` function, it calls `vhost_reset_status()`.
>> Still, we were always talking about resetting the device.
> There is some hacky stuff with struct vhost_dev's vq_index_end and
> multi-queue devices. I think it's because multi-queue vhost-net device
> consist of many vhost_devs and NetClientStates, so certain vhost
> operations are skipped unless this is the "first" or "last" vhost_dev
> from a large aggregate vhost-net device. That might be responsible for
> part of the weirdness.
>
>> It doesn’t make sense to me that vDPA would provide no function to fully
>> reset a device, while vhost-user does.  Being able to reset a device sounds
>> vital to me.  This also gave me the impression that SET_STATUS(0) on vDPA at
>> least is functionally equivalent to a full device reset.
>>
>>
>> Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on
>> vhost-user.  That would be a real shame, so I assumed this would not be the
>> case; that SET_STATUS(0) does the same thing on both protocols.
> Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user
> it's currently only used by DPDK as a hint for when device
> initialization is complete:
> https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1

FWIW, now the code is a bit different. 
https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054 
has added a RESET interpretation for the status field, i.e. when it is 
0.  It doesn’t do anything, but at least DPDK seems to agree that 
SET_STATUS(0) is a reset.

>> The virtio specification says “Writing 0 into this field resets the device.”
>> about the device_status field.
>>
>> This also makes sense, because the device_status field is basically used to
>> tell the device that a driver has taken control.  If reset, this indicates
>> the driver has given up control, and to me this is a point where a device
>> should fully reset itself.
>>
>> So all in all, I can’t see the rationale why any implementation that
>> supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or
>> a superset of RESET_DEVICE.  I may be wrong, and this might explain a whole
>> deal about what kind of background operations we hope to stop with
>> SET_STATUS(0).
> I would like vhost-user devices to implement SET_STATUS according to the
> VIRTIO specification in the future and they can do that. But I think
> front-ends should continue sending RESET_DEVICE in order to support old
> devices.

Well, yes, exactly.  That is what I meant to address with this patch, 
vhost-user right now does not send RESET_DEVICE in its 
vhost_reset_status implementation, so the front-end will not fall back 
to RESET_DEVICE when it apparently does intend to reset the device[1].  
We do arguably have vhost_reset_device, too, but for vDPA that is just a 
SET_STATUS(0) (there is no RESET_DEVICE on vDPA), and it’s also only 
called by vhost-user-scsi.

So this also begs the question why we even do have vhost_reset_status 
and vhost_reset_device as two separate things. The commit introducing 
vhost_reset_status (c3716f260bf) doesn’t say.  Maybe the intention was 
that vhost_reset_device would leave the status at 0, while 
vhost_reset_status would return it to ACKNOWLEDGE | DRIVER, as done by 
the introducing commit, but that comes back to patch 5 in this series – 
we don’t need to have ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), 
so we don’t need vhost_reset_status to set those flags.  They should be 
set in vhost_dev_start().

[1] This is assuming that SET_STATUS(0) is intended to reset the device, 
but it sounds like you agree on that.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-11 15:52 ` [PATCH 3/6] vhost: Do not reset suspended devices on stop Hanna Czenczek
  2023-07-18 14:33   ` Stefan Hajnoczi
@ 2023-07-21 15:25   ` Eugenio Perez Martin
  2023-07-21 16:07     ` Hanna Czenczek
  1 sibling, 1 reply; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-21 15:25 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> struct, so vhost_dev_stop() can check whether the back-end has been
> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> there is no need to reset it; the reset is just a fall-back to stop
> device operations for back-ends that do not support suspend.
>
> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> when the device is re-started, we still have to do the reset to have it
> un-suspend.
>
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> ---
>  include/hw/virtio/vhost-vdpa.h |  2 --
>  include/hw/virtio/vhost.h      |  8 ++++++++
>  hw/virtio/vhost-vdpa.c         | 11 +++++++----
>  hw/virtio/vhost.c              |  8 +++++++-
>  4 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index e64bfc7f98..72c3686b7f 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
>      bool shadow_vqs_enabled;
>      /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
>      bool shadow_data;
> -    /* Device suspended successfully */
> -    bool suspended;
>      /* IOVA mapping used by the Shadow Virtqueue */
>      VhostIOVATree *iova_tree;
>      GPtrArray *shadow_vqs;
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 6a173cb9fa..69bf59d630 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -120,6 +120,14 @@ struct vhost_dev {
>      uint64_t backend_cap;
>      /* @started: is the vhost device started? */
>      bool started;
> +    /**
> +     * @suspended: Whether the vhost device is currently suspended.  Set
> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> +     * are supposed to automatically suspend/resume in their
> +     * vhost_dev_start handlers as required.  Must also be cleared when
> +     * the device is reset.
> +     */
> +    bool suspended;
>      bool log_enabled;
>      uint64_t log_size;
>      Error *migration_blocker;
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 7b7dee468e..f7fd19a203 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>
>  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>  {
> -    struct vhost_vdpa *v = dev->opaque;
>      int ret;
>      uint8_t status = 0;
>
>      ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>      trace_vhost_vdpa_reset_device(dev);
> -    v->suspended = false;
> +    dev->suspended = false;
>      return ret;
>  }
>
> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
>          if (unlikely(r)) {
>              error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
>          } else {
> -            v->suspended = true;
> +            dev->suspended = true;
>              return;
>          }
>      }
> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>              return -1;
>          }
>          vhost_vdpa_set_vring_ready(dev);
> +        if (dev->suspended) {
> +            /* TODO: When RESUME is available, use it instead of resetting */
> +            vhost_vdpa_reset_status(dev);

How is that we reset the status at each vhost_vdpa_dev_start? That
will clean all the vqs configured, features negotiated, etc. in the
vDPA device. Or am I missing something?

I'm totally ok with the move of "suspended" member.

Thanks!


> +        }
>      } else {
>          vhost_vdpa_suspend(dev);
>          vhost_vdpa_svqs_stop(dev);
> @@ -1400,7 +1403,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
>          return 0;
>      }
>
> -    if (!v->suspended) {
> +    if (!dev->suspended) {
>          /*
>           * Cannot trust in value returned by device, let vhost recover used
>           * idx from guest.
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index abf0d03c8d..2e28e58da7 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -2059,7 +2059,13 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>                               hdev->vqs + i,
>                               hdev->vq_index + i);
>      }
> -    if (hdev->vhost_ops->vhost_reset_status) {
> +
> +    /*
> +     * If we failed to successfully stop the device via SUSPEND (should have
> +     * been attempted by `vhost_dev_start(hdev, false)`), reset it to stop it.
> +     * Stateful devices where this would be problematic must implement SUSPEND.
> +     */
> +    if (!hdev->suspended && hdev->vhost_ops->vhost_reset_status) {
>          hdev->vhost_ops->vhost_reset_status(hdev);
>      }
>
> --
> 2.41.0
>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset
  2023-07-19 14:09     ` Hanna Czenczek
  2023-07-19 15:06       ` Stefan Hajnoczi
@ 2023-07-21 15:47       ` Eugenio Perez Martin
  1 sibling, 0 replies; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-21 15:47 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: Stefan Hajnoczi, qemu-devel, Michael S . Tsirkin, German Maglione

On Wed, Jul 19, 2023 at 4:10 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 18.07.23 16:50, Stefan Hajnoczi wrote:
> > On Tue, Jul 11, 2023 at 05:52:27PM +0200, Hanna Czenczek wrote:
> >> vhost-vdpa and vhost-user differ in how they reset the status in their
> >> respective vhost_reset_status implementations: vhost-vdpa zeroes it,
> >> then re-adds the S_ACKNOWLEDGE and S_DRIVER config bits.  S_DRIVER_OK is
> >> then set in vhost_vdpa_dev_start().
> >>
> >> vhost-user in contrast just zeroes the status, and does no re-add any
> >> config bits until vhost_user_dev_start() (where it does re-add all of
> >> S_ACKNOWLEDGE, S_DRIVER, and S_DRIVER_OK).
> >>
> >> There is no documentation for vhost_reset_status, but its only caller is
> >> vhost_dev_stop().  So apparently, the device is to be stopped after
> >> vhost_reset_status, and therefore it makes more sense to keep the status
> >> field fully cleared until the back-end is re-started, which is how
> >> vhost-user does it.  Make vhost-vdpa do the same -- if nothing else it's
> >> confusing to have both vhost implementations handle this differently.
> >>

Ok now I understand better why you moved the call to vhost_vdpa_reset_status.

Maybe we can move the vhost_vdpa_add_status(dev, _S_ACKNOWLEDGE |
_S_DRIVER) to vhost_vdpa_set_features then, and let reset_status call
vhost_vdpa_reset_device directly. But I'm not 100% sure if I'm missing
something, it would need testing.

Thanks!

> >> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >> ---
> >>   hw/virtio/vhost-vdpa.c | 6 +++---
> >>   1 file changed, 3 insertions(+), 3 deletions(-)
> > Hi Hanna,
> > The VIRTIO spec lists the Device Initialization sequence including the
> > bits set in the Device Status Register here:
> > https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-1070001
> >
> > ACKNOWLEDGE and DRIVER must be set before FEATURES_OK. DRIVER_OK is set
> > after FEATURES_OK.
> >
> > The driver may read the Device Configuration Space once ACKNOWLEDGE and
> > DRIVER are set.
> >
> > QEMU's vhost code should follow this sequence (especially for vDPA where
> > full VIRTIO devices are implemented).
> >
> > vhost-user is not faithful to the VIRTIO spec here. That's probably due
> > to the fact that vhost-user didn't have the concept of the Device Status
> > Register until recently and back-ends mostly ignore it.
> >
> > Please do the opposite of this patch: bring vhost-user in line with the
> > VIRTIO specification so that the Device Initialization sequence is
> > followed correctly. I think vhost-vdpa already does the right thing.
>
> Hm.  This sounds all very good, but what leaves me lost is the fact that
> we never actually expose the status field to the guest, as far as I can
> see.  We have no set_status callback, and as written in the commit
> message, the only caller of reset_status is vhost_dev_stop().  So the
> status field seems completely artificial in vhost right now.  That is
> why I’m wondering what the flags even really mean.
>
> Another point I made in the commit message is that it is strange that we
> reset the status to 0, and then add the ACKNOWLEDGE and DRIVER while the
> VM is still stopped.  It doesn’t make sense to me to set these flags
> while the guest driver is not operative.
>
> If what you’re saying is that we must set FEATURES_OK only after
> ACKNOWLEDGE and DRIVER, wouldn’t it be still better to set all of these
> flags only in vhost_*_dev_start(), but do it in two separate SET_STATUS
> calls?
>
> (You mentioned the configuration space – is that accessed while between
> vhost_dev_stop and vhost_dev_start?)
>
> Hanna
>
> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >> index f7fd19a203..0cde8b40de 100644
> >> --- a/hw/virtio/vhost-vdpa.c
> >> +++ b/hw/virtio/vhost-vdpa.c
> >> @@ -1294,8 +1294,6 @@ static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> >>       }
> >>
> >>       vhost_vdpa_reset_device(dev);
> >> -    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >> -                               VIRTIO_CONFIG_S_DRIVER);
> >>       memory_listener_unregister(&v->listener);
> >>   }
> >>
> >> @@ -1334,7 +1332,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>           }
> >>           memory_listener_register(&v->listener, dev->vdev->dma_as);
> >>
> >> -        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >> +        return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >> +                                          VIRTIO_CONFIG_S_DRIVER |
> >> +                                          VIRTIO_CONFIG_S_DRIVER_OK);
> >>       }
> >>
> >>       return 0;
> >> --
> >> 2.41.0
> >>
>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-21 15:25   ` Eugenio Perez Martin
@ 2023-07-21 16:07     ` Hanna Czenczek
  2023-07-24 15:48       ` Eugenio Perez Martin
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-21 16:07 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On 21.07.23 17:25, Eugenio Perez Martin wrote:
> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
>> struct, so vhost_dev_stop() can check whether the back-end has been
>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
>> there is no need to reset it; the reset is just a fall-back to stop
>> device operations for back-ends that do not support suspend.
>>
>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
>> when the device is re-started, we still have to do the reset to have it
>> un-suspend.
>>
>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>> ---
>>   include/hw/virtio/vhost-vdpa.h |  2 --
>>   include/hw/virtio/vhost.h      |  8 ++++++++
>>   hw/virtio/vhost-vdpa.c         | 11 +++++++----
>>   hw/virtio/vhost.c              |  8 +++++++-
>>   4 files changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>> index e64bfc7f98..72c3686b7f 100644
>> --- a/include/hw/virtio/vhost-vdpa.h
>> +++ b/include/hw/virtio/vhost-vdpa.h
>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
>>       bool shadow_vqs_enabled;
>>       /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
>>       bool shadow_data;
>> -    /* Device suspended successfully */
>> -    bool suspended;
>>       /* IOVA mapping used by the Shadow Virtqueue */
>>       VhostIOVATree *iova_tree;
>>       GPtrArray *shadow_vqs;
>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>> index 6a173cb9fa..69bf59d630 100644
>> --- a/include/hw/virtio/vhost.h
>> +++ b/include/hw/virtio/vhost.h
>> @@ -120,6 +120,14 @@ struct vhost_dev {
>>       uint64_t backend_cap;
>>       /* @started: is the vhost device started? */
>>       bool started;
>> +    /**
>> +     * @suspended: Whether the vhost device is currently suspended.  Set
>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
>> +     * are supposed to automatically suspend/resume in their
>> +     * vhost_dev_start handlers as required.  Must also be cleared when
>> +     * the device is reset.
>> +     */
>> +    bool suspended;
>>       bool log_enabled;
>>       uint64_t log_size;
>>       Error *migration_blocker;
>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index 7b7dee468e..f7fd19a203 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>>
>>   static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>>   {
>> -    struct vhost_vdpa *v = dev->opaque;
>>       int ret;
>>       uint8_t status = 0;
>>
>>       ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>>       trace_vhost_vdpa_reset_device(dev);
>> -    v->suspended = false;
>> +    dev->suspended = false;
>>       return ret;
>>   }
>>
>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
>>           if (unlikely(r)) {
>>               error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
>>           } else {
>> -            v->suspended = true;
>> +            dev->suspended = true;
>>               return;
>>           }
>>       }
>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>               return -1;
>>           }
>>           vhost_vdpa_set_vring_ready(dev);
>> +        if (dev->suspended) {
>> +            /* TODO: When RESUME is available, use it instead of resetting */
>> +            vhost_vdpa_reset_status(dev);
> How is that we reset the status at each vhost_vdpa_dev_start? That
> will clean all the vqs configured, features negotiated, etc. in the
> vDPA device. Or am I missing something?

What alternative do you propose?  We don’t have RESUME for vDPA in qemu, 
but we somehow need to lift the previous SUSPEND so the device will 
again respond to guest requests, do we not?

But more generally, is this any different from what is done before this 
patch?  Before this patch, vhost_dev_stop() unconditionally invokes 
vhost_reset_status(), so the device is reset in every stop/start cycle, 
that doesn’t change.  And we still won’t reset it on the first 
vhost_dev_start(), because dev->suspended will be false then, only on 
subsequent stop/start cycles, as before.  So the only difference is that 
now the device is reset on start, not on stop.

Hanna



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-21 16:07     ` Hanna Czenczek
@ 2023-07-24 15:48       ` Eugenio Perez Martin
  2023-07-25  7:53         ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-24 15:48 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> > On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> >> struct, so vhost_dev_stop() can check whether the back-end has been
> >> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> >> there is no need to reset it; the reset is just a fall-back to stop
> >> device operations for back-ends that do not support suspend.
> >>
> >> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> >> when the device is re-started, we still have to do the reset to have it
> >> un-suspend.
> >>
> >> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >> ---
> >>   include/hw/virtio/vhost-vdpa.h |  2 --
> >>   include/hw/virtio/vhost.h      |  8 ++++++++
> >>   hw/virtio/vhost-vdpa.c         | 11 +++++++----
> >>   hw/virtio/vhost.c              |  8 +++++++-
> >>   4 files changed, 22 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >> index e64bfc7f98..72c3686b7f 100644
> >> --- a/include/hw/virtio/vhost-vdpa.h
> >> +++ b/include/hw/virtio/vhost-vdpa.h
> >> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> >>       bool shadow_vqs_enabled;
> >>       /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
> >>       bool shadow_data;
> >> -    /* Device suspended successfully */
> >> -    bool suspended;
> >>       /* IOVA mapping used by the Shadow Virtqueue */
> >>       VhostIOVATree *iova_tree;
> >>       GPtrArray *shadow_vqs;
> >> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >> index 6a173cb9fa..69bf59d630 100644
> >> --- a/include/hw/virtio/vhost.h
> >> +++ b/include/hw/virtio/vhost.h
> >> @@ -120,6 +120,14 @@ struct vhost_dev {
> >>       uint64_t backend_cap;
> >>       /* @started: is the vhost device started? */
> >>       bool started;
> >> +    /**
> >> +     * @suspended: Whether the vhost device is currently suspended.  Set
> >> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> >> +     * are supposed to automatically suspend/resume in their
> >> +     * vhost_dev_start handlers as required.  Must also be cleared when
> >> +     * the device is reset.
> >> +     */
> >> +    bool suspended;
> >>       bool log_enabled;
> >>       uint64_t log_size;
> >>       Error *migration_blocker;
> >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >> index 7b7dee468e..f7fd19a203 100644
> >> --- a/hw/virtio/vhost-vdpa.c
> >> +++ b/hw/virtio/vhost-vdpa.c
> >> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> >>
> >>   static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >>   {
> >> -    struct vhost_vdpa *v = dev->opaque;
> >>       int ret;
> >>       uint8_t status = 0;
> >>
> >>       ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> >>       trace_vhost_vdpa_reset_device(dev);
> >> -    v->suspended = false;
> >> +    dev->suspended = false;
> >>       return ret;
> >>   }
> >>
> >> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
> >>           if (unlikely(r)) {
> >>               error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
> >>           } else {
> >> -            v->suspended = true;
> >> +            dev->suspended = true;
> >>               return;
> >>           }
> >>       }
> >> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>               return -1;
> >>           }
> >>           vhost_vdpa_set_vring_ready(dev);
> >> +        if (dev->suspended) {
> >> +            /* TODO: When RESUME is available, use it instead of resetting */
> >> +            vhost_vdpa_reset_status(dev);
> > How is that we reset the status at each vhost_vdpa_dev_start? That
> > will clean all the vqs configured, features negotiated, etc. in the
> > vDPA device. Or am I missing something?
>
> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
> but we somehow need to lift the previous SUSPEND so the device will
> again respond to guest requests, do we not?
>

Reset also clears the suspend state in vDPA, and it should be called
at vhost_dev_stop. So the device should never be in suspended state
here. Does that solve your concerns?

> But more generally, is this any different from what is done before this
> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
> vhost_reset_status(), so the device is reset in every stop/start cycle,
> that doesn’t change.  And we still won’t reset it on the first
> vhost_dev_start(), because dev->suspended will be false then, only on
> subsequent stop/start cycles, as before.  So the only difference is that
> now the device is reset on start, not on stop.
>

The difference is that vhost_vdpa_dev_start is called after features
ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
configuration (using vhost_virtqueue_start). A device reset forces the
device to forget about all of that, and qemu cannot configure them
again until qemu acks the features again.



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-19 13:59     ` Hanna Czenczek
@ 2023-07-24 17:55       ` Stefan Hajnoczi
  2023-07-25  8:30         ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-24 17:55 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 9531 bytes --]

On Wed, Jul 19, 2023 at 03:59:32PM +0200, Hanna Czenczek wrote:
> On 18.07.23 16:25, Stefan Hajnoczi wrote:
> > On Tue, Jul 11, 2023 at 05:52:23PM +0200, Hanna Czenczek wrote:
> > > When stopping the VM, qemu wants all devices to fully cease any
> > > operation, too.  Currently, we can only have vhost-user back-ends stop
> > > processing vrings, but not background operations.  Add the SUSPEND and
> > > RESUME commands from vDPA, which allow the front-end (qemu) to tell
> > > back-ends to cease all operations, including those running in the
> > > background.
> > > 
> > > qemu's current work-around for this is to reset the back-end instead of
> > > suspending it, which will not work for back-ends that have internal
> > > state that must be preserved across e.g. stop/cont.
> > > 
> > > Note that the given specification requires the back-end to delay
> > > processing kicks (i.e. starting vrings) until the device is resumed,
> > > instead of requiring the front-end not to send kicks while suspended.
> > > qemu starts devices (and would just resume them) only when the VM is in
> > > a running state, so it would be difficult to have qemu delay kicks until
> > > the device is resumed, which is why this patch specifies handling of
> > > kicks as it does.
> > > 
> > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > ---
> > >   docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
> > >   1 file changed, 33 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > index 5a070adbc1..ac6be34c4c 100644
> > > --- a/docs/interop/vhost-user.rst
> > > +++ b/docs/interop/vhost-user.rst
> > > @@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
> > >   or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
> > >   and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
> > > +While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
> > > +never process rings, and thus also delay handling kicks until the
> > If you respin this series, I suggest replacing "never" with "not" to
> > emphasize that ring processing is only skipped while the device is
> > suspended (rather than forever). "Never" feels too strong to use when
> > describing a temporary state.
> 
> Sure.
> 
> > > +back-end is resumed again.
> > > +
> > >   Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
> > >   If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
> > > @@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
> > >   ancillary data, it may be used to inform the front-end that the log has
> > >   been modified.
> > > -Once the source has finished migration, rings will be stopped by the
> > > -source. No further update must be done before rings are restarted.
> > > +Once the source has finished migration, the device will be suspended and
> > > +its rings will be stopped by the source. No further update must be done
> > > +before the device and its rings are resumed.
> > This paragraph is abstract and doesn't directly identify the mechanisms
> > or who does what:
> > - "the device will be suspended" via VHOST_USER_SUSPEND (or reset when
> >    VHOST_USER_SUSPEND is not supported?) or automatically by the device
> >    itself or some other mechanism?
> 
> OK, I’ll add VHOST_USER_SUSPEND.
> 
> So far I hadn’t considered making a note of resetting as a fallback right in
> the specification.  I don’t think I would want it in the specification, but
> on the other hand, it is probably important for back-end authors to know
> that if they do not implement SUSPEND, their device is going to be reset by
> qemu.
> 
> Can we make that a ”may”, i.e.:
> 
> ```
> Once the source has finished migration, the device will be suspended via
> VHOST_USER_SUSPEND and its rings will be stopped by the source.

I'm not sure what "its rings will be stopped by the source" means
exactly. Is it summarizing the effect of VHOST_USER_SUSPEND or is there
an additional action after VHOST_USER_SUSPEND that stops rings? And I'm
not sure whether "by the source" means by the front-end or back-end on
the source machine?

> No further
> update must be done before the device and its rings are resumed.

"Update" to what? Guest RAM? Device state? Rings?

I feel like this text is too vague for a spec. People may interpret it
differently. Can you make rephrase this to more concrete?

> If and only
> if the back-end does not support VHOST_USER_SUSPEND, the front-end may reset
> it instead (via VHOST_USER_SET_STATUS, VHOST_USER_RESET_DEVICE, or
> VHOST_USER_RESET_OWNER).
> ```
> 
> I’m unsure about the “If and only if” – older qemu versions will break this,
> but I feel like we must promise back-end writers that if they implement
> SUSPEND, their back-end is not going to be reset; if it is, and something
> breaks because of it, it’s the front-end that must be updated to match the
> specification.

I this the trick is to say "if and only if VHOST_USER_F_SUSPEND has not
been negotiated". That way really only new front-ends that support
VHOST_USER_SUSPEND are required to use suspend instead of reset and
older versions of QEMU will not violate this statement.

> 
> > - "before the device and its rings are resumed" via VHOST_USER_RESUME?
> >    And is this referring to the source device?
> 
> Yes, via VHOST_USER_RESUME, and restarting the rings by starting them (i.e.
> a kick).
> 
> Whether this is referring to the source device…  Well, the text as it was
> before begs the exact same question, so honestly, I don’t know for sure. 
> “Restarting” only makes sense if the rings were stopped before, so I assume
> it’s referring to the source, e.g. for the case of a failed migration. 
> RESUME at least definitely will only happen after a prior SUSPEND, so this
> one will definitely only apply on the source side.
> 
> > Please rephrase the paragraph to identify the vhost-user messages
> > involved.
> > 
> > >   In postcopy migration the back-end is started before all the memory has
> > >   been received from the source host, and care must be taken to avoid
> > > @@ -885,6 +890,7 @@ Protocol features
> > >     #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
> > >     #define VHOST_USER_PROTOCOL_F_STATUS               16
> > >     #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
> > > +  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
> > >   Front-end message types
> > >   -----------------------
> > > @@ -1440,6 +1446,31 @@ Front-end message types
> > >     query the back-end for its device status as defined in the Virtio
> > >     specification.
> > > +``VHOST_USER_SUSPEND``
> > > +  :id: 41
> > > +  :equivalent ioctl: VHOST_VDPA_SUSPEND
> > > +  :request payload: N/A
> > > +  :reply payload: N/A
> > > +
> > > +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
> > > +  successfully negotiated, this message is submitted by the front-end to
> > > +  have the back-end cease all operations except for handling vhost-user
> > > +  requests.  The back-end must stop processing all virt queues, and it
> > > +  must not perform any background operations.  It may not resume until a
> > "background operations" are not defined. What does it mean:
> > - Anything that writes to memory slots
> > - Anything that changes the visible state of the device
> > - Anything that changes the non-visible internal state of the device
> > - etc
> > ?
> 
> My best answer (honestly) is: You tell me.  This series is introducing
> SUSPEND/RESUME because qemu wants to reset devices to make them stop
> “background operations”, and this would break virtiofsd if any form of reset
> were actually implemented.  The implementation of SUSPEND/RESUME in
> virtiofsd on the other hand is supposed to basically be a no-op (besides
> delaying ring processing until a RESUME, but even if we processed them
> before, i.e. really make SUSPEND/RESUME no-ops, it would most likely work
> out fine), so I have no idea what kind of background operations we are even
> talking about, or whether any such actually exist in practice.
> 
> I don’t know what anyone had in mind when introducing the RESET.  It comes
> from vDPA (c3716f260bf moved it from vdpa into the common code), and exists
> since the code was originally added (108a64818e6), so there’s no comment
> explaining why it exists.  I can’t explain what the back-end is supposed to
> stop doing, because so far it isn’t explained anywhere in qemu, its git log,
> or in any documentation (there basically is no vdpa documentation).
> 
> I can only say what I just completely naïvely assumed it to mean so far:
> That the back-end basically should stop doing anything besides handling and
> replying to vhost-user messages.  If such a message requires changing any
> state, visible or not, then this state change is permissible.

Okay, I suggest the following instead of "background operations":

- Changes to the device state produced by SET_DEVICE_STATE_FD.
- Writing to memory regions.
- Signalling that buffers have been used.
- Signalling that the configuration space has changed.

The goal is to ensure the device state and memory regions are stable and
that back-end doesn't trigger activity in the front-end.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-21 14:16           ` Hanna Czenczek
@ 2023-07-24 18:04             ` Stefan Hajnoczi
  2023-07-25  8:39               ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-24 18:04 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez,
	German Maglione, Maxime Coquelin

[-- Attachment #1: Type: text/plain, Size: 8105 bytes --]

On Fri, Jul 21, 2023 at 04:16:07PM +0200, Hanna Czenczek wrote:
> On 20.07.23 18:03, Stefan Hajnoczi wrote:
> > On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote:
> > > On 19.07.23 16:11, Hanna Czenczek wrote:
> > > > On 18.07.23 17:10, Stefan Hajnoczi wrote:
> > > > > On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
> > > > > > The only user of vhost_user_reset_status() is vhost_dev_stop(), which
> > > > > > only uses it as a fall-back to stop the back-end if it does not support
> > > > > > SUSPEND.  However, vhost-user's implementation is a no-op unless the
> > > > > > back-end supports SET_STATUS.
> > > > > > 
> > > > > > vhost-vdpa's implementation instead just calls
> > > > > > vhost_vdpa_reset_device(), implying that it's OK to fully reset the
> > > > > > device if SET_STATUS is not supported.
> > > > > > 
> > > > > > To be fair, vhost_vdpa_reset_device() does nothing but to set
> > > > > > the status
> > > > > > to zero.  However, that may well be because vhost-vdpa has no method
> > > > > > besides this to reset a device.  In contrast, vhost-user has
> > > > > > RESET_DEVICE and a RESET_OWNER, which can be used instead.
> > > > > > 
> > > > > > While it is not entirely clear from documentation or git logs, from
> > > > > > discussions and the order of vhost-user protocol features, it
> > > > > > appears to
> > > > > > me as if RESET_OWNER originally had no real meaning for vhost-user, and
> > > > > > was thus used to signal a device reset to the back-end.  Then,
> > > > > > RESET_DEVICE was introduced, to have a well-defined dedicated reset
> > > > > > command.  Finally, vhost-user received full STATUS support, including
> > > > > > SET_STATUS, so setting the device status to 0 is now the preferred way
> > > > > > of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
> > > > > > remain valid as fall-backs.
> > > > > > 
> > > > > > Therefore, have vhost_user_reset_status() fall back to
> > > > > > vhost_user_reset_device() if the back-end has no STATUS support.
> > > > > > 
> > > > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > > > > ---
> > > > > >    hw/virtio/vhost-user.c | 2 ++
> > > > > >    1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > > > index 4507de5a92..53a881ec2a 100644
> > > > > > --- a/hw/virtio/vhost-user.c
> > > > > > +++ b/hw/virtio/vhost-user.c
> > > > > > @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct
> > > > > > vhost_dev *dev)
> > > > > >        if (virtio_has_feature(dev->protocol_features,
> > > > > >                               VHOST_USER_PROTOCOL_F_STATUS)) {
> > > > > >            vhost_user_set_status(dev, 0);
> > > > > > +    } else {
> > > > > > +        vhost_user_reset_device(dev);
> > > > > >        }
> > > > > >    }
> > > > > Did you check whether DPDK treats setting the status to 0 as equivalent
> > > > > to RESET_DEVICE?
> > > > If it doesn’t, what’s even the point of using reset_status?
> > > Sorry, I’m being unclear, and I think this may be important because it ties
> > > into the question from patch 1, what qemu is even trying to do by running
> > > SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that
> > > SET_STATUS(0) and RESET_DEVICE should be equivalent:
> > > 
> > > vhost-vdpa.c runs SET_STATUS(0) in a function called
> > > vhost_vdpa_reset_device().  This is one thing that gave me the impression
> > > that this is about an actual full reset.
> > > 
> > > Another is the whole discussion that we’ve had.  vhost_dev_stop() does not
> > > call a `vhost_reset_device()` function, it calls `vhost_reset_status()`.
> > > Still, we were always talking about resetting the device.
> > There is some hacky stuff with struct vhost_dev's vq_index_end and
> > multi-queue devices. I think it's because multi-queue vhost-net device
> > consist of many vhost_devs and NetClientStates, so certain vhost
> > operations are skipped unless this is the "first" or "last" vhost_dev
> > from a large aggregate vhost-net device. That might be responsible for
> > part of the weirdness.
> > 
> > > It doesn’t make sense to me that vDPA would provide no function to fully
> > > reset a device, while vhost-user does.  Being able to reset a device sounds
> > > vital to me.  This also gave me the impression that SET_STATUS(0) on vDPA at
> > > least is functionally equivalent to a full device reset.
> > > 
> > > 
> > > Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on
> > > vhost-user.  That would be a real shame, so I assumed this would not be the
> > > case; that SET_STATUS(0) does the same thing on both protocols.
> > Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user
> > it's currently only used by DPDK as a hint for when device
> > initialization is complete:
> > https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1
> 
> FWIW, now the code is a bit different.
> https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054
> has added a RESET interpretation for the status field, i.e. when it is 0. 
> It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0)
> is a reset.

That patch adds diagnostics but does not perform any action for
SET_STATUS 0. DPDK's vhost_user_reset_owner() is still the only place
where the device is actually reset. QEMU cannot switch to just
SET_STATUS 0, it still needs to send RESET_DEVICE/RESET_OWNER.

> 
> > > The virtio specification says “Writing 0 into this field resets the device.”
> > > about the device_status field.
> > > 
> > > This also makes sense, because the device_status field is basically used to
> > > tell the device that a driver has taken control.  If reset, this indicates
> > > the driver has given up control, and to me this is a point where a device
> > > should fully reset itself.
> > > 
> > > So all in all, I can’t see the rationale why any implementation that
> > > supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or
> > > a superset of RESET_DEVICE.  I may be wrong, and this might explain a whole
> > > deal about what kind of background operations we hope to stop with
> > > SET_STATUS(0).
> > I would like vhost-user devices to implement SET_STATUS according to the
> > VIRTIO specification in the future and they can do that. But I think
> > front-ends should continue sending RESET_DEVICE in order to support old
> > devices.
> 
> Well, yes, exactly.  That is what I meant to address with this patch,
> vhost-user right now does not send RESET_DEVICE in its vhost_reset_status
> implementation, so the front-end will not fall back to RESET_DEVICE when it
> apparently does intend to reset the device[1].  We do arguably have
> vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is
> no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi.
> 
> So this also begs the question why we even do have vhost_reset_status and
> vhost_reset_device as two separate things. The commit introducing
> vhost_reset_status (c3716f260bf) doesn’t say.  Maybe the intention was that
> vhost_reset_device would leave the status at 0, while vhost_reset_status
> would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit,
> but that comes back to patch 5 in this series – we don’t need to have
> ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need
> vhost_reset_status to set those flags.  They should be set in
> vhost_dev_start().
> 
> [1] This is assuming that SET_STATUS(0) is intended to reset the device, but
> it sounds like you agree on that.

I don't know the answers, but I think it's safe to go ahead with a
SET_STATUS sequence that follows the VIRTIO spec, plus a
VHOST_USER_RESET_DEVICE/VHOST_USER_RESET_OWNER.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-24 15:48       ` Eugenio Perez Martin
@ 2023-07-25  7:53         ` Hanna Czenczek
  2023-07-25 10:03           ` Eugenio Perez Martin
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-25  7:53 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On 24.07.23 17:48, Eugenio Perez Martin wrote:
> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
>>>> struct, so vhost_dev_stop() can check whether the back-end has been
>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
>>>> there is no need to reset it; the reset is just a fall-back to stop
>>>> device operations for back-ends that do not support suspend.
>>>>
>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
>>>> when the device is re-started, we still have to do the reset to have it
>>>> un-suspend.
>>>>
>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>> ---
>>>>    include/hw/virtio/vhost-vdpa.h |  2 --
>>>>    include/hw/virtio/vhost.h      |  8 ++++++++
>>>>    hw/virtio/vhost-vdpa.c         | 11 +++++++----
>>>>    hw/virtio/vhost.c              |  8 +++++++-
>>>>    4 files changed, 22 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>>> index e64bfc7f98..72c3686b7f 100644
>>>> --- a/include/hw/virtio/vhost-vdpa.h
>>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
>>>>        bool shadow_vqs_enabled;
>>>>        /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
>>>>        bool shadow_data;
>>>> -    /* Device suspended successfully */
>>>> -    bool suspended;
>>>>        /* IOVA mapping used by the Shadow Virtqueue */
>>>>        VhostIOVATree *iova_tree;
>>>>        GPtrArray *shadow_vqs;
>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>> index 6a173cb9fa..69bf59d630 100644
>>>> --- a/include/hw/virtio/vhost.h
>>>> +++ b/include/hw/virtio/vhost.h
>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
>>>>        uint64_t backend_cap;
>>>>        /* @started: is the vhost device started? */
>>>>        bool started;
>>>> +    /**
>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
>>>> +     * are supposed to automatically suspend/resume in their
>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
>>>> +     * the device is reset.
>>>> +     */
>>>> +    bool suspended;
>>>>        bool log_enabled;
>>>>        uint64_t log_size;
>>>>        Error *migration_blocker;
>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>> index 7b7dee468e..f7fd19a203 100644
>>>> --- a/hw/virtio/vhost-vdpa.c
>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>>>>
>>>>    static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>>>>    {
>>>> -    struct vhost_vdpa *v = dev->opaque;
>>>>        int ret;
>>>>        uint8_t status = 0;
>>>>
>>>>        ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>>>>        trace_vhost_vdpa_reset_device(dev);
>>>> -    v->suspended = false;
>>>> +    dev->suspended = false;
>>>>        return ret;
>>>>    }
>>>>
>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
>>>>            if (unlikely(r)) {
>>>>                error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
>>>>            } else {
>>>> -            v->suspended = true;
>>>> +            dev->suspended = true;
>>>>                return;
>>>>            }
>>>>        }
>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>>>                return -1;
>>>>            }
>>>>            vhost_vdpa_set_vring_ready(dev);
>>>> +        if (dev->suspended) {
>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
>>>> +            vhost_vdpa_reset_status(dev);
>>> How is that we reset the status at each vhost_vdpa_dev_start? That
>>> will clean all the vqs configured, features negotiated, etc. in the
>>> vDPA device. Or am I missing something?
>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
>> but we somehow need to lift the previous SUSPEND so the device will
>> again respond to guest requests, do we not?
>>
> Reset also clears the suspend state in vDPA, and it should be called
> at vhost_dev_stop. So the device should never be in suspended state
> here. Does that solve your concerns?

My intention with this patch was precisely not to reset in 
vhost_dev_stop when suspending is supported.  So now I’m more confused 
than before.

>> But more generally, is this any different from what is done before this
>> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
>> vhost_reset_status(), so the device is reset in every stop/start cycle,
>> that doesn’t change.  And we still won’t reset it on the first
>> vhost_dev_start(), because dev->suspended will be false then, only on
>> subsequent stop/start cycles, as before.  So the only difference is that
>> now the device is reset on start, not on stop.
>>
> The difference is that vhost_vdpa_dev_start is called after features
> ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
> configuration (using vhost_virtqueue_start). A device reset forces the
> device to forget about all of that, and qemu cannot configure them
> again until qemu acks the features again.

Now I’m completely confused, because I don’t see the point of 
implementing suspend at all if we rely on a reset immediately afterwards 
anyway.  It was my impression this whole time that suspending would 
remove the need to reset.  Well, at least until the device should be 
resumed again, i.e. in vhost_dev_start().

In addition, I also don’t understand the magnitude of the problem with 
ordering.  If the order in vhost_dev_start() is wrong, can we not easily 
fix it?  E.g. add a full vhost_dev_resume callback to invoke right at 
the start of vhost_dev_start(); or check (in the same place) whether the 
back-end supports resuming, and if it doesn’t (and it is currently 
suspended), reset it there.

In any case, if we need to reset in vhost_dev_stop(), i.e. immediately 
after suspend, I don’t see the point of suspending, indicating to me 
that I still fail to understand its purpose.

Hanna



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-24 17:55       ` Stefan Hajnoczi
@ 2023-07-25  8:30         ` Hanna Czenczek
  2023-07-27 21:12           ` Stefan Hajnoczi
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-25  8:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

On 24.07.23 19:55, Stefan Hajnoczi wrote:
> On Wed, Jul 19, 2023 at 03:59:32PM +0200, Hanna Czenczek wrote:
>> On 18.07.23 16:25, Stefan Hajnoczi wrote:
>>> On Tue, Jul 11, 2023 at 05:52:23PM +0200, Hanna Czenczek wrote:
>>>> When stopping the VM, qemu wants all devices to fully cease any
>>>> operation, too.  Currently, we can only have vhost-user back-ends stop
>>>> processing vrings, but not background operations.  Add the SUSPEND and
>>>> RESUME commands from vDPA, which allow the front-end (qemu) to tell
>>>> back-ends to cease all operations, including those running in the
>>>> background.
>>>>
>>>> qemu's current work-around for this is to reset the back-end instead of
>>>> suspending it, which will not work for back-ends that have internal
>>>> state that must be preserved across e.g. stop/cont.
>>>>
>>>> Note that the given specification requires the back-end to delay
>>>> processing kicks (i.e. starting vrings) until the device is resumed,
>>>> instead of requiring the front-end not to send kicks while suspended.
>>>> qemu starts devices (and would just resume them) only when the VM is in
>>>> a running state, so it would be difficult to have qemu delay kicks until
>>>> the device is resumed, which is why this patch specifies handling of
>>>> kicks as it does.
>>>>
>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>> ---
>>>>    docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
>>>>    1 file changed, 33 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>>>> index 5a070adbc1..ac6be34c4c 100644
>>>> --- a/docs/interop/vhost-user.rst
>>>> +++ b/docs/interop/vhost-user.rst
>>>> @@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
>>>>    or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
>>>>    and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
>>>> +While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
>>>> +never process rings, and thus also delay handling kicks until the
>>> If you respin this series, I suggest replacing "never" with "not" to
>>> emphasize that ring processing is only skipped while the device is
>>> suspended (rather than forever). "Never" feels too strong to use when
>>> describing a temporary state.
>> Sure.
>>
>>>> +back-end is resumed again.
>>>> +
>>>>    Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
>>>>    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
>>>> @@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
>>>>    ancillary data, it may be used to inform the front-end that the log has
>>>>    been modified.
>>>> -Once the source has finished migration, rings will be stopped by the
>>>> -source. No further update must be done before rings are restarted.
>>>> +Once the source has finished migration, the device will be suspended and
>>>> +its rings will be stopped by the source. No further update must be done
>>>> +before the device and its rings are resumed.
>>> This paragraph is abstract and doesn't directly identify the mechanisms
>>> or who does what:
>>> - "the device will be suspended" via VHOST_USER_SUSPEND (or reset when
>>>     VHOST_USER_SUSPEND is not supported?) or automatically by the device
>>>     itself or some other mechanism?
>> OK, I’ll add VHOST_USER_SUSPEND.
>>
>> So far I hadn’t considered making a note of resetting as a fallback right in
>> the specification.  I don’t think I would want it in the specification, but
>> on the other hand, it is probably important for back-end authors to know
>> that if they do not implement SUSPEND, their device is going to be reset by
>> qemu.
>>
>> Can we make that a ”may”, i.e.:
>>
>> ```
>> Once the source has finished migration, the device will be suspended via
>> VHOST_USER_SUSPEND and its rings will be stopped by the source.
> I'm not sure what "its rings will be stopped by the source" means
> exactly. Is it summarizing the effect of VHOST_USER_SUSPEND or is there
> an additional action after VHOST_USER_SUSPEND that stops rings? And I'm
> not sure whether "by the source" means by the front-end or back-end on
> the source machine?

This is pre-existing text and I assumed it (with not doubt) to refer to 
GET_VRING_BASE, because that is how rings are stopped.

I can improve the existing documentation and add the reference to 
GET_VRING_BASE, and say that it clearly must come from the front-end.

>> No further
>> update must be done before the device and its rings are resumed.
> "Update" to what? Guest RAM? Device state? Rings?
>
> I feel like this text is too vague for a spec. People may interpret it
> differently. Can you make rephrase this to more concrete?

Honestly, no.  This is pre-existing, and I have the same questions as 
you do.

I cannot “rephrase” this to make it more concrete.  I can try to 
actually specify that was was left unspecified, but that would be a 
change in specification that would require its own patch, separate from 
this series.

Personally, I’ve generally taken this sentence to be fluff.  If the 
rings are stopped, clearly, they should not be accessed at all. Probably 
the back-end should also refrain from writing to guest memory, because 
that is a logical conclusion from having the rings stopped.  But now 
it’s even clearer: The back-end ideally is suspended, which directly 
means not to modify guest memory, and not to perform “background 
operations”.  Updating device state of course is possible through 
vhost-user commands, because those must always be executed.

So basically it’s just “Device and rings are stopped (RESUME and 
GET_VRING_BASE, resp.), so you know, adhere to that.”

Or maybe I’m completely wrong and “Once the source has finished 
migration, rings will be stopped by the source. No further update must 
be done before rings are restarted.” is to be taken together and the 
second sentence just refers to the rings, i.e. the front-end stops the 
rings, and the back-end must not update them.  Or it means that the 
front-end must not send any commands to the back-end until restarting 
the rings, but that feels practically impossible.

Again, because this sentence currently doesn’t specify anything, really, 
changing it to carry any meaning would be to add to the specification, 
not just clarify it.

>> If and only
>> if the back-end does not support VHOST_USER_SUSPEND, the front-end may reset
>> it instead (via VHOST_USER_SET_STATUS, VHOST_USER_RESET_DEVICE, or
>> VHOST_USER_RESET_OWNER).
>> ```
>>
>> I’m unsure about the “If and only if” – older qemu versions will break this,
>> but I feel like we must promise back-end writers that if they implement
>> SUSPEND, their back-end is not going to be reset; if it is, and something
>> breaks because of it, it’s the front-end that must be updated to match the
>> specification.
> I this the trick is to say "if and only if VHOST_USER_F_SUSPEND has not
> been negotiated". That way really only new front-ends that support
> VHOST_USER_SUSPEND are required to use suspend instead of reset and
> older versions of QEMU will not violate this statement.

Ah, right, thanks!

>>> - "before the device and its rings are resumed" via VHOST_USER_RESUME?
>>>     And is this referring to the source device?
>> Yes, via VHOST_USER_RESUME, and restarting the rings by starting them (i.e.
>> a kick).
>>
>> Whether this is referring to the source device…  Well, the text as it was
>> before begs the exact same question, so honestly, I don’t know for sure.
>> “Restarting” only makes sense if the rings were stopped before, so I assume
>> it’s referring to the source, e.g. for the case of a failed migration.
>> RESUME at least definitely will only happen after a prior SUSPEND, so this
>> one will definitely only apply on the source side.
>>
>>> Please rephrase the paragraph to identify the vhost-user messages
>>> involved.
>>>
>>>>    In postcopy migration the back-end is started before all the memory has
>>>>    been received from the source host, and care must be taken to avoid
>>>> @@ -885,6 +890,7 @@ Protocol features
>>>>      #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>>>>      #define VHOST_USER_PROTOCOL_F_STATUS               16
>>>>      #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>>>> +  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
>>>>    Front-end message types
>>>>    -----------------------
>>>> @@ -1440,6 +1446,31 @@ Front-end message types
>>>>      query the back-end for its device status as defined in the Virtio
>>>>      specification.
>>>> +``VHOST_USER_SUSPEND``
>>>> +  :id: 41
>>>> +  :equivalent ioctl: VHOST_VDPA_SUSPEND
>>>> +  :request payload: N/A
>>>> +  :reply payload: N/A
>>>> +
>>>> +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
>>>> +  successfully negotiated, this message is submitted by the front-end to
>>>> +  have the back-end cease all operations except for handling vhost-user
>>>> +  requests.  The back-end must stop processing all virt queues, and it
>>>> +  must not perform any background operations.  It may not resume until a
>>> "background operations" are not defined. What does it mean:
>>> - Anything that writes to memory slots
>>> - Anything that changes the visible state of the device
>>> - Anything that changes the non-visible internal state of the device
>>> - etc
>>> ?
>> My best answer (honestly) is: You tell me.  This series is introducing
>> SUSPEND/RESUME because qemu wants to reset devices to make them stop
>> “background operations”, and this would break virtiofsd if any form of reset
>> were actually implemented.  The implementation of SUSPEND/RESUME in
>> virtiofsd on the other hand is supposed to basically be a no-op (besides
>> delaying ring processing until a RESUME, but even if we processed them
>> before, i.e. really make SUSPEND/RESUME no-ops, it would most likely work
>> out fine), so I have no idea what kind of background operations we are even
>> talking about, or whether any such actually exist in practice.
>>
>> I don’t know what anyone had in mind when introducing the RESET.  It comes
>> from vDPA (c3716f260bf moved it from vdpa into the common code), and exists
>> since the code was originally added (108a64818e6), so there’s no comment
>> explaining why it exists.  I can’t explain what the back-end is supposed to
>> stop doing, because so far it isn’t explained anywhere in qemu, its git log,
>> or in any documentation (there basically is no vdpa documentation).
>>
>> I can only say what I just completely naïvely assumed it to mean so far:
>> That the back-end basically should stop doing anything besides handling and
>> replying to vhost-user messages.  If such a message requires changing any
>> state, visible or not, then this state change is permissible.
> Okay, I suggest the following instead of "background operations":
>
> - Changes to the device state produced by SET_DEVICE_STATE_FD.

This is definitely something that I would absolutely allow after 
SUSPEND.  If the front-end does not wish the back-end to do this, it can 
just not send this command while the back-end is suspended.

> - Writing to memory regions.
> - Signalling that buffers have been used.

This feels both too tight and not concrete enough.  The only buffers I 
can think of are buffers supplied through virt queues, which I intended 
to already be included in “stop processing all virt queues”.  (I took 
this to mean the used-buffer ring, too, but I can of course be more 
explicit about this, e.g. “stop processing all virt queues, including 
returning buffers to the driver”.) “Signalling” sounds like triggering 
the callfd, but that also seems clear; if you can’t process virt queues, 
including returning buffers, you can never trigger the callfd (or send 
VHOST_USER_BACKEND_VRING_CALL), because there can never be a new buffer 
returned to the driver.

So too tight because it feels like a subset of virt queue processing, 
but not concrete enough because “buffers” makes me feel like I’m 
overlooking something besides virt queues.

> - Signalling that the configuration space has changed.

Maybe this could be more general, i.e. the back-end must not send any 
vhost-user messages to the front-end?

> The goal is to ensure the device state and memory regions are stable and
> that back-end doesn't trigger activity in the front-end.

If the goal is to ensure that the device state is stable, I feel like we 
must then specify precisely this, and not just to say it must not be 
mutable through SET_DEVICE_STATE_FD: In general, the state is more 
likely to be changed by other factors after all.

On the other hand, I also don’t see why that’s a goal.  For migration, 
it might seem desirable, but I don’t actually think so: The back-end is 
required to send a consistent device state anyway, so it is up to the 
back-end to ensure the state is consistent, we don’t need to make it a 
requirement for SUSPEND.  It is implicitly clear from the migration 
model that if the device state were to change after the back-end has 
sent it to the front-end, those change will be lost on the destination 
side, so it is clear that the back-end must anticipate this and work 
around it.

Other than migration, qemu doesn’t see the device state at all.  I don’t 
see why internal state should not change between stop/cont. If a device 
experience some signal that (for some reason) it can’t pause to receive 
only after the subsequent RESUME, it might need to make note of that 
signal to act on it after RESUME.  I would consider that a change in 
internal state, and I don’t immediately see a problem with it.  (It may 
be problematic when migrating, because receiving such signals on the 
source side after transferring the state would mean they’re lost, but 
again, this is something the device clearly has to solve, e.g. by 
redirecting the signals to the destination starting with the 
SET_TRANSFER_STATE_FD call.)

Hanna



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 6/6] vhost-user: Have reset_status fall back to reset
  2023-07-24 18:04             ` Stefan Hajnoczi
@ 2023-07-25  8:39               ` Hanna Czenczek
  0 siblings, 0 replies; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-25  8:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez,
	German Maglione, Maxime Coquelin

On 24.07.23 20:04, Stefan Hajnoczi wrote:
> On Fri, Jul 21, 2023 at 04:16:07PM +0200, Hanna Czenczek wrote:
>> On 20.07.23 18:03, Stefan Hajnoczi wrote:
>>> On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote:
>>>> On 19.07.23 16:11, Hanna Czenczek wrote:
>>>>> On 18.07.23 17:10, Stefan Hajnoczi wrote:
>>>>>> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote:
>>>>>>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which
>>>>>>> only uses it as a fall-back to stop the back-end if it does not support
>>>>>>> SUSPEND.  However, vhost-user's implementation is a no-op unless the
>>>>>>> back-end supports SET_STATUS.
>>>>>>>
>>>>>>> vhost-vdpa's implementation instead just calls
>>>>>>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the
>>>>>>> device if SET_STATUS is not supported.
>>>>>>>
>>>>>>> To be fair, vhost_vdpa_reset_device() does nothing but to set
>>>>>>> the status
>>>>>>> to zero.  However, that may well be because vhost-vdpa has no method
>>>>>>> besides this to reset a device.  In contrast, vhost-user has
>>>>>>> RESET_DEVICE and a RESET_OWNER, which can be used instead.
>>>>>>>
>>>>>>> While it is not entirely clear from documentation or git logs, from
>>>>>>> discussions and the order of vhost-user protocol features, it
>>>>>>> appears to
>>>>>>> me as if RESET_OWNER originally had no real meaning for vhost-user, and
>>>>>>> was thus used to signal a device reset to the back-end.  Then,
>>>>>>> RESET_DEVICE was introduced, to have a well-defined dedicated reset
>>>>>>> command.  Finally, vhost-user received full STATUS support, including
>>>>>>> SET_STATUS, so setting the device status to 0 is now the preferred way
>>>>>>> of resetting a device.  Still, RESET_DEVICE and RESET_OWNER should
>>>>>>> remain valid as fall-backs.
>>>>>>>
>>>>>>> Therefore, have vhost_user_reset_status() fall back to
>>>>>>> vhost_user_reset_device() if the back-end has no STATUS support.
>>>>>>>
>>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>>>>> ---
>>>>>>>     hw/virtio/vhost-user.c | 2 ++
>>>>>>>     1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>>>>>> index 4507de5a92..53a881ec2a 100644
>>>>>>> --- a/hw/virtio/vhost-user.c
>>>>>>> +++ b/hw/virtio/vhost-user.c
>>>>>>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct
>>>>>>> vhost_dev *dev)
>>>>>>>         if (virtio_has_feature(dev->protocol_features,
>>>>>>>                                VHOST_USER_PROTOCOL_F_STATUS)) {
>>>>>>>             vhost_user_set_status(dev, 0);
>>>>>>> +    } else {
>>>>>>> +        vhost_user_reset_device(dev);
>>>>>>>         }
>>>>>>>     }
>>>>>> Did you check whether DPDK treats setting the status to 0 as equivalent
>>>>>> to RESET_DEVICE?
>>>>> If it doesn’t, what’s even the point of using reset_status?
>>>> Sorry, I’m being unclear, and I think this may be important because it ties
>>>> into the question from patch 1, what qemu is even trying to do by running
>>>> SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that
>>>> SET_STATUS(0) and RESET_DEVICE should be equivalent:
>>>>
>>>> vhost-vdpa.c runs SET_STATUS(0) in a function called
>>>> vhost_vdpa_reset_device().  This is one thing that gave me the impression
>>>> that this is about an actual full reset.
>>>>
>>>> Another is the whole discussion that we’ve had.  vhost_dev_stop() does not
>>>> call a `vhost_reset_device()` function, it calls `vhost_reset_status()`.
>>>> Still, we were always talking about resetting the device.
>>> There is some hacky stuff with struct vhost_dev's vq_index_end and
>>> multi-queue devices. I think it's because multi-queue vhost-net device
>>> consist of many vhost_devs and NetClientStates, so certain vhost
>>> operations are skipped unless this is the "first" or "last" vhost_dev
>>> from a large aggregate vhost-net device. That might be responsible for
>>> part of the weirdness.
>>>
>>>> It doesn’t make sense to me that vDPA would provide no function to fully
>>>> reset a device, while vhost-user does.  Being able to reset a device sounds
>>>> vital to me.  This also gave me the impression that SET_STATUS(0) on vDPA at
>>>> least is functionally equivalent to a full device reset.
>>>>
>>>>
>>>> Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on
>>>> vhost-user.  That would be a real shame, so I assumed this would not be the
>>>> case; that SET_STATUS(0) does the same thing on both protocols.
>>> Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user
>>> it's currently only used by DPDK as a hint for when device
>>> initialization is complete:
>>> https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1
>> FWIW, now the code is a bit different.
>> https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054
>> has added a RESET interpretation for the status field, i.e. when it is 0.
>> It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0)
>> is a reset.
> That patch adds diagnostics but does not perform any action for
> SET_STATUS 0. DPDK's vhost_user_reset_owner() is still the only place
> where the device is actually reset.

That’s what I said, it doesn’t do anything, but the diagnostics agree 
that it is a RESET.

> QEMU cannot switch to just
> SET_STATUS 0, it still needs to send RESET_DEVICE/RESET_OWNER.

That is what I questioned below: We currently *do not* call 
RESET_DEVICE/RESET_OWNER.  This patch is not about switching to 
SET_STATUS(0), it is about having RESET_DEVICE/RESET_OWNER be fallbacks 
for it.

>>>> The virtio specification says “Writing 0 into this field resets the device.”
>>>> about the device_status field.
>>>>
>>>> This also makes sense, because the device_status field is basically used to
>>>> tell the device that a driver has taken control.  If reset, this indicates
>>>> the driver has given up control, and to me this is a point where a device
>>>> should fully reset itself.
>>>>
>>>> So all in all, I can’t see the rationale why any implementation that
>>>> supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or
>>>> a superset of RESET_DEVICE.  I may be wrong, and this might explain a whole
>>>> deal about what kind of background operations we hope to stop with
>>>> SET_STATUS(0).
>>> I would like vhost-user devices to implement SET_STATUS according to the
>>> VIRTIO specification in the future and they can do that. But I think
>>> front-ends should continue sending RESET_DEVICE in order to support old
>>> devices.
>> Well, yes, exactly.  That is what I meant to address with this patch,
>> vhost-user right now does not send RESET_DEVICE in its vhost_reset_status
>> implementation, so the front-end will not fall back to RESET_DEVICE when it
>> apparently does intend to reset the device[1].  We do arguably have
>> vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is
>> no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi.
>>
>> So this also begs the question why we even do have vhost_reset_status and
>> vhost_reset_device as two separate things. The commit introducing
>> vhost_reset_status (c3716f260bf) doesn’t say.  Maybe the intention was that
>> vhost_reset_device would leave the status at 0, while vhost_reset_status
>> would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit,
>> but that comes back to patch 5 in this series – we don’t need to have
>> ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need
>> vhost_reset_status to set those flags.  They should be set in
>> vhost_dev_start().
>>
>> [1] This is assuming that SET_STATUS(0) is intended to reset the device, but
>> it sounds like you agree on that.
> I don't know the answers, but I think it's safe to go ahead with a
> SET_STATUS sequence that follows the VIRTIO spec, plus a
> VHOST_USER_RESET_DEVICE/VHOST_USER_RESET_OWNER.

So what you’re saying is that RESET_DEVICE/RESET_OWNER should not be 
fallbacks, but be invoked in addition to SET_STATUS(0)?

If so, that would be silly.  I see your point that DPDK resets only in 
response to RESET_DEVICE/RESET_OWNER, but the diagnostics agree that 
SET_STATUS(0) is a reset, which is why I find this so silly. It sounds 
to me as if any properly behaving implementation would fully reset the 
back-end on SET_STATUS(0), so unconditionally invoking 
RESET_DEVICE/RESET_OWNER afterwards is just doing a double-reset.

Notably, invoking RESET_DEVICE/RESET_OWNER in addition to SET_STATUS(0) 
(instead of as a fallback) would be a change in behavior, because we do 
not call RESET_DEVICE/RESET_OWNER outside of vhost-user-scsi today.

Hanna



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-25  7:53         ` Hanna Czenczek
@ 2023-07-25 10:03           ` Eugenio Perez Martin
  2023-07-25 13:09             ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-25 10:03 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 24.07.23 17:48, Eugenio Perez Martin wrote:
> > On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> >>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> >>>> struct, so vhost_dev_stop() can check whether the back-end has been
> >>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> >>>> there is no need to reset it; the reset is just a fall-back to stop
> >>>> device operations for back-ends that do not support suspend.
> >>>>
> >>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> >>>> when the device is re-started, we still have to do the reset to have it
> >>>> un-suspend.
> >>>>
> >>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >>>> ---
> >>>>    include/hw/virtio/vhost-vdpa.h |  2 --
> >>>>    include/hw/virtio/vhost.h      |  8 ++++++++
> >>>>    hw/virtio/vhost-vdpa.c         | 11 +++++++----
> >>>>    hw/virtio/vhost.c              |  8 +++++++-
> >>>>    4 files changed, 22 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>>> index e64bfc7f98..72c3686b7f 100644
> >>>> --- a/include/hw/virtio/vhost-vdpa.h
> >>>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> >>>>        bool shadow_vqs_enabled;
> >>>>        /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
> >>>>        bool shadow_data;
> >>>> -    /* Device suspended successfully */
> >>>> -    bool suspended;
> >>>>        /* IOVA mapping used by the Shadow Virtqueue */
> >>>>        VhostIOVATree *iova_tree;
> >>>>        GPtrArray *shadow_vqs;
> >>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>> index 6a173cb9fa..69bf59d630 100644
> >>>> --- a/include/hw/virtio/vhost.h
> >>>> +++ b/include/hw/virtio/vhost.h
> >>>> @@ -120,6 +120,14 @@ struct vhost_dev {
> >>>>        uint64_t backend_cap;
> >>>>        /* @started: is the vhost device started? */
> >>>>        bool started;
> >>>> +    /**
> >>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
> >>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> >>>> +     * are supposed to automatically suspend/resume in their
> >>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
> >>>> +     * the device is reset.
> >>>> +     */
> >>>> +    bool suspended;
> >>>>        bool log_enabled;
> >>>>        uint64_t log_size;
> >>>>        Error *migration_blocker;
> >>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>> index 7b7dee468e..f7fd19a203 100644
> >>>> --- a/hw/virtio/vhost-vdpa.c
> >>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> >>>>
> >>>>    static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >>>>    {
> >>>> -    struct vhost_vdpa *v = dev->opaque;
> >>>>        int ret;
> >>>>        uint8_t status = 0;
> >>>>
> >>>>        ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> >>>>        trace_vhost_vdpa_reset_device(dev);
> >>>> -    v->suspended = false;
> >>>> +    dev->suspended = false;
> >>>>        return ret;
> >>>>    }
> >>>>
> >>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
> >>>>            if (unlikely(r)) {
> >>>>                error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
> >>>>            } else {
> >>>> -            v->suspended = true;
> >>>> +            dev->suspended = true;
> >>>>                return;
> >>>>            }
> >>>>        }
> >>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>>>                return -1;
> >>>>            }
> >>>>            vhost_vdpa_set_vring_ready(dev);
> >>>> +        if (dev->suspended) {
> >>>> +            /* TODO: When RESUME is available, use it instead of resetting */
> >>>> +            vhost_vdpa_reset_status(dev);
> >>> How is that we reset the status at each vhost_vdpa_dev_start? That
> >>> will clean all the vqs configured, features negotiated, etc. in the
> >>> vDPA device. Or am I missing something?
> >> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
> >> but we somehow need to lift the previous SUSPEND so the device will
> >> again respond to guest requests, do we not?
> >>
> > Reset also clears the suspend state in vDPA, and it should be called
> > at vhost_dev_stop. So the device should never be in suspended state
> > here. Does that solve your concerns?
>
> My intention with this patch was precisely not to reset in
> vhost_dev_stop when suspending is supported.  So now I’m more confused
> than before.
>

At this moment, I think that should be focused as an optimization and
not to be included in the main series.

> >> But more generally, is this any different from what is done before this
> >> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
> >> vhost_reset_status(), so the device is reset in every stop/start cycle,
> >> that doesn’t change.  And we still won’t reset it on the first
> >> vhost_dev_start(), because dev->suspended will be false then, only on
> >> subsequent stop/start cycles, as before.  So the only difference is that
> >> now the device is reset on start, not on stop.
> >>
> > The difference is that vhost_vdpa_dev_start is called after features
> > ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
> > configuration (using vhost_virtqueue_start). A device reset forces the
> > device to forget about all of that, and qemu cannot configure them
> > again until qemu acks the features again.
>
> Now I’m completely confused, because I don’t see the point of
> implementing suspend at all if we rely on a reset immediately afterwards
> anyway.

It's not immediate. From vhost_dev_stop, comments added only in this mail:

void vhost_virtqueue_stop(struct vhost_dev *dev,
                          struct VirtIODevice *vdev,
                          struct vhost_virtqueue *vq,
                          unsigned idx)
{
    ...
    // Get each vring indexes, trusting the destination device can
continue safely from there
    r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
    if (r < 0) {
        VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
        /* Connection to the backend is broken, so let's sync internal
         * last avail idx to the device used idx.
         */
        virtio_queue_restore_last_avail_idx(vdev, idx);
    } else {
        virtio_queue_set_last_avail_idx(vdev, idx, state.num);
    }
    ...
}

void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
{
    ...
    // Suspend the device, so we can trust in vring indexes / vq state
    if (hdev->vhost_ops->vhost_dev_start) {
        hdev->vhost_ops->vhost_dev_start(hdev, false);
    }
    if (vrings) {
        vhost_dev_set_vring_enable(hdev, false);
    }

    // Fetch each vq index / state and store in vdev->vq[i]
    for (i = 0; i < hdev->nvqs; ++i) {
        vhost_virtqueue_stop(hdev,
                             vdev,
                             hdev->vqs + i,
                             hdev->vq_index + i);
    }

    // Reset the device, as we don't need it anymore and it can
release the resources
    if (hdev->vhost_ops->vhost_reset_status) {
        hdev->vhost_ops->vhost_reset_status(hdev);
    }
}
---

>  It was my impression this whole time that suspending would
> remove the need to reset.  Well, at least until the device should be
> resumed again, i.e. in vhost_dev_start().
>

It cannot. vhost_dev_stop is also called when the guest reset the
device, so the guest trusts the device to be in a clean state.

Also, the reset is the moment when the device frees the unused
resources. This would mandate the device to

> In addition, I also don’t understand the magnitude of the problem with
> ordering.  If the order in vhost_dev_start() is wrong, can we not easily
> fix it?

The order in vhost_dev_start follows the VirtIO standard. The move of
the reset should be to remove it from vhost_dev_stop to something like
both virtio_reset and the end of virtio_save. I'm not sure if I'm
forgetting some other use cases.

> E.g. add a full vhost_dev_resume callback to invoke right at
> the start of vhost_dev_start(); or check (in the same place) whether the
> back-end supports resuming, and if it doesn’t (and it is currently
> suspended), reset it there.
>
> In any case, if we need to reset in vhost_dev_stop(), i.e. immediately
> after suspend, I don’t see the point of suspending, indicating to me
> that I still fail to understand its purpose.
>
> Hanna
>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-25 10:03           ` Eugenio Perez Martin
@ 2023-07-25 13:09             ` Hanna Czenczek
  2023-07-25 18:53               ` Eugenio Perez Martin
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-25 13:09 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On 25.07.23 12:03, Eugenio Perez Martin wrote:
> On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
>> On 24.07.23 17:48, Eugenio Perez Martin wrote:
>>> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
>>>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
>>>>>> struct, so vhost_dev_stop() can check whether the back-end has been
>>>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
>>>>>> there is no need to reset it; the reset is just a fall-back to stop
>>>>>> device operations for back-ends that do not support suspend.
>>>>>>
>>>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
>>>>>> when the device is re-started, we still have to do the reset to have it
>>>>>> un-suspend.
>>>>>>
>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>>>> ---
>>>>>>     include/hw/virtio/vhost-vdpa.h |  2 --
>>>>>>     include/hw/virtio/vhost.h      |  8 ++++++++
>>>>>>     hw/virtio/vhost-vdpa.c         | 11 +++++++----
>>>>>>     hw/virtio/vhost.c              |  8 +++++++-
>>>>>>     4 files changed, 22 insertions(+), 7 deletions(-)
>>>>>>
>>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>>>>> index e64bfc7f98..72c3686b7f 100644
>>>>>> --- a/include/hw/virtio/vhost-vdpa.h
>>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
>>>>>>         bool shadow_vqs_enabled;
>>>>>>         /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
>>>>>>         bool shadow_data;
>>>>>> -    /* Device suspended successfully */
>>>>>> -    bool suspended;
>>>>>>         /* IOVA mapping used by the Shadow Virtqueue */
>>>>>>         VhostIOVATree *iova_tree;
>>>>>>         GPtrArray *shadow_vqs;
>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>> index 6a173cb9fa..69bf59d630 100644
>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
>>>>>>         uint64_t backend_cap;
>>>>>>         /* @started: is the vhost device started? */
>>>>>>         bool started;
>>>>>> +    /**
>>>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
>>>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
>>>>>> +     * are supposed to automatically suspend/resume in their
>>>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
>>>>>> +     * the device is reset.
>>>>>> +     */
>>>>>> +    bool suspended;
>>>>>>         bool log_enabled;
>>>>>>         uint64_t log_size;
>>>>>>         Error *migration_blocker;
>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>>> index 7b7dee468e..f7fd19a203 100644
>>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>>>>>>
>>>>>>     static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>>>>>>     {
>>>>>> -    struct vhost_vdpa *v = dev->opaque;
>>>>>>         int ret;
>>>>>>         uint8_t status = 0;
>>>>>>
>>>>>>         ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>>>>>>         trace_vhost_vdpa_reset_device(dev);
>>>>>> -    v->suspended = false;
>>>>>> +    dev->suspended = false;
>>>>>>         return ret;
>>>>>>     }
>>>>>>
>>>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
>>>>>>             if (unlikely(r)) {
>>>>>>                 error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
>>>>>>             } else {
>>>>>> -            v->suspended = true;
>>>>>> +            dev->suspended = true;
>>>>>>                 return;
>>>>>>             }
>>>>>>         }
>>>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>>>>>                 return -1;
>>>>>>             }
>>>>>>             vhost_vdpa_set_vring_ready(dev);
>>>>>> +        if (dev->suspended) {
>>>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
>>>>>> +            vhost_vdpa_reset_status(dev);
>>>>> How is that we reset the status at each vhost_vdpa_dev_start? That
>>>>> will clean all the vqs configured, features negotiated, etc. in the
>>>>> vDPA device. Or am I missing something?
>>>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
>>>> but we somehow need to lift the previous SUSPEND so the device will
>>>> again respond to guest requests, do we not?
>>>>
>>> Reset also clears the suspend state in vDPA, and it should be called
>>> at vhost_dev_stop. So the device should never be in suspended state
>>> here. Does that solve your concerns?
>> My intention with this patch was precisely not to reset in
>> vhost_dev_stop when suspending is supported.  So now I’m more confused
>> than before.
>>
> At this moment, I think that should be focused as an optimization and
> not to be included in the main series.

It is absolutely not an optimization but vital for my use case. 
virtiofsd does not currently implement resetting, but if it did (and we 
want this support in the future), resetting it during stop/cont would be 
catastrophic.  There must be a way to have qemu not issue a reset.  This 
patch is the reason why this series exists.

>>>> But more generally, is this any different from what is done before this
>>>> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
>>>> vhost_reset_status(), so the device is reset in every stop/start cycle,
>>>> that doesn’t change.  And we still won’t reset it on the first
>>>> vhost_dev_start(), because dev->suspended will be false then, only on
>>>> subsequent stop/start cycles, as before.  So the only difference is that
>>>> now the device is reset on start, not on stop.
>>>>
>>> The difference is that vhost_vdpa_dev_start is called after features
>>> ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
>>> configuration (using vhost_virtqueue_start). A device reset forces the
>>> device to forget about all of that, and qemu cannot configure them
>>> again until qemu acks the features again.
>> Now I’m completely confused, because I don’t see the point of
>> implementing suspend at all if we rely on a reset immediately afterwards
>> anyway.
> It's not immediate. From vhost_dev_stop, comments added only in this mail:
>
> void vhost_virtqueue_stop(struct vhost_dev *dev,
>                            struct VirtIODevice *vdev,
>                            struct vhost_virtqueue *vq,
>                            unsigned idx)
> {
>      ...
>      // Get each vring indexes, trusting the destination device can
> continue safely from there
>      r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>      if (r < 0) {
>          VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
>          /* Connection to the backend is broken, so let's sync internal
>           * last avail idx to the device used idx.
>           */
>          virtio_queue_restore_last_avail_idx(vdev, idx);
>      } else {
>          virtio_queue_set_last_avail_idx(vdev, idx, state.num);
>      }
>      ...
> }
>
> void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> {
>      ...
>      // Suspend the device, so we can trust in vring indexes / vq state

I don’t understand this purpose.  GET_VRING_BASE stops the vring in 
question, so the vring index returned must be trustworthy, no?

>      if (hdev->vhost_ops->vhost_dev_start) {
>          hdev->vhost_ops->vhost_dev_start(hdev, false);
>      }
>      if (vrings) {
>          vhost_dev_set_vring_enable(hdev, false);
>      }
>
>      // Fetch each vq index / state and store in vdev->vq[i]
>      for (i = 0; i < hdev->nvqs; ++i) {
>          vhost_virtqueue_stop(hdev,
>                               vdev,
>                               hdev->vqs + i,
>                               hdev->vq_index + i);
>      }
>
>      // Reset the device, as we don't need it anymore and it can
> release the resources
>      if (hdev->vhost_ops->vhost_reset_status) {
>          hdev->vhost_ops->vhost_reset_status(hdev);
>      }
> }
> ---
>
>>   It was my impression this whole time that suspending would
>> remove the need to reset.  Well, at least until the device should be
>> resumed again, i.e. in vhost_dev_start().
>>
> It cannot. vhost_dev_stop is also called when the guest reset the
> device, so the guest trusts the device to be in a clean state.
>
> Also, the reset is the moment when the device frees the unused
> resources. This would mandate the device to

What resources are we talking about?  This function is called when the 
VM is paused (stop).  If a stateful device is reset to free “unused 
resources”, this means dropping its internal state, which is absolutely 
wrong in a stop/cont cycle.

>> In addition, I also don’t understand the magnitude of the problem with
>> ordering.  If the order in vhost_dev_start() is wrong, can we not easily
>> fix it?
> The order in vhost_dev_start follows the VirtIO standard.

What does the VirtIO standard say about suspended vhost back-ends?

Hanna

> The move of
> the reset should be to remove it from vhost_dev_stop to something like
> both virtio_reset and the end of virtio_save. I'm not sure if I'm
> forgetting some other use cases.
>
>> E.g. add a full vhost_dev_resume callback to invoke right at
>> the start of vhost_dev_start(); or check (in the same place) whether the
>> back-end supports resuming, and if it doesn’t (and it is currently
>> suspended), reset it there.
>>
>> In any case, if we need to reset in vhost_dev_stop(), i.e. immediately
>> after suspend, I don’t see the point of suspending, indicating to me
>> that I still fail to understand its purpose.
>>
>> Hanna
>>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-25 13:09             ` Hanna Czenczek
@ 2023-07-25 18:53               ` Eugenio Perez Martin
  2023-07-26  6:57                 ` Hanna Czenczek
  0 siblings, 1 reply; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-25 18:53 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On Tue, Jul 25, 2023 at 3:09 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 25.07.23 12:03, Eugenio Perez Martin wrote:
> > On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
> >> On 24.07.23 17:48, Eugenio Perez Martin wrote:
> >>> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> >>>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> >>>>>> struct, so vhost_dev_stop() can check whether the back-end has been
> >>>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> >>>>>> there is no need to reset it; the reset is just a fall-back to stop
> >>>>>> device operations for back-ends that do not support suspend.
> >>>>>>
> >>>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> >>>>>> when the device is re-started, we still have to do the reset to have it
> >>>>>> un-suspend.
> >>>>>>
> >>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >>>>>> ---
> >>>>>>     include/hw/virtio/vhost-vdpa.h |  2 --
> >>>>>>     include/hw/virtio/vhost.h      |  8 ++++++++
> >>>>>>     hw/virtio/vhost-vdpa.c         | 11 +++++++----
> >>>>>>     hw/virtio/vhost.c              |  8 +++++++-
> >>>>>>     4 files changed, 22 insertions(+), 7 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>>>>> index e64bfc7f98..72c3686b7f 100644
> >>>>>> --- a/include/hw/virtio/vhost-vdpa.h
> >>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> >>>>>>         bool shadow_vqs_enabled;
> >>>>>>         /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
> >>>>>>         bool shadow_data;
> >>>>>> -    /* Device suspended successfully */
> >>>>>> -    bool suspended;
> >>>>>>         /* IOVA mapping used by the Shadow Virtqueue */
> >>>>>>         VhostIOVATree *iova_tree;
> >>>>>>         GPtrArray *shadow_vqs;
> >>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>>>> index 6a173cb9fa..69bf59d630 100644
> >>>>>> --- a/include/hw/virtio/vhost.h
> >>>>>> +++ b/include/hw/virtio/vhost.h
> >>>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
> >>>>>>         uint64_t backend_cap;
> >>>>>>         /* @started: is the vhost device started? */
> >>>>>>         bool started;
> >>>>>> +    /**
> >>>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
> >>>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> >>>>>> +     * are supposed to automatically suspend/resume in their
> >>>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
> >>>>>> +     * the device is reset.
> >>>>>> +     */
> >>>>>> +    bool suspended;
> >>>>>>         bool log_enabled;
> >>>>>>         uint64_t log_size;
> >>>>>>         Error *migration_blocker;
> >>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>>> index 7b7dee468e..f7fd19a203 100644
> >>>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> >>>>>>
> >>>>>>     static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >>>>>>     {
> >>>>>> -    struct vhost_vdpa *v = dev->opaque;
> >>>>>>         int ret;
> >>>>>>         uint8_t status = 0;
> >>>>>>
> >>>>>>         ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> >>>>>>         trace_vhost_vdpa_reset_device(dev);
> >>>>>> -    v->suspended = false;
> >>>>>> +    dev->suspended = false;
> >>>>>>         return ret;
> >>>>>>     }
> >>>>>>
> >>>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
> >>>>>>             if (unlikely(r)) {
> >>>>>>                 error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
> >>>>>>             } else {
> >>>>>> -            v->suspended = true;
> >>>>>> +            dev->suspended = true;
> >>>>>>                 return;
> >>>>>>             }
> >>>>>>         }
> >>>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>>>>>                 return -1;
> >>>>>>             }
> >>>>>>             vhost_vdpa_set_vring_ready(dev);
> >>>>>> +        if (dev->suspended) {
> >>>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
> >>>>>> +            vhost_vdpa_reset_status(dev);
> >>>>> How is that we reset the status at each vhost_vdpa_dev_start? That
> >>>>> will clean all the vqs configured, features negotiated, etc. in the
> >>>>> vDPA device. Or am I missing something?
> >>>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
> >>>> but we somehow need to lift the previous SUSPEND so the device will
> >>>> again respond to guest requests, do we not?
> >>>>
> >>> Reset also clears the suspend state in vDPA, and it should be called
> >>> at vhost_dev_stop. So the device should never be in suspended state
> >>> here. Does that solve your concerns?
> >> My intention with this patch was precisely not to reset in
> >> vhost_dev_stop when suspending is supported.  So now I’m more confused
> >> than before.
> >>
> > At this moment, I think that should be focused as an optimization and
> > not to be included in the main series.
>
> It is absolutely not an optimization but vital for my use case.
> virtiofsd does not currently implement resetting, but if it did (and we
> want this support in the future), resetting it during stop/cont would be
> catastrophic.  There must be a way to have qemu not issue a reset.  This
> patch is the reason why this series exists.
>

Sorry, I see I confused things with the first reply. Let me do a recap.

If I understand the problem correctly, your use case requires that
qemu does not reset the device before the device state is fetched with
virtio_save in the case of a migration. This is understandable and I
think we have a solution for that: to move the vhost_ops call to
virtio_reset and the end of virtio_save. To remove the reset call from
vhost_dev_stop is somehow mandatory, as it is called before
virtio_save.

But we cannot move to vhost_vdpa_dev_start, as proposed here. The reasons are:
* All the features, vq parameters, etc are set before any
vhost_vdpa_dev_start call. To reset at any vhost_vdpa_dev_start would
wipe them.
* The device needs to hold all the resources until it is reset. Things
like descriptor status etc.

And, regarding the comment "When RESUME is available, use it instead
of resetting", we cannot use resume to replace reset in all cases.
This is because the semantics are different.

For example, vhost_dev_stop and vhost_dev_start are also called when
the guest reset by itself the device. You can check it rmmoding and
modprobing virtio-net driver, for example. In this case, the driver
expects to find all vqs to start with 0, but the resume must not reset
these indexes.

It can be applied as an optimizations sometimes, but not for the general case.

> >>>> But more generally, is this any different from what is done before this
> >>>> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
> >>>> vhost_reset_status(), so the device is reset in every stop/start cycle,
> >>>> that doesn’t change.  And we still won’t reset it on the first
> >>>> vhost_dev_start(), because dev->suspended will be false then, only on
> >>>> subsequent stop/start cycles, as before.  So the only difference is that
> >>>> now the device is reset on start, not on stop.
> >>>>
> >>> The difference is that vhost_vdpa_dev_start is called after features
> >>> ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
> >>> configuration (using vhost_virtqueue_start). A device reset forces the
> >>> device to forget about all of that, and qemu cannot configure them
> >>> again until qemu acks the features again.
> >> Now I’m completely confused, because I don’t see the point of
> >> implementing suspend at all if we rely on a reset immediately afterwards
> >> anyway.
> > It's not immediate. From vhost_dev_stop, comments added only in this mail:
> >
> > void vhost_virtqueue_stop(struct vhost_dev *dev,
> >                            struct VirtIODevice *vdev,
> >                            struct vhost_virtqueue *vq,
> >                            unsigned idx)
> > {
> >      ...
> >      // Get each vring indexes, trusting the destination device can
> > continue safely from there
> >      r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
> >      if (r < 0) {
> >          VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
> >          /* Connection to the backend is broken, so let's sync internal
> >           * last avail idx to the device used idx.
> >           */
> >          virtio_queue_restore_last_avail_idx(vdev, idx);
> >      } else {
> >          virtio_queue_set_last_avail_idx(vdev, idx, state.num);
> >      }
> >      ...
> > }
> >
> > void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> > {
> >      ...
> >      // Suspend the device, so we can trust in vring indexes / vq state
>
> I don’t understand this purpose.  GET_VRING_BASE stops the vring in
> question, so the vring index returned must be trustworthy, no?
>

That only happens in vhost-user, not in vhost-vdpa.

> >      if (hdev->vhost_ops->vhost_dev_start) {
> >          hdev->vhost_ops->vhost_dev_start(hdev, false);
> >      }
> >      if (vrings) {
> >          vhost_dev_set_vring_enable(hdev, false);
> >      }
> >
> >      // Fetch each vq index / state and store in vdev->vq[i]
> >      for (i = 0; i < hdev->nvqs; ++i) {
> >          vhost_virtqueue_stop(hdev,
> >                               vdev,
> >                               hdev->vqs + i,
> >                               hdev->vq_index + i);
> >      }
> >
> >      // Reset the device, as we don't need it anymore and it can
> > release the resources
> >      if (hdev->vhost_ops->vhost_reset_status) {
> >          hdev->vhost_ops->vhost_reset_status(hdev);
> >      }
> > }
> > ---
> >
> >>   It was my impression this whole time that suspending would
> >> remove the need to reset.  Well, at least until the device should be
> >> resumed again, i.e. in vhost_dev_start().
> >>
> > It cannot. vhost_dev_stop is also called when the guest reset the
> > device, so the guest trusts the device to be in a clean state.
> >
> > Also, the reset is the moment when the device frees the unused
> > resources. This would mandate the device to
>
> What resources are we talking about?  This function is called when the
> VM is paused (stop).  If a stateful device is reset to free “unused
> resources”, this means dropping its internal state, which is absolutely
> wrong in a stop/cont cycle.
>

But is the expected result in the virtio reset cycle. We need to split
these paths.

> >> In addition, I also don’t understand the magnitude of the problem with
> >> ordering.  If the order in vhost_dev_start() is wrong, can we not easily
> >> fix it?
> > The order in vhost_dev_start follows the VirtIO standard.
>
> What does the VirtIO standard say about suspended vhost back-ends?
>

Suspend does not exist in the VirtIO standard. I meant the device
initialization order in "3.1 Device Initialization":

1. Reset the device.
...
5. Set the FEATURES_OK status bit. [...]
...
7. Perform device-specific setup, including discovery of virtqueues
for the device, optional per-bus setup, reading and possibly writing
the device’s virtio configuration space, and population of virtqueues.
8.Set the DRIVER_OK status bit. At this point the device is “live”.

Steps 4-8 are all done in vhost_dev_start, in that particular order.
To call vhost_vdpa_reset_status from vhost_vdpa_dev_start(true) would
reset the device back to step 1, but there is no more code to set all
configuration from 2-7 before 8 (DRIVER_OK).

> Hanna
>
> > The move of
> > the reset should be to remove it from vhost_dev_stop to something like
> > both virtio_reset and the end of virtio_save. I'm not sure if I'm
> > forgetting some other use cases.
> >
> >> E.g. add a full vhost_dev_resume callback to invoke right at
> >> the start of vhost_dev_start(); or check (in the same place) whether the
> >> back-end supports resuming, and if it doesn’t (and it is currently
> >> suspended), reset it there.
> >>
> >> In any case, if we need to reset in vhost_dev_stop(), i.e. immediately
> >> after suspend, I don’t see the point of suspending, indicating to me
> >> that I still fail to understand its purpose.
> >>
> >> Hanna
> >>
>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-25 18:53               ` Eugenio Perez Martin
@ 2023-07-26  6:57                 ` Hanna Czenczek
  2023-07-27 12:49                   ` Eugenio Perez Martin
  0 siblings, 1 reply; 37+ messages in thread
From: Hanna Czenczek @ 2023-07-26  6:57 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On 25.07.23 20:53, Eugenio Perez Martin wrote:
> On Tue, Jul 25, 2023 at 3:09 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>> On 25.07.23 12:03, Eugenio Perez Martin wrote:
>>> On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>> On 24.07.23 17:48, Eugenio Perez Martin wrote:
>>>>> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>>>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
>>>>>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
>>>>>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
>>>>>>>> struct, so vhost_dev_stop() can check whether the back-end has been
>>>>>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
>>>>>>>> there is no need to reset it; the reset is just a fall-back to stop
>>>>>>>> device operations for back-ends that do not support suspend.
>>>>>>>>
>>>>>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
>>>>>>>> when the device is re-started, we still have to do the reset to have it
>>>>>>>> un-suspend.
>>>>>>>>
>>>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
>>>>>>>> ---
>>>>>>>>      include/hw/virtio/vhost-vdpa.h |  2 --
>>>>>>>>      include/hw/virtio/vhost.h      |  8 ++++++++
>>>>>>>>      hw/virtio/vhost-vdpa.c         | 11 +++++++----
>>>>>>>>      hw/virtio/vhost.c              |  8 +++++++-
>>>>>>>>      4 files changed, 22 insertions(+), 7 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
>>>>>>>> index e64bfc7f98..72c3686b7f 100644
>>>>>>>> --- a/include/hw/virtio/vhost-vdpa.h
>>>>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
>>>>>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
>>>>>>>>          bool shadow_vqs_enabled;
>>>>>>>>          /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
>>>>>>>>          bool shadow_data;
>>>>>>>> -    /* Device suspended successfully */
>>>>>>>> -    bool suspended;
>>>>>>>>          /* IOVA mapping used by the Shadow Virtqueue */
>>>>>>>>          VhostIOVATree *iova_tree;
>>>>>>>>          GPtrArray *shadow_vqs;
>>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>>>> index 6a173cb9fa..69bf59d630 100644
>>>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
>>>>>>>>          uint64_t backend_cap;
>>>>>>>>          /* @started: is the vhost device started? */
>>>>>>>>          bool started;
>>>>>>>> +    /**
>>>>>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
>>>>>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
>>>>>>>> +     * are supposed to automatically suspend/resume in their
>>>>>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
>>>>>>>> +     * the device is reset.
>>>>>>>> +     */
>>>>>>>> +    bool suspended;
>>>>>>>>          bool log_enabled;
>>>>>>>>          uint64_t log_size;
>>>>>>>>          Error *migration_blocker;
>>>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>>>>>>> index 7b7dee468e..f7fd19a203 100644
>>>>>>>> --- a/hw/virtio/vhost-vdpa.c
>>>>>>>> +++ b/hw/virtio/vhost-vdpa.c
>>>>>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
>>>>>>>>
>>>>>>>>      static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>>>>>>>>      {
>>>>>>>> -    struct vhost_vdpa *v = dev->opaque;
>>>>>>>>          int ret;
>>>>>>>>          uint8_t status = 0;
>>>>>>>>
>>>>>>>>          ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
>>>>>>>>          trace_vhost_vdpa_reset_device(dev);
>>>>>>>> -    v->suspended = false;
>>>>>>>> +    dev->suspended = false;
>>>>>>>>          return ret;
>>>>>>>>      }
>>>>>>>>
>>>>>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
>>>>>>>>              if (unlikely(r)) {
>>>>>>>>                  error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
>>>>>>>>              } else {
>>>>>>>> -            v->suspended = true;
>>>>>>>> +            dev->suspended = true;
>>>>>>>>                  return;
>>>>>>>>              }
>>>>>>>>          }
>>>>>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
>>>>>>>>                  return -1;
>>>>>>>>              }
>>>>>>>>              vhost_vdpa_set_vring_ready(dev);
>>>>>>>> +        if (dev->suspended) {
>>>>>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
>>>>>>>> +            vhost_vdpa_reset_status(dev);
>>>>>>> How is that we reset the status at each vhost_vdpa_dev_start? That
>>>>>>> will clean all the vqs configured, features negotiated, etc. in the
>>>>>>> vDPA device. Or am I missing something?
>>>>>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
>>>>>> but we somehow need to lift the previous SUSPEND so the device will
>>>>>> again respond to guest requests, do we not?
>>>>>>
>>>>> Reset also clears the suspend state in vDPA, and it should be called
>>>>> at vhost_dev_stop. So the device should never be in suspended state
>>>>> here. Does that solve your concerns?
>>>> My intention with this patch was precisely not to reset in
>>>> vhost_dev_stop when suspending is supported.  So now I’m more confused
>>>> than before.
>>>>
>>> At this moment, I think that should be focused as an optimization and
>>> not to be included in the main series.
>> It is absolutely not an optimization but vital for my use case.
>> virtiofsd does not currently implement resetting, but if it did (and we
>> want this support in the future), resetting it during stop/cont would be
>> catastrophic.  There must be a way to have qemu not issue a reset.  This
>> patch is the reason why this series exists.
>>
> Sorry, I see I confused things with the first reply. Let me do a recap.
>
> If I understand the problem correctly, your use case requires that
> qemu does not reset the device before the device state is fetched with
> virtio_save in the case of a migration.

That is only part of the problem, the bigger picture has nothing to do 
with migration at all.  The problem is that when the VM is paused 
(stop), we invoke vhost_dev_stop(), and when it is resumed (cont), we 
invoke vhost_dev_start().  To me, it therefore sounds absolutely wrong 
to reset the back-end in either of these functions.  For stateless 
devices, it was determined to not be an issue (I still find it extremely 
strange), and as far as I’ve understood, we’ve come to the agreement 
that it’s basically a fallback for when there is no other way to stop 
the back-end.  But stateful devices like virtio-fs would be completely 
broken by resetting them there.

Therefore, if virtiofsd did implement reset through SET_STATUS, 
stop/cont would break it today.  Maybe other vhost-user devices, too, 
which just implement RESET_OWNER/RESET_DEVICE, which aren’t even called 
when the device is supposed to be reset in vhost_dev_stop() (patch 6).

So not just because of migration, but in general, there must be a way 
for back-ends to force qemu not to reset them in vhost_dev_start()/stop().

Or we stop using vhost_dev_start()/stop() when the VM is paused/resumed 
(stop/cont).

> This is understandable and I
> think we have a solution for that: to move the vhost_ops call to
> virtio_reset and the end of virtio_save.

Why would we reset the device in virtio_save()?

> To remove the reset call from
> vhost_dev_stop is somehow mandatory, as it is called before
> virtio_save.
>
> But we cannot move to vhost_vdpa_dev_start, as proposed here. The reasons are:
> * All the features, vq parameters, etc are set before any
> vhost_vdpa_dev_start call. To reset at any vhost_vdpa_dev_start would
> wipe them.
> * The device needs to hold all the resources until it is reset. Things
> like descriptor status etc.
>
> And, regarding the comment "When RESUME is available, use it instead
> of resetting", we cannot use resume to replace reset in all cases.
> This is because the semantics are different.
>
> For example, vhost_dev_stop and vhost_dev_start are also called when
> the guest reset by itself the device. You can check it rmmoding and
> modprobing virtio-net driver, for example. In this case, the driver
> expects to find all vqs to start with 0, but the resume must not reset
> these indexes.

This isn’t quite clear to me.  I understand this to mean that there must 
be a reset somewhere in vhost_dev_stop() and/or vhost_dev_start().  But 
above, you proposed moving the reset from vhost_dev_stop() into 
virtio_reset().  Is virtio_reset() called in addition to 
vhost_dev_stop() and vhost_dev_start() when the guest driver is changed?

Because we can’t have an always-present reset in vhost_dev_stop() or 
vhost_dev_start().  It just doesn’t work with stop/cont.  At the same 
time, I understand that you say we must have it because 
vhost_dev_{stop,start}() are also used when the guest driver changes.  
Consequently, it sounds clear to me that using the exact same functions 
in stop/cont and when the guest driver is unloaded/loaded is and has 
always been wrong.  Because in stop/cont, the guest driver never 
changes, so we shouldn’t tell the back-end that it did (by sending 
SET_STATUS(0)).

> It can be applied as an optimizations sometimes, but not for the general case.
>
>>>>>> But more generally, is this any different from what is done before this
>>>>>> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
>>>>>> vhost_reset_status(), so the device is reset in every stop/start cycle,
>>>>>> that doesn’t change.  And we still won’t reset it on the first
>>>>>> vhost_dev_start(), because dev->suspended will be false then, only on
>>>>>> subsequent stop/start cycles, as before.  So the only difference is that
>>>>>> now the device is reset on start, not on stop.
>>>>>>
>>>>> The difference is that vhost_vdpa_dev_start is called after features
>>>>> ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
>>>>> configuration (using vhost_virtqueue_start). A device reset forces the
>>>>> device to forget about all of that, and qemu cannot configure them
>>>>> again until qemu acks the features again.
>>>> Now I’m completely confused, because I don’t see the point of
>>>> implementing suspend at all if we rely on a reset immediately afterwards
>>>> anyway.
>>> It's not immediate. From vhost_dev_stop, comments added only in this mail:
>>>
>>> void vhost_virtqueue_stop(struct vhost_dev *dev,
>>>                             struct VirtIODevice *vdev,
>>>                             struct vhost_virtqueue *vq,
>>>                             unsigned idx)
>>> {
>>>       ...
>>>       // Get each vring indexes, trusting the destination device can
>>> continue safely from there
>>>       r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
>>>       if (r < 0) {
>>>           VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
>>>           /* Connection to the backend is broken, so let's sync internal
>>>            * last avail idx to the device used idx.
>>>            */
>>>           virtio_queue_restore_last_avail_idx(vdev, idx);
>>>       } else {
>>>           virtio_queue_set_last_avail_idx(vdev, idx, state.num);
>>>       }
>>>       ...
>>> }
>>>
>>> void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
>>> {
>>>       ...
>>>       // Suspend the device, so we can trust in vring indexes / vq state
>> I don’t understand this purpose.  GET_VRING_BASE stops the vring in
>> question, so the vring index returned must be trustworthy, no?
>>
> That only happens in vhost-user, not in vhost-vdpa.

OK, so that begs the question: Was SUSPEND ever intended to do anything 
but stop all vrings?  Because this series is about to make its meaning a 
whole lot broader than that in vhost-user.

>>>       if (hdev->vhost_ops->vhost_dev_start) {
>>>           hdev->vhost_ops->vhost_dev_start(hdev, false);
>>>       }
>>>       if (vrings) {
>>>           vhost_dev_set_vring_enable(hdev, false);
>>>       }
>>>
>>>       // Fetch each vq index / state and store in vdev->vq[i]
>>>       for (i = 0; i < hdev->nvqs; ++i) {
>>>           vhost_virtqueue_stop(hdev,
>>>                                vdev,
>>>                                hdev->vqs + i,
>>>                                hdev->vq_index + i);
>>>       }
>>>
>>>       // Reset the device, as we don't need it anymore and it can
>>> release the resources
>>>       if (hdev->vhost_ops->vhost_reset_status) {
>>>           hdev->vhost_ops->vhost_reset_status(hdev);
>>>       }
>>> }
>>> ---
>>>
>>>>    It was my impression this whole time that suspending would
>>>> remove the need to reset.  Well, at least until the device should be
>>>> resumed again, i.e. in vhost_dev_start().
>>>>
>>> It cannot. vhost_dev_stop is also called when the guest reset the
>>> device, so the guest trusts the device to be in a clean state.
>>>
>>> Also, the reset is the moment when the device frees the unused
>>> resources. This would mandate the device to
>> What resources are we talking about?  This function is called when the
>> VM is paused (stop).  If a stateful device is reset to free “unused
>> resources”, this means dropping its internal state, which is absolutely
>> wrong in a stop/cont cycle.
>>
> But is the expected result in the virtio reset cycle. We need to split
> these paths.
>
>>>> In addition, I also don’t understand the magnitude of the problem with
>>>> ordering.  If the order in vhost_dev_start() is wrong, can we not easily
>>>> fix it?
>>> The order in vhost_dev_start follows the VirtIO standard.
>> What does the VirtIO standard say about suspended vhost back-ends?
>>
> Suspend does not exist in the VirtIO standard. I meant the device
> initialization order in "3.1 Device Initialization":
>
> 1. Reset the device.
> ...
> 5. Set the FEATURES_OK status bit. [...]
> ...
> 7. Perform device-specific setup, including discovery of virtqueues
> for the device, optional per-bus setup, reading and possibly writing
> the device’s virtio configuration space, and population of virtqueues.
> 8.Set the DRIVER_OK status bit. At this point the device is “live”.
>
> Steps 4-8 are all done in vhost_dev_start, in that particular order.
> To call vhost_vdpa_reset_status from vhost_vdpa_dev_start(true) would
> reset the device back to step 1, but there is no more code to set all
> configuration from 2-7 before 8 (DRIVER_OK).

That’s why I’ve proposed doing the reset at the start of 
vhost_dev_start() (quoted below still).  To me, that sounds in line with 
the virtio specification.

Still, if you insist there must a reset somewhere in 
vhost_dev_start()/stop() because it may be guest-initiated, then there 
is no solution that can work for both.  We must be able to distinguish 
between a paused VM and a change in the guest driver.

>> Hanna
>>
>>> The move of
>>> the reset should be to remove it from vhost_dev_stop to something like
>>> both virtio_reset and the end of virtio_save. I'm not sure if I'm
>>> forgetting some other use cases.
>>>
>>>> E.g. add a full vhost_dev_resume callback to invoke right at
>>>> the start of vhost_dev_start(); or check (in the same place) whether the
>>>> back-end supports resuming, and if it doesn’t (and it is currently
>>>> suspended), reset it there.




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-26  6:57                 ` Hanna Czenczek
@ 2023-07-27 12:49                   ` Eugenio Perez Martin
  2023-07-27 20:26                     ` Stefan Hajnoczi
  0 siblings, 1 reply; 37+ messages in thread
From: Eugenio Perez Martin @ 2023-07-27 12:49 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Stefan Hajnoczi, German Maglione

On Wed, Jul 26, 2023 at 8:57 AM Hanna Czenczek <hreitz@redhat.com> wrote:
>
> On 25.07.23 20:53, Eugenio Perez Martin wrote:
> > On Tue, Jul 25, 2023 at 3:09 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >> On 25.07.23 12:03, Eugenio Perez Martin wrote:
> >>> On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>> On 24.07.23 17:48, Eugenio Perez Martin wrote:
> >>>>> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>>>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> >>>>>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> >>>>>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> >>>>>>>> struct, so vhost_dev_stop() can check whether the back-end has been
> >>>>>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> >>>>>>>> there is no need to reset it; the reset is just a fall-back to stop
> >>>>>>>> device operations for back-ends that do not support suspend.
> >>>>>>>>
> >>>>>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> >>>>>>>> when the device is re-started, we still have to do the reset to have it
> >>>>>>>> un-suspend.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> >>>>>>>> ---
> >>>>>>>>      include/hw/virtio/vhost-vdpa.h |  2 --
> >>>>>>>>      include/hw/virtio/vhost.h      |  8 ++++++++
> >>>>>>>>      hw/virtio/vhost-vdpa.c         | 11 +++++++----
> >>>>>>>>      hw/virtio/vhost.c              |  8 +++++++-
> >>>>>>>>      4 files changed, 22 insertions(+), 7 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> >>>>>>>> index e64bfc7f98..72c3686b7f 100644
> >>>>>>>> --- a/include/hw/virtio/vhost-vdpa.h
> >>>>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> >>>>>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> >>>>>>>>          bool shadow_vqs_enabled;
> >>>>>>>>          /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
> >>>>>>>>          bool shadow_data;
> >>>>>>>> -    /* Device suspended successfully */
> >>>>>>>> -    bool suspended;
> >>>>>>>>          /* IOVA mapping used by the Shadow Virtqueue */
> >>>>>>>>          VhostIOVATree *iova_tree;
> >>>>>>>>          GPtrArray *shadow_vqs;
> >>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>>>>>> index 6a173cb9fa..69bf59d630 100644
> >>>>>>>> --- a/include/hw/virtio/vhost.h
> >>>>>>>> +++ b/include/hw/virtio/vhost.h
> >>>>>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
> >>>>>>>>          uint64_t backend_cap;
> >>>>>>>>          /* @started: is the vhost device started? */
> >>>>>>>>          bool started;
> >>>>>>>> +    /**
> >>>>>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
> >>>>>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> >>>>>>>> +     * are supposed to automatically suspend/resume in their
> >>>>>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
> >>>>>>>> +     * the device is reset.
> >>>>>>>> +     */
> >>>>>>>> +    bool suspended;
> >>>>>>>>          bool log_enabled;
> >>>>>>>>          uint64_t log_size;
> >>>>>>>>          Error *migration_blocker;
> >>>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>>>>>>> index 7b7dee468e..f7fd19a203 100644
> >>>>>>>> --- a/hw/virtio/vhost-vdpa.c
> >>>>>>>> +++ b/hw/virtio/vhost-vdpa.c
> >>>>>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> >>>>>>>>
> >>>>>>>>      static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >>>>>>>>      {
> >>>>>>>> -    struct vhost_vdpa *v = dev->opaque;
> >>>>>>>>          int ret;
> >>>>>>>>          uint8_t status = 0;
> >>>>>>>>
> >>>>>>>>          ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> >>>>>>>>          trace_vhost_vdpa_reset_device(dev);
> >>>>>>>> -    v->suspended = false;
> >>>>>>>> +    dev->suspended = false;
> >>>>>>>>          return ret;
> >>>>>>>>      }
> >>>>>>>>
> >>>>>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
> >>>>>>>>              if (unlikely(r)) {
> >>>>>>>>                  error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
> >>>>>>>>              } else {
> >>>>>>>> -            v->suspended = true;
> >>>>>>>> +            dev->suspended = true;
> >>>>>>>>                  return;
> >>>>>>>>              }
> >>>>>>>>          }
> >>>>>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>>>>>>>                  return -1;
> >>>>>>>>              }
> >>>>>>>>              vhost_vdpa_set_vring_ready(dev);
> >>>>>>>> +        if (dev->suspended) {
> >>>>>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
> >>>>>>>> +            vhost_vdpa_reset_status(dev);
> >>>>>>> How is that we reset the status at each vhost_vdpa_dev_start? That
> >>>>>>> will clean all the vqs configured, features negotiated, etc. in the
> >>>>>>> vDPA device. Or am I missing something?
> >>>>>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
> >>>>>> but we somehow need to lift the previous SUSPEND so the device will
> >>>>>> again respond to guest requests, do we not?
> >>>>>>
> >>>>> Reset also clears the suspend state in vDPA, and it should be called
> >>>>> at vhost_dev_stop. So the device should never be in suspended state
> >>>>> here. Does that solve your concerns?
> >>>> My intention with this patch was precisely not to reset in
> >>>> vhost_dev_stop when suspending is supported.  So now I’m more confused
> >>>> than before.
> >>>>
> >>> At this moment, I think that should be focused as an optimization and
> >>> not to be included in the main series.
> >> It is absolutely not an optimization but vital for my use case.
> >> virtiofsd does not currently implement resetting, but if it did (and we
> >> want this support in the future), resetting it during stop/cont would be
> >> catastrophic.  There must be a way to have qemu not issue a reset.  This
> >> patch is the reason why this series exists.
> >>
> > Sorry, I see I confused things with the first reply. Let me do a recap.
> >
> > If I understand the problem correctly, your use case requires that
> > qemu does not reset the device before the device state is fetched with
> > virtio_save in the case of a migration.
>
> That is only part of the problem, the bigger picture has nothing to do
> with migration at all.  The problem is that when the VM is paused
> (stop), we invoke vhost_dev_stop(), and when it is resumed (cont), we
> invoke vhost_dev_start().  To me, it therefore sounds absolutely wrong
> to reset the back-end in either of these functions.  For stateless
> devices, it was determined to not be an issue (I still find it extremely
> strange), and as far as I’ve understood, we’ve come to the agreement
> that it’s basically a fallback for when there is no other way to stop
> the back-end.  But stateful devices like virtio-fs would be completely
> broken by resetting them there.
>
> Therefore, if virtiofsd did implement reset through SET_STATUS,
> stop/cont would break it today.  Maybe other vhost-user devices, too,
> which just implement RESET_OWNER/RESET_DEVICE, which aren’t even called
> when the device is supposed to be reset in vhost_dev_stop() (patch 6).
>
> So not just because of migration, but in general, there must be a way
> for back-ends to force qemu not to reset them in vhost_dev_start()/stop().
>
> Or we stop using vhost_dev_start()/stop() when the VM is paused/resumed
> (stop/cont).
>

Yes, that comes back to the thread [1].

As a third alternative, you can keep vhost_dev_start and let the
function check the current state and initialize the device only if
needed. But you can keep symmetrical functions and call one or another
at the device's code, of course. Not sure what is cleaner or requires
less changes.

> > This is understandable and I
> > think we have a solution for that: to move the vhost_ops call to
> > virtio_reset and the end of virtio_save.
>
> Why would we reset the device in virtio_save()?
>

If the VM continues in the source because of whatever reason,
vhost_dev_start would expect the device to be clean. You can test it
with the command "cont" after the LM.

> > To remove the reset call from
> > vhost_dev_stop is somehow mandatory, as it is called before
> > virtio_save.
> >
> > But we cannot move to vhost_vdpa_dev_start, as proposed here. The reasons are:
> > * All the features, vq parameters, etc are set before any
> > vhost_vdpa_dev_start call. To reset at any vhost_vdpa_dev_start would
> > wipe them.
> > * The device needs to hold all the resources until it is reset. Things
> > like descriptor status etc.
> >
> > And, regarding the comment "When RESUME is available, use it instead
> > of resetting", we cannot use resume to replace reset in all cases.
> > This is because the semantics are different.
> >
> > For example, vhost_dev_stop and vhost_dev_start are also called when
> > the guest reset by itself the device. You can check it rmmoding and
> > modprobing virtio-net driver, for example. In this case, the driver
> > expects to find all vqs to start with 0, but the resume must not reset
> > these indexes.
>
> This isn’t quite clear to me.  I understand this to mean that there must
> be a reset somewhere in vhost_dev_stop() and/or vhost_dev_start().  But
> above, you proposed moving the reset from vhost_dev_stop() into
> virtio_reset().  Is virtio_reset() called in addition to
> vhost_dev_stop() and vhost_dev_start() when the guest driver is changed?
>

Right. Maybe another option is virtio_set_status?

The point is that other parts of qemu / guest trust the device to be
reset after (current version of) vhost_dev_stop, so if we are going to
move the reset it must be added to the callers at least. To trace
these callers is needed, so maybe it is easy to add another function
(vhost_dev_suspend?), your first alternative.

> Because we can’t have an always-present reset in vhost_dev_stop() or
> vhost_dev_start().  It just doesn’t work with stop/cont.  At the same
> time, I understand that you say we must have it because
> vhost_dev_{stop,start}() are also used when the guest driver changes.
> Consequently, it sounds clear to me that using the exact same functions
> in stop/cont and when the guest driver is unloaded/loaded is and has
> always been wrong.  Because in stop/cont, the guest driver never
> changes, so we shouldn’t tell the back-end that it did (by sending
> SET_STATUS(0)).
>

Talking just about vhost_net here, the device dumps all the state
(like vqs) to qemu in vhost_dev_stop. That is what allows to have, for
example, an unified code in migrating: qemu only needs to know how to
migrate its emulated device, and vhost code just writes or read at
suspend / continue or live migrating. To suspend and continue is the
same operation actually for vhost_net.

> > It can be applied as an optimizations sometimes, but not for the general case.
> >
> >>>>>> But more generally, is this any different from what is done before this
> >>>>>> patch?  Before this patch, vhost_dev_stop() unconditionally invokes
> >>>>>> vhost_reset_status(), so the device is reset in every stop/start cycle,
> >>>>>> that doesn’t change.  And we still won’t reset it on the first
> >>>>>> vhost_dev_start(), because dev->suspended will be false then, only on
> >>>>>> subsequent stop/start cycles, as before.  So the only difference is that
> >>>>>> now the device is reset on start, not on stop.
> >>>>>>
> >>>>> The difference is that vhost_vdpa_dev_start is called after features
> >>>>> ack (via vhost_dev_start, through vhost_dev_set_features call) and vq
> >>>>> configuration (using vhost_virtqueue_start). A device reset forces the
> >>>>> device to forget about all of that, and qemu cannot configure them
> >>>>> again until qemu acks the features again.
> >>>> Now I’m completely confused, because I don’t see the point of
> >>>> implementing suspend at all if we rely on a reset immediately afterwards
> >>>> anyway.
> >>> It's not immediate. From vhost_dev_stop, comments added only in this mail:
> >>>
> >>> void vhost_virtqueue_stop(struct vhost_dev *dev,
> >>>                             struct VirtIODevice *vdev,
> >>>                             struct vhost_virtqueue *vq,
> >>>                             unsigned idx)
> >>> {
> >>>       ...
> >>>       // Get each vring indexes, trusting the destination device can
> >>> continue safely from there
> >>>       r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
> >>>       if (r < 0) {
> >>>           VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
> >>>           /* Connection to the backend is broken, so let's sync internal
> >>>            * last avail idx to the device used idx.
> >>>            */
> >>>           virtio_queue_restore_last_avail_idx(vdev, idx);
> >>>       } else {
> >>>           virtio_queue_set_last_avail_idx(vdev, idx, state.num);
> >>>       }
> >>>       ...
> >>> }
> >>>
> >>> void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> >>> {
> >>>       ...
> >>>       // Suspend the device, so we can trust in vring indexes / vq state
> >> I don’t understand this purpose.  GET_VRING_BASE stops the vring in
> >> question, so the vring index returned must be trustworthy, no?
> >>
> > That only happens in vhost-user, not in vhost-vdpa.
>
> OK, so that begs the question: Was SUSPEND ever intended to do anything
> but stop all vrings?  Because this series is about to make its meaning a
> whole lot broader than that in vhost-user.
>

That was the big part, yes. The device must also stop modifying
config, like in the case of link status changes. Other actions like
fetch vq state are performed later (at vhost_virtqueue_stop).

VHOST_VDPA_SUSPEND requires the device to preserve all the needed
state to be recovered or continued:

Suspend a device so it does not process virtqueue requests anymore

After the return of ioctl the device must preserve all the necessary state
(the virtqueue vring base plus the possible device specific states) that is
required for restoring in the future. The device must not change its
configuration after that point.
---

> >>>       if (hdev->vhost_ops->vhost_dev_start) {
> >>>           hdev->vhost_ops->vhost_dev_start(hdev, false);
> >>>       }
> >>>       if (vrings) {
> >>>           vhost_dev_set_vring_enable(hdev, false);
> >>>       }
> >>>
> >>>       // Fetch each vq index / state and store in vdev->vq[i]
> >>>       for (i = 0; i < hdev->nvqs; ++i) {
> >>>           vhost_virtqueue_stop(hdev,
> >>>                                vdev,
> >>>                                hdev->vqs + i,
> >>>                                hdev->vq_index + i);
> >>>       }
> >>>
> >>>       // Reset the device, as we don't need it anymore and it can
> >>> release the resources
> >>>       if (hdev->vhost_ops->vhost_reset_status) {
> >>>           hdev->vhost_ops->vhost_reset_status(hdev);
> >>>       }
> >>> }
> >>> ---
> >>>
> >>>>    It was my impression this whole time that suspending would
> >>>> remove the need to reset.  Well, at least until the device should be
> >>>> resumed again, i.e. in vhost_dev_start().
> >>>>
> >>> It cannot. vhost_dev_stop is also called when the guest reset the
> >>> device, so the guest trusts the device to be in a clean state.
> >>>
> >>> Also, the reset is the moment when the device frees the unused
> >>> resources. This would mandate the device to
> >> What resources are we talking about?  This function is called when the
> >> VM is paused (stop).  If a stateful device is reset to free “unused
> >> resources”, this means dropping its internal state, which is absolutely
> >> wrong in a stop/cont cycle.
> >>
> > But is the expected result in the virtio reset cycle. We need to split
> > these paths.
> >
> >>>> In addition, I also don’t understand the magnitude of the problem with
> >>>> ordering.  If the order in vhost_dev_start() is wrong, can we not easily
> >>>> fix it?
> >>> The order in vhost_dev_start follows the VirtIO standard.
> >> What does the VirtIO standard say about suspended vhost back-ends?
> >>
> > Suspend does not exist in the VirtIO standard. I meant the device
> > initialization order in "3.1 Device Initialization":
> >
> > 1. Reset the device.
> > ...
> > 5. Set the FEATURES_OK status bit. [...]
> > ...
> > 7. Perform device-specific setup, including discovery of virtqueues
> > for the device, optional per-bus setup, reading and possibly writing
> > the device’s virtio configuration space, and population of virtqueues.
> > 8.Set the DRIVER_OK status bit. At this point the device is “live”.
> >
> > Steps 4-8 are all done in vhost_dev_start, in that particular order.
> > To call vhost_vdpa_reset_status from vhost_vdpa_dev_start(true) would
> > reset the device back to step 1, but there is no more code to set all
> > configuration from 2-7 before 8 (DRIVER_OK).
>
> That’s why I’ve proposed doing the reset at the start of
> vhost_dev_start() (quoted below still).  To me, that sounds in line with
> the virtio specification.
>
> Still, if you insist there must a reset somewhere in
> vhost_dev_start()/stop() because it may be guest-initiated, then there
> is no solution that can work for both.  We must be able to distinguish
> between a paused VM and a change in the guest driver.
>

Right.

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg954916.html

> >> Hanna
> >>
> >>> The move of
> >>> the reset should be to remove it from vhost_dev_stop to something like
> >>> both virtio_reset and the end of virtio_save. I'm not sure if I'm
> >>> forgetting some other use cases.
> >>>
> >>>> E.g. add a full vhost_dev_resume callback to invoke right at
> >>>> the start of vhost_dev_start(); or check (in the same place) whether the
> >>>> back-end supports resuming, and if it doesn’t (and it is currently
> >>>> suspended), reset it there.
>
>



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/6] vhost: Do not reset suspended devices on stop
  2023-07-27 12:49                   ` Eugenio Perez Martin
@ 2023-07-27 20:26                     ` Stefan Hajnoczi
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-27 20:26 UTC (permalink / raw)
  To: longpeng2
  Cc: Hanna Czenczek, qemu-devel, Michael S . Tsirkin, German Maglione,
	Eugenio Perez Martin

[-- Attachment #1: Type: text/plain, Size: 12365 bytes --]

On Thu, Jul 27, 2023 at 02:49:04PM +0200, Eugenio Perez Martin wrote:
> On Wed, Jul 26, 2023 at 8:57 AM Hanna Czenczek <hreitz@redhat.com> wrote:
> >
> > On 25.07.23 20:53, Eugenio Perez Martin wrote:
> > > On Tue, Jul 25, 2023 at 3:09 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> > >> On 25.07.23 12:03, Eugenio Perez Martin wrote:
> > >>> On Tue, Jul 25, 2023 at 9:53 AM Hanna Czenczek <hreitz@redhat.com> wrote:
> > >>>> On 24.07.23 17:48, Eugenio Perez Martin wrote:
> > >>>>> On Fri, Jul 21, 2023 at 6:07 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> > >>>>>> On 21.07.23 17:25, Eugenio Perez Martin wrote:
> > >>>>>>> On Tue, Jul 11, 2023 at 5:52 PM Hanna Czenczek <hreitz@redhat.com> wrote:
> > >>>>>>>> Move the `suspended` field from vhost_vdpa into the global vhost_dev
> > >>>>>>>> struct, so vhost_dev_stop() can check whether the back-end has been
> > >>>>>>>> suspended by `vhost_ops->vhost_dev_start(hdev, false)`.  If it has,
> > >>>>>>>> there is no need to reset it; the reset is just a fall-back to stop
> > >>>>>>>> device operations for back-ends that do not support suspend.
> > >>>>>>>>
> > >>>>>>>> Unfortunately, for vDPA specifically, RESUME is not yet implemented, so
> > >>>>>>>> when the device is re-started, we still have to do the reset to have it
> > >>>>>>>> un-suspend.
> > >>>>>>>>
> > >>>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > >>>>>>>> ---
> > >>>>>>>>      include/hw/virtio/vhost-vdpa.h |  2 --
> > >>>>>>>>      include/hw/virtio/vhost.h      |  8 ++++++++
> > >>>>>>>>      hw/virtio/vhost-vdpa.c         | 11 +++++++----
> > >>>>>>>>      hw/virtio/vhost.c              |  8 +++++++-
> > >>>>>>>>      4 files changed, 22 insertions(+), 7 deletions(-)
> > >>>>>>>>
> > >>>>>>>> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > >>>>>>>> index e64bfc7f98..72c3686b7f 100644
> > >>>>>>>> --- a/include/hw/virtio/vhost-vdpa.h
> > >>>>>>>> +++ b/include/hw/virtio/vhost-vdpa.h
> > >>>>>>>> @@ -42,8 +42,6 @@ typedef struct vhost_vdpa {
> > >>>>>>>>          bool shadow_vqs_enabled;
> > >>>>>>>>          /* Vdpa must send shadow addresses as IOTLB key for data queues, not GPA */
> > >>>>>>>>          bool shadow_data;
> > >>>>>>>> -    /* Device suspended successfully */
> > >>>>>>>> -    bool suspended;
> > >>>>>>>>          /* IOVA mapping used by the Shadow Virtqueue */
> > >>>>>>>>          VhostIOVATree *iova_tree;
> > >>>>>>>>          GPtrArray *shadow_vqs;
> > >>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > >>>>>>>> index 6a173cb9fa..69bf59d630 100644
> > >>>>>>>> --- a/include/hw/virtio/vhost.h
> > >>>>>>>> +++ b/include/hw/virtio/vhost.h
> > >>>>>>>> @@ -120,6 +120,14 @@ struct vhost_dev {
> > >>>>>>>>          uint64_t backend_cap;
> > >>>>>>>>          /* @started: is the vhost device started? */
> > >>>>>>>>          bool started;
> > >>>>>>>> +    /**
> > >>>>>>>> +     * @suspended: Whether the vhost device is currently suspended.  Set
> > >>>>>>>> +     * and reset by implementations (vhost-user, vhost-vdpa, ...), which
> > >>>>>>>> +     * are supposed to automatically suspend/resume in their
> > >>>>>>>> +     * vhost_dev_start handlers as required.  Must also be cleared when
> > >>>>>>>> +     * the device is reset.
> > >>>>>>>> +     */
> > >>>>>>>> +    bool suspended;
> > >>>>>>>>          bool log_enabled;
> > >>>>>>>>          uint64_t log_size;
> > >>>>>>>>          Error *migration_blocker;
> > >>>>>>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >>>>>>>> index 7b7dee468e..f7fd19a203 100644
> > >>>>>>>> --- a/hw/virtio/vhost-vdpa.c
> > >>>>>>>> +++ b/hw/virtio/vhost-vdpa.c
> > >>>>>>>> @@ -858,13 +858,12 @@ static int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > >>>>>>>>
> > >>>>>>>>      static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > >>>>>>>>      {
> > >>>>>>>> -    struct vhost_vdpa *v = dev->opaque;
> > >>>>>>>>          int ret;
> > >>>>>>>>          uint8_t status = 0;
> > >>>>>>>>
> > >>>>>>>>          ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > >>>>>>>>          trace_vhost_vdpa_reset_device(dev);
> > >>>>>>>> -    v->suspended = false;
> > >>>>>>>> +    dev->suspended = false;
> > >>>>>>>>          return ret;
> > >>>>>>>>      }
> > >>>>>>>>
> > >>>>>>>> @@ -1278,7 +1277,7 @@ static void vhost_vdpa_suspend(struct vhost_dev *dev)
> > >>>>>>>>              if (unlikely(r)) {
> > >>>>>>>>                  error_report("Cannot suspend: %s(%d)", g_strerror(errno), errno);
> > >>>>>>>>              } else {
> > >>>>>>>> -            v->suspended = true;
> > >>>>>>>> +            dev->suspended = true;
> > >>>>>>>>                  return;
> > >>>>>>>>              }
> > >>>>>>>>          }
> > >>>>>>>> @@ -1313,6 +1312,10 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> > >>>>>>>>                  return -1;
> > >>>>>>>>              }
> > >>>>>>>>              vhost_vdpa_set_vring_ready(dev);
> > >>>>>>>> +        if (dev->suspended) {
> > >>>>>>>> +            /* TODO: When RESUME is available, use it instead of resetting */
> > >>>>>>>> +            vhost_vdpa_reset_status(dev);
> > >>>>>>> How is that we reset the status at each vhost_vdpa_dev_start? That
> > >>>>>>> will clean all the vqs configured, features negotiated, etc. in the
> > >>>>>>> vDPA device. Or am I missing something?
> > >>>>>> What alternative do you propose?  We don’t have RESUME for vDPA in qemu,
> > >>>>>> but we somehow need to lift the previous SUSPEND so the device will
> > >>>>>> again respond to guest requests, do we not?
> > >>>>>>
> > >>>>> Reset also clears the suspend state in vDPA, and it should be called
> > >>>>> at vhost_dev_stop. So the device should never be in suspended state
> > >>>>> here. Does that solve your concerns?
> > >>>> My intention with this patch was precisely not to reset in
> > >>>> vhost_dev_stop when suspending is supported.  So now I’m more confused
> > >>>> than before.
> > >>>>
> > >>> At this moment, I think that should be focused as an optimization and
> > >>> not to be included in the main series.
> > >> It is absolutely not an optimization but vital for my use case.
> > >> virtiofsd does not currently implement resetting, but if it did (and we
> > >> want this support in the future), resetting it during stop/cont would be
> > >> catastrophic.  There must be a way to have qemu not issue a reset.  This
> > >> patch is the reason why this series exists.
> > >>
> > > Sorry, I see I confused things with the first reply. Let me do a recap.
> > >
> > > If I understand the problem correctly, your use case requires that
> > > qemu does not reset the device before the device state is fetched with
> > > virtio_save in the case of a migration.
> >
> > That is only part of the problem, the bigger picture has nothing to do
> > with migration at all.  The problem is that when the VM is paused
> > (stop), we invoke vhost_dev_stop(), and when it is resumed (cont), we
> > invoke vhost_dev_start().  To me, it therefore sounds absolutely wrong
> > to reset the back-end in either of these functions.  For stateless
> > devices, it was determined to not be an issue (I still find it extremely
> > strange), and as far as I’ve understood, we’ve come to the agreement
> > that it’s basically a fallback for when there is no other way to stop
> > the back-end.  But stateful devices like virtio-fs would be completely
> > broken by resetting them there.
> >
> > Therefore, if virtiofsd did implement reset through SET_STATUS,
> > stop/cont would break it today.  Maybe other vhost-user devices, too,
> > which just implement RESET_OWNER/RESET_DEVICE, which aren’t even called
> > when the device is supposed to be reset in vhost_dev_stop() (patch 6).
> >
> > So not just because of migration, but in general, there must be a way
> > for back-ends to force qemu not to reset them in vhost_dev_start()/stop().
> >
> > Or we stop using vhost_dev_start()/stop() when the VM is paused/resumed
> > (stop/cont).
> >
> 
> Yes, that comes back to the thread [1].
> 
> As a third alternative, you can keep vhost_dev_start and let the
> function check the current state and initialize the device only if
> needed. But you can keep symmetrical functions and call one or another
> at the device's code, of course. Not sure what is cleaner or requires
> less changes.
> 
> > > This is understandable and I
> > > think we have a solution for that: to move the vhost_ops call to
> > > virtio_reset and the end of virtio_save.
> >
> > Why would we reset the device in virtio_save()?
> >
> 
> If the VM continues in the source because of whatever reason,
> vhost_dev_start would expect the device to be clean. You can test it
> with the command "cont" after the LM.
> 
> > > To remove the reset call from
> > > vhost_dev_stop is somehow mandatory, as it is called before
> > > virtio_save.
> > >
> > > But we cannot move to vhost_vdpa_dev_start, as proposed here. The reasons are:
> > > * All the features, vq parameters, etc are set before any
> > > vhost_vdpa_dev_start call. To reset at any vhost_vdpa_dev_start would
> > > wipe them.
> > > * The device needs to hold all the resources until it is reset. Things
> > > like descriptor status etc.
> > >
> > > And, regarding the comment "When RESUME is available, use it instead
> > > of resetting", we cannot use resume to replace reset in all cases.
> > > This is because the semantics are different.
> > >
> > > For example, vhost_dev_stop and vhost_dev_start are also called when
> > > the guest reset by itself the device. You can check it rmmoding and
> > > modprobing virtio-net driver, for example. In this case, the driver
> > > expects to find all vqs to start with 0, but the resume must not reset
> > > these indexes.
> >
> > This isn’t quite clear to me.  I understand this to mean that there must
> > be a reset somewhere in vhost_dev_stop() and/or vhost_dev_start().  But
> > above, you proposed moving the reset from vhost_dev_stop() into
> > virtio_reset().  Is virtio_reset() called in addition to
> > vhost_dev_stop() and vhost_dev_start() when the guest driver is changed?
> >
> 
> Right. Maybe another option is virtio_set_status?
> 
> The point is that other parts of qemu / guest trust the device to be
> reset after (current version of) vhost_dev_stop, so if we are going to
> move the reset it must be added to the callers at least. To trace
> these callers is needed, so maybe it is easy to add another function
> (vhost_dev_suspend?), your first alternative.
> 
> > Because we can’t have an always-present reset in vhost_dev_stop() or
> > vhost_dev_start().  It just doesn’t work with stop/cont.  At the same
> > time, I understand that you say we must have it because
> > vhost_dev_{stop,start}() are also used when the guest driver changes.
> > Consequently, it sounds clear to me that using the exact same functions
> > in stop/cont and when the guest driver is unloaded/loaded is and has
> > always been wrong.  Because in stop/cont, the guest driver never
> > changes, so we shouldn’t tell the back-end that it did (by sending
> > SET_STATUS(0)).
> >
> 
> Talking just about vhost_net here, the device dumps all the state
> (like vqs) to qemu in vhost_dev_stop. That is what allows to have, for
> example, an unified code in migrating: qemu only needs to know how to
> migrate its emulated device, and vhost code just writes or read at
> suspend / continue or live migrating. To suspend and continue is the
> same operation actually for vhost_net.

Hi Longpeng,
This discussion made me realize that --device vhost-vdpa-device-pci does
not support stateful vDPA devices.

For example, a vdpa-net device that implements the controlq will lose
state (e.g. packet receive filtering) when the guest is stopped with the
QEMU 'stop' command.

QEMU needs to use the VHOST_VDPA_RESUME ioctl so it can resume stateful
devices instead of resetting them across 'stop'/'cont'.

Whatever solution we agree on in this thread should also work for
vhost-vdpa and solve the issue for --device vhost-vdpa-device-pci.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/6] vhost-user.rst: Add suspend/resume
  2023-07-25  8:30         ` Hanna Czenczek
@ 2023-07-27 21:12           ` Stefan Hajnoczi
  0 siblings, 0 replies; 37+ messages in thread
From: Stefan Hajnoczi @ 2023-07-27 21:12 UTC (permalink / raw)
  To: Hanna Czenczek
  Cc: qemu-devel, Michael S . Tsirkin, Eugenio Pérez, German Maglione

[-- Attachment #1: Type: text/plain, Size: 17061 bytes --]

On Tue, Jul 25, 2023 at 10:30:49AM +0200, Hanna Czenczek wrote:
> On 24.07.23 19:55, Stefan Hajnoczi wrote:
> > On Wed, Jul 19, 2023 at 03:59:32PM +0200, Hanna Czenczek wrote:
> > > On 18.07.23 16:25, Stefan Hajnoczi wrote:
> > > > On Tue, Jul 11, 2023 at 05:52:23PM +0200, Hanna Czenczek wrote:
> > > > > When stopping the VM, qemu wants all devices to fully cease any
> > > > > operation, too.  Currently, we can only have vhost-user back-ends stop
> > > > > processing vrings, but not background operations.  Add the SUSPEND and
> > > > > RESUME commands from vDPA, which allow the front-end (qemu) to tell
> > > > > back-ends to cease all operations, including those running in the
> > > > > background.
> > > > > 
> > > > > qemu's current work-around for this is to reset the back-end instead of
> > > > > suspending it, which will not work for back-ends that have internal
> > > > > state that must be preserved across e.g. stop/cont.
> > > > > 
> > > > > Note that the given specification requires the back-end to delay
> > > > > processing kicks (i.e. starting vrings) until the device is resumed,
> > > > > instead of requiring the front-end not to send kicks while suspended.
> > > > > qemu starts devices (and would just resume them) only when the VM is in
> > > > > a running state, so it would be difficult to have qemu delay kicks until
> > > > > the device is resumed, which is why this patch specifies handling of
> > > > > kicks as it does.
> > > > > 
> > > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
> > > > > ---
> > > > >    docs/interop/vhost-user.rst | 35 +++++++++++++++++++++++++++++++++--
> > > > >    1 file changed, 33 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > > > index 5a070adbc1..ac6be34c4c 100644
> > > > > --- a/docs/interop/vhost-user.rst
> > > > > +++ b/docs/interop/vhost-user.rst
> > > > > @@ -381,6 +381,10 @@ readable) on the descriptor specified by ``VHOST_USER_SET_VRING_KICK``
> > > > >    or receiving the in-band message ``VHOST_USER_VRING_KICK`` if negotiated,
> > > > >    and stop ring upon receiving ``VHOST_USER_GET_VRING_BASE``.
> > > > > +While the back-end is suspended (via ``VHOST_USER_SUSPEND``), it must
> > > > > +never process rings, and thus also delay handling kicks until the
> > > > If you respin this series, I suggest replacing "never" with "not" to
> > > > emphasize that ring processing is only skipped while the device is
> > > > suspended (rather than forever). "Never" feels too strong to use when
> > > > describing a temporary state.
> > > Sure.
> > > 
> > > > > +back-end is resumed again.
> > > > > +
> > > > >    Rings can be enabled or disabled by ``VHOST_USER_SET_VRING_ENABLE``.
> > > > >    If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
> > > > > @@ -479,8 +483,9 @@ supplied by ``VhostUserLog``.
> > > > >    ancillary data, it may be used to inform the front-end that the log has
> > > > >    been modified.
> > > > > -Once the source has finished migration, rings will be stopped by the
> > > > > -source. No further update must be done before rings are restarted.
> > > > > +Once the source has finished migration, the device will be suspended and
> > > > > +its rings will be stopped by the source. No further update must be done
> > > > > +before the device and its rings are resumed.
> > > > This paragraph is abstract and doesn't directly identify the mechanisms
> > > > or who does what:
> > > > - "the device will be suspended" via VHOST_USER_SUSPEND (or reset when
> > > >     VHOST_USER_SUSPEND is not supported?) or automatically by the device
> > > >     itself or some other mechanism?
> > > OK, I’ll add VHOST_USER_SUSPEND.
> > > 
> > > So far I hadn’t considered making a note of resetting as a fallback right in
> > > the specification.  I don’t think I would want it in the specification, but
> > > on the other hand, it is probably important for back-end authors to know
> > > that if they do not implement SUSPEND, their device is going to be reset by
> > > qemu.
> > > 
> > > Can we make that a ”may”, i.e.:
> > > 
> > > ```
> > > Once the source has finished migration, the device will be suspended via
> > > VHOST_USER_SUSPEND and its rings will be stopped by the source.
> > I'm not sure what "its rings will be stopped by the source" means
> > exactly. Is it summarizing the effect of VHOST_USER_SUSPEND or is there
> > an additional action after VHOST_USER_SUSPEND that stops rings? And I'm
> > not sure whether "by the source" means by the front-end or back-end on
> > the source machine?
> 
> This is pre-existing text and I assumed it (with not doubt) to refer to
> GET_VRING_BASE, because that is how rings are stopped.
> 
> I can improve the existing documentation and add the reference to
> GET_VRING_BASE, and say that it clearly must come from the front-end.

Yes, please.

> 
> > > No further
> > > update must be done before the device and its rings are resumed.
> > "Update" to what? Guest RAM? Device state? Rings?
> > 
> > I feel like this text is too vague for a spec. People may interpret it
> > differently. Can you make rephrase this to more concrete?
> 
> Honestly, no.  This is pre-existing, and I have the same questions as you
> do.
> 
> I cannot “rephrase” this to make it more concrete.  I can try to actually
> specify that was was left unspecified, but that would be a change in
> specification that would require its own patch, separate from this series.

I want to make sure that VHOST_USER_SUSPEND is well-defined. If it's
pre-existing live migration text that doesn't involve
VHOST_USER_SUSPEND, then it would be unfair to ask you to rewrite it.

> 
> Personally, I’ve generally taken this sentence to be fluff.  If the rings
> are stopped, clearly, they should not be accessed at all. Probably the
> back-end should also refrain from writing to guest memory, because that is a
> logical conclusion from having the rings stopped.  But now it’s even
> clearer: The back-end ideally is suspended, which directly means not to
> modify guest memory, and not to perform “background operations”.  Updating
> device state of course is possible through vhost-user commands, because
> those must always be executed.
> 
> So basically it’s just “Device and rings are stopped (RESUME and
> GET_VRING_BASE, resp.), so you know, adhere to that.”
> 
> Or maybe I’m completely wrong and “Once the source has finished migration,
> rings will be stopped by the source. No further update must be done before
> rings are restarted.” is to be taken together and the second sentence just
> refers to the rings, i.e. the front-end stops the rings, and the back-end
> must not update them.  Or it means that the front-end must not send any
> commands to the back-end until restarting the rings, but that feels
> practically impossible.
> 
> Again, because this sentence currently doesn’t specify anything, really,
> changing it to carry any meaning would be to add to the specification, not
> just clarify it.
> 
> > > If and only
> > > if the back-end does not support VHOST_USER_SUSPEND, the front-end may reset
> > > it instead (via VHOST_USER_SET_STATUS, VHOST_USER_RESET_DEVICE, or
> > > VHOST_USER_RESET_OWNER).
> > > ```
> > > 
> > > I’m unsure about the “If and only if” – older qemu versions will break this,
> > > but I feel like we must promise back-end writers that if they implement
> > > SUSPEND, their back-end is not going to be reset; if it is, and something
> > > breaks because of it, it’s the front-end that must be updated to match the
> > > specification.
> > I this the trick is to say "if and only if VHOST_USER_F_SUSPEND has not
> > been negotiated". That way really only new front-ends that support
> > VHOST_USER_SUSPEND are required to use suspend instead of reset and
> > older versions of QEMU will not violate this statement.
> 
> Ah, right, thanks!
> 
> > > > - "before the device and its rings are resumed" via VHOST_USER_RESUME?
> > > >     And is this referring to the source device?
> > > Yes, via VHOST_USER_RESUME, and restarting the rings by starting them (i.e.
> > > a kick).
> > > 
> > > Whether this is referring to the source device…  Well, the text as it was
> > > before begs the exact same question, so honestly, I don’t know for sure.
> > > “Restarting” only makes sense if the rings were stopped before, so I assume
> > > it’s referring to the source, e.g. for the case of a failed migration.
> > > RESUME at least definitely will only happen after a prior SUSPEND, so this
> > > one will definitely only apply on the source side.
> > > 
> > > > Please rephrase the paragraph to identify the vhost-user messages
> > > > involved.
> > > > 
> > > > >    In postcopy migration the back-end is started before all the memory has
> > > > >    been received from the source host, and care must be taken to avoid
> > > > > @@ -885,6 +890,7 @@ Protocol features
> > > > >      #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
> > > > >      #define VHOST_USER_PROTOCOL_F_STATUS               16
> > > > >      #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
> > > > > +  #define VHOST_USER_PROTOCOL_F_SUSPEND              18
> > > > >    Front-end message types
> > > > >    -----------------------
> > > > > @@ -1440,6 +1446,31 @@ Front-end message types
> > > > >      query the back-end for its device status as defined in the Virtio
> > > > >      specification.
> > > > > +``VHOST_USER_SUSPEND``
> > > > > +  :id: 41
> > > > > +  :equivalent ioctl: VHOST_VDPA_SUSPEND
> > > > > +  :request payload: N/A
> > > > > +  :reply payload: N/A
> > > > > +
> > > > > +  When the ``VHOST_USER_PROTOCOL_F_SUSPEND`` protocol feature has been
> > > > > +  successfully negotiated, this message is submitted by the front-end to
> > > > > +  have the back-end cease all operations except for handling vhost-user
> > > > > +  requests.  The back-end must stop processing all virt queues, and it
> > > > > +  must not perform any background operations.  It may not resume until a
> > > > "background operations" are not defined. What does it mean:
> > > > - Anything that writes to memory slots
> > > > - Anything that changes the visible state of the device
> > > > - Anything that changes the non-visible internal state of the device
> > > > - etc
> > > > ?
> > > My best answer (honestly) is: You tell me.  This series is introducing
> > > SUSPEND/RESUME because qemu wants to reset devices to make them stop
> > > “background operations”, and this would break virtiofsd if any form of reset
> > > were actually implemented.  The implementation of SUSPEND/RESUME in
> > > virtiofsd on the other hand is supposed to basically be a no-op (besides
> > > delaying ring processing until a RESUME, but even if we processed them
> > > before, i.e. really make SUSPEND/RESUME no-ops, it would most likely work
> > > out fine), so I have no idea what kind of background operations we are even
> > > talking about, or whether any such actually exist in practice.
> > > 
> > > I don’t know what anyone had in mind when introducing the RESET.  It comes
> > > from vDPA (c3716f260bf moved it from vdpa into the common code), and exists
> > > since the code was originally added (108a64818e6), so there’s no comment
> > > explaining why it exists.  I can’t explain what the back-end is supposed to
> > > stop doing, because so far it isn’t explained anywhere in qemu, its git log,
> > > or in any documentation (there basically is no vdpa documentation).
> > > 
> > > I can only say what I just completely naïvely assumed it to mean so far:
> > > That the back-end basically should stop doing anything besides handling and
> > > replying to vhost-user messages.  If such a message requires changing any
> > > state, visible or not, then this state change is permissible.
> > Okay, I suggest the following instead of "background operations":
> > 
> > - Changes to the device state produced by SET_DEVICE_STATE_FD.
> 
> This is definitely something that I would absolutely allow after SUSPEND. 
> If the front-end does not wish the back-end to do this, it can just not send
> this command while the back-end is suspended.
>
> 
> > - Writing to memory regions.
> > - Signalling that buffers have been used.
> 
> This feels both too tight and not concrete enough.  The only buffers I can
> think of are buffers supplied through virt queues, which I intended to
> already be included in “stop processing all virt queues”.  (I took this to
> mean the used-buffer ring, too, but I can of course be more explicit about
> this, e.g. “stop processing all virt queues, including returning buffers to
> the driver”.) “Signalling” sounds like triggering the callfd, but that also
> seems clear; if you can’t process virt queues, including returning buffers,
> you can never trigger the callfd (or send VHOST_USER_BACKEND_VRING_CALL),
> because there can never be a new buffer returned to the driver.

Yes, I was trying to use the language of the vhost-user spec related to
the callfd. If the back-end has a timer for interrupt mitigation then it
technically isn't doing any virtqueue processing after SUSPEND but might
still signal the callfd when the timer expires.

There is overlap with virtqueue processing, depending on how you define
it. Maybe mentioning this separately is overkill.

> 
> So too tight because it feels like a subset of virt queue processing, but
> not concrete enough because “buffers” makes me feel like I’m overlooking
> something besides virt queues.
> 
> > - Signalling that the configuration space has changed.
> 
> Maybe this could be more general, i.e. the back-end must not send any
> vhost-user messages to the front-end?

Yes, I was thinking solely of VHOST_USER_BACKEND_CONFIG_CHANGE_MSG but
it could include all back-end message types.

[I am reordering your reply, because this might be important before we
continue below...]
> Other than migration, qemu doesn’t see the device state at all.  I don’t see
> why internal state should not change between stop/cont. If a device
> experience some signal that (for some reason) it can’t pause to receive only
> after the subsequent RESUME, it might need to make note of that signal to
> act on it after RESUME.  I would consider that a change in internal state,
> and I don’t immediately see a problem with it.  (It may be problematic when
> migrating, because receiving such signals on the source side after
> transferring the state would mean they’re lost, but again, this is something
> the device clearly has to solve, e.g. by redirecting the signals to the
> destination starting with the SET_TRANSFER_STATE_FD call.)

I only consider device state to be the data that is transferred over the
device state fd. The other state in the back-end, like the pending
signal example you mentioned, is not part of what I call the device
state.

> > The goal is to ensure the device state and memory regions are stable and
> > that back-end doesn't trigger activity in the front-end.
> 
> If the goal is to ensure that the device state is stable, I feel like we
> must then specify precisely this, and not just to say it must not be mutable
> through SET_DEVICE_STATE_FD: In general, the state is more likely to be
> changed by other factors after all.
> 
> On the other hand, I also don’t see why that’s a goal.  For migration, it
> might seem desirable, but I don’t actually think so: The back-end is
> required to send a consistent device state anyway, so it is up to the
> back-end to ensure the state is consistent, we don’t need to make it a
> requirement for SUSPEND.  It is implicitly clear from the migration model
> that if the device state were to change after the back-end has sent it to
> the front-end, those change will be lost on the destination side, so it is
> clear that the back-end must anticipate this and work around it.

I don't follow your last sentence. I agree that device state changes
after the device state fd has been read would be lost on the
destination. But that does not mean that it is valid to modify the
device state while reading the device state fd (there is no iterative
migration support yet).

Are you assuming that SET_DEVICE_STATE_FD takes a snapshot of the device
state and reading the fd reads the (consistent) snapshot?

I worry that it may not be possible for all implementations to take a
snapshot of the device state and then continue modifying the actual
device state. This could be due to resource limitations or due to
underlying API limitations. It would be easier if the spec forbid
modifying the device state while the device was suspended.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2023-07-27 21:48 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-11 15:52 [PATCH 0/6] vhost-user: Add suspend/resume Hanna Czenczek
2023-07-11 15:52 ` [PATCH 1/6] vhost-user.rst: " Hanna Czenczek
2023-07-18 14:25   ` Stefan Hajnoczi
2023-07-19 13:59     ` Hanna Czenczek
2023-07-24 17:55       ` Stefan Hajnoczi
2023-07-25  8:30         ` Hanna Czenczek
2023-07-27 21:12           ` Stefan Hajnoczi
2023-07-11 15:52 ` [PATCH 2/6] vhost-vdpa: Move vhost_vdpa_reset_status() up Hanna Czenczek
2023-07-18 14:29   ` Stefan Hajnoczi
2023-07-11 15:52 ` [PATCH 3/6] vhost: Do not reset suspended devices on stop Hanna Czenczek
2023-07-18 14:33   ` Stefan Hajnoczi
2023-07-21 15:25   ` Eugenio Perez Martin
2023-07-21 16:07     ` Hanna Czenczek
2023-07-24 15:48       ` Eugenio Perez Martin
2023-07-25  7:53         ` Hanna Czenczek
2023-07-25 10:03           ` Eugenio Perez Martin
2023-07-25 13:09             ` Hanna Czenczek
2023-07-25 18:53               ` Eugenio Perez Martin
2023-07-26  6:57                 ` Hanna Czenczek
2023-07-27 12:49                   ` Eugenio Perez Martin
2023-07-27 20:26                     ` Stefan Hajnoczi
2023-07-11 15:52 ` [PATCH 4/6] vhost-user: Implement suspend/resume Hanna Czenczek
2023-07-18 14:37   ` Stefan Hajnoczi
2023-07-11 15:52 ` [PATCH 5/6] vhost-vdpa: Match vhost-user's status reset Hanna Czenczek
2023-07-18 14:50   ` Stefan Hajnoczi
2023-07-19 14:09     ` Hanna Czenczek
2023-07-19 15:06       ` Stefan Hajnoczi
2023-07-21 15:47       ` Eugenio Perez Martin
2023-07-11 15:52 ` [PATCH 6/6] vhost-user: Have reset_status fall back to reset Hanna Czenczek
2023-07-18 15:10   ` Stefan Hajnoczi
2023-07-19 14:11     ` Hanna Czenczek
2023-07-19 14:27       ` Hanna Czenczek
2023-07-20 16:03         ` Stefan Hajnoczi
2023-07-21 14:16           ` Hanna Czenczek
2023-07-24 18:04             ` Stefan Hajnoczi
2023-07-25  8:39               ` Hanna Czenczek
2023-07-18 15:14 ` [PATCH 0/6] vhost-user: Add suspend/resume Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.