All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
@ 2023-08-30 13:40 Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 1/7] vhost-user: strip superfluous whitespace Laszlo Ersek
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

v1:

- http://mid.mail-archive.com/20230827182937.146450-1-lersek@redhat.com
- https://patchwork.ozlabs.org/project/qemu-devel/cover/20230827182937.146450-1-lersek@redhat.com/

v2 picks up tags from Phil and Stefano, and addresses feedback from
Stefano. Please see the Notes section on each patch, for the v2 changes.

I've not CC'd the stable list, as we've not figured out what prior
releases to target. Applying the series to 8.1 is easy; to 8.0 -- not so
much.

Retested.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Laszlo

Laszlo Ersek (7):
  vhost-user: strip superfluous whitespace
  vhost-user: tighten "reply_supported" scope in "set_vring_addr"
  vhost-user: factor out "vhost_user_write_sync"
  vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"
  vhost-user: hoist "write_sync", "get_features", "get_u64"
  vhost-user: allow "vhost_set_vring" to wait for a reply
  vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously

 hw/virtio/vhost-user.c | 216 ++++++++++----------
 1 file changed, 108 insertions(+), 108 deletions(-)


base-commit: 813bac3d8d70d85cb7835f7945eb9eed84c2d8d0

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/7] vhost-user: strip superfluous whitespace
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-31  7:13   ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr" Laszlo Ersek
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up Stefano's R-b

 hw/virtio/vhost-user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8dcf049d422b..b4b677c1ce66 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -398,7 +398,7 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
      * operations such as configuring device memory mappings or issuing device
      * resets, which affect the whole device instead of individual VQs,
      * vhost-user messages should only be sent once.
-     * 
+     *
      * Devices with multiple vhost_devs are given an associated dev->vq_index
      * so per_device requests are only sent if vq_index is 0.
      */


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr"
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 1/7] vhost-user: strip superfluous whitespace Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-31  7:14   ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 3/7] vhost-user: factor out "vhost_user_write_sync" Laszlo Ersek
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

In the vhost_user_set_vring_addr() function, we calculate
"reply_supported" unconditionally, even though we'll only need it if
"wait_for_reply" is also true.

Restrict the scope of "reply_supported" to the minimum.

This is purely refactoring -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up Stefano's R-b

 hw/virtio/vhost-user.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index b4b677c1ce66..64eac317bfb2 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1331,17 +1331,18 @@ static int vhost_user_set_vring_addr(struct vhost_dev *dev,
         .hdr.size = sizeof(msg.payload.addr),
     };
 
-    bool reply_supported = virtio_has_feature(dev->protocol_features,
-                                              VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
     /*
      * wait for a reply if logging is enabled to make sure
      * backend is actually logging changes
      */
     bool wait_for_reply = addr->flags & (1 << VHOST_VRING_F_LOG);
 
-    if (reply_supported && wait_for_reply) {
-        msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
+    if (wait_for_reply) {
+        bool reply_supported = virtio_has_feature(dev->protocol_features,
+                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
+        if (reply_supported) {
+            msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
+        }
     }
 
     ret = vhost_user_write(dev, &msg, NULL, 0);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/7] vhost-user: factor out "vhost_user_write_sync"
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 1/7] vhost-user: strip superfluous whitespace Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr" Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 4/7] vhost-user: flatten "enforce_reply" into "vhost_user_write_sync" Laszlo Ersek
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

The tails of the "vhost_user_set_vring_addr" and "vhost_user_set_u64"
functions are now byte-for-byte identical. Factor the common tail out to a
new function called "vhost_user_write_sync".

This is purely refactoring -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up R-b's from Phil and Stefano
    
    - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code and
      commit message) [Stefano]

 hw/virtio/vhost-user.c | 66 +++++++++-----------
 1 file changed, 28 insertions(+), 38 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 64eac317bfb2..1476b36f0a6e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1320,10 +1320,35 @@ static int enforce_reply(struct vhost_dev *dev,
     return vhost_user_get_features(dev, &dummy);
 }
 
+/* Note: "msg->hdr.flags" may be modified. */
+static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
+                                 bool wait_for_reply)
+{
+    int ret;
+
+    if (wait_for_reply) {
+        bool reply_supported = virtio_has_feature(dev->protocol_features,
+                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
+        if (reply_supported) {
+            msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
+        }
+    }
+
+    ret = vhost_user_write(dev, msg, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (wait_for_reply) {
+        return enforce_reply(dev, msg);
+    }
+
+    return 0;
+}
+
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
                                      struct vhost_vring_addr *addr)
 {
-    int ret;
     VhostUserMsg msg = {
         .hdr.request = VHOST_USER_SET_VRING_ADDR,
         .hdr.flags = VHOST_USER_VERSION,
@@ -1337,24 +1362,7 @@ static int vhost_user_set_vring_addr(struct vhost_dev *dev,
      */
     bool wait_for_reply = addr->flags & (1 << VHOST_VRING_F_LOG);
 
-    if (wait_for_reply) {
-        bool reply_supported = virtio_has_feature(dev->protocol_features,
-                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
-        if (reply_supported) {
-            msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
-        }
-    }
-
-    ret = vhost_user_write(dev, &msg, NULL, 0);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (wait_for_reply) {
-        return enforce_reply(dev, &msg);
-    }
-
-    return 0;
+    return vhost_user_write_sync(dev, &msg, wait_for_reply);
 }
 
 static int vhost_user_set_u64(struct vhost_dev *dev, int request, uint64_t u64,
@@ -1366,26 +1374,8 @@ static int vhost_user_set_u64(struct vhost_dev *dev, int request, uint64_t u64,
         .payload.u64 = u64,
         .hdr.size = sizeof(msg.payload.u64),
     };
-    int ret;
 
-    if (wait_for_reply) {
-        bool reply_supported = virtio_has_feature(dev->protocol_features,
-                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
-        if (reply_supported) {
-            msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
-        }
-    }
-
-    ret = vhost_user_write(dev, &msg, NULL, 0);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (wait_for_reply) {
-        return enforce_reply(dev, &msg);
-    }
-
-    return 0;
+    return vhost_user_write_sync(dev, &msg, wait_for_reply);
 }
 
 static int vhost_user_set_status(struct vhost_dev *dev, uint8_t status)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/7] vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
                   ` (2 preceding siblings ...)
  2023-08-30 13:40 ` [PATCH v2 3/7] vhost-user: factor out "vhost_user_write_sync" Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64" Laszlo Ersek
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

At this point, only "vhost_user_write_sync" calls "enforce_reply"; embed
the latter into the former.

This is purely refactoring -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up R-b's from Phil and Stefano
    
    - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code
      context and commit message) [Stefano]

 hw/virtio/vhost-user.c | 32 ++++++++------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 1476b36f0a6e..4129ba72e408 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1302,24 +1302,6 @@ static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
     return 0;
 }
 
-static int enforce_reply(struct vhost_dev *dev,
-                         const VhostUserMsg *msg)
-{
-    uint64_t dummy;
-
-    if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
-        return process_message_reply(dev, msg);
-    }
-
-   /*
-    * We need to wait for a reply but the backend does not
-    * support replies for the command we just sent.
-    * Send VHOST_USER_GET_FEATURES which makes all backends
-    * send a reply.
-    */
-    return vhost_user_get_features(dev, &dummy);
-}
-
 /* Note: "msg->hdr.flags" may be modified. */
 static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
                                  bool wait_for_reply)
@@ -1340,7 +1322,19 @@ static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
     }
 
     if (wait_for_reply) {
-        return enforce_reply(dev, msg);
+        uint64_t dummy;
+
+        if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
+            return process_message_reply(dev, msg);
+        }
+
+       /*
+        * We need to wait for a reply but the backend does not
+        * support replies for the command we just sent.
+        * Send VHOST_USER_GET_FEATURES which makes all backends
+        * send a reply.
+        */
+        return vhost_user_get_features(dev, &dummy);
     }
 
     return 0;


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64"
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
                   ` (3 preceding siblings ...)
  2023-08-30 13:40 ` [PATCH v2 4/7] vhost-user: flatten "enforce_reply" into "vhost_user_write_sync" Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-31  7:20   ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 6/7] vhost-user: allow "vhost_set_vring" to wait for a reply Laszlo Ersek
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

In order to avoid a forward-declaration for "vhost_user_write_sync" in a
subsequent patch, hoist "vhost_user_write_sync" ->
"vhost_user_get_features" -> "vhost_user_get_u64" just above
"vhost_set_vring".

This is purely code movement -- no observable change.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up R-b from Stefano
    
    - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code and
      commit message) [Stefano]

 hw/virtio/vhost-user.c | 170 ++++++++++----------
 1 file changed, 85 insertions(+), 85 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 4129ba72e408..c79b6f77cdca 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1083,6 +1083,91 @@ static int vhost_user_set_vring_endian(struct vhost_dev *dev,
     return vhost_user_write(dev, &msg, NULL, 0);
 }
 
+static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64)
+{
+    int ret;
+    VhostUserMsg msg = {
+        .hdr.request = request,
+        .hdr.flags = VHOST_USER_VERSION,
+    };
+
+    if (vhost_user_per_device_request(request) && dev->vq_index != 0) {
+        return 0;
+    }
+
+    ret = vhost_user_write(dev, &msg, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+
+    ret = vhost_user_read(dev, &msg);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (msg.hdr.request != request) {
+        error_report("Received unexpected msg type. Expected %d received %d",
+                     request, msg.hdr.request);
+        return -EPROTO;
+    }
+
+    if (msg.hdr.size != sizeof(msg.payload.u64)) {
+        error_report("Received bad msg size.");
+        return -EPROTO;
+    }
+
+    *u64 = msg.payload.u64;
+
+    return 0;
+}
+
+static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
+{
+    if (vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features) < 0) {
+        return -EPROTO;
+    }
+
+    return 0;
+}
+
+/* Note: "msg->hdr.flags" may be modified. */
+static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
+                                 bool wait_for_reply)
+{
+    int ret;
+
+    if (wait_for_reply) {
+        bool reply_supported = virtio_has_feature(dev->protocol_features,
+                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
+        if (reply_supported) {
+            msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
+        }
+    }
+
+    ret = vhost_user_write(dev, msg, NULL, 0);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (wait_for_reply) {
+        uint64_t dummy;
+
+        if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
+            return process_message_reply(dev, msg);
+        }
+
+       /*
+        * We need to wait for a reply but the backend does not
+        * support replies for the command we just sent.
+        * Send VHOST_USER_GET_FEATURES which makes all backends
+        * send a reply.
+        */
+        return vhost_user_get_features(dev, &dummy);
+    }
+
+    return 0;
+}
+
 static int vhost_set_vring(struct vhost_dev *dev,
                            unsigned long int request,
                            struct vhost_vring_state *ring)
@@ -1255,91 +1340,6 @@ static int vhost_user_set_vring_err(struct vhost_dev *dev,
     return vhost_set_vring_file(dev, VHOST_USER_SET_VRING_ERR, file);
 }
 
-static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64)
-{
-    int ret;
-    VhostUserMsg msg = {
-        .hdr.request = request,
-        .hdr.flags = VHOST_USER_VERSION,
-    };
-
-    if (vhost_user_per_device_request(request) && dev->vq_index != 0) {
-        return 0;
-    }
-
-    ret = vhost_user_write(dev, &msg, NULL, 0);
-    if (ret < 0) {
-        return ret;
-    }
-
-    ret = vhost_user_read(dev, &msg);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (msg.hdr.request != request) {
-        error_report("Received unexpected msg type. Expected %d received %d",
-                     request, msg.hdr.request);
-        return -EPROTO;
-    }
-
-    if (msg.hdr.size != sizeof(msg.payload.u64)) {
-        error_report("Received bad msg size.");
-        return -EPROTO;
-    }
-
-    *u64 = msg.payload.u64;
-
-    return 0;
-}
-
-static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
-{
-    if (vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features) < 0) {
-        return -EPROTO;
-    }
-
-    return 0;
-}
-
-/* Note: "msg->hdr.flags" may be modified. */
-static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
-                                 bool wait_for_reply)
-{
-    int ret;
-
-    if (wait_for_reply) {
-        bool reply_supported = virtio_has_feature(dev->protocol_features,
-                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
-        if (reply_supported) {
-            msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
-        }
-    }
-
-    ret = vhost_user_write(dev, msg, NULL, 0);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (wait_for_reply) {
-        uint64_t dummy;
-
-        if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
-            return process_message_reply(dev, msg);
-        }
-
-       /*
-        * We need to wait for a reply but the backend does not
-        * support replies for the command we just sent.
-        * Send VHOST_USER_GET_FEATURES which makes all backends
-        * send a reply.
-        */
-        return vhost_user_get_features(dev, &dummy);
-    }
-
-    return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
                                      struct vhost_vring_addr *addr)
 {


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 6/7] vhost-user: allow "vhost_set_vring" to wait for a reply
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
                   ` (4 preceding siblings ...)
  2023-08-30 13:40 ` [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64" Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-08-30 13:40 ` [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
  2023-09-07 16:01 ` [PATCH v2 0/7] " Eugenio Perez Martin
  7 siblings, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

The "vhost_set_vring" function already centralizes the common parts of
"vhost_user_set_vring_num", "vhost_user_set_vring_base" and
"vhost_user_set_vring_enable". We'll want to allow some of those callers
to wait for a reply.

Therefore, rebase "vhost_set_vring" from just "vhost_user_write" to
"vhost_user_write_sync", exposing the "wait_for_reply" parameter.

This is purely refactoring -- there is no observable change. That's
because:

- all three callers pass in "false" for "wait_for_reply", which disables
  all logic in "vhost_user_write_sync" except the call to
  "vhost_user_write";

- the fds=NULL and fd_num=0 arguments of the original "vhost_user_write"
  call inside "vhost_set_vring" are hard-coded within
  "vhost_user_write_sync".

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up R-b's from Phil and Stefano
    
    - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code and
      commit message) [Stefano]

 hw/virtio/vhost-user.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index c79b6f77cdca..18e15a9bb359 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1170,7 +1170,8 @@ static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
 
 static int vhost_set_vring(struct vhost_dev *dev,
                            unsigned long int request,
-                           struct vhost_vring_state *ring)
+                           struct vhost_vring_state *ring,
+                           bool wait_for_reply)
 {
     VhostUserMsg msg = {
         .hdr.request = request,
@@ -1179,13 +1180,13 @@ static int vhost_set_vring(struct vhost_dev *dev,
         .hdr.size = sizeof(msg.payload.state),
     };
 
-    return vhost_user_write(dev, &msg, NULL, 0);
+    return vhost_user_write_sync(dev, &msg, wait_for_reply);
 }
 
 static int vhost_user_set_vring_num(struct vhost_dev *dev,
                                     struct vhost_vring_state *ring)
 {
-    return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring);
+    return vhost_set_vring(dev, VHOST_USER_SET_VRING_NUM, ring, false);
 }
 
 static void vhost_user_host_notifier_free(VhostUserHostNotifier *n)
@@ -1216,7 +1217,7 @@ static void vhost_user_host_notifier_remove(VhostUserHostNotifier *n,
 static int vhost_user_set_vring_base(struct vhost_dev *dev,
                                      struct vhost_vring_state *ring)
 {
-    return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring);
+    return vhost_set_vring(dev, VHOST_USER_SET_VRING_BASE, ring, false);
 }
 
 static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
@@ -1234,7 +1235,7 @@ static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
             .num   = enable,
         };
 
-        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state);
+        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, false);
         if (ret < 0) {
             /*
              * Restoring the previous state is likely infeasible, as well as


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
                   ` (5 preceding siblings ...)
  2023-08-30 13:40 ` [PATCH v2 6/7] vhost-user: allow "vhost_set_vring" to wait for a reply Laszlo Ersek
@ 2023-08-30 13:40 ` Laszlo Ersek
  2023-09-07 15:59   ` Eugenio Perez Martin
  2023-09-07 16:01 ` [PATCH v2 0/7] " Eugenio Perez Martin
  7 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-30 13:40 UTC (permalink / raw)
  To: qemu-devel, lersek
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

(1) The virtio-1.2 specification
<http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes:

> 3     General Initialization And Device Operation
> 3.1   Device Initialization
> 3.1.1 Driver Requirements: Device Initialization
>
> [...]
>
> 7. Perform device-specific setup, including discovery of virtqueues for
>    the device, optional per-bus setup, reading and possibly writing the
>    device’s virtio configuration space, and population of virtqueues.
>
> 8. Set the DRIVER_OK status bit. At this point the device is “live”.

and

> 4         Virtio Transport Options
> 4.1       Virtio Over PCI Bus
> 4.1.4     Virtio Structure PCI Capabilities
> 4.1.4.3   Common configuration structure layout
> 4.1.4.3.2 Driver Requirements: Common configuration structure layout
>
> [...]
>
> The driver MUST configure the other virtqueue fields before enabling the
> virtqueue with queue_enable.
>
> [...]

(The same statements are present in virtio-1.0 identically, at
<http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.)

These together mean that the following sub-sequence of steps is valid for
a virtio-1.0 guest driver:

(1.1) set "queue_enable" for the needed queues as the final part of device
initialization step (7),

(1.2) set DRIVER_OK in step (8),

(1.3) immediately start sending virtio requests to the device.

(2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES
special virtio feature is negotiated, then virtio rings start in disabled
state, according to
<https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>.
In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for
enabling vrings.

Therefore setting "queue_enable" from the guest (1.1) is a *control plane*
operation, which travels from the guest through QEMU to the vhost-user
backend, using a unix domain socket.

Whereas sending a virtio request (1.3) is a *data plane* operation, which
evades QEMU -- it travels from guest to the vhost-user backend via
eventfd.

This means that steps (1.1) and (1.3) travel through different channels,
and their relative order can be reversed, as perceived by the vhost-user
backend.

That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs
against the Rust-language virtiofsd version 1.7.2. (Which uses version
0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost
crate.)

Namely, when VirtioFsDxe binds a virtiofs device, it goes through the
device initialization steps (i.e., control plane operations), and
immediately sends a FUSE_INIT request too (i.e., performs a data plane
operation). In the Rust-language virtiofsd, this creates a race between
two components that run *concurrently*, i.e., in different threads or
processes:

- Control plane, handling vhost-user protocol messages:

  The "VhostUserSlaveReqHandlerMut::set_vring_enable" method
  [crates/vhost-user-backend/src/handler.rs] handles
  VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled"
  flag according to the message processed.

- Data plane, handling virtio / FUSE requests:

  The "VringEpollHandler::handle_event" method
  [crates/vhost-user-backend/src/event_loop.rs] handles the incoming
  virtio / FUSE request, consuming the virtio kick at the same time. If
  the vring's "enabled" flag is set, the virtio / FUSE request is
  processed genuinely. If the vring's "enabled" flag is clear, then the
  virtio / FUSE request is discarded.

Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*.
However, if the data plane processor in virtiofsd wins the race, then it
sees the FUSE_INIT *before* the control plane processor took notice of
VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane
processor. Therefore the latter drops FUSE_INIT on the floor, and goes
back to waiting for further virtio / FUSE requests with epoll_wait.
Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock.

The deadlock is not deterministic. OVMF hangs infrequently during first
boot. However, OVMF hangs almost certainly during reboots from the UEFI
shell.

The race can be "reliably masked" by inserting a very small delay -- a
single debug message -- at the top of "VringEpollHandler::handle_event",
i.e., just before the data plane processor checks the "enabled" field of
the vring. That delay suffices for the control plane processor to act upon
VHOST_USER_SET_VRING_ENABLE.

We can deterministically prevent the race in QEMU, by blocking OVMF inside
step (1.1) -- i.e., in the write to the "queue_enable" register -- until
VHOST_USER_SET_VRING_ENABLE actually *completes*. That way OVMF's VCPU
cannot advance to the FUSE_INIT submission before virtiofsd's control
plane processor takes notice of the queue being enabled.

Wait for VHOST_USER_SET_VRING_ENABLE completion by:

- setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting
  for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature
  has been negotiated, or

- performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires
  a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Cc: German Maglione <gmaglione@redhat.com>
Cc: Liu Jiang <gerry@linux.alibaba.com>
Cc: Sergio Lopez Pascual <slp@redhat.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    
    - pick up R-b from Stefano
    
    - update virtio spec reference from 1.0 to 1.2 (also keep the 1.0 ref)
      in the commit message; re-check the quotes / section headers [Stefano]
    
    - summarize commit message in code comment [Stefano]

 hw/virtio/vhost-user.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 18e15a9bb359..41842eb023b5 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -1235,7 +1235,21 @@ static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
             .num   = enable,
         };
 
-        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, false);
+        /*
+         * SET_VRING_ENABLE travels from guest to QEMU to vhost-user backend /
+         * control plane thread via unix domain socket. Virtio requests travel
+         * from guest to vhost-user backend / data plane thread via eventfd.
+         * Even if the guest enables the ring first, and pushes its first virtio
+         * request second (conforming to the virtio spec), the data plane thread
+         * in the backend may see the virtio request before the control plane
+         * thread sees the queue enablement. This causes (in fact, requires) the
+         * data plane thread to discard the virtio request (it arrived on a
+         * seemingly disabled queue). To prevent this out-of-order delivery,
+         * don't let the guest proceed to pushing the virtio request until the
+         * backend control plane acknowledges enabling the queue -- IOW, pass
+         * wait_for_reply=true below.
+         */
+        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, true);
         if (ret < 0) {
             /*
              * Restoring the previous state is likely infeasible, as well as

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/7] vhost-user: strip superfluous whitespace
  2023-08-30 13:40 ` [PATCH v2 1/7] vhost-user: strip superfluous whitespace Laszlo Ersek
@ 2023-08-31  7:13   ` Laszlo Ersek
  2023-09-06  7:12     ` Albert Esteve
  0 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-31  7:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella,
	Philippe Mathieu-Daudé

On 8/30/23 15:40, Laszlo Ersek wrote:
> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> Cc: Eugenio Perez Martin <eperezma@redhat.com>
> Cc: German Maglione <gmaglione@redhat.com>
> Cc: Liu Jiang <gerry@linux.alibaba.com>
> Cc: Sergio Lopez Pascual <slp@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> 
> Notes:
>     v2:
>     
>     - pick up Stefano's R-b
> 
>  hw/virtio/vhost-user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This has been

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

under the (identical) v1 posting:

http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org

Thanks, Phil! (and sorry that I posted v2 too quickly -- I forgot that
sometimes reviewers split a review over multiple days.)

Laszlo

> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 8dcf049d422b..b4b677c1ce66 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -398,7 +398,7 @@ static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
>       * operations such as configuring device memory mappings or issuing device
>       * resets, which affect the whole device instead of individual VQs,
>       * vhost-user messages should only be sent once.
> -     * 
> +     *
>       * Devices with multiple vhost_devs are given an associated dev->vq_index
>       * so per_device requests are only sent if vq_index is 0.
>       */
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr"
  2023-08-30 13:40 ` [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr" Laszlo Ersek
@ 2023-08-31  7:14   ` Laszlo Ersek
  0 siblings, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-31  7:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

On 8/30/23 15:40, Laszlo Ersek wrote:
> In the vhost_user_set_vring_addr() function, we calculate
> "reply_supported" unconditionally, even though we'll only need it if
> "wait_for_reply" is also true.
> 
> Restrict the scope of "reply_supported" to the minimum.
> 
> This is purely refactoring -- no observable change.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> Cc: Eugenio Perez Martin <eperezma@redhat.com>
> Cc: German Maglione <gmaglione@redhat.com>
> Cc: Liu Jiang <gerry@linux.alibaba.com>
> Cc: Sergio Lopez Pascual <slp@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> 
> Notes:
>     v2:
>     
>     - pick up Stefano's R-b
> 
>  hw/virtio/vhost-user.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)

identical to v1, so:

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

from
<http://mid.mail-archive.com/6c12069e-da31-9758-4972-7121ab5ffdee@linaro.org>.

Laszlo

> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index b4b677c1ce66..64eac317bfb2 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1331,17 +1331,18 @@ static int vhost_user_set_vring_addr(struct vhost_dev *dev,
>          .hdr.size = sizeof(msg.payload.addr),
>      };
>  
> -    bool reply_supported = virtio_has_feature(dev->protocol_features,
> -                                              VHOST_USER_PROTOCOL_F_REPLY_ACK);
> -
>      /*
>       * wait for a reply if logging is enabled to make sure
>       * backend is actually logging changes
>       */
>      bool wait_for_reply = addr->flags & (1 << VHOST_VRING_F_LOG);
>  
> -    if (reply_supported && wait_for_reply) {
> -        msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> +    if (wait_for_reply) {
> +        bool reply_supported = virtio_has_feature(dev->protocol_features,
> +                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
> +        if (reply_supported) {
> +            msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> +        }
>      }
>  
>      ret = vhost_user_write(dev, &msg, NULL, 0);
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64"
  2023-08-30 13:40 ` [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64" Laszlo Ersek
@ 2023-08-31  7:20   ` Laszlo Ersek
  2023-09-06  9:59     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 18+ messages in thread
From: Laszlo Ersek @ 2023-08-31  7:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella,
	Philippe Mathieu-Daudé

On 8/30/23 15:40, Laszlo Ersek wrote:
> In order to avoid a forward-declaration for "vhost_user_write_sync" in a
> subsequent patch, hoist "vhost_user_write_sync" ->
> "vhost_user_get_features" -> "vhost_user_get_u64" just above
> "vhost_set_vring".
> 
> This is purely code movement -- no observable change.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> Cc: Eugenio Perez Martin <eperezma@redhat.com>
> Cc: German Maglione <gmaglione@redhat.com>
> Cc: Liu Jiang <gerry@linux.alibaba.com>
> Cc: Sergio Lopez Pascual <slp@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
> 
> Notes:
>     v2:
>     
>     - pick up R-b from Stefano
>     
>     - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code and
>       commit message) [Stefano]
> 
>  hw/virtio/vhost-user.c | 170 ++++++++++----------
>  1 file changed, 85 insertions(+), 85 deletions(-)

Phil reviewed v1:

http://mid.mail-archive.com/98150923-39ef-7581-6144-8d0ad8d4dd52@linaro.org

and I would've kept his R-b (similar to Stefano's) across the
vhost_user_write_msg->vhost_user_write_sync rename in v2; so I'm copying
it here:

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

Hope that's OK.

Laszlo

> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 4129ba72e408..c79b6f77cdca 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1083,6 +1083,91 @@ static int vhost_user_set_vring_endian(struct vhost_dev *dev,
>      return vhost_user_write(dev, &msg, NULL, 0);
>  }
>  
> +static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64)
> +{
> +    int ret;
> +    VhostUserMsg msg = {
> +        .hdr.request = request,
> +        .hdr.flags = VHOST_USER_VERSION,
> +    };
> +
> +    if (vhost_user_per_device_request(request) && dev->vq_index != 0) {
> +        return 0;
> +    }
> +
> +    ret = vhost_user_write(dev, &msg, NULL, 0);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    ret = vhost_user_read(dev, &msg);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    if (msg.hdr.request != request) {
> +        error_report("Received unexpected msg type. Expected %d received %d",
> +                     request, msg.hdr.request);
> +        return -EPROTO;
> +    }
> +
> +    if (msg.hdr.size != sizeof(msg.payload.u64)) {
> +        error_report("Received bad msg size.");
> +        return -EPROTO;
> +    }
> +
> +    *u64 = msg.payload.u64;
> +
> +    return 0;
> +}
> +
> +static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
> +{
> +    if (vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features) < 0) {
> +        return -EPROTO;
> +    }
> +
> +    return 0;
> +}
> +
> +/* Note: "msg->hdr.flags" may be modified. */
> +static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
> +                                 bool wait_for_reply)
> +{
> +    int ret;
> +
> +    if (wait_for_reply) {
> +        bool reply_supported = virtio_has_feature(dev->protocol_features,
> +                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
> +        if (reply_supported) {
> +            msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> +        }
> +    }
> +
> +    ret = vhost_user_write(dev, msg, NULL, 0);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    if (wait_for_reply) {
> +        uint64_t dummy;
> +
> +        if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
> +            return process_message_reply(dev, msg);
> +        }
> +
> +       /*
> +        * We need to wait for a reply but the backend does not
> +        * support replies for the command we just sent.
> +        * Send VHOST_USER_GET_FEATURES which makes all backends
> +        * send a reply.
> +        */
> +        return vhost_user_get_features(dev, &dummy);
> +    }
> +
> +    return 0;
> +}
> +
>  static int vhost_set_vring(struct vhost_dev *dev,
>                             unsigned long int request,
>                             struct vhost_vring_state *ring)
> @@ -1255,91 +1340,6 @@ static int vhost_user_set_vring_err(struct vhost_dev *dev,
>      return vhost_set_vring_file(dev, VHOST_USER_SET_VRING_ERR, file);
>  }
>  
> -static int vhost_user_get_u64(struct vhost_dev *dev, int request, uint64_t *u64)
> -{
> -    int ret;
> -    VhostUserMsg msg = {
> -        .hdr.request = request,
> -        .hdr.flags = VHOST_USER_VERSION,
> -    };
> -
> -    if (vhost_user_per_device_request(request) && dev->vq_index != 0) {
> -        return 0;
> -    }
> -
> -    ret = vhost_user_write(dev, &msg, NULL, 0);
> -    if (ret < 0) {
> -        return ret;
> -    }
> -
> -    ret = vhost_user_read(dev, &msg);
> -    if (ret < 0) {
> -        return ret;
> -    }
> -
> -    if (msg.hdr.request != request) {
> -        error_report("Received unexpected msg type. Expected %d received %d",
> -                     request, msg.hdr.request);
> -        return -EPROTO;
> -    }
> -
> -    if (msg.hdr.size != sizeof(msg.payload.u64)) {
> -        error_report("Received bad msg size.");
> -        return -EPROTO;
> -    }
> -
> -    *u64 = msg.payload.u64;
> -
> -    return 0;
> -}
> -
> -static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
> -{
> -    if (vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features) < 0) {
> -        return -EPROTO;
> -    }
> -
> -    return 0;
> -}
> -
> -/* Note: "msg->hdr.flags" may be modified. */
> -static int vhost_user_write_sync(struct vhost_dev *dev, VhostUserMsg *msg,
> -                                 bool wait_for_reply)
> -{
> -    int ret;
> -
> -    if (wait_for_reply) {
> -        bool reply_supported = virtio_has_feature(dev->protocol_features,
> -                                          VHOST_USER_PROTOCOL_F_REPLY_ACK);
> -        if (reply_supported) {
> -            msg->hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> -        }
> -    }
> -
> -    ret = vhost_user_write(dev, msg, NULL, 0);
> -    if (ret < 0) {
> -        return ret;
> -    }
> -
> -    if (wait_for_reply) {
> -        uint64_t dummy;
> -
> -        if (msg->hdr.flags & VHOST_USER_NEED_REPLY_MASK) {
> -            return process_message_reply(dev, msg);
> -        }
> -
> -       /*
> -        * We need to wait for a reply but the backend does not
> -        * support replies for the command we just sent.
> -        * Send VHOST_USER_GET_FEATURES which makes all backends
> -        * send a reply.
> -        */
> -        return vhost_user_get_features(dev, &dummy);
> -    }
> -
> -    return 0;
> -}
> -
>  static int vhost_user_set_vring_addr(struct vhost_dev *dev,
>                                       struct vhost_vring_addr *addr)
>  {
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/7] vhost-user: strip superfluous whitespace
  2023-08-31  7:13   ` Laszlo Ersek
@ 2023-09-06  7:12     ` Albert Esteve
  2023-09-06  9:09       ` Laszlo Ersek
  2023-10-02 19:48       ` Laszlo Ersek
  0 siblings, 2 replies; 18+ messages in thread
From: Albert Esteve @ 2023-09-06  7:12 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: qemu-devel, Michael S. Tsirkin, Eugenio Perez Martin,
	German Maglione, Liu Jiang, Sergio Lopez Pascual,
	Stefano Garzarella, Philippe Mathieu-Daudé

[-- Attachment #1: Type: text/plain, Size: 2757 bytes --]

On Thu, Aug 31, 2023 at 9:14 AM Laszlo Ersek <lersek@redhat.com> wrote:

> On 8/30/23 15:40, Laszlo Ersek wrote:
> > Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> > Cc: Eugenio Perez Martin <eperezma@redhat.com>
> > Cc: German Maglione <gmaglione@redhat.com>
> > Cc: Liu Jiang <gerry@linux.alibaba.com>
> > Cc: Sergio Lopez Pascual <slp@redhat.com>
> > Cc: Stefano Garzarella <sgarzare@redhat.com>
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> > ---
> >
> > Notes:
> >     v2:
> >
> >     - pick up Stefano's R-b
> >
> >  hw/virtio/vhost-user.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
>
> This has been
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>
> under the (identical) v1 posting:
>
> http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org
>
> Thanks, Phil! (and sorry that I posted v2 too quickly -- I forgot that
> sometimes reviewers split a review over multiple days.)
>
> Laszlo
>
> >
> > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > index 8dcf049d422b..b4b677c1ce66 100644
> > --- a/hw/virtio/vhost-user.c
> > +++ b/hw/virtio/vhost-user.c
> > @@ -398,7 +398,7 @@ static int vhost_user_write(struct vhost_dev *dev,
> VhostUserMsg *msg,
> >       * operations such as configuring device memory mappings or issuing
> device
> >       * resets, which affect the whole device instead of individual VQs,
> >       * vhost-user messages should only be sent once.
> > -     *
> > +     *
> >       * Devices with multiple vhost_devs are given an associated
> dev->vq_index
> >       * so per_device requests are only sent if vq_index is 0.
> >       */
> >
>
>
Thanks for the series!
I had a timeout problem with a virtio device I am developing, and I was not
sure yet what was going on.
Your description of the problem seemed to fit mine, in my case the driver
sent a command through the data plane
right after the feature negotiation that reached the backend too soon.
Adding delays alleviated the issue, so it
already hinted me to a race condition.

I tested using this patch series and according to my experiments, it really
lowers the chances to get the deadlock.
Sadly, I do still get the issue sometimes, though (not frequently)...
However, I think probably the solution comes not
from the Qemu side, but from the rust-vmm vhost-user-backend crate. I am
looking for that solution on my side.

But that does not invalidate this patch, which I think is a necessary
improvement, and in my tests it really
helps a lot with the described issue. Therefore:

Tested-by: Albert Esteve <aesteve@redhat.com>

BR,
Albert

[-- Attachment #2: Type: text/html, Size: 4296 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/7] vhost-user: strip superfluous whitespace
  2023-09-06  7:12     ` Albert Esteve
@ 2023-09-06  9:09       ` Laszlo Ersek
  2023-10-02 19:48       ` Laszlo Ersek
  1 sibling, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-09-06  9:09 UTC (permalink / raw)
  To: Albert Esteve
  Cc: qemu-devel, Michael S. Tsirkin, Eugenio Perez Martin,
	German Maglione, Liu Jiang, Sergio Lopez Pascual,
	Stefano Garzarella, Philippe Mathieu-Daudé

On 9/6/23 09:12, Albert Esteve wrote:
> 
> 
> On Thu, Aug 31, 2023 at 9:14 AM Laszlo Ersek <lersek@redhat.com
> <mailto:lersek@redhat.com>> wrote:
> 
>     On 8/30/23 15:40, Laszlo Ersek wrote:
>     > Cc: "Michael S. Tsirkin" <mst@redhat.com <mailto:mst@redhat.com>>
>     (supporter:vhost)
>     > Cc: Eugenio Perez Martin <eperezma@redhat.com
>     <mailto:eperezma@redhat.com>>
>     > Cc: German Maglione <gmaglione@redhat.com
>     <mailto:gmaglione@redhat.com>>
>     > Cc: Liu Jiang <gerry@linux.alibaba.com
>     <mailto:gerry@linux.alibaba.com>>
>     > Cc: Sergio Lopez Pascual <slp@redhat.com <mailto:slp@redhat.com>>
>     > Cc: Stefano Garzarella <sgarzare@redhat.com
>     <mailto:sgarzare@redhat.com>>
>     > Signed-off-by: Laszlo Ersek <lersek@redhat.com
>     <mailto:lersek@redhat.com>>
>     > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com
>     <mailto:sgarzare@redhat.com>>
>     > ---
>     >
>     > Notes:
>     >     v2:
>     >     
>     >     - pick up Stefano's R-b
>     >
>     >  hw/virtio/vhost-user.c | 2 +-
>     >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>     This has been
> 
>     Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
>     <mailto:philmd@linaro.org>>
> 
>     under the (identical) v1 posting:
> 
>     http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org <http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org>
> 
>     Thanks, Phil! (and sorry that I posted v2 too quickly -- I forgot that
>     sometimes reviewers split a review over multiple days.)
> 
>     Laszlo
> 
>     >
>     > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>     > index 8dcf049d422b..b4b677c1ce66 100644
>     > --- a/hw/virtio/vhost-user.c
>     > +++ b/hw/virtio/vhost-user.c
>     > @@ -398,7 +398,7 @@ static int vhost_user_write(struct vhost_dev
>     *dev, VhostUserMsg *msg,
>     >       * operations such as configuring device memory mappings or
>     issuing device
>     >       * resets, which affect the whole device instead of
>     individual VQs,
>     >       * vhost-user messages should only be sent once.
>     > -     *
>     > +     *
>     >       * Devices with multiple vhost_devs are given an associated
>     dev->vq_index
>     >       * so per_device requests are only sent if vq_index is 0.
>     >       */
>     >
> 
> 
> Thanks for the series!
> I had a timeout problem with a virtio device I am developing, and I was
> not sure yet what was going on.
> Your description of the problem seemed to fit mine, in my case the
> driver sent a command through the data plane
> right after the feature negotiation that reached the backend too soon.
> Adding delays alleviated the issue, so it
> already hinted me to a race condition.
> 
> I tested using this patch series and according to my experiments, it
> really lowers the chances to get the deadlock.
> Sadly, I do still get the issue sometimes, though (not frequently)...
> However, I think probably the solution comes not
> from the Qemu side, but from the rust-vmm vhost-user-backend crate. I am
> looking for that solution on my side.
> 
> But that does not invalidate this patch, which I think is a necessary
> improvement, and in my tests it really
> helps a lot with the described issue. Therefore:
> 
> Tested-by: Albert Esteve <aesteve@redhat.com <mailto:aesteve@redhat.com>>

Thank you for the testing and the feedback!

My problem with relegating the fix to rust-vmm/vhost -- i.e. with
calling the hang a bug inside rust-vmm/vhost -- is that rust-vmm/vhost
is *unfixable* as long as the vhost-user specification says what it
says. As soon as the guest is permitted to issue a data plane operation,
without forcing it to await completion of an earlier control plane
operation, the cat is out of the bag. Those operations travel through
independent channels, and the vhost-user spec currently requires the
backend to listen to both channels at the same time. There's no way to
restore the original order after both operations have been submitted
effectively in parallel from the guest side.

The guest cannot limit itself, the virtio operations it needs to do (on
the control plane) are described in the virtio spec, in "driver
requirements" sections, and most of those steps are inherently
treated/specified as synchronous. The guest already thinks those steps
are synchronous.

So it really is a spec problem. I see the big problem in the switch from
vhost-net to the generic vhost-user protocol. My understanding from the
virtio-networking series in the RH Developer Blog is that vhost-net was
entirely ioctl() based, and consequently totally synchronous. That was
lost when the protocol was rebased to unix domain sockets, without
upholding the explicit request-response pattern in all operations.

I'm worried we can't get this unstuck until Michael answers Stefan's
question, concerning the spec.

Laszlo



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64"
  2023-08-31  7:20   ` Laszlo Ersek
@ 2023-09-06  9:59     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 18+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-09-06  9:59 UTC (permalink / raw)
  To: Laszlo Ersek, qemu-devel
  Cc: Michael S. Tsirkin, Eugenio Perez Martin, German Maglione,
	Liu Jiang, Sergio Lopez Pascual, Stefano Garzarella

On 31/8/23 09:20, Laszlo Ersek wrote:
> On 8/30/23 15:40, Laszlo Ersek wrote:
>> In order to avoid a forward-declaration for "vhost_user_write_sync" in a
>> subsequent patch, hoist "vhost_user_write_sync" ->
>> "vhost_user_get_features" -> "vhost_user_get_u64" just above
>> "vhost_set_vring".
>>
>> This is purely code movement -- no observable change.
>>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
>> Cc: Eugenio Perez Martin <eperezma@redhat.com>
>> Cc: German Maglione <gmaglione@redhat.com>
>> Cc: Liu Jiang <gerry@linux.alibaba.com>
>> Cc: Sergio Lopez Pascual <slp@redhat.com>
>> Cc: Stefano Garzarella <sgarzare@redhat.com>
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>>
>> Notes:
>>      v2:
>>      
>>      - pick up R-b from Stefano
>>      
>>      - rename "vhost_user_write_msg" to "vhost_user_write_sync" (in code and
>>        commit message) [Stefano]
>>
>>   hw/virtio/vhost-user.c | 170 ++++++++++----------
>>   1 file changed, 85 insertions(+), 85 deletions(-)
> 
> Phil reviewed v1:
> 
> http://mid.mail-archive.com/98150923-39ef-7581-6144-8d0ad8d4dd52@linaro.org
> 
> and I would've kept his R-b (similar to Stefano's) across the
> vhost_user_write_msg->vhost_user_write_sync rename in v2; so I'm copying
> it here:
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> 
> Hope that's OK.

Sure! (same for patch 2/7)

Thanks :)

Phil.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
  2023-08-30 13:40 ` [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
@ 2023-09-07 15:59   ` Eugenio Perez Martin
  2023-09-07 20:27     ` Laszlo Ersek
  0 siblings, 1 reply; 18+ messages in thread
From: Eugenio Perez Martin @ 2023-09-07 15:59 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: qemu-devel, Michael S. Tsirkin, German Maglione, Liu Jiang,
	Sergio Lopez Pascual, Stefano Garzarella

On Wed, Aug 30, 2023 at 3:41 PM Laszlo Ersek <lersek@redhat.com> wrote:
>
> (1) The virtio-1.2 specification
> <http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes:
>
> > 3     General Initialization And Device Operation
> > 3.1   Device Initialization
> > 3.1.1 Driver Requirements: Device Initialization
> >
> > [...]
> >
> > 7. Perform device-specific setup, including discovery of virtqueues for
> >    the device, optional per-bus setup, reading and possibly writing the
> >    device’s virtio configuration space, and population of virtqueues.
> >
> > 8. Set the DRIVER_OK status bit. At this point the device is “live”.
>
> and
>
> > 4         Virtio Transport Options
> > 4.1       Virtio Over PCI Bus
> > 4.1.4     Virtio Structure PCI Capabilities
> > 4.1.4.3   Common configuration structure layout
> > 4.1.4.3.2 Driver Requirements: Common configuration structure layout
> >
> > [...]
> >
> > The driver MUST configure the other virtqueue fields before enabling the
> > virtqueue with queue_enable.
> >
> > [...]
>
> (The same statements are present in virtio-1.0 identically, at
> <http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.)
>
> These together mean that the following sub-sequence of steps is valid for
> a virtio-1.0 guest driver:
>
> (1.1) set "queue_enable" for the needed queues as the final part of device
> initialization step (7),
>
> (1.2) set DRIVER_OK in step (8),
>
> (1.3) immediately start sending virtio requests to the device.
>
> (2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES
> special virtio feature is negotiated, then virtio rings start in disabled
> state, according to
> <https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>.
> In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for
> enabling vrings.
>
> Therefore setting "queue_enable" from the guest (1.1) is a *control plane*
> operation, which travels from the guest through QEMU to the vhost-user
> backend, using a unix domain socket.
>

The code looks good to me, but this part of the message is not precise
if I understood it correctly.

Guest PCI "queue_enable" writes remain in the qemu virtio device model
until the guest writes DRIVER_OK to the status. I'm referring to
hw/virtio/virtio-pci.c:virtio_pci_common_write, case
VIRTIO_PCI_COMMON_Q_ENABLE. From there, virtio_queue_enable just saves
the info in VirtIOPCIProxy.

After the needed queues are enabled, the guest writes DRIVER_OK status
bit. Then, the vhost backend is started and qemu sends the
VHOST_USER_SET_VRING_ENABLE through the unix socket. And that is the
source of the message that is racing with the dataplane.

I didn't confirm it with virtiofs through tracing / debugging, so I
may be missing something.

Even with the small nit, the code fixes the problem.

Acked-by: Eugenio Pérez <eperezma@redhat.com>

> Whereas sending a virtio request (1.3) is a *data plane* operation, which
> evades QEMU -- it travels from guest to the vhost-user backend via
> eventfd.
>
> This means that steps (1.1) and (1.3) travel through different channels,
> and their relative order can be reversed, as perceived by the vhost-user
> backend.
>
> That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs
> against the Rust-language virtiofsd version 1.7.2. (Which uses version
> 0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost
> crate.)
>
> Namely, when VirtioFsDxe binds a virtiofs device, it goes through the
> device initialization steps (i.e., control plane operations), and
> immediately sends a FUSE_INIT request too (i.e., performs a data plane
> operation). In the Rust-language virtiofsd, this creates a race between
> two components that run *concurrently*, i.e., in different threads or
> processes:
>
> - Control plane, handling vhost-user protocol messages:
>
>   The "VhostUserSlaveReqHandlerMut::set_vring_enable" method
>   [crates/vhost-user-backend/src/handler.rs] handles
>   VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled"
>   flag according to the message processed.
>
> - Data plane, handling virtio / FUSE requests:
>
>   The "VringEpollHandler::handle_event" method
>   [crates/vhost-user-backend/src/event_loop.rs] handles the incoming
>   virtio / FUSE request, consuming the virtio kick at the same time. If
>   the vring's "enabled" flag is set, the virtio / FUSE request is
>   processed genuinely. If the vring's "enabled" flag is clear, then the
>   virtio / FUSE request is discarded.
>
> Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*.
> However, if the data plane processor in virtiofsd wins the race, then it
> sees the FUSE_INIT *before* the control plane processor took notice of
> VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane
> processor. Therefore the latter drops FUSE_INIT on the floor, and goes
> back to waiting for further virtio / FUSE requests with epoll_wait.
> Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock.
>
> The deadlock is not deterministic. OVMF hangs infrequently during first
> boot. However, OVMF hangs almost certainly during reboots from the UEFI
> shell.
>
> The race can be "reliably masked" by inserting a very small delay -- a
> single debug message -- at the top of "VringEpollHandler::handle_event",
> i.e., just before the data plane processor checks the "enabled" field of
> the vring. That delay suffices for the control plane processor to act upon
> VHOST_USER_SET_VRING_ENABLE.
>
> We can deterministically prevent the race in QEMU, by blocking OVMF inside
> step (1.1) -- i.e., in the write to the "queue_enable" register -- until
> VHOST_USER_SET_VRING_ENABLE actually *completes*. That way OVMF's VCPU
> cannot advance to the FUSE_INIT submission before virtiofsd's control
> plane processor takes notice of the queue being enabled.
>
> Wait for VHOST_USER_SET_VRING_ENABLE completion by:
>
> - setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting
>   for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature
>   has been negotiated, or
>
> - performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires
>   a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK.
>
> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> Cc: Eugenio Perez Martin <eperezma@redhat.com>
> Cc: German Maglione <gmaglione@redhat.com>
> Cc: Liu Jiang <gerry@linux.alibaba.com>
> Cc: Sergio Lopez Pascual <slp@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
> ---
>
> Notes:
>     v2:
>
>     - pick up R-b from Stefano
>
>     - update virtio spec reference from 1.0 to 1.2 (also keep the 1.0 ref)
>       in the commit message; re-check the quotes / section headers [Stefano]
>
>     - summarize commit message in code comment [Stefano]
>
>  hw/virtio/vhost-user.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 18e15a9bb359..41842eb023b5 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -1235,7 +1235,21 @@ static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
>              .num   = enable,
>          };
>
> -        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, false);
> +        /*
> +         * SET_VRING_ENABLE travels from guest to QEMU to vhost-user backend /
> +         * control plane thread via unix domain socket. Virtio requests travel
> +         * from guest to vhost-user backend / data plane thread via eventfd.
> +         * Even if the guest enables the ring first, and pushes its first virtio
> +         * request second (conforming to the virtio spec), the data plane thread
> +         * in the backend may see the virtio request before the control plane
> +         * thread sees the queue enablement. This causes (in fact, requires) the
> +         * data plane thread to discard the virtio request (it arrived on a
> +         * seemingly disabled queue). To prevent this out-of-order delivery,
> +         * don't let the guest proceed to pushing the virtio request until the
> +         * backend control plane acknowledges enabling the queue -- IOW, pass
> +         * wait_for_reply=true below.
> +         */
> +        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, true);
>          if (ret < 0) {
>              /*
>               * Restoring the previous state is likely infeasible, as well as



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
  2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
                   ` (6 preceding siblings ...)
  2023-08-30 13:40 ` [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
@ 2023-09-07 16:01 ` Eugenio Perez Martin
  7 siblings, 0 replies; 18+ messages in thread
From: Eugenio Perez Martin @ 2023-09-07 16:01 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: qemu-devel, Michael S. Tsirkin, German Maglione, Liu Jiang,
	Sergio Lopez Pascual, Stefano Garzarella

On Wed, Aug 30, 2023 at 3:41 PM Laszlo Ersek <lersek@redhat.com> wrote:
>
> v1:
>
> - http://mid.mail-archive.com/20230827182937.146450-1-lersek@redhat.com
> - https://patchwork.ozlabs.org/project/qemu-devel/cover/20230827182937.146450-1-lersek@redhat.com/
>
> v2 picks up tags from Phil and Stefano, and addresses feedback from
> Stefano. Please see the Notes section on each patch, for the v2 changes.
>
> I've not CC'd the stable list, as we've not figured out what prior
> releases to target. Applying the series to 8.1 is easy; to 8.0 -- not so
> much.
>
> Retested.
>
> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
> Cc: Eugenio Perez Martin <eperezma@redhat.com>
> Cc: German Maglione <gmaglione@redhat.com>
> Cc: Liu Jiang <gerry@linux.alibaba.com>
> Cc: Sergio Lopez Pascual <slp@redhat.com>
> Cc: Stefano Garzarella <sgarzare@redhat.com>
>

Reviewed-by: Eugenio Pérez <eperezma@redhat.com>

> Thanks,
> Laszlo
>
> Laszlo Ersek (7):
>   vhost-user: strip superfluous whitespace
>   vhost-user: tighten "reply_supported" scope in "set_vring_addr"
>   vhost-user: factor out "vhost_user_write_sync"
>   vhost-user: flatten "enforce_reply" into "vhost_user_write_sync"
>   vhost-user: hoist "write_sync", "get_features", "get_u64"
>   vhost-user: allow "vhost_set_vring" to wait for a reply
>   vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
>
>  hw/virtio/vhost-user.c | 216 ++++++++++----------
>  1 file changed, 108 insertions(+), 108 deletions(-)
>
>
> base-commit: 813bac3d8d70d85cb7835f7945eb9eed84c2d8d0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously
  2023-09-07 15:59   ` Eugenio Perez Martin
@ 2023-09-07 20:27     ` Laszlo Ersek
  0 siblings, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-09-07 20:27 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: qemu-devel, Michael S. Tsirkin, German Maglione, Liu Jiang,
	Sergio Lopez Pascual, Stefano Garzarella

On 9/7/23 17:59, Eugenio Perez Martin wrote:
> On Wed, Aug 30, 2023 at 3:41 PM Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> (1) The virtio-1.2 specification
>> <http://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html> writes:
>>
>>> 3     General Initialization And Device Operation
>>> 3.1   Device Initialization
>>> 3.1.1 Driver Requirements: Device Initialization
>>>
>>> [...]
>>>
>>> 7. Perform device-specific setup, including discovery of virtqueues for
>>>    the device, optional per-bus setup, reading and possibly writing the
>>>    device’s virtio configuration space, and population of virtqueues.
>>>
>>> 8. Set the DRIVER_OK status bit. At this point the device is “live”.
>>
>> and
>>
>>> 4         Virtio Transport Options
>>> 4.1       Virtio Over PCI Bus
>>> 4.1.4     Virtio Structure PCI Capabilities
>>> 4.1.4.3   Common configuration structure layout
>>> 4.1.4.3.2 Driver Requirements: Common configuration structure layout
>>>
>>> [...]
>>>
>>> The driver MUST configure the other virtqueue fields before enabling the
>>> virtqueue with queue_enable.
>>>
>>> [...]
>>
>> (The same statements are present in virtio-1.0 identically, at
>> <http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>.)
>>
>> These together mean that the following sub-sequence of steps is valid for
>> a virtio-1.0 guest driver:
>>
>> (1.1) set "queue_enable" for the needed queues as the final part of device
>> initialization step (7),
>>
>> (1.2) set DRIVER_OK in step (8),
>>
>> (1.3) immediately start sending virtio requests to the device.
>>
>> (2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES
>> special virtio feature is negotiated, then virtio rings start in disabled
>> state, according to
>> <https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#ring-states>.
>> In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for
>> enabling vrings.
>>
>> Therefore setting "queue_enable" from the guest (1.1) is a *control plane*
>> operation, which travels from the guest through QEMU to the vhost-user
>> backend, using a unix domain socket.
>>
> 
> The code looks good to me, but this part of the message is not precise
> if I understood it correctly.
> 
> Guest PCI "queue_enable" writes remain in the qemu virtio device model
> until the guest writes DRIVER_OK to the status. I'm referring to
> hw/virtio/virtio-pci.c:virtio_pci_common_write, case
> VIRTIO_PCI_COMMON_Q_ENABLE. From there, virtio_queue_enable just saves
> the info in VirtIOPCIProxy.
> 
> After the needed queues are enabled, the guest writes DRIVER_OK status
> bit. Then, the vhost backend is started and qemu sends the
> VHOST_USER_SET_VRING_ENABLE through the unix socket. And that is the
> source of the message that is racing with the dataplane.

OK, so this means that 1.1 is "buffered" in QEMU until 1.2, but the race
between 1.2 and 1.3 is just the same.

I can reword the commit message to take this into account.

Thanks!
Laszlo

> 
> I didn't confirm it with virtiofs through tracing / debugging, so I
> may be missing something.
> 
> Even with the small nit, the code fixes the problem.
> 
> Acked-by: Eugenio Pérez <eperezma@redhat.com>
> 
>> Whereas sending a virtio request (1.3) is a *data plane* operation, which
>> evades QEMU -- it travels from guest to the vhost-user backend via
>> eventfd.
>>
>> This means that steps (1.1) and (1.3) travel through different channels,
>> and their relative order can be reversed, as perceived by the vhost-user
>> backend.
>>
>> That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs
>> against the Rust-language virtiofsd version 1.7.2. (Which uses version
>> 0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost
>> crate.)
>>
>> Namely, when VirtioFsDxe binds a virtiofs device, it goes through the
>> device initialization steps (i.e., control plane operations), and
>> immediately sends a FUSE_INIT request too (i.e., performs a data plane
>> operation). In the Rust-language virtiofsd, this creates a race between
>> two components that run *concurrently*, i.e., in different threads or
>> processes:
>>
>> - Control plane, handling vhost-user protocol messages:
>>
>>   The "VhostUserSlaveReqHandlerMut::set_vring_enable" method
>>   [crates/vhost-user-backend/src/handler.rs] handles
>>   VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled"
>>   flag according to the message processed.
>>
>> - Data plane, handling virtio / FUSE requests:
>>
>>   The "VringEpollHandler::handle_event" method
>>   [crates/vhost-user-backend/src/event_loop.rs] handles the incoming
>>   virtio / FUSE request, consuming the virtio kick at the same time. If
>>   the vring's "enabled" flag is set, the virtio / FUSE request is
>>   processed genuinely. If the vring's "enabled" flag is clear, then the
>>   virtio / FUSE request is discarded.
>>
>> Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*.
>> However, if the data plane processor in virtiofsd wins the race, then it
>> sees the FUSE_INIT *before* the control plane processor took notice of
>> VHOST_USER_SET_VRING_ENABLE and green-lit the queue for the data plane
>> processor. Therefore the latter drops FUSE_INIT on the floor, and goes
>> back to waiting for further virtio / FUSE requests with epoll_wait.
>> Meanwhile OVMF is stuck waiting for the FUSET_INIT response -- a deadlock.
>>
>> The deadlock is not deterministic. OVMF hangs infrequently during first
>> boot. However, OVMF hangs almost certainly during reboots from the UEFI
>> shell.
>>
>> The race can be "reliably masked" by inserting a very small delay -- a
>> single debug message -- at the top of "VringEpollHandler::handle_event",
>> i.e., just before the data plane processor checks the "enabled" field of
>> the vring. That delay suffices for the control plane processor to act upon
>> VHOST_USER_SET_VRING_ENABLE.
>>
>> We can deterministically prevent the race in QEMU, by blocking OVMF inside
>> step (1.1) -- i.e., in the write to the "queue_enable" register -- until
>> VHOST_USER_SET_VRING_ENABLE actually *completes*. That way OVMF's VCPU
>> cannot advance to the FUSE_INIT submission before virtiofsd's control
>> plane processor takes notice of the queue being enabled.
>>
>> Wait for VHOST_USER_SET_VRING_ENABLE completion by:
>>
>> - setting the NEED_REPLY flag on VHOST_USER_SET_VRING_ENABLE, and waiting
>>   for the reply, if the VHOST_USER_PROTOCOL_F_REPLY_ACK vhost-user feature
>>   has been negotiated, or
>>
>> - performing a separate VHOST_USER_GET_FEATURES *exchange*, which requires
>>   a backend response regardless of VHOST_USER_PROTOCOL_F_REPLY_ACK.
>>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:vhost)
>> Cc: Eugenio Perez Martin <eperezma@redhat.com>
>> Cc: German Maglione <gmaglione@redhat.com>
>> Cc: Liu Jiang <gerry@linux.alibaba.com>
>> Cc: Sergio Lopez Pascual <slp@redhat.com>
>> Cc: Stefano Garzarella <sgarzare@redhat.com>
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
>> ---
>>
>> Notes:
>>     v2:
>>
>>     - pick up R-b from Stefano
>>
>>     - update virtio spec reference from 1.0 to 1.2 (also keep the 1.0 ref)
>>       in the commit message; re-check the quotes / section headers [Stefano]
>>
>>     - summarize commit message in code comment [Stefano]
>>
>>  hw/virtio/vhost-user.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index 18e15a9bb359..41842eb023b5 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -1235,7 +1235,21 @@ static int vhost_user_set_vring_enable(struct vhost_dev *dev, int enable)
>>              .num   = enable,
>>          };
>>
>> -        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, false);
>> +        /*
>> +         * SET_VRING_ENABLE travels from guest to QEMU to vhost-user backend /
>> +         * control plane thread via unix domain socket. Virtio requests travel
>> +         * from guest to vhost-user backend / data plane thread via eventfd.
>> +         * Even if the guest enables the ring first, and pushes its first virtio
>> +         * request second (conforming to the virtio spec), the data plane thread
>> +         * in the backend may see the virtio request before the control plane
>> +         * thread sees the queue enablement. This causes (in fact, requires) the
>> +         * data plane thread to discard the virtio request (it arrived on a
>> +         * seemingly disabled queue). To prevent this out-of-order delivery,
>> +         * don't let the guest proceed to pushing the virtio request until the
>> +         * backend control plane acknowledges enabling the queue -- IOW, pass
>> +         * wait_for_reply=true below.
>> +         */
>> +        ret = vhost_set_vring(dev, VHOST_USER_SET_VRING_ENABLE, &state, true);
>>          if (ret < 0) {
>>              /*
>>               * Restoring the previous state is likely infeasible, as well as
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/7] vhost-user: strip superfluous whitespace
  2023-09-06  7:12     ` Albert Esteve
  2023-09-06  9:09       ` Laszlo Ersek
@ 2023-10-02 19:48       ` Laszlo Ersek
  1 sibling, 0 replies; 18+ messages in thread
From: Laszlo Ersek @ 2023-10-02 19:48 UTC (permalink / raw)
  To: Albert Esteve
  Cc: qemu-devel, Michael S. Tsirkin, Eugenio Perez Martin,
	German Maglione, Liu Jiang, Sergio Lopez Pascual,
	Stefano Garzarella, Philippe Mathieu-Daudé

On 9/6/23 09:12, Albert Esteve wrote:
> 
> 
> On Thu, Aug 31, 2023 at 9:14 AM Laszlo Ersek <lersek@redhat.com
> <mailto:lersek@redhat.com>> wrote:
> 
>     On 8/30/23 15:40, Laszlo Ersek wrote:
>     > Cc: "Michael S. Tsirkin" <mst@redhat.com <mailto:mst@redhat.com>>
>     (supporter:vhost)
>     > Cc: Eugenio Perez Martin <eperezma@redhat.com
>     <mailto:eperezma@redhat.com>>
>     > Cc: German Maglione <gmaglione@redhat.com
>     <mailto:gmaglione@redhat.com>>
>     > Cc: Liu Jiang <gerry@linux.alibaba.com
>     <mailto:gerry@linux.alibaba.com>>
>     > Cc: Sergio Lopez Pascual <slp@redhat.com <mailto:slp@redhat.com>>
>     > Cc: Stefano Garzarella <sgarzare@redhat.com
>     <mailto:sgarzare@redhat.com>>
>     > Signed-off-by: Laszlo Ersek <lersek@redhat.com
>     <mailto:lersek@redhat.com>>
>     > Reviewed-by: Stefano Garzarella <sgarzare@redhat.com
>     <mailto:sgarzare@redhat.com>>
>     > ---
>     >
>     > Notes:
>     >     v2:
>     >     
>     >     - pick up Stefano's R-b
>     >
>     >  hw/virtio/vhost-user.c | 2 +-
>     >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>     This has been
> 
>     Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org
>     <mailto:philmd@linaro.org>>
> 
>     under the (identical) v1 posting:
> 
>     http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org <http://mid.mail-archive.com/cd0604a1-6826-ac6c-6c47-dcb6def64b28@linaro.org>
> 
>     Thanks, Phil! (and sorry that I posted v2 too quickly -- I forgot that
>     sometimes reviewers split a review over multiple days.)
> 
>     Laszlo
> 
>     >
>     > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>     > index 8dcf049d422b..b4b677c1ce66 100644
>     > --- a/hw/virtio/vhost-user.c
>     > +++ b/hw/virtio/vhost-user.c
>     > @@ -398,7 +398,7 @@ static int vhost_user_write(struct vhost_dev
>     *dev, VhostUserMsg *msg,
>     >       * operations such as configuring device memory mappings or
>     issuing device
>     >       * resets, which affect the whole device instead of
>     individual VQs,
>     >       * vhost-user messages should only be sent once.
>     > -     *
>     > +     *
>     >       * Devices with multiple vhost_devs are given an associated
>     dev->vq_index
>     >       * so per_device requests are only sent if vq_index is 0.
>     >       */
>     >
> 
> 
> Thanks for the series!
> I had a timeout problem with a virtio device I am developing, and I was
> not sure yet what was going on.
> Your description of the problem seemed to fit mine, in my case the
> driver sent a command through the data plane
> right after the feature negotiation that reached the backend too soon.
> Adding delays alleviated the issue, so it
> already hinted me to a race condition.
> 
> I tested using this patch series and according to my experiments, it
> really lowers the chances to get the deadlock.
> Sadly, I do still get the issue sometimes, though (not frequently)...
> However, I think probably the solution comes not
> from the Qemu side, but from the rust-vmm vhost-user-backend crate. I am
> looking for that solution on my side.
> 
> But that does not invalidate this patch, which I think is a necessary
> improvement, and in my tests it really
> helps a lot with the described issue. Therefore:
> 
> Tested-by: Albert Esteve <aesteve@redhat.com <mailto:aesteve@redhat.com>>

Thanks again -- I'm picking this up for the whole series.

Laszlo

> 
> BR,
> Albert



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-10-02 19:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-30 13:40 [PATCH v2 0/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 1/7] vhost-user: strip superfluous whitespace Laszlo Ersek
2023-08-31  7:13   ` Laszlo Ersek
2023-09-06  7:12     ` Albert Esteve
2023-09-06  9:09       ` Laszlo Ersek
2023-10-02 19:48       ` Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 2/7] vhost-user: tighten "reply_supported" scope in "set_vring_addr" Laszlo Ersek
2023-08-31  7:14   ` Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 3/7] vhost-user: factor out "vhost_user_write_sync" Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 4/7] vhost-user: flatten "enforce_reply" into "vhost_user_write_sync" Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 5/7] vhost-user: hoist "write_sync", "get_features", "get_u64" Laszlo Ersek
2023-08-31  7:20   ` Laszlo Ersek
2023-09-06  9:59     ` Philippe Mathieu-Daudé
2023-08-30 13:40 ` [PATCH v2 6/7] vhost-user: allow "vhost_set_vring" to wait for a reply Laszlo Ersek
2023-08-30 13:40 ` [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously Laszlo Ersek
2023-09-07 15:59   ` Eugenio Perez Martin
2023-09-07 20:27     ` Laszlo Ersek
2023-09-07 16:01 ` [PATCH v2 0/7] " Eugenio Perez Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.