[Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
@ 2016-07-01  9:46 Prerna Saxena
  2016-07-01  9:46 ` [Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table Prerna Saxena
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Prerna Saxena @ 2016-07-01  9:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: felipe, mst, marcandre.lureau, anilkumar.boggarapu, Prerna Saxena

From: Prerna Saxena <prerna.saxena@nutanix.com>

The current vhost-user protocol requires the client to send responses to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
Note that SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about those GPAs.

Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing applications. 
To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.

The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.

Changelog:
---------
Changes since v1:
Patch 1 : Ask for get_features before returning from set_mem_table(new). 
Patch 2 : * Improve documentation. 
          * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.

Prerna Saxena (2):
  vhost-user: Attempt to prevent a race on set_mem_table.
  vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.

 docs/specs/vhost-user.txt |  40 ++++++++++++
 hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
 2 files changed, 150 insertions(+), 47 deletions(-)

-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table.
  2016-07-01  9:46 [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Prerna Saxena
@ 2016-07-01  9:46 ` Prerna Saxena
  2016-07-01  9:46 ` [Qemu-devel] [PATCH 2/2] vhost-user : Introduce a new protocol feature REPLY_ACK Prerna Saxena
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Prerna Saxena @ 2016-07-01  9:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: felipe, mst, marcandre.lureau, anilkumar.boggarapu, Prerna Saxena

From: Prerna Saxena <prerna.saxena@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing the so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured on (1).
(4) The application has not yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about those GPAs.

While a guaranteed fix would require a protocol extension (committed separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that process vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before replying.

For a vhost-user application that processes mesages strictly in order, a response
against GET_FEATURES will ensure that the application has finished processing the
previous set_mem request too.

Signed-off-by: Prerna Saxena <prerna.saxena@nutanix.com>
---
 hw/virtio/vhost-user.c | 104 +++++++++++++++++++++++++++----------------------
 1 file changed, 57 insertions(+), 47 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..858a1bb 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -233,53 +233,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, uint64_t base,
     return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-                                    struct vhost_memory *mem)
-{
-    int fds[VHOST_MEMORY_MAX_NREGIONS];
-    int i, fd;
-    size_t fd_num = 0;
-    VhostUserMsg msg = {
-        .request = VHOST_USER_SET_MEM_TABLE,
-        .flags = VHOST_USER_VERSION,
-    };
-
-    for (i = 0; i < dev->mem->nregions; ++i) {
-        struct vhost_memory_region *reg = dev->mem->regions + i;
-        ram_addr_t offset;
-        MemoryRegion *mr;
-
-        assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-        mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
-                                     &offset);
-        fd = memory_region_get_fd(mr);
-        if (fd > 0) {
-            msg.payload.memory.regions[fd_num].userspace_addr = reg->userspace_addr;
-            msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-            msg.payload.memory.regions[fd_num].guest_phys_addr = reg->guest_phys_addr;
-            msg.payload.memory.regions[fd_num].mmap_offset = offset;
-            assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-            fds[fd_num++] = fd;
-        }
-    }
-
-    msg.payload.memory.nregions = fd_num;
-
-    if (!fd_num) {
-        error_report("Failed initializing vhost-user memory map, "
-                     "consider using -object memory-backend-file share=on");
-        return -1;
-    }
-
-    msg.size = sizeof(msg.payload.memory.nregions);
-    msg.size += sizeof(msg.payload.memory.padding);
-    msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-    vhost_user_write(dev, &msg, fds, fd_num);
-
-    return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
                                      struct vhost_vring_addr *addr)
 {
@@ -482,6 +435,63 @@ static int vhost_user_get_features(struct vhost_dev *dev, uint64_t *features)
     return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+                                    struct vhost_memory *mem)
+{
+    int fds[VHOST_MEMORY_MAX_NREGIONS];
+    int i, fd;
+    size_t fd_num = 0;
+    uint64_t features;
+    VhostUserMsg msg = {
+        .request = VHOST_USER_SET_MEM_TABLE,
+        .flags = VHOST_USER_VERSION,
+    };
+
+    for (i = 0; i < dev->mem->nregions; ++i) {
+        struct vhost_memory_region *reg = dev->mem->regions + i;
+        ram_addr_t offset;
+        MemoryRegion *mr;
+
+        assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+        mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+                                     &offset);
+        fd = memory_region_get_fd(mr);
+        if (fd > 0) {
+            msg.payload.memory.regions[fd_num].userspace_addr \
+                                        = reg->userspace_addr;
+            msg.payload.memory.regions[fd_num].memory_size  \
+                                        = reg->memory_size;
+            msg.payload.memory.regions[fd_num].guest_phys_addr \
+                                        = reg->guest_phys_addr;
+            msg.payload.memory.regions[fd_num].mmap_offset = offset;
+            assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
+            fds[fd_num++] = fd;
+        }
+    }
+
+    msg.payload.memory.nregions = fd_num;
+
+    if (!fd_num) {
+        error_report("Failed initializing vhost-user memory map, "
+                     "consider using -object memory-backend-file share=on");
+        return -1;
+    }
+
+    msg.size = sizeof(msg.payload.memory.nregions);
+    msg.size += sizeof(msg.payload.memory.padding);
+    msg.size += fd_num * sizeof(VhostUserMemoryRegion);
+
+    vhost_user_write(dev, &msg, fds, fd_num);
+
+    /* Note: It is (yet) unknown when the client application has finished
+     * remapping the GPA.
+     * Attempt to prevent a race by sending a command that requires a reply.
+     */
+    vhost_user_get_features(dev, &features);
+
+    return 0;
+}
+
 static int vhost_user_set_owner(struct vhost_dev *dev)
 {
     VhostUserMsg msg = {
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/2] vhost-user : Introduce a new protocol feature REPLY_ACK.
  2016-07-01  9:46 [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Prerna Saxena
  2016-07-01  9:46 ` [Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table Prerna Saxena
@ 2016-07-01  9:46 ` Prerna Saxena
  2016-07-03 11:47 ` [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Marc-André Lureau
  2016-07-04 11:59 ` Michael S. Tsirkin
  3 siblings, 0 replies; 8+ messages in thread
From: Prerna Saxena @ 2016-07-01  9:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: felipe, mst, marcandre.lureau, anilkumar.boggarapu, Prerna Saxena

From: Prerna Saxena <prerna.saxena@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_response" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.saxena@nutanix.com>
---
 docs/specs/vhost-user.txt | 44 ++++++++++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost-user.c    | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..26dbe71 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
    - Lower 2 bits are the version (currently 0x01)
    - Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+     details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ             0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD      1
 #define VHOST_USER_PROTOCOL_F_RARP           2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
 
 Message types
 -------------
@@ -464,3 +469,42 @@ Message types
       is present in VHOST_USER_GET_PROTOCOL_FEATURES.
       The first 6 bytes of the payload contain the mac address of the guest to
       allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+-------------------------------
+The original vhost-user specification only demands responses for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the newly
+introduced "need_response" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+In other words, response must be in the following format :
+
+------------------------------------
+| request | flags | size | payload |
+------------------------------------
+
+ * Request: 32-bit type of the request
+ * Flags: 32-bit bit field:
+ * Size: size of the payload ( see below)
+ * Payload : a u64 integer, where a non-zero value indicates a failure.
+
+This aids debugging the application's responses from QEMU. More
+importantly, it indicates to QEMU that the requested operation has
+deterministically (not) been met. Today, QEMU is expected to terminate
+the main vhost-user loop upon receiving such errors. In future, qemu could
+be taught to be more resilient for selective requests.
+
+Note that as per the original vhost-user protocol, the following four messages
+anyway require distinct responses from the vhost-user client process:
+ * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
+ * VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+
+For these message types, the presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or
+need_response bit being set brings no behaviourial change.
+The response from the client is identical whether or not the REPLY_ACK feature
+has been negotiated.
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 858a1bb..bff229e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
     VHOST_USER_PROTOCOL_F_MQ = 0,
     VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
     VHOST_USER_PROTOCOL_F_RARP = 2,
+    VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
     VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK     (0x3)
 #define VHOST_USER_REPLY_MASK       (0x1<<2)
+#define VHOST_USER_NEED_RESPONSE_MASK       (0x1 << 3)
     uint32_t flags;
     uint32_t size; /* the following payload size */
     union {
@@ -107,6 +109,25 @@ static VhostUserMsg m __attribute__ ((unused));
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION    (0x1)
 
+static int process_message_response(struct vhost_dev *dev,
+                                    VhostUserRequest request)
+{
+    VhostUserMsg msg;
+
+    if (vhost_user_read(dev, &msg) < 0) {
+        return 0;
+    }
+
+    if (msg.request != request) {
+        error_report("Received unexpected msg type."
+                        "Expected %d received %d",
+                        request, msg.request);
+        return -1;
+    }
+
+    return msg.payload.u64 ? -1 : 0;
+}
+
 static bool ioeventfd_enabled(void)
 {
     return kvm_enabled() && kvm_eventfds_enabled();
@@ -442,11 +463,18 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
     int i, fd;
     size_t fd_num = 0;
     uint64_t features;
+    bool reply_supported = virtio_has_feature(dev->protocol_features,
+                            VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
     VhostUserMsg msg = {
         .request = VHOST_USER_SET_MEM_TABLE,
         .flags = VHOST_USER_VERSION,
     };
 
+    if (reply_supported) {
+        msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
+    }
+
     for (i = 0; i < dev->mem->nregions; ++i) {
         struct vhost_memory_region *reg = dev->mem->regions + i;
         ram_addr_t offset;
@@ -483,6 +511,9 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 
     vhost_user_write(dev, &msg, fds, fd_num);
 
+    if (reply_supported) {
+        return process_message_response(dev, msg.request);
+    } else {
     /* Note: It is (yet) unknown when the client application has finished
      * remapping the GPA.
      * Attempt to prevent a race by sending a command that requires a reply.
@@ -490,6 +521,7 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
     vhost_user_get_features(dev, &features);
 
     return 0;
+    }
 }
 
 static int vhost_user_set_owner(struct vhost_dev *dev)
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
  2016-07-01  9:46 [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Prerna Saxena
  2016-07-01  9:46 ` [Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table Prerna Saxena
  2016-07-01  9:46 ` [Qemu-devel] [PATCH 2/2] vhost-user : Introduce a new protocol feature REPLY_ACK Prerna Saxena
@ 2016-07-03 11:47 ` Marc-André Lureau
  2016-07-04  2:27   ` Prerna Saxena
  2016-07-04 11:59 ` Michael S. Tsirkin
  3 siblings, 1 reply; 8+ messages in thread
From: Marc-André Lureau @ 2016-07-03 11:47 UTC (permalink / raw)
  To: Prerna Saxena
  Cc: QEMU, Felipe Franciosi, Michael S. Tsirkin, Anil Kumar Boggarapu,
	Prerna Saxena

Hi

On Fri, Jul 1, 2016 at 11:46 AM, Prerna Saxena <saxenap.ltc@gmail.com> wrote:
> From: Prerna Saxena <prerna.saxena@nutanix.com>
>
> The current vhost-user protocol requires the client to send responses to only a
> few commands. For the remaining commands, it is impossible for QEMU to know the
> status of the requested operation -- ie, did it succeed? If so, by what time?
>
> This is inconvenient, and can also lead to races. As an example:
>
> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
> Note that SET_MEM_TABLE does not require a reply according to the spec.
> (2) Qemu commits the memory to the guest.
> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
> (4) The application hasn't yet remapped the memory, but it sees the I/O request.
> (5) The application cannot satisfy the request because it does not know about those GPAs.
>
> Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).
>
> Changing the behaviour of current vhost-user commands would break existing applications.
> To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.
>
> The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.
>
> Changelog:
> ---------
> Changes since v1:
> Patch 1 : Ask for get_features before returning from set_mem_table(new).
> Patch 2 : * Improve documentation.
>           * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.
>

Overall, that looks good to me.

Why do we have both "response" and "reply" which basically means the
same thing, right? I would rather stick with "reply".

I am not convinced the first patch is needed, imho it is a
workaround/hack, the solution is given with the patch 2 only.

> Prerna Saxena (2):
>   vhost-user: Attempt to prevent a race on set_mem_table.
>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
>
>  docs/specs/vhost-user.txt |  40 ++++++++++++
>  hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
>  2 files changed, 150 insertions(+), 47 deletions(-)
>
> --
> 1.8.1.2
>



-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
  2016-07-03 11:47 ` [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Marc-André Lureau
@ 2016-07-04  2:27   ` Prerna Saxena
  0 siblings, 0 replies; 8+ messages in thread
From: Prerna Saxena @ 2016-07-04  2:27 UTC (permalink / raw)
  To: Marc-André Lureau, Prerna Saxena
  Cc: QEMU, Felipe Franciosi, Michael S. Tsirkin, Anil Kumar Boggarapu

Hi Marc-Andre,
Thank you for taking a look.





On 03/07/16 5:17 pm, "Marc-André Lureau" <marcandre.lureau@gmail.com> wrote:

>Hi
>
>On Fri, Jul 1, 2016 at 11:46 AM, Prerna Saxena <saxenap.ltc@gmail.com> wrote:
>> From: Prerna Saxena <prerna.saxena@nutanix.com>
>>
>> The current vhost-user protocol requires the client to send responses to only a
>> few commands. For the remaining commands, it is impossible for QEMU to know the
>> status of the requested operation -- ie, did it succeed? If so, by what time?
>>
>> This is inconvenient, and can also lead to races. As an example:
>>
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
>> Note that SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
>> (4) The application hasn't yet remapped the memory, but it sees the I/O request.
>> (5) The application cannot satisfy the request because it does not know about those GPAs.
>>
>> Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).
>>
>> Changing the behaviour of current vhost-user commands would break existing applications.
>> To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.
>>
>> The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.
>>
>> Changelog:
>> ---------
>> Changes since v1:
>> Patch 1 : Ask for get_features before returning from set_mem_table(new).
>> Patch 2 : * Improve documentation.
>>           * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.
>>
>
>Overall, that looks good to me.
>
>Why do we have both "response" and "reply" which basically means the
>same thing, right? I would rather stick with "reply".

Allright, will rename this function to process_message_reply().

>
>I am not convinced the first patch is needed, imho it is a
>workaround/hack, the solution is given with the patch 2 only.

Great, I’ll post a v3 with just Patch2.

Regards,
Prerna

>
>> Prerna Saxena (2):
>>   vhost-user: Attempt to prevent a race on set_mem_table.
>>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
>>
>>  docs/specs/vhost-user.txt |  40 ++++++++++++
>>  hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
>>  2 files changed, 150 insertions(+), 47 deletions(-)
>>
>> --
>> 1.8.1.2
>>
>
>
>
>-- 
>Marc-André Lureau
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
  2016-07-01  9:46 [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Prerna Saxena
                   ` (2 preceding siblings ...)
  2016-07-03 11:47 ` [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Marc-André Lureau
@ 2016-07-04 11:59 ` Michael S. Tsirkin
  2016-07-04 16:50   ` Prerna Saxena
  3 siblings, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2016-07-04 11:59 UTC (permalink / raw)
  To: Prerna Saxena
  Cc: qemu-devel, felipe, marcandre.lureau, anilkumar.boggarapu, Prerna Saxena

On Fri, Jul 01, 2016 at 02:46:20AM -0700, Prerna Saxena wrote:
> From: Prerna Saxena <prerna.saxena@nutanix.com>
> 
> The current vhost-user protocol requires the client to send responses to only a
> few commands. For the remaining commands, it is impossible for QEMU to know the
> status of the requested operation -- ie, did it succeed? If so, by what time?
> 
> This is inconvenient, and can also lead to races. As an example:
> 
> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
> Note that SET_MEM_TABLE does not require a reply according to the spec.
> (2) Qemu commits the memory to the guest.
> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
> (4) The application hasn't yet remapped the memory, but it sees the I/O request.
> (5) The application cannot satisfy the request because it does not know about those GPAs.
> 
> Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).
> 
> Changing the behaviour of current vhost-user commands would break existing applications. 
> To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.
> 
> The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.


OK this all looks very reasonable (and I do like patch 1 too)
but there's one source of waste here: we do not need to
synchronize when we set up device the first time
when hdev->memory_changed is false.

I think we should test that and skip synch in both patches
unless  hdev->memory_changed is set.


> Changelog:
> ---------
> Changes since v1:
> Patch 1 : Ask for get_features before returning from set_mem_table(new). 
> Patch 2 : * Improve documentation. 
>           * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.
> 
> Prerna Saxena (2):
>   vhost-user: Attempt to prevent a race on set_mem_table.
>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
> 
>  docs/specs/vhost-user.txt |  40 ++++++++++++
>  hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
>  2 files changed, 150 insertions(+), 47 deletions(-)
> 
> -- 
> 1.8.1.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
  2016-07-04 11:59 ` Michael S. Tsirkin
@ 2016-07-04 16:50   ` Prerna Saxena
  2016-07-04 19:54     ` Michael S. Tsirkin
  0 siblings, 1 reply; 8+ messages in thread
From: Prerna Saxena @ 2016-07-04 16:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, Felipe Franciosi, marcandre.lureau,
	Anil Kumar Boggarapu, Prerna Saxena

Hi Michael,
Thank you for taking a look.





On 04/07/16 5:29 pm, "Michael S. Tsirkin" <mst@redhat.com> wrote:

>On Fri, Jul 01, 2016 at 02:46:20AM -0700, Prerna Saxena wrote:
>> From: Prerna Saxena <prerna.saxena@nutanix.com>
>> 
>> The current vhost-user protocol requires the client to send responses to only a
>> few commands. For the remaining commands, it is impossible for QEMU to know the
>> status of the requested operation -- ie, did it succeed? If so, by what time?
>> 
>> This is inconvenient, and can also lead to races. As an example:
>> 
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
>> Note that SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
>> (4) The application hasn't yet remapped the memory, but it sees the I/O request.
>> (5) The application cannot satisfy the request because it does not know about those GPAs.
>> 
>> Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).
>> 
>> Changing the behaviour of current vhost-user commands would break existing applications. 
>> To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.
>> 
>> The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.
>
>
>OK this all looks very reasonable (and I do like patch 1 too)
>but there's one source of waste here: we do not need to
>synchronize when we set up device the first time
>when hdev->memory_changed is false.
>
>I think we should test that and skip synch in both patches
>unless  hdev->memory_changed is set.

I do not entirely agree with that. The first set_mem_table command is not much different from subsequent set_mem_table calls.
For all cases, there is a fair chance that the vhost-user application may, for some reason, not be able to map the guest memory.
This protocol extension provides a mechanism for such errors to be propagated back to QEMU. It is upto QEMU to acknowledge the failure (by terminating itself or failing the device) or ignore it. However, in the absence of such a mechanism, it would be really bad for QEMU to believe that the vhost application is all set to process guest requests when reality is quite the opposite.

Also, as pointed out before, QEMU needs to have a notion of _when_ the memory mapping was finished, so that it may proceed to pass on actual requests to the vhost user application. The race described in the covering letter (above) can potentially happen even at first-time initialization.

This protocol extension is an attempt to bridge the subtle behavioural difference between vhost-user and vhost-kernel. Patch 1, in my opinion, makes the code less intuitive. This is because we are calling a GET_FEATURES vhost message from inside the handler for another vhost command— SET_MEM_TABLE. However, if you think it better to have both Patch 1 & 2, I’ll be happy to post both.

Regards,
Prerna

>
>
>> Changelog:
>> ---------
>> Changes since v1:
>> Patch 1 : Ask for get_features before returning from set_mem_table(new). 
>> Patch 2 : * Improve documentation. 
>>           * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.
>> 
>> Prerna Saxena (2):
>>   vhost-user: Attempt to prevent a race on set_mem_table.
>>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
>> 
>>  docs/specs/vhost-user.txt |  40 ++++++++++++
>>  hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
>>  2 files changed, 150 insertions(+), 47 deletions(-)
>> 
>> -- 
>> 1.8.1.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.
  2016-07-04 16:50   ` Prerna Saxena
@ 2016-07-04 19:54     ` Michael S. Tsirkin
  0 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2016-07-04 19:54 UTC (permalink / raw)
  To: Prerna Saxena
  Cc: qemu-devel, Felipe Franciosi, marcandre.lureau,
	Anil Kumar Boggarapu, Prerna Saxena

On Mon, Jul 04, 2016 at 04:50:04PM +0000, Prerna Saxena wrote:
> Hi Michael,
> Thank you for taking a look.
> 
> 
> 
> 
> 
> On 04/07/16 5:29 pm, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> >On Fri, Jul 01, 2016 at 02:46:20AM -0700, Prerna Saxena wrote:
> >> From: Prerna Saxena <prerna.saxena@nutanix.com>
> >> 
> >> The current vhost-user protocol requires the client to send responses to only a
> >> few commands. For the remaining commands, it is impossible for QEMU to know the
> >> status of the requested operation -- ie, did it succeed? If so, by what time?
> >> 
> >> This is inconvenient, and can also lead to races. As an example:
> >> 
> >> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net application).
> >> Note that SET_MEM_TABLE does not require a reply according to the spec.
> >> (2) Qemu commits the memory to the guest.
> >> (3) Guest issues an I/O operation over a new memory region which was configured on (1).
> >> (4) The application hasn't yet remapped the memory, but it sees the I/O request.
> >> (5) The application cannot satisfy the request because it does not know about those GPAs.
> >> 
> >> Note that the kernel implementation does not suffer from this limitation since messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. vhost-net) completes the command and returns (with an error code).
> >> 
> >> Changing the behaviour of current vhost-user commands would break existing applications. 
> >> To work around this race, Patch 1 adds a get_features command to be sent before returning from set_mem_table. While this is not a complete fix, it will help client applications that strictly process messages in order.
> >> 
> >> The second patch introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to request a response to any message by setting the newly introduced "need_response" flag. The application must then respond to qemu by providing a status about the requested operation.
> >
> >
> >OK this all looks very reasonable (and I do like patch 1 too)
> >but there's one source of waste here: we do not need to
> >synchronize when we set up device the first time
> >when hdev->memory_changed is false.
> >
> >I think we should test that and skip synch in both patches
> >unless  hdev->memory_changed is set.
> 
> I do not entirely agree with that. The first set_mem_table command is not much different from subsequent set_mem_table calls.
> For all cases, there is a fair chance that the vhost-user application may, for some reason, not be able to map the guest memory.
> This protocol extension provides a mechanism for such errors to be propagated back to QEMU. It is upto QEMU to acknowledge the failure (by terminating itself or failing the device) or ignore it. However, in the absence of such a mechanism, it would be really bad for QEMU to believe that the vhost application is all set to process guest requests when reality is quite the opposite.


Same applies to any other request in theory.
Backend can disconnect if it can not handle a request,
this seems sufficient to me.



> Also, as pointed out before, QEMU needs to have a notion of _when_ the memory mapping was finished, so that it may proceed to pass on actual requests to the vhost user application. The race described in the covering letter (above) can potentially happen even at first-time initialization.
> 
> This protocol extension is an attempt to bridge the subtle behavioural difference between vhost-user and vhost-kernel. Patch 1, in my opinion, makes the code less intuitive. This is because we are calling a GET_FEATURES vhost message from inside the handler for another vhost command— SET_MEM_TABLE. However, if you think it better to have both Patch 1 & 2, I’ll be happy to post both.
> 
> Regards,
> Prerna
> 
> >
> >
> >> Changelog:
> >> ---------
> >> Changes since v1:
> >> Patch 1 : Ask for get_features before returning from set_mem_table(new). 
> >> Patch 2 : * Improve documentation. 
> >>           * Abstract out commonly used operations in the form of a function, process_message_response(). Also implement this only for SET_MEM_TABLE.
> >> 
> >> Prerna Saxena (2):
> >>   vhost-user: Attempt to prevent a race on set_mem_table.
> >>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
> >> 
> >>  docs/specs/vhost-user.txt |  40 ++++++++++++
> >>  hw/virtio/vhost-user.c    | 157 ++++++++++++++++++++++++++++++++--------------
> >>  2 files changed, 150 insertions(+), 47 deletions(-)
> >> 
> >> -- 
> >> 1.8.1.2

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-07-04 19:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-01  9:46 [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Prerna Saxena
2016-07-01  9:46 ` [Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table Prerna Saxena
2016-07-01  9:46 ` [Qemu-devel] [PATCH 2/2] vhost-user : Introduce a new protocol feature REPLY_ACK Prerna Saxena
2016-07-03 11:47 ` [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command Marc-André Lureau
2016-07-04  2:27   ` Prerna Saxena
2016-07-04 11:59 ` Michael S. Tsirkin
2016-07-04 16:50   ` Prerna Saxena
2016-07-04 19:54     ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.