All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/2] vhost-user: Specify device IOTLB support
@ 2017-04-11 10:10 Maxime Coquelin
  2017-04-11 10:10 ` [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests Maxime Coquelin
  2017-04-11 10:10 ` [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support Maxime Coquelin
  0 siblings, 2 replies; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-11 10:10 UTC (permalink / raw)
  To: mst, vkaplans, jasowang, wexu, peterx, yuanhan.liu,
	virtio-comment, qemu-devel
  Cc: Maxime Coquelin

This two-patches series aims at specifying the protocol update
required to support device IOTLB with user backends.

The protocol is very close to kernel backends, except that a new
communication channel is introduced to enable the slave to send
requests to the master.

While unix sockets are bidirectionnal, I create a dedicated channel
for slave intiated requests to simplify message handling with multi-
threaded masters and slaves.

Maxime Coquelin (2):
  spec/vhost-user: Introduce secondary channel for slave initiated
    requests
  spec/vhost-user spec: Add IOMMU support

 docs/specs/vhost-user.txt | 94 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-11 10:10 [Qemu-devel] [RFC 0/2] vhost-user: Specify device IOTLB support Maxime Coquelin
@ 2017-04-11 10:10 ` Maxime Coquelin
  2017-04-11 13:06   ` Marc-André Lureau
  2017-04-11 10:10 ` [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support Maxime Coquelin
  1 sibling, 1 reply; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-11 10:10 UTC (permalink / raw)
  To: mst, vkaplans, jasowang, wexu, peterx, yuanhan.liu,
	virtio-comment, qemu-devel
  Cc: Maxime Coquelin

This vhost-user specification update aims at enabling the
slave to send requests to the master using a dedicated socket
created by the master.

It can be used for example when the slave implements a device
IOTLB to send cache miss requests to the master.

The message types list is updated with an "Initiator" field to
indicate for each type whether the master and/or slave can
initiate the request.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 docs/specs/vhost-user.txt | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 036890f..b365047 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -139,6 +139,7 @@ in the ancillary data:
  * VHOST_USER_SET_VRING_KICK
  * VHOST_USER_SET_VRING_CALL
  * VHOST_USER_SET_VRING_ERR
+ * VHOST_USER_SET_SLAVE_REQ_FD
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
@@ -150,6 +151,11 @@ As older slaves don't support negotiating protocol features,
 a feature bit was dedicated for this purpose:
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+If the slave supports VHOST_USER_PROTOCOL_F_SLAVE_REQ protocol feature, the
+master may create a secondary Unix domain socket and send its file descriptor
+to the slave using VHOST_USER_SET_SLAVE_REQ_FD request. This new channel enables
+the slave to send message requests and master to send message replies.
+
 Starting and stopping rings
 ----------------------
 Client must only process each ring when it is started.
@@ -260,6 +266,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_RARP           2
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
 #define VHOST_USER_PROTOCOL_F_MTU            4
+#define VHOST_USER_PROTOCOL_F_SLAVE_REQ      5
 
 Message types
 -------------
@@ -268,6 +275,7 @@ Message types
 
       Id: 1
       Equivalent ioctl: VHOST_GET_FEATURES
+      Initiator: Master
       Master payload: N/A
       Slave payload: u64
 
@@ -279,6 +287,7 @@ Message types
 
       Id: 2
       Ioctl: VHOST_SET_FEATURES
+      Initiator: Master
       Master payload: u64
 
       Enable features in the underlying vhost implementation using a bitmask.
@@ -289,6 +298,7 @@ Message types
 
       Id: 15
       Equivalent ioctl: VHOST_GET_FEATURES
+      Initiator: Master
       Master payload: N/A
       Slave payload: u64
 
@@ -302,6 +312,7 @@ Message types
 
       Id: 16
       Ioctl: VHOST_SET_FEATURES
+      Initiator: Master
       Master payload: u64
 
       Enable protocol features in the underlying vhost implementation.
@@ -314,6 +325,7 @@ Message types
 
       Id: 3
       Equivalent ioctl: VHOST_SET_OWNER
+      Initiator: Master
       Master payload: N/A
 
       Issued when a new connection is established. It sets the current Master
@@ -323,6 +335,7 @@ Message types
  * VHOST_USER_RESET_OWNER
 
       Id: 4
+      Initiator: Master
       Master payload: N/A
 
       This is no longer used. Used to be sent to request disabling
@@ -335,6 +348,7 @@ Message types
 
       Id: 5
       Equivalent ioctl: VHOST_SET_MEM_TABLE
+      Initiator: Master
       Master payload: memory regions description
 
       Sets the memory map regions on the slave so it can translate the vring
@@ -346,6 +360,7 @@ Message types
 
       Id: 6
       Equivalent ioctl: VHOST_SET_LOG_BASE
+      Initiator: Master
       Master payload: u64
       Slave payload: N/A
 
@@ -360,6 +375,7 @@ Message types
 
       Id: 7
       Equivalent ioctl: VHOST_SET_LOG_FD
+      Initiator: Master
       Master payload: N/A
 
       Sets the logging file descriptor, which is passed as ancillary data.
@@ -368,6 +384,7 @@ Message types
 
       Id: 8
       Equivalent ioctl: VHOST_SET_VRING_NUM
+      Initiator: Master
       Master payload: vring state description
 
       Set the size of the queue.
@@ -376,6 +393,7 @@ Message types
 
       Id: 9
       Equivalent ioctl: VHOST_SET_VRING_ADDR
+      Initiator: Master
       Master payload: vring address description
       Slave payload: N/A
 
@@ -385,6 +403,7 @@ Message types
 
       Id: 10
       Equivalent ioctl: VHOST_SET_VRING_BASE
+      Initiator: Master
       Master payload: vring state description
 
       Sets the base offset in the available vring.
@@ -393,6 +412,7 @@ Message types
 
       Id: 11
       Equivalent ioctl: VHOST_USER_GET_VRING_BASE
+      Initiator: Master
       Master payload: vring state description
       Slave payload: vring state description
 
@@ -402,6 +422,7 @@ Message types
 
       Id: 12
       Equivalent ioctl: VHOST_SET_VRING_KICK
+      Initiator: Master
       Master payload: u64
 
       Set the event file descriptor for adding buffers to the vring. It
@@ -415,6 +436,7 @@ Message types
 
       Id: 13
       Equivalent ioctl: VHOST_SET_VRING_CALL
+      Initiator: Master
       Master payload: u64
 
       Set the event file descriptor to signal when buffers are used. It
@@ -428,6 +450,7 @@ Message types
 
       Id: 14
       Equivalent ioctl: VHOST_SET_VRING_ERR
+      Initiator: Master
       Master payload: u64
 
       Set the event file descriptor to signal when error occurs. It
@@ -440,6 +463,7 @@ Message types
 
       Id: 17
       Equivalent ioctl: N/A
+      Initiator: Master
       Master payload: N/A
       Slave payload: u64
 
@@ -451,6 +475,7 @@ Message types
 
       Id: 18
       Equivalent ioctl: N/A
+      Initiator: Master
       Master payload: vring state description
 
       Signal slave to enable or disable corresponding vring.
@@ -461,6 +486,7 @@ Message types
 
       Id: 19
       Equivalent ioctl: N/A
+      Initiator: Master
       Master payload: u64
 
       Ask vhost user backend to broadcast a fake RARP to notify the migration
@@ -475,6 +501,7 @@ Message types
 
       Id: 20
       Equivalent ioctl: N/A
+      Initiator: Master
       Master payload: u64
 
       Set host MTU value exposed to the guest.
@@ -486,6 +513,17 @@ Message types
       If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
       with zero in case the specified MTU is valid, or non-zero otherwise.
 
+ * VHOST_USER_SET_SLAVE_REQ_FD
+
+      Id: 21
+      Equivalent ioctl: N/A
+      Initiator: Master
+
+      Set the socket file descriptor for slave initiated requests.
+      This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
+      has been negotiated, and protocol feature bit VHOST_USER_PROTOCOL_F_SLAVE_REQ
+      bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 -------------------------------
 The original vhost-user specification only demands replies for certain
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-11 10:10 [Qemu-devel] [RFC 0/2] vhost-user: Specify device IOTLB support Maxime Coquelin
  2017-04-11 10:10 ` [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests Maxime Coquelin
@ 2017-04-11 10:10 ` Maxime Coquelin
  2017-04-11 13:20   ` Peter Xu
  1 sibling, 1 reply; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-11 10:10 UTC (permalink / raw)
  To: mst, vkaplans, jasowang, wexu, peterx, yuanhan.liu,
	virtio-comment, qemu-devel
  Cc: Maxime Coquelin

This patch specifies the master/slave communication to support
device IOTLB implementation in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 docs/specs/vhost-user.txt | 56 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index b365047..048a4d6 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,23 @@ Depending on the request type, payload can be:
    log offset: offset from start of supplied file descriptor
        where logging starts (i.e. where guest address 0 would be logged)
 
+ * An IOTLB message
+   ---------------------------------------------------------
+   | iova | size | user address | permissions flags | type |
+   ---------------------------------------------------------
+
+   IOVA: a 64-bit guest I/O virtual address
+   Size: a 64-bit size
+   User address: a 64-bit user address
+   Permissions flags: a 8-bit bit field:
+    - Bit 0: Read access
+    - Bit 1: Write access
+   Type: a 8-bit IOTLB message type:
+    - 1: IOTLB miss
+    - 2: IOTLB update
+    - 3: IOTLB invalidate
+    - 4: IOTLB access fail
+
 In QEMU the vhost-user message is implemented with the following struct:
 
 typedef struct VhostUserMsg {
@@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
         struct vhost_vring_addr addr;
         VhostUserMemory memory;
         VhostUserLog log;
+        struct vhost_iotlb_msg iotlb;
     };
 } QEMU_PACKED VhostUserMsg;
 
@@ -258,6 +276,30 @@ Once the source has finished migration, rings will be stopped by
 the source. No further update must be done before rings are
 restarted.
 
+IOMMU support
+-------------
+
+When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
+to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
+requests to the slave with a struct vhost_iotlb_msg payload. For update events,
+the iotlb payload has to be filled with the update message type (2), the I/O
+virtual address, the size, the user virtual address, and the permissions
+flags. For invalidation events, the iotlb payload has to be filled with the
+update message type (3), the I/O virtual address and the size. On success, the
+slave is expected to reply with a zero payload, non-zero otherwise.
+
+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
+master initiated the slave to master communication channel using the
+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
+failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
+struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
+filled with the miss message type (1), the I/O virtual address and the
+permissions flags. For access failure event, the iotlb payload has to be
+filled with the access failure message type (4), the I/O virtual address and
+the permissions flags. On success, the master is expected to reply  when the
+request has been handled (for example, on miss requests, once the device IOTLB
+has been updated) with a zero payload, non-zero otherwise.
+
 Protocol features
 -----------------
 
@@ -524,6 +566,20 @@ Message types
       has been negotiated, and protocol feature bit VHOST_USER_PROTOCOL_F_SLAVE_REQ
       bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
 
+ * VHOST_USER_IOTLB_MSG
+
+      Id: 22
+      Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
+      Initiator: Master or slave
+
+      Send IOTLB messages with struct vhost_iotlb_msg as payload.
+      Master sends such requests to update and invalidate entries in the device
+      IOTLB. Slave sends such requests to notify of an IOTLB miss, or an IOTLB
+      access failure. The recipient has to acknowledge the request with
+      sending zero as u64 payload for success, non-zero otherwise.
+      This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
+      has been successfully negotiated.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 -------------------------------
 The original vhost-user specification only demands replies for certain
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-11 10:10 ` [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests Maxime Coquelin
@ 2017-04-11 13:06   ` Marc-André Lureau
  2017-04-11 13:53     ` Maxime Coquelin
  0 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2017-04-11 13:06 UTC (permalink / raw)
  To: Maxime Coquelin, mst, vkaplans, jasowang, wexu, peterx,
	yuanhan.liu, virtio-comment, qemu-devel

Hi

On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin <maxime.coquelin@redhat.com>
wrote:

> This vhost-user specification update aims at enabling the
> slave to send requests to the master using a dedicated socket
> created by the master.
>
> It can be used for example when the slave implements a device
> IOTLB to send cache miss requests to the master.
>
> The message types list is updated with an "Initiator" field to
> indicate for each type whether the master and/or slave can
> initiate the request.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>

This is very similar to a patch I proposed for shutdown slave initiated
requests:
https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html


> ---
>  docs/specs/vhost-user.txt | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index 036890f..b365047 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -139,6 +139,7 @@ in the ancillary data:
>   * VHOST_USER_SET_VRING_KICK
>   * VHOST_USER_SET_VRING_CALL
>   * VHOST_USER_SET_VRING_ERR
> + * VHOST_USER_SET_SLAVE_REQ_FD
>
>
I like "slave-req-fd" better than "slave-fd"


>  If Master is unable to send the full message or receives a wrong reply it
> will
>  close the connection. An optional reconnection mechanism can be
> implemented.
> @@ -150,6 +151,11 @@ As older slaves don't support negotiating protocol
> features,
>  a feature bit was dedicated for this purpose:
>  #define VHOST_USER_F_PROTOCOL_FEATURES 30
>
> +If the slave supports VHOST_USER_PROTOCOL_F_SLAVE_REQ protocol feature,
> the
> +master may create a secondary Unix domain socket and send its file
> descriptor
> +to the slave using VHOST_USER_SET_SLAVE_REQ_FD request. This new channel
> enables
> +the slave to send message requests and master to send message replies.
> +
>  Starting and stopping rings
>  ----------------------
>  Client must only process each ring when it is started.
> @@ -260,6 +266,7 @@ Protocol features
>  #define VHOST_USER_PROTOCOL_F_RARP           2
>  #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
>  #define VHOST_USER_PROTOCOL_F_MTU            4
> +#define VHOST_USER_PROTOCOL_F_SLAVE_REQ      5
>
>  Message types
>  -------------
> @@ -268,6 +275,7 @@ Message types
>
>        Id: 1
>        Equivalent ioctl: VHOST_GET_FEATURES
> +      Initiator: Master
>        Master payload: N/A
>        Slave payload: u64
>
> @@ -279,6 +287,7 @@ Message types
>
>        Id: 2
>        Ioctl: VHOST_SET_FEATURES
> +      Initiator: Master
>        Master payload: u64
>
>        Enable features in the underlying vhost implementation using a
> bitmask.
> @@ -289,6 +298,7 @@ Message types
>
>        Id: 15
>        Equivalent ioctl: VHOST_GET_FEATURES
> +      Initiator: Master
>        Master payload: N/A
>        Slave payload: u64
>
> @@ -302,6 +312,7 @@ Message types
>
>        Id: 16
>        Ioctl: VHOST_SET_FEATURES
> +      Initiator: Master
>        Master payload: u64
>
>        Enable protocol features in the underlying vhost implementation.
> @@ -314,6 +325,7 @@ Message types
>
>        Id: 3
>        Equivalent ioctl: VHOST_SET_OWNER
> +      Initiator: Master
>        Master payload: N/A
>
>        Issued when a new connection is established. It sets the current
> Master
> @@ -323,6 +335,7 @@ Message types
>   * VHOST_USER_RESET_OWNER
>
>        Id: 4
> +      Initiator: Master
>        Master payload: N/A
>
>        This is no longer used. Used to be sent to request disabling
> @@ -335,6 +348,7 @@ Message types
>
>        Id: 5
>        Equivalent ioctl: VHOST_SET_MEM_TABLE
> +      Initiator: Master
>        Master payload: memory regions description
>
>        Sets the memory map regions on the slave so it can translate the
> vring
> @@ -346,6 +360,7 @@ Message types
>
>        Id: 6
>        Equivalent ioctl: VHOST_SET_LOG_BASE
> +      Initiator: Master
>        Master payload: u64
>        Slave payload: N/A
>
> @@ -360,6 +375,7 @@ Message types
>
>        Id: 7
>        Equivalent ioctl: VHOST_SET_LOG_FD
> +      Initiator: Master
>        Master payload: N/A
>
>        Sets the logging file descriptor, which is passed as ancillary data.
> @@ -368,6 +384,7 @@ Message types
>
>        Id: 8
>        Equivalent ioctl: VHOST_SET_VRING_NUM
> +      Initiator: Master
>        Master payload: vring state description
>
>        Set the size of the queue.
> @@ -376,6 +393,7 @@ Message types
>
>        Id: 9
>        Equivalent ioctl: VHOST_SET_VRING_ADDR
> +      Initiator: Master
>        Master payload: vring address description
>        Slave payload: N/A
>
> @@ -385,6 +403,7 @@ Message types
>
>        Id: 10
>        Equivalent ioctl: VHOST_SET_VRING_BASE
> +      Initiator: Master
>        Master payload: vring state description
>
>        Sets the base offset in the available vring.
> @@ -393,6 +412,7 @@ Message types
>
>        Id: 11
>        Equivalent ioctl: VHOST_USER_GET_VRING_BASE
> +      Initiator: Master
>        Master payload: vring state description
>        Slave payload: vring state description
>
> @@ -402,6 +422,7 @@ Message types
>
>        Id: 12
>        Equivalent ioctl: VHOST_SET_VRING_KICK
> +      Initiator: Master
>        Master payload: u64
>
>        Set the event file descriptor for adding buffers to the vring. It
> @@ -415,6 +436,7 @@ Message types
>
>        Id: 13
>        Equivalent ioctl: VHOST_SET_VRING_CALL
> +      Initiator: Master
>        Master payload: u64
>
>        Set the event file descriptor to signal when buffers are used. It
> @@ -428,6 +450,7 @@ Message types
>
>        Id: 14
>        Equivalent ioctl: VHOST_SET_VRING_ERR
> +      Initiator: Master
>        Master payload: u64
>
>        Set the event file descriptor to signal when error occurs. It
> @@ -440,6 +463,7 @@ Message types
>
>        Id: 17
>        Equivalent ioctl: N/A
> +      Initiator: Master
>        Master payload: N/A
>        Slave payload: u64
>
> @@ -451,6 +475,7 @@ Message types
>
>        Id: 18
>        Equivalent ioctl: N/A
> +      Initiator: Master
>        Master payload: vring state description
>
>        Signal slave to enable or disable corresponding vring.
> @@ -461,6 +486,7 @@ Message types
>
>        Id: 19
>        Equivalent ioctl: N/A
> +      Initiator: Master
>        Master payload: u64
>
>        Ask vhost user backend to broadcast a fake RARP to notify the
> migration
> @@ -475,6 +501,7 @@ Message types
>
>        Id: 20
>        Equivalent ioctl: N/A
> +      Initiator: Master
>        Master payload: u64
>
>        Set host MTU value exposed to the guest.
> @@ -486,6 +513,17 @@ Message types
>        If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
>        with zero in case the specified MTU is valid, or non-zero otherwise.
>
> + * VHOST_USER_SET_SLAVE_REQ_FD
> +
> +      Id: 21
> +      Equivalent ioctl: N/A
> +      Initiator: Master
> +
> +      Set the socket file descriptor for slave initiated requests.
> +      This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
> +      has been negotiated, and protocol feature bit
> VHOST_USER_PROTOCOL_F_SLAVE_REQ
> +      bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
> +
>

looks good to me


>  VHOST_USER_PROTOCOL_F_REPLY_ACK:
>  -------------------------------
>  The original vhost-user specification only demands replies for certain
> --
> 2.9.3
>
>
> --
Marc-André Lureau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-11 10:10 ` [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support Maxime Coquelin
@ 2017-04-11 13:20   ` Peter Xu
  2017-04-11 15:16     ` Maxime Coquelin
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2017-04-11 13:20 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: mst, vkaplans, jasowang, wexu, yuanhan.liu, virtio-comment, qemu-devel

On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
> This patch specifies the master/slave communication to support
> device IOTLB implementation in slave.
> 
> The vhost_iotlb_msg structure introduced for kernel backends is
> re-used, making the design close between the two backends.
> 
> An exception is the use of the secondary channel to enable the
> slave to send IOTLB miss requests to the master.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>  docs/specs/vhost-user.txt | 56 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
> index b365047..048a4d6 100644
> --- a/docs/specs/vhost-user.txt
> +++ b/docs/specs/vhost-user.txt
> @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
>     log offset: offset from start of supplied file descriptor
>         where logging starts (i.e. where guest address 0 would be logged)
>  
> + * An IOTLB message
> +   ---------------------------------------------------------
> +   | iova | size | user address | permissions flags | type |
> +   ---------------------------------------------------------
> +
> +   IOVA: a 64-bit guest I/O virtual address
> +   Size: a 64-bit size
> +   User address: a 64-bit user address
> +   Permissions flags: a 8-bit bit field:
> +    - Bit 0: Read access
> +    - Bit 1: Write access
> +   Type: a 8-bit IOTLB message type:
> +    - 1: IOTLB miss
> +    - 2: IOTLB update
> +    - 3: IOTLB invalidate
> +    - 4: IOTLB access fail
> +
>  In QEMU the vhost-user message is implemented with the following struct:
>  
>  typedef struct VhostUserMsg {
> @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
>          struct vhost_vring_addr addr;
>          VhostUserMemory memory;
>          VhostUserLog log;
> +        struct vhost_iotlb_msg iotlb;
>      };
>  } QEMU_PACKED VhostUserMsg;
>  
> @@ -258,6 +276,30 @@ Once the source has finished migration, rings will be stopped by
>  the source. No further update must be done before rings are
>  restarted.
>  
> +IOMMU support
> +-------------
> +
> +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
> +to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
> +requests to the slave with a struct vhost_iotlb_msg payload. For update events,
> +the iotlb payload has to be filled with the update message type (2), the I/O
> +virtual address, the size, the user virtual address, and the permissions
> +flags. For invalidation events, the iotlb payload has to be filled with the
> +update message type (3), the I/O virtual address and the size. On success, the

s/update/invalidate/?

> +slave is expected to reply with a zero payload, non-zero otherwise.

Is this ack mechanism really necessary? If not, not sure it'll be nice
to keep vhost-user/vhost-kernel aligned on this behavior. At least
that'll simplify vhost-user implementation on QEMU side (iiuc even
without introducing new functions for update/invalidate operations).

> +
> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
> +master initiated the slave to master communication channel using the
> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
> +failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
> +struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
> +filled with the miss message type (1), the I/O virtual address and the
> +permissions flags. For access failure event, the iotlb payload has to be
> +filled with the access failure message type (4), the I/O virtual address and
> +the permissions flags. On success, the master is expected to reply  when the
> +request has been handled (for example, on miss requests, once the device IOTLB
> +has been updated) with a zero payload, non-zero otherwise.

Failed to understand the last sentence clearly. IIUC vhost-net will
reply with an UPDATE message when a MISS message is received. Here for
vhost-user are we going to send one extra zero payload after that?

> +
>  Protocol features
>  -----------------
>  
> @@ -524,6 +566,20 @@ Message types
>        has been negotiated, and protocol feature bit VHOST_USER_PROTOCOL_F_SLAVE_REQ
>        bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
>  
> + * VHOST_USER_IOTLB_MSG
> +
> +      Id: 22
> +      Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
> +      Initiator: Master or slave
> +
> +      Send IOTLB messages with struct vhost_iotlb_msg as payload.
> +      Master sends such requests to update and invalidate entries in the device
> +      IOTLB. Slave sends such requests to notify of an IOTLB miss, or an IOTLB

s/of//?

> +      access failure. The recipient has to acknowledge the request with
> +      sending zero as u64 payload for success, non-zero otherwise.

Same question here...

Thanks,

> +      This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
> +      has been successfully negotiated.
> +
>  VHOST_USER_PROTOCOL_F_REPLY_ACK:
>  -------------------------------
>  The original vhost-user specification only demands replies for certain
> -- 
> 2.9.3
> 

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-11 13:06   ` Marc-André Lureau
@ 2017-04-11 13:53     ` Maxime Coquelin
  2017-04-14  9:03       ` Marc-André Lureau
  0 siblings, 1 reply; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-11 13:53 UTC (permalink / raw)
  To: Marc-André Lureau, mst, vkaplans, jasowang, wexu, peterx,
	yuanhan.liu, virtio-comment, qemu-devel

Hi Marc-André,

On 04/11/2017 03:06 PM, Marc-André Lureau wrote:
> Hi
>
> On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin
> <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>> wrote:
>
>     This vhost-user specification update aims at enabling the
>     slave to send requests to the master using a dedicated socket
>     created by the master.
>
>     It can be used for example when the slave implements a device
>     IOTLB to send cache miss requests to the master.
>
>     The message types list is updated with an "Initiator" field to
>     indicate for each type whether the master and/or slave can
>     initiate the request.
>
>     Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
>     <mailto:maxime.coquelin@redhat.com>>
>
>
> This is very similar to a patch I proposed for shutdown slave initiated
> requests:
> https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html

Indeed, thanks for pointing this out, I wasn't aware of your series.

I find your proposal of having dedicated messages types
(VHOST_USER_SLAVE_*) cleaner.

Are you ok if I handover your patch, and replace
VHOST_USER_SET_SLAVE_FD to VHOST_USER_SET_SLAVE_REQ_FD?

>
>
>     ---
>      docs/specs/vhost-user.txt | 38 ++++++++++++++++++++++++++++++++++++++
>      1 file changed, 38 insertions(+)
>
>     diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
>     index 036890f..b365047 100644
>     --- a/docs/specs/vhost-user.txt
>     +++ b/docs/specs/vhost-user.txt
>     @@ -139,6 +139,7 @@ in the ancillary data:
>       * VHOST_USER_SET_VRING_KICK
>       * VHOST_USER_SET_VRING_CALL
>       * VHOST_USER_SET_VRING_ERR
>     + * VHOST_USER_SET_SLAVE_REQ_FD
>
>
> I like "slave-req-fd" better than "slave-fd"
>
>
>      If Master is unable to send the full message or receives a wrong
>     reply it will
>      close the connection. An optional reconnection mechanism can be
>     implemented.
>     @@ -150,6 +151,11 @@ As older slaves don't support negotiating
>     protocol features,
>      a feature bit was dedicated for this purpose:
>      #define VHOST_USER_F_PROTOCOL_FEATURES 30
>
>     +If the slave supports VHOST_USER_PROTOCOL_F_SLAVE_REQ protocol
>     feature, the
>     +master may create a secondary Unix domain socket and send its file
>     descriptor
>     +to the slave using VHOST_USER_SET_SLAVE_REQ_FD request. This new
>     channel enables
>     +the slave to send message requests and master to send message replies.
>     +
>      Starting and stopping rings
>      ----------------------
>      Client must only process each ring when it is started.
>     @@ -260,6 +266,7 @@ Protocol features
>      #define VHOST_USER_PROTOCOL_F_RARP           2
>      #define VHOST_USER_PROTOCOL_F_REPLY_ACK      3
>      #define VHOST_USER_PROTOCOL_F_MTU            4
>     +#define VHOST_USER_PROTOCOL_F_SLAVE_REQ      5
>
>      Message types
>      -------------
>     @@ -268,6 +275,7 @@ Message types
>
>            Id: 1
>            Equivalent ioctl: VHOST_GET_FEATURES
>     +      Initiator: Master
>            Master payload: N/A
>            Slave payload: u64
>
>     @@ -279,6 +287,7 @@ Message types
>
>            Id: 2
>            Ioctl: VHOST_SET_FEATURES
>     +      Initiator: Master
>            Master payload: u64
>
>            Enable features in the underlying vhost implementation using
>     a bitmask.
>     @@ -289,6 +298,7 @@ Message types
>
>            Id: 15
>            Equivalent ioctl: VHOST_GET_FEATURES
>     +      Initiator: Master
>            Master payload: N/A
>            Slave payload: u64
>
>     @@ -302,6 +312,7 @@ Message types
>
>            Id: 16
>            Ioctl: VHOST_SET_FEATURES
>     +      Initiator: Master
>            Master payload: u64
>
>            Enable protocol features in the underlying vhost implementation.
>     @@ -314,6 +325,7 @@ Message types
>
>            Id: 3
>            Equivalent ioctl: VHOST_SET_OWNER
>     +      Initiator: Master
>            Master payload: N/A
>
>            Issued when a new connection is established. It sets the
>     current Master
>     @@ -323,6 +335,7 @@ Message types
>       * VHOST_USER_RESET_OWNER
>
>            Id: 4
>     +      Initiator: Master
>            Master payload: N/A
>
>            This is no longer used. Used to be sent to request disabling
>     @@ -335,6 +348,7 @@ Message types
>
>            Id: 5
>            Equivalent ioctl: VHOST_SET_MEM_TABLE
>     +      Initiator: Master
>            Master payload: memory regions description
>
>            Sets the memory map regions on the slave so it can translate
>     the vring
>     @@ -346,6 +360,7 @@ Message types
>
>            Id: 6
>            Equivalent ioctl: VHOST_SET_LOG_BASE
>     +      Initiator: Master
>            Master payload: u64
>            Slave payload: N/A
>
>     @@ -360,6 +375,7 @@ Message types
>
>            Id: 7
>            Equivalent ioctl: VHOST_SET_LOG_FD
>     +      Initiator: Master
>            Master payload: N/A
>
>            Sets the logging file descriptor, which is passed as
>     ancillary data.
>     @@ -368,6 +384,7 @@ Message types
>
>            Id: 8
>            Equivalent ioctl: VHOST_SET_VRING_NUM
>     +      Initiator: Master
>            Master payload: vring state description
>
>            Set the size of the queue.
>     @@ -376,6 +393,7 @@ Message types
>
>            Id: 9
>            Equivalent ioctl: VHOST_SET_VRING_ADDR
>     +      Initiator: Master
>            Master payload: vring address description
>            Slave payload: N/A
>
>     @@ -385,6 +403,7 @@ Message types
>
>            Id: 10
>            Equivalent ioctl: VHOST_SET_VRING_BASE
>     +      Initiator: Master
>            Master payload: vring state description
>
>            Sets the base offset in the available vring.
>     @@ -393,6 +412,7 @@ Message types
>
>            Id: 11
>            Equivalent ioctl: VHOST_USER_GET_VRING_BASE
>     +      Initiator: Master
>            Master payload: vring state description
>            Slave payload: vring state description
>
>     @@ -402,6 +422,7 @@ Message types
>
>            Id: 12
>            Equivalent ioctl: VHOST_SET_VRING_KICK
>     +      Initiator: Master
>            Master payload: u64
>
>            Set the event file descriptor for adding buffers to the vring. It
>     @@ -415,6 +436,7 @@ Message types
>
>            Id: 13
>            Equivalent ioctl: VHOST_SET_VRING_CALL
>     +      Initiator: Master
>            Master payload: u64
>
>            Set the event file descriptor to signal when buffers are used. It
>     @@ -428,6 +450,7 @@ Message types
>
>            Id: 14
>            Equivalent ioctl: VHOST_SET_VRING_ERR
>     +      Initiator: Master
>            Master payload: u64
>
>            Set the event file descriptor to signal when error occurs. It
>     @@ -440,6 +463,7 @@ Message types
>
>            Id: 17
>            Equivalent ioctl: N/A
>     +      Initiator: Master
>            Master payload: N/A
>            Slave payload: u64
>
>     @@ -451,6 +475,7 @@ Message types
>
>            Id: 18
>            Equivalent ioctl: N/A
>     +      Initiator: Master
>            Master payload: vring state description
>
>            Signal slave to enable or disable corresponding vring.
>     @@ -461,6 +486,7 @@ Message types
>
>            Id: 19
>            Equivalent ioctl: N/A
>     +      Initiator: Master
>            Master payload: u64
>
>            Ask vhost user backend to broadcast a fake RARP to notify the
>     migration
>     @@ -475,6 +501,7 @@ Message types
>
>            Id: 20
>            Equivalent ioctl: N/A
>     +      Initiator: Master
>            Master payload: u64
>
>            Set host MTU value exposed to the guest.
>     @@ -486,6 +513,17 @@ Message types
>            If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must
>     respond
>            with zero in case the specified MTU is valid, or non-zero
>     otherwise.
>
>     + * VHOST_USER_SET_SLAVE_REQ_FD
>     +
>     +      Id: 21
>     +      Equivalent ioctl: N/A
>     +      Initiator: Master
>     +
>     +      Set the socket file descriptor for slave initiated requests.
>     +      This request should be sent only when
>     VHOST_USER_F_PROTOCOL_FEATURES
>     +      has been negotiated, and protocol feature bit
>     VHOST_USER_PROTOCOL_F_SLAVE_REQ
>     +      bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
>     +
>
>
> looks good to me
>
>
>      VHOST_USER_PROTOCOL_F_REPLY_ACK:
>      -------------------------------
>      The original vhost-user specification only demands replies for certain
>     --
>     2.9.3
>
>
> --
> Marc-André Lureau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-11 13:20   ` Peter Xu
@ 2017-04-11 15:16     ` Maxime Coquelin
  2017-04-12  7:17       ` Peter Xu
  0 siblings, 1 reply; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-11 15:16 UTC (permalink / raw)
  To: Peter Xu
  Cc: mst, vkaplans, jasowang, wexu, yuanhan.liu, virtio-comment, qemu-devel



On 04/11/2017 03:20 PM, Peter Xu wrote:
> On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
>> This patch specifies the master/slave communication to support
>> device IOTLB implementation in slave.
>>
>> The vhost_iotlb_msg structure introduced for kernel backends is
>> re-used, making the design close between the two backends.
>>
>> An exception is the use of the secondary channel to enable the
>> slave to send IOTLB miss requests to the master.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>  docs/specs/vhost-user.txt | 56 +++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 56 insertions(+)
>>
>> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
>> index b365047..048a4d6 100644
>> --- a/docs/specs/vhost-user.txt
>> +++ b/docs/specs/vhost-user.txt
>> @@ -97,6 +97,23 @@ Depending on the request type, payload can be:
>>     log offset: offset from start of supplied file descriptor
>>         where logging starts (i.e. where guest address 0 would be logged)
>>
>> + * An IOTLB message
>> +   ---------------------------------------------------------
>> +   | iova | size | user address | permissions flags | type |
>> +   ---------------------------------------------------------
>> +
>> +   IOVA: a 64-bit guest I/O virtual address
>> +   Size: a 64-bit size
>> +   User address: a 64-bit user address
>> +   Permissions flags: a 8-bit bit field:
>> +    - Bit 0: Read access
>> +    - Bit 1: Write access
>> +   Type: a 8-bit IOTLB message type:
>> +    - 1: IOTLB miss
>> +    - 2: IOTLB update
>> +    - 3: IOTLB invalidate
>> +    - 4: IOTLB access fail
>> +
>>  In QEMU the vhost-user message is implemented with the following struct:
>>
>>  typedef struct VhostUserMsg {
>> @@ -109,6 +126,7 @@ typedef struct VhostUserMsg {
>>          struct vhost_vring_addr addr;
>>          VhostUserMemory memory;
>>          VhostUserLog log;
>> +        struct vhost_iotlb_msg iotlb;
>>      };
>>  } QEMU_PACKED VhostUserMsg;
>>
>> @@ -258,6 +276,30 @@ Once the source has finished migration, rings will be stopped by
>>  the source. No further update must be done before rings are
>>  restarted.
>>
>> +IOMMU support
>> +-------------
>> +
>> +When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master has
>> +to send IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
>> +requests to the slave with a struct vhost_iotlb_msg payload. For update events,
>> +the iotlb payload has to be filled with the update message type (2), the I/O
>> +virtual address, the size, the user virtual address, and the permissions
>> +flags. For invalidation events, the iotlb payload has to be filled with the
>> +update message type (3), the I/O virtual address and the size. On success, the
>
> s/update/invalidate/?

Indeed.

>
>> +slave is expected to reply with a zero payload, non-zero otherwise.
>
> Is this ack mechanism really necessary? If not, not sure it'll be nice
> to keep vhost-user/vhost-kernel aligned on this behavior. At least
> that'll simplify vhost-user implementation on QEMU side (iiuc even
> without introducing new functions for update/invalidate operations).

I think this is necessary, and it won't complexify the vhost-user
implementation on QEMU side, since already widely used (see reply-ack
feature).

This reply-ack mechanism is used to obtain a behaviour closer to kernel
backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
is blocked in the write() while the message is being processed in the
Kernel. With user backend, QEMU is unblocked from the write() when the
backend has read the message, before it is being processed.


>> +
>> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
>> +master initiated the slave to master communication channel using the
>> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
>> +failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
>> +struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
>> +filled with the miss message type (1), the I/O virtual address and the
>> +permissions flags. For access failure event, the iotlb payload has to be
>> +filled with the access failure message type (4), the I/O virtual address and
>> +the permissions flags. On success, the master is expected to reply  when the
>> +request has been handled (for example, on miss requests, once the device IOTLB
>> +has been updated) with a zero payload, non-zero otherwise.
>
> Failed to understand the last sentence clearly. IIUC vhost-net will
> reply with an UPDATE message when a MISS message is received. Here for
> vhost-user are we going to send one extra zero payload after that?

Not exactly. There are two channels, one for QEMU to backend requests
(channel A), one for backend to QEMU requests (channel B).

The backend may be multi-threaded (like DPDK), one thread for handling
QEMU initiated requests (channel A), the others to handle packet
processing (i.e. one for Rx, one for Tx).

The processing threads will need to translate iova adresses by
searching in the IOTLB cache. In case of miss, it will send an IOTLB
miss request on channel B, and then wait for the ack/nack. In case of
ack, it can search again the IOTLB cache and find the translation.

On QEMU side, when the thread handling channel B requests receives the
IOTLB miss message, it gets the translation and send an IOTLB update
message on channel A. Then it waits for the ack from the backend,
meaning that the IOTLB cache has been updated, and replies ack on
channel B.

Doing this, in backend, we have only one writer in the IOTLB cache, and
multiple readers.

>> +
>>  Protocol features
>>  -----------------
>>
>> @@ -524,6 +566,20 @@ Message types
>>        has been negotiated, and protocol feature bit VHOST_USER_PROTOCOL_F_SLAVE_REQ
>>        bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
>>
>> + * VHOST_USER_IOTLB_MSG
>> +
>> +      Id: 22
>> +      Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
>> +      Initiator: Master or slave
>> +
>> +      Send IOTLB messages with struct vhost_iotlb_msg as payload.
>> +      Master sends such requests to update and invalidate entries in the device
>> +      IOTLB. Slave sends such requests to notify of an IOTLB miss, or an IOTLB
>
> s/of//?

Yes.

>
>> +      access failure. The recipient has to acknowledge the request with
>> +      sending zero as u64 payload for success, non-zero otherwise.
>
> Same question here...

Thanks,
Maxime

> Thanks,
>
>> +      This request should be send only when VIRTIO_F_IOMMU_PLATFORM feature
>> +      has been successfully negotiated.
>> +
>>  VHOST_USER_PROTOCOL_F_REPLY_ACK:
>>  -------------------------------
>>  The original vhost-user specification only demands replies for certain
>> --
>> 2.9.3
>>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-11 15:16     ` Maxime Coquelin
@ 2017-04-12  7:17       ` Peter Xu
  2017-04-12  7:24         ` Maxime Coquelin
  2017-04-12  8:54         ` Jason Wang
  0 siblings, 2 replies; 18+ messages in thread
From: Peter Xu @ 2017-04-12  7:17 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: mst, vkaplans, jasowang, wexu, yuanhan.liu, virtio-comment, qemu-devel

On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
> On 04/11/2017 03:20 PM, Peter Xu wrote:
> >On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:

[...]

> >
> >>+slave is expected to reply with a zero payload, non-zero otherwise.
> >
> >Is this ack mechanism really necessary? If not, not sure it'll be nice
> >to keep vhost-user/vhost-kernel aligned on this behavior. At least
> >that'll simplify vhost-user implementation on QEMU side (iiuc even
> >without introducing new functions for update/invalidate operations).
> 
> I think this is necessary, and it won't complexify the vhost-user
> implementation on QEMU side, since already widely used (see reply-ack
> feature).

Could you provide file/function/link pointer to the "reply-ack"
feature? I failed to find it myself.

> 
> This reply-ack mechanism is used to obtain a behaviour closer to kernel
> backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
> is blocked in the write() while the message is being processed in the
> Kernel. With user backend, QEMU is unblocked from the write() when the
> backend has read the message, before it is being processed.
> 

I see. Then I agree with you that we may need a synchronized way to do
it. One thing I think of is IOMMU page invalidation - it should be a
sync operation to make sure that all the related caches were destroyed
when the invalidation command returns in QEMU vIOMMU emulation path.

> 
> >>+
> >>+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
> >>+master initiated the slave to master communication channel using the
> >>+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
> >>+failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
> >>+struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
> >>+filled with the miss message type (1), the I/O virtual address and the
> >>+permissions flags. For access failure event, the iotlb payload has to be
> >>+filled with the access failure message type (4), the I/O virtual address and
> >>+the permissions flags. On success, the master is expected to reply  when the
> >>+request has been handled (for example, on miss requests, once the device IOTLB
> >>+has been updated) with a zero payload, non-zero otherwise.
> >
> >Failed to understand the last sentence clearly. IIUC vhost-net will
> >reply with an UPDATE message when a MISS message is received. Here for
> >vhost-user are we going to send one extra zero payload after that?
> 
> Not exactly. There are two channels, one for QEMU to backend requests
> (channel A), one for backend to QEMU requests (channel B).
> 
> The backend may be multi-threaded (like DPDK), one thread for handling
> QEMU initiated requests (channel A), the others to handle packet
> processing (i.e. one for Rx, one for Tx).
> 
> The processing threads will need to translate iova adresses by
> searching in the IOTLB cache. In case of miss, it will send an IOTLB
> miss request on channel B, and then wait for the ack/nack. In case of
> ack, it can search again the IOTLB cache and find the translation.
> 
> On QEMU side, when the thread handling channel B requests receives the
> IOTLB miss message, it gets the translation and send an IOTLB update
> message on channel A. Then it waits for the ack from the backend,
> meaning that the IOTLB cache has been updated, and replies ack on
> channel B.

If the ack on channel B is used to notify the processing thread that
"cache is ready", then... would it be faster that we just let the
processing thread poll the cache until it finds it, or let the other
thread notify it when it receives ack on channel A? Not sure whether
it'll be faster.

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  7:17       ` Peter Xu
@ 2017-04-12  7:24         ` Maxime Coquelin
  2017-04-12  7:49           ` Peter Xu
  2017-04-12  9:00           ` Jason Wang
  2017-04-12  8:54         ` Jason Wang
  1 sibling, 2 replies; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-12  7:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: mst, vkaplans, jasowang, wexu, yuanhan.liu, virtio-comment, qemu-devel



On 04/12/2017 09:17 AM, Peter Xu wrote:
> On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
>> On 04/11/2017 03:20 PM, Peter Xu wrote:
>>> On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
>
> [...]
>
>>>
>>>> +slave is expected to reply with a zero payload, non-zero otherwise.
>>>
>>> Is this ack mechanism really necessary? If not, not sure it'll be nice
>>> to keep vhost-user/vhost-kernel aligned on this behavior. At least
>>> that'll simplify vhost-user implementation on QEMU side (iiuc even
>>> without introducing new functions for update/invalidate operations).
>>
>> I think this is necessary, and it won't complexify the vhost-user
>> implementation on QEMU side, since already widely used (see reply-ack
>> feature).
>
> Could you provide file/function/link pointer to the "reply-ack"
> feature? I failed to find it myself.
>
>>
>> This reply-ack mechanism is used to obtain a behaviour closer to kernel
>> backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
>> is blocked in the write() while the message is being processed in the
>> Kernel. With user backend, QEMU is unblocked from the write() when the
>> backend has read the message, before it is being processed.
>>
>
> I see. Then I agree with you that we may need a synchronized way to do
> it. One thing I think of is IOMMU page invalidation - it should be a
> sync operation to make sure that all the related caches were destroyed
> when the invalidation command returns in QEMU vIOMMU emulation path.
>
>>
>>>> +
>>>> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
>>>> +master initiated the slave to master communication channel using the
>>>> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
>>>> +failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
>>>> +struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
>>>> +filled with the miss message type (1), the I/O virtual address and the
>>>> +permissions flags. For access failure event, the iotlb payload has to be
>>>> +filled with the access failure message type (4), the I/O virtual address and
>>>> +the permissions flags. On success, the master is expected to reply  when the
>>>> +request has been handled (for example, on miss requests, once the device IOTLB
>>>> +has been updated) with a zero payload, non-zero otherwise.
>>>
>>> Failed to understand the last sentence clearly. IIUC vhost-net will
>>> reply with an UPDATE message when a MISS message is received. Here for
>>> vhost-user are we going to send one extra zero payload after that?
>>
>> Not exactly. There are two channels, one for QEMU to backend requests
>> (channel A), one for backend to QEMU requests (channel B).
>>
>> The backend may be multi-threaded (like DPDK), one thread for handling
>> QEMU initiated requests (channel A), the others to handle packet
>> processing (i.e. one for Rx, one for Tx).
>>
>> The processing threads will need to translate iova adresses by
>> searching in the IOTLB cache. In case of miss, it will send an IOTLB
>> miss request on channel B, and then wait for the ack/nack. In case of
>> ack, it can search again the IOTLB cache and find the translation.
>>
>> On QEMU side, when the thread handling channel B requests receives the
>> IOTLB miss message, it gets the translation and send an IOTLB update
>> message on channel A. Then it waits for the ack from the backend,
>> meaning that the IOTLB cache has been updated, and replies ack on
>> channel B.
>
> If the ack on channel B is used to notify the processing thread that
> "cache is ready", then... would it be faster that we just let the
> processing thread poll the cache until it finds it, or let the other
> thread notify it when it receives ack on channel A? Not sure whether
> it'll be faster.

Not sure either.
Not requiring a ack can indeed make sense in some cases, for example
with single-threaded backends.

What we can do is to remove the mandatory ack reply for
VHOST_USER_IOTLB_MSG slave requests (miss, access fail).
The backend then can just rely on the REPLY_ACK feature, and set the
VHOST_USER_NEED_REPLY flag if it want to receive such ack.

Would it be fine for you?

Thanks,
Maxime

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  7:24         ` Maxime Coquelin
@ 2017-04-12  7:49           ` Peter Xu
  2017-04-12  9:00           ` Jason Wang
  1 sibling, 0 replies; 18+ messages in thread
From: Peter Xu @ 2017-04-12  7:49 UTC (permalink / raw)
  To: Maxime Coquelin
  Cc: mst, vkaplans, jasowang, wexu, yuanhan.liu, virtio-comment, qemu-devel

On Wed, Apr 12, 2017 at 09:24:47AM +0200, Maxime Coquelin wrote:
> 
> 
> On 04/12/2017 09:17 AM, Peter Xu wrote:
> >On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
> >>On 04/11/2017 03:20 PM, Peter Xu wrote:
> >>>On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
> >
> >[...]
> >
> >>>
> >>>>+slave is expected to reply with a zero payload, non-zero otherwise.
> >>>
> >>>Is this ack mechanism really necessary? If not, not sure it'll be nice
> >>>to keep vhost-user/vhost-kernel aligned on this behavior. At least
> >>>that'll simplify vhost-user implementation on QEMU side (iiuc even
> >>>without introducing new functions for update/invalidate operations).
> >>
> >>I think this is necessary, and it won't complexify the vhost-user
> >>implementation on QEMU side, since already widely used (see reply-ack
> >>feature).
> >
> >Could you provide file/function/link pointer to the "reply-ack"
> >feature? I failed to find it myself.
> >
> >>
> >>This reply-ack mechanism is used to obtain a behaviour closer to kernel
> >>backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
> >>is blocked in the write() while the message is being processed in the
> >>Kernel. With user backend, QEMU is unblocked from the write() when the
> >>backend has read the message, before it is being processed.
> >>
> >
> >I see. Then I agree with you that we may need a synchronized way to do
> >it. One thing I think of is IOMMU page invalidation - it should be a
> >sync operation to make sure that all the related caches were destroyed
> >when the invalidation command returns in QEMU vIOMMU emulation path.
> >
> >>
> >>>>+
> >>>>+When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the slave, and the
> >>>>+master initiated the slave to master communication channel using the
> >>>>+VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB miss and access
> >>>>+failure events by sending VHOST_USER_IOTLB_MSG requests to the master with a
> >>>>+struct vhost_iotlb_msg payload. For miss events, the iotlb payload has to be
> >>>>+filled with the miss message type (1), the I/O virtual address and the
> >>>>+permissions flags. For access failure event, the iotlb payload has to be
> >>>>+filled with the access failure message type (4), the I/O virtual address and
> >>>>+the permissions flags. On success, the master is expected to reply  when the
> >>>>+request has been handled (for example, on miss requests, once the device IOTLB
> >>>>+has been updated) with a zero payload, non-zero otherwise.
> >>>
> >>>Failed to understand the last sentence clearly. IIUC vhost-net will
> >>>reply with an UPDATE message when a MISS message is received. Here for
> >>>vhost-user are we going to send one extra zero payload after that?
> >>
> >>Not exactly. There are two channels, one for QEMU to backend requests
> >>(channel A), one for backend to QEMU requests (channel B).
> >>
> >>The backend may be multi-threaded (like DPDK), one thread for handling
> >>QEMU initiated requests (channel A), the others to handle packet
> >>processing (i.e. one for Rx, one for Tx).
> >>
> >>The processing threads will need to translate iova adresses by
> >>searching in the IOTLB cache. In case of miss, it will send an IOTLB
> >>miss request on channel B, and then wait for the ack/nack. In case of
> >>ack, it can search again the IOTLB cache and find the translation.
> >>
> >>On QEMU side, when the thread handling channel B requests receives the
> >>IOTLB miss message, it gets the translation and send an IOTLB update
> >>message on channel A. Then it waits for the ack from the backend,
> >>meaning that the IOTLB cache has been updated, and replies ack on
> >>channel B.
> >
> >If the ack on channel B is used to notify the processing thread that
> >"cache is ready", then... would it be faster that we just let the
> >processing thread poll the cache until it finds it, or let the other
> >thread notify it when it receives ack on channel A? Not sure whether
> >it'll be faster.
> 
> Not sure either.
> Not requiring a ack can indeed make sense in some cases, for example
> with single-threaded backends.
> 
> What we can do is to remove the mandatory ack reply for
> VHOST_USER_IOTLB_MSG slave requests (miss, access fail).
> The backend then can just rely on the REPLY_ACK feature, and set the
> VHOST_USER_NEED_REPLY flag if it want to receive such ack.
> 
> Would it be fine for you?

Okay I found the REPLY_ACK feature now.

It's okay to me (actually, in either way). Thank you.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  7:17       ` Peter Xu
  2017-04-12  7:24         ` Maxime Coquelin
@ 2017-04-12  8:54         ` Jason Wang
  2017-04-12  9:26           ` Peter Xu
  1 sibling, 1 reply; 18+ messages in thread
From: Jason Wang @ 2017-04-12  8:54 UTC (permalink / raw)
  To: Peter Xu, Maxime Coquelin
  Cc: mst, vkaplans, wexu, yuanhan.liu, virtio-comment, qemu-devel



On 2017年04月12日 15:17, Peter Xu wrote:
> On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
>> On 04/11/2017 03:20 PM, Peter Xu wrote:
>>> On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
> [...]
>
>>>> +slave is expected to reply with a zero payload, non-zero otherwise.
>>> Is this ack mechanism really necessary? If not, not sure it'll be nice
>>> to keep vhost-user/vhost-kernel aligned on this behavior. At least
>>> that'll simplify vhost-user implementation on QEMU side (iiuc even
>>> without introducing new functions for update/invalidate operations).
>> I think this is necessary, and it won't complexify the vhost-user
>> implementation on QEMU side, since already widely used (see reply-ack
>> feature).
> Could you provide file/function/link pointer to the "reply-ack"
> feature? I failed to find it myself.
>
>> This reply-ack mechanism is used to obtain a behaviour closer to kernel
>> backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
>> is blocked in the write() while the message is being processed in the
>> Kernel. With user backend, QEMU is unblocked from the write() when the
>> backend has read the message, before it is being processed.
>>
> I see. Then I agree with you that we may need a synchronized way to do
> it. One thing I think of is IOMMU page invalidation - it should be a
> sync operation to make sure that all the related caches were destroyed
> when the invalidation command returns in QEMU vIOMMU emulation path.
>

Looks not, if I understand correctly, e.g for Intel IOMMU, when QI is 
enabled, this could be done asynchronously by not waiting for the 
completion through wait descriptor. Vhost-kernel always implement the 
invalidation as a synchronous one for simplicity, but looks like this is 
not needed.

Thakns

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  7:24         ` Maxime Coquelin
  2017-04-12  7:49           ` Peter Xu
@ 2017-04-12  9:00           ` Jason Wang
  1 sibling, 0 replies; 18+ messages in thread
From: Jason Wang @ 2017-04-12  9:00 UTC (permalink / raw)
  To: Maxime Coquelin, Peter Xu
  Cc: yuanhan.liu, mst, qemu-devel, wexu, virtio-comment, vkaplans



On 2017年04月12日 15:24, Maxime Coquelin wrote:
>
>
> On 04/12/2017 09:17 AM, Peter Xu wrote:
>> On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
>>> On 04/11/2017 03:20 PM, Peter Xu wrote:
>>>> On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
>>
>> [...]
>>
>>>>
>>>>> +slave is expected to reply with a zero payload, non-zero otherwise.
>>>>
>>>> Is this ack mechanism really necessary? If not, not sure it'll be nice
>>>> to keep vhost-user/vhost-kernel aligned on this behavior. At least
>>>> that'll simplify vhost-user implementation on QEMU side (iiuc even
>>>> without introducing new functions for update/invalidate operations).
>>>
>>> I think this is necessary, and it won't complexify the vhost-user
>>> implementation on QEMU side, since already widely used (see reply-ack
>>> feature).
>>
>> Could you provide file/function/link pointer to the "reply-ack"
>> feature? I failed to find it myself.
>>
>>>
>>> This reply-ack mechanism is used to obtain a behaviour closer to kernel
>>> backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
>>> is blocked in the write() while the message is being processed in the
>>> Kernel. With user backend, QEMU is unblocked from the write() when the
>>> backend has read the message, before it is being processed.
>>>
>>
>> I see. Then I agree with you that we may need a synchronized way to do
>> it. One thing I think of is IOMMU page invalidation - it should be a
>> sync operation to make sure that all the related caches were destroyed
>> when the invalidation command returns in QEMU vIOMMU emulation path.
>>
>>>
>>>>> +
>>>>> +When the VHOST_USER_PROTOCOL_F_SLAVE_REQ is supported by the 
>>>>> slave, and the
>>>>> +master initiated the slave to master communication channel using the
>>>>> +VHOST_USER_SET_SLAVE_REQ_FD request, the slave can send IOTLB 
>>>>> miss and access
>>>>> +failure events by sending VHOST_USER_IOTLB_MSG requests to the 
>>>>> master with a
>>>>> +struct vhost_iotlb_msg payload. For miss events, the iotlb 
>>>>> payload has to be
>>>>> +filled with the miss message type (1), the I/O virtual address 
>>>>> and the
>>>>> +permissions flags. For access failure event, the iotlb payload 
>>>>> has to be
>>>>> +filled with the access failure message type (4), the I/O virtual 
>>>>> address and
>>>>> +the permissions flags. On success, the master is expected to 
>>>>> reply  when the
>>>>> +request has been handled (for example, on miss requests, once the 
>>>>> device IOTLB
>>>>> +has been updated) with a zero payload, non-zero otherwise.
>>>>
>>>> Failed to understand the last sentence clearly. IIUC vhost-net will
>>>> reply with an UPDATE message when a MISS message is received. Here for
>>>> vhost-user are we going to send one extra zero payload after that?
>>>
>>> Not exactly. There are two channels, one for QEMU to backend requests
>>> (channel A), one for backend to QEMU requests (channel B).
>>>
>>> The backend may be multi-threaded (like DPDK), one thread for handling
>>> QEMU initiated requests (channel A), the others to handle packet
>>> processing (i.e. one for Rx, one for Tx).
>>>
>>> The processing threads will need to translate iova adresses by
>>> searching in the IOTLB cache. In case of miss, it will send an IOTLB
>>> miss request on channel B, and then wait for the ack/nack. In case of
>>> ack, it can search again the IOTLB cache and find the translation.
>>>
>>> On QEMU side, when the thread handling channel B requests receives the
>>> IOTLB miss message, it gets the translation and send an IOTLB update
>>> message on channel A. Then it waits for the ack from the backend,
>>> meaning that the IOTLB cache has been updated, and replies ack on
>>> channel B.
>>
>> If the ack on channel B is used to notify the processing thread that
>> "cache is ready", then... would it be faster that we just let the
>> processing thread poll the cache until it finds it, or let the other
>> thread notify it when it receives ack on channel A? Not sure whether
>> it'll be faster.
>
> Not sure either.
> Not requiring a ack can indeed make sense in some cases, for example
> with single-threaded backends.
>
> What we can do is to remove the mandatory ack reply for
> VHOST_USER_IOTLB_MSG slave requests (miss, access fail).

I don't see any requirement for ack reply unless slave want to do any 
post processing when guest want to access the forbidden area. It looks 
to me that this should be done by userspace, if it's a valid map, master 
will send the IOTLB update message. If not, it will just report to 
guest. What needs to be guaranteed is that slave can still handle other 
request e.g set_owner or other in this case.

Thanks

> The backend then can just rely on the REPLY_ACK feature, and set the
> VHOST_USER_NEED_REPLY flag if it want to receive such ack.
>
> Would it be fine for you?
>
> Thanks,
> Maxime
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  8:54         ` Jason Wang
@ 2017-04-12  9:26           ` Peter Xu
  2017-04-13  7:12             ` Jason Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Peter Xu @ 2017-04-12  9:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Maxime Coquelin, mst, vkaplans, wexu, yuanhan.liu,
	virtio-comment, qemu-devel

On Wed, Apr 12, 2017 at 04:54:25PM +0800, Jason Wang wrote:
> 
> 
> On 2017年04月12日 15:17, Peter Xu wrote:
> >On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
> >>On 04/11/2017 03:20 PM, Peter Xu wrote:
> >>>On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
> >[...]
> >
> >>>>+slave is expected to reply with a zero payload, non-zero otherwise.
> >>>Is this ack mechanism really necessary? If not, not sure it'll be nice
> >>>to keep vhost-user/vhost-kernel aligned on this behavior. At least
> >>>that'll simplify vhost-user implementation on QEMU side (iiuc even
> >>>without introducing new functions for update/invalidate operations).
> >>I think this is necessary, and it won't complexify the vhost-user
> >>implementation on QEMU side, since already widely used (see reply-ack
> >>feature).
> >Could you provide file/function/link pointer to the "reply-ack"
> >feature? I failed to find it myself.
> >
> >>This reply-ack mechanism is used to obtain a behaviour closer to kernel
> >>backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
> >>is blocked in the write() while the message is being processed in the
> >>Kernel. With user backend, QEMU is unblocked from the write() when the
> >>backend has read the message, before it is being processed.
> >>
> >I see. Then I agree with you that we may need a synchronized way to do
> >it. One thing I think of is IOMMU page invalidation - it should be a
> >sync operation to make sure that all the related caches were destroyed
> >when the invalidation command returns in QEMU vIOMMU emulation path.
> >
> 
> Looks not, if I understand correctly, e.g for Intel IOMMU, when QI is
> enabled, this could be done asynchronously by not waiting for the completion
> through wait descriptor. Vhost-kernel always implement the invalidation as a
> synchronous one for simplicity, but looks like this is not needed.

IMHO, the point is guest cannot reuse that IOVA only if it sends a
invalidation wait descriptor. If without wait descriptor, the guest
should never release any IOVA range, if so, that'll be dangerous,
because the cache may still be dirty on that range on specific device.

And since guest will for sure use wait descriptor (as long as it wants
to reuse iova addresses), then we should possibly finally need a way
to synchronously invalidate IOTLB, including to vhost-user backends.

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support
  2017-04-12  9:26           ` Peter Xu
@ 2017-04-13  7:12             ` Jason Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jason Wang @ 2017-04-13  7:12 UTC (permalink / raw)
  To: Peter Xu
  Cc: Maxime Coquelin, mst, vkaplans, wexu, yuanhan.liu,
	virtio-comment, qemu-devel



On 2017年04月12日 17:26, Peter Xu wrote:
> On Wed, Apr 12, 2017 at 04:54:25PM +0800, Jason Wang wrote:
>> On 2017年04月12日 15:17, Peter Xu wrote:
>>> On Tue, Apr 11, 2017 at 05:16:19PM +0200, Maxime Coquelin wrote:
>>>> On 04/11/2017 03:20 PM, Peter Xu wrote:
>>>>> On Tue, Apr 11, 2017 at 12:10:02PM +0200, Maxime Coquelin wrote:
>>> [...]
>>>
>>>>>> +slave is expected to reply with a zero payload, non-zero otherwise.
>>>>> Is this ack mechanism really necessary? If not, not sure it'll be nice
>>>>> to keep vhost-user/vhost-kernel aligned on this behavior. At least
>>>>> that'll simplify vhost-user implementation on QEMU side (iiuc even
>>>>> without introducing new functions for update/invalidate operations).
>>>> I think this is necessary, and it won't complexify the vhost-user
>>>> implementation on QEMU side, since already widely used (see reply-ack
>>>> feature).
>>> Could you provide file/function/link pointer to the "reply-ack"
>>> feature? I failed to find it myself.
>>>
>>>> This reply-ack mechanism is used to obtain a behaviour closer to kernel
>>>> backend. Indeed, when QEMU sends a vhost_msg to the kernel backend, it
>>>> is blocked in the write() while the message is being processed in the
>>>> Kernel. With user backend, QEMU is unblocked from the write() when the
>>>> backend has read the message, before it is being processed.
>>>>
>>> I see. Then I agree with you that we may need a synchronized way to do
>>> it. One thing I think of is IOMMU page invalidation - it should be a
>>> sync operation to make sure that all the related caches were destroyed
>>> when the invalidation command returns in QEMU vIOMMU emulation path.
>>>
>> Looks not, if I understand correctly, e.g for Intel IOMMU, when QI is
>> enabled, this could be done asynchronously by not waiting for the completion
>> through wait descriptor. Vhost-kernel always implement the invalidation as a
>> synchronous one for simplicity, but looks like this is not needed.
> IMHO, the point is guest cannot reuse that IOVA only if it sends a
> invalidation wait descriptor. If without wait descriptor, the guest
> should never release any IOVA range, if so, that'll be dangerous,
> because the cache may still be dirty on that range on specific device.
>
> And since guest will for sure use wait descriptor (as long as it wants
> to reuse iova addresses), then we should possibly finally need a way
> to synchronously invalidate IOTLB, including to vhost-user backends.

Yes, what I mean is technically we can implement this only for wait 
descriptor.

Thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-11 13:53     ` Maxime Coquelin
@ 2017-04-14  9:03       ` Marc-André Lureau
  2017-04-24  8:05         ` [Qemu-devel] [virtio-comment] " Wei Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Marc-André Lureau @ 2017-04-14  9:03 UTC (permalink / raw)
  To: Maxime Coquelin, mst, vkaplans, jasowang, wexu, peterx,
	yuanhan.liu, virtio-comment, qemu-devel

Hi

On Tue, Apr 11, 2017 at 5:53 PM Maxime Coquelin <maxime.coquelin@redhat.com>
wrote:

> Hi Marc-André,
>
> On 04/11/2017 03:06 PM, Marc-André Lureau wrote:
> > Hi
> >
> > On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin
> > <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>> wrote:
> >
> >     This vhost-user specification update aims at enabling the
> >     slave to send requests to the master using a dedicated socket
> >     created by the master.
> >
> >     It can be used for example when the slave implements a device
> >     IOTLB to send cache miss requests to the master.
> >
> >     The message types list is updated with an "Initiator" field to
> >     indicate for each type whether the master and/or slave can
> >     initiate the request.
> >
> >     Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
> >     <mailto:maxime.coquelin@redhat.com>>
> >
> >
> > This is very similar to a patch I proposed for shutdown slave initiated
> > requests:
> > https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html
>
> Indeed, thanks for pointing this out, I wasn't aware of your series.
>
> I find your proposal of having dedicated messages types
> (VHOST_USER_SLAVE_*) cleaner.
>
> ok


> Are you ok if I handover your patch, and replace
> VHOST_USER_SET_SLAVE_FD to VHOST_USER_SET_SLAVE_REQ_FD?
>

They are very similar, I suggest you update your patch with the best of
both.

I suppose you came to the same conclusion with me that trying to make the
communication both ways on the same fd would be quite difficult, although
it's a bit strange that the qemu implementation forces the design of the
protocol in some direction.
-- 
Marc-André Lureau

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [virtio-comment] Re: [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-14  9:03       ` Marc-André Lureau
@ 2017-04-24  8:05         ` Wei Wang
  2017-04-25 11:55           ` Maxime Coquelin
  0 siblings, 1 reply; 18+ messages in thread
From: Wei Wang @ 2017-04-24  8:05 UTC (permalink / raw)
  To: Marc-André Lureau, Maxime Coquelin, mst, vkaplans, jasowang,
	wexu, peterx, yuanhan.liu, virtio-comment, qemu-devel

On 04/14/2017 05:03 PM, Marc-André Lureau wrote:
> Hi
>
> On Tue, Apr 11, 2017 at 5:53 PM Maxime Coquelin 
> <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>> wrote:
>
>     Hi Marc-André,
>
>     On 04/11/2017 03:06 PM, Marc-André Lureau wrote:
>     > Hi
>     >
>     > On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin
>     > <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>
>     <mailto:maxime.coquelin@redhat.com
>     <mailto:maxime.coquelin@redhat.com>>> wrote:
>     >
>     >     This vhost-user specification update aims at enabling the
>     >     slave to send requests to the master using a dedicated socket
>     >     created by the master.
>     >
>     >     It can be used for example when the slave implements a device
>     >     IOTLB to send cache miss requests to the master.
>     >
>     >     The message types list is updated with an "Initiator" field to
>     >     indicate for each type whether the master and/or slave can
>     >     initiate the request.
>     >
>     >     Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
>     <mailto:maxime.coquelin@redhat.com>
>     >     <mailto:maxime.coquelin@redhat.com
>     <mailto:maxime.coquelin@redhat.com>>>
>     >
>     >
>     > This is very similar to a patch I proposed for shutdown slave
>     initiated
>     > requests:
>     > https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html
>
>     Indeed, thanks for pointing this out, I wasn't aware of your series.
>
>     I find your proposal of having dedicated messages types
>     (VHOST_USER_SLAVE_*) cleaner.
>
> ok
>
>     Are you ok if I handover your patch, and replace
>     VHOST_USER_SET_SLAVE_FD to VHOST_USER_SET_SLAVE_REQ_FD?
>
>
> They are very similar, I suggest you update your patch with the best 
> of both.
>
> I suppose you came to the same conclusion with me that trying to make 
> the communication both ways on the same fd would be quite difficult, 
> although it's a bit strange that the qemu implementation forces the 
> design of the protocol in some direction.
> -- 
>

When would you get the implementation patch ready? Thanks.

Best,
Wei

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [virtio-comment] Re: [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-24  8:05         ` [Qemu-devel] [virtio-comment] " Wei Wang
@ 2017-04-25 11:55           ` Maxime Coquelin
  2017-04-26 11:29             ` Wei Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Maxime Coquelin @ 2017-04-25 11:55 UTC (permalink / raw)
  To: Wei Wang, Marc-André Lureau, mst, vkaplans, jasowang, wexu,
	peterx, yuanhan.liu, virtio-comment, qemu-devel

Hi Wei,

On 04/24/2017 10:05 AM, Wei Wang wrote:
> On 04/14/2017 05:03 PM, Marc-André Lureau wrote:
>> Hi
>>
>> On Tue, Apr 11, 2017 at 5:53 PM Maxime Coquelin 
>> <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>> wrote:
>>
>>     Hi Marc-André,
>>
>>     On 04/11/2017 03:06 PM, Marc-André Lureau wrote:
>>     > Hi
>>     >
>>     > On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin
>>     > <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>
>>     <mailto:maxime.coquelin@redhat.com
>>     <mailto:maxime.coquelin@redhat.com>>> wrote:
>>     >
>>     >     This vhost-user specification update aims at enabling the
>>     >     slave to send requests to the master using a dedicated socket
>>     >     created by the master.
>>     >
>>     >     It can be used for example when the slave implements a device
>>     >     IOTLB to send cache miss requests to the master.
>>     >
>>     >     The message types list is updated with an "Initiator" field to
>>     >     indicate for each type whether the master and/or slave can
>>     >     initiate the request.
>>     >
>>     >     Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
>>     <mailto:maxime.coquelin@redhat.com>
>>     >     <mailto:maxime.coquelin@redhat.com
>>     <mailto:maxime.coquelin@redhat.com>>>
>>     >
>>     >
>>     > This is very similar to a patch I proposed for shutdown slave
>>     initiated
>>     > requests:
>>     > https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html
>>
>>     Indeed, thanks for pointing this out, I wasn't aware of your series.
>>
>>     I find your proposal of having dedicated messages types
>>     (VHOST_USER_SLAVE_*) cleaner.
>>
>> ok
>>
>>     Are you ok if I handover your patch, and replace
>>     VHOST_USER_SET_SLAVE_FD to VHOST_USER_SET_SLAVE_REQ_FD?
>>
>>
>> They are very similar, I suggest you update your patch with the best 
>> of both.
>>
>> I suppose you came to the same conclusion with me that trying to make 
>> the communication both ways on the same fd would be quite difficult, 
>> although it's a bit strange that the qemu implementation forces the 
>> design of the protocol in some direction.
>> -- 
>>
> 
> When would you get the implementation patch ready? Thanks.

I sent second version of the RFC on April 14th, which comprises the
implementation:
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02467.html

Cheers,
Maxime

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [virtio-comment] Re: [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests
  2017-04-25 11:55           ` Maxime Coquelin
@ 2017-04-26 11:29             ` Wei Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Wei Wang @ 2017-04-26 11:29 UTC (permalink / raw)
  To: Maxime Coquelin, Marc-André Lureau, mst, vkaplans, jasowang,
	wexu, peterx, yuanhan.liu, virtio-comment, qemu-devel

On 04/25/2017 07:55 PM, Maxime Coquelin wrote:
> Hi Wei,
>
> On 04/24/2017 10:05 AM, Wei Wang wrote:
>> On 04/14/2017 05:03 PM, Marc-André Lureau wrote:
>>> Hi
>>>
>>> On Tue, Apr 11, 2017 at 5:53 PM Maxime Coquelin 
>>> <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>> wrote:
>>>
>>>     Hi Marc-André,
>>>
>>>     On 04/11/2017 03:06 PM, Marc-André Lureau wrote:
>>>     > Hi
>>>     >
>>>     > On Tue, Apr 11, 2017 at 12:10 PM Maxime Coquelin
>>>     > <maxime.coquelin@redhat.com <mailto:maxime.coquelin@redhat.com>
>>>     <mailto:maxime.coquelin@redhat.com
>>>     <mailto:maxime.coquelin@redhat.com>>> wrote:
>>>     >
>>>     >     This vhost-user specification update aims at enabling the
>>>     >     slave to send requests to the master using a dedicated socket
>>>     >     created by the master.
>>>     >
>>>     >     It can be used for example when the slave implements a device
>>>     >     IOTLB to send cache miss requests to the master.
>>>     >
>>>     >     The message types list is updated with an "Initiator" 
>>> field to
>>>     >     indicate for each type whether the master and/or slave can
>>>     >     initiate the request.
>>>     >
>>>     >     Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com
>>>     <mailto:maxime.coquelin@redhat.com>
>>>     >     <mailto:maxime.coquelin@redhat.com
>>>     <mailto:maxime.coquelin@redhat.com>>>
>>>     >
>>>     >
>>>     > This is very similar to a patch I proposed for shutdown slave
>>>     initiated
>>>     > requests:
>>>     > 
>>> https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html
>>>
>>>     Indeed, thanks for pointing this out, I wasn't aware of your 
>>> series.
>>>
>>>     I find your proposal of having dedicated messages types
>>>     (VHOST_USER_SLAVE_*) cleaner.
>>>
>>> ok
>>>
>>>     Are you ok if I handover your patch, and replace
>>>     VHOST_USER_SET_SLAVE_FD to VHOST_USER_SET_SLAVE_REQ_FD?
>>>
>>>
>>> They are very similar, I suggest you update your patch with the best 
>>> of both.
>>>
>>> I suppose you came to the same conclusion with me that trying to 
>>> make the communication both ways on the same fd would be quite 
>>> difficult, although it's a bit strange that the qemu implementation 
>>> forces the design of the protocol in some direction.
>>> -- 
>>>
>>
>> When would you get the implementation patch ready? Thanks.
>
> I sent second version of the RFC on April 14th, which comprises the
> implementation:
> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02467.html

Thanks, Maxime. I was trying to make the connection bidirectional
(https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg02617.html),
which was reported as problematic due to the possibility of race (though
I think it can be solved by re-sending the msg in that rare case).

Anyway, hope to see you guys' second channel based implementation to
be merged soon. I would also consider to switch to use it then.

Best,
Wei

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-04-26 11:27 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-11 10:10 [Qemu-devel] [RFC 0/2] vhost-user: Specify device IOTLB support Maxime Coquelin
2017-04-11 10:10 ` [Qemu-devel] [RFC 1/2] spec/vhost-user: Introduce secondary channel for slave initiated requests Maxime Coquelin
2017-04-11 13:06   ` Marc-André Lureau
2017-04-11 13:53     ` Maxime Coquelin
2017-04-14  9:03       ` Marc-André Lureau
2017-04-24  8:05         ` [Qemu-devel] [virtio-comment] " Wei Wang
2017-04-25 11:55           ` Maxime Coquelin
2017-04-26 11:29             ` Wei Wang
2017-04-11 10:10 ` [Qemu-devel] [RFC 2/2] spec/vhost-user spec: Add IOMMU support Maxime Coquelin
2017-04-11 13:20   ` Peter Xu
2017-04-11 15:16     ` Maxime Coquelin
2017-04-12  7:17       ` Peter Xu
2017-04-12  7:24         ` Maxime Coquelin
2017-04-12  7:49           ` Peter Xu
2017-04-12  9:00           ` Jason Wang
2017-04-12  8:54         ` Jason Wang
2017-04-12  9:26           ` Peter Xu
2017-04-13  7:12             ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.