All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v3 0/8] vDPA support in qemu
@ 2020-05-29 14:06 Cindy Lu
  2020-05-29 14:06 ` [RFC v3 1/8] net: introduce qemu_get_peer Cindy Lu
                   ` (10 more replies)
  0 siblings, 11 replies; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by software.
This RFC introduce the vDPA support in qemu
TODO: 
1) vIOMMU support
2) live migration support

change from v1
separate the patch of introduce vhost_set_vring_ready method
separate the patch of qemu_get_peer
separate the patch  of vhost_set_state
introduce the new macro specific for vDPA in configure
introduce the function to pass the fd from cmdline
introduce the docmation in qemu-options.hx
the other comments form last version 

change from v2
change the work process of vhost set status
introduce vhost_get_device_id
test based on qemu v5.0.0-rc4
the other comments from last version

Cindy Lu (3):
  net: introduce qemu_get_peer
  vhost_net: use the function qemu_get_peer
  vhost-backend: export the vhost backend helper

Jason Wang (3):
  virtio-bus: introduce queue_enabled method
  virtio-pci: implement queue_enabled method
  vhost: introduce vhost_set_vring_ready method

Tiwei Bie (2):
  vhost-vdpa: introduce vhost-vdpa backend
  vhost-vdpa: introduce vhost-vdpa net client

 configure                         |  21 ++
 hw/net/vhost_net-stub.c           |   9 +
 hw/net/vhost_net.c                |  72 +++++-
 hw/virtio/Makefile.objs           |   1 +
 hw/virtio/vhost-backend.c         |  39 +--
 hw/virtio/vhost-vdpa.c            | 399 ++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  37 ++-
 hw/virtio/virtio-pci.c            |  13 +
 hw/virtio/virtio.c                |   6 +
 include/hw/virtio/vhost-backend.h |  38 ++-
 include/hw/virtio/vhost-vdpa.h    |  26 ++
 include/hw/virtio/vhost.h         |   2 +
 include/hw/virtio/virtio-bus.h    |   4 +
 include/net/net.h                 |   1 +
 include/net/vhost-vdpa.h          |  19 ++
 include/net/vhost_net.h           |   3 +-
 net/Makefile.objs                 |   2 +-
 net/clients.h                     |   2 +
 net/net.c                         |   9 +
 net/vhost-vdpa.c                  | 235 ++++++++++++++++++
 qapi/net.json                     |  26 +-
 qemu-options.hx                   |  15 ++
 22 files changed, 951 insertions(+), 28 deletions(-)
 create mode 100644 hw/virtio/vhost-vdpa.c
 create mode 100644 include/hw/virtio/vhost-vdpa.h
 create mode 100644 include/net/vhost-vdpa.h
 create mode 100644 net/vhost-vdpa.c

-- 
2.21.1



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC v3 1/8] net: introduce qemu_get_peer
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-11  9:07   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 2/8] vhost_net: use the function qemu_get_peer Cindy Lu
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

This is a small function that can get the peer from given NetClientState and queue_index

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 include/net/net.h | 1 +
 net/net.c         | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 39085d9444..e7ef42d62b 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -176,6 +176,7 @@ void hmp_info_network(Monitor *mon, const QDict *qdict);
 void net_socket_rs_init(SocketReadState *rs,
                         SocketReadStateFinalize *finalize,
                         bool vnet_hdr);
+NetClientState *qemu_get_peer(NetClientState *nc, int queue_index);
 
 /* NIC info */
 
diff --git a/net/net.c b/net/net.c
index 38778e831d..599fb61028 100644
--- a/net/net.c
+++ b/net/net.c
@@ -324,6 +324,12 @@ void *qemu_get_nic_opaque(NetClientState *nc)
 
     return nic->opaque;
 }
+NetClientState *qemu_get_peer(NetClientState *nc, int queue_index)
+{
+    assert(nc != NULL);
+    NetClientState *ncs = nc + queue_index;
+    return ncs->peer;
+}
 
 static void qemu_cleanup_net_client(NetClientState *nc)
 {
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 2/8] vhost_net: use the function qemu_get_peer
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
  2020-05-29 14:06 ` [RFC v3 1/8] net: introduce qemu_get_peer Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-16  7:47   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 3/8] virtio-bus: introduce queue_enabled method Cindy Lu
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

user the qemu_get_peer to replace the old process

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/net/vhost_net.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 6b82803fa7..d1d421e3d9 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -306,7 +306,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
     VirtioBusState *vbus = VIRTIO_BUS(qbus);
     VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
+    struct vhost_net *net;
     int r, e, i;
+    NetClientState *peer;
 
     if (!k->set_guest_notifiers) {
         error_report("binding does not support guest notifiers");
@@ -314,9 +316,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     }
 
     for (i = 0; i < total_queues; i++) {
-        struct vhost_net *net;
 
-        net = get_vhost_net(ncs[i].peer);
+        peer = qemu_get_peer(ncs, i);
+        net = get_vhost_net(peer);
         vhost_net_set_vq_index(net, i * 2);
 
         /* Suppress the masking guest notifiers on vhost user
@@ -335,7 +337,8 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     }
 
     for (i = 0; i < total_queues; i++) {
-        r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev);
+        peer = qemu_get_peer(ncs, i);
+        r = vhost_net_start_one(get_vhost_net(peer), dev);
 
         if (r < 0) {
             goto err_start;
@@ -343,7 +346,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 
         if (ncs[i].peer->vring_enable) {
             /* restore vring enable state */
-            r = vhost_set_vring_enable(ncs[i].peer, ncs[i].peer->vring_enable);
+            r = vhost_set_vring_enable(peer, peer->vring_enable);
 
             if (r < 0) {
                 goto err_start;
@@ -355,7 +358,8 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
 
 err_start:
     while (--i >= 0) {
-        vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev);
+        peer = qemu_get_peer(ncs , i);
+        vhost_net_stop_one(get_vhost_net(peer), dev);
     }
     e = k->set_guest_notifiers(qbus->parent, total_queues * 2, false);
     if (e < 0) {
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 3/8] virtio-bus: introduce queue_enabled method
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
  2020-05-29 14:06 ` [RFC v3 1/8] net: introduce qemu_get_peer Cindy Lu
  2020-05-29 14:06 ` [RFC v3 2/8] vhost_net: use the function qemu_get_peer Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-16  7:49   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 4/8] virtio-pci: implement " Cindy Lu
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

From: Jason Wang <jasowang@redhat.com>

This patch introduces queue_enabled() method which allows the
transport to implement its own way to report whether or not a queue is
enabled.

Signed-off-by: Jason Wang <jasowang@redhat.com>

0005-virtio-bus-introduce-queue_enabled-method.patch
---
 hw/virtio/virtio.c             | 6 ++++++
 include/hw/virtio/virtio-bus.h | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b6c8ef5bc0..445a4ed760 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3285,6 +3285,12 @@ hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n)
 
 bool virtio_queue_enabled(VirtIODevice *vdev, int n)
 {
+    BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
+    VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+
+    if (k->queue_enabled) {
+        return k->queue_enabled(qbus->parent, n);
+    }
     return virtio_queue_get_desc_addr(vdev, n) != 0;
 }
 
diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 38c9399cd4..0f6f215925 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -83,6 +83,10 @@ typedef struct VirtioBusClass {
      */
     int (*ioeventfd_assign)(DeviceState *d, EventNotifier *notifier,
                             int n, bool assign);
+    /*
+     * Whether queue number n is enabled.
+     */
+    bool (*queue_enabled)(DeviceState *d, int n);
     /*
      * Does the transport have variable vring alignment?
      * (ie can it ever call virtio_queue_set_align()?)
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 4/8] virtio-pci: implement queue_enabled method
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (2 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 3/8] virtio-bus: introduce queue_enabled method Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-16  7:56   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method Cindy Lu
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

From: Jason Wang <jasowang@redhat.com>

With version 1, we can detect whether a queue is enabled via
queue_enabled.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/virtio/virtio-pci.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 4cb784389c..2c82ed5246 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1107,6 +1107,18 @@ static AddressSpace *virtio_pci_get_dma_as(DeviceState *d)
     return pci_get_address_space(dev);
 }
 
+static bool virtio_pci_queue_enabled(DeviceState *d, int n)
+{
+    VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
+    VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+
+    if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
+        return proxy->vqs[vdev->queue_sel].enabled;
+    }
+
+    return virtio_queue_get_desc_addr(vdev, n) != 0;
+}
+
 static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
                                    struct virtio_pci_cap *cap)
 {
@@ -2059,6 +2071,7 @@ static void virtio_pci_bus_class_init(ObjectClass *klass, void *data)
     k->ioeventfd_enabled = virtio_pci_ioeventfd_enabled;
     k->ioeventfd_assign = virtio_pci_ioeventfd_assign;
     k->get_dma_as = virtio_pci_get_dma_as;
+    k->queue_enabled = virtio_pci_queue_enabled;
 }
 
 static const TypeInfo virtio_pci_bus_info = {
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (3 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 4/8] virtio-pci: implement " Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-16  8:04   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 6/8] vhost-backend: export the vhost backend helper Cindy Lu
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

From: Jason Wang <jasowang@redhat.com>

Vhost-vdpa introduces VHOST_VDPA_SET_VRING_ENABLE which complies the
semantic of queue_enable defined in virtio spec. This method can be
used for preventing device from executing request for a specific
virtqueue. This patch introduces the vhost_ops for this.

Note that, we've already had vhost_set_vring_enable which has different
semantic which allows to enable or disable a specific virtqueue for
some kinds of vhost backends. E.g vhost-user use this to changes the
number of active queue pairs.

Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/net/vhost_net-stub.c |  4 ++++
 hw/net/vhost_net.c      | 11 ++++++++++-
 include/net/vhost_net.h |  1 +
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
index aac0e98228..43e93e1a9a 100644
--- a/hw/net/vhost_net-stub.c
+++ b/hw/net/vhost_net-stub.c
@@ -86,6 +86,10 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
     return 0;
 }
 
+int vhost_set_vring_ready(NetClientState *nc)
+{
+    return 0;
+}
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 {
     return 0;
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d1d421e3d9..e2bc7de2eb 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -344,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
             goto err_start;
         }
 
-        if (ncs[i].peer->vring_enable) {
+        if (peer->vring_enable) {
             /* restore vring enable state */
             r = vhost_set_vring_enable(peer, peer->vring_enable);
 
@@ -455,6 +455,15 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
     return 0;
 }
 
+int vhost_set_vring_ready(NetClientState *nc)
+{
+    VHostNetState *net = get_vhost_net(nc);
+    const VhostOps *vhost_ops = net->dev.vhost_ops;
+    if (vhost_ops && vhost_ops->vhost_set_vring_ready) {
+        return vhost_ops->vhost_set_vring_ready(&net->dev);
+    }
+    return 0;
+}
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 {
     const VhostOps *vhost_ops = net->dev.vhost_ops;
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 77e47398c4..8a6f208189 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -35,6 +35,7 @@ int vhost_net_notify_migration_done(VHostNetState *net, char* mac_addr);
 VHostNetState *get_vhost_net(NetClientState *nc);
 
 int vhost_set_vring_enable(NetClientState * nc, int enable);
+int vhost_set_vring_ready(NetClientState *nc);
 
 uint64_t vhost_net_get_acked_features(VHostNetState *net);
 
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 6/8] vhost-backend: export the vhost backend helper
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (4 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-16  8:16   ` Laurent Vivier
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, maxime.coquelin, lingshan.zhu

export the helper then we can reuse some of them in vhost-vdpa

Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 hw/virtio/vhost-backend.c         | 34 ++++++++++++++++++-------------
 include/hw/virtio/vhost-backend.h | 28 +++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 48905383f8..42efb4967b 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -14,7 +14,7 @@
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "standard-headers/linux/vhost_types.h"
-
+#include "hw/virtio/vhost-vdpa.h"
 #ifdef CONFIG_VHOST_KERNEL
 #include <linux/vhost.h>
 #include <sys/ioctl.h>
@@ -22,10 +22,16 @@
 static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
                              void *arg)
 {
-    int fd = (uintptr_t) dev->opaque;
-
-    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
-
+    int fd = -1;
+    struct vhost_vdpa *v = NULL;
+    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL) {
+        fd  = (uintptr_t) dev->opaque;
+    }
+    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
+        v = dev->opaque;
+        fd = v->device_fd;
+    }
+    assert(fd != -1);
     return ioctl(fd, request, arg);
 }
 
@@ -89,7 +95,7 @@ static int vhost_kernel_scsi_get_abi_version(struct vhost_dev *dev, int *version
     return vhost_kernel_call(dev, VHOST_SCSI_GET_ABI_VERSION, version);
 }
 
-static int vhost_kernel_set_log_base(struct vhost_dev *dev, uint64_t base,
+int vhost_kernel_set_log_base(struct vhost_dev *dev, uint64_t base,
                                      struct vhost_log *log)
 {
     return vhost_kernel_call(dev, VHOST_SET_LOG_BASE, &base);
@@ -101,7 +107,7 @@ static int vhost_kernel_set_mem_table(struct vhost_dev *dev,
     return vhost_kernel_call(dev, VHOST_SET_MEM_TABLE, mem);
 }
 
-static int vhost_kernel_set_vring_addr(struct vhost_dev *dev,
+int vhost_kernel_set_vring_addr(struct vhost_dev *dev,
                                        struct vhost_vring_addr *addr)
 {
     return vhost_kernel_call(dev, VHOST_SET_VRING_ADDR, addr);
@@ -113,31 +119,31 @@ static int vhost_kernel_set_vring_endian(struct vhost_dev *dev,
     return vhost_kernel_call(dev, VHOST_SET_VRING_ENDIAN, ring);
 }
 
-static int vhost_kernel_set_vring_num(struct vhost_dev *dev,
+int vhost_kernel_set_vring_num(struct vhost_dev *dev,
                                       struct vhost_vring_state *ring)
 {
     return vhost_kernel_call(dev, VHOST_SET_VRING_NUM, ring);
 }
 
-static int vhost_kernel_set_vring_base(struct vhost_dev *dev,
+int vhost_kernel_set_vring_base(struct vhost_dev *dev,
                                        struct vhost_vring_state *ring)
 {
     return vhost_kernel_call(dev, VHOST_SET_VRING_BASE, ring);
 }
 
-static int vhost_kernel_get_vring_base(struct vhost_dev *dev,
+int vhost_kernel_get_vring_base(struct vhost_dev *dev,
                                        struct vhost_vring_state *ring)
 {
     return vhost_kernel_call(dev, VHOST_GET_VRING_BASE, ring);
 }
 
-static int vhost_kernel_set_vring_kick(struct vhost_dev *dev,
+int vhost_kernel_set_vring_kick(struct vhost_dev *dev,
                                        struct vhost_vring_file *file)
 {
     return vhost_kernel_call(dev, VHOST_SET_VRING_KICK, file);
 }
 
-static int vhost_kernel_set_vring_call(struct vhost_dev *dev,
+int vhost_kernel_set_vring_call(struct vhost_dev *dev,
                                        struct vhost_vring_file *file)
 {
     return vhost_kernel_call(dev, VHOST_SET_VRING_CALL, file);
@@ -155,13 +161,13 @@ static int vhost_kernel_set_features(struct vhost_dev *dev,
     return vhost_kernel_call(dev, VHOST_SET_FEATURES, &features);
 }
 
-static int vhost_kernel_get_features(struct vhost_dev *dev,
+int vhost_kernel_get_features(struct vhost_dev *dev,
                                      uint64_t *features)
 {
     return vhost_kernel_call(dev, VHOST_GET_FEATURES, features);
 }
 
-static int vhost_kernel_set_owner(struct vhost_dev *dev)
+int vhost_kernel_set_owner(struct vhost_dev *dev)
 {
     return vhost_kernel_call(dev, VHOST_SET_OWNER, NULL);
 }
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index 6f6670783f..300b59c172 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -172,4 +172,32 @@ int vhost_backend_handle_iotlb_msg(struct vhost_dev *dev,
 
 int vhost_user_gpu_set_socket(struct vhost_dev *dev, int fd);
 
+
+int vhost_kernel_set_log_base(struct vhost_dev *dev, uint64_t base,
+                                     struct vhost_log *log);
+
+int vhost_kernel_set_vring_addr(struct vhost_dev *dev,
+                                       struct vhost_vring_addr *addr);
+
+int vhost_kernel_set_vring_num(struct vhost_dev *dev,
+                                      struct vhost_vring_state *ring);
+
+int vhost_kernel_set_vring_base(struct vhost_dev *dev,
+                                       struct vhost_vring_state *ring);
+
+int vhost_kernel_get_vring_base(struct vhost_dev *dev,
+                                       struct vhost_vring_state *ring);
+
+int vhost_kernel_set_vring_kick(struct vhost_dev *dev,
+                                       struct vhost_vring_file *file);
+
+int vhost_kernel_set_vring_call(struct vhost_dev *dev,
+                                       struct vhost_vring_file *file);
+
+int vhost_kernel_set_owner(struct vhost_dev *dev);
+
+int vhost_kernel_get_features(struct vhost_dev *dev,
+                                     uint64_t *features);
+
+
 #endif /* VHOST_BACKEND_H */
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (5 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 6/8] vhost-backend: export the vhost backend helper Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-06-03  2:52   ` Jason Wang
                     ` (4 more replies)
  2020-05-29 14:06 ` [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client Cindy Lu
                   ` (3 subsequent siblings)
  10 siblings, 5 replies; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, maxime.coquelin, lingshan.zhu

From: Tiwei Bie <tiwei.bie@intel.com>

Currently we have 2 types of vhost backends in QEMU: vhost kernel and
vhost-user. The above patch provides a generic device for vDPA purpose,
this vDPA device exposes to user space a non-vendor-specific configuration
interface for setting up a vhost HW accelerator, this patch set introduces
a third vhost backend called vhost-vdpa based on the vDPA interface.

Vhost-vdpa usage:

  qemu-system-x86_64 -cpu host -enable-kvm \
    ......
  -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
  -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \

Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 configure                         |  21 ++
 hw/net/vhost_net-stub.c           |   5 +
 hw/net/vhost_net.c                |  47 +++-
 hw/virtio/Makefile.objs           |   1 +
 hw/virtio/vhost-backend.c         |   5 +
 hw/virtio/vhost-vdpa.c            | 399 ++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  37 ++-
 include/hw/virtio/vhost-backend.h |  10 +-
 include/hw/virtio/vhost-vdpa.h    |  26 ++
 include/hw/virtio/vhost.h         |   2 +
 include/net/vhost_net.h           |   4 +-
 qemu-options.hx                   |  15 ++
 12 files changed, 566 insertions(+), 6 deletions(-)
 create mode 100644 hw/virtio/vhost-vdpa.c
 create mode 100644 include/hw/virtio/vhost-vdpa.h

diff --git a/configure b/configure
index 23b5e93752..53679ee57f 100755
--- a/configure
+++ b/configure
@@ -1557,6 +1557,10 @@ for opt do
   ;;
   --enable-vhost-user) vhost_user="yes"
   ;;
+  --disable-vhost-vdpa) vhost_vdpa="no"
+  ;;
+  --enable-vhost-vdpa) vhost_vdpa="yes"
+  ;;
   --disable-vhost-kernel) vhost_kernel="no"
   ;;
   --enable-vhost-kernel) vhost_kernel="yes"
@@ -1846,6 +1850,7 @@ disabled with --disable-FEATURE, default is enabled if available:
   vhost-crypto    vhost-user-crypto backend support
   vhost-kernel    vhost kernel backend support
   vhost-user      vhost-user backend support
+  vhost-vdpa      vhost-vdpa kernel backend support
   spice           spice
   rbd             rados block device (rbd)
   libiscsi        iscsi support
@@ -2336,6 +2341,10 @@ test "$vhost_user" = "" && vhost_user=yes
 if test "$vhost_user" = "yes" && test "$mingw32" = "yes"; then
   error_exit "vhost-user isn't available on win32"
 fi
+test "$vhost_vdpa" = "" && vhost_vdpa=$linux
+if test "$vhost_vdpa" = "yes" && test "$linux" != "yes"; then
+  error_exit "vhost-vdpa is only available on Linux"
+fi
 test "$vhost_kernel" = "" && vhost_kernel=$linux
 if test "$vhost_kernel" = "yes" && test "$linux" != "yes"; then
   error_exit "vhost-kernel is only available on Linux"
@@ -2364,6 +2373,11 @@ test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
 if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
   error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
 fi
+#vhost-vdpa backends
+test "$vhost_net_vdpa" = "" && vhost_net_vdpa=$vhost_vdpa
+if test "$vhost_net_vdpa" = "yes" && test "$vhost_vdpa" = "no"; then
+  error_exit "--enable-vhost-net-vdpa requires --enable-vhost-vdpa"
+fi
 
 # OR the vhost-kernel and vhost-user values for simplicity
 if test "$vhost_net" = ""; then
@@ -6673,6 +6687,7 @@ echo "vhost-scsi support $vhost_scsi"
 echo "vhost-vsock support $vhost_vsock"
 echo "vhost-user support $vhost_user"
 echo "vhost-user-fs support $vhost_user_fs"
+echo "vhost-vdpa support $vhost_vdpa"
 echo "Trace backends    $trace_backends"
 if have_backend "simple"; then
 echo "Trace output file $trace_file-<pid>"
@@ -7170,6 +7185,9 @@ fi
 if test "$vhost_net_user" = "yes" ; then
   echo "CONFIG_VHOST_NET_USER=y" >> $config_host_mak
 fi
+if test "$vhost_net_vdpa" = "yes" ; then
+  echo "CONFIG_VHOST_NET_VDPA=y" >> $config_host_mak
+fi
 if test "$vhost_crypto" = "yes" ; then
   echo "CONFIG_VHOST_CRYPTO=y" >> $config_host_mak
 fi
@@ -7182,6 +7200,9 @@ fi
 if test "$vhost_user" = "yes" ; then
   echo "CONFIG_VHOST_USER=y" >> $config_host_mak
 fi
+if test "$vhost_vdpa" = "yes" ; then
+  echo "CONFIG_VHOST_VDPA=y" >> $config_host_mak
+fi
 if test "$vhost_user_fs" = "yes" ; then
   echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
 fi
diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
index 43e93e1a9a..ab77a92a7d 100644
--- a/hw/net/vhost_net-stub.c
+++ b/hw/net/vhost_net-stub.c
@@ -94,3 +94,8 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 {
     return 0;
 }
+int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
+{
+    return 0;
+}
+
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index e2bc7de2eb..25045cff59 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -17,8 +17,10 @@
 #include "net/net.h"
 #include "net/tap.h"
 #include "net/vhost-user.h"
+#include "net/vhost-vdpa.h"
 
 #include "standard-headers/linux/vhost_types.h"
+#include "linux-headers/linux/vhost.h"
 #include "hw/virtio/virtio-net.h"
 #include "net/vhost_net.h"
 #include "qemu/error-report.h"
@@ -85,6 +87,30 @@ static const int user_feature_bits[] = {
     VHOST_INVALID_FEATURE_BIT
 };
 
+static const int vdpa_feature_bits[] = {
+    VIRTIO_F_NOTIFY_ON_EMPTY,
+    VIRTIO_RING_F_INDIRECT_DESC,
+    VIRTIO_RING_F_EVENT_IDX,
+    VIRTIO_F_ANY_LAYOUT,
+    VIRTIO_F_VERSION_1,
+    VIRTIO_NET_F_CSUM,
+    VIRTIO_NET_F_GUEST_CSUM,
+    VIRTIO_NET_F_GSO,
+    VIRTIO_NET_F_GUEST_TSO4,
+    VIRTIO_NET_F_GUEST_TSO6,
+    VIRTIO_NET_F_GUEST_ECN,
+    VIRTIO_NET_F_GUEST_UFO,
+    VIRTIO_NET_F_HOST_TSO4,
+    VIRTIO_NET_F_HOST_TSO6,
+    VIRTIO_NET_F_HOST_ECN,
+    VIRTIO_NET_F_HOST_UFO,
+    VIRTIO_NET_F_MRG_RXBUF,
+    VIRTIO_NET_F_MTU,
+    VIRTIO_F_IOMMU_PLATFORM,
+    VIRTIO_F_RING_PACKED,
+    VIRTIO_NET_F_GUEST_ANNOUNCE,
+    VHOST_INVALID_FEATURE_BIT
+};
 static const int *vhost_net_get_feature_bits(struct vhost_net *net)
 {
     const int *feature_bits = 0;
@@ -96,6 +122,9 @@ static const int *vhost_net_get_feature_bits(struct vhost_net *net)
     case NET_CLIENT_DRIVER_VHOST_USER:
         feature_bits = user_feature_bits;
         break;
+    case NET_CLIENT_DRIVER_VHOST_VDPA:
+        feature_bits = vdpa_feature_bits;
+        break;
     default:
         error_report("Feature bits not defined for this type: %d",
                 net->nc->info->type);
@@ -110,7 +139,10 @@ uint64_t vhost_net_get_features(struct vhost_net *net, uint64_t features)
     return vhost_get_features(&net->dev, vhost_net_get_feature_bits(net),
             features);
 }
-
+int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
+{
+    return vhost_dev_get_device_id(&net->dev, device_id);
+}
 void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
 {
     net->dev.acked_features = net->dev.backend_features;
@@ -337,6 +369,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
     }
 
     for (i = 0; i < total_queues; i++) {
+
+        if (virtio_queue_enabled(dev, i)) {
+            vhost_set_vring_ready(peer);
+        }
+
         peer = qemu_get_peer(ncs, i);
         r = vhost_net_start_one(get_vhost_net(peer), dev);
 
@@ -433,6 +470,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
         vhost_net = vhost_user_get_vhost_net(nc);
         assert(vhost_net);
         break;
+#endif
+#ifdef CONFIG_VHOST_NET_VDPA
+    case NET_CLIENT_DRIVER_VHOST_VDPA:
+        vhost_net = vhost_vdpa_get_vhost_net(nc);
+        assert(vhost_net);
+        break;
 #endif
     default:
         break;
@@ -474,3 +517,5 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
 
     return vhost_ops->vhost_net_set_mtu(&net->dev, mtu);
 }
+
+
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 4e4d39a0a4..6b1b1a5fce 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -5,6 +5,7 @@ obj-y += virtio.o
 obj-$(CONFIG_VHOST) += vhost.o vhost-backend.o
 common-obj-$(call lnot,$(CONFIG_VHOST)) += vhost-stub.o
 obj-$(CONFIG_VHOST_USER) += vhost-user.o
+obj-$(CONFIG_VHOST_VDPA) += vhost-vdpa.o
 
 common-obj-$(CONFIG_VIRTIO_RNG) += virtio-rng.o
 common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 42efb4967b..420341e8c5 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -291,6 +291,11 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
     case VHOST_BACKEND_TYPE_USER:
         dev->vhost_ops = &user_ops;
         break;
+#endif
+#ifdef CONFIG_VHOST_VDPA
+    case VHOST_BACKEND_TYPE_VDPA:
+        dev->vhost_ops = &vdpa_ops;
+        break;
 #endif
     default:
         error_report("Unknown vhost backend type");
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
new file mode 100644
index 0000000000..2d136a8565
--- /dev/null
+++ b/hw/virtio/vhost-vdpa.c
@@ -0,0 +1,399 @@
+/*
+ * vhost-vdpa
+ *
+ *  Copyright(c) 2017-2018 Intel Corporation.
+ *  Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include <linux/vhost.h>
+#include <linux/vfio.h>
+#include <sys/eventfd.h>
+#include <sys/ioctl.h>
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
+#include "hw/virtio/virtio-net.h"
+#include "hw/virtio/vhost-vdpa.h"
+#include "qemu/main-loop.h"
+#include <linux/kvm.h>
+#include "sysemu/kvm.h"
+
+
+static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section)
+{
+    return (!memory_region_is_ram(section->mr) &&
+            !memory_region_is_iommu(section->mr)) ||
+           /*
+            * Sizing an enabled 64-bit BAR can cause spurious mappings to
+            * addresses in the upper part of the 64-bit address space.  These
+            * are never accessed by the CPU and beyond the address width of
+            * some IOMMU hardware.  TODO: VDPA should tell us the IOMMU width.
+            */
+           section->offset_within_address_space & (1ULL << 63);
+}
+
+static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
+                              void *vaddr, bool readonly)
+{
+    struct vhost_msg_v2 msg;
+    int fd = v->device_fd;
+    int ret = 0;
+
+    msg.type =  v->msg_type;
+    msg.iotlb.iova = iova;
+    msg.iotlb.size = size;
+    msg.iotlb.uaddr = (uint64_t)vaddr;
+    msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
+    msg.iotlb.type = VHOST_IOTLB_UPDATE;
+
+    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
+        error_report("failed to write, fd=%d, errno=%d (%s)",
+            fd, errno, strerror(errno));
+        return -EIO ;
+    }
+
+    return ret;
+}
+
+static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
+                                hwaddr size)
+{
+    struct vhost_msg_v2 msg;
+    int fd = v->device_fd;
+    int ret = 0;
+
+    msg.type =  v->msg_type;
+    msg.iotlb.iova = iova;
+    msg.iotlb.size = size;
+    msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+
+    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
+        error_report("failed to write, fd=%d, errno=%d (%s)",
+            fd, errno, strerror(errno));
+        return -EIO ;
+    }
+
+    return ret;
+}
+
+static void vhost_vdpa_listener_region_add(MemoryListener *listener,
+                                           MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+    hwaddr iova;
+    Int128 llend, llsize;
+    void *vaddr;
+    int ret;
+
+    if (vhost_vdpa_listener_skipped_section(section)) {
+        return;
+    }
+
+    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
+                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
+        error_report("%s received unaligned region", __func__);
+        return;
+    }
+
+    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+    llend = int128_make64(section->offset_within_address_space);
+    llend = int128_add(llend, section->size);
+    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+
+    if (int128_ge(int128_make64(iova), llend)) {
+        return;
+    }
+
+    memory_region_ref(section->mr);
+
+    /* Here we assume that memory_region_is_ram(section->mr)==true */
+
+    vaddr = memory_region_get_ram_ptr(section->mr) +
+            section->offset_within_region +
+            (iova - section->offset_within_address_space);
+
+    llsize = int128_sub(llend, int128_make64(iova));
+
+    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
+                             vaddr, section->readonly);
+    if (ret) {
+        error_report("vhost vdpa map fail!");
+        if (memory_region_is_ram_device(section->mr)) {
+            /* Allow unexpected mappings not to be fatal for RAM devices */
+            error_report("map ram fail!");
+          return ;
+        }
+        goto fail;
+    }
+
+    return;
+
+fail:
+    if (memory_region_is_ram_device(section->mr)) {
+        error_report("failed to vdpa_dma_map. pci p2p may not work");
+        return;
+
+    }
+    /*
+     * On the initfn path, store the first error in the container so we
+     * can gracefully fail.  Runtime, there's not much we can do other
+     * than throw a hardware error.
+     */
+    error_report("vhost-vdpa: DMA mapping failed, unable to continue");
+    return;
+
+}
+
+static void vhost_vdpa_listener_region_del(MemoryListener *listener,
+                                           MemoryRegionSection *section)
+{
+    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
+    hwaddr iova;
+    Int128 llend, llsize;
+    int ret;
+    bool try_unmap = true;
+
+    if (vhost_vdpa_listener_skipped_section(section)) {
+        return;
+    }
+
+    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
+                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
+        error_report("%s received unaligned region", __func__);
+        return;
+    }
+
+    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+    llend = int128_make64(section->offset_within_address_space);
+    llend = int128_add(llend, section->size);
+    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
+
+    if (int128_ge(int128_make64(iova), llend)) {
+        return;
+    }
+
+    llsize = int128_sub(llend, int128_make64(iova));
+
+    if (try_unmap) {
+        ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
+        if (ret) {
+            error_report("vhost_vdpa dma unmap error!");
+        }
+    }
+
+    memory_region_unref(section->mr);
+}
+/* Register a new memory listener, only to get diffs from qemu,
+ * this help to reduce the tricky codes in vhost
+ * (e.g generating diffs of two rbtree as usnic did).*/
+static const MemoryListener vhost_vdpa_memory_listener = {
+    .region_add = vhost_vdpa_listener_region_add,
+    .region_del = vhost_vdpa_listener_region_del,
+};
+
+static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
+                             void *arg)
+{
+    struct vhost_vdpa *v = dev->opaque;
+    int fd = v->device_fd;
+
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
+
+    return ioctl(fd, request, arg);
+}
+
+static void vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
+{
+    uint8_t s;
+
+    if (vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s)) {
+        return;
+    }
+
+    s |= status;
+
+    vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
+}
+
+static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
+{
+    struct vhost_vdpa *v;
+
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
+
+    v = opaque;
+    dev->opaque =  opaque ;
+
+    v->listener = vhost_vdpa_memory_listener;
+    v->msg_type = VHOST_IOTLB_MSG_V2;
+    memory_listener_register(&v->listener, &address_space_memory);
+
+    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                               VIRTIO_CONFIG_S_DRIVER);
+
+    return 0;
+}
+
+static int vhost_vdpa_cleanup(struct vhost_dev *dev)
+{
+    struct vhost_vdpa *v;
+    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
+
+    v = dev->opaque;
+    memory_listener_unregister(&v->listener);
+
+    dev->opaque = NULL;
+    return 0;
+}
+
+static int vhost_vdpa_memslots_limit(struct vhost_dev *dev)
+{
+    return INT_MAX;
+}
+
+static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
+                                    struct vhost_memory *mem)
+{
+
+    if (mem->padding) {
+        return -1;
+    }
+
+    return 0;
+}
+
+static int vhost_vdpa_set_features(struct vhost_dev *dev,
+                                   uint64_t features)
+{
+    int ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
+    uint8_t status = 0;
+
+    if (ret) {
+        return ret;
+    }
+    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
+    vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
+
+    return !(status & VIRTIO_CONFIG_S_FEATURES_OK);
+}
+
+int vhost_vdpa_get_device_id(struct vhost_dev *dev,
+                                   uint32_t *device_id)
+{
+    return vhost_vdpa_call(dev, VHOST_VDPA_GET_DEVICE_ID, device_id);
+}
+
+static int vhost_vdpa_reset_device(struct vhost_dev *dev)
+{
+    uint8_t status = 0;
+
+    return vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
+}
+
+static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
+{
+    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
+
+    return idx - dev->vq_index;
+}
+
+static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)
+{
+    int i;
+    for (i = 0; i < dev->nvqs; ++i) {
+        struct vhost_vring_state state = {
+            .index = dev->vq_index + i,
+            .num = 1,
+        };
+        vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
+    }
+    return 0;
+}
+
+static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
+                                   uint32_t offset, uint32_t size,
+                                   uint32_t flags)
+{
+    struct vhost_vdpa_config config;
+    int ret;
+    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
+        return -1;
+    }
+    memset(&config, 0, sizeof(struct vhost_vdpa_config));
+    config.off = 0;
+    config.len = size;
+    memcpy(&config.buf, data, size);
+    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG, &config);
+    return ret;
+}
+
+static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
+                                   uint32_t config_len)
+{
+    struct vhost_vdpa_config v_config;
+    int ret;
+
+    memset(&v_config, 0, sizeof(struct vhost_vdpa_config));
+    if (config == NULL) {
+        return -1;
+    }
+    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_CONFIG, &v_config);
+    if ((v_config.len > config_len) || (v_config.len == 0)) {
+        return -EINVAL;
+    }
+    memcpy(config, &v_config.buf, config_len);
+    return ret;
+ }
+
+static int vhost_vdpa_set_state(struct vhost_dev *dev, bool started)
+{
+    if (started) {
+        uint8_t status = 0;
+
+        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+        vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
+
+        return !(status & VIRTIO_CONFIG_S_DRIVER_OK);
+    } else {
+        vhost_vdpa_reset_device(dev);
+        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
+                                   VIRTIO_CONFIG_S_DRIVER);
+        return 0;
+    }
+}
+
+const VhostOps vdpa_ops = {
+        .backend_type = VHOST_BACKEND_TYPE_VDPA,
+        .vhost_backend_init = vhost_vdpa_init,
+        .vhost_backend_cleanup = vhost_vdpa_cleanup,
+        .vhost_set_log_base = vhost_kernel_set_log_base,
+        .vhost_set_vring_addr = vhost_kernel_set_vring_addr,
+        .vhost_set_vring_num = vhost_kernel_set_vring_num,
+        .vhost_set_vring_base = vhost_kernel_set_vring_base,
+        .vhost_get_vring_base = vhost_kernel_get_vring_base,
+        .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
+        .vhost_set_vring_call = vhost_kernel_set_vring_call,
+        .vhost_get_features = vhost_kernel_get_features,
+        .vhost_set_owner = vhost_kernel_set_owner,
+        .vhost_set_vring_endian = NULL,
+        .vhost_backend_memslots_limit = vhost_vdpa_memslots_limit,
+        .vhost_set_mem_table = vhost_vdpa_set_mem_table,
+        .vhost_set_features = vhost_vdpa_set_features,
+        .vhost_reset_device = vhost_vdpa_reset_device,
+        .vhost_get_vq_index = vhost_vdpa_get_vq_index,
+        .vhost_set_vring_ready = vhost_vdpa_set_vring_ready,
+        .vhost_get_config  = vhost_vdpa_get_config,
+        .vhost_set_config = vhost_vdpa_set_config,
+        .vhost_requires_shm_log = NULL,
+        .vhost_migration_done = NULL,
+        .vhost_backend_can_merge = NULL,
+        .vhost_net_set_mtu = NULL,
+        .vhost_set_iotlb_callback = NULL,
+        .vhost_send_device_iotlb_msg = NULL,
+        .vhost_set_state = vhost_vdpa_set_state,
+        .vhost_get_device_id = vhost_vdpa_get_device_id,
+};
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 01ebe12f28..b97aa02a4c 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -756,6 +756,12 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
         .log_guest_addr = vq->used_phys,
         .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
     };
+    /*vDPA need to use the phys address here to set to hardware*/
+    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
+        addr.desc_user_addr = (uint64_t)(unsigned long)vq->desc_phys;
+        addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail_phys;
+        addr.used_user_addr = (uint64_t)(unsigned long)vq->used_phys;
+    }
     int r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
     if (r < 0) {
         VHOST_OPS_DEBUG("vhost_set_vring_addr failed");
@@ -1506,6 +1512,14 @@ int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *data,
     return -1;
 }
 
+int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id)
+{
+    assert(hdev->vhost_ops);
+    if (hdev->vhost_ops->vhost_get_device_id) {
+        return hdev->vhost_ops->vhost_get_device_id(hdev, device_id);
+    }
+    return -1;
+}
 void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
                                    const VhostDevConfigOps *ops)
 {
@@ -1661,7 +1675,13 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
         }
     }
 
-    if (vhost_dev_has_iommu(hdev)) {
+    r = vhost_set_state(hdev, true);
+    if (r) {
+        goto fail_log;
+    }
+
+    if (vhost_dev_has_iommu(hdev) &&
+        hdev->vhost_ops->vhost_set_iotlb_callback) {
         hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
 
         /* Update used ring information for IOTLB to work correctly,
@@ -1697,6 +1717,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
     /* should only be called after backend is connected */
     assert(hdev->vhost_ops);
 
+    vhost_set_state(hdev, false);
+
     for (i = 0; i < hdev->nvqs; ++i) {
         vhost_virtqueue_stop(hdev,
                              vdev,
@@ -1705,7 +1727,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
     }
 
     if (vhost_dev_has_iommu(hdev)) {
-        hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+        if (hdev->vhost_ops->vhost_set_iotlb_callback) {
+            hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
+        }
         memory_listener_unregister(&hdev->iommu_listener);
     }
     vhost_log_put(hdev, true);
@@ -1722,3 +1746,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
 
     return -1;
 }
+
+int vhost_set_state(struct vhost_dev *hdev, bool started)
+{
+    if (hdev->vhost_ops->vhost_set_state) {
+        return hdev->vhost_ops->vhost_set_state(hdev, started);
+    }
+
+    return 0;
+}
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index 300b59c172..1ebe3785cf 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -17,7 +17,8 @@ typedef enum VhostBackendType {
     VHOST_BACKEND_TYPE_NONE = 0,
     VHOST_BACKEND_TYPE_KERNEL = 1,
     VHOST_BACKEND_TYPE_USER = 2,
-    VHOST_BACKEND_TYPE_MAX = 3,
+    VHOST_BACKEND_TYPE_VDPA = 3,
+    VHOST_BACKEND_TYPE_MAX = 4,
 } VhostBackendType;
 
 typedef enum VhostSetConfigType {
@@ -77,6 +78,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
 typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
 typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
                                          int enable);
+typedef int (*vhost_set_vring_ready_op)(struct vhost_dev *dev);
 typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
 typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
                                        char *mac_addr);
@@ -112,6 +114,8 @@ typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev *dev,
 typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev,
                                         struct vhost_inflight *inflight);
 
+typedef int (*vhost_set_state_op)(struct vhost_dev *dev, bool started);
+typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
 typedef struct VhostOps {
     VhostBackendType backend_type;
     vhost_backend_init vhost_backend_init;
@@ -138,6 +142,7 @@ typedef struct VhostOps {
     vhost_reset_device_op vhost_reset_device;
     vhost_get_vq_index_op vhost_get_vq_index;
     vhost_set_vring_enable_op vhost_set_vring_enable;
+    vhost_set_vring_ready_op vhost_set_vring_ready;
     vhost_requires_shm_log_op vhost_requires_shm_log;
     vhost_migration_done_op vhost_migration_done;
     vhost_backend_can_merge_op vhost_backend_can_merge;
@@ -152,9 +157,12 @@ typedef struct VhostOps {
     vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
     vhost_get_inflight_fd_op vhost_get_inflight_fd;
     vhost_set_inflight_fd_op vhost_set_inflight_fd;
+    vhost_set_state_op vhost_set_state;
+    vhost_get_device_id_op vhost_get_device_id;
 } VhostOps;
 
 extern const VhostOps user_ops;
+extern const VhostOps vdpa_ops;
 
 int vhost_set_backend_type(struct vhost_dev *dev,
                            VhostBackendType backend_type);
diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
new file mode 100644
index 0000000000..6455663388
--- /dev/null
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -0,0 +1,26 @@
+/*
+ * vhost-vdpa.h
+ *
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VIRTIO_VHOST_VDPA_H
+#define HW_VIRTIO_VHOST_VDPA_H
+
+#include "hw/virtio/virtio.h"
+
+typedef struct vhost_vdpa {
+    int device_fd;
+    uint32_t msg_type;
+    MemoryListener listener;
+} VhostVDPA;
+
+extern AddressSpace address_space_memory;
+extern int vhost_vdpa_get_device_id(struct vhost_dev *dev,
+                                   uint32_t *device_id);
+#endif
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 085450c6f8..b682545f51 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -124,6 +124,7 @@ int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
                          uint32_t config_len);
 int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *data,
                          uint32_t offset, uint32_t size, uint32_t flags);
+int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id);
 /* notifier callback in case vhost device config space changed
  */
 void vhost_dev_set_config_notifier(struct vhost_dev *dev,
@@ -137,4 +138,5 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
                            struct vhost_inflight *inflight);
 int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
                            struct vhost_inflight *inflight);
+int vhost_set_state(struct vhost_dev *dev, bool started);
 #endif
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 8a6f208189..56e67fe164 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -40,5 +40,5 @@ int vhost_set_vring_ready(NetClientState *nc);
 uint64_t vhost_net_get_acked_features(VHostNetState *net);
 
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
-
-#endif
+int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
+endif
diff --git a/qemu-options.hx b/qemu-options.hx
index 292d4e7c0c..c19e10ce9c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2409,6 +2409,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
 #ifdef CONFIG_POSIX
     "-netdev vhost-user,id=str,chardev=dev[,vhostforce=on|off]\n"
     "                configure a vhost-user network, backed by a chardev 'dev'\n"
+#endif
+#ifdef CONFIG_POSIX
+    "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
+    "                configure a vhost-vdpa network,Establish a vhost-vdpa netdev\n"
 #endif
     "-netdev hubport,id=str,hubid=n[,netdev=nd]\n"
     "                configure a hub port on the hub with ID 'n'\n", QEMU_ARCH_ALL)
@@ -2428,6 +2432,9 @@ DEF("nic", HAS_ARG, QEMU_OPTION_nic,
 #endif
 #ifdef CONFIG_POSIX
     "vhost-user|"
+#endif
+#ifdef CONFIG_POSIX
+    "vhost-vdpa|"
 #endif
     "socket][,option][,...][mac=macaddr]\n"
     "                initialize an on-board / default host NIC (using MAC address\n"
@@ -2896,6 +2903,14 @@ SRST
     hubport to another netdev with ID nd by using the ``netdev=nd``
     option.
 
+``-netdev vhost-vdpa,vhostdev=/path/to/dev ``
+    Establish a vhost-vdpa netdev.
+
+    vDPA device is a device that uses a datapath which complies with
+    the virtio specifications with a vendor specific control path.
+    vDPA devices can be both physically located on the hardware or
+    emulated by software.
+
 ``-net nic[,netdev=nd][,macaddr=mac][,model=type] [,name=name][,addr=addr][,vectors=v]``
     Legacy option to configure or create an on-board (or machine
     default) Network Interface Card(NIC) and connect it either to the
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (6 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
@ 2020-05-29 14:06 ` Cindy Lu
  2020-05-29 14:22   ` Eric Blake
  2020-06-03  6:39   ` Jason Wang
  2020-05-29 20:29 ` [RFC v3 0/8] vDPA support in qemu no-reply
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 41+ messages in thread
From: Cindy Lu @ 2020-05-29 14:06 UTC (permalink / raw)
  To: mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, lulu, hanand, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, maxime.coquelin, lingshan.zhu

From: Tiwei Bie <tiwei.bie@intel.com>

This patch set introduces a new net client type: vhost-vdpa.
vhost-vdpa net client will set up a vDPA device which is specified
by a "vhostdev" parameter.

Co-authored-by: Lingshan Zhu <lingshan.zhu@intel.com>
Signed-off-by: Cindy Lu <lulu@redhat.com>
---
 include/net/vhost-vdpa.h |  19 ++++
 include/net/vhost_net.h  |   2 +-
 net/Makefile.objs        |   2 +-
 net/clients.h            |   2 +
 net/net.c                |   3 +
 net/vhost-vdpa.c         | 235 +++++++++++++++++++++++++++++++++++++++
 qapi/net.json            |  26 ++++-
 7 files changed, 285 insertions(+), 4 deletions(-)
 create mode 100644 include/net/vhost-vdpa.h
 create mode 100644 net/vhost-vdpa.c

diff --git a/include/net/vhost-vdpa.h b/include/net/vhost-vdpa.h
new file mode 100644
index 0000000000..6ce0d04f72
--- /dev/null
+++ b/include/net/vhost-vdpa.h
@@ -0,0 +1,19 @@
+/*
+ * vhost-vdpa.h
+ *
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_VDPA_H
+#define VHOST_VDPA_H
+
+struct vhost_net;
+struct vhost_net *vhost_vdpa_get_vhost_net(NetClientState *nc);
+uint64_t vhost_vdpa_get_acked_features(NetClientState *nc);
+
+#endif /* VHOST_VDPA_H */
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 56e67fe164..0b87d3c6e9 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -41,4 +41,4 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net);
 
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
 int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
-endif
+#endif
diff --git a/net/Makefile.objs b/net/Makefile.objs
index c5d076d19c..5ab45545db 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -26,7 +26,7 @@ tap-obj-$(CONFIG_SOLARIS) = tap-solaris.o
 tap-obj-y ?= tap-stub.o
 common-obj-$(CONFIG_POSIX) += tap.o $(tap-obj-y)
 common-obj-$(CONFIG_WIN32) += tap-win32.o
-
+common-obj-$(CONFIG_VHOST_NET_VDPA) += vhost-vdpa.o
 vde.o-libs = $(VDE_LIBS)
 
 common-obj-$(CONFIG_CAN_BUS) += can/
diff --git a/net/clients.h b/net/clients.h
index a6ef267e19..92f9b59aed 100644
--- a/net/clients.h
+++ b/net/clients.h
@@ -61,4 +61,6 @@ int net_init_netmap(const Netdev *netdev, const char *name,
 int net_init_vhost_user(const Netdev *netdev, const char *name,
                         NetClientState *peer, Error **errp);
 
+int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
+                        NetClientState *peer, Error **errp);
 #endif /* QEMU_NET_CLIENTS_H */
diff --git a/net/net.c b/net/net.c
index 599fb61028..82624ea9ac 100644
--- a/net/net.c
+++ b/net/net.c
@@ -965,6 +965,9 @@ static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
 #ifdef CONFIG_VHOST_NET_USER
         [NET_CLIENT_DRIVER_VHOST_USER] = net_init_vhost_user,
 #endif
+#ifdef CONFIG_VHOST_NET_VDPA
+        [NET_CLIENT_DRIVER_VHOST_VDPA] = net_init_vhost_vdpa,
+#endif
 #ifdef CONFIG_L2TPV3
         [NET_CLIENT_DRIVER_L2TPV3]    = net_init_l2tpv3,
 #endif
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
new file mode 100644
index 0000000000..7b98c142b5
--- /dev/null
+++ b/net/vhost-vdpa.c
@@ -0,0 +1,235 @@
+/*
+ * vhost-vdpa.c
+ *
+ * Copyright(c) 2017-2018 Intel Corporation.
+ * Copyright(c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "clients.h"
+#include "net/vhost_net.h"
+#include "net/vhost-vdpa.h"
+#include "hw/virtio/vhost-vdpa.h"
+#include "qemu/config-file.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qapi/error.h"
+#include <sys/ioctl.h>
+#include <err.h>
+#include "standard-headers/linux/virtio_net.h"
+#include "monitor/monitor.h"
+#include "hw/virtio/vhost.h"
+
+/* Todo:need to add the multiqueue support here */
+typedef struct VhostVDPAState {
+    NetClientState nc;
+    struct vhost_vdpa vhost_vdpa;
+    VHostNetState *vhost_net;
+    uint64_t acked_features;
+    bool started;
+} VhostVDPAState;
+
+VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
+{
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+    return s->vhost_net;
+}
+
+uint64_t vhost_vdpa_get_acked_features(NetClientState *nc)
+{
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+    return s->acked_features;
+}
+
+static int vhost_vdpa_check_device_id(NetClientState *nc)
+{
+    uint32_t device_id;
+    int ret;
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    /* Get the device id from hw*/
+    ret = vhost_net_get_device_id(s->vhost_net, &device_id);
+    if (device_id != VIRTIO_ID_NET) {
+        return -ENOTSUP;
+    }
+    return ret;
+}
+
+static void vhost_vdpa_del(NetClientState *ncs)
+{
+    VhostVDPAState *s;
+
+    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+    s = DO_UPCAST(VhostVDPAState, nc, ncs);
+
+    if (s->vhost_net) {
+        /* save acked features */
+        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
+        if (features) {
+            s->acked_features = features;
+        }
+        vhost_net_cleanup(s->vhost_net);
+    }
+}
+
+static int vhost_vdpa_add(NetClientState *ncs, void *be)
+{
+    VhostNetOptions options;
+    struct vhost_net *net = NULL;
+    VhostVDPAState *s;
+    int ret;
+
+    options.backend_type = VHOST_BACKEND_TYPE_VDPA;
+
+    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+    s = DO_UPCAST(VhostVDPAState, nc, ncs);
+
+    options.net_backend = ncs;
+    options.opaque      = be;
+    options.busyloop_timeout = 0;
+    net = vhost_net_init(&options);
+    if (!net) {
+        error_report("failed to init vhost_net for queue");
+        goto err;
+    }
+
+    if (s->vhost_net) {
+        vhost_net_cleanup(s->vhost_net);
+        g_free(s->vhost_net);
+    }
+    s->vhost_net = net;
+    /* check the device id for vdpa */
+    ret = vhost_vdpa_check_device_id(ncs);
+    if (ret) {
+        goto err;
+    }
+    return 0;
+err:
+    if (net) {
+        vhost_net_cleanup(net);
+    }
+    vhost_vdpa_del(ncs);
+    return -1;
+}
+
+static void vhost_vdpa_cleanup(NetClientState *nc)
+{
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+
+    if (s->vhost_net) {
+        vhost_net_cleanup(s->vhost_net);
+        g_free(s->vhost_net);
+        s->vhost_net = NULL;
+    }
+
+    qemu_purge_queued_packets(nc);
+}
+
+static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
+{
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+
+    return true;
+}
+
+static bool vhost_vdpa_has_ufo(NetClientState *nc)
+{
+    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
+    uint64_t  features = 0;
+
+    features |= (1ULL << VIRTIO_NET_F_HOST_UFO);
+    features = vhost_net_get_features(s->vhost_net, features);
+    return !!(features & (1ULL << VIRTIO_NET_F_HOST_UFO));
+
+}
+
+static NetClientInfo net_vhost_vdpa_info = {
+        .type = NET_CLIENT_DRIVER_VHOST_VDPA,
+        .size = sizeof(VhostVDPAState),
+        .cleanup = vhost_vdpa_cleanup,
+        .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
+        .has_ufo = vhost_vdpa_has_ufo,
+};
+
+static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
+                               const char *name, const char *vhostdev,
+                               bool has_fd, char *fd)
+{
+    NetClientState *nc = NULL;
+    VhostVDPAState *s;
+    int vdpa_device_fd = -1;
+    Error *err = NULL;
+    int ret = 0;
+    assert(name);
+
+    nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
+    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-vdpa");
+    nc->queue_index = 0;
+
+    s = DO_UPCAST(VhostVDPAState, nc, nc);
+
+    if (has_fd) {
+        vdpa_device_fd = monitor_fd_param(cur_mon, fd, &err);
+    } else{
+        vdpa_device_fd = open(vhostdev, O_RDWR);
+    }
+
+    if (vdpa_device_fd == -1) {
+        return -errno;
+    }
+    s->vhost_vdpa.device_fd = vdpa_device_fd;
+    ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa);
+    assert(s->vhost_net);
+
+    if (ret) {
+        if (has_fd) {
+            close(vdpa_device_fd);
+        }
+    }
+    return ret;
+}
+
+static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
+{
+    const char *name = opaque;
+    const char *driver, *netdev;
+
+    driver = qemu_opt_get(opts, "driver");
+    netdev = qemu_opt_get(opts, "netdev");
+    if (!driver || !netdev) {
+        return 0;
+    }
+
+    if (strcmp(netdev, name) == 0 &&
+        !g_str_has_prefix(driver, "virtio-net-")) {
+        error_setg(errp, "vhost-vdpa requires frontend driver virtio-net-*");
+        return -1;
+    }
+
+    return 0;
+}
+
+int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
+                        NetClientState *peer, Error **errp)
+{
+    const NetdevVhostVDPAOptions *opts;
+
+    assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
+    opts = &netdev->u.vhost_vdpa;
+    /* verify net frontend */
+    if (qemu_opts_foreach(qemu_find_opts("device"), net_vhost_check_net,
+                          (char *)name, errp)) {
+        return -1;
+    }
+    return net_vhost_vdpa_init(peer, "vhost_vdpa", name, opts->vhostdev,
+                    opts->has_fd, opts->fd);
+}
diff --git a/qapi/net.json b/qapi/net.json
index cebb1b52e3..37507ce9ba 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -428,6 +428,27 @@
     '*vhostforce':    'bool',
     '*queues':        'int' } }
 
+##
+# @NetdevVhostVDPAOptions:
+#
+# Vhost-vdpa network backend
+#
+# @vhostdev: name of a vdpa dev path in sysfs
+#            (default path:/dev/vhost-vdpa-$ID)
+#
+# @fd: file descriptor of an already opened vdpa device
+#
+# @queues: number of queues to be created for multiqueue vhost-vdpa
+#          (default: 1)
+#
+# Since: 5.1
+##
+{ 'struct': 'NetdevVhostVDPAOptions',
+  'data': {
+    '*vhostdev':     'str',
+    '*fd':           'str',
+    '*queues':       'int' } }
+
 ##
 # @NetClientDriver:
 #
@@ -437,7 +458,7 @@
 ##
 { 'enum': 'NetClientDriver',
   'data': [ 'none', 'nic', 'user', 'tap', 'l2tpv3', 'socket', 'vde',
-            'bridge', 'hubport', 'netmap', 'vhost-user' ] }
+            'bridge', 'hubport', 'netmap', 'vhost-user', 'vhost-vdpa' ] }
 
 ##
 # @Netdev:
@@ -465,7 +486,8 @@
     'bridge':   'NetdevBridgeOptions',
     'hubport':  'NetdevHubPortOptions',
     'netmap':   'NetdevNetmapOptions',
-    'vhost-user': 'NetdevVhostUserOptions' } }
+    'vhost-user': 'NetdevVhostUserOptions',
+    'vhost-vdpa': 'NetdevVhostVDPAOptions' } }
 
 ##
 # @NetLegacy:
-- 
2.21.1



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-05-29 14:06 ` [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client Cindy Lu
@ 2020-05-29 14:22   ` Eric Blake
  2020-06-01  1:41     ` Cindy Lu
  2020-06-03  6:39   ` Jason Wang
  1 sibling, 1 reply; 41+ messages in thread
From: Eric Blake @ 2020-05-29 14:22 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, hanand, hch, eperezma,
	jgg, shahafs, kevin.tian, parav, vmireyno, cunming.liang, gdawar,
	jiri, xiao.w.wang, stefanha, zhihong.wang, Tiwei Bie, aadam,
	rdunlap, maxime.coquelin, lingshan.zhu

On 5/29/20 9:06 AM, Cindy Lu wrote:
> From: Tiwei Bie <tiwei.bie@intel.com>
> 
> This patch set introduces a new net client type: vhost-vdpa.
> vhost-vdpa net client will set up a vDPA device which is specified
> by a "vhostdev" parameter.
> 
> Co-authored-by: Lingshan Zhu <lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---

> +static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
> +                               const char *name, const char *vhostdev,
> +                               bool has_fd, char *fd)
> +{

fd is usually an int, not a string.

> +    NetClientState *nc = NULL;
> +    VhostVDPAState *s;
> +    int vdpa_device_fd = -1;
> +    Error *err = NULL;
> +    int ret = 0;
> +    assert(name);
> +
> +    nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
> +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-vdpa");
> +    nc->queue_index = 0;
> +
> +    s = DO_UPCAST(VhostVDPAState, nc, nc);
> +
> +    if (has_fd) {
> +        vdpa_device_fd = monitor_fd_param(cur_mon, fd, &err);
> +    } else{
> +        vdpa_device_fd = open(vhostdev, O_RDWR);
> +    }

Oh, you're trying to use the old way for passing in fds.  The preferred 
way is to use qemu_open(), at which point you can pass in fds via the 
add-fd QMP command, and then pass the string "/dev/fdset/NNN" as 
vhostdev.  Then you don't need a special fd parameter here.

> +++ b/qapi/net.json
> @@ -428,6 +428,27 @@
>       '*vhostforce':    'bool',
>       '*queues':        'int' } }
>   
> +##
> +# @NetdevVhostVDPAOptions:
> +#
> +# Vhost-vdpa network backend
> +#
> +# @vhostdev: name of a vdpa dev path in sysfs
> +#            (default path:/dev/vhost-vdpa-$ID)
> +#
> +# @fd: file descriptor of an already opened vdpa device
> +#
> +# @queues: number of queues to be created for multiqueue vhost-vdpa
> +#          (default: 1)
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'NetdevVhostVDPAOptions',
> +  'data': {
> +    '*vhostdev':     'str',
> +    '*fd':           'str',
> +    '*queues':       'int' } }

Instead of having vhostdev and fd both be optional (but where the user 
has to specify exactly one of them), you should only have vhostdev be 
mandatory, and rely on the /dev/fdset/NNN string as a way to get 
vhostdev to point to a previously-passed fd.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 0/8] vDPA support in qemu
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (7 preceding siblings ...)
  2020-05-29 14:06 ` [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client Cindy Lu
@ 2020-05-29 20:29 ` no-reply
  2020-05-29 20:33 ` no-reply
  2020-05-29 20:37 ` no-reply
  10 siblings, 0 replies; 41+ messages in thread
From: no-reply @ 2020-05-29 20:29 UTC (permalink / raw)
  To: lulu
  Cc: rdunlap, mst, mhabets, qemu-devel, rob.miller, saugatm, lulu,
	armbru, hch, eperezma, jgg, jasowang, shahafs, kevin.tian, parav,
	vmireyno, cunming.liang, gdawar, jiri, xiao.w.wang, stefanha,
	zhihong.wang, maxime.coquelin, aadam, cohuck, hanand,
	lingshan.zhu

Patchew URL: https://patchew.org/QEMU/20200529140620.28759-1-lulu@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/hw/virtio/vhost-user-fs-pci.o
  CC      x86_64-softmmu/hw/virtio/virtio-iommu.o
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c: In function 'vhost_vdpa_set_config':
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c:323:17: error: 'VHOST_VDPA_MAX_CONFIG_SIZE' undeclared (first use in this function)
     if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
                 ^
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c:323:17: note: each undeclared identifier is reported only once for each function it appears in
make[1]: *** [hw/virtio/vhost-vdpa.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/hw/vfio/common.o
  CC      aarch64-softmmu/hw/vfio/spapr.o
---
  CC      aarch64-softmmu/hw/virtio/vhost-vsock.o
  CC      aarch64-softmmu/hw/virtio/vhost-vsock-pci.o
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c: In function 'vhost_vdpa_set_config':
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c:323:17: error: 'VHOST_VDPA_MAX_CONFIG_SIZE' undeclared (first use in this function)
     if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
                 ^
/tmp/qemu-test/src/hw/virtio/vhost-vdpa.c:323:17: note: each undeclared identifier is reported only once for each function it appears in
make[1]: *** [hw/virtio/vhost-vdpa.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
make: *** [aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=c4d0aff9719e4e6986252b1cdad2d78a', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-jxkhgbnq/src/docker-src.2020-05-29-16.25.59.22140:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=c4d0aff9719e4e6986252b1cdad2d78a
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-jxkhgbnq/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    3m7.072s
user    0m9.059s


The full log is available at
http://patchew.org/logs/20200529140620.28759-1-lulu@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 0/8] vDPA support in qemu
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (8 preceding siblings ...)
  2020-05-29 20:29 ` [RFC v3 0/8] vDPA support in qemu no-reply
@ 2020-05-29 20:33 ` no-reply
  2020-05-29 20:37 ` no-reply
  10 siblings, 0 replies; 41+ messages in thread
From: no-reply @ 2020-05-29 20:33 UTC (permalink / raw)
  To: lulu
  Cc: rdunlap, mst, mhabets, qemu-devel, rob.miller, saugatm, lulu,
	armbru, hch, eperezma, jgg, jasowang, shahafs, kevin.tian, parav,
	vmireyno, cunming.liang, gdawar, jiri, xiao.w.wang, stefanha,
	zhihong.wang, maxime.coquelin, aadam, cohuck, hanand,
	lingshan.zhu

Patchew URL: https://patchew.org/QEMU/20200529140620.28759-1-lulu@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      accel/tcg/trace.o
  CC      backends/trace.o

Warning, treated as error:
/tmp/qemu-test/src/docs/../qemu-options.hx:2920:Inline literal start-string without end-string.
  CC      crypto/trace.o
  CC      monitor/trace.o
---
  CC      block/trace.o
  CC      io/trace.o
  CC      nbd/trace.o
make: *** [Makefile:1114: .docs_system_qemu.1_docs_system_qemu-block-drivers.7_docs_system_qemu-cpu-models.7.sentinel.] Error 2
make: *** Deleting file '.docs_system_qemu.1_docs_system_qemu-block-drivers.7_docs_system_qemu-cpu-models.7.sentinel.'
make: *** Waiting for unfinished jobs....

Warning, treated as error:
/tmp/qemu-test/src/docs/../qemu-options.hx:2920:Inline literal start-string without end-string.
make: *** [Makefile:1103: docs/system/index.html] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=97acfdab68d34ed4abd8fdcbff72793c', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-l3clgv2k/src/docker-src.2020-05-29-16.30.04.2097:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=97acfdab68d34ed4abd8fdcbff72793c
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-l3clgv2k/src'
make: *** [docker-run-test-debug@fedora] Error 2

real    3m32.660s
user    0m8.310s


The full log is available at
http://patchew.org/logs/20200529140620.28759-1-lulu@redhat.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 0/8] vDPA support in qemu
  2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
                   ` (9 preceding siblings ...)
  2020-05-29 20:33 ` no-reply
@ 2020-05-29 20:37 ` no-reply
  10 siblings, 0 replies; 41+ messages in thread
From: no-reply @ 2020-05-29 20:37 UTC (permalink / raw)
  To: lulu
  Cc: rdunlap, mst, mhabets, qemu-devel, rob.miller, saugatm, lulu,
	armbru, hch, eperezma, jgg, jasowang, shahafs, kevin.tian, parav,
	vmireyno, cunming.liang, gdawar, jiri, xiao.w.wang, stefanha,
	zhihong.wang, maxime.coquelin, aadam, cohuck, hanand,
	lingshan.zhu

Patchew URL: https://patchew.org/QEMU/20200529140620.28759-1-lulu@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      io/channel-websock.o
  CC      io/channel-util.o

Warning, treated as error:
/tmp/qemu-test/src/docs/../qemu-options.hx:2920:Inline literal start-string without end-string.
  CC      io/dns-resolver.o
  CC      io/net-listener.o
---
  CC      qom/container.o
  CC      qom/qom-qobject.o

Warning, treated as error:
/tmp/qemu-test/src/docs/../qemu-options.hx:2920:Inline literal start-string without end-string.
  CC      qom/object_interfaces.o
  CC      qemu-io.o
---
  CC      iothread.o
  CC      job-qmp.o
  CC      os-win32.o
make: *** [Makefile:1103: docs/system/index.html] Error 2
make: *** Waiting for unfinished jobs....
make: *** [Makefile:1114: .docs_system_qemu.1_docs_system_qemu-block-drivers.7_docs_system_qemu-cpu-models.7.sentinel.] Error 2
make: *** Deleting file '.docs_system_qemu.1_docs_system_qemu-block-drivers.7_docs_system_qemu-cpu-models.7.sentinel.'
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=1b05e0f6710048c78da28e8db7addc87', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-9mffj7y4/src/docker-src.2020-05-29-16.35.13.7869:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=1b05e0f6710048c78da28e8db7addc87
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-9mffj7y4/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    1m50.552s
user    0m8.680s


The full log is available at
http://patchew.org/logs/20200529140620.28759-1-lulu@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-05-29 14:22   ` Eric Blake
@ 2020-06-01  1:41     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-01  1:41 UTC (permalink / raw)
  To: Eric Blake
  Cc: Cornelia Huck, Michael Tsirkin, Jason Wang, qemu-devel, hanand,
	rob.miller, saugatm, Markus Armbruster, hch,
	Eugenio Perez Martin, jgg, mhabets, Shahaf Shuler, kevin.tian,
	parav, vmireyno, Liang, Cunming, gdawar, jiri, xiao.w.wang,
	Stefan Hajnoczi, Wang, Zhihong, Tiwei Bie, Ariel Adam, rdunlap,
	Maxime Coquelin, Zhu, Lingshan

On Fri, May 29, 2020 at 10:23 PM Eric Blake <eblake@redhat.com> wrote:
>
> On 5/29/20 9:06 AM, Cindy Lu wrote:
> > From: Tiwei Bie <tiwei.bie@intel.com>
> >
> > This patch set introduces a new net client type: vhost-vdpa.
> > vhost-vdpa net client will set up a vDPA device which is specified
> > by a "vhostdev" parameter.
> >
> > Co-authored-by: Lingshan Zhu <lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
>
> > +static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
> > +                               const char *name, const char *vhostdev,
> > +                               bool has_fd, char *fd)
> > +{
>
> fd is usually an int, not a string.
>
will fix this
> > +    NetClientState *nc = NULL;
> > +    VhostVDPAState *s;
> > +    int vdpa_device_fd = -1;
> > +    Error *err = NULL;
> > +    int ret = 0;
> > +    assert(name);
> > +
> > +    nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
> > +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-vdpa");
> > +    nc->queue_index = 0;
> > +
> > +    s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +
> > +    if (has_fd) {
> > +        vdpa_device_fd = monitor_fd_param(cur_mon, fd, &err);
> > +    } else{
> > +        vdpa_device_fd = open(vhostdev, O_RDWR);
> > +    }
>
> Oh, you're trying to use the old way for passing in fds.  The preferred
> way is to use qemu_open(), at which point you can pass in fds via the
> add-fd QMP command, and then pass the string "/dev/fdset/NNN" as
> vhostdev.  Then you don't need a special fd parameter here.
>
Thanks Eric, I will try this.

> > +++ b/qapi/net.json
> > @@ -428,6 +428,27 @@
> >       '*vhostforce':    'bool',
> >       '*queues':        'int' } }
> >
> > +##
> > +# @NetdevVhostVDPAOptions:
> > +#
> > +# Vhost-vdpa network backend
> > +#
> > +# @vhostdev: name of a vdpa dev path in sysfs
> > +#            (default path:/dev/vhost-vdpa-$ID)
> > +#
> > +# @fd: file descriptor of an already opened vdpa device
> > +#
> > +# @queues: number of queues to be created for multiqueue vhost-vdpa
> > +#          (default: 1)
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'struct': 'NetdevVhostVDPAOptions',
> > +  'data': {
> > +    '*vhostdev':     'str',
> > +    '*fd':           'str',
> > +    '*queues':       'int' } }
>
> Instead of having vhostdev and fd both be optional (but where the user
> has to specify exactly one of them), you should only have vhostdev be
> mandatory, and rely on the /dev/fdset/NNN string as a way to get
> vhostdev to point to a previously-passed fd.
>
will fix this
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
@ 2020-06-03  2:52   ` Jason Wang
  2020-06-03  5:23     ` Cindy Lu
  2020-06-03  2:53   ` Jason Wang
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2020-06-03  2:52 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck
  Cc: mhabets, qemu-devel, rob.miller, saugatm, hanand, hch, eperezma,
	jgg, shahafs, kevin.tian, parav, vmireyno, cunming.liang, gdawar,
	jiri, xiao.w.wang, stefanha, zhihong.wang, Tiwei Bie, aadam,
	rdunlap, maxime.coquelin, lingshan.zhu


On 2020/5/29 下午10:06, Cindy Lu wrote:
> From: Tiwei Bie <tiwei.bie@intel.com>
>
> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> vhost-user. The above patch provides a generic device for vDPA purpose,
> this vDPA device exposes to user space a non-vendor-specific configuration
> interface for setting up a vhost HW accelerator, this patch set introduces
> a third vhost backend called vhost-vdpa based on the vDPA interface.
>
> Vhost-vdpa usage:
>
>    qemu-system-x86_64 -cpu host -enable-kvm \
>      ......
>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
>
> Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>   configure                         |  21 ++
>   hw/net/vhost_net-stub.c           |   5 +
>   hw/net/vhost_net.c                |  47 +++-
>   hw/virtio/Makefile.objs           |   1 +
>   hw/virtio/vhost-backend.c         |   5 +
>   hw/virtio/vhost-vdpa.c            | 399 ++++++++++++++++++++++++++++++
>   hw/virtio/vhost.c                 |  37 ++-
>   include/hw/virtio/vhost-backend.h |  10 +-
>   include/hw/virtio/vhost-vdpa.h    |  26 ++
>   include/hw/virtio/vhost.h         |   2 +
>   include/net/vhost_net.h           |   4 +-
>   qemu-options.hx                   |  15 ++
>   12 files changed, 566 insertions(+), 6 deletions(-)
>   create mode 100644 hw/virtio/vhost-vdpa.c
>   create mode 100644 include/hw/virtio/vhost-vdpa.h
>
> diff --git a/configure b/configure
> index 23b5e93752..53679ee57f 100755
> --- a/configure
> +++ b/configure
> @@ -1557,6 +1557,10 @@ for opt do
>     ;;
>     --enable-vhost-user) vhost_user="yes"
>     ;;
> +  --disable-vhost-vdpa) vhost_vdpa="no"
> +  ;;
> +  --enable-vhost-vdpa) vhost_vdpa="yes"
> +  ;;
>     --disable-vhost-kernel) vhost_kernel="no"
>     ;;
>     --enable-vhost-kernel) vhost_kernel="yes"
> @@ -1846,6 +1850,7 @@ disabled with --disable-FEATURE, default is enabled if available:
>     vhost-crypto    vhost-user-crypto backend support
>     vhost-kernel    vhost kernel backend support
>     vhost-user      vhost-user backend support
> +  vhost-vdpa      vhost-vdpa kernel backend support
>     spice           spice
>     rbd             rados block device (rbd)
>     libiscsi        iscsi support
> @@ -2336,6 +2341,10 @@ test "$vhost_user" = "" && vhost_user=yes
>   if test "$vhost_user" = "yes" && test "$mingw32" = "yes"; then
>     error_exit "vhost-user isn't available on win32"
>   fi
> +test "$vhost_vdpa" = "" && vhost_vdpa=$linux
> +if test "$vhost_vdpa" = "yes" && test "$linux" != "yes"; then
> +  error_exit "vhost-vdpa is only available on Linux"
> +fi
>   test "$vhost_kernel" = "" && vhost_kernel=$linux
>   if test "$vhost_kernel" = "yes" && test "$linux" != "yes"; then
>     error_exit "vhost-kernel is only available on Linux"
> @@ -2364,6 +2373,11 @@ test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
>   if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
>     error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
>   fi
> +#vhost-vdpa backends
> +test "$vhost_net_vdpa" = "" && vhost_net_vdpa=$vhost_vdpa
> +if test "$vhost_net_vdpa" = "yes" && test "$vhost_vdpa" = "no"; then
> +  error_exit "--enable-vhost-net-vdpa requires --enable-vhost-vdpa"
> +fi
>   
>   # OR the vhost-kernel and vhost-user values for simplicity
>   if test "$vhost_net" = ""; then
> @@ -6673,6 +6687,7 @@ echo "vhost-scsi support $vhost_scsi"
>   echo "vhost-vsock support $vhost_vsock"
>   echo "vhost-user support $vhost_user"
>   echo "vhost-user-fs support $vhost_user_fs"
> +echo "vhost-vdpa support $vhost_vdpa"
>   echo "Trace backends    $trace_backends"
>   if have_backend "simple"; then
>   echo "Trace output file $trace_file-<pid>"
> @@ -7170,6 +7185,9 @@ fi
>   if test "$vhost_net_user" = "yes" ; then
>     echo "CONFIG_VHOST_NET_USER=y" >> $config_host_mak
>   fi
> +if test "$vhost_net_vdpa" = "yes" ; then
> +  echo "CONFIG_VHOST_NET_VDPA=y" >> $config_host_mak
> +fi
>   if test "$vhost_crypto" = "yes" ; then
>     echo "CONFIG_VHOST_CRYPTO=y" >> $config_host_mak
>   fi
> @@ -7182,6 +7200,9 @@ fi
>   if test "$vhost_user" = "yes" ; then
>     echo "CONFIG_VHOST_USER=y" >> $config_host_mak
>   fi
> +if test "$vhost_vdpa" = "yes" ; then
> +  echo "CONFIG_VHOST_VDPA=y" >> $config_host_mak
> +fi
>   if test "$vhost_user_fs" = "yes" ; then
>     echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
>   fi
> diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
> index 43e93e1a9a..ab77a92a7d 100644
> --- a/hw/net/vhost_net-stub.c
> +++ b/hw/net/vhost_net-stub.c
> @@ -94,3 +94,8 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>   {
>       return 0;
>   }
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> +{
> +    return 0;
> +}
> +
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index e2bc7de2eb..25045cff59 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -17,8 +17,10 @@
>   #include "net/net.h"
>   #include "net/tap.h"
>   #include "net/vhost-user.h"
> +#include "net/vhost-vdpa.h"
>   
>   #include "standard-headers/linux/vhost_types.h"
> +#include "linux-headers/linux/vhost.h"
>   #include "hw/virtio/virtio-net.h"
>   #include "net/vhost_net.h"
>   #include "qemu/error-report.h"
> @@ -85,6 +87,30 @@ static const int user_feature_bits[] = {
>       VHOST_INVALID_FEATURE_BIT
>   };
>   
> +static const int vdpa_feature_bits[] = {
> +    VIRTIO_F_NOTIFY_ON_EMPTY,
> +    VIRTIO_RING_F_INDIRECT_DESC,
> +    VIRTIO_RING_F_EVENT_IDX,
> +    VIRTIO_F_ANY_LAYOUT,
> +    VIRTIO_F_VERSION_1,
> +    VIRTIO_NET_F_CSUM,
> +    VIRTIO_NET_F_GUEST_CSUM,
> +    VIRTIO_NET_F_GSO,
> +    VIRTIO_NET_F_GUEST_TSO4,
> +    VIRTIO_NET_F_GUEST_TSO6,
> +    VIRTIO_NET_F_GUEST_ECN,
> +    VIRTIO_NET_F_GUEST_UFO,
> +    VIRTIO_NET_F_HOST_TSO4,
> +    VIRTIO_NET_F_HOST_TSO6,
> +    VIRTIO_NET_F_HOST_ECN,
> +    VIRTIO_NET_F_HOST_UFO,
> +    VIRTIO_NET_F_MRG_RXBUF,
> +    VIRTIO_NET_F_MTU,
> +    VIRTIO_F_IOMMU_PLATFORM,
> +    VIRTIO_F_RING_PACKED,
> +    VIRTIO_NET_F_GUEST_ANNOUNCE,
> +    VHOST_INVALID_FEATURE_BIT
> +};


I think those feature bits should belong to net/vhost-vdpa.c, since it 
contains bits that are net specific.


>   static const int *vhost_net_get_feature_bits(struct vhost_net *net)
>   {
>       const int *feature_bits = 0;
> @@ -96,6 +122,9 @@ static const int *vhost_net_get_feature_bits(struct vhost_net *net)
>       case NET_CLIENT_DRIVER_VHOST_USER:
>           feature_bits = user_feature_bits;
>           break;
> +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> +        feature_bits = vdpa_feature_bits;
> +        break;
>       default:
>           error_report("Feature bits not defined for this type: %d",
>                   net->nc->info->type);
> @@ -110,7 +139,10 @@ uint64_t vhost_net_get_features(struct vhost_net *net, uint64_t features)
>       return vhost_get_features(&net->dev, vhost_net_get_feature_bits(net),
>               features);
>   }
> -
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> +{
> +    return vhost_dev_get_device_id(&net->dev, device_id);
> +}
>   void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
>   {
>       net->dev.acked_features = net->dev.backend_features;
> @@ -337,6 +369,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>       }
>   
>       for (i = 0; i < total_queues; i++) {
> +
> +        if (virtio_queue_enabled(dev, i)) {
> +            vhost_set_vring_ready(peer);
> +        }


So this may break vpda_sim since it call set_vq_ready() before 
set_vring_addr.

I think maybe it's better not introduce vhost_set_vring_ready() but 
enable virtqueue in vhost_vdpa_set_state() before setting DRIVER_OK.


> +
>           peer = qemu_get_peer(ncs, i);
>           r = vhost_net_start_one(get_vhost_net(peer), dev);
>   
> @@ -433,6 +470,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
>           vhost_net = vhost_user_get_vhost_net(nc);
>           assert(vhost_net);
>           break;
> +#endif
> +#ifdef CONFIG_VHOST_NET_VDPA
> +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> +        vhost_net = vhost_vdpa_get_vhost_net(nc);
> +        assert(vhost_net);
> +        break;
>   #endif
>       default:
>           break;
> @@ -474,3 +517,5 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>   
>       return vhost_ops->vhost_net_set_mtu(&net->dev, mtu);
>   }
> +
> +


One extra newline.


> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> index 4e4d39a0a4..6b1b1a5fce 100644
> --- a/hw/virtio/Makefile.objs
> +++ b/hw/virtio/Makefile.objs
> @@ -5,6 +5,7 @@ obj-y += virtio.o
>   obj-$(CONFIG_VHOST) += vhost.o vhost-backend.o
>   common-obj-$(call lnot,$(CONFIG_VHOST)) += vhost-stub.o
>   obj-$(CONFIG_VHOST_USER) += vhost-user.o
> +obj-$(CONFIG_VHOST_VDPA) += vhost-vdpa.o
>   
>   common-obj-$(CONFIG_VIRTIO_RNG) += virtio-rng.o
>   common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 42efb4967b..420341e8c5 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -291,6 +291,11 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
>       case VHOST_BACKEND_TYPE_USER:
>           dev->vhost_ops = &user_ops;
>           break;
> +#endif
> +#ifdef CONFIG_VHOST_VDPA
> +    case VHOST_BACKEND_TYPE_VDPA:
> +        dev->vhost_ops = &vdpa_ops;
> +        break;
>   #endif
>       default:
>           error_report("Unknown vhost backend type");
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> new file mode 100644
> index 0000000000..2d136a8565
> --- /dev/null
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -0,0 +1,399 @@
> +/*
> + * vhost-vdpa
> + *
> + *  Copyright(c) 2017-2018 Intel Corporation.
> + *  Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include <linux/vhost.h>
> +#include <linux/vfio.h>
> +#include <sys/eventfd.h>
> +#include <sys/ioctl.h>
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-backend.h"
> +#include "hw/virtio/virtio-net.h"
> +#include "hw/virtio/vhost-vdpa.h"
> +#include "qemu/main-loop.h"
> +#include <linux/kvm.h>
> +#include "sysemu/kvm.h"
> +
> +
> +static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section)
> +{
> +    return (!memory_region_is_ram(section->mr) &&
> +            !memory_region_is_iommu(section->mr)) ||
> +           /*
> +            * Sizing an enabled 64-bit BAR can cause spurious mappings to
> +            * addresses in the upper part of the 64-bit address space.  These
> +            * are never accessed by the CPU and beyond the address width of
> +            * some IOMMU hardware.  TODO: VDPA should tell us the IOMMU width.
> +            */
> +           section->offset_within_address_space & (1ULL << 63);
> +}
> +
> +static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> +                              void *vaddr, bool readonly)
> +{
> +    struct vhost_msg_v2 msg;
> +    int fd = v->device_fd;
> +    int ret = 0;
> +
> +    msg.type =  v->msg_type;
> +    msg.iotlb.iova = iova;
> +    msg.iotlb.size = size;
> +    msg.iotlb.uaddr = (uint64_t)vaddr;
> +    msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
> +    msg.iotlb.type = VHOST_IOTLB_UPDATE;
> +
> +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> +        error_report("failed to write, fd=%d, errno=%d (%s)",
> +            fd, errno, strerror(errno));
> +        return -EIO ;
> +    }
> +
> +    return ret;
> +}
> +
> +static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
> +                                hwaddr size)
> +{
> +    struct vhost_msg_v2 msg;
> +    int fd = v->device_fd;
> +    int ret = 0;
> +
> +    msg.type =  v->msg_type;
> +    msg.iotlb.iova = iova;
> +    msg.iotlb.size = size;
> +    msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
> +
> +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> +        error_report("failed to write, fd=%d, errno=%d (%s)",
> +            fd, errno, strerror(errno));
> +        return -EIO ;
> +    }
> +
> +    return ret;
> +}
> +
> +static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> +                                           MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +    hwaddr iova;
> +    Int128 llend, llsize;
> +    void *vaddr;
> +    int ret;
> +
> +    if (vhost_vdpa_listener_skipped_section(section)) {
> +        return;
> +    }
> +
> +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> +        error_report("%s received unaligned region", __func__);
> +        return;
> +    }
> +
> +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +    llend = int128_make64(section->offset_within_address_space);
> +    llend = int128_add(llend, section->size);
> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> +
> +    if (int128_ge(int128_make64(iova), llend)) {
> +        return;
> +    }
> +
> +    memory_region_ref(section->mr);
> +
> +    /* Here we assume that memory_region_is_ram(section->mr)==true */
> +
> +    vaddr = memory_region_get_ram_ptr(section->mr) +
> +            section->offset_within_region +
> +            (iova - section->offset_within_address_space);
> +
> +    llsize = int128_sub(llend, int128_make64(iova));
> +
> +    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> +                             vaddr, section->readonly);
> +    if (ret) {
> +        error_report("vhost vdpa map fail!");
> +        if (memory_region_is_ram_device(section->mr)) {
> +            /* Allow unexpected mappings not to be fatal for RAM devices */
> +            error_report("map ram fail!");
> +          return ;
> +        }
> +        goto fail;
> +    }
> +
> +    return;
> +
> +fail:
> +    if (memory_region_is_ram_device(section->mr)) {
> +        error_report("failed to vdpa_dma_map. pci p2p may not work");
> +        return;
> +
> +    }
> +    /*
> +     * On the initfn path, store the first error in the container so we
> +     * can gracefully fail.  Runtime, there's not much we can do other
> +     * than throw a hardware error.
> +     */
> +    error_report("vhost-vdpa: DMA mapping failed, unable to continue");
> +    return;
> +
> +}
> +
> +static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> +                                           MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +    hwaddr iova;
> +    Int128 llend, llsize;
> +    int ret;
> +    bool try_unmap = true;
> +
> +    if (vhost_vdpa_listener_skipped_section(section)) {
> +        return;
> +    }
> +
> +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> +        error_report("%s received unaligned region", __func__);
> +        return;
> +    }
> +
> +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +    llend = int128_make64(section->offset_within_address_space);
> +    llend = int128_add(llend, section->size);
> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> +
> +    if (int128_ge(int128_make64(iova), llend)) {
> +        return;
> +    }
> +
> +    llsize = int128_sub(llend, int128_make64(iova));
> +
> +    if (try_unmap) {
> +        ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> +        if (ret) {
> +            error_report("vhost_vdpa dma unmap error!");
> +        }
> +    }
> +
> +    memory_region_unref(section->mr);
> +}


newline is needed here.


> +/* Register a new memory listener, only to get diffs from qemu,
> + * this help to reduce the tricky codes in vhost
> + * (e.g generating diffs of two rbtree as usnic did).*/


This comment needs some improvement. How about:

/* IOTLB API is used by vhost-vpda which requires incremental updating 
of the mapping. So we can not use generic vhost memory listener which 
depends on the addnop(). */


> +static const MemoryListener vhost_vdpa_memory_listener = {
> +    .region_add = vhost_vdpa_listener_region_add,
> +    .region_del = vhost_vdpa_listener_region_del,
> +};
> +
> +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> +                             void *arg)
> +{
> +    struct vhost_vdpa *v = dev->opaque;
> +    int fd = v->device_fd;
> +
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    return ioctl(fd, request, arg);
> +}
> +
> +static void vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +{
> +    uint8_t s;
> +
> +    if (vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s)) {
> +        return;
> +    }
> +
> +    s |= status;
> +
> +    vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> +}
> +
> +static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
> +{
> +    struct vhost_vdpa *v;
> +
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    v = opaque;
> +    dev->opaque =  opaque ;
> +
> +    v->listener = vhost_vdpa_memory_listener;
> +    v->msg_type = VHOST_IOTLB_MSG_V2;
> +    memory_listener_register(&v->listener, &address_space_memory);


Let's move the memory listener register/unregister to 
vhost_vdpa_set_state(). Then we can avoid lots of unnecessary vhost 
IOTLB transactions before DRIVER_OK which vhost-vDPA doesn't care.


> +
> +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                               VIRTIO_CONFIG_S_DRIVER);
> +
> +    return 0;
> +}
> +
> +static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> +{
> +    struct vhost_vdpa *v;
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    v = dev->opaque;
> +    memory_listener_unregister(&v->listener);
> +
> +    dev->opaque = NULL;
> +    return 0;
> +}
> +
> +static int vhost_vdpa_memslots_limit(struct vhost_dev *dev)
> +{
> +    return INT_MAX;
> +}
> +
> +static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
> +                                    struct vhost_memory *mem)
> +{
> +
> +    if (mem->padding) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +static int vhost_vdpa_set_features(struct vhost_dev *dev,
> +                                   uint64_t features)
> +{
> +    int ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
> +    uint8_t status = 0;
> +
> +    if (ret) {
> +        return ret;
> +    }
> +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
> +    vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> +
> +    return !(status & VIRTIO_CONFIG_S_FEATURES_OK);
> +}
> +
> +int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> +                                   uint32_t *device_id)
> +{
> +    return vhost_vdpa_call(dev, VHOST_VDPA_GET_DEVICE_ID, device_id);
> +}
> +
> +static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> +{
> +    uint8_t status = 0;
> +
> +    return vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> +}
> +
> +static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> +{
> +    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
> +
> +    return idx - dev->vq_index;
> +}
> +
> +static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)
> +{
> +    int i;
> +    for (i = 0; i < dev->nvqs; ++i) {
> +        struct vhost_vring_state state = {
> +            .index = dev->vq_index + i,
> +            .num = 1,
> +        };
> +        vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> +    }
> +    return 0;
> +}
> +
> +static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
> +                                   uint32_t offset, uint32_t size,
> +                                   uint32_t flags)
> +{
> +    struct vhost_vdpa_config config;
> +    int ret;
> +    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
> +        return -1;
> +    }
> +    memset(&config, 0, sizeof(struct vhost_vdpa_config));
> +    config.off = 0;
> +    config.len = size;
> +    memcpy(&config.buf, data, size);
> +    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG, &config);
> +    return ret;
> +}
> +
> +static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
> +                                   uint32_t config_len)
> +{
> +    struct vhost_vdpa_config v_config;
> +    int ret;
> +
> +    memset(&v_config, 0, sizeof(struct vhost_vdpa_config));
> +    if (config == NULL) {
> +        return -1;
> +    }
> +    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_CONFIG, &v_config);
> +    if ((v_config.len > config_len) || (v_config.len == 0)) {
> +        return -EINVAL;
> +    }
> +    memcpy(config, &v_config.buf, config_len);
> +    return ret;
> + }
> +
> +static int vhost_vdpa_set_state(struct vhost_dev *dev, bool started)


We probably need a better name, e.g vhost_vdpa_start()?


> +{
> +    if (started) {
> +        uint8_t status = 0;
> +
> +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> +        vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> +
> +        return !(status & VIRTIO_CONFIG_S_DRIVER_OK);
> +    } else {
> +        vhost_vdpa_reset_device(dev);
> +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                                   VIRTIO_CONFIG_S_DRIVER);
> +        return 0;
> +    }
> +}
> +
> +const VhostOps vdpa_ops = {
> +        .backend_type = VHOST_BACKEND_TYPE_VDPA,
> +        .vhost_backend_init = vhost_vdpa_init,
> +        .vhost_backend_cleanup = vhost_vdpa_cleanup,
> +        .vhost_set_log_base = vhost_kernel_set_log_base,
> +        .vhost_set_vring_addr = vhost_kernel_set_vring_addr,
> +        .vhost_set_vring_num = vhost_kernel_set_vring_num,
> +        .vhost_set_vring_base = vhost_kernel_set_vring_base,
> +        .vhost_get_vring_base = vhost_kernel_get_vring_base,
> +        .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
> +        .vhost_set_vring_call = vhost_kernel_set_vring_call,
> +        .vhost_get_features = vhost_kernel_get_features,
> +        .vhost_set_owner = vhost_kernel_set_owner,
> +        .vhost_set_vring_endian = NULL,
> +        .vhost_backend_memslots_limit = vhost_vdpa_memslots_limit,
> +        .vhost_set_mem_table = vhost_vdpa_set_mem_table,
> +        .vhost_set_features = vhost_vdpa_set_features,
> +        .vhost_reset_device = vhost_vdpa_reset_device,
> +        .vhost_get_vq_index = vhost_vdpa_get_vq_index,
> +        .vhost_set_vring_ready = vhost_vdpa_set_vring_ready,
> +        .vhost_get_config  = vhost_vdpa_get_config,
> +        .vhost_set_config = vhost_vdpa_set_config,
> +        .vhost_requires_shm_log = NULL,
> +        .vhost_migration_done = NULL,
> +        .vhost_backend_can_merge = NULL,
> +        .vhost_net_set_mtu = NULL,
> +        .vhost_set_iotlb_callback = NULL,
> +        .vhost_send_device_iotlb_msg = NULL,
> +        .vhost_set_state = vhost_vdpa_set_state,


Since it only accept boolean parameter I guess vhost_dev_start() is better?


> +        .vhost_get_device_id = vhost_vdpa_get_device_id,
> +};
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 01ebe12f28..b97aa02a4c 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -756,6 +756,12 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
>           .log_guest_addr = vq->used_phys,
>           .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
>       };
> +    /*vDPA need to use the phys address here to set to hardware*/


Actually it's "IOVA" instead of "phys address".


> +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
> +        addr.desc_user_addr = (uint64_t)(unsigned long)vq->desc_phys;
> +        addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail_phys;
> +        addr.used_user_addr = (uint64_t)(unsigned long)vq->used_phys;
> +    }


Let's introduce a callback here instead of such hard-coded ones.


>       int r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
>       if (r < 0) {
>           VHOST_OPS_DEBUG("vhost_set_vring_addr failed");
> @@ -1506,6 +1512,14 @@ int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *data,
>       return -1;
>   }
>   
> +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id)
> +{
> +    assert(hdev->vhost_ops);
> +    if (hdev->vhost_ops->vhost_get_device_id) {
> +        return hdev->vhost_ops->vhost_get_device_id(hdev, device_id);
> +    }
> +    return -1;
> +}
>   void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
>                                      const VhostDevConfigOps *ops)
>   {
> @@ -1661,7 +1675,13 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>           }
>       }
>   
> -    if (vhost_dev_has_iommu(hdev)) {
> +    r = vhost_set_state(hdev, true);
> +    if (r) {
> +        goto fail_log;
> +    }


Please use a separate patch for introducing vhost_set_state().


> +
> +    if (vhost_dev_has_iommu(hdev) &&
> +        hdev->vhost_ops->vhost_set_iotlb_callback) {


A new patch for checking vhost_set_iotlb_callback(), or just implement 
it for vhost-vpda that just warn about IOTLB miss.


>           hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
>   
>           /* Update used ring information for IOTLB to work correctly,
> @@ -1697,6 +1717,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>       /* should only be called after backend is connected */
>       assert(hdev->vhost_ops);
>   
> +    vhost_set_state(hdev, false);
> +
>       for (i = 0; i < hdev->nvqs; ++i) {
>           vhost_virtqueue_stop(hdev,
>                                vdev,
> @@ -1705,7 +1727,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>       }
>   
>       if (vhost_dev_has_iommu(hdev)) {
> -        hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> +        if (hdev->vhost_ops->vhost_set_iotlb_callback) {
> +            hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> +        }
>           memory_listener_unregister(&hdev->iommu_listener);
>       }
>       vhost_log_put(hdev, true);
> @@ -1722,3 +1746,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>   
>       return -1;
>   }
> +
> +int vhost_set_state(struct vhost_dev *hdev, bool started)
> +{
> +    if (hdev->vhost_ops->vhost_set_state) {
> +        return hdev->vhost_ops->vhost_set_state(hdev, started);
> +    }
> +
> +    return 0;
> +}
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index 300b59c172..1ebe3785cf 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -17,7 +17,8 @@ typedef enum VhostBackendType {
>       VHOST_BACKEND_TYPE_NONE = 0,
>       VHOST_BACKEND_TYPE_KERNEL = 1,
>       VHOST_BACKEND_TYPE_USER = 2,
> -    VHOST_BACKEND_TYPE_MAX = 3,
> +    VHOST_BACKEND_TYPE_VDPA = 3,
> +    VHOST_BACKEND_TYPE_MAX = 4,
>   } VhostBackendType;
>   
>   typedef enum VhostSetConfigType {
> @@ -77,6 +78,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
>   typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
>   typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
>                                            int enable);
> +typedef int (*vhost_set_vring_ready_op)(struct vhost_dev *dev);
>   typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
>   typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
>                                          char *mac_addr);
> @@ -112,6 +114,8 @@ typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev *dev,
>   typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev,
>                                           struct vhost_inflight *inflight);
>   
> +typedef int (*vhost_set_state_op)(struct vhost_dev *dev, bool started);
> +typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
>   typedef struct VhostOps {
>       VhostBackendType backend_type;
>       vhost_backend_init vhost_backend_init;
> @@ -138,6 +142,7 @@ typedef struct VhostOps {
>       vhost_reset_device_op vhost_reset_device;
>       vhost_get_vq_index_op vhost_get_vq_index;
>       vhost_set_vring_enable_op vhost_set_vring_enable;
> +    vhost_set_vring_ready_op vhost_set_vring_ready;
>       vhost_requires_shm_log_op vhost_requires_shm_log;
>       vhost_migration_done_op vhost_migration_done;
>       vhost_backend_can_merge_op vhost_backend_can_merge;
> @@ -152,9 +157,12 @@ typedef struct VhostOps {
>       vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
>       vhost_get_inflight_fd_op vhost_get_inflight_fd;
>       vhost_set_inflight_fd_op vhost_set_inflight_fd;
> +    vhost_set_state_op vhost_set_state;
> +    vhost_get_device_id_op vhost_get_device_id;
>   } VhostOps;
>   
>   extern const VhostOps user_ops;
> +extern const VhostOps vdpa_ops;
>   
>   int vhost_set_backend_type(struct vhost_dev *dev,
>                              VhostBackendType backend_type);
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> new file mode 100644
> index 0000000000..6455663388
> --- /dev/null
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -0,0 +1,26 @@
> +/*
> + * vhost-vdpa.h
> + *
> + * Copyright(c) 2017-2018 Intel Corporation.
> + * Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef HW_VIRTIO_VHOST_VDPA_H
> +#define HW_VIRTIO_VHOST_VDPA_H
> +
> +#include "hw/virtio/virtio.h"
> +
> +typedef struct vhost_vdpa {
> +    int device_fd;
> +    uint32_t msg_type;
> +    MemoryListener listener;
> +} VhostVDPA;
> +
> +extern AddressSpace address_space_memory;
> +extern int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> +                                   uint32_t *device_id);
> +#endif
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 085450c6f8..b682545f51 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -124,6 +124,7 @@ int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
>                            uint32_t config_len);
>   int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *data,
>                            uint32_t offset, uint32_t size, uint32_t flags);
> +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id);
>   /* notifier callback in case vhost device config space changed
>    */
>   void vhost_dev_set_config_notifier(struct vhost_dev *dev,
> @@ -137,4 +138,5 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
>                              struct vhost_inflight *inflight);
>   int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
>                              struct vhost_inflight *inflight);
> +int vhost_set_state(struct vhost_dev *dev, bool started);
>   #endif
> diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> index 8a6f208189..56e67fe164 100644
> --- a/include/net/vhost_net.h
> +++ b/include/net/vhost_net.h
> @@ -40,5 +40,5 @@ int vhost_set_vring_ready(NetClientState *nc);
>   uint64_t vhost_net_get_acked_features(VHostNetState *net);
>   
>   int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
> -
> -#endif
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);


This should belong to vhost-vdpa header.

Thanks


> +endif
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 292d4e7c0c..c19e10ce9c 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -2409,6 +2409,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
>   #ifdef CONFIG_POSIX
>       "-netdev vhost-user,id=str,chardev=dev[,vhostforce=on|off]\n"
>       "                configure a vhost-user network, backed by a chardev 'dev'\n"
> +#endif
> +#ifdef CONFIG_POSIX
> +    "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
> +    "                configure a vhost-vdpa network,Establish a vhost-vdpa netdev\n"
>   #endif
>       "-netdev hubport,id=str,hubid=n[,netdev=nd]\n"
>       "                configure a hub port on the hub with ID 'n'\n", QEMU_ARCH_ALL)
> @@ -2428,6 +2432,9 @@ DEF("nic", HAS_ARG, QEMU_OPTION_nic,
>   #endif
>   #ifdef CONFIG_POSIX
>       "vhost-user|"
> +#endif
> +#ifdef CONFIG_POSIX
> +    "vhost-vdpa|"
>   #endif
>       "socket][,option][,...][mac=macaddr]\n"
>       "                initialize an on-board / default host NIC (using MAC address\n"
> @@ -2896,6 +2903,14 @@ SRST
>       hubport to another netdev with ID nd by using the ``netdev=nd``
>       option.
>   
> +``-netdev vhost-vdpa,vhostdev=/path/to/dev ``
> +    Establish a vhost-vdpa netdev.
> +
> +    vDPA device is a device that uses a datapath which complies with
> +    the virtio specifications with a vendor specific control path.
> +    vDPA devices can be both physically located on the hardware or
> +    emulated by software.
> +
>   ``-net nic[,netdev=nd][,macaddr=mac][,model=type] [,name=name][,addr=addr][,vectors=v]``
>       Legacy option to configure or create an on-board (or machine
>       default) Network Interface Card(NIC) and connect it either to the



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
  2020-06-03  2:52   ` Jason Wang
@ 2020-06-03  2:53   ` Jason Wang
  2020-06-03  5:23     ` Cindy Lu
  2020-06-03  6:43   ` Jason Wang
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2020-06-03  2:53 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, hanand, lingshan.zhu


On 2020/5/29 下午10:06, Cindy Lu wrote:
> From: Tiwei Bie<tiwei.bie@intel.com>


Consider the significant modification based on the original patch.

I think you may change the other to yourslef and keep the sobs for both 
Tiwei and Lingshan.

Thanks


>
> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> vhost-user. The above patch provides a generic device for vDPA purpose,
> this vDPA device exposes to user space a non-vendor-specific configuration
> interface for setting up a vhost HW accelerator, this patch set introduces
> a third vhost backend called vhost-vdpa based on the vDPA interface.
>
> Vhost-vdpa usage:
>
>    qemu-system-x86_64 -cpu host -enable-kvm \
>      ......
>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
>
> Co-Authored-By: Lingshan zhu<lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu<lulu@redhat.com>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-03  2:52   ` Jason Wang
@ 2020-06-03  5:23     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-03  5:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, Michael Tsirkin, mhabets, qemu-devel, hanand,
	rob.miller, saugatm, Markus Armbruster, hch,
	Eugenio Perez Martin, jgg, Shahaf Shuler, kevin.tian, parav,
	vmireyno, Liang, Cunming, gdawar, jiri, xiao.w.wang,
	Stefan Hajnoczi, Wang, Zhihong, Tiwei Bie, Ariel Adam, rdunlap,
	Maxime Coquelin, Zhu, Lingshan

Hi Jason,

On Wed, Jun 3, 2020 at 10:52 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/5/29 下午10:06, Cindy Lu wrote:
> > From: Tiwei Bie <tiwei.bie@intel.com>
> >
> > Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> > vhost-user. The above patch provides a generic device for vDPA purpose,
> > this vDPA device exposes to user space a non-vendor-specific configuration
> > interface for setting up a vhost HW accelerator, this patch set introduces
> > a third vhost backend called vhost-vdpa based on the vDPA interface.
> >
> > Vhost-vdpa usage:
> >
> >    qemu-system-x86_64 -cpu host -enable-kvm \
> >      ......
> >    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
> >    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> >
> > Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >   configure                         |  21 ++
> >   hw/net/vhost_net-stub.c           |   5 +
> >   hw/net/vhost_net.c                |  47 +++-
> >   hw/virtio/Makefile.objs           |   1 +
> >   hw/virtio/vhost-backend.c         |   5 +
> >   hw/virtio/vhost-vdpa.c            | 399 ++++++++++++++++++++++++++++++
> >   hw/virtio/vhost.c                 |  37 ++-
> >   include/hw/virtio/vhost-backend.h |  10 +-
> >   include/hw/virtio/vhost-vdpa.h    |  26 ++
> >   include/hw/virtio/vhost.h         |   2 +
> >   include/net/vhost_net.h           |   4 +-
> >   qemu-options.hx                   |  15 ++
> >   12 files changed, 566 insertions(+), 6 deletions(-)
> >   create mode 100644 hw/virtio/vhost-vdpa.c
> >   create mode 100644 include/hw/virtio/vhost-vdpa.h
> >
> > diff --git a/configure b/configure
> > index 23b5e93752..53679ee57f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -1557,6 +1557,10 @@ for opt do
> >     ;;
> >     --enable-vhost-user) vhost_user="yes"
> >     ;;
> > +  --disable-vhost-vdpa) vhost_vdpa="no"
> > +  ;;
> > +  --enable-vhost-vdpa) vhost_vdpa="yes"
> > +  ;;
> >     --disable-vhost-kernel) vhost_kernel="no"
> >     ;;
> >     --enable-vhost-kernel) vhost_kernel="yes"
> > @@ -1846,6 +1850,7 @@ disabled with --disable-FEATURE, default is enabled if available:
> >     vhost-crypto    vhost-user-crypto backend support
> >     vhost-kernel    vhost kernel backend support
> >     vhost-user      vhost-user backend support
> > +  vhost-vdpa      vhost-vdpa kernel backend support
> >     spice           spice
> >     rbd             rados block device (rbd)
> >     libiscsi        iscsi support
> > @@ -2336,6 +2341,10 @@ test "$vhost_user" = "" && vhost_user=yes
> >   if test "$vhost_user" = "yes" && test "$mingw32" = "yes"; then
> >     error_exit "vhost-user isn't available on win32"
> >   fi
> > +test "$vhost_vdpa" = "" && vhost_vdpa=$linux
> > +if test "$vhost_vdpa" = "yes" && test "$linux" != "yes"; then
> > +  error_exit "vhost-vdpa is only available on Linux"
> > +fi
> >   test "$vhost_kernel" = "" && vhost_kernel=$linux
> >   if test "$vhost_kernel" = "yes" && test "$linux" != "yes"; then
> >     error_exit "vhost-kernel is only available on Linux"
> > @@ -2364,6 +2373,11 @@ test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
> >   if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
> >     error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
> >   fi
> > +#vhost-vdpa backends
> > +test "$vhost_net_vdpa" = "" && vhost_net_vdpa=$vhost_vdpa
> > +if test "$vhost_net_vdpa" = "yes" && test "$vhost_vdpa" = "no"; then
> > +  error_exit "--enable-vhost-net-vdpa requires --enable-vhost-vdpa"
> > +fi
> >
> >   # OR the vhost-kernel and vhost-user values for simplicity
> >   if test "$vhost_net" = ""; then
> > @@ -6673,6 +6687,7 @@ echo "vhost-scsi support $vhost_scsi"
> >   echo "vhost-vsock support $vhost_vsock"
> >   echo "vhost-user support $vhost_user"
> >   echo "vhost-user-fs support $vhost_user_fs"
> > +echo "vhost-vdpa support $vhost_vdpa"
> >   echo "Trace backends    $trace_backends"
> >   if have_backend "simple"; then
> >   echo "Trace output file $trace_file-<pid>"
> > @@ -7170,6 +7185,9 @@ fi
> >   if test "$vhost_net_user" = "yes" ; then
> >     echo "CONFIG_VHOST_NET_USER=y" >> $config_host_mak
> >   fi
> > +if test "$vhost_net_vdpa" = "yes" ; then
> > +  echo "CONFIG_VHOST_NET_VDPA=y" >> $config_host_mak
> > +fi
> >   if test "$vhost_crypto" = "yes" ; then
> >     echo "CONFIG_VHOST_CRYPTO=y" >> $config_host_mak
> >   fi
> > @@ -7182,6 +7200,9 @@ fi
> >   if test "$vhost_user" = "yes" ; then
> >     echo "CONFIG_VHOST_USER=y" >> $config_host_mak
> >   fi
> > +if test "$vhost_vdpa" = "yes" ; then
> > +  echo "CONFIG_VHOST_VDPA=y" >> $config_host_mak
> > +fi
> >   if test "$vhost_user_fs" = "yes" ; then
> >     echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
> >   fi
> > diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
> > index 43e93e1a9a..ab77a92a7d 100644
> > --- a/hw/net/vhost_net-stub.c
> > +++ b/hw/net/vhost_net-stub.c
> > @@ -94,3 +94,8 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
> >   {
> >       return 0;
> >   }
> > +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> > +{
> > +    return 0;
> > +}
> > +
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index e2bc7de2eb..25045cff59 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -17,8 +17,10 @@
> >   #include "net/net.h"
> >   #include "net/tap.h"
> >   #include "net/vhost-user.h"
> > +#include "net/vhost-vdpa.h"
> >
> >   #include "standard-headers/linux/vhost_types.h"
> > +#include "linux-headers/linux/vhost.h"
> >   #include "hw/virtio/virtio-net.h"
> >   #include "net/vhost_net.h"
> >   #include "qemu/error-report.h"
> > @@ -85,6 +87,30 @@ static const int user_feature_bits[] = {
> >       VHOST_INVALID_FEATURE_BIT
> >   };
> >
> > +static const int vdpa_feature_bits[] = {
> > +    VIRTIO_F_NOTIFY_ON_EMPTY,
> > +    VIRTIO_RING_F_INDIRECT_DESC,
> > +    VIRTIO_RING_F_EVENT_IDX,
> > +    VIRTIO_F_ANY_LAYOUT,
> > +    VIRTIO_F_VERSION_1,
> > +    VIRTIO_NET_F_CSUM,
> > +    VIRTIO_NET_F_GUEST_CSUM,
> > +    VIRTIO_NET_F_GSO,
> > +    VIRTIO_NET_F_GUEST_TSO4,
> > +    VIRTIO_NET_F_GUEST_TSO6,
> > +    VIRTIO_NET_F_GUEST_ECN,
> > +    VIRTIO_NET_F_GUEST_UFO,
> > +    VIRTIO_NET_F_HOST_TSO4,
> > +    VIRTIO_NET_F_HOST_TSO6,
> > +    VIRTIO_NET_F_HOST_ECN,
> > +    VIRTIO_NET_F_HOST_UFO,
> > +    VIRTIO_NET_F_MRG_RXBUF,
> > +    VIRTIO_NET_F_MTU,
> > +    VIRTIO_F_IOMMU_PLATFORM,
> > +    VIRTIO_F_RING_PACKED,
> > +    VIRTIO_NET_F_GUEST_ANNOUNCE,
> > +    VHOST_INVALID_FEATURE_BIT
> > +};
>
>
> I think those feature bits should belong to net/vhost-vdpa.c, since it
> contains bits that are net specific.
>
Sure will move these.
>
> >   static const int *vhost_net_get_feature_bits(struct vhost_net *net)
> >   {
> >       const int *feature_bits = 0;
> > @@ -96,6 +122,9 @@ static const int *vhost_net_get_feature_bits(struct vhost_net *net)
> >       case NET_CLIENT_DRIVER_VHOST_USER:
> >           feature_bits = user_feature_bits;
> >           break;
> > +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> > +        feature_bits = vdpa_feature_bits;
> > +        break;
> >       default:
> >           error_report("Feature bits not defined for this type: %d",
> >                   net->nc->info->type);
> > @@ -110,7 +139,10 @@ uint64_t vhost_net_get_features(struct vhost_net *net, uint64_t features)
> >       return vhost_get_features(&net->dev, vhost_net_get_feature_bits(net),
> >               features);
> >   }
> > -
> > +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> > +{
> > +    return vhost_dev_get_device_id(&net->dev, device_id);
> > +}
> >   void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
> >   {
> >       net->dev.acked_features = net->dev.backend_features;
> > @@ -337,6 +369,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >       }
> >
> >       for (i = 0; i < total_queues; i++) {
> > +
> > +        if (virtio_queue_enabled(dev, i)) {
> > +            vhost_set_vring_ready(peer);
> > +        }
>
>
> So this may break vpda_sim since it call set_vq_ready() before
> set_vring_addr.
>
> I think maybe it's better not introduce vhost_set_vring_ready() but
> enable virtqueue in vhost_vdpa_set_state() before setting DRIVER_OK.
>
>
will fix this
> > +
> >           peer = qemu_get_peer(ncs, i);
> >           r = vhost_net_start_one(get_vhost_net(peer), dev);
> >
> > @@ -433,6 +470,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
> >           vhost_net = vhost_user_get_vhost_net(nc);
> >           assert(vhost_net);
> >           break;
> > +#endif
> > +#ifdef CONFIG_VHOST_NET_VDPA
> > +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> > +        vhost_net = vhost_vdpa_get_vhost_net(nc);
> > +        assert(vhost_net);
> > +        break;
> >   #endif
> >       default:
> >           break;
> > @@ -474,3 +517,5 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
> >
> >       return vhost_ops->vhost_net_set_mtu(&net->dev, mtu);
> >   }
> > +
> > +
>
>
> One extra newline.
>
will fix this
>
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 4e4d39a0a4..6b1b1a5fce 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -5,6 +5,7 @@ obj-y += virtio.o
> >   obj-$(CONFIG_VHOST) += vhost.o vhost-backend.o
> >   common-obj-$(call lnot,$(CONFIG_VHOST)) += vhost-stub.o
> >   obj-$(CONFIG_VHOST_USER) += vhost-user.o
> > +obj-$(CONFIG_VHOST_VDPA) += vhost-vdpa.o
> >
> >   common-obj-$(CONFIG_VIRTIO_RNG) += virtio-rng.o
> >   common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> > diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> > index 42efb4967b..420341e8c5 100644
> > --- a/hw/virtio/vhost-backend.c
> > +++ b/hw/virtio/vhost-backend.c
> > @@ -291,6 +291,11 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
> >       case VHOST_BACKEND_TYPE_USER:
> >           dev->vhost_ops = &user_ops;
> >           break;
> > +#endif
> > +#ifdef CONFIG_VHOST_VDPA
> > +    case VHOST_BACKEND_TYPE_VDPA:
> > +        dev->vhost_ops = &vdpa_ops;
> > +        break;
> >   #endif
> >       default:
> >           error_report("Unknown vhost backend type");
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > new file mode 100644
> > index 0000000000..2d136a8565
> > --- /dev/null
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -0,0 +1,399 @@
> > +/*
> > + * vhost-vdpa
> > + *
> > + *  Copyright(c) 2017-2018 Intel Corporation.
> > + *  Copyright(c) 2020 Red Hat, Inc.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include <linux/vhost.h>
> > +#include <linux/vfio.h>
> > +#include <sys/eventfd.h>
> > +#include <sys/ioctl.h>
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/vhost-backend.h"
> > +#include "hw/virtio/virtio-net.h"
> > +#include "hw/virtio/vhost-vdpa.h"
> > +#include "qemu/main-loop.h"
> > +#include <linux/kvm.h>
> > +#include "sysemu/kvm.h"
> > +
> > +
> > +static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section)
> > +{
> > +    return (!memory_region_is_ram(section->mr) &&
> > +            !memory_region_is_iommu(section->mr)) ||
> > +           /*
> > +            * Sizing an enabled 64-bit BAR can cause spurious mappings to
> > +            * addresses in the upper part of the 64-bit address space.  These
> > +            * are never accessed by the CPU and beyond the address width of
> > +            * some IOMMU hardware.  TODO: VDPA should tell us the IOMMU width.
> > +            */
> > +           section->offset_within_address_space & (1ULL << 63);
> > +}
> > +
> > +static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> > +                              void *vaddr, bool readonly)
> > +{
> > +    struct vhost_msg_v2 msg;
> > +    int fd = v->device_fd;
> > +    int ret = 0;
> > +
> > +    msg.type =  v->msg_type;
> > +    msg.iotlb.iova = iova;
> > +    msg.iotlb.size = size;
> > +    msg.iotlb.uaddr = (uint64_t)vaddr;
> > +    msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
> > +    msg.iotlb.type = VHOST_IOTLB_UPDATE;
> > +
> > +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> > +        error_report("failed to write, fd=%d, errno=%d (%s)",
> > +            fd, errno, strerror(errno));
> > +        return -EIO ;
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> > +static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
> > +                                hwaddr size)
> > +{
> > +    struct vhost_msg_v2 msg;
> > +    int fd = v->device_fd;
> > +    int ret = 0;
> > +
> > +    msg.type =  v->msg_type;
> > +    msg.iotlb.iova = iova;
> > +    msg.iotlb.size = size;
> > +    msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
> > +
> > +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> > +        error_report("failed to write, fd=%d, errno=%d (%s)",
> > +            fd, errno, strerror(errno));
> > +        return -EIO ;
> > +    }
> > +
> > +    return ret;
> > +}
> > +
> > +static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> > +                                           MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +    hwaddr iova;
> > +    Int128 llend, llsize;
> > +    void *vaddr;
> > +    int ret;
> > +
> > +    if (vhost_vdpa_listener_skipped_section(section)) {
> > +        return;
> > +    }
> > +
> > +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > +        error_report("%s received unaligned region", __func__);
> > +        return;
> > +    }
> > +
> > +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> > +    llend = int128_make64(section->offset_within_address_space);
> > +    llend = int128_add(llend, section->size);
> > +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> > +
> > +    if (int128_ge(int128_make64(iova), llend)) {
> > +        return;
> > +    }
> > +
> > +    memory_region_ref(section->mr);
> > +
> > +    /* Here we assume that memory_region_is_ram(section->mr)==true */
> > +
> > +    vaddr = memory_region_get_ram_ptr(section->mr) +
> > +            section->offset_within_region +
> > +            (iova - section->offset_within_address_space);
> > +
> > +    llsize = int128_sub(llend, int128_make64(iova));
> > +
> > +    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> > +                             vaddr, section->readonly);
> > +    if (ret) {
> > +        error_report("vhost vdpa map fail!");
> > +        if (memory_region_is_ram_device(section->mr)) {
> > +            /* Allow unexpected mappings not to be fatal for RAM devices */
> > +            error_report("map ram fail!");
> > +          return ;
> > +        }
> > +        goto fail;
> > +    }
> > +
> > +    return;
> > +
> > +fail:
> > +    if (memory_region_is_ram_device(section->mr)) {
> > +        error_report("failed to vdpa_dma_map. pci p2p may not work");
> > +        return;
> > +
> > +    }
> > +    /*
> > +     * On the initfn path, store the first error in the container so we
> > +     * can gracefully fail.  Runtime, there's not much we can do other
> > +     * than throw a hardware error.
> > +     */
> > +    error_report("vhost-vdpa: DMA mapping failed, unable to continue");
> > +    return;
> > +
> > +}
> > +
> > +static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> > +                                           MemoryRegionSection *section)
> > +{
> > +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> > +    hwaddr iova;
> > +    Int128 llend, llsize;
> > +    int ret;
> > +    bool try_unmap = true;
> > +
> > +    if (vhost_vdpa_listener_skipped_section(section)) {
> > +        return;
> > +    }
> > +
> > +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> > +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> > +        error_report("%s received unaligned region", __func__);
> > +        return;
> > +    }
> > +
> > +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> > +    llend = int128_make64(section->offset_within_address_space);
> > +    llend = int128_add(llend, section->size);
> > +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> > +
> > +    if (int128_ge(int128_make64(iova), llend)) {
> > +        return;
> > +    }
> > +
> > +    llsize = int128_sub(llend, int128_make64(iova));
> > +
> > +    if (try_unmap) {
> > +        ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> > +        if (ret) {
> > +            error_report("vhost_vdpa dma unmap error!");
> > +        }
> > +    }
> > +
> > +    memory_region_unref(section->mr);
> > +}
>
>
> newline is needed here.
>
will fix this
>
> > +/* Register a new memory listener, only to get diffs from qemu,
> > + * this help to reduce the tricky codes in vhost
> > + * (e.g generating diffs of two rbtree as usnic did).*/
>
>
> This comment needs some improvement. How about:
>
> /* IOTLB API is used by vhost-vpda which requires incremental updating
> of the mapping. So we can not use generic vhost memory listener which
> depends on the addnop(). */
>
>
> > +static const MemoryListener vhost_vdpa_memory_listener = {
> > +    .region_add = vhost_vdpa_listener_region_add,
> > +    .region_del = vhost_vdpa_listener_region_del,
> > +};
> > +
> > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> > +                             void *arg)
> > +{
> > +    struct vhost_vdpa *v = dev->opaque;
> > +    int fd = v->device_fd;
> > +
> > +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > +
> > +    return ioctl(fd, request, arg);
> > +}
> > +
> > +static void vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > +{
> > +    uint8_t s;
> > +
> > +    if (vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s)) {
> > +        return;
> > +    }
> > +
> > +    s |= status;
> > +
> > +    vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > +}
> > +
> > +static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
> > +{
> > +    struct vhost_vdpa *v;
> > +
> > +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > +
> > +    v = opaque;
> > +    dev->opaque =  opaque ;
> > +
> > +    v->listener = vhost_vdpa_memory_listener;
> > +    v->msg_type = VHOST_IOTLB_MSG_V2;
> > +    memory_listener_register(&v->listener, &address_space_memory);
>
>
> Let's move the memory listener register/unregister to
> vhost_vdpa_set_state(). Then we can avoid lots of unnecessary vhost
> IOTLB transactions before DRIVER_OK which vhost-vDPA doesn't care.
>
>
sure will move this

> > +
> > +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > +                               VIRTIO_CONFIG_S_DRIVER);
> > +
> > +    return 0;
> > +}
> > +
> > +static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> > +{
> > +    struct vhost_vdpa *v;
> > +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > +
> > +    v = dev->opaque;
> > +    memory_listener_unregister(&v->listener);
> > +
> > +    dev->opaque = NULL;
> > +    return 0;
> > +}
> > +
> > +static int vhost_vdpa_memslots_limit(struct vhost_dev *dev)
> > +{
> > +    return INT_MAX;
> > +}
> > +
> > +static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
> > +                                    struct vhost_memory *mem)
> > +{
> > +
> > +    if (mem->padding) {
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static int vhost_vdpa_set_features(struct vhost_dev *dev,
> > +                                   uint64_t features)
> > +{
> > +    int ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
> > +    uint8_t status = 0;
> > +
> > +    if (ret) {
> > +        return ret;
> > +    }
> > +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
> > +    vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> > +
> > +    return !(status & VIRTIO_CONFIG_S_FEATURES_OK);
> > +}
> > +
> > +int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > +                                   uint32_t *device_id)
> > +{
> > +    return vhost_vdpa_call(dev, VHOST_VDPA_GET_DEVICE_ID, device_id);
> > +}
> > +
> > +static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > +{
> > +    uint8_t status = 0;
> > +
> > +    return vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > +}
> > +
> > +static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > +{
> > +    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
> > +
> > +    return idx - dev->vq_index;
> > +}
> > +
> > +static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)
> > +{
> > +    int i;
> > +    for (i = 0; i < dev->nvqs; ++i) {
> > +        struct vhost_vring_state state = {
> > +            .index = dev->vq_index + i,
> > +            .num = 1,
> > +        };
> > +        vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> > +    }
> > +    return 0;
> > +}
> > +
> > +static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
> > +                                   uint32_t offset, uint32_t size,
> > +                                   uint32_t flags)
> > +{
> > +    struct vhost_vdpa_config config;
> > +    int ret;
> > +    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
> > +        return -1;
> > +    }
> > +    memset(&config, 0, sizeof(struct vhost_vdpa_config));
> > +    config.off = 0;
> > +    config.len = size;
> > +    memcpy(&config.buf, data, size);
> > +    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG, &config);
> > +    return ret;
> > +}
> > +
> > +static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
> > +                                   uint32_t config_len)
> > +{
> > +    struct vhost_vdpa_config v_config;
> > +    int ret;
> > +
> > +    memset(&v_config, 0, sizeof(struct vhost_vdpa_config));
> > +    if (config == NULL) {
> > +        return -1;
> > +    }
> > +    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_CONFIG, &v_config);
> > +    if ((v_config.len > config_len) || (v_config.len == 0)) {
> > +        return -EINVAL;
> > +    }
> > +    memcpy(config, &v_config.buf, config_len);
> > +    return ret;
> > + }
> > +
> > +static int vhost_vdpa_set_state(struct vhost_dev *dev, bool started)
>
>
> We probably need a better name, e.g vhost_vdpa_start()?
>
>
will fix this
> > +{
> > +    if (started) {
> > +        uint8_t status = 0;
> > +
> > +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> > +        vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> > +
> > +        return !(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > +    } else {
> > +        vhost_vdpa_reset_device(dev);
> > +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> > +                                   VIRTIO_CONFIG_S_DRIVER);
> > +        return 0;
> > +    }
> > +}
> > +
> > +const VhostOps vdpa_ops = {
> > +        .backend_type = VHOST_BACKEND_TYPE_VDPA,
> > +        .vhost_backend_init = vhost_vdpa_init,
> > +        .vhost_backend_cleanup = vhost_vdpa_cleanup,
> > +        .vhost_set_log_base = vhost_kernel_set_log_base,
> > +        .vhost_set_vring_addr = vhost_kernel_set_vring_addr,
> > +        .vhost_set_vring_num = vhost_kernel_set_vring_num,
> > +        .vhost_set_vring_base = vhost_kernel_set_vring_base,
> > +        .vhost_get_vring_base = vhost_kernel_get_vring_base,
> > +        .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
> > +        .vhost_set_vring_call = vhost_kernel_set_vring_call,
> > +        .vhost_get_features = vhost_kernel_get_features,
> > +        .vhost_set_owner = vhost_kernel_set_owner,
> > +        .vhost_set_vring_endian = NULL,
> > +        .vhost_backend_memslots_limit = vhost_vdpa_memslots_limit,
> > +        .vhost_set_mem_table = vhost_vdpa_set_mem_table,
> > +        .vhost_set_features = vhost_vdpa_set_features,
> > +        .vhost_reset_device = vhost_vdpa_reset_device,
> > +        .vhost_get_vq_index = vhost_vdpa_get_vq_index,
> > +        .vhost_set_vring_ready = vhost_vdpa_set_vring_ready,
> > +        .vhost_get_config  = vhost_vdpa_get_config,
> > +        .vhost_set_config = vhost_vdpa_set_config,
> > +        .vhost_requires_shm_log = NULL,
> > +        .vhost_migration_done = NULL,
> > +        .vhost_backend_can_merge = NULL,
> > +        .vhost_net_set_mtu = NULL,
> > +        .vhost_set_iotlb_callback = NULL,
> > +        .vhost_send_device_iotlb_msg = NULL,
> > +        .vhost_set_state = vhost_vdpa_set_state,
>
>
> Since it only accept boolean parameter I guess vhost_dev_start() is better?
>
will fix this
>
> > +        .vhost_get_device_id = vhost_vdpa_get_device_id,
> > +};
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 01ebe12f28..b97aa02a4c 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -756,6 +756,12 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
> >           .log_guest_addr = vq->used_phys,
> >           .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
> >       };
> > +    /*vDPA need to use the phys address here to set to hardware*/
>
>
> Actually it's "IOVA" instead of "phys address".
>
>
> > +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
> > +        addr.desc_user_addr = (uint64_t)(unsigned long)vq->desc_phys;
> > +        addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail_phys;
> > +        addr.used_user_addr = (uint64_t)(unsigned long)vq->used_phys;
> > +    }
>
>
> Let's introduce a callback here instead of such hard-coded ones.
>
will fix this
>
> >       int r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
> >       if (r < 0) {
> >           VHOST_OPS_DEBUG("vhost_set_vring_addr failed");
> > @@ -1506,6 +1512,14 @@ int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *data,
> >       return -1;
> >   }
> >
> > +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id)
> > +{
> > +    assert(hdev->vhost_ops);
> > +    if (hdev->vhost_ops->vhost_get_device_id) {
> > +        return hdev->vhost_ops->vhost_get_device_id(hdev, device_id);
> > +    }
> > +    return -1;
> > +}
> >   void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
> >                                      const VhostDevConfigOps *ops)
> >   {
> > @@ -1661,7 +1675,13 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
> >           }
> >       }
> >
> > -    if (vhost_dev_has_iommu(hdev)) {
> > +    r = vhost_set_state(hdev, true);
> > +    if (r) {
> > +        goto fail_log;
> > +    }
>
>
> Please use a separate patch for introducing vhost_set_state().
>
will fix this
>
> > +
> > +    if (vhost_dev_has_iommu(hdev) &&
> > +        hdev->vhost_ops->vhost_set_iotlb_callback) {
>
>
> A new patch for checking vhost_set_iotlb_callback(), or just implement
> it for vhost-vpda that just warn about IOTLB miss.
>
>
> >           hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
> >
> >           /* Update used ring information for IOTLB to work correctly,
> > @@ -1697,6 +1717,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >       /* should only be called after backend is connected */
> >       assert(hdev->vhost_ops);
> >
> > +    vhost_set_state(hdev, false);
> > +
> >       for (i = 0; i < hdev->nvqs; ++i) {
> >           vhost_virtqueue_stop(hdev,
> >                                vdev,
> > @@ -1705,7 +1727,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >       }
> >
> >       if (vhost_dev_has_iommu(hdev)) {
> > -        hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> > +        if (hdev->vhost_ops->vhost_set_iotlb_callback) {
> > +            hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> > +        }
> >           memory_listener_unregister(&hdev->iommu_listener);
> >       }
> >       vhost_log_put(hdev, true);
> > @@ -1722,3 +1746,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >
> >       return -1;
> >   }
> > +
> > +int vhost_set_state(struct vhost_dev *hdev, bool started)
> > +{
> > +    if (hdev->vhost_ops->vhost_set_state) {
> > +        return hdev->vhost_ops->vhost_set_state(hdev, started);
> > +    }
> > +
> > +    return 0;
> > +}
> > diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> > index 300b59c172..1ebe3785cf 100644
> > --- a/include/hw/virtio/vhost-backend.h
> > +++ b/include/hw/virtio/vhost-backend.h
> > @@ -17,7 +17,8 @@ typedef enum VhostBackendType {
> >       VHOST_BACKEND_TYPE_NONE = 0,
> >       VHOST_BACKEND_TYPE_KERNEL = 1,
> >       VHOST_BACKEND_TYPE_USER = 2,
> > -    VHOST_BACKEND_TYPE_MAX = 3,
> > +    VHOST_BACKEND_TYPE_VDPA = 3,
> > +    VHOST_BACKEND_TYPE_MAX = 4,
> >   } VhostBackendType;
> >
> >   typedef enum VhostSetConfigType {
> > @@ -77,6 +78,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
> >   typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
> >   typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
> >                                            int enable);
> > +typedef int (*vhost_set_vring_ready_op)(struct vhost_dev *dev);
> >   typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
> >   typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
> >                                          char *mac_addr);
> > @@ -112,6 +114,8 @@ typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev *dev,
> >   typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev,
> >                                           struct vhost_inflight *inflight);
> >
> > +typedef int (*vhost_set_state_op)(struct vhost_dev *dev, bool started);
> > +typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
> >   typedef struct VhostOps {
> >       VhostBackendType backend_type;
> >       vhost_backend_init vhost_backend_init;
> > @@ -138,6 +142,7 @@ typedef struct VhostOps {
> >       vhost_reset_device_op vhost_reset_device;
> >       vhost_get_vq_index_op vhost_get_vq_index;
> >       vhost_set_vring_enable_op vhost_set_vring_enable;
> > +    vhost_set_vring_ready_op vhost_set_vring_ready;
> >       vhost_requires_shm_log_op vhost_requires_shm_log;
> >       vhost_migration_done_op vhost_migration_done;
> >       vhost_backend_can_merge_op vhost_backend_can_merge;
> > @@ -152,9 +157,12 @@ typedef struct VhostOps {
> >       vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
> >       vhost_get_inflight_fd_op vhost_get_inflight_fd;
> >       vhost_set_inflight_fd_op vhost_set_inflight_fd;
> > +    vhost_set_state_op vhost_set_state;
> > +    vhost_get_device_id_op vhost_get_device_id;
> >   } VhostOps;
> >
> >   extern const VhostOps user_ops;
> > +extern const VhostOps vdpa_ops;
> >
> >   int vhost_set_backend_type(struct vhost_dev *dev,
> >                              VhostBackendType backend_type);
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > new file mode 100644
> > index 0000000000..6455663388
> > --- /dev/null
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -0,0 +1,26 @@
> > +/*
> > + * vhost-vdpa.h
> > + *
> > + * Copyright(c) 2017-2018 Intel Corporation.
> > + * Copyright(c) 2020 Red Hat, Inc.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef HW_VIRTIO_VHOST_VDPA_H
> > +#define HW_VIRTIO_VHOST_VDPA_H
> > +
> > +#include "hw/virtio/virtio.h"
> > +
> > +typedef struct vhost_vdpa {
> > +    int device_fd;
> > +    uint32_t msg_type;
> > +    MemoryListener listener;
> > +} VhostVDPA;
> > +
> > +extern AddressSpace address_space_memory;
> > +extern int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> > +                                   uint32_t *device_id);
> > +#endif
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 085450c6f8..b682545f51 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -124,6 +124,7 @@ int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
> >                            uint32_t config_len);
> >   int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *data,
> >                            uint32_t offset, uint32_t size, uint32_t flags);
> > +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id);
> >   /* notifier callback in case vhost device config space changed
> >    */
> >   void vhost_dev_set_config_notifier(struct vhost_dev *dev,
> > @@ -137,4 +138,5 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
> >                              struct vhost_inflight *inflight);
> >   int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
> >                              struct vhost_inflight *inflight);
> > +int vhost_set_state(struct vhost_dev *dev, bool started);
> >   #endif
> > diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> > index 8a6f208189..56e67fe164 100644
> > --- a/include/net/vhost_net.h
> > +++ b/include/net/vhost_net.h
> > @@ -40,5 +40,5 @@ int vhost_set_vring_ready(NetClientState *nc);
> >   uint64_t vhost_net_get_acked_features(VHostNetState *net);
> >
> >   int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
> > -
> > -#endif
> > +int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
>
>
> This should belong to vhost-vdpa header.
>
> Thanks
>
will fix this
>
> > +endif
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 292d4e7c0c..c19e10ce9c 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -2409,6 +2409,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
> >   #ifdef CONFIG_POSIX
> >       "-netdev vhost-user,id=str,chardev=dev[,vhostforce=on|off]\n"
> >       "                configure a vhost-user network, backed by a chardev 'dev'\n"
> > +#endif
> > +#ifdef CONFIG_POSIX
> > +    "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
> > +    "                configure a vhost-vdpa network,Establish a vhost-vdpa netdev\n"
> >   #endif
> >       "-netdev hubport,id=str,hubid=n[,netdev=nd]\n"
> >       "                configure a hub port on the hub with ID 'n'\n", QEMU_ARCH_ALL)
> > @@ -2428,6 +2432,9 @@ DEF("nic", HAS_ARG, QEMU_OPTION_nic,
> >   #endif
> >   #ifdef CONFIG_POSIX
> >       "vhost-user|"
> > +#endif
> > +#ifdef CONFIG_POSIX
> > +    "vhost-vdpa|"
> >   #endif
> >       "socket][,option][,...][mac=macaddr]\n"
> >       "                initialize an on-board / default host NIC (using MAC address\n"
> > @@ -2896,6 +2903,14 @@ SRST
> >       hubport to another netdev with ID nd by using the ``netdev=nd``
> >       option.
> >
> > +``-netdev vhost-vdpa,vhostdev=/path/to/dev ``
> > +    Establish a vhost-vdpa netdev.
> > +
> > +    vDPA device is a device that uses a datapath which complies with
> > +    the virtio specifications with a vendor specific control path.
> > +    vDPA devices can be both physically located on the hardware or
> > +    emulated by software.
> > +
> >   ``-net nic[,netdev=nd][,macaddr=mac][,model=type] [,name=name][,addr=addr][,vectors=v]``
> >       Legacy option to configure or create an on-board (or machine
> >       default) Network Interface Card(NIC) and connect it either to the
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-03  2:53   ` Jason Wang
@ 2020-06-03  5:23     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-03  5:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, rob.miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Shahaf Shuler, kevin.tian, parav, vmireyno, Liang, Cunming,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang, Zhihong,
	Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck, hanand,
	Zhu, Lingshan

On Wed, Jun 3, 2020 at 10:54 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/5/29 下午10:06, Cindy Lu wrote:
> > From: Tiwei Bie<tiwei.bie@intel.com>
>
>
> Consider the significant modification based on the original patch.
>
> I think you may change the other to yourslef and keep the sobs for both
> Tiwei and Lingshan.
>
> Thanks
>
>
Sure, Will change this
> >
> > Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> > vhost-user. The above patch provides a generic device for vDPA purpose,
> > this vDPA device exposes to user space a non-vendor-specific configuration
> > interface for setting up a vhost HW accelerator, this patch set introduces
> > a third vhost backend called vhost-vdpa based on the vDPA interface.
> >
> > Vhost-vdpa usage:
> >
> >    qemu-system-x86_64 -cpu host -enable-kvm \
> >      ......
> >    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
> >    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> >
> > Co-Authored-By: Lingshan zhu<lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu<lulu@redhat.com>
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-05-29 14:06 ` [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client Cindy Lu
  2020-05-29 14:22   ` Eric Blake
@ 2020-06-03  6:39   ` Jason Wang
  2020-06-03  8:19     ` Cindy Lu
  1 sibling, 1 reply; 41+ messages in thread
From: Jason Wang @ 2020-06-03  6:39 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, hanand, lingshan.zhu


On 2020/5/29 下午10:06, Cindy Lu wrote:
> From: Tiwei Bie <tiwei.bie@intel.com>


Similar for this patch, you can change the git author and keep sobs for 
both Tiwei and Ling Shan.


>
> This patch set introduces a new net client type: vhost-vdpa.
> vhost-vdpa net client will set up a vDPA device which is specified
> by a "vhostdev" parameter.
>
> Co-authored-by: Lingshan Zhu <lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>   include/net/vhost-vdpa.h |  19 ++++
>   include/net/vhost_net.h  |   2 +-
>   net/Makefile.objs        |   2 +-
>   net/clients.h            |   2 +
>   net/net.c                |   3 +
>   net/vhost-vdpa.c         | 235 +++++++++++++++++++++++++++++++++++++++
>   qapi/net.json            |  26 ++++-
>   7 files changed, 285 insertions(+), 4 deletions(-)
>   create mode 100644 include/net/vhost-vdpa.h
>   create mode 100644 net/vhost-vdpa.c
>
> diff --git a/include/net/vhost-vdpa.h b/include/net/vhost-vdpa.h
> new file mode 100644
> index 0000000000..6ce0d04f72
> --- /dev/null
> +++ b/include/net/vhost-vdpa.h
> @@ -0,0 +1,19 @@
> +/*
> + * vhost-vdpa.h
> + *
> + * Copyright(c) 2017-2018 Intel Corporation.
> + * Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef VHOST_VDPA_H
> +#define VHOST_VDPA_H
> +
> +struct vhost_net;
> +struct vhost_net *vhost_vdpa_get_vhost_net(NetClientState *nc);
> +uint64_t vhost_vdpa_get_acked_features(NetClientState *nc);
> +
> +#endif /* VHOST_VDPA_H */
> diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> index 56e67fe164..0b87d3c6e9 100644
> --- a/include/net/vhost_net.h
> +++ b/include/net/vhost_net.h
> @@ -41,4 +41,4 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net);
>   
>   int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
>   int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
> -endif
> +#endif
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index c5d076d19c..5ab45545db 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -26,7 +26,7 @@ tap-obj-$(CONFIG_SOLARIS) = tap-solaris.o
>   tap-obj-y ?= tap-stub.o
>   common-obj-$(CONFIG_POSIX) += tap.o $(tap-obj-y)
>   common-obj-$(CONFIG_WIN32) += tap-win32.o
> -
> +common-obj-$(CONFIG_VHOST_NET_VDPA) += vhost-vdpa.o
>   vde.o-libs = $(VDE_LIBS)
>   
>   common-obj-$(CONFIG_CAN_BUS) += can/
> diff --git a/net/clients.h b/net/clients.h
> index a6ef267e19..92f9b59aed 100644
> --- a/net/clients.h
> +++ b/net/clients.h
> @@ -61,4 +61,6 @@ int net_init_netmap(const Netdev *netdev, const char *name,
>   int net_init_vhost_user(const Netdev *netdev, const char *name,
>                           NetClientState *peer, Error **errp);
>   
> +int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> +                        NetClientState *peer, Error **errp);
>   #endif /* QEMU_NET_CLIENTS_H */
> diff --git a/net/net.c b/net/net.c
> index 599fb61028..82624ea9ac 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -965,6 +965,9 @@ static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
>   #ifdef CONFIG_VHOST_NET_USER
>           [NET_CLIENT_DRIVER_VHOST_USER] = net_init_vhost_user,
>   #endif
> +#ifdef CONFIG_VHOST_NET_VDPA
> +        [NET_CLIENT_DRIVER_VHOST_VDPA] = net_init_vhost_vdpa,
> +#endif
>   #ifdef CONFIG_L2TPV3
>           [NET_CLIENT_DRIVER_L2TPV3]    = net_init_l2tpv3,
>   #endif
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> new file mode 100644
> index 0000000000..7b98c142b5
> --- /dev/null
> +++ b/net/vhost-vdpa.c
> @@ -0,0 +1,235 @@
> +/*
> + * vhost-vdpa.c
> + *
> + * Copyright(c) 2017-2018 Intel Corporation.
> + * Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "clients.h"
> +#include "net/vhost_net.h"
> +#include "net/vhost-vdpa.h"
> +#include "hw/virtio/vhost-vdpa.h"
> +#include "qemu/config-file.h"
> +#include "qemu/error-report.h"
> +#include "qemu/option.h"
> +#include "qapi/error.h"
> +#include <sys/ioctl.h>
> +#include <err.h>
> +#include "standard-headers/linux/virtio_net.h"
> +#include "monitor/monitor.h"
> +#include "hw/virtio/vhost.h"
> +
> +/* Todo:need to add the multiqueue support here */
> +typedef struct VhostVDPAState {
> +    NetClientState nc;
> +    struct vhost_vdpa vhost_vdpa;
> +    VHostNetState *vhost_net;
> +    uint64_t acked_features;
> +    bool started;
> +} VhostVDPAState;
> +
> +VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> +{
> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +    return s->vhost_net;
> +}
> +
> +uint64_t vhost_vdpa_get_acked_features(NetClientState *nc)
> +{
> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +    return s->acked_features;
> +}
> +
> +static int vhost_vdpa_check_device_id(NetClientState *nc)


A better name is needed, something like "vhost_vdpa_net_check_devie_id" 
is better.


> +{
> +    uint32_t device_id;
> +    int ret;
> +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    /* Get the device id from hw*/


The code explains itself, so no need for this comment.


> +    ret = vhost_net_get_device_id(s->vhost_net, &device_id);
> +    if (device_id != VIRTIO_ID_NET) {
> +        return -ENOTSUP;
> +    }
> +    return ret;
> +}
> +
> +static void vhost_vdpa_del(NetClientState *ncs)
> +{
> +    VhostVDPAState *s;
> +
> +    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +
> +    s = DO_UPCAST(VhostVDPAState, nc, ncs);
> +
> +    if (s->vhost_net) {
> +        /* save acked features */
> +        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
> +        if (features) {
> +            s->acked_features = features;
> +        }


I'm not sure I get here, is the acked_features used in the 
vhost_net_cleanup()?


> +        vhost_net_cleanup(s->vhost_net);
> +    }
> +}
> +
> +static int vhost_vdpa_add(NetClientState *ncs, void *be)
> +{
> +    VhostNetOptions options;
> +    struct vhost_net *net = NULL;
> +    VhostVDPAState *s;
> +    int ret;
> +
> +    options.backend_type = VHOST_BACKEND_TYPE_VDPA;
> +
> +    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +
> +    s = DO_UPCAST(VhostVDPAState, nc, ncs);
> +
> +    options.net_backend = ncs;
> +    options.opaque      = be;
> +    options.busyloop_timeout = 0;
> +    net = vhost_net_init(&options);
> +    if (!net) {
> +        error_report("failed to init vhost_net for queue");
> +        goto err;
> +    }
> +
> +    if (s->vhost_net) {
> +        vhost_net_cleanup(s->vhost_net);
> +        g_free(s->vhost_net);
> +    }
> +    s->vhost_net = net;
> +    /* check the device id for vdpa */


The comment could be removed as well.


> +    ret = vhost_vdpa_check_device_id(ncs);
> +    if (ret) {
> +        goto err;
> +    }
> +    return 0;
> +err:
> +    if (net) {
> +        vhost_net_cleanup(net);
> +    }
> +    vhost_vdpa_del(ncs);
> +    return -1;
> +}
> +
> +static void vhost_vdpa_cleanup(NetClientState *nc)
> +{
> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +
> +    if (s->vhost_net) {
> +        vhost_net_cleanup(s->vhost_net);
> +        g_free(s->vhost_net);
> +        s->vhost_net = NULL;
> +    }
> +
> +    qemu_purge_queued_packets(nc);


Why this is needed?

Thanks


> +}
> +
> +static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
> +{
> +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +
> +    return true;
> +}
> +
> +static bool vhost_vdpa_has_ufo(NetClientState *nc)
> +{
> +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> +    uint64_t  features = 0;
> +
> +    features |= (1ULL << VIRTIO_NET_F_HOST_UFO);
> +    features = vhost_net_get_features(s->vhost_net, features);
> +    return !!(features & (1ULL << VIRTIO_NET_F_HOST_UFO));
> +
> +}
> +
> +static NetClientInfo net_vhost_vdpa_info = {
> +        .type = NET_CLIENT_DRIVER_VHOST_VDPA,
> +        .size = sizeof(VhostVDPAState),
> +        .cleanup = vhost_vdpa_cleanup,
> +        .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
> +        .has_ufo = vhost_vdpa_has_ufo,
> +};
> +
> +static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
> +                               const char *name, const char *vhostdev,
> +                               bool has_fd, char *fd)
> +{
> +    NetClientState *nc = NULL;
> +    VhostVDPAState *s;
> +    int vdpa_device_fd = -1;
> +    Error *err = NULL;
> +    int ret = 0;
> +    assert(name);
> +
> +    nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
> +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-vdpa");
> +    nc->queue_index = 0;
> +
> +    s = DO_UPCAST(VhostVDPAState, nc, nc);
> +
> +    if (has_fd) {
> +        vdpa_device_fd = monitor_fd_param(cur_mon, fd, &err);
> +    } else{
> +        vdpa_device_fd = open(vhostdev, O_RDWR);
> +    }
> +
> +    if (vdpa_device_fd == -1) {
> +        return -errno;
> +    }
> +    s->vhost_vdpa.device_fd = vdpa_device_fd;
> +    ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa);
> +    assert(s->vhost_net);
> +
> +    if (ret) {
> +        if (has_fd) {
> +            close(vdpa_device_fd);
> +        }
> +    }
> +    return ret;
> +}
> +
> +static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
> +{
> +    const char *name = opaque;
> +    const char *driver, *netdev;
> +
> +    driver = qemu_opt_get(opts, "driver");
> +    netdev = qemu_opt_get(opts, "netdev");
> +    if (!driver || !netdev) {
> +        return 0;
> +    }
> +
> +    if (strcmp(netdev, name) == 0 &&
> +        !g_str_has_prefix(driver, "virtio-net-")) {
> +        error_setg(errp, "vhost-vdpa requires frontend driver virtio-net-*");
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> +                        NetClientState *peer, Error **errp)
> +{
> +    const NetdevVhostVDPAOptions *opts;
> +
> +    assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> +    opts = &netdev->u.vhost_vdpa;
> +    /* verify net frontend */
> +    if (qemu_opts_foreach(qemu_find_opts("device"), net_vhost_check_net,
> +                          (char *)name, errp)) {
> +        return -1;
> +    }
> +    return net_vhost_vdpa_init(peer, "vhost_vdpa", name, opts->vhostdev,
> +                    opts->has_fd, opts->fd);
> +}
> diff --git a/qapi/net.json b/qapi/net.json
> index cebb1b52e3..37507ce9ba 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -428,6 +428,27 @@
>       '*vhostforce':    'bool',
>       '*queues':        'int' } }
>   
> +##
> +# @NetdevVhostVDPAOptions:
> +#
> +# Vhost-vdpa network backend
> +#
> +# @vhostdev: name of a vdpa dev path in sysfs
> +#            (default path:/dev/vhost-vdpa-$ID)
> +#
> +# @fd: file descriptor of an already opened vdpa device
> +#
> +# @queues: number of queues to be created for multiqueue vhost-vdpa
> +#          (default: 1)
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'NetdevVhostVDPAOptions',
> +  'data': {
> +    '*vhostdev':     'str',
> +    '*fd':           'str',
> +    '*queues':       'int' } }
> +
>   ##
>   # @NetClientDriver:
>   #
> @@ -437,7 +458,7 @@
>   ##
>   { 'enum': 'NetClientDriver',
>     'data': [ 'none', 'nic', 'user', 'tap', 'l2tpv3', 'socket', 'vde',
> -            'bridge', 'hubport', 'netmap', 'vhost-user' ] }
> +            'bridge', 'hubport', 'netmap', 'vhost-user', 'vhost-vdpa' ] }
>   
>   ##
>   # @Netdev:
> @@ -465,7 +486,8 @@
>       'bridge':   'NetdevBridgeOptions',
>       'hubport':  'NetdevHubPortOptions',
>       'netmap':   'NetdevNetmapOptions',
> -    'vhost-user': 'NetdevVhostUserOptions' } }
> +    'vhost-user': 'NetdevVhostUserOptions',
> +    'vhost-vdpa': 'NetdevVhostVDPAOptions' } }
>   
>   ##
>   # @NetLegacy:



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
  2020-06-03  2:52   ` Jason Wang
  2020-06-03  2:53   ` Jason Wang
@ 2020-06-03  6:43   ` Jason Wang
  2020-06-03  8:20     ` Cindy Lu
  2020-06-04 10:39   ` Eugenio Perez Martin
  2020-06-08 20:14   ` Eric Blake
  4 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2020-06-03  6:43 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, hanand, lingshan.zhu


On 2020/5/29 下午10:06, Cindy Lu wrote:
> From: Tiwei Bie<tiwei.bie@intel.com>
>
> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> vhost-user. The above patch provides a generic device for vDPA purpose,
> this vDPA device exposes to user space a non-vendor-specific configuration
> interface for setting up a vhost HW accelerator, this patch set introduces
> a third vhost backend called vhost-vdpa based on the vDPA interface.
>
> Vhost-vdpa usage:
>
>    qemu-system-x86_64 -cpu host -enable-kvm \
>      ......
>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
>
> Co-Authored-By: Lingshan zhu<lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu<lulu@redhat.com>
> ---


Btw, I don't see the how to connect the vhost_set/get_config() with the 
virtio_net_set/get_config().

Thanks




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-06-03  6:39   ` Jason Wang
@ 2020-06-03  8:19     ` Cindy Lu
  2020-06-03  8:43       ` Jason Wang
  0 siblings, 1 reply; 41+ messages in thread
From: Cindy Lu @ 2020-06-03  8:19 UTC (permalink / raw)
  To: Jason Wang
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, rob.miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Shahaf Shuler, kevin.tian, parav, vmireyno, Liang, Cunming,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang, Zhihong,
	Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck, hanand,
	Zhu, Lingshan

Hi Jason,

On Wed, Jun 3, 2020 at 2:39 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/5/29 下午10:06, Cindy Lu wrote:
> > From: Tiwei Bie <tiwei.bie@intel.com>
>
>
> Similar for this patch, you can change the git author and keep sobs for
> both Tiwei and Ling Shan.
>
>
Will Fix this

> >
> > This patch set introduces a new net client type: vhost-vdpa.
> > vhost-vdpa net client will set up a vDPA device which is specified
> > by a "vhostdev" parameter.
> >
> > Co-authored-by: Lingshan Zhu <lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >   include/net/vhost-vdpa.h |  19 ++++
> >   include/net/vhost_net.h  |   2 +-
> >   net/Makefile.objs        |   2 +-
> >   net/clients.h            |   2 +
> >   net/net.c                |   3 +
> >   net/vhost-vdpa.c         | 235 +++++++++++++++++++++++++++++++++++++++
> >   qapi/net.json            |  26 ++++-
> >   7 files changed, 285 insertions(+), 4 deletions(-)
> >   create mode 100644 include/net/vhost-vdpa.h
> >   create mode 100644 net/vhost-vdpa.c
> >
> > diff --git a/include/net/vhost-vdpa.h b/include/net/vhost-vdpa.h
> > new file mode 100644
> > index 0000000000..6ce0d04f72
> > --- /dev/null
> > +++ b/include/net/vhost-vdpa.h
> > @@ -0,0 +1,19 @@
> > +/*
> > + * vhost-vdpa.h
> > + *
> > + * Copyright(c) 2017-2018 Intel Corporation.
> > + * Copyright(c) 2020 Red Hat, Inc.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#ifndef VHOST_VDPA_H
> > +#define VHOST_VDPA_H
> > +
> > +struct vhost_net;
> > +struct vhost_net *vhost_vdpa_get_vhost_net(NetClientState *nc);
> > +uint64_t vhost_vdpa_get_acked_features(NetClientState *nc);
> > +
> > +#endif /* VHOST_VDPA_H */
> > diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> > index 56e67fe164..0b87d3c6e9 100644
> > --- a/include/net/vhost_net.h
> > +++ b/include/net/vhost_net.h
> > @@ -41,4 +41,4 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net);
> >
> >   int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
> >   int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
> > -endif
> > +#endif
> > diff --git a/net/Makefile.objs b/net/Makefile.objs
> > index c5d076d19c..5ab45545db 100644
> > --- a/net/Makefile.objs
> > +++ b/net/Makefile.objs
> > @@ -26,7 +26,7 @@ tap-obj-$(CONFIG_SOLARIS) = tap-solaris.o
> >   tap-obj-y ?= tap-stub.o
> >   common-obj-$(CONFIG_POSIX) += tap.o $(tap-obj-y)
> >   common-obj-$(CONFIG_WIN32) += tap-win32.o
> > -
> > +common-obj-$(CONFIG_VHOST_NET_VDPA) += vhost-vdpa.o
> >   vde.o-libs = $(VDE_LIBS)
> >
> >   common-obj-$(CONFIG_CAN_BUS) += can/
> > diff --git a/net/clients.h b/net/clients.h
> > index a6ef267e19..92f9b59aed 100644
> > --- a/net/clients.h
> > +++ b/net/clients.h
> > @@ -61,4 +61,6 @@ int net_init_netmap(const Netdev *netdev, const char *name,
> >   int net_init_vhost_user(const Netdev *netdev, const char *name,
> >                           NetClientState *peer, Error **errp);
> >
> > +int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> > +                        NetClientState *peer, Error **errp);
> >   #endif /* QEMU_NET_CLIENTS_H */
> > diff --git a/net/net.c b/net/net.c
> > index 599fb61028..82624ea9ac 100644
> > --- a/net/net.c
> > +++ b/net/net.c
> > @@ -965,6 +965,9 @@ static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
> >   #ifdef CONFIG_VHOST_NET_USER
> >           [NET_CLIENT_DRIVER_VHOST_USER] = net_init_vhost_user,
> >   #endif
> > +#ifdef CONFIG_VHOST_NET_VDPA
> > +        [NET_CLIENT_DRIVER_VHOST_VDPA] = net_init_vhost_vdpa,
> > +#endif
> >   #ifdef CONFIG_L2TPV3
> >           [NET_CLIENT_DRIVER_L2TPV3]    = net_init_l2tpv3,
> >   #endif
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > new file mode 100644
> > index 0000000000..7b98c142b5
> > --- /dev/null
> > +++ b/net/vhost-vdpa.c
> > @@ -0,0 +1,235 @@
> > +/*
> > + * vhost-vdpa.c
> > + *
> > + * Copyright(c) 2017-2018 Intel Corporation.
> > + * Copyright(c) 2020 Red Hat, Inc.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "clients.h"
> > +#include "net/vhost_net.h"
> > +#include "net/vhost-vdpa.h"
> > +#include "hw/virtio/vhost-vdpa.h"
> > +#include "qemu/config-file.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/option.h"
> > +#include "qapi/error.h"
> > +#include <sys/ioctl.h>
> > +#include <err.h>
> > +#include "standard-headers/linux/virtio_net.h"
> > +#include "monitor/monitor.h"
> > +#include "hw/virtio/vhost.h"
> > +
> > +/* Todo:need to add the multiqueue support here */
> > +typedef struct VhostVDPAState {
> > +    NetClientState nc;
> > +    struct vhost_vdpa vhost_vdpa;
> > +    VHostNetState *vhost_net;
> > +    uint64_t acked_features;
> > +    bool started;
> > +} VhostVDPAState;
> > +
> > +VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
> > +{
> > +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +    return s->vhost_net;
> > +}
> > +
> > +uint64_t vhost_vdpa_get_acked_features(NetClientState *nc)
> > +{
> > +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +    return s->acked_features;
> > +}
> > +
> > +static int vhost_vdpa_check_device_id(NetClientState *nc)
>
>
> A better name is needed, something like "vhost_vdpa_net_check_devie_id"
> is better.
>
Sure will fix this
>
> > +{
> > +    uint32_t device_id;
> > +    int ret;
> > +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    /* Get the device id from hw*/
>
>
> The code explains itself, so no need for this comment.
>
will remove this
>
> > +    ret = vhost_net_get_device_id(s->vhost_net, &device_id);
> > +    if (device_id != VIRTIO_ID_NET) {
> > +        return -ENOTSUP;
> > +    }
> > +    return ret;
> > +}
> > +
> > +static void vhost_vdpa_del(NetClientState *ncs)
> > +{
> > +    VhostVDPAState *s;
> > +
> > +    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +
> > +    s = DO_UPCAST(VhostVDPAState, nc, ncs);
> > +
> > +    if (s->vhost_net) {
> > +        /* save acked features */
> > +        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
> > +        if (features) {
> > +            s->acked_features = features;
> > +        }
>
>
> I'm not sure I get here, is the acked_features used in the
> vhost_net_cleanup()?
>
I think we can remove this part, seems these bit are not using anymore
>
> > +        vhost_net_cleanup(s->vhost_net);
> > +    }
> > +}
> > +
> > +static int vhost_vdpa_add(NetClientState *ncs, void *be)
> > +{
> > +    VhostNetOptions options;
> > +    struct vhost_net *net = NULL;
> > +    VhostVDPAState *s;
> > +    int ret;
> > +
> > +    options.backend_type = VHOST_BACKEND_TYPE_VDPA;
> > +
> > +    assert(ncs->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +
> > +    s = DO_UPCAST(VhostVDPAState, nc, ncs);
> > +
> > +    options.net_backend = ncs;
> > +    options.opaque      = be;
> > +    options.busyloop_timeout = 0;
> > +    net = vhost_net_init(&options);
> > +    if (!net) {
> > +        error_report("failed to init vhost_net for queue");
> > +        goto err;
> > +    }
> > +
> > +    if (s->vhost_net) {
> > +        vhost_net_cleanup(s->vhost_net);
> > +        g_free(s->vhost_net);
> > +    }
> > +    s->vhost_net = net;
> > +    /* check the device id for vdpa */
>
>
> The comment could be removed as well.
>
>
Sure will remove this

> > +    ret = vhost_vdpa_check_device_id(ncs);
> > +    if (ret) {
> > +        goto err;
> > +    }
> > +    return 0;
> > +err:
> > +    if (net) {
> > +        vhost_net_cleanup(net);
> > +    }
> > +    vhost_vdpa_del(ncs);
> > +    return -1;
> > +}
> > +
> > +static void vhost_vdpa_cleanup(NetClientState *nc)
> > +{
> > +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +
> > +    if (s->vhost_net) {
> > +        vhost_net_cleanup(s->vhost_net);
> > +        g_free(s->vhost_net);
> > +        s->vhost_net = NULL;
> > +    }
> > +
> > +    qemu_purge_queued_packets(nc);
>
>
> Why this is needed?
>
> Thanks
>
This is to clean the packet in the queue while the vdpa remove,  I
will double check this part
>
> > +}
> > +
> > +static bool vhost_vdpa_has_vnet_hdr(NetClientState *nc)
> > +{
> > +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +
> > +    return true;
> > +}
> > +
> > +static bool vhost_vdpa_has_ufo(NetClientState *nc)
> > +{
> > +    assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +    uint64_t  features = 0;
> > +
> > +    features |= (1ULL << VIRTIO_NET_F_HOST_UFO);
> > +    features = vhost_net_get_features(s->vhost_net, features);
> > +    return !!(features & (1ULL << VIRTIO_NET_F_HOST_UFO));
> > +
> > +}
> > +
> > +static NetClientInfo net_vhost_vdpa_info = {
> > +        .type = NET_CLIENT_DRIVER_VHOST_VDPA,
> > +        .size = sizeof(VhostVDPAState),
> > +        .cleanup = vhost_vdpa_cleanup,
> > +        .has_vnet_hdr = vhost_vdpa_has_vnet_hdr,
> > +        .has_ufo = vhost_vdpa_has_ufo,
> > +};
> > +
> > +static int net_vhost_vdpa_init(NetClientState *peer, const char *device,
> > +                               const char *name, const char *vhostdev,
> > +                               bool has_fd, char *fd)
> > +{
> > +    NetClientState *nc = NULL;
> > +    VhostVDPAState *s;
> > +    int vdpa_device_fd = -1;
> > +    Error *err = NULL;
> > +    int ret = 0;
> > +    assert(name);
> > +
> > +    nc = qemu_new_net_client(&net_vhost_vdpa_info, peer, device, name);
> > +    snprintf(nc->info_str, sizeof(nc->info_str), "vhost-vdpa");
> > +    nc->queue_index = 0;
> > +
> > +    s = DO_UPCAST(VhostVDPAState, nc, nc);
> > +
> > +    if (has_fd) {
> > +        vdpa_device_fd = monitor_fd_param(cur_mon, fd, &err);
> > +    } else{
> > +        vdpa_device_fd = open(vhostdev, O_RDWR);
> > +    }
> > +
> > +    if (vdpa_device_fd == -1) {
> > +        return -errno;
> > +    }
> > +    s->vhost_vdpa.device_fd = vdpa_device_fd;
> > +    ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa);
> > +    assert(s->vhost_net);
> > +
> > +    if (ret) {
> > +        if (has_fd) {
> > +            close(vdpa_device_fd);
> > +        }
> > +    }
> > +    return ret;
> > +}
> > +
> > +static int net_vhost_check_net(void *opaque, QemuOpts *opts, Error **errp)
> > +{
> > +    const char *name = opaque;
> > +    const char *driver, *netdev;
> > +
> > +    driver = qemu_opt_get(opts, "driver");
> > +    netdev = qemu_opt_get(opts, "netdev");
> > +    if (!driver || !netdev) {
> > +        return 0;
> > +    }
> > +
> > +    if (strcmp(netdev, name) == 0 &&
> > +        !g_str_has_prefix(driver, "virtio-net-")) {
> > +        error_setg(errp, "vhost-vdpa requires frontend driver virtio-net-*");
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +int net_init_vhost_vdpa(const Netdev *netdev, const char *name,
> > +                        NetClientState *peer, Error **errp)
> > +{
> > +    const NetdevVhostVDPAOptions *opts;
> > +
> > +    assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> > +    opts = &netdev->u.vhost_vdpa;
> > +    /* verify net frontend */
> > +    if (qemu_opts_foreach(qemu_find_opts("device"), net_vhost_check_net,
> > +                          (char *)name, errp)) {
> > +        return -1;
> > +    }
> > +    return net_vhost_vdpa_init(peer, "vhost_vdpa", name, opts->vhostdev,
> > +                    opts->has_fd, opts->fd);
> > +}
> > diff --git a/qapi/net.json b/qapi/net.json
> > index cebb1b52e3..37507ce9ba 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -428,6 +428,27 @@
> >       '*vhostforce':    'bool',
> >       '*queues':        'int' } }
> >
> > +##
> > +# @NetdevVhostVDPAOptions:
> > +#
> > +# Vhost-vdpa network backend
> > +#
> > +# @vhostdev: name of a vdpa dev path in sysfs
> > +#            (default path:/dev/vhost-vdpa-$ID)
> > +#
> > +# @fd: file descriptor of an already opened vdpa device
> > +#
> > +# @queues: number of queues to be created for multiqueue vhost-vdpa
> > +#          (default: 1)
> > +#
> > +# Since: 5.1
> > +##
> > +{ 'struct': 'NetdevVhostVDPAOptions',
> > +  'data': {
> > +    '*vhostdev':     'str',
> > +    '*fd':           'str',
> > +    '*queues':       'int' } }
> > +
> >   ##
> >   # @NetClientDriver:
> >   #
> > @@ -437,7 +458,7 @@
> >   ##
> >   { 'enum': 'NetClientDriver',
> >     'data': [ 'none', 'nic', 'user', 'tap', 'l2tpv3', 'socket', 'vde',
> > -            'bridge', 'hubport', 'netmap', 'vhost-user' ] }
> > +            'bridge', 'hubport', 'netmap', 'vhost-user', 'vhost-vdpa' ] }
> >
> >   ##
> >   # @Netdev:
> > @@ -465,7 +486,8 @@
> >       'bridge':   'NetdevBridgeOptions',
> >       'hubport':  'NetdevHubPortOptions',
> >       'netmap':   'NetdevNetmapOptions',
> > -    'vhost-user': 'NetdevVhostUserOptions' } }
> > +    'vhost-user': 'NetdevVhostUserOptions',
> > +    'vhost-vdpa': 'NetdevVhostVDPAOptions' } }
> >
> >   ##
> >   # @NetLegacy:
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-03  6:43   ` Jason Wang
@ 2020-06-03  8:20     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-03  8:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, rob.miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Shahaf Shuler, kevin.tian, parav, vmireyno, Liang, Cunming,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang, Zhihong,
	Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck, hanand,
	Zhu, Lingshan

On Wed, Jun 3, 2020 at 2:43 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/5/29 下午10:06, Cindy Lu wrote:
> > From: Tiwei Bie<tiwei.bie@intel.com>
> >
> > Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> > vhost-user. The above patch provides a generic device for vDPA purpose,
> > this vDPA device exposes to user space a non-vendor-specific configuration
> > interface for setting up a vhost HW accelerator, this patch set introduces
> > a third vhost backend called vhost-vdpa based on the vDPA interface.
> >
> > Vhost-vdpa usage:
> >
> >    qemu-system-x86_64 -cpu host -enable-kvm \
> >      ......
> >    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
> >    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> >
> > Co-Authored-By: Lingshan zhu<lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu<lulu@redhat.com>
> > ---
>
>
> Btw, I don't see the how to connect the vhost_set/get_config() with the
> virtio_net_set/get_config().
>
Sure, I will add this part
> Thanks
>
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-06-03  8:19     ` Cindy Lu
@ 2020-06-03  8:43       ` Jason Wang
  2020-06-03  8:49         ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Jason Wang @ 2020-06-03  8:43 UTC (permalink / raw)
  To: Cindy Lu
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, rob.miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Shahaf Shuler, kevin.tian, parav, vmireyno, Liang, Cunming,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang, Zhihong,
	Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck, hanand,
	Zhu, Lingshan


On 2020/6/3 下午4:19, Cindy Lu wrote:
>>> +static void vhost_vdpa_cleanup(NetClientState *nc)
>>> +{
>>> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
>>> +
>>> +    if (s->vhost_net) {
>>> +        vhost_net_cleanup(s->vhost_net);
>>> +        g_free(s->vhost_net);
>>> +        s->vhost_net = NULL;
>>> +    }
>>> +
>>> +    qemu_purge_queued_packets(nc);
>> Why this is needed?
>>
>> Thanks
>>
> This is to clean the packet in the queue while the vdpa remove,  I
> will double check this part


Note we don't have a software backup driver for qemu currently (we 
probably need one in the future).

So we can't fallback into userspace which means the packet can not be 
queued by qemu.

Thanks



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client
  2020-06-03  8:43       ` Jason Wang
@ 2020-06-03  8:49         ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-03  8:49 UTC (permalink / raw)
  To: Jason Wang
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, rob.miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Shahaf Shuler, kevin.tian, parav, vmireyno, Liang, Cunming,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang, Zhihong,
	Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck, hanand,
	Zhu, Lingshan

Hi Jason,


On Wed, Jun 3, 2020 at 4:43 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2020/6/3 下午4:19, Cindy Lu wrote:
> >>> +static void vhost_vdpa_cleanup(NetClientState *nc)
> >>> +{
> >>> +    VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
> >>> +
> >>> +    if (s->vhost_net) {
> >>> +        vhost_net_cleanup(s->vhost_net);
> >>> +        g_free(s->vhost_net);
> >>> +        s->vhost_net = NULL;
> >>> +    }
> >>> +
> >>> +    qemu_purge_queued_packets(nc);
> >> Why this is needed?
> >>
> >> Thanks
> >>
> > This is to clean the packet in the queue while the vdpa remove,  I
> > will double check this part
>
>
> Note we don't have a software backup driver for qemu currently (we
> probably need one in the future).
>
> So we can't fallback into userspace which means the packet can not be
> queued by qemu.
>
Got it, Thanks Jason, I will remove this part

> Thanks
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
                     ` (2 preceding siblings ...)
  2020-06-03  6:43   ` Jason Wang
@ 2020-06-04 10:39   ` Eugenio Perez Martin
  2020-06-04 11:33     ` Michael S. Tsirkin
  2020-06-08 20:14   ` Eric Blake
  4 siblings, 1 reply; 41+ messages in thread
From: Eugenio Perez Martin @ 2020-06-04 10:39 UTC (permalink / raw)
  To: Cindy Lu
  Cc: Cornelia Huck, Michael Tsirkin, Jason Wang, qemu-devel, hanand,
	Rob Miller, saugatm, Markus Armbruster, hch, jgg, mhabets,
	shahafs, kevin.tian, parav, Vitaly Mireyno, cunming.liang,
	gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, zhihong.wang,
	Tiwei Bie, Ariel Adam, rdunlap, Maxime Coquelin, lingshan.zhu

On Fri, May 29, 2020 at 4:10 PM Cindy Lu <lulu@redhat.com> wrote:
>
> From: Tiwei Bie <tiwei.bie@intel.com>
>
> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> vhost-user. The above patch provides a generic device for vDPA purpose,
> this vDPA device exposes to user space a non-vendor-specific configuration
> interface for setting up a vhost HW accelerator, this patch set introduces
> a third vhost backend called vhost-vdpa based on the vDPA interface.
>
> Vhost-vdpa usage:
>
>   qemu-system-x86_64 -cpu host -enable-kvm \
>     ......
>   -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>   -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
>
> Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  configure                         |  21 ++
>  hw/net/vhost_net-stub.c           |   5 +
>  hw/net/vhost_net.c                |  47 +++-
>  hw/virtio/Makefile.objs           |   1 +
>  hw/virtio/vhost-backend.c         |   5 +
>  hw/virtio/vhost-vdpa.c            | 399 ++++++++++++++++++++++++++++++
>  hw/virtio/vhost.c                 |  37 ++-
>  include/hw/virtio/vhost-backend.h |  10 +-
>  include/hw/virtio/vhost-vdpa.h    |  26 ++
>  include/hw/virtio/vhost.h         |   2 +
>  include/net/vhost_net.h           |   4 +-
>  qemu-options.hx                   |  15 ++
>  12 files changed, 566 insertions(+), 6 deletions(-)
>  create mode 100644 hw/virtio/vhost-vdpa.c
>  create mode 100644 include/hw/virtio/vhost-vdpa.h
>
> diff --git a/configure b/configure
> index 23b5e93752..53679ee57f 100755
> --- a/configure
> +++ b/configure
> @@ -1557,6 +1557,10 @@ for opt do
>    ;;
>    --enable-vhost-user) vhost_user="yes"
>    ;;
> +  --disable-vhost-vdpa) vhost_vdpa="no"
> +  ;;
> +  --enable-vhost-vdpa) vhost_vdpa="yes"
> +  ;;
>    --disable-vhost-kernel) vhost_kernel="no"
>    ;;
>    --enable-vhost-kernel) vhost_kernel="yes"
> @@ -1846,6 +1850,7 @@ disabled with --disable-FEATURE, default is enabled if available:
>    vhost-crypto    vhost-user-crypto backend support
>    vhost-kernel    vhost kernel backend support
>    vhost-user      vhost-user backend support
> +  vhost-vdpa      vhost-vdpa kernel backend support
>    spice           spice
>    rbd             rados block device (rbd)
>    libiscsi        iscsi support
> @@ -2336,6 +2341,10 @@ test "$vhost_user" = "" && vhost_user=yes
>  if test "$vhost_user" = "yes" && test "$mingw32" = "yes"; then
>    error_exit "vhost-user isn't available on win32"
>  fi
> +test "$vhost_vdpa" = "" && vhost_vdpa=$linux
> +if test "$vhost_vdpa" = "yes" && test "$linux" != "yes"; then
> +  error_exit "vhost-vdpa is only available on Linux"
> +fi
>  test "$vhost_kernel" = "" && vhost_kernel=$linux
>  if test "$vhost_kernel" = "yes" && test "$linux" != "yes"; then
>    error_exit "vhost-kernel is only available on Linux"
> @@ -2364,6 +2373,11 @@ test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
>  if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
>    error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
>  fi
> +#vhost-vdpa backends
> +test "$vhost_net_vdpa" = "" && vhost_net_vdpa=$vhost_vdpa
> +if test "$vhost_net_vdpa" = "yes" && test "$vhost_vdpa" = "no"; then
> +  error_exit "--enable-vhost-net-vdpa requires --enable-vhost-vdpa"
> +fi
>
>  # OR the vhost-kernel and vhost-user values for simplicity
>  if test "$vhost_net" = ""; then
> @@ -6673,6 +6687,7 @@ echo "vhost-scsi support $vhost_scsi"
>  echo "vhost-vsock support $vhost_vsock"
>  echo "vhost-user support $vhost_user"
>  echo "vhost-user-fs support $vhost_user_fs"
> +echo "vhost-vdpa support $vhost_vdpa"
>  echo "Trace backends    $trace_backends"
>  if have_backend "simple"; then
>  echo "Trace output file $trace_file-<pid>"
> @@ -7170,6 +7185,9 @@ fi
>  if test "$vhost_net_user" = "yes" ; then
>    echo "CONFIG_VHOST_NET_USER=y" >> $config_host_mak
>  fi
> +if test "$vhost_net_vdpa" = "yes" ; then
> +  echo "CONFIG_VHOST_NET_VDPA=y" >> $config_host_mak
> +fi
>  if test "$vhost_crypto" = "yes" ; then
>    echo "CONFIG_VHOST_CRYPTO=y" >> $config_host_mak
>  fi
> @@ -7182,6 +7200,9 @@ fi
>  if test "$vhost_user" = "yes" ; then
>    echo "CONFIG_VHOST_USER=y" >> $config_host_mak
>  fi
> +if test "$vhost_vdpa" = "yes" ; then
> +  echo "CONFIG_VHOST_VDPA=y" >> $config_host_mak
> +fi
>  if test "$vhost_user_fs" = "yes" ; then
>    echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
>  fi
> diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
> index 43e93e1a9a..ab77a92a7d 100644
> --- a/hw/net/vhost_net-stub.c
> +++ b/hw/net/vhost_net-stub.c
> @@ -94,3 +94,8 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>  {
>      return 0;
>  }
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> +{
> +    return 0;
> +}
> +
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index e2bc7de2eb..25045cff59 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -17,8 +17,10 @@
>  #include "net/net.h"
>  #include "net/tap.h"
>  #include "net/vhost-user.h"
> +#include "net/vhost-vdpa.h"
>
>  #include "standard-headers/linux/vhost_types.h"
> +#include "linux-headers/linux/vhost.h"
>  #include "hw/virtio/virtio-net.h"
>  #include "net/vhost_net.h"
>  #include "qemu/error-report.h"
> @@ -85,6 +87,30 @@ static const int user_feature_bits[] = {
>      VHOST_INVALID_FEATURE_BIT
>  };
>
> +static const int vdpa_feature_bits[] = {
> +    VIRTIO_F_NOTIFY_ON_EMPTY,
> +    VIRTIO_RING_F_INDIRECT_DESC,
> +    VIRTIO_RING_F_EVENT_IDX,
> +    VIRTIO_F_ANY_LAYOUT,
> +    VIRTIO_F_VERSION_1,
> +    VIRTIO_NET_F_CSUM,
> +    VIRTIO_NET_F_GUEST_CSUM,
> +    VIRTIO_NET_F_GSO,
> +    VIRTIO_NET_F_GUEST_TSO4,
> +    VIRTIO_NET_F_GUEST_TSO6,
> +    VIRTIO_NET_F_GUEST_ECN,
> +    VIRTIO_NET_F_GUEST_UFO,
> +    VIRTIO_NET_F_HOST_TSO4,
> +    VIRTIO_NET_F_HOST_TSO6,
> +    VIRTIO_NET_F_HOST_ECN,
> +    VIRTIO_NET_F_HOST_UFO,
> +    VIRTIO_NET_F_MRG_RXBUF,
> +    VIRTIO_NET_F_MTU,
> +    VIRTIO_F_IOMMU_PLATFORM,
> +    VIRTIO_F_RING_PACKED,
> +    VIRTIO_NET_F_GUEST_ANNOUNCE,
> +    VHOST_INVALID_FEATURE_BIT
> +};
>  static const int *vhost_net_get_feature_bits(struct vhost_net *net)
>  {
>      const int *feature_bits = 0;
> @@ -96,6 +122,9 @@ static const int *vhost_net_get_feature_bits(struct vhost_net *net)
>      case NET_CLIENT_DRIVER_VHOST_USER:
>          feature_bits = user_feature_bits;
>          break;
> +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> +        feature_bits = vdpa_feature_bits;
> +        break;
>      default:
>          error_report("Feature bits not defined for this type: %d",
>                  net->nc->info->type);
> @@ -110,7 +139,10 @@ uint64_t vhost_net_get_features(struct vhost_net *net, uint64_t features)
>      return vhost_get_features(&net->dev, vhost_net_get_feature_bits(net),
>              features);
>  }
> -
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t * device_id)
> +{
> +    return vhost_dev_get_device_id(&net->dev, device_id);
> +}
>  void vhost_net_ack_features(struct vhost_net *net, uint64_t features)
>  {
>      net->dev.acked_features = net->dev.backend_features;
> @@ -337,6 +369,11 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>      }
>
>      for (i = 0; i < total_queues; i++) {
> +
> +        if (virtio_queue_enabled(dev, i)) {
> +            vhost_set_vring_ready(peer);
> +        }
> +
>          peer = qemu_get_peer(ncs, i);
>          r = vhost_net_start_one(get_vhost_net(peer), dev);
>
> @@ -433,6 +470,12 @@ VHostNetState *get_vhost_net(NetClientState *nc)
>          vhost_net = vhost_user_get_vhost_net(nc);
>          assert(vhost_net);
>          break;
> +#endif
> +#ifdef CONFIG_VHOST_NET_VDPA
> +    case NET_CLIENT_DRIVER_VHOST_VDPA:
> +        vhost_net = vhost_vdpa_get_vhost_net(nc);
> +        assert(vhost_net);
> +        break;
>  #endif
>      default:
>          break;
> @@ -474,3 +517,5 @@ int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>
>      return vhost_ops->vhost_net_set_mtu(&net->dev, mtu);
>  }
> +
> +
> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> index 4e4d39a0a4..6b1b1a5fce 100644
> --- a/hw/virtio/Makefile.objs
> +++ b/hw/virtio/Makefile.objs
> @@ -5,6 +5,7 @@ obj-y += virtio.o
>  obj-$(CONFIG_VHOST) += vhost.o vhost-backend.o
>  common-obj-$(call lnot,$(CONFIG_VHOST)) += vhost-stub.o
>  obj-$(CONFIG_VHOST_USER) += vhost-user.o
> +obj-$(CONFIG_VHOST_VDPA) += vhost-vdpa.o
>
>  common-obj-$(CONFIG_VIRTIO_RNG) += virtio-rng.o
>  common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 42efb4967b..420341e8c5 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -291,6 +291,11 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type)
>      case VHOST_BACKEND_TYPE_USER:
>          dev->vhost_ops = &user_ops;
>          break;
> +#endif
> +#ifdef CONFIG_VHOST_VDPA
> +    case VHOST_BACKEND_TYPE_VDPA:
> +        dev->vhost_ops = &vdpa_ops;
> +        break;
>  #endif
>      default:
>          error_report("Unknown vhost backend type");
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> new file mode 100644
> index 0000000000..2d136a8565
> --- /dev/null
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -0,0 +1,399 @@
> +/*
> + * vhost-vdpa
> + *
> + *  Copyright(c) 2017-2018 Intel Corporation.
> + *  Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include <linux/vhost.h>
> +#include <linux/vfio.h>
> +#include <sys/eventfd.h>
> +#include <sys/ioctl.h>
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/vhost-backend.h"
> +#include "hw/virtio/virtio-net.h"
> +#include "hw/virtio/vhost-vdpa.h"
> +#include "qemu/main-loop.h"
> +#include <linux/kvm.h>
> +#include "sysemu/kvm.h"
> +
> +
> +static bool vhost_vdpa_listener_skipped_section(MemoryRegionSection *section)
> +{
> +    return (!memory_region_is_ram(section->mr) &&
> +            !memory_region_is_iommu(section->mr)) ||
> +           /*
> +            * Sizing an enabled 64-bit BAR can cause spurious mappings to
> +            * addresses in the upper part of the 64-bit address space.  These
> +            * are never accessed by the CPU and beyond the address width of
> +            * some IOMMU hardware.  TODO: VDPA should tell us the IOMMU width.
> +            */
> +           section->offset_within_address_space & (1ULL << 63);
> +}
> +
> +static int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
> +                              void *vaddr, bool readonly)
> +{
> +    struct vhost_msg_v2 msg;
> +    int fd = v->device_fd;
> +    int ret = 0;
> +
> +    msg.type =  v->msg_type;
> +    msg.iotlb.iova = iova;
> +    msg.iotlb.size = size;
> +    msg.iotlb.uaddr = (uint64_t)vaddr;
> +    msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
> +    msg.iotlb.type = VHOST_IOTLB_UPDATE;
> +
> +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> +        error_report("failed to write, fd=%d, errno=%d (%s)",
> +            fd, errno, strerror(errno));
> +        return -EIO ;
> +    }
> +
> +    return ret;
> +}
> +
> +static int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova,
> +                                hwaddr size)
> +{
> +    struct vhost_msg_v2 msg;
> +    int fd = v->device_fd;
> +    int ret = 0;
> +
> +    msg.type =  v->msg_type;
> +    msg.iotlb.iova = iova;
> +    msg.iotlb.size = size;
> +    msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
> +
> +    if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
> +        error_report("failed to write, fd=%d, errno=%d (%s)",
> +            fd, errno, strerror(errno));
> +        return -EIO ;
> +    }
> +
> +    return ret;
> +}
> +
> +static void vhost_vdpa_listener_region_add(MemoryListener *listener,
> +                                           MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +    hwaddr iova;
> +    Int128 llend, llsize;
> +    void *vaddr;
> +    int ret;
> +
> +    if (vhost_vdpa_listener_skipped_section(section)) {
> +        return;
> +    }
> +
> +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> +        error_report("%s received unaligned region", __func__);
> +        return;
> +    }
> +
> +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +    llend = int128_make64(section->offset_within_address_space);
> +    llend = int128_add(llend, section->size);
> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> +
> +    if (int128_ge(int128_make64(iova), llend)) {
> +        return;
> +    }
> +
> +    memory_region_ref(section->mr);
> +
> +    /* Here we assume that memory_region_is_ram(section->mr)==true */
> +
> +    vaddr = memory_region_get_ram_ptr(section->mr) +
> +            section->offset_within_region +
> +            (iova - section->offset_within_address_space);
> +
> +    llsize = int128_sub(llend, int128_make64(iova));
> +
> +    ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
> +                             vaddr, section->readonly);
> +    if (ret) {
> +        error_report("vhost vdpa map fail!");
> +        if (memory_region_is_ram_device(section->mr)) {
> +            /* Allow unexpected mappings not to be fatal for RAM devices */
> +            error_report("map ram fail!");
> +          return ;
> +        }
> +        goto fail;
> +    }
> +
> +    return;
> +
> +fail:
> +    if (memory_region_is_ram_device(section->mr)) {
> +        error_report("failed to vdpa_dma_map. pci p2p may not work");
> +        return;
> +
> +    }
> +    /*
> +     * On the initfn path, store the first error in the container so we
> +     * can gracefully fail.  Runtime, there's not much we can do other
> +     * than throw a hardware error.
> +     */
> +    error_report("vhost-vdpa: DMA mapping failed, unable to continue");
> +    return;
> +
> +}
> +
> +static void vhost_vdpa_listener_region_del(MemoryListener *listener,
> +                                           MemoryRegionSection *section)
> +{
> +    struct vhost_vdpa *v = container_of(listener, struct vhost_vdpa, listener);
> +    hwaddr iova;
> +    Int128 llend, llsize;
> +    int ret;
> +    bool try_unmap = true;
> +
> +    if (vhost_vdpa_listener_skipped_section(section)) {
> +        return;
> +    }
> +
> +    if (unlikely((section->offset_within_address_space & ~TARGET_PAGE_MASK) !=
> +                 (section->offset_within_region & ~TARGET_PAGE_MASK))) {
> +        error_report("%s received unaligned region", __func__);
> +        return;
> +    }
> +
> +    iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +    llend = int128_make64(section->offset_within_address_space);
> +    llend = int128_add(llend, section->size);
> +    llend = int128_and(llend, int128_exts64(TARGET_PAGE_MASK));
> +
> +    if (int128_ge(int128_make64(iova), llend)) {
> +        return;
> +    }
> +
> +    llsize = int128_sub(llend, int128_make64(iova));
> +
> +    if (try_unmap) {
> +        ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
> +        if (ret) {
> +            error_report("vhost_vdpa dma unmap error!");
> +        }
> +    }
> +
> +    memory_region_unref(section->mr);
> +}
> +/* Register a new memory listener, only to get diffs from qemu,
> + * this help to reduce the tricky codes in vhost
> + * (e.g generating diffs of two rbtree as usnic did).*/
> +static const MemoryListener vhost_vdpa_memory_listener = {
> +    .region_add = vhost_vdpa_listener_region_add,
> +    .region_del = vhost_vdpa_listener_region_del,
> +};
> +
> +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> +                             void *arg)
> +{
> +    struct vhost_vdpa *v = dev->opaque;
> +    int fd = v->device_fd;
> +
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    return ioctl(fd, request, arg);
> +}
> +
> +static void vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +{
> +    uint8_t s;
> +
> +    if (vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s)) {
> +        return;
> +    }
> +
> +    s |= status;
> +
> +    vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> +}
> +
> +static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
> +{
> +    struct vhost_vdpa *v;
> +
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    v = opaque;
> +    dev->opaque =  opaque ;
> +
> +    v->listener = vhost_vdpa_memory_listener;
> +    v->msg_type = VHOST_IOTLB_MSG_V2;
> +    memory_listener_register(&v->listener, &address_space_memory);
> +
> +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                               VIRTIO_CONFIG_S_DRIVER);
> +
> +    return 0;
> +}
> +
> +static int vhost_vdpa_cleanup(struct vhost_dev *dev)
> +{
> +    struct vhost_vdpa *v;
> +    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +
> +    v = dev->opaque;
> +    memory_listener_unregister(&v->listener);
> +
> +    dev->opaque = NULL;
> +    return 0;
> +}
> +
> +static int vhost_vdpa_memslots_limit(struct vhost_dev *dev)
> +{
> +    return INT_MAX;
> +}
> +
> +static int vhost_vdpa_set_mem_table(struct vhost_dev *dev,
> +                                    struct vhost_memory *mem)
> +{
> +
> +    if (mem->padding) {
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +static int vhost_vdpa_set_features(struct vhost_dev *dev,
> +                                   uint64_t features)
> +{
> +    int ret = vhost_vdpa_call(dev, VHOST_SET_FEATURES, &features);
> +    uint8_t status = 0;
> +
> +    if (ret) {
> +        return ret;
> +    }
> +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
> +    vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> +
> +    return !(status & VIRTIO_CONFIG_S_FEATURES_OK);
> +}
> +
> +int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> +                                   uint32_t *device_id)
> +{
> +    return vhost_vdpa_call(dev, VHOST_VDPA_GET_DEVICE_ID, device_id);
> +}
> +
> +static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> +{
> +    uint8_t status = 0;
> +
> +    return vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> +}
> +
> +static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> +{
> +    assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
> +
> +    return idx - dev->vq_index;
> +}
> +
> +static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)
> +{
> +    int i;
> +    for (i = 0; i < dev->nvqs; ++i) {
> +        struct vhost_vring_state state = {
> +            .index = dev->vq_index + i,
> +            .num = 1,
> +        };
> +        vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
> +    }
> +    return 0;
> +}
> +
> +static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
> +                                   uint32_t offset, uint32_t size,
> +                                   uint32_t flags)
> +{
> +    struct vhost_vdpa_config config;
> +    int ret;
> +    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {

VHOST_VDPA_MAX_CONFIG_SIZE is currently undefined.

If we want to maintain this as a stack allocation (as proposed in
https://www.mail-archive.com/qemu-devel@nongnu.org/msg701744.html) I
think that the best option is to decide which is the maximum value buf
can hold, and set it in vhost_vdpa_config.buf declaration.

I can only see IFCVF expecting a struct virtio_net_config there in the
Linux kernel v5.7. It is assumable that its size will be the bigger
data size we are going to save there through qemu configuration? It
will be possible to change in future versions if needed, isn't it?.

If it could be much bigger at this moment, we should go back to heap allocation.

Any thoughts?

Thanks!


> +        return -1;
> +    }
> +    memset(&config, 0, sizeof(struct vhost_vdpa_config));
> +    config.off = 0;
> +    config.len = size;
> +    memcpy(&config.buf, data, size);
> +    ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_CONFIG, &config);
> +    return ret;
> +}
> +
> +static int vhost_vdpa_get_config(struct vhost_dev *dev, uint8_t *config,
> +                                   uint32_t config_len)
> +{
> +    struct vhost_vdpa_config v_config;
> +    int ret;
> +
> +    memset(&v_config, 0, sizeof(struct vhost_vdpa_config));
> +    if (config == NULL) {
> +        return -1;
> +    }
> +    ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_CONFIG, &v_config);
> +    if ((v_config.len > config_len) || (v_config.len == 0)) {
> +        return -EINVAL;
> +    }
> +    memcpy(config, &v_config.buf, config_len);
> +    return ret;
> + }
> +
> +static int vhost_vdpa_set_state(struct vhost_dev *dev, bool started)
> +{
> +    if (started) {
> +        uint8_t status = 0;
> +
> +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> +        vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status);
> +
> +        return !(status & VIRTIO_CONFIG_S_DRIVER_OK);
> +    } else {
> +        vhost_vdpa_reset_device(dev);
> +        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> +                                   VIRTIO_CONFIG_S_DRIVER);
> +        return 0;
> +    }
> +}
> +
> +const VhostOps vdpa_ops = {
> +        .backend_type = VHOST_BACKEND_TYPE_VDPA,
> +        .vhost_backend_init = vhost_vdpa_init,
> +        .vhost_backend_cleanup = vhost_vdpa_cleanup,
> +        .vhost_set_log_base = vhost_kernel_set_log_base,
> +        .vhost_set_vring_addr = vhost_kernel_set_vring_addr,
> +        .vhost_set_vring_num = vhost_kernel_set_vring_num,
> +        .vhost_set_vring_base = vhost_kernel_set_vring_base,
> +        .vhost_get_vring_base = vhost_kernel_get_vring_base,
> +        .vhost_set_vring_kick = vhost_kernel_set_vring_kick,
> +        .vhost_set_vring_call = vhost_kernel_set_vring_call,
> +        .vhost_get_features = vhost_kernel_get_features,
> +        .vhost_set_owner = vhost_kernel_set_owner,
> +        .vhost_set_vring_endian = NULL,
> +        .vhost_backend_memslots_limit = vhost_vdpa_memslots_limit,
> +        .vhost_set_mem_table = vhost_vdpa_set_mem_table,
> +        .vhost_set_features = vhost_vdpa_set_features,
> +        .vhost_reset_device = vhost_vdpa_reset_device,
> +        .vhost_get_vq_index = vhost_vdpa_get_vq_index,
> +        .vhost_set_vring_ready = vhost_vdpa_set_vring_ready,
> +        .vhost_get_config  = vhost_vdpa_get_config,
> +        .vhost_set_config = vhost_vdpa_set_config,
> +        .vhost_requires_shm_log = NULL,
> +        .vhost_migration_done = NULL,
> +        .vhost_backend_can_merge = NULL,
> +        .vhost_net_set_mtu = NULL,
> +        .vhost_set_iotlb_callback = NULL,
> +        .vhost_send_device_iotlb_msg = NULL,
> +        .vhost_set_state = vhost_vdpa_set_state,
> +        .vhost_get_device_id = vhost_vdpa_get_device_id,
> +};
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 01ebe12f28..b97aa02a4c 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -756,6 +756,12 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
>          .log_guest_addr = vq->used_phys,
>          .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
>      };
> +    /*vDPA need to use the phys address here to set to hardware*/
> +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
> +        addr.desc_user_addr = (uint64_t)(unsigned long)vq->desc_phys;
> +        addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail_phys;
> +        addr.used_user_addr = (uint64_t)(unsigned long)vq->used_phys;
> +    }
>      int r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
>      if (r < 0) {
>          VHOST_OPS_DEBUG("vhost_set_vring_addr failed");
> @@ -1506,6 +1512,14 @@ int vhost_dev_set_config(struct vhost_dev *hdev, const uint8_t *data,
>      return -1;
>  }
>
> +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id)
> +{
> +    assert(hdev->vhost_ops);
> +    if (hdev->vhost_ops->vhost_get_device_id) {
> +        return hdev->vhost_ops->vhost_get_device_id(hdev, device_id);
> +    }
> +    return -1;
> +}
>  void vhost_dev_set_config_notifier(struct vhost_dev *hdev,
>                                     const VhostDevConfigOps *ops)
>  {
> @@ -1661,7 +1675,13 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
>          }
>      }
>
> -    if (vhost_dev_has_iommu(hdev)) {
> +    r = vhost_set_state(hdev, true);
> +    if (r) {
> +        goto fail_log;
> +    }
> +
> +    if (vhost_dev_has_iommu(hdev) &&
> +        hdev->vhost_ops->vhost_set_iotlb_callback) {
>          hdev->vhost_ops->vhost_set_iotlb_callback(hdev, true);
>
>          /* Update used ring information for IOTLB to work correctly,
> @@ -1697,6 +1717,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>      /* should only be called after backend is connected */
>      assert(hdev->vhost_ops);
>
> +    vhost_set_state(hdev, false);
> +
>      for (i = 0; i < hdev->nvqs; ++i) {
>          vhost_virtqueue_stop(hdev,
>                               vdev,
> @@ -1705,7 +1727,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>      }
>
>      if (vhost_dev_has_iommu(hdev)) {
> -        hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> +        if (hdev->vhost_ops->vhost_set_iotlb_callback) {
> +            hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false);
> +        }
>          memory_listener_unregister(&hdev->iommu_listener);
>      }
>      vhost_log_put(hdev, true);
> @@ -1722,3 +1746,12 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>
>      return -1;
>  }
> +
> +int vhost_set_state(struct vhost_dev *hdev, bool started)
> +{
> +    if (hdev->vhost_ops->vhost_set_state) {
> +        return hdev->vhost_ops->vhost_set_state(hdev, started);
> +    }
> +
> +    return 0;
> +}
> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> index 300b59c172..1ebe3785cf 100644
> --- a/include/hw/virtio/vhost-backend.h
> +++ b/include/hw/virtio/vhost-backend.h
> @@ -17,7 +17,8 @@ typedef enum VhostBackendType {
>      VHOST_BACKEND_TYPE_NONE = 0,
>      VHOST_BACKEND_TYPE_KERNEL = 1,
>      VHOST_BACKEND_TYPE_USER = 2,
> -    VHOST_BACKEND_TYPE_MAX = 3,
> +    VHOST_BACKEND_TYPE_VDPA = 3,
> +    VHOST_BACKEND_TYPE_MAX = 4,
>  } VhostBackendType;
>
>  typedef enum VhostSetConfigType {
> @@ -77,6 +78,7 @@ typedef int (*vhost_reset_device_op)(struct vhost_dev *dev);
>  typedef int (*vhost_get_vq_index_op)(struct vhost_dev *dev, int idx);
>  typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev,
>                                           int enable);
> +typedef int (*vhost_set_vring_ready_op)(struct vhost_dev *dev);
>  typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev);
>  typedef int (*vhost_migration_done_op)(struct vhost_dev *dev,
>                                         char *mac_addr);
> @@ -112,6 +114,8 @@ typedef int (*vhost_get_inflight_fd_op)(struct vhost_dev *dev,
>  typedef int (*vhost_set_inflight_fd_op)(struct vhost_dev *dev,
>                                          struct vhost_inflight *inflight);
>
> +typedef int (*vhost_set_state_op)(struct vhost_dev *dev, bool started);
> +typedef int (*vhost_get_device_id_op)(struct vhost_dev *dev, uint32_t *dev_id);
>  typedef struct VhostOps {
>      VhostBackendType backend_type;
>      vhost_backend_init vhost_backend_init;
> @@ -138,6 +142,7 @@ typedef struct VhostOps {
>      vhost_reset_device_op vhost_reset_device;
>      vhost_get_vq_index_op vhost_get_vq_index;
>      vhost_set_vring_enable_op vhost_set_vring_enable;
> +    vhost_set_vring_ready_op vhost_set_vring_ready;
>      vhost_requires_shm_log_op vhost_requires_shm_log;
>      vhost_migration_done_op vhost_migration_done;
>      vhost_backend_can_merge_op vhost_backend_can_merge;
> @@ -152,9 +157,12 @@ typedef struct VhostOps {
>      vhost_backend_mem_section_filter_op vhost_backend_mem_section_filter;
>      vhost_get_inflight_fd_op vhost_get_inflight_fd;
>      vhost_set_inflight_fd_op vhost_set_inflight_fd;
> +    vhost_set_state_op vhost_set_state;
> +    vhost_get_device_id_op vhost_get_device_id;
>  } VhostOps;
>
>  extern const VhostOps user_ops;
> +extern const VhostOps vdpa_ops;
>
>  int vhost_set_backend_type(struct vhost_dev *dev,
>                             VhostBackendType backend_type);
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> new file mode 100644
> index 0000000000..6455663388
> --- /dev/null
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -0,0 +1,26 @@
> +/*
> + * vhost-vdpa.h
> + *
> + * Copyright(c) 2017-2018 Intel Corporation.
> + * Copyright(c) 2020 Red Hat, Inc.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef HW_VIRTIO_VHOST_VDPA_H
> +#define HW_VIRTIO_VHOST_VDPA_H
> +
> +#include "hw/virtio/virtio.h"
> +
> +typedef struct vhost_vdpa {
> +    int device_fd;
> +    uint32_t msg_type;
> +    MemoryListener listener;
> +} VhostVDPA;
> +
> +extern AddressSpace address_space_memory;
> +extern int vhost_vdpa_get_device_id(struct vhost_dev *dev,
> +                                   uint32_t *device_id);
> +#endif
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 085450c6f8..b682545f51 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -124,6 +124,7 @@ int vhost_dev_get_config(struct vhost_dev *dev, uint8_t *config,
>                           uint32_t config_len);
>  int vhost_dev_set_config(struct vhost_dev *dev, const uint8_t *data,
>                           uint32_t offset, uint32_t size, uint32_t flags);
> +int vhost_dev_get_device_id(struct vhost_dev *hdev, uint32_t *device_id);
>  /* notifier callback in case vhost device config space changed
>   */
>  void vhost_dev_set_config_notifier(struct vhost_dev *dev,
> @@ -137,4 +138,5 @@ int vhost_dev_set_inflight(struct vhost_dev *dev,
>                             struct vhost_inflight *inflight);
>  int vhost_dev_get_inflight(struct vhost_dev *dev, uint16_t queue_size,
>                             struct vhost_inflight *inflight);
> +int vhost_set_state(struct vhost_dev *dev, bool started);
>  #endif
> diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> index 8a6f208189..56e67fe164 100644
> --- a/include/net/vhost_net.h
> +++ b/include/net/vhost_net.h
> @@ -40,5 +40,5 @@ int vhost_set_vring_ready(NetClientState *nc);
>  uint64_t vhost_net_get_acked_features(VHostNetState *net);
>
>  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
> -
> -#endif
> +int vhost_net_get_device_id(struct vhost_net *net, uint32_t *device_id);
> +endif
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 292d4e7c0c..c19e10ce9c 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -2409,6 +2409,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
>  #ifdef CONFIG_POSIX
>      "-netdev vhost-user,id=str,chardev=dev[,vhostforce=on|off]\n"
>      "                configure a vhost-user network, backed by a chardev 'dev'\n"
> +#endif
> +#ifdef CONFIG_POSIX
> +    "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
> +    "                configure a vhost-vdpa network,Establish a vhost-vdpa netdev\n"
>  #endif
>      "-netdev hubport,id=str,hubid=n[,netdev=nd]\n"
>      "                configure a hub port on the hub with ID 'n'\n", QEMU_ARCH_ALL)
> @@ -2428,6 +2432,9 @@ DEF("nic", HAS_ARG, QEMU_OPTION_nic,
>  #endif
>  #ifdef CONFIG_POSIX
>      "vhost-user|"
> +#endif
> +#ifdef CONFIG_POSIX
> +    "vhost-vdpa|"
>  #endif
>      "socket][,option][,...][mac=macaddr]\n"
>      "                initialize an on-board / default host NIC (using MAC address\n"
> @@ -2896,6 +2903,14 @@ SRST
>      hubport to another netdev with ID nd by using the ``netdev=nd``
>      option.
>
> +``-netdev vhost-vdpa,vhostdev=/path/to/dev ``
> +    Establish a vhost-vdpa netdev.
> +
> +    vDPA device is a device that uses a datapath which complies with
> +    the virtio specifications with a vendor specific control path.
> +    vDPA devices can be both physically located on the hardware or
> +    emulated by software.
> +
>  ``-net nic[,netdev=nd][,macaddr=mac][,model=type] [,name=name][,addr=addr][,vectors=v]``
>      Legacy option to configure or create an on-board (or machine
>      default) Network Interface Card(NIC) and connect it either to the
> --
> 2.21.1
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-04 10:39   ` Eugenio Perez Martin
@ 2020-06-04 11:33     ` Michael S. Tsirkin
  2020-06-08 14:46       ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Michael S. Tsirkin @ 2020-06-04 11:33 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Cornelia Huck, Jason Wang, qemu-devel, hanand, Rob Miller,
	saugatm, Cindy Lu, Markus Armbruster, hch, jgg, mhabets, shahafs,
	kevin.tian, parav, Vitaly Mireyno, cunming.liang, gdawar, jiri,
	xiao.w.wang, Stefan Hajnoczi, zhihong.wang, Tiwei Bie,
	Ariel Adam, rdunlap, Maxime Coquelin, lingshan.zhu

On Thu, Jun 04, 2020 at 12:39:34PM +0200, Eugenio Perez Martin wrote:
> > +static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
> > +                                   uint32_t offset, uint32_t size,
> > +                                   uint32_t flags)
> > +{
> > +    struct vhost_vdpa_config config;
> > +    int ret;
> > +    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
> 
> VHOST_VDPA_MAX_CONFIG_SIZE is currently undefined.
> 
> If we want to maintain this as a stack allocation (as proposed in
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg701744.html) I
> think that the best option is to decide which is the maximum value buf
> can hold, and set it in vhost_vdpa_config.buf declaration.

That depends on device features. qemu has logic to figure out
config size based on that and set config_size accordingly.
Why not reuse it? Sending more should be ok and extra
data just ignored.

-- 
MST



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-04 11:33     ` Michael S. Tsirkin
@ 2020-06-08 14:46       ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-08 14:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, Jason Wang, qemu-devel, hanand, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	mhabets, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno, Liang,
	Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Tiwei Bie, Ariel Adam, rdunlap, Maxime Coquelin, Zhu,
	Lingshan

On Thu, Jun 4, 2020 at 7:34 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jun 04, 2020 at 12:39:34PM +0200, Eugenio Perez Martin wrote:
> > > +static int vhost_vdpa_set_config(struct vhost_dev *dev, const uint8_t *data,
> > > +                                   uint32_t offset, uint32_t size,
> > > +                                   uint32_t flags)
> > > +{
> > > +    struct vhost_vdpa_config config;
> > > +    int ret;
> > > +    if ((size > VHOST_VDPA_MAX_CONFIG_SIZE) || (data == NULL)) {
> >
> > VHOST_VDPA_MAX_CONFIG_SIZE is currently undefined.
> >
> > If we want to maintain this as a stack allocation (as proposed in
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg701744.html) I
> > think that the best option is to decide which is the maximum value buf
> > can hold, and set it in vhost_vdpa_config.buf declaration.
>
> That depends on device features. qemu has logic to figure out
> config size based on that and set config_size accordingly.
> Why not reuse it? Sending more should be ok and extra
> data just ignored.
>
> --
> MST
>
Thanks Michael and Eugenio for your suggestion,I‘m rewriting this part



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
                     ` (3 preceding siblings ...)
  2020-06-04 10:39   ` Eugenio Perez Martin
@ 2020-06-08 20:14   ` Eric Blake
  2020-06-09  3:42     ` Cindy Lu
  2020-06-15 14:44     ` Laurent Vivier
  4 siblings, 2 replies; 41+ messages in thread
From: Eric Blake @ 2020-06-08 20:14 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, hanand, hch, eperezma,
	jgg, shahafs, kevin.tian, parav, vmireyno, cunming.liang, gdawar,
	jiri, xiao.w.wang, stefanha, zhihong.wang, Tiwei Bie, aadam,
	rdunlap, maxime.coquelin, lingshan.zhu

On 5/29/20 9:06 AM, Cindy Lu wrote:
> From: Tiwei Bie <tiwei.bie@intel.com>

The original author is Tiwei Bie...

> 
> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> vhost-user. The above patch provides a generic device for vDPA purpose,
> this vDPA device exposes to user space a non-vendor-specific configuration
> interface for setting up a vhost HW accelerator, this patch set introduces
> a third vhost backend called vhost-vdpa based on the vDPA interface.
> 
> Vhost-vdpa usage:
> 
>    qemu-system-x86_64 -cpu host -enable-kvm \
>      ......
>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> 
> Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---

...but there is no S-o-b here.  Also, Co-Authored-By is an unusual tag; 
it's just as easy to spell it Signed-off-by even for co-authors.

[Pardon my delicacy in wording my response; I unfortunately lack enough 
cultural context to know a preferred name or even gender-correct 
pronouns for referring to the authors in shorthand]

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-08 20:14   ` Eric Blake
@ 2020-06-09  3:42     ` Cindy Lu
  2020-06-15 14:44     ` Laurent Vivier
  1 sibling, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-09  3:42 UTC (permalink / raw)
  To: Eric Blake
  Cc: Cornelia Huck, Michael Tsirkin, Jason Wang, qemu-devel, hanand,
	Rob Miller, saugatm, Markus Armbruster, hch,
	Eugenio Perez Martin, jgg, mhabets, Shahaf Shuler, kevin.tian,
	parav, Vitaly Mireyno, Liang, Cunming, gdawar, jiri, xiao.w.wang,
	Stefan Hajnoczi, Wang, Zhihong, Tiwei Bie, Ariel Adam, rdunlap,
	Maxime Coquelin, Zhu, Lingshan

On Tue, Jun 9, 2020 at 4:14 AM Eric Blake <eblake@redhat.com> wrote:
>
> On 5/29/20 9:06 AM, Cindy Lu wrote:
> > From: Tiwei Bie <tiwei.bie@intel.com>
>
> The original author is Tiwei Bie...
>
> >
> > Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> > vhost-user. The above patch provides a generic device for vDPA purpose,
> > this vDPA device exposes to user space a non-vendor-specific configuration
> > interface for setting up a vhost HW accelerator, this patch set introduces
> > a third vhost backend called vhost-vdpa based on the vDPA interface.
> >
> > Vhost-vdpa usage:
> >
> >    qemu-system-x86_64 -cpu host -enable-kvm \
> >      ......
> >    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
> >    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> >
> > Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
>
> ...but there is no S-o-b here.  Also, Co-Authored-By is an unusual tag;
> it's just as easy to spell it Signed-off-by even for co-authors.
>
> [Pardon my delicacy in wording my response; I unfortunately lack enough
> cultural context to know a preferred name or even gender-correct
> pronouns for referring to the authors in shorthand]
>
Thanks Eric for pointing that out :-), I will fix this soon.

Thanks
Cindy

> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 1/8] net: introduce qemu_get_peer
  2020-05-29 14:06 ` [RFC v3 1/8] net: introduce qemu_get_peer Cindy Lu
@ 2020-06-11  9:07   ` Laurent Vivier
  2020-06-11 13:12     ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Laurent Vivier @ 2020-06-11  9:07 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> This is a small function that can get the peer from given NetClientState and queue_index
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  include/net/net.h | 1 +
>  net/net.c         | 6 ++++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/include/net/net.h b/include/net/net.h
> index 39085d9444..e7ef42d62b 100644
> --- a/include/net/net.h
> +++ b/include/net/net.h
> @@ -176,6 +176,7 @@ void hmp_info_network(Monitor *mon, const QDict *qdict);
>  void net_socket_rs_init(SocketReadState *rs,
>                          SocketReadStateFinalize *finalize,
>                          bool vnet_hdr);
> +NetClientState *qemu_get_peer(NetClientState *nc, int queue_index);
>  
>  /* NIC info */
>  
> diff --git a/net/net.c b/net/net.c
> index 38778e831d..599fb61028 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -324,6 +324,12 @@ void *qemu_get_nic_opaque(NetClientState *nc)
>  
>      return nic->opaque;
>  }

To be consistent with the style of the file, you should add a blank line
here.

> +NetClientState *qemu_get_peer(NetClientState *nc, int queue_index)
> +{
> +    assert(nc != NULL);
> +    NetClientState *ncs = nc + queue_index;
> +    return ncs->peer;
> +}
>  
>  static void qemu_cleanup_net_client(NetClientState *nc)
>  {
> 

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 1/8] net: introduce qemu_get_peer
  2020-06-11  9:07   ` Laurent Vivier
@ 2020-06-11 13:12     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-11 13:12 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Jason Wang, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno,
	Liang, Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Maxime Coquelin, Ariel Adam, Cornelia Huck, hanand, Zhu,
	Lingshan

On Thu, Jun 11, 2020 at 5:08 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 29/05/2020 16:06, Cindy Lu wrote:
> > This is a small function that can get the peer from given NetClientState and queue_index
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >  include/net/net.h | 1 +
> >  net/net.c         | 6 ++++++
> >  2 files changed, 7 insertions(+)
> >
> > diff --git a/include/net/net.h b/include/net/net.h
> > index 39085d9444..e7ef42d62b 100644
> > --- a/include/net/net.h
> > +++ b/include/net/net.h
> > @@ -176,6 +176,7 @@ void hmp_info_network(Monitor *mon, const QDict *qdict);
> >  void net_socket_rs_init(SocketReadState *rs,
> >                          SocketReadStateFinalize *finalize,
> >                          bool vnet_hdr);
> > +NetClientState *qemu_get_peer(NetClientState *nc, int queue_index);
> >
> >  /* NIC info */
> >
> > diff --git a/net/net.c b/net/net.c
> > index 38778e831d..599fb61028 100644
> > --- a/net/net.c
> > +++ b/net/net.c
> > @@ -324,6 +324,12 @@ void *qemu_get_nic_opaque(NetClientState *nc)
> >
> >      return nic->opaque;
> >  }
>
> To be consistent with the style of the file, you should add a blank line
> here.
>
Thanks Laurent, I will fix this
> > +NetClientState *qemu_get_peer(NetClientState *nc, int queue_index)
> > +{
> > +    assert(nc != NULL);
> > +    NetClientState *ncs = nc + queue_index;
> > +    return ncs->peer;
> > +}
> >
> >  static void qemu_cleanup_net_client(NetClientState *nc)
> >  {
> >
>
> Thanks,
> Laurent
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-08 20:14   ` Eric Blake
  2020-06-09  3:42     ` Cindy Lu
@ 2020-06-15 14:44     ` Laurent Vivier
  2020-06-16  8:52       ` Cindy Lu
  1 sibling, 1 reply; 41+ messages in thread
From: Laurent Vivier @ 2020-06-15 14:44 UTC (permalink / raw)
  To: Eric Blake, Cindy Lu, mst, armbru, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	Tiwei Bie, aadam, rdunlap, hanand, lingshan.zhu

On 08/06/2020 22:14, Eric Blake wrote:
> On 5/29/20 9:06 AM, Cindy Lu wrote:
>> From: Tiwei Bie <tiwei.bie@intel.com>
> 
> The original author is Tiwei Bie...
> 
>>
>> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
>> vhost-user. The above patch provides a generic device for vDPA purpose,
>> this vDPA device exposes to user space a non-vendor-specific
>> configuration
>> interface for setting up a vhost HW accelerator, this patch set
>> introduces
>> a third vhost backend called vhost-vdpa based on the vDPA interface.
>>
>> Vhost-vdpa usage:
>>
>>    qemu-system-x86_64 -cpu host -enable-kvm \
>>      ......
>>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
>>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
>>
>> Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
>> Signed-off-by: Cindy Lu <lulu@redhat.com>
>> ---
> 
> ...but there is no S-o-b here.  Also, Co-Authored-By is an unusual tag;
> it's just as easy to spell it Signed-off-by even for co-authors.
> 

Normally the tag to use in this case is "Co-developed-by".

https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by

"A Co-Developed-by: states that the patch was also created by another
developer along with the original author. This is useful at times when
multiple people work on a single patch. Note, this person also needs to
have a Signed-off-by: line in the patch as well."

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 2/8] vhost_net: use the function qemu_get_peer
  2020-05-29 14:06 ` [RFC v3 2/8] vhost_net: use the function qemu_get_peer Cindy Lu
@ 2020-06-16  7:47   ` Laurent Vivier
  0 siblings, 0 replies; 41+ messages in thread
From: Laurent Vivier @ 2020-06-16  7:47 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> user the qemu_get_peer to replace the old process
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/net/vhost_net.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index 6b82803fa7..d1d421e3d9 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -306,7 +306,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>      BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev)));
>      VirtioBusState *vbus = VIRTIO_BUS(qbus);
>      VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus);
> +    struct vhost_net *net;
>      int r, e, i;
> +    NetClientState *peer;
>  
>      if (!k->set_guest_notifiers) {
>          error_report("binding does not support guest notifiers");
> @@ -314,9 +316,9 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>      }
>  
>      for (i = 0; i < total_queues; i++) {
> -        struct vhost_net *net;
>  
> -        net = get_vhost_net(ncs[i].peer);
> +        peer = qemu_get_peer(ncs, i);
> +        net = get_vhost_net(peer);
>          vhost_net_set_vq_index(net, i * 2);
>  
>          /* Suppress the masking guest notifiers on vhost user
> @@ -335,7 +337,8 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>      }
>  
>      for (i = 0; i < total_queues; i++) {
> -        r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev);
> +        peer = qemu_get_peer(ncs, i);
> +        r = vhost_net_start_one(get_vhost_net(peer), dev);
>  
>          if (r < 0) {
>              goto err_start;
> @@ -343,7 +346,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>  
>          if (ncs[i].peer->vring_enable) {

You can replace this "ncs[i].peer->vring_enable" by
"peer->vring_enable"... and you do this later in PATCH 5/8.

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 3/8] virtio-bus: introduce queue_enabled method
  2020-05-29 14:06 ` [RFC v3 3/8] virtio-bus: introduce queue_enabled method Cindy Lu
@ 2020-06-16  7:49   ` Laurent Vivier
  2020-06-16 12:22     ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Laurent Vivier @ 2020-06-16  7:49 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> From: Jason Wang <jasowang@redhat.com>
> 
> This patch introduces queue_enabled() method which allows the
> transport to implement its own way to report whether or not a queue is
> enabled.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Cindy, you must add your signed-off-by on all the patch you send, after
all the existing S-o-b.

> 
> 0005-virtio-bus-introduce-queue_enabled-method.patch

bad cut&paste?

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 4/8] virtio-pci: implement queue_enabled method
  2020-05-29 14:06 ` [RFC v3 4/8] virtio-pci: implement " Cindy Lu
@ 2020-06-16  7:56   ` Laurent Vivier
  0 siblings, 0 replies; 41+ messages in thread
From: Laurent Vivier @ 2020-06-16  7:56 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> From: Jason Wang <jasowang@redhat.com>
> 
> With version 1, we can detect whether a queue is enabled via
> queue_enabled.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Add your S-o-b.

> ---
>  hw/virtio/virtio-pci.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 4cb784389c..2c82ed5246 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -1107,6 +1107,18 @@ static AddressSpace *virtio_pci_get_dma_as(DeviceState *d)
>      return pci_get_address_space(dev);
>  }
>  
> +static bool virtio_pci_queue_enabled(DeviceState *d, int n)
> +{
> +    VirtIOPCIProxy *proxy = VIRTIO_PCI(d);
> +    VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
> +
> +    if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> +        return proxy->vqs[vdev->queue_sel].enabled;
> +    }
> +
> +    return virtio_queue_get_desc_addr(vdev, n) != 0;

I think it would be clearer/cleaner to use here:

  return virtio_queue_enabled(vdev, n);

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method
  2020-05-29 14:06 ` [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method Cindy Lu
@ 2020-06-16  8:04   ` Laurent Vivier
  2020-06-16 12:21     ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Laurent Vivier @ 2020-06-16  8:04 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> From: Jason Wang <jasowang@redhat.com>
> 
> Vhost-vdpa introduces VHOST_VDPA_SET_VRING_ENABLE which complies the
> semantic of queue_enable defined in virtio spec. This method can be
> used for preventing device from executing request for a specific
> virtqueue. This patch introduces the vhost_ops for this.
> 
> Note that, we've already had vhost_set_vring_enable which has different
> semantic which allows to enable or disable a specific virtqueue for
> some kinds of vhost backends. E.g vhost-user use this to changes the
> number of active queue pairs.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Add your S-o-b.

> ---
>  hw/net/vhost_net-stub.c |  4 ++++
>  hw/net/vhost_net.c      | 11 ++++++++++-
>  include/net/vhost_net.h |  1 +
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
> index aac0e98228..43e93e1a9a 100644
> --- a/hw/net/vhost_net-stub.c
> +++ b/hw/net/vhost_net-stub.c
> @@ -86,6 +86,10 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
>      return 0;
>  }
>  
> +int vhost_set_vring_ready(NetClientState *nc)
> +{
> +    return 0;
> +}

Add a blank line here.

>  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>  {
>      return 0;
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> index d1d421e3d9..e2bc7de2eb 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -344,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
>              goto err_start;
>          }
>  
> -        if (ncs[i].peer->vring_enable) {
> +        if (peer->vring_enable) {
>              /* restore vring enable state */
>              r = vhost_set_vring_enable(peer, peer->vring_enable);

Move this part to PATCH 2/8

> @@ -455,6 +455,15 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
>      return 0;
>  }
>  
> +int vhost_set_vring_ready(NetClientState *nc)
> +{
> +    VHostNetState *net = get_vhost_net(nc);
> +    const VhostOps *vhost_ops = net->dev.vhost_ops;
> +    if (vhost_ops && vhost_ops->vhost_set_vring_ready) {

The structure VhostOps doesn't declare the vhost_set_vring_ready field.
Your patch is missing something and it could be not built.

It is defined in PATCH 7/8. If you want to keep this patch you should
move the declaration of "vhost_set_vring_ready_op vhost_set_vring_ready"
(and related) to this patch.

> +        return vhost_ops->vhost_set_vring_ready(&net->dev);
> +    }
> +    return 0;
> +}

Add a blank line.

>  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
>  {
>      const VhostOps *vhost_ops = net->dev.vhost_ops;
> diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> index 77e47398c4..8a6f208189 100644
> --- a/include/net/vhost_net.h
> +++ b/include/net/vhost_net.h
> @@ -35,6 +35,7 @@ int vhost_net_notify_migration_done(VHostNetState *net, char* mac_addr);
>  VHostNetState *get_vhost_net(NetClientState *nc);
>  
>  int vhost_set_vring_enable(NetClientState * nc, int enable);
> +int vhost_set_vring_ready(NetClientState *nc);
>  
>  uint64_t vhost_net_get_acked_features(VHostNetState *net);
>  
> 

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 6/8] vhost-backend: export the vhost backend helper
  2020-05-29 14:06 ` [RFC v3 6/8] vhost-backend: export the vhost backend helper Cindy Lu
@ 2020-06-16  8:16   ` Laurent Vivier
  2020-06-17  3:03     ` Cindy Lu
  0 siblings, 1 reply; 41+ messages in thread
From: Laurent Vivier @ 2020-06-16  8:16 UTC (permalink / raw)
  To: Cindy Lu, mst, armbru, eblake, cohuck, jasowang
  Cc: mhabets, qemu-devel, rob.miller, saugatm, maxime.coquelin, hch,
	eperezma, jgg, shahafs, kevin.tian, parav, vmireyno,
	cunming.liang, gdawar, jiri, xiao.w.wang, stefanha, zhihong.wang,
	aadam, rdunlap, hanand, lingshan.zhu

On 29/05/2020 16:06, Cindy Lu wrote:
> export the helper then we can reuse some of them in vhost-vdpa
> 
> Signed-off-by: Cindy Lu <lulu@redhat.com>
> ---
>  hw/virtio/vhost-backend.c         | 34 ++++++++++++++++++-------------
>  include/hw/virtio/vhost-backend.h | 28 +++++++++++++++++++++++++
>  2 files changed, 48 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 48905383f8..42efb4967b 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -14,7 +14,7 @@
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "standard-headers/linux/vhost_types.h"
> -
> +#include "hw/virtio/vhost-vdpa.h"

You can't include this file because it is created in the next patch.

>  #ifdef CONFIG_VHOST_KERNEL
>  #include <linux/vhost.h>
>  #include <sys/ioctl.h>
> @@ -22,10 +22,16 @@
>  static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
>                               void *arg)
>  {
> -    int fd = (uintptr_t) dev->opaque;
> -
> -    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
> -
> +    int fd = -1;
> +    struct vhost_vdpa *v = NULL;
> +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL) {
> +        fd  = (uintptr_t) dev->opaque;
> +    }
> +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
> +        v = dev->opaque;
> +        fd = v->device_fd;
> +    }
> +    assert(fd != -1);

A switch would be cleaner:

    switch (dev->vhost_ops->backend_type) {
    case VHOST_BACKEND_TYPE_KERNEL:
        fd  = (uintptr_t)dev->opaque;
        break;
    case VHOST_BACKEND_TYPE_VDPA:
        fd = ((struct vhost_vdpa *)dev->opaque)->device_fd;
        break;
    default:
        g_assert_not_reached()
    }

>      return ioctl(fd, request, arg);
>  }
>  

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend
  2020-06-15 14:44     ` Laurent Vivier
@ 2020-06-16  8:52       ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-16  8:52 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Jason Wang, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno,
	Liang, Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Maxime Coquelin, Tiwei Bie, Ariel Adam, Cornelia Huck,
	hanand, Zhu, Lingshan

On Mon, Jun 15, 2020 at 10:44 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 08/06/2020 22:14, Eric Blake wrote:
> > On 5/29/20 9:06 AM, Cindy Lu wrote:
> >> From: Tiwei Bie <tiwei.bie@intel.com>
> >
> > The original author is Tiwei Bie...
> >
> >>
> >> Currently we have 2 types of vhost backends in QEMU: vhost kernel and
> >> vhost-user. The above patch provides a generic device for vDPA purpose,
> >> this vDPA device exposes to user space a non-vendor-specific
> >> configuration
> >> interface for setting up a vhost HW accelerator, this patch set
> >> introduces
> >> a third vhost backend called vhost-vdpa based on the vDPA interface.
> >>
> >> Vhost-vdpa usage:
> >>
> >>    qemu-system-x86_64 -cpu host -enable-kvm \
> >>      ......
> >>    -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-id,id=vhost-vdpa0 \
> >>    -device virtio-net-pci,netdev=vhost-vdpa0,page-per-vq=on \
> >>
> >> Co-Authored-By: Lingshan zhu <lingshan.zhu@intel.com>
> >> Signed-off-by: Cindy Lu <lulu@redhat.com>
> >> ---
> >
> > ...but there is no S-o-b here.  Also, Co-Authored-By is an unusual tag;
> > it's just as easy to spell it Signed-off-by even for co-authors.
> >
>
> Normally the tag to use in this case is "Co-developed-by".
>
> https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by
>
> "A Co-Developed-by: states that the patch was also created by another
> developer along with the original author. This is useful at times when
> multiple people work on a single patch. Note, this person also needs to
> have a Signed-off-by: line in the patch as well."
>
> Thanks,
> Laurent
>
Thanks Laurent,I will fix this



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method
  2020-06-16  8:04   ` Laurent Vivier
@ 2020-06-16 12:21     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-16 12:21 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Jason Wang, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno,
	Liang, Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Maxime Coquelin, Ariel Adam, Cornelia Huck, hanand, Zhu,
	Lingshan

On Tue, Jun 16, 2020 at 4:04 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 29/05/2020 16:06, Cindy Lu wrote:
> > From: Jason Wang <jasowang@redhat.com>
> >
> > Vhost-vdpa introduces VHOST_VDPA_SET_VRING_ENABLE which complies the
> > semantic of queue_enable defined in virtio spec. This method can be
> > used for preventing device from executing request for a specific
> > virtqueue. This patch introduces the vhost_ops for this.
> >
> > Note that, we've already had vhost_set_vring_enable which has different
> > semantic which allows to enable or disable a specific virtqueue for
> > some kinds of vhost backends. E.g vhost-user use this to changes the
> > number of active queue pairs.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> Add your S-o-b.
>
will fix this
> > ---
> >  hw/net/vhost_net-stub.c |  4 ++++
> >  hw/net/vhost_net.c      | 11 ++++++++++-
> >  include/net/vhost_net.h |  1 +
> >  3 files changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/net/vhost_net-stub.c b/hw/net/vhost_net-stub.c
> > index aac0e98228..43e93e1a9a 100644
> > --- a/hw/net/vhost_net-stub.c
> > +++ b/hw/net/vhost_net-stub.c
> > @@ -86,6 +86,10 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
> >      return 0;
> >  }
> >
> > +int vhost_set_vring_ready(NetClientState *nc)
> > +{
> > +    return 0;
> > +}
>
> Add a blank line here.
>
will fix this
> >  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
> >  {
> >      return 0;
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > index d1d421e3d9..e2bc7de2eb 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -344,7 +344,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs,
> >              goto err_start;
> >          }
> >
> > -        if (ncs[i].peer->vring_enable) {
> > +        if (peer->vring_enable) {
> >              /* restore vring enable state */
> >              r = vhost_set_vring_enable(peer, peer->vring_enable);
>
> Move this part to PATCH 2/8
>
will fix this
> > @@ -455,6 +455,15 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)
> >      return 0;
> >  }
> >
> > +int vhost_set_vring_ready(NetClientState *nc)
> > +{
> > +    VHostNetState *net = get_vhost_net(nc);
> > +    const VhostOps *vhost_ops = net->dev.vhost_ops;
> > +    if (vhost_ops && vhost_ops->vhost_set_vring_ready) {
>
> The structure VhostOps doesn't declare the vhost_set_vring_ready field.
> Your patch is missing something and it could be not built.
>
> It is defined in PATCH 7/8. If you want to keep this patch you should
> move the declaration of "vhost_set_vring_ready_op vhost_set_vring_ready"
> (and related) to this patch.
>
Thanks  Laurent,  I will fix this

> > +        return vhost_ops->vhost_set_vring_ready(&net->dev);
> > +    }
> > +    return 0;
> > +}
>
> Add a blank line.
>
sure will fix this
> >  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu)
> >  {
> >      const VhostOps *vhost_ops = net->dev.vhost_ops;
> > diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
> > index 77e47398c4..8a6f208189 100644
> > --- a/include/net/vhost_net.h
> > +++ b/include/net/vhost_net.h
> > @@ -35,6 +35,7 @@ int vhost_net_notify_migration_done(VHostNetState *net, char* mac_addr);
> >  VHostNetState *get_vhost_net(NetClientState *nc);
> >
> >  int vhost_set_vring_enable(NetClientState * nc, int enable);
> > +int vhost_set_vring_ready(NetClientState *nc);
> >
> >  uint64_t vhost_net_get_acked_features(VHostNetState *net);
> >
> >
>
> Thanks,
> Laurent
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 3/8] virtio-bus: introduce queue_enabled method
  2020-06-16  7:49   ` Laurent Vivier
@ 2020-06-16 12:22     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-16 12:22 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Jason Wang, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno,
	Liang, Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Maxime Coquelin, Ariel Adam, Cornelia Huck, hanand, Zhu,
	Lingshan

On Tue, Jun 16, 2020 at 3:50 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 29/05/2020 16:06, Cindy Lu wrote:
> > From: Jason Wang <jasowang@redhat.com>
> >
> > This patch introduces queue_enabled() method which allows the
> > transport to implement its own way to report whether or not a queue is
> > enabled.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> Cindy, you must add your signed-off-by on all the patch you send, after
> all the existing S-o-b.
>
sure will fix this
> >
> > 0005-virtio-bus-introduce-queue_enabled-method.patch
>
will remove this part

> bad cut&paste?
>
> Thanks,
> Laurent
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC v3 6/8] vhost-backend: export the vhost backend helper
  2020-06-16  8:16   ` Laurent Vivier
@ 2020-06-17  3:03     ` Cindy Lu
  0 siblings, 0 replies; 41+ messages in thread
From: Cindy Lu @ 2020-06-17  3:03 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: rdunlap, Michael Tsirkin, mhabets, qemu-devel, Rob Miller,
	saugatm, Markus Armbruster, hch, Eugenio Perez Martin, jgg,
	Jason Wang, Shahaf Shuler, kevin.tian, parav, Vitaly Mireyno,
	Liang, Cunming, gdawar, jiri, xiao.w.wang, Stefan Hajnoczi, Wang,
	Zhihong, Maxime Coquelin, Ariel Adam, Cornelia Huck, hanand, Zhu,
	Lingshan

On Tue, Jun 16, 2020 at 4:17 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 29/05/2020 16:06, Cindy Lu wrote:
> > export the helper then we can reuse some of them in vhost-vdpa
> >
> > Signed-off-by: Cindy Lu <lulu@redhat.com>
> > ---
> >  hw/virtio/vhost-backend.c         | 34 ++++++++++++++++++-------------
> >  include/hw/virtio/vhost-backend.h | 28 +++++++++++++++++++++++++
> >  2 files changed, 48 insertions(+), 14 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> > index 48905383f8..42efb4967b 100644
> > --- a/hw/virtio/vhost-backend.c
> > +++ b/hw/virtio/vhost-backend.c
> > @@ -14,7 +14,7 @@
> >  #include "qemu/error-report.h"
> >  #include "qemu/main-loop.h"
> >  #include "standard-headers/linux/vhost_types.h"
> > -
> > +#include "hw/virtio/vhost-vdpa.h"
>
> You can't include this file because it is created in the next patch.
>
> >  #ifdef CONFIG_VHOST_KERNEL
> >  #include <linux/vhost.h>
> >  #include <sys/ioctl.h>
> > @@ -22,10 +22,16 @@
> >  static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
> >                               void *arg)
> >  {
> > -    int fd = (uintptr_t) dev->opaque;
> > -
> > -    assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL);
> > -
> > +    int fd = -1;
> > +    struct vhost_vdpa *v = NULL;
> > +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_KERNEL) {
> > +        fd  = (uintptr_t) dev->opaque;
> > +    }
> > +    if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA) {
> > +        v = dev->opaque;
> > +        fd = v->device_fd;
> > +    }
> > +    assert(fd != -1);
>
> A switch would be cleaner:
>
>     switch (dev->vhost_ops->backend_type) {
>     case VHOST_BACKEND_TYPE_KERNEL:
>         fd  = (uintptr_t)dev->opaque;
>         break;
>     case VHOST_BACKEND_TYPE_VDPA:
>         fd = ((struct vhost_vdpa *)dev->opaque)->device_fd;
>         break;
>     default:
>         g_assert_not_reached()
>     }
>
> >      return ioctl(fd, request, arg);
> >  }
> >
>
Thanks Laurent , will fix this
> Thanks,
> Laurent
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-06-17  3:04 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-29 14:06 [RFC v3 0/8] vDPA support in qemu Cindy Lu
2020-05-29 14:06 ` [RFC v3 1/8] net: introduce qemu_get_peer Cindy Lu
2020-06-11  9:07   ` Laurent Vivier
2020-06-11 13:12     ` Cindy Lu
2020-05-29 14:06 ` [RFC v3 2/8] vhost_net: use the function qemu_get_peer Cindy Lu
2020-06-16  7:47   ` Laurent Vivier
2020-05-29 14:06 ` [RFC v3 3/8] virtio-bus: introduce queue_enabled method Cindy Lu
2020-06-16  7:49   ` Laurent Vivier
2020-06-16 12:22     ` Cindy Lu
2020-05-29 14:06 ` [RFC v3 4/8] virtio-pci: implement " Cindy Lu
2020-06-16  7:56   ` Laurent Vivier
2020-05-29 14:06 ` [RFC v3 5/8] vhost: introduce vhost_set_vring_ready method Cindy Lu
2020-06-16  8:04   ` Laurent Vivier
2020-06-16 12:21     ` Cindy Lu
2020-05-29 14:06 ` [RFC v3 6/8] vhost-backend: export the vhost backend helper Cindy Lu
2020-06-16  8:16   ` Laurent Vivier
2020-06-17  3:03     ` Cindy Lu
2020-05-29 14:06 ` [RFC v3 7/8] vhost-vdpa: introduce vhost-vdpa backend Cindy Lu
2020-06-03  2:52   ` Jason Wang
2020-06-03  5:23     ` Cindy Lu
2020-06-03  2:53   ` Jason Wang
2020-06-03  5:23     ` Cindy Lu
2020-06-03  6:43   ` Jason Wang
2020-06-03  8:20     ` Cindy Lu
2020-06-04 10:39   ` Eugenio Perez Martin
2020-06-04 11:33     ` Michael S. Tsirkin
2020-06-08 14:46       ` Cindy Lu
2020-06-08 20:14   ` Eric Blake
2020-06-09  3:42     ` Cindy Lu
2020-06-15 14:44     ` Laurent Vivier
2020-06-16  8:52       ` Cindy Lu
2020-05-29 14:06 ` [RFC v3 8/8] vhost-vdpa: introduce vhost-vdpa net client Cindy Lu
2020-05-29 14:22   ` Eric Blake
2020-06-01  1:41     ` Cindy Lu
2020-06-03  6:39   ` Jason Wang
2020-06-03  8:19     ` Cindy Lu
2020-06-03  8:43       ` Jason Wang
2020-06-03  8:49         ` Cindy Lu
2020-05-29 20:29 ` [RFC v3 0/8] vDPA support in qemu no-reply
2020-05-29 20:33 ` no-reply
2020-05-29 20:37 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.