All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] add generic vDPA device support
@ 2022-01-05  0:58 Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
                   ` (10 more replies)
  0 siblings, 11 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Hi guys,

This patchset tries to support the generic vDPA device, the previous
disscussion can be found here [1].

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.

We can use the generic vDPA device as follow:
  -device vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-X

I've done some simple tests on Huawei's offloading card (net, 0.95)
and vdpa_sim_blk (1.0);

Note:
  the kernel part does not send out yet, I'll send it as soon as possible.

[1] https://lore.kernel.org/all/20211208052010.1719-1-longpeng2@huawei.com/

Longpeng (Mike) (10):
  virtio: get class_id and pci device id by the virtio id
  vhost: add 3 commands for vhost-vdpa
  vdpa: add the infrastructure of vdpa-dev
  vdpa-dev: implement the instance_init/class_init interface
  vdpa-dev: implement the realize interface
  vdpa-dev: implement the unrealize interface
  vdpa-dev: implement the get_config/set_config interface
  vdpa-dev: implement the get_features interface
  vdpa-dev: implement the set_status interface
  vdpa-dev: mark the device as unmigratable

 hw/virtio/Kconfig            |   5 +
 hw/virtio/meson.build        |   2 +
 hw/virtio/vdpa-dev-pci.c     | 127 +++++++++++++
 hw/virtio/vdpa-dev.c         | 355 +++++++++++++++++++++++++++++++++++
 hw/virtio/virtio-pci.c       |  93 +++++++++
 hw/virtio/virtio-pci.h       |   4 +
 include/hw/virtio/vdpa-dev.h |  26 +++
 linux-headers/linux/vhost.h  |  10 +
 8 files changed, 622 insertions(+)
 create mode 100644 hw/virtio/vdpa-dev-pci.c
 create mode 100644 hw/virtio/vdpa-dev.c
 create mode 100644 include/hw/virtio/vdpa-dev.h

-- 
2.23.0



^ permalink raw reply	[flat|nested] 52+ messages in thread

* [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  4:37   ` Jason Wang
                     ` (2 more replies)
  2022-01-05  0:58 ` [RFC 02/10] vhost: add 3 commands for vhost-vdpa Longpeng(Mike) via
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
deivce which is specificed by the "Virtio Device ID".

These helpers will be used to build the generic vDPA device later.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
 hw/virtio/virtio-pci.h |  4 ++
 2 files changed, 97 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 750aa47ec1..843085c4ea 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -19,6 +19,7 @@
 
 #include "exec/memop.h"
 #include "standard-headers/linux/virtio_pci.h"
+#include "standard-headers/linux/virtio_ids.h"
 #include "hw/boards.h"
 #include "hw/virtio/virtio.h"
 #include "migration/qemu-file-types.h"
@@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n, QEMUFile *f)
     return 0;
 }
 
+typedef struct VirtIOPCIIDInfo {
+    uint16_t vdev_id; /* virtio id */
+    uint16_t pdev_id; /* pci device id */
+    uint16_t class_id;
+} VirtIOPCIIDInfo;
+
+static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
+    {
+        .vdev_id = VIRTIO_ID_NET,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
+        .class_id = PCI_CLASS_NETWORK_ETHERNET,
+    },
+    {
+        .vdev_id = VIRTIO_ID_BLOCK,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
+        .class_id = PCI_CLASS_STORAGE_SCSI,
+    },
+    {
+        .vdev_id = VIRTIO_ID_CONSOLE,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
+        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
+    },
+    {
+        .vdev_id = VIRTIO_ID_SCSI,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
+        .class_id = PCI_CLASS_STORAGE_SCSI,
+    },
+    {
+        .vdev_id = VIRTIO_ID_9P,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
+        .class_id = PCI_BASE_CLASS_NETWORK,
+    },
+    {
+        .vdev_id = VIRTIO_ID_VSOCK,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
+        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
+    },
+    {
+        .vdev_id = VIRTIO_ID_IOMMU,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
+        .class_id = PCI_CLASS_OTHERS,
+    },
+    {
+        .vdev_id = VIRTIO_ID_MEM,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
+        .class_id = PCI_CLASS_OTHERS,
+    },
+    {
+        .vdev_id = VIRTIO_ID_PMEM,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
+        .class_id = PCI_CLASS_OTHERS,
+    },
+    {
+        .vdev_id = VIRTIO_ID_RNG,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
+        .class_id = PCI_CLASS_OTHERS,
+    },
+    {
+        .vdev_id = VIRTIO_ID_BALLOON,
+        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
+        .class_id = PCI_CLASS_OTHERS,
+    },
+};
+
+static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
+{
+    VirtIOPCIIDInfo info = {};
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
+        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
+            info = virtio_pci_id_info[i];
+            break;
+        }
+    }
+
+    return info;
+}
+
+uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
+{
+    return virtio_pci_get_id_info(device_id).pdev_id;
+}
+
+uint16_t virtio_pci_get_class_id(uint16_t device_id)
+{
+    return virtio_pci_get_id_info(device_id).class_id;
+}
+
 static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
 {
     VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
@@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
          * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
          */
         pci_set_word(config + PCI_SUBSYSTEM_ID, virtio_bus_get_vdev_id(bus));
+        if (proxy->pdev_id) {
+            pci_config_set_device_id(config, proxy->pdev_id);
+        }
     } else {
         /* pure virtio-1.0 */
         pci_set_word(config + PCI_VENDOR_ID,
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 2446dcd9ae..06aa59436e 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
     bool disable_modern;
     bool ignore_backend_features;
     OnOffAuto disable_legacy;
+    uint16_t pdev_id;
     uint32_t class_code;
     uint32_t nvectors;
     uint32_t dfselect;
@@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
     VirtioBusState bus;
 };
 
+uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
+uint16_t virtio_pci_get_class_id(uint16_t device_id);
+
 static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
 {
     return !proxy->disable_modern;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  4:35   ` Jason Wang
  2022-01-05  0:58 ` [RFC 03/10] vdpa: add the infrastructure of vdpa-dev Longpeng(Mike) via
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

To support generic vdpa deivce, we need add the following ioctls:
- GET_VECTORS_NUM: the count of vectors that supported
- GET_CONFIG_SIZE: the size of the virtio config space
- GET_VQS_NUM: the count of virtqueues that exported

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 linux-headers/linux/vhost.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index c998860d7b..c5edd75d15 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -150,4 +150,14 @@
 /* Get the valid iova range */
 #define VHOST_VDPA_GET_IOVA_RANGE	_IOR(VHOST_VIRTIO, 0x78, \
 					     struct vhost_vdpa_iova_range)
+
+/* Get the number of vectors */
+#define VHOST_VDPA_GET_VECTORS_NUM	_IOR(VHOST_VIRTIO, 0x79, int)
+
+/* Get the virtio config size */
+#define VHOST_VDPA_GET_CONFIG_SIZE	_IOR(VHOST_VIRTIO, 0x80, int)
+
+/* Get the number of virtqueues */
+#define VHOST_VDPA_GET_VQS_NUM		_IOR(VHOST_VIRTIO, 0x81, int)
+
 #endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 02/10] vhost: add 3 commands for vhost-vdpa Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  9:48   ` Stefan Hajnoczi
  2022-01-05  0:58 ` [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface Longpeng(Mike) via
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Add the infrastructure of vdpa-dev (the generic vDPA device), we
can add a generic vDPA device as follow:
  -device vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-X

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/Kconfig            |  5 ++++
 hw/virtio/meson.build        |  2 ++
 hw/virtio/vdpa-dev-pci.c     | 51 ++++++++++++++++++++++++++++++++++++
 hw/virtio/vdpa-dev.c         | 41 +++++++++++++++++++++++++++++
 include/hw/virtio/vdpa-dev.h | 16 +++++++++++
 5 files changed, 115 insertions(+)
 create mode 100644 hw/virtio/vdpa-dev-pci.c
 create mode 100644 hw/virtio/vdpa-dev.c
 create mode 100644 include/hw/virtio/vdpa-dev.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index c144d42f9b..2723283382 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -68,3 +68,8 @@ config VHOST_USER_RNG
     bool
     default y
     depends on VIRTIO && VHOST_USER
+
+config VHOST_VDPA_DEV
+    bool
+    default y if VIRTIO_PCI
+    depends on VIRTIO && VHOST_VDPA && LINUX
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 521f7d64a8..8e8943e20b 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -29,6 +29,7 @@ virtio_ss.add(when: 'CONFIG_VHOST_USER_I2C', if_true: files('vhost-user-i2c.c'))
 virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_I2C'], if_true: files('vhost-user-i2c-pci.c'))
 virtio_ss.add(when: 'CONFIG_VHOST_USER_RNG', if_true: files('vhost-user-rng.c'))
 virtio_ss.add(when: ['CONFIG_VHOST_USER_RNG', 'CONFIG_VIRTIO_PCI'], if_true: files('vhost-user-rng-pci.c'))
+virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c'))
 
 virtio_pci_ss = ss.source_set()
 virtio_pci_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: files('vhost-vsock-pci.c'))
@@ -49,6 +50,7 @@ virtio_pci_ss.add(when: 'CONFIG_VIRTIO_SERIAL', if_true: files('virtio-serial-pc
 virtio_pci_ss.add(when: 'CONFIG_VIRTIO_PMEM', if_true: files('virtio-pmem-pci.c'))
 virtio_pci_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: files('virtio-iommu-pci.c'))
 virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MEM', if_true: files('virtio-mem-pci.c'))
+virtio_pci_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev-pci.c'))
 
 virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', if_true: virtio_pci_ss)
 
diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
new file mode 100644
index 0000000000..a5a7b528a9
--- /dev/null
+++ b/hw/virtio/vdpa-dev-pci.c
@@ -0,0 +1,51 @@
+#include "qemu/osdep.h"
+#include <sys/ioctl.h>
+#include <linux/vhost.h>
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/module.h"
+#include "virtio-pci.h"
+#include "qom/object.h"
+
+
+typedef struct VhostVdpaDevicePCI VhostVdpaDevicePCI;
+
+#define TYPE_VHOST_VDPA_DEVICE_PCI "vhost-vdpa-device-pci-base"
+DECLARE_INSTANCE_CHECKER(VhostVdpaDevicePCI, VHOST_VDPA_DEVICE_PCI,
+                         TYPE_VHOST_VDPA_DEVICE_PCI)
+
+struct VhostVdpaDevicePCI {
+    VirtIOPCIProxy parent_obj;
+    VhostVdpaDevice vdev;
+};
+
+static void vhost_vdpa_device_pci_instance_init(Object *obj)
+{
+    return;
+}
+
+static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
+{
+    return;
+}
+
+static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
+    .base_name               = TYPE_VHOST_VDPA_DEVICE_PCI,
+    .generic_name            = "vhost-vdpa-device-pci",
+    .transitional_name       = "vhost-vdpa-device-pci-transitional",
+    .non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",
+    .instance_size  = sizeof(VhostVdpaDevicePCI),
+    .instance_init  = vhost_vdpa_device_pci_instance_init,
+    .class_init     = vhost_vdpa_device_pci_class_init,
+};
+
+static void vhost_vdpa_device_pci_register(void)
+{
+    virtio_pci_types_register(&vhost_vdpa_device_pci_info);
+}
+
+type_init(vhost_vdpa_device_pci_register);
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
new file mode 100644
index 0000000000..f4f92b90b0
--- /dev/null
+++ b/hw/virtio/vdpa-dev.c
@@ -0,0 +1,41 @@
+#include "qemu/osdep.h"
+#include <sys/ioctl.h>
+#include <linux/vhost.h>
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "hw/qdev-core.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+
+static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
+{
+    return;
+}
+
+static void vhost_vdpa_device_instance_init(Object *obj)
+{
+    return;
+}
+
+static const TypeInfo vhost_vdpa_device_info = {
+    .name = TYPE_VHOST_VDPA_DEVICE,
+    .parent = TYPE_VIRTIO_DEVICE,
+    .instance_size = sizeof(VhostVdpaDevice),
+    .class_init = vhost_vdpa_device_class_init,
+    .instance_init = vhost_vdpa_device_instance_init,
+};
+
+static void register_vhost_vdpa_device_type(void)
+{
+    type_register_static(&vhost_vdpa_device_info);
+}
+
+type_init(register_vhost_vdpa_device_type);
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
new file mode 100644
index 0000000000..dd94bd74a2
--- /dev/null
+++ b/include/hw/virtio/vdpa-dev.h
@@ -0,0 +1,16 @@
+#ifndef _VHOST_VDPA_DEVICE_H
+#define _VHOST_VDPA_DEVICE_H
+
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-vdpa.h"
+#include "qom/object.h"
+
+
+#define TYPE_VHOST_VDPA_DEVICE "vhost-vdpa-device"
+OBJECT_DECLARE_SIMPLE_TYPE(VhostVdpaDevice, VHOST_VDPA_DEVICE)
+
+struct VhostVdpaDevice {
+    VirtIODevice parent_obj;
+};
+
+#endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (2 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 03/10] vdpa: add the infrastructure of vdpa-dev Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05 10:00   ` Stefan Hajnoczi
  2022-01-05 11:28   ` Stefano Garzarella
  2022-01-05  0:58 ` [RFC 05/10] vdpa-dev: implement the realize interface Longpeng(Mike) via
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .instance_init and the .class_init interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev-pci.c     | 80 +++++++++++++++++++++++++++++++++++-
 hw/virtio/vdpa-dev.c         | 68 +++++++++++++++++++++++++++++-
 include/hw/virtio/vdpa-dev.h |  2 +
 3 files changed, 146 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
index a5a7b528a9..0af54a26d4 100644
--- a/hw/virtio/vdpa-dev-pci.c
+++ b/hw/virtio/vdpa-dev-pci.c
@@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
     VhostVdpaDevice vdev;
 };
 
+static uint32_t
+vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
+{
+    int device_fd;
+    uint32_t val;
+    int ret;
+
+    device_fd = qemu_open(name, O_RDWR, errp);
+    if (device_fd == -1) {
+        return (uint32_t)-1;
+    }
+
+    ret = ioctl(device_fd, cmd, &val);
+    if (ret < 0) {
+        error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
+                   cmd, strerror(errno));
+        goto out;
+    }
+
+out:
+    close(device_fd);
+    return val;
+}
+
+static inline uint32_t
+vdpa_dev_pci_get_devid(VhostVdpaDevicePCI *dev, Error **errp)
+{
+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
+                                 VHOST_VDPA_GET_DEVICE_ID, errp);
+}
+
+static inline uint32_t
+vdpa_dev_pci_get_vectors_num(VhostVdpaDevicePCI *dev, Error **errp)
+{
+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
+                                 VHOST_VDPA_GET_VECTORS_NUM, errp);
+}
+
 static void vhost_vdpa_device_pci_instance_init(Object *obj)
 {
-    return;
+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
+
+    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
+                                TYPE_VHOST_VDPA_DEVICE);
+    object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
+                              "bootindex");
+}
+
+static Property vhost_vdpa_device_pci_properties[] = {
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void
+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
+{
+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
+    DeviceState *vdev = DEVICE(&dev->vdev);
+    uint32_t devid;
+    uint32_t vectors;
+
+    devid = vdpa_dev_pci_get_devid(dev, errp);
+    if (*errp) {
+        return;
+    }
+
+    vectors = vdpa_dev_pci_get_vectors_num(dev, errp);
+    if (*errp) {
+        return;
+    }
+
+    vpci_dev->class_code = virtio_pci_get_class_id(devid);
+    vpci_dev->pdev_id = virtio_pci_get_pci_devid(devid);
+    vpci_dev->nvectors = vectors;
+    qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
 }
 
 static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
 {
-    return;
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
+
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    device_class_set_props(dc, vhost_vdpa_device_pci_properties);
+    k->realize = vhost_vdpa_device_pci_realize;
 }
 
 static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index f4f92b90b0..790117fb3b 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -15,16 +15,80 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/runstate.h"
 
-static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
 {
     return;
 }
 
-static void vhost_vdpa_device_instance_init(Object *obj)
+static void vhost_vdpa_device_unrealize(DeviceState *dev)
+{
+    return;
+}
+
+static void
+vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
+{
+    return;
+}
+
+static void
+vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
 {
     return;
 }
 
+static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
+                                               uint64_t features,
+                                               Error **errp)
+{
+    return (uint64_t)-1;
+}
+
+static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
+{
+    return;
+}
+
+static Property vhost_vdpa_device_properties[] = {
+    DEFINE_PROP_STRING("vdpa-dev", VhostVdpaDevice, vdpa_dev),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static const VMStateDescription vmstate_vhost_vdpa_device = {
+    .name = "vhost-vdpa-device",
+    .minimum_version_id = 1,
+    .version_id = 1,
+    .fields = (VMStateField[]) {
+        VMSTATE_VIRTIO_DEVICE,
+        VMSTATE_END_OF_LIST()
+    },
+};
+
+static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+
+    device_class_set_props(dc, vhost_vdpa_device_properties);
+    dc->desc = "VDPA-based generic PCI device assignment";
+    dc->vmsd = &vmstate_vhost_vdpa_device;
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+    vdc->realize = vhost_vdpa_device_realize;
+    vdc->unrealize = vhost_vdpa_device_unrealize;
+    vdc->get_config = vhost_vdpa_device_get_config;
+    vdc->set_config = vhost_vdpa_device_set_config;
+    vdc->get_features = vhost_vdpa_device_get_features;
+    vdc->set_status = vhost_vdpa_device_set_status;
+}
+
+static void vhost_vdpa_device_instance_init(Object *obj)
+{
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(obj);
+
+    device_add_bootindex_property(obj, &s->bootindex, "bootindex",
+                                  NULL, DEVICE(obj));
+}
+
 static const TypeInfo vhost_vdpa_device_info = {
     .name = TYPE_VHOST_VDPA_DEVICE,
     .parent = TYPE_VIRTIO_DEVICE,
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index dd94bd74a2..7a0e6bdcf8 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -11,6 +11,8 @@ OBJECT_DECLARE_SIMPLE_TYPE(VhostVdpaDevice, VHOST_VDPA_DEVICE)
 
 struct VhostVdpaDevice {
     VirtIODevice parent_obj;
+    char *vdpa_dev;
+    int32_t bootindex;
 };
 
 #endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (3 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05 10:17   ` Stefan Hajnoczi
  2022-01-05  0:58 ` [RFC 06/10] vdpa-dev: implement the unrealize interface Longpeng(Mike) via
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .realize interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
 include/hw/virtio/vdpa-dev.h |   8 +++
 2 files changed, 122 insertions(+)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 790117fb3b..2d534d837a 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -15,9 +15,122 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/runstate.h"
 
+static void
+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+    /* Nothing to do */
+}
+
+static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
+{
+    int val;
+
+    if (ioctl(fd, cmd, &val) < 0) {
+        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
+                   cmd, strerror(errno));
+        return -1;
+    }
+
+    return val;
+}
+
+static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
+{
+    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
+}
+
+static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
+{
+    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
+}
+
+static inline int vdpa_dev_get_config_size(int fd, Error **errp)
+{
+    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE, errp);
+}
+
 static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
 {
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    uint32_t device_id;
+    int max_queue_size;
+    int fd;
+    int i, ret;
+
+    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
+    if (fd == -1) {
+        return;
+    }
+    s->vdpa.device_fd = fd;
+
+    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
+    if (*errp) {
+        goto out;
+    }
+
+    if (s->queue_size > max_queue_size) {
+        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d (max:%d)",
+                   s->queue_size, max_queue_size);
+        goto out;
+    } else if (!s->queue_size) {
+        s->queue_size = max_queue_size;
+    }
+
+    ret = vdpa_dev_get_vqs_num(fd, errp);
+    if (*errp) {
+        goto out;
+    }
+
+    s->dev.nvqs = ret;
+    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
+    s->dev.vq_index = 0;
+    s->dev.vq_index_end = s->dev.nvqs;
+    s->dev.backend_features = 0;
+    s->started = false;
+
+    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0, NULL);
+    if (ret < 0) {
+        error_setg(errp, "vhost-vdpa-device: vhost initialization failed: %s",
+                   strerror(-ret));
+        goto out;
+    }
+
+    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);
+    if (ret < 0) {
+        error_setg(errp, "vhost-vdpa-device: vhost get device id failed: %s",
+                   strerror(-ret));
+        goto vhost_cleanup;
+    }
+
+    s->config_size = vdpa_dev_get_config_size(fd, errp);
+    if (*errp) {
+        goto vhost_cleanup;
+    }
+
+    s->config = g_malloc0(s->config_size);
+
+    ret = vhost_dev_get_config(&s->dev, s->config, s->config_size, NULL);
+    if (ret < 0) {
+        error_setg(errp, "vhost-vdpa-device: get config failed");
+        goto config_err;
+    }
+
+    virtio_init(vdev, "vhost-vdpa", device_id, s->config_size);
+
+    s->virtqs = g_new0(VirtQueue *, s->dev.nvqs);
+    for (i = 0; i < s->dev.nvqs; i++) {
+        s->virtqs[i] = virtio_add_queue(vdev, s->queue_size,
+                                        vhost_vdpa_device_dummy_handle_output);
+    }
+
     return;
+config_err:
+    g_free(s->config);
+vhost_cleanup:
+    vhost_dev_cleanup(&s->dev);
+out:
+    close(fd);
 }
 
 static void vhost_vdpa_device_unrealize(DeviceState *dev)
@@ -51,6 +164,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
 
 static Property vhost_vdpa_device_properties[] = {
     DEFINE_PROP_STRING("vdpa-dev", VhostVdpaDevice, vdpa_dev),
+    DEFINE_PROP_UINT16("queue-size", VhostVdpaDevice, queue_size, 0),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
index 7a0e6bdcf8..49f8145d61 100644
--- a/include/hw/virtio/vdpa-dev.h
+++ b/include/hw/virtio/vdpa-dev.h
@@ -13,6 +13,14 @@ struct VhostVdpaDevice {
     VirtIODevice parent_obj;
     char *vdpa_dev;
     int32_t bootindex;
+    struct vhost_dev dev;
+    struct vhost_vdpa vdpa;
+    VirtQueue **virtqs;
+    uint8_t *config;
+    int config_size;
+    uint32_t num_queues;
+    uint16_t queue_size;
+    bool started;
 };
 
 #endif
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 06/10] vdpa-dev: implement the unrealize interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (4 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 05/10] vdpa-dev: implement the realize interface Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05 11:16   ` Stefano Garzarella
  2022-01-05  0:58 ` [RFC 07/10] vdpa-dev: implement the get_config/set_config interface Longpeng(Mike) via
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .unrealize interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 2d534d837a..4e4dd3d201 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -133,9 +133,29 @@ out:
     close(fd);
 }
 
+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+    int i;
+
+    for (i = 0; i < s->num_queues; i++) {
+        virtio_delete_queue(s->virtqs[i]);
+    }
+    g_free(s->virtqs);
+    virtio_cleanup(vdev);
+
+    g_free(s->config);
+}
+
 static void vhost_vdpa_device_unrealize(DeviceState *dev)
 {
-    return;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+
+    virtio_set_status(vdev, 0);
+    vhost_dev_cleanup(&s->dev);
+    vhost_vdpa_vdev_unrealize(s);
+    close(s->vdpa.device_fd);
 }
 
 static void
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 07/10] vdpa-dev: implement the get_config/set_config interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (5 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 06/10] vdpa-dev: implement the unrealize interface Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 08/10] vdpa-dev: implement the get_features interface Longpeng(Mike) via
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .get_config and .set_config interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 4e4dd3d201..4f97a7521b 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -161,13 +161,23 @@ static void vhost_vdpa_device_unrealize(DeviceState *dev)
 static void
 vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
 {
-    return;
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+
+    memcpy(config, s->config, s->config_size);
 }
 
 static void
 vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
 {
-    return;
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    int ret;
+
+    ret = vhost_dev_set_config(&s->dev, s->config, 0, s->config_size,
+                               VHOST_SET_CONFIG_TYPE_MASTER);
+    if (ret) {
+        error_report("set device config space failed");
+        return;
+    }
 }
 
 static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 08/10] vdpa-dev: implement the get_features interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (6 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 07/10] vdpa-dev: implement the get_config/set_config interface Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  0:58 ` [RFC 09/10] vdpa-dev: implement the set_status interface Longpeng(Mike) via
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .get_features interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 4f97a7521b..32b3117c4b 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -184,7 +184,14 @@ static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
                                                uint64_t features,
                                                Error **errp)
 {
-    return (uint64_t)-1;
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    uint64_t backend_features = s->dev.features;
+
+    if (!virtio_has_feature(features, VIRTIO_F_IOMMU_PLATFORM)) {
+        virtio_clear_feature(&backend_features, VIRTIO_F_IOMMU_PLATFORM);
+    }
+
+    return backend_features;
 }
 
 static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 09/10] vdpa-dev: implement the set_status interface
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (7 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 08/10] vdpa-dev: implement the get_features interface Longpeng(Mike) via
@ 2022-01-05  0:58 ` Longpeng(Mike) via
  2022-01-05  0:59 ` [RFC 10/10] vdpa-dev: mark the device as unmigratable Longpeng(Mike) via
  2022-01-05 10:21 ` [RFC 00/10] add generic vDPA device support Stefan Hajnoczi
  10 siblings, 0 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:58 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

Implements the .set_status interface.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c | 100 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 99 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 32b3117c4b..64649bfb5a 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -194,9 +194,107 @@ static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
     return backend_features;
 }
 
+static int vhost_vdpa_device_start(VirtIODevice *vdev, Error **errp)
+{
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+    VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+    int i, ret;
+
+    if (!k->set_guest_notifiers) {
+        error_setg(errp, "binding does not support guest notifiers");
+        return -ENOSYS;
+    }
+
+    ret = vhost_dev_enable_notifiers(&s->dev, vdev);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error enabling host notifiers");
+        return ret;
+    }
+
+    ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, true);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error binding guest notifier");
+        goto err_host_notifiers;
+    }
+
+    s->dev.acked_features = vdev->guest_features;
+
+    ret = vhost_dev_start(&s->dev, vdev);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Error starting vhost");
+        goto err_guest_notifiers;
+    }
+    s->started = true;
+
+    /*
+     * guest_notifier_mask/pending not used yet, so just unmask
+     * everything here. virtio-pci will do the right thing by
+     * enabling/disabling irqfd.
+     */
+    for (i = 0; i < s->dev.nvqs; i++) {
+        vhost_virtqueue_mask(&s->dev, vdev, i, false);
+    }
+
+    return ret;
+
+err_guest_notifiers:
+    k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+err_host_notifiers:
+    vhost_dev_disable_notifiers(&s->dev, vdev);
+    return ret;
+}
+
+static void vhost_vdpa_device_stop(VirtIODevice *vdev)
+{
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+    VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
+    int ret;
+
+    if (!s->started) {
+        return;
+    }
+    s->started = false;
+
+    if (!k->set_guest_notifiers) {
+        return;
+    }
+
+    vhost_dev_stop(&s->dev, vdev);
+
+    ret = k->set_guest_notifiers(qbus->parent, s->dev.nvqs, false);
+    if (ret < 0) {
+        error_report("vhost guest notifier cleanup failed: %d", ret);
+        return;
+    }
+
+    vhost_dev_disable_notifiers(&s->dev, vdev);
+}
+
 static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
 {
-    return;
+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
+    bool should_start = virtio_device_started(vdev, status);
+    Error *local_err = NULL;
+    int ret;
+
+    if (!vdev->vm_running) {
+        should_start = false;
+    }
+
+    if (s->started == should_start) {
+        return;
+    }
+
+    if (should_start) {
+        ret = vhost_vdpa_device_start(vdev, &local_err);
+        if (ret < 0) {
+            error_reportf_err(local_err, "vhost-vdpa-device: start failed: ");
+        }
+    } else {
+        vhost_vdpa_device_stop(vdev);
+    }
 }
 
 static Property vhost_vdpa_device_properties[] = {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [RFC 10/10] vdpa-dev: mark the device as unmigratable
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (8 preceding siblings ...)
  2022-01-05  0:58 ` [RFC 09/10] vdpa-dev: implement the set_status interface Longpeng(Mike) via
@ 2022-01-05  0:59 ` Longpeng(Mike) via
  2022-01-05 10:21 ` [RFC 00/10] add generic vDPA device support Stefan Hajnoczi
  10 siblings, 0 replies; 52+ messages in thread
From: Longpeng(Mike) via @ 2022-01-05  0:59 UTC (permalink / raw)
  To: stefanha, mst, jasowang, sgarzare
  Cc: cohuck, pbonzini, arei.gonglei, yechuan, huangzhichao,
	qemu-devel, Longpeng

From: Longpeng <longpeng2@huawei.com>

The generic vDPA device doesn't support migration currently, so
mark it as unmigratable temporarily.

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vdpa-dev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 64649bfb5a..0644aace22 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -305,6 +305,7 @@ static Property vhost_vdpa_device_properties[] = {
 
 static const VMStateDescription vmstate_vhost_vdpa_device = {
     .name = "vhost-vdpa-device",
+    .unmigratable = 1,
     .minimum_version_id = 1,
     .version_id = 1,
     .fields = (VMStateField[]) {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  0:58 ` [RFC 02/10] vhost: add 3 commands for vhost-vdpa Longpeng(Mike) via
@ 2022-01-05  4:35   ` Jason Wang
  2022-01-05  6:40     ` longpeng2--- via
  2022-01-05  7:02     ` Michael S. Tsirkin
  0 siblings, 2 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-05  4:35 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	huangzhichao, Stefan Hajnoczi, pbonzini, Stefano Garzarella

On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
>
> From: Longpeng <longpeng2@huawei.com>
>
> To support generic vdpa deivce, we need add the following ioctls:
> - GET_VECTORS_NUM: the count of vectors that supported

Does this mean MSI vectors? If yes, it looks like a layer violation:
vhost is transport independent.  And it reveals device implementation
details which block (cross vendor) migration.

Thanks

> - GET_CONFIG_SIZE: the size of the virtio config space
> - GET_VQS_NUM: the count of virtqueues that exported
>
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  linux-headers/linux/vhost.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> index c998860d7b..c5edd75d15 100644
> --- a/linux-headers/linux/vhost.h
> +++ b/linux-headers/linux/vhost.h
> @@ -150,4 +150,14 @@
>  /* Get the valid iova range */
>  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
>                                              struct vhost_vdpa_iova_range)
> +
> +/* Get the number of vectors */
> +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> +
> +/* Get the virtio config size */
> +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> +
> +/* Get the number of virtqueues */
> +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> +
>  #endif
> --
> 2.23.0
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
@ 2022-01-05  4:37   ` Jason Wang
  2022-01-05  5:47     ` longpeng2--- via
  2022-01-05 10:46   ` Cornelia Huck
  2022-01-10  5:43   ` Michael S. Tsirkin
  2 siblings, 1 reply; 52+ messages in thread
From: Jason Wang @ 2022-01-05  4:37 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	huangzhichao, Stefan Hajnoczi, pbonzini, Stefano Garzarella

On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
>
> From: Longpeng <longpeng2@huawei.com>
>
> Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> deivce which is specificed by the "Virtio Device ID".
>
> These helpers will be used to build the generic vDPA device later.
>
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
>  hw/virtio/virtio-pci.h |  4 ++
>  2 files changed, 97 insertions(+)
>
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 750aa47ec1..843085c4ea 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -19,6 +19,7 @@
>
>  #include "exec/memop.h"
>  #include "standard-headers/linux/virtio_pci.h"
> +#include "standard-headers/linux/virtio_ids.h"
>  #include "hw/boards.h"
>  #include "hw/virtio/virtio.h"
>  #include "migration/qemu-file-types.h"
> @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n, QEMUFile *f)
>      return 0;
>  }
>
> +typedef struct VirtIOPCIIDInfo {
> +    uint16_t vdev_id; /* virtio id */
> +    uint16_t pdev_id; /* pci device id */
> +    uint16_t class_id;
> +} VirtIOPCIIDInfo;
> +
> +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> +    {

Any way to get rid of this array? E.g using the algorithm that is used
by the kernel virtio driver.

Thanks

> +        .vdev_id = VIRTIO_ID_NET,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BLOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_CONSOLE,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_SCSI,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_9P,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> +        .class_id = PCI_BASE_CLASS_NETWORK,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_VSOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_IOMMU,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_MEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_PMEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_RNG,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BALLOON,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +};
> +
> +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> +{
> +    VirtIOPCIIDInfo info = {};
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> +            info = virtio_pci_id_info[i];
> +            break;
> +        }
> +    }
> +
> +    return info;
> +}
> +
> +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).pdev_id;
> +}
> +
> +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).class_id;
> +}
> +
>  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
>  {
>      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
>           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
>           */
>          pci_set_word(config + PCI_SUBSYSTEM_ID, virtio_bus_get_vdev_id(bus));
> +        if (proxy->pdev_id) {
> +            pci_config_set_device_id(config, proxy->pdev_id);
> +        }
>      } else {
>          /* pure virtio-1.0 */
>          pci_set_word(config + PCI_VENDOR_ID,
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 2446dcd9ae..06aa59436e 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
>      bool disable_modern;
>      bool ignore_backend_features;
>      OnOffAuto disable_legacy;
> +    uint16_t pdev_id;
>      uint32_t class_code;
>      uint32_t nvectors;
>      uint32_t dfselect;
> @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
>      VirtioBusState bus;
>  };
>
> +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> +
>  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
>  {
>      return !proxy->disable_modern;
> --
> 2.23.0
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  4:37   ` Jason Wang
@ 2022-01-05  5:47     ` longpeng2--- via
  2022-01-05  6:15       ` Jason Wang
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-05  5:47 UTC (permalink / raw)
  To: Jason Wang
  Cc: Stefan Hajnoczi, mst, Stefano Garzarella, Cornelia Huck,
	pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, January 5, 2022 12:38 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> <qemu-devel@nongnu.org>
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> >
> > From: Longpeng <longpeng2@huawei.com>
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> >
> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >      return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +    uint16_t vdev_id; /* virtio id */
> > +    uint16_t pdev_id; /* pci device id */
> > +    uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +    {
> 
> Any way to get rid of this array? E.g using the algorithm that is used
> by the kernel virtio driver.
> 

For device id, we can use the algorithm if we no need to support
Transitional id. But how to get the class id ?

> Thanks
> 
> > +        .vdev_id = VIRTIO_ID_NET,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BLOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_SCSI,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_9P,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_VSOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_IOMMU,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_MEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_PMEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_RNG,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BALLOON,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +};
> > +
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +    VirtIOPCIIDInfo info = {};
> > +    int i;
> > +
> > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +            info = virtio_pci_id_info[i];
> > +            break;
> > +        }
> > +    }
> > +
> > +    return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).class_id;
> > +}
> > +
> >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> >  {
> >      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d,
> Error **errp)
> >           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> >           */
> >          pci_set_word(config + PCI_SUBSYSTEM_ID,
> virtio_bus_get_vdev_id(bus));
> > +        if (proxy->pdev_id) {
> > +            pci_config_set_device_id(config, proxy->pdev_id);
> > +        }
> >      } else {
> >          /* pure virtio-1.0 */
> >          pci_set_word(config + PCI_VENDOR_ID,
> > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > index 2446dcd9ae..06aa59436e 100644
> > --- a/hw/virtio/virtio-pci.h
> > +++ b/hw/virtio/virtio-pci.h
> > @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
> >      bool disable_modern;
> >      bool ignore_backend_features;
> >      OnOffAuto disable_legacy;
> > +    uint16_t pdev_id;
> >      uint32_t class_code;
> >      uint32_t nvectors;
> >      uint32_t dfselect;
> > @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
> >      VirtioBusState bus;
> >  };
> >
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> > +
> >  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
> >  {
> >      return !proxy->disable_modern;
> > --
> > 2.23.0
> >


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  5:47     ` longpeng2--- via
@ 2022-01-05  6:15       ` Jason Wang
  2022-01-10  3:03         ` longpeng2--- via
  0 siblings, 1 reply; 52+ messages in thread
From: Jason Wang @ 2022-01-05  6:15 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Stefano Garzarella

On Wed, Jan 5, 2022 at 1:48 PM Longpeng (Mike, Cloud Infrastructure
Service Product Dept.) <longpeng2@huawei.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, January 5, 2022 12:38 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>
> > Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
> > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > <qemu-devel@nongnu.org>
> > Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> > id
> >
> > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > >
> > > From: Longpeng <longpeng2@huawei.com>
> > >
> > > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > > deivce which is specificed by the "Virtio Device ID".
> > >
> > > These helpers will be used to build the generic vDPA device later.
> > >
> > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > ---
> > >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> > >  hw/virtio/virtio-pci.h |  4 ++
> > >  2 files changed, 97 insertions(+)
> > >
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index 750aa47ec1..843085c4ea 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -19,6 +19,7 @@
> > >
> > >  #include "exec/memop.h"
> > >  #include "standard-headers/linux/virtio_pci.h"
> > > +#include "standard-headers/linux/virtio_ids.h"
> > >  #include "hw/boards.h"
> > >  #include "hw/virtio/virtio.h"
> > >  #include "migration/qemu-file-types.h"
> > > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> > QEMUFile *f)
> > >      return 0;
> > >  }
> > >
> > > +typedef struct VirtIOPCIIDInfo {
> > > +    uint16_t vdev_id; /* virtio id */
> > > +    uint16_t pdev_id; /* pci device id */
> > > +    uint16_t class_id;
> > > +} VirtIOPCIIDInfo;
> > > +
> > > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > > +    {
> >
> > Any way to get rid of this array? E.g using the algorithm that is used
> > by the kernel virtio driver.
> >
>
> For device id, we can use the algorithm if we no need to support
> Transitional id. But how to get the class id ?

Right, I miss this. So the current code should be fine.

Thanks

>
> > Thanks
> >
> > > +        .vdev_id = VIRTIO_ID_NET,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_BLOCK,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_SCSI,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_9P,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_VSOCK,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_IOMMU,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_MEM,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_PMEM,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_RNG,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_BALLOON,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +};
> > > +
> > > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > > +{
> > > +    VirtIOPCIIDInfo info = {};
> > > +    int i;
> > > +
> > > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > > +            info = virtio_pci_id_info[i];
> > > +            break;
> > > +        }
> > > +    }
> > > +
> > > +    return info;
> > > +}
> > > +
> > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > > +{
> > > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > > +}
> > > +
> > > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > > +{
> > > +    return virtio_pci_get_id_info(device_id).class_id;
> > > +}
> > > +
> > >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> > >  {
> > >      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d,
> > Error **errp)
> > >           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> > >           */
> > >          pci_set_word(config + PCI_SUBSYSTEM_ID,
> > virtio_bus_get_vdev_id(bus));
> > > +        if (proxy->pdev_id) {
> > > +            pci_config_set_device_id(config, proxy->pdev_id);
> > > +        }
> > >      } else {
> > >          /* pure virtio-1.0 */
> > >          pci_set_word(config + PCI_VENDOR_ID,
> > > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > > index 2446dcd9ae..06aa59436e 100644
> > > --- a/hw/virtio/virtio-pci.h
> > > +++ b/hw/virtio/virtio-pci.h
> > > @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
> > >      bool disable_modern;
> > >      bool ignore_backend_features;
> > >      OnOffAuto disable_legacy;
> > > +    uint16_t pdev_id;
> > >      uint32_t class_code;
> > >      uint32_t nvectors;
> > >      uint32_t dfselect;
> > > @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
> > >      VirtioBusState bus;
> > >  };
> > >
> > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> > > +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> > > +
> > >  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
> > >  {
> > >      return !proxy->disable_modern;
> > > --
> > > 2.23.0
> > >
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  4:35   ` Jason Wang
@ 2022-01-05  6:40     ` longpeng2--- via
  2022-01-05  6:43       ` Jason Wang
  2022-01-05  7:02     ` Michael S. Tsirkin
  1 sibling, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-05  6:40 UTC (permalink / raw)
  To: Jason Wang
  Cc: Stefan Hajnoczi, mst, Stefano Garzarella, Cornelia Huck,
	pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, January 5, 2022 12:36 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> <qemu-devel@nongnu.org>
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> >
> > From: Longpeng <longpeng2@huawei.com>
> >
> > To support generic vdpa deivce, we need add the following ioctls:
> > - GET_VECTORS_NUM: the count of vectors that supported
> 
> Does this mean MSI vectors? If yes, it looks like a layer violation:
> vhost is transport independent.  And it reveals device implementation
> details which block (cross vendor) migration.
> 

Can we set the VirtIOPCIProxy.nvectors to "the count of virtqueues + 1 (config)" ?

> Thanks
> 
> > - GET_CONFIG_SIZE: the size of the virtio config space
> > - GET_VQS_NUM: the count of virtqueues that exported
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  linux-headers/linux/vhost.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > index c998860d7b..c5edd75d15 100644
> > --- a/linux-headers/linux/vhost.h
> > +++ b/linux-headers/linux/vhost.h
> > @@ -150,4 +150,14 @@
> >  /* Get the valid iova range */
> >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> >                                              struct vhost_vdpa_iova_range)
> > +
> > +/* Get the number of vectors */
> > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > +
> > +/* Get the virtio config size */
> > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > +
> > +/* Get the number of virtqueues */
> > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > +
> >  #endif
> > --
> > 2.23.0
> >


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  6:40     ` longpeng2--- via
@ 2022-01-05  6:43       ` Jason Wang
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-05  6:43 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Stefano Garzarella


在 2022/1/5 下午2:40, Longpeng (Mike, Cloud Infrastructure Service Product 
Dept.) 写道:
>
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Wednesday, January 5, 2022 12:36 PM
>> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
>> <longpeng2@huawei.com>
>> Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
>> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
>> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
>> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
>> <qemu-devel@nongnu.org>
>> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
>>
>> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
>>> From: Longpeng <longpeng2@huawei.com>
>>>
>>> To support generic vdpa deivce, we need add the following ioctls:
>>> - GET_VECTORS_NUM: the count of vectors that supported
>> Does this mean MSI vectors? If yes, it looks like a layer violation:
>> vhost is transport independent.  And it reveals device implementation
>> details which block (cross vendor) migration.
>>
> Can we set the VirtIOPCIProxy.nvectors to "the count of virtqueues + 1 (config)" ?


That should work.

Thanks


>
>> Thanks
>>
>>> - GET_CONFIG_SIZE: the size of the virtio config space
>>> - GET_VQS_NUM: the count of virtqueues that exported
>>>
>>> Signed-off-by: Longpeng <longpeng2@huawei.com>
>>> ---
>>>   linux-headers/linux/vhost.h | 10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
>>> index c998860d7b..c5edd75d15 100644
>>> --- a/linux-headers/linux/vhost.h
>>> +++ b/linux-headers/linux/vhost.h
>>> @@ -150,4 +150,14 @@
>>>   /* Get the valid iova range */
>>>   #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
>>>                                               struct vhost_vdpa_iova_range)
>>> +
>>> +/* Get the number of vectors */
>>> +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
>>> +
>>> +/* Get the virtio config size */
>>> +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
>>> +
>>> +/* Get the number of virtqueues */
>>> +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
>>> +
>>>   #endif
>>> --
>>> 2.23.0
>>>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  4:35   ` Jason Wang
  2022-01-05  6:40     ` longpeng2--- via
@ 2022-01-05  7:02     ` Michael S. Tsirkin
  2022-01-05  7:54       ` Jason Wang
  1 sibling, 1 reply; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-05  7:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng(Mike),
	Stefano Garzarella

On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> >
> > From: Longpeng <longpeng2@huawei.com>
> >
> > To support generic vdpa deivce, we need add the following ioctls:
> > - GET_VECTORS_NUM: the count of vectors that supported
> 
> Does this mean MSI vectors? If yes, it looks like a layer violation:
> vhost is transport independent.

Well *guest* needs to know how many vectors device supports.
I don't think there's a way around that. Do you?
Otherwise guests will at best be suboptimal.

>  And it reveals device implementation
> details which block (cross vendor) migration.
> 
> Thanks

Not necessarily, userspace can hide this from guest if it
wants to, just validate.


> > - GET_CONFIG_SIZE: the size of the virtio config space
> > - GET_VQS_NUM: the count of virtqueues that exported
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  linux-headers/linux/vhost.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > index c998860d7b..c5edd75d15 100644
> > --- a/linux-headers/linux/vhost.h
> > +++ b/linux-headers/linux/vhost.h
> > @@ -150,4 +150,14 @@
> >  /* Get the valid iova range */
> >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> >                                              struct vhost_vdpa_iova_range)
> > +
> > +/* Get the number of vectors */
> > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > +
> > +/* Get the virtio config size */
> > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > +
> > +/* Get the number of virtqueues */
> > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > +
> >  #endif
> > --
> > 2.23.0
> >



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  7:02     ` Michael S. Tsirkin
@ 2022-01-05  7:54       ` Jason Wang
  2022-01-05  8:37         ` longpeng2--- via
  2022-01-05  9:12         ` Michael S. Tsirkin
  0 siblings, 2 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-05  7:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng(Mike),
	Stefano Garzarella

On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > >
> > > From: Longpeng <longpeng2@huawei.com>
> > >
> > > To support generic vdpa deivce, we need add the following ioctls:
> > > - GET_VECTORS_NUM: the count of vectors that supported
> >
> > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > vhost is transport independent.
>
> Well *guest* needs to know how many vectors device supports.
> I don't think there's a way around that. Do you?

We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
simply assume #vqs + 1?

> Otherwise guests will at best be suboptimal.
>
> >  And it reveals device implementation
> > details which block (cross vendor) migration.
> >
> > Thanks
>
> Not necessarily, userspace can hide this from guest if it
> wants to, just validate.

If we can hide it at vhost/uAPI level, it would be even better?

Thanks

>
>
> > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > - GET_VQS_NUM: the count of virtqueues that exported
> > >
> > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > ---
> > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > index c998860d7b..c5edd75d15 100644
> > > --- a/linux-headers/linux/vhost.h
> > > +++ b/linux-headers/linux/vhost.h
> > > @@ -150,4 +150,14 @@
> > >  /* Get the valid iova range */
> > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > >                                              struct vhost_vdpa_iova_range)
> > > +
> > > +/* Get the number of vectors */
> > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > +
> > > +/* Get the virtio config size */
> > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > +
> > > +/* Get the number of virtqueues */
> > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > +
> > >  #endif
> > > --
> > > 2.23.0
> > >
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  7:54       ` Jason Wang
@ 2022-01-05  8:37         ` longpeng2--- via
  2022-01-05  9:09           ` Jason Wang
  2022-01-05  9:12         ` Michael S. Tsirkin
  1 sibling, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-05  8:37 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefano Garzarella, Cornelia Huck, pbonzini,
	Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, January 5, 2022 3:54 PM
> To: Michael S. Tsirkin <mst@redhat.com>
> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> <qemu-devel@nongnu.org>
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > >
> > > > From: Longpeng <longpeng2@huawei.com>
> > > >
> > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > - GET_VECTORS_NUM: the count of vectors that supported
> > >
> > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > vhost is transport independent.
> >
> > Well *guest* needs to know how many vectors device supports.
> > I don't think there's a way around that. Do you?
> 
> We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> simply assume #vqs + 1?
> 
> > Otherwise guests will at best be suboptimal.
> >
> > >  And it reveals device implementation
> > > details which block (cross vendor) migration.
> > >
> > > Thanks
> >
> > Not necessarily, userspace can hide this from guest if it
> > wants to, just validate.
> 
> If we can hide it at vhost/uAPI level, it would be even better?
> 

Not only MSI vectors, but also queue-size, #vqs, etc.

Maybe the vhost level could expose the hardware's real capabilities
and let the userspace (QEMU) do the hiding? The userspace know how
to process them.

> Thanks
> 
> >
> >
> > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > >
> > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > ---
> > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > index c998860d7b..c5edd75d15 100644
> > > > --- a/linux-headers/linux/vhost.h
> > > > +++ b/linux-headers/linux/vhost.h
> > > > @@ -150,4 +150,14 @@
> > > >  /* Get the valid iova range */
> > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > >                                              struct vhost_vdpa_iova_range)
> > > > +
> > > > +/* Get the number of vectors */
> > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > +
> > > > +/* Get the virtio config size */
> > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > +
> > > > +/* Get the number of virtqueues */
> > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > +
> > > >  #endif
> > > > --
> > > > 2.23.0
> > > >
> >


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  8:37         ` longpeng2--- via
@ 2022-01-05  9:09           ` Jason Wang
  2022-01-05 12:26             ` Michael S. Tsirkin
  0 siblings, 1 reply; 52+ messages in thread
From: Jason Wang @ 2022-01-05  9:09 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: Michael S. Tsirkin, Cornelia Huck, qemu-devel, Yechuan,
	Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Stefano Garzarella

On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
Service Product Dept.) <longpeng2@huawei.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Jason Wang [mailto:jasowang@redhat.com]
> > Sent: Wednesday, January 5, 2022 3:54 PM
> > To: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > <qemu-devel@nongnu.org>
> > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> >
> > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > > >
> > > > > From: Longpeng <longpeng2@huawei.com>
> > > > >
> > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > >
> > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > vhost is transport independent.
> > >
> > > Well *guest* needs to know how many vectors device supports.
> > > I don't think there's a way around that. Do you?
> >
> > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > simply assume #vqs + 1?
> >
> > > Otherwise guests will at best be suboptimal.
> > >
> > > >  And it reveals device implementation
> > > > details which block (cross vendor) migration.
> > > >
> > > > Thanks
> > >
> > > Not necessarily, userspace can hide this from guest if it
> > > wants to, just validate.
> >
> > If we can hide it at vhost/uAPI level, it would be even better?
> >
>
> Not only MSI vectors, but also queue-size, #vqs, etc.

MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5

And it's something that is not guaranteed to be not changed. E.g some
drivers may choose to allocate MSI during set_status() which can fail
for various reasons.

>
> Maybe the vhost level could expose the hardware's real capabilities
> and let the userspace (QEMU) do the hiding? The userspace know how
> to process them.

#MSI vectors is much more easier to be mediated than queue-size and #vqs.

For interrupts, we've already had VHOST_SET_X_KICK, we can keep
allocating eventfd based on #MSI vectors to make it work with any
number of MSI vectors that the virtual device had.

For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
can simply fail if SET_VRING_NUM fail.

For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
device requires knowledge the #vqs in the config space. (still not a
must, we can enumerate #vqs per device type)

For the config size, it's OK but not a must, technically we can simply
relay what guest write to vhost-vdpa. It's just because current Qemu
require to have it during virtio device initialization.

Thanks

>
> > Thanks
> >
> > >
> > >
> > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > >
> > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > ---
> > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > >  1 file changed, 10 insertions(+)
> > > > >
> > > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > > index c998860d7b..c5edd75d15 100644
> > > > > --- a/linux-headers/linux/vhost.h
> > > > > +++ b/linux-headers/linux/vhost.h
> > > > > @@ -150,4 +150,14 @@
> > > > >  /* Get the valid iova range */
> > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > > >                                              struct vhost_vdpa_iova_range)
> > > > > +
> > > > > +/* Get the number of vectors */
> > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > > +
> > > > > +/* Get the virtio config size */
> > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > > +
> > > > > +/* Get the number of virtqueues */
> > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > > +
> > > > >  #endif
> > > > > --
> > > > > 2.23.0
> > > > >
> > >
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  7:54       ` Jason Wang
  2022-01-05  8:37         ` longpeng2--- via
@ 2022-01-05  9:12         ` Michael S. Tsirkin
  2022-01-05  9:21           ` Jason Wang
  1 sibling, 1 reply; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-05  9:12 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng(Mike),
	Stefano Garzarella

On Wed, Jan 05, 2022 at 03:54:13PM +0800, Jason Wang wrote:
> On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > >
> > > > From: Longpeng <longpeng2@huawei.com>
> > > >
> > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > - GET_VECTORS_NUM: the count of vectors that supported
> > >
> > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > vhost is transport independent.
> >
> > Well *guest* needs to know how many vectors device supports.
> > I don't think there's a way around that. Do you?
> 
> We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> simply assume #vqs + 1?

Some hardware may be more limited. E.g. there could be a bunch of vqs
which don't need a dedicated interrupt. Or device could support a single
interrupt shared between VQs.


> > Otherwise guests will at best be suboptimal.
> >
> > >  And it reveals device implementation
> > > details which block (cross vendor) migration.
> > >
> > > Thanks
> >
> > Not necessarily, userspace can hide this from guest if it
> > wants to, just validate.
> 
> If we can hide it at vhost/uAPI level, it would be even better?
> 
> Thanks
> 
> >
> >
> > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > >
> > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > ---
> > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > index c998860d7b..c5edd75d15 100644
> > > > --- a/linux-headers/linux/vhost.h
> > > > +++ b/linux-headers/linux/vhost.h
> > > > @@ -150,4 +150,14 @@
> > > >  /* Get the valid iova range */
> > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > >                                              struct vhost_vdpa_iova_range)
> > > > +
> > > > +/* Get the number of vectors */
> > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > +
> > > > +/* Get the virtio config size */
> > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > +
> > > > +/* Get the number of virtqueues */
> > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > +
> > > >  #endif
> > > > --
> > > > 2.23.0
> > > >
> >



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  9:12         ` Michael S. Tsirkin
@ 2022-01-05  9:21           ` Jason Wang
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-05  9:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng(Mike),
	Stefano Garzarella

On Wed, Jan 5, 2022 at 5:12 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 05, 2022 at 03:54:13PM +0800, Jason Wang wrote:
> > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > > >
> > > > > From: Longpeng <longpeng2@huawei.com>
> > > > >
> > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > >
> > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > vhost is transport independent.
> > >
> > > Well *guest* needs to know how many vectors device supports.
> > > I don't think there's a way around that. Do you?
> >
> > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > simply assume #vqs + 1?
>
> Some hardware may be more limited. E.g. there could be a bunch of vqs
> which don't need a dedicated interrupt. Or device could support a single
> interrupt shared between VQs.

Right, but in the worst case, we may just waste some eventfds?

Or if we want to be optimized in the case you mentioned above, we need
to know the association among vectors and vqs which requires more
extensions.

1) API to know which vectors does a dedicated VQ belong
2) the kick eventfd is set based on the vectors but not vq

And such mappings are not static, e.g IFCVF requesting MSI-X vectors
during DRIVER_OK.

Thanks

>
>
> > > Otherwise guests will at best be suboptimal.
> > >
> > > >  And it reveals device implementation
> > > > details which block (cross vendor) migration.
> > > >
> > > > Thanks
> > >
> > > Not necessarily, userspace can hide this from guest if it
> > > wants to, just validate.
> >
> > If we can hide it at vhost/uAPI level, it would be even better?
> >
> > Thanks
> >
> > >
> > >
> > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > >
> > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > ---
> > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > >  1 file changed, 10 insertions(+)
> > > > >
> > > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > > index c998860d7b..c5edd75d15 100644
> > > > > --- a/linux-headers/linux/vhost.h
> > > > > +++ b/linux-headers/linux/vhost.h
> > > > > @@ -150,4 +150,14 @@
> > > > >  /* Get the valid iova range */
> > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > > >                                              struct vhost_vdpa_iova_range)
> > > > > +
> > > > > +/* Get the number of vectors */
> > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > > +
> > > > > +/* Get the virtio config size */
> > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > > +
> > > > > +/* Get the number of virtqueues */
> > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > > +
> > > > >  #endif
> > > > > --
> > > > > 2.23.0
> > > > >
> > >
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
  2022-01-05  0:58 ` [RFC 03/10] vdpa: add the infrastructure of vdpa-dev Longpeng(Mike) via
@ 2022-01-05  9:48   ` Stefan Hajnoczi
  2022-01-06  1:22     ` longpeng2--- via
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-05  9:48 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 516 bytes --]

On Wed, Jan 05, 2022 at 08:58:53AM +0800, Longpeng(Mike) wrote:
> +static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> +    .base_name               = TYPE_VHOST_VDPA_DEVICE_PCI,
> +    .generic_name            = "vhost-vdpa-device-pci",
> +    .transitional_name       = "vhost-vdpa-device-pci-transitional",
> +    .non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",

Does vDPA support Transitional VIRTIO devices?

I expected this device to support Modern devices only.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface
  2022-01-05  0:58 ` [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface Longpeng(Mike) via
@ 2022-01-05 10:00   ` Stefan Hajnoczi
  2022-01-06  2:39     ` longpeng2--- via
  2022-01-05 11:28   ` Stefano Garzarella
  1 sibling, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-05 10:00 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
> From: Longpeng <longpeng2@huawei.com>
> 
> Implements the .instance_init and the .class_init interface.
> 
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  hw/virtio/vdpa-dev-pci.c     | 80 +++++++++++++++++++++++++++++++++++-
>  hw/virtio/vdpa-dev.c         | 68 +++++++++++++++++++++++++++++-
>  include/hw/virtio/vdpa-dev.h |  2 +
>  3 files changed, 146 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> index a5a7b528a9..0af54a26d4 100644
> --- a/hw/virtio/vdpa-dev-pci.c
> +++ b/hw/virtio/vdpa-dev-pci.c
> @@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
>      VhostVdpaDevice vdev;
>  };
>  
> +static uint32_t
> +vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)

vdpa_dev_pci_get_u32() might be a clearer name.

> +{
> +    int device_fd;
> +    uint32_t val;
> +    int ret;
> +
> +    device_fd = qemu_open(name, O_RDWR, errp);
> +    if (device_fd == -1) {
> +        return (uint32_t)-1;
> +    }
> +
> +    ret = ioctl(device_fd, cmd, &val);
> +    if (ret < 0) {
> +        error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
> +                   cmd, strerror(errno));
> +        goto out;
> +    }
> +
> +out:
> +    close(device_fd);

Race conditions are possible if the device node is replaced between
calls. Why not open the file once and reuse the fd across ioctl calls?

Both VhostVdpaDevicePCI and VhostVdpaDevice need the fd but
VhostVdpaDevicePCI needs it first. I suggest passing ownership of the fd
to the VhostVdpaDevice. One way of doing this is using QOM properties so
that VhostVdpaDevice can use the given fd instead of reopening the file.
(If fd is -1 then VhostVdpaDevice can try to open the file. That is
necessary when VhostVdpaDevice is used directly with virtio-mmio because
there is no proxy object.)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-05  0:58 ` [RFC 05/10] vdpa-dev: implement the realize interface Longpeng(Mike) via
@ 2022-01-05 10:17   ` Stefan Hajnoczi
  2022-01-06  3:02     ` longpeng2--- via
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-05 10:17 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 4131 bytes --]

On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> From: Longpeng <longpeng2@huawei.com>
> 
> Implements the .realize interface.
> 
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
>  include/hw/virtio/vdpa-dev.h |   8 +++
>  2 files changed, 122 insertions(+)
> 
> diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> index 790117fb3b..2d534d837a 100644
> --- a/hw/virtio/vdpa-dev.c
> +++ b/hw/virtio/vdpa-dev.c
> @@ -15,9 +15,122 @@
>  #include "sysemu/sysemu.h"
>  #include "sysemu/runstate.h"
>  
> +static void
> +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    /* Nothing to do */
> +}
> +
> +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)

This looks similar to the helper function in a previous patch but this
time the return value type is int instead of uint32_t. Please make the
types consistent.

> +{
> +    int val;
> +
> +    if (ioctl(fd, cmd, &val) < 0) {
> +        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> +                   cmd, strerror(errno));
> +        return -1;
> +    }
> +
> +    return val;
> +}
> +
> +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> +{
> +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> +}
> +
> +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> +{
> +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> +}
> +
> +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> +{
> +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE, errp);
> +}
> +
>  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
>  {
> +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> +    uint32_t device_id;
> +    int max_queue_size;
> +    int fd;
> +    int i, ret;
> +
> +    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> +    if (fd == -1) {
> +        return;
> +    }
> +    s->vdpa.device_fd = fd;

This is the field I suggest exposing as a QOM property so it can be set
from the proxy object (e.g. when the PCI proxy opens the vdpa device
before our .realize() function is called).

> +
> +    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> +    if (*errp) {
> +        goto out;
> +    }
> +
> +    if (s->queue_size > max_queue_size) {
> +        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d (max:%d)",
> +                   s->queue_size, max_queue_size);
> +        goto out;
> +    } else if (!s->queue_size) {
> +        s->queue_size = max_queue_size;
> +    }
> +
> +    ret = vdpa_dev_get_vqs_num(fd, errp);
> +    if (*errp) {
> +        goto out;
> +    }
> +
> +    s->dev.nvqs = ret;

There is no input validation because we trust the kernel vDPA return
values. That seems okay for now but if there is a vhost-user version of
this in the future then input validation will be necessary to achieve
isolation between QEMU and the vhost-user processes. I suggest including
input validation code right away because it's harder to audit the code
and fix missing input validation later on.

> +    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> +    s->dev.vq_index = 0;
> +    s->dev.vq_index_end = s->dev.nvqs;
> +    s->dev.backend_features = 0;
> +    s->started = false;
> +
> +    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0, NULL);
> +    if (ret < 0) {
> +        error_setg(errp, "vhost-vdpa-device: vhost initialization failed: %s",
> +                   strerror(-ret));
> +        goto out;
> +    }
> +
> +    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);

The vhost_*() API abstracts the ioctl calls but this source file and the
PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
calls into the vhost_*() API? That would be cleaner and also make it
easier to add vhost-user vDPA support in the future.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 00/10] add generic vDPA device support
  2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
                   ` (9 preceding siblings ...)
  2022-01-05  0:59 ` [RFC 10/10] vdpa-dev: mark the device as unmigratable Longpeng(Mike) via
@ 2022-01-05 10:21 ` Stefan Hajnoczi
  10 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-05 10:21 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]

On Wed, Jan 05, 2022 at 08:58:50AM +0800, Longpeng(Mike) wrote:
> From: Longpeng <longpeng2@huawei.com>
> 
> Hi guys,
> 
> This patchset tries to support the generic vDPA device, the previous
> disscussion can be found here [1].
> 
> With the generic vDPA device, QEMU won't need to touch the device
> types any more, such like vfio.
> 
> We can use the generic vDPA device as follow:
>   -device vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-X
> 
> I've done some simple tests on Huawei's offloading card (net, 0.95)
> and vdpa_sim_blk (1.0);
> 
> Note:
>   the kernel part does not send out yet, I'll send it as soon as possible.
> 
> [1] https://lore.kernel.org/all/20211208052010.1719-1-longpeng2@huawei.com/
> 
> Longpeng (Mike) (10):
>   virtio: get class_id and pci device id by the virtio id
>   vhost: add 3 commands for vhost-vdpa
>   vdpa: add the infrastructure of vdpa-dev
>   vdpa-dev: implement the instance_init/class_init interface
>   vdpa-dev: implement the realize interface
>   vdpa-dev: implement the unrealize interface
>   vdpa-dev: implement the get_config/set_config interface
>   vdpa-dev: implement the get_features interface
>   vdpa-dev: implement the set_status interface
>   vdpa-dev: mark the device as unmigratable

Nice and small.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
  2022-01-05  4:37   ` Jason Wang
@ 2022-01-05 10:46   ` Cornelia Huck
  2022-01-06  1:50     ` longpeng2--- via
  2022-01-10  5:43   ` Michael S. Tsirkin
  2 siblings, 1 reply; 52+ messages in thread
From: Cornelia Huck @ 2022-01-05 10:46 UTC (permalink / raw)
  To: Longpeng(Mike), stefanha, mst, jasowang, sgarzare
  Cc: qemu-devel, yechuan, arei.gonglei, huangzhichao, pbonzini, Longpeng

On Wed, Jan 05 2022, "Longpeng(Mike)" <longpeng2@huawei.com> wrote:

> From: Longpeng <longpeng2@huawei.com>
>
> Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> deivce which is specificed by the "Virtio Device ID".
>
> These helpers will be used to build the generic vDPA device later.
>
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
>  hw/virtio/virtio-pci.h |  4 ++
>  2 files changed, 97 insertions(+)
>
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 750aa47ec1..843085c4ea 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -19,6 +19,7 @@
>  
>  #include "exec/memop.h"
>  #include "standard-headers/linux/virtio_pci.h"
> +#include "standard-headers/linux/virtio_ids.h"
>  #include "hw/boards.h"
>  #include "hw/virtio/virtio.h"
>  #include "migration/qemu-file-types.h"
> @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n, QEMUFile *f)
>      return 0;
>  }
>  
> +typedef struct VirtIOPCIIDInfo {
> +    uint16_t vdev_id; /* virtio id */
> +    uint16_t pdev_id; /* pci device id */
> +    uint16_t class_id;
> +} VirtIOPCIIDInfo;
> +
> +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> +    {
> +        .vdev_id = VIRTIO_ID_NET,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BLOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_CONSOLE,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_SCSI,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_9P,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> +        .class_id = PCI_BASE_CLASS_NETWORK,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_VSOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_IOMMU,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_MEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_PMEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_RNG,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BALLOON,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +};
> +
> +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> +{
> +    VirtIOPCIIDInfo info = {};
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> +            info = virtio_pci_id_info[i];
> +            break;
> +        }
> +    }
> +
> +    return info;
> +}
> +
> +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).pdev_id;
> +}
> +
> +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).class_id;
> +}

What happens if these functions are called for a device_id that is not
in the array, e.g. if we forgot to add a new id to the array?

Can the array be generated in some way?



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 06/10] vdpa-dev: implement the unrealize interface
  2022-01-05  0:58 ` [RFC 06/10] vdpa-dev: implement the unrealize interface Longpeng(Mike) via
@ 2022-01-05 11:16   ` Stefano Garzarella
  2022-01-06  3:23     ` longpeng2--- via
  0 siblings, 1 reply; 52+ messages in thread
From: Stefano Garzarella @ 2022-01-05 11:16 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, stefanha, pbonzini

On Wed, Jan 05, 2022 at 08:58:56AM +0800, Longpeng(Mike) wrote:
>From: Longpeng <longpeng2@huawei.com>
>
>Implements the .unrealize interface.
>
>Signed-off-by: Longpeng <longpeng2@huawei.com>
>---
> hw/virtio/vdpa-dev.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
>diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
>index 2d534d837a..4e4dd3d201 100644
>--- a/hw/virtio/vdpa-dev.c
>+++ b/hw/virtio/vdpa-dev.c
>@@ -133,9 +133,29 @@ out:
>     close(fd);
> }
>
>+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
>+{
>+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>+    int i;
>+
>+    for (i = 0; i < s->num_queues; i++) {
                       ^
`s->num_queues` seems uninitialized to me, maybe we could just remove 
the num_queues field from VhostVdpaDevice, and use `s->dev.nvqs` as in 
vhost_vdpa_device_realize().

>+        virtio_delete_queue(s->virtqs[i]);
>+    }
>+    g_free(s->virtqs);
>+    virtio_cleanup(vdev);
>+
>+    g_free(s->config);
>+}
>+
> static void vhost_vdpa_device_unrealize(DeviceState *dev)
> {
>-    return;
>+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
>+
>+    virtio_set_status(vdev, 0);
>+    vhost_dev_cleanup(&s->dev);

If we will use `s->dev.nvqs` in vhost_vdpa_vdev_unrealize(), we should 
call vhost_dev_cleanup() after it, just before close() as we already do 
in the error path of vhost_vdpa_device_realize().

>+    vhost_vdpa_vdev_unrealize(s);
>+    close(s->vdpa.device_fd);
> }
>
> static void
>-- 
>2.23.0
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface
  2022-01-05  0:58 ` [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface Longpeng(Mike) via
  2022-01-05 10:00   ` Stefan Hajnoczi
@ 2022-01-05 11:28   ` Stefano Garzarella
  2022-01-06  2:40     ` longpeng2--- via
  1 sibling, 1 reply; 52+ messages in thread
From: Stefano Garzarella @ 2022-01-05 11:28 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: mst, jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, stefanha, pbonzini

On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
>From: Longpeng <longpeng2@huawei.com>
>
>Implements the .instance_init and the .class_init interface.
>
>Signed-off-by: Longpeng <longpeng2@huawei.com>
>---
> hw/virtio/vdpa-dev-pci.c     | 80 +++++++++++++++++++++++++++++++++++-
> hw/virtio/vdpa-dev.c         | 68 +++++++++++++++++++++++++++++-
> include/hw/virtio/vdpa-dev.h |  2 +
> 3 files changed, 146 insertions(+), 4 deletions(-)
>
>diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
>index a5a7b528a9..0af54a26d4 100644
>--- a/hw/virtio/vdpa-dev-pci.c
>+++ b/hw/virtio/vdpa-dev-pci.c
>@@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
>     VhostVdpaDevice vdev;
> };
>
>+static uint32_t
>+vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
>+{
>+    int device_fd;
>+    uint32_t val;
>+    int ret;
>+
>+    device_fd = qemu_open(name, O_RDWR, errp);
>+    if (device_fd == -1) {
>+        return (uint32_t)-1;
>+    }
>+
>+    ret = ioctl(device_fd, cmd, &val);
>+    if (ret < 0) {
>+        error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
>+                   cmd, strerror(errno));
>+        goto out;
>+    }
>+
>+out:
>+    close(device_fd);
>+    return val;
>+}
>+
>+static inline uint32_t
>+vdpa_dev_pci_get_devid(VhostVdpaDevicePCI *dev, Error **errp)
>+{
>+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
>+                                 VHOST_VDPA_GET_DEVICE_ID, errp);
>+}
>+
>+static inline uint32_t
>+vdpa_dev_pci_get_vectors_num(VhostVdpaDevicePCI *dev, Error **errp)
>+{
>+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
>+                                 VHOST_VDPA_GET_VECTORS_NUM, errp);
>+}
>+
> static void vhost_vdpa_device_pci_instance_init(Object *obj)
> {
>-    return;
>+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
>+
>+    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
>+                                TYPE_VHOST_VDPA_DEVICE);
>+    object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
>+                              "bootindex");
>+}
>+
>+static Property vhost_vdpa_device_pci_properties[] = {
>+    DEFINE_PROP_END_OF_LIST(),
>+};
>+
>+static void
>+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
>+{
>+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
>+    DeviceState *vdev = DEVICE(&dev->vdev);
>+    uint32_t devid;
>+    uint32_t vectors;
>+
>+    devid = vdpa_dev_pci_get_devid(dev, errp);
>+    if (*errp) {
>+        return;
>+    }
>+
>+    vectors = vdpa_dev_pci_get_vectors_num(dev, errp);
>+    if (*errp) {
>+        return;
>+    }
>+
>+    vpci_dev->class_code = virtio_pci_get_class_id(devid);
>+    vpci_dev->pdev_id = virtio_pci_get_pci_devid(devid);
>+    vpci_dev->nvectors = vectors;
>+    qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
> }
>
> static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
> {
>-    return;
>+    DeviceClass *dc = DEVICE_CLASS(klass);
>+    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
>+
>+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>+    device_class_set_props(dc, vhost_vdpa_device_pci_properties);
>+    k->realize = vhost_vdpa_device_pci_realize;
> }
>
> static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
>diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
>index f4f92b90b0..790117fb3b 100644
>--- a/hw/virtio/vdpa-dev.c
>+++ b/hw/virtio/vdpa-dev.c
>@@ -15,16 +15,80 @@
> #include "sysemu/sysemu.h"
> #include "sysemu/runstate.h"
>
>-static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
>+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> {
>     return;
> }
>
>-static void vhost_vdpa_device_instance_init(Object *obj)
>+static void vhost_vdpa_device_unrealize(DeviceState *dev)
>+{
>+    return;
>+}
>+
>+static void
>+vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
>+{
>+    return;
>+}
>+
>+static void
>+vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
> {
>     return;
> }
>
>+static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
>+                                               uint64_t features,
>+                                               Error **errp)
>+{
>+    return (uint64_t)-1;
>+}
>+
>+static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
>+{
>+    return;
>+}
>+
>+static Property vhost_vdpa_device_properties[] = {
>+    DEFINE_PROP_STRING("vdpa-dev", VhostVdpaDevice, vdpa_dev),
>+    DEFINE_PROP_END_OF_LIST(),
>+};
>+
>+static const VMStateDescription vmstate_vhost_vdpa_device = {
>+    .name = "vhost-vdpa-device",
>+    .minimum_version_id = 1,
>+    .version_id = 1,
>+    .fields = (VMStateField[]) {
>+        VMSTATE_VIRTIO_DEVICE,
>+        VMSTATE_END_OF_LIST()
>+    },
>+};
>+
>+static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
>+{
>+    DeviceClass *dc = DEVICE_CLASS(klass);
>+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
>+
>+    device_class_set_props(dc, vhost_vdpa_device_properties);
>+    dc->desc = "VDPA-based generic PCI device assignment";

IIUC, this should be the description of the generic vhost vDPA device, 
not the PCI implementation, right?

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05  9:09           ` Jason Wang
@ 2022-01-05 12:26             ` Michael S. Tsirkin
  2022-01-06  2:34               ` Jason Wang
  0 siblings, 1 reply; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-05 12:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.),
	Stefano Garzarella

On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
> On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) <longpeng2@huawei.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > Sent: Wednesday, January 5, 2022 3:54 PM
> > > To: Michael S. Tsirkin <mst@redhat.com>
> > > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> > > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> > > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > > <qemu-devel@nongnu.org>
> > > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> > >
> > > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > > > >
> > > > > > From: Longpeng <longpeng2@huawei.com>
> > > > > >
> > > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > > >
> > > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > > vhost is transport independent.
> > > >
> > > > Well *guest* needs to know how many vectors device supports.
> > > > I don't think there's a way around that. Do you?
> > >
> > > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > > simply assume #vqs + 1?
> > >
> > > > Otherwise guests will at best be suboptimal.
> > > >
> > > > >  And it reveals device implementation
> > > > > details which block (cross vendor) migration.
> > > > >
> > > > > Thanks
> > > >
> > > > Not necessarily, userspace can hide this from guest if it
> > > > wants to, just validate.
> > >
> > > If we can hide it at vhost/uAPI level, it would be even better?
> > >
> >
> > Not only MSI vectors, but also queue-size, #vqs, etc.
> 
> MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
> 
> And it's something that is not guaranteed to be not changed. E.g some
> drivers may choose to allocate MSI during set_status() which can fail
> for various reasons.
> 
> >
> > Maybe the vhost level could expose the hardware's real capabilities
> > and let the userspace (QEMU) do the hiding? The userspace know how
> > to process them.
> 
> #MSI vectors is much more easier to be mediated than queue-size and #vqs.
> 
> For interrupts, we've already had VHOST_SET_X_KICK, we can keep
> allocating eventfd based on #MSI vectors to make it work with any
> number of MSI vectors that the virtual device had.

Right but if hardware does not support so many then what?
Just fail? Having a query API would make things somewhat cleaner imho.

> For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
> can simply fail if SET_VRING_NUM fail.
>
> For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
> device requires knowledge the #vqs in the config space. (still not a
> must, we can enumerate #vqs per device type)
> 
> For the config size, it's OK but not a must, technically we can simply
> relay what guest write to vhost-vdpa. It's just because current Qemu
> require to have it during virtio device initialization.
> 
> Thanks


I agree but these ok things make for a cleaner API I think.

> >
> > > Thanks
> > >
> > > >
> > > >
> > > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > > >
> > > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > > ---
> > > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > > >  1 file changed, 10 insertions(+)
> > > > > >
> > > > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > > > index c998860d7b..c5edd75d15 100644
> > > > > > --- a/linux-headers/linux/vhost.h
> > > > > > +++ b/linux-headers/linux/vhost.h
> > > > > > @@ -150,4 +150,14 @@
> > > > > >  /* Get the valid iova range */
> > > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > > > >                                              struct vhost_vdpa_iova_range)
> > > > > > +
> > > > > > +/* Get the number of vectors */
> > > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > > > +
> > > > > > +/* Get the virtio config size */
> > > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > > > +
> > > > > > +/* Get the number of virtqueues */
> > > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > > > +
> > > > > >  #endif
> > > > > > --
> > > > > > 2.23.0
> > > > > >
> > > >
> >



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
  2022-01-05  9:48   ` Stefan Hajnoczi
@ 2022-01-06  1:22     ` longpeng2--- via
  2022-01-06 11:25       ` Stefan Hajnoczi
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  1:22 UTC (permalink / raw)
  To: Stefan Hajnoczi, jasowang
  Cc: mst, sgarzare, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Wednesday, January 5, 2022 5:49 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
> 
> On Wed, Jan 05, 2022 at 08:58:53AM +0800, Longpeng(Mike) wrote:
> > +static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> > +    .base_name               = TYPE_VHOST_VDPA_DEVICE_PCI,
> > +    .generic_name            = "vhost-vdpa-device-pci",
> > +    .transitional_name       = "vhost-vdpa-device-pci-transitional",
> > +    .non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",
> 
> Does vDPA support Transitional VIRTIO devices?
> 
> I expected this device to support Modern devices only.
> 

There's already a 0.95 vdpa driver (Alibaba ENI) in the kernel source and
supporting 0.95 devices is necessary for some older GuestOS.

I'm OK if other guys also approve of supporting 1.0+ devices only :)

> Stefan


^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05 10:46   ` Cornelia Huck
@ 2022-01-06  1:50     ` longpeng2--- via
  0 siblings, 0 replies; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  1:50 UTC (permalink / raw)
  To: Cornelia Huck, stefanha, mst, jasowang, sgarzare
  Cc: pbonzini, Gonglei (Arei), Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Cornelia Huck [mailto:cohuck@redhat.com]
> Sent: Wednesday, January 5, 2022 6:46 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>; stefanha@redhat.com; mst@redhat.com;
> jasowang@redhat.com; sgarzare@redhat.com
> Cc: pbonzini@redhat.com; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>;
> qemu-devel@nongnu.org; Longpeng (Mike, Cloud Infrastructure Service Product
> Dept.) <longpeng2@huawei.com>
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 05 2022, "Longpeng(Mike)" <longpeng2@huawei.com> wrote:
> 
> > From: Longpeng <longpeng2@huawei.com>
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> >
> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >      return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +    uint16_t vdev_id; /* virtio id */
> > +    uint16_t pdev_id; /* pci device id */
> > +    uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +    {
> > +        .vdev_id = VIRTIO_ID_NET,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BLOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_SCSI,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_9P,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_VSOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_IOMMU,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_MEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_PMEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_RNG,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BALLOON,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +};
> > +
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +    VirtIOPCIIDInfo info = {};
> > +    int i;
> > +
> > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +            info = virtio_pci_id_info[i];
> > +            break;
> > +        }
> > +    }
> > +
> > +    return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).class_id;
> > +}
> 
> What happens if these functions are called for a device_id that is not
> in the array, e.g. if we forgot to add a new id to the array?
> 

It would return pdev_id=0 or class_id=0 as a result, the virtio device
with pdev_id=0 is undefined and class_id=0 is also treated as undefined
(PCI_CLASS_NOT_DEFINED), so the caller should check the returned value.

> Can the array be generated in some way?

For PCI Device ID:
  - If we need to support Transitional VIRTIO devices, there's no algorithm
    can map a VIRTIO ID to a PCI Device ID.
  - If we only need to support 1.0+ VIRTIO devices, then we can calculate the
    PCI Device ID based on the VIRTIO ID.

For Class ID, seems no way :(



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-05 12:26             ` Michael S. Tsirkin
@ 2022-01-06  2:34               ` Jason Wang
  2022-01-06  8:00                 ` longpeng2--- via
  2022-01-06 14:09                 ` Michael S. Tsirkin
  0 siblings, 2 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-06  2:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.),
	Stefano Garzarella

On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
> > On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
> > Service Product Dept.) <longpeng2@huawei.com> wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > > Sent: Wednesday, January 5, 2022 3:54 PM
> > > > To: Michael S. Tsirkin <mst@redhat.com>
> > > > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > > <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> > > > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> > > > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > > > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > > > <qemu-devel@nongnu.org>
> > > > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> > > >
> > > > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > > > > >
> > > > > > > From: Longpeng <longpeng2@huawei.com>
> > > > > > >
> > > > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > > > >
> > > > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > > > vhost is transport independent.
> > > > >
> > > > > Well *guest* needs to know how many vectors device supports.
> > > > > I don't think there's a way around that. Do you?
> > > >
> > > > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > > > simply assume #vqs + 1?
> > > >
> > > > > Otherwise guests will at best be suboptimal.
> > > > >
> > > > > >  And it reveals device implementation
> > > > > > details which block (cross vendor) migration.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Not necessarily, userspace can hide this from guest if it
> > > > > wants to, just validate.
> > > >
> > > > If we can hide it at vhost/uAPI level, it would be even better?
> > > >
> > >
> > > Not only MSI vectors, but also queue-size, #vqs, etc.
> >
> > MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
> >
> > And it's something that is not guaranteed to be not changed. E.g some
> > drivers may choose to allocate MSI during set_status() which can fail
> > for various reasons.
> >
> > >
> > > Maybe the vhost level could expose the hardware's real capabilities
> > > and let the userspace (QEMU) do the hiding? The userspace know how
> > > to process them.
> >
> > #MSI vectors is much more easier to be mediated than queue-size and #vqs.
> >
> > For interrupts, we've already had VHOST_SET_X_KICK, we can keep
> > allocating eventfd based on #MSI vectors to make it work with any
> > number of MSI vectors that the virtual device had.
>
> Right but if hardware does not support so many then what?
> Just fail?

Or just trigger the callback of vqs that shares the vector.

> Having a query API would make things somewhat cleaner imho.

I may miss something,  even if we know #vectors, we still don't know
the associated virtqueues for a dedicated vector?

>
> > For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
> > can simply fail if SET_VRING_NUM fail.
> >
> > For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
> > device requires knowledge the #vqs in the config space. (still not a
> > must, we can enumerate #vqs per device type)
> >
> > For the config size, it's OK but not a must, technically we can simply
> > relay what guest write to vhost-vdpa. It's just because current Qemu
> > require to have it during virtio device initialization.
> >
> > Thanks
>
>
> I agree but these ok things make for a cleaner API I think.

Right.

Thanks

>
> > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > > > >
> > > > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > > > ---
> > > > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > > > >  1 file changed, 10 insertions(+)
> > > > > > >
> > > > > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > > > > index c998860d7b..c5edd75d15 100644
> > > > > > > --- a/linux-headers/linux/vhost.h
> > > > > > > +++ b/linux-headers/linux/vhost.h
> > > > > > > @@ -150,4 +150,14 @@
> > > > > > >  /* Get the valid iova range */
> > > > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > >                                              struct vhost_vdpa_iova_range)
> > > > > > > +
> > > > > > > +/* Get the number of vectors */
> > > > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > > > > +
> > > > > > > +/* Get the virtio config size */
> > > > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > > > > +
> > > > > > > +/* Get the number of virtqueues */
> > > > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > > > > +
> > > > > > >  #endif
> > > > > > > --
> > > > > > > 2.23.0
> > > > > > >
> > > > >
> > >
>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface
  2022-01-05 10:00   ` Stefan Hajnoczi
@ 2022-01-06  2:39     ` longpeng2--- via
  0 siblings, 0 replies; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  2:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: mst, jasowang, sgarzare, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Wednesday, January 5, 2022 6:01 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng <longpeng2@huawei.com>
> >
> > Implements the .instance_init and the .class_init interface.
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  hw/virtio/vdpa-dev-pci.c     | 80 +++++++++++++++++++++++++++++++++++-
> >  hw/virtio/vdpa-dev.c         | 68 +++++++++++++++++++++++++++++-
> >  include/hw/virtio/vdpa-dev.h |  2 +
> >  3 files changed, 146 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> > index a5a7b528a9..0af54a26d4 100644
> > --- a/hw/virtio/vdpa-dev-pci.c
> > +++ b/hw/virtio/vdpa-dev-pci.c
> > @@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
> >      VhostVdpaDevice vdev;
> >  };
> >
> > +static uint32_t
> > +vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
> 
> vdpa_dev_pci_get_u32() might be a clearer name.
> 

OK.

> > +{
> > +    int device_fd;
> > +    uint32_t val;
> > +    int ret;
> > +
> > +    device_fd = qemu_open(name, O_RDWR, errp);
> > +    if (device_fd == -1) {
> > +        return (uint32_t)-1;
> > +    }
> > +
> > +    ret = ioctl(device_fd, cmd, &val);
> > +    if (ret < 0) {
> > +        error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
> > +                   cmd, strerror(errno));
> > +        goto out;
> > +    }
> > +
> > +out:
> > +    close(device_fd);
> 
> Race conditions are possible if the device node is replaced between
> calls. Why not open the file once and reuse the fd across ioctl calls?
> 
> Both VhostVdpaDevicePCI and VhostVdpaDevice need the fd but
> VhostVdpaDevicePCI needs it first. I suggest passing ownership of the fd
> to the VhostVdpaDevice. One way of doing this is using QOM properties so
> that VhostVdpaDevice can use the given fd instead of reopening the file.
> (If fd is -1 then VhostVdpaDevice can try to open the file. That is
> necessary when VhostVdpaDevice is used directly with virtio-mmio because
> there is no proxy object.)

Adding the fd field into the VhostVdpaDevice looks fine! I'll do it in V2.
Thanks.




^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface
  2022-01-05 11:28   ` Stefano Garzarella
@ 2022-01-06  2:40     ` longpeng2--- via
  0 siblings, 0 replies; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  2:40 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: stefanha, mst, jasowang, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefano Garzarella [mailto:sgarzare@redhat.com]
> Sent: Wednesday, January 5, 2022 7:29 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: stefanha@redhat.com; mst@redhat.com; jasowang@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
> >From: Longpeng <longpeng2@huawei.com>
> >
> >Implements the .instance_init and the .class_init interface.
> >
> >Signed-off-by: Longpeng <longpeng2@huawei.com>
> >---
> > hw/virtio/vdpa-dev-pci.c     | 80 +++++++++++++++++++++++++++++++++++-
> > hw/virtio/vdpa-dev.c         | 68 +++++++++++++++++++++++++++++-
> > include/hw/virtio/vdpa-dev.h |  2 +
> > 3 files changed, 146 insertions(+), 4 deletions(-)
> >
> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >index a5a7b528a9..0af54a26d4 100644
> >--- a/hw/virtio/vdpa-dev-pci.c
> >+++ b/hw/virtio/vdpa-dev-pci.c
> >@@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
> >     VhostVdpaDevice vdev;
> > };
> >
> >+static uint32_t
> >+vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
> >+{
> >+    int device_fd;
> >+    uint32_t val;
> >+    int ret;
> >+
> >+    device_fd = qemu_open(name, O_RDWR, errp);
> >+    if (device_fd == -1) {
> >+        return (uint32_t)-1;
> >+    }
> >+
> >+    ret = ioctl(device_fd, cmd, &val);
> >+    if (ret < 0) {
> >+        error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
> >+                   cmd, strerror(errno));
> >+        goto out;
> >+    }
> >+
> >+out:
> >+    close(device_fd);
> >+    return val;
> >+}
> >+
> >+static inline uint32_t
> >+vdpa_dev_pci_get_devid(VhostVdpaDevicePCI *dev, Error **errp)
> >+{
> >+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
> >+                                 VHOST_VDPA_GET_DEVICE_ID, errp);
> >+}
> >+
> >+static inline uint32_t
> >+vdpa_dev_pci_get_vectors_num(VhostVdpaDevicePCI *dev, Error **errp)
> >+{
> >+    return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
> >+                                 VHOST_VDPA_GET_VECTORS_NUM, errp);
> >+}
> >+
> > static void vhost_vdpa_device_pci_instance_init(Object *obj)
> > {
> >-    return;
> >+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
> >+
> >+    virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
> >+                                TYPE_VHOST_VDPA_DEVICE);
> >+    object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
> >+                              "bootindex");
> >+}
> >+
> >+static Property vhost_vdpa_device_pci_properties[] = {
> >+    DEFINE_PROP_END_OF_LIST(),
> >+};
> >+
> >+static void
> >+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> >+{
> >+    VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >+    DeviceState *vdev = DEVICE(&dev->vdev);
> >+    uint32_t devid;
> >+    uint32_t vectors;
> >+
> >+    devid = vdpa_dev_pci_get_devid(dev, errp);
> >+    if (*errp) {
> >+        return;
> >+    }
> >+
> >+    vectors = vdpa_dev_pci_get_vectors_num(dev, errp);
> >+    if (*errp) {
> >+        return;
> >+    }
> >+
> >+    vpci_dev->class_code = virtio_pci_get_class_id(devid);
> >+    vpci_dev->pdev_id = virtio_pci_get_pci_devid(devid);
> >+    vpci_dev->nvectors = vectors;
> >+    qdev_realize(vdev, BUS(&vpci_dev->bus), errp);
> > }
> >
> > static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
> > {
> >-    return;
> >+    DeviceClass *dc = DEVICE_CLASS(klass);
> >+    VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> >+
> >+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >+    device_class_set_props(dc, vhost_vdpa_device_pci_properties);
> >+    k->realize = vhost_vdpa_device_pci_realize;
> > }
> >
> > static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index f4f92b90b0..790117fb3b 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -15,16 +15,80 @@
> > #include "sysemu/sysemu.h"
> > #include "sysemu/runstate.h"
> >
> >-static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
> >+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > {
> >     return;
> > }
> >
> >-static void vhost_vdpa_device_instance_init(Object *obj)
> >+static void vhost_vdpa_device_unrealize(DeviceState *dev)
> >+{
> >+    return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
> >+{
> >+    return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
> > {
> >     return;
> > }
> >
> >+static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
> >+                                               uint64_t features,
> >+                                               Error **errp)
> >+{
> >+    return (uint64_t)-1;
> >+}
> >+
> >+static void vhost_vdpa_device_set_status(VirtIODevice *vdev, uint8_t status)
> >+{
> >+    return;
> >+}
> >+
> >+static Property vhost_vdpa_device_properties[] = {
> >+    DEFINE_PROP_STRING("vdpa-dev", VhostVdpaDevice, vdpa_dev),
> >+    DEFINE_PROP_END_OF_LIST(),
> >+};
> >+
> >+static const VMStateDescription vmstate_vhost_vdpa_device = {
> >+    .name = "vhost-vdpa-device",
> >+    .minimum_version_id = 1,
> >+    .version_id = 1,
> >+    .fields = (VMStateField[]) {
> >+        VMSTATE_VIRTIO_DEVICE,
> >+        VMSTATE_END_OF_LIST()
> >+    },
> >+};
> >+
> >+static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
> >+{
> >+    DeviceClass *dc = DEVICE_CLASS(klass);
> >+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> >+
> >+    device_class_set_props(dc, vhost_vdpa_device_properties);
> >+    dc->desc = "VDPA-based generic PCI device assignment";
> 
> IIUC, this should be the description of the generic vhost vDPA device,
> not the PCI implementation, right?
> 

Good catch, thanks.

> Thanks,
> Stefano



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-05 10:17   ` Stefan Hajnoczi
@ 2022-01-06  3:02     ` longpeng2--- via
  2022-01-06 11:34       ` Stefan Hajnoczi
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  3:02 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: mst, jasowang, sgarzare, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Wednesday, January 5, 2022 6:18 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> 
> On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng <longpeng2@huawei.com>
> >
> > Implements the .realize interface.
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
> >  include/hw/virtio/vdpa-dev.h |   8 +++
> >  2 files changed, 122 insertions(+)
> >
> > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > index 790117fb3b..2d534d837a 100644
> > --- a/hw/virtio/vdpa-dev.c
> > +++ b/hw/virtio/vdpa-dev.c
> > @@ -15,9 +15,122 @@
> >  #include "sysemu/sysemu.h"
> >  #include "sysemu/runstate.h"
> >
> > +static void
> > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    /* Nothing to do */
> > +}
> > +
> > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> 
> This looks similar to the helper function in a previous patch but this
> time the return value type is int instead of uint32_t. Please make the
> types consistent.
> 

OK.

> > +{
> > +    int val;
> > +
> > +    if (ioctl(fd, cmd, &val) < 0) {
> > +        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > +                   cmd, strerror(errno));
> > +        return -1;
> > +    }
> > +
> > +    return val;
> > +}
> > +
> > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > +{
> > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > +}
> > +
> > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > +{
> > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > +}
> > +
> > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > +{
> > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE, errp);
> > +}
> > +
> >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > +    uint32_t device_id;
> > +    int max_queue_size;
> > +    int fd;
> > +    int i, ret;
> > +
> > +    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > +    if (fd == -1) {
> > +        return;
> > +    }
> > +    s->vdpa.device_fd = fd;
> 
> This is the field I suggest exposing as a QOM property so it can be set
> from the proxy object (e.g. when the PCI proxy opens the vdpa device
> before our .realize() function is called).
> 

OK.

> > +
> > +    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > +    if (*errp) {
> > +        goto out;
> > +    }
> > +
> > +    if (s->queue_size > max_queue_size) {
> > +        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> (max:%d)",
> > +                   s->queue_size, max_queue_size);
> > +        goto out;
> > +    } else if (!s->queue_size) {
> > +        s->queue_size = max_queue_size;
> > +    }
> > +
> > +    ret = vdpa_dev_get_vqs_num(fd, errp);
> > +    if (*errp) {
> > +        goto out;
> > +    }
> > +
> > +    s->dev.nvqs = ret;
> 
> There is no input validation because we trust the kernel vDPA return
> values. That seems okay for now but if there is a vhost-user version of
> this in the future then input validation will be necessary to achieve
> isolation between QEMU and the vhost-user processes. I suggest including
> input validation code right away because it's harder to audit the code
> and fix missing input validation later on.
> 

Make sense!

Should we only need to validate the upper boundary (e.g. <VIRTIO_QUEUE_MAX)?

> > +    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> > +    s->dev.vq_index = 0;
> > +    s->dev.vq_index_end = s->dev.nvqs;
> > +    s->dev.backend_features = 0;
> > +    s->started = false;
> > +
> > +    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> NULL);
> > +    if (ret < 0) {
> > +        error_setg(errp, "vhost-vdpa-device: vhost initialization
> failed: %s",
> > +                   strerror(-ret));
> > +        goto out;
> > +    }
> > +
> > +    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);
> 
> The vhost_*() API abstracts the ioctl calls but this source file and the
> PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
> calls into the vhost_*() API? That would be cleaner and also make it
> easier to add vhost-user vDPA support in the future.

We need these ioctls calls because we need invoke them before the vhost-dev
object is initialized.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 06/10] vdpa-dev: implement the unrealize interface
  2022-01-05 11:16   ` Stefano Garzarella
@ 2022-01-06  3:23     ` longpeng2--- via
  2022-01-10  9:38       ` Stefano Garzarella
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  3:23 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: stefanha, mst, jasowang, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefano Garzarella [mailto:sgarzare@redhat.com]
> Sent: Wednesday, January 5, 2022 7:16 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: stefanha@redhat.com; mst@redhat.com; jasowang@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 06/10] vdpa-dev: implement the unrealize interface
> 
> On Wed, Jan 05, 2022 at 08:58:56AM +0800, Longpeng(Mike) wrote:
> >From: Longpeng <longpeng2@huawei.com>
> >
> >Implements the .unrealize interface.
> >
> >Signed-off-by: Longpeng <longpeng2@huawei.com>
> >---
> > hw/virtio/vdpa-dev.c | 22 +++++++++++++++++++++-
> > 1 file changed, 21 insertions(+), 1 deletion(-)
> >
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index 2d534d837a..4e4dd3d201 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -133,9 +133,29 @@ out:
> >     close(fd);
> > }
> >
> >+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
> >+{
> >+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
> >+    int i;
> >+
> >+    for (i = 0; i < s->num_queues; i++) {
>                        ^
> `s->num_queues` seems uninitialized to me, maybe we could just remove
> the num_queues field from VhostVdpaDevice, and use `s->dev.nvqs` as in
> vhost_vdpa_device_realize().
> 

Good catch, I'll fix the bug.

But I think we should keep the num_queues field, we need it if we support
migration in the next step, right?

> >+        virtio_delete_queue(s->virtqs[i]);
> >+    }
> >+    g_free(s->virtqs);
> >+    virtio_cleanup(vdev);
> >+
> >+    g_free(s->config);
> >+}
> >+
> > static void vhost_vdpa_device_unrealize(DeviceState *dev)
> > {
> >-    return;
> >+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >+
> >+    virtio_set_status(vdev, 0);
> >+    vhost_dev_cleanup(&s->dev);
> 
> If we will use `s->dev.nvqs` in vhost_vdpa_vdev_unrealize(), we should
> call vhost_dev_cleanup() after it, just before close() as we already do
> in the error path of vhost_vdpa_device_realize().
> 

I'll try to fix the above bug first if you agree that we should keep the
num_queues field.

I just realize that I forgot to free s->dev.vqs here...

> >+    vhost_vdpa_vdev_unrealize(s);
> >+    close(s->vdpa.device_fd);
> > }
> >
> > static void
> >--
> >2.23.0
> >



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-06  2:34               ` Jason Wang
@ 2022-01-06  8:00                 ` longpeng2--- via
  2022-01-07  2:41                   ` Jason Wang
  2022-01-06 14:09                 ` Michael S. Tsirkin
  1 sibling, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-06  8:00 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Stefan Hajnoczi, Stefano Garzarella, Cornelia Huck, pbonzini,
	Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Thursday, January 6, 2022 10:34 AM
> To: Michael S. Tsirkin <mst@redhat.com>
> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> <qemu-devel@nongnu.org>
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
> > > Service Product Dept.) <longpeng2@huawei.com> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > > > Sent: Wednesday, January 5, 2022 3:54 PM
> > > > > To: Michael S. Tsirkin <mst@redhat.com>
> > > > > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > > > <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> > > > > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>;
> pbonzini
> > > > > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > > > > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>;
> qemu-devel
> > > > > <qemu-devel@nongnu.org>
> > > > > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> > > > >
> > > > > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com>
> wrote:
> > > > > > > >
> > > > > > > > From: Longpeng <longpeng2@huawei.com>
> > > > > > > >
> > > > > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > > > > >
> > > > > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > > > > vhost is transport independent.
> > > > > >
> > > > > > Well *guest* needs to know how many vectors device supports.
> > > > > > I don't think there's a way around that. Do you?
> > > > >
> > > > > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > > > > simply assume #vqs + 1?
> > > > >
> > > > > > Otherwise guests will at best be suboptimal.
> > > > > >
> > > > > > >  And it reveals device implementation
> > > > > > > details which block (cross vendor) migration.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Not necessarily, userspace can hide this from guest if it
> > > > > > wants to, just validate.
> > > > >
> > > > > If we can hide it at vhost/uAPI level, it would be even better?
> > > > >
> > > >
> > > > Not only MSI vectors, but also queue-size, #vqs, etc.
> > >
> > > MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
> > >
> > > And it's something that is not guaranteed to be not changed. E.g some
> > > drivers may choose to allocate MSI during set_status() which can fail
> > > for various reasons.
> > >
> > > >
> > > > Maybe the vhost level could expose the hardware's real capabilities
> > > > and let the userspace (QEMU) do the hiding? The userspace know how
> > > > to process them.
> > >
> > > #MSI vectors is much more easier to be mediated than queue-size and #vqs.
> > >
> > > For interrupts, we've already had VHOST_SET_X_KICK, we can keep
> > > allocating eventfd based on #MSI vectors to make it work with any
> > > number of MSI vectors that the virtual device had.
> >
> > Right but if hardware does not support so many then what?
> > Just fail?
> 
> Or just trigger the callback of vqs that shares the vector.
> 

Then we should disable PI if we need to share a vector in this case?

> > Having a query API would make things somewhat cleaner imho.
> 
> I may miss something,  even if we know #vectors, we still don't know
> the associated virtqueues for a dedicated vector?
> 
> >
> > > For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
> > > can simply fail if SET_VRING_NUM fail.
> > >
> > > For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
> > > device requires knowledge the #vqs in the config space. (still not a
> > > must, we can enumerate #vqs per device type)
> > >
> > > For the config size, it's OK but not a must, technically we can simply
> > > relay what guest write to vhost-vdpa. It's just because current Qemu
> > > require to have it during virtio device initialization.
> > >
> > > Thanks
> >
> >
> > I agree but these ok things make for a cleaner API I think.
> 
> Right.
> 
> Thanks
> 
> >
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >
> > > > > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > > > > >
> > > > > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > > > > ---
> > > > > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/linux-headers/linux/vhost.h
> b/linux-headers/linux/vhost.h
> > > > > > > > index c998860d7b..c5edd75d15 100644
> > > > > > > > --- a/linux-headers/linux/vhost.h
> > > > > > > > +++ b/linux-headers/linux/vhost.h
> > > > > > > > @@ -150,4 +150,14 @@
> > > > > > > >  /* Get the valid iova range */
> > > > > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78,
> \
> > > > > > > >                                              struct
> vhost_vdpa_iova_range)
> > > > > > > > +
> > > > > > > > +/* Get the number of vectors */
> > > > > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79,
> int)
> > > > > > > > +
> > > > > > > > +/* Get the virtio config size */
> > > > > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80,
> int)
> > > > > > > > +
> > > > > > > > +/* Get the number of virtqueues */
> > > > > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81,
> int)
> > > > > > > > +
> > > > > > > >  #endif
> > > > > > > > --
> > > > > > > > 2.23.0
> > > > > > > >
> > > > > >
> > > >
> >


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
  2022-01-06  1:22     ` longpeng2--- via
@ 2022-01-06 11:25       ` Stefan Hajnoczi
  2022-01-07  2:22         ` Jason Wang
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-06 11:25 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, jasowang, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 1591 bytes --]

On Thu, Jan 06, 2022 at 01:22:19AM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> 
> 
> > -----Original Message-----
> > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > Sent: Wednesday, January 5, 2022 5:49 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>
> > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > Subject: Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
> > 
> > On Wed, Jan 05, 2022 at 08:58:53AM +0800, Longpeng(Mike) wrote:
> > > +static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> > > +    .base_name               = TYPE_VHOST_VDPA_DEVICE_PCI,
> > > +    .generic_name            = "vhost-vdpa-device-pci",
> > > +    .transitional_name       = "vhost-vdpa-device-pci-transitional",
> > > +    .non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",
> > 
> > Does vDPA support Transitional VIRTIO devices?
> > 
> > I expected this device to support Modern devices only.
> > 
> 
> There's already a 0.95 vdpa driver (Alibaba ENI) in the kernel source and
> supporting 0.95 devices is necessary for some older GuestOS.
> 
> I'm OK if other guys also approve of supporting 1.0+ devices only :)

If vDPA supports Transitional VIRTIO devices then it's fine to keep this
code unchanged in this patch series.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-06  3:02     ` longpeng2--- via
@ 2022-01-06 11:34       ` Stefan Hajnoczi
  2022-01-17 12:34         ` longpeng2--- via
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-06 11:34 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, jasowang, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 6108 bytes --]

On Thu, Jan 06, 2022 at 03:02:37AM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> 
> 
> > -----Original Message-----
> > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > Sent: Wednesday, January 5, 2022 6:18 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>
> > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> > 
> > On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > > From: Longpeng <longpeng2@huawei.com>
> > >
> > > Implements the .realize interface.
> > >
> > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > ---
> > >  hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
> > >  include/hw/virtio/vdpa-dev.h |   8 +++
> > >  2 files changed, 122 insertions(+)
> > >
> > > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > > index 790117fb3b..2d534d837a 100644
> > > --- a/hw/virtio/vdpa-dev.c
> > > +++ b/hw/virtio/vdpa-dev.c
> > > @@ -15,9 +15,122 @@
> > >  #include "sysemu/sysemu.h"
> > >  #include "sysemu/runstate.h"
> > >
> > > +static void
> > > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > > +{
> > > +    /* Nothing to do */
> > > +}
> > > +
> > > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> > 
> > This looks similar to the helper function in a previous patch but this
> > time the return value type is int instead of uint32_t. Please make the
> > types consistent.
> > 
> 
> OK.
> 
> > > +{
> > > +    int val;
> > > +
> > > +    if (ioctl(fd, cmd, &val) < 0) {
> > > +        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > > +                   cmd, strerror(errno));
> > > +        return -1;
> > > +    }
> > > +
> > > +    return val;
> > > +}
> > > +
> > > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > > +{
> > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > > +}
> > > +
> > > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > > +{
> > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > > +}
> > > +
> > > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > > +{
> > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE, errp);
> > > +}
> > > +
> > >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > >  {
> > > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > +    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > > +    uint32_t device_id;
> > > +    int max_queue_size;
> > > +    int fd;
> > > +    int i, ret;
> > > +
> > > +    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > > +    if (fd == -1) {
> > > +        return;
> > > +    }
> > > +    s->vdpa.device_fd = fd;
> > 
> > This is the field I suggest exposing as a QOM property so it can be set
> > from the proxy object (e.g. when the PCI proxy opens the vdpa device
> > before our .realize() function is called).
> > 
> 
> OK.
> 
> > > +
> > > +    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > > +    if (*errp) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    if (s->queue_size > max_queue_size) {
> > > +        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> > (max:%d)",
> > > +                   s->queue_size, max_queue_size);
> > > +        goto out;
> > > +    } else if (!s->queue_size) {
> > > +        s->queue_size = max_queue_size;
> > > +    }
> > > +
> > > +    ret = vdpa_dev_get_vqs_num(fd, errp);
> > > +    if (*errp) {
> > > +        goto out;
> > > +    }
> > > +
> > > +    s->dev.nvqs = ret;
> > 
> > There is no input validation because we trust the kernel vDPA return
> > values. That seems okay for now but if there is a vhost-user version of
> > this in the future then input validation will be necessary to achieve
> > isolation between QEMU and the vhost-user processes. I suggest including
> > input validation code right away because it's harder to audit the code
> > and fix missing input validation later on.
> > 
> 
> Make sense!
> 
> Should we only need to validate the upper boundary (e.g. <VIRTIO_QUEUE_MAX)?

Careful, ret is currently an int so negative values would bypass the <
VIRTIO_QUEUE_MAX check.

> 
> > > +    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> > > +    s->dev.vq_index = 0;
> > > +    s->dev.vq_index_end = s->dev.nvqs;
> > > +    s->dev.backend_features = 0;
> > > +    s->started = false;
> > > +
> > > +    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> > NULL);
> > > +    if (ret < 0) {
> > > +        error_setg(errp, "vhost-vdpa-device: vhost initialization
> > failed: %s",
> > > +                   strerror(-ret));
> > > +        goto out;
> > > +    }
> > > +
> > > +    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);
> > 
> > The vhost_*() API abstracts the ioctl calls but this source file and the
> > PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
> > calls into the vhost_*() API? That would be cleaner and also make it
> > easier to add vhost-user vDPA support in the future.
> 
> We need these ioctls calls because we need invoke them before the vhost-dev
> object is initialized.

It may be possible to clean this up by changing how vhost_dev_init()
works but I haven't investigated. The issue is that the vhost_dev_init()
API requires information from the caller that has to be fetched from the
vDPA device. This forces the caller to communicate directly with the
vDPA device before calling vhost_dev_init(). It may be possible to move
this setup code inside vhost_dev_init() (and vhost_ops callbacks).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-06  2:34               ` Jason Wang
  2022-01-06  8:00                 ` longpeng2--- via
@ 2022-01-06 14:09                 ` Michael S. Tsirkin
  2022-01-07  2:53                   ` Jason Wang
  1 sibling, 1 reply; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-06 14:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.),
	Stefano Garzarella

On Thu, Jan 06, 2022 at 10:34:20AM +0800, Jason Wang wrote:
> On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
> > > Service Product Dept.) <longpeng2@huawei.com> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > > > Sent: Wednesday, January 5, 2022 3:54 PM
> > > > > To: Michael S. Tsirkin <mst@redhat.com>
> > > > > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > > > <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
> > > > > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> > > > > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > > > > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > > > > <qemu-devel@nongnu.org>
> > > > > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> > > > >
> > > > > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > > > > > >
> > > > > > > > From: Longpeng <longpeng2@huawei.com>
> > > > > > > >
> > > > > > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > > > > >
> > > > > > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > > > > > vhost is transport independent.
> > > > > >
> > > > > > Well *guest* needs to know how many vectors device supports.
> > > > > > I don't think there's a way around that. Do you?
> > > > >
> > > > > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > > > > simply assume #vqs + 1?
> > > > >
> > > > > > Otherwise guests will at best be suboptimal.
> > > > > >
> > > > > > >  And it reveals device implementation
> > > > > > > details which block (cross vendor) migration.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Not necessarily, userspace can hide this from guest if it
> > > > > > wants to, just validate.
> > > > >
> > > > > If we can hide it at vhost/uAPI level, it would be even better?
> > > > >
> > > >
> > > > Not only MSI vectors, but also queue-size, #vqs, etc.
> > >
> > > MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
> > >
> > > And it's something that is not guaranteed to be not changed. E.g some
> > > drivers may choose to allocate MSI during set_status() which can fail
> > > for various reasons.
> > >
> > > >
> > > > Maybe the vhost level could expose the hardware's real capabilities
> > > > and let the userspace (QEMU) do the hiding? The userspace know how
> > > > to process them.
> > >
> > > #MSI vectors is much more easier to be mediated than queue-size and #vqs.
> > >
> > > For interrupts, we've already had VHOST_SET_X_KICK, we can keep
> > > allocating eventfd based on #MSI vectors to make it work with any
> > > number of MSI vectors that the virtual device had.
> >
> > Right but if hardware does not support so many then what?
> > Just fail?
> 
> Or just trigger the callback of vqs that shares the vector.


Right but we want userspace to be able to report this to guest accurately
if it wants to. Guest can then configure itself correctly.


> > Having a query API would make things somewhat cleaner imho.
> 
> I may miss something,  even if we know #vectors, we still don't know
> the associated virtqueues for a dedicated vector?

This is up to the guest.

> >
> > > For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
> > > can simply fail if SET_VRING_NUM fail.
> > >
> > > For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
> > > device requires knowledge the #vqs in the config space. (still not a
> > > must, we can enumerate #vqs per device type)
> > >
> > > For the config size, it's OK but not a must, technically we can simply
> > > relay what guest write to vhost-vdpa. It's just because current Qemu
> > > require to have it during virtio device initialization.
> > >
> > > Thanks
> >
> >
> > I agree but these ok things make for a cleaner API I think.
> 
> Right.
> 
> Thanks
> 
> >
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >
> > > > > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > > > > >
> > > > > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > > > > ---
> > > > > > > >  linux-headers/linux/vhost.h | 10 ++++++++++
> > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > > > > > index c998860d7b..c5edd75d15 100644
> > > > > > > > --- a/linux-headers/linux/vhost.h
> > > > > > > > +++ b/linux-headers/linux/vhost.h
> > > > > > > > @@ -150,4 +150,14 @@
> > > > > > > >  /* Get the valid iova range */
> > > > > > > >  #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
> > > > > > > >                                              struct vhost_vdpa_iova_range)
> > > > > > > > +
> > > > > > > > +/* Get the number of vectors */
> > > > > > > > +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
> > > > > > > > +
> > > > > > > > +/* Get the virtio config size */
> > > > > > > > +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
> > > > > > > > +
> > > > > > > > +/* Get the number of virtqueues */
> > > > > > > > +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
> > > > > > > > +
> > > > > > > >  #endif
> > > > > > > > --
> > > > > > > > 2.23.0
> > > > > > > >
> > > > > >
> > > >
> >



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
  2022-01-06 11:25       ` Stefan Hajnoczi
@ 2022-01-07  2:22         ` Jason Wang
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-07  2:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: mst, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, pbonzini, Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.),
	sgarzare

On Thu, Jan 6, 2022 at 7:25 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Thu, Jan 06, 2022 at 01:22:19AM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > > Sent: Wednesday, January 5, 2022 5:49 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > <longpeng2@huawei.com>
> > > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > > Subject: Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
> > >
> > > On Wed, Jan 05, 2022 at 08:58:53AM +0800, Longpeng(Mike) wrote:
> > > > +static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> > > > +    .base_name               = TYPE_VHOST_VDPA_DEVICE_PCI,
> > > > +    .generic_name            = "vhost-vdpa-device-pci",
> > > > +    .transitional_name       = "vhost-vdpa-device-pci-transitional",
> > > > +    .non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",
> > >
> > > Does vDPA support Transitional VIRTIO devices?
> > >
> > > I expected this device to support Modern devices only.
> > >
> >
> > There's already a 0.95 vdpa driver (Alibaba ENI) in the kernel source and
> > supporting 0.95 devices is necessary for some older GuestOS.
> >
> > I'm OK if other guys also approve of supporting 1.0+ devices only :)
>
> If vDPA supports Transitional VIRTIO devices then it's fine to keep this
> code unchanged in this patch series.

Right, and I think that's the plan.

Thanks

>
> Stefan



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-06  8:00                 ` longpeng2--- via
@ 2022-01-07  2:41                   ` Jason Wang
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-07  2:41 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.),
	Michael S. Tsirkin
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Zhu, Lingshan,
	Stefano Garzarella


在 2022/1/6 下午4:00, Longpeng (Mike, Cloud Infrastructure Service Product 
Dept.) 写道:
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Thursday, January 6, 2022 10:34 AM
>> To: Michael S. Tsirkin<mst@redhat.com>
>> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
>> <longpeng2@huawei.com>; Stefan Hajnoczi<stefanha@redhat.com>; Stefano
>> Garzarella<sgarzare@redhat.com>; Cornelia Huck<cohuck@redhat.com>; pbonzini
>> <pbonzini@redhat.com>; Gonglei (Arei)<arei.gonglei@huawei.com>; Yechuan
>> <yechuan@huawei.com>; Huangzhichao<huangzhichao@huawei.com>; qemu-devel
>> <qemu-devel@nongnu.org>
>> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
>>
>> On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin<mst@redhat.com>  wrote:
>>> On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
>>>> On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
>>>> Service Product Dept.)<longpeng2@huawei.com>  wrote:
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jason Wang [mailto:jasowang@redhat.com]
>>>>>> Sent: Wednesday, January 5, 2022 3:54 PM
>>>>>> To: Michael S. Tsirkin<mst@redhat.com>
>>>>>> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
>>>>>> <longpeng2@huawei.com>; Stefan Hajnoczi<stefanha@redhat.com>; Stefano
>>>>>> Garzarella<sgarzare@redhat.com>; Cornelia Huck<cohuck@redhat.com>;
>> pbonzini
>>>>>> <pbonzini@redhat.com>; Gonglei (Arei)<arei.gonglei@huawei.com>; Yechuan
>>>>>> <yechuan@huawei.com>; Huangzhichao<huangzhichao@huawei.com>;
>> qemu-devel
>>>>>> <qemu-devel@nongnu.org>
>>>>>> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
>>>>>>
>>>>>> On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin<mst@redhat.com>  wrote:
>>>>>>> On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
>>>>>>>> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike)<longpeng2@huawei.com>
>> wrote:
>>>>>>>>> From: Longpeng<longpeng2@huawei.com>
>>>>>>>>>
>>>>>>>>> To support generic vdpa deivce, we need add the following ioctls:
>>>>>>>>> - GET_VECTORS_NUM: the count of vectors that supported
>>>>>>>> Does this mean MSI vectors? If yes, it looks like a layer violation:
>>>>>>>> vhost is transport independent.
>>>>>>> Well*guest*  needs to know how many vectors device supports.
>>>>>>> I don't think there's a way around that. Do you?
>>>>>> We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
>>>>>> simply assume #vqs + 1?
>>>>>>
>>>>>>> Otherwise guests will at best be suboptimal.
>>>>>>>
>>>>>>>>   And it reveals device implementation
>>>>>>>> details which block (cross vendor) migration.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Not necessarily, userspace can hide this from guest if it
>>>>>>> wants to, just validate.
>>>>>> If we can hide it at vhost/uAPI level, it would be even better?
>>>>>>
>>>>> Not only MSI vectors, but also queue-size, #vqs, etc.
>>>> MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
>>>>
>>>> And it's something that is not guaranteed to be not changed. E.g some
>>>> drivers may choose to allocate MSI during set_status() which can fail
>>>> for various reasons.
>>>>
>>>>> Maybe the vhost level could expose the hardware's real capabilities
>>>>> and let the userspace (QEMU) do the hiding? The userspace know how
>>>>> to process them.
>>>> #MSI vectors is much more easier to be mediated than queue-size and #vqs.
>>>>
>>>> For interrupts, we've already had VHOST_SET_X_KICK, we can keep
>>>> allocating eventfd based on #MSI vectors to make it work with any
>>>> number of MSI vectors that the virtual device had.
>>> Right but if hardware does not support so many then what?
>>> Just fail?
>> Or just trigger the callback of vqs that shares the vector.
>>
> Then we should disable PI if we need to share a vector in this case?


I may miss something, but I don't see any reason for doing this. I think 
the irqbypass manager and the arch specific PI codes should deal with 
this case.

Ling Shan (cced) told me it works in the past.

Thanks


>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
  2022-01-06 14:09                 ` Michael S. Tsirkin
@ 2022-01-07  2:53                   ` Jason Wang
  0 siblings, 0 replies; 52+ messages in thread
From: Jason Wang @ 2022-01-07  2:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, Stefan Hajnoczi, pbonzini, Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.),
	Stefano Garzarella


在 2022/1/6 下午10:09, Michael S. Tsirkin 写道:
> On Thu, Jan 06, 2022 at 10:34:20AM +0800, Jason Wang wrote:
>> On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>> On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
>>>> On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
>>>> Service Product Dept.) <longpeng2@huawei.com> wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jason Wang [mailto:jasowang@redhat.com]
>>>>>> Sent: Wednesday, January 5, 2022 3:54 PM
>>>>>> To: Michael S. Tsirkin <mst@redhat.com>
>>>>>> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
>>>>>> <longpeng2@huawei.com>; Stefan Hajnoczi <stefanha@redhat.com>; Stefano
>>>>>> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
>>>>>> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
>>>>>> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
>>>>>> <qemu-devel@nongnu.org>
>>>>>> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
>>>>>>
>>>>>> On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>> On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
>>>>>>>> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
>>>>>>>>> From: Longpeng <longpeng2@huawei.com>
>>>>>>>>>
>>>>>>>>> To support generic vdpa deivce, we need add the following ioctls:
>>>>>>>>> - GET_VECTORS_NUM: the count of vectors that supported
>>>>>>>> Does this mean MSI vectors? If yes, it looks like a layer violation:
>>>>>>>> vhost is transport independent.
>>>>>>> Well *guest* needs to know how many vectors device supports.
>>>>>>> I don't think there's a way around that. Do you?
>>>>>> We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
>>>>>> simply assume #vqs + 1?
>>>>>>
>>>>>>> Otherwise guests will at best be suboptimal.
>>>>>>>
>>>>>>>>   And it reveals device implementation
>>>>>>>> details which block (cross vendor) migration.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Not necessarily, userspace can hide this from guest if it
>>>>>>> wants to, just validate.
>>>>>> If we can hide it at vhost/uAPI level, it would be even better?
>>>>>>
>>>>> Not only MSI vectors, but also queue-size, #vqs, etc.
>>>> MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
>>>>
>>>> And it's something that is not guaranteed to be not changed. E.g some
>>>> drivers may choose to allocate MSI during set_status() which can fail
>>>> for various reasons.
>>>>
>>>>> Maybe the vhost level could expose the hardware's real capabilities
>>>>> and let the userspace (QEMU) do the hiding? The userspace know how
>>>>> to process them.
>>>> #MSI vectors is much more easier to be mediated than queue-size and #vqs.
>>>>
>>>> For interrupts, we've already had VHOST_SET_X_KICK, we can keep
>>>> allocating eventfd based on #MSI vectors to make it work with any
>>>> number of MSI vectors that the virtual device had.
>>> Right but if hardware does not support so many then what?
>>> Just fail?
>> Or just trigger the callback of vqs that shares the vector.
>
> Right but we want userspace to be able to report this to guest accurately
> if it wants to. Guest can then configure itself correctly.
>
>
>>> Having a query API would make things somewhat cleaner imho.
>> I may miss something,  even if we know #vectors, we still don't know
>> the associated virtqueues for a dedicated vector?
> This is up to the guest.


Just to clarify the possible issue, this only works if vDPA parent is 
using the same irq binding policy as what viritio-pci did in the guest.

Consider vDPA has 3 vectors allocated:

host vector 0: tx/rx
host vector 1: cvq
host vector 2: config

So we return 3 for get_vectors. So the virtual device will have 3 
vectors in this case.

But a guest driver may do:

guest vector 0: tx (eventfd0)
guest vector 1: rx (eventfd1)
guest vector 2: cvq/config (eventfd2)

The irq handler of host vector0 will notify both eventfd0(guest vector0) 
and eventfd1(guest vector1) in this case.

And using such "vectors passthrough" may block migration between the 
vDPA device where the #vectors is the only difference.

Thanks


>
>>>> For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
>>>> can simply fail if SET_VRING_NUM fail.
>>>>
>>>> For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
>>>> device requires knowledge the #vqs in the config space. (still not a
>>>> must, we can enumerate #vqs per device type)
>>>>
>>>> For the config size, it's OK but not a must, technically we can simply
>>>> relay what guest write to vhost-vdpa. It's just because current Qemu
>>>> require to have it during virtio device initialization.
>>>>
>>>> Thanks
>>>
>>> I agree but these ok things make for a cleaner API I think.
>> Right.
>>
>> Thanks
>>
>>>>>> Thanks
>>>>>>
>>>>>>>
>>>>>>>>> - GET_CONFIG_SIZE: the size of the virtio config space
>>>>>>>>> - GET_VQS_NUM: the count of virtqueues that exported
>>>>>>>>>
>>>>>>>>> Signed-off-by: Longpeng <longpeng2@huawei.com>
>>>>>>>>> ---
>>>>>>>>>   linux-headers/linux/vhost.h | 10 ++++++++++
>>>>>>>>>   1 file changed, 10 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
>>>>>>>>> index c998860d7b..c5edd75d15 100644
>>>>>>>>> --- a/linux-headers/linux/vhost.h
>>>>>>>>> +++ b/linux-headers/linux/vhost.h
>>>>>>>>> @@ -150,4 +150,14 @@
>>>>>>>>>   /* Get the valid iova range */
>>>>>>>>>   #define VHOST_VDPA_GET_IOVA_RANGE      _IOR(VHOST_VIRTIO, 0x78, \
>>>>>>>>>                                               struct vhost_vdpa_iova_range)
>>>>>>>>> +
>>>>>>>>> +/* Get the number of vectors */
>>>>>>>>> +#define VHOST_VDPA_GET_VECTORS_NUM     _IOR(VHOST_VIRTIO, 0x79, int)
>>>>>>>>> +
>>>>>>>>> +/* Get the virtio config size */
>>>>>>>>> +#define VHOST_VDPA_GET_CONFIG_SIZE     _IOR(VHOST_VIRTIO, 0x80, int)
>>>>>>>>> +
>>>>>>>>> +/* Get the number of virtqueues */
>>>>>>>>> +#define VHOST_VDPA_GET_VQS_NUM         _IOR(VHOST_VIRTIO, 0x81, int)
>>>>>>>>> +
>>>>>>>>>   #endif
>>>>>>>>> --
>>>>>>>>> 2.23.0
>>>>>>>>>



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  6:15       ` Jason Wang
@ 2022-01-10  3:03         ` longpeng2--- via
  0 siblings, 0 replies; 52+ messages in thread
From: longpeng2--- via @ 2022-01-10  3:03 UTC (permalink / raw)
  To: Jason Wang
  Cc: Stefan Hajnoczi, mst, Stefano Garzarella, Cornelia Huck,
	pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@redhat.com]
> Sent: Wednesday, January 5, 2022 2:15 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
> Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>; pbonzini
> <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> <qemu-devel@nongnu.org>
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 5, 2022 at 1:48 PM Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) <longpeng2@huawei.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Jason Wang [mailto:jasowang@redhat.com]
> > > Sent: Wednesday, January 5, 2022 12:38 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > <longpeng2@huawei.com>
> > > Cc: Stefan Hajnoczi <stefanha@redhat.com>; mst <mst@redhat.com>; Stefano
> > > Garzarella <sgarzare@redhat.com>; Cornelia Huck <cohuck@redhat.com>;
> pbonzini
> > > <pbonzini@redhat.com>; Gonglei (Arei) <arei.gonglei@huawei.com>; Yechuan
> > > <yechuan@huawei.com>; Huangzhichao <huangzhichao@huawei.com>; qemu-devel
> > > <qemu-devel@nongnu.org>
> > > Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> > > id
> > >
> > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) <longpeng2@huawei.com> wrote:
> > > >
> > > > From: Longpeng <longpeng2@huawei.com>
> > > >
> > > > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > > > deivce which is specificed by the "Virtio Device ID".
> > > >
> > > > These helpers will be used to build the generic vDPA device later.
> > > >
> > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > ---
> > > >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> > > >  hw/virtio/virtio-pci.h |  4 ++
> > > >  2 files changed, 97 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > > index 750aa47ec1..843085c4ea 100644
> > > > --- a/hw/virtio/virtio-pci.c
> > > > +++ b/hw/virtio/virtio-pci.c
> > > > @@ -19,6 +19,7 @@
> > > >
> > > >  #include "exec/memop.h"
> > > >  #include "standard-headers/linux/virtio_pci.h"
> > > > +#include "standard-headers/linux/virtio_ids.h"
> > > >  #include "hw/boards.h"
> > > >  #include "hw/virtio/virtio.h"
> > > >  #include "migration/qemu-file-types.h"
> > > > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int
> n,
> > > QEMUFile *f)
> > > >      return 0;
> > > >  }
> > > >
> > > > +typedef struct VirtIOPCIIDInfo {
> > > > +    uint16_t vdev_id; /* virtio id */
> > > > +    uint16_t pdev_id; /* pci device id */
> > > > +    uint16_t class_id;
> > > > +} VirtIOPCIIDInfo;
> > > > +
> > > > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > > > +    {
> > >
> > > Any way to get rid of this array? E.g using the algorithm that is used
> > > by the kernel virtio driver.
> > >
> >
> > For device id, we can use the algorithm if we no need to support
> > Transitional id. But how to get the class id ?
> 
> Right, I miss this. So the current code should be fine.
> 

Maybe the following way would be better? It can save about 40 lines.

#define VIRTIO_PCI_ID_INFO(name, class)   \
    {VIRTIO_ID_##name, PCI_DEVICE_ID_VIRTIO_##name, class}

static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
    VIRTIO_PCI_ID_INFO(NET,     PCI_CLASS_NETWORK_ETHERNET),
    VIRTIO_PCI_ID_INFO(BLOCK,   PCI_CLASS_STORAGE_SCSI),
    VIRTIO_PCI_ID_INFO(SCSI,    PCI_CLASS_STORAGE_SCSI),
    VIRTIO_PCI_ID_INFO(CONSOLE, PCI_CLASS_COMMUNICATION_OTHER),
    VIRTIO_PCI_ID_INFO(VSOCK,   PCI_CLASS_COMMUNICATION_OTHER),
    VIRTIO_PCI_ID_INFO(IOMMU,   PCI_CLASS_OTHERS),
    VIRTIO_PCI_ID_INFO(MEM,     PCI_CLASS_OTHERS),
    VIRTIO_PCI_ID_INFO(PMEM,    PCI_CLASS_OTHERS),
    VIRTIO_PCI_ID_INFO(RNG,     PCI_CLASS_OTHERS),
    VIRTIO_PCI_ID_INFO(BALLOON, PCI_CLASS_OTHERS),
    VIRTIO_PCI_ID_INFO(9P,      PCI_BASE_CLASS_NETWORK),
};


> Thanks
> 
> >
> > > Thanks
> > >
> > > > +        .vdev_id = VIRTIO_ID_NET,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > > > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_BLOCK,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_SCSI,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_9P,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > > > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_VSOCK,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_IOMMU,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > > > +        .class_id = PCI_CLASS_OTHERS,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_MEM,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > > > +        .class_id = PCI_CLASS_OTHERS,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_PMEM,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > > > +        .class_id = PCI_CLASS_OTHERS,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_RNG,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > > > +        .class_id = PCI_CLASS_OTHERS,
> > > > +    },
> > > > +    {
> > > > +        .vdev_id = VIRTIO_ID_BALLOON,
> > > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > > > +        .class_id = PCI_CLASS_OTHERS,
> > > > +    },
> > > > +};
> > > > +
> > > > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > > > +{
> > > > +    VirtIOPCIIDInfo info = {};
> > > > +    int i;
> > > > +
> > > > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > > > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > > > +            info = virtio_pci_id_info[i];
> > > > +            break;
> > > > +        }
> > > > +    }
> > > > +
> > > > +    return info;
> > > > +}
> > > > +
> > > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > > > +{
> > > > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > > > +}
> > > > +
> > > > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > > > +{
> > > > +    return virtio_pci_get_id_info(device_id).class_id;
> > > > +}
> > > > +
> > > >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> > > >  {
> > > >      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > > > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState
> *d,
> > > Error **errp)
> > > >           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> > > >           */
> > > >          pci_set_word(config + PCI_SUBSYSTEM_ID,
> > > virtio_bus_get_vdev_id(bus));
> > > > +        if (proxy->pdev_id) {
> > > > +            pci_config_set_device_id(config, proxy->pdev_id);
> > > > +        }
> > > >      } else {
> > > >          /* pure virtio-1.0 */
> > > >          pci_set_word(config + PCI_VENDOR_ID,
> > > > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > > > index 2446dcd9ae..06aa59436e 100644
> > > > --- a/hw/virtio/virtio-pci.h
> > > > +++ b/hw/virtio/virtio-pci.h
> > > > @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
> > > >      bool disable_modern;
> > > >      bool ignore_backend_features;
> > > >      OnOffAuto disable_legacy;
> > > > +    uint16_t pdev_id;
> > > >      uint32_t class_code;
> > > >      uint32_t nvectors;
> > > >      uint32_t dfselect;
> > > > @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
> > > >      VirtioBusState bus;
> > > >  };
> > > >
> > > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> > > > +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> > > > +
> > > >  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
> > > >  {
> > > >      return !proxy->disable_modern;
> > > > --
> > > > 2.23.0
> > > >
> >


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
  2022-01-05  4:37   ` Jason Wang
  2022-01-05 10:46   ` Cornelia Huck
@ 2022-01-10  5:43   ` Michael S. Tsirkin
  2022-01-10  6:27     ` longpeng2--- via
  2 siblings, 1 reply; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-10  5:43 UTC (permalink / raw)
  To: Longpeng(Mike)
  Cc: jasowang, cohuck, qemu-devel, yechuan, arei.gonglei,
	huangzhichao, stefanha, pbonzini, sgarzare

On Wed, Jan 05, 2022 at 08:58:51AM +0800, Longpeng(Mike) wrote:
> From: Longpeng <longpeng2@huawei.com>
> 
> Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> deivce which is specificed by the "Virtio Device ID".

ton of typos here.

> These helpers will be used to build the generic vDPA device later.
> 
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
>  hw/virtio/virtio-pci.h |  4 ++
>  2 files changed, 97 insertions(+)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 750aa47ec1..843085c4ea 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -19,6 +19,7 @@
>  
>  #include "exec/memop.h"
>  #include "standard-headers/linux/virtio_pci.h"
> +#include "standard-headers/linux/virtio_ids.h"
>  #include "hw/boards.h"
>  #include "hw/virtio/virtio.h"
>  #include "migration/qemu-file-types.h"
> @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n, QEMUFile *f)
>      return 0;
>  }
>  
> +typedef struct VirtIOPCIIDInfo {
> +    uint16_t vdev_id; /* virtio id */
> +    uint16_t pdev_id; /* pci device id */
> +    uint16_t class_id;
> +} VirtIOPCIIDInfo;


if this is transitional as comment says make it explicit
in the names and comments.

> +
> +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> +    {
> +        .vdev_id = VIRTIO_ID_NET,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BLOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_CONSOLE,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_SCSI,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> +        .class_id = PCI_CLASS_STORAGE_SCSI,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_9P,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> +        .class_id = PCI_BASE_CLASS_NETWORK,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_VSOCK,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_IOMMU,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_MEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_PMEM,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_RNG,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +    {
> +        .vdev_id = VIRTIO_ID_BALLOON,
> +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> +        .class_id = PCI_CLASS_OTHERS,
> +    },
> +};
> +


this is the list from the spec:


So this is the list from the spec:

0x1000 network card
0x1001 block device
0x1002 memory ballooning (traditional)
0x1003 console
0x1004 SCSI host
0x1005 entropy source
0x1009 9P transport


I'd drop all the rest, use the algorithm for non transitional.
And when class is other I'd just not include it in the array,
make this the default.



> +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> +{
> +    VirtIOPCIIDInfo info = {};
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> +            info = virtio_pci_id_info[i];
> +            break;
> +        }
> +    }
> +
> +    return info;
> +}
> +
> +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).pdev_id;
> +}
> +
> +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> +{
> +    return virtio_pci_get_id_info(device_id).class_id;
> +}
> +
>  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
>  {
>      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
>           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
>           */
>          pci_set_word(config + PCI_SUBSYSTEM_ID, virtio_bus_get_vdev_id(bus));
> +        if (proxy->pdev_id) {
> +            pci_config_set_device_id(config, proxy->pdev_id);
> +        }
>      } else {
>          /* pure virtio-1.0 */
>          pci_set_word(config + PCI_VENDOR_ID,
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 2446dcd9ae..06aa59436e 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
>      bool disable_modern;
>      bool ignore_backend_features;
>      OnOffAuto disable_legacy;
> +    uint16_t pdev_id;
>      uint32_t class_code;
>      uint32_t nvectors;
>      uint32_t dfselect;
> @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
>      VirtioBusState bus;
>  };
>  
> +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> +
>  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
>  {
>      return !proxy->disable_modern;
> -- 
> 2.23.0



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-10  5:43   ` Michael S. Tsirkin
@ 2022-01-10  6:27     ` longpeng2--- via
  2022-01-10  7:14       ` Michael S. Tsirkin
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-10  6:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: stefanha, jasowang, sgarzare, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Michael S. Tsirkin [mailto:mst@redhat.com]
> Sent: Monday, January 10, 2022 1:43 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: stefanha@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 05, 2022 at 08:58:51AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng <longpeng2@huawei.com>
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> 
> ton of typos here.
> 

Will fix all in the V2.

> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >      return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +    uint16_t vdev_id; /* virtio id */
> > +    uint16_t pdev_id; /* pci device id */
> > +    uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> 
> 
> if this is transitional as comment says make it explicit
> in the names and comments.
> 

OK.

> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +    {
> > +        .vdev_id = VIRTIO_ID_NET,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BLOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_SCSI,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_9P,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_VSOCK,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_IOMMU,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_MEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_PMEM,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_RNG,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +    {
> > +        .vdev_id = VIRTIO_ID_BALLOON,
> > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +        .class_id = PCI_CLASS_OTHERS,
> > +    },
> > +};
> > +
> 
> 
> this is the list from the spec:
> 
> 
> So this is the list from the spec:
> 
> 0x1000 network card
> 0x1001 block device
> 0x1002 memory ballooning (traditional)
> 0x1003 console
> 0x1004 SCSI host
> 0x1005 entropy source
> 0x1009 9P transport
> 

Why the following device IDs are introduced? They are non
transitional devices.

#define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
#define PCI_DEVICE_ID_VIRTIO_PMEM        0x1013
#define PCI_DEVICE_ID_VIRTIO_IOMMU       0x1014
#define PCI_DEVICE_ID_VIRTIO_MEM         0x1015

> 
> I'd drop all the rest, use the algorithm for non transitional.
> And when class is other I'd just not include it in the array,
> make this the default.
> 
> 
> 
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +    VirtIOPCIIDInfo info = {};
> > +    int i;
> > +
> > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +            info = virtio_pci_id_info[i];
> > +            break;
> > +        }
> > +    }
> > +
> > +    return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > +{
> > +    return virtio_pci_get_id_info(device_id).class_id;
> > +}
> > +
> >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> >  {
> >      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d,
> Error **errp)
> >           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> >           */
> >          pci_set_word(config + PCI_SUBSYSTEM_ID,
> virtio_bus_get_vdev_id(bus));
> > +        if (proxy->pdev_id) {
> > +            pci_config_set_device_id(config, proxy->pdev_id);
> > +        }
> >      } else {
> >          /* pure virtio-1.0 */
> >          pci_set_word(config + PCI_VENDOR_ID,
> > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > index 2446dcd9ae..06aa59436e 100644
> > --- a/hw/virtio/virtio-pci.h
> > +++ b/hw/virtio/virtio-pci.h
> > @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
> >      bool disable_modern;
> >      bool ignore_backend_features;
> >      OnOffAuto disable_legacy;
> > +    uint16_t pdev_id;
> >      uint32_t class_code;
> >      uint32_t nvectors;
> >      uint32_t dfselect;
> > @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
> >      VirtioBusState bus;
> >  };
> >
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> > +
> >  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
> >  {
> >      return !proxy->disable_modern;
> > --
> > 2.23.0



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio id
  2022-01-10  6:27     ` longpeng2--- via
@ 2022-01-10  7:14       ` Michael S. Tsirkin
  0 siblings, 0 replies; 52+ messages in thread
From: Michael S. Tsirkin @ 2022-01-10  7:14 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: jasowang, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, stefanha, pbonzini, sgarzare

On Mon, Jan 10, 2022 at 06:27:05AM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> 
> 
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst@redhat.com]
> > Sent: Monday, January 10, 2022 1:43 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>
> > Cc: stefanha@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> > id
> > 
> > On Wed, Jan 05, 2022 at 08:58:51AM +0800, Longpeng(Mike) wrote:
> > > From: Longpeng <longpeng2@huawei.com>
> > >
> > > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > > deivce which is specificed by the "Virtio Device ID".
> > 
> > ton of typos here.
> > 
> 
> Will fix all in the V2.
> 
> > > These helpers will be used to build the generic vDPA device later.
> > >
> > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > ---
> > >  hw/virtio/virtio-pci.c | 93 ++++++++++++++++++++++++++++++++++++++++++
> > >  hw/virtio/virtio-pci.h |  4 ++
> > >  2 files changed, 97 insertions(+)
> > >
> > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > index 750aa47ec1..843085c4ea 100644
> > > --- a/hw/virtio/virtio-pci.c
> > > +++ b/hw/virtio/virtio-pci.c
> > > @@ -19,6 +19,7 @@
> > >
> > >  #include "exec/memop.h"
> > >  #include "standard-headers/linux/virtio_pci.h"
> > > +#include "standard-headers/linux/virtio_ids.h"
> > >  #include "hw/boards.h"
> > >  #include "hw/virtio/virtio.h"
> > >  #include "migration/qemu-file-types.h"
> > > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> > QEMUFile *f)
> > >      return 0;
> > >  }
> > >
> > > +typedef struct VirtIOPCIIDInfo {
> > > +    uint16_t vdev_id; /* virtio id */
> > > +    uint16_t pdev_id; /* pci device id */
> > > +    uint16_t class_id;
> > > +} VirtIOPCIIDInfo;
> > 
> > 
> > if this is transitional as comment says make it explicit
> > in the names and comments.
> > 
> 
> OK.
> 
> > > +
> > > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_NET,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > > +        .class_id = PCI_CLASS_NETWORK_ETHERNET,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_BLOCK,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_CONSOLE,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_SCSI,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > > +        .class_id = PCI_CLASS_STORAGE_SCSI,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_9P,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > > +        .class_id = PCI_BASE_CLASS_NETWORK,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_VSOCK,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > > +        .class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_IOMMU,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_MEM,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_PMEM,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_RNG,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +    {
> > > +        .vdev_id = VIRTIO_ID_BALLOON,
> > > +        .pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > > +        .class_id = PCI_CLASS_OTHERS,
> > > +    },
> > > +};
> > > +
> > 
> > 
> > this is the list from the spec:
> > 
> > 
> > So this is the list from the spec:
> > 
> > 0x1000 network card
> > 0x1001 block device
> > 0x1002 memory ballooning (traditional)
> > 0x1003 console
> > 0x1004 SCSI host
> > 0x1005 entropy source
> > 0x1009 9P transport
> > 
> 
> Why the following device IDs are introduced? They are non
> transitional devices.
> 
> #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
> #define PCI_DEVICE_ID_VIRTIO_PMEM        0x1013
> #define PCI_DEVICE_ID_VIRTIO_IOMMU       0x1014
> #define PCI_DEVICE_ID_VIRTIO_MEM         0x1015

Just a single place to put these things.
E.g. vsock id is used in more than 1 place.

> > 
> > I'd drop all the rest, use the algorithm for non transitional.
> > And when class is other I'd just not include it in the array,
> > make this the default.
> > 
> > 
> > 
> > > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > > +{
> > > +    VirtIOPCIIDInfo info = {};
> > > +    int i;
> > > +
> > > +    for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > > +        if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > > +            info = virtio_pci_id_info[i];
> > > +            break;
> > > +        }
> > > +    }
> > > +
> > > +    return info;
> > > +}
> > > +
> > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > > +{
> > > +    return virtio_pci_get_id_info(device_id).pdev_id;
> > > +}
> > > +
> > > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > > +{
> > > +    return virtio_pci_get_id_info(device_id).class_id;
> > > +}
> > > +
> > >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> > >  {
> > >      VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d,
> > Error **errp)
> > >           * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> > >           */
> > >          pci_set_word(config + PCI_SUBSYSTEM_ID,
> > virtio_bus_get_vdev_id(bus));
> > > +        if (proxy->pdev_id) {
> > > +            pci_config_set_device_id(config, proxy->pdev_id);
> > > +        }
> > >      } else {
> > >          /* pure virtio-1.0 */
> > >          pci_set_word(config + PCI_VENDOR_ID,
> > > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > > index 2446dcd9ae..06aa59436e 100644
> > > --- a/hw/virtio/virtio-pci.h
> > > +++ b/hw/virtio/virtio-pci.h
> > > @@ -146,6 +146,7 @@ struct VirtIOPCIProxy {
> > >      bool disable_modern;
> > >      bool ignore_backend_features;
> > >      OnOffAuto disable_legacy;
> > > +    uint16_t pdev_id;
> > >      uint32_t class_code;
> > >      uint32_t nvectors;
> > >      uint32_t dfselect;
> > > @@ -158,6 +159,9 @@ struct VirtIOPCIProxy {
> > >      VirtioBusState bus;
> > >  };
> > >
> > > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id);
> > > +uint16_t virtio_pci_get_class_id(uint16_t device_id);
> > > +
> > >  static inline bool virtio_pci_modern(VirtIOPCIProxy *proxy)
> > >  {
> > >      return !proxy->disable_modern;
> > > --
> > > 2.23.0



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 06/10] vdpa-dev: implement the unrealize interface
  2022-01-06  3:23     ` longpeng2--- via
@ 2022-01-10  9:38       ` Stefano Garzarella
  0 siblings, 0 replies; 52+ messages in thread
From: Stefano Garzarella @ 2022-01-10  9:38 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, jasowang, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, stefanha, pbonzini

On Thu, Jan 06, 2022 at 03:23:07AM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
>
>
>> -----Original Message-----
>> From: Stefano Garzarella [mailto:sgarzare@redhat.com]
>> Sent: Wednesday, January 5, 2022 7:16 PM
>> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
>> <longpeng2@huawei.com>
>> Cc: stefanha@redhat.com; mst@redhat.com; jasowang@redhat.com;
>> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
>> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
>> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
>> Subject: Re: [RFC 06/10] vdpa-dev: implement the unrealize interface
>>
>> On Wed, Jan 05, 2022 at 08:58:56AM +0800, Longpeng(Mike) wrote:
>> >From: Longpeng <longpeng2@huawei.com>
>> >
>> >Implements the .unrealize interface.
>> >
>> >Signed-off-by: Longpeng <longpeng2@huawei.com>
>> >---
>> > hw/virtio/vdpa-dev.c | 22 +++++++++++++++++++++-
>> > 1 file changed, 21 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
>> >index 2d534d837a..4e4dd3d201 100644
>> >--- a/hw/virtio/vdpa-dev.c
>> >+++ b/hw/virtio/vdpa-dev.c
>> >@@ -133,9 +133,29 @@ out:
>> >     close(fd);
>> > }
>> >
>> >+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
>> >+{
>> >+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
>> >+    int i;
>> >+
>> >+    for (i = 0; i < s->num_queues; i++) {
>>                        ^
>> `s->num_queues` seems uninitialized to me, maybe we could just remove
>> the num_queues field from VhostVdpaDevice, and use `s->dev.nvqs` as in
>> vhost_vdpa_device_realize().
>>
>
>Good catch, I'll fix the bug.
>
>But I think we should keep the num_queues field, we need it if we support
>migration in the next step, right?
>
>> >+        virtio_delete_queue(s->virtqs[i]);
>> >+    }
>> >+    g_free(s->virtqs);
>> >+    virtio_cleanup(vdev);
>> >+
>> >+    g_free(s->config);
>> >+}
>> >+
>> > static void vhost_vdpa_device_unrealize(DeviceState *dev)
>> > {
>> >-    return;
>> >+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> >+    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
>> >+
>> >+    virtio_set_status(vdev, 0);
>> >+    vhost_dev_cleanup(&s->dev);
>>
>> If we will use `s->dev.nvqs` in vhost_vdpa_vdev_unrealize(), we should
>> call vhost_dev_cleanup() after it, just before close() as we already do
>> in the error path of vhost_vdpa_device_realize().
>>
>
>I'll try to fix the above bug first if you agree that we should keep the
>num_queues field.

Yep, if it could be useful, we can keep it.

>
>I just realize that I forgot to free s->dev.vqs here...

Oh right, I miss it too.
We should free it also in the error path of vhost_vdpa_device_realize().

Thanks,
Stefano



^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-06 11:34       ` Stefan Hajnoczi
@ 2022-01-17 12:34         ` longpeng2--- via
  2022-01-19 17:15           ` Stefan Hajnoczi
  0 siblings, 1 reply; 52+ messages in thread
From: longpeng2--- via @ 2022-01-17 12:34 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: mst, jasowang, sgarzare, cohuck, pbonzini, Gonglei (Arei),
	Yechuan, Huangzhichao, qemu-devel



> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Thursday, January 6, 2022 7:34 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> <longpeng2@huawei.com>
> Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> 
> On Thu, Jan 06, 2022 at 03:02:37AM +0000, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > > Sent: Wednesday, January 5, 2022 6:18 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > <longpeng2@huawei.com>
> > > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > > Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> > >
> > > On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > > > From: Longpeng <longpeng2@huawei.com>
> > > >
> > > > Implements the .realize interface.
> > > >
> > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > ---
> > > >  hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
> > > >  include/hw/virtio/vdpa-dev.h |   8 +++
> > > >  2 files changed, 122 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > > > index 790117fb3b..2d534d837a 100644
> > > > --- a/hw/virtio/vdpa-dev.c
> > > > +++ b/hw/virtio/vdpa-dev.c
> > > > @@ -15,9 +15,122 @@
> > > >  #include "sysemu/sysemu.h"
> > > >  #include "sysemu/runstate.h"
> > > >
> > > > +static void
> > > > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> *vq)
> > > > +{
> > > > +    /* Nothing to do */
> > > > +}
> > > > +
> > > > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> > >
> > > This looks similar to the helper function in a previous patch but this
> > > time the return value type is int instead of uint32_t. Please make the
> > > types consistent.
> > >
> >
> > OK.
> >
> > > > +{
> > > > +    int val;
> > > > +
> > > > +    if (ioctl(fd, cmd, &val) < 0) {
> > > > +        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > > > +                   cmd, strerror(errno));
> > > > +        return -1;
> > > > +    }
> > > > +
> > > > +    return val;
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > > > +{
> > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > > > +{
> > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > > > +{
> > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE,
> errp);
> > > > +}
> > > > +
> > > >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > > >  {
> > > > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > > +    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > > > +    uint32_t device_id;
> > > > +    int max_queue_size;
> > > > +    int fd;
> > > > +    int i, ret;
> > > > +
> > > > +    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > > > +    if (fd == -1) {
> > > > +        return;
> > > > +    }
> > > > +    s->vdpa.device_fd = fd;
> > >
> > > This is the field I suggest exposing as a QOM property so it can be set
> > > from the proxy object (e.g. when the PCI proxy opens the vdpa device
> > > before our .realize() function is called).
> > >
> >
> > OK.
> >
> > > > +
> > > > +    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > > > +    if (*errp) {
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    if (s->queue_size > max_queue_size) {
> > > > +        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> > > (max:%d)",
> > > > +                   s->queue_size, max_queue_size);
> > > > +        goto out;
> > > > +    } else if (!s->queue_size) {
> > > > +        s->queue_size = max_queue_size;
> > > > +    }
> > > > +
> > > > +    ret = vdpa_dev_get_vqs_num(fd, errp);
> > > > +    if (*errp) {
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    s->dev.nvqs = ret;
> > >
> > > There is no input validation because we trust the kernel vDPA return
> > > values. That seems okay for now but if there is a vhost-user version of
> > > this in the future then input validation will be necessary to achieve
> > > isolation between QEMU and the vhost-user processes. I suggest including
> > > input validation code right away because it's harder to audit the code
> > > and fix missing input validation later on.
> > >
> >
> > Make sense!
> >
> > Should we only need to validate the upper boundary (e.g. <VIRTIO_QUEUE_MAX)?
> 
> Careful, ret is currently an int so negative values would bypass the <
> VIRTIO_QUEUE_MAX check.
> 
> >
> > > > +    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> > > > +    s->dev.vq_index = 0;
> > > > +    s->dev.vq_index_end = s->dev.nvqs;
> > > > +    s->dev.backend_features = 0;
> > > > +    s->started = false;
> > > > +
> > > > +    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> > > NULL);
> > > > +    if (ret < 0) {
> > > > +        error_setg(errp, "vhost-vdpa-device: vhost initialization
> > > failed: %s",
> > > > +                   strerror(-ret));
> > > > +        goto out;
> > > > +    }
> > > > +
> > > > +    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);
> > >
> > > The vhost_*() API abstracts the ioctl calls but this source file and the
> > > PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
> > > calls into the vhost_*() API? That would be cleaner and also make it
> > > easier to add vhost-user vDPA support in the future.
> >
> > We need these ioctls calls because we need invoke them before the vhost-dev
> > object is initialized.
> 
> It may be possible to clean this up by changing how vhost_dev_init()
> works but I haven't investigated. The issue is that the vhost_dev_init()
> API requires information from the caller that has to be fetched from the
> vDPA device. This forces the caller to communicate directly with the
> vDPA device before calling vhost_dev_init(). It may be possible to move
> this setup code inside vhost_dev_init() (and vhost_ops callbacks).
> 

Hmm, this is still not clear to me, so let's continue to discuss this
in v2 if you think it's necessary.

> Stefan


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC 05/10] vdpa-dev: implement the realize interface
  2022-01-17 12:34         ` longpeng2--- via
@ 2022-01-19 17:15           ` Stefan Hajnoczi
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Hajnoczi @ 2022-01-19 17:15 UTC (permalink / raw)
  To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
  Cc: mst, jasowang, cohuck, qemu-devel, Yechuan, Gonglei (Arei),
	Huangzhichao, pbonzini, sgarzare

[-- Attachment #1: Type: text/plain, Size: 7581 bytes --]

On Mon, Jan 17, 2022 at 12:34:50PM +0000, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> 
> 
> > -----Original Message-----
> > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > Sent: Thursday, January 6, 2022 7:34 PM
> > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > <longpeng2@huawei.com>
> > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> > 
> > On Thu, Jan 06, 2022 at 03:02:37AM +0000, Longpeng (Mike, Cloud Infrastructure
> > Service Product Dept.) wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > > > Sent: Wednesday, January 5, 2022 6:18 PM
> > > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > > <longpeng2@huawei.com>
> > > > Cc: mst@redhat.com; jasowang@redhat.com; sgarzare@redhat.com;
> > > > cohuck@redhat.com; pbonzini@redhat.com; Gonglei (Arei)
> > > > <arei.gonglei@huawei.com>; Yechuan <yechuan@huawei.com>; Huangzhichao
> > > > <huangzhichao@huawei.com>; qemu-devel@nongnu.org
> > > > Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> > > >
> > > > On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > > > > From: Longpeng <longpeng2@huawei.com>
> > > > >
> > > > > Implements the .realize interface.
> > > > >
> > > > > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > > > > ---
> > > > >  hw/virtio/vdpa-dev.c         | 114 +++++++++++++++++++++++++++++++++++
> > > > >  include/hw/virtio/vdpa-dev.h |   8 +++
> > > > >  2 files changed, 122 insertions(+)
> > > > >
> > > > > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > > > > index 790117fb3b..2d534d837a 100644
> > > > > --- a/hw/virtio/vdpa-dev.c
> > > > > +++ b/hw/virtio/vdpa-dev.c
> > > > > @@ -15,9 +15,122 @@
> > > > >  #include "sysemu/sysemu.h"
> > > > >  #include "sysemu/runstate.h"
> > > > >
> > > > > +static void
> > > > > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> > *vq)
> > > > > +{
> > > > > +    /* Nothing to do */
> > > > > +}
> > > > > +
> > > > > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> > > >
> > > > This looks similar to the helper function in a previous patch but this
> > > > time the return value type is int instead of uint32_t. Please make the
> > > > types consistent.
> > > >
> > >
> > > OK.
> > >
> > > > > +{
> > > > > +    int val;
> > > > > +
> > > > > +    if (ioctl(fd, cmd, &val) < 0) {
> > > > > +        error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > > > > +                   cmd, strerror(errno));
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    return val;
> > > > > +}
> > > > > +
> > > > > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > > > > +{
> > > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > > > > +}
> > > > > +
> > > > > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > > > > +{
> > > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > > > > +}
> > > > > +
> > > > > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > > > > +{
> > > > > +    return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE,
> > errp);
> > > > > +}
> > > > > +
> > > > >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > > > >  {
> > > > > +    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > > > +    VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > > > > +    uint32_t device_id;
> > > > > +    int max_queue_size;
> > > > > +    int fd;
> > > > > +    int i, ret;
> > > > > +
> > > > > +    fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > > > > +    if (fd == -1) {
> > > > > +        return;
> > > > > +    }
> > > > > +    s->vdpa.device_fd = fd;
> > > >
> > > > This is the field I suggest exposing as a QOM property so it can be set
> > > > from the proxy object (e.g. when the PCI proxy opens the vdpa device
> > > > before our .realize() function is called).
> > > >
> > >
> > > OK.
> > >
> > > > > +
> > > > > +    max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > > > > +    if (*errp) {
> > > > > +        goto out;
> > > > > +    }
> > > > > +
> > > > > +    if (s->queue_size > max_queue_size) {
> > > > > +        error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> > > > (max:%d)",
> > > > > +                   s->queue_size, max_queue_size);
> > > > > +        goto out;
> > > > > +    } else if (!s->queue_size) {
> > > > > +        s->queue_size = max_queue_size;
> > > > > +    }
> > > > > +
> > > > > +    ret = vdpa_dev_get_vqs_num(fd, errp);
> > > > > +    if (*errp) {
> > > > > +        goto out;
> > > > > +    }
> > > > > +
> > > > > +    s->dev.nvqs = ret;
> > > >
> > > > There is no input validation because we trust the kernel vDPA return
> > > > values. That seems okay for now but if there is a vhost-user version of
> > > > this in the future then input validation will be necessary to achieve
> > > > isolation between QEMU and the vhost-user processes. I suggest including
> > > > input validation code right away because it's harder to audit the code
> > > > and fix missing input validation later on.
> > > >
> > >
> > > Make sense!
> > >
> > > Should we only need to validate the upper boundary (e.g. <VIRTIO_QUEUE_MAX)?
> > 
> > Careful, ret is currently an int so negative values would bypass the <
> > VIRTIO_QUEUE_MAX check.
> > 
> > >
> > > > > +    s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> > > > > +    s->dev.vq_index = 0;
> > > > > +    s->dev.vq_index_end = s->dev.nvqs;
> > > > > +    s->dev.backend_features = 0;
> > > > > +    s->started = false;
> > > > > +
> > > > > +    ret = vhost_dev_init(&s->dev, &s->vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> > > > NULL);
> > > > > +    if (ret < 0) {
> > > > > +        error_setg(errp, "vhost-vdpa-device: vhost initialization
> > > > failed: %s",
> > > > > +                   strerror(-ret));
> > > > > +        goto out;
> > > > > +    }
> > > > > +
> > > > > +    ret = s->dev.vhost_ops->vhost_get_device_id(&s->dev, &device_id);
> > > >
> > > > The vhost_*() API abstracts the ioctl calls but this source file and the
> > > > PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
> > > > calls into the vhost_*() API? That would be cleaner and also make it
> > > > easier to add vhost-user vDPA support in the future.
> > >
> > > We need these ioctls calls because we need invoke them before the vhost-dev
> > > object is initialized.
> > 
> > It may be possible to clean this up by changing how vhost_dev_init()
> > works but I haven't investigated. The issue is that the vhost_dev_init()
> > API requires information from the caller that has to be fetched from the
> > vDPA device. This forces the caller to communicate directly with the
> > vDPA device before calling vhost_dev_init(). It may be possible to move
> > this setup code inside vhost_dev_init() (and vhost_ops callbacks).
> > 
> 
> Hmm, this is still not clear to me, so let's continue to discuss this
> in v2 if you think it's necessary.

Okay.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2022-01-19 17:19 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05  0:58 [RFC 00/10] add generic vDPA device support Longpeng(Mike) via
2022-01-05  0:58 ` [RFC 01/10] virtio: get class_id and pci device id by the virtio id Longpeng(Mike) via
2022-01-05  4:37   ` Jason Wang
2022-01-05  5:47     ` longpeng2--- via
2022-01-05  6:15       ` Jason Wang
2022-01-10  3:03         ` longpeng2--- via
2022-01-05 10:46   ` Cornelia Huck
2022-01-06  1:50     ` longpeng2--- via
2022-01-10  5:43   ` Michael S. Tsirkin
2022-01-10  6:27     ` longpeng2--- via
2022-01-10  7:14       ` Michael S. Tsirkin
2022-01-05  0:58 ` [RFC 02/10] vhost: add 3 commands for vhost-vdpa Longpeng(Mike) via
2022-01-05  4:35   ` Jason Wang
2022-01-05  6:40     ` longpeng2--- via
2022-01-05  6:43       ` Jason Wang
2022-01-05  7:02     ` Michael S. Tsirkin
2022-01-05  7:54       ` Jason Wang
2022-01-05  8:37         ` longpeng2--- via
2022-01-05  9:09           ` Jason Wang
2022-01-05 12:26             ` Michael S. Tsirkin
2022-01-06  2:34               ` Jason Wang
2022-01-06  8:00                 ` longpeng2--- via
2022-01-07  2:41                   ` Jason Wang
2022-01-06 14:09                 ` Michael S. Tsirkin
2022-01-07  2:53                   ` Jason Wang
2022-01-05  9:12         ` Michael S. Tsirkin
2022-01-05  9:21           ` Jason Wang
2022-01-05  0:58 ` [RFC 03/10] vdpa: add the infrastructure of vdpa-dev Longpeng(Mike) via
2022-01-05  9:48   ` Stefan Hajnoczi
2022-01-06  1:22     ` longpeng2--- via
2022-01-06 11:25       ` Stefan Hajnoczi
2022-01-07  2:22         ` Jason Wang
2022-01-05  0:58 ` [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface Longpeng(Mike) via
2022-01-05 10:00   ` Stefan Hajnoczi
2022-01-06  2:39     ` longpeng2--- via
2022-01-05 11:28   ` Stefano Garzarella
2022-01-06  2:40     ` longpeng2--- via
2022-01-05  0:58 ` [RFC 05/10] vdpa-dev: implement the realize interface Longpeng(Mike) via
2022-01-05 10:17   ` Stefan Hajnoczi
2022-01-06  3:02     ` longpeng2--- via
2022-01-06 11:34       ` Stefan Hajnoczi
2022-01-17 12:34         ` longpeng2--- via
2022-01-19 17:15           ` Stefan Hajnoczi
2022-01-05  0:58 ` [RFC 06/10] vdpa-dev: implement the unrealize interface Longpeng(Mike) via
2022-01-05 11:16   ` Stefano Garzarella
2022-01-06  3:23     ` longpeng2--- via
2022-01-10  9:38       ` Stefano Garzarella
2022-01-05  0:58 ` [RFC 07/10] vdpa-dev: implement the get_config/set_config interface Longpeng(Mike) via
2022-01-05  0:58 ` [RFC 08/10] vdpa-dev: implement the get_features interface Longpeng(Mike) via
2022-01-05  0:58 ` [RFC 09/10] vdpa-dev: implement the set_status interface Longpeng(Mike) via
2022-01-05  0:59 ` [RFC 10/10] vdpa-dev: mark the device as unmigratable Longpeng(Mike) via
2022-01-05 10:21 ` [RFC 00/10] add generic vDPA device support Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.