All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] vDPA shadow virtqueue - notifications forwarding
@ 2021-01-29 20:54 Eugenio Pérez
  2021-01-29 20:54 ` [RFC 01/10] virtio: Add virtqueue_set_handler Eugenio Pérez
                   ` (9 more replies)
  0 siblings, 10 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

This series enable vhost (And vhost-vdpa) notifications forwarding for
software assisted live migration, implemented through a shadow
virtqueue.

Shadow virtqueue is a new method of tracking memory for migration:
Instead of relay on vDPA device's dirty logging capability, SW assisted
LM intercepts dataplane, forwarding the descriptors between VM and
device.

In this migration mode, qemu offers a new (shadow) vring to the device
to read and write into, and forwards descriptors between host vring
and qemu one. On used buffer relay, qemu will mark the dirty memory as
with plain virtio-net devices. This way, devices does not need to have
dirty page logging capability.

This RFC series just enables just the notifications forwarding part,
not buffer forwarding/tracking.

It is based on the ideas of DPDK SW assisted LM, in the series of
DPDK's https://patchwork.dpdk.org/cover/48370/ , but will use memory in
qemu Virtual Address Space for rings, instead of in guest's.

Main changes from previous RFC [1] are:
* Use QMP to enable. Can disable through QMP too.
* Do not use vhost_dev_{enable,disable}_notifiers, since they override
  the VM ioeventfd set, and could cause race conditions. Do never modify
  irqfd or ioeventfd used for the guest.

Comments are welcome.

Thanks!

[1] https://patchew.org/QEMU/20201120185105.279030-1-eperezma@redhat.com/

Eugenio Pérez (10):
  virtio: Add virtqueue_set_handler
  virtio: Add set_vq_handler
  virtio: Add virtio_queue_get_idx
  virtio: Add virtio_queue_host_notifier_status
  vhost: Add vhost_dev_from_virtio
  vhost: Save masked_notifier state
  vhost: Add VhostShadowVirtqueue
  vhost: Add x-vhost-enable-shadow-vq qmp
  vhost: Route guest->host notification through shadow virtqueue
  vhost: Route host->guest notification through shadow virtqueue

 qapi/net.json                      |  23 +++
 hw/virtio/vhost-shadow-virtqueue.h |  31 ++++
 include/hw/virtio/vhost.h          |   6 +
 include/hw/virtio/virtio.h         |  14 +-
 hw/net/virtio-net.c                |  26 ++++
 hw/virtio/vhost-shadow-virtqueue.c | 234 +++++++++++++++++++++++++++++
 hw/virtio/vhost.c                  | 161 ++++++++++++++++++++
 hw/virtio/virtio.c                 |  24 +++
 hw/virtio/meson.build              |   2 +-
 9 files changed, 517 insertions(+), 4 deletions(-)
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.c

-- 
2.27.0



^ permalink raw reply	[flat|nested] 42+ messages in thread

* [RFC 01/10] virtio: Add virtqueue_set_handler
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-01-29 20:54 ` [RFC 02/10] virtio: Add set_vq_handler Eugenio Pérez
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

This allows qemu to override vq handler.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio.h |  6 +++---
 hw/virtio/virtio.c         | 14 ++++++++++++++
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b7ece7a6a8..9b5479e256 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -47,6 +47,8 @@ size_t virtio_feature_get_config_size(VirtIOFeature *features,
                                       uint64_t host_features);
 
 typedef struct VirtQueue VirtQueue;
+typedef void (*VirtIOHandleOutput)(VirtIODevice *, VirtQueue *);
+typedef bool (*VirtIOHandleAIOOutput)(VirtIODevice *, VirtQueue *);
 
 #define VIRTQUEUE_MAX_SIZE 1024
 
@@ -174,9 +176,6 @@ void virtio_error(VirtIODevice *vdev, const char *fmt, ...) GCC_FMT_ATTR(2, 3);
 /* Set the child bus name. */
 void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name);
 
-typedef void (*VirtIOHandleOutput)(VirtIODevice *, VirtQueue *);
-typedef bool (*VirtIOHandleAIOOutput)(VirtIODevice *, VirtQueue *);
-
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
                             VirtIOHandleOutput handle_output);
 
@@ -184,6 +183,7 @@ void virtio_del_queue(VirtIODevice *vdev, int n);
 
 void virtio_delete_queue(VirtQueue *vq);
 
+void virtqueue_set_handler(VirtQueue *vq, VirtIOHandleOutput handler);
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
                     unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index b308026596..ebb780fb42 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1796,6 +1796,20 @@ unsigned int virtqueue_drop_all(VirtQueue *vq)
     }
 }
 
+/*
+ * virtqueue_set_handler:
+ * @vq The #VirtQueue
+ * @handler The handler to call on vq event
+ * Replaces vq handler.
+ *
+ * Note: It takes no protection, so make sure no other calls to the handler
+ * are happening.
+ */
+void virtqueue_set_handler(VirtQueue *vq, VirtIOHandleOutput handler)
+{
+    vq->handle_output = handler;
+}
+
 /* Reading and writing a structure directly to QEMUFile is *awful*, but
  * it is what QEMU has always done by mistake.  We can change it sooner
  * or later by bumping the version number of the affected vm states.
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 02/10] virtio: Add set_vq_handler
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
  2021-01-29 20:54 ` [RFC 01/10] virtio: Add virtqueue_set_handler Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-01-29 20:54 ` [RFC 03/10] virtio: Add virtio_queue_get_idx Eugenio Pérez
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

So other subsystem can override vq handler and device can reset it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio.h |  5 +++++
 hw/net/virtio-net.c        | 26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 9b5479e256..9988c6d5c9 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -149,6 +149,11 @@ struct VirtioDeviceClass {
     void (*guest_notifier_mask)(VirtIODevice *vdev, int n, bool mask);
     int (*start_ioeventfd)(VirtIODevice *vdev);
     void (*stop_ioeventfd)(VirtIODevice *vdev);
+    /*
+     * Set handler for a vq. NULL handler for reset to default.
+     */
+    bool (*set_vq_handler)(VirtIODevice *vdev, unsigned int n,
+                           VirtIOHandleOutput handle_output);
     /* Saving and loading of a device; trying to deprecate save/load
      * use vmsd for new devices.
      */
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5150f295e8..f7b2998fb1 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2699,6 +2699,31 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int multiqueue)
     virtio_net_set_queues(n);
 }
 
+static bool virtio_net_set_vq_handler(VirtIODevice *vdev, unsigned int i,
+                                      VirtIOHandleOutput handle_output)
+{
+    const VirtIONet *n = VIRTIO_NET(vdev);
+    const unsigned max_queues = n->multiqueue ? n->max_queues : 1;
+    VirtQueue *vq;
+
+    /* Reset control queue also not supported */
+    assert(i < max_queues * 2);
+
+    vq = virtio_get_queue(vdev, i);
+    if (handle_output == NULL) {
+        if (i % 2) {
+            handle_output = virtio_net_handle_rx;
+        } else {
+            const VirtIONetQueue *q = &n->vqs[i / 2];
+            handle_output = q->tx_timer ? virtio_net_handle_tx_timer
+                                        : virtio_net_handle_tx_bh;
+        }
+    }
+
+    virtqueue_set_handler(vq, handle_output);
+    return true;
+}
+
 static int virtio_net_post_load_device(void *opaque, int version_id)
 {
     VirtIONet *n = opaque;
@@ -3519,6 +3544,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
     vdc->set_status = virtio_net_set_status;
     vdc->guest_notifier_mask = virtio_net_guest_notifier_mask;
     vdc->guest_notifier_pending = virtio_net_guest_notifier_pending;
+    vdc->set_vq_handler = virtio_net_set_vq_handler;
     vdc->legacy_features |= (0x1 << VIRTIO_NET_F_GSO);
     vdc->post_load = virtio_net_post_load_virtio;
     vdc->vmsd = &vmstate_virtio_net_device;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 03/10] virtio: Add virtio_queue_get_idx
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
  2021-01-29 20:54 ` [RFC 01/10] virtio: Add virtqueue_set_handler Eugenio Pérez
  2021-01-29 20:54 ` [RFC 02/10] virtio: Add set_vq_handler Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-02-01  6:10     ` Jason Wang
  2021-01-29 20:54 ` [RFC 04/10] virtio: Add virtio_queue_host_notifier_status Eugenio Pérez
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio.h | 2 ++
 hw/virtio/virtio.c         | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 9988c6d5c9..9013c03424 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -399,6 +399,8 @@ static inline bool virtio_device_disabled(VirtIODevice *vdev)
     return unlikely(vdev->disabled || vdev->broken);
 }
 
+unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq);
+
 bool virtio_legacy_allowed(VirtIODevice *vdev);
 bool virtio_legacy_check_disabled(VirtIODevice *vdev);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index ebb780fb42..3d14b0ef74 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -500,6 +500,11 @@ void virtio_queue_set_notification(VirtQueue *vq, int enable)
     }
 }
 
+unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq)
+{
+    return vq - vdev->vq;
+}
+
 int virtio_queue_ready(VirtQueue *vq)
 {
     return vq->vring.avail != 0;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 04/10] virtio: Add virtio_queue_host_notifier_status
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (2 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 03/10] virtio: Add virtio_queue_get_idx Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-01-29 20:54 ` [RFC 05/10] vhost: Add vhost_dev_from_virtio Eugenio Pérez
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/virtio.h | 1 +
 hw/virtio/virtio.c         | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 9013c03424..c5fcd9b169 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -321,6 +321,7 @@ void virtio_device_release_ioeventfd(VirtIODevice *vdev);
 bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
 void virtio_queue_set_host_notifier_enabled(VirtQueue *vq, bool enabled);
+bool virtio_queue_host_notifier_status(const VirtQueue *vq);
 void virtio_queue_host_notifier_read(EventNotifier *n);
 void virtio_queue_aio_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
                                                 VirtIOHandleAIOOutput handle_output);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 3d14b0ef74..fdf37d8e48 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3613,6 +3613,11 @@ EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq)
     return &vq->host_notifier;
 }
 
+bool virtio_queue_host_notifier_status(const VirtQueue *vq)
+{
+    return vq->host_notifier_enabled;
+}
+
 void virtio_queue_set_host_notifier_enabled(VirtQueue *vq, bool enabled)
 {
     vq->host_notifier_enabled = enabled;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (3 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 04/10] virtio: Add virtio_queue_host_notifier_status Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-02-01  6:12     ` Jason Wang
  2021-01-29 20:54 ` [RFC 06/10] vhost: Save masked_notifier state Eugenio Pérez
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost.h |  1 +
 hw/virtio/vhost.c         | 17 +++++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 4a8bc75415..fca076e3f0 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
 void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
                         uint64_t features);
 bool vhost_has_free_slot(void);
+struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
 
 int vhost_net_set_backend(struct vhost_dev *hdev,
                           struct vhost_vring_file *file);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 28c7d78172..8683d507f5 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
     return slots_limit > used_memslots;
 }
 
+/*
+ * Get the vhost device associated to a VirtIO device.
+ */
+struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
+{
+    struct vhost_dev *hdev;
+
+    QLIST_FOREACH(hdev, &vhost_devices, entry) {
+        if (hdev->vdev == vdev) {
+            return hdev;
+        }
+    }
+
+    assert(hdev);
+    return NULL;
+}
+
 static void vhost_dev_sync_region(struct vhost_dev *dev,
                                   MemoryRegionSection *section,
                                   uint64_t mfirst, uint64_t mlast,
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 06/10] vhost: Save masked_notifier state
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (4 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 05/10] vhost: Add vhost_dev_from_virtio Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-01-29 20:54 ` [RFC 07/10] vhost: Add VhostShadowVirtqueue Eugenio Pérez
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

It will be used to recover call eventfd.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/hw/virtio/vhost.h | 1 +
 hw/virtio/vhost.c         | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index fca076e3f0..2be782cefd 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -28,6 +28,7 @@ struct vhost_virtqueue {
     unsigned avail_size;
     unsigned long long used_phys;
     unsigned used_size;
+    bool notifier_is_masked;
     EventNotifier masked_notifier;
     struct vhost_dev *dev;
 };
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 8683d507f5..040f68ff2e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1526,6 +1526,8 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
     /* should only be called after backend is connected */
     assert(hdev->vhost_ops);
 
+    hdev->vqs[index].notifier_is_masked = mask;
+
     if (mask) {
         assert(vdev->use_guest_notifier_mask);
         file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 07/10] vhost: Add VhostShadowVirtqueue
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (5 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 06/10] vhost: Save masked_notifier state Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-01-29 20:54 ` [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Vhost shadow virtqueue is an intermediate jump for virtqueue
notifications and buffers, allowing qemu to track them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h | 24 ++++++++++++
 hw/virtio/vhost-shadow-virtqueue.c | 60 ++++++++++++++++++++++++++++++
 hw/virtio/meson.build              |  2 +-
 3 files changed, 85 insertions(+), 1 deletion(-)
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.c

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
new file mode 100644
index 0000000000..6cc18d6acb
--- /dev/null
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -0,0 +1,24 @@
+/*
+ * vhost software live migration ring
+ *
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef VHOST_SHADOW_VIRTQUEUE_H
+#define VHOST_SHADOW_VIRTQUEUE_H
+
+#include "qemu/osdep.h"
+
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/vhost.h"
+
+typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
+
+VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
+
+void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
+
+#endif
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
new file mode 100644
index 0000000000..c0c967a7c5
--- /dev/null
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -0,0 +1,60 @@
+/*
+ * vhost software live migration ring
+ *
+ * SPDX-FileCopyrightText: Red Hat, Inc. 2021
+ * SPDX-FileContributor: Author: Eugenio Pérez <eperezma@redhat.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "hw/virtio/vhost-shadow-virtqueue.h"
+
+#include "qemu/error-report.h"
+#include "qemu/event_notifier.h"
+
+typedef struct VhostShadowVirtqueue {
+    EventNotifier kick_notifier;
+    EventNotifier call_notifier;
+} VhostShadowVirtqueue;
+
+/*
+ * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
+ * methods and file descriptors.
+ */
+VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
+{
+    g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
+    int r;
+
+    r = event_notifier_init(&svq->kick_notifier, 0);
+    if (r != 0) {
+        error_report("Couldn't create kick event notifier: %s",
+                     strerror(errno));
+        goto err_init_kick_notifier;
+    }
+
+    r = event_notifier_init(&svq->call_notifier, 0);
+    if (r != 0) {
+        error_report("Couldn't create call event notifier: %s",
+                     strerror(errno));
+        goto err_init_call_notifier;
+    }
+
+    return svq;
+
+err_init_call_notifier:
+    event_notifier_cleanup(&svq->kick_notifier);
+
+err_init_kick_notifier:
+    return NULL;
+}
+
+/*
+ * Free the resources of the shadow virtqueue.
+ */
+void vhost_shadow_vq_free(VhostShadowVirtqueue *vq)
+{
+    event_notifier_cleanup(&vq->kick_notifier);
+    event_notifier_cleanup(&vq->call_notifier);
+    g_free(vq);
+}
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index fbff9bc9d4..8b5a0225fe 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -11,7 +11,7 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost-stub.c'))
 
 virtio_ss = ss.source_set()
 virtio_ss.add(files('virtio.c'))
-virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c'))
+virtio_ss.add(when: 'CONFIG_VHOST', if_true: files('vhost.c', 'vhost-backend.c', 'vhost-shadow-virtqueue.c'))
 virtio_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user.c'))
 virtio_ss.add(when: 'CONFIG_VHOST_VDPA', if_true: files('vhost-vdpa.c'))
 virtio_ss.add(when: 'CONFIG_VIRTIO_BALLOON', if_true: files('virtio-balloon.c'))
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (6 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 07/10] vhost: Add VhostShadowVirtqueue Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-02-02 15:38     ` Eric Blake
  2021-01-29 20:54 ` [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
  2021-01-29 20:54 ` [RFC 10/10] vhost: Route host->guest " Eugenio Pérez
  9 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Command to enable shadow virtqueue looks like:

{ "execute": "x-vhost-enable-shadow-vq", "arguments": { "name": "dev0", "enable": true } }

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 qapi/net.json     | 23 +++++++++++++++++++++++
 hw/virtio/vhost.c |  6 ++++++
 2 files changed, 29 insertions(+)

diff --git a/qapi/net.json b/qapi/net.json
index c31748c87f..6170d69798 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -77,6 +77,29 @@
 ##
 { 'command': 'netdev_del', 'data': {'id': 'str'} }
 
+##
+# @x-vhost-enable-shadow-vq:
+#
+# Use vhost shadow virtqueue.
+#
+# @name: the device name of the virtual network adapter
+#
+# @enable: true to use he alternate shadow VQ notification path
+#
+# Returns: Error if failure, or 'no error' for success
+#
+# Since: 5.3
+#
+# Example:
+#
+# -> { "execute": "x-vhost_enable_shadow_vq", "arguments": {"enable": true} }
+# <- { "return": { "enabled" : true } }
+#
+##
+{ 'command': 'x-vhost-enable-shadow-vq',
+  'data': {'name': 'str', 'enable': 'bool'},
+  'if': 'defined(CONFIG_VHOST_KERNEL)' }
+
 ##
 # @NetLegacyNicOptions:
 #
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 040f68ff2e..42836e45f3 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -15,6 +15,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qapi/qapi-commands-net.h"
 #include "hw/virtio/vhost.h"
 #include "qemu/atomic.h"
 #include "qemu/range.h"
@@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
 
     return -1;
 }
+
+void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
+{
+    error_setg(errp, "Shadow virtqueue still not implemented.");
+}
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (7 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  2021-02-01  6:29     ` Jason Wang
  2021-01-29 20:54 ` [RFC 10/10] vhost: Route host->guest " Eugenio Pérez
  9 siblings, 1 reply; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Shadow virtqueue notifications forwarding is disabled when vhost_dev
stops.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |   5 ++
 include/hw/virtio/vhost.h          |   4 +
 hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
 hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
 4 files changed, 264 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 6cc18d6acb..466f8ae595 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -17,6 +17,11 @@
 
 typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
 
+bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
+                               VhostShadowVirtqueue *svq);
+void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
+                              VhostShadowVirtqueue *svq);
+
 VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
 
 void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 2be782cefd..732a4b2a2b 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -55,6 +55,8 @@ struct vhost_iommu {
     QLIST_ENTRY(vhost_iommu) iommu_next;
 };
 
+typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
+
 typedef struct VhostDevConfigOps {
     /* Vhost device config space changed callback
      */
@@ -83,7 +85,9 @@ struct vhost_dev {
     uint64_t backend_cap;
     bool started;
     bool log_enabled;
+    bool sw_lm_enabled;
     uint64_t log_size;
+    VhostShadowVirtqueue **shadow_vqs;
     Error *migration_blocker;
     const VhostOps *vhost_ops;
     void *opaque;
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index c0c967a7c5..908c36c66d 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -8,15 +8,129 @@
  */
 
 #include "hw/virtio/vhost-shadow-virtqueue.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio-access.h"
+
+#include "standard-headers/linux/vhost_types.h"
+#include "standard-headers/linux/virtio_ring.h"
 
 #include "qemu/error-report.h"
-#include "qemu/event_notifier.h"
+#include "qemu/main-loop.h"
 
 typedef struct VhostShadowVirtqueue {
     EventNotifier kick_notifier;
     EventNotifier call_notifier;
+    const struct vhost_virtqueue *hvq;
+    VirtIODevice *vdev;
+    VirtQueue *vq;
 } VhostShadowVirtqueue;
 
+static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
+{
+    const struct vring_used *used = svq->hvq->used;
+    return virtio_tswap16(svq->vdev, used->flags);
+}
+
+static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
+{
+    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
+}
+
+static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
+{
+    if (vhost_shadow_vring_should_kick(vq)) {
+        event_notifier_set(&vq->kick_notifier);
+    }
+}
+
+static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
+{
+    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
+    uint16_t idx = virtio_get_queue_index(vq);
+
+    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
+
+    vhost_shadow_vring_kick(svq);
+}
+
+/*
+ * Start shadow virtqueue operation.
+ * @dev vhost device
+ * @svq Shadow Virtqueue
+ *
+ * Run in RCU context
+ */
+bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
+                               VhostShadowVirtqueue *svq)
+{
+    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
+    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
+    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
+    struct vhost_vring_file kick_file = {
+        .index = idx,
+        .fd = event_notifier_get_fd(&svq->kick_notifier),
+    };
+    int r;
+    bool ok;
+
+    /* Check that notifications are still going directly to vhost dev */
+    assert(virtio_queue_host_notifier_status(svq->vq));
+
+    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
+    if (!ok) {
+        error_report("Couldn't set the vq handler");
+        goto err_set_kick_handler;
+    }
+
+    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
+    if (r != 0) {
+        error_report("Couldn't set kick fd: %s", strerror(errno));
+        goto err_set_vring_kick;
+    }
+
+    event_notifier_set_handler(vq_host_notifier,
+                               virtio_queue_host_notifier_read);
+    virtio_queue_set_host_notifier_enabled(svq->vq, false);
+    virtio_queue_host_notifier_read(vq_host_notifier);
+
+    return true;
+
+err_set_vring_kick:
+    k->set_vq_handler(dev->vdev, idx, NULL);
+
+err_set_kick_handler:
+    return false;
+}
+
+/*
+ * Stop shadow virtqueue operation.
+ * @dev vhost device
+ * @svq Shadow Virtqueue
+ *
+ * Run in RCU context
+ */
+void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
+                              VhostShadowVirtqueue *svq)
+{
+    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
+    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
+    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
+    struct vhost_vring_file kick_file = {
+        .index = idx,
+        .fd = event_notifier_get_fd(vq_host_notifier),
+    };
+    int r;
+
+    /* Restore vhost kick */
+    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
+    /* Cannot do a lot of things */
+    assert(r == 0);
+
+    event_notifier_set_handler(vq_host_notifier, NULL);
+    virtio_queue_set_host_notifier_enabled(svq->vq, true);
+    k->set_vq_handler(svq->vdev, idx, NULL);
+}
+
 /*
  * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
  * methods and file descriptors.
@@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
 VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
 {
     g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
+    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
     int r;
 
+    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
+    svq->hvq = &dev->vqs[idx];
+    svq->vdev = dev->vdev;
+
     r = event_notifier_init(&svq->kick_notifier, 0);
     if (r != 0) {
         error_report("Couldn't create kick event notifier: %s",
@@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
         goto err_init_call_notifier;
     }
 
-    return svq;
+    return g_steal_pointer(&svq);
 
 err_init_call_notifier:
     event_notifier_cleanup(&svq->kick_notifier);
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 42836e45f3..bde688f278 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -25,6 +25,7 @@
 #include "exec/address-spaces.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
+#include "hw/virtio/vhost-shadow-virtqueue.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file-types.h"
 #include "sysemu/dma.h"
@@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
     }
 }
 
+static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
+{
+    int idx;
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        dev->sw_lm_enabled = false;
+
+        for (idx = 0; idx < dev->nvqs; ++idx) {
+            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
+        }
+    }
+
+    for (idx = 0; idx < dev->nvqs; ++idx) {
+        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
+    }
+
+    g_free(dev->shadow_vqs);
+    dev->shadow_vqs = NULL;
+    return 0;
+}
+
+static int vhost_sw_live_migration_start(struct vhost_dev *dev)
+{
+    int idx;
+
+    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
+    for (idx = 0; idx < dev->nvqs; ++idx) {
+        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
+        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
+            goto err;
+        }
+    }
+
+    WITH_RCU_READ_LOCK_GUARD() {
+        for (idx = 0; idx < dev->nvqs; ++idx) {
+            int stop_idx = idx;
+            bool ok = vhost_shadow_vq_start_rcu(dev,
+                                                dev->shadow_vqs[idx]);
+
+            if (!ok) {
+                while (--stop_idx >= 0) {
+                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
+                }
+
+                goto err;
+            }
+        }
+    }
+
+    dev->sw_lm_enabled = true;
+    return 0;
+
+err:
+    for (; idx >= 0; --idx) {
+        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
+    }
+    g_free(dev->shadow_vqs[idx]);
+
+    return -1;
+}
+
+static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
+                                          bool enable_lm)
+{
+    int r;
+
+    if (enable_lm == dev->sw_lm_enabled) {
+        return 0;
+    }
+
+    r = enable_lm ? vhost_sw_live_migration_start(dev)
+                  : vhost_sw_live_migration_stop(dev);
+
+    return r;
+}
+
 static void vhost_log_start(MemoryListener *listener,
                             MemoryRegionSection *section,
                             int old, int new)
@@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
     hdev->log = NULL;
     hdev->log_size = 0;
     hdev->log_enabled = false;
+    hdev->sw_lm_enabled = false;
     hdev->started = false;
     memory_listener_register(&hdev->memory_listener, &address_space_memory);
     QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
@@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
         hdev->vhost_ops->vhost_dev_start(hdev, false);
     }
     for (i = 0; i < hdev->nvqs; ++i) {
+        if (hdev->sw_lm_enabled) {
+            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
+            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
+        }
+
         vhost_virtqueue_stop(hdev,
                              vdev,
                              hdev->vqs + i,
@@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
         memory_listener_unregister(&hdev->iommu_listener);
     }
     vhost_log_put(hdev, true);
+    g_free(hdev->shadow_vqs);
+    hdev->sw_lm_enabled = false;
     hdev->started = false;
     hdev->vdev = NULL;
 }
@@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
 
 void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
 {
-    error_setg(errp, "Shadow virtqueue still not implemented.");
+    struct vhost_dev *hdev;
+    const char *err_cause = NULL;
+    const VirtioDeviceClass *k;
+    int r;
+    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
+
+    QLIST_FOREACH(hdev, &vhost_devices, entry) {
+        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
+            break;
+        }
+    }
+
+    if (!hdev) {
+        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
+        err_cause = "Device not found";
+        goto err;
+    }
+
+    if (!hdev->started) {
+        err_cause = "Device is not started";
+        goto err;
+    }
+
+    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
+        err_cause = "Use packed vq";
+        goto err;
+    }
+
+    if (vhost_dev_has_iommu(hdev)) {
+        err_cause = "Device use IOMMU";
+        goto err;
+    }
+
+    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
+    if (!k->set_vq_handler) {
+        err_cause = "Virtio device type does not support reset of vq handler";
+        goto err;
+    }
+
+    r = vhost_sw_live_migration_enable(hdev, enable);
+    if (unlikely(r)) {
+        err_cause = "Error enabling (see monitor)";
+    }
+
+err:
+    if (err_cause) {
+        error_set(errp, err_class,
+                  "Can't enable shadow vq on %s: %s", name, err_cause);
+    }
 }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC 10/10] vhost: Route host->guest notification through shadow virtqueue
  2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
                   ` (8 preceding siblings ...)
  2021-01-29 20:54 ` [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
@ 2021-01-29 20:54 ` Eugenio Pérez
  9 siblings, 0 replies; 42+ messages in thread
From: Eugenio Pérez @ 2021-01-29 20:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 hw/virtio/vhost-shadow-virtqueue.h |  2 ++
 hw/virtio/vhost-shadow-virtqueue.c | 55 ++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                  |  5 ++-
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
index 466f8ae595..99a4e011fd 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -17,6 +17,8 @@
 
 typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
 
+EventNotifier *vhost_shadow_vq_get_call_notifier(VhostShadowVirtqueue *vq);
+
 bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
                                VhostShadowVirtqueue *svq);
 void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
index 908c36c66d..e2e0bfe325 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -53,6 +53,34 @@ static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
     vhost_shadow_vring_kick(svq);
 }
 
+static void vhost_handle_call(EventNotifier *n)
+{
+    VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
+                                             call_notifier);
+
+    if (event_notifier_test_and_clear(n)) {
+        unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
+
+        /*
+         * Since QEMU has not add any descriptors, virtqueue code thinks its
+         * not needed to signal used. QEMU shadow virtqueue will take
+         * descriptor forwarding soon, so just invalidate used cache for now.
+         */
+        virtio_queue_invalidate_signalled_used(svq->vdev, idx);
+        virtio_notify_irqfd(svq->vdev, svq->vq);
+    }
+}
+
+/*
+ * Get the vhost call notifier of the shadow vq
+ * @vq Shadow virtqueue
+ */
+EventNotifier *vhost_shadow_vq_get_call_notifier(VhostShadowVirtqueue *vq)
+{
+    return &vq->call_notifier;
+}
+
+
 /*
  * Start shadow virtqueue operation.
  * @dev vhost device
@@ -70,6 +98,10 @@ bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
         .index = idx,
         .fd = event_notifier_get_fd(&svq->kick_notifier),
     };
+    struct vhost_vring_file call_file = {
+        .index = idx,
+        .fd = event_notifier_get_fd(&svq->call_notifier),
+    };
     int r;
     bool ok;
 
@@ -88,6 +120,12 @@ bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
         goto err_set_vring_kick;
     }
 
+    r = dev->vhost_ops->vhost_set_vring_call(dev, &call_file);
+    if (r != 0) {
+        error_report("Couldn't set call fd: %s", strerror(errno));
+        goto err_set_vring_call;
+    }
+
     event_notifier_set_handler(vq_host_notifier,
                                virtio_queue_host_notifier_read);
     virtio_queue_set_host_notifier_enabled(svq->vq, false);
@@ -95,6 +133,11 @@ bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
 
     return true;
 
+err_set_vring_call:
+    kick_file.fd = event_notifier_get_fd(vq_host_notifier);
+    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
+    assert(r == 0);
+
 err_set_vring_kick:
     k->set_vq_handler(dev->vdev, idx, NULL);
 
@@ -129,6 +172,17 @@ void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
     event_notifier_set_handler(vq_host_notifier, NULL);
     virtio_queue_set_host_notifier_enabled(svq->vq, true);
     k->set_vq_handler(svq->vdev, idx, NULL);
+
+    if (!dev->vqs[idx].notifier_is_masked) {
+        EventNotifier *e = vhost_shadow_vq_get_call_notifier(svq);
+
+        /* Restore vhost call */
+        vhost_virtqueue_mask(dev, svq->vdev, idx, false);
+        if (event_notifier_test_and_clear(e)) {
+            virtio_queue_invalidate_signalled_used(svq->vdev, idx);
+            virtio_notify_irqfd(svq->vdev, svq->vq);
+        }
+    }
 }
 
 /*
@@ -159,6 +213,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
         goto err_init_call_notifier;
     }
 
+    event_notifier_set_handler(&svq->call_notifier, vhost_handle_call);
     return g_steal_pointer(&svq);
 
 err_init_call_notifier:
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index bde688f278..5ad0990509 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -984,7 +984,6 @@ static int vhost_sw_live_migration_start(struct vhost_dev *dev)
             int stop_idx = idx;
             bool ok = vhost_shadow_vq_start_rcu(dev,
                                                 dev->shadow_vqs[idx]);
-
             if (!ok) {
                 while (--stop_idx >= 0) {
                     vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
@@ -1610,6 +1609,10 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
     if (mask) {
         assert(vdev->use_guest_notifier_mask);
         file.fd = event_notifier_get_fd(&hdev->vqs[index].masked_notifier);
+    } else if (hdev->sw_lm_enabled) {
+        VhostShadowVirtqueue *svq = hdev->shadow_vqs[n];
+        EventNotifier *e = vhost_shadow_vq_get_call_notifier(svq);
+        file.fd = event_notifier_get_fd(e);
     } else {
         file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
     }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [RFC 03/10] virtio: Add virtio_queue_get_idx
  2021-01-29 20:54 ` [RFC 03/10] virtio: Add virtio_queue_get_idx Eugenio Pérez
@ 2021-02-01  6:10     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:10 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   include/hw/virtio/virtio.h | 2 ++
>   hw/virtio/virtio.c         | 5 +++++
>   2 files changed, 7 insertions(+)
>
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 9988c6d5c9..9013c03424 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -399,6 +399,8 @@ static inline bool virtio_device_disabled(VirtIODevice *vdev)
>       return unlikely(vdev->disabled || vdev->broken);
>   }
>   
> +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq);
> +
>   bool virtio_legacy_allowed(VirtIODevice *vdev);
>   bool virtio_legacy_check_disabled(VirtIODevice *vdev);
>   
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index ebb780fb42..3d14b0ef74 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -500,6 +500,11 @@ void virtio_queue_set_notification(VirtQueue *vq, int enable)
>       }
>   }
>   
> +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq)
> +{
> +    return vq - vdev->vq;
> +}


It looks to me we had a dedicated index stored in VirtQueue: 
vq->queue_index.

Thanks


> +
>   int virtio_queue_ready(VirtQueue *vq)
>   {
>       return vq->vring.avail != 0;



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 03/10] virtio: Add virtio_queue_get_idx
@ 2021-02-01  6:10     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:10 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, virtualization,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	Eric Blake, Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   include/hw/virtio/virtio.h | 2 ++
>   hw/virtio/virtio.c         | 5 +++++
>   2 files changed, 7 insertions(+)
>
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index 9988c6d5c9..9013c03424 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -399,6 +399,8 @@ static inline bool virtio_device_disabled(VirtIODevice *vdev)
>       return unlikely(vdev->disabled || vdev->broken);
>   }
>   
> +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq);
> +
>   bool virtio_legacy_allowed(VirtIODevice *vdev);
>   bool virtio_legacy_check_disabled(VirtIODevice *vdev);
>   
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index ebb780fb42..3d14b0ef74 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -500,6 +500,11 @@ void virtio_queue_set_notification(VirtQueue *vq, int enable)
>       }
>   }
>   
> +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq)
> +{
> +    return vq - vdev->vq;
> +}


It looks to me we had a dedicated index stored in VirtQueue: 
vq->queue_index.

Thanks


> +
>   int virtio_queue_ready(VirtQueue *vq)
>   {
>       return vq->vring.avail != 0;

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-01-29 20:54 ` [RFC 05/10] vhost: Add vhost_dev_from_virtio Eugenio Pérez
@ 2021-02-01  6:12     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:12 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   include/hw/virtio/vhost.h |  1 +
>   hw/virtio/vhost.c         | 17 +++++++++++++++++
>   2 files changed, 18 insertions(+)
>
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 4a8bc75415..fca076e3f0 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>   void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>                           uint64_t features);
>   bool vhost_has_free_slot(void);
> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>   
>   int vhost_net_set_backend(struct vhost_dev *hdev,
>                             struct vhost_vring_file *file);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 28c7d78172..8683d507f5 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>       return slots_limit > used_memslots;
>   }
>   
> +/*
> + * Get the vhost device associated to a VirtIO device.
> + */
> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> +{
> +    struct vhost_dev *hdev;
> +
> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> +        if (hdev->vdev == vdev) {
> +            return hdev;
> +        }
> +    }
> +
> +    assert(hdev);
> +    return NULL;
> +}


I'm not sure this can work in the case of multiqueue. E.g vhost-net 
multiqueue is a N:1 mapping between vhost devics and virtio devices.

Thanks


> +
>   static void vhost_dev_sync_region(struct vhost_dev *dev,
>                                     MemoryRegionSection *section,
>                                     uint64_t mfirst, uint64_t mlast,



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
@ 2021-02-01  6:12     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:12 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, virtualization,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	Eric Blake, Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   include/hw/virtio/vhost.h |  1 +
>   hw/virtio/vhost.c         | 17 +++++++++++++++++
>   2 files changed, 18 insertions(+)
>
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 4a8bc75415..fca076e3f0 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>   void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>                           uint64_t features);
>   bool vhost_has_free_slot(void);
> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>   
>   int vhost_net_set_backend(struct vhost_dev *hdev,
>                             struct vhost_vring_file *file);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 28c7d78172..8683d507f5 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>       return slots_limit > used_memslots;
>   }
>   
> +/*
> + * Get the vhost device associated to a VirtIO device.
> + */
> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> +{
> +    struct vhost_dev *hdev;
> +
> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> +        if (hdev->vdev == vdev) {
> +            return hdev;
> +        }
> +    }
> +
> +    assert(hdev);
> +    return NULL;
> +}


I'm not sure this can work in the case of multiqueue. E.g vhost-net 
multiqueue is a N:1 mapping between vhost devics and virtio devices.

Thanks


> +
>   static void vhost_dev_sync_region(struct vhost_dev *dev,
>                                     MemoryRegionSection *section,
>                                     uint64_t mfirst, uint64_t mlast,

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-01-29 20:54 ` [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
@ 2021-02-01  6:29     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:29 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Shadow virtqueue notifications forwarding is disabled when vhost_dev
> stops.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>   include/hw/virtio/vhost.h          |   4 +
>   hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>   hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>   4 files changed, 264 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index 6cc18d6acb..466f8ae595 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -17,6 +17,11 @@
>   
>   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>   
> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> +                               VhostShadowVirtqueue *svq);
> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> +                              VhostShadowVirtqueue *svq);
> +
>   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>   
>   void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 2be782cefd..732a4b2a2b 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -55,6 +55,8 @@ struct vhost_iommu {
>       QLIST_ENTRY(vhost_iommu) iommu_next;
>   };
>   
> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> +
>   typedef struct VhostDevConfigOps {
>       /* Vhost device config space changed callback
>        */
> @@ -83,7 +85,9 @@ struct vhost_dev {
>       uint64_t backend_cap;
>       bool started;
>       bool log_enabled;
> +    bool sw_lm_enabled;
>       uint64_t log_size;
> +    VhostShadowVirtqueue **shadow_vqs;
>       Error *migration_blocker;
>       const VhostOps *vhost_ops;
>       void *opaque;
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index c0c967a7c5..908c36c66d 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -8,15 +8,129 @@
>    */
>   
>   #include "hw/virtio/vhost-shadow-virtqueue.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/virtio-access.h"
> +
> +#include "standard-headers/linux/vhost_types.h"
> +#include "standard-headers/linux/virtio_ring.h"
>   
>   #include "qemu/error-report.h"
> -#include "qemu/event_notifier.h"
> +#include "qemu/main-loop.h"
>   
>   typedef struct VhostShadowVirtqueue {
>       EventNotifier kick_notifier;
>       EventNotifier call_notifier;
> +    const struct vhost_virtqueue *hvq;
> +    VirtIODevice *vdev;
> +    VirtQueue *vq;
>   } VhostShadowVirtqueue;


So instead of doing things at virtio level, how about do the shadow 
stuffs at vhost level?

It works like:

virtio -> [shadow vhost backend] -> vhost backend

Then the QMP is used to plug the shadow vhost backend in the middle or not.

It looks kind of easier since we don't need to deal with virtqueue 
handlers etc.. Instead, we just need to deal with eventfd stuffs:

When shadow vhost mode is enabled, we just intercept the host_notifiers 
and guest_notifiers. When it was disabled, we just pass the host/guest 
notifiers to the real vhost backends?

Thanks


>   
> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
> +{
> +    const struct vring_used *used = svq->hvq->used;
> +    return virtio_tswap16(svq->vdev, used->flags);
> +}
> +
> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
> +{
> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
> +}
> +
> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
> +{
> +    if (vhost_shadow_vring_should_kick(vq)) {
> +        event_notifier_set(&vq->kick_notifier);
> +    }
> +}
> +
> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
> +    uint16_t idx = virtio_get_queue_index(vq);
> +
> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
> +
> +    vhost_shadow_vring_kick(svq);
> +}
> +
> +/*
> + * Start shadow virtqueue operation.
> + * @dev vhost device
> + * @svq Shadow Virtqueue
> + *
> + * Run in RCU context
> + */
> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> +                               VhostShadowVirtqueue *svq)
> +{
> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> +    struct vhost_vring_file kick_file = {
> +        .index = idx,
> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> +    };
> +    int r;
> +    bool ok;
> +
> +    /* Check that notifications are still going directly to vhost dev */
> +    assert(virtio_queue_host_notifier_status(svq->vq));
> +
> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
> +    if (!ok) {
> +        error_report("Couldn't set the vq handler");
> +        goto err_set_kick_handler;
> +    }
> +
> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> +    if (r != 0) {
> +        error_report("Couldn't set kick fd: %s", strerror(errno));
> +        goto err_set_vring_kick;
> +    }
> +
> +    event_notifier_set_handler(vq_host_notifier,
> +                               virtio_queue_host_notifier_read);
> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
> +    virtio_queue_host_notifier_read(vq_host_notifier);
> +
> +    return true;
> +
> +err_set_vring_kick:
> +    k->set_vq_handler(dev->vdev, idx, NULL);
> +
> +err_set_kick_handler:
> +    return false;
> +}
> +
> +/*
> + * Stop shadow virtqueue operation.
> + * @dev vhost device
> + * @svq Shadow Virtqueue
> + *
> + * Run in RCU context
> + */
> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> +                              VhostShadowVirtqueue *svq)
> +{
> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> +    struct vhost_vring_file kick_file = {
> +        .index = idx,
> +        .fd = event_notifier_get_fd(vq_host_notifier),
> +    };
> +    int r;
> +
> +    /* Restore vhost kick */
> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> +    /* Cannot do a lot of things */
> +    assert(r == 0);
> +
> +    event_notifier_set_handler(vq_host_notifier, NULL);
> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
> +    k->set_vq_handler(svq->vdev, idx, NULL);
> +}
> +
>   /*
>    * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>    * methods and file descriptors.
> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>   {
>       g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>       int r;
>   
> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> +    svq->hvq = &dev->vqs[idx];
> +    svq->vdev = dev->vdev;
> +
>       r = event_notifier_init(&svq->kick_notifier, 0);
>       if (r != 0) {
>           error_report("Couldn't create kick event notifier: %s",
> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>           goto err_init_call_notifier;
>       }
>   
> -    return svq;
> +    return g_steal_pointer(&svq);
>   
>   err_init_call_notifier:
>       event_notifier_cleanup(&svq->kick_notifier);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 42836e45f3..bde688f278 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -25,6 +25,7 @@
>   #include "exec/address-spaces.h"
>   #include "hw/virtio/virtio-bus.h"
>   #include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>   #include "migration/blocker.h"
>   #include "migration/qemu-file-types.h"
>   #include "sysemu/dma.h"
> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>       }
>   }
>   
> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> +{
> +    int idx;
> +
> +    WITH_RCU_READ_LOCK_GUARD() {
> +        dev->sw_lm_enabled = false;
> +
> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
> +        }
> +    }
> +
> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> +    }
> +
> +    g_free(dev->shadow_vqs);
> +    dev->shadow_vqs = NULL;
> +    return 0;
> +}
> +
> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> +{
> +    int idx;
> +
> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> +            goto err;
> +        }
> +    }
> +
> +    WITH_RCU_READ_LOCK_GUARD() {
> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> +            int stop_idx = idx;
> +            bool ok = vhost_shadow_vq_start_rcu(dev,
> +                                                dev->shadow_vqs[idx]);
> +
> +            if (!ok) {
> +                while (--stop_idx >= 0) {
> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
> +                }
> +
> +                goto err;
> +            }
> +        }
> +    }
> +
> +    dev->sw_lm_enabled = true;
> +    return 0;
> +
> +err:
> +    for (; idx >= 0; --idx) {
> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> +    }
> +    g_free(dev->shadow_vqs[idx]);
> +
> +    return -1;
> +}
> +
> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> +                                          bool enable_lm)
> +{
> +    int r;
> +
> +    if (enable_lm == dev->sw_lm_enabled) {
> +        return 0;
> +    }
> +
> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
> +                  : vhost_sw_live_migration_stop(dev);
> +
> +    return r;
> +}
> +
>   static void vhost_log_start(MemoryListener *listener,
>                               MemoryRegionSection *section,
>                               int old, int new)
> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>       hdev->log = NULL;
>       hdev->log_size = 0;
>       hdev->log_enabled = false;
> +    hdev->sw_lm_enabled = false;
>       hdev->started = false;
>       memory_listener_register(&hdev->memory_listener, &address_space_memory);
>       QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>           hdev->vhost_ops->vhost_dev_start(hdev, false);
>       }
>       for (i = 0; i < hdev->nvqs; ++i) {
> +        if (hdev->sw_lm_enabled) {
> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> +        }
> +
>           vhost_virtqueue_stop(hdev,
>                                vdev,
>                                hdev->vqs + i,
> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>           memory_listener_unregister(&hdev->iommu_listener);
>       }
>       vhost_log_put(hdev, true);
> +    g_free(hdev->shadow_vqs);
> +    hdev->sw_lm_enabled = false;
>       hdev->started = false;
>       hdev->vdev = NULL;
>   }
> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>   
>   void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>   {
> -    error_setg(errp, "Shadow virtqueue still not implemented.");
> +    struct vhost_dev *hdev;
> +    const char *err_cause = NULL;
> +    const VirtioDeviceClass *k;
> +    int r;
> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> +
> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> +            break;
> +        }
> +    }
> +
> +    if (!hdev) {
> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> +        err_cause = "Device not found";
> +        goto err;
> +    }
> +
> +    if (!hdev->started) {
> +        err_cause = "Device is not started";
> +        goto err;
> +    }
> +
> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
> +        err_cause = "Use packed vq";
> +        goto err;
> +    }
> +
> +    if (vhost_dev_has_iommu(hdev)) {
> +        err_cause = "Device use IOMMU";
> +        goto err;
> +    }
> +
> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
> +    if (!k->set_vq_handler) {
> +        err_cause = "Virtio device type does not support reset of vq handler";
> +        goto err;
> +    }
> +
> +    r = vhost_sw_live_migration_enable(hdev, enable);
> +    if (unlikely(r)) {
> +        err_cause = "Error enabling (see monitor)";
> +    }
> +
> +err:
> +    if (err_cause) {
> +        error_set(errp, err_class,
> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
> +    }
>   }



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
@ 2021-02-01  6:29     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-01  6:29 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, virtualization,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	Eric Blake, Michael Lilja, Jim Harford, Rob Miller


On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> Shadow virtqueue notifications forwarding is disabled when vhost_dev
> stops.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>   include/hw/virtio/vhost.h          |   4 +
>   hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>   hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>   4 files changed, 264 insertions(+), 3 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index 6cc18d6acb..466f8ae595 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -17,6 +17,11 @@
>   
>   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>   
> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> +                               VhostShadowVirtqueue *svq);
> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> +                              VhostShadowVirtqueue *svq);
> +
>   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>   
>   void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 2be782cefd..732a4b2a2b 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -55,6 +55,8 @@ struct vhost_iommu {
>       QLIST_ENTRY(vhost_iommu) iommu_next;
>   };
>   
> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> +
>   typedef struct VhostDevConfigOps {
>       /* Vhost device config space changed callback
>        */
> @@ -83,7 +85,9 @@ struct vhost_dev {
>       uint64_t backend_cap;
>       bool started;
>       bool log_enabled;
> +    bool sw_lm_enabled;
>       uint64_t log_size;
> +    VhostShadowVirtqueue **shadow_vqs;
>       Error *migration_blocker;
>       const VhostOps *vhost_ops;
>       void *opaque;
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index c0c967a7c5..908c36c66d 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -8,15 +8,129 @@
>    */
>   
>   #include "hw/virtio/vhost-shadow-virtqueue.h"
> +#include "hw/virtio/vhost.h"
> +#include "hw/virtio/virtio-access.h"
> +
> +#include "standard-headers/linux/vhost_types.h"
> +#include "standard-headers/linux/virtio_ring.h"
>   
>   #include "qemu/error-report.h"
> -#include "qemu/event_notifier.h"
> +#include "qemu/main-loop.h"
>   
>   typedef struct VhostShadowVirtqueue {
>       EventNotifier kick_notifier;
>       EventNotifier call_notifier;
> +    const struct vhost_virtqueue *hvq;
> +    VirtIODevice *vdev;
> +    VirtQueue *vq;
>   } VhostShadowVirtqueue;


So instead of doing things at virtio level, how about do the shadow 
stuffs at vhost level?

It works like:

virtio -> [shadow vhost backend] -> vhost backend

Then the QMP is used to plug the shadow vhost backend in the middle or not.

It looks kind of easier since we don't need to deal with virtqueue 
handlers etc.. Instead, we just need to deal with eventfd stuffs:

When shadow vhost mode is enabled, we just intercept the host_notifiers 
and guest_notifiers. When it was disabled, we just pass the host/guest 
notifiers to the real vhost backends?

Thanks


>   
> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
> +{
> +    const struct vring_used *used = svq->hvq->used;
> +    return virtio_tswap16(svq->vdev, used->flags);
> +}
> +
> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
> +{
> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
> +}
> +
> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
> +{
> +    if (vhost_shadow_vring_should_kick(vq)) {
> +        event_notifier_set(&vq->kick_notifier);
> +    }
> +}
> +
> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
> +    uint16_t idx = virtio_get_queue_index(vq);
> +
> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
> +
> +    vhost_shadow_vring_kick(svq);
> +}
> +
> +/*
> + * Start shadow virtqueue operation.
> + * @dev vhost device
> + * @svq Shadow Virtqueue
> + *
> + * Run in RCU context
> + */
> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> +                               VhostShadowVirtqueue *svq)
> +{
> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> +    struct vhost_vring_file kick_file = {
> +        .index = idx,
> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> +    };
> +    int r;
> +    bool ok;
> +
> +    /* Check that notifications are still going directly to vhost dev */
> +    assert(virtio_queue_host_notifier_status(svq->vq));
> +
> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
> +    if (!ok) {
> +        error_report("Couldn't set the vq handler");
> +        goto err_set_kick_handler;
> +    }
> +
> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> +    if (r != 0) {
> +        error_report("Couldn't set kick fd: %s", strerror(errno));
> +        goto err_set_vring_kick;
> +    }
> +
> +    event_notifier_set_handler(vq_host_notifier,
> +                               virtio_queue_host_notifier_read);
> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
> +    virtio_queue_host_notifier_read(vq_host_notifier);
> +
> +    return true;
> +
> +err_set_vring_kick:
> +    k->set_vq_handler(dev->vdev, idx, NULL);
> +
> +err_set_kick_handler:
> +    return false;
> +}
> +
> +/*
> + * Stop shadow virtqueue operation.
> + * @dev vhost device
> + * @svq Shadow Virtqueue
> + *
> + * Run in RCU context
> + */
> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> +                              VhostShadowVirtqueue *svq)
> +{
> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> +    struct vhost_vring_file kick_file = {
> +        .index = idx,
> +        .fd = event_notifier_get_fd(vq_host_notifier),
> +    };
> +    int r;
> +
> +    /* Restore vhost kick */
> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> +    /* Cannot do a lot of things */
> +    assert(r == 0);
> +
> +    event_notifier_set_handler(vq_host_notifier, NULL);
> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
> +    k->set_vq_handler(svq->vdev, idx, NULL);
> +}
> +
>   /*
>    * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>    * methods and file descriptors.
> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>   {
>       g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>       int r;
>   
> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> +    svq->hvq = &dev->vqs[idx];
> +    svq->vdev = dev->vdev;
> +
>       r = event_notifier_init(&svq->kick_notifier, 0);
>       if (r != 0) {
>           error_report("Couldn't create kick event notifier: %s",
> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>           goto err_init_call_notifier;
>       }
>   
> -    return svq;
> +    return g_steal_pointer(&svq);
>   
>   err_init_call_notifier:
>       event_notifier_cleanup(&svq->kick_notifier);
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 42836e45f3..bde688f278 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -25,6 +25,7 @@
>   #include "exec/address-spaces.h"
>   #include "hw/virtio/virtio-bus.h"
>   #include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>   #include "migration/blocker.h"
>   #include "migration/qemu-file-types.h"
>   #include "sysemu/dma.h"
> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>       }
>   }
>   
> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> +{
> +    int idx;
> +
> +    WITH_RCU_READ_LOCK_GUARD() {
> +        dev->sw_lm_enabled = false;
> +
> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
> +        }
> +    }
> +
> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> +    }
> +
> +    g_free(dev->shadow_vqs);
> +    dev->shadow_vqs = NULL;
> +    return 0;
> +}
> +
> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> +{
> +    int idx;
> +
> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> +            goto err;
> +        }
> +    }
> +
> +    WITH_RCU_READ_LOCK_GUARD() {
> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> +            int stop_idx = idx;
> +            bool ok = vhost_shadow_vq_start_rcu(dev,
> +                                                dev->shadow_vqs[idx]);
> +
> +            if (!ok) {
> +                while (--stop_idx >= 0) {
> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
> +                }
> +
> +                goto err;
> +            }
> +        }
> +    }
> +
> +    dev->sw_lm_enabled = true;
> +    return 0;
> +
> +err:
> +    for (; idx >= 0; --idx) {
> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> +    }
> +    g_free(dev->shadow_vqs[idx]);
> +
> +    return -1;
> +}
> +
> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> +                                          bool enable_lm)
> +{
> +    int r;
> +
> +    if (enable_lm == dev->sw_lm_enabled) {
> +        return 0;
> +    }
> +
> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
> +                  : vhost_sw_live_migration_stop(dev);
> +
> +    return r;
> +}
> +
>   static void vhost_log_start(MemoryListener *listener,
>                               MemoryRegionSection *section,
>                               int old, int new)
> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>       hdev->log = NULL;
>       hdev->log_size = 0;
>       hdev->log_enabled = false;
> +    hdev->sw_lm_enabled = false;
>       hdev->started = false;
>       memory_listener_register(&hdev->memory_listener, &address_space_memory);
>       QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>           hdev->vhost_ops->vhost_dev_start(hdev, false);
>       }
>       for (i = 0; i < hdev->nvqs; ++i) {
> +        if (hdev->sw_lm_enabled) {
> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> +        }
> +
>           vhost_virtqueue_stop(hdev,
>                                vdev,
>                                hdev->vqs + i,
> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>           memory_listener_unregister(&hdev->iommu_listener);
>       }
>       vhost_log_put(hdev, true);
> +    g_free(hdev->shadow_vqs);
> +    hdev->sw_lm_enabled = false;
>       hdev->started = false;
>       hdev->vdev = NULL;
>   }
> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>   
>   void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>   {
> -    error_setg(errp, "Shadow virtqueue still not implemented.");
> +    struct vhost_dev *hdev;
> +    const char *err_cause = NULL;
> +    const VirtioDeviceClass *k;
> +    int r;
> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> +
> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> +            break;
> +        }
> +    }
> +
> +    if (!hdev) {
> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> +        err_cause = "Device not found";
> +        goto err;
> +    }
> +
> +    if (!hdev->started) {
> +        err_cause = "Device is not started";
> +        goto err;
> +    }
> +
> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
> +        err_cause = "Use packed vq";
> +        goto err;
> +    }
> +
> +    if (vhost_dev_has_iommu(hdev)) {
> +        err_cause = "Device use IOMMU";
> +        goto err;
> +    }
> +
> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
> +    if (!k->set_vq_handler) {
> +        err_cause = "Virtio device type does not support reset of vq handler";
> +        goto err;
> +    }
> +
> +    r = vhost_sw_live_migration_enable(hdev, enable);
> +    if (unlikely(r)) {
> +        err_cause = "Error enabling (see monitor)";
> +    }
> +
> +err:
> +    if (err_cause) {
> +        error_set(errp, err_class,
> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
> +    }
>   }

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 03/10] virtio: Add virtio_queue_get_idx
  2021-02-01  6:10     ` Jason Wang
  (?)
@ 2021-02-01  7:20     ` Eugenio Perez Martin
  -1 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-01  7:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Mon, Feb 1, 2021 at 7:10 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   include/hw/virtio/virtio.h | 2 ++
> >   hw/virtio/virtio.c         | 5 +++++
> >   2 files changed, 7 insertions(+)
> >
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index 9988c6d5c9..9013c03424 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -399,6 +399,8 @@ static inline bool virtio_device_disabled(VirtIODevice *vdev)
> >       return unlikely(vdev->disabled || vdev->broken);
> >   }
> >
> > +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq);
> > +
> >   bool virtio_legacy_allowed(VirtIODevice *vdev);
> >   bool virtio_legacy_check_disabled(VirtIODevice *vdev);
> >
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index ebb780fb42..3d14b0ef74 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -500,6 +500,11 @@ void virtio_queue_set_notification(VirtQueue *vq, int enable)
> >       }
> >   }
> >
> > +unsigned virtio_queue_get_idx(const VirtIODevice *vdev, const VirtQueue *vq)
> > +{
> > +    return vq - vdev->vq;
> > +}
>
>
> It looks to me we had a dedicated index stored in VirtQueue:
> vq->queue_index.
>

You are right, I don't know why but missed it! It will be used in the
next series.

Thanks!

> Thanks
>
>
> > +
> >   int virtio_queue_ready(VirtQueue *vq)
> >   {
> >       return vq->vring.avail != 0;
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-01  6:12     ` Jason Wang
  (?)
@ 2021-02-01  8:28     ` Eugenio Perez Martin
  2021-02-02  3:31         ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-01  8:28 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   include/hw/virtio/vhost.h |  1 +
> >   hw/virtio/vhost.c         | 17 +++++++++++++++++
> >   2 files changed, 18 insertions(+)
> >
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 4a8bc75415..fca076e3f0 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> >   void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
> >                           uint64_t features);
> >   bool vhost_has_free_slot(void);
> > +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
> >
> >   int vhost_net_set_backend(struct vhost_dev *hdev,
> >                             struct vhost_vring_file *file);
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 28c7d78172..8683d507f5 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
> >       return slots_limit > used_memslots;
> >   }
> >
> > +/*
> > + * Get the vhost device associated to a VirtIO device.
> > + */
> > +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> > +{
> > +    struct vhost_dev *hdev;
> > +
> > +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> > +        if (hdev->vdev == vdev) {
> > +            return hdev;
> > +        }
> > +    }
> > +
> > +    assert(hdev);
> > +    return NULL;
> > +}
>
>
> I'm not sure this can work in the case of multiqueue. E.g vhost-net
> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>
> Thanks
>

Right. We could add an "vdev vq index" parameter to the function in
this case, but I guess the most reliable way to do this is to add a
vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.

I need to take this into account in qmp_x_vhost_enable_shadow_vq too.

>
> > +
> >   static void vhost_dev_sync_region(struct vhost_dev *dev,
> >                                     MemoryRegionSection *section,
> >                                     uint64_t mfirst, uint64_t mlast,
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-01  8:28     ` Eugenio Perez Martin
@ 2021-02-02  3:31         ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-02  3:31 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    include/hw/virtio/vhost.h |  1 +
>>>    hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>    2 files changed, 18 insertions(+)
>>>
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index 4a8bc75415..fca076e3f0 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>    void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>                            uint64_t features);
>>>    bool vhost_has_free_slot(void);
>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>
>>>    int vhost_net_set_backend(struct vhost_dev *hdev,
>>>                              struct vhost_vring_file *file);
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 28c7d78172..8683d507f5 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>        return slots_limit > used_memslots;
>>>    }
>>>
>>> +/*
>>> + * Get the vhost device associated to a VirtIO device.
>>> + */
>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>> +{
>>> +    struct vhost_dev *hdev;
>>> +
>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>> +        if (hdev->vdev == vdev) {
>>> +            return hdev;
>>> +        }
>>> +    }
>>> +
>>> +    assert(hdev);
>>> +    return NULL;
>>> +}
>>
>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>
>> Thanks
>>
> Right. We could add an "vdev vq index" parameter to the function in
> this case, but I guess the most reliable way to do this is to add a
> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.


So the question still, it looks like it's easier to hide the shadow 
virtqueue stuffs at vhost layer instead of expose them to virtio layer:

1) vhost protocol is stable ABI
2) no need to deal with virtio stuffs which is more complex than vhost

Or are there any advantages if we do it at virtio layer?

Thanks


>
> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>
>>> +
>>>    static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>                                      MemoryRegionSection *section,
>>>                                      uint64_t mfirst, uint64_t mlast,



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
@ 2021-02-02  3:31         ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-02  3:31 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Eric Blake, Michael Lilja, Jim Harford,
	Rob Miller


On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    include/hw/virtio/vhost.h |  1 +
>>>    hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>    2 files changed, 18 insertions(+)
>>>
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index 4a8bc75415..fca076e3f0 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>    void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>                            uint64_t features);
>>>    bool vhost_has_free_slot(void);
>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>
>>>    int vhost_net_set_backend(struct vhost_dev *hdev,
>>>                              struct vhost_vring_file *file);
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 28c7d78172..8683d507f5 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>        return slots_limit > used_memslots;
>>>    }
>>>
>>> +/*
>>> + * Get the vhost device associated to a VirtIO device.
>>> + */
>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>> +{
>>> +    struct vhost_dev *hdev;
>>> +
>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>> +        if (hdev->vdev == vdev) {
>>> +            return hdev;
>>> +        }
>>> +    }
>>> +
>>> +    assert(hdev);
>>> +    return NULL;
>>> +}
>>
>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>
>> Thanks
>>
> Right. We could add an "vdev vq index" parameter to the function in
> this case, but I guess the most reliable way to do this is to add a
> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.


So the question still, it looks like it's easier to hide the shadow 
virtqueue stuffs at vhost layer instead of expose them to virtio layer:

1) vhost protocol is stable ABI
2) no need to deal with virtio stuffs which is more complex than vhost

Or are there any advantages if we do it at virtio layer?

Thanks


>
> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>
>>> +
>>>    static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>                                      MemoryRegionSection *section,
>>>                                      uint64_t mfirst, uint64_t mlast,

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-02-01  6:29     ` Jason Wang
  (?)
@ 2021-02-02 10:08     ` Eugenio Perez Martin
  2021-02-04  3:26         ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-02 10:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> > Shadow virtqueue notifications forwarding is disabled when vhost_dev
> > stops.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   hw/virtio/vhost-shadow-virtqueue.h |   5 ++
> >   include/hw/virtio/vhost.h          |   4 +
> >   hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
> >   hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
> >   4 files changed, 264 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> > index 6cc18d6acb..466f8ae595 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -17,6 +17,11 @@
> >
> >   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> >
> > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> > +                               VhostShadowVirtqueue *svq);
> > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> > +                              VhostShadowVirtqueue *svq);
> > +
> >   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
> >
> >   void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 2be782cefd..732a4b2a2b 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -55,6 +55,8 @@ struct vhost_iommu {
> >       QLIST_ENTRY(vhost_iommu) iommu_next;
> >   };
> >
> > +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> > +
> >   typedef struct VhostDevConfigOps {
> >       /* Vhost device config space changed callback
> >        */
> > @@ -83,7 +85,9 @@ struct vhost_dev {
> >       uint64_t backend_cap;
> >       bool started;
> >       bool log_enabled;
> > +    bool sw_lm_enabled;
> >       uint64_t log_size;
> > +    VhostShadowVirtqueue **shadow_vqs;
> >       Error *migration_blocker;
> >       const VhostOps *vhost_ops;
> >       void *opaque;
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> > index c0c967a7c5..908c36c66d 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -8,15 +8,129 @@
> >    */
> >
> >   #include "hw/virtio/vhost-shadow-virtqueue.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/virtio-access.h"
> > +
> > +#include "standard-headers/linux/vhost_types.h"
> > +#include "standard-headers/linux/virtio_ring.h"
> >
> >   #include "qemu/error-report.h"
> > -#include "qemu/event_notifier.h"
> > +#include "qemu/main-loop.h"
> >
> >   typedef struct VhostShadowVirtqueue {
> >       EventNotifier kick_notifier;
> >       EventNotifier call_notifier;
> > +    const struct vhost_virtqueue *hvq;
> > +    VirtIODevice *vdev;
> > +    VirtQueue *vq;
> >   } VhostShadowVirtqueue;
>
>
> So instead of doing things at virtio level, how about do the shadow
> stuffs at vhost level?
>
> It works like:
>
> virtio -> [shadow vhost backend] -> vhost backend
>
> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>
> It looks kind of easier since we don't need to deal with virtqueue
> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>
> When shadow vhost mode is enabled, we just intercept the host_notifiers
> and guest_notifiers. When it was disabled, we just pass the host/guest
> notifiers to the real vhost backends?
>

Hi Jason.

Sure we can try that model, but it seems to me that it comes with a
different set of problems.

For example, there are code in vhost.c that checks if implementations
are available in vhost_ops, like:

if (dev->vhost_ops->vhost_vq_get_addr) {
        r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
        ...
}

I can count 14 of these, checking:

dev->vhost_ops->vhost_backend_can_merge
dev->vhost_ops->vhost_backend_mem_section_filter
dev->vhost_ops->vhost_force_iommu
dev->vhost_ops->vhost_requires_shm_log
dev->vhost_ops->vhost_set_backend_cap
dev->vhost_ops->vhost_set_vring_busyloop_timeout
dev->vhost_ops->vhost_vq_get_addr
hdev->vhost_ops->vhost_dev_start
hdev->vhost_ops->vhost_get_config
hdev->vhost_ops->vhost_get_inflight_fd
hdev->vhost_ops->vhost_net_set_backend
hdev->vhost_ops->vhost_set_config
hdev->vhost_ops->vhost_set_inflight_fd
hdev->vhost_ops->vhost_set_iotlb_callback

So we should Implement all of the vhost_ops callbacks, forwarding them
to actual vhost_backed, and delete conditionally these ones? In other
words, dynamically generate the new shadow vq vhost_ops? If a new
callback is added to any vhost backend in the future, do we have to
force the adding / checking for NULL in shadow backend vhost_ops?
Would this be a good moment to check if all backends implement these
and delete the checks?

There are also checks like:

if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)

How would shadow_vq backend expose itself? (I guess as the actual used backend).

I can modify this patchset to not relay the guest->host notifications
on vq handlers but on eventfd handlers. Although this will make it
independent of the actual virtio device kind used, I can see two
drawbacks:
* The actual fact that it makes it independent of virtio device kind.
If a device does not use the notifiers and poll the ring by itself, it
has no chance of knowing that it should stop. What happens if
virtio-net tx timer is armed when we start shadow vq?.
* The fixes (current and future) in vq notifications, like the one
currently implemented in virtio_notify_irqfd for windows drivers
regarding ISR bit 0. I think this one in particular is OK not to
carry, but I think many changes affecting any of the functions will
have to be mirrored in the other.

Thoughts on this?

Thanks!

> Thanks
>
>
> >
> > +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
> > +{
> > +    const struct vring_used *used = svq->hvq->used;
> > +    return virtio_tswap16(svq->vdev, used->flags);
> > +}
> > +
> > +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
> > +{
> > +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
> > +}
> > +
> > +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
> > +{
> > +    if (vhost_shadow_vring_should_kick(vq)) {
> > +        event_notifier_set(&vq->kick_notifier);
> > +    }
> > +}
> > +
> > +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
> > +    uint16_t idx = virtio_get_queue_index(vq);
> > +
> > +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
> > +
> > +    vhost_shadow_vring_kick(svq);
> > +}
> > +
> > +/*
> > + * Start shadow virtqueue operation.
> > + * @dev vhost device
> > + * @svq Shadow Virtqueue
> > + *
> > + * Run in RCU context
> > + */
> > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> > +                               VhostShadowVirtqueue *svq)
> > +{
> > +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> > +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> > +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> > +    struct vhost_vring_file kick_file = {
> > +        .index = idx,
> > +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> > +    };
> > +    int r;
> > +    bool ok;
> > +
> > +    /* Check that notifications are still going directly to vhost dev */
> > +    assert(virtio_queue_host_notifier_status(svq->vq));
> > +
> > +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
> > +    if (!ok) {
> > +        error_report("Couldn't set the vq handler");
> > +        goto err_set_kick_handler;
> > +    }
> > +
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> > +    if (r != 0) {
> > +        error_report("Couldn't set kick fd: %s", strerror(errno));
> > +        goto err_set_vring_kick;
> > +    }
> > +
> > +    event_notifier_set_handler(vq_host_notifier,
> > +                               virtio_queue_host_notifier_read);
> > +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
> > +    virtio_queue_host_notifier_read(vq_host_notifier);
> > +
> > +    return true;
> > +
> > +err_set_vring_kick:
> > +    k->set_vq_handler(dev->vdev, idx, NULL);
> > +
> > +err_set_kick_handler:
> > +    return false;
> > +}
> > +
> > +/*
> > + * Stop shadow virtqueue operation.
> > + * @dev vhost device
> > + * @svq Shadow Virtqueue
> > + *
> > + * Run in RCU context
> > + */
> > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> > +                              VhostShadowVirtqueue *svq)
> > +{
> > +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
> > +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> > +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> > +    struct vhost_vring_file kick_file = {
> > +        .index = idx,
> > +        .fd = event_notifier_get_fd(vq_host_notifier),
> > +    };
> > +    int r;
> > +
> > +    /* Restore vhost kick */
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> > +    /* Cannot do a lot of things */
> > +    assert(r == 0);
> > +
> > +    event_notifier_set_handler(vq_host_notifier, NULL);
> > +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
> > +    k->set_vq_handler(svq->vdev, idx, NULL);
> > +}
> > +
> >   /*
> >    * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
> >    * methods and file descriptors.
> > @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
> >   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> >   {
> >       g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> > +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
> >       int r;
> >
> > +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> > +    svq->hvq = &dev->vqs[idx];
> > +    svq->vdev = dev->vdev;
> > +
> >       r = event_notifier_init(&svq->kick_notifier, 0);
> >       if (r != 0) {
> >           error_report("Couldn't create kick event notifier: %s",
> > @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> >           goto err_init_call_notifier;
> >       }
> >
> > -    return svq;
> > +    return g_steal_pointer(&svq);
> >
> >   err_init_call_notifier:
> >       event_notifier_cleanup(&svq->kick_notifier);
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 42836e45f3..bde688f278 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -25,6 +25,7 @@
> >   #include "exec/address-spaces.h"
> >   #include "hw/virtio/virtio-bus.h"
> >   #include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> >   #include "migration/blocker.h"
> >   #include "migration/qemu-file-types.h"
> >   #include "sysemu/dma.h"
> > @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
> >       }
> >   }
> >
> > +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> > +{
> > +    int idx;
> > +
> > +    WITH_RCU_READ_LOCK_GUARD() {
> > +        dev->sw_lm_enabled = false;
> > +
> > +        for (idx = 0; idx < dev->nvqs; ++idx) {
> > +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
> > +        }
> > +    }
> > +
> > +    for (idx = 0; idx < dev->nvqs; ++idx) {
> > +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> > +    }
> > +
> > +    g_free(dev->shadow_vqs);
> > +    dev->shadow_vqs = NULL;
> > +    return 0;
> > +}
> > +
> > +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> > +{
> > +    int idx;
> > +
> > +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> > +    for (idx = 0; idx < dev->nvqs; ++idx) {
> > +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> > +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> > +            goto err;
> > +        }
> > +    }
> > +
> > +    WITH_RCU_READ_LOCK_GUARD() {
> > +        for (idx = 0; idx < dev->nvqs; ++idx) {
> > +            int stop_idx = idx;
> > +            bool ok = vhost_shadow_vq_start_rcu(dev,
> > +                                                dev->shadow_vqs[idx]);
> > +
> > +            if (!ok) {
> > +                while (--stop_idx >= 0) {
> > +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
> > +                }
> > +
> > +                goto err;
> > +            }
> > +        }
> > +    }
> > +
> > +    dev->sw_lm_enabled = true;
> > +    return 0;
> > +
> > +err:
> > +    for (; idx >= 0; --idx) {
> > +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> > +    }
> > +    g_free(dev->shadow_vqs[idx]);
> > +
> > +    return -1;
> > +}
> > +
> > +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> > +                                          bool enable_lm)
> > +{
> > +    int r;
> > +
> > +    if (enable_lm == dev->sw_lm_enabled) {
> > +        return 0;
> > +    }
> > +
> > +    r = enable_lm ? vhost_sw_live_migration_start(dev)
> > +                  : vhost_sw_live_migration_stop(dev);
> > +
> > +    return r;
> > +}
> > +
> >   static void vhost_log_start(MemoryListener *listener,
> >                               MemoryRegionSection *section,
> >                               int old, int new)
> > @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> >       hdev->log = NULL;
> >       hdev->log_size = 0;
> >       hdev->log_enabled = false;
> > +    hdev->sw_lm_enabled = false;
> >       hdev->started = false;
> >       memory_listener_register(&hdev->memory_listener, &address_space_memory);
> >       QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> > @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >           hdev->vhost_ops->vhost_dev_start(hdev, false);
> >       }
> >       for (i = 0; i < hdev->nvqs; ++i) {
> > +        if (hdev->sw_lm_enabled) {
> > +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
> > +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> > +        }
> > +
> >           vhost_virtqueue_stop(hdev,
> >                                vdev,
> >                                hdev->vqs + i,
> > @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >           memory_listener_unregister(&hdev->iommu_listener);
> >       }
> >       vhost_log_put(hdev, true);
> > +    g_free(hdev->shadow_vqs);
> > +    hdev->sw_lm_enabled = false;
> >       hdev->started = false;
> >       hdev->vdev = NULL;
> >   }
> > @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >
> >   void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> >   {
> > -    error_setg(errp, "Shadow virtqueue still not implemented.");
> > +    struct vhost_dev *hdev;
> > +    const char *err_cause = NULL;
> > +    const VirtioDeviceClass *k;
> > +    int r;
> > +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> > +
> > +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> > +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (!hdev) {
> > +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> > +        err_cause = "Device not found";
> > +        goto err;
> > +    }
> > +
> > +    if (!hdev->started) {
> > +        err_cause = "Device is not started";
> > +        goto err;
> > +    }
> > +
> > +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
> > +        err_cause = "Use packed vq";
> > +        goto err;
> > +    }
> > +
> > +    if (vhost_dev_has_iommu(hdev)) {
> > +        err_cause = "Device use IOMMU";
> > +        goto err;
> > +    }
> > +
> > +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
> > +    if (!k->set_vq_handler) {
> > +        err_cause = "Virtio device type does not support reset of vq handler";
> > +        goto err;
> > +    }
> > +
> > +    r = vhost_sw_live_migration_enable(hdev, enable);
> > +    if (unlikely(r)) {
> > +        err_cause = "Error enabling (see monitor)";
> > +    }
> > +
> > +err:
> > +    if (err_cause) {
> > +        error_set(errp, err_class,
> > +                  "Can't enable shadow vq on %s: %s", name, err_cause);
> > +    }
> >   }
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-02  3:31         ` Jason Wang
  (?)
@ 2021-02-02 10:17         ` Eugenio Perez Martin
  2021-02-04  3:14             ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-02 10:17 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
> > On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>    include/hw/virtio/vhost.h |  1 +
> >>>    hw/virtio/vhost.c         | 17 +++++++++++++++++
> >>>    2 files changed, 18 insertions(+)
> >>>
> >>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>> index 4a8bc75415..fca076e3f0 100644
> >>> --- a/include/hw/virtio/vhost.h
> >>> +++ b/include/hw/virtio/vhost.h
> >>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>    void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>                            uint64_t features);
> >>>    bool vhost_has_free_slot(void);
> >>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
> >>>
> >>>    int vhost_net_set_backend(struct vhost_dev *hdev,
> >>>                              struct vhost_vring_file *file);
> >>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>> index 28c7d78172..8683d507f5 100644
> >>> --- a/hw/virtio/vhost.c
> >>> +++ b/hw/virtio/vhost.c
> >>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
> >>>        return slots_limit > used_memslots;
> >>>    }
> >>>
> >>> +/*
> >>> + * Get the vhost device associated to a VirtIO device.
> >>> + */
> >>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> >>> +{
> >>> +    struct vhost_dev *hdev;
> >>> +
> >>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> >>> +        if (hdev->vdev == vdev) {
> >>> +            return hdev;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    assert(hdev);
> >>> +    return NULL;
> >>> +}
> >>
> >> I'm not sure this can work in the case of multiqueue. E.g vhost-net
> >> multiqueue is a N:1 mapping between vhost devics and virtio devices.
> >>
> >> Thanks
> >>
> > Right. We could add an "vdev vq index" parameter to the function in
> > this case, but I guess the most reliable way to do this is to add a
> > vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>
>
> So the question still, it looks like it's easier to hide the shadow
> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>
> 1) vhost protocol is stable ABI
> 2) no need to deal with virtio stuffs which is more complex than vhost
>
> Or are there any advantages if we do it at virtio layer?
>

As far as I can tell, we will need the virtio layer the moment we
start copying/translating buffers.

In this series, the virtio dependency can be reduced if qemu does not
check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
would enable packed queues and IOMMU immediately, and I think the cost
should not be so high. In the previous RFC this check was deleted
later anyway, so I think it was a bad idea to include it from the start.





> Thanks
>
>
> >
> > I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
> >
> >>> +
> >>>    static void vhost_dev_sync_region(struct vhost_dev *dev,
> >>>                                      MemoryRegionSection *section,
> >>>                                      uint64_t mfirst, uint64_t mlast,
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
  2021-01-29 20:54 ` [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
@ 2021-02-02 15:38     ` Eric Blake
  0 siblings, 0 replies; 42+ messages in thread
From: Eric Blake @ 2021-02-02 15:38 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	Markus Armbruster, virtualization, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, Stefano Garzarella,
	Michael Lilja, Jim Harford, Rob Miller

On 1/29/21 2:54 PM, Eugenio Pérez wrote:
> Command to enable shadow virtqueue looks like:
> 
> { "execute": "x-vhost-enable-shadow-vq", "arguments": { "name": "dev0", "enable": true } }
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  qapi/net.json     | 23 +++++++++++++++++++++++
>  hw/virtio/vhost.c |  6 ++++++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/qapi/net.json b/qapi/net.json
> index c31748c87f..6170d69798 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -77,6 +77,29 @@
>  ##
>  { 'command': 'netdev_del', 'data': {'id': 'str'} }
>  
> +##
> +# @x-vhost-enable-shadow-vq:

This spelling is the preferred form...[1]

> +#
> +# Use vhost shadow virtqueue.
> +#
> +# @name: the device name of the virtual network adapter
> +#
> +# @enable: true to use he alternate shadow VQ notification path
> +#
> +# Returns: Error if failure, or 'no error' for success

This line...[2]

> +#
> +# Since: 5.3

The next release is 6.0, not 5.3.

> +#
> +# Example:
> +#
> +# -> { "execute": "x-vhost_enable_shadow_vq", "arguments": {"enable": true} }

[1]...but doesn't match the example.

> +# <- { "return": { "enabled" : true } }

[2]...doesn't match this comment.  I'd just drop the line, since there
is no explicit return listed.

> +#
> +##
> +{ 'command': 'x-vhost-enable-shadow-vq',
> +  'data': {'name': 'str', 'enable': 'bool'},
> +  'if': 'defined(CONFIG_VHOST_KERNEL)' }
> +
>  ##
>  # @NetLegacyNicOptions:
>  #
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 040f68ff2e..42836e45f3 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -15,6 +15,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
> +#include "qapi/qapi-commands-net.h"
>  #include "hw/virtio/vhost.h"
>  #include "qemu/atomic.h"
>  #include "qemu/range.h"
> @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>  
>      return -1;
>  }
> +
> +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> +{
> +    error_setg(errp, "Shadow virtqueue still not implemented.");

error_setg() should not be passed a trailing '.'.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
@ 2021-02-02 15:38     ` Eric Blake
  0 siblings, 0 replies; 42+ messages in thread
From: Eric Blake @ 2021-02-02 15:38 UTC (permalink / raw)
  To: Eugenio Pérez, qemu-devel
  Cc: Parav Pandit, Michael S. Tsirkin, virtualization,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	Michael Lilja, Jim Harford, Rob Miller

On 1/29/21 2:54 PM, Eugenio Pérez wrote:
> Command to enable shadow virtqueue looks like:
> 
> { "execute": "x-vhost-enable-shadow-vq", "arguments": { "name": "dev0", "enable": true } }
> 
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>  qapi/net.json     | 23 +++++++++++++++++++++++
>  hw/virtio/vhost.c |  6 ++++++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/qapi/net.json b/qapi/net.json
> index c31748c87f..6170d69798 100644
> --- a/qapi/net.json
> +++ b/qapi/net.json
> @@ -77,6 +77,29 @@
>  ##
>  { 'command': 'netdev_del', 'data': {'id': 'str'} }
>  
> +##
> +# @x-vhost-enable-shadow-vq:

This spelling is the preferred form...[1]

> +#
> +# Use vhost shadow virtqueue.
> +#
> +# @name: the device name of the virtual network adapter
> +#
> +# @enable: true to use he alternate shadow VQ notification path
> +#
> +# Returns: Error if failure, or 'no error' for success

This line...[2]

> +#
> +# Since: 5.3

The next release is 6.0, not 5.3.

> +#
> +# Example:
> +#
> +# -> { "execute": "x-vhost_enable_shadow_vq", "arguments": {"enable": true} }

[1]...but doesn't match the example.

> +# <- { "return": { "enabled" : true } }

[2]...doesn't match this comment.  I'd just drop the line, since there
is no explicit return listed.

> +#
> +##
> +{ 'command': 'x-vhost-enable-shadow-vq',
> +  'data': {'name': 'str', 'enable': 'bool'},
> +  'if': 'defined(CONFIG_VHOST_KERNEL)' }
> +
>  ##
>  # @NetLegacyNicOptions:
>  #
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 040f68ff2e..42836e45f3 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -15,6 +15,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
> +#include "qapi/qapi-commands-net.h"
>  #include "hw/virtio/vhost.h"
>  #include "qemu/atomic.h"
>  #include "qemu/range.h"
> @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>  
>      return -1;
>  }
> +
> +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> +{
> +    error_setg(errp, "Shadow virtqueue still not implemented.");

error_setg() should not be passed a trailing '.'.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-02 10:17         ` Eugenio Perez Martin
@ 2021-02-04  3:14             ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-04  3:14 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Rob Miller, Parav Pandit, Juan Quintela, Michael S. Tsirkin,
	qemu-level, Markus Armbruster, Harpreet Singh Anand, Xiao W Wang,
	Stefan Hajnoczi, Eli Cohen, virtualization, Michael Lilja,
	Jim Harford, Stefano Garzarella


On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>     include/hw/virtio/vhost.h |  1 +
>>>>>     hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>     2 files changed, 18 insertions(+)
>>>>>
>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>> --- a/include/hw/virtio/vhost.h
>>>>> +++ b/include/hw/virtio/vhost.h
>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>     void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>                             uint64_t features);
>>>>>     bool vhost_has_free_slot(void);
>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>
>>>>>     int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>                               struct vhost_vring_file *file);
>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>> index 28c7d78172..8683d507f5 100644
>>>>> --- a/hw/virtio/vhost.c
>>>>> +++ b/hw/virtio/vhost.c
>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>         return slots_limit > used_memslots;
>>>>>     }
>>>>>
>>>>> +/*
>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>> + */
>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>> +{
>>>>> +    struct vhost_dev *hdev;
>>>>> +
>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>> +        if (hdev->vdev == vdev) {
>>>>> +            return hdev;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    assert(hdev);
>>>>> +    return NULL;
>>>>> +}
>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>
>>>> Thanks
>>>>
>>> Right. We could add an "vdev vq index" parameter to the function in
>>> this case, but I guess the most reliable way to do this is to add a
>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>
>> So the question still, it looks like it's easier to hide the shadow
>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>
>> 1) vhost protocol is stable ABI
>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>
>> Or are there any advantages if we do it at virtio layer?
>>
> As far as I can tell, we will need the virtio layer the moment we
> start copying/translating buffers.
>
> In this series, the virtio dependency can be reduced if qemu does not
> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
> would enable packed queues and IOMMU immediately, and I think the cost
> should not be so high. In the previous RFC this check was deleted
> later anyway, so I think it was a bad idea to include it from the start.


I am not sure I understand here. For vhost, we can still do anything we 
want, e.g accessing guest memory etc. Any blocker that prevent us from 
copying/translating buffers? (Note that qemu will propagate memory 
mappings to vhost).

Thanks


>
>
>
>
>
>> Thanks
>>
>>
>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>
>>>>> +
>>>>>     static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>                                       MemoryRegionSection *section,
>>>>>                                       uint64_t mfirst, uint64_t mlast,
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
@ 2021-02-04  3:14             ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-04  3:14 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Rob Miller, Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Michael Lilja, Jim Harford


On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>     include/hw/virtio/vhost.h |  1 +
>>>>>     hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>     2 files changed, 18 insertions(+)
>>>>>
>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>> --- a/include/hw/virtio/vhost.h
>>>>> +++ b/include/hw/virtio/vhost.h
>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>     void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>                             uint64_t features);
>>>>>     bool vhost_has_free_slot(void);
>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>
>>>>>     int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>                               struct vhost_vring_file *file);
>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>> index 28c7d78172..8683d507f5 100644
>>>>> --- a/hw/virtio/vhost.c
>>>>> +++ b/hw/virtio/vhost.c
>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>         return slots_limit > used_memslots;
>>>>>     }
>>>>>
>>>>> +/*
>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>> + */
>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>> +{
>>>>> +    struct vhost_dev *hdev;
>>>>> +
>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>> +        if (hdev->vdev == vdev) {
>>>>> +            return hdev;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    assert(hdev);
>>>>> +    return NULL;
>>>>> +}
>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>
>>>> Thanks
>>>>
>>> Right. We could add an "vdev vq index" parameter to the function in
>>> this case, but I guess the most reliable way to do this is to add a
>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>
>> So the question still, it looks like it's easier to hide the shadow
>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>
>> 1) vhost protocol is stable ABI
>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>
>> Or are there any advantages if we do it at virtio layer?
>>
> As far as I can tell, we will need the virtio layer the moment we
> start copying/translating buffers.
>
> In this series, the virtio dependency can be reduced if qemu does not
> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
> would enable packed queues and IOMMU immediately, and I think the cost
> should not be so high. In the previous RFC this check was deleted
> later anyway, so I think it was a bad idea to include it from the start.


I am not sure I understand here. For vhost, we can still do anything we 
want, e.g accessing guest memory etc. Any blocker that prevent us from 
copying/translating buffers? (Note that qemu will propagate memory 
mappings to vhost).

Thanks


>
>
>
>
>
>> Thanks
>>
>>
>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>
>>>>> +
>>>>>     static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>                                       MemoryRegionSection *section,
>>>>>                                       uint64_t mfirst, uint64_t mlast,
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-02-02 10:08     ` Eugenio Perez Martin
@ 2021-02-04  3:26         ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-04  3:26 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/2/2 下午6:08, Eugenio Perez Martin wrote:
> On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>> Shadow virtqueue notifications forwarding is disabled when vhost_dev
>>> stops.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>>>    include/hw/virtio/vhost.h          |   4 +
>>>    hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>>>    hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>>>    4 files changed, 264 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
>>> index 6cc18d6acb..466f8ae595 100644
>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>>> @@ -17,6 +17,11 @@
>>>
>>>    typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>
>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>> +                               VhostShadowVirtqueue *svq);
>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>> +                              VhostShadowVirtqueue *svq);
>>> +
>>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>>>
>>>    void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index 2be782cefd..732a4b2a2b 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -55,6 +55,8 @@ struct vhost_iommu {
>>>        QLIST_ENTRY(vhost_iommu) iommu_next;
>>>    };
>>>
>>> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>> +
>>>    typedef struct VhostDevConfigOps {
>>>        /* Vhost device config space changed callback
>>>         */
>>> @@ -83,7 +85,9 @@ struct vhost_dev {
>>>        uint64_t backend_cap;
>>>        bool started;
>>>        bool log_enabled;
>>> +    bool sw_lm_enabled;
>>>        uint64_t log_size;
>>> +    VhostShadowVirtqueue **shadow_vqs;
>>>        Error *migration_blocker;
>>>        const VhostOps *vhost_ops;
>>>        void *opaque;
>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
>>> index c0c967a7c5..908c36c66d 100644
>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>>> @@ -8,15 +8,129 @@
>>>     */
>>>
>>>    #include "hw/virtio/vhost-shadow-virtqueue.h"
>>> +#include "hw/virtio/vhost.h"
>>> +#include "hw/virtio/virtio-access.h"
>>> +
>>> +#include "standard-headers/linux/vhost_types.h"
>>> +#include "standard-headers/linux/virtio_ring.h"
>>>
>>>    #include "qemu/error-report.h"
>>> -#include "qemu/event_notifier.h"
>>> +#include "qemu/main-loop.h"
>>>
>>>    typedef struct VhostShadowVirtqueue {
>>>        EventNotifier kick_notifier;
>>>        EventNotifier call_notifier;
>>> +    const struct vhost_virtqueue *hvq;
>>> +    VirtIODevice *vdev;
>>> +    VirtQueue *vq;
>>>    } VhostShadowVirtqueue;
>>
>> So instead of doing things at virtio level, how about do the shadow
>> stuffs at vhost level?
>>
>> It works like:
>>
>> virtio -> [shadow vhost backend] -> vhost backend
>>
>> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>>
>> It looks kind of easier since we don't need to deal with virtqueue
>> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>>
>> When shadow vhost mode is enabled, we just intercept the host_notifiers
>> and guest_notifiers. When it was disabled, we just pass the host/guest
>> notifiers to the real vhost backends?
>>
> Hi Jason.
>
> Sure we can try that model, but it seems to me that it comes with a
> different set of problems.
>
> For example, there are code in vhost.c that checks if implementations
> are available in vhost_ops, like:
>
> if (dev->vhost_ops->vhost_vq_get_addr) {
>          r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
>          ...
> }
>
> I can count 14 of these, checking:
>
> dev->vhost_ops->vhost_backend_can_merge
> dev->vhost_ops->vhost_backend_mem_section_filter
> dev->vhost_ops->vhost_force_iommu
> dev->vhost_ops->vhost_requires_shm_log
> dev->vhost_ops->vhost_set_backend_cap
> dev->vhost_ops->vhost_set_vring_busyloop_timeout
> dev->vhost_ops->vhost_vq_get_addr
> hdev->vhost_ops->vhost_dev_start
> hdev->vhost_ops->vhost_get_config
> hdev->vhost_ops->vhost_get_inflight_fd
> hdev->vhost_ops->vhost_net_set_backend
> hdev->vhost_ops->vhost_set_config
> hdev->vhost_ops->vhost_set_inflight_fd
> hdev->vhost_ops->vhost_set_iotlb_callback
>
> So we should Implement all of the vhost_ops callbacks, forwarding them
> to actual vhost_backed, and delete conditionally these ones? In other
> words, dynamically generate the new shadow vq vhost_ops? If a new
> callback is added to any vhost backend in the future, do we have to
> force the adding / checking for NULL in shadow backend vhost_ops?
> Would this be a good moment to check if all backends implement these
> and delete the checks?


I think it won't be easy if we want to support all kinds of vhost 
backends from the start. So we can go with vhost-vdpa one first.

Actually how it work might be something like (no need to switch 
vhost_ops, we can do everything silently in the ops)

1) when device to switch to shadow vq (e.g via QMP)
2) vhost-vdpa will stop and sync state (last_avail_idx) internally
3) reset vhost-vdpa, clean call and kick eventfd
4) allocate vqs for vhost-vdpa, new call and kick eventfd, restart 
vhost-vdpa
5) start the shadow vq (make it start for last_avail_idx)
6) intercept ioeventfd and forward the request to callfd
7) intercept callfd and forward the request to irqfd
8) forward request between shadow virtqueue and vhost-vdpa


>
> There are also checks like:
>
> if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)
>
> How would shadow_vq backend expose itself? (I guess as the actual used backend).
>
> I can modify this patchset to not relay the guest->host notifications
> on vq handlers but on eventfd handlers. Although this will make it
> independent of the actual virtio device kind used, I can see two
> drawbacks:
> * The actual fact that it makes it independent of virtio device kind.
> If a device does not use the notifiers and poll the ring by itself, it
> has no chance of knowing that it should stop. What happens if
> virtio-net tx timer is armed when we start shadow vq?.


So if we do that in vhost level, it's a vhost backend from the virtio 
layer. Then we don't need to worry about tx timer stuffs.


> * The fixes (current and future) in vq notifications, like the one
> currently implemented in virtio_notify_irqfd for windows drivers
> regarding ISR bit 0. I think this one in particular is OK not to
> carry, but I think many changes affecting any of the functions will
> have to be mirrored in the other.


Consider we behave like a vhost, it just work as in the past for other 
type of vhost backends when MSI-X is not enabled?

Thanks


>
> Thoughts on this?
>
> Thanks!
>
>> Thanks
>>
>>
>>> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
>>> +{
>>> +    const struct vring_used *used = svq->hvq->used;
>>> +    return virtio_tswap16(svq->vdev, used->flags);
>>> +}
>>> +
>>> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
>>> +{
>>> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
>>> +}
>>> +
>>> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
>>> +{
>>> +    if (vhost_shadow_vring_should_kick(vq)) {
>>> +        event_notifier_set(&vq->kick_notifier);
>>> +    }
>>> +}
>>> +
>>> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
>>> +{
>>> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
>>> +    uint16_t idx = virtio_get_queue_index(vq);
>>> +
>>> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
>>> +
>>> +    vhost_shadow_vring_kick(svq);
>>> +}
>>> +
>>> +/*
>>> + * Start shadow virtqueue operation.
>>> + * @dev vhost device
>>> + * @svq Shadow Virtqueue
>>> + *
>>> + * Run in RCU context
>>> + */
>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>> +                               VhostShadowVirtqueue *svq)
>>> +{
>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>> +    struct vhost_vring_file kick_file = {
>>> +        .index = idx,
>>> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
>>> +    };
>>> +    int r;
>>> +    bool ok;
>>> +
>>> +    /* Check that notifications are still going directly to vhost dev */
>>> +    assert(virtio_queue_host_notifier_status(svq->vq));
>>> +
>>> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
>>> +    if (!ok) {
>>> +        error_report("Couldn't set the vq handler");
>>> +        goto err_set_kick_handler;
>>> +    }
>>> +
>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>> +    if (r != 0) {
>>> +        error_report("Couldn't set kick fd: %s", strerror(errno));
>>> +        goto err_set_vring_kick;
>>> +    }
>>> +
>>> +    event_notifier_set_handler(vq_host_notifier,
>>> +                               virtio_queue_host_notifier_read);
>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
>>> +    virtio_queue_host_notifier_read(vq_host_notifier);
>>> +
>>> +    return true;
>>> +
>>> +err_set_vring_kick:
>>> +    k->set_vq_handler(dev->vdev, idx, NULL);
>>> +
>>> +err_set_kick_handler:
>>> +    return false;
>>> +}
>>> +
>>> +/*
>>> + * Stop shadow virtqueue operation.
>>> + * @dev vhost device
>>> + * @svq Shadow Virtqueue
>>> + *
>>> + * Run in RCU context
>>> + */
>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>> +                              VhostShadowVirtqueue *svq)
>>> +{
>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>> +    struct vhost_vring_file kick_file = {
>>> +        .index = idx,
>>> +        .fd = event_notifier_get_fd(vq_host_notifier),
>>> +    };
>>> +    int r;
>>> +
>>> +    /* Restore vhost kick */
>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>> +    /* Cannot do a lot of things */
>>> +    assert(r == 0);
>>> +
>>> +    event_notifier_set_handler(vq_host_notifier, NULL);
>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
>>> +    k->set_vq_handler(svq->vdev, idx, NULL);
>>> +}
>>> +
>>>    /*
>>>     * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>>>     * methods and file descriptors.
>>> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>    {
>>>        g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>>> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>>>        int r;
>>>
>>> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
>>> +    svq->hvq = &dev->vqs[idx];
>>> +    svq->vdev = dev->vdev;
>>> +
>>>        r = event_notifier_init(&svq->kick_notifier, 0);
>>>        if (r != 0) {
>>>            error_report("Couldn't create kick event notifier: %s",
>>> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>            goto err_init_call_notifier;
>>>        }
>>>
>>> -    return svq;
>>> +    return g_steal_pointer(&svq);
>>>
>>>    err_init_call_notifier:
>>>        event_notifier_cleanup(&svq->kick_notifier);
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 42836e45f3..bde688f278 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -25,6 +25,7 @@
>>>    #include "exec/address-spaces.h"
>>>    #include "hw/virtio/virtio-bus.h"
>>>    #include "hw/virtio/virtio-access.h"
>>> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>>>    #include "migration/blocker.h"
>>>    #include "migration/qemu-file-types.h"
>>>    #include "sysemu/dma.h"
>>> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>>>        }
>>>    }
>>>
>>> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
>>> +{
>>> +    int idx;
>>> +
>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>> +        dev->sw_lm_enabled = false;
>>> +
>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
>>> +        }
>>> +    }
>>> +
>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>> +    }
>>> +
>>> +    g_free(dev->shadow_vqs);
>>> +    dev->shadow_vqs = NULL;
>>> +    return 0;
>>> +}
>>> +
>>> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
>>> +{
>>> +    int idx;
>>> +
>>> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
>>> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
>>> +            goto err;
>>> +        }
>>> +    }
>>> +
>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +            int stop_idx = idx;
>>> +            bool ok = vhost_shadow_vq_start_rcu(dev,
>>> +                                                dev->shadow_vqs[idx]);
>>> +
>>> +            if (!ok) {
>>> +                while (--stop_idx >= 0) {
>>> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
>>> +                }
>>> +
>>> +                goto err;
>>> +            }
>>> +        }
>>> +    }
>>> +
>>> +    dev->sw_lm_enabled = true;
>>> +    return 0;
>>> +
>>> +err:
>>> +    for (; idx >= 0; --idx) {
>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>> +    }
>>> +    g_free(dev->shadow_vqs[idx]);
>>> +
>>> +    return -1;
>>> +}
>>> +
>>> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
>>> +                                          bool enable_lm)
>>> +{
>>> +    int r;
>>> +
>>> +    if (enable_lm == dev->sw_lm_enabled) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
>>> +                  : vhost_sw_live_migration_stop(dev);
>>> +
>>> +    return r;
>>> +}
>>> +
>>>    static void vhost_log_start(MemoryListener *listener,
>>>                                MemoryRegionSection *section,
>>>                                int old, int new)
>>> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>        hdev->log = NULL;
>>>        hdev->log_size = 0;
>>>        hdev->log_enabled = false;
>>> +    hdev->sw_lm_enabled = false;
>>>        hdev->started = false;
>>>        memory_listener_register(&hdev->memory_listener, &address_space_memory);
>>>        QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>>> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>            hdev->vhost_ops->vhost_dev_start(hdev, false);
>>>        }
>>>        for (i = 0; i < hdev->nvqs; ++i) {
>>> +        if (hdev->sw_lm_enabled) {
>>> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
>>> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
>>> +        }
>>> +
>>>            vhost_virtqueue_stop(hdev,
>>>                                 vdev,
>>>                                 hdev->vqs + i,
>>> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>            memory_listener_unregister(&hdev->iommu_listener);
>>>        }
>>>        vhost_log_put(hdev, true);
>>> +    g_free(hdev->shadow_vqs);
>>> +    hdev->sw_lm_enabled = false;
>>>        hdev->started = false;
>>>        hdev->vdev = NULL;
>>>    }
>>> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>>>
>>>    void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>>>    {
>>> -    error_setg(errp, "Shadow virtqueue still not implemented.");
>>> +    struct vhost_dev *hdev;
>>> +    const char *err_cause = NULL;
>>> +    const VirtioDeviceClass *k;
>>> +    int r;
>>> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
>>> +
>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!hdev) {
>>> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
>>> +        err_cause = "Device not found";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (!hdev->started) {
>>> +        err_cause = "Device is not started";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
>>> +        err_cause = "Use packed vq";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (vhost_dev_has_iommu(hdev)) {
>>> +        err_cause = "Device use IOMMU";
>>> +        goto err;
>>> +    }
>>> +
>>> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
>>> +    if (!k->set_vq_handler) {
>>> +        err_cause = "Virtio device type does not support reset of vq handler";
>>> +        goto err;
>>> +    }
>>> +
>>> +    r = vhost_sw_live_migration_enable(hdev, enable);
>>> +    if (unlikely(r)) {
>>> +        err_cause = "Error enabling (see monitor)";
>>> +    }
>>> +
>>> +err:
>>> +    if (err_cause) {
>>> +        error_set(errp, err_class,
>>> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
>>> +    }
>>>    }



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
@ 2021-02-04  3:26         ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-04  3:26 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Eric Blake, Michael Lilja, Jim Harford,
	Rob Miller


On 2021/2/2 下午6:08, Eugenio Perez Martin wrote:
> On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>> Shadow virtqueue notifications forwarding is disabled when vhost_dev
>>> stops.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>>>    include/hw/virtio/vhost.h          |   4 +
>>>    hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>>>    hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>>>    4 files changed, 264 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
>>> index 6cc18d6acb..466f8ae595 100644
>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>>> @@ -17,6 +17,11 @@
>>>
>>>    typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>
>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>> +                               VhostShadowVirtqueue *svq);
>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>> +                              VhostShadowVirtqueue *svq);
>>> +
>>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>>>
>>>    void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>> index 2be782cefd..732a4b2a2b 100644
>>> --- a/include/hw/virtio/vhost.h
>>> +++ b/include/hw/virtio/vhost.h
>>> @@ -55,6 +55,8 @@ struct vhost_iommu {
>>>        QLIST_ENTRY(vhost_iommu) iommu_next;
>>>    };
>>>
>>> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>> +
>>>    typedef struct VhostDevConfigOps {
>>>        /* Vhost device config space changed callback
>>>         */
>>> @@ -83,7 +85,9 @@ struct vhost_dev {
>>>        uint64_t backend_cap;
>>>        bool started;
>>>        bool log_enabled;
>>> +    bool sw_lm_enabled;
>>>        uint64_t log_size;
>>> +    VhostShadowVirtqueue **shadow_vqs;
>>>        Error *migration_blocker;
>>>        const VhostOps *vhost_ops;
>>>        void *opaque;
>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
>>> index c0c967a7c5..908c36c66d 100644
>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>>> @@ -8,15 +8,129 @@
>>>     */
>>>
>>>    #include "hw/virtio/vhost-shadow-virtqueue.h"
>>> +#include "hw/virtio/vhost.h"
>>> +#include "hw/virtio/virtio-access.h"
>>> +
>>> +#include "standard-headers/linux/vhost_types.h"
>>> +#include "standard-headers/linux/virtio_ring.h"
>>>
>>>    #include "qemu/error-report.h"
>>> -#include "qemu/event_notifier.h"
>>> +#include "qemu/main-loop.h"
>>>
>>>    typedef struct VhostShadowVirtqueue {
>>>        EventNotifier kick_notifier;
>>>        EventNotifier call_notifier;
>>> +    const struct vhost_virtqueue *hvq;
>>> +    VirtIODevice *vdev;
>>> +    VirtQueue *vq;
>>>    } VhostShadowVirtqueue;
>>
>> So instead of doing things at virtio level, how about do the shadow
>> stuffs at vhost level?
>>
>> It works like:
>>
>> virtio -> [shadow vhost backend] -> vhost backend
>>
>> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>>
>> It looks kind of easier since we don't need to deal with virtqueue
>> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>>
>> When shadow vhost mode is enabled, we just intercept the host_notifiers
>> and guest_notifiers. When it was disabled, we just pass the host/guest
>> notifiers to the real vhost backends?
>>
> Hi Jason.
>
> Sure we can try that model, but it seems to me that it comes with a
> different set of problems.
>
> For example, there are code in vhost.c that checks if implementations
> are available in vhost_ops, like:
>
> if (dev->vhost_ops->vhost_vq_get_addr) {
>          r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
>          ...
> }
>
> I can count 14 of these, checking:
>
> dev->vhost_ops->vhost_backend_can_merge
> dev->vhost_ops->vhost_backend_mem_section_filter
> dev->vhost_ops->vhost_force_iommu
> dev->vhost_ops->vhost_requires_shm_log
> dev->vhost_ops->vhost_set_backend_cap
> dev->vhost_ops->vhost_set_vring_busyloop_timeout
> dev->vhost_ops->vhost_vq_get_addr
> hdev->vhost_ops->vhost_dev_start
> hdev->vhost_ops->vhost_get_config
> hdev->vhost_ops->vhost_get_inflight_fd
> hdev->vhost_ops->vhost_net_set_backend
> hdev->vhost_ops->vhost_set_config
> hdev->vhost_ops->vhost_set_inflight_fd
> hdev->vhost_ops->vhost_set_iotlb_callback
>
> So we should Implement all of the vhost_ops callbacks, forwarding them
> to actual vhost_backed, and delete conditionally these ones? In other
> words, dynamically generate the new shadow vq vhost_ops? If a new
> callback is added to any vhost backend in the future, do we have to
> force the adding / checking for NULL in shadow backend vhost_ops?
> Would this be a good moment to check if all backends implement these
> and delete the checks?


I think it won't be easy if we want to support all kinds of vhost 
backends from the start. So we can go with vhost-vdpa one first.

Actually how it work might be something like (no need to switch 
vhost_ops, we can do everything silently in the ops)

1) when device to switch to shadow vq (e.g via QMP)
2) vhost-vdpa will stop and sync state (last_avail_idx) internally
3) reset vhost-vdpa, clean call and kick eventfd
4) allocate vqs for vhost-vdpa, new call and kick eventfd, restart 
vhost-vdpa
5) start the shadow vq (make it start for last_avail_idx)
6) intercept ioeventfd and forward the request to callfd
7) intercept callfd and forward the request to irqfd
8) forward request between shadow virtqueue and vhost-vdpa


>
> There are also checks like:
>
> if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)
>
> How would shadow_vq backend expose itself? (I guess as the actual used backend).
>
> I can modify this patchset to not relay the guest->host notifications
> on vq handlers but on eventfd handlers. Although this will make it
> independent of the actual virtio device kind used, I can see two
> drawbacks:
> * The actual fact that it makes it independent of virtio device kind.
> If a device does not use the notifiers and poll the ring by itself, it
> has no chance of knowing that it should stop. What happens if
> virtio-net tx timer is armed when we start shadow vq?.


So if we do that in vhost level, it's a vhost backend from the virtio 
layer. Then we don't need to worry about tx timer stuffs.


> * The fixes (current and future) in vq notifications, like the one
> currently implemented in virtio_notify_irqfd for windows drivers
> regarding ISR bit 0. I think this one in particular is OK not to
> carry, but I think many changes affecting any of the functions will
> have to be mirrored in the other.


Consider we behave like a vhost, it just work as in the past for other 
type of vhost backends when MSI-X is not enabled?

Thanks


>
> Thoughts on this?
>
> Thanks!
>
>> Thanks
>>
>>
>>> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
>>> +{
>>> +    const struct vring_used *used = svq->hvq->used;
>>> +    return virtio_tswap16(svq->vdev, used->flags);
>>> +}
>>> +
>>> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
>>> +{
>>> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
>>> +}
>>> +
>>> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
>>> +{
>>> +    if (vhost_shadow_vring_should_kick(vq)) {
>>> +        event_notifier_set(&vq->kick_notifier);
>>> +    }
>>> +}
>>> +
>>> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
>>> +{
>>> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
>>> +    uint16_t idx = virtio_get_queue_index(vq);
>>> +
>>> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
>>> +
>>> +    vhost_shadow_vring_kick(svq);
>>> +}
>>> +
>>> +/*
>>> + * Start shadow virtqueue operation.
>>> + * @dev vhost device
>>> + * @svq Shadow Virtqueue
>>> + *
>>> + * Run in RCU context
>>> + */
>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>> +                               VhostShadowVirtqueue *svq)
>>> +{
>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>> +    struct vhost_vring_file kick_file = {
>>> +        .index = idx,
>>> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
>>> +    };
>>> +    int r;
>>> +    bool ok;
>>> +
>>> +    /* Check that notifications are still going directly to vhost dev */
>>> +    assert(virtio_queue_host_notifier_status(svq->vq));
>>> +
>>> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
>>> +    if (!ok) {
>>> +        error_report("Couldn't set the vq handler");
>>> +        goto err_set_kick_handler;
>>> +    }
>>> +
>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>> +    if (r != 0) {
>>> +        error_report("Couldn't set kick fd: %s", strerror(errno));
>>> +        goto err_set_vring_kick;
>>> +    }
>>> +
>>> +    event_notifier_set_handler(vq_host_notifier,
>>> +                               virtio_queue_host_notifier_read);
>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
>>> +    virtio_queue_host_notifier_read(vq_host_notifier);
>>> +
>>> +    return true;
>>> +
>>> +err_set_vring_kick:
>>> +    k->set_vq_handler(dev->vdev, idx, NULL);
>>> +
>>> +err_set_kick_handler:
>>> +    return false;
>>> +}
>>> +
>>> +/*
>>> + * Stop shadow virtqueue operation.
>>> + * @dev vhost device
>>> + * @svq Shadow Virtqueue
>>> + *
>>> + * Run in RCU context
>>> + */
>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>> +                              VhostShadowVirtqueue *svq)
>>> +{
>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>> +    struct vhost_vring_file kick_file = {
>>> +        .index = idx,
>>> +        .fd = event_notifier_get_fd(vq_host_notifier),
>>> +    };
>>> +    int r;
>>> +
>>> +    /* Restore vhost kick */
>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>> +    /* Cannot do a lot of things */
>>> +    assert(r == 0);
>>> +
>>> +    event_notifier_set_handler(vq_host_notifier, NULL);
>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
>>> +    k->set_vq_handler(svq->vdev, idx, NULL);
>>> +}
>>> +
>>>    /*
>>>     * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>>>     * methods and file descriptors.
>>> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>    {
>>>        g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>>> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>>>        int r;
>>>
>>> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
>>> +    svq->hvq = &dev->vqs[idx];
>>> +    svq->vdev = dev->vdev;
>>> +
>>>        r = event_notifier_init(&svq->kick_notifier, 0);
>>>        if (r != 0) {
>>>            error_report("Couldn't create kick event notifier: %s",
>>> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>            goto err_init_call_notifier;
>>>        }
>>>
>>> -    return svq;
>>> +    return g_steal_pointer(&svq);
>>>
>>>    err_init_call_notifier:
>>>        event_notifier_cleanup(&svq->kick_notifier);
>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>> index 42836e45f3..bde688f278 100644
>>> --- a/hw/virtio/vhost.c
>>> +++ b/hw/virtio/vhost.c
>>> @@ -25,6 +25,7 @@
>>>    #include "exec/address-spaces.h"
>>>    #include "hw/virtio/virtio-bus.h"
>>>    #include "hw/virtio/virtio-access.h"
>>> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>>>    #include "migration/blocker.h"
>>>    #include "migration/qemu-file-types.h"
>>>    #include "sysemu/dma.h"
>>> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>>>        }
>>>    }
>>>
>>> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
>>> +{
>>> +    int idx;
>>> +
>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>> +        dev->sw_lm_enabled = false;
>>> +
>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
>>> +        }
>>> +    }
>>> +
>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>> +    }
>>> +
>>> +    g_free(dev->shadow_vqs);
>>> +    dev->shadow_vqs = NULL;
>>> +    return 0;
>>> +}
>>> +
>>> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
>>> +{
>>> +    int idx;
>>> +
>>> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
>>> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
>>> +            goto err;
>>> +        }
>>> +    }
>>> +
>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>> +            int stop_idx = idx;
>>> +            bool ok = vhost_shadow_vq_start_rcu(dev,
>>> +                                                dev->shadow_vqs[idx]);
>>> +
>>> +            if (!ok) {
>>> +                while (--stop_idx >= 0) {
>>> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
>>> +                }
>>> +
>>> +                goto err;
>>> +            }
>>> +        }
>>> +    }
>>> +
>>> +    dev->sw_lm_enabled = true;
>>> +    return 0;
>>> +
>>> +err:
>>> +    for (; idx >= 0; --idx) {
>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>> +    }
>>> +    g_free(dev->shadow_vqs[idx]);
>>> +
>>> +    return -1;
>>> +}
>>> +
>>> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
>>> +                                          bool enable_lm)
>>> +{
>>> +    int r;
>>> +
>>> +    if (enable_lm == dev->sw_lm_enabled) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
>>> +                  : vhost_sw_live_migration_stop(dev);
>>> +
>>> +    return r;
>>> +}
>>> +
>>>    static void vhost_log_start(MemoryListener *listener,
>>>                                MemoryRegionSection *section,
>>>                                int old, int new)
>>> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>        hdev->log = NULL;
>>>        hdev->log_size = 0;
>>>        hdev->log_enabled = false;
>>> +    hdev->sw_lm_enabled = false;
>>>        hdev->started = false;
>>>        memory_listener_register(&hdev->memory_listener, &address_space_memory);
>>>        QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>>> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>            hdev->vhost_ops->vhost_dev_start(hdev, false);
>>>        }
>>>        for (i = 0; i < hdev->nvqs; ++i) {
>>> +        if (hdev->sw_lm_enabled) {
>>> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
>>> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
>>> +        }
>>> +
>>>            vhost_virtqueue_stop(hdev,
>>>                                 vdev,
>>>                                 hdev->vqs + i,
>>> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>            memory_listener_unregister(&hdev->iommu_listener);
>>>        }
>>>        vhost_log_put(hdev, true);
>>> +    g_free(hdev->shadow_vqs);
>>> +    hdev->sw_lm_enabled = false;
>>>        hdev->started = false;
>>>        hdev->vdev = NULL;
>>>    }
>>> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>>>
>>>    void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>>>    {
>>> -    error_setg(errp, "Shadow virtqueue still not implemented.");
>>> +    struct vhost_dev *hdev;
>>> +    const char *err_cause = NULL;
>>> +    const VirtioDeviceClass *k;
>>> +    int r;
>>> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
>>> +
>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
>>> +            break;
>>> +        }
>>> +    }
>>> +
>>> +    if (!hdev) {
>>> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
>>> +        err_cause = "Device not found";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (!hdev->started) {
>>> +        err_cause = "Device is not started";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
>>> +        err_cause = "Use packed vq";
>>> +        goto err;
>>> +    }
>>> +
>>> +    if (vhost_dev_has_iommu(hdev)) {
>>> +        err_cause = "Device use IOMMU";
>>> +        goto err;
>>> +    }
>>> +
>>> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
>>> +    if (!k->set_vq_handler) {
>>> +        err_cause = "Virtio device type does not support reset of vq handler";
>>> +        goto err;
>>> +    }
>>> +
>>> +    r = vhost_sw_live_migration_enable(hdev, enable);
>>> +    if (unlikely(r)) {
>>> +        err_cause = "Error enabling (see monitor)";
>>> +    }
>>> +
>>> +err:
>>> +    if (err_cause) {
>>> +        error_set(errp, err_class,
>>> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
>>> +    }
>>>    }

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
  2021-02-02 15:38     ` Eric Blake
  (?)
@ 2021-02-04  9:01     ` Eugenio Perez Martin
  2021-02-04 12:16         ` Markus Armbruster
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-04  9:01 UTC (permalink / raw)
  To: Eric Blake
  Cc: Parav Pandit, Michael S. Tsirkin, Jason Wang, Juan Quintela,
	qemu-level, Markus Armbruster, Stefano Garzarella,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Michael Lilja, Jim Harford, Rob Miller

On Tue, Feb 2, 2021 at 4:38 PM Eric Blake <eblake@redhat.com> wrote:
>
> On 1/29/21 2:54 PM, Eugenio Pérez wrote:
> > Command to enable shadow virtqueue looks like:
> >
> > { "execute": "x-vhost-enable-shadow-vq", "arguments": { "name": "dev0", "enable": true } }
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >  qapi/net.json     | 23 +++++++++++++++++++++++
> >  hw/virtio/vhost.c |  6 ++++++
> >  2 files changed, 29 insertions(+)
> >
> > diff --git a/qapi/net.json b/qapi/net.json
> > index c31748c87f..6170d69798 100644
> > --- a/qapi/net.json
> > +++ b/qapi/net.json
> > @@ -77,6 +77,29 @@
> >  ##
> >  { 'command': 'netdev_del', 'data': {'id': 'str'} }
> >
> > +##
> > +# @x-vhost-enable-shadow-vq:
>
> This spelling is the preferred form...[1]
>
> > +#
> > +# Use vhost shadow virtqueue.
> > +#
> > +# @name: the device name of the virtual network adapter
> > +#
> > +# @enable: true to use he alternate shadow VQ notification path
> > +#
> > +# Returns: Error if failure, or 'no error' for success
>
> This line...[2]
>
> > +#
> > +# Since: 5.3
>
> The next release is 6.0, not 5.3.
>
> > +#
> > +# Example:
> > +#
> > +# -> { "execute": "x-vhost_enable_shadow_vq", "arguments": {"enable": true} }
>
> [1]...but doesn't match the example.
>
> > +# <- { "return": { "enabled" : true } }
>
> [2]...doesn't match this comment.  I'd just drop the line, since there
> is no explicit return listed.
>

Hi Eric.

Thanks for your comments, they will be addressed in the next revision.

> > +#
> > +##
> > +{ 'command': 'x-vhost-enable-shadow-vq',
> > +  'data': {'name': 'str', 'enable': 'bool'},
> > +  'if': 'defined(CONFIG_VHOST_KERNEL)' }
> > +
> >  ##
> >  # @NetLegacyNicOptions:
> >  #
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 040f68ff2e..42836e45f3 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -15,6 +15,7 @@
> >
> >  #include "qemu/osdep.h"
> >  #include "qapi/error.h"
> > +#include "qapi/qapi-commands-net.h"
> >  #include "hw/virtio/vhost.h"
> >  #include "qemu/atomic.h"
> >  #include "qemu/range.h"
> > @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >
> >      return -1;
> >  }
> > +
> > +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> > +{
> > +    error_setg(errp, "Shadow virtqueue still not implemented.");
>
> error_setg() should not be passed a trailing '.'.
>

Oh, sorry I missed the comment in the error_setg doc.

I copy&pasted from the call to error_setg "Migration disabled: vhost
lacks VHOST_F_LOG_ALL feature.". I'm wondering if it's a good moment
to delete the dot there too, since other tools could depend on parsing
it.

Thanks!

> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-04  3:14             ` Jason Wang
  (?)
@ 2021-02-04  9:25             ` Eugenio Perez Martin
  2021-02-05  3:51                 ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-04  9:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Rob Miller, Parav Pandit, Juan Quintela, Michael S. Tsirkin,
	qemu-level, Markus Armbruster, Harpreet Singh Anand, Xiao W Wang,
	Stefan Hajnoczi, Eli Cohen, virtualization, Michael Lilja,
	Jim Harford, Stefano Garzarella

On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
> > On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
> >>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> >>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>> ---
> >>>>>     include/hw/virtio/vhost.h |  1 +
> >>>>>     hw/virtio/vhost.c         | 17 +++++++++++++++++
> >>>>>     2 files changed, 18 insertions(+)
> >>>>>
> >>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>>> index 4a8bc75415..fca076e3f0 100644
> >>>>> --- a/include/hw/virtio/vhost.h
> >>>>> +++ b/include/hw/virtio/vhost.h
> >>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>>>     void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>>>                             uint64_t features);
> >>>>>     bool vhost_has_free_slot(void);
> >>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
> >>>>>
> >>>>>     int vhost_net_set_backend(struct vhost_dev *hdev,
> >>>>>                               struct vhost_vring_file *file);
> >>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>>>> index 28c7d78172..8683d507f5 100644
> >>>>> --- a/hw/virtio/vhost.c
> >>>>> +++ b/hw/virtio/vhost.c
> >>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
> >>>>>         return slots_limit > used_memslots;
> >>>>>     }
> >>>>>
> >>>>> +/*
> >>>>> + * Get the vhost device associated to a VirtIO device.
> >>>>> + */
> >>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> >>>>> +{
> >>>>> +    struct vhost_dev *hdev;
> >>>>> +
> >>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> >>>>> +        if (hdev->vdev == vdev) {
> >>>>> +            return hdev;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>> +    assert(hdev);
> >>>>> +    return NULL;
> >>>>> +}
> >>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
> >>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
> >>>>
> >>>> Thanks
> >>>>
> >>> Right. We could add an "vdev vq index" parameter to the function in
> >>> this case, but I guess the most reliable way to do this is to add a
> >>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
> >>
> >> So the question still, it looks like it's easier to hide the shadow
> >> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
> >>
> >> 1) vhost protocol is stable ABI
> >> 2) no need to deal with virtio stuffs which is more complex than vhost
> >>
> >> Or are there any advantages if we do it at virtio layer?
> >>
> > As far as I can tell, we will need the virtio layer the moment we
> > start copying/translating buffers.
> >
> > In this series, the virtio dependency can be reduced if qemu does not
> > check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
> > would enable packed queues and IOMMU immediately, and I think the cost
> > should not be so high. In the previous RFC this check was deleted
> > later anyway, so I think it was a bad idea to include it from the start.
>
>
> I am not sure I understand here. For vhost, we can still do anything we
> want, e.g accessing guest memory etc. Any blocker that prevent us from
> copying/translating buffers? (Note that qemu will propagate memory
> mappings to vhost).
>

There is nothing that forbids us to access directly, but if we don't
reuse the virtio layer functionality we would have to duplicate every
access function. "Need" was a too strong word maybe :).

In other words: for the shadow vq vring exposed for the device, qemu
treats it as a driver, and this functionality needs to be added to
qemu. But for accessing the guest's one do not reuse virtio.c would be
a bad idea in my opinion.

> Thanks
>
>
> >
> >
> >
> >
> >
> >> Thanks
> >>
> >>
> >>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
> >>>
> >>>>> +
> >>>>>     static void vhost_dev_sync_region(struct vhost_dev *dev,
> >>>>>                                       MemoryRegionSection *section,
> >>>>>                                       uint64_t mfirst, uint64_t mlast,
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
  2021-02-04  9:01     ` Eugenio Perez Martin
@ 2021-02-04 12:16         ` Markus Armbruster
  0 siblings, 0 replies; 42+ messages in thread
From: Markus Armbruster @ 2021-02-04 12:16 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Juan Quintela, Jason Wang, Michael S. Tsirkin,
	qemu-level, virtualization, Harpreet Singh Anand, Xiao W Wang,
	Stefan Hajnoczi, Eli Cohen, Rob Miller, Michael Lilja,
	Jim Harford, Stefano Garzarella

Eugenio Perez Martin <eperezma@redhat.com> writes:

> On Tue, Feb 2, 2021 at 4:38 PM Eric Blake <eblake@redhat.com> wrote:
>>
>> On 1/29/21 2:54 PM, Eugenio Pérez wrote:
[...]
>> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> > index 040f68ff2e..42836e45f3 100644
>> > --- a/hw/virtio/vhost.c
>> > +++ b/hw/virtio/vhost.c
>> > @@ -15,6 +15,7 @@
>> >
>> >  #include "qemu/osdep.h"
>> >  #include "qapi/error.h"
>> > +#include "qapi/qapi-commands-net.h"
>> >  #include "hw/virtio/vhost.h"
>> >  #include "qemu/atomic.h"
>> >  #include "qemu/range.h"
>> > @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>> >
>> >      return -1;
>> >  }
>> > +
>> > +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>> > +{
>> > +    error_setg(errp, "Shadow virtqueue still not implemented.");
>>
>> error_setg() should not be passed a trailing '.'.
>>
>
> Oh, sorry I missed the comment in the error_setg doc.
>
> I copy&pasted from the call to error_setg "Migration disabled: vhost
> lacks VHOST_F_LOG_ALL feature.". I'm wondering if it's a good moment
> to delete the dot there too, since other tools could depend on parsing
> it.

It's pretty much always a good moment for patches improving error
messages :)



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
@ 2021-02-04 12:16         ` Markus Armbruster
  0 siblings, 0 replies; 42+ messages in thread
From: Markus Armbruster @ 2021-02-04 12:16 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, qemu-level, virtualization,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	Rob Miller, Eric Blake, Michael Lilja, Jim Harford

Eugenio Perez Martin <eperezma@redhat.com> writes:

> On Tue, Feb 2, 2021 at 4:38 PM Eric Blake <eblake@redhat.com> wrote:
>>
>> On 1/29/21 2:54 PM, Eugenio Pérez wrote:
[...]
>> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> > index 040f68ff2e..42836e45f3 100644
>> > --- a/hw/virtio/vhost.c
>> > +++ b/hw/virtio/vhost.c
>> > @@ -15,6 +15,7 @@
>> >
>> >  #include "qemu/osdep.h"
>> >  #include "qapi/error.h"
>> > +#include "qapi/qapi-commands-net.h"
>> >  #include "hw/virtio/vhost.h"
>> >  #include "qemu/atomic.h"
>> >  #include "qemu/range.h"
>> > @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>> >
>> >      return -1;
>> >  }
>> > +
>> > +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>> > +{
>> > +    error_setg(errp, "Shadow virtqueue still not implemented.");
>>
>> error_setg() should not be passed a trailing '.'.
>>
>
> Oh, sorry I missed the comment in the error_setg doc.
>
> I copy&pasted from the call to error_setg "Migration disabled: vhost
> lacks VHOST_F_LOG_ALL feature.". I'm wondering if it's a good moment
> to delete the dot there too, since other tools could depend on parsing
> it.

It's pretty much always a good moment for patches improving error
messages :)

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp
  2021-02-04 12:16         ` Markus Armbruster
  (?)
@ 2021-02-04 14:03         ` Eugenio Perez Martin
  -1 siblings, 0 replies; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-04 14:03 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Parav Pandit, Juan Quintela, Jason Wang, Michael S. Tsirkin,
	qemu-level, virtualization, Harpreet Singh Anand, Xiao W Wang,
	Stefan Hajnoczi, Eli Cohen, Rob Miller, Michael Lilja,
	Jim Harford, Stefano Garzarella

On Thu, Feb 4, 2021 at 1:16 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Eugenio Perez Martin <eperezma@redhat.com> writes:
>
> > On Tue, Feb 2, 2021 at 4:38 PM Eric Blake <eblake@redhat.com> wrote:
> >>
> >> On 1/29/21 2:54 PM, Eugenio Pérez wrote:
> [...]
> >> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> > index 040f68ff2e..42836e45f3 100644
> >> > --- a/hw/virtio/vhost.c
> >> > +++ b/hw/virtio/vhost.c
> >> > @@ -15,6 +15,7 @@
> >> >
> >> >  #include "qemu/osdep.h"
> >> >  #include "qapi/error.h"
> >> > +#include "qapi/qapi-commands-net.h"
> >> >  #include "hw/virtio/vhost.h"
> >> >  #include "qemu/atomic.h"
> >> >  #include "qemu/range.h"
> >> > @@ -1841,3 +1842,8 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >> >
> >> >      return -1;
> >> >  }
> >> > +
> >> > +void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> >> > +{
> >> > +    error_setg(errp, "Shadow virtqueue still not implemented.");
> >>
> >> error_setg() should not be passed a trailing '.'.
> >>
> >
> > Oh, sorry I missed the comment in the error_setg doc.
> >
> > I copy&pasted from the call to error_setg "Migration disabled: vhost
> > lacks VHOST_F_LOG_ALL feature.". I'm wondering if it's a good moment
> > to delete the dot there too, since other tools could depend on parsing
> > it.
>
> It's pretty much always a good moment for patches improving error
> messages :)
>

Great, changing it too :).

Thanks!



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-04  9:25             ` Eugenio Perez Martin
@ 2021-02-05  3:51                 ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-05  3:51 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Juan Quintela, Michael S. Tsirkin, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/2/4 下午5:25, Eugenio Perez Martin wrote:
> On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
>>> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>> ---
>>>>>>>      include/hw/virtio/vhost.h |  1 +
>>>>>>>      hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>>>      2 files changed, 18 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>      void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>                              uint64_t features);
>>>>>>>      bool vhost_has_free_slot(void);
>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>>>
>>>>>>>      int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>>>                                struct vhost_vring_file *file);
>>>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>>>> index 28c7d78172..8683d507f5 100644
>>>>>>> --- a/hw/virtio/vhost.c
>>>>>>> +++ b/hw/virtio/vhost.c
>>>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>>>          return slots_limit > used_memslots;
>>>>>>>      }
>>>>>>>
>>>>>>> +/*
>>>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>>>> + */
>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>>>> +{
>>>>>>> +    struct vhost_dev *hdev;
>>>>>>> +
>>>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>>>> +        if (hdev->vdev == vdev) {
>>>>>>> +            return hdev;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    assert(hdev);
>>>>>>> +    return NULL;
>>>>>>> +}
>>>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>> Right. We could add an "vdev vq index" parameter to the function in
>>>>> this case, but I guess the most reliable way to do this is to add a
>>>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>>> So the question still, it looks like it's easier to hide the shadow
>>>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>>>
>>>> 1) vhost protocol is stable ABI
>>>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>>>
>>>> Or are there any advantages if we do it at virtio layer?
>>>>
>>> As far as I can tell, we will need the virtio layer the moment we
>>> start copying/translating buffers.
>>>
>>> In this series, the virtio dependency can be reduced if qemu does not
>>> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
>>> would enable packed queues and IOMMU immediately, and I think the cost
>>> should not be so high. In the previous RFC this check was deleted
>>> later anyway, so I think it was a bad idea to include it from the start.
>>
>> I am not sure I understand here. For vhost, we can still do anything we
>> want, e.g accessing guest memory etc. Any blocker that prevent us from
>> copying/translating buffers? (Note that qemu will propagate memory
>> mappings to vhost).
>>
> There is nothing that forbids us to access directly, but if we don't
> reuse the virtio layer functionality we would have to duplicate every
> access function. "Need" was a too strong word maybe :).
>
> In other words: for the shadow vq vring exposed for the device, qemu
> treats it as a driver, and this functionality needs to be added to
> qemu. But for accessing the guest's one do not reuse virtio.c would be
> a bad idea in my opinion.


The problem is, virtio.c is not a library and it has a lot of dependency 
with other qemu modules basically makes it impossible to be reused at 
vhost level.

We can solve this by:

1) split the core functions out as a library or
2) switch to use contrib/lib-vhostuser but needs to decouple UNIX socket 
transport

None of the above looks trivial and they are only device codes. For 
shadow virtqueue, we need driver codes as well where no code can be reused.

As we discussed, we probably need IOVA allocated when forwarding 
descriptors between the two virtqueues. So my feeling is we can have our 
own codes to start then we can consider whether we can reuse some from 
the existing virtio.c or lib-vhostuser.

Thanks


>
>> Thanks
>>
>>
>>>
>>>
>>>
>>>
>>>> Thanks
>>>>
>>>>
>>>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>>>
>>>>>>> +
>>>>>>>      static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>>>                                        MemoryRegionSection *section,
>>>>>>>                                        uint64_t mfirst, uint64_t mlast,
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
@ 2021-02-05  3:51                 ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-05  3:51 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Michael Lilja, Jim Harford, Rob Miller


On 2021/2/4 下午5:25, Eugenio Perez Martin wrote:
> On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
>>> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>> ---
>>>>>>>      include/hw/virtio/vhost.h |  1 +
>>>>>>>      hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>>>      2 files changed, 18 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>      void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>                              uint64_t features);
>>>>>>>      bool vhost_has_free_slot(void);
>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>>>
>>>>>>>      int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>>>                                struct vhost_vring_file *file);
>>>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>>>> index 28c7d78172..8683d507f5 100644
>>>>>>> --- a/hw/virtio/vhost.c
>>>>>>> +++ b/hw/virtio/vhost.c
>>>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>>>          return slots_limit > used_memslots;
>>>>>>>      }
>>>>>>>
>>>>>>> +/*
>>>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>>>> + */
>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>>>> +{
>>>>>>> +    struct vhost_dev *hdev;
>>>>>>> +
>>>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>>>> +        if (hdev->vdev == vdev) {
>>>>>>> +            return hdev;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    assert(hdev);
>>>>>>> +    return NULL;
>>>>>>> +}
>>>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>> Right. We could add an "vdev vq index" parameter to the function in
>>>>> this case, but I guess the most reliable way to do this is to add a
>>>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>>> So the question still, it looks like it's easier to hide the shadow
>>>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>>>
>>>> 1) vhost protocol is stable ABI
>>>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>>>
>>>> Or are there any advantages if we do it at virtio layer?
>>>>
>>> As far as I can tell, we will need the virtio layer the moment we
>>> start copying/translating buffers.
>>>
>>> In this series, the virtio dependency can be reduced if qemu does not
>>> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
>>> would enable packed queues and IOMMU immediately, and I think the cost
>>> should not be so high. In the previous RFC this check was deleted
>>> later anyway, so I think it was a bad idea to include it from the start.
>>
>> I am not sure I understand here. For vhost, we can still do anything we
>> want, e.g accessing guest memory etc. Any blocker that prevent us from
>> copying/translating buffers? (Note that qemu will propagate memory
>> mappings to vhost).
>>
> There is nothing that forbids us to access directly, but if we don't
> reuse the virtio layer functionality we would have to duplicate every
> access function. "Need" was a too strong word maybe :).
>
> In other words: for the shadow vq vring exposed for the device, qemu
> treats it as a driver, and this functionality needs to be added to
> qemu. But for accessing the guest's one do not reuse virtio.c would be
> a bad idea in my opinion.


The problem is, virtio.c is not a library and it has a lot of dependency 
with other qemu modules basically makes it impossible to be reused at 
vhost level.

We can solve this by:

1) split the core functions out as a library or
2) switch to use contrib/lib-vhostuser but needs to decouple UNIX socket 
transport

None of the above looks trivial and they are only device codes. For 
shadow virtqueue, we need driver codes as well where no code can be reused.

As we discussed, we probably need IOVA allocated when forwarding 
descriptors between the two virtqueues. So my feeling is we can have our 
own codes to start then we can consider whether we can reuse some from 
the existing virtio.c or lib-vhostuser.

Thanks


>
>> Thanks
>>
>>
>>>
>>>
>>>
>>>
>>>> Thanks
>>>>
>>>>
>>>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>>>
>>>>>>> +
>>>>>>>      static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>>>                                        MemoryRegionSection *section,
>>>>>>>                                        uint64_t mfirst, uint64_t mlast,
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-02-04  3:26         ` Jason Wang
  (?)
@ 2021-02-09 15:02         ` Eugenio Perez Martin
  2021-02-10  5:57             ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-09 15:02 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Michael S. Tsirkin, Juan Quintela, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Thu, Feb 4, 2021 at 4:27 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/2/2 下午6:08, Eugenio Perez Martin wrote:
> > On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> >>> Shadow virtqueue notifications forwarding is disabled when vhost_dev
> >>> stops.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>    hw/virtio/vhost-shadow-virtqueue.h |   5 ++
> >>>    include/hw/virtio/vhost.h          |   4 +
> >>>    hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
> >>>    hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
> >>>    4 files changed, 264 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> >>> index 6cc18d6acb..466f8ae595 100644
> >>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> >>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> >>> @@ -17,6 +17,11 @@
> >>>
> >>>    typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> >>>
> >>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> >>> +                               VhostShadowVirtqueue *svq);
> >>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> >>> +                              VhostShadowVirtqueue *svq);
> >>> +
> >>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
> >>>
> >>>    void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> >>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>> index 2be782cefd..732a4b2a2b 100644
> >>> --- a/include/hw/virtio/vhost.h
> >>> +++ b/include/hw/virtio/vhost.h
> >>> @@ -55,6 +55,8 @@ struct vhost_iommu {
> >>>        QLIST_ENTRY(vhost_iommu) iommu_next;
> >>>    };
> >>>
> >>> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> >>> +
> >>>    typedef struct VhostDevConfigOps {
> >>>        /* Vhost device config space changed callback
> >>>         */
> >>> @@ -83,7 +85,9 @@ struct vhost_dev {
> >>>        uint64_t backend_cap;
> >>>        bool started;
> >>>        bool log_enabled;
> >>> +    bool sw_lm_enabled;
> >>>        uint64_t log_size;
> >>> +    VhostShadowVirtqueue **shadow_vqs;
> >>>        Error *migration_blocker;
> >>>        const VhostOps *vhost_ops;
> >>>        void *opaque;
> >>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> >>> index c0c967a7c5..908c36c66d 100644
> >>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> >>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> >>> @@ -8,15 +8,129 @@
> >>>     */
> >>>
> >>>    #include "hw/virtio/vhost-shadow-virtqueue.h"
> >>> +#include "hw/virtio/vhost.h"
> >>> +#include "hw/virtio/virtio-access.h"
> >>> +
> >>> +#include "standard-headers/linux/vhost_types.h"
> >>> +#include "standard-headers/linux/virtio_ring.h"
> >>>
> >>>    #include "qemu/error-report.h"
> >>> -#include "qemu/event_notifier.h"
> >>> +#include "qemu/main-loop.h"
> >>>
> >>>    typedef struct VhostShadowVirtqueue {
> >>>        EventNotifier kick_notifier;
> >>>        EventNotifier call_notifier;
> >>> +    const struct vhost_virtqueue *hvq;
> >>> +    VirtIODevice *vdev;
> >>> +    VirtQueue *vq;
> >>>    } VhostShadowVirtqueue;
> >>
> >> So instead of doing things at virtio level, how about do the shadow
> >> stuffs at vhost level?
> >>
> >> It works like:
> >>
> >> virtio -> [shadow vhost backend] -> vhost backend
> >>
> >> Then the QMP is used to plug the shadow vhost backend in the middle or not.
> >>
> >> It looks kind of easier since we don't need to deal with virtqueue
> >> handlers etc.. Instead, we just need to deal with eventfd stuffs:
> >>
> >> When shadow vhost mode is enabled, we just intercept the host_notifiers
> >> and guest_notifiers. When it was disabled, we just pass the host/guest
> >> notifiers to the real vhost backends?
> >>
> > Hi Jason.
> >
> > Sure we can try that model, but it seems to me that it comes with a
> > different set of problems.
> >
> > For example, there are code in vhost.c that checks if implementations
> > are available in vhost_ops, like:
> >
> > if (dev->vhost_ops->vhost_vq_get_addr) {
> >          r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
> >          ...
> > }
> >
> > I can count 14 of these, checking:
> >
> > dev->vhost_ops->vhost_backend_can_merge
> > dev->vhost_ops->vhost_backend_mem_section_filter
> > dev->vhost_ops->vhost_force_iommu
> > dev->vhost_ops->vhost_requires_shm_log
> > dev->vhost_ops->vhost_set_backend_cap
> > dev->vhost_ops->vhost_set_vring_busyloop_timeout
> > dev->vhost_ops->vhost_vq_get_addr
> > hdev->vhost_ops->vhost_dev_start
> > hdev->vhost_ops->vhost_get_config
> > hdev->vhost_ops->vhost_get_inflight_fd
> > hdev->vhost_ops->vhost_net_set_backend
> > hdev->vhost_ops->vhost_set_config
> > hdev->vhost_ops->vhost_set_inflight_fd
> > hdev->vhost_ops->vhost_set_iotlb_callback
> >
> > So we should Implement all of the vhost_ops callbacks, forwarding them
> > to actual vhost_backed, and delete conditionally these ones? In other
> > words, dynamically generate the new shadow vq vhost_ops? If a new
> > callback is added to any vhost backend in the future, do we have to
> > force the adding / checking for NULL in shadow backend vhost_ops?
> > Would this be a good moment to check if all backends implement these
> > and delete the checks?
>
>
> I think it won't be easy if we want to support all kinds of vhost
> backends from the start. So we can go with vhost-vdpa one first.
>
> Actually how it work might be something like (no need to switch
> vhost_ops, we can do everything silently in the ops)
>
> 1) when device to switch to shadow vq (e.g via QMP)
> 2) vhost-vdpa will stop and sync state (last_avail_idx) internally
> 3) reset vhost-vdpa, clean call and kick eventfd
> 4) allocate vqs for vhost-vdpa, new call and kick eventfd, restart
> vhost-vdpa
> 5) start the shadow vq (make it start for last_avail_idx)
> 6) intercept ioeventfd and forward the request to callfd
> 7) intercept callfd and forward the request to irqfd
> 8) forward request between shadow virtqueue and vhost-vdpa
>

Sorry, still not clear to me how it relates with the actual backend used :).

Would it work for you if we finish the notifications forwarding part
and we work on the buffer forwarding part?

I'm going to send another cleaner revision, with less dependencies on
virtio code, only notifications eventfds. I think it will be easier to
discuss the changes on top of that.

>
> >
> > There are also checks like:
> >
> > if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)
> >
> > How would shadow_vq backend expose itself? (I guess as the actual used backend).
> >
> > I can modify this patchset to not relay the guest->host notifications
> > on vq handlers but on eventfd handlers. Although this will make it
> > independent of the actual virtio device kind used, I can see two
> > drawbacks:
> > * The actual fact that it makes it independent of virtio device kind.
> > If a device does not use the notifiers and poll the ring by itself, it
> > has no chance of knowing that it should stop. What happens if
> > virtio-net tx timer is armed when we start shadow vq?.
>
>
> So if we do that in vhost level, it's a vhost backend from the virtio
> layer. Then we don't need to worry about tx timer stuffs.
>

Got it.

So I'm going to assume that no device in virtio layer needs to be
aware of the change. It seems a valid assumption to me.

>
> > * The fixes (current and future) in vq notifications, like the one
> > currently implemented in virtio_notify_irqfd for windows drivers
> > regarding ISR bit 0. I think this one in particular is OK not to
> > carry, but I think many changes affecting any of the functions will
> > have to be mirrored in the other.
>
>
> Consider we behave like a vhost, it just work as in the past for other
> type of vhost backends when MSI-X is not enabled?
>

Yes, it may be a bad example, as vhost devices may not update it.
However I still think a lot of features (packed ring buffers
treatment, etc), performance optimizations and future fixes will have
to apply to two different codes in case we roll our own buffers
treatment.

Expanding in thread of patch 05/10, since you propose two solutions for this.

Thanks!


> Thanks
>
>
> >
> > Thoughts on this?
> >
> > Thanks!
> >
> >> Thanks
> >>
> >>
> >>> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
> >>> +{
> >>> +    const struct vring_used *used = svq->hvq->used;
> >>> +    return virtio_tswap16(svq->vdev, used->flags);
> >>> +}
> >>> +
> >>> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
> >>> +{
> >>> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
> >>> +}
> >>> +
> >>> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
> >>> +{
> >>> +    if (vhost_shadow_vring_should_kick(vq)) {
> >>> +        event_notifier_set(&vq->kick_notifier);
> >>> +    }
> >>> +}
> >>> +
> >>> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
> >>> +{
> >>> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
> >>> +    uint16_t idx = virtio_get_queue_index(vq);
> >>> +
> >>> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
> >>> +
> >>> +    vhost_shadow_vring_kick(svq);
> >>> +}
> >>> +
> >>> +/*
> >>> + * Start shadow virtqueue operation.
> >>> + * @dev vhost device
> >>> + * @svq Shadow Virtqueue
> >>> + *
> >>> + * Run in RCU context
> >>> + */
> >>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> >>> +                               VhostShadowVirtqueue *svq)
> >>> +{
> >>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> >>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> >>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> >>> +    struct vhost_vring_file kick_file = {
> >>> +        .index = idx,
> >>> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> >>> +    };
> >>> +    int r;
> >>> +    bool ok;
> >>> +
> >>> +    /* Check that notifications are still going directly to vhost dev */
> >>> +    assert(virtio_queue_host_notifier_status(svq->vq));
> >>> +
> >>> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
> >>> +    if (!ok) {
> >>> +        error_report("Couldn't set the vq handler");
> >>> +        goto err_set_kick_handler;
> >>> +    }
> >>> +
> >>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> >>> +    if (r != 0) {
> >>> +        error_report("Couldn't set kick fd: %s", strerror(errno));
> >>> +        goto err_set_vring_kick;
> >>> +    }
> >>> +
> >>> +    event_notifier_set_handler(vq_host_notifier,
> >>> +                               virtio_queue_host_notifier_read);
> >>> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
> >>> +    virtio_queue_host_notifier_read(vq_host_notifier);
> >>> +
> >>> +    return true;
> >>> +
> >>> +err_set_vring_kick:
> >>> +    k->set_vq_handler(dev->vdev, idx, NULL);
> >>> +
> >>> +err_set_kick_handler:
> >>> +    return false;
> >>> +}
> >>> +
> >>> +/*
> >>> + * Stop shadow virtqueue operation.
> >>> + * @dev vhost device
> >>> + * @svq Shadow Virtqueue
> >>> + *
> >>> + * Run in RCU context
> >>> + */
> >>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> >>> +                              VhostShadowVirtqueue *svq)
> >>> +{
> >>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
> >>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> >>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> >>> +    struct vhost_vring_file kick_file = {
> >>> +        .index = idx,
> >>> +        .fd = event_notifier_get_fd(vq_host_notifier),
> >>> +    };
> >>> +    int r;
> >>> +
> >>> +    /* Restore vhost kick */
> >>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> >>> +    /* Cannot do a lot of things */
> >>> +    assert(r == 0);
> >>> +
> >>> +    event_notifier_set_handler(vq_host_notifier, NULL);
> >>> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
> >>> +    k->set_vq_handler(svq->vdev, idx, NULL);
> >>> +}
> >>> +
> >>>    /*
> >>>     * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
> >>>     * methods and file descriptors.
> >>> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
> >>>    VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> >>>    {
> >>>        g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> >>> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
> >>>        int r;
> >>>
> >>> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> >>> +    svq->hvq = &dev->vqs[idx];
> >>> +    svq->vdev = dev->vdev;
> >>> +
> >>>        r = event_notifier_init(&svq->kick_notifier, 0);
> >>>        if (r != 0) {
> >>>            error_report("Couldn't create kick event notifier: %s",
> >>> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> >>>            goto err_init_call_notifier;
> >>>        }
> >>>
> >>> -    return svq;
> >>> +    return g_steal_pointer(&svq);
> >>>
> >>>    err_init_call_notifier:
> >>>        event_notifier_cleanup(&svq->kick_notifier);
> >>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>> index 42836e45f3..bde688f278 100644
> >>> --- a/hw/virtio/vhost.c
> >>> +++ b/hw/virtio/vhost.c
> >>> @@ -25,6 +25,7 @@
> >>>    #include "exec/address-spaces.h"
> >>>    #include "hw/virtio/virtio-bus.h"
> >>>    #include "hw/virtio/virtio-access.h"
> >>> +#include "hw/virtio/vhost-shadow-virtqueue.h"
> >>>    #include "migration/blocker.h"
> >>>    #include "migration/qemu-file-types.h"
> >>>    #include "sysemu/dma.h"
> >>> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
> >>>        }
> >>>    }
> >>>
> >>> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> >>> +{
> >>> +    int idx;
> >>> +
> >>> +    WITH_RCU_READ_LOCK_GUARD() {
> >>> +        dev->sw_lm_enabled = false;
> >>> +
> >>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> >>> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
> >>> +        }
> >>> +    }
> >>> +
> >>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> >>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> >>> +    }
> >>> +
> >>> +    g_free(dev->shadow_vqs);
> >>> +    dev->shadow_vqs = NULL;
> >>> +    return 0;
> >>> +}
> >>> +
> >>> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> >>> +{
> >>> +    int idx;
> >>> +
> >>> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> >>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
> >>> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> >>> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> >>> +            goto err;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    WITH_RCU_READ_LOCK_GUARD() {
> >>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
> >>> +            int stop_idx = idx;
> >>> +            bool ok = vhost_shadow_vq_start_rcu(dev,
> >>> +                                                dev->shadow_vqs[idx]);
> >>> +
> >>> +            if (!ok) {
> >>> +                while (--stop_idx >= 0) {
> >>> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
> >>> +                }
> >>> +
> >>> +                goto err;
> >>> +            }
> >>> +        }
> >>> +    }
> >>> +
> >>> +    dev->sw_lm_enabled = true;
> >>> +    return 0;
> >>> +
> >>> +err:
> >>> +    for (; idx >= 0; --idx) {
> >>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> >>> +    }
> >>> +    g_free(dev->shadow_vqs[idx]);
> >>> +
> >>> +    return -1;
> >>> +}
> >>> +
> >>> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> >>> +                                          bool enable_lm)
> >>> +{
> >>> +    int r;
> >>> +
> >>> +    if (enable_lm == dev->sw_lm_enabled) {
> >>> +        return 0;
> >>> +    }
> >>> +
> >>> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
> >>> +                  : vhost_sw_live_migration_stop(dev);
> >>> +
> >>> +    return r;
> >>> +}
> >>> +
> >>>    static void vhost_log_start(MemoryListener *listener,
> >>>                                MemoryRegionSection *section,
> >>>                                int old, int new)
> >>> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> >>>        hdev->log = NULL;
> >>>        hdev->log_size = 0;
> >>>        hdev->log_enabled = false;
> >>> +    hdev->sw_lm_enabled = false;
> >>>        hdev->started = false;
> >>>        memory_listener_register(&hdev->memory_listener, &address_space_memory);
> >>>        QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> >>> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >>>            hdev->vhost_ops->vhost_dev_start(hdev, false);
> >>>        }
> >>>        for (i = 0; i < hdev->nvqs; ++i) {
> >>> +        if (hdev->sw_lm_enabled) {
> >>> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
> >>> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> >>> +        }
> >>> +
> >>>            vhost_virtqueue_stop(hdev,
> >>>                                 vdev,
> >>>                                 hdev->vqs + i,
> >>> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> >>>            memory_listener_unregister(&hdev->iommu_listener);
> >>>        }
> >>>        vhost_log_put(hdev, true);
> >>> +    g_free(hdev->shadow_vqs);
> >>> +    hdev->sw_lm_enabled = false;
> >>>        hdev->started = false;
> >>>        hdev->vdev = NULL;
> >>>    }
> >>> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >>>
> >>>    void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> >>>    {
> >>> -    error_setg(errp, "Shadow virtqueue still not implemented.");
> >>> +    struct vhost_dev *hdev;
> >>> +    const char *err_cause = NULL;
> >>> +    const VirtioDeviceClass *k;
> >>> +    int r;
> >>> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> >>> +
> >>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> >>> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> >>> +            break;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    if (!hdev) {
> >>> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> >>> +        err_cause = "Device not found";
> >>> +        goto err;
> >>> +    }
> >>> +
> >>> +    if (!hdev->started) {
> >>> +        err_cause = "Device is not started";
> >>> +        goto err;
> >>> +    }
> >>> +
> >>> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
> >>> +        err_cause = "Use packed vq";
> >>> +        goto err;
> >>> +    }
> >>> +
> >>> +    if (vhost_dev_has_iommu(hdev)) {
> >>> +        err_cause = "Device use IOMMU";
> >>> +        goto err;
> >>> +    }
> >>> +
> >>> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
> >>> +    if (!k->set_vq_handler) {
> >>> +        err_cause = "Virtio device type does not support reset of vq handler";
> >>> +        goto err;
> >>> +    }
> >>> +
> >>> +    r = vhost_sw_live_migration_enable(hdev, enable);
> >>> +    if (unlikely(r)) {
> >>> +        err_cause = "Error enabling (see monitor)";
> >>> +    }
> >>> +
> >>> +err:
> >>> +    if (err_cause) {
> >>> +        error_set(errp, err_class,
> >>> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
> >>> +    }
> >>>    }
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-05  3:51                 ` Jason Wang
  (?)
@ 2021-02-09 15:35                 ` Eugenio Perez Martin
  2021-02-10  5:54                     ` Jason Wang
  -1 siblings, 1 reply; 42+ messages in thread
From: Eugenio Perez Martin @ 2021-02-09 15:35 UTC (permalink / raw)
  To: Jason Wang
  Cc: Parav Pandit, Juan Quintela, Michael S. Tsirkin, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller

On Fri, Feb 5, 2021 at 4:52 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2021/2/4 下午5:25, Eugenio Perez Martin wrote:
> > On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
> >>> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
> >>>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
> >>>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> >>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>>>>> ---
> >>>>>>>      include/hw/virtio/vhost.h |  1 +
> >>>>>>>      hw/virtio/vhost.c         | 17 +++++++++++++++++
> >>>>>>>      2 files changed, 18 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> >>>>>>> index 4a8bc75415..fca076e3f0 100644
> >>>>>>> --- a/include/hw/virtio/vhost.h
> >>>>>>> +++ b/include/hw/virtio/vhost.h
> >>>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>>>>>      void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
> >>>>>>>                              uint64_t features);
> >>>>>>>      bool vhost_has_free_slot(void);
> >>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
> >>>>>>>
> >>>>>>>      int vhost_net_set_backend(struct vhost_dev *hdev,
> >>>>>>>                                struct vhost_vring_file *file);
> >>>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>>>>>> index 28c7d78172..8683d507f5 100644
> >>>>>>> --- a/hw/virtio/vhost.c
> >>>>>>> +++ b/hw/virtio/vhost.c
> >>>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
> >>>>>>>          return slots_limit > used_memslots;
> >>>>>>>      }
> >>>>>>>
> >>>>>>> +/*
> >>>>>>> + * Get the vhost device associated to a VirtIO device.
> >>>>>>> + */
> >>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
> >>>>>>> +{
> >>>>>>> +    struct vhost_dev *hdev;
> >>>>>>> +
> >>>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> >>>>>>> +        if (hdev->vdev == vdev) {
> >>>>>>> +            return hdev;
> >>>>>>> +        }
> >>>>>>> +    }
> >>>>>>> +
> >>>>>>> +    assert(hdev);
> >>>>>>> +    return NULL;
> >>>>>>> +}
> >>>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
> >>>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>> Right. We could add an "vdev vq index" parameter to the function in
> >>>>> this case, but I guess the most reliable way to do this is to add a
> >>>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
> >>>> So the question still, it looks like it's easier to hide the shadow
> >>>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
> >>>>
> >>>> 1) vhost protocol is stable ABI
> >>>> 2) no need to deal with virtio stuffs which is more complex than vhost
> >>>>
> >>>> Or are there any advantages if we do it at virtio layer?
> >>>>
> >>> As far as I can tell, we will need the virtio layer the moment we
> >>> start copying/translating buffers.
> >>>
> >>> In this series, the virtio dependency can be reduced if qemu does not
> >>> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
> >>> would enable packed queues and IOMMU immediately, and I think the cost
> >>> should not be so high. In the previous RFC this check was deleted
> >>> later anyway, so I think it was a bad idea to include it from the start.
> >>
> >> I am not sure I understand here. For vhost, we can still do anything we
> >> want, e.g accessing guest memory etc. Any blocker that prevent us from
> >> copying/translating buffers? (Note that qemu will propagate memory
> >> mappings to vhost).
> >>
> > There is nothing that forbids us to access directly, but if we don't
> > reuse the virtio layer functionality we would have to duplicate every
> > access function. "Need" was a too strong word maybe :).
> >
> > In other words: for the shadow vq vring exposed for the device, qemu
> > treats it as a driver, and this functionality needs to be added to
> > qemu. But for accessing the guest's one do not reuse virtio.c would be
> > a bad idea in my opinion.
>
>
> The problem is, virtio.c is not a library and it has a lot of dependency
> with other qemu modules basically makes it impossible to be reused at
> vhost level.
>

While virtio.c as a whole has dependencies, I think that the functions
needed in the original RFC do not have these dependencies.

However I see how to split vring dataplane from virtio device
management can benefit.

> We can solve this by:
>
> 1) split the core functions out as a library or
> 2) switch to use contrib/lib-vhostuser but needs to decouple UNIX socket
> transport
>
> None of the above looks trivial and they are only device codes. For
> shadow virtqueue, we need driver codes as well where no code can be reused.
>
> As we discussed, we probably need IOVA allocated when forwarding
> descriptors between the two virtqueues. So my feeling is we can have our
> own codes to start then we can consider whether we can reuse some from
> the existing virtio.c or lib-vhostuser.
>

As I see it, if we develop our own code a lot of it will be copied
from current virtio.c, which itself duplicates a lot of contrib/ lib
functionality.

Maybe it's better to combine your proposals and decouple the vring
functions, the vhost transport, and the qemu virtio device management,
so other projects can reuse them directly?

I still think this can be left for a later series with buffer
forwarding on top of this one, do you think they can/should be merged
independently?

Thanks!

> Thanks
>
>
> >
> >> Thanks
> >>
> >>
> >>>
> >>>
> >>>
> >>>
> >>>> Thanks
> >>>>
> >>>>
> >>>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
> >>>>>
> >>>>>>> +
> >>>>>>>      static void vhost_dev_sync_region(struct vhost_dev *dev,
> >>>>>>>                                        MemoryRegionSection *section,
> >>>>>>>                                        uint64_t mfirst, uint64_t mlast,
> >
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
  2021-02-09 15:35                 ` Eugenio Perez Martin
@ 2021-02-10  5:54                     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-10  5:54 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Juan Quintela, Michael S. Tsirkin, qemu-level,
	Markus Armbruster, Stefano Garzarella, Harpreet Singh Anand,
	Xiao W Wang, Stefan Hajnoczi, Eli Cohen, virtualization,
	Michael Lilja, Jim Harford, Rob Miller


On 2021/2/9 下午11:35, Eugenio Perez Martin wrote:
> On Fri, Feb 5, 2021 at 4:52 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/4 下午5:25, Eugenio Perez Martin wrote:
>>> On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
>>>>> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>>>>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>>>> ---
>>>>>>>>>       include/hw/virtio/vhost.h |  1 +
>>>>>>>>>       hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>>>>>       2 files changed, 18 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>>>       void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>>>                               uint64_t features);
>>>>>>>>>       bool vhost_has_free_slot(void);
>>>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>>>>>
>>>>>>>>>       int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>>>>>                                 struct vhost_vring_file *file);
>>>>>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>>>>>> index 28c7d78172..8683d507f5 100644
>>>>>>>>> --- a/hw/virtio/vhost.c
>>>>>>>>> +++ b/hw/virtio/vhost.c
>>>>>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>>>>>           return slots_limit > used_memslots;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> +/*
>>>>>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>>>>>> + */
>>>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>>>>>> +{
>>>>>>>>> +    struct vhost_dev *hdev;
>>>>>>>>> +
>>>>>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>>>>>> +        if (hdev->vdev == vdev) {
>>>>>>>>> +            return hdev;
>>>>>>>>> +        }
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    assert(hdev);
>>>>>>>>> +    return NULL;
>>>>>>>>> +}
>>>>>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>>>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>> Right. We could add an "vdev vq index" parameter to the function in
>>>>>>> this case, but I guess the most reliable way to do this is to add a
>>>>>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>>>>> So the question still, it looks like it's easier to hide the shadow
>>>>>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>>>>>
>>>>>> 1) vhost protocol is stable ABI
>>>>>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>>>>>
>>>>>> Or are there any advantages if we do it at virtio layer?
>>>>>>
>>>>> As far as I can tell, we will need the virtio layer the moment we
>>>>> start copying/translating buffers.
>>>>>
>>>>> In this series, the virtio dependency can be reduced if qemu does not
>>>>> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
>>>>> would enable packed queues and IOMMU immediately, and I think the cost
>>>>> should not be so high. In the previous RFC this check was deleted
>>>>> later anyway, so I think it was a bad idea to include it from the start.
>>>> I am not sure I understand here. For vhost, we can still do anything we
>>>> want, e.g accessing guest memory etc. Any blocker that prevent us from
>>>> copying/translating buffers? (Note that qemu will propagate memory
>>>> mappings to vhost).
>>>>
>>> There is nothing that forbids us to access directly, but if we don't
>>> reuse the virtio layer functionality we would have to duplicate every
>>> access function. "Need" was a too strong word maybe :).
>>>
>>> In other words: for the shadow vq vring exposed for the device, qemu
>>> treats it as a driver, and this functionality needs to be added to
>>> qemu. But for accessing the guest's one do not reuse virtio.c would be
>>> a bad idea in my opinion.
>>
>> The problem is, virtio.c is not a library and it has a lot of dependency
>> with other qemu modules basically makes it impossible to be reused at
>> vhost level.
>>
> While virtio.c as a whole has dependencies, I think that the functions
> needed in the original RFC do not have these dependencies.
>
> However I see how to split vring dataplane from virtio device
> management can benefit.


If you can split them out, that would be fine.


>
>> We can solve this by:
>>
>> 1) split the core functions out as a library or
>> 2) switch to use contrib/lib-vhostuser but needs to decouple UNIX socket
>> transport
>>
>> None of the above looks trivial and they are only device codes. For
>> shadow virtqueue, we need driver codes as well where no code can be reused.
>>
>> As we discussed, we probably need IOVA allocated when forwarding
>> descriptors between the two virtqueues. So my feeling is we can have our
>> own codes to start then we can consider whether we can reuse some from
>> the existing virtio.c or lib-vhostuser.
>>
> As I see it, if we develop our own code a lot of it will be copied
> from current virtio.c, which itself duplicates a lot of contrib/ lib
> functionality.
>
> Maybe it's better to combine your proposals and decouple the vring
> functions, the vhost transport, and the qemu virtio device management,
> so other projects can reuse them directly?


I think this can work.


>
> I still think this can be left for a later series with buffer
> forwarding on top of this one, do you think they can/should be merged
> independently?


Since you post a new series, let's see.

Thanks


>
> Thanks!
>
>> Thanks
>>
>>
>>>> Thanks
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>>>>>
>>>>>>>>> +
>>>>>>>>>       static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>>>>>                                         MemoryRegionSection *section,
>>>>>>>>>                                         uint64_t mfirst, uint64_t mlast,



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 05/10] vhost: Add vhost_dev_from_virtio
@ 2021-02-10  5:54                     ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-10  5:54 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Michael Lilja, Jim Harford, Rob Miller


On 2021/2/9 下午11:35, Eugenio Perez Martin wrote:
> On Fri, Feb 5, 2021 at 4:52 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/4 下午5:25, Eugenio Perez Martin wrote:
>>> On Thu, Feb 4, 2021 at 4:14 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/2/2 下午6:17, Eugenio Perez Martin wrote:
>>>>> On Tue, Feb 2, 2021 at 4:31 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On 2021/2/1 下午4:28, Eugenio Perez Martin wrote:
>>>>>>> On Mon, Feb 1, 2021 at 7:13 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>>>> ---
>>>>>>>>>       include/hw/virtio/vhost.h |  1 +
>>>>>>>>>       hw/virtio/vhost.c         | 17 +++++++++++++++++
>>>>>>>>>       2 files changed, 18 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>>>>>> index 4a8bc75415..fca076e3f0 100644
>>>>>>>>> --- a/include/hw/virtio/vhost.h
>>>>>>>>> +++ b/include/hw/virtio/vhost.h
>>>>>>>>> @@ -123,6 +123,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>>>       void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits,
>>>>>>>>>                               uint64_t features);
>>>>>>>>>       bool vhost_has_free_slot(void);
>>>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev);
>>>>>>>>>
>>>>>>>>>       int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>>>>>                                 struct vhost_vring_file *file);
>>>>>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>>>>>> index 28c7d78172..8683d507f5 100644
>>>>>>>>> --- a/hw/virtio/vhost.c
>>>>>>>>> +++ b/hw/virtio/vhost.c
>>>>>>>>> @@ -61,6 +61,23 @@ bool vhost_has_free_slot(void)
>>>>>>>>>           return slots_limit > used_memslots;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> +/*
>>>>>>>>> + * Get the vhost device associated to a VirtIO device.
>>>>>>>>> + */
>>>>>>>>> +struct vhost_dev *vhost_dev_from_virtio(const VirtIODevice *vdev)
>>>>>>>>> +{
>>>>>>>>> +    struct vhost_dev *hdev;
>>>>>>>>> +
>>>>>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>>>>>> +        if (hdev->vdev == vdev) {
>>>>>>>>> +            return hdev;
>>>>>>>>> +        }
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    assert(hdev);
>>>>>>>>> +    return NULL;
>>>>>>>>> +}
>>>>>>>> I'm not sure this can work in the case of multiqueue. E.g vhost-net
>>>>>>>> multiqueue is a N:1 mapping between vhost devics and virtio devices.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>> Right. We could add an "vdev vq index" parameter to the function in
>>>>>>> this case, but I guess the most reliable way to do this is to add a
>>>>>>> vhost_opaque value to VirtQueue, as Stefan proposed in previous RFC.
>>>>>> So the question still, it looks like it's easier to hide the shadow
>>>>>> virtqueue stuffs at vhost layer instead of expose them to virtio layer:
>>>>>>
>>>>>> 1) vhost protocol is stable ABI
>>>>>> 2) no need to deal with virtio stuffs which is more complex than vhost
>>>>>>
>>>>>> Or are there any advantages if we do it at virtio layer?
>>>>>>
>>>>> As far as I can tell, we will need the virtio layer the moment we
>>>>> start copying/translating buffers.
>>>>>
>>>>> In this series, the virtio dependency can be reduced if qemu does not
>>>>> check the used ring _F_NO_NOTIFY flag before writing to irqfd. It
>>>>> would enable packed queues and IOMMU immediately, and I think the cost
>>>>> should not be so high. In the previous RFC this check was deleted
>>>>> later anyway, so I think it was a bad idea to include it from the start.
>>>> I am not sure I understand here. For vhost, we can still do anything we
>>>> want, e.g accessing guest memory etc. Any blocker that prevent us from
>>>> copying/translating buffers? (Note that qemu will propagate memory
>>>> mappings to vhost).
>>>>
>>> There is nothing that forbids us to access directly, but if we don't
>>> reuse the virtio layer functionality we would have to duplicate every
>>> access function. "Need" was a too strong word maybe :).
>>>
>>> In other words: for the shadow vq vring exposed for the device, qemu
>>> treats it as a driver, and this functionality needs to be added to
>>> qemu. But for accessing the guest's one do not reuse virtio.c would be
>>> a bad idea in my opinion.
>>
>> The problem is, virtio.c is not a library and it has a lot of dependency
>> with other qemu modules basically makes it impossible to be reused at
>> vhost level.
>>
> While virtio.c as a whole has dependencies, I think that the functions
> needed in the original RFC do not have these dependencies.
>
> However I see how to split vring dataplane from virtio device
> management can benefit.


If you can split them out, that would be fine.


>
>> We can solve this by:
>>
>> 1) split the core functions out as a library or
>> 2) switch to use contrib/lib-vhostuser but needs to decouple UNIX socket
>> transport
>>
>> None of the above looks trivial and they are only device codes. For
>> shadow virtqueue, we need driver codes as well where no code can be reused.
>>
>> As we discussed, we probably need IOVA allocated when forwarding
>> descriptors between the two virtqueues. So my feeling is we can have our
>> own codes to start then we can consider whether we can reuse some from
>> the existing virtio.c or lib-vhostuser.
>>
> As I see it, if we develop our own code a lot of it will be copied
> from current virtio.c, which itself duplicates a lot of contrib/ lib
> functionality.
>
> Maybe it's better to combine your proposals and decouple the vring
> functions, the vhost transport, and the qemu virtio device management,
> so other projects can reuse them directly?


I think this can work.


>
> I still think this can be left for a later series with buffer
> forwarding on top of this one, do you think they can/should be merged
> independently?


Since you post a new series, let's see.

Thanks


>
> Thanks!
>
>> Thanks
>>
>>
>>>> Thanks
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>> I need to take this into account in qmp_x_vhost_enable_shadow_vq too.
>>>>>>>
>>>>>>>>> +
>>>>>>>>>       static void vhost_dev_sync_region(struct vhost_dev *dev,
>>>>>>>>>                                         MemoryRegionSection *section,
>>>>>>>>>                                         uint64_t mfirst, uint64_t mlast,

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
  2021-02-09 15:02         ` Eugenio Perez Martin
@ 2021-02-10  5:57             ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-10  5:57 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Rob Miller, Parav Pandit, Juan Quintela, Michael S. Tsirkin,
	qemu-level, Markus Armbruster, Harpreet Singh Anand, Xiao W Wang,
	Stefan Hajnoczi, Eli Cohen, virtualization, Michael Lilja,
	Jim Harford, Stefano Garzarella


On 2021/2/9 下午11:02, Eugenio Perez Martin wrote:
> On Thu, Feb 4, 2021 at 4:27 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/2 下午6:08, Eugenio Perez Martin wrote:
>>> On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>> Shadow virtqueue notifications forwarding is disabled when vhost_dev
>>>>> stops.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>     hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>>>>>     include/hw/virtio/vhost.h          |   4 +
>>>>>     hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>>>>>     hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>>>>>     4 files changed, 264 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> index 6cc18d6acb..466f8ae595 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> @@ -17,6 +17,11 @@
>>>>>
>>>>>     typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>>>
>>>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>>>> +                               VhostShadowVirtqueue *svq);
>>>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>>>> +                              VhostShadowVirtqueue *svq);
>>>>> +
>>>>>     VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>>>>>
>>>>>     void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>> index 2be782cefd..732a4b2a2b 100644
>>>>> --- a/include/hw/virtio/vhost.h
>>>>> +++ b/include/hw/virtio/vhost.h
>>>>> @@ -55,6 +55,8 @@ struct vhost_iommu {
>>>>>         QLIST_ENTRY(vhost_iommu) iommu_next;
>>>>>     };
>>>>>
>>>>> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>>> +
>>>>>     typedef struct VhostDevConfigOps {
>>>>>         /* Vhost device config space changed callback
>>>>>          */
>>>>> @@ -83,7 +85,9 @@ struct vhost_dev {
>>>>>         uint64_t backend_cap;
>>>>>         bool started;
>>>>>         bool log_enabled;
>>>>> +    bool sw_lm_enabled;
>>>>>         uint64_t log_size;
>>>>> +    VhostShadowVirtqueue **shadow_vqs;
>>>>>         Error *migration_blocker;
>>>>>         const VhostOps *vhost_ops;
>>>>>         void *opaque;
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> index c0c967a7c5..908c36c66d 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> @@ -8,15 +8,129 @@
>>>>>      */
>>>>>
>>>>>     #include "hw/virtio/vhost-shadow-virtqueue.h"
>>>>> +#include "hw/virtio/vhost.h"
>>>>> +#include "hw/virtio/virtio-access.h"
>>>>> +
>>>>> +#include "standard-headers/linux/vhost_types.h"
>>>>> +#include "standard-headers/linux/virtio_ring.h"
>>>>>
>>>>>     #include "qemu/error-report.h"
>>>>> -#include "qemu/event_notifier.h"
>>>>> +#include "qemu/main-loop.h"
>>>>>
>>>>>     typedef struct VhostShadowVirtqueue {
>>>>>         EventNotifier kick_notifier;
>>>>>         EventNotifier call_notifier;
>>>>> +    const struct vhost_virtqueue *hvq;
>>>>> +    VirtIODevice *vdev;
>>>>> +    VirtQueue *vq;
>>>>>     } VhostShadowVirtqueue;
>>>> So instead of doing things at virtio level, how about do the shadow
>>>> stuffs at vhost level?
>>>>
>>>> It works like:
>>>>
>>>> virtio -> [shadow vhost backend] -> vhost backend
>>>>
>>>> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>>>>
>>>> It looks kind of easier since we don't need to deal with virtqueue
>>>> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>>>>
>>>> When shadow vhost mode is enabled, we just intercept the host_notifiers
>>>> and guest_notifiers. When it was disabled, we just pass the host/guest
>>>> notifiers to the real vhost backends?
>>>>
>>> Hi Jason.
>>>
>>> Sure we can try that model, but it seems to me that it comes with a
>>> different set of problems.
>>>
>>> For example, there are code in vhost.c that checks if implementations
>>> are available in vhost_ops, like:
>>>
>>> if (dev->vhost_ops->vhost_vq_get_addr) {
>>>           r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
>>>           ...
>>> }
>>>
>>> I can count 14 of these, checking:
>>>
>>> dev->vhost_ops->vhost_backend_can_merge
>>> dev->vhost_ops->vhost_backend_mem_section_filter
>>> dev->vhost_ops->vhost_force_iommu
>>> dev->vhost_ops->vhost_requires_shm_log
>>> dev->vhost_ops->vhost_set_backend_cap
>>> dev->vhost_ops->vhost_set_vring_busyloop_timeout
>>> dev->vhost_ops->vhost_vq_get_addr
>>> hdev->vhost_ops->vhost_dev_start
>>> hdev->vhost_ops->vhost_get_config
>>> hdev->vhost_ops->vhost_get_inflight_fd
>>> hdev->vhost_ops->vhost_net_set_backend
>>> hdev->vhost_ops->vhost_set_config
>>> hdev->vhost_ops->vhost_set_inflight_fd
>>> hdev->vhost_ops->vhost_set_iotlb_callback
>>>
>>> So we should Implement all of the vhost_ops callbacks, forwarding them
>>> to actual vhost_backed, and delete conditionally these ones? In other
>>> words, dynamically generate the new shadow vq vhost_ops? If a new
>>> callback is added to any vhost backend in the future, do we have to
>>> force the adding / checking for NULL in shadow backend vhost_ops?
>>> Would this be a good moment to check if all backends implement these
>>> and delete the checks?
>>
>> I think it won't be easy if we want to support all kinds of vhost
>> backends from the start. So we can go with vhost-vdpa one first.
>>
>> Actually how it work might be something like (no need to switch
>> vhost_ops, we can do everything silently in the ops)
>>
>> 1) when device to switch to shadow vq (e.g via QMP)
>> 2) vhost-vdpa will stop and sync state (last_avail_idx) internally
>> 3) reset vhost-vdpa, clean call and kick eventfd
>> 4) allocate vqs for vhost-vdpa, new call and kick eventfd, restart
>> vhost-vdpa
>> 5) start the shadow vq (make it start for last_avail_idx)
>> 6) intercept ioeventfd and forward the request to callfd
>> 7) intercept callfd and forward the request to irqfd
>> 8) forward request between shadow virtqueue and vhost-vdpa
>>
> Sorry, still not clear to me how it relates with the actual backend used :).


So I think I meant is, if we're doing at vhost level, we only play with 
eventfds not the event notifiers abstraction in virtio level.


>
> Would it work for you if we finish the notifications forwarding part
> and we work on the buffer forwarding part?


I think it's better to do them all since the notification forward itself 
is not a complete fnction.


>
> I'm going to send another cleaner revision, with less dependencies on
> virtio code, only notifications eventfds. I think it will be easier to
> discuss the changes on top of that.


Right, let's discuss there.

Thanks


>
>>> There are also checks like:
>>>
>>> if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)
>>>
>>> How would shadow_vq backend expose itself? (I guess as the actual used backend).
>>>
>>> I can modify this patchset to not relay the guest->host notifications
>>> on vq handlers but on eventfd handlers. Although this will make it
>>> independent of the actual virtio device kind used, I can see two
>>> drawbacks:
>>> * The actual fact that it makes it independent of virtio device kind.
>>> If a device does not use the notifiers and poll the ring by itself, it
>>> has no chance of knowing that it should stop. What happens if
>>> virtio-net tx timer is armed when we start shadow vq?.
>>
>> So if we do that in vhost level, it's a vhost backend from the virtio
>> layer. Then we don't need to worry about tx timer stuffs.
>>
> Got it.
>
> So I'm going to assume that no device in virtio layer needs to be
> aware of the change. It seems a valid assumption to me.
>
>>> * The fixes (current and future) in vq notifications, like the one
>>> currently implemented in virtio_notify_irqfd for windows drivers
>>> regarding ISR bit 0. I think this one in particular is OK not to
>>> carry, but I think many changes affecting any of the functions will
>>> have to be mirrored in the other.
>>
>> Consider we behave like a vhost, it just work as in the past for other
>> type of vhost backends when MSI-X is not enabled?
>>
> Yes, it may be a bad example, as vhost devices may not update it.
> However I still think a lot of features (packed ring buffers
> treatment, etc), performance optimizations and future fixes will have
> to apply to two different codes in case we roll our own buffers
> treatment.
>
> Expanding in thread of patch 05/10, since you propose two solutions for this.
>
> Thanks!
>
>
>> Thanks
>>
>>
>>> Thoughts on this?
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>
>>>>> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const struct vring_used *used = svq->hvq->used;
>>>>> +    return virtio_tswap16(svq->vdev, used->flags);
>>>>> +}
>>>>> +
>>>>> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
>>>>> +{
>>>>> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
>>>>> +}
>>>>> +
>>>>> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
>>>>> +{
>>>>> +    if (vhost_shadow_vring_should_kick(vq)) {
>>>>> +        event_notifier_set(&vq->kick_notifier);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
>>>>> +{
>>>>> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
>>>>> +    uint16_t idx = virtio_get_queue_index(vq);
>>>>> +
>>>>> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
>>>>> +
>>>>> +    vhost_shadow_vring_kick(svq);
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Start shadow virtqueue operation.
>>>>> + * @dev vhost device
>>>>> + * @svq Shadow Virtqueue
>>>>> + *
>>>>> + * Run in RCU context
>>>>> + */
>>>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>>>> +                               VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
>>>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>>>> +    struct vhost_vring_file kick_file = {
>>>>> +        .index = idx,
>>>>> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
>>>>> +    };
>>>>> +    int r;
>>>>> +    bool ok;
>>>>> +
>>>>> +    /* Check that notifications are still going directly to vhost dev */
>>>>> +    assert(virtio_queue_host_notifier_status(svq->vq));
>>>>> +
>>>>> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
>>>>> +    if (!ok) {
>>>>> +        error_report("Couldn't set the vq handler");
>>>>> +        goto err_set_kick_handler;
>>>>> +    }
>>>>> +
>>>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>>>> +    if (r != 0) {
>>>>> +        error_report("Couldn't set kick fd: %s", strerror(errno));
>>>>> +        goto err_set_vring_kick;
>>>>> +    }
>>>>> +
>>>>> +    event_notifier_set_handler(vq_host_notifier,
>>>>> +                               virtio_queue_host_notifier_read);
>>>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
>>>>> +    virtio_queue_host_notifier_read(vq_host_notifier);
>>>>> +
>>>>> +    return true;
>>>>> +
>>>>> +err_set_vring_kick:
>>>>> +    k->set_vq_handler(dev->vdev, idx, NULL);
>>>>> +
>>>>> +err_set_kick_handler:
>>>>> +    return false;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Stop shadow virtqueue operation.
>>>>> + * @dev vhost device
>>>>> + * @svq Shadow Virtqueue
>>>>> + *
>>>>> + * Run in RCU context
>>>>> + */
>>>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>>>> +                              VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
>>>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>>>> +    struct vhost_vring_file kick_file = {
>>>>> +        .index = idx,
>>>>> +        .fd = event_notifier_get_fd(vq_host_notifier),
>>>>> +    };
>>>>> +    int r;
>>>>> +
>>>>> +    /* Restore vhost kick */
>>>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>>>> +    /* Cannot do a lot of things */
>>>>> +    assert(r == 0);
>>>>> +
>>>>> +    event_notifier_set_handler(vq_host_notifier, NULL);
>>>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
>>>>> +    k->set_vq_handler(svq->vdev, idx, NULL);
>>>>> +}
>>>>> +
>>>>>     /*
>>>>>      * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>>>>>      * methods and file descriptors.
>>>>> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>>>>>     VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>>>     {
>>>>>         g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>>>>> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>>>>>         int r;
>>>>>
>>>>> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
>>>>> +    svq->hvq = &dev->vqs[idx];
>>>>> +    svq->vdev = dev->vdev;
>>>>> +
>>>>>         r = event_notifier_init(&svq->kick_notifier, 0);
>>>>>         if (r != 0) {
>>>>>             error_report("Couldn't create kick event notifier: %s",
>>>>> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>>>             goto err_init_call_notifier;
>>>>>         }
>>>>>
>>>>> -    return svq;
>>>>> +    return g_steal_pointer(&svq);
>>>>>
>>>>>     err_init_call_notifier:
>>>>>         event_notifier_cleanup(&svq->kick_notifier);
>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>> index 42836e45f3..bde688f278 100644
>>>>> --- a/hw/virtio/vhost.c
>>>>> +++ b/hw/virtio/vhost.c
>>>>> @@ -25,6 +25,7 @@
>>>>>     #include "exec/address-spaces.h"
>>>>>     #include "hw/virtio/virtio-bus.h"
>>>>>     #include "hw/virtio/virtio-access.h"
>>>>> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>>>>>     #include "migration/blocker.h"
>>>>>     #include "migration/qemu-file-types.h"
>>>>>     #include "sysemu/dma.h"
>>>>> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>>>>>         }
>>>>>     }
>>>>>
>>>>> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>>>> +        dev->sw_lm_enabled = false;
>>>>> +
>>>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>>>> +    }
>>>>> +
>>>>> +    g_free(dev->shadow_vqs);
>>>>> +    dev->shadow_vqs = NULL;
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
>>>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
>>>>> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
>>>>> +            goto err;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +            int stop_idx = idx;
>>>>> +            bool ok = vhost_shadow_vq_start_rcu(dev,
>>>>> +                                                dev->shadow_vqs[idx]);
>>>>> +
>>>>> +            if (!ok) {
>>>>> +                while (--stop_idx >= 0) {
>>>>> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
>>>>> +                }
>>>>> +
>>>>> +                goto err;
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    dev->sw_lm_enabled = true;
>>>>> +    return 0;
>>>>> +
>>>>> +err:
>>>>> +    for (; idx >= 0; --idx) {
>>>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>>>> +    }
>>>>> +    g_free(dev->shadow_vqs[idx]);
>>>>> +
>>>>> +    return -1;
>>>>> +}
>>>>> +
>>>>> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
>>>>> +                                          bool enable_lm)
>>>>> +{
>>>>> +    int r;
>>>>> +
>>>>> +    if (enable_lm == dev->sw_lm_enabled) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
>>>>> +                  : vhost_sw_live_migration_stop(dev);
>>>>> +
>>>>> +    return r;
>>>>> +}
>>>>> +
>>>>>     static void vhost_log_start(MemoryListener *listener,
>>>>>                                 MemoryRegionSection *section,
>>>>>                                 int old, int new)
>>>>> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>>>         hdev->log = NULL;
>>>>>         hdev->log_size = 0;
>>>>>         hdev->log_enabled = false;
>>>>> +    hdev->sw_lm_enabled = false;
>>>>>         hdev->started = false;
>>>>>         memory_listener_register(&hdev->memory_listener, &address_space_memory);
>>>>>         QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>>>>> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>>>             hdev->vhost_ops->vhost_dev_start(hdev, false);
>>>>>         }
>>>>>         for (i = 0; i < hdev->nvqs; ++i) {
>>>>> +        if (hdev->sw_lm_enabled) {
>>>>> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
>>>>> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
>>>>> +        }
>>>>> +
>>>>>             vhost_virtqueue_stop(hdev,
>>>>>                                  vdev,
>>>>>                                  hdev->vqs + i,
>>>>> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>>>             memory_listener_unregister(&hdev->iommu_listener);
>>>>>         }
>>>>>         vhost_log_put(hdev, true);
>>>>> +    g_free(hdev->shadow_vqs);
>>>>> +    hdev->sw_lm_enabled = false;
>>>>>         hdev->started = false;
>>>>>         hdev->vdev = NULL;
>>>>>     }
>>>>> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>
>>>>>     void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>>>>>     {
>>>>> -    error_setg(errp, "Shadow virtqueue still not implemented.");
>>>>> +    struct vhost_dev *hdev;
>>>>> +    const char *err_cause = NULL;
>>>>> +    const VirtioDeviceClass *k;
>>>>> +    int r;
>>>>> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
>>>>> +
>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (!hdev) {
>>>>> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
>>>>> +        err_cause = "Device not found";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (!hdev->started) {
>>>>> +        err_cause = "Device is not started";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
>>>>> +        err_cause = "Use packed vq";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (vhost_dev_has_iommu(hdev)) {
>>>>> +        err_cause = "Device use IOMMU";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
>>>>> +    if (!k->set_vq_handler) {
>>>>> +        err_cause = "Virtio device type does not support reset of vq handler";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    r = vhost_sw_live_migration_enable(hdev, enable);
>>>>> +    if (unlikely(r)) {
>>>>> +        err_cause = "Error enabling (see monitor)";
>>>>> +    }
>>>>> +
>>>>> +err:
>>>>> +    if (err_cause) {
>>>>> +        error_set(errp, err_class,
>>>>> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
>>>>> +    }
>>>>>     }
>



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue
@ 2021-02-10  5:57             ` Jason Wang
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Wang @ 2021-02-10  5:57 UTC (permalink / raw)
  To: Eugenio Perez Martin
  Cc: Rob Miller, Parav Pandit, Michael S. Tsirkin, qemu-level,
	Harpreet Singh Anand, Xiao W Wang, Stefan Hajnoczi, Eli Cohen,
	virtualization, Michael Lilja, Jim Harford


On 2021/2/9 下午11:02, Eugenio Perez Martin wrote:
> On Thu, Feb 4, 2021 at 4:27 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On 2021/2/2 下午6:08, Eugenio Perez Martin wrote:
>>> On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
>>>>> Shadow virtqueue notifications forwarding is disabled when vhost_dev
>>>>> stops.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>     hw/virtio/vhost-shadow-virtqueue.h |   5 ++
>>>>>     include/hw/virtio/vhost.h          |   4 +
>>>>>     hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
>>>>>     hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
>>>>>     4 files changed, 264 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> index 6cc18d6acb..466f8ae595 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.h
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
>>>>> @@ -17,6 +17,11 @@
>>>>>
>>>>>     typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>>>
>>>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>>>> +                               VhostShadowVirtqueue *svq);
>>>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>>>> +                              VhostShadowVirtqueue *svq);
>>>>> +
>>>>>     VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>>>>>
>>>>>     void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>> index 2be782cefd..732a4b2a2b 100644
>>>>> --- a/include/hw/virtio/vhost.h
>>>>> +++ b/include/hw/virtio/vhost.h
>>>>> @@ -55,6 +55,8 @@ struct vhost_iommu {
>>>>>         QLIST_ENTRY(vhost_iommu) iommu_next;
>>>>>     };
>>>>>
>>>>> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>>>>> +
>>>>>     typedef struct VhostDevConfigOps {
>>>>>         /* Vhost device config space changed callback
>>>>>          */
>>>>> @@ -83,7 +85,9 @@ struct vhost_dev {
>>>>>         uint64_t backend_cap;
>>>>>         bool started;
>>>>>         bool log_enabled;
>>>>> +    bool sw_lm_enabled;
>>>>>         uint64_t log_size;
>>>>> +    VhostShadowVirtqueue **shadow_vqs;
>>>>>         Error *migration_blocker;
>>>>>         const VhostOps *vhost_ops;
>>>>>         void *opaque;
>>>>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> index c0c967a7c5..908c36c66d 100644
>>>>> --- a/hw/virtio/vhost-shadow-virtqueue.c
>>>>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
>>>>> @@ -8,15 +8,129 @@
>>>>>      */
>>>>>
>>>>>     #include "hw/virtio/vhost-shadow-virtqueue.h"
>>>>> +#include "hw/virtio/vhost.h"
>>>>> +#include "hw/virtio/virtio-access.h"
>>>>> +
>>>>> +#include "standard-headers/linux/vhost_types.h"
>>>>> +#include "standard-headers/linux/virtio_ring.h"
>>>>>
>>>>>     #include "qemu/error-report.h"
>>>>> -#include "qemu/event_notifier.h"
>>>>> +#include "qemu/main-loop.h"
>>>>>
>>>>>     typedef struct VhostShadowVirtqueue {
>>>>>         EventNotifier kick_notifier;
>>>>>         EventNotifier call_notifier;
>>>>> +    const struct vhost_virtqueue *hvq;
>>>>> +    VirtIODevice *vdev;
>>>>> +    VirtQueue *vq;
>>>>>     } VhostShadowVirtqueue;
>>>> So instead of doing things at virtio level, how about do the shadow
>>>> stuffs at vhost level?
>>>>
>>>> It works like:
>>>>
>>>> virtio -> [shadow vhost backend] -> vhost backend
>>>>
>>>> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>>>>
>>>> It looks kind of easier since we don't need to deal with virtqueue
>>>> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>>>>
>>>> When shadow vhost mode is enabled, we just intercept the host_notifiers
>>>> and guest_notifiers. When it was disabled, we just pass the host/guest
>>>> notifiers to the real vhost backends?
>>>>
>>> Hi Jason.
>>>
>>> Sure we can try that model, but it seems to me that it comes with a
>>> different set of problems.
>>>
>>> For example, there are code in vhost.c that checks if implementations
>>> are available in vhost_ops, like:
>>>
>>> if (dev->vhost_ops->vhost_vq_get_addr) {
>>>           r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
>>>           ...
>>> }
>>>
>>> I can count 14 of these, checking:
>>>
>>> dev->vhost_ops->vhost_backend_can_merge
>>> dev->vhost_ops->vhost_backend_mem_section_filter
>>> dev->vhost_ops->vhost_force_iommu
>>> dev->vhost_ops->vhost_requires_shm_log
>>> dev->vhost_ops->vhost_set_backend_cap
>>> dev->vhost_ops->vhost_set_vring_busyloop_timeout
>>> dev->vhost_ops->vhost_vq_get_addr
>>> hdev->vhost_ops->vhost_dev_start
>>> hdev->vhost_ops->vhost_get_config
>>> hdev->vhost_ops->vhost_get_inflight_fd
>>> hdev->vhost_ops->vhost_net_set_backend
>>> hdev->vhost_ops->vhost_set_config
>>> hdev->vhost_ops->vhost_set_inflight_fd
>>> hdev->vhost_ops->vhost_set_iotlb_callback
>>>
>>> So we should Implement all of the vhost_ops callbacks, forwarding them
>>> to actual vhost_backed, and delete conditionally these ones? In other
>>> words, dynamically generate the new shadow vq vhost_ops? If a new
>>> callback is added to any vhost backend in the future, do we have to
>>> force the adding / checking for NULL in shadow backend vhost_ops?
>>> Would this be a good moment to check if all backends implement these
>>> and delete the checks?
>>
>> I think it won't be easy if we want to support all kinds of vhost
>> backends from the start. So we can go with vhost-vdpa one first.
>>
>> Actually how it work might be something like (no need to switch
>> vhost_ops, we can do everything silently in the ops)
>>
>> 1) when device to switch to shadow vq (e.g via QMP)
>> 2) vhost-vdpa will stop and sync state (last_avail_idx) internally
>> 3) reset vhost-vdpa, clean call and kick eventfd
>> 4) allocate vqs for vhost-vdpa, new call and kick eventfd, restart
>> vhost-vdpa
>> 5) start the shadow vq (make it start for last_avail_idx)
>> 6) intercept ioeventfd and forward the request to callfd
>> 7) intercept callfd and forward the request to irqfd
>> 8) forward request between shadow virtqueue and vhost-vdpa
>>
> Sorry, still not clear to me how it relates with the actual backend used :).


So I think I meant is, if we're doing at vhost level, we only play with 
eventfds not the event notifiers abstraction in virtio level.


>
> Would it work for you if we finish the notifications forwarding part
> and we work on the buffer forwarding part?


I think it's better to do them all since the notification forward itself 
is not a complete fnction.


>
> I'm going to send another cleaner revision, with less dependencies on
> virtio code, only notifications eventfds. I think it will be easier to
> discuss the changes on top of that.


Right, let's discuss there.

Thanks


>
>>> There are also checks like:
>>>
>>> if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)
>>>
>>> How would shadow_vq backend expose itself? (I guess as the actual used backend).
>>>
>>> I can modify this patchset to not relay the guest->host notifications
>>> on vq handlers but on eventfd handlers. Although this will make it
>>> independent of the actual virtio device kind used, I can see two
>>> drawbacks:
>>> * The actual fact that it makes it independent of virtio device kind.
>>> If a device does not use the notifiers and poll the ring by itself, it
>>> has no chance of knowing that it should stop. What happens if
>>> virtio-net tx timer is armed when we start shadow vq?.
>>
>> So if we do that in vhost level, it's a vhost backend from the virtio
>> layer. Then we don't need to worry about tx timer stuffs.
>>
> Got it.
>
> So I'm going to assume that no device in virtio layer needs to be
> aware of the change. It seems a valid assumption to me.
>
>>> * The fixes (current and future) in vq notifications, like the one
>>> currently implemented in virtio_notify_irqfd for windows drivers
>>> regarding ISR bit 0. I think this one in particular is OK not to
>>> carry, but I think many changes affecting any of the functions will
>>> have to be mirrored in the other.
>>
>> Consider we behave like a vhost, it just work as in the past for other
>> type of vhost backends when MSI-X is not enabled?
>>
> Yes, it may be a bad example, as vhost devices may not update it.
> However I still think a lot of features (packed ring buffers
> treatment, etc), performance optimizations and future fixes will have
> to apply to two different codes in case we roll our own buffers
> treatment.
>
> Expanding in thread of patch 05/10, since you propose two solutions for this.
>
> Thanks!
>
>
>> Thanks
>>
>>
>>> Thoughts on this?
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>
>>>>> +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const struct vring_used *used = svq->hvq->used;
>>>>> +    return virtio_tswap16(svq->vdev, used->flags);
>>>>> +}
>>>>> +
>>>>> +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
>>>>> +{
>>>>> +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
>>>>> +}
>>>>> +
>>>>> +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
>>>>> +{
>>>>> +    if (vhost_shadow_vring_should_kick(vq)) {
>>>>> +        event_notifier_set(&vq->kick_notifier);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
>>>>> +{
>>>>> +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
>>>>> +    uint16_t idx = virtio_get_queue_index(vq);
>>>>> +
>>>>> +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
>>>>> +
>>>>> +    vhost_shadow_vring_kick(svq);
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Start shadow virtqueue operation.
>>>>> + * @dev vhost device
>>>>> + * @svq Shadow Virtqueue
>>>>> + *
>>>>> + * Run in RCU context
>>>>> + */
>>>>> +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
>>>>> +                               VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
>>>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>>>> +    struct vhost_vring_file kick_file = {
>>>>> +        .index = idx,
>>>>> +        .fd = event_notifier_get_fd(&svq->kick_notifier),
>>>>> +    };
>>>>> +    int r;
>>>>> +    bool ok;
>>>>> +
>>>>> +    /* Check that notifications are still going directly to vhost dev */
>>>>> +    assert(virtio_queue_host_notifier_status(svq->vq));
>>>>> +
>>>>> +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
>>>>> +    if (!ok) {
>>>>> +        error_report("Couldn't set the vq handler");
>>>>> +        goto err_set_kick_handler;
>>>>> +    }
>>>>> +
>>>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>>>> +    if (r != 0) {
>>>>> +        error_report("Couldn't set kick fd: %s", strerror(errno));
>>>>> +        goto err_set_vring_kick;
>>>>> +    }
>>>>> +
>>>>> +    event_notifier_set_handler(vq_host_notifier,
>>>>> +                               virtio_queue_host_notifier_read);
>>>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
>>>>> +    virtio_queue_host_notifier_read(vq_host_notifier);
>>>>> +
>>>>> +    return true;
>>>>> +
>>>>> +err_set_vring_kick:
>>>>> +    k->set_vq_handler(dev->vdev, idx, NULL);
>>>>> +
>>>>> +err_set_kick_handler:
>>>>> +    return false;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * Stop shadow virtqueue operation.
>>>>> + * @dev vhost device
>>>>> + * @svq Shadow Virtqueue
>>>>> + *
>>>>> + * Run in RCU context
>>>>> + */
>>>>> +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
>>>>> +                              VhostShadowVirtqueue *svq)
>>>>> +{
>>>>> +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
>>>>> +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
>>>>> +    EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
>>>>> +    struct vhost_vring_file kick_file = {
>>>>> +        .index = idx,
>>>>> +        .fd = event_notifier_get_fd(vq_host_notifier),
>>>>> +    };
>>>>> +    int r;
>>>>> +
>>>>> +    /* Restore vhost kick */
>>>>> +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
>>>>> +    /* Cannot do a lot of things */
>>>>> +    assert(r == 0);
>>>>> +
>>>>> +    event_notifier_set_handler(vq_host_notifier, NULL);
>>>>> +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
>>>>> +    k->set_vq_handler(svq->vdev, idx, NULL);
>>>>> +}
>>>>> +
>>>>>     /*
>>>>>      * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
>>>>>      * methods and file descriptors.
>>>>> @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
>>>>>     VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>>>     {
>>>>>         g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
>>>>> +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + idx);
>>>>>         int r;
>>>>>
>>>>> +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
>>>>> +    svq->hvq = &dev->vqs[idx];
>>>>> +    svq->vdev = dev->vdev;
>>>>> +
>>>>>         r = event_notifier_init(&svq->kick_notifier, 0);
>>>>>         if (r != 0) {
>>>>>             error_report("Couldn't create kick event notifier: %s",
>>>>> @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
>>>>>             goto err_init_call_notifier;
>>>>>         }
>>>>>
>>>>> -    return svq;
>>>>> +    return g_steal_pointer(&svq);
>>>>>
>>>>>     err_init_call_notifier:
>>>>>         event_notifier_cleanup(&svq->kick_notifier);
>>>>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>>>>> index 42836e45f3..bde688f278 100644
>>>>> --- a/hw/virtio/vhost.c
>>>>> +++ b/hw/virtio/vhost.c
>>>>> @@ -25,6 +25,7 @@
>>>>>     #include "exec/address-spaces.h"
>>>>>     #include "hw/virtio/virtio-bus.h"
>>>>>     #include "hw/virtio/virtio-access.h"
>>>>> +#include "hw/virtio/vhost-shadow-virtqueue.h"
>>>>>     #include "migration/blocker.h"
>>>>>     #include "migration/qemu-file-types.h"
>>>>>     #include "sysemu/dma.h"
>>>>> @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener *listener)
>>>>>         }
>>>>>     }
>>>>>
>>>>> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>>>> +        dev->sw_lm_enabled = false;
>>>>> +
>>>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>>>> +    }
>>>>> +
>>>>> +    g_free(dev->shadow_vqs);
>>>>> +    dev->shadow_vqs = NULL;
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
>>>>> +{
>>>>> +    int idx;
>>>>> +
>>>>> +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
>>>>> +    for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
>>>>> +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
>>>>> +            goto err;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    WITH_RCU_READ_LOCK_GUARD() {
>>>>> +        for (idx = 0; idx < dev->nvqs; ++idx) {
>>>>> +            int stop_idx = idx;
>>>>> +            bool ok = vhost_shadow_vq_start_rcu(dev,
>>>>> +                                                dev->shadow_vqs[idx]);
>>>>> +
>>>>> +            if (!ok) {
>>>>> +                while (--stop_idx >= 0) {
>>>>> +                    vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[stop_idx]);
>>>>> +                }
>>>>> +
>>>>> +                goto err;
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    dev->sw_lm_enabled = true;
>>>>> +    return 0;
>>>>> +
>>>>> +err:
>>>>> +    for (; idx >= 0; --idx) {
>>>>> +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
>>>>> +    }
>>>>> +    g_free(dev->shadow_vqs[idx]);
>>>>> +
>>>>> +    return -1;
>>>>> +}
>>>>> +
>>>>> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
>>>>> +                                          bool enable_lm)
>>>>> +{
>>>>> +    int r;
>>>>> +
>>>>> +    if (enable_lm == dev->sw_lm_enabled) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    r = enable_lm ? vhost_sw_live_migration_start(dev)
>>>>> +                  : vhost_sw_live_migration_stop(dev);
>>>>> +
>>>>> +    return r;
>>>>> +}
>>>>> +
>>>>>     static void vhost_log_start(MemoryListener *listener,
>>>>>                                 MemoryRegionSection *section,
>>>>>                                 int old, int new)
>>>>> @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
>>>>>         hdev->log = NULL;
>>>>>         hdev->log_size = 0;
>>>>>         hdev->log_enabled = false;
>>>>> +    hdev->sw_lm_enabled = false;
>>>>>         hdev->started = false;
>>>>>         memory_listener_register(&hdev->memory_listener, &address_space_memory);
>>>>>         QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
>>>>> @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>>>             hdev->vhost_ops->vhost_dev_start(hdev, false);
>>>>>         }
>>>>>         for (i = 0; i < hdev->nvqs; ++i) {
>>>>> +        if (hdev->sw_lm_enabled) {
>>>>> +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
>>>>> +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
>>>>> +        }
>>>>> +
>>>>>             vhost_virtqueue_stop(hdev,
>>>>>                                  vdev,
>>>>>                                  hdev->vqs + i,
>>>>> @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
>>>>>             memory_listener_unregister(&hdev->iommu_listener);
>>>>>         }
>>>>>         vhost_log_put(hdev, true);
>>>>> +    g_free(hdev->shadow_vqs);
>>>>> +    hdev->sw_lm_enabled = false;
>>>>>         hdev->started = false;
>>>>>         hdev->vdev = NULL;
>>>>>     }
>>>>> @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>>>>>
>>>>>     void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
>>>>>     {
>>>>> -    error_setg(errp, "Shadow virtqueue still not implemented.");
>>>>> +    struct vhost_dev *hdev;
>>>>> +    const char *err_cause = NULL;
>>>>> +    const VirtioDeviceClass *k;
>>>>> +    int r;
>>>>> +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
>>>>> +
>>>>> +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
>>>>> +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
>>>>> +            break;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    if (!hdev) {
>>>>> +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
>>>>> +        err_cause = "Device not found";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (!hdev->started) {
>>>>> +        err_cause = "Device is not started";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
>>>>> +        err_cause = "Use packed vq";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    if (vhost_dev_has_iommu(hdev)) {
>>>>> +        err_cause = "Device use IOMMU";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
>>>>> +    if (!k->set_vq_handler) {
>>>>> +        err_cause = "Virtio device type does not support reset of vq handler";
>>>>> +        goto err;
>>>>> +    }
>>>>> +
>>>>> +    r = vhost_sw_live_migration_enable(hdev, enable);
>>>>> +    if (unlikely(r)) {
>>>>> +        err_cause = "Error enabling (see monitor)";
>>>>> +    }
>>>>> +
>>>>> +err:
>>>>> +    if (err_cause) {
>>>>> +        error_set(errp, err_class,
>>>>> +                  "Can't enable shadow vq on %s: %s", name, err_cause);
>>>>> +    }
>>>>>     }
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-02-10  5:58 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 20:54 [RFC 00/10] vDPA shadow virtqueue - notifications forwarding Eugenio Pérez
2021-01-29 20:54 ` [RFC 01/10] virtio: Add virtqueue_set_handler Eugenio Pérez
2021-01-29 20:54 ` [RFC 02/10] virtio: Add set_vq_handler Eugenio Pérez
2021-01-29 20:54 ` [RFC 03/10] virtio: Add virtio_queue_get_idx Eugenio Pérez
2021-02-01  6:10   ` Jason Wang
2021-02-01  6:10     ` Jason Wang
2021-02-01  7:20     ` Eugenio Perez Martin
2021-01-29 20:54 ` [RFC 04/10] virtio: Add virtio_queue_host_notifier_status Eugenio Pérez
2021-01-29 20:54 ` [RFC 05/10] vhost: Add vhost_dev_from_virtio Eugenio Pérez
2021-02-01  6:12   ` Jason Wang
2021-02-01  6:12     ` Jason Wang
2021-02-01  8:28     ` Eugenio Perez Martin
2021-02-02  3:31       ` Jason Wang
2021-02-02  3:31         ` Jason Wang
2021-02-02 10:17         ` Eugenio Perez Martin
2021-02-04  3:14           ` Jason Wang
2021-02-04  3:14             ` Jason Wang
2021-02-04  9:25             ` Eugenio Perez Martin
2021-02-05  3:51               ` Jason Wang
2021-02-05  3:51                 ` Jason Wang
2021-02-09 15:35                 ` Eugenio Perez Martin
2021-02-10  5:54                   ` Jason Wang
2021-02-10  5:54                     ` Jason Wang
2021-01-29 20:54 ` [RFC 06/10] vhost: Save masked_notifier state Eugenio Pérez
2021-01-29 20:54 ` [RFC 07/10] vhost: Add VhostShadowVirtqueue Eugenio Pérez
2021-01-29 20:54 ` [RFC 08/10] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
2021-02-02 15:38   ` Eric Blake
2021-02-02 15:38     ` Eric Blake
2021-02-04  9:01     ` Eugenio Perez Martin
2021-02-04 12:16       ` Markus Armbruster
2021-02-04 12:16         ` Markus Armbruster
2021-02-04 14:03         ` Eugenio Perez Martin
2021-01-29 20:54 ` [RFC 09/10] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
2021-02-01  6:29   ` Jason Wang
2021-02-01  6:29     ` Jason Wang
2021-02-02 10:08     ` Eugenio Perez Martin
2021-02-04  3:26       ` Jason Wang
2021-02-04  3:26         ` Jason Wang
2021-02-09 15:02         ` Eugenio Perez Martin
2021-02-10  5:57           ` Jason Wang
2021-02-10  5:57             ` Jason Wang
2021-01-29 20:54 ` [RFC 10/10] vhost: Route host->guest " Eugenio Pérez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.