From: Jason Wang <jasowang@redhat.com>
To: "Eugenio Pérez" <eperezma@redhat.com>, qemu-devel@nongnu.org
Cc: Parav Pandit <parav@mellanox.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Guru Prasad <guru.prasad@broadcom.com>,
Juan Quintela <quintela@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
virtualization@lists.linux-foundation.org,
Harpreet Singh Anand <hanand@xilinx.com>,
Xiao W Wang <xiao.w.wang@intel.com>, Eli Cohen <eli@mellanox.com>,
Stefano Garzarella <sgarzare@redhat.com>,
Michael Lilja <ml@napatech.com>,
Jim Harford <jim.harford@broadcom.com>,
Rob Miller <rob.miller@broadcom.com>
Subject: Re: [RFC v2 05/13] vhost: Route guest->host notification through shadow virtqueue
Date: Tue, 16 Mar 2021 15:18:02 +0800 [thread overview]
Message-ID: <23e492d1-9e86-20d3-e2b3-b3d7c8c6da9c@redhat.com> (raw)
In-Reply-To: <20210315194842.277740-6-eperezma@redhat.com>
在 2021/3/16 上午3:48, Eugenio Pérez 写道:
> Shadow virtqueue notifications forwarding is disabled when vhost_dev
> stops, so code flow follows usual cleanup.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
> hw/virtio/vhost-shadow-virtqueue.h | 7 ++
> include/hw/virtio/vhost.h | 4 +
> hw/virtio/vhost-shadow-virtqueue.c | 113 ++++++++++++++++++++++-
> hw/virtio/vhost.c | 143 ++++++++++++++++++++++++++++-
> 4 files changed, 265 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h
> index 6cc18d6acb..c891c6510d 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.h
> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> @@ -17,6 +17,13 @@
>
> typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
>
> +bool vhost_shadow_vq_start(struct vhost_dev *dev,
> + unsigned idx,
> + VhostShadowVirtqueue *svq);
> +void vhost_shadow_vq_stop(struct vhost_dev *dev,
> + unsigned idx,
> + VhostShadowVirtqueue *svq);
> +
> VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
>
> void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index ac963bf23d..7ffdf9aea0 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -55,6 +55,8 @@ struct vhost_iommu {
> QLIST_ENTRY(vhost_iommu) iommu_next;
> };
>
> +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> +
> typedef struct VhostDevConfigOps {
> /* Vhost device config space changed callback
> */
> @@ -83,7 +85,9 @@ struct vhost_dev {
> uint64_t backend_cap;
> bool started;
> bool log_enabled;
> + bool shadow_vqs_enabled;
> uint64_t log_size;
> + VhostShadowVirtqueue **shadow_vqs;
Any reason that you don't embed the shadow virtqueue into
vhost_virtqueue structure?
(Note that there's a masked_notifier in struct vhost_virtqueue).
> Error *migration_blocker;
> const VhostOps *vhost_ops;
> void *opaque;
> diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c
> index 4512e5b058..3e43399e9c 100644
> --- a/hw/virtio/vhost-shadow-virtqueue.c
> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> @@ -8,9 +8,12 @@
> */
>
> #include "hw/virtio/vhost-shadow-virtqueue.h"
> +#include "hw/virtio/vhost.h"
> +
> +#include "standard-headers/linux/vhost_types.h"
>
> #include "qemu/error-report.h"
> -#include "qemu/event_notifier.h"
> +#include "qemu/main-loop.h"
>
> /* Shadow virtqueue to relay notifications */
> typedef struct VhostShadowVirtqueue {
> @@ -18,14 +21,121 @@ typedef struct VhostShadowVirtqueue {
> EventNotifier kick_notifier;
> /* Shadow call notifier, sent to vhost */
> EventNotifier call_notifier;
> +
> + /*
> + * Borrowed virtqueue's guest to host notifier.
> + * To borrow it in this event notifier allows to register on the event
> + * loop and access the associated shadow virtqueue easily. If we use the
> + * VirtQueue, we don't have an easy way to retrieve it.
So this is something that worries me. It looks like a layer violation
that makes the codes harder to work correctly.
I wonder if it would be simpler to start from a vDPA dedicated shadow
virtqueue implementation:
1) have the above fields embeded in vhost_vdpa structure
2) Work at the level of
vhost_vdpa_set_vring_kick()/vhost_vdpa_set_vring_call()
Then the layer is still isolated and you have a much simpler context to
work that you don't need to care a lot of synchornization:
1) vq masking
2) vhost dev start and stop
?
> + *
> + * So shadow virtqueue must not clean it, or we would lose VirtQueue one.
> + */
> + EventNotifier host_notifier;
> +
> + /* Virtio queue shadowing */
> + VirtQueue *vq;
> } VhostShadowVirtqueue;
>
> +/* Forward guest notifications */
> +static void vhost_handle_guest_kick(EventNotifier *n)
> +{
> + VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
> + host_notifier);
> +
> + if (unlikely(!event_notifier_test_and_clear(n))) {
> + return;
> + }
> +
> + event_notifier_set(&svq->kick_notifier);
> +}
> +
> +/*
> + * Restore the vhost guest to host notifier, i.e., disables svq effect.
> + */
> +static int vhost_shadow_vq_restore_vdev_host_notifier(struct vhost_dev *dev,
> + unsigned vhost_index,
> + VhostShadowVirtqueue *svq)
> +{
> + EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> + struct vhost_vring_file file = {
> + .index = vhost_index,
> + .fd = event_notifier_get_fd(vq_host_notifier),
> + };
> + int r;
> +
> + /* Restore vhost kick */
> + r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
> + return r ? -errno : 0;
> +}
> +
> +/*
> + * Start shadow virtqueue operation.
> + * @dev vhost device
> + * @hidx vhost virtqueue index
> + * @svq Shadow Virtqueue
> + */
> +bool vhost_shadow_vq_start(struct vhost_dev *dev,
> + unsigned idx,
> + VhostShadowVirtqueue *svq)
It looks to me this assumes the vhost_dev is started before
vhost_shadow_vq_start()?
> +{
> + EventNotifier *vq_host_notifier = virtio_queue_get_host_notifier(svq->vq);
> + struct vhost_vring_file file = {
> + .index = idx,
> + .fd = event_notifier_get_fd(&svq->kick_notifier),
> + };
> + int r;
> +
> + /* Check that notifications are still going directly to vhost dev */
> + assert(virtio_queue_is_host_notifier_enabled(svq->vq));
> +
> + /*
> + * event_notifier_set_handler already checks for guest's notifications if
> + * they arrive in the switch, so there is no need to explicitely check for
> + * them.
> + */
> + event_notifier_init_fd(&svq->host_notifier,
> + event_notifier_get_fd(vq_host_notifier));
> + event_notifier_set_handler(&svq->host_notifier, vhost_handle_guest_kick);
> +
> + r = dev->vhost_ops->vhost_set_vring_kick(dev, &file);
> + if (unlikely(r != 0)) {
> + error_report("Couldn't set kick fd: %s", strerror(errno));
> + goto err_set_vring_kick;
> + }
> +
> + return true;
> +
> +err_set_vring_kick:
> + event_notifier_set_handler(&svq->host_notifier, NULL);
> +
> + return false;
> +}
> +
> +/*
> + * Stop shadow virtqueue operation.
> + * @dev vhost device
> + * @idx vhost queue index
> + * @svq Shadow Virtqueue
> + */
> +void vhost_shadow_vq_stop(struct vhost_dev *dev,
> + unsigned idx,
> + VhostShadowVirtqueue *svq)
> +{
> + int r = vhost_shadow_vq_restore_vdev_host_notifier(dev, idx, svq);
> + if (unlikely(r < 0)) {
> + error_report("Couldn't restore vq kick fd: %s", strerror(-r));
> + }
> +
> + event_notifier_set_handler(&svq->host_notifier, NULL);
> +}
> +
> /*
> * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow
> * methods and file descriptors.
> */
> VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> {
> + int vq_idx = dev->vq_index + idx;
> g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
> int r;
>
> @@ -43,6 +153,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> goto err_init_call_notifier;
> }
>
> + svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> return g_steal_pointer(&svq);
>
> err_init_call_notifier:
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 97f1bcfa42..4858a35df6 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -25,6 +25,7 @@
> #include "exec/address-spaces.h"
> #include "hw/virtio/virtio-bus.h"
> #include "hw/virtio/virtio-access.h"
> +#include "hw/virtio/vhost-shadow-virtqueue.h"
> #include "migration/blocker.h"
> #include "migration/qemu-file-types.h"
> #include "sysemu/dma.h"
> @@ -1219,6 +1220,74 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
> 0, virtio_queue_get_desc_size(vdev, idx));
> }
>
> +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> +{
> + int idx;
> +
> + dev->shadow_vqs_enabled = false;
> +
> + for (idx = 0; idx < dev->nvqs; ++idx) {
> + vhost_shadow_vq_stop(dev, idx, dev->shadow_vqs[idx]);
> + vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> + }
> +
> + g_free(dev->shadow_vqs);
> + dev->shadow_vqs = NULL;
> + return 0;
> +}
> +
> +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> +{
> + int idx, stop_idx;
> +
> + dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> + for (idx = 0; idx < dev->nvqs; ++idx) {
> + dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> + if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> + goto err_new;
> + }
> + }
> +
> + dev->shadow_vqs_enabled = true;
> + for (idx = 0; idx < dev->nvqs; ++idx) {
> + bool ok = vhost_shadow_vq_start(dev, idx, dev->shadow_vqs[idx]);
> + if (unlikely(!ok)) {
> + goto err_start;
> + }
> + }
> +
> + return 0;
> +
> +err_start:
> + dev->shadow_vqs_enabled = false;
> + for (stop_idx = 0; stop_idx < idx; stop_idx++) {
> + vhost_shadow_vq_stop(dev, idx, dev->shadow_vqs[stop_idx]);
> + }
> +
> +err_new:
> + for (idx = 0; idx < dev->nvqs; ++idx) {
> + vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> + }
> + g_free(dev->shadow_vqs);
> +
> + return -1;
> +}
> +
> +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> + bool enable_lm)
> +{
So the live migration part should be done in an separted patch.
Thanks
> + int r;
> +
> + if (enable_lm == dev->shadow_vqs_enabled) {
> + return 0;
> + }
> +
> + r = enable_lm ? vhost_sw_live_migration_start(dev)
> + : vhost_sw_live_migration_stop(dev);
> +
> + return r;
> +}
> +
> static void vhost_eventfd_add(MemoryListener *listener,
> MemoryRegionSection *section,
> bool match_data, uint64_t data, EventNotifier *e)
> @@ -1381,6 +1450,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque,
> hdev->log = NULL;
> hdev->log_size = 0;
> hdev->log_enabled = false;
> + hdev->shadow_vqs_enabled = false;
> hdev->started = false;
> memory_listener_register(&hdev->memory_listener, &address_space_memory);
> QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> @@ -1484,6 +1554,10 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
> BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
> int i, r;
>
> + if (hdev->shadow_vqs_enabled) {
> + vhost_sw_live_migration_enable(hdev, false);
> + }
> +
> for (i = 0; i < hdev->nvqs; ++i) {
> r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
> false);
> @@ -1798,6 +1872,7 @@ fail_features:
> void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> {
> int i;
> + bool is_shadow_vqs_enabled = hdev->shadow_vqs_enabled;
>
> /* should only be called after backend is connected */
> assert(hdev->vhost_ops);
> @@ -1805,7 +1880,16 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> if (hdev->vhost_ops->vhost_dev_start) {
> hdev->vhost_ops->vhost_dev_start(hdev, false);
> }
> + if (is_shadow_vqs_enabled) {
> + /* Shadow virtqueue will be stopped */
> + hdev->shadow_vqs_enabled = false;
> + }
> for (i = 0; i < hdev->nvqs; ++i) {
> + if (is_shadow_vqs_enabled) {
> + vhost_shadow_vq_stop(hdev, i, hdev->shadow_vqs[i]);
> + vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> + }
> +
> vhost_virtqueue_stop(hdev,
> vdev,
> hdev->vqs + i,
> @@ -1819,6 +1903,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev)
> memory_listener_unregister(&hdev->iommu_listener);
> }
> vhost_log_put(hdev, true);
> + g_free(hdev->shadow_vqs);
> + hdev->shadow_vqs_enabled = false;
> hdev->started = false;
> hdev->vdev = NULL;
> }
> @@ -1835,5 +1921,60 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
>
> void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error **errp)
> {
> - error_setg(errp, "Shadow virtqueue still not implemented");
> + struct vhost_dev *hdev, *hdev_err;
> + VirtIODevice *vdev;
> + const char *err_cause = NULL;
> + int r;
> + ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> +
> + QLIST_FOREACH(hdev, &vhost_devices, entry) {
> + if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> + vdev = hdev->vdev;
> + break;
> + }
> + }
> +
> + if (!hdev) {
> + err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> + err_cause = "Device not found";
> + goto not_found_err;
> + }
> +
> + for ( ; hdev; hdev = QLIST_NEXT(hdev, entry)) {
> + if (vdev != hdev->vdev) {
> + continue;
> + }
> +
> + if (!hdev->started) {
> + err_cause = "Device is not started";
> + goto err;
> + }
> +
> + r = vhost_sw_live_migration_enable(hdev, enable);
> + if (unlikely(r)) {
> + err_cause = "Error enabling (see monitor)";
> + goto err;
> + }
> + }
> +
> + return;
> +
> +err:
> + QLIST_FOREACH(hdev_err, &vhost_devices, entry) {
> + if (hdev_err == hdev) {
> + break;
> + }
> +
> + if (vdev != hdev->vdev) {
> + continue;
> + }
> +
> + vhost_sw_live_migration_enable(hdev, !enable);
> + }
> +
> +not_found_err:
> + if (err_cause) {
> + error_set(errp, err_class,
> + "Can't enable shadow vq on %s: %s", name, err_cause);
> + }
> }
next prev parent reply other threads:[~2021-03-16 7:20 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-15 19:48 [RFC v2 00/13] vDPA software assisted live migration Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 01/13] virtio: Add virtio_queue_is_host_notifier_enabled Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 02/13] vhost: Save masked_notifier state Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 03/13] vhost: Add VhostShadowVirtqueue Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 04/13] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
2021-03-16 13:37 ` Eric Blake
2021-03-15 19:48 ` [RFC v2 05/13] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
2021-03-16 7:18 ` Jason Wang [this message]
2021-03-16 10:31 ` Eugenio Perez Martin
2021-03-17 2:05 ` Jason Wang
2021-03-17 16:47 ` Eugenio Perez Martin
2021-03-18 3:10 ` Jason Wang
2021-03-18 9:18 ` Eugenio Perez Martin
2021-03-18 9:29 ` Jason Wang
2021-03-18 10:48 ` Eugenio Perez Martin
2021-03-18 12:04 ` Eugenio Perez Martin
2021-03-19 6:55 ` Jason Wang
2021-03-15 19:48 ` [RFC v2 06/13] vhost: Route host->guest " Eugenio Pérez
2021-03-16 7:21 ` Jason Wang
2021-03-15 19:48 ` [RFC v2 07/13] vhost: Avoid re-set masked notifier in shadow vq Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 08/13] virtio: Add vhost_shadow_vq_get_vring_addr Eugenio Pérez
2021-03-16 7:50 ` Jason Wang
2021-03-16 15:20 ` Eugenio Perez Martin
2021-05-17 17:39 ` Eugenio Perez Martin
2021-03-15 19:48 ` [RFC v2 09/13] virtio: Add virtio_queue_full Eugenio Pérez
2021-03-15 19:48 ` [RFC v2 10/13] vhost: add vhost_kernel_set_vring_enable Eugenio Pérez
2021-03-16 7:29 ` Jason Wang
2021-03-16 10:43 ` Eugenio Perez Martin
2021-03-17 2:25 ` Jason Wang
2021-03-15 19:48 ` [RFC v2 11/13] vhost: Shadow virtqueue buffers forwarding Eugenio Pérez
2021-03-16 8:15 ` Jason Wang
2021-03-16 16:05 ` Eugenio Perez Martin
2021-03-17 2:50 ` Jason Wang
2021-03-17 14:38 ` Eugenio Perez Martin
2021-03-18 3:14 ` Jason Wang
2021-03-18 8:06 ` Eugenio Perez Martin
2021-03-18 9:16 ` Jason Wang
2021-03-18 9:54 ` Eugenio Perez Martin
2021-03-15 19:48 ` [RFC v2 12/13] vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue kick Eugenio Pérez
2021-03-16 8:07 ` Jason Wang
2021-05-17 17:11 ` Eugenio Perez Martin
2021-03-15 19:48 ` [RFC v2 13/13] vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow virtqueue Eugenio Pérez
2021-03-16 8:08 ` Jason Wang
2021-05-17 17:32 ` Eugenio Perez Martin
2021-03-16 8:28 ` [RFC v2 00/13] vDPA software assisted live migration Jason Wang
2021-03-16 17:25 ` Eugenio Perez Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23e492d1-9e86-20d3-e2b3-b3d7c8c6da9c@redhat.com \
--to=jasowang@redhat.com \
--cc=armbru@redhat.com \
--cc=eli@mellanox.com \
--cc=eperezma@redhat.com \
--cc=guru.prasad@broadcom.com \
--cc=hanand@xilinx.com \
--cc=jim.harford@broadcom.com \
--cc=ml@napatech.com \
--cc=mst@redhat.com \
--cc=parav@mellanox.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=rob.miller@broadcom.com \
--cc=sgarzare@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=xiao.w.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).