qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Parav Pandit <parav@mellanox.com>,
	Juan Quintela <quintela@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	qemu-level <qemu-devel@nongnu.org>,
	Markus Armbruster <armbru@redhat.com>,
	Eugenio Perez Martin <eperezma@redhat.com>,
	Harpreet Singh Anand <hanand@xilinx.com>,
	Xiao W Wang <xiao.w.wang@intel.com>, Eli Cohen <eli@mellanox.com>,
	virtualization@lists.linux-foundation.org,
	Eric Blake <eblake@redhat.com>, Michael Lilja <ml@napatech.com>,
	Stefano Garzarella <sgarzare@redhat.com>
Subject: Re: [RFC v3 00/29] vDPA software assisted live migration
Date: Mon, 19 Jul 2021 15:13:37 +0100	[thread overview]
Message-ID: <YPWIkRLSd7/wj11k@stefanha-x1.localdomain> (raw)
In-Reply-To: <20210524072739-mutt-send-email-mst@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3018 bytes --]

On Mon, May 24, 2021 at 07:29:06AM -0400, Michael S. Tsirkin wrote:
> On Mon, May 24, 2021 at 12:37:48PM +0200, Eugenio Perez Martin wrote:
> > On Mon, May 24, 2021 at 11:38 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, May 19, 2021 at 06:28:34PM +0200, Eugenio Pérez wrote:
> > > > Commit 17 introduces the buffer forwarding. Previous one are for
> > > > preparations again, and laters are for enabling some obvious
> > > > optimizations. However, it needs the vdpa device to be able to map
> > > > every IOVA space, and some vDPA devices are not able to do so. Checking
> > > > of this is added in previous commits.
> > >
> > > That might become a significant limitation. And it worries me that
> > > this is such a big patchset which might yet take a while to get
> > > finalized.
> > >
> > 
> > Sorry, maybe I've been unclear here: Latter commits in this series
> > address this limitation. Still not perfect: for example, it does not
> > support adding or removing guest's memory at the moment, but this
> > should be easy to implement on top.
> > 
> > The main issue I'm observing is from the kernel if I'm not wrong: If I
> > unmap every address, I cannot re-map them again. But code in this
> > patchset is mostly final, except for the comments it may arise in the
> > mail list of course.
> > 
> > > I have an idea: how about as a first step we implement a transparent
> > > switch from vdpa to a software virtio in QEMU or a software vhost in
> > > kernel?
> > >
> > > This will give us live migration quickly with performance comparable
> > > to failover but without dependance on guest cooperation.
> > >
> > 
> > I think it should be doable. I'm not sure about the effort that needs
> > to be done in qemu to hide these "hypervisor-failover devices" from
> > guest's view but it should be comparable to failover, as you say.
> > 
> > Networking should be ok by its nature, although it could require care
> > on the host hardware setup. But I'm not sure how other types of
> > vhost/vdpa devices may work that way. How would a disk/scsi device
> > switch modes? Can the kernel take control of the vdpa device through
> > vhost, and just start reporting with a dirty bitmap?
> > 
> > Thanks!
> 
> It depends of course, e.g. blk is mostly reads/writes so
> not a lot of state. just don't reorder or drop requests.

QEMU's virtio-blk does not attempt to change states (e.g. quiesce the
device or switch between vhost kernel/QEMU, etc) while there are
in-flight requests. Instead all currently active requests must complete
(in some cases they can be cancelled to stop them early). Note that
failed requests can be kept in a list across the switch and then
resubmitted later.

The underlying storage never has requests in flight while the device is
switched. The reason QEMU does this is because there's no way to hand
over an in-flight preadv(2), Linux AIO, or other host kernel block layer
request to another process.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-07-19 14:15 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 16:28 [RFC v3 00/29] vDPA software assisted live migration Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 01/29] virtio: Add virtio_queue_is_host_notifier_enabled Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 02/29] vhost: Save masked_notifier state Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 03/29] vhost: Add VhostShadowVirtqueue Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 04/29] vhost: Add x-vhost-enable-shadow-vq qmp Eugenio Pérez
2021-05-21  7:05   ` Markus Armbruster
2021-05-24  7:13     ` Eugenio Perez Martin
2021-06-08 14:23       ` Markus Armbruster
2021-06-08 15:26         ` Eugenio Perez Martin
2021-06-09 11:46           ` Markus Armbruster
2021-06-09 14:06             ` Eugenio Perez Martin
2021-05-19 16:28 ` [RFC v3 05/29] virtio: Add VIRTIO_F_QUEUE_STATE Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 06/29] virtio-net: Honor VIRTIO_CONFIG_S_DEVICE_STOPPED Eugenio Pérez
2021-05-26  1:06   ` Jason Wang
2021-05-26  1:10     ` Jason Wang
2021-06-01  7:13       ` Eugenio Perez Martin
2021-06-03  3:12         ` Jason Wang
2021-05-19 16:28 ` [RFC v3 07/29] vhost: Route guest->host notification through shadow virtqueue Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 08/29] vhost: Route host->guest " Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 09/29] vhost: Avoid re-set masked notifier in shadow vq Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 10/29] virtio: Add vhost_shadow_vq_get_vring_addr Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 11/29] vhost: Add vhost_vring_pause operation Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 12/29] vhost: add vhost_kernel_vring_pause Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 13/29] vhost: Add vhost_get_iova_range operation Eugenio Pérez
2021-05-26  1:14   ` Jason Wang
2021-05-26 17:49     ` Eugenio Perez Martin
2021-05-27  4:51       ` Jason Wang
2021-06-01  7:17         ` Eugenio Perez Martin
2021-06-03  3:13           ` Jason Wang
2021-05-19 16:28 ` [RFC v3 14/29] vhost: add vhost_has_limited_iova_range Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 15/29] vhost: Add enable_custom_iommu to VhostOps Eugenio Pérez
2021-05-31  9:01   ` Jason Wang
2021-06-01  7:49     ` Eugenio Perez Martin
2021-05-19 16:28 ` [RFC v3 16/29] vhost-vdpa: Add vhost_vdpa_enable_custom_iommu Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 17/29] vhost: Shadow virtqueue buffers forwarding Eugenio Pérez
2021-06-02  9:50   ` Jason Wang
2021-06-02 17:18     ` Eugenio Perez Martin
2021-06-03  3:34       ` Jason Wang
2021-06-04  8:37         ` Eugenio Perez Martin
2021-05-19 16:28 ` [RFC v3 18/29] vhost: Use vhost_enable_custom_iommu to unmap everything if available Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 19/29] vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue kick Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 20/29] vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow virtqueue Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 21/29] vhost: Add VhostIOVATree Eugenio Pérez
2021-05-31  9:40   ` Jason Wang
2021-06-01  8:15     ` Eugenio Perez Martin
2021-07-14  3:04       ` Jason Wang
2021-07-14  6:54         ` Eugenio Perez Martin
2021-07-14  9:14           ` Eugenio Perez Martin
2021-07-14  9:33             ` Jason Wang
2021-05-19 16:28 ` [RFC v3 22/29] vhost: Add iova_rev_maps_find_iova to IOVAReverseMaps Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 23/29] vhost: Use a tree to store memory mappings Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 24/29] vhost: Add iova_rev_maps_alloc Eugenio Pérez
2021-05-19 16:28 ` [RFC v3 25/29] vhost: Add custom IOTLB translations to SVQ Eugenio Pérez
2021-06-02  9:51   ` Jason Wang
2021-06-02 17:51     ` Eugenio Perez Martin
2021-06-03  3:39       ` Jason Wang
2021-06-04  9:07         ` Eugenio Perez Martin
2021-05-19 16:29 ` [RFC v3 26/29] vhost: Map in vdpa-dev Eugenio Pérez
2021-05-19 16:29 ` [RFC v3 27/29] vhost-vdpa: Implement vhost_vdpa_vring_pause operation Eugenio Pérez
2021-05-19 16:29 ` [RFC v3 28/29] vhost-vdpa: never map with vDPA listener Eugenio Pérez
2021-05-19 16:29 ` [RFC v3 29/29] vhost: Start vhost-vdpa SVQ directly Eugenio Pérez
2021-05-24  9:38 ` [RFC v3 00/29] vDPA software assisted live migration Michael S. Tsirkin
2021-05-24 10:37   ` Eugenio Perez Martin
2021-05-24 11:29     ` Michael S. Tsirkin
2021-07-19 14:13       ` Stefan Hajnoczi [this message]
2021-05-25  0:09     ` Jason Wang
2021-06-02  9:59 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YPWIkRLSd7/wj11k@stefanha-x1.localdomain \
    --to=stefanha@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eli@mellanox.com \
    --cc=eperezma@redhat.com \
    --cc=hanand@xilinx.com \
    --cc=jasowang@redhat.com \
    --cc=ml@napatech.com \
    --cc=mst@redhat.com \
    --cc=parav@mellanox.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=sgarzare@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xiao.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).