kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "Stefan Hajnoczi" <stefanha@gmail.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	qemu-devel@nongnu.org, "Lars Ganrot" <lars.ganrot@gmail.com>,
	virtualization@lists.linux-foundation.org,
	"Salil Mehta" <mehta.salil.lnk@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Liran Alon" <liralon@gmail.com>,
	"Rob Miller" <rob.miller@broadcom.com>,
	"Max Gurtovoy" <maxgu14@gmail.com>,
	"Alex Barba" <alex.barba@broadcom.com>,
	"Jim Harford" <jim.harford@broadcom.com>,
	"Harpreet Singh Anand" <hanand@xilinx.com>,
	"Christophe Fontaine" <cfontain@redhat.com>,
	vm <vmireyno@marvell.com>, "Daniel Daly" <dandaly0@gmail.com>,
	"Michael Lilja" <ml@napatech.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>,
	"Nitin Shrivastav" <nitin.shrivastav@broadcom.com>,
	"Lee Ballard" <ballle98@gmail.com>,
	"Dmytro Kazantsev" <dmytro.kazantsev@gmail.com>,
	"Juan Quintela" <quintela@redhat.com>,
	kvm@vger.kernel.org, "Howard Cai" <howard.cai@gmail.com>,
	"Xiao W Wang" <xiao.w.wang@intel.com>,
	"Sean Mooney" <smooney@redhat.com>,
	"Parav Pandit" <parav@mellanox.com>,
	"Eli Cohen" <eli@mellanox.com>, "Siwei Liu" <loseweigh@gmail.com>,
	"Stephen Finucane" <stephenfin@redhat.com>
Subject: Re: [RFC PATCH 00/27] vDPA software assisted live migration
Date: Thu, 10 Dec 2020 17:12:29 +0800	[thread overview]
Message-ID: <750d098a-20e1-983b-9085-5197776cde35@redhat.com> (raw)
In-Reply-To: <20201209155729.GB396498@stefanha-x1.localdomain>


On 2020/12/9 下午11:57, Stefan Hajnoczi wrote:
> On Wed, Dec 09, 2020 at 04:26:50AM -0500, Jason Wang wrote:
>> ----- Original Message -----
>>> On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote:
>>>> This series enable vDPA software assisted live migration for vhost-net
>>>> devices. This is a new method of vhost devices migration: Instead of
>>>> relay on vDPA device's dirty logging capability, SW assisted LM
>>>> intercepts dataplane, forwarding the descriptors between VM and device.
>>> Pros:
>>> + vhost/vDPA devices don't need to implement dirty memory logging
>>> + Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends
>>>
>>> Cons:
>>> - Not generic, relies on vhost-net-specific ioctls
>>> - Doesn't support VIRTIO Shared Memory Regions
>>>    https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex
>> I may miss something but my understanding is that it's the
>> responsiblity of device to migrate this part?
> Good point. You're right.
>
>>> - Performance (see below)
>>>
>>> I think performance will be significantly lower when the shadow vq is
>>> enabled. Imagine a vDPA device with hardware vq doorbell registers
>>> mapped into the guest so the guest driver can directly kick the device.
>>> When the shadow vq is enabled a vmexit is needed to write to the shadow
>>> vq ioeventfd, then the host kernel scheduler switches to a QEMU thread
>>> to read the ioeventfd, the descriptors are translated, QEMU writes to
>>> the vhost hdev kick fd, the host kernel scheduler switches to the vhost
>>> worker thread, vhost/vDPA notifies the virtqueue, and finally the
>>> vDPA driver writes to the hardware vq doorbell register. That is a lot
>>> of overhead compared to writing to an exitless MMIO register!
>> I think it's a balance. E.g we can poll the virtqueue to have an
>> exitless doorbell.
>>
>>> If the shadow vq was implemented in drivers/vhost/ and QEMU used the
>>> existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be
>>> reduced to just one set of ioeventfd/irqfd. In other words, the QEMU
>>> dirty memory logging happens asynchronously and isn't in the dataplane.
>>>
>>> In addition, hardware that supports dirty memory logging as well as
>>> software vDPA devices could completely eliminate the shadow vq for even
>>> better performance.
>> Yes. That's our plan. But the interface might require more thought.
>>
>> E.g is the bitmap a good approach? To me reporting dirty pages via
>> virqueue is better since it get less footprint and is self throttled.
>>
>> And we need an address space other than the one used by guest for
>> either bitmap for virtqueue.
>>
>>> But performance is a question of "is it good enough?". Maybe this
>>> approach is okay and users don't expect good performance while dirty
>>> memory logging is enabled.
>> Yes, and actually such slow down may help for the converge of the
>> migration.
>>
>> Note that the whole idea is try to have a generic solution for all
>> types of devices. It's good to consider the performance but for the
>> first stage, it should be sufficient to make it work and consider to
>> optimize on top.
> Moving the shadow vq to the kernel later would be quite a big change
> requiring rewriting much of the code. That's why I mentioned this now
> before a lot of effort is invested in a QEMU implementation.


Right.


>
>>> I just wanted to share the idea of moving the
>>> shadow vq into the kernel in case you like that approach better.
>> My understanding is to keep kernel as simple as possible and leave the
>> polices to userspace as much as possible. E.g it requires us to
>> disable doorbell mapping and irq offloading, all of which were under
>> the control of userspace.
> If the performance is acceptable with the QEMU approach then I think
> that's the best place to implement it. It looks high-overhead though so
> maybe one of the first things to do is to run benchmarks to collect data
> on how it performs?


Yes, I agree.

Thanks


>
> Stefan


      reply	other threads:[~2020-12-10  9:14 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 18:50 [RFC PATCH 00/27] vDPA software assisted live migration Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 01/27] vhost: Add vhost_dev_can_log Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 02/27] vhost: Add device callback in vhost_migration_log Eugenio Pérez
2020-12-07 16:19   ` Stefan Hajnoczi
2020-12-09 12:20     ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 03/27] vhost: Move log resize/put to vhost_dev_set_log Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 04/27] vhost: add vhost_kernel_set_vring_enable Eugenio Pérez
2020-12-07 16:43   ` Stefan Hajnoczi
2020-12-09 12:00     ` Eugenio Perez Martin
2020-12-09 16:08       ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 05/27] vhost: Add hdev->dev.sw_lm_vq_handler Eugenio Pérez
2020-12-07 16:52   ` Stefan Hajnoczi
2020-12-09 15:02     ` Eugenio Perez Martin
2020-12-10 11:30       ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 06/27] virtio: Add virtio_queue_get_used_notify_split Eugenio Pérez
2020-12-07 16:58   ` Stefan Hajnoczi
2021-01-12 18:21     ` Eugenio Perez Martin
2021-03-02 11:22       ` Stefan Hajnoczi
2021-03-02 18:34         ` Eugenio Perez Martin
2021-03-08 10:46           ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 07/27] vhost: Route guest->host notification through qemu Eugenio Pérez
2020-12-07 17:42   ` Stefan Hajnoczi
2020-12-09 17:08     ` Eugenio Perez Martin
2020-12-10 11:50       ` Stefan Hajnoczi
2021-01-21 20:10         ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 08/27] vhost: Add a flag for software assisted Live Migration Eugenio Pérez
2020-12-08  7:20   ` Stefan Hajnoczi
2020-12-09 17:57     ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 09/27] vhost: Route host->guest notification through qemu Eugenio Pérez
2020-12-08  7:34   ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 10/27] vhost: Allocate shadow vring Eugenio Pérez
2020-12-08  7:49   ` Stefan Hajnoczi
2020-12-08  8:17   ` Stefan Hajnoczi
2020-12-09 18:15     ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 11/27] virtio: const-ify all virtio_tswap* functions Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 12/27] virtio: Add virtio_queue_full Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 13/27] vhost: Send buffers to device Eugenio Pérez
2020-12-08  8:16   ` Stefan Hajnoczi
2020-12-09 18:41     ` Eugenio Perez Martin
2020-12-10 11:55       ` Stefan Hajnoczi
2021-01-22 18:18         ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 14/27] virtio: Remove virtio_queue_get_used_notify_split Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 15/27] vhost: Do not invalidate signalled used Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 16/27] virtio: Expose virtqueue_alloc_element Eugenio Pérez
2020-12-08  8:25   ` Stefan Hajnoczi
2020-12-09 18:46     ` Eugenio Perez Martin
2020-12-10 11:57       ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 17/27] vhost: add vhost_vring_set_notification_rcu Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 18/27] vhost: add vhost_vring_poll_rcu Eugenio Pérez
2020-12-08  8:41   ` Stefan Hajnoczi
2020-12-09 18:48     ` Eugenio Perez Martin
2020-11-20 18:50 ` [RFC PATCH 19/27] vhost: add vhost_vring_get_buf_rcu Eugenio Pérez
2020-11-20 18:50 ` [RFC PATCH 20/27] vhost: Return used buffers Eugenio Pérez
2020-12-08  8:50   ` Stefan Hajnoczi
2020-11-20 18:50 ` [RFC PATCH 21/27] vhost: Add vhost_virtqueue_memory_unmap Eugenio Pérez
2020-11-20 18:51 ` [RFC PATCH 22/27] vhost: Add vhost_virtqueue_memory_map Eugenio Pérez
2020-11-20 18:51 ` [RFC PATCH 23/27] vhost: unmap qemu's shadow virtqueues on sw live migration Eugenio Pérez
2020-11-27 15:29   ` Stefano Garzarella
2020-11-30  7:54     ` Eugenio Perez Martin
2020-11-20 18:51 ` [RFC PATCH 24/27] vhost: iommu changes Eugenio Pérez
2020-12-08  9:02   ` Stefan Hajnoczi
2020-11-20 18:51 ` [RFC PATCH 25/27] vhost: Do not commit vhost used idx on vhost_virtqueue_stop Eugenio Pérez
2020-11-20 19:35   ` Eugenio Perez Martin
2020-11-20 18:51 ` [RFC PATCH 26/27] vhost: Add vhost_hdev_can_sw_lm Eugenio Pérez
2020-11-20 18:51 ` [RFC PATCH 27/27] vhost: forbid vhost devices logging Eugenio Pérez
2020-11-20 19:03 ` [RFC PATCH 00/27] vDPA software assisted live migration Eugenio Perez Martin
2020-11-20 19:30 ` no-reply
2020-11-25  7:08 ` Jason Wang
2020-11-25 12:03   ` Eugenio Perez Martin
2020-11-25 12:14     ` Eugenio Perez Martin
2020-11-26  3:07     ` Jason Wang
2020-11-27 15:44 ` Stefano Garzarella
2020-12-08  9:37 ` Stefan Hajnoczi
2020-12-09  9:26   ` Jason Wang
2020-12-09 15:57     ` Stefan Hajnoczi
2020-12-10  9:12       ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=750d098a-20e1-983b-9085-5197776cde35@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.barba@broadcom.com \
    --cc=ballle98@gmail.com \
    --cc=cfontain@redhat.com \
    --cc=dandaly0@gmail.com \
    --cc=dmytro.kazantsev@gmail.com \
    --cc=eli@mellanox.com \
    --cc=eperezma@redhat.com \
    --cc=hanand@xilinx.com \
    --cc=howard.cai@gmail.com \
    --cc=jim.harford@broadcom.com \
    --cc=kvm@vger.kernel.org \
    --cc=lars.ganrot@gmail.com \
    --cc=liralon@gmail.com \
    --cc=loseweigh@gmail.com \
    --cc=maxgu14@gmail.com \
    --cc=mehta.salil.lnk@gmail.com \
    --cc=ml@napatech.com \
    --cc=mst@redhat.com \
    --cc=nitin.shrivastav@broadcom.com \
    --cc=parav@mellanox.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=rob.miller@broadcom.com \
    --cc=sgarzare@redhat.com \
    --cc=smooney@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=stephenfin@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vmireyno@marvell.com \
    --cc=xiao.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).