All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Claudio Fontana <cfontana@suse.de>
Cc: "Fabiano Rosas" <farosas@suse.de>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-devel@nongnu.org, jfehlig@suse.com, dfaggioli@suse.com,
	dgilbert@redhat.com, "Juan Quintela" <quintela@redhat.com>
Subject: Re: [RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram
Date: Mon, 3 Apr 2023 15:26:58 -0400	[thread overview]
Message-ID: <ZCsogia3r7ePKBR9@x1n> (raw)
In-Reply-To: <d2b40262-3791-8820-5104-e4eb313cd796@suse.de>

Hi, Claudio,

Thanks for the context.

On Mon, Apr 03, 2023 at 09:47:26AM +0200, Claudio Fontana wrote:
> Hi, not sure if what is asked here is context in terms of the previous
> upstream discussions or our specific requirement we are trying to bring
> upstream.
>
> In terms of the specific requirement we are trying to bring upstream, we
> need to get libvirt+QEMU VM save and restore functionality to be able to
> transfer VM sizes of ~30 GB (4/8 vcpus) in roughly 5 seconds.  When an
> event trigger happens, the VM needs to be quickly paused and saved to
> disk safely, including datasync, and another VM needs to be restored,
> also in ~5 secs.  For our specific requirement, the VM is never running
> when its data (mostly consisting of RAM) is saved.
>
> I understand that the need to handle also the "live" case comes from
> upstream discussions about solving the "general case", where someone
> might want to do this for "live" VMs, but if helpful I want to highlight
> that it is not part of the specific requirement we are trying to address,
> and for this specific case won't also in the future, as the whole point
> of the trigger is to replace the running VM with another VM, so it cannot
> be kept running.

From what I read so far, that scenario suites exactly what live snapshot
would do with current QEMU - that at least should involve a snapshot on the
disks being used or I can't see how that can be live.  So it looks like a
separate request.

> The reason we are using "migrate" here likely stems from the fact that
> existing libvirt code currently uses QMP migrate to implement the save
> and restore commands.  And in my personal view, I think that reusing the
> existing building blocks (migration, multifd) would be preferable, to
> avoid having to maintain two separate ways to do the same thing.  That
> said, it could be done in a different way, if the performance can keep
> up. Just thinking of reducing the overall effort and also maintenance
> surface.

I would vaguely guess the performance can not only keep up but better than
what the current solution would provide, due to the possibility of (1)
batch handling of continuous guest pages, and (2) completely no dirty
tracking overhead.

For (2), it's not about wr-protect page faults or vmexits due to PML being
full (because vcpus will be stopped anyway..), it's about enabling the
dirty tracking (which already contains overhead, especially when huge pages
are enabled, to split huge pages in EPT pgtables) and all the bitmap
operations QEMU does during live migration even if the VM is not live.

IMHO reusing multifd may or may not be a good idea here, because it'll of
course also complicate multifd code, hence makes multifd harder to
maintain, while not in a good way, because as I mentioned I don't think it
can use much of what multifd provides.

I don't have a strong opinion on the impl (even though I do have a
preference..), but I think at least we should still check on two things:

  - Being crystal clear on the use case above, and double check whether "VM
    stop" should be the default operation at the start of the new cmd - we
    shouldn't assume the user will be aware of doing this, neither should
    we assume the user is aware of the performance implications.

  - Making sure the image layout is well defined, so:

    - It'll be extensible in the future, and,

    - If someone would like to refactor it to not use the migration thread
      model anymore, the image format, hopefully, can be easy to keep
      untouched so it can be compatible with the current approach.

Just my two cents. I think Juan should have the best grasp on this.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2023-04-03 19:27 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-30 18:03 [RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 01/26] migration: Add support for 'file:' uri for source migration Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 02/26] migration: Add support for 'file:' uri for incoming migration Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 03/26] tests/qtest: migration: Add migrate_incoming_qmp helper Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 04/26] tests/qtest: migration-test: Add tests for file-based migration Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 05/26] migration: Initial support of fixed-ram feature for analyze-migration.py Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 06/26] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 07/26] io: Add generic pwritev/preadv interface Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 08/26] io: implement io_pwritev/preadv for QIOChannelFile Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 09/26] migration/qemu-file: add utility methods for working with seekable channels Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 10/26] migration/ram: Introduce 'fixed-ram' migration stream capability Fabiano Rosas
2023-03-30 22:01   ` Peter Xu
2023-03-31  7:56     ` Daniel P. Berrangé
2023-03-31 14:39       ` Peter Xu
2023-03-31 15:34         ` Daniel P. Berrangé
2023-03-31 16:13           ` Peter Xu
2023-03-31 15:05     ` Fabiano Rosas
2023-03-31  5:50   ` Markus Armbruster
2023-03-30 18:03 ` [RFC PATCH v1 11/26] migration: Refactor precopy ram loading code Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 12/26] migration: Add support for 'fixed-ram' migration restore Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 13/26] tests/qtest: migration-test: Add tests for fixed-ram file-based migration Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 14/26] migration: Add completion tracepoint Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 15/26] migration/multifd: Remove direct "socket" references Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 16/26] migration/multifd: Allow multifd without packets Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 17/26] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 18/26] migration/multifd: Add incoming " Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 19/26] migration/multifd: Add pages to the receiving side Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 20/26] io: Add a pwritev/preadv version that takes a discontiguous iovec Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 21/26] migration/ram: Add a wrapper for fixed-ram shadow bitmap Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 22/26] migration/multifd: Support outgoing fixed-ram stream format Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 23/26] migration/multifd: Support incoming " Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 24/26] tests/qtest: Add a multifd + fixed-ram migration test Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 25/26] migration: Add direct-io parameter Fabiano Rosas
2023-03-30 18:03 ` [RFC PATCH v1 26/26] tests/migration/guestperf: Add file, fixed-ram and direct-io support Fabiano Rosas
2023-03-30 21:41 ` [RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram Peter Xu
2023-03-31 14:37   ` Fabiano Rosas
2023-03-31 14:52     ` Peter Xu
2023-03-31 15:30       ` Fabiano Rosas
2023-03-31 15:55         ` Peter Xu
2023-03-31 16:10           ` Daniel P. Berrangé
2023-03-31 16:27             ` Peter Xu
2023-03-31 18:18               ` Fabiano Rosas
2023-03-31 21:52                 ` Peter Xu
2023-04-03  7:47                   ` Claudio Fontana
2023-04-03 19:26                     ` Peter Xu [this message]
2023-04-04  8:00                       ` Claudio Fontana
2023-04-04 14:53                         ` Peter Xu
2023-04-04 15:10                           ` Claudio Fontana
2023-04-04 15:56                             ` Peter Xu
2023-04-06 16:46                               ` Fabiano Rosas
2023-04-07 10:36                                 ` Claudio Fontana
2023-04-11 15:48                                   ` Peter Xu
2023-04-18 16:58               ` Daniel P. Berrangé
2023-04-18 19:26                 ` Peter Xu
2023-04-19 17:12                   ` Daniel P. Berrangé
2023-04-19 19:07                     ` Peter Xu
2023-04-20  9:02                       ` Daniel P. Berrangé
2023-04-20 19:19                         ` Peter Xu
2023-04-21  7:48                           ` Daniel P. Berrangé
2023-04-21 13:56                             ` Peter Xu
2023-03-31 15:46       ` Daniel P. Berrangé
2023-04-03  7:38 ` David Hildenbrand
2023-04-03 14:41   ` Fabiano Rosas
2023-04-03 16:24     ` David Hildenbrand
2023-04-03 16:36       ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCsogia3r7ePKBR9@x1n \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cfontana@suse.de \
    --cc=dfaggioli@suse.com \
    --cc=dgilbert@redhat.com \
    --cc=farosas@suse.de \
    --cc=jfehlig@suse.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.