qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Peter Krempa <pkrempa@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Den Lunev <den@openvz.org>
Subject: Re: [PATCH v4 0/6] UFFD write-tracking migration/snapshots
Date: Fri, 27 Nov 2020 10:45:46 -0500	[thread overview]
Message-ID: <20201127154546.GC6573@xz-x1> (raw)
In-Reply-To: <23cbb153-9260-3357-04ba-94da7da8c0d2@virtuozzo.com>

On Fri, Nov 27, 2020 at 01:00:48PM +0300, Andrey Gruzdev wrote:
> On 27.11.2020 12:49, Peter Krempa wrote:
> > On Fri, Nov 27, 2020 at 11:21:39 +0300, Andrey Gruzdev wrote:
> > > On 26.11.2020 18:47, Peter Krempa wrote:
> > > > On Thu, Nov 26, 2020 at 18:17:28 +0300, Andrey Gruzdev via wrote:
> > > > > This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's
> > > > > implemented in his series '[PATCH v0 0/4] migration: add background snapshot'.
> > > > > 
> > > > > Currently the only way to make (external) live VM snapshot is using existing
> > > > > dirty page logging migration mechanism. The main problem is that it tends to
> > > > > produce a lot of page duplicates while running VM goes on updating already
> > > > > saved pages. That leads to the fact that vmstate image size is commonly several
> > > > > times bigger then non-zero part of virtual machine's RSS. Time required to
> > > > > converge RAM migration and the size of snapshot image severely depend on the
> > > > > guest memory write rate, sometimes resulting in unacceptably long snapshot
> > > > > creation time and huge image size.
> > > > > 
> > > > > This series propose a way to solve the aforementioned problems. This is done
> > > > > by using different RAM migration mechanism based on UFFD write protection
> > > > > management introduced in v5.7 kernel. The migration strategy is to 'freeze'
> > > > > guest RAM content using write-protection and iteratively release protection
> > > > > for memory ranges that have already been saved to the migration stream.
> > > > > At the same time we read in pending UFFD write fault events and save those
> > > > > pages out-of-order with higher priority.
> > > > 
> > > > This sounds amazing! Based on your description I assume that the final
> > > > memory image contains state image from the beginning of the migration.
> > > > 
> > > > This would make it desirable for the 'migrate' qmp command to be used as
> > > > part of a 'transaction' qmp command so that we can do an instant disk
> > > > and memory snapshot without any extraneous pausing of the VM.
> > > > 
> > > > I'll have a look at using this mechanism in libvirt natively.
> > > > 
> > > 
> > > Correct, the final image contains state at the beginning of migration.
> > > 
> > > So far, if I'm not missing something about libvirt, for external snapshot
> > > creation it performs a sequence like that:
> > > migrate(fd)->transaction(blockdev-snapshot-all)->cont.
> > > 
> > > So, in case 'background-snapshot' capability is enabled, the sequence would
> > > change to:
> > > stop->transaction(blockdev-snapshot-all)->migrate(fd).
> > > With background snapshot migration it will finish with VM running so there's
> > > not need to 'cont' here.
> > 
> > Yes, that's correct.
> > 
> > The reason I've suggested that 'migrate' being part of a 'transaction'
> > is that it would remove the need to stop it for the disk snapshot part.
> > 
> 
> Hmm, I believe stopping VM for a short time is unavoidable to keep saved
> device state consistent with blockdev snapshot base.. May be I've missed
> something but it seems logical.

I guess PeterK meant an explicit stop vm command, rather than the stop in this
series that should be undetectable from the guest.  It would be definitely
great if PeterK could quickly follow up with this on libvirt. :)

One overall comment (before go into details) is that please remember to start
using "migration: " prefixes in patch subjects from the next version.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2020-11-27 15:55 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 15:17 [PATCH v4 0/6] UFFD write-tracking migration/snapshots Andrey Gruzdev via
2020-11-26 15:17 ` [PATCH v4 1/6] introduce 'background-snapshot' migration capability Andrey Gruzdev via
2020-11-27 19:55   ` Peter Xu
2020-11-28 16:35     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 2/6] introduce UFFD-WP low-level interface helpers Andrey Gruzdev via
2020-11-27 21:04   ` Peter Xu
2020-11-29 20:12     ` Andrey Gruzdev
2020-11-30 15:34       ` Peter Xu
2020-11-30 18:41         ` Andrey Gruzdev
2020-12-01 12:24   ` Dr. David Alan Gilbert
2020-12-01 19:32     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 3/6] support UFFD write fault processing in ram_save_iterate() Andrey Gruzdev via
2020-11-27 21:49   ` Peter Xu
2020-11-29 21:14     ` Andrey Gruzdev
2020-11-30 16:32       ` Peter Xu
2020-11-30 19:27         ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 4/6] implementation of background snapshot thread Andrey Gruzdev via
2020-11-26 15:17 ` [PATCH v4 5/6] the rest of write tracking migration code Andrey Gruzdev via
2020-11-27 22:26   ` Peter Xu
2020-11-30  8:09     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 6/6] introduce simple linear scan rate limiting mechanism Andrey Gruzdev via
2020-11-27 22:28   ` Peter Xu
2020-11-30  8:11     ` Andrey Gruzdev
2020-11-30 16:40       ` Peter Xu
2020-11-30 19:30         ` Andrey Gruzdev
2020-11-26 15:47 ` [PATCH v4 0/6] UFFD write-tracking migration/snapshots Peter Krempa
2020-11-27  8:21   ` Andrey Gruzdev
2020-11-27  9:49     ` Peter Krempa
2020-11-27 10:00       ` Andrey Gruzdev
2020-11-27 15:45         ` Peter Xu [this message]
2020-11-27 17:19           ` Andrey Gruzdev
2020-11-27 22:04 ` Peter Xu
2020-11-30  8:07   ` Andrey Gruzdev
2020-12-01  7:08 ` Peter Krempa
2020-12-01  8:42   ` Andrey Gruzdev
2020-12-01 10:53     ` Peter Krempa
2020-12-01 11:24       ` Andrey Gruzdev
2020-12-01 18:40         ` Dr. David Alan Gilbert
2020-12-01 19:22           ` Peter Xu
2020-12-01 20:01             ` Dr. David Alan Gilbert
2020-12-01 20:29               ` Andrey Gruzdev
2020-12-01 20:11           ` Andrey Gruzdev
2020-12-01 18:54         ` Peter Xu
2020-12-01 20:00           ` Dr. David Alan Gilbert
2020-12-01 20:26           ` Andrey Gruzdev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201127154546.GC6573@xz-x1 \
    --to=peterx@redhat.com \
    --cc=andrey.gruzdev@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).