All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Gruzdev via <qemu-devel@nongnu.org>
To: qemu-devel@nongnu.org
Cc: Den Lunev <den@openvz.org>, Eric Blake <eblake@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	Peter Xu <peterx@redhat.com>,
	Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Subject: [PATCH v4 0/6] UFFD write-tracking migration/snapshots
Date: Thu, 26 Nov 2020 18:17:28 +0300	[thread overview]
Message-ID: <20201126151734.743849-1-andrey.gruzdev@virtuozzo.com> (raw)

This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's
implemented in his series '[PATCH v0 0/4] migration: add background snapshot'.

Currently the only way to make (external) live VM snapshot is using existing
dirty page logging migration mechanism. The main problem is that it tends to
produce a lot of page duplicates while running VM goes on updating already
saved pages. That leads to the fact that vmstate image size is commonly several
times bigger then non-zero part of virtual machine's RSS. Time required to
converge RAM migration and the size of snapshot image severely depend on the
guest memory write rate, sometimes resulting in unacceptably long snapshot
creation time and huge image size.

This series propose a way to solve the aforementioned problems. This is done
by using different RAM migration mechanism based on UFFD write protection
management introduced in v5.7 kernel. The migration strategy is to 'freeze'
guest RAM content using write-protection and iteratively release protection
for memory ranges that have already been saved to the migration stream.
At the same time we read in pending UFFD write fault events and save those
pages out-of-order with higher priority.

How to use:
1. Enable write-tracking migration capability
   virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
track-writes-ram on

2. Start the external migration to a file
   virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'

3. Wait for the migration finish and check that the migration has completed.
state.

Changes v3->v4:

* 1. Renamed migrate capability 'track-writes-ram'->'background-snapshot'.
* 2. Use array of incompatible caps to replace bulky 'if' constructs.
* 3. Moved UFFD low-level code to the separate module ('util/userfaultfd.c').
* 4. Always do UFFD wr-unprotect on cleanup; just closing file descriptor
*    won't cleanup PTEs anyhow, it will release registration ranges, wait 
*    queues etc. but won't cleanup process MM context on MMU level.
* 5. Allow to enable 'background-snapshot' capability on Linux-only hosts.
* 6. Put UFFD code usage under '#ifdef CONFIG_LINUX' prerequisite.
* 7. Removed 'wt_' from RAMState struct.
* 8. Refactored ram_find_and_save_block() to make more clean - poll UFFD
*    wr-fault events in get_queued_page(), use ram_save_host_page_pre(),
*    ram_save_host_page_post() notifiers around ram_save_host_page()
*    instead of bulky inline write-unprotect code.

Andrey Gruzdev (6):
  introduce 'background-snapshot' migration capability
  introduce UFFD-WP low-level interface helpers
  support UFFD write fault processing in ram_save_iterate()
  implementation of background snapshot thread
  the rest of write tracking migration code
  introduce simple linear scan rate limiting mechanism

 include/exec/memory.h      |   7 +
 include/qemu/userfaultfd.h |  29 ++++
 migration/migration.c      | 314 +++++++++++++++++++++++++++++++++-
 migration/migration.h      |   4 +
 migration/ram.c            | 334 ++++++++++++++++++++++++++++++++++++-
 migration/ram.h            |   4 +
 migration/savevm.c         |   1 -
 migration/savevm.h         |   2 +
 qapi/migration.json        |   7 +-
 util/meson.build           |   1 +
 util/userfaultfd.c         | 215 ++++++++++++++++++++++++
 11 files changed, 908 insertions(+), 10 deletions(-)
 create mode 100644 include/qemu/userfaultfd.h
 create mode 100644 util/userfaultfd.c

-- 
2.25.1



             reply	other threads:[~2020-11-26 15:20 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 15:17 Andrey Gruzdev via [this message]
2020-11-26 15:17 ` [PATCH v4 1/6] introduce 'background-snapshot' migration capability Andrey Gruzdev via
2020-11-27 19:55   ` Peter Xu
2020-11-28 16:35     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 2/6] introduce UFFD-WP low-level interface helpers Andrey Gruzdev via
2020-11-27 21:04   ` Peter Xu
2020-11-29 20:12     ` Andrey Gruzdev
2020-11-30 15:34       ` Peter Xu
2020-11-30 18:41         ` Andrey Gruzdev
2020-12-01 12:24   ` Dr. David Alan Gilbert
2020-12-01 19:32     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 3/6] support UFFD write fault processing in ram_save_iterate() Andrey Gruzdev via
2020-11-27 21:49   ` Peter Xu
2020-11-29 21:14     ` Andrey Gruzdev
2020-11-30 16:32       ` Peter Xu
2020-11-30 19:27         ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 4/6] implementation of background snapshot thread Andrey Gruzdev via
2020-11-26 15:17 ` [PATCH v4 5/6] the rest of write tracking migration code Andrey Gruzdev via
2020-11-27 22:26   ` Peter Xu
2020-11-30  8:09     ` Andrey Gruzdev
2020-11-26 15:17 ` [PATCH v4 6/6] introduce simple linear scan rate limiting mechanism Andrey Gruzdev via
2020-11-27 22:28   ` Peter Xu
2020-11-30  8:11     ` Andrey Gruzdev
2020-11-30 16:40       ` Peter Xu
2020-11-30 19:30         ` Andrey Gruzdev
2020-11-26 15:47 ` [PATCH v4 0/6] UFFD write-tracking migration/snapshots Peter Krempa
2020-11-27  8:21   ` Andrey Gruzdev
2020-11-27  9:49     ` Peter Krempa
2020-11-27 10:00       ` Andrey Gruzdev
2020-11-27 15:45         ` Peter Xu
2020-11-27 17:19           ` Andrey Gruzdev
2020-11-27 22:04 ` Peter Xu
2020-11-30  8:07   ` Andrey Gruzdev
2020-12-01  7:08 ` Peter Krempa
2020-12-01  8:42   ` Andrey Gruzdev
2020-12-01 10:53     ` Peter Krempa
2020-12-01 11:24       ` Andrey Gruzdev
2020-12-01 18:40         ` Dr. David Alan Gilbert
2020-12-01 19:22           ` Peter Xu
2020-12-01 20:01             ` Dr. David Alan Gilbert
2020-12-01 20:29               ` Andrey Gruzdev
2020-12-01 20:11           ` Andrey Gruzdev
2020-12-01 18:54         ` Peter Xu
2020-12-01 20:00           ` Dr. David Alan Gilbert
2020-12-01 20:26           ` Andrey Gruzdev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201126151734.743849-1-andrey.gruzdev@virtuozzo.com \
    --to=qemu-devel@nongnu.org \
    --cc=andrey.gruzdev@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.