All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Sistare <steven.sistare@oracle.com>
To: qemu-devel@nongnu.org
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Jason Zeng" <jason.zeng@linux.intel.com>,
	"Zheng Chuan" <zhengchuan@huawei.com>,
	"Steve Sistare" <steven.sistare@oracle.com>,
	"Mark Kanda" <mark.kanda@oracle.com>,
	"Guoyi Tu" <tugy@chinatelecom.cn>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Philippe Mathieu-Daudé" <philippe.mathieu.daude@gmail.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"John Snow" <jsnow@redhat.com>, "Peng Liang" <tcx4c70@gmail.com>
Subject: [PATCH V9 00/46] Live Update
Date: Tue, 26 Jul 2022 09:09:57 -0700	[thread overview]
Message-ID: <1658851843-236870-1-git-send-email-steven.sistare@oracle.com> (raw)

This version of the live update patch series integrates live update into the
live migration framework.  The new interfaces are:
  * mode (migration parameter)
  * cpr-exec-args (migration parameter)
  * file (migration URI)
  * migrate-mode-enable (command-line argument)
  * only-cpr-capable (command-line argument)

Provide the cpr-exec and cpr-reboot migration modes for live update.  These
save and restore VM state, with minimal guest pause time, so that qemu may be
updated to a new version in between.  The caller sets the mode parameter
before invoking the migrate or migrate-incoming commands.

In cpr-reboot mode, the migrate command saves state to a file, allowing
one to quit qemu, reboot to an updated kernel, start an updated version of
qemu, and resume via the migrate-incoming command.  The caller must specify
a migration URI that writes to and reads from a file.  Unlike normal mode,
the use of certain local storage options does not block the migration, but
the caller must not modify guest block devices between the quit and restart.
The guest RAM memory-backend must be shared, and the @x-ignore-shared
migration capability must be set, to avoid saving it to the file.  Guest RAM
must be non-volatile across reboot, which can be achieved by backing it with
a dax device, or /dev/shm PKRAM as proposed in
https://lore.kernel.org/lkml/1617140178-8773-1-git-send-email-anthony.yznaga@oracle.com
but this is not enforced.  The restarted qemu arguments must match those used
to initially start qemu, plus the -incoming option.

The reboot mode supports vfio devices if the caller first suspends the guest,
such as by issuing guest-suspend-ram to the qemu guest agent.  The guest
drivers' suspend methods flush outstanding requests and re-initialize the
devices, and thus there is no device state to save and restore.  After
issuing migrate-incoming, the caller must issue a system_wakeup command to
resume.

In cpr-exec mode, the migrate command saves state to a file and directly
exec's a new version of qemu on the same host, replacing the original process
while retaining its PID.  The caller must specify a migration URI that writes
to and reads from a file, and resumes execution via the migrate-incoming
command.  Arguments for the new qemu process are taken from the cpr-exec-args
migration parameter, and must include the -incoming option.

Guest RAM must be backed by a memory backend with share=on, but cannot be
memory-backend-ram.  The memory is re-mmap'd in the updated process, so guest
ram is efficiently preserved in place, albeit with new virtual addresses.
In addition, the '-migrate-mode-enable cpr-exec' option is required.  This
causes secondary guest ram blocks (those not specified on the command line)
to be allocated by mmap'ing a memfd.  The memfds are kept open across exec,
their values are saved in special cpr state which is retrieved after exec,
and they are re-mmap'd.  Since guest RAM is not copied, and storage blocks
are not migrated, the caller must disable all capabilities related to page
and block copy.  The implementation ignores all related parameters.

The exec mode supports vfio devices by preserving the vfio container, group,
device, and event descriptors across the qemu re-exec, and by updating DMA
mapping virtual addresses using VFIO_DMA_UNMAP_FLAG_VADDR and
VFIO_DMA_MAP_FLAG_VADDR as defined in
  https://lore.kernel.org/kvm/1611939252-7240-1-git-send-email-steven.sistare@oracle.com
and integrated in Linux kernel 5.12.

Here is an example of updating qemu from v7.0.50 to v7.0.51 using exec mode.
The software update is performed while the guest is running to minimize
downtime.

window 1                                        | window 2
                                                |
# qemu-system-$arch ...                         |
  -migrate-mode-enable cpr-exec                 |
QEMU 7.0.50 monitor - type 'help' ...           |
(qemu) info status                              |
VM status: running                              |
                                                | # yum update qemu
(qemu) migrate_set_parameter mode cpr-exec      |
(qemu) migrate_set_parameter cpr-exec-args      |
  qemu-system-$arch ... -incoming defer         |
(qemu) migrate -d file:/tmp/qemu.sav            |
QEMU 7.0.51 monitor - type 'help' ...           |
(qemu) info status                              |
VM status: paused (inmigrate)                   |
(qemu) migrate_incoming file:/tmp/qemu.sav      |
(qemu) info status                              |
VM status: running                              |


Here is an example of updating the host kernel using reboot mode.

window 1                                        | window 2
                                                |
# qemu-system-$arch ... mem-path=/dev/dax0.0    |
  -migrate-mode-enable cpr-reboot               |
QEMU 7.0.50 monitor - type 'help' ...           |
(qemu) info status                              |
VM status: running                              |
                                                | # yum update kernel-uek
(qemu) migrate_set_parameter mode cpr-reboot    |
(qemu) migrate -d file:/tmp/qemu.sav            |
(qemu) quit                                     |
                                                |
# systemctl kexec                               |
kexec_core: Starting new kernel                 |
...                                             |
                                                |
# qemu-system-$arch mem-path=/dev/dax0.0 ...    |
  -incoming defer                               |
QEMU 7.0.51 monitor - type 'help' ...           |
(qemu) info status                              |
VM status: paused (inmigrate)                   |
(qemu) migrate_incoming file:/tmp/qemu.sav      |
(qemu) info status                              |
VM status: running                              |

Changes from V8 to V9:
  vfio:
    - free all cpr state during unwind in vfio_connect_container
    - change cpr_resave_fd to return void, and avoid new unwind cases
    - delete incorrect .unmigratable=1 in vmstate handlers
    - add route batching in vfio_claim_vectors
    - simplified vfio intx cpr code
    - fix commit message for 'recover from unmap-all-vaddr failure'
    - verify suspended runstate for cpr-reboot mode
  Other:
    - delete cpr-save, cpr-exec, cpr-load
    - delete ram block vmstate handlers that were added in V8
    - rename cpr-enable option to migrate-mode-enable
    - add file URI for migration
    - add mode and cpr-exec-args migration parameters
    - add per-mode migration blockers
    - add mode checks in migration notifiers
    - fix suspended runstate during migration
    - replace RAM_ANON flag with RAM_NAMED_FILE
    - support memory-backend-epc

Steve Sistare (44):
  migration: fix populate_vfio_info                  ---  reboot mode  ---
  memory: RAM_NAMED_FILE flag
  migration: file URI
  migration: mode parameter
  migration: migrate-enable-mode option
  migration: simplify blockers
  migration: per-mode blockers
  cpr: relax some blockers
  cpr: reboot mode

  qdev-properties: strList                           ---  exec mode ---
  qapi: strList_from_string
  qapi: QAPI_LIST_LENGTH
  qapi: strv_from_strList
  qapi: strList unit tests
  migration: cpr-exec-args parameter
  migration: simplify notifiers
  migration: check mode in notifiers
  memory: flat section iterator
  oslib: qemu_clear_cloexec
  vl: helper to request re-exec
  cpr: preserve extra state
  cpr: exec mode
  cpr: add exec-mode blockers
  cpr: ram block blockers
  cpr: only-cpr-capable
  cpr: Mismatched GPAs fix
  hostmem-memfd: cpr support
  hostmem-epc: cpr support

  pci: export msix_is_pending                       --- vfio for exec ---
  vfio-pci: refactor for cpr
  vfio-pci: cpr part 1 (fd and dma)
  vfio-pci: cpr part 2 (msi)
  vfio-pci: cpr part 3 (intx)
  vfio-pci: recover from unmap-all-vaddr failure

  chardev: cpr framework                            --- misc for exec ---
  chardev: cpr for simple devices
  chardev: cpr for pty
  python/machine: QEMUMachine full_args
  python/machine: QEMUMachine reopen_qmp_connection
  tests/avocado: add cpr regression test

  vl: start on wakeup request                       --- vfio for reboot ---
  migration: fix suspended runstate
  migration: notifier error reporting
  vfio: allow cpr-reboot migration if suspended

Mark Kanda, Steve Sistare (2):
  vhost: reset vhost devices for cpr
  chardev: cpr for sockets

 MAINTAINERS                         |  14 ++
 accel/xen/xen-all.c                 |   3 +
 backends/hostmem-epc.c              |  18 +-
 backends/hostmem-file.c             |   1 +
 backends/hostmem-memfd.c            |  22 ++-
 backends/tpm/tpm_emulator.c         |  11 +-
 block/parallels.c                   |   7 +-
 block/qcow.c                        |   7 +-
 block/vdi.c                         |   7 +-
 block/vhdx.c                        |   7 +-
 block/vmdk.c                        |   7 +-
 block/vpc.c                         |   7 +-
 block/vvfat.c                       |   7 +-
 chardev/char-mux.c                  |   1 +
 chardev/char-null.c                 |   1 +
 chardev/char-pty.c                  |  16 +-
 chardev/char-serial.c               |   1 +
 chardev/char-socket.c               |  48 +++++
 chardev/char-stdio.c                |  31 +++
 chardev/char.c                      |  49 ++++-
 dump/dump.c                         |   4 +-
 gdbstub.c                           |   1 +
 hmp-commands.hx                     |   2 +-
 hw/9pfs/9p.c                        |  11 +-
 hw/core/qdev-properties-system.c    |  12 ++
 hw/core/qdev-properties.c           |  44 +++++
 hw/display/virtio-gpu-base.c        |   8 +-
 hw/intc/arm_gic_kvm.c               |   3 +-
 hw/intc/arm_gicv3_its_kvm.c         |   3 +-
 hw/intc/arm_gicv3_kvm.c             |   3 +-
 hw/misc/ivshmem.c                   |   8 +-
 hw/net/virtio-net.c                 |  10 +-
 hw/pci/msix.c                       |   2 +-
 hw/pci/pci.c                        |  12 ++
 hw/ppc/pef.c                        |   2 +-
 hw/ppc/spapr.c                      |   2 +-
 hw/ppc/spapr_events.c               |   2 +-
 hw/ppc/spapr_rtas.c                 |   2 +-
 hw/remote/proxy.c                   |   7 +-
 hw/s390x/s390-virtio-ccw.c          |   9 +-
 hw/scsi/vhost-scsi.c                |   9 +-
 hw/vfio/common.c                    | 235 +++++++++++++++++++----
 hw/vfio/cpr.c                       | 177 ++++++++++++++++++
 hw/vfio/meson.build                 |   1 +
 hw/vfio/migration.c                 |  23 +--
 hw/vfio/pci.c                       | 336 ++++++++++++++++++++++++++++-----
 hw/vfio/trace-events                |   1 +
 hw/virtio/vhost-vdpa.c              |   6 +-
 hw/virtio/vhost.c                   |  32 +++-
 include/chardev/char-socket.h       |   1 +
 include/chardev/char.h              |   5 +
 include/exec/memory.h               |  48 +++++
 include/exec/ram_addr.h             |   1 +
 include/exec/ramblock.h             |   1 +
 include/hw/pci/msix.h               |   1 +
 include/hw/qdev-properties-system.h |   4 +
 include/hw/qdev-properties.h        |   3 +
 include/hw/vfio/vfio-common.h       |  12 ++
 include/hw/virtio/vhost.h           |   1 +
 include/migration/blocker.h         |  69 ++++++-
 include/migration/cpr-state.h       |  30 +++
 include/migration/cpr.h             |  20 ++
 include/migration/misc.h            |  13 +-
 include/migration/vmstate.h         |   2 +
 include/qapi/util.h                 |  28 +++
 include/qemu/osdep.h                |   9 +
 include/sysemu/runstate.h           |   2 +
 migration/cpr-state.c               | 362 ++++++++++++++++++++++++++++++++++++
 migration/cpr.c                     |  85 +++++++++
 migration/file.c                    |  62 ++++++
 migration/file.h                    |  14 ++
 migration/meson.build               |   3 +
 migration/migration.c               | 268 +++++++++++++++++++++++---
 migration/ram.c                     |  24 ++-
 migration/target.c                  |   1 +
 migration/trace-events              |  12 ++
 monitor/hmp-cmds.c                  |  59 +++---
 monitor/hmp.c                       |   3 +
 monitor/qmp.c                       |   4 +
 python/qemu/machine/machine.py      |  14 ++
 qapi/char.json                      |   7 +-
 qapi/migration.json                 |  68 ++++++-
 qapi/qapi-util.c                    |  37 ++++
 qemu-options.hx                     |  50 ++++-
 replay/replay.c                     |   4 +
 softmmu/memory.c                    |  31 ++-
 softmmu/physmem.c                   | 100 +++++++++-
 softmmu/runstate.c                  |  42 ++++-
 softmmu/vl.c                        |  10 +
 stubs/cpr-state.c                   |  26 +++
 stubs/meson.build                   |   2 +
 stubs/migr-blocker.c                |   9 +-
 stubs/migration.c                   |  33 ++++
 target/i386/kvm/kvm.c               |   8 +-
 target/i386/nvmm/nvmm-all.c         |   4 +-
 target/i386/sev.c                   |   2 +-
 target/i386/whpx/whpx-all.c         |   3 +-
 tests/avocado/cpr.py                | 176 ++++++++++++++++++
 tests/unit/meson.build              |   1 +
 tests/unit/test-strlist.c           |  81 ++++++++
 trace-events                        |   1 +
 ui/spice-core.c                     |   5 +-
 ui/vdagent.c                        |   5 +-
 util/oslib-posix.c                  |   9 +
 util/oslib-win32.c                  |   4 +
 105 files changed, 2781 insertions(+), 330 deletions(-)
 create mode 100644 hw/vfio/cpr.c
 create mode 100644 include/migration/cpr-state.h
 create mode 100644 include/migration/cpr.h
 create mode 100644 migration/cpr-state.c
 create mode 100644 migration/cpr.c
 create mode 100644 migration/file.c
 create mode 100644 migration/file.h
 create mode 100644 stubs/cpr-state.c
 create mode 100644 stubs/migration.c
 create mode 100644 tests/avocado/cpr.py
 create mode 100644 tests/unit/test-strlist.c

-- 
1.8.3.1



             reply	other threads:[~2022-07-26 16:19 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 16:09 Steve Sistare [this message]
2022-07-26 16:09 ` [PATCH V9 01/46] migration: fix populate_vfio_info Steve Sistare
2022-07-26 16:09 ` [PATCH V9 02/46] memory: RAM_NAMED_FILE flag Steve Sistare
2022-07-26 16:10 ` [PATCH V9 03/46] migration: file URI Steve Sistare
2022-07-26 16:10 ` [PATCH V9 04/46] migration: mode parameter Steve Sistare
2022-07-26 16:10 ` [PATCH V9 05/46] migration: migrate-enable-mode option Steve Sistare
2022-07-26 16:10 ` [PATCH V9 06/46] migration: simplify blockers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 07/46] migration: per-mode blockers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 08/46] cpr: relax some blockers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 09/46] cpr: reboot mode Steve Sistare
2022-07-26 16:10 ` [PATCH V9 10/46] qdev-properties: strList Steve Sistare
2023-06-08 14:50   ` Steven Sistare
2022-07-26 16:10 ` [PATCH V9 11/46] qapi: strList_from_string Steve Sistare
2022-07-26 16:10 ` [PATCH V9 12/46] qapi: QAPI_LIST_LENGTH Steve Sistare
2022-07-26 16:10 ` [PATCH V9 13/46] qapi: strv_from_strList Steve Sistare
2022-07-26 16:10 ` [PATCH V9 14/46] qapi: strList unit tests Steve Sistare
2022-07-26 16:10 ` [PATCH V9 15/46] migration: cpr-exec-args parameter Steve Sistare
2022-07-26 16:10 ` [PATCH V9 16/46] migration: simplify notifiers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 17/46] migration: check mode in notifiers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 18/46] memory: flat section iterator Steve Sistare
2022-07-26 16:10 ` [PATCH V9 19/46] oslib: qemu_clear_cloexec Steve Sistare
2022-07-26 16:10 ` [PATCH V9 20/46] vl: helper to request re-exec Steve Sistare
2022-07-26 16:10 ` [PATCH V9 21/46] cpr: preserve extra state Steve Sistare
2022-07-26 16:10 ` [PATCH V9 22/46] cpr: exec mode Steve Sistare
2022-07-26 16:10 ` [PATCH V9 23/46] cpr: add exec-mode blockers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 24/46] cpr: ram block blockers Steve Sistare
2022-07-26 16:10 ` [PATCH V9 25/46] cpr: only-cpr-capable Steve Sistare
2022-07-26 16:10 ` [PATCH V9 26/46] cpr: Mismatched GPAs fix Steve Sistare
2022-07-26 16:10 ` [PATCH V9 27/46] hostmem-memfd: cpr support Steve Sistare
2022-07-26 16:10 ` [PATCH V9 28/46] hostmem-epc: " Steve Sistare
2022-07-26 16:10 ` [PATCH V9 29/46] pci: export msix_is_pending Steve Sistare
2022-07-26 16:10 ` [PATCH V9 30/46] vfio-pci: refactor for cpr Steve Sistare
2022-07-26 16:10 ` [PATCH V9 31/46] vfio-pci: cpr part 1 (fd and dma) Steve Sistare
2022-07-26 16:10 ` [PATCH V9 32/46] vfio-pci: cpr part 2 (msi) Steve Sistare
2023-07-05  8:56   ` Kunkun Jiang via
2023-07-10 15:43     ` Steven Sistare
2023-07-13 12:35       ` Kunkun Jiang via
2023-07-13 12:42         ` Marc Zyngier
2022-07-26 16:10 ` [PATCH V9 33/46] vfio-pci: cpr part 3 (intx) Steve Sistare
2022-07-26 16:10 ` [PATCH V9 34/46] vfio-pci: recover from unmap-all-vaddr failure Steve Sistare
2022-07-26 16:10 ` [PATCH V9 35/46] vhost: reset vhost devices for cpr Steve Sistare
2022-07-26 16:10 ` [PATCH V9 36/46] chardev: cpr framework Steve Sistare
2022-07-26 16:10 ` [PATCH V9 37/46] chardev: cpr for simple devices Steve Sistare
2022-07-26 16:10 ` [PATCH V9 38/46] chardev: cpr for pty Steve Sistare
2022-07-26 16:10 ` [PATCH V9 39/46] chardev: cpr for sockets Steve Sistare
2022-07-26 16:10 ` [PATCH V9 40/46] python/machine: QEMUMachine full_args Steve Sistare
2022-07-26 18:00   ` John Snow
2022-07-26 16:10 ` [PATCH V9 41/46] python/machine: QEMUMachine reopen_qmp_connection Steve Sistare
2022-07-26 18:04   ` John Snow
2022-07-27 12:06     ` Steven Sistare
2022-07-26 16:10 ` [PATCH V9 42/46] tests/avocado: add cpr regression test Steve Sistare
2023-12-01 10:44   ` Philippe Mathieu-Daudé
2022-07-26 16:10 ` [PATCH V9 43/46] vl: start on wakeup request Steve Sistare
2022-07-26 16:10 ` [PATCH V9 44/46] migration: fix suspended runstate Steve Sistare
2022-07-26 16:10 ` [PATCH V9 45/46] migration: notifier error reporting Steve Sistare
2022-07-26 16:10 ` [PATCH V9 46/46] vfio: allow cpr-reboot migration if suspended Steve Sistare
2022-12-07 15:48 ` [PATCH V9 00/46] Live Update Steven Sistare
2023-02-07 18:44   ` Steven Sistare
2023-02-07 19:01     ` Steven Sistare
2023-05-30 13:38     ` Philippe Mathieu-Daudé
2023-05-30 13:53       ` Steven Sistare
2023-04-07 17:35   ` Michael Galaxy
2023-04-14 19:20   ` Michael Galaxy
2023-06-06 22:15   ` Michael Galaxy
2023-06-07 15:55     ` Michael Galaxy
2023-06-07 17:37       ` Steven Sistare
2023-06-12 14:59         ` Michael Galaxy
2023-07-10 15:10           ` Steven Sistare
2023-07-13 15:53             ` Michael Galaxy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1658851843-236870-1-git-send-email-steven.sistare@oracle.com \
    --to=steven.sistare@oracle.com \
    --cc=alex.bennee@linaro.org \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jason.zeng@linux.intel.com \
    --cc=jsnow@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mark.kanda@oracle.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philippe.mathieu.daude@gmail.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=tcx4c70@gmail.com \
    --cc=tugy@chinatelecom.cn \
    --cc=zhengchuan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.