[RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram

* [RFC PATCH v1 00/26] migration: File based migration with multifd and fixed-ram
@ 2023-03-30 18:03 Fabiano Rosas
  2023-03-30 18:03 ` [RFC PATCH v1 01/26] migration: Add support for 'file:' uri for source migration Fabiano Rosas
                   ` (27 more replies)
  0 siblings, 28 replies; 65+ messages in thread
From: Fabiano Rosas @ 2023-03-30 18:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: Claudio Fontana, jfehlig, dfaggioli, dgilbert,
	Daniel P . Berrangé,
	Juan Quintela

Hi folks,

I'm continuing the work done last year to add a new format of
migration stream that can be used to migrate large guests to a single
file in a performant way.

This is an early RFC with the previous code + my additions to support
multifd and direct IO. Let me know what you think!

Here are the reference links for previous discussions:

https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01813.html
https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg01338.html
https://lists.gnu.org/archive/html/qemu-devel/2022-10/msg05536.html

The series has 4 main parts:

1) File migration: A new "file:" migration URI. So "file:mig" does the
   same as "exec:cat > mig". Patches 1-4 implement this;

2) Fixed-ram format: A new format for the migration stream. Puts guest
   pages at their relative offsets in the migration file. This saves
   space on the worst case of RAM utilization because every page has a
   fixed offset in the migration file and (potentially) saves us time
   because we could write pages independently in parallel. It also
   gives alignment guarantees so we could use O_DIRECT. Patches 5-13
   implement this;

With patches 1-13 these two^ can be used with:

(qemu) migrate_set_capability fixed-ram on
(qemu) migrate[_incoming] file:mig

--> new in this series:

3) MultiFD support: This is about making use of the parallelism
   allowed by the new format. We just need the threading and page
   queuing infrastructure that is already in place for
   multifd. Patches 14-24 implement this;

(qemu) migrate_set_capability fixed-ram on
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_parameter multifd-channels 4
(qemu) migrate_set_parameter max-bandwith 0
(qemu) migrate[_incoming] file:mig

4) Add a new "direct_io" parameter and enable O_DIRECT for the
   properly aligned segments of the migration (mostly ram). Patch 25.

(qemu) migrate_set_parameter direct-io on

Thanks! Some data below:
=====

Outgoing migration to file. NVMe disk. XFS filesystem.

- Single migration runs of stopped 32G guest with ~90% RAM usage. Guest
  running `stress-ng --vm 4 --vm-bytes 90% --vm-method all --verify -t
  10m -v`:

migration type  | MB/s | pages/s |  ms
----------------+------+---------+------
savevm io_uring |  434 |  102294 | 71473
file:           | 3017 |  855862 | 10301
fixed-ram       | 1982 |  330686 | 15637
----------------+------+---------+------
fixed-ram + multifd + O_DIRECT
         2 ch.  | 5565 | 1500882 |  5576
         4 ch.  | 5735 | 1991549 |  5412
         8 ch.  | 5650 | 1769650 |  5489
        16 ch.  | 6071 | 1832407 |  5114
        32 ch.  | 6147 | 1809588 |  5050
        64 ch.  | 6344 | 1841728 |  4895
       128 ch.  | 6120 | 1915669 |  5085
----------------+------+---------+------

- Average of 10 migration runs of guestperf.py --mem 32 --cpus 4:

migration type | #ch. | MB/s | ms
---------------+------+------+-----
fixed-ram +    |    2 | 4132 | 8388
multifd        |    4 | 4273 | 8082
               |    8 | 4094 | 8441
               |   16 | 4204 | 8217
               |   32 | 4048 | 8528
               |   64 | 3861 | 8946
               |  128 | 3777 | 9147
---------------+------+------+-----
fixed-ram +    |    2 | 6031 | 5754
multifd +      |    4 | 6377 | 5421
O_DIRECT       |    8 | 6386 | 5416
               |   16 | 6321 | 5466
               |   32 | 5911 | 5321
               |   64 | 6375 | 5433
               |  128 | 6400 | 5412
---------------+------+------+-----

Fabiano Rosas (13):
  migration: Add completion tracepoint
  migration/multifd: Remove direct "socket" references
  migration/multifd: Allow multifd without packets
  migration/multifd: Add outgoing QIOChannelFile support
  migration/multifd: Add incoming QIOChannelFile support
  migration/multifd: Add pages to the receiving side
  io: Add a pwritev/preadv version that takes a discontiguous iovec
  migration/ram: Add a wrapper for fixed-ram shadow bitmap
  migration/multifd: Support outgoing fixed-ram stream format
  migration/multifd: Support incoming fixed-ram stream format
  tests/qtest: Add a multifd + fixed-ram migration test
  migration: Add direct-io parameter
  tests/migration/guestperf: Add file, fixed-ram and direct-io support

Nikolay Borisov (13):
  migration: Add support for 'file:' uri for source migration
  migration: Add support for 'file:' uri for incoming migration
  tests/qtest: migration: Add migrate_incoming_qmp helper
  tests/qtest: migration-test: Add tests for file-based migration
  migration: Initial support of fixed-ram feature for
    analyze-migration.py
  io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
  io: Add generic pwritev/preadv interface
  io: implement io_pwritev/preadv for QIOChannelFile
  migration/qemu-file: add utility methods for working with seekable
    channels
  migration/ram: Introduce 'fixed-ram' migration stream capability
  migration: Refactor precopy ram loading code
  migration: Add support for 'fixed-ram' migration restore
  tests/qtest: migration-test: Add tests for fixed-ram file-based
    migration

 docs/devel/migration.rst              |  38 +++
 include/exec/ramblock.h               |   8 +
 include/io/channel-file.h             |   1 +
 include/io/channel.h                  | 133 ++++++++++
 include/migration/qemu-file-types.h   |   2 +
 include/qemu/osdep.h                  |   2 +
 io/channel-file.c                     |  60 +++++
 io/channel.c                          | 140 +++++++++++
 migration/file.c                      | 130 ++++++++++
 migration/file.h                      |  14 ++
 migration/meson.build                 |   1 +
 migration/migration-hmp-cmds.c        |   9 +
 migration/migration.c                 | 108 +++++++-
 migration/migration.h                 |  11 +-
 migration/multifd.c                   | 327 ++++++++++++++++++++----
 migration/multifd.h                   |  13 +
 migration/qemu-file.c                 |  80 ++++++
 migration/qemu-file.h                 |   4 +
 migration/ram.c                       | 349 ++++++++++++++++++++------
 migration/ram.h                       |   1 +
 migration/savevm.c                    |  23 +-
 migration/trace-events                |   1 +
 qapi/migration.json                   |  19 +-
 scripts/analyze-migration.py          |  51 +++-
 tests/migration/guestperf/engine.py   |  38 ++-
 tests/migration/guestperf/scenario.py |  14 +-
 tests/migration/guestperf/shell.py    |  18 +-
 tests/qtest/migration-helpers.c       |  19 ++
 tests/qtest/migration-helpers.h       |   4 +
 tests/qtest/migration-test.c          |  73 ++++++
 util/osdep.c                          |   9 +
 31 files changed, 1546 insertions(+), 154 deletions(-)
 create mode 100644 migration/file.c
 create mode 100644 migration/file.h

-- 
2.35.3

^ permalink raw reply	[flat|nested] 65+ messages in thread