All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
@ 2019-01-10 12:01 Yury Kotov
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation Yury Kotov
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-10 12:01 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela,
	Dr. David Alan Gilbert, Eric Blake, Markus Armbruster,
	Thomas Huth, Laurent Vivier
  Cc: wrfsh

Hi,

The series adds migration capability which allows to skip 'external' RAM blocks
during migration. External block is a RAMBlock which available from the outside
of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
migration to update QEMU for the running guests.

Patches:
1. Add offset validation to make sure that external RAM block has the same
   physical offset on target side,
2. Add RAM_EXTERNAL flag to determine external RAM blocks,
3. Add ignore-external migration capability,
4. Add a test.

Usage example:
1. Start source VM:
   qemu-system-x86 \
     -m 4G \
     -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
     -numa node,memdev=mem0 \
     -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \

2. Start target VM:
   qemu-system-x86 \
     -m 4G \
     -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
     -numa node,memdev=mem0 \
     -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
     -incoming defer

3. Enable ignore-external capability on both VMs:
   { "execute": "migrate-set-capabilities" , "arguments":
     { "capabilities": [ { "capability": "x-ignore-external", "state": true } ] } }

4. Start migration.

Regards,
Yury

Yury Kotov (4):
  migration: add RAMBlock's offset validation
  exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
  migration: introduce ignore-external capability
  tests/migration-test: Add a test for ignore-external capability

 backends/hostmem-file.c   |   3 +-
 exec.c                    |   7 ++-
 include/exec/cpu-common.h |   1 +
 include/exec/memory.h     |   3 ++
 migration/migration.c     |   9 ++++
 migration/migration.h     |   1 +
 migration/ram.c           |  52 ++++++++++++++++--
 numa.c                    |   4 +-
 qapi/migration.json       |   6 ++-
 tests/migration-test.c    | 109 +++++++++++++++++++++++++++++++-------
 10 files changed, 165 insertions(+), 30 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
@ 2019-01-10 12:01 ` Yury Kotov
  2019-01-10 20:14   ` Dr. David Alan Gilbert
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks Yury Kotov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Yury Kotov @ 2019-01-10 12:01 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela,
	Dr. David Alan Gilbert, Eric Blake, Markus Armbruster,
	Thomas Huth, Laurent Vivier
  Cc: wrfsh

RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
In this stage QEMU checks further information about RAMBlock:
1. Presence (by idstr),
2. Length (trying to resize, when differs),
3. Optional page size.

This patch adds a check for RAMBlock's offset. Currently we check it during
RAM pages loading - every RAM page has an offset in its header. But there is a
case when we don't send RAM pages (see below).

The following commits introduce a capability (ignore-external) to skip some
RAM blocks from migration. In such case the migration stream contains only
meta information about RAM blocks to validate them. So, the only way to check
block's offset is to send it explicitly.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 migration/ram.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..39629254e1 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
             qemu_put_be64(f, block->page_size);
         }
+        qemu_put_be64(f, block->offset);
     }
 
     rcu_read_unlock();
@@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
 
     seq_iter++;
 
-    if (version_id != 4) {
+    if (version_id < 4) {
         ret = -EINVAL;
     }
 
@@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                             ret = -EINVAL;
                         }
                     }
+                    if (version_id >= 5) {
+                        ram_addr_t offset;
+                        offset = qemu_get_be64(f);
+                        if (block->offset != offset) {
+                            error_report("Mismatched RAM block offset %s "
+                                         "%" PRId64 "!= %" PRId64,
+                                         id, offset, (uint64_t)block->offset);
+                            ret = -EINVAL;
+                        }
+                    }
                     ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
                                           block->idstr);
                 } else {
@@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
 void ram_mig_init(void)
 {
     qemu_mutex_init(&XBZRLE.lock);
-    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
+    register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
 }
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation Yury Kotov
@ 2019-01-10 12:01 ` Yury Kotov
  2019-01-10 20:14   ` Dr. David Alan Gilbert
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 3/4] migration: introduce ignore-external capability Yury Kotov
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Yury Kotov @ 2019-01-10 12:01 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela,
	Dr. David Alan Gilbert, Eric Blake, Markus Armbruster,
	Thomas Huth, Laurent Vivier
  Cc: wrfsh

This flag allows to determine whether RAM block is available from the outside.
E.g. when we use -object memory-backend-file or -mem-path options we have
a RAM block which is mapped to shared file.

We need this flag in the following commits.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 backends/hostmem-file.c | 3 ++-
 exec.c                  | 2 +-
 include/exec/memory.h   | 3 +++
 numa.c                  | 4 ++--
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 7a34e25c43..37fe28f2ac 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -63,7 +63,8 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
                                      name,
                                      backend->size, fb->align,
                                      (backend->share ? RAM_SHARED : 0) |
-                                     (fb->is_pmem ? RAM_PMEM : 0),
+                                     (fb->is_pmem ? RAM_PMEM : 0) |
+                                     RAM_EXTERNAL,
                                      fb->mem_path, errp);
     g_free(name);
 #endif
diff --git a/exec.c b/exec.c
index 6e875f0640..ef2f29d7cb 100644
--- a/exec.c
+++ b/exec.c
@@ -2241,7 +2241,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
     int64_t file_size;
 
     /* Just support these ram flags by now. */
-    assert((ram_flags & ~(RAM_SHARED | RAM_PMEM)) == 0);
+    assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_EXTERNAL)) == 0);
 
     if (xen_enabled()) {
         error_setg(errp, "-mem-path not supported with Xen");
diff --git a/include/exec/memory.h b/include/exec/memory.h
index ffd23ed8d8..3a9cb34f1e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -126,6 +126,9 @@ typedef struct IOMMUNotifier IOMMUNotifier;
 /* RAM is a persistent kind memory */
 #define RAM_PMEM (1 << 5)
 
+/* RAM is from external source (e.g. from file) */
+#define RAM_EXTERNAL (1 << 6)
+
 static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
                                        IOMMUNotifierFlag flags,
                                        hwaddr start, hwaddr end,
diff --git a/numa.c b/numa.c
index 50ec016013..653c5a08de 100644
--- a/numa.c
+++ b/numa.c
@@ -482,8 +482,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
     if (mem_path) {
 #ifdef __linux__
         Error *err = NULL;
-        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
-                                         mem_path, &err);
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
+                                         RAM_EXTERNAL, mem_path, &err);
         if (err) {
             error_report_err(err);
             if (mem_prealloc) {
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 3/4] migration: introduce ignore-external capability
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation Yury Kotov
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks Yury Kotov
@ 2019-01-10 12:01 ` Yury Kotov
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 4/4] tests/migration-test: Add a test for " Yury Kotov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-10 12:01 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela,
	Dr. David Alan Gilbert, Eric Blake, Markus Armbruster,
	Thomas Huth, Laurent Vivier
  Cc: wrfsh

We want to use local migration to update QEMU for running guests.
In this case we don't need to migrate external RAM.
So, add a capability to ignore such blocks during live migration.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 exec.c                    |  5 +++++
 include/exec/cpu-common.h |  1 +
 migration/migration.c     |  9 +++++++++
 migration/migration.h     |  1 +
 migration/ram.c           | 37 ++++++++++++++++++++++++++++++++++---
 qapi/migration.json       |  6 +++++-
 6 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index ef2f29d7cb..3c3e42993f 100644
--- a/exec.c
+++ b/exec.c
@@ -2000,6 +2000,11 @@ void qemu_ram_unset_migratable(RAMBlock *rb)
     rb->flags &= ~RAM_MIGRATABLE;
 }
 
+bool qemu_ram_is_external(RAMBlock *rb)
+{
+    return rb->flags & RAM_EXTERNAL;
+}
+
 /* Called with iothread lock held.  */
 void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
 {
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 2ad2d6d86b..57e84e5aa4 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -78,6 +78,7 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb);
 bool qemu_ram_is_migratable(RAMBlock *rb);
 void qemu_ram_set_migratable(RAMBlock *rb);
 void qemu_ram_unset_migratable(RAMBlock *rb);
+bool qemu_ram_is_external(RAMBlock *rb);
 
 size_t qemu_ram_pagesize(RAMBlock *block);
 size_t qemu_ram_pagesize_largest(void);
diff --git a/migration/migration.c b/migration/migration.c
index ffc4d9e556..9b789d3535 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1979,6 +1979,15 @@ bool migrate_dirty_bitmaps(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_ignore_external(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_IGNORE_EXTERNAL];
+}
+
 bool migrate_use_events(void)
 {
     MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..06691242f3 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -255,6 +255,7 @@ bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
 bool migrate_dirty_bitmaps(void);
+bool migrate_ignore_external(void);
 
 bool migrate_auto_converge(void);
 bool migrate_use_multifd(void);
diff --git a/migration/ram.c b/migration/ram.c
index 39629254e1..0091a8ae3a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -166,6 +166,11 @@ out:
 
 #undef RAMBLOCK_FOREACH
 
+static bool is_ignored_block(RAMBlock *block)
+{
+    return migrate_ignore_external() && qemu_ram_is_external(block);
+}
+
 static void ramblock_recv_map_init(void)
 {
     RAMBlock *rb;
@@ -1537,7 +1542,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     unsigned long *bitmap = rb->bmap;
     unsigned long next;
 
-    if (!qemu_ram_is_migratable(rb)) {
+    if (!qemu_ram_is_migratable(rb) || is_ignored_block(rb)) {
         return size;
     }
 
@@ -1651,6 +1656,9 @@ static void migration_bitmap_sync(RAMState *rs)
     qemu_mutex_lock(&rs->bitmap_mutex);
     rcu_read_lock();
     RAMBLOCK_FOREACH_MIGRATABLE(block) {
+        if (is_ignored_block(block)) {
+            continue;
+        }
         migration_bitmap_sync_range(rs, block, 0, block->used_length);
     }
     ram_counters.remaining = ram_bytes_remaining();
@@ -2472,19 +2480,27 @@ void acct_update_position(QEMUFile *f, size_t size, bool zero)
     }
 }
 
-uint64_t ram_bytes_total(void)
+static uint64_t ram_bytes_total_common(bool skip_ignored)
 {
     RAMBlock *block;
     uint64_t total = 0;
 
     rcu_read_lock();
     RAMBLOCK_FOREACH_MIGRATABLE(block) {
+        if (skip_ignored && is_ignored_block(block)) {
+            continue;
+        }
         total += block->used_length;
     }
     rcu_read_unlock();
     return total;
 }
 
+uint64_t ram_bytes_total(void)
+{
+    return ram_bytes_total_common(true);
+}
+
 static void xbzrle_load_setup(void)
 {
     XBZRLE.decoded_buf = g_malloc(TARGET_PAGE_SIZE);
@@ -3162,7 +3178,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 
     rcu_read_lock();
 
-    qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
+    qemu_put_be64(f, ram_bytes_total_common(false) | RAM_SAVE_FLAG_MEM_SIZE);
 
     RAMBLOCK_FOREACH_MIGRATABLE(block) {
         qemu_put_byte(f, strlen(block->idstr));
@@ -3172,6 +3188,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
             qemu_put_be64(f, block->page_size);
         }
         qemu_put_be64(f, block->offset);
+        qemu_put_byte(f, is_ignored_block(block) ? 1 : 0);
     }
 
     rcu_read_unlock();
@@ -4135,13 +4152,27 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
                     }
                     if (version_id >= 5) {
                         ram_addr_t offset;
+                        bool ignored;
                         offset = qemu_get_be64(f);
+                        ignored = qemu_get_byte(f);
                         if (block->offset != offset) {
                             error_report("Mismatched RAM block offset %s "
                                          "%" PRId64 "!= %" PRId64,
                                          id, offset, (uint64_t)block->offset);
                             ret = -EINVAL;
                         }
+                        if (ignored) {
+                            if (!migrate_ignore_external()) {
+                                error_report("Unexpected ignored RAM block %s: "
+                                             "ignore-external capability is "
+                                             "disabled", id);
+                                ret = -EINVAL;
+                            } else if (!qemu_ram_is_external(block)) {
+                                error_report("Only external RAM block %s "
+                                             "can be ignored", id);
+                                ret = -EINVAL;
+                            }
+                        }
                     }
                     ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
                                           block->idstr);
diff --git a/qapi/migration.json b/qapi/migration.json
index 31b589ec26..10bbac87f8 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -406,13 +406,17 @@
 #           devices (and thus take locks) immediately at the end of migration.
 #           (since 3.0)
 #
+# @x-ignore-external: If enabled, QEMU will not migrate shared memory
+#                     (since 4.0)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
            'block', 'return-path', 'pause-before-switchover', 'x-multifd',
-           'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate' ] }
+           'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
+           'x-ignore-external' ] }
 
 ##
 # @MigrationCapabilityStatus:
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH 4/4] tests/migration-test: Add a test for ignore-external capability
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
                   ` (2 preceding siblings ...)
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 3/4] migration: introduce ignore-external capability Yury Kotov
@ 2019-01-10 12:01 ` Yury Kotov
  2019-01-10 20:11 ` [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Dr. David Alan Gilbert
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-10 12:01 UTC (permalink / raw)
  To: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela,
	Dr. David Alan Gilbert, Eric Blake, Markus Armbruster,
	Thomas Huth, Laurent Vivier
  Cc: wrfsh

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 tests/migration-test.c | 109 +++++++++++++++++++++++++++++++++--------
 1 file changed, 89 insertions(+), 20 deletions(-)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 06ca5068d8..67e6d6dad2 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -332,6 +332,13 @@ static void cleanup(const char *filename)
     g_free(path);
 }
 
+static char *get_shmem_opts(const char *mem_size, const char *shmem_path)
+{
+    return g_strdup_printf("-object memory-backend-file,id=mem0,size=%s"
+                           ",mem-path=%s,share=on -numa node,memdev=mem0",
+                           mem_size, shmem_path);
+}
+
 static void migrate_check_parameter(QTestState *who, const char *parameter,
                                     long long value)
 {
@@ -430,73 +437,91 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
 }
 
 static int test_migrate_start(QTestState **from, QTestState **to,
-                               const char *uri, bool hide_stderr)
+                               const char *uri, bool hide_stderr,
+                               bool use_shmem)
 {
     gchar *cmd_src, *cmd_dst;
     char *bootpath = g_strdup_printf("%s/bootsect", tmpfs);
+    char *extra_opts = NULL;
+    char *shmem_path = NULL;
     const char *arch = qtest_get_arch();
     const char *accel = "kvm:tcg";
 
     got_stop = false;
 
+    if (use_shmem) {
+        shmem_path = g_strdup_printf("/dev/shm/qemu-%d", getpid());
+    }
+
     if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
         init_bootfile(bootpath, x86_bootsect);
+        extra_opts = use_shmem ? get_shmem_opts("150M", shmem_path) : NULL;
         cmd_src = g_strdup_printf("-machine accel=%s -m 150M"
                                   " -name source,debug-threads=on"
                                   " -serial file:%s/src_serial"
-                                  " -drive file=%s,format=raw",
-                                  accel, tmpfs, bootpath);
+                                  " -drive file=%s,format=raw %s",
+                                  accel, tmpfs, bootpath,
+                                  extra_opts ? extra_opts : "");
         cmd_dst = g_strdup_printf("-machine accel=%s -m 150M"
                                   " -name target,debug-threads=on"
                                   " -serial file:%s/dest_serial"
                                   " -drive file=%s,format=raw"
-                                  " -incoming %s",
-                                  accel, tmpfs, bootpath, uri);
+                                  " -incoming %s %s",
+                                  accel, tmpfs, bootpath, uri,
+                                  extra_opts ? extra_opts : "");
         start_address = X86_TEST_MEM_START;
         end_address = X86_TEST_MEM_END;
     } else if (g_str_equal(arch, "s390x")) {
         init_bootfile_s390x(bootpath);
+        extra_opts = use_shmem ? get_shmem_opts("128M", shmem_path) : NULL;
         cmd_src = g_strdup_printf("-machine accel=%s -m 128M"
                                   " -name source,debug-threads=on"
-                                  " -serial file:%s/src_serial -bios %s",
-                                  accel, tmpfs, bootpath);
+                                  " -serial file:%s/src_serial -bios %s %s",
+                                  accel, tmpfs, bootpath,
+                                  extra_opts ? extra_opts : "");
         cmd_dst = g_strdup_printf("-machine accel=%s -m 128M"
                                   " -name target,debug-threads=on"
                                   " -serial file:%s/dest_serial -bios %s"
-                                  " -incoming %s",
-                                  accel, tmpfs, bootpath, uri);
+                                  " -incoming %s %s",
+                                  accel, tmpfs, bootpath, uri,
+                                  extra_opts ? extra_opts : "");
         start_address = S390_TEST_MEM_START;
         end_address = S390_TEST_MEM_END;
     } else if (strcmp(arch, "ppc64") == 0) {
+        extra_opts = use_shmem ? get_shmem_opts("256M", shmem_path) : NULL;
         cmd_src = g_strdup_printf("-machine accel=%s -m 256M -nodefaults"
                                   " -name source,debug-threads=on"
                                   " -serial file:%s/src_serial"
                                   " -prom-env 'use-nvramrc?=true' -prom-env "
                                   "'nvramrc=hex .\" _\" begin %x %x "
                                   "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
-                                  "until'",  accel, tmpfs, end_address,
-                                  start_address);
+                                  "until' %s",  accel, tmpfs, end_address,
+                                  start_address, extra_opts ? extra_opts : "");
         cmd_dst = g_strdup_printf("-machine accel=%s -m 256M"
                                   " -name target,debug-threads=on"
                                   " -serial file:%s/dest_serial"
-                                  " -incoming %s",
-                                  accel, tmpfs, uri);
+                                  " -incoming %s %s",
+                                  accel, tmpfs, uri,
+                                  extra_opts ? extra_opts : "");
 
         start_address = PPC_TEST_MEM_START;
         end_address = PPC_TEST_MEM_END;
     } else if (strcmp(arch, "aarch64") == 0) {
         init_bootfile(bootpath, aarch64_kernel);
+        extra_opts = use_shmem ? get_shmem_opts("150M", shmem_path) : NULL;
         cmd_src = g_strdup_printf("-machine virt,accel=%s,gic-version=max "
                                   "-name vmsource,debug-threads=on -cpu max "
                                   "-m 150M -serial file:%s/src_serial "
-                                  "-kernel %s ",
-                                  accel, tmpfs, bootpath);
+                                  "-kernel %s %s",
+                                  accel, tmpfs, bootpath,
+                                  extra_opts ? extra_opts : "");
         cmd_dst = g_strdup_printf("-machine virt,accel=%s,gic-version=max "
                                   "-name vmdest,debug-threads=on -cpu max "
                                   "-m 150M -serial file:%s/dest_serial "
                                   "-kernel %s "
-                                  "-incoming %s ",
-                                  accel, tmpfs, bootpath, uri);
+                                  "-incoming %s %s",
+                                  accel, tmpfs, bootpath, uri,
+                                  extra_opts ? extra_opts : "");
 
         start_address = ARM_TEST_MEM_START;
         end_address = ARM_TEST_MEM_END;
@@ -507,6 +532,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
     }
 
     g_free(bootpath);
+    g_free(extra_opts);
 
     if (hide_stderr) {
         gchar *tmp;
@@ -524,6 +550,16 @@ static int test_migrate_start(QTestState **from, QTestState **to,
 
     *to = qtest_init(cmd_dst);
     g_free(cmd_dst);
+
+    /*
+     * Remove shmem file immediately to avoid memory leak in test failed case.
+     * It's valid becase QEMU has already opened this file
+     */
+    if (use_shmem) {
+        unlink(shmem_path);
+        g_free(shmem_path);
+    }
+
     return 0;
 }
 
@@ -603,7 +639,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
     char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     QTestState *from, *to;
 
-    if (test_migrate_start(&from, &to, uri, hide_error)) {
+    if (test_migrate_start(&from, &to, uri, hide_error, false)) {
         return -1;
     }
 
@@ -720,7 +756,7 @@ static void test_baddest(void)
     char *status;
     bool failed;
 
-    if (test_migrate_start(&from, &to, "tcp:0:0", true)) {
+    if (test_migrate_start(&from, &to, "tcp:0:0", true, false)) {
         return;
     }
     migrate(from, "tcp:0:0", "{}");
@@ -745,7 +781,7 @@ static void test_precopy_unix(void)
     char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
     QTestState *from, *to;
 
-    if (test_migrate_start(&from, &to, uri, false)) {
+    if (test_migrate_start(&from, &to, uri, false, false)) {
         return;
     }
 
@@ -781,6 +817,38 @@ static void test_precopy_unix(void)
     g_free(uri);
 }
 
+static void test_ignore_external(void)
+{
+    char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+    QTestState *from, *to;
+
+    if (test_migrate_start(&from, &to, uri, false, true)) {
+        return;
+    }
+
+    migrate_set_capability(from, "x-ignore-external", true);
+    migrate_set_capability(to, "x-ignore-external", true);
+
+    /* Wait for the first serial output from the source */
+    wait_for_serial("src_serial");
+
+    migrate(from, uri, "{}");
+
+    wait_for_migration_pass(from);
+
+    if (!got_stop) {
+        qtest_qmp_eventwait(from, "STOP");
+    }
+
+    qtest_qmp_eventwait(to, "RESUME");
+
+    wait_for_serial("dest_serial");
+    wait_for_migration_complete(from);
+
+    test_migrate_end(from, to, true);
+    g_free(uri);
+}
+
 int main(int argc, char **argv)
 {
     char template[] = "/tmp/migration-test-XXXXXX";
@@ -832,6 +900,7 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/deprecated", test_deprecated);
     qtest_add_func("/migration/bad_dest", test_baddest);
     qtest_add_func("/migration/precopy/unix", test_precopy_unix);
+    qtest_add_func("/migration/ignore_external", test_ignore_external);
 
     ret = g_test_run();
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
                   ` (3 preceding siblings ...)
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 4/4] tests/migration-test: Add a test for " Yury Kotov
@ 2019-01-10 20:11 ` Dr. David Alan Gilbert
  2019-01-11 15:49   ` Yury Kotov
  2019-01-13 14:37 ` no-reply
  2019-01-13 23:57 ` no-reply
  6 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-10 20:11 UTC (permalink / raw)
  To: Yury Kotov
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> Hi,
> 
> The series adds migration capability which allows to skip 'external' RAM blocks
> during migration. External block is a RAMBlock which available from the outside
> of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
> migration to update QEMU for the running guests.

Hi Yury,
  There have been a few similar patch series around from people wanting
to do similar things.
  In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
and Cédric Le Goater wanted to skip regions for a different reason.

  We merged some of Cédric's code last year so that we now
have the qemu_ram_is_migratable() function - and we should be reusing
that to skip things rather than adding a new check that we have to add
everywhere.

  Also, ypu're skipping 'external' things, I think the other suggestion
was to skip 'shared' things (i.e. anything with share=0); skipping
share=on cases sounds easier to me.

Dave

> Patches:
> 1. Add offset validation to make sure that external RAM block has the same
>    physical offset on target side,
> 2. Add RAM_EXTERNAL flag to determine external RAM blocks,
> 3. Add ignore-external migration capability,
> 4. Add a test.
> 
> Usage example:
> 1. Start source VM:
>    qemu-system-x86 \
>      -m 4G \
>      -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>      -numa node,memdev=mem0 \
>      -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \
> 
> 2. Start target VM:
>    qemu-system-x86 \
>      -m 4G \
>      -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>      -numa node,memdev=mem0 \
>      -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
>      -incoming defer
> 
> 3. Enable ignore-external capability on both VMs:
>    { "execute": "migrate-set-capabilities" , "arguments":
>      { "capabilities": [ { "capability": "x-ignore-external", "state": true } ] } }
> 
> 4. Start migration.
> 
> Regards,
> Yury
> 
> Yury Kotov (4):
>   migration: add RAMBlock's offset validation
>   exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
>   migration: introduce ignore-external capability
>   tests/migration-test: Add a test for ignore-external capability
> 
>  backends/hostmem-file.c   |   3 +-
>  exec.c                    |   7 ++-
>  include/exec/cpu-common.h |   1 +
>  include/exec/memory.h     |   3 ++
>  migration/migration.c     |   9 ++++
>  migration/migration.h     |   1 +
>  migration/ram.c           |  52 ++++++++++++++++--
>  numa.c                    |   4 +-
>  qapi/migration.json       |   6 ++-
>  tests/migration-test.c    | 109 +++++++++++++++++++++++++++++++-------
>  10 files changed, 165 insertions(+), 30 deletions(-)
> 
> -- 
> 2.20.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation Yury Kotov
@ 2019-01-10 20:14   ` Dr. David Alan Gilbert
  2019-01-11 10:06     ` Igor Mammedov
  2019-01-11 16:38     ` Yury Kotov
  0 siblings, 2 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-10 20:14 UTC (permalink / raw)
  To: Yury Kotov
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
> In this stage QEMU checks further information about RAMBlock:
> 1. Presence (by idstr),
> 2. Length (trying to resize, when differs),
> 3. Optional page size.
> 
> This patch adds a check for RAMBlock's offset. Currently we check it during
> RAM pages loading - every RAM page has an offset in its header. But there is a
> case when we don't send RAM pages (see below).
> 
> The following commits introduce a capability (ignore-external) to skip some
> RAM blocks from migration. In such case the migration stream contains only
> meta information about RAM blocks to validate them. So, the only way to check
> block's offset is to send it explicitly.
> 
> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>

But why check that offsets match?  THey aren't supposed to!
Offset's are entirely private to each qemu and they're allowed to be
different; the only requirement is that the length and name of each
RAMBlock matches, then all the operations we do over the migration
stream are relative to the start of the block.

One example where they are validly different is where you hotplug some
RAM, so for example:


  source qemu
      -M 4G
      hotplug PCI card
      hotplug 2G

  destination qemu
      -M 4G
      PCI card declared on the command line
      extra 2G declared on the command line

The offsets are different but we can migrate that case fine.

Dave

> ---
>  migration/ram.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 7e7deec4d8..39629254e1 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>          if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
>              qemu_put_be64(f, block->page_size);
>          }
> +        qemu_put_be64(f, block->offset);
>      }
>  
>      rcu_read_unlock();
> @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  
>      seq_iter++;
>  
> -    if (version_id != 4) {
> +    if (version_id < 4) {
>          ret = -EINVAL;
>      }
>  
> @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>                              ret = -EINVAL;
>                          }
>                      }
> +                    if (version_id >= 5) {
> +                        ram_addr_t offset;
> +                        offset = qemu_get_be64(f);
> +                        if (block->offset != offset) {
> +                            error_report("Mismatched RAM block offset %s "
> +                                         "%" PRId64 "!= %" PRId64,
> +                                         id, offset, (uint64_t)block->offset);
> +                            ret = -EINVAL;
> +                        }
> +                    }
>                      ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>                                            block->idstr);
>                  } else {
> @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
>  void ram_mig_init(void)
>  {
>      qemu_mutex_init(&XBZRLE.lock);
> -    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
> +    register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
>  }
> -- 
> 2.20.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
  2019-01-10 12:01 ` [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks Yury Kotov
@ 2019-01-10 20:14   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-10 20:14 UTC (permalink / raw)
  To: Yury Kotov
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> This flag allows to determine whether RAM block is available from the outside.
> E.g. when we use -object memory-backend-file or -mem-path options we have
> a RAM block which is mapped to shared file.
> 
> We need this flag in the following commits.

As I said in the cover letter, why not just use shared?

Dave

> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> ---
>  backends/hostmem-file.c | 3 ++-
>  exec.c                  | 2 +-
>  include/exec/memory.h   | 3 +++
>  numa.c                  | 4 ++--
>  4 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 7a34e25c43..37fe28f2ac 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -63,7 +63,8 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>                                       name,
>                                       backend->size, fb->align,
>                                       (backend->share ? RAM_SHARED : 0) |
> -                                     (fb->is_pmem ? RAM_PMEM : 0),
> +                                     (fb->is_pmem ? RAM_PMEM : 0) |
> +                                     RAM_EXTERNAL,
>                                       fb->mem_path, errp);
>      g_free(name);
>  #endif
> diff --git a/exec.c b/exec.c
> index 6e875f0640..ef2f29d7cb 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2241,7 +2241,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
>      int64_t file_size;
>  
>      /* Just support these ram flags by now. */
> -    assert((ram_flags & ~(RAM_SHARED | RAM_PMEM)) == 0);
> +    assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_EXTERNAL)) == 0);
>  
>      if (xen_enabled()) {
>          error_setg(errp, "-mem-path not supported with Xen");
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index ffd23ed8d8..3a9cb34f1e 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -126,6 +126,9 @@ typedef struct IOMMUNotifier IOMMUNotifier;
>  /* RAM is a persistent kind memory */
>  #define RAM_PMEM (1 << 5)
>  
> +/* RAM is from external source (e.g. from file) */
> +#define RAM_EXTERNAL (1 << 6)
> +
>  static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
>                                         IOMMUNotifierFlag flags,
>                                         hwaddr start, hwaddr end,
> diff --git a/numa.c b/numa.c
> index 50ec016013..653c5a08de 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -482,8 +482,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>      if (mem_path) {
>  #ifdef __linux__
>          Error *err = NULL;
> -        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
> -                                         mem_path, &err);
> +        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
> +                                         RAM_EXTERNAL, mem_path, &err);
>          if (err) {
>              error_report_err(err);
>              if (mem_prealloc) {
> -- 
> 2.20.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-10 20:14   ` Dr. David Alan Gilbert
@ 2019-01-11 10:06     ` Igor Mammedov
  2019-01-11 10:58       ` Dr. David Alan Gilbert
  2019-01-11 16:38     ` Yury Kotov
  1 sibling, 1 reply; 22+ messages in thread
From: Igor Mammedov @ 2019-01-11 10:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Yury Kotov, qemu-devel, Eduardo Habkost, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

On Thu, 10 Jan 2019 20:14:19 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> > RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
> > In this stage QEMU checks further information about RAMBlock:
> > 1. Presence (by idstr),
> > 2. Length (trying to resize, when differs),
> > 3. Optional page size.
> > 
> > This patch adds a check for RAMBlock's offset. Currently we check it during
> > RAM pages loading - every RAM page has an offset in its header. But there is a
> > case when we don't send RAM pages (see below).
> > 
> > The following commits introduce a capability (ignore-external) to skip some
> > RAM blocks from migration. In such case the migration stream contains only
> > meta information about RAM blocks to validate them. So, the only way to check
> > block's offset is to send it explicitly.
> > 
> > Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>  
> 
> But why check that offsets match?  THey aren't supposed to!
> Offset's are entirely private to each qemu and they're allowed to be
> different; the only requirement is that the length and name of each
> RAMBlock matches, then all the operations we do over the migration
> stream are relative to the start of the block.
> 
> One example where they are validly different is where you hotplug some
> RAM, so for example:
> 
> 
>   source qemu
>       -M 4G
>       hotplug PCI card
>       hotplug 2G
> 
>   destination qemu
>       -M 4G
>       PCI card declared on the command line
>       extra 2G declared on the command line
> 
> The offsets are different but we can migrate that case fine.
PCI mappings are updated after migration is done, to make DST match SRC
(get_pci_config_device) that mapping is irrelevant here, but I was under
impression that offset in terms of MemoryRegion (GPA) was passed in
migration stream, even if we do not update memory mappings for RAM
(mapping depends on hotplug/CLI order or explicit addr on CLI for memory
devices).

> 
> Dave
> 
> > ---
> >  migration/ram.c | 15 +++++++++++++--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 7e7deec4d8..39629254e1 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >          if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> >              qemu_put_be64(f, block->page_size);
> >          }
> > +        qemu_put_be64(f, block->offset);
> >      }
> >  
> >      rcu_read_unlock();
> > @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >  
> >      seq_iter++;
> >  
> > -    if (version_id != 4) {
> > +    if (version_id < 4) {
> >          ret = -EINVAL;
> >      }
> >  
> > @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >                              ret = -EINVAL;
> >                          }
> >                      }
> > +                    if (version_id >= 5) {
> > +                        ram_addr_t offset;
> > +                        offset = qemu_get_be64(f);
> > +                        if (block->offset != offset) {
> > +                            error_report("Mismatched RAM block offset %s "
> > +                                         "%" PRId64 "!= %" PRId64,
> > +                                         id, offset, (uint64_t)block->offset);
> > +                            ret = -EINVAL;
> > +                        }
> > +                    }
> >                      ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> >                                            block->idstr);
> >                  } else {
> > @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
> >  void ram_mig_init(void)
> >  {
> >      qemu_mutex_init(&XBZRLE.lock);
> > -    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
> > +    register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
> >  }
> > -- 
> > 2.20.1
> >   
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-11 10:06     ` Igor Mammedov
@ 2019-01-11 10:58       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-11 10:58 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Yury Kotov, qemu-devel, Eduardo Habkost, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Thu, 10 Jan 2019 20:14:19 +0000
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> > > RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
> > > In this stage QEMU checks further information about RAMBlock:
> > > 1. Presence (by idstr),
> > > 2. Length (trying to resize, when differs),
> > > 3. Optional page size.
> > > 
> > > This patch adds a check for RAMBlock's offset. Currently we check it during
> > > RAM pages loading - every RAM page has an offset in its header. But there is a
> > > case when we don't send RAM pages (see below).
> > > 
> > > The following commits introduce a capability (ignore-external) to skip some
> > > RAM blocks from migration. In such case the migration stream contains only
> > > meta information about RAM blocks to validate them. So, the only way to check
> > > block's offset is to send it explicitly.
> > > 
> > > Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>  
> > 
> > But why check that offsets match?  THey aren't supposed to!
> > Offset's are entirely private to each qemu and they're allowed to be
> > different; the only requirement is that the length and name of each
> > RAMBlock matches, then all the operations we do over the migration
> > stream are relative to the start of the block.
> > 
> > One example where they are validly different is where you hotplug some
> > RAM, so for example:
> > 
> > 
> >   source qemu
> >       -M 4G
> >       hotplug PCI card
> >       hotplug 2G
> > 
> >   destination qemu
> >       -M 4G
> >       PCI card declared on the command line
> >       extra 2G declared on the command line
> > 
> > The offsets are different but we can migrate that case fine.
> PCI mappings are updated after migration is done, to make DST match SRC
> (get_pci_config_device) that mapping is irrelevant here, but I was under
> impression that offset in terms of MemoryRegion (GPA) was passed in
> migration stream, even if we do not update memory mappings for RAM
> (mapping depends on hotplug/CLI order or explicit addr on CLI for memory
> devices).

The GPA must match, but the offset within ram_addr_t space is abstract
and isn't visible to the guest.

Dave

> > 
> > Dave
> > 
> > > ---
> > >  migration/ram.c | 15 +++++++++++++--
> > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index 7e7deec4d8..39629254e1 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> > >          if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> > >              qemu_put_be64(f, block->page_size);
> > >          }
> > > +        qemu_put_be64(f, block->offset);
> > >      }
> > >  
> > >      rcu_read_unlock();
> > > @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > >  
> > >      seq_iter++;
> > >  
> > > -    if (version_id != 4) {
> > > +    if (version_id < 4) {
> > >          ret = -EINVAL;
> > >      }
> > >  
> > > @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> > >                              ret = -EINVAL;
> > >                          }
> > >                      }
> > > +                    if (version_id >= 5) {
> > > +                        ram_addr_t offset;
> > > +                        offset = qemu_get_be64(f);
> > > +                        if (block->offset != offset) {
> > > +                            error_report("Mismatched RAM block offset %s "
> > > +                                         "%" PRId64 "!= %" PRId64,
> > > +                                         id, offset, (uint64_t)block->offset);
> > > +                            ret = -EINVAL;
> > > +                        }
> > > +                    }
> > >                      ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> > >                                            block->idstr);
> > >                  } else {
> > > @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
> > >  void ram_mig_init(void)
> > >  {
> > >      qemu_mutex_init(&XBZRLE.lock);
> > > -    register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
> > > +    register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
> > >  }
> > > -- 
> > > 2.20.1
> > >   
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-10 20:11 ` [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Dr. David Alan Gilbert
@ 2019-01-11 15:49   ` Yury Kotov
  2019-01-11 20:09     ` Dr. David Alan Gilbert
  2019-01-11 20:55     ` Eduardo Habkost
  0 siblings, 2 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-11 15:49 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Peter Crosthwaite,
	Juan Quintela, qemu-devel, Markus Armbruster, Paolo Bonzini,
	Igor Mammedov, wrfsh, Richard Henderson

10.01.2019, 23:12, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  Hi,
>>
>>  The series adds migration capability which allows to skip 'external' RAM blocks
>>  during migration. External block is a RAMBlock which available from the outside
>>  of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
>>  migration to update QEMU for the running guests.
>
> Hi Yury,
>   There have been a few similar patch series around from people wanting
> to do similar things.
>   In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
> and Cédric Le Goater wanted to skip regions for a different reason.
>
>   We merged some of Cédric's code last year so that we now
> have the qemu_ram_is_migratable() function - and we should be reusing
> that to skip things rather than adding a new check that we have to add
> everywhere.
>

I didn't see the series, so I'll check it, thanks!
But I saw qemu_ram_is_migratable() function and corresponding patch.
It's very close to my needs, but it works a bit different IIUC:
1. Not migratable blocks isn't validated (existence and size) during migration,
2. "Migratable" state is determined during the block creation time.
   Such case isn't valid because of it:
   * Source has one migratable and one not migratable RAM blocks,
   * Target has the same (idstr) blocks, but both are not migratable.
   Thus, target will not expect pages for not migratable blocks.

>   Also, ypu're skipping 'external' things, I think the other suggestion
> was to skip 'shared' things (i.e. anything with share=0); skipping
> share=on cases sounds easier to me.

I agree that introducing new term is a complication, but 'share' and 'external'
terms have important differences (I'll describe it below).

Just to clarify:
* 'share' means that other processes has an access to such memory,
* 'external' means file backed memory.

There is another use case I wanted to support (I had to write about it in
the cover letter, sorry..):
1. Migrate source VM to file and kill source,
2. Start target VM and migrate it from file.
In such case source VM may have memory-backend-ram with share=off, it's ok.

Thus, in the new migration capability I want to migrate memory that meets
three conditions:
1. The source will not use the memory after migration ends,
2. The source may exit before target starts (migrate to file),
3. The target has an access to the memory.

I think 'external' fits them better than 'share'.

>
> Dave
>
>>  Patches:
>>  1. Add offset validation to make sure that external RAM block has the same
>>     physical offset on target side,
>>  2. Add RAM_EXTERNAL flag to determine external RAM blocks,
>>  3. Add ignore-external migration capability,
>>  4. Add a test.
>>
>>  Usage example:
>>  1. Start source VM:
>>     qemu-system-x86 \
>>       -m 4G \
>>       -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>>       -numa node,memdev=mem0 \
>>       -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \
>>
>>  2. Start target VM:
>>     qemu-system-x86 \
>>       -m 4G \
>>       -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>>       -numa node,memdev=mem0 \
>>       -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
>>       -incoming defer
>>
>>  3. Enable ignore-external capability on both VMs:
>>     { "execute": "migrate-set-capabilities" , "arguments":
>>       { "capabilities": [ { "capability": "x-ignore-external", "state": true } ] } }
>>
>>  4. Start migration.
>>
>>  Regards,
>>  Yury
>>
>>  Yury Kotov (4):
>>    migration: add RAMBlock's offset validation
>>    exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
>>    migration: introduce ignore-external capability
>>    tests/migration-test: Add a test for ignore-external capability
>>
>>   backends/hostmem-file.c | 3 +-
>>   exec.c | 7 ++-
>>   include/exec/cpu-common.h | 1 +
>>   include/exec/memory.h | 3 ++
>>   migration/migration.c | 9 ++++
>>   migration/migration.h | 1 +
>>   migration/ram.c | 52 ++++++++++++++++--
>>   numa.c | 4 +-
>>   qapi/migration.json | 6 ++-
>>   tests/migration-test.c | 109 +++++++++++++++++++++++++++++++-------
>>   10 files changed, 165 insertions(+), 30 deletions(-)
>>
>>  --
>>  2.20.1
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-10 20:14   ` Dr. David Alan Gilbert
  2019-01-11 10:06     ` Igor Mammedov
@ 2019-01-11 16:38     ` Yury Kotov
  2019-01-11 18:25       ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 22+ messages in thread
From: Yury Kotov @ 2019-01-11 16:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

10.01.2019, 23:14, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
>>  In this stage QEMU checks further information about RAMBlock:
>>  1. Presence (by idstr),
>>  2. Length (trying to resize, when differs),
>>  3. Optional page size.
>>
>>  This patch adds a check for RAMBlock's offset. Currently we check it during
>>  RAM pages loading - every RAM page has an offset in its header. But there is a
>>  case when we don't send RAM pages (see below).
>>
>>  The following commits introduce a capability (ignore-external) to skip some
>>  RAM blocks from migration. In such case the migration stream contains only
>>  meta information about RAM blocks to validate them. So, the only way to check
>>  block's offset is to send it explicitly.
>>
>>  Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
>
> But why check that offsets match? THey aren't supposed to!
> Offset's are entirely private to each qemu and they're allowed to be
> different; the only requirement is that the length and name of each
> RAMBlock matches, then all the operations we do over the migration
> stream are relative to the start of the block.
>

Yes, you are right. It seems that instead I should check block->mr->addr.

>
> One example where they are validly different is where you hotplug some
> RAM, so for example:
>
>   source qemu
>       -M 4G
>       hotplug PCI card
>       hotplug 2G
>
>   destination qemu
>       -M 4G
>       PCI card declared on the command line
>       extra 2G declared on the command line
>
> The offsets are different but we can migrate that case fine.
>
> Dave
>
>>  ---
>>   migration/ram.c | 15 +++++++++++++--
>>   1 file changed, 13 insertions(+), 2 deletions(-)
>>
>>  diff --git a/migration/ram.c b/migration/ram.c
>>  index 7e7deec4d8..39629254e1 100644
>>  --- a/migration/ram.c
>>  +++ b/migration/ram.c
>>  @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>           if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
>>               qemu_put_be64(f, block->page_size);
>>           }
>>  + qemu_put_be64(f, block->offset);
>>       }
>>
>>       rcu_read_unlock();
>>  @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>
>>       seq_iter++;
>>
>>  - if (version_id != 4) {
>>  + if (version_id < 4) {
>>           ret = -EINVAL;
>>       }
>>
>>  @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>                               ret = -EINVAL;
>>                           }
>>                       }
>>  + if (version_id >= 5) {
>>  + ram_addr_t offset;
>>  + offset = qemu_get_be64(f);
>>  + if (block->offset != offset) {
>>  + error_report("Mismatched RAM block offset %s "
>>  + "%" PRId64 "!= %" PRId64,
>>  + id, offset, (uint64_t)block->offset);
>>  + ret = -EINVAL;
>>  + }
>>  + }
>>                       ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>>                                             block->idstr);
>>                   } else {
>>  @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
>>   void ram_mig_init(void)
>>   {
>>       qemu_mutex_init(&XBZRLE.lock);
>>  - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
>>  + register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
>>   }
>>  --
>>  2.20.1
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-11 16:38     ` Yury Kotov
@ 2019-01-11 18:25       ` Dr. David Alan Gilbert
  2019-01-14 12:58         ` Yury Kotov
  0 siblings, 1 reply; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-11 18:25 UTC (permalink / raw)
  To: Yury Kotov
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> 10.01.2019, 23:14, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> >>  RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
> >>  In this stage QEMU checks further information about RAMBlock:
> >>  1. Presence (by idstr),
> >>  2. Length (trying to resize, when differs),
> >>  3. Optional page size.
> >>
> >>  This patch adds a check for RAMBlock's offset. Currently we check it during
> >>  RAM pages loading - every RAM page has an offset in its header. But there is a
> >>  case when we don't send RAM pages (see below).
> >>
> >>  The following commits introduce a capability (ignore-external) to skip some
> >>  RAM blocks from migration. In such case the migration stream contains only
> >>  meta information about RAM blocks to validate them. So, the only way to check
> >>  block's offset is to send it explicitly.
> >>
> >>  Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> >
> > But why check that offsets match? THey aren't supposed to!
> > Offset's are entirely private to each qemu and they're allowed to be
> > different; the only requirement is that the length and name of each
> > RAMBlock matches, then all the operations we do over the migration
> > stream are relative to the start of the block.
> >
> 
> Yes, you are right. It seems that instead I should check block->mr->addr.

I don't think that's guaranteed to be the same either;  for example
a video buffer or ROM on a  PCI card gets mapped into different parts of
the guests physical address space depending on writes into the PCI
bars.  Some RAMBlocks dont even have a mapping associated with them.

If you do want to add something here, please don't increment the version
- unless we're desperate I want to keep the version number the same so
that backwards migration works.

Dave

> >
> > One example where they are validly different is where you hotplug some
> > RAM, so for example:
> >
> >   source qemu
> >       -M 4G
> >       hotplug PCI card
> >       hotplug 2G
> >
> >   destination qemu
> >       -M 4G
> >       PCI card declared on the command line
> >       extra 2G declared on the command line
> >
> > The offsets are different but we can migrate that case fine.
> >
> > Dave
> >
> >>  ---
> >>   migration/ram.c | 15 +++++++++++++--
> >>   1 file changed, 13 insertions(+), 2 deletions(-)
> >>
> >>  diff --git a/migration/ram.c b/migration/ram.c
> >>  index 7e7deec4d8..39629254e1 100644
> >>  --- a/migration/ram.c
> >>  +++ b/migration/ram.c
> >>  @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> >>           if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
> >>               qemu_put_be64(f, block->page_size);
> >>           }
> >>  + qemu_put_be64(f, block->offset);
> >>       }
> >>
> >>       rcu_read_unlock();
> >>  @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >>
> >>       seq_iter++;
> >>
> >>  - if (version_id != 4) {
> >>  + if (version_id < 4) {
> >>           ret = -EINVAL;
> >>       }
> >>
> >>  @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
> >>                               ret = -EINVAL;
> >>                           }
> >>                       }
> >>  + if (version_id >= 5) {
> >>  + ram_addr_t offset;
> >>  + offset = qemu_get_be64(f);
> >>  + if (block->offset != offset) {
> >>  + error_report("Mismatched RAM block offset %s "
> >>  + "%" PRId64 "!= %" PRId64,
> >>  + id, offset, (uint64_t)block->offset);
> >>  + ret = -EINVAL;
> >>  + }
> >>  + }
> >>                       ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
> >>                                             block->idstr);
> >>                   } else {
> >>  @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
> >>   void ram_mig_init(void)
> >>   {
> >>       qemu_mutex_init(&XBZRLE.lock);
> >>  - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
> >>  + register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
> >>   }
> >>  --
> >>  2.20.1
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> Regards,
> Yury
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-11 15:49   ` Yury Kotov
@ 2019-01-11 20:09     ` Dr. David Alan Gilbert
  2019-01-14 15:16       ` Yury Kotov
  2019-01-21 14:09       ` Yury Kotov
  2019-01-11 20:55     ` Eduardo Habkost
  1 sibling, 2 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-11 20:09 UTC (permalink / raw)
  To: Yury Kotov, clg
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Peter Crosthwaite,
	Juan Quintela, qemu-devel, Markus Armbruster, Paolo Bonzini,
	Igor Mammedov, wrfsh, Richard Henderson

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> 10.01.2019, 23:12, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> >>  Hi,
> >>
> >>  The series adds migration capability which allows to skip 'external' RAM blocks
> >>  during migration. External block is a RAMBlock which available from the outside
> >>  of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
> >>  migration to update QEMU for the running guests.
> >
> > Hi Yury,
> >   There have been a few similar patch series around from people wanting
> > to do similar things.
> >   In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
> > and Cédric Le Goater wanted to skip regions for a different reason.
> >
> >   We merged some of Cédric's code last year so that we now
> > have the qemu_ram_is_migratable() function - and we should be reusing
> > that to skip things rather than adding a new check that we have to add
> > everywhere.
> >
> 
> I didn't see the series, so I'll check it, thanks!
> But I saw qemu_ram_is_migratable() function and corresponding patch.
> It's very close to my needs, but it works a bit different IIUC:
> 1. Not migratable blocks isn't validated (existence and size) during migration,
> 2. "Migratable" state is determined during the block creation time.
>    Such case isn't valid because of it:
>    * Source has one migratable and one not migratable RAM blocks,
>    * Target has the same (idstr) blocks, but both are not migratable.
>    Thus, target will not expect pages for not migratable blocks.

I've added Cédric to the mail;
there were other cases people were interested in, including switching
it dynamically, just no one else used it yet.

I'd prefer it if you did modify the is_migratable - that will fix a lot
of other places in the migration code to avoid your blocks;  I think the
best thing is for you to add a spearate 'needs_migration_check' for
those blocks like yours which you want to check but you don't send the
data.  Please don't change the format of the migratino stream except
in the case where you are sending those to be checked.

> >   Also, ypu're skipping 'external' things, I think the other suggestion
> > was to skip 'shared' things (i.e. anything with share=0); skipping
> > share=on cases sounds easier to me.
> 
> I agree that introducing new term is a complication, but 'share' and 'external'
> terms have important differences (I'll describe it below).
> 
> Just to clarify:
> * 'share' means that other processes has an access to such memory,
> * 'external' means file backed memory.
> 
> There is another use case I wanted to support (I had to write about it in
> the cover letter, sorry..):
> 1. Migrate source VM to file and kill source,
> 2. Start target VM and migrate it from file.
> In such case source VM may have memory-backend-ram with share=off, it's ok.
> 
> Thus, in the new migration capability I want to migrate memory that meets
> three conditions:
> 1. The source will not use the memory after migration ends,
> 2. The source may exit before target starts (migrate to file),
> 3. The target has an access to the memory.
> 
> I think 'external' fits them better than 'share'.

Are you sure that with share=off (the default), in the case where the
source shuts down, that the changes have been written to the backing
file?

I'm also wondering if perhaps we'd be better having an explicit
migrate=off property on memory objects rather than trying to guess from
the share= or the fact it's an external path.

Igor: Does that make sense to you?

Dave

> >
> > Dave
> >
> >>  Patches:
> >>  1. Add offset validation to make sure that external RAM block has the same
> >>     physical offset on target side,
> >>  2. Add RAM_EXTERNAL flag to determine external RAM blocks,
> >>  3. Add ignore-external migration capability,
> >>  4. Add a test.
> >>
> >>  Usage example:
> >>  1. Start source VM:
> >>     qemu-system-x86 \
> >>       -m 4G \
> >>       -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
> >>       -numa node,memdev=mem0 \
> >>       -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \
> >>
> >>  2. Start target VM:
> >>     qemu-system-x86 \
> >>       -m 4G \
> >>       -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
> >>       -numa node,memdev=mem0 \
> >>       -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
> >>       -incoming defer
> >>
> >>  3. Enable ignore-external capability on both VMs:
> >>     { "execute": "migrate-set-capabilities" , "arguments":
> >>       { "capabilities": [ { "capability": "x-ignore-external", "state": true } ] } }
> >>
> >>  4. Start migration.
> >>
> >>  Regards,
> >>  Yury
> >>
> >>  Yury Kotov (4):
> >>    migration: add RAMBlock's offset validation
> >>    exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
> >>    migration: introduce ignore-external capability
> >>    tests/migration-test: Add a test for ignore-external capability
> >>
> >>   backends/hostmem-file.c | 3 +-
> >>   exec.c | 7 ++-
> >>   include/exec/cpu-common.h | 1 +
> >>   include/exec/memory.h | 3 ++
> >>   migration/migration.c | 9 ++++
> >>   migration/migration.h | 1 +
> >>   migration/ram.c | 52 ++++++++++++++++--
> >>   numa.c | 4 +-
> >>   qapi/migration.json | 6 ++-
> >>   tests/migration-test.c | 109 +++++++++++++++++++++++++++++++-------
> >>   10 files changed, 165 insertions(+), 30 deletions(-)
> >>
> >>  --
> >>  2.20.1
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> Regards,
> Yury
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-11 15:49   ` Yury Kotov
  2019-01-11 20:09     ` Dr. David Alan Gilbert
@ 2019-01-11 20:55     ` Eduardo Habkost
  2019-01-14 15:31       ` Yury Kotov
  1 sibling, 1 reply; 22+ messages in thread
From: Eduardo Habkost @ 2019-01-11 20:55 UTC (permalink / raw)
  To: Yury Kotov
  Cc: Dr. David Alan Gilbert, Laurent Vivier, Thomas Huth,
	Peter Crosthwaite, Juan Quintela, qemu-devel, Markus Armbruster,
	Paolo Bonzini, Igor Mammedov, wrfsh, Richard Henderson

On Fri, Jan 11, 2019 at 06:49:53PM +0300, Yury Kotov wrote:
> 10.01.2019, 23:12, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> >>  Hi,
> >>
> >>  The series adds migration capability which allows to skip 'external' RAM blocks
> >>  during migration. External block is a RAMBlock which available from the outside
> >>  of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
> >>  migration to update QEMU for the running guests.
> >
> > Hi Yury,
> >   There have been a few similar patch series around from people wanting
> > to do similar things.
> >   In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
> > and Cédric Le Goater wanted to skip regions for a different reason.
> >
> >   We merged some of Cédric's code last year so that we now
> > have the qemu_ram_is_migratable() function - and we should be reusing
> > that to skip things rather than adding a new check that we have to add
> > everywhere.
> >
> 
> I didn't see the series, so I'll check it, thanks!
> But I saw qemu_ram_is_migratable() function and corresponding patch.
> It's very close to my needs, but it works a bit different IIUC:
> 1. Not migratable blocks isn't validated (existence and size) during migration,
> 2. "Migratable" state is determined during the block creation time.
>    Such case isn't valid because of it:
>    * Source has one migratable and one not migratable RAM blocks,
>    * Target has the same (idstr) blocks, but both are not migratable.
>    Thus, target will not expect pages for not migratable blocks.
> 
> >   Also, ypu're skipping 'external' things, I think the other suggestion
> > was to skip 'shared' things (i.e. anything with share=0); skipping
> > share=on cases sounds easier to me.
> 
> I agree that introducing new term is a complication, but 'share' and 'external'
> terms have important differences (I'll describe it below).
> 
> Just to clarify:
> * 'share' means that other processes has an access to such memory,
> * 'external' means file backed memory.

If you use file backed memory with share=off, writes are not
propagated to the file (they are mapped with MAP_PRIVATE).  Would
you really want to skip file backed memory if it has share=off?

> 
> There is another use case I wanted to support (I had to write about it in
> the cover letter, sorry..):
> 1. Migrate source VM to file and kill source,
> 2. Start target VM and migrate it from file.
> In such case source VM may have memory-backend-ram with share=off, it's ok.
> 
> Thus, in the new migration capability I want to migrate memory that meets
> three conditions:
> 1. The source will not use the memory after migration ends,
> 2. The source may exit before target starts (migrate to file),
> 3. The target has an access to the memory.
> 
> I think 'external' fits them better than 'share'.
> 

In either case, defining "external" seems tricky.  A memory
region might be backed by a file on tmpfs or hugetlbfs that was
deleted, which makes the file "internal" for practical purposes.
QEMU has no way to tell if (3) is really true.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
                   ` (4 preceding siblings ...)
  2019-01-10 20:11 ` [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Dr. David Alan Gilbert
@ 2019-01-13 14:37 ` no-reply
  2019-01-13 23:57 ` no-reply
  6 siblings, 0 replies; 22+ messages in thread
From: no-reply @ 2019-01-13 14:37 UTC (permalink / raw)
  To: yury-kotov
  Cc: fam, qemu-devel, ehabkost, imammedo, pbonzini, crosthwaite.peter,
	rth, quintela, dgilbert, eblake, armbru, thuth, lvivier, wrfsh

Patchew URL: https://patchew.org/QEMU/20190110120120.9943-1-yury-kotov@yandex-team.ru/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-quick@centos7 SHOW_ENV=1 J=8
=== TEST SCRIPT END ===




The full log is available at
http://patchew.org/logs/20190110120120.9943-1-yury-kotov@yandex-team.ru/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
                   ` (5 preceding siblings ...)
  2019-01-13 14:37 ` no-reply
@ 2019-01-13 23:57 ` no-reply
  6 siblings, 0 replies; 22+ messages in thread
From: no-reply @ 2019-01-13 23:57 UTC (permalink / raw)
  To: yury-kotov
  Cc: fam, qemu-devel, ehabkost, imammedo, pbonzini, crosthwaite.peter,
	rth, quintela, dgilbert, eblake, armbru, thuth, lvivier, wrfsh

Patchew URL: https://patchew.org/QEMU/20190110120120.9943-1-yury-kotov@yandex-team.ru/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=14
=== TEST SCRIPT END ===

  CC      aarch64-softmmu/balloon.o
  CC      x86_64-softmmu/hw/block/virtio-blk.o
/tmp/qemu-test/src/migration/ram.c: In function 'ram_load':
/tmp/qemu-test/src/migration/ram.c:4159:42: error: format '%lld' expects argument of type 'long long int', but argument 3 has type 'ram_addr_t {aka unsigned int}' [-Werror=format=]
                             error_report("Mismatched RAM block offset %s "
                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/i686-w64-mingw32/sys-root/mingw/include/inttypes.h:299:0,
---
  CC      aarch64-softmmu/hw/cpu/a15mpcore.o
  CC      aarch64-softmmu/hw/display/omap_dss.o
/tmp/qemu-test/src/migration/ram.c: In function 'ram_load':
/tmp/qemu-test/src/migration/ram.c:4159:42: error: format '%lld' expects argument of type 'long long int', but argument 3 has type 'ram_addr_t {aka unsigned int}' [-Werror=format=]
                             error_report("Mismatched RAM block offset %s "
                                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/i686-w64-mingw32/sys-root/mingw/include/inttypes.h:299:0,


The full log is available at
http://patchew.org/logs/20190110120120.9943-1-yury-kotov@yandex-team.ru/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation
  2019-01-11 18:25       ` Dr. David Alan Gilbert
@ 2019-01-14 12:58         ` Yury Kotov
  0 siblings, 0 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-14 12:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: qemu-devel, Eduardo Habkost, Igor Mammedov, Paolo Bonzini,
	Peter Crosthwaite, Richard Henderson, Juan Quintela, Eric Blake,
	Markus Armbruster, Thomas Huth, Laurent Vivier, wrfsh

11.01.2019, 21:25, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  10.01.2019, 23:14, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
>>  > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  >>  RAM migration has a RAMBlock validation stage (flag RAM_SAVE_FLAG_MEM_SIZE).
>>  >>  In this stage QEMU checks further information about RAMBlock:
>>  >>  1. Presence (by idstr),
>>  >>  2. Length (trying to resize, when differs),
>>  >>  3. Optional page size.
>>  >>
>>  >>  This patch adds a check for RAMBlock's offset. Currently we check it during
>>  >>  RAM pages loading - every RAM page has an offset in its header. But there is a
>>  >>  case when we don't send RAM pages (see below).
>>  >>
>>  >>  The following commits introduce a capability (ignore-external) to skip some
>>  >>  RAM blocks from migration. In such case the migration stream contains only
>>  >>  meta information about RAM blocks to validate them. So, the only way to check
>>  >>  block's offset is to send it explicitly.
>>  >>
>>  >>  Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
>>  >
>>  > But why check that offsets match? THey aren't supposed to!
>>  > Offset's are entirely private to each qemu and they're allowed to be
>>  > different; the only requirement is that the length and name of each
>>  > RAMBlock matches, then all the operations we do over the migration
>>  > stream are relative to the start of the block.
>>  >
>>
>>  Yes, you are right. It seems that instead I should check block->mr->addr.
>
> I don't think that's guaranteed to be the same either; for example
> a video buffer or ROM on a PCI card gets mapped into different parts of
> the guests physical address space depending on writes into the PCI
> bars. Some RAMBlocks dont even have a mapping associated with them.
>
> If you do want to add something here, please don't increment the version
> - unless we're desperate I want to keep the version number the same so
> that backwards migration works.
>
> Dave
>

Ok, thanks. I didn't know that. What do you think about addr checking only for
ignored RAMBlocks? So, only memory backends will be checked and their addrs must
be equal IIUC. Also, in this case there is no need to increment the version.

>>  >
>>  > One example where they are validly different is where you hotplug some
>>  > RAM, so for example:
>>  >
>>  >   source qemu
>>  >       -M 4G
>>  >       hotplug PCI card
>>  >       hotplug 2G
>>  >
>>  >   destination qemu
>>  >       -M 4G
>>  >       PCI card declared on the command line
>>  >       extra 2G declared on the command line
>>  >
>>  > The offsets are different but we can migrate that case fine.
>>  >
>>  > Dave
>>  >
>>  >>  ---
>>  >>   migration/ram.c | 15 +++++++++++++--
>>  >>   1 file changed, 13 insertions(+), 2 deletions(-)
>>  >>
>>  >>  diff --git a/migration/ram.c b/migration/ram.c
>>  >>  index 7e7deec4d8..39629254e1 100644
>>  >>  --- a/migration/ram.c
>>  >>  +++ b/migration/ram.c
>>  >>  @@ -3171,6 +3171,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>  >>           if (migrate_postcopy_ram() && block->page_size != qemu_host_page_size) {
>>  >>               qemu_put_be64(f, block->page_size);
>>  >>           }
>>  >>  + qemu_put_be64(f, block->offset);
>>  >>       }
>>  >>
>>  >>       rcu_read_unlock();
>>  >>  @@ -4031,7 +4032,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>  >>
>>  >>       seq_iter++;
>>  >>
>>  >>  - if (version_id != 4) {
>>  >>  + if (version_id < 4) {
>>  >>           ret = -EINVAL;
>>  >>       }
>>  >>
>>  >>  @@ -4132,6 +4133,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
>>  >>                               ret = -EINVAL;
>>  >>                           }
>>  >>                       }
>>  >>  + if (version_id >= 5) {
>>  >>  + ram_addr_t offset;
>>  >>  + offset = qemu_get_be64(f);
>>  >>  + if (block->offset != offset) {
>>  >>  + error_report("Mismatched RAM block offset %s "
>>  >>  + "%" PRId64 "!= %" PRId64,
>>  >>  + id, offset, (uint64_t)block->offset);
>>  >>  + ret = -EINVAL;
>>  >>  + }
>>  >>  + }
>>  >>                       ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>>  >>                                             block->idstr);
>>  >>                   } else {
>>  >>  @@ -4363,5 +4374,5 @@ static SaveVMHandlers savevm_ram_handlers = {
>>  >>   void ram_mig_init(void)
>>  >>   {
>>  >>       qemu_mutex_init(&XBZRLE.lock);
>>  >>  - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, &ram_state);
>>  >>  + register_savevm_live(NULL, "ram", 0, 5, &savevm_ram_handlers, &ram_state);
>>  >>   }
>>  >>  --
>>  >>  2.20.1
>>  > --
>>  > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>
>>  Regards,
>>  Yury
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-11 20:09     ` Dr. David Alan Gilbert
@ 2019-01-14 15:16       ` Yury Kotov
  2019-01-21 14:09       ` Yury Kotov
  1 sibling, 0 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-14 15:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, clg
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Peter Crosthwaite,
	Juan Quintela, qemu-devel, Markus Armbruster, Paolo Bonzini,
	Igor Mammedov, wrfsh, Richard Henderson

11.01.2019, 23:09, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  10.01.2019, 23:12, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
>>  > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  >> Hi,
>>  >>
>>  >> The series adds migration capability which allows to skip 'external' RAM blocks
>>  >> during migration. External block is a RAMBlock which available from the outside
>>  >> of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
>>  >> migration to update QEMU for the running guests.
>>  >
>>  > Hi Yury,
>>  > There have been a few similar patch series around from people wanting
>>  > to do similar things.
>>  > In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
>>  > and Cédric Le Goater wanted to skip regions for a different reason.
>>  >
>>  > We merged some of Cédric's code last year so that we now
>>  > have the qemu_ram_is_migratable() function - and we should be reusing
>>  > that to skip things rather than adding a new check that we have to add
>>  > everywhere.
>>  >
>>
>>  I didn't see the series, so I'll check it, thanks!
>>  But I saw qemu_ram_is_migratable() function and corresponding patch.
>>  It's very close to my needs, but it works a bit different IIUC:
>>  1. Not migratable blocks isn't validated (existence and size) during migration,
>>  2. "Migratable" state is determined during the block creation time.
>>     Such case isn't valid because of it:
>>     * Source has one migratable and one not migratable RAM blocks,
>>     * Target has the same (idstr) blocks, but both are not migratable.
>>     Thus, target will not expect pages for not migratable blocks.
>
> I've added Cédric to the mail;
> there were other cases people were interested in, including switching
> it dynamically, just no one else used it yet.
>
> I'd prefer it if you did modify the is_migratable - that will fix a lot
> of other places in the migration code to avoid your blocks; I think the
> best thing is for you to add a spearate 'needs_migration_check' for
> those blocks like yours which you want to check but you don't send the
> data. Please don't change the format of the migratino stream except
> in the case where you are sending those to be checked.
>

Ok, I've got your point and will try to reuse/modify is_migratable.

>>  > Also, ypu're skipping 'external' things, I think the other suggestion
>>  > was to skip 'shared' things (i.e. anything with share=0); skipping
>>  > share=on cases sounds easier to me.
>>
>>  I agree that introducing new term is a complication, but 'share' and 'external'
>>  terms have important differences (I'll describe it below).
>>
>>  Just to clarify:
>>  * 'share' means that other processes has an access to such memory,
>>  * 'external' means file backed memory.
>>
>>  There is another use case I wanted to support (I had to write about it in
>>  the cover letter, sorry..):
>>  1. Migrate source VM to file and kill source,
>>  2. Start target VM and migrate it from file.
>>  In such case source VM may have memory-backend-ram with share=off, it's ok.
>>
>>  Thus, in the new migration capability I want to migrate memory that meets
>>  three conditions:
>>  1. The source will not use the memory after migration ends,
>>  2. The source may exit before target starts (migrate to file),
>>  3. The target has an access to the memory.
>>
>>  I think 'external' fits them better than 'share'.
>
> Are you sure that with share=off (the default), in the case where the
> source shuts down, that the changes have been written to the backing
> file?
>

Oh, you're right. I was sure it would work...

> I'm also wondering if perhaps we'd be better having an explicit
> migrate=off property on memory objects rather than trying to guess from
> the share= or the fact it's an external path.

It's good idea. Perhaps this is the best option, given the above.

>
> Igor: Does that make sense to you?
>
> Dave
>
>>  >
>>  > Dave
>>  >
>>  >> Patches:
>>  >> 1. Add offset validation to make sure that external RAM block has the same
>>  >> physical offset on target side,
>>  >> 2. Add RAM_EXTERNAL flag to determine external RAM blocks,
>>  >> 3. Add ignore-external migration capability,
>>  >> 4. Add a test.
>>  >>
>>  >> Usage example:
>>  >> 1. Start source VM:
>>  >> qemu-system-x86 \
>>  >> -m 4G \
>>  >> -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>>  >> -numa node,memdev=mem0 \
>>  >> -qmp unix:/tmp/qemu-qmp-1.sock,server,nowait \
>>  >>
>>  >> 2. Start target VM:
>>  >> qemu-system-x86 \
>>  >> -m 4G \
>>  >> -object memory-backend-file,id=mem0,size=4G,share=on,mem-path=/dev/shm/mem0 \
>>  >> -numa node,memdev=mem0 \
>>  >> -qmp unix:/tmp/qemu-qmp-2.sock,server,nowait \
>>  >> -incoming defer
>>  >>
>>  >> 3. Enable ignore-external capability on both VMs:
>>  >> { "execute": "migrate-set-capabilities" , "arguments":
>>  >> { "capabilities": [ { "capability": "x-ignore-external", "state": true } ] } }
>>  >>
>>  >> 4. Start migration.
>>  >>
>>  >> Regards,
>>  >> Yury
>>  >>
>>  >> Yury Kotov (4):
>>  >> migration: add RAMBlock's offset validation
>>  >> exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks
>>  >> migration: introduce ignore-external capability
>>  >> tests/migration-test: Add a test for ignore-external capability
>>  >>
>>  >> backends/hostmem-file.c | 3 +-
>>  >> exec.c | 7 ++-
>>  >> include/exec/cpu-common.h | 1 +
>>  >> include/exec/memory.h | 3 ++
>>  >> migration/migration.c | 9 ++++
>>  >> migration/migration.h | 1 +
>>  >> migration/ram.c | 52 ++++++++++++++++--
>>  >> numa.c | 4 +-
>>  >> qapi/migration.json | 6 ++-
>>  >> tests/migration-test.c | 109 +++++++++++++++++++++++++++++++-------
>>  >> 10 files changed, 165 insertions(+), 30 deletions(-)
>>  >>
>>  >> --
>>  >> 2.20.1
>>  > --
>>  > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>
>>  Regards,
>>  Yury
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-11 20:55     ` Eduardo Habkost
@ 2019-01-14 15:31       ` Yury Kotov
  0 siblings, 0 replies; 22+ messages in thread
From: Yury Kotov @ 2019-01-14 15:31 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Dr. David Alan Gilbert, Laurent Vivier, Thomas Huth,
	Peter Crosthwaite, Juan Quintela, qemu-devel, Markus Armbruster,
	Paolo Bonzini, Igor Mammedov, wrfsh, Richard Henderson

11.01.2019, 23:55, "Eduardo Habkost" <ehabkost@redhat.com>:
> On Fri, Jan 11, 2019 at 06:49:53PM +0300, Yury Kotov wrote:
>>  10.01.2019, 23:12, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
>>  > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>>  >> Hi,
>>  >>
>>  >> The series adds migration capability which allows to skip 'external' RAM blocks
>>  >> during migration. External block is a RAMBlock which available from the outside
>>  >> of current QEMU process (e.g. file in /dev/shm). It's useful for fast local
>>  >> migration to update QEMU for the running guests.
>>  >
>>  > Hi Yury,
>>  > There have been a few similar patch series around from people wanting
>>  > to do similar things.
>>  > In particular Lai Jiangshan's https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg07511.html
>>  > and Cédric Le Goater wanted to skip regions for a different reason.
>>  >
>>  > We merged some of Cédric's code last year so that we now
>>  > have the qemu_ram_is_migratable() function - and we should be reusing
>>  > that to skip things rather than adding a new check that we have to add
>>  > everywhere.
>>  >
>>
>>  I didn't see the series, so I'll check it, thanks!
>>  But I saw qemu_ram_is_migratable() function and corresponding patch.
>>  It's very close to my needs, but it works a bit different IIUC:
>>  1. Not migratable blocks isn't validated (existence and size) during migration,
>>  2. "Migratable" state is determined during the block creation time.
>>     Such case isn't valid because of it:
>>     * Source has one migratable and one not migratable RAM blocks,
>>     * Target has the same (idstr) blocks, but both are not migratable.
>>     Thus, target will not expect pages for not migratable blocks.
>>
>>  > Also, ypu're skipping 'external' things, I think the other suggestion
>>  > was to skip 'shared' things (i.e. anything with share=0); skipping
>>  > share=on cases sounds easier to me.
>>
>>  I agree that introducing new term is a complication, but 'share' and 'external'
>>  terms have important differences (I'll describe it below).
>>
>>  Just to clarify:
>>  * 'share' means that other processes has an access to such memory,
>>  * 'external' means file backed memory.
>
> If you use file backed memory with share=off, writes are not
> propagated to the file (they are mapped with MAP_PRIVATE). Would
> you really want to skip file backed memory if it has share=off?
>

Yes, you're right. I was sure it would work, but share=on is also needed in
my case.

>>  There is another use case I wanted to support (I had to write about it in
>>  the cover letter, sorry..):
>>  1. Migrate source VM to file and kill source,
>>  2. Start target VM and migrate it from file.
>>  In such case source VM may have memory-backend-ram with share=off, it's ok.
>>
>>  Thus, in the new migration capability I want to migrate memory that meets
>>  three conditions:
>>  1. The source will not use the memory after migration ends,
>>  2. The source may exit before target starts (migrate to file),
>>  3. The target has an access to the memory.
>>
>>  I think 'external' fits them better than 'share'.
>
> In either case, defining "external" seems tricky. A memory
> region might be backed by a file on tmpfs or hugetlbfs that was
> deleted, which makes the file "internal" for practical purposes.
> QEMU has no way to tell if (3) is really true.
>
> --
> Eduardo

Agree. Perhaps the best is a separate flag, as suggested by Dave.

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-11 20:09     ` Dr. David Alan Gilbert
  2019-01-14 15:16       ` Yury Kotov
@ 2019-01-21 14:09       ` Yury Kotov
  2019-01-22 18:08         ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 22+ messages in thread
From: Yury Kotov @ 2019-01-21 14:09 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Peter Crosthwaite,
	Juan Quintela, qemu-devel, Markus Armbruster, Paolo Bonzini,
	Igor Mammedov, wrfsh, Richard Henderson, clg

Hi,

Just want to clarify your suggestions.

1) migrate=off/share=on

I'm not sure that adding new flag 'migrate=off' is a good idea. I think that
share=on as you suggested at first is enough.
* It's a new flag which has sense only with share=on
* It seems to that the meaning of this flag isn't clear. migrate=off isn't
  RAM_MIGRATABLE, it's rather RAM_IGNORABLE. I.e. it means that we don't migrate
  such block only if the capability is enabled.
If you don't mind, I'll prefer share=on and ignore-shared capability.

2) Keep RAM migration version

It's more complicated question. To validate GPAs for ignored blocks I have to
change the stream format. I can do this conditionally (if (cap_enabled) { ... })
but in this case, I want to make sure that the capability is enabled or disabled
on both source and target. Otherwise, there is an undefined behavior on the
target side (most likely some error during deserialization).
To fix that I can add a capability validation feature.

For example, add a new section to vmstate_configuration (not complete):
+static const VMStateDescription vmstate_capabilites = {
+    .name = "configuration/capabilities",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = vmstate_capabilites_needed,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_V(caps_count, SaveState, 1),
+        VMSTATE_VARRAY_UINT32_ALLOC(capabilities, SaveState, caps_count, 1,
+                                    vmstate_info_bool, bool),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_configuration = {
     .name = "configuration",
     .version_id = 1,
@@ -404,6 +456,7 @@ static const VMStateDescription vmstate_configuration = {
     },
     .subsections = (const VMStateDescription*[]) {
         &vmstate_target_page_bits,
+        &vmstate_capabilites,
         NULL
     }
 };

It seems too complicated. If I should change the migration stream anyway, maybe
it's better to increment the version?

What do you think?

Regards,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability
  2019-01-21 14:09       ` Yury Kotov
@ 2019-01-22 18:08         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 22+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-22 18:08 UTC (permalink / raw)
  To: Yury Kotov
  Cc: Laurent Vivier, Thomas Huth, Eduardo Habkost, Peter Crosthwaite,
	Juan Quintela, qemu-devel, Markus Armbruster, Paolo Bonzini,
	Igor Mammedov, wrfsh, Richard Henderson, clg, jiangshanlai

* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> Hi,
> 
> Just want to clarify your suggestions.
> 
> 1) migrate=off/share=on
> 
> I'm not sure that adding new flag 'migrate=off' is a good idea. I think that
> share=on as you suggested at first is enough.
> * It's a new flag which has sense only with share=on
> * It seems to that the meaning of this flag isn't clear. migrate=off isn't
>   RAM_MIGRATABLE, it's rather RAM_IGNORABLE. I.e. it means that we don't migrate
>   such block only if the capability is enabled.
> If you don't mind, I'll prefer share=on and ignore-shared capability.

Yes, I'm OK with that, just check with Lai Jiangshan (cc'd) that it
meets their requirement as well.

> 2) Keep RAM migration version
> 
> It's more complicated question. To validate GPAs for ignored blocks I have to
> change the stream format. I can do this conditionally (if (cap_enabled) { ... })
> but in this case, I want to make sure that the capability is enabled or disabled
> on both source and target. Otherwise, there is an undefined behavior on the
> target side (most likely some error during deserialization).
> To fix that I can add a capability validation feature.
> 
> For example, add a new section to vmstate_configuration (not complete):
> +static const VMStateDescription vmstate_capabilites = {
> +    .name = "configuration/capabilities",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = vmstate_capabilites_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT32_V(caps_count, SaveState, 1),
> +        VMSTATE_VARRAY_UINT32_ALLOC(capabilities, SaveState, caps_count, 1,
> +                                    vmstate_info_bool, bool),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
>  static const VMStateDescription vmstate_configuration = {
>      .name = "configuration",
>      .version_id = 1,
> @@ -404,6 +456,7 @@ static const VMStateDescription vmstate_configuration = {
>      },
>      .subsections = (const VMStateDescription*[]) {
>          &vmstate_target_page_bits,
> +        &vmstate_capabilites,
>          NULL
>      }
>  };
> 
> It seems too complicated. If I should change the migration stream anyway, maybe
> it's better to increment the version?
> 
> What do you think?

Never break compatibility!

There's three separate things here:
  a) Skipping the shared blocks
  b) Adding a check that skipped blocks actually match
  c) Adding a check for matching capabilities

Lets solve them separately.

I think (a) we've discussed.
(b) If you only enable the checking when your ignore-shared is enabled
then it doesn't break any compatibility.  But watch out, if they're
skipped for some other reason then you might not want to check them;
for example some of the things Peter Maydell was talking about for ARM
was that the block may or may not be present, so there's no requirement
it's on the destination.

(c) That's a nice to have; you'd have to create a list of capabilities
you care about including; if they're only new capabilities then you
can just enable the fields when any are set and it doesn't break old
things; if you want to include some other capabilities then tie
it to the machine type and it wont break old setups.

Dave

> Regards,
> Yury
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-01-22 18:45 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-10 12:01 [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Yury Kotov
2019-01-10 12:01 ` [Qemu-devel] [PATCH 1/4] migration: add RAMBlock's offset validation Yury Kotov
2019-01-10 20:14   ` Dr. David Alan Gilbert
2019-01-11 10:06     ` Igor Mammedov
2019-01-11 10:58       ` Dr. David Alan Gilbert
2019-01-11 16:38     ` Yury Kotov
2019-01-11 18:25       ` Dr. David Alan Gilbert
2019-01-14 12:58         ` Yury Kotov
2019-01-10 12:01 ` [Qemu-devel] [PATCH 2/4] exec: add RAM_EXTERNAL flag to mark non-QEMU allocated blocks Yury Kotov
2019-01-10 20:14   ` Dr. David Alan Gilbert
2019-01-10 12:01 ` [Qemu-devel] [PATCH 3/4] migration: introduce ignore-external capability Yury Kotov
2019-01-10 12:01 ` [Qemu-devel] [PATCH 4/4] tests/migration-test: Add a test for " Yury Kotov
2019-01-10 20:11 ` [Qemu-devel] [PATCH 0/4] Add ignore-external migration capability Dr. David Alan Gilbert
2019-01-11 15:49   ` Yury Kotov
2019-01-11 20:09     ` Dr. David Alan Gilbert
2019-01-14 15:16       ` Yury Kotov
2019-01-21 14:09       ` Yury Kotov
2019-01-22 18:08         ` Dr. David Alan Gilbert
2019-01-11 20:55     ` Eduardo Habkost
2019-01-14 15:31       ` Yury Kotov
2019-01-13 14:37 ` no-reply
2019-01-13 23:57 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.