From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> The following changes since commit 240ab11fb72049d6373cbbec8d788f8e411a00bc: Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190924' into staging (2019-09-24 15:36:31 +0100) are available in the Git repository at: git://github.com/dagrh/qemu.git tags/pull-migration-20190925a for you to fetch changes up to 3748fef9b95a9bc1602f3c4ed2a329d8ef47e63c: migration/postcopy: Recognise the recovery states as 'in_postcopy' (2019-09-25 15:51:19 +0100) ---------------------------------------------------------------- Migration pull 2019-09-25 me: test fixes from (should stop hangs in postcopy tests). me: An RDMA cleanup hang fix Wei: Tidy ups around postcopy Marc-Andre: mem leak fix ---------------------------------------------------------------- Dr. David Alan Gilbert (5): migration/rdma: Don't moan about disconnects at the end migration/rdma.c: Swap synchronize_rcu for call_rcu tests/migration: Fail on unexpected migration states tests/migration/postcopy: trim migration bandwidth migration/postcopy: Recognise the recovery states as 'in_postcopy' Marc-André Lureau (1): migration: fix vmdesc leak on vmstate_save() error Wei Yang (3): migration/postcopy: not necessary to do discard when canonicalizing bitmap migration/postcopy: unsentmap is not necessary for postcopy migration: remove sent parameter in get_queued_page_not_dirty include/exec/ram_addr.h | 6 ---- migration/migration.c | 9 ++++- migration/qjson.h | 2 ++ migration/ram.c | 94 ++++++++----------------------------------------- migration/rdma.c | 51 ++++++++++++++++++--------- migration/savevm.c | 3 +- migration/trace-events | 2 +- tests/migration-test.c | 25 +++++++++---- 8 files changed, 80 insertions(+), 112 deletions(-)
On Wed, 25 Sep 2019 at 16:06, Dr. David Alan Gilbert (git) <dgilbert@redhat.com> wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > The following changes since commit 240ab11fb72049d6373cbbec8d788f8e411a00bc: > > Merge remote-tracking branch 'remotes/aperard/tags/pull-xen-20190924' into staging (2019-09-24 15:36:31 +0100) > > are available in the Git repository at: > > git://github.com/dagrh/qemu.git tags/pull-migration-20190925a > > for you to fetch changes up to 3748fef9b95a9bc1602f3c4ed2a329d8ef47e63c: > > migration/postcopy: Recognise the recovery states as 'in_postcopy' (2019-09-25 15:51:19 +0100) > > ---------------------------------------------------------------- > Migration pull 2019-09-25 > > me: test fixes from (should stop hangs in postcopy tests). > me: An RDMA cleanup hang fix > Wei: Tidy ups around postcopy > Marc-Andre: mem leak fix > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2 for any user-visible changes. -- PMM
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> The following changes since commit 736cf607e40674776d752acc201f565723e86045: Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000) are available in the Git repository at: git://github.com/dagrh/qemu.git tags/pull-migration-20200325b for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97: migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000) ---------------------------------------------------------------- Combo Migration/HMP/virtiofs pull Small fixes all around. Ones that are noticeable: a) Igor's migration compatibility fix affecting older machine types has been seen in the wild b) Philippe's autconverge fix should fix an intermittently failing migration test. c) Mao's makes a small change to the output of 'info migrate_parameters' for tls-authz. ---------------------------------------------------------------- Dr. David Alan Gilbert (1): hmp/vnc: Fix info vnc list leak Igor Mammedov (1): vl.c: fix migration failure for 3.1 and older machine types Mao Zhongyi (2): xbzrle: update xbzrle doc migration: use "" instead of (null) for tls-authz Pan Nengyuan (1): hmp-cmd: fix a missing_break warning Philippe Mathieu-Daudé (2): tests/migration: Reduce autoconverge initial bandwidth tools/virtiofsd/passthrough_ll: Fix double close() Vladimir Sementsov-Ogievskiy (2): migration/colo: fix use after free of local_err migration/ram: fix use after free of local_err docs/xbzrle.txt | 7 ++++++- migration/colo.c | 1 + migration/migration.c | 5 +++-- migration/ram.c | 1 + monitor/hmp-cmds.c | 12 +++++++----- softmmu/vl.c | 3 +++ tests/qtest/migration-test.c | 2 +- tools/virtiofsd/passthrough_ll.c | 3 +-- 8 files changed, 23 insertions(+), 11 deletions(-)
From: Pan Nengyuan <pannengyuan@huawei.com> This fix coverity issues 94417686: 1260 break; CID 94417686: (MISSING_BREAK) 1261. unterminated_case: The case for value "MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD" is not terminated by a 'break' statement. 1261 case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD: 1262 p->has_throttle_trigger_threshold = true; 1263 visit_type_int(v, param, &p->throttle_trigger_threshold, &err); 1264 case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL: Fixes: dc14a470763c96fd9d360e1028ce38e8c3613a77 Fixes: Coverity (CID 1421950) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200318071620.59748-1-pannengyuan@huawei.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- monitor/hmp-cmds.c | 1 + 1 file changed, 1 insertion(+) diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 58724031ea..c882c9f3cc 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1261,6 +1261,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD: p->has_throttle_trigger_threshold = true; visit_type_int(v, param, &p->throttle_trigger_threshold, &err); + break; case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL: p->has_cpu_throttle_initial = true; visit_type_int(v, param, &p->cpu_throttle_initial, &err); -- 2.25.1
From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Add new parameter description, also: 1. Remove unsociable space. 2. Nit picking: s/two/2 in report Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Message-Id: <20200320143216.423374-1-maozhongyi@cmss.chinamobile.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- docs/xbzrle.txt | 7 ++++++- migration/migration.c | 2 +- monitor/hmp-cmds.c | 2 +- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt index c0a7dfd44c..b431bdaf0f 100644 --- a/docs/xbzrle.txt +++ b/docs/xbzrle.txt @@ -92,6 +92,11 @@ Usage power of 2. The cache default value is 64MBytes. (on source only) {qemu} migrate_set_cache_size 256m +Commit 73af8dd8d7 "migration: Make xbzrle_cache_size a migration parameter" +(v2.11.0) deprecated migrate-set-cache-size, therefore, the new parameter +is recommended. + {qemu} migrate_set_parameter xbzrle-cache-size 256m + 4. Start outgoing migration {qemu} migrate -d tcp:destination.host:4444 {qemu} info migrate @@ -108,7 +113,7 @@ power of 2. The cache default value is 64MBytes. (on source only) xbzrle transferred: I kbytes xbzrle pages: J pages xbzrle cache miss: K - xbzrle overflow : L + xbzrle overflow: L xbzrle cache-miss: the number of cache misses to date - high cache-miss rate indicates that the cache size is set too low. diff --git a/migration/migration.c b/migration/migration.c index c1d88ace7f..4b26110d57 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1243,7 +1243,7 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp) error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "xbzrle_cache_size", "is invalid, it should be bigger than target page size" - " and a power of two"); + " and a power of 2"); return false; } diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index c882c9f3cc..76725c2ace 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -303,7 +303,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->xbzrle_cache->cache_miss); monitor_printf(mon, "xbzrle cache miss rate: %0.2f\n", info->xbzrle_cache->cache_miss_rate); - monitor_printf(mon, "xbzrle overflow : %" PRIu64 "\n", + monitor_printf(mon, "xbzrle overflow: %" PRIu64 "\n", info->xbzrle_cache->overflow); } -- 2.25.1
From: Philippe Mathieu-Daudé <philmd@redhat.com> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI s390x when configured with --disable-tcg: $ make check-qtest TEST check-qtest-s390x: tests/qtest/boot-serial-test qemu-system-s390x: -accel tcg: invalid accelerator tcg qemu-system-s390x: falling back to KVM TEST check-qtest-s390x: tests/qtest/pxe-test TEST check-qtest-s390x: tests/qtest/test-netfilter TEST check-qtest-s390x: tests/qtest/test-filter-mirror TEST check-qtest-s390x: tests/qtest/test-filter-redirector TEST check-qtest-s390x: tests/qtest/drive_del-test TEST check-qtest-s390x: tests/qtest/device-plug-test TEST check-qtest-s390x: tests/qtest/virtio-ccw-test TEST check-qtest-s390x: tests/qtest/cpu-plug-test TEST check-qtest-s390x: tests/qtest/migration-test ** ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 Per David Gilbert, "it could just be the writing is slow on s390 and the migration thread fast; in which case the autocomplete wouldn't be needed. Perhaps we just need to reduce the bandwidth limit." Tuning the threshold by reducing the initial bandwidth makes the autoconverge test pass. Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200323184015.11565-1-philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- tests/qtest/migration-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 3d6cc83b88..2568c9529c 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) * without throttling. */ migrate_set_parameter_int(from, "downtime-limit", 1); - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ /* To check remaining size after precopy */ migrate_set_capability(from, "pause-before-switchover", true); -- 2.25.1
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> We're iterating the list, and then freeing the iteration pointer rather than the list head. Fixes: 0a9667ecdb6d ("hmp: Update info vnc") Reported-by: Coverity (CID 1421932) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20200323120822.51266-1-dgilbert@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- monitor/hmp-cmds.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 76725c2ace..04ca342c51 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -527,10 +527,11 @@ static void hmp_info_vnc_servers(Monitor *mon, VncServerInfo2List *server) void hmp_info_vnc(Monitor *mon, const QDict *qdict) { - VncInfo2List *info2l; + VncInfo2List *info2l, *info2l_head; Error *err = NULL; info2l = qmp_query_vnc_servers(&err); + info2l_head = info2l; if (err) { hmp_handle_error(mon, err); return; @@ -559,7 +560,7 @@ void hmp_info_vnc(Monitor *mon, const QDict *qdict) info2l = info2l->next; } - qapi_free_VncInfo2List(info2l); + qapi_free_VncInfo2List(info2l_head); } #endif -- 2.25.1
From: Philippe Mathieu-Daudé <philmd@redhat.com> On success, the fdopendir() call closes fd. Later on the error path we try to close an already-closed fd. This can lead to use-after-free. Fix by only closing the fd if the fdopendir() call failed. Cc: qemu-stable@nongnu.org Fixes: b39bce121b (add dirp_map to hide lo_dirp pointers) Reported-by: Coverity (CID 1421933 USE_AFTER_FREE) Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200321120654.7985-1-philmd@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- tools/virtiofsd/passthrough_ll.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index 4f259aac70..4c35c95b25 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -1520,8 +1520,7 @@ out_err: if (d) { if (d->dp) { closedir(d->dp); - } - if (fd != -1) { + } else if (fd != -1) { close(fd); } free(d); -- 2.25.1
From: Igor Mammedov <imammedo@redhat.com> Migration from QEMU(v4.0) fails when using 3.1 or older machine type. For example if one attempts to migrate QEMU-2.12 started as qemu-system-ppc64 -nodefaults -M pseries-2.12 -m 4096 -mem-path /tmp/ to current master, it will fail with qemu-system-ppc64: Unknown ramblock "ppc_spapr.ram", cannot accept migration qemu-system-ppc64: error while loading state for instance 0x0 of device 'ram' qemu-system-ppc64: load of migration failed: Invalid argument Caused by 900c0ba373 commit which switches main RAM allocation to memory backends and the fact in 3.1 and older QEMU, backends used full[***] QOM path as memory region name instead of backend's name. That was changed after 3.1 to use prefix-less names by default (fa0cb34d22) for new machine types. *** effectively makes main RAM memory region names defined by MachineClass::default_ram_id being altered with '/objects/' prefix and therefore migration fails as old QEMU sends prefix-less name while new QEMU expects name with prefix when using 3.1 and older machine types. Fix it by forcing implicit[1] memory backend to always use prefix-less names for its memory region by setting 'x-use-canonical-path-for-ramblock-id' property to false. 1) i.e. memory backend created by compat glue which maps -m/-mem-path/-mem-prealloc/default RAM size into appropriate backend type/options to match old CLI format. Fixes: 900c0ba373 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reported-by: Lukáš Doktor <ldoktor@redhat.com> Message-Id: <20200304172748.15338-1-imammedo@redhat.com> Tested-by: Lukáš Doktor <ldoktor@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- softmmu/vl.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/softmmu/vl.c b/softmmu/vl.c index 1d33a28340..814537bb42 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -2801,6 +2801,9 @@ static void create_default_memdev(MachineState *ms, const char *path) object_property_set_int(obj, ms->ram_size, "size", &error_fatal); object_property_add_child(object_get_objects_root(), mc->default_ram_id, obj, &error_fatal); + /* Ensure backend's memory region name is equal to mc->default_ram_id */ + object_property_set_bool(obj, false, "x-use-canonical-path-for-ramblock-id", + &error_fatal); user_creatable_complete(USER_CREATABLE(obj), &error_fatal); object_unref(obj); object_property_set_str(OBJECT(ms), mc->default_ram_id, "memory-backend", -- 2.25.1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> local_err is used again in secondary_vm_do_failover() after replication_stop_all(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- migration/colo.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/colo.c b/migration/colo.c index 44942c4e23..a54ac84f41 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -93,6 +93,7 @@ static void secondary_vm_do_failover(void) replication_stop_all(true, &local_err); if (local_err) { error_report_err(local_err); + local_err = NULL; } /* Notify all filters of all NIC to do checkpoint */ -- 2.25.1
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> local_err is used again in migration_bitmap_sync_precopy() after precopy_notify(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- migration/ram.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/ram.c b/migration/ram.c index c12cfdbe26..04f13feb2e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -980,6 +980,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs) */ if (precopy_notify(PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC, &local_err)) { error_report_err(local_err); + local_err = NULL; } migration_bitmap_sync(rs); -- 2.25.1
From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> run: (qemu) info migrate_parameters announce-initial: 50 ms ... announce-max: 550 ms multifd-compression: none xbzrle-cache-size: 4194304 max-postcopy-bandwidth: 0 tls-authz: '(null)' Migration parameter 'tls-authz' is used to provide the QOM ID of a QAuthZ subclass instance that provides the access control check, default is NULL. But the empty string is not a valid object ID, so use "" instead of the default. Although it will fail when lookup an object with ID "", it is harmless, just consistent with tls_creds. As a bonus, this patch also fixed the bad indentation on the last line and removed 'has_tls_authz' redundant check in 'hmp_info_migrate_parameters'. Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Message-Id: <119f539a9f4d198bc3bcced46b8280520d60bc51.1585100802.git.maozhongyi@cmss.chinamobile.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- migration/migration.c | 3 ++- monitor/hmp-cmds.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 4b26110d57..c4c9aee15e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -790,7 +790,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->has_tls_hostname = true; params->tls_hostname = g_strdup(s->parameters.tls_hostname); params->has_tls_authz = true; - params->tls_authz = g_strdup(s->parameters.tls_authz); + params->tls_authz = g_strdup(s->parameters.tls_authz ? + s->parameters.tls_authz : ""); params->has_max_bandwidth = true; params->max_bandwidth = s->parameters.max_bandwidth; params->has_downtime_limit = true; diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 04ca342c51..9b94e67879 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -459,9 +459,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 "\n", MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH), params->max_postcopy_bandwidth); - monitor_printf(mon, " %s: '%s'\n", + monitor_printf(mon, "%s: '%s'\n", MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ), - params->has_tls_authz ? params->tls_authz : ""); + params->tls_authz); } qapi_free_MigrationParameters(params); -- 2.25.1
On Wed, 25 Mar 2020 at 13:17, Dr. David Alan Gilbert (git) <dgilbert@redhat.com> wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > The following changes since commit 736cf607e40674776d752acc201f565723e86045: > > Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000) > > are available in the Git repository at: > > git://github.com/dagrh/qemu.git tags/pull-migration-20200325b > > for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97: > > migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000) > > ---------------------------------------------------------------- > Combo Migration/HMP/virtiofs pull > > Small fixes all around. > Ones that are noticeable: > a) Igor's migration compatibility fix affecting older machine types > has been seen in the wild > b) Philippe's autconverge fix should fix an intermittently > failing migration test. > c) Mao's makes a small change to the output of 'info > migrate_parameters' for tls-authz. > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.0 for any user-visible changes. -- PMM
On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote: > From: Philippe Mathieu-Daudé <philmd@redhat.com> > > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI > s390x when configured with --disable-tcg: > > $ make check-qtest > TEST check-qtest-s390x: tests/qtest/boot-serial-test > qemu-system-s390x: -accel tcg: invalid accelerator tcg > qemu-system-s390x: falling back to KVM > TEST check-qtest-s390x: tests/qtest/pxe-test > TEST check-qtest-s390x: tests/qtest/test-netfilter > TEST check-qtest-s390x: tests/qtest/test-filter-mirror > TEST check-qtest-s390x: tests/qtest/test-filter-redirector > TEST check-qtest-s390x: tests/qtest/drive_del-test > TEST check-qtest-s390x: tests/qtest/device-plug-test > TEST check-qtest-s390x: tests/qtest/virtio-ccw-test > TEST check-qtest-s390x: tests/qtest/cpu-plug-test > TEST check-qtest-s390x: tests/qtest/migration-test > ** > ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 > > Per David Gilbert, "it could just be the writing is slow on s390 > and the migration thread fast; in which case the autocomplete > wouldn't be needed. Perhaps we just need to reduce the bandwidth > limit." > > Tuning the threshold by reducing the initial bandwidth makes the > autoconverge test pass. > > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> > Message-Id: <20200323184015.11565-1-philmd@redhat.com> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > Tested-by: Alex Bennée <alex.bennee@linaro.org> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> This slows make check down significantly for me, it stays at the migration test for minutes. I'm carrying a revert at top of my tree for now but I'd rather not need that. This seems like a fragile way to test things anyway. What happens if someone slows writing even more e.g. because it's running in a container or a VM? How about detecting that migration finished too early and slowing it down until autocomplete triggers? > --- > tests/qtest/migration-test.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > index 3d6cc83b88..2568c9529c 100644 > --- a/tests/qtest/migration-test.c > +++ b/tests/qtest/migration-test.c > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) > * without throttling. > */ > migrate_set_parameter_int(from, "downtime-limit", 1); > - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ > + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ > > /* To check remaining size after precopy */ > migrate_set_capability(from, "pause-before-switchover", true); > -- > 2.25.1 > >
On 6/9/20 6:36 PM, Michael S. Tsirkin wrote: > On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote: >> From: Philippe Mathieu-Daudé <philmd@redhat.com> >> >> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI >> s390x when configured with --disable-tcg: >> >> $ make check-qtest >> TEST check-qtest-s390x: tests/qtest/boot-serial-test >> qemu-system-s390x: -accel tcg: invalid accelerator tcg >> qemu-system-s390x: falling back to KVM >> TEST check-qtest-s390x: tests/qtest/pxe-test >> TEST check-qtest-s390x: tests/qtest/test-netfilter >> TEST check-qtest-s390x: tests/qtest/test-filter-mirror >> TEST check-qtest-s390x: tests/qtest/test-filter-redirector >> TEST check-qtest-s390x: tests/qtest/drive_del-test >> TEST check-qtest-s390x: tests/qtest/device-plug-test >> TEST check-qtest-s390x: tests/qtest/virtio-ccw-test >> TEST check-qtest-s390x: tests/qtest/cpu-plug-test >> TEST check-qtest-s390x: tests/qtest/migration-test >> ** >> ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE >> ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE >> make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 >> >> Per David Gilbert, "it could just be the writing is slow on s390 >> and the migration thread fast; in which case the autocomplete >> wouldn't be needed. Perhaps we just need to reduce the bandwidth >> limit." >> >> Tuning the threshold by reducing the initial bandwidth makes the >> autoconverge test pass. >> >> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> >> Message-Id: <20200323184015.11565-1-philmd@redhat.com> >> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Tested-by: Alex Bennée <alex.bennee@linaro.org> >> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > This slows make check down significantly for me, it stays > at the migration test for minutes. Alex Bennée reported the same problem, I don't know what is the best way to fix this. > > I'm carrying a revert at top of my tree for now but I'd rather > not need that. > > > This seems like a fragile way to test things anyway. > What happens if someone slows writing even more > e.g. because it's running in a container or a VM? This seems to be the problem I noticed and tried to fix. > > How about detecting that migration finished too early > and slowing it down until autocomplete triggers? David, I am a bit clueless with this code, do you mind having a look? We can revert this test and disable the s390x runner, but it is our only big-endian host, which recently show to be useful. > >> --- >> tests/qtest/migration-test.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c >> index 3d6cc83b88..2568c9529c 100644 >> --- a/tests/qtest/migration-test.c >> +++ b/tests/qtest/migration-test.c >> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) >> * without throttling. >> */ >> migrate_set_parameter_int(from, "downtime-limit", 1); >> - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ >> + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ >> >> /* To check remaining size after precopy */ >> migrate_set_capability(from, "pause-before-switchover", true); >> -- >> 2.25.1 >> >> >
* Michael S. Tsirkin (mst@redhat.com) wrote: > On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote: > > From: Philippe Mathieu-Daudé <philmd@redhat.com> > > > > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI > > s390x when configured with --disable-tcg: > > > > $ make check-qtest > > TEST check-qtest-s390x: tests/qtest/boot-serial-test > > qemu-system-s390x: -accel tcg: invalid accelerator tcg > > qemu-system-s390x: falling back to KVM > > TEST check-qtest-s390x: tests/qtest/pxe-test > > TEST check-qtest-s390x: tests/qtest/test-netfilter > > TEST check-qtest-s390x: tests/qtest/test-filter-mirror > > TEST check-qtest-s390x: tests/qtest/test-filter-redirector > > TEST check-qtest-s390x: tests/qtest/drive_del-test > > TEST check-qtest-s390x: tests/qtest/device-plug-test > > TEST check-qtest-s390x: tests/qtest/virtio-ccw-test > > TEST check-qtest-s390x: tests/qtest/cpu-plug-test > > TEST check-qtest-s390x: tests/qtest/migration-test > > ** > > ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > > ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > > make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 > > > > Per David Gilbert, "it could just be the writing is slow on s390 > > and the migration thread fast; in which case the autocomplete > > wouldn't be needed. Perhaps we just need to reduce the bandwidth > > limit." > > > > Tuning the threshold by reducing the initial bandwidth makes the > > autoconverge test pass. > > > > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> > > Message-Id: <20200323184015.11565-1-philmd@redhat.com> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > Tested-by: Alex Bennée <alex.bennee@linaro.org> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > This slows make check down significantly for me, it stays > at the migration test for minutes. Is this on the s390x version or all of them? > I'm carrying a revert at top of my tree for now but I'd rather > not need that. > > > This seems like a fragile way to test things anyway. > What happens if someone slows writing even more > e.g. because it's running in a container or a VM? > > How about detecting that migration finished too early > and slowing it down until autocomplete triggers? THe problem is you can't rely on any form of consistency in a heavily contended container, so the fact that it took n-seconds to migrate at this attempt means very little about what the next attempt will take. Dave > > > > --- > > tests/qtest/migration-test.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > > index 3d6cc83b88..2568c9529c 100644 > > --- a/tests/qtest/migration-test.c > > +++ b/tests/qtest/migration-test.c > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) > > * without throttling. > > */ > > migrate_set_parameter_int(from, "downtime-limit", 1); > > - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ > > + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ > > > > /* To check remaining size after precopy */ > > migrate_set_capability(from, "pause-before-switchover", true); > > -- > > 2.25.1 > > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 6/9/20 7:00 PM, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (mst@redhat.com) wrote: >> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote: >>> From: Philippe Mathieu-Daudé <philmd@redhat.com> >>> >>> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI >>> s390x when configured with --disable-tcg: >>> >>> $ make check-qtest >>> TEST check-qtest-s390x: tests/qtest/boot-serial-test >>> qemu-system-s390x: -accel tcg: invalid accelerator tcg >>> qemu-system-s390x: falling back to KVM >>> TEST check-qtest-s390x: tests/qtest/pxe-test >>> TEST check-qtest-s390x: tests/qtest/test-netfilter >>> TEST check-qtest-s390x: tests/qtest/test-filter-mirror >>> TEST check-qtest-s390x: tests/qtest/test-filter-redirector >>> TEST check-qtest-s390x: tests/qtest/drive_del-test >>> TEST check-qtest-s390x: tests/qtest/device-plug-test >>> TEST check-qtest-s390x: tests/qtest/virtio-ccw-test >>> TEST check-qtest-s390x: tests/qtest/cpu-plug-test >>> TEST check-qtest-s390x: tests/qtest/migration-test >>> ** >>> ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE >>> ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE >>> make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 >>> >>> Per David Gilbert, "it could just be the writing is slow on s390 >>> and the migration thread fast; in which case the autocomplete >>> wouldn't be needed. Perhaps we just need to reduce the bandwidth >>> limit." >>> >>> Tuning the threshold by reducing the initial bandwidth makes the >>> autoconverge test pass. >>> >>> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> >>> Message-Id: <20200323184015.11565-1-philmd@redhat.com> >>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >>> Tested-by: Alex Bennée <alex.bennee@linaro.org> >>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >> >> >> This slows make check down significantly for me, it stays >> at the migration test for minutes. > > Is this on the s390x version or all of them? Personally I only noticed it on s390x. Alex have you seen another arch? > >> I'm carrying a revert at top of my tree for now but I'd rather >> not need that. >> >> >> This seems like a fragile way to test things anyway. >> What happens if someone slows writing even more >> e.g. because it's running in a container or a VM? >> >> How about detecting that migration finished too early >> and slowing it down until autocomplete triggers? > > THe problem is you can't rely on any form of consistency in a heavily > contended container, so the fact that it took n-seconds to migrate at > this attempt means very little about what the next attempt will take. > > Dave > >> >> >>> --- >>> tests/qtest/migration-test.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c >>> index 3d6cc83b88..2568c9529c 100644 >>> --- a/tests/qtest/migration-test.c >>> +++ b/tests/qtest/migration-test.c >>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) >>> * without throttling. >>> */ >>> migrate_set_parameter_int(from, "downtime-limit", 1); >>> - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ >>> + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ >>> >>> /* To check remaining size after precopy */ >>> migrate_set_capability(from, "pause-before-switchover", true); >>> -- >>> 2.25.1 >>> >>> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >
On Tue, Jun 09, 2020 at 06:00:17PM +0100, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (mst@redhat.com) wrote: > > On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote: > > > From: Philippe Mathieu-Daudé <philmd@redhat.com> > > > > > > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI > > > s390x when configured with --disable-tcg: > > > > > > $ make check-qtest > > > TEST check-qtest-s390x: tests/qtest/boot-serial-test > > > qemu-system-s390x: -accel tcg: invalid accelerator tcg > > > qemu-system-s390x: falling back to KVM > > > TEST check-qtest-s390x: tests/qtest/pxe-test > > > TEST check-qtest-s390x: tests/qtest/test-netfilter > > > TEST check-qtest-s390x: tests/qtest/test-filter-mirror > > > TEST check-qtest-s390x: tests/qtest/test-filter-redirector > > > TEST check-qtest-s390x: tests/qtest/drive_del-test > > > TEST check-qtest-s390x: tests/qtest/device-plug-test > > > TEST check-qtest-s390x: tests/qtest/virtio-ccw-test > > > TEST check-qtest-s390x: tests/qtest/cpu-plug-test > > > TEST check-qtest-s390x: tests/qtest/migration-test > > > ** > > > ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > > > ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE > > > make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1 > > > > > > Per David Gilbert, "it could just be the writing is slow on s390 > > > and the migration thread fast; in which case the autocomplete > > > wouldn't be needed. Perhaps we just need to reduce the bandwidth > > > limit." > > > > > > Tuning the threshold by reducing the initial bandwidth makes the > > > autoconverge test pass. > > > > > > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> > > > Message-Id: <20200323184015.11565-1-philmd@redhat.com> > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > Tested-by: Alex Bennée <alex.bennee@linaro.org> > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > > > > This slows make check down significantly for me, it stays > > at the migration test for minutes. > > Is this on the s390x version or all of them? x86 slows down for me. > > I'm carrying a revert at top of my tree for now but I'd rather > > not need that. > > > > > > This seems like a fragile way to test things anyway. > > What happens if someone slows writing even more > > e.g. because it's running in a container or a VM? > > > > How about detecting that migration finished too early > > and slowing it down until autocomplete triggers? > > THe problem is you can't rely on any form of consistency in a heavily > contended container, so the fact that it took n-seconds to migrate at > this attempt means very little about what the next attempt will take. > > Dave What I'm saying is try migrating quickly. If it's too quick try again ... > > > > > > > --- > > > tests/qtest/migration-test.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c > > > index 3d6cc83b88..2568c9529c 100644 > > > --- a/tests/qtest/migration-test.c > > > +++ b/tests/qtest/migration-test.c > > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void) > > > * without throttling. > > > */ > > > migrate_set_parameter_int(from, "downtime-limit", 1); > > > - migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */ > > > + migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */ > > > > > > /* To check remaining size after precopy */ > > > migrate_set_capability(from, "pause-before-switchover", true); > > > -- > > > 2.25.1 > > > > > > > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> The following changes since commit a4716fd8d7c877185652f5f8e25032dc7699d51b: Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-to-apply-20210608-1' into staging (2021-06-08 13:54:23 +0100) are available in the Git repository at: https://gitlab.com/dagrh/qemu.git tags/pull-migration-20210609a for you to fetch changes up to a4a571d97866d056787d7a654be5792765be8a60: hmp: Add "calc_dirty_rate" and "info dirty_rate" cmds (2021-06-08 20:18:26 +0100) ---------------------------------------------------------------- Migration pull for 2021-06-09 Yank crash fix from Leo RDMA fix from Li mptcp support from me dirty-rate changes from Hyman and Peter (Note I've switched to the gitlab I've been using for virtiofs pulls) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> ---------------------------------------------------------------- Dr. David Alan Gilbert (5): channel-socket: Only set CLOEXEC if we have space for fds io/net-listener: Call the notifier during finalize migration: Add cleanup hook for inwards migration migration/socket: Close the listener at the end sockets: Support multipath TCP Hyman Huang(黄勇) (1): migration/dirtyrate: make sample page count configurable Leonardo Bras (1): yank: Unregister function when using TLS migration Li Zhijian (1): migration/rdma: Fix cm event use after free Peter Xu (1): hmp: Add "calc_dirty_rate" and "info dirty_rate" cmds hmp-commands-info.hx | 13 ++++++++ hmp-commands.hx | 14 ++++++++ include/monitor/hmp.h | 2 ++ io/channel-socket.c | 8 ++--- io/dns-resolver.c | 4 +++ io/net-listener.c | 3 ++ migration/channel.c | 26 ++++++++------- migration/dirtyrate.c | 78 ++++++++++++++++++++++++++++++++++++++++--- migration/dirtyrate.h | 8 ++++- migration/migration.c | 3 ++ migration/migration.h | 4 +++ migration/multifd.c | 8 ++++- migration/qemu-file-channel.c | 4 ++- migration/rdma.c | 11 ++++-- migration/socket.c | 24 +++++++++---- qapi/migration.json | 15 +++++++-- qapi/sockets.json | 5 ++- util/qemu-sockets.c | 23 +++++++++++++ 18 files changed, 217 insertions(+), 36 deletions(-)
On Wed, 9 Jun 2021 at 15:47, Dr. David Alan Gilbert (git) <dgilbert@redhat.com> wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > The following changes since commit a4716fd8d7c877185652f5f8e25032dc7699d51b: > > Merge remote-tracking branch 'remotes/alistair/tags/pull-riscv-to-apply-20210608-1' into staging (2021-06-08 13:54:23 +0100) > > are available in the Git repository at: > > https://gitlab.com/dagrh/qemu.git tags/pull-migration-20210609a > > for you to fetch changes up to a4a571d97866d056787d7a654be5792765be8a60: > > hmp: Add "calc_dirty_rate" and "info dirty_rate" cmds (2021-06-08 20:18:26 +0100) > > ---------------------------------------------------------------- > Migration pull for 2021-06-09 > > Yank crash fix from Leo > RDMA fix from Li > mptcp support from me > dirty-rate changes from Hyman and Peter > > (Note I've switched to the gitlab I've been using for virtiofs pulls) > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1 for any user-visible changes. -- PMM
Patchew URL: https://patchew.org/QEMU/20210609144512.211746-1-dgilbert@redhat.com/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20210609144512.211746-1-dgilbert@redhat.com Subject: [PULL 0/9] migration queue === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20210609184955.1193081-1-philmd@redhat.com -> patchew/20210609184955.1193081-1-philmd@redhat.com * [new tag] patchew/20210611120427.49736-1-berrange@redhat.com -> patchew/20210611120427.49736-1-berrange@redhat.com * [new tag] patchew/20210613141222.548357-1-lukas.juenger@greensocs.com -> patchew/20210613141222.548357-1-lukas.juenger@greensocs.com Switched to a new branch 'test' === OUTPUT BEGIN === checkpatch.pl: no revisions returned for revlist 'base..' === OUTPUT END === Test command exited with code: 255 The full log is available at http://patchew.org/logs/20210609144512.211746-1-dgilbert@redhat.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com