qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 0/9] migration queue
@ 2020-03-25 13:16 Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The following changes since commit 736cf607e40674776d752acc201f565723e86045:

  Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000)

are available in the Git repository at:

  git://github.com/dagrh/qemu.git tags/pull-migration-20200325b

for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97:

  migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000)

----------------------------------------------------------------
Combo Migration/HMP/virtiofs pull

Small fixes all around.
Ones that are noticeable:
  a) Igor's migration compatibility fix affecting older machine types
     has been seen in the wild
  b) Philippe's autconverge fix should fix an intermittently
     failing migration test.
  c) Mao's makes a small change to the output of 'info
     migrate_parameters'  for tls-authz.

----------------------------------------------------------------
Dr. David Alan Gilbert (1):
      hmp/vnc: Fix info vnc list leak

Igor Mammedov (1):
      vl.c: fix migration failure for 3.1 and older machine types

Mao Zhongyi (2):
      xbzrle: update xbzrle doc
      migration: use "" instead of (null) for tls-authz

Pan Nengyuan (1):
      hmp-cmd: fix a missing_break warning

Philippe Mathieu-Daudé (2):
      tests/migration: Reduce autoconverge initial bandwidth
      tools/virtiofsd/passthrough_ll: Fix double close()

Vladimir Sementsov-Ogievskiy (2):
      migration/colo: fix use after free of local_err
      migration/ram: fix use after free of local_err

 docs/xbzrle.txt                  |  7 ++++++-
 migration/colo.c                 |  1 +
 migration/migration.c            |  5 +++--
 migration/ram.c                  |  1 +
 monitor/hmp-cmds.c               | 12 +++++++-----
 softmmu/vl.c                     |  3 +++
 tests/qtest/migration-test.c     |  2 +-
 tools/virtiofsd/passthrough_ll.c |  3 +--
 8 files changed, 23 insertions(+), 11 deletions(-)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PULL 1/9] hmp-cmd: fix a missing_break warning
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Pan Nengyuan <pannengyuan@huawei.com>

This fix coverity issues 94417686:
    1260        break;
    CID 94417686: (MISSING_BREAK)
    1261. unterminated_case: The case for value "MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD" is not terminated by a 'break' statement.
    1261    case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
    1262        p->has_throttle_trigger_threshold = true;
    1263        visit_type_int(v, param, &p->throttle_trigger_threshold, &err);
    1264    case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:

Fixes: dc14a470763c96fd9d360e1028ce38e8c3613a77
Fixes: Coverity (CID 1421950)
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200318071620.59748-1-pannengyuan@huawei.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 monitor/hmp-cmds.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 58724031ea..c882c9f3cc 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1261,6 +1261,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
         p->has_throttle_trigger_threshold = true;
         visit_type_int(v, param, &p->throttle_trigger_threshold, &err);
+        break;
     case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:
         p->has_cpu_throttle_initial = true;
         visit_type_int(v, param, &p->cpu_throttle_initial, &err);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 2/9] xbzrle: update xbzrle doc
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>

Add new parameter description, also:
1. Remove unsociable space.
2. Nit picking: s/two/2 in report

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <20200320143216.423374-1-maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/xbzrle.txt       | 7 ++++++-
 migration/migration.c | 2 +-
 monitor/hmp-cmds.c    | 2 +-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
index c0a7dfd44c..b431bdaf0f 100644
--- a/docs/xbzrle.txt
+++ b/docs/xbzrle.txt
@@ -92,6 +92,11 @@ Usage
 power of 2. The cache default value is 64MBytes. (on source only)
     {qemu} migrate_set_cache_size 256m
 
+Commit 73af8dd8d7 "migration: Make xbzrle_cache_size a migration parameter"
+(v2.11.0) deprecated migrate-set-cache-size, therefore, the new parameter
+is recommended.
+    {qemu} migrate_set_parameter xbzrle-cache-size 256m
+
 4. Start outgoing migration
     {qemu} migrate -d tcp:destination.host:4444
     {qemu} info migrate
@@ -108,7 +113,7 @@ power of 2. The cache default value is 64MBytes. (on source only)
     xbzrle transferred: I kbytes
     xbzrle pages: J pages
     xbzrle cache miss: K
-    xbzrle overflow : L
+    xbzrle overflow: L
 
 xbzrle cache-miss: the number of cache misses to date - high cache-miss rate
 indicates that the cache size is set too low.
diff --git a/migration/migration.c b/migration/migration.c
index c1d88ace7f..4b26110d57 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1243,7 +1243,7 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
                    "xbzrle_cache_size",
                    "is invalid, it should be bigger than target page size"
-                   " and a power of two");
+                   " and a power of 2");
         return false;
     }
 
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index c882c9f3cc..76725c2ace 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -303,7 +303,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->xbzrle_cache->cache_miss);
         monitor_printf(mon, "xbzrle cache miss rate: %0.2f\n",
                        info->xbzrle_cache->cache_miss_rate);
-        monitor_printf(mon, "xbzrle overflow : %" PRIu64 "\n",
+        monitor_printf(mon, "xbzrle overflow: %" PRIu64 "\n",
                        info->xbzrle_cache->overflow);
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-06-09 16:36   ` Michael S. Tsirkin
  2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Philippe Mathieu-Daudé <philmd@redhat.com>

When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
s390x when configured with --disable-tcg:

  $ make check-qtest
    TEST    check-qtest-s390x: tests/qtest/boot-serial-test
  qemu-system-s390x: -accel tcg: invalid accelerator tcg
  qemu-system-s390x: falling back to KVM
    TEST    check-qtest-s390x: tests/qtest/pxe-test
    TEST    check-qtest-s390x: tests/qtest/test-netfilter
    TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
    TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
    TEST    check-qtest-s390x: tests/qtest/drive_del-test
    TEST    check-qtest-s390x: tests/qtest/device-plug-test
    TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
    TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
    TEST    check-qtest-s390x: tests/qtest/migration-test
  **
  ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
  ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
  make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1

Per David Gilbert, "it could just be the writing is slow on s390
and the migration thread fast; in which case the autocomplete
wouldn't be needed. Perhaps we just need to reduce the bandwidth
limit."

Tuning the threshold by reducing the initial bandwidth makes the
autoconverge test pass.

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200323184015.11565-1-philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 3d6cc83b88..2568c9529c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
      * without throttling.
      */
     migrate_set_parameter_int(from, "downtime-limit", 1);
-    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
+    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
 
     /* To check remaining size after precopy */
     migrate_set_capability(from, "pause-before-switchover", true);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 4/9] hmp/vnc: Fix info vnc list leak
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (2 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

We're iterating the list, and then freeing the iteration pointer rather
than the list head.

Fixes: 0a9667ecdb6d ("hmp: Update info vnc")
Reported-by: Coverity (CID 1421932)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200323120822.51266-1-dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 monitor/hmp-cmds.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 76725c2ace..04ca342c51 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -527,10 +527,11 @@ static void hmp_info_vnc_servers(Monitor *mon, VncServerInfo2List *server)
 
 void hmp_info_vnc(Monitor *mon, const QDict *qdict)
 {
-    VncInfo2List *info2l;
+    VncInfo2List *info2l, *info2l_head;
     Error *err = NULL;
 
     info2l = qmp_query_vnc_servers(&err);
+    info2l_head = info2l;
     if (err) {
         hmp_handle_error(mon, err);
         return;
@@ -559,7 +560,7 @@ void hmp_info_vnc(Monitor *mon, const QDict *qdict)
         info2l = info2l->next;
     }
 
-    qapi_free_VncInfo2List(info2l);
+    qapi_free_VncInfo2List(info2l_head);
 
 }
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close()
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (3 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Philippe Mathieu-Daudé <philmd@redhat.com>

On success, the fdopendir() call closes fd. Later on the error
path we try to close an already-closed fd. This can lead to
use-after-free. Fix by only closing the fd if the fdopendir()
call failed.

Cc: qemu-stable@nongnu.org
Fixes: b39bce121b (add dirp_map to hide lo_dirp pointers)
Reported-by: Coverity (CID 1421933 USE_AFTER_FREE)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200321120654.7985-1-philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 tools/virtiofsd/passthrough_ll.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 4f259aac70..4c35c95b25 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -1520,8 +1520,7 @@ out_err:
     if (d) {
         if (d->dp) {
             closedir(d->dp);
-        }
-        if (fd != -1) {
+        } else if (fd != -1) {
             close(fd);
         }
         free(d);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (4 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Igor Mammedov <imammedo@redhat.com>

Migration from QEMU(v4.0) fails when using 3.1 or older machine
type. For example if one attempts to migrate
QEMU-2.12 started as
  qemu-system-ppc64 -nodefaults -M pseries-2.12 -m 4096 -mem-path /tmp/
to current master, it will fail with
  qemu-system-ppc64: Unknown ramblock "ppc_spapr.ram", cannot accept migration
  qemu-system-ppc64: error while loading state for instance 0x0 of device 'ram'
  qemu-system-ppc64: load of migration failed: Invalid argument

Caused by 900c0ba373 commit which switches main RAM allocation to
memory backends and the fact in 3.1 and older QEMU, backends used
full[***] QOM path as memory region name instead of backend's name.
That was changed after 3.1 to use prefix-less names by default
(fa0cb34d22) for new machine types.
*** effectively makes main RAM memory region names defined by
MachineClass::default_ram_id being altered with '/objects/' prefix
and therefore migration fails as old QEMU sends prefix-less
name while new QEMU expects name with prefix when using 3.1 and
older machine types.

Fix it by forcing implicit[1] memory backend to always use
prefix-less names for its memory region by setting
  'x-use-canonical-path-for-ramblock-id'
property to false.

1) i.e. memory backend created by compat glue which maps
-m/-mem-path/-mem-prealloc/default RAM size into
appropriate backend type/options to match old CLI format.

Fixes: 900c0ba373
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Message-Id: <20200304172748.15338-1-imammedo@redhat.com>
Tested-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 softmmu/vl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/softmmu/vl.c b/softmmu/vl.c
index 1d33a28340..814537bb42 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2801,6 +2801,9 @@ static void create_default_memdev(MachineState *ms, const char *path)
     object_property_set_int(obj, ms->ram_size, "size", &error_fatal);
     object_property_add_child(object_get_objects_root(), mc->default_ram_id,
                               obj, &error_fatal);
+    /* Ensure backend's memory region name is equal to mc->default_ram_id */
+    object_property_set_bool(obj, false, "x-use-canonical-path-for-ramblock-id",
+                             &error_fatal);
     user_creatable_complete(USER_CREATABLE(obj), &error_fatal);
     object_unref(obj);
     object_property_set_str(OBJECT(ms), mc->default_ram_id, "memory-backend",
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 7/9] migration/colo: fix use after free of local_err
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (5 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

local_err is used again in secondary_vm_do_failover() after
replication_stop_all(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/colo.c b/migration/colo.c
index 44942c4e23..a54ac84f41 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -93,6 +93,7 @@ static void secondary_vm_do_failover(void)
     replication_stop_all(true, &local_err);
     if (local_err) {
         error_report_err(local_err);
+        local_err = NULL;
     }
 
     /* Notify all filters of all NIC to do checkpoint */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 8/9] migration/ram: fix use after free of local_err
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (6 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
  2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

local_err is used again in migration_bitmap_sync_precopy() after
precopy_notify(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/ram.c b/migration/ram.c
index c12cfdbe26..04f13feb2e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -980,6 +980,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs)
      */
     if (precopy_notify(PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC, &local_err)) {
         error_report_err(local_err);
+        local_err = NULL;
     }
 
     migration_bitmap_sync(rs);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 9/9] migration: use "" instead of (null) for tls-authz
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (7 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>

run:
(qemu) info migrate_parameters
announce-initial: 50 ms
...
announce-max: 550 ms
multifd-compression: none
xbzrle-cache-size: 4194304
max-postcopy-bandwidth: 0
 tls-authz: '(null)'

Migration parameter 'tls-authz' is used to provide the QOM ID
of a QAuthZ subclass instance that provides the access control
check, default is NULL. But the empty string is not a valid
object ID, so use "" instead of the default. Although it will
fail when lookup an object with ID "", it is harmless, just
consistent with tls_creds.

As a bonus, this patch also fixed the bad indentation on the
last line and removed 'has_tls_authz' redundant check in
'hmp_info_migrate_parameters'.

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <119f539a9f4d198bc3bcced46b8280520d60bc51.1585100802.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.c | 3 ++-
 monitor/hmp-cmds.c    | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4b26110d57..c4c9aee15e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -790,7 +790,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->has_tls_hostname = true;
     params->tls_hostname = g_strdup(s->parameters.tls_hostname);
     params->has_tls_authz = true;
-    params->tls_authz = g_strdup(s->parameters.tls_authz);
+    params->tls_authz = g_strdup(s->parameters.tls_authz ?
+                                 s->parameters.tls_authz : "");
     params->has_max_bandwidth = true;
     params->max_bandwidth = s->parameters.max_bandwidth;
     params->has_downtime_limit = true;
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 04ca342c51..9b94e67879 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -459,9 +459,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
             params->max_postcopy_bandwidth);
-        monitor_printf(mon, " %s: '%s'\n",
+        monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
-            params->has_tls_authz ? params->tls_authz : "");
+            params->tls_authz);
     }
 
     qapi_free_MigrationParameters(params);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PULL 0/9] migration queue
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (8 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
@ 2020-03-26 10:46 ` Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Peter Maydell @ 2020-03-26 10:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: Vladimir Sementsov-Ogievskiy, Mao Zhongyi, Pan Nengyuan,
	QEMU Developers, Igor Mammedov, Philippe Mathieu-Daudé

On Wed, 25 Mar 2020 at 13:17, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> The following changes since commit 736cf607e40674776d752acc201f565723e86045:
>
>   Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000)
>
> are available in the Git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-migration-20200325b
>
> for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97:
>
>   migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000)
>
> ----------------------------------------------------------------
> Combo Migration/HMP/virtiofs pull
>
> Small fixes all around.
> Ones that are noticeable:
>   a) Igor's migration compatibility fix affecting older machine types
>      has been seen in the wild
>   b) Philippe's autconverge fix should fix an intermittently
>      failing migration test.
>   c) Mao's makes a small change to the output of 'info
>      migrate_parameters'  for tls-authz.
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
@ 2020-06-09 16:36   ` Michael S. Tsirkin
  2020-06-09 16:45     ` Philippe Mathieu-Daudé
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2020-06-09 16:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: Philippe Mathieu-Daudé <philmd@redhat.com>
> 
> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> s390x when configured with --disable-tcg:
> 
>   $ make check-qtest
>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>   qemu-system-s390x: falling back to KVM
>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>     TEST    check-qtest-s390x: tests/qtest/migration-test
>   **
>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> 
> Per David Gilbert, "it could just be the writing is slow on s390
> and the migration thread fast; in which case the autocomplete
> wouldn't be needed. Perhaps we just need to reduce the bandwidth
> limit."
> 
> Tuning the threshold by reducing the initial bandwidth makes the
> autoconverge test pass.
> 
> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Tested-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


This slows make check down significantly for me, it stays
at the migration test for minutes.

I'm carrying a revert at top of my tree for now but I'd rather
not need that.


This seems like a fragile way to test things anyway.
What happens if someone slows writing even more
e.g. because it's running in a container or a VM?

How about detecting that migration finished too early
and slowing it down until autocomplete triggers?



> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 3d6cc83b88..2568c9529c 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> -- 
> 2.25.1
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 16:36   ` Michael S. Tsirkin
@ 2020-06-09 16:45     ` Philippe Mathieu-Daudé
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 16+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-09 16:45 UTC (permalink / raw)
  To: Michael S. Tsirkin, Dr. David Alan Gilbert (git)
  Cc: Thomas Huth, vsementsov, maozhongyi, Cornelia Huck, pannengyuan,
	qemu-devel, imammedo, Alex Bennée

On 6/9/20 6:36 PM, Michael S. Tsirkin wrote:
> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
>> From: Philippe Mathieu-Daudé <philmd@redhat.com>
>>
>> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
>> s390x when configured with --disable-tcg:
>>
>>   $ make check-qtest
>>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>>   qemu-system-s390x: falling back to KVM
>>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>>     TEST    check-qtest-s390x: tests/qtest/migration-test
>>   **
>>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
>>
>> Per David Gilbert, "it could just be the writing is slow on s390
>> and the migration thread fast; in which case the autocomplete
>> wouldn't be needed. Perhaps we just need to reduce the bandwidth
>> limit."
>>
>> Tuning the threshold by reducing the initial bandwidth makes the
>> autoconverge test pass.
>>
>> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> 
> This slows make check down significantly for me, it stays
> at the migration test for minutes.

Alex Bennée reported the same problem, I don't know what is the best way
to fix this.

> 
> I'm carrying a revert at top of my tree for now but I'd rather
> not need that.
> 
> 
> This seems like a fragile way to test things anyway.
> What happens if someone slows writing even more
> e.g. because it's running in a container or a VM?

This seems to be the problem I noticed and tried to fix.

> 
> How about detecting that migration finished too early
> and slowing it down until autocomplete triggers?

David, I am a bit clueless with this code, do you mind having a look?

We can revert this test and disable the s390x runner, but it is our only
big-endian host, which recently show to be useful.

> 
>> ---
>>  tests/qtest/migration-test.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 3d6cc83b88..2568c9529c 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>>       * without throttling.
>>       */
>>      migrate_set_parameter_int(from, "downtime-limit", 1);
>> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>>  
>>      /* To check remaining size after precopy */
>>      migrate_set_capability(from, "pause-before-switchover", true);
>> -- 
>> 2.25.1
>>
>>
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 16:36   ` Michael S. Tsirkin
  2020-06-09 16:45     ` Philippe Mathieu-Daudé
@ 2020-06-09 17:00     ` Dr. David Alan Gilbert
  2020-06-09 17:07       ` Philippe Mathieu-Daudé
  2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 2 replies; 16+ messages in thread
From: Dr. David Alan Gilbert @ 2020-06-09 17:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: Philippe Mathieu-Daudé <philmd@redhat.com>
> > 
> > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> > s390x when configured with --disable-tcg:
> > 
> >   $ make check-qtest
> >     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
> >   qemu-system-s390x: -accel tcg: invalid accelerator tcg
> >   qemu-system-s390x: falling back to KVM
> >     TEST    check-qtest-s390x: tests/qtest/pxe-test
> >     TEST    check-qtest-s390x: tests/qtest/test-netfilter
> >     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
> >     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
> >     TEST    check-qtest-s390x: tests/qtest/drive_del-test
> >     TEST    check-qtest-s390x: tests/qtest/device-plug-test
> >     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
> >     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
> >     TEST    check-qtest-s390x: tests/qtest/migration-test
> >   **
> >   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> >   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> >   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> > 
> > Per David Gilbert, "it could just be the writing is slow on s390
> > and the migration thread fast; in which case the autocomplete
> > wouldn't be needed. Perhaps we just need to reduce the bandwidth
> > limit."
> > 
> > Tuning the threshold by reducing the initial bandwidth makes the
> > autoconverge test pass.
> > 
> > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Tested-by: Alex Bennée <alex.bennee@linaro.org>
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> 
> This slows make check down significantly for me, it stays
> at the migration test for minutes.

Is this on the s390x version or all of them?

> I'm carrying a revert at top of my tree for now but I'd rather
> not need that.
> 
> 
> This seems like a fragile way to test things anyway.
> What happens if someone slows writing even more
> e.g. because it's running in a container or a VM?
> 
> How about detecting that migration finished too early
> and slowing it down until autocomplete triggers?

THe problem is you can't rely on any form of consistency in a heavily
contended container, so the fact that it took n-seconds to migrate at
this attempt means very little about what the next attempt will take.

Dave

> 
> 
> > ---
> >  tests/qtest/migration-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index 3d6cc83b88..2568c9529c 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> >       * without throttling.
> >       */
> >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> >  
> >      /* To check remaining size after precopy */
> >      migrate_set_capability(from, "pause-before-switchover", true);
> > -- 
> > 2.25.1
> > 
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 17:00     ` Dr. David Alan Gilbert
@ 2020-06-09 17:07       ` Philippe Mathieu-Daudé
  2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-09 17:07 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Michael S. Tsirkin
  Cc: imammedo, vsementsov, pannengyuan, qemu-devel, maozhongyi

On 6/9/20 7:00 PM, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
>> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
>>> From: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>
>>> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
>>> s390x when configured with --disable-tcg:
>>>
>>>   $ make check-qtest
>>>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>>>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>>>   qemu-system-s390x: falling back to KVM
>>>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>>>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>>>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>>>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>>>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>>>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>>>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>>>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>>>     TEST    check-qtest-s390x: tests/qtest/migration-test
>>>   **
>>>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
>>>
>>> Per David Gilbert, "it could just be the writing is slow on s390
>>> and the migration thread fast; in which case the autocomplete
>>> wouldn't be needed. Perhaps we just need to reduce the bandwidth
>>> limit."
>>>
>>> Tuning the threshold by reducing the initial bandwidth makes the
>>> autoconverge test pass.
>>>
>>> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>
>>
>> This slows make check down significantly for me, it stays
>> at the migration test for minutes.
> 
> Is this on the s390x version or all of them?

Personally I only noticed it on s390x.

Alex have you seen another arch?

> 
>> I'm carrying a revert at top of my tree for now but I'd rather
>> not need that.
>>
>>
>> This seems like a fragile way to test things anyway.
>> What happens if someone slows writing even more
>> e.g. because it's running in a container or a VM?
>>
>> How about detecting that migration finished too early
>> and slowing it down until autocomplete triggers?
> 
> THe problem is you can't rely on any form of consistency in a heavily
> contended container, so the fact that it took n-seconds to migrate at
> this attempt means very little about what the next attempt will take.
> 
> Dave
> 
>>
>>
>>> ---
>>>  tests/qtest/migration-test.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>>> index 3d6cc83b88..2568c9529c 100644
>>> --- a/tests/qtest/migration-test.c
>>> +++ b/tests/qtest/migration-test.c
>>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>>>       * without throttling.
>>>       */
>>>      migrate_set_parameter_int(from, "downtime-limit", 1);
>>> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>>> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>>>  
>>>      /* To check remaining size after precopy */
>>>      migrate_set_capability(from, "pause-before-switchover", true);
>>> -- 
>>> 2.25.1
>>>
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  2020-06-09 17:07       ` Philippe Mathieu-Daudé
@ 2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2020-06-09 18:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

On Tue, Jun 09, 2020 at 06:00:17PM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
> > On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > 
> > > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> > > s390x when configured with --disable-tcg:
> > > 
> > >   $ make check-qtest
> > >     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
> > >   qemu-system-s390x: -accel tcg: invalid accelerator tcg
> > >   qemu-system-s390x: falling back to KVM
> > >     TEST    check-qtest-s390x: tests/qtest/pxe-test
> > >     TEST    check-qtest-s390x: tests/qtest/test-netfilter
> > >     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
> > >     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
> > >     TEST    check-qtest-s390x: tests/qtest/drive_del-test
> > >     TEST    check-qtest-s390x: tests/qtest/device-plug-test
> > >     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
> > >     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
> > >     TEST    check-qtest-s390x: tests/qtest/migration-test
> > >   **
> > >   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> > >   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> > >   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> > > 
> > > Per David Gilbert, "it could just be the writing is slow on s390
> > > and the migration thread fast; in which case the autocomplete
> > > wouldn't be needed. Perhaps we just need to reduce the bandwidth
> > > limit."
> > > 
> > > Tuning the threshold by reducing the initial bandwidth makes the
> > > autoconverge test pass.
> > > 
> > > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Tested-by: Alex Bennée <alex.bennee@linaro.org>
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 
> > 
> > This slows make check down significantly for me, it stays
> > at the migration test for minutes.
> 
> Is this on the s390x version or all of them?

x86 slows down for me.

> > I'm carrying a revert at top of my tree for now but I'd rather
> > not need that.
> > 
> > 
> > This seems like a fragile way to test things anyway.
> > What happens if someone slows writing even more
> > e.g. because it's running in a container or a VM?
> > 
> > How about detecting that migration finished too early
> > and slowing it down until autocomplete triggers?
> 
> THe problem is you can't rely on any form of consistency in a heavily
> contended container, so the fact that it took n-seconds to migrate at
> this attempt means very little about what the next attempt will take.
> 
> Dave

What I'm saying is try migrating quickly. If it's too quick
try again ...

> > 
> > 
> > > ---
> > >  tests/qtest/migration-test.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > > index 3d6cc83b88..2568c9529c 100644
> > > --- a/tests/qtest/migration-test.c
> > > +++ b/tests/qtest/migration-test.c
> > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> > >       * without throttling.
> > >       */
> > >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > > -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > > +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > >  
> > >      /* To check remaining size after precopy */
> > >      migrate_set_capability(from, "pause-before-switchover", true);
> > > -- 
> > > 2.25.1
> > > 
> > > 
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-06-09 18:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
2020-06-09 16:36   ` Michael S. Tsirkin
2020-06-09 16:45     ` Philippe Mathieu-Daudé
2020-06-09 17:00     ` Dr. David Alan Gilbert
2020-06-09 17:07       ` Philippe Mathieu-Daudé
2020-06-09 18:10       ` Michael S. Tsirkin
2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).