All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 0/9] migration queue
@ 2020-03-25 13:16 Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The following changes since commit 736cf607e40674776d752acc201f565723e86045:

  Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000)

are available in the Git repository at:

  git://github.com/dagrh/qemu.git tags/pull-migration-20200325b

for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97:

  migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000)

----------------------------------------------------------------
Combo Migration/HMP/virtiofs pull

Small fixes all around.
Ones that are noticeable:
  a) Igor's migration compatibility fix affecting older machine types
     has been seen in the wild
  b) Philippe's autconverge fix should fix an intermittently
     failing migration test.
  c) Mao's makes a small change to the output of 'info
     migrate_parameters'  for tls-authz.

----------------------------------------------------------------
Dr. David Alan Gilbert (1):
      hmp/vnc: Fix info vnc list leak

Igor Mammedov (1):
      vl.c: fix migration failure for 3.1 and older machine types

Mao Zhongyi (2):
      xbzrle: update xbzrle doc
      migration: use "" instead of (null) for tls-authz

Pan Nengyuan (1):
      hmp-cmd: fix a missing_break warning

Philippe Mathieu-Daudé (2):
      tests/migration: Reduce autoconverge initial bandwidth
      tools/virtiofsd/passthrough_ll: Fix double close()

Vladimir Sementsov-Ogievskiy (2):
      migration/colo: fix use after free of local_err
      migration/ram: fix use after free of local_err

 docs/xbzrle.txt                  |  7 ++++++-
 migration/colo.c                 |  1 +
 migration/migration.c            |  5 +++--
 migration/ram.c                  |  1 +
 monitor/hmp-cmds.c               | 12 +++++++-----
 softmmu/vl.c                     |  3 +++
 tests/qtest/migration-test.c     |  2 +-
 tools/virtiofsd/passthrough_ll.c |  3 +--
 8 files changed, 23 insertions(+), 11 deletions(-)



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PULL 1/9] hmp-cmd: fix a missing_break warning
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Pan Nengyuan <pannengyuan@huawei.com>

This fix coverity issues 94417686:
    1260        break;
    CID 94417686: (MISSING_BREAK)
    1261. unterminated_case: The case for value "MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD" is not terminated by a 'break' statement.
    1261    case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
    1262        p->has_throttle_trigger_threshold = true;
    1263        visit_type_int(v, param, &p->throttle_trigger_threshold, &err);
    1264    case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:

Fixes: dc14a470763c96fd9d360e1028ce38e8c3613a77
Fixes: Coverity (CID 1421950)
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Message-Id: <20200318071620.59748-1-pannengyuan@huawei.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 monitor/hmp-cmds.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 58724031ea..c882c9f3cc 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1261,6 +1261,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
         p->has_throttle_trigger_threshold = true;
         visit_type_int(v, param, &p->throttle_trigger_threshold, &err);
+        break;
     case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:
         p->has_cpu_throttle_initial = true;
         visit_type_int(v, param, &p->cpu_throttle_initial, &err);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 2/9] xbzrle: update xbzrle doc
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>

Add new parameter description, also:
1. Remove unsociable space.
2. Nit picking: s/two/2 in report

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <20200320143216.423374-1-maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/xbzrle.txt       | 7 ++++++-
 migration/migration.c | 2 +-
 monitor/hmp-cmds.c    | 2 +-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt
index c0a7dfd44c..b431bdaf0f 100644
--- a/docs/xbzrle.txt
+++ b/docs/xbzrle.txt
@@ -92,6 +92,11 @@ Usage
 power of 2. The cache default value is 64MBytes. (on source only)
     {qemu} migrate_set_cache_size 256m
 
+Commit 73af8dd8d7 "migration: Make xbzrle_cache_size a migration parameter"
+(v2.11.0) deprecated migrate-set-cache-size, therefore, the new parameter
+is recommended.
+    {qemu} migrate_set_parameter xbzrle-cache-size 256m
+
 4. Start outgoing migration
     {qemu} migrate -d tcp:destination.host:4444
     {qemu} info migrate
@@ -108,7 +113,7 @@ power of 2. The cache default value is 64MBytes. (on source only)
     xbzrle transferred: I kbytes
     xbzrle pages: J pages
     xbzrle cache miss: K
-    xbzrle overflow : L
+    xbzrle overflow: L
 
 xbzrle cache-miss: the number of cache misses to date - high cache-miss rate
 indicates that the cache size is set too low.
diff --git a/migration/migration.c b/migration/migration.c
index c1d88ace7f..4b26110d57 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1243,7 +1243,7 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
                    "xbzrle_cache_size",
                    "is invalid, it should be bigger than target page size"
-                   " and a power of two");
+                   " and a power of 2");
         return false;
     }
 
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index c882c9f3cc..76725c2ace 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -303,7 +303,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->xbzrle_cache->cache_miss);
         monitor_printf(mon, "xbzrle cache miss rate: %0.2f\n",
                        info->xbzrle_cache->cache_miss_rate);
-        monitor_printf(mon, "xbzrle overflow : %" PRIu64 "\n",
+        monitor_printf(mon, "xbzrle overflow: %" PRIu64 "\n",
                        info->xbzrle_cache->overflow);
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-06-09 16:36   ` Michael S. Tsirkin
  2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Philippe Mathieu-Daudé <philmd@redhat.com>

When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
s390x when configured with --disable-tcg:

  $ make check-qtest
    TEST    check-qtest-s390x: tests/qtest/boot-serial-test
  qemu-system-s390x: -accel tcg: invalid accelerator tcg
  qemu-system-s390x: falling back to KVM
    TEST    check-qtest-s390x: tests/qtest/pxe-test
    TEST    check-qtest-s390x: tests/qtest/test-netfilter
    TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
    TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
    TEST    check-qtest-s390x: tests/qtest/drive_del-test
    TEST    check-qtest-s390x: tests/qtest/device-plug-test
    TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
    TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
    TEST    check-qtest-s390x: tests/qtest/migration-test
  **
  ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
  ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
  make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1

Per David Gilbert, "it could just be the writing is slow on s390
and the migration thread fast; in which case the autocomplete
wouldn't be needed. Perhaps we just need to reduce the bandwidth
limit."

Tuning the threshold by reducing the initial bandwidth makes the
autoconverge test pass.

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200323184015.11565-1-philmd@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 3d6cc83b88..2568c9529c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
      * without throttling.
      */
     migrate_set_parameter_int(from, "downtime-limit", 1);
-    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
+    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
 
     /* To check remaining size after precopy */
     migrate_set_capability(from, "pause-before-switchover", true);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 4/9] hmp/vnc: Fix info vnc list leak
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (2 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

We're iterating the list, and then freeing the iteration pointer rather
than the list head.

Fixes: 0a9667ecdb6d ("hmp: Update info vnc")
Reported-by: Coverity (CID 1421932)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200323120822.51266-1-dgilbert@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 monitor/hmp-cmds.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 76725c2ace..04ca342c51 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -527,10 +527,11 @@ static void hmp_info_vnc_servers(Monitor *mon, VncServerInfo2List *server)
 
 void hmp_info_vnc(Monitor *mon, const QDict *qdict)
 {
-    VncInfo2List *info2l;
+    VncInfo2List *info2l, *info2l_head;
     Error *err = NULL;
 
     info2l = qmp_query_vnc_servers(&err);
+    info2l_head = info2l;
     if (err) {
         hmp_handle_error(mon, err);
         return;
@@ -559,7 +560,7 @@ void hmp_info_vnc(Monitor *mon, const QDict *qdict)
         info2l = info2l->next;
     }
 
-    qapi_free_VncInfo2List(info2l);
+    qapi_free_VncInfo2List(info2l_head);
 
 }
 #endif
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close()
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (3 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Philippe Mathieu-Daudé <philmd@redhat.com>

On success, the fdopendir() call closes fd. Later on the error
path we try to close an already-closed fd. This can lead to
use-after-free. Fix by only closing the fd if the fdopendir()
call failed.

Cc: qemu-stable@nongnu.org
Fixes: b39bce121b (add dirp_map to hide lo_dirp pointers)
Reported-by: Coverity (CID 1421933 USE_AFTER_FREE)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200321120654.7985-1-philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 tools/virtiofsd/passthrough_ll.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 4f259aac70..4c35c95b25 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -1520,8 +1520,7 @@ out_err:
     if (d) {
         if (d->dp) {
             closedir(d->dp);
-        }
-        if (fd != -1) {
+        } else if (fd != -1) {
             close(fd);
         }
         free(d);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (4 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Igor Mammedov <imammedo@redhat.com>

Migration from QEMU(v4.0) fails when using 3.1 or older machine
type. For example if one attempts to migrate
QEMU-2.12 started as
  qemu-system-ppc64 -nodefaults -M pseries-2.12 -m 4096 -mem-path /tmp/
to current master, it will fail with
  qemu-system-ppc64: Unknown ramblock "ppc_spapr.ram", cannot accept migration
  qemu-system-ppc64: error while loading state for instance 0x0 of device 'ram'
  qemu-system-ppc64: load of migration failed: Invalid argument

Caused by 900c0ba373 commit which switches main RAM allocation to
memory backends and the fact in 3.1 and older QEMU, backends used
full[***] QOM path as memory region name instead of backend's name.
That was changed after 3.1 to use prefix-less names by default
(fa0cb34d22) for new machine types.
*** effectively makes main RAM memory region names defined by
MachineClass::default_ram_id being altered with '/objects/' prefix
and therefore migration fails as old QEMU sends prefix-less
name while new QEMU expects name with prefix when using 3.1 and
older machine types.

Fix it by forcing implicit[1] memory backend to always use
prefix-less names for its memory region by setting
  'x-use-canonical-path-for-ramblock-id'
property to false.

1) i.e. memory backend created by compat glue which maps
-m/-mem-path/-mem-prealloc/default RAM size into
appropriate backend type/options to match old CLI format.

Fixes: 900c0ba373
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Message-Id: <20200304172748.15338-1-imammedo@redhat.com>
Tested-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 softmmu/vl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/softmmu/vl.c b/softmmu/vl.c
index 1d33a28340..814537bb42 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2801,6 +2801,9 @@ static void create_default_memdev(MachineState *ms, const char *path)
     object_property_set_int(obj, ms->ram_size, "size", &error_fatal);
     object_property_add_child(object_get_objects_root(), mc->default_ram_id,
                               obj, &error_fatal);
+    /* Ensure backend's memory region name is equal to mc->default_ram_id */
+    object_property_set_bool(obj, false, "x-use-canonical-path-for-ramblock-id",
+                             &error_fatal);
     user_creatable_complete(USER_CREATABLE(obj), &error_fatal);
     object_unref(obj);
     object_property_set_str(OBJECT(ms), mc->default_ram_id, "memory-backend",
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 7/9] migration/colo: fix use after free of local_err
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (5 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

local_err is used again in secondary_vm_do_failover() after
replication_stop_all(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/colo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/colo.c b/migration/colo.c
index 44942c4e23..a54ac84f41 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -93,6 +93,7 @@ static void secondary_vm_do_failover(void)
     replication_stop_all(true, &local_err);
     if (local_err) {
         error_report_err(local_err);
+        local_err = NULL;
     }
 
     /* Notify all filters of all NIC to do checkpoint */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 8/9] migration/ram: fix use after free of local_err
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (6 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
  2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

local_err is used again in migration_bitmap_sync_precopy() after
precopy_notify(), so we must zero it. Otherwise try to set
non-NULL local_err will crash.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/ram.c b/migration/ram.c
index c12cfdbe26..04f13feb2e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -980,6 +980,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs)
      */
     if (precopy_notify(PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC, &local_err)) {
         error_report_err(local_err);
+        local_err = NULL;
     }
 
     migration_bitmap_sync(rs);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PULL 9/9] migration: use "" instead of (null) for tls-authz
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (7 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
@ 2020-03-25 13:16 ` Dr. David Alan Gilbert (git)
  2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2020-03-25 13:16 UTC (permalink / raw)
  To: qemu-devel, pannengyuan, maozhongyi, vsementsov, imammedo, philmd

From: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>

run:
(qemu) info migrate_parameters
announce-initial: 50 ms
...
announce-max: 550 ms
multifd-compression: none
xbzrle-cache-size: 4194304
max-postcopy-bandwidth: 0
 tls-authz: '(null)'

Migration parameter 'tls-authz' is used to provide the QOM ID
of a QAuthZ subclass instance that provides the access control
check, default is NULL. But the empty string is not a valid
object ID, so use "" instead of the default. Although it will
fail when lookup an object with ID "", it is harmless, just
consistent with tls_creds.

As a bonus, this patch also fixed the bad indentation on the
last line and removed 'has_tls_authz' redundant check in
'hmp_info_migrate_parameters'.

Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com>
Message-Id: <119f539a9f4d198bc3bcced46b8280520d60bc51.1585100802.git.maozhongyi@cmss.chinamobile.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.c | 3 ++-
 monitor/hmp-cmds.c    | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4b26110d57..c4c9aee15e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -790,7 +790,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->has_tls_hostname = true;
     params->tls_hostname = g_strdup(s->parameters.tls_hostname);
     params->has_tls_authz = true;
-    params->tls_authz = g_strdup(s->parameters.tls_authz);
+    params->tls_authz = g_strdup(s->parameters.tls_authz ?
+                                 s->parameters.tls_authz : "");
     params->has_max_bandwidth = true;
     params->max_bandwidth = s->parameters.max_bandwidth;
     params->has_downtime_limit = true;
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 04ca342c51..9b94e67879 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -459,9 +459,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
             params->max_postcopy_bandwidth);
-        monitor_printf(mon, " %s: '%s'\n",
+        monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
-            params->has_tls_authz ? params->tls_authz : "");
+            params->tls_authz);
     }
 
     qapi_free_MigrationParameters(params);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PULL 0/9] migration queue
  2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
                   ` (8 preceding siblings ...)
  2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
@ 2020-03-26 10:46 ` Peter Maydell
  9 siblings, 0 replies; 16+ messages in thread
From: Peter Maydell @ 2020-03-26 10:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: Vladimir Sementsov-Ogievskiy, Mao Zhongyi, Pan Nengyuan,
	QEMU Developers, Igor Mammedov, Philippe Mathieu-Daudé

On Wed, 25 Mar 2020 at 13:17, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> The following changes since commit 736cf607e40674776d752acc201f565723e86045:
>
>   Update version for v5.0.0-rc0 release (2020-03-24 17:50:00 +0000)
>
> are available in the Git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-migration-20200325b
>
> for you to fetch changes up to 7cd75cbdb8a45d9e2d5912f774d8194cbafdfa97:
>
>   migration: use "" instead of (null) for tls-authz (2020-03-25 12:31:38 +0000)
>
> ----------------------------------------------------------------
> Combo Migration/HMP/virtiofs pull
>
> Small fixes all around.
> Ones that are noticeable:
>   a) Igor's migration compatibility fix affecting older machine types
>      has been seen in the wild
>   b) Philippe's autconverge fix should fix an intermittently
>      failing migration test.
>   c) Mao's makes a small change to the output of 'info
>      migrate_parameters'  for tls-authz.
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.0
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
@ 2020-06-09 16:36   ` Michael S. Tsirkin
  2020-06-09 16:45     ` Philippe Mathieu-Daudé
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2020-06-09 16:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git)
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: Philippe Mathieu-Daudé <philmd@redhat.com>
> 
> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> s390x when configured with --disable-tcg:
> 
>   $ make check-qtest
>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>   qemu-system-s390x: falling back to KVM
>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>     TEST    check-qtest-s390x: tests/qtest/migration-test
>   **
>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> 
> Per David Gilbert, "it could just be the writing is slow on s390
> and the migration thread fast; in which case the autocomplete
> wouldn't be needed. Perhaps we just need to reduce the bandwidth
> limit."
> 
> Tuning the threshold by reducing the initial bandwidth makes the
> autoconverge test pass.
> 
> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Tested-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


This slows make check down significantly for me, it stays
at the migration test for minutes.

I'm carrying a revert at top of my tree for now but I'd rather
not need that.


This seems like a fragile way to test things anyway.
What happens if someone slows writing even more
e.g. because it's running in a container or a VM?

How about detecting that migration finished too early
and slowing it down until autocomplete triggers?



> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 3d6cc83b88..2568c9529c 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> -- 
> 2.25.1
> 
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 16:36   ` Michael S. Tsirkin
@ 2020-06-09 16:45     ` Philippe Mathieu-Daudé
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 16+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-09 16:45 UTC (permalink / raw)
  To: Michael S. Tsirkin, Dr. David Alan Gilbert (git)
  Cc: Thomas Huth, vsementsov, maozhongyi, Cornelia Huck, pannengyuan,
	qemu-devel, imammedo, Alex Bennée

On 6/9/20 6:36 PM, Michael S. Tsirkin wrote:
> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
>> From: Philippe Mathieu-Daudé <philmd@redhat.com>
>>
>> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
>> s390x when configured with --disable-tcg:
>>
>>   $ make check-qtest
>>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>>   qemu-system-s390x: falling back to KVM
>>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>>     TEST    check-qtest-s390x: tests/qtest/migration-test
>>   **
>>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
>>
>> Per David Gilbert, "it could just be the writing is slow on s390
>> and the migration thread fast; in which case the autocomplete
>> wouldn't be needed. Perhaps we just need to reduce the bandwidth
>> limit."
>>
>> Tuning the threshold by reducing the initial bandwidth makes the
>> autoconverge test pass.
>>
>> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> 
> This slows make check down significantly for me, it stays
> at the migration test for minutes.

Alex Bennée reported the same problem, I don't know what is the best way
to fix this.

> 
> I'm carrying a revert at top of my tree for now but I'd rather
> not need that.
> 
> 
> This seems like a fragile way to test things anyway.
> What happens if someone slows writing even more
> e.g. because it's running in a container or a VM?

This seems to be the problem I noticed and tried to fix.

> 
> How about detecting that migration finished too early
> and slowing it down until autocomplete triggers?

David, I am a bit clueless with this code, do you mind having a look?

We can revert this test and disable the s390x runner, but it is our only
big-endian host, which recently show to be useful.

> 
>> ---
>>  tests/qtest/migration-test.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 3d6cc83b88..2568c9529c 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>>       * without throttling.
>>       */
>>      migrate_set_parameter_int(from, "downtime-limit", 1);
>> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>>  
>>      /* To check remaining size after precopy */
>>      migrate_set_capability(from, "pause-before-switchover", true);
>> -- 
>> 2.25.1
>>
>>
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 16:36   ` Michael S. Tsirkin
  2020-06-09 16:45     ` Philippe Mathieu-Daudé
@ 2020-06-09 17:00     ` Dr. David Alan Gilbert
  2020-06-09 17:07       ` Philippe Mathieu-Daudé
  2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 2 replies; 16+ messages in thread
From: Dr. David Alan Gilbert @ 2020-06-09 17:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: Philippe Mathieu-Daudé <philmd@redhat.com>
> > 
> > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> > s390x when configured with --disable-tcg:
> > 
> >   $ make check-qtest
> >     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
> >   qemu-system-s390x: -accel tcg: invalid accelerator tcg
> >   qemu-system-s390x: falling back to KVM
> >     TEST    check-qtest-s390x: tests/qtest/pxe-test
> >     TEST    check-qtest-s390x: tests/qtest/test-netfilter
> >     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
> >     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
> >     TEST    check-qtest-s390x: tests/qtest/drive_del-test
> >     TEST    check-qtest-s390x: tests/qtest/device-plug-test
> >     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
> >     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
> >     TEST    check-qtest-s390x: tests/qtest/migration-test
> >   **
> >   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> >   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> >   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> > 
> > Per David Gilbert, "it could just be the writing is slow on s390
> > and the migration thread fast; in which case the autocomplete
> > wouldn't be needed. Perhaps we just need to reduce the bandwidth
> > limit."
> > 
> > Tuning the threshold by reducing the initial bandwidth makes the
> > autoconverge test pass.
> > 
> > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Tested-by: Alex Bennée <alex.bennee@linaro.org>
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> 
> This slows make check down significantly for me, it stays
> at the migration test for minutes.

Is this on the s390x version or all of them?

> I'm carrying a revert at top of my tree for now but I'd rather
> not need that.
> 
> 
> This seems like a fragile way to test things anyway.
> What happens if someone slows writing even more
> e.g. because it's running in a container or a VM?
> 
> How about detecting that migration finished too early
> and slowing it down until autocomplete triggers?

THe problem is you can't rely on any form of consistency in a heavily
contended container, so the fact that it took n-seconds to migrate at
this attempt means very little about what the next attempt will take.

Dave

> 
> 
> > ---
> >  tests/qtest/migration-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index 3d6cc83b88..2568c9529c 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> >       * without throttling.
> >       */
> >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> >  
> >      /* To check remaining size after precopy */
> >      migrate_set_capability(from, "pause-before-switchover", true);
> > -- 
> > 2.25.1
> > 
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 17:00     ` Dr. David Alan Gilbert
@ 2020-06-09 17:07       ` Philippe Mathieu-Daudé
  2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-06-09 17:07 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Michael S. Tsirkin
  Cc: imammedo, vsementsov, pannengyuan, qemu-devel, maozhongyi

On 6/9/20 7:00 PM, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
>> On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
>>> From: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>
>>> When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
>>> s390x when configured with --disable-tcg:
>>>
>>>   $ make check-qtest
>>>     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
>>>   qemu-system-s390x: -accel tcg: invalid accelerator tcg
>>>   qemu-system-s390x: falling back to KVM
>>>     TEST    check-qtest-s390x: tests/qtest/pxe-test
>>>     TEST    check-qtest-s390x: tests/qtest/test-netfilter
>>>     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
>>>     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
>>>     TEST    check-qtest-s390x: tests/qtest/drive_del-test
>>>     TEST    check-qtest-s390x: tests/qtest/device-plug-test
>>>     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
>>>     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
>>>     TEST    check-qtest-s390x: tests/qtest/migration-test
>>>   **
>>>   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>>   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
>>>   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
>>>
>>> Per David Gilbert, "it could just be the writing is slow on s390
>>> and the migration thread fast; in which case the autocomplete
>>> wouldn't be needed. Perhaps we just need to reduce the bandwidth
>>> limit."
>>>
>>> Tuning the threshold by reducing the initial bandwidth makes the
>>> autoconverge test pass.
>>>
>>> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>> Message-Id: <20200323184015.11565-1-philmd@redhat.com>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>
>>
>> This slows make check down significantly for me, it stays
>> at the migration test for minutes.
> 
> Is this on the s390x version or all of them?

Personally I only noticed it on s390x.

Alex have you seen another arch?

> 
>> I'm carrying a revert at top of my tree for now but I'd rather
>> not need that.
>>
>>
>> This seems like a fragile way to test things anyway.
>> What happens if someone slows writing even more
>> e.g. because it's running in a container or a VM?
>>
>> How about detecting that migration finished too early
>> and slowing it down until autocomplete triggers?
> 
> THe problem is you can't rely on any form of consistency in a heavily
> contended container, so the fact that it took n-seconds to migrate at
> this attempt means very little about what the next attempt will take.
> 
> Dave
> 
>>
>>
>>> ---
>>>  tests/qtest/migration-test.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>>> index 3d6cc83b88..2568c9529c 100644
>>> --- a/tests/qtest/migration-test.c
>>> +++ b/tests/qtest/migration-test.c
>>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>>>       * without throttling.
>>>       */
>>>      migrate_set_parameter_int(from, "downtime-limit", 1);
>>> -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>>> +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>>>  
>>>      /* To check remaining size after precopy */
>>>      migrate_set_capability(from, "pause-before-switchover", true);
>>> -- 
>>> 2.25.1
>>>
>>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth
  2020-06-09 17:00     ` Dr. David Alan Gilbert
  2020-06-09 17:07       ` Philippe Mathieu-Daudé
@ 2020-06-09 18:10       ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2020-06-09 18:10 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: vsementsov, maozhongyi, pannengyuan, qemu-devel, imammedo, philmd

On Tue, Jun 09, 2020 at 06:00:17PM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
> > On Wed, Mar 25, 2020 at 01:16:26PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > 
> > > When using max-bandwidth=~100Mb/s, this test fails on Travis-CI
> > > s390x when configured with --disable-tcg:
> > > 
> > >   $ make check-qtest
> > >     TEST    check-qtest-s390x: tests/qtest/boot-serial-test
> > >   qemu-system-s390x: -accel tcg: invalid accelerator tcg
> > >   qemu-system-s390x: falling back to KVM
> > >     TEST    check-qtest-s390x: tests/qtest/pxe-test
> > >     TEST    check-qtest-s390x: tests/qtest/test-netfilter
> > >     TEST    check-qtest-s390x: tests/qtest/test-filter-mirror
> > >     TEST    check-qtest-s390x: tests/qtest/test-filter-redirector
> > >     TEST    check-qtest-s390x: tests/qtest/drive_del-test
> > >     TEST    check-qtest-s390x: tests/qtest/device-plug-test
> > >     TEST    check-qtest-s390x: tests/qtest/virtio-ccw-test
> > >     TEST    check-qtest-s390x: tests/qtest/cpu-plug-test
> > >     TEST    check-qtest-s390x: tests/qtest/migration-test
> > >   **
> > >   ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> > >   ERROR - Bail out! ERROR:tests/qtest/migration-test.c:1229:test_migrate_auto_converge: 'got_stop' should be FALSE
> > >   make: *** [tests/Makefile.include:633: check-qtest-s390x] Error 1
> > > 
> > > Per David Gilbert, "it could just be the writing is slow on s390
> > > and the migration thread fast; in which case the autocomplete
> > > wouldn't be needed. Perhaps we just need to reduce the bandwidth
> > > limit."
> > > 
> > > Tuning the threshold by reducing the initial bandwidth makes the
> > > autoconverge test pass.
> > > 
> > > Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Message-Id: <20200323184015.11565-1-philmd@redhat.com>
> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Tested-by: Alex Bennée <alex.bennee@linaro.org>
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > 
> > 
> > This slows make check down significantly for me, it stays
> > at the migration test for minutes.
> 
> Is this on the s390x version or all of them?

x86 slows down for me.

> > I'm carrying a revert at top of my tree for now but I'd rather
> > not need that.
> > 
> > 
> > This seems like a fragile way to test things anyway.
> > What happens if someone slows writing even more
> > e.g. because it's running in a container or a VM?
> > 
> > How about detecting that migration finished too early
> > and slowing it down until autocomplete triggers?
> 
> THe problem is you can't rely on any form of consistency in a heavily
> contended container, so the fact that it took n-seconds to migrate at
> this attempt means very little about what the next attempt will take.
> 
> Dave

What I'm saying is try migrating quickly. If it's too quick
try again ...

> > 
> > 
> > > ---
> > >  tests/qtest/migration-test.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > > index 3d6cc83b88..2568c9529c 100644
> > > --- a/tests/qtest/migration-test.c
> > > +++ b/tests/qtest/migration-test.c
> > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> > >       * without throttling.
> > >       */
> > >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > > -    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > > +    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > >  
> > >      /* To check remaining size after precopy */
> > >      migrate_set_capability(from, "pause-before-switchover", true);
> > > -- 
> > > 2.25.1
> > > 
> > > 
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-06-09 18:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-25 13:16 [PULL 0/9] migration queue Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 1/9] hmp-cmd: fix a missing_break warning Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 2/9] xbzrle: update xbzrle doc Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 3/9] tests/migration: Reduce autoconverge initial bandwidth Dr. David Alan Gilbert (git)
2020-06-09 16:36   ` Michael S. Tsirkin
2020-06-09 16:45     ` Philippe Mathieu-Daudé
2020-06-09 17:00     ` Dr. David Alan Gilbert
2020-06-09 17:07       ` Philippe Mathieu-Daudé
2020-06-09 18:10       ` Michael S. Tsirkin
2020-03-25 13:16 ` [PULL 4/9] hmp/vnc: Fix info vnc list leak Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 5/9] tools/virtiofsd/passthrough_ll: Fix double close() Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 6/9] vl.c: fix migration failure for 3.1 and older machine types Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 7/9] migration/colo: fix use after free of local_err Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 8/9] migration/ram: " Dr. David Alan Gilbert (git)
2020-03-25 13:16 ` [PULL 9/9] migration: use "" instead of (null) for tls-authz Dr. David Alan Gilbert (git)
2020-03-26 10:46 ` [PULL 0/9] migration queue Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.